This presentation describes the efficient parallisation of the multilevel fast multipole method (MLFMM) for distributed memory systems. Data is distributed between the processes to obtain a good load balance and to minimise the number of communications. The total runtime efficiency is highly dependent on the efficiency of the matrix-vector-product, as well as the computation of the preconditioner.