On April I published an entry explaining how to calculate PI using the most basic Monte Carlo algorithm and now, using the Middle Earth cluster I can do it parallelized.

Parallelizing Monte Carlo is a very simple task because of it’s random and independent nature and this basic monte carlo is even easier. I can just run exactly the same routine as before on all nodes and at the end sum everything and divide by the number of nodes. To achieve that, I just changed the main.cc file to use MPI, quite simple indeed.

The old main.cc just called the function and returned the value:

area_disk(pi, max_iter, delta); cout << "PI = " << pi << endl;

But now, the new version should know whether it’s the main node or a computing node. After, all computing nodes should calculate the area and the main node should gather and sum.

/* Nodes, compute and return */ if (myrank) { area_disk(tmp, max_iter, delta); MPI_Send( &tmp, 1, MPI_DOUBLE, 0, 17, MPI_COMM_WORLD ); /* Main, receive all areas and sum */ } else { for (int i=1; i < size; i++) { MPI_Recv( &tmp, 1, MPI_DOUBLE, i, 17, MPI_COMM_WORLD, &status ); pi += tmp; } pi /= (size-1); cout << "PI = " << pi << endl; }

On MPI, **myrank** says your node number and **size** shows you the total number of nodes. On the most basic MPI program, if it’s zero you’re the main node, otherwise you’re a computing node.

All computing nodes calculate the area and MPI_Send the result to the main node. The main node waits for all responses on the main loop and sum the temporary result **tmp** to **pi** and at the end divide by the number of computing nodes.

**Benefits:**

This monte carlo is extremely basic and very easy to parallelize. As this copy is run over N computing nodes and there’s no dependency between them you should achieve an increase in speed of over N times the non-parallel one.

Unfortunately, this algorithm is so slow and inaccurate that even running on 9 computing nodes (ie. 9 times faster) it’s still wrong on the third digit.

The slowness is due to the algorithm’s stupidity but the inaccuracy is due to the lack of a really good standard random number generators. Almost all machines yielded results far from the 5-digit answer on M_PI macro of C standard library and the result was also far from it. Also, there are so many other ways of calculating PI that are so much faster that it wouldn’t be a good approach ever!

The good thing is that it was just to show a distributes monte carlo algorithm working… 😉