Distributed Computing using JuliaBox
JuliaBox lets you create distributed algorithm, and run them on clusters of hundreds of CPUs, without having to worry about managing or maintaining that infrastructure. You can just concentrate on your problem domain and your algorithm. We take care of all the tedious requirements of creating and managing the cluster.
Your Juliabox environment includes a
JuliaRunClient package that provides methods to create a distributed cluster of Julia processes. Once the cluster is created, you should use the standard Julia distributed computing primitives such as
@parallel for and
@pmap to implement your algorithm.
Distributed Monte Carlo
As a simple example, let us calculate the value of pi using parallel Monte Carlo simulation.
Initialize the cluster by first loading the
JuliaRunClient library, and then calling the
initializeCluster method. This method takes the number of worker nodes as a parameter, and boots up a cluster of that size.
using JuliaRunClient initializeCluster(2)
Then we define the parallel algorithm. You'll see that this uses the standard Julia
pmap function to evaluate the simulation accross multiple processes
function estimate_pi(N, loops) n = sum(pmap((x)->darts_in_circle(N), 1:loops)) 4 * n / (loops * N) end @everywhere function darts_in_circle(N) n = 0 for i in 1:N if rand()^2 + rand()^2 < 1 n += 1 end end n end
Finally, we run the simulation over for 1 million samples and 50 iterations.
Once the task is complete, the cluster can be brought down
You can run this and other examples using the
JuliaRun-parallel.ipynb and the
Scaling from UI
The "Customize" interface lets you edit the number of worker attached to your Jupyter instance. This is equivaled to using
initializeCluster as shown in the above section.