Introduction

Distributed Computing using JuliaBox

JuliaBox lets you create distributed algorithm, and run them on clusters of hundreds of CPUs, without having to worry about managing or maintaining that infrastructure. You can just concentrate on your problem domain and your algorithm. We take care of all the tedious requirements of creating and managing the cluster.

Basic API

Your Juliabox environment includes a JuliaRunClient package that provides methods to create a distributed cluster of Julia processes. Once the cluster is created, you should use the standard Julia distributed computing primitives such as @everwhere, @parallel for and @pmap to implement your algorithm.

Distributed Monte Carlo

As a simple example, let us calculate the value of pi using parallel Monte Carlo simulation.

Initialize the cluster by first loading the JuliaRunClient library, and then calling the initializeCluster method. This method takes the number of worker nodes as a parameter, and boots up a cluster of that size.

using JuliaRunClient
initializeCluster(2)

Then we define the parallel algorithm. You'll see that this uses the standard Julia pmap function to evaluate the simulation accross multiple processes

function estimate_pi(N, loops)         
    n = sum(pmap((x)->darts_in_circle(N), 1:loops))   
    4 * n / (loops * N)                
end

@everywhere function darts_in_circle(N)  
    n = 0                      
    for i in 1:N                       
        if rand()^2 + rand()^2 < 1     
            n += 1                     
        end                             
    end                                 
    n                                  
end

Finally, we run the simulation over for 1 million samples and 50 iterations.

estimate_pi(1_000_000, 20)

Once the task is complete, the cluster can be brought down

releaseCluster()

You can run this and other examples using the JuliaRun-parallel.ipynb and the JuliaRun-parallel-batch.ipynb notebooks.

Scaling from UI

The "Customize" interface lets you edit the number of worker attached to your Jupyter instance. This is equivaled to using initializeCluster as shown in the above section.