We’ve been testing some Spark code that will eventually be moved to AWS. For now, to save costs, we’ve created a 8 node Spark cluster that runs on a set of Virtual Machines running Ubuntu on VirtualBox. We’ve developed some bash-scripts to make starting (and shutting down) the VMs easy.