R, Spark, and TextMate

I've been keeping my eye on Spark for a while now, and decided to take the plunge recently after having to do some brief R analyses that were not that complicated and were perfect for Spark.  I use TextMate as my R IDE, and I wanted to run my scripts from TextMate right into Spark, and the following are a couple of tips & tricks I found on how to setup everything so that you can start Spark from a command-R (⌘-R) shortcut.

 

• Installing Spark on a Mac

Getting Spark on a Mac is easily done with Homebrew:

• Environment Variables

Next, you're going to need to set up some environment variables.  Open your ".bash_profile" and add the following to your PATH:

PATH=$PATH:/usr/local/Cellar/apache-spark/1.6.0 export PATH
export SPARK_HOME="/usr/local/Cellar/apache-spark/1.6.0/libexec/"

 

• R Source Files

Once the environment variables are set, you'll need to add the following lines to the beginning of your R script.  These lines are the ones that fire up Spark and get things rolling:

library(SparkR, lib.loc="/usr/local/Cellar/apache-spark/1.6.0/libexec/R/lib")
sc = sparkR.init(sparkHome = "/usr/local/Cellar/apache-spark/1.6.0/libexec")
sqc = sparkRSQL.init(sc)

 

The following link is a great resource for getting started with SparkR:

https://spark.apache.org/docs/1.6.0/sparkr.html