Wednesday, May 20, 2015

Teaching R course? Use analogsea to run your customized RStudio in Digital Ocean!

Two years ago I taught an introductory R/Shiny course here at The Jackson Lab. We all learnt a lot. Unfortunately not about Shiny itself, but rather about incompatibilities between its versions and trouble with its installation to some machines.

And it is not only my experience. If you look into forums of Rafael Irizarry MOOC courses, so many questions are just about installation / incompatibilities of R packages. The solution exists for a long time: run your R in a cloud. However, customization of virtual machines (like Amazon EC2) used to be a nontrivial task.

In this post I will show how a few lines of R code can start a customized RStudio docklet in a cloud and email login credentials to course participants. So, the participant do not need to install R and the required packages. Moreover, it is guaranteed they all run exactly the same software. All they need is a decent web browser to access RStudio server.

RStudio server login

Running RStudio in Digital Ocean with R/analogsea

So how complicated is it today to start your RStudio on clouds? It is (almost) a one-liner:
  1. If you do not have  Digital Ocean account, get one. You should receive a promotional credit $10 (= 1 regular machine running without interruption for 1 month):
    (full disclosure: if you create your account using the link above I might get an extra credit)
  2. Install analogsea package from Github. Make sure to create Digital Ocean personal access token and in R set DO_PAT environment variable. Also create your personal SSH key and upload it to Digital Ocean.
  3. And now it is really easy:

    # Sys.setenv(DO_PAT = "*****") set access token

    # start your machine in Digital Ocean
    d <- docklet_create(size = getOption("do_size", "512mb"))
    # run RStudio on machine 'd' (rocker/rstudio docker image)
    d %>% docklet_rstudio()
The last line should open your browser with RStudio login page (user "rstudio", password "rstudio"). If not, use summary(d) to get the IP address of your machine and go to http://your_machine:8787

It will cost you ~$0.01 per hour ($5 per month, May 2015). When you are done, do not forget to stop your Digital Ocean machine (droplet_delete(d)). At the end, make sure that you successfully killed all your machines - either log in to Digital Ocean or by calling droplets() in R.

Wednesday, February 11, 2015

Make your R plots interactive


As a part of my daily job, I draw scatterplots, lots of them. And because there are thousands of genes expressed in any mouse or human tissue, my typical plot looks something like this (code). (Actually, it is a comparison of variance that can be attributed to "sex" factor in mRNA vs. protein expression.)

The question is - what is a gene in the top right corner? And what is the one next to him? And this one? And that?