Installing R packages on to your EC2 RStudio instance
Once you’ve got your EC2 instance running with RStudio, you will probably want to install some of your favourite packages. I use ggplot2 and plyr a lot, but installing them isn’t as simple as on your local pc.
First you need to connect to your instance. If you use Windows, you’ll need to download PuTTY. Amazon provide a walk-through of how to do this, and I’ve replicated the key steps here.
You will need to create a version of the Key-Pair file you created when you first set up your EC2 security group that PuTTY can recognise. Use PuTTYgen for this and click ‘Load’ existing key to browser for your Key-Pair file. You will need to use ‘All file types’ as the Amazon Key-Pair is a .pem file. Once you’ve found it, click save as private key. You should now see a .ppk in the same folder as you .pem key-pair.
Now start the PuTTY.exe program.
In the Hostname box write ‘ubuntu@’ and your public dns name for the instance from the EC2 dashboard.
Then expand the SSH and Auth categories on the lefthand side. You need to find the .ppk key-pair you generated earlier.
Click ‘Open’ and ‘Yes’ on the next dialog box. The command window that appears will allow you to control you EC2 instance.
For some reason (and this may only be with micro instances) you need to add more memory to install packages. Don’t worry, you want be charged for this. To add more type the following commands
sudo /bin/dd if=/dev/zero of=/var/swap.1 bs=1M count=1024
sudo /sbin/mkswap /var/swap.1
sudo /sbin/swapon /var/swap.1
To install packages you need to be the ‘root’ user. I’m still a little unsure what this actually means. Seems to be like having administrator privileges.
sudo -s
Now you’re ready to install packages from CRAN. To install ggplot2, plyr or any other CRAN package type
apt-get install r-cran-ggplot2
apt-get install r-cran-plyr
apt-get install r-cran-[package_name]
You can now load the packages you’ve installed in RStudio using the library function.