A Framework for Bioconductor and R-based Applications on the Cloud


Requirements

The following are the requirements for installing RBioCloud:

  1. 1. Any Linux OS

    We tested our installation using Ubuntu.

  2. 2. Traditional Python

    If Python is not installed, then you can install it using the OS package installation utility. For example, in Ubuntu use:
    # sudo apt-get -y install python
    If you are an advanced user then you can download Python from here.

  3. 3. Virtual Python Environment (virtualenv) Builder

    If virtualenv is not installed, then you can install it using the OS package installation utility. For example, in Ubuntu, use:
    # sudo apt-get -y install python-virtualenv
    If you are an advanced user then you can download virtualenv from here.

  4. 4. A User Account for Amazon Web Services

    If you do not have an Amazon Web Service account, you can sign up here.

Installing RBioCloud

Please follow the steps below to install RBioCloud:

  1. 1. Create a virtual environment

    A virtual environment, ve1 can be created using:
    # virtualenv ve1

  2. 2. Activate the virtual environment

    The virtual environment, ve1 can be activated using:
    # source ve1/bin/activate

  3. 3. Install RBioCloud

    RBioCloud needs to be installed into the virtual environment created and activated in the above steps using:
    # sudo easy_install /download-directory/RBIOCLOUD-1.0-py2.7.egg
    The /download-directory is the path of the directory to which the RBioCloud package was downloaded.

RBioCloud Commands

The following commands will be available within the virtual environment, ve1 after installing RBioCloud:

  1. i. RBC_Configure - Configure RBioCloud after installation
  2. ii. RBC_GatherResource - Create Resources, both instance and clusters on the Amazon Cloud
  3. iii.RBC_SubmitJob - Submit a job (a directory comprising R scripts and data) for execution on the cloud resource
  4. iv. RBC_ExecuteJob - Execute a job on the cloud resource
  5. v. RBC_GetResults - Retrieve results from cloud resource after executing a job
  6. vi. RBC_TerminateResource - Release resources gathered on the cloud
  7. vii.RBC_ListResources - Lists all instances and clusters created
  8. viii.RBC_LoginToResource - Remote login to instances or cluster instances
  9. ix. RBC_RemoveResource - Cleanup of configuration files

The usage of a command can be obtained by using:
# <RBioCloud command> -h
The documentation of all commands can be obtained here.

Configuring RBioCloud

After installing RBioCloud, the framework needs to be configured using:
# RBC_Configure
In the home directory of the user, locate RBC/rbc-1.0/etc/RBCconfig.py. RBCconfig.py is a configuration file that contains parameters that can be edited by the user. The parameters that need to be edited are as followed:

  1. i. RBC_CLUSTER_DEFAULT_NAME = <name> - Default name for configuring a cluster
  2. ii. RBC_INSTANCE_DEFAULT_NAME = <name> - Default name for configuring an instance
  3. iii. MASTER_IMAGE_ID = <image-id> - Amazon Machine Image (AMI) ID for launching clusters and instances. To use Bioconductor packages, the Bioconductor Cloud AMI is required. The Bioconductor Cloud AMI IDs are provided here.
  4. iv. INSTANCE_TYPE = <instance-type> - Amazon EC2 instance type
  5. v. DEFAULT_CLUSTER_SIZE = <cluster-size> - Default size of a cluster, which will be used if not provided through -rsize by a user while creating a cluster
  6. vi. VOLUME_SNAP = <snapshot-id> - Default Snapshot ID to create an Amazon Elastic Block Storage (EBS) volume
  7. vii. AWS_ACCESS_KEY_ID = <aws-user-access-key> - User's AWS access key available from Security Credentials of User's AWS account
  8. viii. AWS_SECRET_KEY_ID = <aws-user-secret-key> - User's AWS secret key available from Security Credentials of User's AWS account
  9. ix. SECURITY_GROUPS = <security-group-name> - Security group the user wants for resources. Open port 22 in security policy when a security group is created.
  10. x. KEY_NAME = <aws-keypair-name> - Name of Amazon key pair file (.pem format) generated using AWS Management Console
  11. xi. KEY_LOCATION = <aws-keypair-location> - Location of the Amazon key pair file on the user's system. Preferred location to save the key file is in RBC/rbc-1.0/etc/

Thank you for downloading, installing and configuring RBioCloud. We would recommend that you follow a simple tutorial we have provided to ensure successful installation and configuration.