Tutorial: GenPipes in the Cloud (GCP)

v6.x Support for Cloud

We have not yet verified / released GenPipes support for GCP / Cloud in version 6.x release. The following tutorial works for GenPipes v5.x only.

GenPipes bioinformatics pipelines are developed as part of the GenAP project at the Canadian Centre for Computational Genomics (C3G).

This tutorial shows you how to run GenPipes in Google Cloud (GCP). It uses the “Try GCP for free” account to run a pipeline in the cloud.

Prerequisites 

Create an account on GCP. Learn more….
Get acquainted with using the Google Cloud Shell. Learn more….
Create a new project. Learn how to create a cloud project….

Step 1: Deploy GenPipes in GCP cloud 

Set up GenPipes in your cloud server instance. Run in the Google shell:

user@machine:~$ git clone https://bitbucket.org/mugqic/cloud_deplyoment.git

user@machine:~$ cd cloud_deplyoment/gcp/

user@machine:~$ gcloud deployment-manager deployments create slurm --config slurm-cluster.yaml

For more details on how to set up GenPipes in the cloud, see GenPipes Cloud Deployment Guide.

Cloud billing

Please note that from here on, your GenPipes cloud deployment is being deployed and your account is getting billed by Google. Remember to shut down the cloud server cluster when the analysis is done if you do not wish to be billed unintentionally.

Step 2: Verify Slurm Deployment 

Once the gcloud command is done running, a configuration script is started to install SLURM on the cluster running in the cloud. You will be able to monitor the installation after you run the next command.

Use the Google shell to log into the login node of the Slurm cluster:

user@machine:~$ gcloud compute ssh login1 --zone=northamerica-northeast1-a

You are now on your cloud deployment login node.

The installation may still be running. Once it is done, you will see a welcome message:

Slurm is currently being installed/configured in the background.

A terminal broadcast will announce when installation and configuration is
complete.

Wait for the terminal broadcast. It can take up to 10 minutes.

Step 3: Run GenPipes Pipeline 

In this tutorial, we will run the chipseq pipeline in the cloud.

First, create a test folder as shown below:

user@machine:~$ mkdir -p chipseq_test
user@machine:~$ cd chipseq_test

Then, download the Chip Sequencing Test Dataset and unzip it:

user@machine:~$ wget  https://m-f39e09.071823.8540.data.globus.org/genpipes-test-datasets/chipseq.chr19.new.tar.gz
user@machine:~$ gzip -d chipseq.chr19.new.tar.gz

Next, download the chipseq configuration file for use in the cloud:

user@machine:~$ wget https://bitbucket.org/mugqic/cloud_deplyoment/raw/master/quick_start.ini

Then construct the chipseq pipeline launch command:

user@machine:~$ genpipes chipseq -c $MUGQIC_PIPELINES_HOME/pipelines/chipseq/chipseq.base.ini \
                    $MUGQIC_PIPELINES_HOME/pipelines/common_ini/rorqual.ini \
                    quick_start.ini \
                -j slurm \
                -r readsets.chipseqTest.chr22.tsv \
                -d designfile_chipseq.chr22.txt \
                -s 1-18 \
                -g chipseqScript.sh

Finally, launch the pipeline using the command:

user@machine:~$ bash chipseqScript.sh

Step 4: Monitor Pipeline Status 

Use the squeue command to monitor the GenPipes analysis run through the Slurm scheduler. For details on how to monitor scheduler jobs, refer to the job monitoring step in the tutorial GenPipes on DRAC.

For more details on viewing log files and generating reports, refer to the section Monitor Job Status in the Tutorial: GenPipes on DRAC servers.

Note

Shut down your GenPipes Cloud setup once you are done to ensure you are not billed for unintentional cloud usage.

After the jobs have run, you can exit the login node:

user@machine:~$ exit

You, are now in back on your cloud shell administrative machine. You can shut down your GenPipes cloud cluster.

user@machine:~$ gcloud deployment-manager deployments delete slurm

You are not being billed anymore.

Note

You need to enable the “deployment manager” API on your project. See this page. You also need to make sure that billing is enabled (even for a free try). For more detailed information, check out our Bitbucket repo

Tutorial: GenPipes in the Cloud (GCP)

Prerequisites

Step 1: Deploy GenPipes in GCP cloud

Step 2: Verify Slurm Deployment

Step 3: Run GenPipes Pipeline

Step 4: Monitor Pipeline Status