The first time you perform RNA-Seq you will need to download a slew
of programs, packages, and files, enabling you to perform the analysis.
This short walkthrough will help you prepare your digital workspace in a
streamlined manner. For example, you can find all bioconductor
install.packages()
commands here, rather than searching
every single one. This walkthough will take you through:
Mac: You already are running on Linux/Unix so you just need to open
up a terminal with command+space
and type
Terminal
then press enter.
Windows: You will need to download WSL (Windows Subsystem for Linux) and install Ubuntu, or your linux distribution of choice (we recommend Ubuntu for Windows). Follow the instructions below or follow along to this youtube video.
wsl --install
Alternatively:
features
and click on
Turn Windows features on or off
Windows Subsystem for Linux
and
check the box. click ok.microsoft store
and
click the microsoft store.U
and you should
see Ubuntuinstalling...
bash
to
run Linux on WSL.For Windows/Linux users: follow the instructions here: (https://gist.github.com/kauffmanes/5e74916617f9993bc3479f401dfec7da). Make certain that you are installing the LINUX version, and install it from inside WSL.
otherwise
Download Anaconda here and install it.
This will install:
1) Anaconda
2) Python
3) And you can also choose to install Jupyter Notebooks/Spyder IDE,
which is recommended.
You can check your version of python in bash/Linux by typing
python
. This also enters the python environment on the
command line. exit by typing exit()
.
Start WSL in a new command window by typing:
wsl
Set up Bioconda by typing in bash:
conda config --add channels defaults
conda config --add channels bioconda
conda config --add channels conda-forge
Then run updates to update all available packages
conda update --all
If you need the instructions, the website instructions are here
Download RStudio and install it.
If you are having trouble downloading packages into the correct library, You can run RStudio as an administrator on windows by right clicking the RStudio icon and running as administrator.
some programs like FastQC depend on Java. to check if you have java,
on bash or command line type java -version
. Install
java here. once java is installed, check it is installed correctly
by again typing java -version
.
Download FastQC here and install it by unzipping the downloaded file. On windows, you can unzip files by right clicking, and using Windows Explorer to open the file. Move the application folder to a new location, and it is now unzipped.
Open the FastQC GUI by following the instructions from the website:
Windows: Simply double click on the run_fastqc bat file. If you want to make a pretty shortcut then we’ve included an icon file in the top level directory so you don’t have to use the generic bat file icon.
MacOSX: Double click on the FastQC application icon.
Notice that to the left of your cursor, the line begins with (base) or something like it. This is telling you what environment you are in. You want to make a new environment for each workflow you do on your computer, so that the packages you download are specific to that environment, and don’t contaminate the global environment in case versions of packages are not compatible with the version of python you have, for example. To create a new conda environment, run:
conda create --name RNA
And every time you open bash to perform this RNA-Seq analysis workflow in a new session, you’ll want to initialize this environment to run the packages you download here.
conda activate RNA
TrimGalore depends on two packages which you’ll need to install first: cutadapt and fastqc. You can download Fastqc again, this time in your conda environment, for optionally running fastqc following TrimGalore
conda install -c bioconda cutadapt
sudo apt install fastqc
sometimes FastQC does not install unless you update Anaconda, and update all packages. If you cannot install FastQC, attempt to update all your programs and packages first.
Check versions to make sure they installed correctly
cutadapt --version
fastqc -v
Install TrimGalore:
Try:
conda install trimgalore
If this doesn’t work, you can try and extract the tarball and install TrimGalore yourself.
curl -fsSL https://github.com/FelixKrueger/TrimGalore/archive/0.6.5.tar.gz -o trim_galore.tar.gz
tar xvzf trim_galore.tar.gz
you can extract tarballs in bash using administrator permissions (if needed) by running:
sudo tar -xvzf /mnt/c/PATH/TO/TAR-FILE/Desktop/FILE-NAME.tar.gz -C /mnt/c/PATH/TO/DESTINATION/FOLDER
Make sure you are in the correct conda environment by passing
conda env list
. the environment with the *
next to it is the current environment. Change environments by typing
conda activate environmentname
.
conda install kallisto
kallisto
You’ll be prompted to install a lot of additional libraries during
this process, make sure you download all of them and, at the end, it is
good practice to update all the packages using
conda update all
.
Open RStudio. Update R. For Mac, download the latest update, then open RStudio and it will automatically install. On Windows:
install.packages("installr")
library(installr)
updateR()
Next, install all of these packages:
if (!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install(version = "3.11")
install.packages("devtools")
install.packages("tidyverse")
if (!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install("tximport")
if (!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install("tximportData")
if (!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install("rhdf5")
if (!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install("GenomicFeatures")
install.packages("pacman")
## or install the source package from
## https://cran.r-project.org/web/packages/pacman/index.html
## and install in R with
## install.packages(path_to_file, repos = NULL, type="source")
if (!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install("DESeq2")
This should download and install all of the packages you will need. If there is a package missing, please email aaron.mitchell.dick@duke.edu and i will include it here.
You should now be ready to run Part 1 and Part 2 of the RNA-Seq Walkthrough.