This week falls under the topic of “dependency management”. Basically, we are setting you up to more easily get stuff done later.
You need software on your system to do your work. Some software comes with your operating system. But a lot of research software will be “third-party” and you’ll have to get it by “some method”.
One option is to install from source code. However, your ability to do this is a function of background/experience and how much access you have to your system.
A popular alternative to installing from code is to use third party “package managers” that have already compiled the code for your system. The two leaders in this area are:
They are superficially similar, but work quite differently.
For R packages, and for report generation, R studio is a key tool.
This one is the easiest:
We do have some possible issues:
If all else fails, you can try the cloud version in the short term.
You want the miniconda version for your system, found here.
You want the 64 bit version for your platform!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
You also want Python 3.8.
To install, the steps will look something like this (on a Linux machine):
curl -L https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh > Miniconda3-latest-Linux-x86_64.sh
bash Miniconda3-latest-Linux-x86_64.sh
Follow the instructions. Say “yes” to everything.
Once the install is done, you must restart your terminal completely.
Restarting is very important.
If you have successfully installed and restarted, you will see a (base) in your command prompt:

This is a bio class, so we want bio software!
conda uses “channels” to organize packages.
The bioconda channel is for bio stuff.
To use bioconda, we need to make our environment aware of it.
We follow the instructions here, which are:
conda config --add channels defaults
conda config --add channels bioconda
conda config --add channels conda-forge
Many users fail to do this! The symptom is that “really odd stuff” happens when you try to install things and/or run them.
If you have succeeded, your ~/.condarc file will look exactly like this:
channels:
- conda-forge
- bioconda
- defaults
To test, let’s install something from bioconda:
conda install bwa
At this point, you can install stuff from any conda channel.
(You may have to add some others, though.)
Reproducibility! Some clarification is needed. Your installation will work now, but there is no guarantee that you’ll be able to install exactly the same versions of everything later. However, if you could manually install everything from source, you could completely recreate your environment.
The problem is that conda packages update continuously.
This continuous updating means that you are very likely to run into dependency requirements that are impossible to satisfy.
You also cannot use conda to replicate a computing environment on a different operating system.
There is no guarantee that all the same versions of all the same things exist for all platforms at any given time.
The answer is “containers” – Docker, Singularity, etc.. This is “beyond the scope” of this class.