Skip to content

Using R inside of Anaconda

Whilst most Apocrita users will want to use the R module or RStudio via OnDemand for R workflows, it is also possible to use R inside of Anaconda.

You may wish to do this when you want to have R packages available at the same time as other packages you can install via Conda, or when your workflow requires the use of both Python and R at the same time.

Setting up your environment and library

To run R inside of Anaconda, first you will need to create and activate an Anaconda environment according to our documentation. Once you are inside your activated environment, you can then install any R packages required.

When running R inside Anaconda, it is best to try to stick to installing any additional R packages you require using the Conda package manager:

https://docs.anaconda.com/free/working-with-conda/packages/using-r-language/

There are over 6,000 commonly used R packages for data science available from Anaconda, as well as many others in the Bioconda and Conda Forge channels. You can use the search facility at anaconda.org to search for packages.

Use Mamba instead of Conda

The official documentation for many Conda packages will often state that you should use conda for commands such as conda install etc. We recommend using mamba instead as it is much faster. See this blog post for further information.

CRAN packages

A large number of CRAN packages are available to install using Anaconda. You will need to add r- before the regular package name. For instance, if you want to install Seurat, you will need to use mamba install r-seurat or for rJava, type mamba install r-rjava.

Pay attention to the output for the proposed installation versions of your package and its dependencies. You might find that the default version of something you are offered is too old and another channel offers a newer version. For example, if you search for the r-seurat package on anaconda.org, you will see that it is available from multiple channels:

To select a specific channel, add it to your installation command. For instance, if you wanted to install Seurat from Conda Forge:

mamba install -c conda-forge r-seurat

You may find that when you specify a channel in this way, then Conda will complain that some dependencies can't be fulfilled. You can specify multiple channels in your installation command, and they will be used in the order specified:

mamba install -c bioconda -c conda-forge <package name>

The above command would install your package from the Bioconda channel, and if any required dependencies aren't found in Bioconda, then the installation process will search Conda Forge for them as well. You may also find that adding the Conda Forge channel is preferable as it gives you newer versions of some dependencies.

Bioconductor packages

There are also a lot of Bioconductor packages available to install using Anaconda. Most of these are available from the Bioconda channel; you will usually need to add bioconductor- before the regular package name.

Again, depending on what packages are already in your library, you may find you need to specify additional channel(s) to fulfil all required dependencies. For example, to install HIBAG:

mamba install -c bioconda -c conda-forge bioconductor-hibag

This will install HIBAG from the Bioconda channel, and use Conda Forge to fulfil any missing dependencies that aren't available from Bioconda.

Beware install.packages

Whilst it may be tempting to install packages within the R shell running inside Anaconda using install.packages as you might do when running R using our module or RStudio OnDemand, it is best to try to avoid this. This might install those packages into your personal R library (default path ~/R/x86_64-pc-linux-gnu-library/<R version>). This path may already contain packages that have been compiled and installed using the R module or RStudio.

The issue here is that the compilation environment inside your Conda environment is likely to be markedly different from that of the R module or RStudio, particularly if you have loaded additional modules and compiled and installed packages in your personal library. There will be different versions of key packages and libraries such as GCC, and all sorts of issues can arise.

It's best wherever possible to only install R packages in your Conda environment using the mamba install methods detailed above, so that they are compiled consistently using the same environment, and don't get mixed up with any packages in your personal R library. Also, using install.packages inside Conda runs the risk that some packages might be compiled using a mix of dependencies from both inside and outside of the Conda environment, which isn't ideal. Using mamba install should also be markedly faster than install.packages as well.

If something is missing from all Anaconda channels, then you can proceed to use installation methods such as install.packages. Just be sure to check your personal library path carefully first:

> .libPaths()
[1] "/path/to/anaconda/env/lib/R/library"

This path should point to a directory inside your Conda environment. If it doesn't, then make sure that it does before installing any additional packages by running the following:

> .libPaths("/path/to/anaconda/env/lib/R/library")

An additional check is to run library(); the output should start with something similar to:

Packages in library ‘/path/to/anaconda/env/lib/R/library’:

The only path listed should be inside your Conda environment; if there are any other library paths listed then tread very carefully.

References

Using R language with Anaconda