Learning Objectives
- Describe what a R add-on package is
- Install a package from CRAN
- Install a package from Bioconductor
- Learn how to manage package conflicts
- Learn about other ways to practice R
In the last two lessons, we used an add-on package called dplyr
. R is open source, so anyone can write their own functions for particular tasks. When you want to make these functions available to other people, you can develop them into a package with extra documentation like help pages for each function and a guide on how to put all the functions together in a typical analysis. If your package meets certain requirements, you can submit it to the Comprehensive R Archive Network (CRAN), who will host it and make it available to anyone in the world. CRAN currently has over 15,000 packages! To get any one of them, use R’s built in install.packages function:
You might get asked to choose a CRAN mirror – this is basically asking you to choose a site to download the package from. The choice doesn’t matter too much. I’d recommend choosing the RStudio mirror or one that is geographically close. You might also notice that more than one package gets downloaded. install.packages()
checks to see if the package depends upon additional packages and downloads them as well.
Besides CRAN, there are other repositories for R packages. Bioconductor provides packages for analyzing genomic data. They currently have over 1,800 software packages and 900 Annotation packages. To install a package from Bioconductor, you cannot use install.packages with default options. Instead, Bioconductor submitted their own small package to CRAN with a function to download packages from either Bioconductor or CRAN.
What was that ::
in the line of code above? It is a way to call the function install
that is specifically from the BiocManager
package. You can imagine that with 13,000 + 1,500 software packages, the same simple function names get used often. It is also a way to use a function without loading its package first. Remember from the dplyr lesson, in order to use the functions we had to first load the dplyr package with the library
function. Every time you open a new R session, you have to load packages from your library to use the functions unless you use the ::
designation. The ::
can also come in handy when you have two packages loaded that share a function name. Try this:
If you already had the dplyr package loaded this session, then the first line may not produce any output in the console. But the second line should produce a lot of output, much of it similar to this:
The following object is masked from ‘package:dplyr’:
select
“Masked” means that since AnnotationDbi
was loaded second, using the function select
will typically result in using the one from AnnotationDbi
instead of dplyr
. Ask for help on select
in the following ways:
?select #R gives both options
?dplyr::select #takes you to dplyr's help page
?AnnotationDbi::select #takes you to AnnotationDbi's help page
If the order of loading packages had been reversed, then the select
function from dplyr
would mask the one from AnnotationDbi
. This package confict becomes an issue when you need more and more packages for an analysis. Package maintainers can mitigate conflicts by writing their functions in a way that tells R to check the class of object given to the function and then select the correction function.
We have covered many of the basics of R, but because R is a language, you will not become fluent unless you get lots of practice. In addition to this workshop, there are many other ways to learn R. One excellent way is using the swirl
package that we installed above. swirl
uses self-paced, interactive lessons directly in R to cover many more topics that we have time to cover in this workshop. The only thing you have to remember to use swirl is to load the package with the library
function:
Just read the output and do what it says. It will wait for you to type in a response and hit enter. It then covers some basics of how it works. There are several different courses and I suggest to start with 1: R Programming: The basics of programming in R
. Once the course installs, choose 1: R Programming
, then from the lessons start with 1: Basic Building Blocks
. Each lesson is interactive, giving you some information or asking a question and then waiting for your response. It checks to make sure your response is correct, and if not, will ask you to try again. These are great to work through on your own, at your own pace.
Another excellent way to get more R practice is to run through Software Carpentry’s Programming with R lessons. It’s focus is more on general programming, writing functions, for loops, if/else statements, writing packages and writing dynamic reports.
Data Carpentry,
2017. License. Contributing.
Questions? Feedback?
Please file
an issue on GitHub.
On
Twitter: @datacarpentry