R on ARC: Difference between revisions
Line 161: | Line 161: | ||
= Using R on ARC = | = Using R on ARC = | ||
Like other calculations on ARC systems, | |||
'''R scripts and programs''' are run by submitting an appropriate script for batch scheduling using the '''sbatch''' command. | |||
For more information about submitting jobs, see Running jobs article. | |||
== R modules == | == R modules == | ||
Currently there are several software modules on ARC that provide different versions of '''R'''. | |||
The versions differ in the release date. | |||
You can see them using the <code>module</code> command: | |||
<pre> | |||
$ module avail R | |||
----------- /global/software/Modules/3.2.10/modulefiles --------- | |||
R/3.5.3 | |||
R/3.6.2 | |||
</pre> | |||
'''In addition''', | |||
* Module <code>biobuilds/2017.11</code> provides '''R v.3.4.2'''. | |||
* Module <code>bioconda/2018.11</code> provides '''R v.3.4.1'''. | |||
These modules are designed with '''bioinformatics''' applications in mind and have a number of specialized R packages preinstalled. | |||
== Installed R packages == | == Installed R packages == |
Revision as of 18:17, 25 May 2020
General
Text mode interactive shell
When you start R usual way you get into interactive R shell where you can type commands and get the results back. Like this:
$ module load R/3.6.2 $ R R version 3.6.2 (2019-12-12) -- "Dark and Stormy Night" Copyright (C) 2019 The R Foundation for Statistical Computing Platform: x86_64-pc-linux-gnu (64-bit) .... Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. > Sys.info() sysname release "Linux" "3.10.0-1127.el7.x86_64" version nodename "#1 SMP Tue Mar 31 23:36:51 UTC 2020" "arc" machine login "x86_64" "drozmano" user effective_user "drozmano" "drozmano" > quit() $
Running R scripts from the command line
In order to run R scripts / programs on ARC as jobs you have to pre-record the commands you want in a text file,
for example test.R
,
and run it as a script non-interactively.
test.R:
cwd = getwd() cat(" Current Directory: ", cwd, "\n") t = Sys.time() cat(" Current time: ", format(t), "\n") u = Sys.info()["user"] cat(" User name: ", u, "\n")
There are three ways to run an R script.
From standard input
An R script can be sent to the standard input of the R interactive shell. This is similar to typing the commands in R:
$ R --no-save < test.R R version 3.6.2 (2019-12-12) -- "Dark and Stormy Night" Copyright (C) 2019 The R Foundation for Statistical Computing Platform: x86_64-pc-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. Natural language support but running in an English locale R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. > cwd = getwd() > cat(" Current Directory: ", cwd, "\n") Current Directory: /global/software/src/r/tests > > t = Sys.time() > cat(" Current time: ", format(t), "\n") Current time: 2020-05-07 15:16:12 > > u = Sys.info()["user"] > cat(" User name: ", u, "\n") User name: drozmano > >
After executing all the commands from the script, R terminates. Note that both the commands and the printed output are shown.
Using CMD BATCH
command
An R script can be passed as an argument to the "R CMD BATCH" command. The output does not go to the screen, but is saved to the .Rout file:
$ R CMD BATCH test.R $ ls -l -rw-r--r-- 1 drozmano drozmano 176 May 7 15:03 test.R -rw-r--r-- 1 drozmano drozmano 1121 May 7 15:19 test.Rout
To see the output use the cat or less commands:
$ cat test.Rout R version 3.6.2 (2019-12-12) -- "Dark and Stormy Night" Copyright (C) 2019 The R Foundation for Statistical Computing Platform: x86_64-pc-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. Natural language support but running in an English locale R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. > cwd = getwd() > cat(" Current Directory: ", cwd, "\n") Current Directory: /global/software/src/r/tests > > t = Sys.time() > cat(" Current time: ", format(t), "\n") Current time: 2020-05-07 15:19:07 > > u = Sys.info()["user"] > cat(" User name: ", u, "\n") User name: drozmano > > > proc.time() user system elapsed 0.219 0.079 0.369
The output if very similar to the first way, but contains some additional timing information. Again, both the commands and the output are shown.
Using Rscript
version of R
Probably the best non-interactive way to run an R script is to use a special non-interactive version of R, Rscript
:
$ Rscript test.R Current Directory: /global/software/src/r/tests Current time: 2020-05-07 15:22:17 User name: drozmano
In this case R does not print any extra information, and only explicitly printed values are shown in the output, the commands themselves are not printed.
Using R on ARC
Like other calculations on ARC systems, R scripts and programs are run by submitting an appropriate script for batch scheduling using the sbatch command. For more information about submitting jobs, see Running jobs article.
R modules
Currently there are several software modules on ARC that provide different versions of R. The versions differ in the release date.
You can see them using the module
command:
$ module avail R ----------- /global/software/Modules/3.2.10/modulefiles --------- R/3.5.3 R/3.6.2
In addition,
- Module
biobuilds/2017.11
provides R v.3.4.2. - Module
bioconda/2018.11
provides R v.3.4.1.
These modules are designed with bioinformatics applications in mind and have a number of specialized R packages preinstalled.
Installed R packages
When installing a new R version, the following packages are typically installed at the same time.
arules purrr xaringan glue covr lintr reprex reticulate utf8 promises devtools cluster dbscan epiR epitools glasso Hmisc irr mi RSQLite foreign openxlsx dplyr tidyr stringr stringi lubridate ggplot2 ggvis rgl htmlwidgets googleVis car lme4 nlme mgcv randomForest multcomp vcd glmnet survival caret shiny rmarkdown xtable sp maptools maps ggmap zoo xts quantmod Rcpp data.table XML jsonlite httr RcppArmadillo manipulate proto dichromat reshape2 mice rpart party caret randomForest nnet e1071 kernlab neuralnet rnn h2o RSNNS tensorflow keras infer janitor DataExplorer sparklyr drake DALEX raster gpclib # BioConductor BiocManager GenomicFeatures AnnotationDbi DESeq DESeq2 MAST FEM DEGseq EBSeq DRIMSeq SGSeqRNASeqR
If you want to use a specific R package with a centrally installed R you can check if it has already been installed before attempting installing it:
> is.installed <- function(mypkg)is.element(mypkg, installed.packages()[,1]) > is.installed("FEM") [1] TRUE > is.installed("e1071") [1] TRUE > is.installed("rgdal") [1] FALSE