Batch processing with R

According to Wikipedia batch processing is execution of a series of programs (“jobs”) without human interaction. Batch job can run non-interactively, so all input data is preselected through scripts or command-line parameters.

R provides you a simple way to run a script non-interactively with input file from “infile” and send output to “outfile”. You can also pass arguments to batch job.

First we use “cat” command and “>” operator to create a small script file. The “cat” command takes input from keyboard and redirect it to a file:

$ cat > hello_world.R
# Hello World example
a <- c("Hello, world!")
print(a)

Type “Control-D” to signal the end of the code. Now we use the R command mode to send “hello_world.R” as a batch job from [STDIN] and show the result on [STDOUT]:

$ R --vanilla --slave < hello_world.R

This command says “invoke hello_world.R non-interactively”. By default, the output is shown on the screen, but we can use “>” operator to redirect it to a file:

$ R --vanilla --slave < hello_world.R > result.txt

More interesting job is to pass our own arguments to script. Let’s put some commands into a script file:

$ cat > print_args.R << EOF
args <- commandArgs()
print(args)
q()
EOF

And running it looks like:

$ R --slave "--args a=100 b=200" < print_args.R
[1] "/Library/Frameworks/R.framework/Resources/bin/exec/i386/R"
[2] "--slave"                                                  
[3] "--args"                                                   
[4] "a=100"                                                 
[5] "b=200" 

The “commandArgs()” function has an argument “trailingOnly”; when TRUE the function only returns the script-specific arguments after the “args” argument. So we can slightly modify our script:

$ cat > print_my_args.R << EOF
args <- commandArgs(TRUE)
print(args)
q()
EOF

The output of this command then returns:

$ R --slave "--args a=100 b=200" < print_my_args.R
[1] "a=100" "b=200"

References:

About Andrej

I am currently working towards my Ph.D. in statistics at the University of Ljubljana. I am also a teaching assistant at the Faculty of Information Studies. My research is in the areas of statistic, bioinformatics, machine learning, artificial intelligence, cognitive science, human-computer interaction, and knowledge representation.
This entry was posted in Programming. Bookmark the permalink.

2 Responses to Batch processing with R

  1. Could I suggest that you try out Rscript? Put #!/usr/bin/Rscript at the top of your code, chmod a+x and then everything works like magic (including command line arguments).

  2. Nice. I didn’t know that. Thanks for your advice.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s