parallel
The Spotfire Enterprise Runtime for R Parallel Package Overview
Description
Provides an overview to the parallel package.
Introduction to the parallel Package
The parallel package contains a subset of functions to provide compatibility for the
open-source R parallelized computing feature. Using the the Spotfire Enterprise Runtime for R parallel package, you can:
- Define and create a cluster of Spotfire Enterprise Runtime for R computation nodes, either locally (multiple cores on
a single machine) or remotely (on multiple machines running Spotfire Statistics Services).
- Execute a parallelized computation on a cluster using one of a family of parallelized
apply functions.
The parallel package implements several "dummy" functions. These are functions that exist only for
compatibility with open-source R parallel functions.
Java Requirement
To use the parallel package, you must have set JAVA_HOME. You can check for
JAVA_HOME by running the command Sys.getenv("JAVA_HOME") in the Spotfire Enterprise Runtime for R console.
Running library(parallel) loads the terrJava package if it is not already loaded.
Using parallel with Spotfire Statistics Services
Spotfire Statistics Services allocates the parallel nodes you create to its available engines.
For example, if you are running Spotfire Statistics Services with three engines, but you
create a cluster with more than three nodes, Spotfire Statistics Services allocates the
nodes to the engines. You can submit as many tasks as there are virtual nodes, but they
are allocated to engines according to their availability.
If you use the parallel package with Spotfire Statistics Services, remember that each
call to the server starts a new engine session. You cannot depend on a particular engine
being used from one call to another. Each individual call could (and probably would) use
a different engine (or to an entirely different machine). If you want to
do some set up and an evaluation, write the script as one single evaluation.
Warning: Clean Up Spawned Parallel Engines
When makeCluster (with type="TERR") creates a cluster of spawned
engines, these processes remain until they are explicitly stopped
by calling stopCluster, or the Spotfire Enterprise Runtime for R process that spawned them exits.
This can cause problems if you call makeCluster repeatedly in a
long-running Spotfire Enterprise Runtime for R engine, such as a Spotfire local Spotfire Enterprise Runtime for R engine, or an engine in
Spotfire Statistics Services that is reused to execute multiple Spotfire
Statistics Services tasks. In this case, you could create many spawned engine
processes, which could ultimately slow down the computer.
A good way to avoid this problem is to be sure to call stopCluster
after the cluster is used, with code such as the following. (It uses
tryCatch so it is sure to stop the cluster, even if an error occurs
when computing with the cluster.) Calling on.exit in a function could
also be used.
clust <- makeCluster(3)
tryCatch(val <- clusterApply(clust, mylst, myfun),
finally=stopCluster(clust))
Package Functions
You can find the following functions in the parallel package. For more information on each
function, see the package help.
Create parallel nodes
The following function creates a parallel node.
makeCluster | Creates a cluster. Use the spec argument to specify the number
of nodes.
|
Perform parallel computation
The following functions perform various computation chores on clusters.
clusterApply | Applies the specified function to the components of x on each node. |
clusterApplyLB | Similar to clusterApply, but with load balancing. |
clusterCall | Calls the specified function on each cluster node. |
clusterEvalQ | Evaluates a lteral expression on each cluster node |
clusterExport | Exports the specified objects to each cluster node. |
clusterMap | Applies a function to multiple list or vector arguments on each node. Similar to mapply |
clusterSplit | Splits the specified sequence into a continuous piece for each cluster node. |
parApply | A parallel version of the apply function. |
parCapply | A parallel column version of apply. |
parLapply | A parallel version of lapply. |
parRapply | A parallel row version of apply. |
parSapply | A parallel version of sapply. |
parLapplyLB | A parallel version of lapply, with load balancing. |
parSapplyLB | A parallel version of sapply, with load balancing. |
|
Miscellaneous parallel functions
clusterSetRNGStream |
Sets the random number generator for each node in the cluster to "L'Ecuyer-CMRG". |
detectCores | Returns an integer value indicating the number of CPU cores,
or NA if retrieving processor information is not supported on the current system. |
nextRNGStream and nextRNGSubStream | Takes a seed for the
"L'Ecuyer-CMRG" random number generator and produces a new seed of the same kind. |
setDefaultCluster | Registers a cluster as the default for the current session. |
splitIndices | Splits the sequence of integers from 1 to nx into contiguous pieces
for each of ncl cluster nodes. |
stopCluster | Stops the engine nodes in the cluster cl.
|
Dummy parallel functions
The following functions are implemented as dummy functions to provide compatibility for
the open-source R parallel functions. (These functions just run serial evaluation, not
parallel evaluation.)
pvec | To support existing code, it just applies FUN to v
and the other ... arguments. |
mc.reset.stream | Does not reset the random number generators. |
mcaffinity | This function does nothing. |
mccollect | This function does nothing. |
mclapply | This function just calls the non-parallel lapply. |
mcparallel | Immediately evaluates and saves the value of expr.
|
Placeholder functions for R compatibility
The following functions are defined to be compatible with open-source R, but Spotfire Enterprise Runtime for R
does not support these types of clusters. If they are called, they generate an error.
See Also
clusterApply,
clusterApplyLB,
clusterCall,
clusterEvalQ,
clusterExport,
clusterMap,
clusterSetRNGStream,
clusterSplit,
detectCores,
makeCluster,
makeForkCluster,
makePSOCKcluster,
mc.reset.stream,
mcaffinity,
mccollect,
mclapply,
mcparallel,
nextRNGStream,
nextRNGSubStream,
parApply,
parCapply,
parLapply,
parLapplyLB,
parRapply,
parSapply,
parSapplyLB,
pvec,
setDefaultCluster,
splitIndices,
stopCluster.