clusterApply
Apply Operations Using Clusters

Description

Provides some functions for parallel computation using clusters.

Usage

clusterApply(cl = NULL, x, fun, ...)
clusterApplyLB(cl = NULL, x, fun, ...) 
clusterCall(cl = NULL, fun, ...) 
clusterEvalQ(cl = NULL, expr) 
clusterExport(cl = NULL, varlist, envir = .GlobalEnv) 
clusterMap(cl = NULL, fun, ..., MoreArgs = NULL,
           RECYCLE = TRUE, SIMPLIFY = FALSE, USE.NAMES = TRUE,
           .scheduling = c("static","dynamic")) 
clusterSplit(cl = NULL, seq) 
parApply(cl = NULL, X, MARGIN, FUN, ...)	
parCapply(cl = NULL, x, FUN, ...) 	
parLapply(cl = NULL, X, fun, ...) 
parRapply(cl = NULL, x, FUN, ...) 
parSapply(cl = NULL, X, FUN, ..., simplify = TRUE, USE.NAMES = TRUE)
parLapplyLB(cl = NULL, X, fun, ...) 
parSapplyLB(cl = NULL, X, FUN, ..., simplify = TRUE, USE.NAMES = TRUE) 

Arguments

cl a cluster object. It is normally created by makeCluster. If NULL, the default cluster set by setDefaultCluster is used.
x
  • a vector for clusterApply and clusterApplyLB
  • a matrix for parCapply and parRapply.
fun, FUN a function or a character string giving the function name.
... additional arguments passed to fun or FUN.
expr an expression to be evaluated.
varlist a character vector giving the names of objects to be exported.
envir an environment from which to find the objects to be exported.
MoreArgs an optional list of additional arguments passed to fun.
RECYCLE a logical value. If TRUE (the default), shorter arguments to clusterMap are recycled. If FALSE, all the arguments are truncated to the length of the shortest one.
SIMPLIFY a logical flag. If TRUE, the result is simplified to a vector or a matrix, if possible, as in mapply. The default is FALSE.
USE.NAMES a logical flag. See mapply for its use in clusterMap, or sapply for use in parSapply and parSapplyLB.
.scheduling a character string specifying how the tasks are scheduled.
  • If "static", the tasks are statically allocated to the nodes of cluster, as in clusterApply.
  • If "dynamic", dynamic load balance is applied to the tasks, as in clusterApplyLB.
Partial string matching is allowed.
seq a vector to be split.
X
  • a list or vector for parLapply, parLapplyLB, parSapply, and parSapplyLB.
  • a matrix or array for parApply.
MARGIN the subscripts over which the function is applied, as in apply.
simplify a logical flag or character string. If TRUE (the default), the result is simplified to an array, as in sapply.

Details

clusterApply Applies the function fun to components of x on each cluster node, calling fun on the first node with arguments x[[1]] and ..., on the second node with arguments x[[2]] and ..., and so on. The cluster nodes are recycled if the number of nodes is less than the length of x. The result of function call fun on each node is returned as an element of the list returned by clusterApply.
clusterApplyLB S load balancing version of clusterApply. Suppose that there are n nodes and p is the length of x. If n is greater or equal to p, jobs are simply submitted to the first p nodes. Otherwise, the first n jobs are submitted in order on the n nodes. When the first job finishes, the next job is submitted to the node that has become free. This process continues until all jobs are finished. Using load balancing produces better cluster utilization if one node is taking much longer than the others; however, increasing communication between parent and child processes can reduce performance. Moreover, predicting which job is executed on which node is more difficult because it depends on the time of each job.
clusterCall Calls the function fun on each cluster node with arguments ....
clusterEvalQ Evaluates a literal expression on each cluster node. It is a parallel version of evalq.
clusterExport Exports the objects named in varlist to each cluster node. Each variable is assigned in the global environment on each node. The environment from which to get objects is specified in envir.
clusterMap Applies a function to multiple list or vector arguments on each cluster node. Most of the arguments are similar to mapply.
clusterSplit Splits a sequence seq into a continuous piece for each cluster node. The number of pieces is equal to the number of nodes in cl, and the lengths of these pieces are roughly equal. See splitIndices.
parApply The parallel version of apply, applying a function to sections of an array, distributing the work over multiple cluster nodes.
parRapply and parCapply Parallel row and column versions of apply, applying a function to the rows or columns of a matrix, distributing the work over multiple cluster nodes.
parLapply and parSapply Parallel versions of lapply and sapply, applying a function to components of a list or vector, distributing the work over multiple cluster nodes.
parLapplyLB and parSapplyLB Load balance versions of parLapply and parSapply. Internally, clusterApplyLB is called.
For all of the above functions except for clusterSplit, the function generates an error if any error occurs on any node.
Value
clusterApplyreturns a list with the same length as x, with no names.
clusterApplyLBreturns a list with the same length as x, with no names.
clusterCallreturns a list. Each element is the result of calling fun on each cluster node.
clusterEvalQreturns a list. Each element is the result of evaluation on each cluster node.
clusterExportreturns NULL invisible.
clusterMapreturns a list, a vector, or a matrix. See mapply.
clusterSplitreturns a list with length equal to the number of nodes.
parApplyreturns a vector or array. See apply.
parRapplyreturns a vector.
parCapplyreturns a vector.
parLapplyreturns a list with the length of X, with the same names as X. See lapply.
parLapplyLBreturns a list with the length of X, with the same names as X. See lapply.
parSapplyreturns a vector or matrix. See sapply.
parSapplyLBreturns a vector or matrix. See sapply.
Differences between Spotfire Enterprise Runtime for R and Open-source R
clusterSplit gives incorrect results in open-source R if cl has one or fewer nodes, or seq has zero elements. Spotfire Enterprise Runtime for R handles these cases better, returning a list with length equal to the number of nodes in cl.
See Also
makeCluster, apply, lapply, mapply, sapply, evalq, splitIndices.
Examples
cl <- makeCluster(3)  # Create cluster with 3 nodes on local host
clusterApply(cl, 1:6, sin)
clusterApplyLB(cl, 1:6, function(i, x, y){ x^i+y}, 3, 4)
clusterApply(cl, 1:6, get("+"), c(3, -3))
clusterCall(cl, function(i, x, y){ x^i+y}, 2, 3, 4)
clusterEvalQ(cl, {x<-1; y<-1; x+y})

x <- 1; y<-1 clusterExport(cl, c("x", "y")) clusterMap(cl, matrix, list(aa = 1:12, bb=1:4, cc=1:6), nrow=c(3,2,2), .scheduling = "dynamic") clusterSplit(cl, 1:20)

z <- array(c(1:24, 101:124, 201:224, 301:324, 401:424), dim=c(2,3,4,5)) parApply(cl, z, c(1, 3), sum)

parRapply(cl, matrix(20:1, ncol=5), mean, trim=.25) parCapply(cl, matrix(20:1, ncol=5), mean, trim=.25)

parLapply(cl, list(a=1:10, b=runif(20)), quantile, probs=1:3/4) parSapply(cl, list(a=1:10, b=runif(20)), quantile, probs=1:3/4)

# Load Balance version parLapplyLB(cl, list(a=1:10, b=runif(20)), quantile, probs=1:3/4) parSapplyLB(cl, list(a=1:10, b=runif(20)), quantile, probs=1:3/4)

stopCluster(cl)

Package parallel version 6.1.1-7
Package Index