reshape(data, varying = NULL, v.names = NULL, timevar = "time", idvar = "id", ids = NULL, times = NULL, drop = NULL, direction, new.row.names = NULL, sep = ".", split = if (sep == "") { list(regexp = "[A-Za-z][0-9]", include = TRUE) } else { list(regexp = sep, include = FALSE, fixed = TRUE) })
data | a data frame with repeated measurements. Each "individual" will have measurements on various aspects of it taken at a number of "times". In "long" format there will be a column of times, a column of ids, and a one column per measurement type. It is expected that each individual will be measured at the same set of times. In "wide" format there will be a column of identifiers (with no repeated entries) and a column for each measurement at each time (and no time column). If data has the attribute reshapeWide or reshapeLong then the components of the attribute will be used as arguments to reshape: all other arguments are ignored. |
varying | a list of equal-length vectors of variable names in wide format: each vector of variable names in the list corresponds to a single variable in the long format. I.e., each component of the list contains the names of the columns referring to a single measurement type taken at different times. It can also be be a matrix of variable names, where each row of the matrix acts like a component of the list decribed above. It can also be a vector of variable names, in which case reshape will try to intuit their meanings, assuming that the names are pasted together from the column names in v.name and time values from the timevar column and the sep string. |
v.names | a character vector of variable names in long format that correspond to multiple variables in the wide format. If not supplied, all columns in the data argument except those named by the idvar, times, and drop arguments will be used. |
timevar | a character string represented as time variable in long format that identifies multiple records from the same group/individual. |
idvar | a character vector of names of one or more variables in long format that identify multiple records from the same group/individual. This argument may also be given in wide format. |
ids | the values to be used in the idvar variables in long format. |
times | the values to be used in the timevar variable in long format. |
drop | a character vector of names of variables to be dropped before the data is reshaped. |
direction | a character string specifies the reshape way. "wide" means to reshape to wide format, and "long" means to reshape to long format. This argument must be presented unless to reshape a data with attribute "reshapeLong" or "reshapeWide". |
new.row.names | a character vector of row names to be replaced as new row names of the reshaped data. If NULL, the row names are created from the values of idvar and timevar variables in long format. |
sep | a character string used as the separator of the variable name and time point parts of the measurement columns in wide format. When converting to wide format, sep is used to generate the new column names; when converting to long format and v.names is a single character string, sep is used, via split, to find the measurement name and time point value encoded in the column names. |
split | A list with three components regexp, a regular expression, include, a logical, and fixed, a logical. This is used when converting to long format to decode the variable names given by varying. regexpr is a either a regular expression pattern (see regexpr) identifying where to split up the names (if fixed is TRUE) or a fixed string identifying where to split up the names (if fixed is FALSE). If include is TRUE then the first character of the matched text is included in the first part of the the split name, otherwise it is not (this is useful when there is no separator character so, e.g., the pattern "Joe10" may be split into "Joe" and "10" with the regular expression "[[:alpha:]][[:digit:]]".) |
# "wide" format data: measured 'conc' on days 1, 3, and 5 for 2 animals d1 <- data.frame(animal = c("Dog", "Cat"), conc1 = c(10.1, 1.1), conc3 = c(30.3, 3.3), conc5 = c(50.5, 5.5), treatment = c("t1", "t2")) d1.long <- reshape(d1, direction = "long", varying = list(c("conc1", "conc3", "conc5")), v.names= "conc", idvar = "animal") d1.long # same result as above, but let it intuit the v.names and time values reshape(d1, direction = "long", varying = c("conc1", "conc3", "conc5"), sep="", idvar = "animal")# convert it back to original shape reshape(d1.long)
# "long" format: one row for each parent d2 <- data.frame( child=c("Alan","Alan","Susan","Susan"), childAge=c(2, 2, 10, 10), parent=c("Betty", "Chris", "Ulam", "Tammy"), parentSex=c("Female", "Male", "Male", "Female"), parentAge=c(26, 28, 44, 42)) # reshape to have one row per child, with columns for each parent reshape(d2, direction = "wide", idvar = "child", timevar="parentSex", v.names=c("parent", "parentAge"))