What syntax can be used to create case selection/verification/recode conditions (or to calculate variable values using spreadsheet formulas)?
Recoding variable values, performing case selection for analyses, or verifying quality of data involve a comparison of conditions or values encountered in the spreadsheet with conditions specified by an expression to determine a further action for each case (that is, respectively assigning a new variable value, including the case in an analysis, or verifying that the data associated with a case are correct). Assigning variable values using a spreadsheet formula involves case-by-case calculation of values for a variable based directly on the values of other spreadsheet variables and parameters not included in the spreadsheet. For data transformations more extensive or more complex than can be accomplished using the variable recode and spreadsheet formula facilities provided, use the Statistica Visual Basic language.
Follow these guidelines to create case selection, verification, or recode conditions.
- Refer to variables by either their numbers (example, v1 = 1) or their names (example, Gender = 1). Note that you can type variable names in either upper case or lower case letters (that is, "GENDER" is equivalent to "gender"). Note also that v0 refers to the case number when used in expressions.
- In expressions, enclose text labels of a variable in single quotation marks (example, v1 = 'MALE'). Note that you can type text labels in either upper case or lower case letters (that is, 'YES' is equivalent to 'yes').
- In expressions or spreadsheet formulas, enclose variable names containing special characters (example, spaces, plus or minus signs) in single quotation marks. If the single quotation mark itself occurs in the variable name, use double quotation marks instead (example, "A's Score"). Note that if double quotation marks are used in the name, then the variable name must be placed in single quotation marks.
Arithmetic: +, -, *, /, ** or ^ (exponentiation), ( )
Relational:
= (equal to)
<>, >< (not equal to)
< (less than)
> (greater than)
<= (less than or equal to)
>= (greater than or equal to)
Logical:
AND (equivalent to &)
OR (equivalent to !)
NOT (equivalent to ~)
Abs(x) | absolute value of x |
Arccos(x) | arc cosine of x |
Arcsin(x) | arc sine of x |
Arctan(x) | arc tangent of x |
Cos(x) | cosine of x |
hyperbolic cosine of x | |
Exp(x) | e to the power of x |
Hypot(x,y) | returns hypotenuse of x and y (square root(x2 + y2)) |
Log(x) | natural logarithm of x |
Log2(x) | binary logarithm of x |
Log10(x) | common logarithm of x |
Max(x,y) | returns the greater of x and y |
Min(x,y) | returns the lesser of x and y |
Rnd(x) | random number in the range of 0 to x |
RndNormal(x) | random number from a normal distribution with mu=0 and sigma=x |
RndPoisson(x) | random number from a Poisson distribution with parameter x |
Sign(x) | sign of x: if x>0 then +1, if x<0 then -1, if x = 0 remains 0 |
Sin(x) | sine of x |
SinH(x) | hyperbolic sine of x |
Sqrt(x) | square root of x |
Tan(x) | tangent of x |
TanH(x) | hyperbolic tangent of x |
Trunc(x) | truncate x to an integer "towards zero" |
Uniform(x) | random value in range 0 to x (same as Rnd(x)) |
Function names are not case sensitive, that is, Log(x) is the same as log(x) or LOG(x). As indicated in the function list above, math functions accept one to two arguments depending on the function. The position of the function in the expression or spreadsheet formula is replaced by the return value of the function. Numeric values (example, Sqrt(155)), variable names (example, Max(SCORE1,SCORE9)), or variable numbers (example, Log(v8)) are acceptable arguments. Additional arguments acceptable to math functions are expressions that evaluate to a number (example, Max(v7,(v5+v8-BASELINEVALUE)/3)) or functions that return a numeric result (example, Sin(Sqrt(v5))). Some commonly used constants can also be specified in expressions and formulas by reference: example, Pi = 3.14... Euler (e) = 2.71...
Mean(x1, x2,..xn) | mean of n arguments |
Median(x1, x2,..xn) | 50th percentile of n arguments |
Perc25(x1, x2,..xn) | 25th percentile of n arguments |
Perc75(x1, x2,..xn) | 75th percentile of n arguments |
Statmax(x1, x2,..xn) | maximum of n arguments |
Statmin(x1, x2,..xn) | minimum of n arguments |
Stdev(x1, x2,..xn) | standard deviation of n arguments |
Sum(x1, x2,..xn) | sum of n arguments |
Validn(x1, x2,..xn) | number non-missing of n arguments |
All statistics functions ignore arguments that contain missing values, basing their results on non-missing data only. Unlike math functions, statistics functions do not evaluate to a missing result unless all arguments are missing values or an argument is encountered that evaluates to an undefined value (example, square root of a negative number or division by zero). Also unlike math functions, each statistics function accepts any number of arguments placed in parentheses and separated by commas.
The following are acceptable statistics function arguments:
Numbers | example, Mean(18,80,120,68,40) |
Variable names | example, Mean(SALARY,1050,BONUS,500) |
Variable numbers | example, Stdev(120,v3,v2,v8,255) |
Ranges of variables designated by name or number (use colons to define ranges) | example, Sum(v54,COST1:COST5,2550,v23:v35,1575,OVERHEAD) |
Expressions that evaluate to a number | example, Mean(500,(v6+456)/v3),TRAVELEXPENSE,v8) |
Functions that return a numeric result | example, Mean(220,Sqrt(v8+v7),v12) |
Below is a listing of all available distributions (parameters are given in parentheses):
Distribution | Density/ Probability Function | Distribution Function | Inverse Distribution Function |
Beta | beta(x,n,w) | ibeta(x,n,w) | vbeta(x,n,w) |
Binomial | binom(x,p,n) | ibinom(x,p,n) | |
Cauchy | cauchy(x,h,q) | icauchy(x,h,q) | vcauchy(x,h,q) |
Chi-square | chi2(x,n) | ichi2(x,n) | vchi2(x,n) |
Exponential | expon(x,l) | iexpon(x,l) | vexpon(x,l) |
Extreme | extreme(x,a,b) | iextreme(x,a,b) | vextreme(x,a,b) |
F | F(x,n,w) | iF(x,n,w) | vF(x,n,w) |
Gamma | gamma(x,c) | igamma(x,c) | vgamma(x,c) |
Geometric | geom(x,p) | igeom(x,p) | |
Laplace | laplace(x,a,b) | ilaplace(x,a,b) | vlaplace(x,a,b) |
Logistic | logis(x,a,b) | ilogis(x,a,b) | vlogis(x,a,b) |
Lognormal | lognorm(x,m,s) | ilognorm(x,m,s) | vlognorm(x,m,s) |
Normal | normal(x,m,s) | inormal(x,m,s) | vnormal(x,m,s) |
Pareto | pareto(x,c) | ipareto(x,c) | vpareto(x,c) |
Poisson | poisson(x,l) | ipoisson(x,l) | |
Rayleigh | rayleigh(x,b) | irayleigh(x,b) | vrayleigh(x,b) |
Student's t | student(x,df) | istudent(x,df) | vstudent(x,df) |
Weibull | weibull(x,b,c,q) | iweibull(x,b,c,q) | vweibull(x,b,c,q) |