Workspace Node: Merge Variables - Specifications Tab

The Merge Variables workspace node can be accessed from the Feature Finder, ribbon bar, or Node Browser. Note that the file connected first will be ordered first in the merge. Double-click the node to display the specifications dialog box.

Input.

Match variables
Use this option to match the cases from the second file with those of the first file based on the values of specified variable pairs. When merging variables from two files, specify the matching variables in each data file via the Match variables button; for each case, Statistica checks the values of these variables in both data files and merges the cases only if their respective values match. Note that missing data is ignored.

Merge matches variable data types as follows:

  • Continuous (Byte, Integer, or Double)
  • Categorical (Text or Double with Text Labels); data will be sorted as Text, case insensitive (e.g., "ABC" and "abc" are treated the same)
Note: if more than one variable is selected for each of the files, the variables are paired up by the order as they appear in the spreadsheet. For example, if you select variables 1-3 from the File 1 and variables 2-4 from the File 2, the values of variable 1 from File 1 will be compared against the values of variable 2 from the File 2, etc. For a pair of cases between the File 1 and the File 2 to match, all of the values between their matching variables must match.

For example, select the variables ID and NAME as your matching variables for the files below:

File 1:

ID NAME HEIGHT
612 Chester
  1. 7

File 2:

ID NAME WEIGHT
612 Sylvester
  1. 7

Although these cases have matching values for the variable ID, the values in the NAME variables differ and Statistica will not consider these matching cases. When the variable WEIGHT is added to File 1, the case Sylvester will be added below Chester:

ID NAME HEIGHT WEIGHT
612 Chester
  1. 7
 
612 Sylvester  
  1. 7

Conversely, suppose that the name in File 2 is Chester:

File 2:

ID NAME WEIGHT
612 Chester
  1. 7

Statistica would recognize these cases as matching because the values in both matching variables are the same. The result would appear as such:

ID NAME HEIGHT WEIGHT
612 Chester
  1. 7
  1. 7

Output. In this group box, select the variables to merge from the two data files.

File 1 variables
Click this button to display a variable selection dialog box containing variables from the first data file; select the desired variables for the output file.
File 2 variables
Click this button to display a variable selection dialog box containing variables from the second data file; select the desired variables to merge into the output file.

Match Criteria by. These options determine the criterion for matching cases.

Numeric
Select this option button to match cases by the numeric value of the case name.
Text
Select this option button to match cases by the text value of the case name.
Auto
Select this option button to automatically choose the appropriate criterion (Numeric or Text) based on the contents of the case names.
Unmatched cases
Select an option to specify the manner of dealing with unmatched cases when the two files are merged. Unmatched cases may result from unequal numbers of cases in the merged files, or because some of the cases do not meet the relational merge criteria.
Fill with MD
Select this option button to pad unmatched cases with missing data.
Delete cases
When this option button is selected, cases from either file that cannot be matched will be removed from the merged file.
Generate Cartesian
Select this option button to create a cross product between every unmatched case against every other case, i.e., if a unique case is found in only File 1 or File 2, every combination of that case against every other case will be created.
Abort merge
When this option button is selected, the presence of unmatched cases in either file will cause an error message to be displayed and the merge procedure to be abandoned.
Multiple cases
Select an option to specify what to do if duplicate matching cases are encountered.
Fill with MD
When this option button is selected, the file will be padded with missing data in the duplicate cases that were matched.
Copy down
When this option button is selected, a Cartesian product will be generated for duplicate matches of the same value.

Drop multiples. Use the options in this group box to specify how duplicate cases in the file should be handled.

File 1
When this check box is selected, duplicate cases from the File 1 data set will not be included.
File 2
When this check box is selected, duplicate cases from the File 2 data file will not be included.
Input order
This group box contains options to specify the order of the output.
Preserve input order in output
When this check box is selected, the output order will match the input order.
File 1 data are sorted
When this check box is selected, Statistica expects the first input spreadsheet to be sorted by case.
File 2 data are sorted
When this check box is selected, Statistica expects the second input spreadsheet to be sorted by case.

Options. See Common Options.

OK. Click the OK button to accept all the specifications made in the dialog box and to close it. To view the new spreadsheet, click the icon on the lower-right corner of the Concatenate Variables node.

See also, Home tab.