Merge Options - Variables Tab

 

Select the Variables tab in the Merge Options dialog box to display options to align (match) the cases of the second file with those of the first.

Option Description
ModeIf the number and order of cases in the two data files are not precisely the same (i.e., if at least one case is missing in the second data file or there is one case that has no equivalent in the first file), all subsequent data will be merged with the wrong cases. To prevent this, specify a mode for merging from the options in the Mode group box.
ConcatenateSelect the Concatenate option button to simply add the variables of the second file to those of the first (this is the default mode). If there are more cases in either File 1 or File 2, the option selected in the Unmatched Cases group box will determine what Statistica will do. By default, Statistica fills any extra cases with missing data.
CartesianSelect the Cartesian option button to create a cross product of the variables. This is to say, every combination of every case will be created. For example:
  • File 1:
    SEATDYNAMO
    DOWNOFF
    UPON
  • File 2:
    GEARTIRES
    MEDIUMHARD
    LOWSOFT
  • Result:
    SEATDYNAMOGEARTIRES
    DOWNOFFMEDIUMHARD
    DOWNOFFLOWSOFT
    UPONMEDIUMHARD
    UPONLOWSOFT

A possible application of a Cartesian merge is to generate an experimental design spreadsheet.

Match casenamesSelect the Match casenames option button to match the cases from the second file with those of the first file based on the case names. For each case in both files, Statistica checks the case names and merges them if their respective case names match.
Match variables/SelectSelect the Match variables option button to match the cases from the second file with those of the first file based on the values of specified variable pairs. When merging variables from two files, specify the matching variables in each data file via the Select button; for each case, Statistica checks the values of these variables in both data files and merges the cases only if their respective values match. Note that missing data is ignored.

Merge matches variable data types as follows:

  • Continuous (Byte, Integer, or Double)
  • Categorical (Text or Double with Text Labels); data will be sorted as Text, case insensitive (e.g., "ABC" and "abc" are treated the same)

Note that if more than one variable is selected for each of the files, the variables are paired up by the order as they appear in the spreadsheet. For example, if you select variables 1-3 from the File 1 and variables 2-4 from the File 2, the values of variable 1 from File 1 will be compared against the values of variable 2 from the File 2, etc. For a pair of cases between the File 1 and the File 2 to match, all of the values between their matching variables must match.

For example, select the variables ID and NAME as your matching variables for the files below:

File 1:

IDNAMEHEIGHT
612CHESTER2.7

File 2:

IDNAMEWEIGHT
612SYLVESTER8.7

Although these cases have matching values for the variable ID, the values in the NAME variables differ and Statistica will not consider these matching cases. When the variable WEIGHT is added to File 1, the case Sylvester will be added below Chester:

IDNAMEHEIGHTWEIGHT
612CHESTER2.7 
612SYLVESTER 8.7

Conversely, suppose that the name in File 2 is Chester:

IDNAMEWEIGHT
612CHESTER8.7

Statistica would recognize these cases as matching because the values in both matching variables are the same. The result would appear as such:

IDNAMEHEIGHTWEIGHT
612CHESTER2.78.7

Match CriteriaSelect an option in the Match Criteria drop-down list to specify how to compare values when using Match casenames or Match variables. The default option is By Auto, which will auto-detect the most appropriate comparison method. By text compares values by textual content, and By numeric compares by numeric value. Note that if you select By numeric when comparing variables that contain double with text labels, the underlying values will be compared.
Unmatched casesSelect an option to specify the manner of dealing with unmatched cases when the two files are merged. Unmatched cases may result from unequal numbers of cases in the merged files, or because some of the cases do not meet the relational merge criteria.
Fill with MDSelect Fill with MD to pad unmatched cases with missing data.
Delete casesIf this option button is selected, cases from either file that cannot be matched will be removed from the merged file.
Generate CartesianSelect Generate Cartesian to create a cross product between every unmatched case against every other case, i.e., if a unique case is found in only File 1 or File 2, then every combination of that case against every other case will be created.
Abort mergeWhen this option button is selected, the presence of unmatched cases in either file will cause an error message to be displayed and the merge procedure to be abandoned.
Multiple CasesWhen either the Match casenames merge or Match variables merge options are selected in the Mode group box, the options in the Multiple Cases group box become available to specify what to do if duplicate matching cases are encountered.
Fill with MDWhen the Fill with MD option button is selected, the file will be padded with missing data in the duplicate cases that were matched. This option (in conjunction with Match variables) is similar to Relational in Statistica 6.
Copy downWhen the Copy down option button is selected, a Cartesian product will be generated for duplicate matches of the same value. This option (in conjunction with Match variables) is similar to Relational hierarchical in Statistica 6.
Drop File 1 multiplesSelect the Drop File 1 multiples check box so that duplicate cases from the File 1 data set will not be included when using Match casenames or match variables.
Drop File 2 multiplesSelect the Drop File 2 multiples check box so that duplicate cases from the File 2 data file will not be included when using Match casenames or Match variables.