Input Formats in Correspondence Analysis - Frequencies with Grouping Variables

If the Frequencies with grouping variables option button is selected [from the Input group box on either the Correspondence Analysis (CA): Table Specifications Startup Panel - Correspondence Analysis (CA) tab or the Multiple Correspondence Analysis (MCA): Table Specifications Startup Panel - Multiple Correspondence Analysis (MCA) tab], Statistica expects grouping variables with code values uniquely identifying each category as input, and, in addition, expects a variable containing frequencies or some other values with the respective measure of correspondence for the categories indicated by the respective grouping variables. For example, the file may look like this:

STAFFGRP	SMOKING	FREQUENCY
Sr.Manag	None	4
Sr.Manag	Light	2
Sr.Manag	Medium	3
Sr.Manag	Heavy	2
Jr.Manag	None	4
Jr.Manag	Light	3
Jr.Manag	Medium	7
Jr.Manag	Heavy	4
Sr.Empl	None	25
Sr.Empl	Light	10
Sr.Empl	Medium	12
.......	.......	.......
.......	.......	.......
.......	.......	.......

If you selected variables StaffGrp and Smoking for the analysis, and variable Frequency as the Variable with frequencies/counts, then Statistica would assign the respective value for variable Frequency to each cell in the table identified by the grouping variables.

Selection of variables and codes

The required selection of variables and codes is the same as that described under the option Raw data (requires tabulation), except that in addition, you will be prompted to select the Variable with frequencies/counts (that is, the variable containing the measure of correspondence, similarity, confusion, association). Note that only positive values or zero are allowed in that variable (example, Statistica does not permit negative frequencies).

Multiple references to the same cell

If there are multiple references to the same cell in the table, then the multiple values for the frequency variable are summed up, and the sum of the values assigned to the respective cell in the table. For example, consider the following data specifying a 2 by 2 table:

GENDER	INCOME	FREQUENCY
MALE	HIGH	4
MALE	HIGH	6
MALE	LOW	3
FEMALE	HIGH	2
FEMALE	LOW	4

There are two references to the cell Male-High (that is, the first two cases in the listing above). Thus, the frequency assigned to that cell will be 4+6=10. This way of handling multiple references enables you to analyze subsets of tables that are coded in this manner. For example, suppose you had three grouping variables Gender, Income, and Occupation, and a fourth variable containing the frequencies for each cell in the three-way table. If you now only selected Gender and Income for the analysis, then Statistica would sum up all the frequencies in the two-way table defined by those two variables, and, in effect, compute the Gender by Income marginal frequency table, collapsing across the levels of the variable Occupation.

Contents

Index

Search Results

Input Formats in Correspondence Analysis - Frequencies with Grouping Variables