Summary: Observed and expected distribution
Click the Summary: Observed and expected distribution button to display either the standard variable selection dialog, or the results of the analysis (if you have already selected a variable). Note that you can choose whether the Kolmogorov-Smirnov test is displayed in the results and, if you choose to display it, whether it is categorized or continuous by selecting the appropriate option buttons on the
Options tab.
By default, Statistica will compute the Chi-square test based on the observed and expected frequencies. Categories where the expected frequency is less than 5 are collapsed to form larger categories. If this test is significant, you reject the hypothesis that the observed data follow the hypothesized distribution.
Note: degrees of freedom. The degrees of freedom for the Chi-square test are computed as:
df = number of categories - number of parameters - 1
where the number of categories refers to the number of categories in the frequency table where the expected frequencies are greater than 5 and number of parameters refers to the number of parameters defining the respective theoretical distribution.
Note: df adjusted. If the Chi-square test results shown in the resulting spreadsheet or histogram are accompanied by the qualifier df adjusted, in order to compute the Chi-square test, Statistica will combine categories where the expected frequencies are less than 5. Specifically, those categories are combined with adjacent categories until the expected frequency for the combined category exceeds 5.0.
Plot of observed and expected distribution
Click the Plot of observed and expected distribution button to display either the standard variable selection dialog or to produce a graph of the observed and expected distribution (if you have already selected a
Variable). Note that you can choose to plot the frequency distribution or cumulative distribution and raw frequencies or relative frequencies by selecting the appropriate option buttons on the
Options tab.
Note: the tabulation (assignment of values into categories) is based on the first 6 significant digits of the data values. Use Basic Statistics for computing standard frequency tables. You can also use Process Analysis to fit various distributions to the data, including Weibull, Beta, Rayleigh, etc., using the method of matching moments, or maximum likelihood.
Note: If you record a macro from this dialog, or request by-group analyses, the specific user-defined parameters shown here will be used in the results dialog to compute the expected values and related results for the respective distribution, regardless of the data (e.g., subgroup) to which the macro (or by-group analysis) is applied. So, for example, if the Normal Distribution is selected, the specific Mean and Standard Deviation shown in the results dialog will be used to compute the expected normal distribution values for the data against which the recorded macro is applied, or for the respective subgroup.