The Nature of Sets

In the Statistica Quality Control module, sets are a collection of samples that have the same control or specification limits. Sets can be defined prior to the analysis (on the Sets tab of the Defining Variables dialog) or after the initial charts have been created (on the Sets tab of the Results dialog). When sets are specified before the analysis, they cannot be specified afterwards, and the options for creating sets on the Results dialog are not available.

Specifying sets before the analysis
Membership in a set can be specified in a variety of ways. First, a new set is always defined when the specification and/or control limits change between samples, regardless of any other method of specifying sets. Secondly, the data can contain a categorical variable that indicates set membership Lastly, sets can be defined through the use of a cycling sequence number (e.g., a variable that counts the number of samples in an hour). You could specify that every time this number returns to a specific value (e.g., 1; indicating the top of the hour), a new set is created.
Note: when specifying sets before the analysis, you can also choose to calculate sigma (for each set) using all the samples in the set, or only the first k samples in the set. When sets are specified in advance, you cannot use the options on the Sets tab of the Results dialog to create more sets; however, you can review summary specifications for all sets by using the available options on that tab.
Specifying sets after the analysis
While monitoring an ongoing process, it often becomes necessary to adjust the center line values or control limits as those values are refined over time. Also, you may want to compute the control limits and center line values from a set of samples that are known to be in control, and apply those values to all subsequent samples. Thus, each set is defined by a set of computation samples (from which various statistics are computed, e.g., sigma, means, etc.) and a set of application samples (to which the respective statistics, etc. are applied). Of course, the computation samples and application samples can be (and often are) different. To reiterate, you may want to estimate sigma from a set of samples that are known to be in control (the computation set), and use that estimate for establishing control limits for all remaining and new samples (the application set).  
Note: each sample must be uniquely assigned to one application set; in other words, each sample has control limits based on statistics (e.g., sigma) computed for one particular set. The assignment of application samples to sets proceeds in a hierarchical manner, i.e., each sample is assigned to the first set where it "fits" (where the definition of the application sample set would include the respective sample). This hierarchical search always begins at the last set that you specified, and not with the All-samples set. Hence, if the user-specified sets encompass all valid samples, the default All-samples set will actually become empty (since all samples will be assigned to one of the user-defined sets).

With the Statistica Interactive Quality Control module, you can define multiple sets defined as ranges of samples or by a coding variable specified in the data file.

Sets, and chart statistics that rely on sequences of points
Transition points between sets are handled, as follows:
  1. For Moving Range (MR) charts, the first sample after a transition point is ignored in the chart; Sigma is only computed from moving ranges that do not cross set boundaries.
  2. Moving Average and Exponentially Moving Average charts (MA, EWMA) are computed so that at transition points the computations for the MA or EWMA computations start over.
  3. Note that CUSUM charts are not affected, because the cumulative sum is computed from the respective means (X-bar's) for each set. Hence, the plot points for CUSUM charts are not affected by set transitions, but scaled correctly using the respective set means.
  4. Also, runs tests are computed ignoring the set transition points. Most runs tests will compute the number of times that plot points (e.g., means) fall outside a particular "zone", as defined by sigma. For the runs test, sigma (as well as center line) values are taken from the respective sets to which each sample belongs, and hence, set transitions can be ignored for those tests. For the few runs tests that count trends (e.g., number of sample values in a row with increasing or decreasing values), adjustments to the centerline for sets may cause an expected increase or decrease in values from one sample to the next, when crossing the set boundary. However, such expected ("artificial") trends for two samples in a series of samples (namely the two at the sample transition point) usually will not affect the usefulness of the respective runs tests. However, if such transition points between sets are very frequent, you may consider clearing the Increasing or decreasing (trend) and Alternating up and down check boxes on the Runs Tests for Control Charts dialog. See also the Introductory Overview for more details on runs tests.