Text Mining Words Tab

Select the Words tab in the Text Mining dialog box to access options to fine-tune the words and phrases indexed for the final results. Note that you can use the options on the Defaults tab to save or retrieve the settings for these options, and to set the defaults for future analyses.

Note: Selected and unselected words; indexed and non-indexed words. It is important to distinguish between selected and unselected words vs. indexed and non-indexed words. Words or terms can be indexed in the (internal) database but not selected into the word list from which final results are computed (e.g., singular value decomposition). The options on this tab pertain to the indexing of words, e.g., stop words specified on this tab will be discarded and will not be indexed (and, hence, will not be selected).

Phrases (word combinations treated as single word) :Select this check box to search for multiple words as phrases, so that the entire phrase is treated as a separate term during indexing (e.g., Microsoft Windows should be treated as a phrase, while Microsoft and Windows could also be indexed as separate terms). After you select this check box, the Edit and Select buttons are enabled.
Button Description
Edit Click this button to display the Phrase editor, where you can edit the list of phrases to be included for indexing (one phrase per line). This option is only available if the Phrases (word combinations treated as single word) check box is selected.
Select Click this button to display the Open phrase (text) file dialog box, where you can locate and select a file including the phrases for indexing. These should be simple text files with a single phrase per line. This option is only available if the Phrases (word combinations treated as single word) check box is selected.
Stop words (discarded, excluded from indexing). Select this check box to exclude non-informative or non-diagnostic terms from the results during indexing and, hence, from the analyses and final results. After you select this check box, the Edit and Select buttons are enabled.
Button Description
Edit Click this button to display the Stop-word editor, where you can edit the list of stop words (one word or term per line). This option is only available if the Stop words (discarded, excluded from indexing) check box is selected.
Select Click this button to display the Open stop-word (text) file dialog box, where you can select a file that includes the list of stop-words. This option is only available if the Stop words (discarded, excluded from indexing) check box is selected. For most languages, a default list is supplied (e.g., EnglishStopList.txt), including the most common words such as the English "a," "the," "also." These files can further be edited via the Edit button.
Synonyms (replace, combine words): Select this check box to specify words that are to be treated as synonyms during indexing and when computing results. For example, you could combine the words "supper" and "dinner" as synonyms, and count each as a reference to meals consumed in the late afternoon or evening. After you select this check box, the Edit and Select buttons are enabled.
Button Description
Edit Click this button to edit the synonym list (one set of synonyms per line). This option is available only when the Synonyms (replace, combine words) check box is selected.
Select Click this button to display the Open synonym (text) file dialog box, where you can locate and select a file containing the synonym list for indexing. This option is only available when the Synonyms (replace, combine words) check box is selected.
Inclusion words (words not in this list are discarded) Select this check box to specify the words and terms that are to be indexed and included in the analyses. These options are useful when you want to use an a priori list of words or terms and enumerate the frequencies with which these occur in the input documents. After you select this check box, the Edit and Select buttons are enabled.
Button Description
Edit Click this button to display the Inclusion word editor, where you can edit the word or term list for the analyses. This option is only available if the Inclusion words (words not in this list are discarded) check box is selected.
Select Click this button to display the Open inclusion word (text) file dialog box, where you can locate and select a file including the words and terms that are to be indexed, selected, and included in the analyses. The file should be a simple text file, where each term or word is placed on a separate line. This option is only available if the Inclusion words (words not in this list are discarded) check box is selected.