Workspace Node: Text Mining - Results - Summary Tab
In the Text Mining node dialog box, under the Results heading, select the Summary tab to access options to review the main "raw" results of the text mining analysis - the frequencies or transformed frequencies for all selected words and documents, as well as the document frequencies. Options are also available for browsing summaries for the documents, i.e., the words that were indexed for each input document. See also the Introductory Overview.
Element Name | Description |
---|---|
Frequency matrix: word <=> document | Select this check box to produce a term-document matrix spreadsheet with the summary word frequencies or transformations of frequencies (see the options for
Frequency measure on the
Results - Frequency Measure tab). Only selected words will be shown.
Generate a new spreadsheet with current results. Select this check box to produce a stand-alone input spreadsheet with the results of the current analysis, along with selected other variables from the current input file. After clicking this button, a variable selection dialog box will be displayed, where you can select the variables to save along with the text mining results (word frequencies or transformed frequencies, SVD scores if available). The program will then create an input spreadsheet that can be used for subsequent analyses using the various facilities available in Statistica and Statistica Data Miner. |
Selected words | Select this check box to produce a results spreadsheet with the total word frequencies and document frequencies for selected words/phrases/stems (see stemming). This spreadsheet is used for deployment. |
Indexed documents | Select this check box to produce a results spreadsheet that shows the document size in number of characters, number of words, and actual (stemmed and indexed) words for each document in the analysis (either in a single text column or in multiple columns, one for each word). |
As report | Select this check box to produce a report containing a list of stemmed and indexed terms for each document; if the documents are external files, a link for each document will be added (which you can click to display the respective document). These terms are listed in the order in which they were encountered in the respective documents (and not in the order of frequency with which they occur). Also, not all terms are listed here, but only up to 255 characters total; if a large number of terms are contained in a document, the list of terms will show an ellipsis ("...") at the end (to denote that the list of terms is not complete). |
Each word in a separate column | Select this check box to display each word in a separate column. |
Only those that satisfy the search query | Select this check box to enable filtering of the Indexed documents spreadsheet and As report outputs by limiting them to documents containing the term entered into the Search edit box. |
Search | See
Only those that satisfy the search query described above.
Options / C / W. See Common Options. |