Workspace Node: Text Mining - Specifications - Quick Tab

The Text Mining workspace node can be accessed from the Feature Finder, the ribbon bar, or the Node Browser. The Specifications - Quick tab is displayed by default when the specifications dialog box is opened. See also the Introductory Overview.

Element Name Description
Retrieve text from Use the options in this group box to specify from where to retrieve text.
Spreadsheet This option is not available when an active data file is not open. Select this option button to use the current active data set (e.g., Statistica Spreadsheet, Statistica Streaming DB Connector, or Excel spreadsheet), and then click the Text variable(s) button, described next.
Text variable(s) Click the Text Variable(s) button to select variables with unstructured/textual information to be analyzed.
Files Select this option button to use document files (e.g., .txt, .doc, .rtf, .pdf, .html, .htm, or .xml) for the Text Mining analysis, and then click the Browse documents button, described next.
Browse documents Click the Browse documents button to display the Select documents dialog box, where you can select external text/document files or URL (Web) addresses for the Text Mining analysis.
Paths in spreadsheet The Files option button must be selected and an active data file must be open for this option to be available. When using a spreadsheet containing text variables with file paths (or URL links), select the Paths in spreadsheet check box, and then click the Document paths button, described next. Note that the Web Crawling analysis is used to create an input spreadsheet with text variables containing such file paths or links.
Document paths This option is not available when an active data file is not open. Click the Document paths button to display the Select Variables dialog box, where you can select a variable containing file links.

Options / C / W. See Common Options.

See also, Specifications - Advanced tab, Specifications - Words tab, Specifications - Projects tab, Specifications - Filters tab, Specifications - Characters tab, Specifications - Delimiters tab, Results - Frequency measure tab, Results - Summary tab, Results - Concept extraction tab, Results - Search tab, Downstream tab, and Home tab.