Data Mining with Statistica

Auto-Updating (Running) Workspace Projects (example, When the Data Change)
Controlling the Flow of Data through Nodes: Some Special Issues and Considerations
The Statistica workspace environment is extremely flexible and can be customized to accept a variety of data sources, with or without case selection conditions, case weights.
Data Mining with Statistica Data Miner
Data Mining Overview
Data Mining Very Large Data Sets (Databases): Scalability of Statistica Data Miner
Data Warehousing
Data warehousing is a process of organizing the storage of large, multivariate data sets in a way that facilitates the retrieval of information for analytic purposes.
Exploratory Data Analysis (EDA) and Data Mining Techniques
Getting Started with Statistica Data Miner
To use the Statistica data mining tools, follow these steps:
Global Dictionary
How to Write Custom Workspace Nodes
A custom workspace node requires two files, an *.svx file that defines the operation of the node, and a *.dmi file that defines the node’s user interface. Both files should have the same file name; only the file extension should be different.
On-Line Analytic Processing (OLAP)
The term On-Line Analytic Processing (OLAP) [or Fast Analysis of Shared Multidimensional Information (FASMI)] refers to technology that allows users of multidimensional databases to generate on-line descriptive or comparative summaries (views) of data and other analytic queries. Note that despite its name, analyses referred to as OLAP do not need to be performed truly on-line (or in real-time); the term applies to analyses of multidimensional databases (that can, obviously, contain dynamically updated information) through efficient multidimensional queries that reference various types of data.
Statistica Data Miner Summary
Statistica Data Miner is an extremely comprehensive and effective system of user-friendly tools for the entire data mining process, from querying databases to generating the final reports.
Structure and user interface of Statistica Data Miner
Statistica Data Miner is based on libraries of more than 250 different nodes that contain the complete functionality of Statistica, as well as specialized methods and functions for data mining.
The Client-Server Version of Statistica Data Miner and Data Mining via Statistica Enterprise Server
Using Statistica Data Miner with Extremely Large Data Sets
Using C/C++, C#, and Java Code for Deployment
Practically all Statistica modules will generate C/C++, C#, and Java code for trained models, which can be incorporated into compiled programs to compute predictions or predicted classifications for new observations. Obviously, writing and debugging compiled programs requires some experience and experimentation to make sure that all information is passed and processed as expected. So, as a general initial recommendation, we strongly urge you to carefully verify the predictions computed from your specific compiled programs by comparing the results to those computed, for example, by the Rapid Deployment of Predictive Models module (which computes those predictions or predicted classifications based on PMML-based deployment files).
Working with Text Variables and Text Values: Ensuring Consistent Coding
Data Mining - Workspaces Menu Commands
Statistica Workspace
Data Miner Nodes Dialogs
Examples
StatisticaTM StatisticaTM Live Score
Execute External Workspace Node
The Statistica Workspace is a graphical data preparation and analysis environment, which allows you to create, view and edit a symbolic representation of a flow from the input data to the final results and models. Such flows sometimes become very complex and hard to read when deployed to production or business processes with complex data structures. It is also difficult to collaborate with others when the whole workflow is presented on a single workspace.

Contents

Index

Search Results

Data Mining with Statistica