In-Database Analytics Workspace Nodes

In Database Analytics: Overview
With the size of data constantly growing, it becomes more challenging to bring the data across the network for analysis. It can even be impossible if the size of the data exceeds the hardware limits of the application server that is used for the analysis. This demands for the analysis to be brought to the data in a way that utilizes technical resources of the database system and possibly distributes computations, when the latter consists of multiple nodes. Consider a simple example on how this can be accomplished. Assume that the PredictorsOfYield table is uploaded to the Microsoft SQL Server database, and we need to compute the correlation between two columns: Yield and Param_3_avgval1. Traditionally, we would extract the data using, e.g., the following query:
In Database Correlation Matrices Specifications: Quick Tab
Use this node to create correlation and partial correlation matrices for the selected continuous variables and an optional weight variable.
In Database Enterprise Data Configuration: Specifications Tab
Workspace Node: Descriptive Statistics - Results - Advanced Tab / In-Database Descriptive Statistics - Specifications - Advanced Tab
In Database Filter Duplicate Cases: Specifications Tab
In Database Logistic Regression: Specifications Quick Tab
In Database Multiple Regression: Specifications Quick Tab
WS Node - Random Sample Filtering - Specifications Tab / In-Database Random Sample Filtering - Specifications Tab
In Database Sort: Specifications Tab
Write to Database: Specifications Tab

Contents

Index

Search Results

In-Database Analytics Workspace Nodes