PCA
The Principal Component Analysis (PCA) operator generates an orthogonal transformation to convert a set of observations of possibly correlated variables into a set of values of uncorrelated variables.
Information at a Glance
|
Parameter |
Description |
|---|---|
| Category | Model |
| Data source type | TIBCO® Data Virtualization |
| Send output to other operators | Yes |
| Data processing tool | TIBCO® DV, Apache Spark 3.2 or later |
Algorithm
The PCA is an orthogonal linear transformation that transforms data into a new coordinate system such that the greatest variance by any projection of the data comes to lie on the first coordinate (called the first principal component), the second greatest variance on the second coordinate, the third on the third coordinate, continuing until the number of coordinates has been reached or a preset maximum principal component threshold has been reached.
This operator applies Center and Scale transformation to the selected columns, before generating the principal components. It also generates the full number of principal components.
Input
An input is a single tabular data set.
Configuration
The following table provides the configuration details for the PCA operator.
| Parameter | Description |
|---|---|
| Notes | Notes or helpful information about this operator's parameter settings. When you enter content in the Notes field, a yellow asterisk appears on the operator. |
| Continuous Predictors | Specify the numerical data columns as independent columns. It must be numerical column. Click Select Columns to select the required columns. |
| Use all available columns as Predictors | When set to Yes, the operator enable the wildcard feature. When set to No, users must select at least one of the Continuous Predictors. |
Output
- Components: Displays the component matrix used for generating the principal components.
- Variance: Captures information on variance explained by each principal component (in descending order) alongside a cumulative total of variance explained.
Example
The following example demonstrates the PCA operator.
- Sepal length
- Sepal width
- Petal length
- Petal width
-
Continuous Predictors: sepal_length,sepal_width,petal_length,petal_width
-
Use all available columns as Predictors: No