The Team Studio Environment
Glossary
Workspaces
Workspace Tabs
Overview Tab
Workspace Stages
Workspace Roles
Data Sources Tab
Associating a Data Source with a Workspace from Your Workspace
Associating a Data Source with a Workspace from the Data Section
Data Tab
Exploring a Data Source
Previewing Data in a Workspace Table
Visualizing Data in a Workspace Table
Analyzing Data in a Workspace Table
Associating Datasets with a Workspace
Importing Data into a Workspace’s Sandbox
Importing Associated Data
Importing Oracle Datasets
Importing Hadoop Datasets
Scheduling Recurring Imports
Work Files Tab
Creating a Work File
Editing a Work File
Searching for a Work File
Copying a Work File to a Workspace
Importing a Work File
Deleting a Work File
Creating or Importing a SQL Work File
SQL Editor
Running a SQL Work File
Version Control in SQL Work Files
Jobs Tab
Viewing the Jobs List
Creating a Job
Adding Tasks to a Job
Running a Job
Viewing Job Results
Milestones Tab
Viewing a Milestone
Creating a Milestone
Engines Tab
Viewing the Engines List
Creating an Engine
Deploying and Governing an Engine
Testing an Engine
Workspace Activities
Creating a Workspace
Workflows
Workflow Editor
Workflow Actions
Your Workflow Results
Refresh Metadata
Flow History
Export Flow
Workflow Variables
Clear Temporary Data
Convert to Spark/Revert to Non-Spark
Output All Results: Table or View
Preferences
Manage Custom Operators
System Logs
Workflow Menu
Data Explorer
Browse Hadoop Sources
Browse Database Sources
Data Sources
Operator Explorer
Operator Help
Creating a New Workflow
Importing Database Data into a Workflow
Importing Hadoop Data into a Workflow
Explore Visual Results
Navigating the Results Panel
Plotly Charts and Graphs
Spark Optimization for Data Scientists
Spark Autotuning
Settings for Spark-Enabled Operators
Advanced Settings Dialog Box
alpine.conf Spark Settings
Spark Values
Team Studio-Specific Spark Values
YARN Configuration Values
Deploying Models and Workflows
Moving a Workflow from Development to Production
Preparing Data and Deploying Models
Optimizing Models
Batch Model Scoring
Real-Time Model Scoring
Code Generation
Model Management
Workflow Scheduling
Preview and Visualize Data
Preview
Show Table Metadata or Inspect Hadoop File Properties
Scatter Plot Chart
Bar Chart
Univariate Plot Chart
Box and Whisker Chart
Histogram
Summary Statistics (right-click)
Time Series Chart
Correlation Analysis
Frequency Analysis
Running a Workflow
Stepping Through a Workflow
Stopping a Workflow
Clearing a Workflow
Saving a Workflow
Reverting a Workflow
Running a Flow in Local Mode
Running Workflow Branches in Parallel
Handling Bad Data in Hadoop
Viewing Workflow Results
Saving Flow Output
Results Management
Downloading Results
Viewing Database SQL
Workflow Variables
Defining New Workflow Variables
Overriding Hadoop Data Source Parameters Using Workflow Variables
Team Studio Operator Job Names
Touchpoints
Creating a Touchpoint
Run Settings Tab
Adding Parameters to a Touchpoint
Testing a Touchpoint
Downloading Touchpoint Test Results
Running a Touchpoint
Publishing a Touchpoint to the Catalog
Touchpoint Parameters
Text Touchpoint Parameter
Multiline Text Touchpoint Parameter
Number Touchpoint Parameter
Single-Select Option Touchpoint Parameter
Multiple-Select Option Touchpoint Parameter
Date/Time Touchpoint Parameter
Search
Search Page Options
Tags
Viewing Tags
Navigating with Tags
Adding and Editing Tags
Deleting a Tag
Renaming a Tag
Jupyter Notebooks
Creating a Jupyter Notebook
Installing Python Packages
Python Packages Required for Jupyter Notebooks in Team Studio
Initializing PySpark
Creating a Custom Environment for Running Jupyter Notebooks
Uploading and Running the Conda Environment Example
Adding Your Data to a Notebook
Incorporating Notebooks in a Workflow
Workflow Operators
Operator Actions
Connecting Operators
Selecting Multiple Operators
Editing Operator Properties
Deleting Operators
Moving Connections
Deleting Connections
Data Management
Selecting Groups of HDFS files
Data Exploration
Visualizing data with charts and graphs
Explore Visual Results
Navigating the Results Panel
Plotly Charts and Graphs
Correlation and Covariance
Information Value and Weight of Evidence Analysis
Data Transformation
Aggregation Methods for Batch Aggregation
Outliers in Numerical Data
Creating a Join condition for a database join.
Key-Value Pairs Parsing Example using the Variable Operator
datetime Format Conversion Examples
Data Modeling and Model Validation
Cluster Analysis Using K-Means
K-Means Use Case
Patterns in Data Sets
Alpine Forest Operators
Ensemble Decision Tree Modeling with Alpine Forest
Model Export Formats
Fitting a Trend Line for Linearly Dependent Data Values
Linear Regression Use Case (1)
Linear Regression Use Case (2)
Probability Calculation Using Logistic Regression
Logistic Regression Use Case (1)
Logistic Regression Use Case (2)
Classification Modeling with Decision Tree
Decision Tree and CART Operator General Principles
Decision Tree Concept of Purity
Information Gain
Pruning or Pre-Pruning
Differences in Decision Tree Algorithms
Decision Tree Output Troubleshooting
Decision Tree Use Case
Classification Modeling with Naive Bayes
Naive Bayes Use Case
Computed Metrics and Use Case for the Regression Evaluator
Collaborative Filtering
Prediction Threshold
Principal Component Analysis
Support Vector Machine Classification
SVM Use Case
T-Tests
Independent Samples T-Test Use Case
Paired Samples T-Test Use Case
Single Sample T-Test Use Case
Testing Models for Performance Decay
Prediction
Prediction and Modeling Operator Pairings
Pearson's Chi Square Operations
Spark Node Fusion
Viewing Results for Individual Operators
Specialized Tools
Natural Language Processing Tools
Using the Results of Text Featurizer
Unsupervised Text Mining
LDA Training and Model Evaluation Tips
NLP Use Case
Test Corpus Parsing
Using Pig User-Defined Functions
DateTime Input Values
Setting Up Notebooks for Python Execute
R Execute
R Execute Error Messages
Syntax errors in the R script
Logical errors in the user's R script
Input data size limitations
Output data size limitations
Network issues
Missing output reference in the R script
Column name or type mismatches
Type coercion error
Workflow Operator Reference
Core Operators
Data Operators
Copy Between Databases
Copy To Database
Copy To Hadoop
Dataset
Hadoop File
Hive Table
Import Excel (DB)
Import Excel (HD)
Load To Hive
Exploration Operators
Bar Chart
Box Plot
Correlation (DB)
Correlation (HD)
Frequency
Histogram
Information Value
Line Chart
Scatter Plot Matrix
Summary Statistics (DB)
Summary Statistics (HD)
Variable Selection (DB)
Variable Selection (HD)
Transformation Operators
Aggregation (DB)
Aggregation (HD)
Batch Aggregation
Collapse
Column Filter (DB)
Column Filter (HD)
Correlation Filter (DB)
Correlation Filter (HD)
Distinct (DB)
Distinct (HD)
Fuzzy Join
Join (DB)
Join (HD)
Normalization (DB)
Normalization (HD)
Null Value Replacement (DB)
Null Value Replacement (HD)
Numeric to Text (DB)
Numeric to Text (HD)
One-Hot Encoding
Pivot (DB)
Pivot (HD)
Reorder Columns (DB)
Reorder Columns (HD)
Replace Outliers (DB)
Replace Outliers (HD)
Row Filter (DB)
Row Filter (HD)
Sessionization
Set Operations (DB)
Set Operations (HD)
Sort By Multiple Columns
Time Series SAX Encoder
Transpose
Unpivot (DB)
Unpivot (HD)
Unstack
Variable (DB)
Variable (HD)
Wide Data Variable Selector - Correlations
Wide Data Variable Selector - Chi Square / Anova
Window Functions - Aggregate
Window Functions - Lag/Lead
Window Functions - Rank
Sampling Operators
Random Sampling (DB)
Random Sampling (HD)
Resampling
Sample Selector
Stratified Sampling
Modeling Operators
Alpine Forest - MADlib
Alpine Forest Classification
Alpine Forest Predictor - MADlib
Alpine Forest Regression
ARIMA Time Series (DB)
ARIMA Time Series (HD)
Association Rules
Collaborative Filter Trainer
Decision Tree
Decision Tree - MADlib
Decision Tree Classification - CART
Decision Tree Regression - CART
Elastic Net Linear - MADlib
Elastic Net Logistic - MADlib
Generalized Linear Regression Models
Gradient Boosting Classification
Gradient Boosting Regression
K-Means (DB)
K-Means (HD)
K-Means Clustering - MADlib
Linear Regression (HD)
Linear Regression (DB)
Linear Regression - MADlib
Logistic Regression (DB)
Logistic Regression (HD)
Logistic Regression - MADlib
Naive Bayes (DB)
Naive Bayes (HD)
Neural Network
PCA (DB)
PCA (HD)
SVM Classification
NLP Operators
N-gram Dictionary Builder
N-gram Dictionary Loader
Text Extractor
Text Featurizer
Stop Words
LDA Predictor
LDA Trainer
Prediction Operators
Chi Square, Goodness of Fit
Chi Square, Independence Test
Classifier (DB)
Classifier (HD)
Collaborative Filter Predictor
Collaborative Filter Recommender
K-Means Predictor - MADlib
PCA Apply
Predictor (DB)
Predictor (HD)
Model Validation Operators
Alpine Forest Evaluator
Classification Threshold Metrics
Confusion Matrix
Goodness of Fit
Lift (DB)
Lift (HD)
Regression Evaluator (DB)
Regression Evaluator (HD)
ROC
T-Test - Independent Samples
T-Test - Paired Samples
T-Test - Single Sample
Tool Operators
Convert
Export
Export to Excel (DB)
Export to Excel (HD)
Export to FTP
Export to SBDF (DB)
Export to SBDF (HD)
Flow Control
HQL Execute
Load Model
Note
Pig Execute
Python Execute (DB)
Python Execute (HD)
R Execute (DB)
R Execute (HD)
SQL Execute
Sub-Flow
Operator Dialog Boxes
Advanced Parameter Configuration Dialog Box
Advanced Settings Dialog Box
Bin Configuration Dialog Box
Choose Collapse Columns Dialog Box
Configure Columns Dialog Box
Configure Columns: Text Files
Configure Columns: XML and JSON Files
Configure Columns: Log Files
Define Column Aggregations Dialog Box
Define Filter Dialog Box
Define Join Conditions Dialog Box (Hadoop)
Define Pig Script Dialog Box
Define Quantile Variables Dialog Box
Define R Script Dialog Box
Define Sample Size Dialog Box
Define Sets Dialog Box
Define SQL Statement Dialog Box
Define Variables Dialog Box
Edit Table Columns Dialog Box
Input Table Mapping dialog box
Interaction Parameters Dialog box
Join Properties - Database Dialog Box
Key Columns Dialog Box
Null Value Replacement Configuration Dialog Box (DB)
Null Value Replacement Configuration Dialog Box (HD)
Ordered Columns Dialog Box
Results File Structure Dialog Box
Select Columns Dialog Box
Storage Parameters Dialog Box
Store Intermediate Results
Sub-Flow Variable dialog box
Window Column Configuration Dialog Box
Operator Compatibility
Operator and Data Source Compatibility
Hadoop Data Source and Operator Compatibility
AWS EMR
Cloudera Hadoop
Dataproc
Hive Hadoop
Hortonworks
MapR Hadoop
Pivotal HD
Analytic Data Sources (JDBC) and Operator Compatibility
Greenplum Database
Pivotal HAWQ
Postgres
Non-Analytic Data Sources (JDBC) and Operator Compatibility
Apache Impala
AWS Redshift
Azure Data Warehouse
Google BigQuery
Hive JDBC
MS SQL Server
SAP HANA
Vertica JDBC
TIBCO Data Virtualization Compatibility
Deprecated, Removed, or Replaced Operators
Processing Tools for Hadoop-Enabled Operators
Processing Tools for Data Load Operators (HDFS)
Processing Tools for Exploration Operators (HDFS)
Processing Tools for Transformation Operators (HDFS)
Processing Tools for Sample Operators (HDFS)
Processing Tools for Modeling Operators (HDFS)
Processing Tools for NLP Operators (HDFS)
Processing Tools for Prediction Operators (HDFS)
Processing Tools for Model Validation Operators (HDFS)
Processing Tools for Tool Operators (HDFS)
TIBCO Documentation and Support Services
Legal and Third-Party Notices