Machine Learning (Python-based) Functions

The Machine Learning (Python-based) functions are FOCUS functions implemented as Python scripts. These Python scripts take advantage of Python packages such as scipy, numpy, scikit-learn, and pandas, which extends the Python capabilities to machine learning.

Before running the functions, you must configure the Adapter for Python. For information about configuring the adapter, see the Adapter Administration manual.

The Machine Learning (Python-based) functions perform regression and classification using a variety of machine learning methods. The Python scripts perform a sequence of conventional machine learning tasks including scaling of the data where appropriate. They are built around a grid search with cross-validation. That is, some hyper-parameters (parameters that influence the learning process, but that are not model parameters) are identified, and models are built using a number of values for each hyper-parameter, in order to determine the optimal values. To determine optimality, cross-validation is used, which ensures that the performance is measured on a validation-subset of the data that is distinct from the training-subset. Rows with missing values in the target-column are not used for training and validation, but a predicted value is computed by the trained model.