Scoring Service

A scoring service provides a REST endpoint to load and unload models, and to score requests against the loaded models.

Architecture

A scoring service is a Kubernetes service that is executing in one or more Pods. The scoring service Pods elastically scale as needed based on metrics collected at runtime.

Scoring services are automatically started as needed when a scoring pipeline is scheduled. When a scheduled job starts, these steps are taken for each model in the scoring pipeline:

  1. Determine the required model runner.
  2. Start a scoring service with the required model runner.
  3. Load the model into the started scoring service.

When the job completes, all scoring services associated with the job are shutdown.

Scoring services have a REST API to load and unload models, and to accept scoring requests. The response to a scoring request is a score calculated by the model. The request and response format is defined by an input and an output schema associated with the loaded model.

Models are associated with a model runner using a model type. When a model is loaded into a scoring service, the model type is used to determine while model runner should execute the model and respond to all scoring requests. All details about the model syntax, execution semantics, etc. are handled by the model runner.

Architecture

Model Runners

A Java Model Runner API is available to allow new model runners to be added. Model runners are hosted in-process in a scoring server.

These broad model execution strategies are possible:

  • directly implement logic in Java
  • call an embedded library
  • invoke an external command
  • call out to an external (proxied) scoring service

All of these model execution strategies are supported using the Model Runner API. In addition, no restrictions are placed on model runners to access other external services, for example data sources, to perform model execution and scoring.

The Model Runner API is available as a Maven artifact:

   <dependency>
        <groupId>com.tibco.modelops.runner</groupId>
        <artifactId>api</artifactId>
        <version>1.2.0</version>
    </dependency>

The high-level implementation steps are summarized in the Model Runner API Javadoc.

Once a model runner is implemented it is loaded into a scoring server using the –runners option.

java -jar scoring-server.jar --help
java -jar scoring-server.jar [-c <file>] [-d] [-h] [-r <path,...>] [-s <name=value,...> | -sf <file>]  [-v]
scoring-server.jar: --help: help
    ...
-r,--runners <path,...> model runner directories (default /opt/tibco/distrib/tibco/modelops/runner)
    ...

For example, if a model runner is installed into a directory named runner, it would be loaded into the scoring server using this command line:

java -jar scoring-server.jar --runners runner

Finally, any required model runner configuration should be specified using the Scoring.runners configuration property. Model runners should make no assumptions on how local or external resources are accessed - this information should be passed into the runner using configuration. See Configuration for details.