Cloud Installation
Introduction
This section provides general information on deploying TIBCO Streaming Model Management Server (MMS) to a Kubernetes cluster.
- Requirements
- Sizing
- Additional Model Management Server Configuration
- Third Party Operator Configuration
- Passwords and secrets
- Quick run through
- Alternative ways to manage volumes
- The installation pipeline
- Cloud platform differences
- Upgrading
- Uninstalling
- Troubleshooting
- Handling Multi-Attach Errors with External Storage
Requirements
The following tools are required to complete the installation - these must be downloaded and installed prior to installing Model Management Server:
- Kubernetes CLI tool
- macOS:
brew install kubectl
- macOS:
- Helm CLI tool
- macOS:
brew install helm
- macOS:
- Tekton CLI tool
- macOS:
brew install tektoncd-cli
- macOS:
Additional requirements depend on the Kubernetes cloud platform being used:
Optional Tools
These tools are optional, but have been found to be useful:
- Lens - Lens is a powerful Kubernetes IDE that simplifies the management of Kubernetes clusters. It provides a graphical interface to monitor cluster resources, manage configurations, and troubleshoot issues.
- macOS:
brew install lens
- macOS:
Azure AKS
Azure AKS also requires:
- Azure CLI Tools must be installed and configured.
- macOS:
brew install azure-cli
- macOS:
OpenShift
OpenShift also requires:
- OpenShift CLI tools must be installed and configured.
- macOS:
brew install openshift-cli
- macOS:
Sizing
Model Management Server can be quickly configured for a small, medium or large installation whilst also allowing for further customizations as needed.
Whilst many values are best kept at defaults, the values listed below have biggest effect on sizing and so are exposed as install options.
Small - for single scoring
Minimum hardware is single virtual machine, 6 cores, 30GB memory and 100G disk space. Large configurations will allow for more concurrent scoring and faster executions.
The helm install option, –set size=small, defaults to :
Medium - for small teams, the installation default
Minimum hardware is single virtual machine, 8 cpus, 32GB memory and 100G disk space. Recommend cloud infrastructure configured for cluster scaling to absorb variable demand. A useful cluster scaling configuration is a minimum of 1 virtual machine and maximum of 5 virtual machines, although experience shows 2 servers usually sufficient.
The helm install option, –set size=medium, defaults to :
Large - for larger teams
Minimum hardware is single virtual machine, 16 cpus, 64GB memory and 500G disk space. Recommend cloud infrastructure configured for cluster scaling to absorb variable demand. A useful cluster scaling configuration is a minimum of 1 virtual machine and maximum of 10 virtual machines.
The helm install option, –set size=large, defaults to :
Further sizing customizations
Each of the above values can be overridden as needed using the override name included above, for example to increase the git server disk space with the medium configuration, use –set size=medium –set medium.git.disk=50Gi.
Additional Model Management Server Configuration
Feature | Value | Override name | Description |
---|---|---|---|
CPU utilization target | 75 | artifactManagement.cpuUtilizationTarget | Average CPU utilization percentage to use for autoscaling |
Memory utilization target | 75 | artifactManagement.memoryUtilizationTarget | Average memory utilization percentage to use for autoscaling |
Artifact Server JVM Arguments | artifactManagement.jvmargs | Java command line arguments for the JVM. Defaults to container.linux.jvmargs. | |
Data Channel Registry JVM Arguments | datachannelregistry.jvmargs | Java command line arguments for the JVM. Defaults to container.linux.jvmargs. | |
Scheduling Server JVM Arguments | schedulingserver.jvmargs | Java command line arguments for the JVM. Defaults to container.linux.jvmargs. | |
Artifact Server JVM Arguments | artifactManagement.jvmargs | Java command line arguments for the JVM. Defaults to container.linux.jvmargs. | |
Scoring Server JVM Arguments | scoringserver.<runnerName>.jvmargs | Java command line arguments for the JVM. Defaults to container.<os>.jvmargs. | |
Data Channel JVM Arguments | datachannel.<channelType>.jvmargs | Java command line arguments for the JVM. Defaults to container.linux.jvmargs. |
The utilization target values specify targets which are used in horizontal pod autoscaling.
This is used to automatically scale the number of replicas up or down based on load. The current utilization metric for
a pod is calculated as a percentage of the resource request (e.g,, {size}.artifactManagement.requests.memory
). These percentages
are averaged over all Artifact Management pods. If the resulting value is above (below) the target value, the number of replicas
will be scaled up (down.)
Third Party Operator Configuration
As part of a Kubernetes installation, several third party operators (Elastic, Git, Kibana, Nexus, Prometheus, Jaeger, OpenTelemetry) are also installed. Each of these operators has key parameters available for customization. The default helm chart values for these operators can be viewed with the helm show values kubernetes-installer-1.4.0.tgz command. Alternatively, the values.yaml file inside the kubernetes-installer-1.4.0.tgz archive can be inspected.
Passwords and secrets
In order to avoid clear text passwords, Kubenertes provides a Secrets facility. So prior to installation, kubernetes Secrets have to be created to contain the passwords required by Model Management Server.
These are :
Description | Secret name | Key name | Comments |
---|---|---|---|
Elastic search | elasticsearch-es-elastic-user | Elastic search user name | See https://www.elastic.co/guide/en/cloud-on-k8s/master/k8s-users-and-roles.html - if not set elastic search generates a password |
Git server | git-server | git user name | |
Nexus server | nexus-server | admin | |
Prometheus server | prometheus-server | usernames and passwords | Only used when not using OAuth2 |
Artifact managment server | artifact-management | admin | |
Artifact managment Env | configuring environment variables related to artifact management | ||
Scoring flow admin | scoring-admin | admin |
These secrets may be created via the cloud infrastructure or on the command-line using kubectl. For example :
# elastic search
#
# note in this case we use apply to avoid elastic search re-creating the secret
#
kubectl create secret generic elasticsearch-es-elastic-user --from-literal=elastic=mysecretpassword --namespace mms --dry-run=client --output=yaml 2>/dev/null > secret.yaml
kubectl apply --filename secret.yaml
# git server
#
kubectl create secret generic git-server --from-literal=mms=mysecretpassword --namespace mms
# nexus server
#
kubectl create secret generic nexus-server --from-literal=admin=mysecretpassword --namespace mms
# prometheus server
#
kubectl create secret generic prometheus-server --from-literal=user1=mysecretpassword2 --from-literal=user2=mysecretpassword2 ... --namespace mms
# scoring admin
#
kubectl create secret generic scoring-admin --from-literal=admin=mysecretpassword --namespace mms
# artifact manangement server
#
kubectl create secret generic artifact-management --from-literal=admin=mysecretpassword --namespace mms
kubectl create secret generic artifact-management-env --namespace mms
NOTE: The Elasticsearch password is limited to alphanumeric, “.”, “_”, “~”, and “-” characters, i.e. it must conform to this regular expression (‘^[a-zA-Z0-9._~-]+$’)).
NOTE: The prometheus-server
secret is only used if you are not using OAuth2 authentication; if using OAuth2, this secret need not be created.
It should consist of name value pairs where each name is a user name, and each value is the corresponding password. These are used to secure
the Prometheus ingress with basic authentication.
It is recommended to install an encryption provider for maximum security - see https://kubernetes.io/docs/tasks/administer-cluster/encrypt-data/.
Quick run through
The Helm CLI tool is used to install the Model Management Server components to Kubernetes :
$ helm upgrade --install installer helm-charts/kubernetes-installer-1.4.0.tgz --atomic --set cloud=aks
This command first installs and starts the bootstrap pipeline which installes the required Kubernetes operators - this takes a few seconds after which the helm command returns with a summary of the installation.
For example :
Release "installer" does not exist. Installing it now.
NAME: installer
LAST DEPLOYED: Mon Jan 24 14:04:38 2022
NAMESPACE: mms
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
Thank you for installing ep-kubernetes-installer configured for docker-for-desktop in kubernetes v1.22.5
The Operator Lifecycle Manager has been installed
The bootstrap pipeline has started which includes :
Adding kubernetes permissions
Installing a nexus server
Installing ElasticSearch and Kibana
Installing Prometheus
Populating the nexus repository with artifacts
Creating a product install pipeline from helm charts
Starting the nexus server at :
Internal web console URL - http://artifact-repository:80/
Maven repository - http://artifact-repository:80/repository/maven-public/
Helm repository - http://artifact-repository:80/repository/helm/
PyPi proxy - http://artifact-repository:80/repository/pypi-group
Container registry - 192.168.175.10:8082
Starting prometheus server at :
Internal URL - http://prometheus.mms:9090
Starting elasticsearch server at :
Internal URL - https://elasticsearch-es-http:9200
Userid is elastic, password set in kubernetes secret
Starting kibana server at :
Internal URL - http://kibana-kb-http
Userid is elastic, password set in kubernetes secret
The docker daemon should be configured to allow http pull requests :
{
"insecure-registries": [
"192.168.175.10:8082"
]
}
To track the progress of the bootstrap pipeline run :
tkn pipelinerun logs bootstrap --follow --namespace mms
The output depends on the cloud platform and any additional options selected. These details are also displayed with the helm status mms command.
The zip of maven artifacts should be copied using kubectl cp command :
$ kubectl cp product-repository-`.zip mavenrepo-0:/tmp/ --namespace mms
At this point the installation has been started and, as mentioned above, the status of the installation can be monitored with tkn pipelinerun logs bootstrap -f. For example :
$ tkn pipelinerun logs bootstrap --follow --namespace mms
[nexus : nexus] Installing nexus operator
[nexus : nexus] namespace/nexus-operator-system created
[nexus : nexus] customresourcedefinition.apiextensions.k8s.io/nexus.apps.m88i.io created
[nexus : nexus] role.rbac.authorization.k8s.io/nexus-operator-leader-election-role created
[nexus : nexus] clusterrole.rbac.authorization.k8s.io/nexus-operator-manager-role created
[nexus : nexus] clusterrole.rbac.authorization.k8s.io/nexus-operator-metrics-reader created
[nexus : nexus] clusterrole.rbac.authorization.k8s.io/nexus-operator-proxy-role created
[nexus : nexus] rolebinding.rbac.authorization.k8s.io/nexus-operator-leader-election-rolebinding created
[nexus : nexus] clusterrolebinding.rbac.authorization.k8s.io/nexus-operator-manager-rolebinding created
[nexus : nexus] clusterrolebinding.rbac.authorization.k8s.io/nexus-operator-proxy-rolebinding created
[nexus : nexus] service/nexus-operator-controller-manager-metrics-service created
....
[install-pipeline-run : run] 14:16:27.765 [main] INFO com.tibco.streaming.installpipeline.Kubernetes - To track the progress of the artifact-management pipeline run :
[install-pipeline-run : run] 14:16:27.766 [main] INFO com.tibco.streaming.installpipeline.Kubernetes - tkn pipelinerun logs artifact-management --follow --namespace mms
The installation process can run tasks in parallel - hence the output is prefixed with the task and lines are coloured.
Once the bootstrap pipeline has completed, the application pipeline can be monitored in a similar way :
$ tkn pipelinerun logs artifact-management --follow --namespace mms
....
[scheduling-server-scale : scale] Resuming rollout of scheduling-server
[scheduling-server-scale : scale] deployment.apps/scheduling-server resumed
[data-channel-registry-prepare : prepare] Preparing directory for data-channel-registry
....
The installation is completed when the tkn pipelinerun logs artifact-management –follow –namespace mms command completes. The tkn taskrun list command shows the task status :
$ tkn taskrun list --namespace mms
NAME TASK NAME STARTED DURATION STATUS
artifact-management-python-scale python-scale 1 minute ago 7s Succeeded
artifact-management-spark-scale spark-scale 2 minutes ago 16s Succeeded
artifact-management-rmodelrunner-scale rmodelrunner-scale 2 minutes ago 6s Succeeded
artifact-management-tensorflow-scale tensorflow-scale 2 minutes ago 6s Succeeded
artifact-management-pmml-scale pmml-scale 2 minutes ago 6s Succeeded
artifact-management-artifact-management-scale artifact-management-scale 2 minutes ago 7s Succeeded
artifact-management-artifact-management-initindex artifact-management-initindex 2 minutes ago 11s Succeeded
artifact-management-spark-image spark-image 3 minutes ago 1m37s Succeeded
artifact-management-pmml-image pmml-image 3 minutes ago 1m33s Succeeded
artifact-management-artifact-management-image artifact-management-image 4 minutes ago 1m32s Succeeded
artifact-management-tensorflow-image tensorflow-image 4 minutes ago 1m50s Succeeded
artifact-management-rmodelrunner-image rmodelrunner-image 4 minutes ago 1m57s Succeeded
artifact-management-python-image python-image 4 minutes ago 3m16s Succeeded
artifact-management-jdbc-source-image jdbc-source-image 4 minutes ago 1m41s Succeeded
artifact-management-pipeline-image pipeline-image 4 minutes ago 3m6s Succeeded
artifact-management-rest-source-image rest-source-image 5 minutes ago 1m48s Succeeded
artifact-management-rest-sink-image rest-sink-image 5 minutes ago 1m44s Succeeded
artifact-management-kafka-sink-image kafka-sink-image 5 minutes ago 1m44s Succeeded
artifact-management-file-sink-image file-sink-image 5 minutes ago 1m23s Succeeded
artifact-management-file-source-image file-source-image 5 minutes ago 1m23s Succeeded
artifact-management-jdbc-sink-image jdbc-sink-image 5 minutes ago 1m25s Succeeded
artifact-management-kafka-source-image kafka-source-image 5 minutes ago 1m30s Succeeded
artifact-management-rest-request-response-image rest-request-response-image 5 minutes ago 1m35s Succeeded
artifact-management-statistica-scale statistica-scale 5 minutes ago 13s Succeeded
artifact-management-jdbc-source-maven jdbc-source-maven 5 minutes ago 44s Succeeded
artifact-management-rmodelrunner-maven rmodelrunner-maven 5 minutes ago 1m30s Succeeded
artifact-management-spark-maven spark-maven 5 minutes ago 2m1s Succeeded
artifact-management-kafka-sink-maven kafka-sink-maven 5 minutes ago 38s Succeeded
artifact-management-rest-source-maven rest-source-maven 5 minutes ago 38s Succeeded
artifact-management-tensorflow-maven tensorflow-maven 5 minutes ago 1m33s Succeeded
artifact-management-rest-sink-maven rest-sink-maven 5 minutes ago 39s Succeeded
artifact-management-file-source-maven file-source-maven 5 minutes ago 39s Succeeded
artifact-management-pipeline-maven pipeline-maven 5 minutes ago 46s Succeeded
artifact-management-statistica-image statistica-image 5 minutes ago 9s Succeeded
artifact-management-file-sink-maven file-sink-maven 5 minutes ago 39s Succeeded
artifact-management-jdbc-sink-maven jdbc-sink-maven 5 minutes ago 39s Succeeded
artifact-management-kafka-source-maven kafka-source-maven 5 minutes ago 34s Succeeded
artifact-management-pmml-maven pmml-maven 5 minutes ago 1m54s Succeeded
artifact-management-rest-request-response-maven rest-request-response-maven 5 minutes ago 32s Succeeded
artifact-management-python-maven python-maven 5 minutes ago 1m24s Succeeded
artifact-management-artifact-management-maven artifact-management-maven 5 minutes ago 1m40s Succeeded
artifact-management-spark-prepare spark-prepare 5 minutes ago 8s Succeeded
artifact-management-jdbc-source-prepare jdbc-source-prepare 5 minutes ago 9s Succeeded
artifact-management-rmodelrunner-prepare rmodelrunner-prepare 5 minutes ago 8s Succeeded
artifact-management-rest-source-prepare rest-source-prepare 5 minutes ago 7s Succeeded
artifact-management-kafka-sink-prepare kafka-sink-prepare 5 minutes ago 7s Succeeded
artifact-management-tensorflow-prepare tensorflow-prepare 5 minutes ago 10s Succeeded
artifact-management-pipeline-prepare pipeline-prepare 5 minutes ago 7s Succeeded
artifact-management-rest-sink-prepare rest-sink-prepare 5 minutes ago 7s Succeeded
artifact-management-file-source-prepare file-source-prepare 6 minutes ago 7s Succeeded
artifact-management-file-sink-prepare file-sink-prepare 6 minutes ago 7s Succeeded
artifact-management-jdbc-sink-prepare jdbc-sink-prepare 6 minutes ago 8s Succeeded
artifact-management-pmml-prepare pmml-prepare 6 minutes ago 9s Succeeded
artifact-management-rest-request-response-prepare rest-request-response-prepare 6 minutes ago 8s Succeeded
artifact-management-python-prepare python-prepare 6 minutes ago 8s Succeeded
artifact-management-kafka-source-prepare kafka-source-prepare 6 minutes ago 10s Succeeded
artifact-management-artifact-management-prepare artifact-management-prepare 6 minutes ago 9s Succeeded
artifact-management-statistica-maven statistica-maven 6 minutes ago 27s Succeeded
artifact-management-statistica-prepare statistica-prepare 6 minutes ago 7s Succeeded
artifact-management-sw-container-base-windows-image sw-container-base-windows-image 6 minutes ago 12s Succeeded
artifact-management-sw-container-base-linux-image sw-container-base-linux-image 6 minutes ago 40s Succeeded
artifact-management-git-server-scale git-server-scale 6 minutes ago 7s Succeeded
artifact-management-git-server-image git-server-image 7 minutes ago 24s Succeeded
artifact-management-sw-container-base-linux-maven sw-container-base-linux-maven 7 minutes ago 31s Succeeded
artifact-management-sw-container-base-windows-maven sw-container-base-windows-maven 7 minutes ago 31s Succeeded
artifact-management-git-server-prepare git-server-prepare 7 minutes ago 31s Succeeded
artifact-management-sw-container-base-linux-prepare sw-container-base-linux-prepare 7 minutes ago 32s Succeeded
artifact-management-sw-container-base-windows-prepare sw-container-base-windows-prepare 7 minutes ago 32s Succeeded
Alternative ways to manage volumes
By default, the MMS installation will create kubernetes persistent volume claims for the artifact-management and git servers. However, if needed, these volumes can be managed differently.
- Pre-create persistent volume claims
In this case the administrator pre-creates git-server
and artifact-management
persistent volume
claims ( perhaps with custom storage class and reclaim policy ) and specifies the createPVC=false
option
when installing MMS :
$ helm upgrade --install installer helm-charts/kubernetes-installer-1.4.0.tgz --atomic --set cloud=aks --set createPVC=false
- Use a custom storage class
In this case the adminstrator pre-creates a custom storage class (or specify a non-default one) and specifies
the artifactManagement.storageClass
and/or artifactManagement.storageClass
options when installing MMS :
$ helm upgrade --install installer helm-charts/kubernetes-installer-1.4.0.tgz --atomic --set cloud=aks --set artifactManagement.storageClass=customclass
The installation pipeline
The install process is controlled via a Tekton pipeline called installation. This pipeline first installs the following Kubernetes Operators during the pre-install hook :
- Operator Lifecycle Manager (if required)
- Tekton pipeline operator
- Tekton triggers operator
- Nexus 3 operator
- Prometheus operator
- Elastic Cloud on Kubernetes operator
- OpenTelemetry
- Jaeger
- General purpose tools image - used for various build and deploy tasks
Kubernetes permissions are added to support Role-based access control (RBAC), security context constraints (SCC) and Streaming discovery.
The following container images are built in Kubernetes :
- GIT server image - used to hold the Model Management Server artifacts
Model Management Server helm charts also create :
- SW container base Windows image - A common base image for Windows-based MMS components. This is only supported in Kubernetes environments that support Windows containers (e.g., Azure).
- Windows base container image for Statistica Model - No Kubernetes deployments are created for Statistica model runners in MMS. The containers are built but are not used in MMS.
- SW container base Linux image - A common base image for most MMS components.
- Linux base container for Scoring Pipeline - Scoring pipelines are created and deployed dynamically at runtime when users create Cloud Deployments.
- Linux base container for Model Runner - For each Model Runner extension shipped with the product.
- Linux base container for Data Channels - For each data channel extension shipped with the product.
- Linux base container for Artifact Management Server - Creates the Artifact Management Server as the last step.
The following services are started :
- GIT server
- TIBCO Model Management Server
- Nexus repository configured with :
- Maven repository, populated with TIBCO artifacts
- Python repository (both proxy and hosted)
- Container registry
- Helm chart repository
- TIBCO Scheduling Server
Finally the installation deploys a helm chart used to later deploy a Model Management Server server.
Kubernetes rollout is paused during the installation process and resumed once new container images are available.
Individual pipeline tasks are scheduled by dependency and available resources.
Cloud platform differences
Kubernetes features differs between platforms and so the installation process also varies slightly. In general, natively provided features are used in preference to custom provided features. These difference are shown below :
Feature | OpenShift | AKS | EKS |
---|---|---|---|
Operator Lifecycle Manager | Provided | Installed | Installed |
Container registry | ImageStream | ACR | ECR |
Network exposure | route | Ingress | Ingress |
RBAC supported | Yes | Yes | Yes |
SCC supported | Yes | No | No |
Windows images supported | No | Yes | No |
These differences are controlled via Model Management Server helm chart values parameters - these can be viewed with the helm show values kubernetes-installer-1.4.0.tgz command,
$ helm install mms kubernetes-installer-1.4.0.tgz --set cloud=aks
However individual settings can be overridden if required, using cloud name.parameter format. For example :
$ helm install mms kubernetes-installer-1.4.0.tgz --set cloud=aks \
--set aks.containerRegistry=myserver:30030
Upgrading
To upgrade the Model Management Server components use :
$ helm upgrade installer kubernetes-installer-1.4.0.tgz ...
However, its common practice to use the same command for installation and upgrades :
$ helm upgrade installer kubernetes-installer-1.4.0.tgz --install ...
When the installation is upgraded the installation pipeline is re-executed and a rollout restart is performed on existing pods.
Uninstalling
To uninstall the Model Management Server components use:
$ helm uninstall mms
Note that this doesn't uninstall the Kubernetes operators (so that a further install is faster).
Troubleshooting
Always ensure the kubernetes context is what you expect. For example :
$ kubectl config current-context
mms
The context is also displayed in docker for desktop UI.
Storage Issues
Multi-Attach Errors and External Storage
When running multiple replicas of the artifact-management server, especially in clusters with autoscaling or when pods are rescheduled across nodes, you may encounter storage errors such as Multi-Attach
. This typically happens if your storage class does not support multi-node access (e.g., ReadWriteMany
).
For a step-by-step guide on resolving these issues—including using external storage and migrating data—see
Handling Multi-Attach Errors with External Storage.