Cloud Installation

Introduction

This section provides general information on deploying ModelOps to a Kubernetes cluster.

Requirements
Sizing
Passwords and secrets
Quick run through
The installation pipeline
Cloud platform differences
Upgrading
Uninstalling
Troubleshooting

Requirements

The following tools are required to complete the installation - these must be downloaded and installed prior to installing ModelOps :

Kubernetes CLI tool
- macOS: brew install kubectl
Helm CLI tool
- macOS: brew install helm
Tekton CLI tool
- macOS: brew install tektoncd-cli

Additional requirements depend on the Kubernetes cloud platform being used:

Optional Tools

These tools are optional, but have been found to be useful

Lens
- macOS: brew install lens

Docker for desktop also requires:

Kubernetes must be enabled
Docker Engine must enable insecure registries for the local machine
Allocate required resources, roughly

Kind

Kind has no additional requirements.

Azure AKS

Azure AKS also requires:

Azure CLI Tools must be installed and configured.
- macOS: brew install azure-cli

OpenShift

OpenShift also requires:

OpenShift CLI tools must be installed and configured.
- macOS: brew install openshift-cli

Sizing

Modelops can be quickly configured for a small, medium or large installation whilst also allowing for further customizations as needed.

Whilst many values are best kept at defaults, the values listed below have biggest effect on sizing and so are exposed as install options.

Small - for single scoring

Minimum hardware is single virtual machine, 6 cores, 30GB memory and 100G disk space. Large configurations will allow for more concurrent scoring and faster executions. Can be run in desktop kubernetes environments such as docker for desktop.

Linux based scoring only ( a hybrid deployment of linux and windows nodes is required for statistica scoring ).

The helm install option, –set global.size=small, defaults to :

Feature	Value	Units
Scoring flow memory limit	1.5Gi	See https://kubernetes.io/docs/reference/kubernetes-api/common-definitions/quantity/#Quantity
Scoring flow cpu limit	2	See https://kubernetes.io/docs/reference/kubernetes-api/common-definitions/quantity/#Quantity
Nexus memory limit	1.5Gi	See https://kubernetes.io/docs/reference/kubernetes-api/common-definitions/quantity/#Quantity
Nexus disk space	20Gi	See https://kubernetes.io/docs/reference/kubernetes-api/common-definitions/quantity/#Quantity
Nexus memory limit	1.5Gi	See https://kubernetes.io/docs/reference/kubernetes-api/common-definitions/quantity/#Quantity
Git server disk space	5Gi	See https://kubernetes.io/docs/reference/kubernetes-api/common-definitions/quantity/#Quantity
Modelops server disk space	5Gi	See https://kubernetes.io/docs/reference/kubernetes-api/common-definitions/quantity/#Quantity
Modelops metrics memory limit	10Gi	See https://kubernetes.io/docs/reference/kubernetes-api/common-definitions/quantity/#Quantity
Modelops metrics sampling interval	30	Seconds
Elastic search disk space	10Gi	See https://kubernetes.io/docs/reference/kubernetes-api/common-definitions/quantity/#Quantity
Elastic search memory limit	2Gi	See https://kubernetes.io/docs/reference/kubernetes-api/common-definitions/quantity/#Quantity
Prometheus sampling interval	30s	Seconds

Medium - for small teams, the installation default

Minimum hardware is single virtual machine, 8 cpus, 32GB memory and 100G disk space. Recommend cloud infrastructure configured for cluster scaling to absorb variable demand. A useful cluster scaling configuration is a minimum of 1 virtual machine and maximum of 5 virtual machines, although experience shows 2 servers usually sufficient.

Additional windows virtual machines can be added to the cluster to support statistica scoring if required.

The helm install option, –set global.size=medium, defaults to :

Feature	Value	Units
Scoring flow memory limit	2Gi	See https://kubernetes.io/docs/reference/kubernetes-api/common-definitions/quantity/#Quantity
Scoring flow cpu limit	4	See https://kubernetes.io/docs/reference/kubernetes-api/common-definitions/quantity/#Quantity
Nexus disk space	20Gi	See https://kubernetes.io/docs/reference/kubernetes-api/common-definitions/quantity/#Quantity
Nexus memory limit	1.5Gi	See https://kubernetes.io/docs/reference/kubernetes-api/common-definitions/quantity/#Quantity
Git server disk space	20Gi	See https://kubernetes.io/docs/reference/kubernetes-api/common-definitions/quantity/#Quantity
Modelops server disk space	20Gi	See https://kubernetes.io/docs/reference/kubernetes-api/common-definitions/quantity/#Quantity
Modelops metrics memory limit	15Gi	See https://kubernetes.io/docs/reference/kubernetes-api/common-definitions/quantity/#Quantity
Modelops metrics sampling interval	10	Seconds
Elastic search disk space	50Gi	See https://kubernetes.io/docs/reference/kubernetes-api/common-definitions/quantity/#Quantity
Elastic search memory limit	5Gi	See https://kubernetes.io/docs/reference/kubernetes-api/common-definitions/quantity/#Quantity
Prometheus sampling interval	10s	Seconds

Large - for larger teams

Minimum hardware is single virtual machine, 16 cpus, 64GB memory and 500G disk space. Recommend cloud infrastructure configured for cluster scaling to absorb variable demand. A useful cluster scaling configuration is a minimum of 1 virtual machine and maximum of 10 virtual machines.

Additional windows virtual machines can be added to the cluster to support statistica scoring if required.

The helm install option, –set global.size=large, defaults to :

Feature	Value	Units
Scoring flow memory limit	4Gi	See https://kubernetes.io/docs/reference/kubernetes-api/common-definitions/quantity/#Quantity
Scoring flow cpu limit	6	See https://kubernetes.io/docs/reference/kubernetes-api/common-definitions/quantity/#Quantity
Nexus disk space	20Gi	See https://kubernetes.io/docs/reference/kubernetes-api/common-definitions/quantity/#Quantity
Nexus memory limit	1.5Gi	See https://kubernetes.io/docs/reference/kubernetes-api/common-definitions/quantity/#Quantity
Git server disk space	100Gi	See https://kubernetes.io/docs/reference/kubernetes-api/common-definitions/quantity/#Quantity
Modelops server disk space	100Gi	See https://kubernetes.io/docs/reference/kubernetes-api/common-definitions/quantity/#Quantity
Modelops metrics memory limit	20Gi	See https://kubernetes.io/docs/reference/kubernetes-api/common-definitions/quantity/#Quantity
Modelops metrics sampling interval	10	Seconds
Elastic search disk space	100Gi	See https://kubernetes.io/docs/reference/kubernetes-api/common-definitions/quantity/#Quantity
Elastic search memory limit	10Gi	See https://kubernetes.io/docs/reference/kubernetes-api/common-definitions/quantity/#Quantity
Prometheus sampling interval	10s	Seconds

Further sizing customizations

Each of the above values can be overridden as needed, for example to increase the git server disk space with the medium configuration, use –set global.size=medium –set global.medium.git.disk=50Gi.

The installation values file listed below shows the actual customization names.

Passwords and secrets

In order to avoid clear text passwords, Kubenertes provides a Secrets facility. So prior to installation, kubernetes Secrets have to be created to contain the passwords required by modelops.

These are :

Description	Secret name	Key name	Comments
Elastic search	elasticsearch-es-elastic-user	Elastic search user name	See https://www.elastic.co/guide/en/cloud-on-k8s/master/k8s-users-and-roles.html - if not set elastic search generates a password
Git server	git-server	git user name
Nexus server	nexus-server	admin
Modelops server	modelops-server	admin
Grafana server	grafana-server	admin
Scoring flow admin	scoring-admin	admin

These secrets may be created via the cloud infrastructure or on the command-line using kubectl. For example :

# elastic search
#
# note in this case we use apply to avoid elastic search re-creating the secret
#
kubectl create secret generic elasticsearch-es-elastic-user --from-literal=elastic=mysecretpassword --namespace modelops --dry-run=client --output=yaml 2>/dev/null > secret.yaml
kubectl apply --filename secret.yaml
 
# git server
#
kubectl create secret generic git-server --from-literal=modelops=mysecretpassword --namespace modelops
 
# nexus server
#
kubectl create secret generic nexus-server --from-literal=admin=mysecretpassword --namespace modelops
 
# modelops server
#
kubectl create secret generic modelops-server --from-literal=admin=mysecretpassword --namespace modelops
 
# grafana server
#
kubectl create secret generic grafana-server --from-literal=admin=mysecretpassword --namespace modelops
 
# scoring admin
#
kubectl create secret generic scoring-admin --from-literal=admin=mysecretpassword --namespace modelops

It is recommended to install an encryption provider for maximum security - see https://kubernetes.io/docs/tasks/administer-cluster/encrypt-data/.

Quick run through

The Helm CLI tool is used to install the ModelOps components to Kubernetes :

$ helm upgrade --install modelops modelops-1.0.0.tgz --atomic --set global.cloud=aks

This command first installs required Kubernetes operators - this takes a few seconds after which the helm command returns with a summary of the installation.

For example :

NAME: modelops
LAST DEPLOYED: Thu Oct 29 10:22:35 2020
NAMESPACE: modelops
STATUS: deployed
REVISION: 2
TEST SUITE: None
NOTES:
Thank you for installing modelops configured for aks in kubernetes v1.17.11
 
 
The installation pipeline has started which includes :
 
  Building tools image
  Building git server image
  Building sbrt base image
  Building modelops server image
  Building modelops scoring flow image
  Adding kubernetes permissions
  The following sub-charts :
    data-channel
    modelops-server
    pmml
    python
    sbrt-base
    scheduling-server
    scoring-flow
    statistica
    tensorflow
    test-datasink
    test-datasource
    tools
 
 
  Starting the git server at :
 
    Internal URL - http://git:3000/
    External web console URL - run kubectl get service git --namespace modelops -o jsonpath='http://{.status.loadBalancer.ingress[0].ip}:{.spec.ports[0].targetPort}'
 
  Starting the nexus server at :
 
    Internal web console URL - http://nexus:8081/
      Maven repository - http://nexus:8081/repository/maven-public/
      Helm repository - http://nexus:8081/repository/helm/
      PyPi proxy - http://nexus:8081/repository/pypi-group
      Container registry - container:8082
    External web console URL - run kubectl get service nexuslb --namespace modelops -o jsonpath='http://{.status.loadBalancer.ingress[0].ip}:{.spec.ports[0].targetPort}'
 
  Starting the modelops server at :
 
    Internal URL - http://modelops-server/
    External URL - run kubectl get service modelops-server --namespace modelops -o jsonpath='http://{.status.loadBalancer.ingress[0].ip}:{.spec.ports[0].targetPort}'
 
  Starting the data channel server at :
 
    Internal URL - http://data-channel/
 
  Starting the scheduling server at :
 
    Internal URL - http://scheduling-server/
    External web console URL - run kubectl get service scheduling-server --namespace modelops -o jsonpath='http://{.status.loadBalancer.ingress[0].ip}:{.spec.ports[0].targetPort}'
 
  Starting prometheus server at :
 
    Internal URL - http://prometheus.modelops.svc.cluster.local:9090
    External URL - run kubectl get service prometheus --namespace modelops -o jsonpath='http://{.status.loadBalancer.ingress[0].ip}:{.spec.ports[0].targetPort}'
 
  Starting grafana server at :
 
    Internal URL - http://grafana:3000
    External URL - run kubectl get service grafana --namespace modelops -o jsonpath='http://{.status.loadBalancer.ingress[0].ip}:{.spec.ports[0].targetPort}'
 
  Starting elasticsearch server at :
 
    Internal URL - http://elasticsearch-es-http:9200
    External URL - run kubectl get service elasticsearch-es-http --namespace modelops -o jsonpath='http://{.status.loadBalancer.ingress[0].ip}:{.spec.ports[0].targetPort}'
 
    Userid is elastic, password elastic
 
  Starting kibana server at :
 
    Internal URL - http://kibana-kb-http
    External URL - run kubectl get service kibana-kb-http --namespace modelops -o jsonpath='http://{.status.loadBalancer.ingress[0].ip}:{.spec.ports[0].targetPort}'
 
    Userid is elastic, password elastic
 
 
 
  Populating the nexus maven repository with TIBCO artifacts
 
To track the progress of the installation pipeline run :
 
  tkn pipelinerun logs installation-2 --follow --namespace modelops

The output depends on the cloud platform and any additional options selected. These details are also displayed with the helm status modelops command.

The zip of maven artifacts should be copied using kubectl cp command :

$ kubectl cp kubernetes-installer-1.0.0-mavenrepo.zip mavenrepo-0:/tmp/mavenrepo.zip

At this point the installation has been started and, as mentioned above, the status of the installation can be monitored with tkn pipelinerun logs installation -f. For example :

$ tkn pipelinerun logs installation-2 --follow --namespace modelops
[tools-prepare : prepare] Preparing directory for build tools image
 
[git-server : build-and-push] INFO[0005] Retrieving image manifest gitea/gitea:1.10.2
[git-server : build-and-push] INFO[0007] Retrieving image manifest gitea/gitea:1.10.2
[git-server : build-and-push] INFO[0010] Built cross stage deps: map[]
[git-server : build-and-push] INFO[0010] Retrieving image manifest gitea/gitea:1.10.2
[git-server : build-and-push] INFO[0011] Retrieving image manifest gitea/gitea:1.10.2
....

The installation process can run tasks in parallel - hence the output is prefixed with the task and lines are coloured.

The installation is completed when the tkn pipelinerun logs installation –follow –namespace modelops command completes. The tkn taskrun list command shows the task status :

$ tkn taskrun list --namespace modelops
 NAME                                             TASK NAME                   STARTED          DURATION     STATUS
 installation-2-modelops-server-scale-tq26d       modelops-server-scale       9 minutes ago    26 seconds   Succeeded
 installation-2-test-datasink-scale-7x6dv         test-datasink-scale         12 minutes ago   43 seconds   Succeeded
 installation-2-test-datasource-scale-n2bn4       test-datasource-scale       12 minutes ago   27 seconds   Succeeded
 installation-2-modelops-server-image-n8pc8       modelops-server-image       15 minutes ago   5 minutes    Succeeded
 installation-2-scoring-flow-image-dt5d6          scoring-flow-image          15 minutes ago   6 minutes    Succeeded
 installation-2-test-datasource-image-ldxrk       test-datasource-image       15 minutes ago   2 minutes    Succeeded
 installation-2-test-datasink-image-45thd         test-datasink-image         15 minutes ago   3 minutes    Succeeded
 installation-2-git-server-scale-pbclk            git-server-scale            16 minutes ago   21 seconds   Succeeded
 installation-2-modelops-server-maven-kc4s4       modelops-server-maven       16 minutes ago   1 minute     Succeeded
 installation-2-test-datasource-maven-m7ghv       test-datasource-maven       16 minutes ago   1 minute     Succeeded
 installation-2-test-datasink-maven-z9pd6         test-datasink-maven         16 minutes ago   1 minute     Succeeded
 installation-2-scoring-flow-maven-znt2n          scoring-flow-maven          16 minutes ago   1 minute     Succeeded
 installation-2-git-server-qq8dl                  git-server                  17 minutes ago   1 minute     Succeeded
 installation-2-test-datasource-prepare-lt6xd     test-datasource-prepare     17 minutes ago   53 seconds   Succeeded
 installation-2-test-datasink-prepare-4bmzj       test-datasink-prepare       17 minutes ago   51 seconds   Succeeded
 installation-2-scoring-flow-prepare-tmh55        scoring-flow-prepare        17 minutes ago   51 seconds   Succeeded
 installation-2-modelops-server-prepare-rm6d5     modelops-server-prepare     17 minutes ago   57 seconds   Succeeded
 installation-2-git-server-prepare-269qk          git-server-prepare          19 minutes ago   1 minute     Succeeded
 installation-2-data-channel-scale-5792c          data-channel-scale          19 minutes ago   1 minute     Succeeded
 installation-2-kibana-gzjr4                      kibana                      19 minutes ago   1 minute     Succeeded
 installation-2-scoring-pipeline-helm-qcx7d       scoring-pipeline-helm       19 minutes ago   1 minute     Succeeded
 installation-2-scheduling-server-scale-m5c5q     scheduling-server-scale     20 minutes ago   1 minute     Succeeded
 installation-2-python-image-j5l5q                python-image                25 minutes ago   7 minutes    Succeeded
 installation-2-tools-image-ddh4r                 tools-image                 25 minutes ago   6 minutes    Succeeded
 installation-2-statistica-image-mq5cq            statistica-image            25 minutes ago   24 minutes   Succeeded
 installation-2-pmml-image-sw2s6                  pmml-image                  25 minutes ago   6 minutes    Succeeded
 installation-2-data-channel-image-r7hs8          data-channel-image          26 minutes ago   7 minutes    Succeeded
 installation-2-scheduling-server-image-df9jh     scheduling-server-image     26 minutes ago   6 minutes    Succeeded
 installation-2-tensorflow-image-w6ssk            tensorflow-image            26 minutes ago   6 minutes    Succeeded
 installation-2-sbrt-base-image-xk6hg             sbrt-base-image             26 minutes ago   8 minutes    Succeeded
 installation-2-python-maven-ffsbw                python-maven                26 minutes ago   1 minute     Succeeded
 installation-2-statistica-maven-vkxvd            statistica-maven            26 minutes ago   59 seconds   Succeeded
 installation-2-pmml-maven-nvz42                  pmml-maven                  26 minutes ago   1 minute     Succeeded
 installation-2-tools-maven-8bm4r                 tools-maven                 27 minutes ago   1 minute     Succeeded
 installation-2-scheduling-server-maven-kf2br     scheduling-server-maven     27 minutes ago   1 minute     Succeeded
 installation-2-sbrt-base-maven-hdh67             sbrt-base-maven             27 minutes ago   1 minute     Succeeded
 installation-2-data-channel-maven-hwb7t          data-channel-maven          27 minutes ago   1 minute     Succeeded
 installation-2-tensorflow-maven-m4svb            tensorflow-maven            27 minutes ago   1 minute     Succeeded
 installation-2-pmml-prepare-b8mqn                pmml-prepare                28 minutes ago   1 minute     Succeeded
 installation-2-data-channel-prepare-77sk2        data-channel-prepare        28 minutes ago   47 seconds   Succeeded
 installation-2-tools-prepare-t4fgk               tools-prepare               28 minutes ago   53 seconds   Succeeded
 installation-2-statistica-prepare-prhn9          statistica-prepare          28 minutes ago   1 minute     Succeeded
 installation-2-tensorflow-prepare-kxfm6          tensorflow-prepare          28 minutes ago   48 seconds   Succeeded
 installation-2-scheduling-server-prepare-lfn8l   scheduling-server-prepare   28 minutes ago   54 seconds   Succeeded
 installation-2-python-prepare-5bwlm              python-prepare              28 minutes ago   2 minutes    Succeeded
 installation-2-sbrt-base-prepare-b94jj           sbrt-base-prepare           28 minutes ago   54 seconds   Succeeded
 installation-2-deploy-artifacts-fqckl            deploy-artifacts            33 minutes ago   4 minutes    Succeeded
 installation-2-nexus-repositories-7qvfx          nexus-repositories          33 minutes ago   12 seconds   Succeeded

The installation pipeline

The install process is controlled via a Tekton pipeline called installation. This pipeline first installs the following Kubernetes Operators during the pre-install hook :

Kubernetes permissions are added to support Role-based access control (RBAC), security context constraints (SCC) and Streaming discovery.

The following container images are built in Kubernetes :

GIT server image - used to hold the ModelOps artifacts

Dependent helm sub charts also create :

General purpose tools image - used for various build and deploy tasks
TIBCO Streaming runtime base image - used as a base for further images
TIBCO ModelOps Server image - scoring pipeline, flow, and model management
TIBCO ModelOps Scoring Server image for for each runner - model scoring
TIBCO Data Channel Registry - data source and sink registration and discovery
TIBCO ModelOps Scheduling Server - job scheduling

The following services are started :

GIT server
TIBCO ModelOps Server
Nexus repository configured with :
- Maven repository, populated with TIBCO artifacts
- Python repository (both proxy and hosted)
- Container registry
- Helm chart repository
TIBCO Data Channel Registry
TIBCO Scheduling Server

Finally the installation deploys a helm chart used to later deploy a ModelOps server.

Kubernetes rollout is paused during the installation process and resumed once new container images are available.

Individual pipeline tasks are scheduled by dependency and available resources.

Old pipeline runs left over from earlier upgrades are cleaned up so that the logs for the last 3 installations only are kept.

installation pipeline

Cloud platform differences

Kubernetes features differs between platforms and so the installation process also varies slightly. In general, natively provided features are used in preference to custom provided features. These difference are shown below :

Feature	Docker for desktop	Kind	OpenShift	AKS	EKS	Nutanix
Operator Lifecycle Manager	Installed	Installed	Provided	Installed	TBD	TBD
Container registry	Nexus	Kind	ImageStream	ACR	TBD	TBD
Network exposure	node port	node port	route	load balancer	TBD	TBD
RBAC supported	No	Yes	Yes	Yes	TBD	TBD
SCC supported	No	No	Yes	No	TBD	TBD
Windows images supported	No	No	No	Yes	TBD	TBD

These differences are controlled via ModelOps helm chart values parameters - these can be viewed with the helm show values modelops-1.0.0.tgz command, for example :

$ helm show values modelops-1.0.0.tgz
#
# Default values for the chart
#
 
#
# declare as global so subcharts get the same
#
global:
 
#
# empty map for sub charts to populate to extend this chart
#
  buildsteps: {}
  supportedmodels: {}
  runafter: {}
 
#
# cloud environment
#
  cloud: docker-for-desktop
 
#
# Container timezone
#
  timeZone: Europe/London
 
#
# nexus specific settings
#
  nexus:
    nodePort:           30020
    containerNodePort:  30030
    adminPassword:      "admin123"
    hostname:           "nexus"
 
#
# The following values are defaulted depending on cloud type :
#
# installOLM - install the operator lifecycle manager
#
# containerRegistry - base URI of container registry.  Use the supplied one
#   if available.
#
# containerUsername/containerPassword - if set, used to access container registry
#
# networkExposure - mechanism to use to expose network
#
# createPVC - if true create persistent volume claim in helm chart, if false
#   the persistent volume claim must be created before installing the chart.
#
# selfSignedRegistry - if true then skip tls verification on registry
#
# httpRegistry - if true then use http registry
#
# adminRBAC - if true, create and use admin service account for admin tasks
#
# kubernetesRBAC - if true, create role for streaming pod to manage service (plain kubernetes)
#
# openshiftRBAC - if true, create role for streaming pod to manage service (openshift)
#
# openshiftOperator - if true, use the OpenShift operator hub
#
# windows - if true build windows container (currently statistica scoring server)
#
# dnsSuffix - AKS only, set azure annotation for pubic dns name, ie <container>-<dnsSuffix>.<region>.cloudapp.azure.com
#
 
  docker-for-desktop:
    installOLM:         true
    installMetrics:     true
    installLogs:        true
    containerRegistry:  "localhost:5000"
    networkExposure:    "nodePort"
    createPVC:          true
    httpRegistry:       true
    selfSignedRegistry: false
    adminRBAC:          false
    kubernetesRBAC:     false
    openshiftRBAC:      false
    openshiftOperator:  false
    windows:            false
 
  kind:
    installOLM:         true
    installMetrics:     true
    installLogs:        true
    containerRegistry:  "kind-registry:5000"
    networkExposure:    "nodePort"
    createPVC:          true
    selfSignedRegistry: false
    httpRegistry:       true
    adminRBAC:          true
    kubernetesRBAC:     true
    openshiftRBAC:      false
    openshiftOperator:  false
    windows:            false
 
  openshift:
    installOLM:         false
    installMetrics:     true
    installLogs:        true
    containerRegistry:  "image-registry.openshift-image-registry.svc:5000/default"
    networkExposure:    "route"
    createPVC:          true
    selfSignedRegistry: true
    httpRegistry:       false
    adminRBAC:          true
    kubernetesRBAC:     false
    openshiftRBAC:      true
    openshiftOperator:  true
    windows:            false
 
  aks:
    installOLM:         true
    installMetrics:     true
    installLogs:        true
    containerRegistry:  "myregistry.azurecr.io"
    containerUsername:  "azure appid"
    containerPassword:  "azure password"
    azureTenantId:      "azure tenantId"
    networkExposure:    "loadBalancer"
    domain:             "tobeset"
    createPVC:          false
    selfSignedRegistry: false
    httpRegistry:       true
    adminRBAC:          true
    kubernetesRBAC:     true
    openshiftRBAC:      false
    openshiftOperator:  false
    windows:            true
 
#
# sizing details
#
  small:
    nexus:
      disk: "20Gi"
      memory: "1.5Gi"
    git:
      disk: "5Gi"
    modelopsserver:
      disk: "5Gi"
    modelopsmetrics:
      memory: "10Gi"
      interval: "30"
    elasticsearch:
      disk: "5Gi"
      memory: "2Gi"
    prometheus:
      interval: "30s"
 
  medium:
    nexus:
      disk: "20Gi"
      memory: "1.5Gi"
    git:
      disk: "20Gi"
    modelopsserver:
      disk: "20Gi"
    modelopsmetrics:
      memory: "15Gi"
      interval: "10"
    elasticsearch:
      disk: "20Gi"
      memory: "5Gi"
    prometheus:
      interval: "10s"
 
  large:
    nexus:
      disk: "20Gi"
      memory: "1.5Gi"
    git:
      disk: "100Gi"
    modelops-erver:
      disk: "100Gi"
    modelopsmetrics:
      memory: "20Gi"
      interval: "10"
    elasticsearch:
      disk: "100Gi"
      memory: "10Gi"
    prometheus:
      interval: "10s"
  
#
# hence the chart may be installed :
#
#   helm install modelops target/helm/repo/modelops-1.0.0.tgz --set cloud=openshift
#
# or override individual settings
#
#   helm install modelops target/helm/repo/modelops-1.0.0.tgz --set cloud=openshift --set openshift.createPVC=true
#
 
#
# auto start deployments ( after image is built )
#
autostartdeployments:
  tools:              false
  statistica:         false
  pmml:               false
  tensorflow:         false
  scoring-flow:       false
  data-channel:       true
  scheduling-server:  true
  modelops-server:    true
 
#
#
# git specific settings
#
# if azureDiskURL is set, use azureDisk with that URL
#
git:
  nodePort:           30010
  username:           "modelops"
  password:           "modelops"
  repository:         "scoringpipelines"
  azureDiskURL:       ""
 
#
# modelops-server specific settings
#
# if azureDiskURL is set, use azureDisk with that URL
#
modelopsserver:
  nodePort:           30040
  username:           "admin"
  password:           "admin"
  azureDiskURL:       ""
 
#
# prometheus specific settings
#
prometheus:
  nodePort:           30050
 
#
# grafana specific settings
#
grafana:
  nodePort:           30060
 
#
# elasiticsearch specific settings
#
elasticsearch:
  nodePort:           30070
  password:           "elastic"
 
#
# kibana specific settings
#
kibana:
  nodePort:           30080
 
#
# scheduling-server specific settings
#
schedulingserver:
  nodePort:           30090
  logLevel:           "INFO"
 
#
# data channel specific settings
#
#
datachannel:
  nodePort:           30100

So to choose the defaults for a given environment, just set global.cloud to the right environment :

$ helm install modelops modelops-1.0.0.tgz --set global.cloud=kind

However individual settings can be overridden if required, using cloud name.parameter format. For example :

$ helm install modelops modelops-1.0.0.tgz --set global.cloud=docker-for-desktop \
    --set global.docker-for-desktop.containerRegistry=myserver:30030

Some examples are shown in the sections below :

FIX THIS - I wondered if we need to be more specific about creating these environments ?

Docker for Desktop

To install docker for desktop :

Follow the instructions at https://www.docker.com/products/docker-desktop
Enable Kubernetes in the GUI
Ensure there are sufficient resources allocated

ModelOps is installed using these commands:

//
//  Install ModelOps helm charts
//
$ cd ${TIBCO_EP_HOME}/ep-modelops/kubernetes-installer/target/helm/repo
$ helm install modelops \
    modelops-1.0.0.tgz \
    --set global.cloud=docker-for-desktop \
    --set global.docker-for-desktop.containerRegistry=$(hostname -s):30030
//
//  Populate the Maven repository inside the K8s cluster
//
$ cd ${TIBCO_EP_HOME}/ep-modelops/kubernetes-installer/target/
$ kubectl cp kubernetes-installer-1.0.0-mavenrepo.zip \
    mavenrepo-0:/tmp/mavenrepo.zip

Kind

To install kind :

Install docker for desktop as above
Follow the instructions at https://kind.sigs.k8s.io/docs/user/quick-start/
Follow the instructions at https://kind.sigs.k8s.io/docs/user/local-registry/ to configure the registry

Once installed, a typical ModelOps installation command for kind is :

$ helm install modelops modelops-1.0.0.tgz --set global.cloud=kind
$ kubectl cp kubernetes-installer-1.0.0-mavenrepo.zip mavenrepo-0:/tmp/mavenrepo.zip

OpenShift Code Ready Containers (CRC)

To install code ready containers :

Follow the instructions at https://cloud.redhat.com/openshift/install/crc/installer-provisioned
Start with crc start –cpus 6 –memory 16384 –pull-secret-file ~/pull-secret.txt

Then a typical ModelOps installation command for OpenShift code ready containers is :

$ helm install modelops modelops-1.0.0.tgz --set global.cloud=openshift
$ kubectl cp kubernetes-installer-1.0.0-mavenrepo.zip mavenrepo-0:/tmp/mavenrepo.zip

AKS

To install AKS :

Follow the instructions at https://docs.microsoft.com/en-us/azure/aks/kubernetes-walkthrough
To support windows containers, follow the instructions at https://docs.microsoft.com/en-us/azure/aks/windows-container-cli#add-a-windows-server-node-pool

An example AKS script is :

namespace=modelops
group=modelops
cluster=modelops
acr=registry
azure_winpassword="set this"
azure_appId="set this"
azure_password="set this""
azure_tenantId="set this"
 
#
# create cluster
#
az group create --name ${group} --location uksouth
az acr create --name ${acr} --resource-group ${group} --sku basic
az aks create \
    --resource-group ${group} \
    --service-principal ${azure_appId} \
    --client-secret ${azure_password} \
    --name ${cluster} \
    --node-count 1 \
    --enable-cluster-autoscaler \
    --min-count 1 \
    --max-count 5 \
    --no-ssh-key \
    --windows-admin-password ${azure_winpassword} \
    --windows-admin-username azureuser \
    --vm-set-type VirtualMachineScaleSets \
    --node-vm-size Standard_B8ms \
    --network-plugin azure \
    --attach-acr ${acr}
 
#
# add windows pool and set taint to avoid default use
# (some operators will attempt to schedule linux pod on windows node)
#
az aks nodepool add \
    --resource-group ${group} \
    --cluster-name ${cluster} \
    --os-type Windows \
    --name npwin \
    --node-count 1 \
    --enable-cluster-autoscaler \
    --min-count 1 \
    --max-count 2 \
    --node-vm-size Standard_B8ms \
    --node-taints os=windows:NoSchedule
 
#
# create any persistent volumes ( external to kubernetes / namespace )
#
nodegroup=$(az aks show --resource-group ${group} --name ${cluster} --query nodeResourceGroup -o tsv)
az disk create \
  --resource-group ${nodegroup} \
  --name modelops-server \
  --size-gb 5 \
  --query id --output tsv
az disk create \
  --resource-group ${nodegroup} \
  --name git-server \
  --size-gb 5 \
  --query id --output tsv

Once installed, a typical ModelOps installation command for AKS is :

$ helm install modelops modelops-1.0.0.tgz --atomic --set global.cloud=aks \
    --set global.aks.containerRegistry=${acr}.azurecr.io \
    --set global.aks.containerUsername=${azure_appId} \
    --set global.aks.containerPassword=${azure_password} \
    --set global.aks.azureTenantId=${azure_tenantId}
$ kubectl cp kubernetes-installer-1.0.0-mavenrepo.zip mavenrepo-0:/tmp/mavenrepo.zip

Note that the chosen Azure registry URL and authentication details must be provided.

Upgrading

To upgrade the ModelOps components use :

$ helm upgrade modelops modelops-1.0.0.tgz ...

However, its common practice to use the same command for installation and upgrades :

$ helm upgrade modelops modelops-1.0.0.tgz --install ...

When the installation is upgraded the installation pipeline is re-executed and a rollout restart is performed on existing pods.

Uninstalling

To uninstall the ModelOps components use:

$ helm uninstall modelops

Note that this doesn’t uninstall the Kubernetes operators (so that a further install is faster).

To uninstall everything to start from scratch reset the Kubernetes cluster, for example in Docker for Desktop:

Reset Kubernetes Cluster

Troubleshooting

Always ensure the kubernetes context is what you expect. For example with docker for desktop :

$ kubectl config current-context
docker-desktop

The context is also displayed in docker for desktop UI.