Cloud Installation

Introduction

This section provides general information on deploying ModelOps to a Kubernetes cluster.

Requirements

The following tools are required to complete the installation - these must be downloaded and installed prior to installing ModelOps :

Additional requirements depend on the Kubernetes cloud platform being used:

Optional Tools

These tools are optional, but have been found to be useful

  • Lens
    • macOS: brew install lens

Azure AKS

Azure AKS also requires:

  • Azure CLI Tools must be installed and configured.
    • macOS: brew install azure-cli

OpenShift

OpenShift also requires:

Sizing

Modelops can be quickly configured for a small, medium or large installation whilst also allowing for further customizations as needed.

Whilst many values are best kept at defaults, the values listed below have biggest effect on sizing and so are exposed as install options.

Small - for single scoring

Minimum hardware is single virtual machine, 6 cores, 30GB memory and 100G disk space. Large configurations will allow for more concurrent scoring and faster executions.

Linux based scoring only ( a hybrid deployment of linux and windows nodes is required for statistica scoring ).

The helm install option, –set size=small, defaults to :

Feature Value Override name Units
Scoring flow memory limit 1.5Gi small.scoringflow.memory See https://kubernetes.io/docs/reference/kubernetes-api/common-definitions/quantity/#Quantity
Scoring flow cpu limit 2 small.scoringflow.cpu See https://kubernetes.io/docs/reference/kubernetes-api/common-definitions/quantity/#Quantity
Nexus memory limit 2Gi small.nexus.memory See https://kubernetes.io/docs/reference/kubernetes-api/common-definitions/quantity/#Quantity
Nexus disk space 20Gi small.nexus.disk See https://kubernetes.io/docs/reference/kubernetes-api/common-definitions/quantity/#Quantity
Git server disk space 5Gi small.git.disk See https://kubernetes.io/docs/reference/kubernetes-api/common-definitions/quantity/#Quantity
Modelops server memory limit 1Gi small.modelopsserver.memory See https://kubernetes.io/docs/reference/kubernetes-api/common-definitions/quantity/#Quantity
Modelops server cpu limit 2 small.modelopsserver.cpu See https://kubernetes.io/docs/reference/kubernetes-api/common-definitions/quantity/#Quantity
Modelops server disk space 5Gi small.modelopsserver.disk See https://kubernetes.io/docs/reference/kubernetes-api/common-definitions/quantity/#Quantity
Modelops metrics memory limit 10Gi small.modelopsmetrics.memory See https://kubernetes.io/docs/reference/kubernetes-api/common-definitions/quantity/#Quantity
Modelops metrics sampling interval 30 small.modelopsmetrics.interval Seconds
Modelops metrics table size 50 small.modelopsmetrics.tablesize Megabytes
Modelops metrics data age 5 small.modelopsmetrics.age Minutes
Elastic search disk space 10Gi small.elasticsearch.disk See https://kubernetes.io/docs/reference/kubernetes-api/common-definitions/quantity/#Quantity
Elastic search memory limit 2Gi small.elasticsearch.memory See https://kubernetes.io/docs/reference/kubernetes-api/common-definitions/quantity/#Quantity
Prometheus sampling interval 30s small.prometheus.interval Seconds
Prometheus disk space 10Gi small.prometheus.disk See https://kubernetes.io/docs/reference/kubernetes-api/common-definitions/quantity/#Quantity

Medium - for small teams, the installation default

Minimum hardware is single virtual machine, 8 cpus, 32GB memory and 100G disk space. Recommend cloud infrastructure configured for cluster scaling to absorb variable demand. A useful cluster scaling configuration is a minimum of 1 virtual machine and maximum of 5 virtual machines, although experience shows 2 servers usually sufficient.

Additional windows virtual machines can be added to the cluster to support statistica scoring if required.

The helm install option, –set size=medium, defaults to :

Feature Value Override name Units
Scoring flow memory limit 2Gi medium.scoringflow.memory See https://kubernetes.io/docs/reference/kubernetes-api/common-definitions/quantity/#Quantity
Scoring flow cpu limit 4 medium.scoringflow.cpu See https://kubernetes.io/docs/reference/kubernetes-api/common-definitions/quantity/#Quantity
Nexus memory limit 2Gi medium.nexus.memory See https://kubernetes.io/docs/reference/kubernetes-api/common-definitions/quantity/#Quantity
Nexus disk space 20Gi medium.nexus.disk See https://kubernetes.io/docs/reference/kubernetes-api/common-definitions/quantity/#Quantity
Git server disk space 20Gi medium.git.disk See https://kubernetes.io/docs/reference/kubernetes-api/common-definitions/quantity/#Quantity
Modelops server memory limit 2Gi medium.modelopsserver.memory See https://kubernetes.io/docs/reference/kubernetes-api/common-definitions/quantity/#Quantity
Modelops server cpu limit 4 medium.modelopsserver.cpu See https://kubernetes.io/docs/reference/kubernetes-api/common-definitions/quantity/#Quantity
Modelops server disk space 20Gi medium.modelopsserver.disk See https://kubernetes.io/docs/reference/kubernetes-api/common-definitions/quantity/#Quantity
Modelops metrics memory limit 15Gi medium.modelopsmetrics.memory See https://kubernetes.io/docs/reference/kubernetes-api/common-definitions/quantity/#Quantity
Modelops metrics sampling interval 10 medium.modelopsmetrics.interval Seconds
Modelops metrics table size 50 medium.modelopsmetrics.tablesize Megabytes
Modelops metrics data age 5 medium.modelopsmetrics.age Minutes
Elastic search disk space 50Gi medium.elasticsearch.disk See https://kubernetes.io/docs/reference/kubernetes-api/common-definitions/quantity/#Quantity
Elastic search memory limit 5Gi medium.elasticsearch.memory See https://kubernetes.io/docs/reference/kubernetes-api/common-definitions/quantity/#Quantity
Prometheus sampling interval 10s medium.prometheus.interval Seconds
Prometheus disk space 50Gi medium.prometheus.disk See https://kubernetes.io/docs/reference/kubernetes-api/common-definitions/quantity/#Quantity

Large - for larger teams

Minimum hardware is single virtual machine, 16 cpus, 64GB memory and 500G disk space. Recommend cloud infrastructure configured for cluster scaling to absorb variable demand. A useful cluster scaling configuration is a minimum of 1 virtual machine and maximum of 10 virtual machines.

Additional windows virtual machines can be added to the cluster to support statistica scoring if required.

The helm install option, –set size=large, defaults to :

Feature Value Override name Units
Scoring flow memory limit 5Gi large.scoringflow.memory See https://kubernetes.io/docs/reference/kubernetes-api/common-definitions/quantity/#Quantity
Scoring flow cpu limit 6 large.scoringflow.cpu See https://kubernetes.io/docs/reference/kubernetes-api/common-definitions/quantity/#Quantity
Nexus memory limit 2Gi large.nexus.memory See https://kubernetes.io/docs/reference/kubernetes-api/common-definitions/quantity/#Quantity
Nexus disk space 20Gi large.nexus.disk See https://kubernetes.io/docs/reference/kubernetes-api/common-definitions/quantity/#Quantity
Git server disk space 100Gi large.git.disk See https://kubernetes.io/docs/reference/kubernetes-api/common-definitions/quantity/#Quantity
Modelops server memory limit 4Gi large.modelopsserver.memory See https://kubernetes.io/docs/reference/kubernetes-api/common-definitions/quantity/#Quantity
Modelops server cpu limit 6 large.modelopsserver.cpu See https://kubernetes.io/docs/reference/kubernetes-api/common-definitions/quantity/#Quantity
Modelops server disk space 100Gi large.modelopsserver.disk See https://kubernetes.io/docs/reference/kubernetes-api/common-definitions/quantity/#Quantity
Modelops metrics memory limit 20Gi large.modelopsmetrics.memory See https://kubernetes.io/docs/reference/kubernetes-api/common-definitions/quantity/#Quantity
Modelops metrics sampling interval 10 large.modelopsmetrics.interval Seconds
Modelops metrics table size 50 large.modelopsmetrics.tablesize Megabytes
Modelops metrics data age 5 large.modelopsmetrics.age Minutes
Elastic search disk space 100Gi large.elasticsearch.disk See https://kubernetes.io/docs/reference/kubernetes-api/common-definitions/quantity/#Quantity
Elastic search memory limit 10Gi large.elasticsearch.memory See https://kubernetes.io/docs/reference/kubernetes-api/common-definitions/quantity/#Quantity
Prometheus sampling interval 10s large.prometheus.interval Seconds
Prometheus disk space 100Gi large.prometheus.disk See https://kubernetes.io/docs/reference/kubernetes-api/common-definitions/quantity/#Quantity

Further sizing customizations

Each of the above values can be overridden as needed using the override name included above, for example to increase the git server disk space with the medium configuration, use –set size=medium –set medium.git.disk=50Gi.

Passwords and secrets

In order to avoid clear text passwords, Kubenertes provides a Secrets facility. So prior to installation, kubernetes Secrets have to be created to contain the passwords required by modelops.

These are :

Description Secret name Key name Comments
Elastic search elasticsearch-es-elastic-user Elastic search user name See https://www.elastic.co/guide/en/cloud-on-k8s/master/k8s-users-and-roles.html - if not set elastic search generates a password
Git server git-server git user name
Nexus server nexus-server admin
Modelops server modelops-server admin
Scoring flow admin scoring-admin admin

These secrets may be created via the cloud infrastructure or on the command-line using kubectl. For example :

# elastic search
#
# note in this case we use apply to avoid elastic search re-creating the secret
#
kubectl create secret generic elasticsearch-es-elastic-user --from-literal=elastic=mysecretpassword --namespace modelops --dry-run=client --output=yaml 2>/dev/null > secret.yaml
kubectl apply --filename secret.yaml

# git server
#
kubectl create secret generic git-server --from-literal=modelops=mysecretpassword --namespace modelops

# nexus server
#
kubectl create secret generic nexus-server --from-literal=admin=mysecretpassword --namespace modelops

# modelops server
#
kubectl create secret generic modelops-server --from-literal=admin=mysecretpassword --namespace modelops

# scoring admin
#
kubectl create secret generic scoring-admin --from-literal=admin=mysecretpassword --namespace modelops

It is recommended to install an encryption provider for maximum security - see https://kubernetes.io/docs/tasks/administer-cluster/encrypt-data/.

Quick run through

The Helm CLI tool is used to install the ModelOps components to Kubernetes :

$ helm upgrade --install installer helm-charts/kubernetes-installer-1.0.2.tgz --atomic --set cloud=aks

This command first installs and starts the bootstrap pipeline which installes the required Kubernetes operators - this takes a few seconds after which the helm command returns with a summary of the installation.

For example :

Release "installer" does not exist. Installing it now.
NAME: installer
LAST DEPLOYED: Mon Jan 24 14:04:38 2022
NAMESPACE: modelops
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
Thank you for installing ep-kubernetes-installer configured for docker-for-desktop in kubernetes v1.22.5

The Operator Lifecycle Manager has been installed

The bootstrap pipeline has started which includes :

  Adding kubernetes permissions
  Installing a nexus server 
  Installing ElasticSearch and Kibana
  Installing Prometheus
  Populating the nexus repository with artifacts
  Creating a product install pipeline from helm charts


  Starting the nexus server at :

    Internal web console URL - http://artifact-repository:80/
      Maven repository - http://artifact-repository:80/repository/maven-public/
      Helm repository - http://artifact-repository:80/repository/helm/
      PyPi proxy - http://artifact-repository:80/repository/pypi-group
      Container registry - 192.168.175.10:8082

  Starting prometheus server at :

    Internal URL - http://prometheus.modelops:9090

  Starting elasticsearch server at :

    Internal URL - http://elasticsearch-es-http:9200

    Userid is elastic, password set in kubernetes secret

  Starting kibana server at :

    Internal URL - http://kibana-kb-http

    Userid is elastic, password set in kubernetes secret


    The docker daemon should be configured to allow http pull requests :

    {
      "insecure-registries": [
        "192.168.175.10:8082"
      ]
    }

To track the progress of the bootstrap pipeline run :

  tkn pipelinerun logs bootstrap --follow --namespace modelops

The output depends on the cloud platform and any additional options selected. These details are also displayed with the helm status modelops command.

The zip of maven artifacts should be copied using kubectl cp command :

$ kubectl cp modelops-repo-1.2.0-mavenrepo.zip mavenrepo-0:/tmp/ --namespace modelops

At this point the installation has been started and, as mentioned above, the status of the installation can be monitored with tkn pipelinerun logs bootstrap -f. For example :

$ tkn pipelinerun logs bootstrap --follow --namespace modelops

[nexus : nexus] Installing nexus operator
[nexus : nexus] namespace/nexus-operator-system created
[nexus : nexus] customresourcedefinition.apiextensions.k8s.io/nexus.apps.m88i.io created
[nexus : nexus] role.rbac.authorization.k8s.io/nexus-operator-leader-election-role created
[nexus : nexus] clusterrole.rbac.authorization.k8s.io/nexus-operator-manager-role created
[nexus : nexus] clusterrole.rbac.authorization.k8s.io/nexus-operator-metrics-reader created
[nexus : nexus] clusterrole.rbac.authorization.k8s.io/nexus-operator-proxy-role created
[nexus : nexus] rolebinding.rbac.authorization.k8s.io/nexus-operator-leader-election-rolebinding created
[nexus : nexus] clusterrolebinding.rbac.authorization.k8s.io/nexus-operator-manager-rolebinding created
[nexus : nexus] clusterrolebinding.rbac.authorization.k8s.io/nexus-operator-proxy-rolebinding created
[nexus : nexus] service/nexus-operator-controller-manager-metrics-service created
....
[install-pipeline-run : run] 14:16:27.765 [main] INFO com.tibco.streaming.installpipeline.Kubernetes - To track the progress of the modelops-server pipeline run :
[install-pipeline-run : run] 14:16:27.766 [main] INFO com.tibco.streaming.installpipeline.Kubernetes -   tkn pipelinerun logs modelops-server --follow --namespace modelops

The installation process can run tasks in parallel - hence the output is prefixed with the task and lines are coloured.

Once the bootstrap pipeline has completed, the application pipeline can be monitored in a similar way :

$ tkn pipelinerun logs modelops-server --follow --namespace modelops
....
[scheduling-server-scale : scale] Resuming rollout of scheduling-server
[scheduling-server-scale : scale] deployment.apps/scheduling-server resumed

[data-channel-registry-prepare : prepare] Preparing directory for data-channel-registry
....

The installation is completed when the tkn pipelinerun logs modelops-server –follow –namespace modelops command completes. The tkn taskrun list command shows the task status :

$ tkn taskrun list --namespace modelops
NAME                                                        STARTED          DURATION     STATUS
modelops-server-modelops-server-scale                       1 minute ago     20 seconds   Succeeded
modelops-server-modelops-metrics-scale                      3 minutes ago    11 seconds   Succeeded
modelops-server-kafka-datasource-image                      4 minutes ago    2 minutes    Succeeded
modelops-server-kafka-datasink-image                        4 minutes ago    2 minutes    Succeeded
modelops-server-modelops-server-image                       4 minutes ago    3 minutes    Succeeded
modelops-server-scoring-flow-image                          5 minutes ago    5 minutes    Succeeded
modelops-server-test-datasource-image                       5 minutes ago    1 minute     Succeeded
modelops-server-test-datasink-image                         5 minutes ago    2 minutes    Succeeded
modelops-server-modelops-metrics-image                      6 minutes ago    2 minutes    Succeeded
modelops-server-modelops-metrics-maven                      7 minutes ago    1 minute     Succeeded
modelops-server-kafka-datasource-maven                      8 minutes ago    3 minutes    Succeeded
modelops-server-test-datasource-maven                       8 minutes ago    2 minutes    Succeeded
modelops-server-scoring-flow-maven                          8 minutes ago    2 minutes    Succeeded
modelops-server-test-datasink-maven                         8 minutes ago    2 minutes    Succeeded
modelops-server-kafka-datasink-maven                        8 minutes ago    3 minutes    Succeeded
modelops-server-modelops-server-maven                       8 minutes ago    3 minutes    Succeeded
modelops-server-scoring-flow-prepare                        8 minutes ago    43 seconds   Succeeded
modelops-server-kafka-datasource-prepare                    8 minutes ago    51 seconds   Succeeded
modelops-server-test-datasource-prepare                     8 minutes ago    44 seconds   Succeeded
modelops-server-kafka-datasink-prepare                      9 minutes ago    42 seconds   Succeeded
modelops-server-test-datasink-prepare                       9 minutes ago    42 seconds   Succeeded
modelops-server-modelops-metrics-prepare                    9 minutes ago    1 minute     Succeeded
modelops-server-modelops-server-prepare                     9 minutes ago    42 seconds   Succeeded
modelops-server-scheduling-server-scale                     10 minutes ago   39 seconds   Succeeded
modelops-server-data-channel-registry-scale                 10 minutes ago   19 seconds   Succeeded
modelops-server-sbrt-base-image                             20 minutes ago   11 minutes   Succeeded
modelops-server-statistica-image                            21 minutes ago   3 minutes    Succeeded
modelops-server-tensorflow-image                            21 minutes ago   14 minutes   Succeeded
modelops-server-rest-datasink-image                         21 minutes ago   4 minutes    Succeeded
modelops-server-spark-image                                 21 minutes ago   14 minutes   Succeeded
modelops-server-rest-datasource-image                       21 minutes ago   4 minutes    Succeeded
modelops-server-python-image                                21 minutes ago   15 minutes   Succeeded
modelops-server-file-datasource-image                       21 minutes ago   5 minutes    Succeeded
modelops-server-file-datasink-image                         21 minutes ago   3 minutes    Succeeded
modelops-server-rest-request-response-datachannel-image     21 minutes ago   4 minutes    Succeeded
modelops-server-pmml-image                                  21 minutes ago   10 minutes   Succeeded
modelops-server-scheduling-server-image                     21 minutes ago   11 minutes   Succeeded
modelops-server-jdbc-datasource-image                       21 minutes ago   2 minutes    Succeeded
modelops-server-data-channel-registry-image                 21 minutes ago   10 minutes   Succeeded
modelops-server-git-server-scale                            23 minutes ago   15 seconds   Succeeded
modelops-server-sbrt-base-maven                             24 minutes ago   3 minutes    Succeeded
modelops-server-rest-datasink-maven                         24 minutes ago   3 minutes    Succeeded
modelops-server-file-datasink-maven                         24 minutes ago   3 minutes    Succeeded
modelops-server-data-channel-registry-maven                 24 minutes ago   3 minutes    Succeeded
modelops-server-pmml-maven                                  25 minutes ago   3 minutes    Succeeded
modelops-server-rest-datasource-maven                       25 minutes ago   4 minutes    Succeeded
modelops-server-spark-maven                                 25 minutes ago   4 minutes    Succeeded
modelops-server-scheduling-server-maven                     25 minutes ago   4 minutes    Succeeded
modelops-server-git-server-image                            25 minutes ago   2 minutes    Succeeded
modelops-server-file-datasource-maven                       25 minutes ago   4 minutes    Succeeded
modelops-server-tensorflow-maven                            25 minutes ago   4 minutes    Succeeded
modelops-server-python-maven                                25 minutes ago   4 minutes    Succeeded
modelops-server-rest-request-response-datachannel-maven     25 minutes ago   4 minutes    Succeeded
modelops-server-jdbc-datasource-maven                       25 minutes ago   4 minutes    Succeeded
modelops-server-statistica-maven                            25 minutes ago   4 minutes    Succeeded
modelops-server-rest-datasink-prepare                       26 minutes ago   1 minute     Succeeded
modelops-server-git-server-prepare                          26 minutes ago   40 seconds   Succeeded
modelops-server-spark-prepare                               26 minutes ago   46 seconds   Succeeded
modelops-server-sbrt-base-prepare                           26 minutes ago   1 minute     Succeeded
modelops-server-tensorflow-prepare                          26 minutes ago   38 seconds   Succeeded
modelops-server-data-channel-registry-prepare               26 minutes ago   1 minute     Succeeded
modelops-server-file-datasink-prepare                       26 minutes ago   1 minute     Succeeded
modelops-server-file-datasource-prepare                     26 minutes ago   37 seconds   Succeeded
modelops-server-rest-request-response-datachannel-prepare   26 minutes ago   39 seconds   Succeeded
modelops-server-statistica-prepare                          26 minutes ago   35 seconds   Succeeded
modelops-server-scheduling-server-prepare                   26 minutes ago   48 seconds   Succeeded
modelops-server-pmml-prepare                                26 minutes ago   1 minute     Succeeded
modelops-server-rest-datasource-prepare                     26 minutes ago   52 seconds   Succeeded
modelops-server-python-prepare                              26 minutes ago   39 seconds   Succeeded
modelops-server-jdbc-datasource-prepare                     26 minutes ago   36 seconds   Succeeded
bootstrap-install-pipeline-run                              27 minutes ago   23 seconds   Succeeded
bootstrap-install-pipeline-image                            27 minutes ago   55 seconds   Succeeded
bootstrap-install-pipeline-maven                            29 minutes ago   1 minute     Succeeded
bootstrap-nexus-helm-index                                  29 minutes ago   20 seconds   Succeeded
bootstrap-install-pipeline-prepare                          29 minutes ago   20 seconds   Succeeded
bootstrap-ingress                                           31 minutes ago   1 minute     Succeeded
bootstrap-elasticsearch                                     31 minutes ago   2 minutes    Succeeded
bootstrap-deploy-artifacts                                  31 minutes ago   2 minutes    Succeeded
bootstrap-prometheus                                        31 minutes ago   1 minute     Succeeded
bootstrap-tools-image                                       33 minutes ago   1 minute     Succeeded
bootstrap-tools-prepare                                     33 minutes ago   7 seconds    Succeeded
bootstrap-nexus-repositories                                37 minutes ago   4 minutes    Succeeded
bootstrap-nexus                                             37 minutes ago   15 seconds   Succeeded
bootstrap-tidy-up                                           37 minutes ago   10 seconds   Succeeded

Alternative ways to manage volumes

By default, the modelops installation will create kubernetes persistent volume claims for the modelops and git servers. However, if needed, these volumes can be managed differently.

  • Pre-create persistent volume claims

In this case the administrator pre-creates git-server and modelops-server persistent volume claims ( perhaps with custom storage class and reclaim policy ) and specifies the createPVC=false option when installing modelops :

$ helm upgrade --install installer helm-charts/kubernetes-installer-1.0.2.tgz --atomic --set cloud=aks --set createPVC=false
  • Use a custom storage class

In this case the adminstrator pre-creates a custom storage class (or specify a non-default one) and specifies the modelopsserver.storageClass and/or modelopsserver.storageClass options when installing modelops :

$ helm upgrade --install installer helm-charts/kubernetes-installer-1.0.2.tgz --atomic --set cloud=aks --set modelopsserver.storageClass=customclass

The installation pipeline

The install process is controlled via a Tekton pipeline called installation. This pipeline first installs the following Kubernetes Operators during the pre-install hook :

Kubernetes permissions are added to support Role-based access control (RBAC), security context constraints (SCC) and Streaming discovery.

The following container images are built in Kubernetes :

  • GIT server image - used to hold the ModelOps artifacts

Modelops helm charts also create :

  • TIBCO Streaming runtime base image - used as a base for further images
  • TIBCO ModelOps Server image - scoring pipeline, flow, and model management
  • TIBCO ModelOps Scoring Server image for for each runner - model scoring
  • TIBCO Data Channel Registry - data source and sink registration and discovery
  • TIBCO ModelOps Scheduling Server - job scheduling
  • Various data channels

The following services are started :

  • GIT server
  • TIBCO ModelOps Server
  • Nexus repository configured with :
    • Maven repository, populated with TIBCO artifacts
    • Python repository (both proxy and hosted)
    • Container registry
    • Helm chart repository
  • TIBCO Data Channel Registry
  • TIBCO Scheduling Server

Finally the installation deploys a helm chart used to later deploy a ModelOps server.

Kubernetes rollout is paused during the installation process and resumed once new container images are available.

Individual pipeline tasks are scheduled by dependency and available resources.

installation pipeline

Cloud platform differences

Kubernetes features differs between platforms and so the installation process also varies slightly. In general, natively provided features are used in preference to custom provided features. These difference are shown below :

Feature OpenShift AKS EKS
Operator Lifecycle Manager Provided Installed Installed
Container registry ImageStream ACR ECR
Network exposure route Ingress Ingress
RBAC supported Yes Yes Yes
SCC supported Yes No No
Windows images supported No Yes No

These differences are controlled via ModelOps helm chart values parameters - these can be viewed with the helm show values kubernetes-installer-1.0.2.tgz command, for example :

$ helm show values kubernetes-installer-1.0.2.tgz
#
# Default values for the chart
#

#
# cloud environment
#
cloud: docker-for-desktop

#
# image pull policy
#
pullpolicy:           "IfNotPresent"

#
# sizing
#
size: medium

#
# operator lifecycle manager specific settings
#
olm:
  operatorVersion:    "v0.17.0"

#
# tekton specific settings
#
tekton:
  operatorVersion:    "latest"

#
# nexus specific settings
#
nexus:
  operatorVersion:    "v0.6.0"
  internalPort:       80
  nodePort:           30020
  containerNodePort:  30030
  hostname:           "artifact-repository"
  maven:
    maven-proxy:
      url:            "https://repo1.maven.org/maven2/"
  pypi:
    pypi-proxy:
      url:            "https://pypi.org/"
  yum:
    yum-proxy:
      url:            "https://repo.almalinux.org/almalinux"
 
#
# The following values are defaulted depending on cloud type :
#
# installOLM - install the operator lifecycle manager
#
# containerRegistry - base URI of container registry.  Use the supplied one 
#   if available.
#
# containerUsername/containerPassword - if set, used to access container registry
#
# networkExposure - mechanism to use to expose network
#
# createPVC - if true create persistent volume claim in helm chart, if false 
#   the persistent volume claim must be created before installing the chart.
#
# selfSignedRegistry - if true then skip tls verification on registry
#
# httpRegistry - if true then use http registry
#
# roleBasedAccessControl - kubernetes or openshift
#
# windows - if true build windows container (currently statistica scoring server)
#
# dnsSuffix - AKS only, set azure annotation for pubic dns name, ie <container>-<dnsSuffix>.<region>.cloudapp.azure.com
#

docker-for-desktop:
  installOLM:         true
  installMetrics:     true
  installLogs:        true
  containerRegistry:  "localhost:5000"
  networkExposure:    "nodePort"
  createPVC:          true
  httpRegistry:       true
  selfSignedRegistry: false
  roleBasedAccessControl: "kubernetes"
  windows:            false
  ingressDomain:      "tobeset"

kind:
  installOLM:         true
  installMetrics:     true
  installLogs:        true
  containerRegistry:  "kind-registry:5000"
  networkExposure:    "ingress"
  createPVC:          true
  selfSignedRegistry: false
  httpRegistry:       true
  roleBasedAccessControl: "kubernetes"
  windows:            false
  ingressDomain:      "tobeset"

colima:
  installOLM:         true
  installMetrics:     true
  installLogs:        true
  containerRegistry:  "localhost:5000"
  networkExposure:    "nodePort"
  createPVC:          true
  httpRegistry:       true
  selfSignedRegistry: false
  roleBasedAccessControl: "kubernetes"
  windows:            false
  ingressDomain:      "tobeset"

openshift:
  installOLM:         false
  installMetrics:     true
  installLogs:        true
  containerRegistry:  "image-registry.openshift-image-registry.svc:5000/{{ .Release.Namespace }}"
  networkExposure:    "route"
  createPVC:          true
  selfSignedRegistry: true
  httpRegistry:       false
  roleBasedAccessControl: "openshift"
  windows:            false
  ingressDomain:      "tobeset"

aks:
  installOLM:         true
  installMetrics:     true
  installLogs:        true
  containerRegistry:  "myregistry.azurecr.io"
  containerUsername:  "azure appid"
  containerPassword:  "azure password"
  azureTenantId:      "azure tenantId"
  networkExposure:    "ingress"
  createPVC:          true
  selfSignedRegistry: false
  httpRegistry:       true
  roleBasedAccessControl: "kubernetes"
  windows:            true
  ingressDomain:      "tobeset"
  # oauth2:             "azure"

eks:
  installOLM:         true
  installMetrics:     true
  installLogs:        true
  containerRegistry:  "eks registry"
  region:             "region"
  networkExposure:    "ingress"
  createPVC:          true
  selfSignedRegistry: false
  httpRegistry:       true
  roleBasedAccessControl: "kubernetes"
  windows:            false
  ingressDomain:      "tobeset"
  # oauth2:             "cognito"

#
# sizing details
#
small:
  general:
    cpu: "2"
    memory: "400Mi"
  nexus:
    disk: "20Gi"
    memory: "2Gi"
  elasticsearch:
    disk: "10Gi"
    memory: "2Gi"
  prometheus:
    interval: "30s"
    disk: "10Gi"

medium:
  general:
    cpu: "2"
    memory: "400Mi"
  nexus:
    disk: "20Gi"
    memory: "2Gi"
  elasticsearch:
    disk: "50Gi"
    memory: "5Gi"
  prometheus:
    interval: "10s"
    disk: "50Gi"

large:
  general:
    cpu: "2"
    memory: "400Mi"
  nexus:
    disk: "20Gi"
    memory: "2Gi"
  elasticsearch:
    disk: "100Gi"
    memory: "10Gi"
  prometheus:
    interval: "10s"
    disk: "100Gi"

#
# hence the chart may be installed :
#
#   helm install installer kubernetes-installer-[version].tgz --set cloud=openshift
#
# or override individual settings
#
#   helm install installer kubernetes-installer-[version].tgz --set cloud=openshift --set openshift.createPVC=true
#

#
# Kubernetes DNS domain - not generally used but needed for windows work-arounds (see TMO-1156)
#
clusterName:              "svc.cluster.local"

#
# prometheus specific settings
#
# if storageClass is set, use storageClass in volumeClaimTemplate (otherwise system defult is used)
#
# See https://prometheus.io/docs/prometheus/latest/storage/#operational-aspects for retention time
#
prometheus:
  operatorVersion:    "30.0.1"
  nodePort:           30050
  retention:          "1y"
  storageClass:       ""
# see https://github.com/prometheus-operator/prometheus-operator/blob/master/Documentation/api.md#alertmanagerconfigspec
#  alerts:
#    route:
#      groupBy: ['job']
#      receiver: "test"
#    receivers:
#    - name: "test"
#      emailConfigs:
#      - to: plord@tibco.com
#        from: plord@tibco.com
#        smarthost: smtp-relay.gmail.com:587

#
# elasiticsearch specific settings
#
elasticsearch:
  operatorVersion:    "1.9.1"
  version:            "7.16.2"
  nodePort:           30070
  username:           "elastic"

#
# kibana specific settings
#
kibana:
  version:            "7.16.2"
  nodePort:           30080
  operatorVersion:    "1.9.1"

#
# ingress nginx specific settings
#
ingressnginx:
  version:            "4.0.1"

#
# cert manager specific settings
#
certmanager:
  version:            "v1.6.1"

#
# Oauth2
# 
oauth2:

  azure:
    # oauth2 values for azure
    #
    # need a secret "oauth2" with 
    #
    # TENANT_ID set to azure tenantid
    # CLIENT_ID set to azure application id
    # CLIENT_SECRET set to azure client secret
    #
    identityAttributeName:    "unique_name"
    roleAttributeName:        "roles"
    jwtAudience:              "${CLIENT_ID}"
    jwtIssuer:                "https://sts.windows.net/${TENANT_ID}/"
    jwksURL:                  "https://login.microsoftonline.com/common/discovery/keys"
    jwksCacheTimeoutSeconds:  "3600"
    ssoLogoutURL:             "https://login.microsoftonline.com/${TENANT_ID}/oauth2/logout?post_logout_redirect_uri=https://modelops-server.${MODELOPS_DOMAIN}/oauth2/sign_out"
    # oauth2-proxy settings - see https://oauth2-proxy.github.io/oauth2-proxy/docs/
    provider:                 "azure"
    emailclaim:               "unique_name"
    azuretenant:              "${TENANT_ID}"
    oidcissuerurl:            "https://sts.windows.net/${TENANT_ID}/"
    extrajwtissuers:          "https://login.microsoftonline.com/${TENANT_ID}/v2.0=${CLIENT_ID}"
    clientid:                 "${CLIENT_ID}"
    clientsecret:             "${CLIENT_SECRET}"
    whitelist:                "login.microsoftonline.com/${TENANT_ID}"

  cognito:
    # oauth2 values for amazon cognito
    #
    # need a secret "oauth2" with 
    #
    # REGION set to cognito region
    # POOL_ID set to cognito pool id
    # CLIENT_ID set to cognito client id
    # CLIENT_SECRET set to cognito client secret
    # DOMAIN set to cognito domain
    #
    identityAttributeName:    "email"
    roleAttributeName:        "cognito:groups"
    jwtAudience:              "${CLIENT_ID}"
    jwtIssuer:                "https://cognito-idp.${REGION}.amazonaws.com/${POOL_ID}"
    jwksURL:                  "https://cognito-idp.${REGION}.amazonaws.com/${POOL_ID}/.well-known/jwks.json"
    jwksCacheTimeoutSeconds:  "3600"
    ssoLogoutURL:             "https://${DOMAIN}.auth.${REGION}.amazoncognito.com/logout?client_id=${CLIENT_ID}&logout_uri=https://modelops-server.${MODELOPS_DOMAIN}/oauth2/sign_out"
    # oauth2-proxy settings - see https://oauth2-proxy.github.io/oauth2-proxy/docs/
    provider:                 "oidc"
    emailclaim:               "email"
    oidcissuerurl:            "https://cognito-idp.${REGION}.amazonaws.com/${POOL_ID}"
    clientid:                 "${CLIENT_ID}"
    clientsecret:             "${CLIENT_SECRET}"
    whitelist:                "tibco-modelops.auth.${REGION}.amazoncognito.com"

So to choose the defaults for a given environment, just set cloud to the right environment :

$ helm install modelops kubernetes-installer-1.0.2.tgz --set cloud=aks

However individual settings can be overridden if required, using cloud name.parameter format. For example :

$ helm install modelops kubernetes-installer-1.0.2.tgz --set cloud=aks \
    --set aks.containerRegistry=myserver:30030

Upgrading

To upgrade the ModelOps components use :

$ helm upgrade installer kubernetes-installer-1.0.2.tgz ...

However, its common practice to use the same command for installation and upgrades :

$ helm upgrade installer kubernetes-installer-1.0.2.tgz --install ...

When the installation is upgraded the installation pipeline is re-executed and a rollout restart is performed on existing pods.

Uninstalling

To uninstall the ModelOps components use:

$ helm uninstall modelops

Note that this doesn’t uninstall the Kubernetes operators (so that a further install is faster).

Troubleshooting

Always ensure the kubernetes context is what you expect. For example :

$ kubectl config current-context
modelops

The context is also displayed in docker for desktop UI.