Monitoring
This section describes how to use the installed ModelOps components to support monitoring. How to access the services associated with the ModelOps components is described here.
Logging
All ModelOps components, including scoring pipelines, generate log records that are stored in logging store (ElasticSearch). These services are available for accessing the logging components:
Component | Service | Default Credentials (username/password) |
---|---|---|
Logging Store | elasticsearch-es-http | elastic/elastic |
Logging Visualization | kibana-kb-http | elastic/elastic |
Accessing Logging Store - ElasticSearch
//
// Get elasticsearch-es-http service port
//
kubectl get services --namespace modelops elasticsearch-es-http
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
elasticsearch-es-http ClusterIP 10.0.86.229 <none> 9200/TCP 20d
//
// Set up port-forward to local port 9200
//
kubectl port-forward service/elasticsearch-es-http --namespace modelops 9200:9200
//
// Open browser window (macOS only)
//
open http://localhost:9200
Accessing Logging Visualization - Kibana
//
// Get kibana-kb-http service port
//
kubectl get services --namespace modelops kibana-kb-http
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kibana-kb-http ClusterIP 10.0.11.207 <none> 80/TCP 20d
//
// Set up port-forward to local port 9300
//
kubectl port-forward service/kibana-kb-http --namespace modelops 9300:80
//
// Open browser window (macOS only)
//
open http://localhost:9300
Pod Logging
In addition to accessing logging records in the logging store, logging can also be accessed directly from a Pod using this command:
//
// Follow log output - replace <pod-name> with actual Pod name
//
kubectl logs <pod-name> --namespace modelops --follow
See Service Pods for instructions on getting Pod names.
Metrics
The metrics architecture and metric names are here. These services are available for accessing the metrics components:
Component | Service | Default Credentials (username/password) |
---|---|---|
Metrics Store | prometheus | None |
Real-Time Metrics | modelops-metrics | None |
Accessing Metrics Store - Prometheus
//
// Get prometheus service port
//
kubectl get services --namespace modelops prometheus
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
prometheus ClusterIP 10.0.109.248 <none> 9090/TCP 20d
//
// Set up port-forward to local port 9090
//
kubectl port-forward service/prometheus --namespace modelops 9090:9090
//
// Open browser window (macOS only)
//
open http://localhost:9090
Accessing Real-Time Metrics - LiveView Web
//
// Get real-time metrics service port
//
kubectl get services --namespace modelops modelops-metrics
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
modelops-metrics ClusterIP 10.0.42.12 <none> 80/TCP 20d
//
// Set up port-forward to local port 9070
//
kubectl port-forward service/modelops-metrics --namespace modelops 9070:80
//
// Open browser window (macOS only)
//
open http://localhost:9070
Scoring Pipelines and Data Channels
Scoring pipelines and data channels are started using Tekton pipelines and tasks which are created using Helm charts.
When a scoring pipeline or data channel is deployed a PipelineRun instance is created, along with associated TaskRun instances in the modelops
namespace. The PipelineRun
and TaskRun
instances can be used to monitor the status of running scoring pipelines and data channels.
tkn
must be installed on the local workstation. See the general installation instructions for details on installing Tekton.
Running Pipelines and Data Channels
The running scoring pipelines and data channels are displayed with:
//
// Display all PipelineRun instances
//
tkn pipelinerun list --namespace modelops
The PipelineRun
naming conventions are:
- file-datasink-* - file data sinks
- file-datasource-* - file data sources
- installation-* - ModelOps installation
- kafka-datasink-* - Kafka data sinks
- kafka-datasource-* - Kafka data sources
- scoringpipeline-* - scoring pipelines
Logging
Logging output can be displayed for both PipelineRun
and TaskRun
instances using these commands:
//
// Display PipelineRun logs - replace <pipelinerun-name> with actual PipelineRun name
//
tkn pipelinerun logs --namespace modelops <pipelinerun-name>
//
// Display TaskRun logs - replace <taskrun-name> with actual TaskRun name
//
tkn taskrun logs --namespace modelops <taskrun-name>
Identifying Pods
Scoring pipelines and data channels run in a Pod. The Pod that was started by a PipelineRun
to deploy the scoring pipeline or data channel can be determined using the jobIdentifier
and namespace
associate with the wait TaskRun
.
The TaskRun
instances associated with a PipelineRun
are displayed the describe
command. The wait TaskRun
has a TASK NAME
of wait
in this command output.
//
// Display TaskRuns associated with a PipelineRun
// Replace <pipelinerun-name> with actual PipelineRun name
//
tkn pipelinerun describe --namespace modelops <pipelinerun-name>
For example:
tkn pipelinerun describe --namespace modelops kafka-datasource-bt2b9
Name: kafka-datasource-bt2b9
Namespace: modelops
Pipeline Ref: install-datachannel
Service Account: default
Labels:
app.kubernetes.io/managed-by=Helm
app.kubernetes.io/name=install-datachannel
app.kubernetes.io/part-of=modelops
tekton.dev/pipeline=install-datachannel
🌡️ Status
STARTED DURATION STATUS
1 day ago --- Running
📦 Resources
No resources
⚓ Params
NAME VALUE
∙ logLevel INFO
∙ sourceUrls [raw/commit/a75e7941e584889af60260883f4bd9c5c926fd76/sbhosle-race-cardata/car-kafka-source.datachannel.values.yaml]
∙ namespace datachannels
∙ externalNamespaces [development]
∙ deployParameters []
∙ durationMinutes 0m
∙ trace false
📝 Results
No results
📂 Workspaces
No workspaces
🗂 Taskruns
NAME TASK NAME STARTED DURATION STATUS
∙ kafka-datasource-bt2b9-wait-jgj2f wait 1 day ago --- Running
∙ kafka-datasource-bt2b9-install-datachannel-87mrv install-datachannel 1 day ago 11 seconds Succeeded
⏭️ Skipped Tasks
No Skipped Tasks
The wait TaskRun
is named kafka-datasource-bt2b9-wait-jgj2f
.
This command is then used to display the jobIdentifier
and namespace
:
//
// Describe TaskRun details - replace taskrun-name with actual name
//
tkn taskrun describe --namespace modelops <taskrun-name>
For example:
tkn taskrun describe --namespace modelops kafka-datasource-bt2b9-wait-jgj2f
Name: kafka-datasource-bt2b9-wait-jgj2f
Namespace: modelops
Task Ref: wait-datachannel
Service Account: default
Labels:
app.kubernetes.io/managed-by=Helm
app.kubernetes.io/name=wait-datachannel
app.kubernetes.io/part-of=modelops
tekton.dev/pipeline=install-datachannel
tekton.dev/pipelineRun=kafka-datasource-bt2b9
tekton.dev/pipelineTask=wait
tekton.dev/task=wait-datachannel
🌡️ Status
STARTED DURATION STATUS
1 day ago --- Running
📨 Input Resources
No input resources
📡 Output Resources
No output resources
⚓ Params
NAME VALUE
∙ jobIdentifier kafka-datasource-bt2b9
∙ namespace datachannels
∙ durationMinutes 0m
∙ install-error-message
📝 Results
No results
📂 Workspaces
No workspaces
🦶 Steps
NAME STATUS
∙ install Running
🚗 Sidecars
No sidecars
This output indicates that this Kafka Data Source is deployed in the datachannels
namespace
and has a jobIdentifier
of kafka-datasource-bt2b9
.
Finally the associated Pod can be found using this command:
//
// Get all pods in namespace and filter by job-identifier prefix
// Replace <namespace> and <job-identifier> with values found above
//
kubectl get pods --namespace <namespace> | grep <job-identifier>
Pod names started by TaskRuns
are prefixed by the jobIdentifier
.
For example:
kubectl get pods --namespace datachannels | grep kafka-datasource-bt2b9
kafka-datasource-bt2b9-b79bd6cc8-pnwpx 1/1 Running 0 31h