Kubernetes Deployment
This section provides prerequisites and describes how to deploy ibi Data Qualityin a Kubernetes cluster.
Kubernetes Overview
Kubernetes® (K8s) is an open-source system for running, managing, and orchestrating containerized applications in a cluster of servers (known as a Kubernetes cluster). Kubernetes clusters can run in any cloud environment (e.g., private, public, hybrid) or on-premises.
A Kubernetes cluster consists of a set of worker machines, called nodes, that run containerized applications. Every cluster has at least one worker node.
The size of a Kubernetes cluster (determined by the number of nodes) depends on the application workload(s) that will be running. For example, each node can represent an 8 core / 64 GB RAM system. Pods are the basic objects in a Kubernetes cluster, which consists of one or more software containers. The worker node(s) host the Pods that are the components of the application workload. Usually, only one container runs in a Pod. However, multiple containers can be run in a Pod if needed (depending on specific environment requirements). If one Pod fails, Kubernetes can automatically replace that Pod with a new instance.
Key benefits of Kubernetes include automatic scalability, efficient load balancing, high availability, failover/fault tolerance, and deploying updates across your environment without any disruptions to users.
Hardware Requirements
The ibi Data Quality is a collection of microservices running as separate containers in a private cloud environment such as Kubernetes. Your Kubernetes cluster must have at least the following:
- With Horizontal Pod Autoscaler (HPA):
- 42.5 vCPUs
- 33.7 GB of memory
- Without Horizontal Pod Autoscaler (HPA):
- 22.5 vCPUs
- 23.7 GB of memory
Both cases would require 500 GB of persistent volume to be available.
The following deployment resource spec table lists the recommended CPU, memory, and volume requirements for each of the containers.
Container Name |
Replicas |
CPU (Per Replica) |
Memory (Per Replica) |
|||
---|---|---|---|---|---|---|
Min |
Max |
Requests |
Limits |
Requests |
Limits |
|
dq-address-server |
1 |
1 |
2000m |
2000m |
6Gi |
8Gi |
dq-butler |
1 |
2 |
1000m |
3000m |
2Gi |
3Gi |
dq-dqs |
1 |
1 |
500m |
1000m |
1Gi |
1Gi |
dq-dsml-classifier |
1 |
1 |
2000m |
5000m |
2Gi |
4Gi |
dq-grafana |
1 |
1 |
1000m |
2000m |
1Gi |
3Gi |
dq-ksource |
1 |
1 |
1000m |
2000m |
1Gi |
2Gi |
dq-postgres |
1 |
1 |
2000m |
4000m |
2Gi |
8Gi |
dq-profiler |
4 |
8 |
4000m |
4000m |
2Gi |
4Gi |
dq-python-services |
3 |
6 |
4000m |
4000m |
2Gi |
8Gi |
dq-valet-services |
1 |
3 |
500m |
1000m |
1Gi |
2Gi |
dq-valet-ui |
1 |
1 |
500m |
1000m |
1Gi |
2Gi |
dq-wso2 |
1 |
1 |
2000m |
4000m |
2Gi |
3Gi |
dq-redis |
1 |
1 |
500m |
2000m |
200Mi |
1Gi |
dq-log-viewer |
1 |
1 |
1000m |
1000m |
0.5Gi |
1Gi |
Installation
To install ibi Data Quality:
- Download and unzip the ibi_tdq_{product_version}_container.zip file.
- Download and extract a Linux installation of Loqate into the NFS server volume where the instance of ibi Data Quality that is being deployed to a Kubernetes cluster will be utilized:
<NFS_Volume_Mount_Path>/loqate
Building Docker Images for Kubernetes
Before you deploy ibi Data Quality in a Kubernetes cluster, you must first create Docker images for the ibi Data Quality components.
- Open Windows PowerShell or any Linux shell.
- Change your directory to the folder where you extracted the ibi_tdq_5.2.0_container.zip file.
- Change your directory to the install/dq-k8s/scripts folder.
- Deploy the NGINX Ingress controller using the following command:
$ kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/controller-v1.3.1/deploy/static/provider/baremetal/deploy.yaml
This command will deploy Kubernetes objects into the ingress-nginx namespace.
After deployment, run the following command to extract the mapped controller HTTPS port of the cluster, which will be required in the next step:
$ kubectl -n ingress-nginx get service ingress-nginx-controller | grep -oP '(?<=443:).*(?=/TCP)'
- Change your directory to the folder where you extracted the ibi_tdq_5.2.0_container.zip file to the install/ folder. Run the following command:
$ docker-compose --build -f docker-compose-kubernetes.yaml build
- For the steps to build the DSML images: refer to section in installing WebFocus DSML Services Container Edition.
- (Optional) Push the Docker images to the image registry that can be accessed by the Kubernetes cluster.
Prerequisites
-
NFS mount should be made available to the Kubernete nodes.
-
Openssl toolchain installed
-
Helm package management software installed
Deploying Infrastructure Components
This section describes how to deploy ibi Data Quality infrastructure components.
- Change your directory to the folder where you extracted the ibi_tdq_5.2.0_container.zip file to the install/dq-k8s/helm/ folder.
- Create a namespace (for example, idq-ns) using the following command:
$ kubectl create namespace idq-ns
This will create the namespace object tdq-ns in the Kubernetes cluster.
- Create a secret for ingresses.
Either create a self-signed certificate using the following command, or use a trusted certificate key and certificate to create a Kubernetes secret.
The following command will create a self-signed certificate for IP 10.10.11.12 and the output files will be dq.key and dq.cert:
$ openssl req -x509 -nodes -days 365 -newkey rsa:2048 -keyout dq.key -out dq.cert -subj "/CN=10.10.11.12/O=customer.org/subjectAltName=DNS.1=*.10.10.11.12"
The following command will insert the dq.key and dq.cert files created in the previous command in the Kubernetes secret and named dq-tls-secret in the idq-ns namespace:
$ kubectl create secret tls dq-tls-secret --key dq.key --cert dq.cert -n idq-ns
- In the dq/values.yaml file, replace the following string patterns with the appropriate values:
- <CLUSTER_DOMAIN>: The cluster domain intended to be used for the HTTPS URL.
- <INGRESS_NODEPORT>: The Ingress Controller’s HTTPS NodePort.
- <IMAGE_REGISTRY>: The Docker image registry URL.
- <NFS_SERVER_IP>: The NFS server IP used for Persistent Volume.
- <NFS_SERVER_MOUNT_PATH>: The exported path on the NFS server.
- (Optional) For users who require the Horizontal Pod Autoscaler (HPA) feature, set autoscaling.enabled to true in the dq/values.yaml file.
HorizontalPodAutoscaler automatically scales the ibi Data Quality components to match demand based on observed metrics, such as average CPU utilization and average memory utilization.
All the HPA component targets are set to be 50% of the requested CPU and memory usage. Please check the minimum and maximum replica number for each component, as noted in the deployment resource spec table in Hardware Requirements.
- Install the ibi Data Quality Helm chart.
The following command will install the ibi Data Quality Helm chart to the namespace with the idq release name:
$ helm upgrade --install -n idq-ns --create-namespace idq ./dq
Verifying and Using
This section describes how to verify and test ibi Data Qualityin a Kubernetes cluster by confirming all required services and components are running.
Confirming Services and Components are up and running
To confirm ibi Data Quality services and components are running:
- Confirm that the WSO2 Identity Server is started and running.
The following is a sample command you can use to check the status of the dq-wso2 pod:
$ kubectl -n idq-ns get pod -l app.kubernetes.io/component=dq-wso2
$ kubectl -n idq-ns describe pod -l app.kubernetes.io/component=dq-wso2
- Wait for approximately 15 minutes to confirm that the dq-valet-ui service is started and running.
The following is a sample command you can use to check the status of the dq-valet-ui pod:
$ kubectl -n idq-ns get pod -l app.kubernetes.io/components=dq-valet-ui
$ kubectl -n idq-ns describe pod -l app.kubernetes.io/components=dq-valet-ui
- Verify that all other ibi Data Quality pods in the deployment are started and running.
The following is a sample command you can use to check the status of each ibi Data Quality pod:
$ kubectl -n idq-ns get pod
The following is a sample command you can use to check the status of each ibi Data Quality service:
$ kubectl -n idq-ns get svc
The following is a sample command you can use to check the status of each ibi Data Qualityingress:
$ kubectl -n idq-ns get ingress
Logging in to the ibi Data Quality Console
If a self-signed or untrusted certificate is used, there are three certificates that must be accepted by the browser before logging into the ibi Data Quality console.
- Visit the WSO2 console at the following URL (substituting values for
<DQ CLUSTER DOMAIN>
and<DQ_INGRESS_CONTROLLER_HTTPS_NODEPORT>
) and accept the self-signed certificate:https://dq-wso2.<DQ_CLUSTER_DOMAIN>:<DQ_INGRESS_CONTROLLER_HTTPS_NODEPORT>
- Visit the following URL in your browser (substituting values for
<DQ CLUSTER DOMAIN>
and<DQ_INGRESS_CONTROLLER_HTTPS_NODEPORT>
) and accept the certificate:https://dq-valet-services.<DQ_CLUSTER_DOMAIN>:<DQ_INGRESS_CONTROLLER_HTTPS_NODEPORT>/api/v1/about
- Visit the following URL in your browser (substituting values for
<DQ CLUSTER DOMAIN>
and<DQ_INGRESS_CONTROLLER_HTTPS_NODEPORT>
) to access the ibi Data QualityConsole:https://dq-valet-ui.<DQ_CLUSTER_DOMAIN>:<DQ_INGRESS_CONTROLLER_HTTPS_NODEPORT>
The default login credentials (user ID and password) are:
- User ID: dqadmin
- Password: dqadmin
- Visit the following URL in your browser (substituting values for
<DQ CLUSTER DOMAIN>
and<DQ_INGRESS_CONTROLLER_HTTPS_NODEPORT>
) to access the log files for ibi Data Quality:https://dq-log-viewer.<DQ_CLUSTER_DOMAIN>:<DQ_INGRESS_CONTROLLER_HTTPS_NODEPORT>
The default login credentials (user ID and password) are:
- User ID: dqadmin
- Password: dqadmin