In this section: |
This section provides prerequisites and describes how to deploy ibi Data Quality in a Kubernetes cluster.
Kubernetes® (K8s) is an open-source system for running, managing, and orchestrating containerized applications in a cluster of servers (known as a Kubernetes cluster). Kubernetes clusters can run in any cloud environment (e.g., private, public, hybrid) or on-premises.
A Kubernetes cluster consists of a set of worker machines, called nodes, that run containerized applications. Every cluster has at least one worker node.
The size of a Kubernetes cluster (determined by the number of nodes) depends on the application workload(s) that will be running. For example, each node can represent an 8 core / 64 GB RAM system. Pods are the basic objects in a Kubernetes cluster, which consists of one or more software containers. The worker node(s) host the Pods that are the components of the application workload. Usually, only one container runs in a Pod. However, multiple containers can be run in a Pod if needed (depending on specific environment requirements). If one Pod fails, Kubernetes can automatically replace that Pod with a new instance.
Key benefits of Kubernetes include automatic scalability, efficient load balancing, high availability, failover/fault tolerance, and deploying updates across your environment without any disruptions to users.
In this section: |
ibi Data Quality is a collection of microservices running as separate containers in a private cloud environment such as Kubernetes. Your Kubernetes cluster must have at least the following:
Both cases would require 500 GB of persistent volume to be available.
The following deployment resource spec table lists the recommended CPU, memory, and volume requirements for each of the containers.
Container Name |
Replicas |
CPU (Per Replica) |
Memory (Per Replica) |
|||
---|---|---|---|---|---|---|
Min |
Max |
Requests |
Limits |
Requests |
Limits |
|
dq-address-server |
1 |
1 |
2000m |
2000m |
6Gi |
8Gi |
dq-butler |
1 |
2 |
1000m |
3000m |
2Gi |
3Gi |
dq-dqs |
1 |
1 |
500m |
1000m |
1Gi |
1Gi |
dq-dsml-classifier |
1 |
1 |
2000m |
5000m |
2Gi |
4Gi |
dq-grafana |
1 |
1 |
1000m |
2000m |
1Gi |
3Gi |
dq-ksource |
1 |
1 |
1000m |
2000m |
1Gi |
2Gi |
dq-postgres |
1 |
1 |
2000m |
4000m |
2Gi |
8Gi |
dq-profiler |
4 |
8 |
4000m |
4000m |
2Gi |
4Gi |
dq-python-services |
3 |
6 |
4000m |
4000m |
2Gi |
8Gi |
dq-valet-services |
1 |
3 |
500m |
1000m |
1Gi |
2Gi |
dq-valet-ui |
1 |
1 |
500m |
1000m |
1Gi |
2Gi |
dq-wso2 |
1 |
1 |
2000m |
4000m |
2Gi |
3Gi |
dq-redis |
1 |
1 |
500m |
2000m |
200Mi |
1Gi |
dq-log-viewer |
1 |
1 |
1000m |
1000m |
0.5Gi |
1Gi |
To install ibi Data Quality:
<NFS_Volume_Mount_Path>/loqate
Before you deploy ibi Data Quality in a Kubernetes cluster, you must first create Docker images for the ibi Data Quality components.
$ kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/controller-v1.3.1/deploy/static/provider/baremetal/deploy.yaml
This command will deploy Kubernetes objects into the ingress-nginx namespace.
After deployment, run the following command to extract the mapped controller HTTPS port of the cluster, which will be required in the next step:
$ kubectl -n ingress-nginx get service ingress-nginx-controller | grep -oP '(?<=443:).*(?=/TCP)'
$ ./init_kubernetes_configuration.sh <DQ_CLUSTER_DOMAIN> <DQ_INGRESS_CONTROLLER_HTTPS_NODEPORT>
where:
$ docker-compose --build -f docker-compose-kubernetes.yaml build
For more information, see Installing WebFOCUS DSML Services Container Edition.
In this section: |
ibi Data Quality deployed in a Kubernetes cluster requires a shared PersistentVolume (PV) that is set to the ReadWriteMany access mode. In the following procedures, the NFS-volume is used as an example.
Prerequisites
This section describes how to deploy ibi Data Quality infrastructure components.
$ kubectl create namespace idq-ns
This will create the namespace object tdq-ns in the Kubernetes cluster.
Either create a self-signed certificate using the following command, or use a trusted certificate key and certificate to create a Kubernetes secret.
The following command will create a self-signed certificate for IP 10.10.11.12 and the output files will be dq.key and dq.cert:
$ openssl req -x509 -nodes -days 365 -newkey rsa:2048 -keyout dq.key -out dq.cert -subj "/CN=10.10.11.12/O=customer.org/subjectAltName=DNS.1=*.10.10.11.12"
The following command will insert the dq.key and dq.cert files created in the previous command in the Kubernetes secret and named dq-tls-secret in the idq-ns namespace:
$ kubectl create secret tls dq-tls-secret --key dq.key --cert dq.cert -n idq-ns
HorizontalPodAutoscaler automatically scales the ibi Data Quality components to match demand based on observed metrics, such as average CPU utilization and average memory utilization.
All the HPA component targets are set to be 50% of the requested CPU and memory usage. Please check the minimum and maximum replica number for each component, as noted in the deployment resource spec table in Hardware Requirements.
The following command will install the ibi Data Quality Helm chart to the namespace with the idq release name:
$ helm upgrade --install -n idq-ns --create-namespace idq ./dq
In this section: |
This section describes how to verify and test ibi Data Quality in a Kubernetes cluster by confirming all required services and components are running.
To confirm ibi Data Quality services and components are running:
The following is a sample command you can use to check the status of the dq-wso2 pod:
$ kubectl -n idq-ns get pod -l app.kubernetes.io/component=dq-wso2
$ kubectl -n idq-ns describe pod -l app.kubernetes.io/component=dq-wso2
The following is a sample command you can use to check the status of the dq-valet-ui pod:
$ kubectl -n idq-ns get pod -l app.kubernetes.io/components=dq-valet-ui
$ kubectl -n idq-ns describe pod -l app.kubernetes.io/components=dq-valet-ui
The following is a sample command you can use to check the status of each ibi Data Quality pod:
$ kubectl -n idq-ns get pod
The following is a sample command you can use to check the status of each ibi Data Quality service:
$ kubectl -n idq-ns get svc
The following is a sample command you can use to check the status of each ibi Data Quality ingress:
$ kubectl -n idq-ns get ingress
If a self-signed or untrusted certificate is used, there are three certificates that must be accepted by the browser before logging into the ibi Data Quality console.
https://dq-wso2.<DQ_CLUSTER_DOMAIN>:<DQ_INGRESS_CONTROLLER_HTTPS_NODEPORT>
https://dq-valet-services.<DQ_CLUSTER_DOMAIN>:<DQ_INGRESS_CONTROLLER_HTTPS_NODEPORT>/api/v1/about
https://dq-valet-ui.<DQ_CLUSTER_DOMAIN>:<DQ_INGRESS_CONTROLLER_HTTPS_NODEPORT>
The default login credentials (user ID and password) are:
https://dq-log-viewer.<DQ_CLUSTER_DOMAIN>:<DQ_INGRESS_CONTROLLER_HTTPS_NODEPORT>
The default login credentials (user ID and password) are: