Troubleshooting

How do I check failure in reporting-pod's deployment?

To check if the Reporting pod is not coming to running state, then pod's deployement should be described using the following command:

kubectl describe pod reporting-set-0-0

For OpenShift cluster:

oc describe pod reporting-set-0-0

Check for any errors. The most common error would be that it doesn't find the node to be deployed if you forget to label the node.

Why do I see the following errors in fluentd-others.log log file?

2021-02-08 11:56:14 +0000 [warn]: #1 failed to write post to http://localhost:3100/loki/api/v1/push (400 Bad Request entry with timestamp 2021-02-08 11:56:10 +0000 UTC ignored, reason: 'entry out of order' for stream: {cluster_name="PLVal-multizone-kk-pune", fluentd_thread="flush_thread_0", function="lifecycle", process="containeragent", tag="tml-tm.tmdata", type="container", zone_name="eastus-2"},
entry with timestamp 2021-02-08 11:56:10 +0000 UTC ignored, reason: 'entry out of order' for stream: {cluster_name="PLVal-multizone-kk-pune", fluentd_thread="flush_thread_0", function="lifecycle", process="containeragent", tag="tml-tm.tmdata", type="container", zone_name="eastus-2"},
entry with timestamp 2021-02-08 11:56:10 +0000 UTC ignored, reason: 'entry out of order' for stream: {cluster_name="PLVal-multizone-kk-pune", fluentd_thread="flush_thread_0", function="lifecycle", process="containeragent", tag="tml-tm.tmdata", type="container", zone_name="eastus-2"},
entry with timestamp 2021-02-08 11:56:10 +0000 UTC ignored, reason: 'entry out of order' for stream: {cluster_name="PLVal-multizone-kk-pune", fluentd_thread="flush_thread_0", function="lifecycle", process="containeragent", tag="tml-tm.tmdata", type="container", zone_name="eastus-2"},
entry with timestamp 2021-02-08 11:56:10 +0000 UTC ignored, reason: 'entry out of order' for stream: {cluster_name="PLVal-multizone-kk-pune", fluentd_thread="flush_thread_0", function="lifecycle", process="containeragent", tag="tml-tm.tmdata", type="container", zone_name="eastus-2"},
entry with timestamp 2021-02-08 11:56:10 +0000 UTC ignored, reason: 'entry out of order' for stream: {cluster_name="PLVal-multizone-kk-pune", fluentd_thread="flush_thread_0", function="lifecycle", process="containeragent", tag="tml-tm.tmdata", type="container", zone_name="eastus-2"},
entry with timestamp 2021-02-08 11:56:10 +0000 UTC ignored, reason: 'entry out of order' for stream: {cluster_name="PLVal-multizone-kk-pune", fluentd_thread="flush_thread_0", function="lifecycle", process="containeragent", tag="tml-tm.tmdata", type="container", zone_name="eastus-2"},
total ignored: 7 out of 15
)

This happens when all the logs with same timestamp are pushed to the Loki service from Fluentd.

Why the new dashboard is not loading into the Grafana's service?

First, check the Grafana's service logs for any errors:

/var/log/grafana/

Next, correct the dashboard and redeploy the reporting container with the changes.

Why isn't the reporting pod getting deployed in the k8s cluster?

First, check if the node in first zone (for multi-zone) or into default zone (for single zone) is added with the required label. Next, describe the pod and check for any deployment error:

kubectl describe pod reporting-set-0-0

For OpenShift cluster:

oc describe pod reporting-set-0-0

Why isn't the reporting container getting deployed in the Swarm cluster?

First, check the placement constraint for the following key in the tmgc-reporting.yml"node.hostname". Next, add the required constraint as mentioned in Prerequisites. Then, check if the required variable is exposed in the shell where the deployment script is running.

echo $REPORTING_HOST_NAME

If the above variable is not set, then set this variable with the worker name where reporting has to run.

export REPORTING_HOST_NAME=<node_name>

Contents