Performing Log Cleanup

Log cleanup is done by an automated process running in the log container at regular intervals. This clean up utility cleans the logs according to the configuration of disk.usage.warning and disk.usage.critical levels. By default, these values are set to
  • disk.usage.warning=70
  • disk.usage.critical=80
This number represents the percentage of the total disk allocated to the log container.

Log cleanup cron can be found at /opt/mashery/containeragent/quartz/jobs/log-cleanup-jobs.xml.

log-cleanup-cron

*/15 * * * * root /usr/bin/flock -w 0 /var/lock/logcleanup-task /usr/bin/python 
/opt/mashery/containeragent/bin/cleanup_disk.py 2>&1 | tee -a /var/log/logcleanuu
pcron.log
You can change the disk warning and critical level in the file /opt/mashery/containeragent/bin/logservicecleanup.ini

logservicecleanup.ini

[PATH_NOT_TO_BE_DELETED]
#please provide other log path if those logs needs to be excluded from deletion. But would be deleted after 31 days or
# size of that dir is reached the limit allowed for that log path.
# e.g. /mnt/log1,/mnt/log2
log.path=
 
[BASE_PATH]
# base path for the log directory
# e.g. /mnt/data
base.path=/mnt/data
 
[OTHER_LOG]
# duration for which the logs need to be kept. It is in days
# This would keep the other logs of log container.
other.log.duration=1
other.log.path=/var/log/,/tmp
 
[CONTAINER_LOG]
# duration for which the logs need to be kept. It is in days
# This would keep the application logs of all the containers
container.log.duration=4
 
[METRIC_LOG]
# duration for which the logs need to be kept. It is in days
# This would keep the metrics log for the below mentioned days
metric.log.duration=4
 
[ACCESS_LOG]
# duration for which the logs need to be kept. It is in days
# This would keep the access logs for the below mentioned days
access.log.duration=31
access.log.path=/mnt/data/trafficmanager
 
[DISK_USAGE_MONITOR]
#This is the level set for cleaning up logs. Below values represents % of the configured disk
disk.usage.warning=70
disk.usage.critical=80
1. PATH_NOT_TO_BE_DELETED : It is the path which user doesn't want utility to clean up for later debugging purpose. This has to be given in comma separated value.
 
2. BASE_PATH : This is the base path where the logs are stored.
 
3. OTHER_LOG : This is the section for log container's /var/log/ and /tmp logs and "other.log.duration" property is used to define days for which user wants to keep the logs if the disk usage is above warning level.
 
4. CONTAINER_LOG : This is the section for application logs of containers and "container.log.duration" property is used to define days for which user wants to keep the logs if the disk usage is above warning level.
 
4. METRIC_LOG : This is the section for all the metrics logs and "metric.log.duration" property is used to define days for which user wants to keep the logs if the disk usage is above the warning level.
 
5. ACCESS_LOG : This is the section for all the access logs and "access.log.duration" property is used to define days for which user wants to keep the access logs if the disk usage is above critical level.
"access.log.path" property is used to define the base path for the access logs storage.
 
6. DISK_USAGE_MONITOR : This is the section to define the disk usage warn and critical level. This represents the percentage of the disk usage.
"disk.usage.warning" and "disk.usage.critical" properties to define the threshold level as per disk usage.
 
NOTE : Threshold level is totally dependent on inflow of logs and the volume of disk that is configured. As the log cleanup utility would run every 15 minutes log clean up would be continuous exercise. But user is free to set the above parameters as per the need and requirements.

Log clean up configurations

These configuration can be provided in the tml_log_properties.json file before starting the deployment or can be set using the cluster manager after the cluster is active. The log clean up policy and configuration dictate threshold of cleanup and it's occurence. The clean happens only on utilization thresholds, and not time or date.

The following configurations are applicable only, when deletion happens.
Key Description Data Type and Expected Values Default Value Example
disk_usage_warning A percentage threshold of current disk utilization for deleting all logs except access logs. Positive integer (ideally > 60% 70 disk_usage_warning="70"
disk_usage_critical A percentage threshold of current disk utilization for deleting all logs including access logs. Positive integer (ideally > 70% and less than 95) 80 disk_usage_critical="80"
container_log_duration Retention period (in days) of container logs.

A value of 0, means delete everything.

Positive Integer 4 days container_log_duration="1"
metric_log_duration Retention period (in days) of metrics of different services from different containers.

A value of 0, means delete everything.

Positive Integer 4 days metric_log_duration="1"
access_log_duration Retention period (in days) of access logs.

A value of 0, means delete everything.

Positive Integer 31 days access_log_duration="5"
payloads_duration Retention period (in days) of payload logs.

A value of 0, means delete everything.

Positive Integer 1 day payloads_duration="2"