Resource Monitoring
Resource monitoring includes monitoring resources like CPU, memory, disk and Network I/O.
Container Metrics
These metrics denote the utilization of CPU, memory, network and disks assigned to a container or pod.
Container CPU Metrics
Captured metrics reflect the percentage of CPU utilized by container/pod, user space and a distill of the usage per core.
CPU Metrics example:
Container Memory
{ "time": 1554194126, "message": { "cpu_p": 41.88333333333333, "user_p": 25.033333333333335, "system_p": 16.85, "cpu0.p_cpu": 41.88333333333333, "cpu0.p_user": 25.033333333333335, "cpu0.p_system": 16.85, "ingestion_time": "2019-04-02T08:35:26+00:00", "tag": "tml-nosql.6c680d874676.metrics.cpu" } }
Metric Name | Field Name | Units | Data Type | Notes |
---|---|---|---|---|
Total CPU consumption | cpu_p | % | number | Total CPU usage across all cores assigned to the container - includes user and Kernel processes if there are 4 cores the container can use, the percent usage can go up to 400% |
CPU consumption by user processes | user_p | % | number | Total CPU used by user processes across all cores |
CPU consumption by kernel processes | system_p | % | number | Total CPU used by kernel processes across all cores. |
Total usage per core N | cpuN | % | number | Usage of Core N by user and kernel processes |
User processes usage of Core N | cpuN | % | number | Usage of Core N by user processes |
Kernel processes usage of core N | cpuN | % | number | Usage of Core N by kernel processes. |
Pod/Container memory metrics example:
Container Disk
{ "time": 1554198840, "message": { "Mem.total": 4045520, "Mem.used": 3932664, "Mem.free": 112856, "Swap.total": 1928204, "Swap.used": 1483436, "Swap.free": 444768, "ingestion_time": "2019-04-02T09:54:00+00:00", "tag": "tml-log.6a6873b34d5e.metrics.mem" } }
Metric Name | Field Name | Units | Data Type | Notes |
---|---|---|---|---|
Total memory (RAM) | Mem.total | bytes | Number | Total memory available to container or pod in bytes |
Used memory (RAM) | Mem.used | bytes | Number | Memory utilized by container in bytes |
Free memory (RAM) | Mem.free | bytes | Number | available free RAM in bytes |
Total swap space | Swap.total | bytes | Number | Total swap space |
Used swap space | Swap.used | bytes | Number | Used swap space |
Free swap space | Swap.free | bytes | Number | Free swap space |
Captured metrics reflects number of bytes read and written at the point in time.
Pod/ Container disk metrics example:
Container Network
{ "time": 1554193560, "message": { "read_size": 7029587968, "write_size": 14102749184, "ingestion_time": "2019-04-02T08:26:00+00:00", "tag": "tml-log.6a6873b34d5e.metrics.disk" } }
The network metrics are available per network interface like eth1, lo etc. The metrics captured reflect the transmit and receive size at the point in time.
Pod/Container Network metrics example:{ "time": 1554199020, "message": { "eth0.rx.bytes": 516319, "eth0.rx.packets": 1062, "eth0.rx.errors": 0, "eth0.tx.bytes": 61578, "eth0.tx.packets": 893, "eth0.tx.errors": 0, "ingestion_time": "2019-04-02T09:57:00+00:00", "tag": "tml-log.6a6873b34d5e.metrics.netif" } }
Metric Name | Field Name | Units | Data Type | Notes |
---|---|---|---|---|
Bytes transmitted on a netif_name | netif_name | bytes | Number | Total bytes transmitted for the particular network interface. |
Packets transmitted on a netif_name | netif_name | Packet | Number | Total packets transmitted for the particular network interface. |
Errors in transmitting packets on a netif_name | netif_name | Packet | Number | Number of packets failed to be transmitted for particular network interface due to window, carrier, aborted, or heartbeat errors |
Bytes recieved on a netif_name | netif_name | bytes | Number | Total bytes recieved for the particular network interface. |
Packets recieved on a netif_name | netif_name | Packet | Number | Total packets recieved for the particular network interface. |
Errors recieving packets on a netif_name | netif_name | Packet | Number | Number of packets dropped |
Common Process Metrics
{ "time": 1554199440, "message": { "alive": true, "proc_name": "td-agent-bit", "pid": 2156, "mem.VmPeak": 83856000, "mem.VmSize": 83852000, "mem.VmLck": 0, "mem.VmHWM": 7416000, "mem.VmRSS": 3412000, "mem.VmData": 31028000, "mem.VmStk": 132000, "mem.VmExe": 4184000, "mem.VmLib": 5352000, "mem.VmPTE": 140000, "mem.VmSwap": 2040000, "fd": 65, "ingestion_time": "2019-04-02T10:04:00+00:00", "tag": "tml-log.6a6873b34d5e.metrics.proc.td-agent-bit" } }
Metric Name | Field Name | Unit | Data Type | Notes |
---|---|---|---|---|
Process status | alive | Boolean | Is the process running? | |
Process name | proc_name | String | Name of the process as identified by /proc/pid/cmd | |
Peak virtual memory usage | mem.VmPeak | bytes | Number | Max memory used by this process so far |
Virtual memory size | mem.VmSize | bytes | Number | |
Current mlocked memory | mem.VmLck | bytes | Number | Amount of memory locked by the process. This memory is released after the process exits. |
Peak RAM used | mem.VmHWM | bytes | Number | |
Current RAM being used | mem.VmRSS | bytes | Number | |
Size of "data" | mem.VmData | bytes | Number | |
Size of stack | mem.VmStk | bytes | Number | |
Size of "text" segment | mem.VmExe | bytes | Number | |
Shared library mem usage | mem.VmLib | bytes | Number | |
Current swap space used | mem.VmSwap | bytes | Number |
Process List
Processes on all containers
Per Container Processes
Process Name | Description |
---|---|
Containeragent | The Mashery Local container agent which manages all processes running inside a container. |
td-agent-bit | The Log and metrics forwarder. It forwards all logs to the Log service. |
syslog-ng | Supervisor + worker. |
Container Name | Process Name | Description |
---|---|---|
TM | proxy | Traffic Manager Proxy (embedded jetty) |
Sql | Jetty | On-Prem Loader - syncs with MOM in tethered mode and for untethered_cm loads from data.zip |
Sql | mysqld | Service for MySql |
NoSql (seed and non-seed) | Cassandra | |
NoSql (only on non-seed) | Jetty | Jetty server hosting the ML Registry Java webapp |
Cache | Memcached | 6 processes 1 each for pools 11211, 11212, 11213, 1124, 11215 and, 11216 |
Cache | pxrt | The memcache loader, which keeps memcache up-to-date with changes to service definitions, packages et al. |
Api | lighthttpd | CGI server supporting PHP CGI |
API | memcached | 2 processes 1 each for pools 11211 and 11214 |
API | pxrt | embedded jetty server hosting the V3 API |
API | php-cgi | ~20 php-cgi processes - workers which execute a V2 API request |
CM | Jetty | Jetty server hosting the certificate manager Java webapp. |
logservice | td-agent | Log collector and forwarder. Grabs logs from other containers and forwards them to user chosen destination. 1 supervisor + 9 Workers |
logservice | java | process which syncs access logs to TIBCO Cloud Mashery in "tethered" mode. |
Diagnostic Recipe / Alerts
Copyright © Cloud Software Group, Inc. All rights reserved.