Cluster Storage Summary

This section describes storage sizing for different components.

Log

The Log container receives various logs from all other containers. The following information summarizes the disk requirements for a single log pod/ container assuming one instance of each all other containers.

Some processes, such as container agent or metrics collector, run inside each container and generate logs at an almost fixed rate given no traffic. The variation comes from the Traffic Manager container as it will generate access logs at a different rate at different QPS and from NoSQL pod when a lot of tokens are being created frequently. But on average, the assumption is that the disk size for a Log container can be computed as sum of size for logs of constant processes of other containers and size of access logs which can vary according to QPS.

Disk size for one log pod/ container for 1 day = Disk size for log from all containers (1 instance of each type) for 1 day + Disk Size for access log generated by Traffic Manager.

Size of logs generated by one instance of each component in one day:
  • NoSQL - 65 MB

  • SQL - 60 MB

  • Cache - 65 MB

  • CM - 95 MB

  • TM (excluding Access Log) - 65 MB

  • Log - 40 MB

So the disk required (approximate) for one day could be calculated as:

Total disk (in MB) = ( No of NoSql * 65 )+ ( No of Sql * 60 ) + ( No of Cache * 65 ) + ( No of CM * 95 ) + ( No of TM * 65 ) + ( No of Log * 40 ) + (access log size for a given QPS)

For example, if you are running one instance of each component and average QPS is around 200, the disk size required for a single day would be:

disk = ( 1 * 65 ) + ( 1 * 60 ) + ( 1 * 65 ) + ( 1 * 95 ) + ( 1 * 65 ) + ( 1 * 40 ) + (360 * 2 * 24) = 17670 MB (17.67 GB)

Note: 360 MB is the size of Access Log generated at 200 QPS in 30 minutes. Refer to the following table on Unprotected Traffic.
Test Duration 30 Minutes
container 210 QPS 406 QPS 602 QPS 788 QPS
before after delta (Gb) before after delta (Gb) before after delta (Gb) before after delta (Gb)
/mnt/data/trafficmanager/ log-set-0-0 0.46 0.82 0.36 0.82 1.60 0.78 1.60 2.50 0.90 2.50 3.80 1.30
/mnt/data/trafficmanager/access/ 0.21 0.37 0.16 0.37 0.67 0.30 0.67 1.20 0.53 1.20 1.70 0.50
/mnt/data/trafficmanager/enriched/ 0.25 0.45 0.20 0.45 0.83 0.38 0.83 1.40 0.57 1.40 2.10 0.70

NoSQL

NoSQL storage requirement is primarily driven by number of OAuth tokens needed. For example, if the tokens are getting created at the rate of 200 QPS, it will need close to 177 MB space. Refer to the following table on OAuth (Token Creation).
Test Duration 30 Minutes
container 200 QPS 354 QPS 590 QPS 757 QPS
before after delta (Gb) before after delta (Gb) before after delta (Gb) before after delta (Gb)
Number of Tokens cass-set-0-0 179,319 394,787 215,468 394,787 727,040 332,253 727,040 1,215,676 488,636 1,215,676 1,559,298 343,622
/var/lib/cassandra 87 264 177 264 575 311 575 1126 551 1126 1843 717
/var/lib/cassandra/commitlog/ 48 137 89 137 301 164 301 573 272 573 922 349
/var/lib/cassandra/data/ 39 128 89 128 275 147 275 510 235 510 828 318

Cache

Cache component does not require much storage, so you can select as minimal as possible.

SQL

SQL database primarily stores configuration data. 2 GB storage should suffice.
Note: The above calculation is based on some assumptions. Therefore, you should consider some buffer while sizing.