The amount of time that it takes to process a unit of application work (e.g. processing a request and sending its response), excluding any time spent blocked (e.g. disk I/O, or waiting for a response from an intermediate system).
Adding more computing nodes (i.e. machines) to a system.
Adding more resources (e.g. CPUs or memory) to a single computing node in a system.
Competition for computing resources. When resources are not available the application waits and often uses up other system resources competing for the requested resource.
The time between when a request is issued and a response is received. Latency can consist of a variety of components (network, disk, application, etc...).
A measure of the overall amount of work that a system is capable of over a given period of time.