Engine Interruption and Smoothing
If an Engine is interrupted or fails during a task, that task must run again from the beginning. Therefore, divide work into tasks with short execution times. The shorter the task, the less work you lose when an Engine fails.
Additionally, shorter tasks result in better performance. This is because you are reducing the variability of task durations in a computation.
For example, suppose that you divide the work of computation so that 10 Engines each have one task. You expect that this minimizes communication overhead and that Engines do not fail. However, what if you wrongly estimate one task and it takes twice as long as the others? Since all tasks must finish for the computation to be complete, the longest task determines the computation time. If you have nine one-minute tasks and one two-minute task on 10 Engines, the computation takes two minutes, with the last minute consisting of nine idle Engines and one Engine still working on the two-minute task. With exactly as many tasks as Engines, your program runs as long as the longest task. (This section simplifies this discussion by ignoring communication time.)
Suppose you use twice as many tasks as Engines. This significantly improves the expected running time. To understand why, continue the above example. If you divide each of the 10 tasks in two, you have 20 tasks for 10 Engines: 18 30-second tasks, and two one-minute tasks. Each Engine takes two tasks at random. The chance of the same Engine receiving both long tasks is fairly small, so this program is likely to take one and a half minutes most of the time.
Similarly, more, shorter tasks smooth out the effect of different processor speeds. Assume that all tasks take the same time, but that one Engine is slower than the others. With exactly one task per Engine, the slow Engine determines the computation time. With many short tasks, the slow Engine takes fewer tasks than the other Engines, and all Engines finish at close to the same time, minimizing the time for the whole computation.