Direct Data Transfer
By default, GridServer uses Direct Data Transfer, or peer-to-peer communication, to optimize data throughput between Drivers and Engines. Without Direct Data Transfer, all task input and output goes through the Manager. Sending the input and output through the Manager uses more memory and disk on the Manager and results in lower throughput.
Using Direct Data Transfer, the Driver and Engine nodes do the “heavy lifting,” and only lightweight messages go through the Manager. Direct Data Transfer requires each peer to know the IP address that it presents to other peers. In most cases, therefore, Direct Data Transfer precludes the use of NAT between the peers. Likewise, Direct Data Transfer does not support proxies.
When using Direct Data Transfer to move TaskInput data from Drivers to Engines, the Engines pull the data via HTTP directly from the Driver. You can configure the Driver's embedded HTTP server to use a static TCP port or an ephemeral TCP port according to the driver.properties file.
You can also install an optional Engine Hook to enable communication between Drivers and Engines when NAT is in use.
For GridServer deployments that use NAT and do not have the optional Engine Hook, you can support NAT between Drivers and Engines by disabling peer-to-peer communication in one of two ways:
| • | If, from the perspective of the Drivers, the Engines are behind a NAT device, the Engines cannot provide peer-to-peer communication. In this case, disable Direct Data Transfer in the Engine configuration. |
| • | If, from the perspective of the Engines, the Drivers are behind a NAT device, then the Drivers cannot provide peer-to-peer communication. Provide the Driver addresses if you know them in advance. Otherwise, disable Direct Data Transfer in the Driver properties file. |
When using Direct Data Transfer to move TaskOutput data from Engines to Drivers, the Drivers pull the data via HTTP directly from the Engines.