User Guide > Data Ship Performance Optimization > About Data Ship
 
About Data Ship
The data ship optimization accomplishes efficient federated query execution by transformation of federated queries into locally executed SQL operations. When SQL operations involve a very large table (with millions or hundreds of thousands of rows) and a small table with significantly fewer rows, the TDV Server can enhance query performance by shipping a copy of the smaller table to the data source of the larger table. This optimization can yield significantly faster results than a traditional federated query.
The following SQL operators can take advantage of the data ship optimization:
JOIN
UNION
INTERSECT
EXCEPT
When data ship optimization is enabled the TDV query engine evaluates a query, which can include explicitly defined query option hints, with all relevant data source statistics gathered on query dependencies to determine whether using the data ship optimization yields best performance. If data ship optimization can yield faster execution times according to the analysis and settings, an execution of the query sends data source specific instructions to move copies of tables (or results sets from an execution) from the database with the smaller table to a temporary table on the target data source. Processing of the query occurs on the data source with the larger source table. Because the two sides of the federated query reside on a single data source, processing efficiencies yield a faster return of the result set.
When a query is submitted to TDV, TDV evaluates the query and produces an execution plan without using the data ship optimization. This plan is a pre-data ship plan. TDV goes on to examine the fetch nodes of this plan and decides whether to use the data ship optimization. The data ship optimization is used if:
The data sources corresponding to fetch node supports data ship.
Data ship is not disabled for the query using query hints.
Data source is configured as data ship source or target.
Among all the nodes involved, TDV dedicates one node as the target and all others as sources. TDV creates temp tables in the target data source and moves data from other nodes into this temp table. An alternate query plan is generated that transfers the federated SQL operation of a prior plan to the targeted data source.
Determining the target node is based on the:
Data source corresponding to the fetch node is configured as source or target.
Cost of retrieving the data with in the lower and upper bound set for data ship.