Tuning the Server Configuration

There are certain configuration parameters that can be tuned, in order to use the MPP Engine for your query. Certain tuning options are explored in this section with examples.

This setting controls the estimated maximum number of rows in the source side of a join to trigger an automatic semi-join. This value can be overridden at data source level using the Studio.

n1.n_name supp_nation,

n2.n_name cust_nation,

l_extendedprice volume

FROM /shared/new_DS/hive211/tpch_11/supplier,

/shared/new_DS/hive211/tpch_11/lineitem,

/shared/new_DS/hive211/tpch_11/orders,

/shared/new_DS/hive211/tpch_11/customer,

/shared/new_DS/imp27/tpch_11/nation n1,

/shared/new_DS/imp27/tpch_11/nation n2

WHERE ((((((s_suppkey = l_suppkey AND o_orderkey = l_orderkey) AND c_custkey = o_custkey) AND s_nationkey = n1.n_nationkey) AND c_nationkey = n2.n_nationkey) AND ((n1.n_name LIKE 'FRANCE%' AND n2.n_name LIKE 'GERMANY%') OR (n1.n_name LIKE 'EGYPT%' AND n2.n_name LIKE 'IRAN%'))) AND l_shipdate BETWEEN DATE '1995-01-01' AND DATE '1996-12-31')) shipping

GROUP BY supp_nation,

ORDER BY supp_nation ASC,

Query plan contains a SCAN node using a SEMI join: |physical.operator.SqlScan (@44093785, @471fc95b) |Type : |Node Id : 10 |Datasource Id : 20173 |Plan Level : 0 |Cardinality : 8 (between 0 and 9223372036854775807 rows) | Selectables: customer.c_nationkey, supplier.s_nationkey | Map : 0, 1 | Matching Id : 30, 32 |semiJoinID: 2 |semiJoinNode: physical.operator.DynamicHashJoin@2ced0d21 |Push SQL is SELECT customer.`c_nationkey`,supplier.`s_nationkey` FROM tpch_11.supplier JOIN tpch_11.lineitem ON supplier.`s_suppkey` = lineitem.`l_suppkey` JOIN tpch_11.orders ON lineitem.`l_orderkey` = orders.`o_orderkey` JOIN tpch_11.customer ON orders.`o_custkey` = customer.`c_custkey` WHERE (lineitem.`l_shipdate` >= '1995-01-01' AND lineitem.`l_shipdate` <= '1996-12-31')

1. Navigate to Administration -> Configuration -> Server -> SQL Engine -> Optimizations -> Semi join and set the “Max Source Side Cardinality Estimate” to -1. You can also navigate to the property by searching for it using the search option in the configuration window.

This setting controls the default case-sensitivity of queries. If false, string comparisons are done ignoring case. If this setting does not match the case-sensitivity setting of a data source, performance will be degraded when querying that source. The default value is False. Depending on your configuration of this option, you may come across the following fallback reason:

Cannot delegate plan execution to the parallel runtime engine: Cannot push Selectable because the data source Apache Drill 1.14 is case sensitive and our system is not case sensitive and the selectable is not wrapped with upper or lower function --`R_1`.`supp_nation`--

1. Navigate to Administration -> Configuration -> Server -> SQL Engine -> SQL Language and set the “Case Sensitivity” to True. You can also navigate to the property by searching for it using the search option in the configuration window.

This setting controls the ignore trailing spaces during string comparisons in queries. If true, string comparisons are done ignoring trailing spaces. If this setting does not match the trailing spaces setting of a data source, performance will be degraded when querying that source. Default value is True. Depending on your configuration of this option, you may come across the following fallback reason:

Cannot delegate plan execution to the parallel runtime engine: Cannot push selectable because the data source Apache Drill 1.14 does not ignore trailing spaces and our system ignores trailing spaces and the selectable is not wrapped with rtrim function --`R_1`.`supp_nation`--

1. Navigate to Administration -> Configuration -> Server -> SQL Engine -> SQL Language and set the “Ignore Trailing Spaces” to False. You can also navigate to the property by searching for it using the search option in the configuration window.

The Minimum Partition Volume configuration is used to set the data volume to be fetched by each partition. The default value is 256. If set to 0 or a negative value, this has no effect. Run the following query to analyze this option:

select CASE (N_NAME)

WHEN 'ALGERIA' THEN 'result1'

WHEN 'KENYA' THEN 'result2'

from /users/composite/drill/ds/tpchorcl234/C##TPCH10/H_NATION inner join

/users/composite/drill/ds/tpchorcl236/C##TPCH10/H_REGION

on N_REGIONKEY=R_REGIONKEY

Plan has scans with low estimated data volumes. Maximum estimated data volume: 4250

1. Navigate to Server -> SQL Engine -> Parallel Processing -> Minimum Partition Volume and set it to a lower value. You can also navigate to the property by searching for it using the search option in the configuration window.

Note: Queries handling data of low volume are typically not handled by the parallel processing engine. The setting “Disable Low Data Volume Restriction” can be turned on to disable this rule.