User Guide > Configuring the TDV Massively Parallel Processing Engine > Configuring the MPP engine from the Query Execution Plan Viewer
 
Configuring the MPP engine from the Query Execution Plan Viewer
From the Studio View editor, generate the query execution plan by clicking on the button Show Execution Plan. The following plan is displayed in the Execution Plan pane.
 
Click on the PARALEL_FETCH on the left hand side to display a detailed Query Execution Plan on the right hand side. In case PARALLEL _FETCH did not happen, click on SELECT, to investigate the reason. The reason explained in the right-pane will help you tune your settings. Refer the section Considerations for using MPP Engine for a list of things to consider while using the MPP Engine.
Name: PARALLEL FETCH
Estimated Rows Returned: 8000000 (between 0 and 9223372036854775807 rows)
SQL: SELECT `R_1`.`nation`,`R_1`.`o_year`,SUM(1) AS `sum_profit` FROM (SELECT `R_3`.`nation`,EXTRACT(YEAR FROM `R_5`.`o_orderdate`) AS `o_year` FROM `ciscodv.Q-1919832149200026`.`R_3` INNER JOIN `ciscodv.Q-1919832149200026`.`R_4` ON `R_3`.`n_nationkey` = `R_4`.`s_nationkey` INNER JOIN `ciscodv.Q-1919832149200026`.`R_5` ON `R_4`.`s_suppkey` = `R_5`.`l_suppkey` INNER JOIN `ciscodv.Q-1919832149200026`.`R_2` ON `R_5`.`l_partkey` = `R_2`.`ps_partkey` AND `R_5`.`l_suppkey` = `R_2`.`ps_suppkey`) `R_1` GROUP BY `R_1`.`nation`,`R_1`.`o_year` ORDER BY `R_1`.`nation` NULLS FIRST,`R_1`.`o_year` DESC NULLS LAST
Virtual Scan R_2:
R_2 - Data source path: /users/composite/drill/ds/impala20_247
R_2 - Data source type: Impala
R_2 - SQL: SELECT tpch.partsupp.`ps_partkey`,tpch.partsupp.`ps_suppkey` FROM tpch.partsupp
R_2 - Estimated row data volume: 64
R_2 - Estimated number of rows: 8000000
R_2 - Estimated total data volume: 512000000
R_2 - Partition count: 1
R_2 - Partition column name: ps_partkey
R_2 - Partition rank: 100
Virtual Scan R_3:
R_3 - Data source path: /users/composite/drill/ds/impala20_247
R_3 - Data source type: Impala
R_3 - SQL: SELECT tpch.nation.`n_name` AS `nation`,tpch.nation.`n_nationkey` FROM tpch.nation WHERE (tpch.nation.`n_nationkey` > 5 AND tpch.nation.`n_nationkey` < 19)
R_3 - Estimated row data volume: 4294967398
R_3 - Estimated number of rows: 14
R_3 - Estimated total data volume: 60129543572
R_3 - Partition count: 8
R_3 - Partition column name: nation
R_3 - Partition rank: 150
Virtual Scan R_4:
R_4 - Data source path: /users/composite/drill/ds/tpchorcl236
R_4 - Data source type: Oracle
R_4 - SQL: SELECT "C##TPCH10"."H_SUPPLIER"."S_SUPPKEY","C##TPCH10"."H_SUPPLIER"."S_NATIONKEY" FROM "C##TPCH10"."H_SUPPLIER" WHERE ("C##TPCH10"."H_SUPPLIER"."S_NATIONKEY" < 19 AND "C##TPCH10"."H_SUPPLIER"."S_NATIONKEY" > 5)
R_4 - Estimated row data volume: 64
R_4 - Estimated number of rows: 56089
R_4 - Estimated total data volume: 3589696
R_4 - Partition count: 1
R_4 - Partition column name: s_suppkey
R_4 - Partition rank: 100
Virtual Scan R_5:
R_5 - Data source path: /users/composite/drill/ds/tpchorcl234
R_5 - Data source type: Oracle
R_5 - SQL: SELECT "C##TPCH10"."H_PART"."P_PARTKEY","C##TPCH10"."H_ORDER"."O_ORDERDATE","C##TPCH10"."H_LINEITEM"."L_PARTKEY","C##TPCH10"."H_LINEITEM"."L_SUPPKEY" FROM "C##TPCH10"."H_PART" INNER JOIN ("C##TPCH10"."H_LINEITEM" INNER JOIN "C##TPCH10"."H_ORDER" ON "C##TPCH10"."H_LINEITEM"."L_ORDERKEY" = "C##TPCH10"."H_ORDER"."O_ORDERKEY") ON "C##TPCH10"."H_PART"."P_NAME" LIKE '%dodger%' AND "C##TPCH10"."H_PART"."P_PARTKEY" = "C##TPCH10"."H_LINEITEM"."L_PARTKEY"
R_5 - Estimated row data volume: 168
R_5 - Estimated number of rows: 31959459
R_5 - Estimated total data volume: 5369189112
R_5 - Partition count: 1
R_5 - Partition column name: p_partkey
R_5 - Partition rank: 100
The following table explains the different fields in the query execution plan..
Field
Description
Estimated Rows Returned
The estimated number of rows returned for the whole query. Notice that in the above example, multiple data sources are being used in the query and the estimated number of rows returned is 8M.
SQL
The SQL query that is being executed. Notice that in the above example, the topmost query is the main query which is decomposed into partitioned queries.
Virtual Scan
 
 
A Virtual Scan represents a FETCH against a datasource that would spawn multiple queries partitioned by the partition column.
Data Source Path
The path of each of the datasource that is being used in each virtual scan.
Data Source Type
The type of each of the datasource that is being used in each virtual scan.
Estimated Row Data Volume
The estimated row data volume in bytes.
Estimated Number of Rows
The estimated number of rows that is returned in each virtual scan.
Estimated Total Data Volume
The estimated total data volume for each virtual scan.
Partition Count
Partition Count represents the number of ranges of the partition column that is dynamically computed. It indicates the number of partitioned queries that will be issued as part of each virtual scan.
Partition Column Name
This is the column chosen as the most suitable candidate for issuing partitioned queries, most likely candidates are numeric columns.
Partition Rank
Partition rank is a weight assigned to candidate partition columns based on their data type, the uniqueness of their values and the type of statistics (detailed or boundary) available for the distribution of their values. One of the columns with the highest weight is designated as the partition column.
Note: On the top level SELECT node of the query execution plan, when there is a note “No Parallelization”, it indicates that the query did not match the conditions to be processed in parallel. Some of the reasons for such a scenario include Concurrent Request Limit not set, data sources characteristics not suitable for parallel processing, Case-sensitivity or Trailing space settings do not allow for parallel processing, Semi-join optimization takes precedence over parallel processing, etc. In these cases tune the appropriate settings to force parallel mode.
If you find that your query does not run in parallel mode though the query plan shows a Parallel Fetch then there could be a runtime failure that causes the Fallback to classic engine. The fallback reason will be logged in the DV log under Server > SQL Engine > Parallel Processing > Log Level (when it is set to HIGH). There could be various reasons why a runtime failure occurs: for example, the node got disconnected from the cluster, configured memory was not sufficient for query to finish, runtime code generation caused a failure, etc.