About the Semijoin Optimization

The semijoin optimization reduces the number of rows retrieved from the right-hand side (RHS) by rewriting the FETCH pushed to the second data source using the unique values returned by the left-hand side (LHS). Table cardinalities must be supplied for this optimization.

Inner joins can use the semijoin optimization. Left and right outer joins can use semijoin optimization, but only from the outer side to the inner side. The full outer join is not supported with the semijoin optimization.

You can specify use of the semijoin optimization in the SQL defining the view. When the table cardinalities of source and target sides are known, and when the ratio between the two is favorable, the TDV query engine automatically uses the semijoin optimization.

The semijoin can only be attempted if the right side can be queried as a node that fetches against a data source that supports an IN or an OR clause.

For inner and left outer joins, the semijoin optimization always uses rows from the LHS to constrain the RHS—for example, where LHS is the source side and RHS is the target. For a right outer join the situation is reversed: RHS is the source and LHS is the target.

As another example, consider the tables Employee and Dept and their semijoin.

The semijoin of the employee and department tables would result in the following table.

Consider another query like this:

The usual evaluation would hash the R table expression and then iterate over T table expressions, and do look-ups into the hash table to find matching rows. If the join in the example is such that the number of rows from the R-side is much smaller than the number of rows from the T-side, additional optimization is possible. The rows from R are buffered in TDV, but an additional predicate of the form “t IN (1,6,8,93…)” is added to the SQL generated for the RHS. The values on RHS of the IN clause are all the ‘r’ values from LHS of the join. This additional predicate ensures that TDV retrieves the smallest possible number of rows from the T-side to perform the join.

Employee Name	EmpId	DeptName
Harry	3415	Finance
Sally	2241	Sales
George	3401	Finance
Harriet	2202	Production

Dept DeptName	Manager
Sales	Bob
Sales	Thomas
Production	Katie
Production	Mark

Employee joined to Dept Name	EmpId	DeptName
Sally	2241	Sales
Harriet	2202	Production