Decision Table Performance Considerations

Performance Considerations

The Decision Table operator, in loading a decision table, evaluates the expression in every cell of the table. This is done for two reasons. First, if any invalid expressions are present in the decision table, the load can be rejected and the previously loaded decision table, if any, can be retained. Second, the load-time evaluations populate an expression cache, increasing subsequent runtime performance by several orders of magnitude.

Expressions that appear more than once per column are evaluated just once. Thus, the total time to load a decision table depends upon the total number of unique expressions across all the table columns. For example, in a 100-row, 10-column table, if every cell in each column is unique, 1,000 expressions will be evaluated during loading. If, on the other hand, each column has only 20 unique values, only 200 expressions will be evaluated.

When a decision table is loaded, the Decision Table operator builds a decision tree, in which the root node holds the unique expressions in the first column of the decision table. The expressions in the root node point to second-level nodes holding expressions in the second column of the decision table, and so on. In processing an input tuple against the decision tree, all the expressions in the root node are evaluated, but only those second-level expressions whose parent expressions evaluated to true are evaluated, and so on. Thus, the total number of expressions evaluated in processing an input tuple against a "well-formed" decision tree is typically a small subset of the total set of expressions in the original decision table. A well-formed decision tree is one derived from a decision table containing many duplicate expressions, particularly in the left-most columns. Each expression that evaluates to false eliminates a large portion of the tree, so processing an input tuple typically requires evaluating a small number of expressions.