Predicate Functions and Argument Lists
As described in the previous section, Predicate lpar Expressions, predicate expressions consist of constant or field values, unary operators, and binary operators. There are some operations that need more than just one or two inputs. Consider determining the distance between two points on the Earth’s surface. To do so we need the latitude and longitude of the first point, plus the latitude and longitude of the second point; a total of four values. Predicate expressions handle this by assembling the four values into a special type called an argument list. The distance operation is then implemented as a unary operator that accepts a value of the special type "argument list". Unary operators that accept a value of type argument list are called predicate functions.
An argument list is created using the PRED_OP_ARGS_CREATE unary operator. This creates an argument list with one value. Additional values are appended to the argument list using the PRED_OP_ARGS_APPEND binary operator. The left side of this operator is an argument list, the right side is any value, and the result is a new argument list with the value added to the end of the argument list. An example of creating an argument list for the distance function is given below. The distance function actually takes 5 arguments, the fifth argument is the units of measure for the output value.
/*
* Create call to geo-distance function.
*/
/* create the argument list with the first argument. */
lpar_t geo_args = lpar_create_predicate2(PRED_OP_ARGS_CREATE,
lpar_create_dbl(LPAR_DBL_PREDVALUE,
45.0)) ;
/* append each of the next arguments */
geo_args = lpar_create_predicate3(geo_args, PRED_OP_ARGS_APPEND,
lpar_create_dbl(LPAR_DBL_PREDVALUE, 75.0));
geo_args = lpar_create_predicate3(geo_args, PRED_OP_ARGS_APPEND,
lpar_create_str(LPAR_STR_PREDFIELD,
"latitude")) ;
geo_args = lpar_create_predicate3(geo_args, PRED_OP_ARGS_APPEND,
lpar_create_str(LPAR_STR_PREDFIELD,
"longitude")) ;
geo_args = lpar_create_predicate3(geo_args, PRED_OP_ARGS_APPEND,
lpar_create_str(LPAR_STR_PREDVALUE,
"miles")) ;
/* now create the geo-distance call. */
lpar_t geo_distance = lpar_create_predicate2(PRED_OP_FUNC_GEOD, geo_args) ;
The order in which arguments are appended is critical. Each predicate function expects a specific set of arguments in its argument list in a specific order.
In the example above all of the arguments are simple values, but they can be any predicate expression. For example, to use the distance value returned by the above predicate expression to score records, you need to convert it from a distance measure to a score in the range of 0.0 to 1.0. This can be done using the PRED_OP_FUNC_TOSCORE predicate function. This expects 3 values: the raw value to be converted, a value that represents the zero score, and a value that represents the 1.0 score. Using the geo_distance value from the above example, this call is created as follows:
/*
* Now convert distance to a score value.
*/
/* create the argument list for the to score function. */
lpar_t toscore_args = lpar_create_predicate2(PRED_OP_ARGS_CREATE,
geod_distance) ;
toscore_args = lpar_create_predicate3(toscore_args, PRED_OP_ARGS_APPEND,
lpar_create_dbl(LPAR_DBL_PREDVALUE,
20.0)) ;
toscore_args = lpar_create_predicate3(toscore_args, PRED_OP_ARGS_APPEND,
lpar_create_dbl(LPAR_DBL_PREDVALUE,
0.0)) ;
/* call the function. */
lpar_t toscore_func = lpar_create_predicate2(PRED_OP_FUNC_TOSCORE,
toscore_args) ;
As mentioned previously, a predicate function is a unary operator that accepts an argument list as its operand. That it has received an argument list, and not, for example, an integer value, is a syntax check that is performed when the expression is compiled; this triggers a DVK_ERR_PARAMVAL error. In addition, each predicate function operator verifies that its argument list contains the number and type of arguments it expects. A DVK_ERR_PARAMVAL error is returned if this validation fails.
The following are descriptions of each of the predicate functions:
PRED_OP_FUNC_GEOD
This function computes the distance between two points on the Earth’s surface. It takes the following five arguments:
| 1. | The first argument is the latitude of the first point. It is expressed in degrees as a floating point number. |
| 2. | The second argument is the longitude of the first point. It is expressed in degrees as a floating point number. |
| 3. | The third argument is the latitude of the second point. It is expressed in degrees as a floating point number. |
| 4. | The fourth argument is the longitude of the second point. It is expressed in degrees as a floating point number. |
| 5. | The fifth argument defines the units to be used. The units supported are either miles or kilometers. The argument value is a string value and must be one of the following recognized strings (letters are not case sensitive): “km”, “mi”, “mile”, “miles”, “kilometer”, “kilometers”. |
PRED_OP_FUNC_TOSCORE
This function is used to convert a raw value to a value in the valid score range of 0.0 to 1.0. A raw value that is to be considered the 0.0 score and a raw value that is to be considered a 1.0 score are identified. Raw values in between the 0.0 score and the 1.0 score are converted using linear interpolation. Raw values outside the range are converted to either 0.0 or 1.0. The 0.0 score value might be greater than the 1.0 score value. In this case, the interpolation is reversed. Lower values get higher scores than higher values. For example, if you wish to convert a distance to a score, where anything more than 20 miles away is a non-match, and anything 0.0 miles away is a perfect match, the zero score value would be 20.0 and the 1.0 score value would be 0.0. All arguments must be of type double. The following are the arguments:
| 6. | The raw value. |
| 7. | The zero score value. |
| 8. | The 1.0 score value |
PRED_OP_FUNC_IF
This function can be thought of as an extended version of the “C” ternary operator. It is used to select one of a set of possible values based on conditional expressions.
Unlike the other functions, this one accepts a variable number of arguments, but the arguments must follow a specific set of rules. The arguments come in pairs with a final singleton argument. The first argument of the pair must evaluate to a Boolean value. The second argument of the pair might have any value type with the restriction that all second arguments, and the final singleton argument, must have the same value type. Zero or more argument pairs are allowed.
The value returned by this expression is the value of the second argument of the first pair for which the first argument evaluates to true. If no such pair exists, the value of the last, singleton, argument is returned. For example:
The following example returns “good comparison”:
{ 0 > 10, “bad comparison”, 10 > 0, “good comparison”, “no comparison” }
But this example returns "ok math":
{ 0 > 10, “bad comparison”, 2*5 > 3*5, “bad math”, “ok math” }