The PSI Prefilter - Usage Details

Like the SORT prefilter, a PSI prefilter uses a collection of sorted indexes. The terminology, defaults, and restrictions of the SORT prefilter apply to the PSI prefilter.

Creating a PSI prefilter is the same as creating a SORT prefilter: pass an LPAR_LST_ENCODINGS in the dbpars parameter of the lkt_dbload function. Each element in the LPAR_LST_ENCODINGS lpar must be an LPAR_LST_ENCODING lpar. Within each LPAR_LST_ENCODING, PSI allows five lpars (SORT only allows two).

The first is LPAR_STRARR_FIELDNAMES. Each string array is a list of field names which is used for a given encoding. The first field name is the primary sort key, the second field is secondary, and so on.
The second is LPAR_INTARR_BKWDSFLDS. Fields can also be sorted backwards (from the last characters in the field to the first). Important fields can and must be used as primary sort keys in both their forward and reverse directions. The backward fields array, if given, must be the same length as the field names array. The integer is set to one if the field in the corresponding position in the field names array is to be sorted in the backward direction and zero for the forward direction. If this value is not given, all fields are sorted in the forward direction.

These first two parameters, LPAR_STRARR_FIELDNAMES and LPAR_INTARR_BKWDSFLDS, operate exactly as in SORT.

The third is LPAR_INT_ENCODING_SFX_CNT, which holds the number of fields to which suffixing is applied. This is optional. If present, it must be 0 or 1. Default is 1. Passing 0 turns off suffixing for the encoding, making it behave much like a SORT encoding.
The fourth is LPAR_INTARR_PSIMINMATCHSIZES. This is also optional. If present, it must be the same length as LPAR_STRARR_FIELDNAMES. It is used for improving search speed. Larger values improve speed, but might degrade accuracy. Contact your TIBCO representative for assistance in balancing this parameter. Each entry controls the minimum amount of data that must match within a specific field. If no field meets the minimum, the prefilter rejects the record and prunes the scan of the encoding.
The fifth is LPAR_INT_PSI_DENSITY. This is optional. The allowable values are 0 (high density), 1 (standard density – the default), and 2 (low density). High density might improve accuracy, but increase memory usage and might lower throughput. Low density lowers memory usage and might increase throughput, but might lower accuracy.

Like the SORT prefilter, if no encodings are given when a PSI enabled table is created, a default set of encodings is automatically generated. This default set must provide good accuracy for record-matching operations but might take more memory than a carefully crafted set of encodings.

Contact your TIBCO representative for further information on which encodings to use for your specific application.

To use PSI prefilter in a search, the table must be created with PSI enabled.

Note: Unlike SORT, LPAR_LST_DEDUPQUERY in the dbpars is not supported by PSI.

 

Like SORT, the PSI prefilter accepts a set of field values corresponding to the fields of the table. The LPAR_BLKARR_PSILOOKUPFIELD parameter specifies these field values explicitly. The length of the array must be the same as the number of fields in the table, and the order of the values must correspond to the field order when the table was defined.

If the look-up fields are not specified, the TIBCO Patterns server attempts to infer them from the query given. If it cannot do so, an error is returned and no query is performed.

If the PSI prefilter is enabled for a table, the server uses PSI with queries on that table unless LPAR_BOOL_PSISEARCH parameter is sent with value false.

Whenever a lkt_dbsearch is done, the same Boolean parameter, LPAR_BOOL_PSISEARCH, is returned in the stats list. This parameter is true if the PSI prefilter was employed, false otherwise.

Whenever database statistics are returned for a PSI prefilter enabled database, the value of LPAR_INT_DBIDXTOTALKBYTES gives the total size of the PSI prefilter related data structures associated with this database. The value of LPAR_INTARR_PSI_DENSITIES lists the suffix densities of the PSI encodings.

Note: If predicate indexes also exist for the table, this value also includes the space used by the predicate indexes. See Partition Indexes for more details.