Old-style Query Construction
This style of query construction is deprecated as of version 4.1. It remains only for the sake of reverse compatibility and to document functionality which previous users might still be using. See Query Construction for more information on the currently recommended query formulation.
Note that not all of the pre version 4.1 forms and options are supported in version 4.1 and later. If you are upgrading from a pre-4.1 version it is likely that your old-style queries work as before. However, in some cases it might be necessary to modify your queries to meet the forms required by this version of the TIBCO Patterns servers.
In its simplest form, a query is a simple, contiguous block of text, held in an LPAR_BLK_SEARCHQUERY. Records are compared against this text and evaluated for similarity.
For more advanced searches, it might be desirable to have multiple, independent blocks of query text, with different search options for each block. When a query text block is combined with search options that pertain only to that text block it is referred to as a querylet. A querylet is a list type lpar, LPAR_LST_QUERYLET, which must contain exactly one LPAR_BLK_SEARCHQUERY, but might also contain any of the following additional options:
| • | LPAR_INTARR_SELECTFLDS selects the record fields against which this querylet is compared. Fields are specified by number with this option. |
| • | LPAR_STRARR_FIELDNAMES selects the record fields against which this querylet is compared. Fields are specified by name with this option. |
| • | LPAR_DBL_QUERYLETWEIGHT specifies a weighting factor for this querylet. The raw score is multiplied by this factor to obtain the final score for this querylet. This can be used to adjust the relative importance of each querylet. |
| • | LPAR_DBLARR_QFIELDWEIGHTS specifies the weight for matched text against each field in the LPAR_INTARR_SELECTFLDS or LPAR_STRARR_FIELDNAMES array. |
The maximum weight is 1.0, with values less than 1.0 representing penalized matches.
| • | LPAR_INT_ALIGNFIELD The meaning and use of this field changed somewhat as of the 4.1 release. Previous to 4.1 it defined the starting point for alignment of the query string, that is which field of the merged field set the querylet string should be aligned over. As of 4.1 this defines which field is to be considered the cognate field for this querylet (see Cognate Query). Thus this value is only relevant to interdependent (described below) multi-queries or multiquerylets (which correspond to cognate queries in 4.1). This still indicates the field within the querylet, not in the database. In other words, if the field set consists of fields 1, 3, and 9, an ALIGNFIELD value of 1 is assigned field 3 as the cognate field for this querylet. |
| • | LPAR_STR_THESAURUSNAME is the thesaurus to use for just this querylet. For example, this makes it possible to use one thesaurus for mapping St. to Street for an address field querylet, and St. to Saint for a City field querylet, allowing the correct recognition of Main St., St. Louis. In version 4.1 and later there is a restriction that all querylets of an interdependent multi-query or multiquerylet must use the same thesaurus. An attempt to define a different thesaurus for such interdependent querylets is rejected with a DVK_ERR_PARAMCONFLICT error. |
Querylets must be combined into a multi-part query, or multi-query by placing them in an LPAR_LST_MULTIQUERY. This multi-query can then be used as the querypar for the dbsearch command.
For a concrete example, consider a database with the following fields:
| • | First Name |
| • | Last Name |
| • | Social Security Number |
| • | House Number |
| • | Street Name |
| • | City |
and a search application which provides the user three text entry boxes labeled Name, Address, and SSN.
lpar_t multiquery,querylet,sq,sf;
unsigned char *name,*address,*ssn;
unsigned char *name_fields[] = { "First Name",
"Last Name" };
unsigned char *ssn_fields[] = { "Social Security Number" };
unsigned char *address_fields[] = { "House Number",
"Street Name",
"City" };
name = get_name_box();
address = get_address_box();
ssn = get_ssn_box();
multiquery=lpar_create_lst(LPAR_LST_MULTIQUERY);
/* Name querylet */
querylet=lpar_create_lst(LPAR_LST_QUERYLET);
sq=lpar_create_blk(LPAR_BLK_SEARCHQUERY,name,strlen(name));
lpar_append_lst(querylet,sq);
sf=lpar_create_strarr(LPAR_STRARR_FIELDNAMES,name_fields,2);
lpar_append_lst(querylet,sf);
lpar_append_lst(multiquery,querylet);
/* SSN querylet */
querylet=lpar_create_lst(LPAR_LST_QUERYLET);
sq=lpar_create_blk(LPAR_BLK_SEARCHQUERY,ssn,strlen(ssn));
lpar_append_lst(querylet,sq);
sf=lpar_create_strarr(LPAR_STRARR_FIELDNAMES,ssn_fields,1);
lpar_append_lst(querylet,sf);
lpar_append_lst(multiquery,querylet);
/* Address querylet */
querylet=lpar_create_lst(LPAR_LST_QUERYLET);
sq=lpar_create_blk(LPAR_BLK_SEARCHQUERY,address,strlen(address));
lpar_append_lst(querylet,sq);
sf=lpar_create_strarr(LPAR_STRARR_FIELDNAMES,address_fields,3);
lpar_append_lst(querylet,sf);
lpar_append_lst(multiquery,querylet);
Here a multiquery containing three querylets has been built. Each querylet contains a SEARCHQUERY text block and a field selection array. Notice that there does not need to be a one to one mapping between record fields and querylets. When one querylet lists multiple fields in its field set, those fields are concatenated and the querylet text is compared against this concatenated field string.
In the above example, the querylets are said to be independent, because no two querylets share a common record field. When two querylets have the same or overlapping field sets, the querylets become interdependent. The match scores for interdependent querylets must be computed simultaneously, since these querylets belongs to the same bipartite graph.
When a multiquery contains interdependent querylets, the field sets for all querylets is joined and the query as a whole is compared with the union of all selected fields. (In other words, when there are any interdependent querylets, all querylets are considered to be interdependent, even if their specified field sets do not actually overlap the field sets of other querylets.) In version 4.1 and later this corresponds to a cognate query (see Cognate Query) and must conform to the cognate model. That is there must be a one to one mapping of querylets to fields in the combined field set.
Consider the previous example, but with the additional complication that the user now has four text boxes. The first and last names have been isolated into separate entries, but we still want to allow for queries or records where the first and last name have been accidentally transposed. The following code sets up a new query with interdependent querylets.
lpar_t multiquery,querylet,sq,sf,af,qfw;
unsigned char *fname,*lname,*address,*ssn;
unsigned char *name_fields[] = { "First Name",
"Last Name" };
double name_weights[2][2] = { { 1.0 , 0.8 } ,
{ 0.8 , 1.0 } };
unsigned char *ssn_fields[] = { "Social Security Number" };
unsigned char *address_fields[] = { "House Number",
"Street Name",
"City" };
double house_weights[3] = { 1.0, 0.8, 0.8 } ;
double street_weights[3] = { 0.8, 1.0, 0.8 } ;
double city_weights[3] = { 0.8, 0.8, 1.0 } ;
fname = get_fname_box();
lname = get_lname_box();
address = get_address_box();
ssn = get_ssn_box();
multiquery=lpar_create_lst(LPAR_LST_MULTIQUERY);
/* First name querylet */
querylet=lpar_create_lst(LPAR_LST_QUERYLET);
sq=lpar_create_blk(LPAR_BLK_SEARCHQUERY,fname,strlen(fname));
lpar_append_lst(querylet,sq);
sf=lpar_create_strarr(LPAR_STRARR_FIELDNAMES,name_fields,2);
lpar_append_lst(querylet,sf);
as=lpar_create_int(LPAR_INT_ALIGNFIELD,0);
lpar_append_lst(querylet,af);
qfw=lpar_create_dblarr(LPAR_DBLARR_QFIELDWEIGHTS,name_weights[0],2);
lpar_append_lst(querylet,qfw);
lpar_append_lst(multiquery,querylet);
/* Last name querylet */
querylet=lpar_create_lst(LPAR_LST_QUERYLET);
sq=lpar_create_blk(LPAR_BLK_SEARCHQUERY,fname,strlen(fname));
lpar_append_lst(querylet,sq);
sf=lpar_create_strarr(LPAR_STRARR_FIELDNAMES,name_fields,2);
lpar_append_lst(querylet,sf);
af=lpar_create_int(LPAR_INT_ALIGNFIELD,1);
lpar_append_lst(querylet,af);
qfw=lpar_create_dblarr(LPAR_DBLARR_QFIELDWEIGHTS,name_weights[1],2);
lpar_append_lst(querylet,qfw);
lpar_append_lst(multiquery,querylet);
/* SSN querylet */
querylet=lpar_create_lst(LPAR_LST_QUERYLET);
sq=lpar_create_blk(LPAR_BLK_SEARCHQUERY,ssn,strlen(ssn));
lpar_append_lst(querylet,sq);
sf=lpar_create_strarr(LPAR_STRARR_FIELDNAMES,ssn_fields,1);
lpar_append_lst(querylet,sf);
lpar_append_lst(multiquery,querylet);
/* Address querylets */
/* house */
querylet=lpar_create_lst(LPAR_LST_QUERYLET);
sq=lpar_create_blk(LPAR_BLK_SEARCHQUERY,address,strlen(address));
lpar_append_lst(querylet,sq);
sf=lpar_create_strarr(LPAR_STRARR_FIELDNAMES,address_fields,3);
lpar_append_lst(querylet,sf);
qfw=lpar_create_dblarr(LPAR_DBLARR_QFIELDWEIGHTS,house_weights,3);
lpar_append_lst(querylet,qfw);
lpar_append_lst(multiquery,querylet);
/* street */
querylet=lpar_create_lst(LPAR_LST_QUERYLET);
sq=lpar_create_blk(LPAR_BLK_SEARCHQUERY,address,strlen(address));
lpar_append_lst(querylet,sq);
sf=lpar_create_strarr(LPAR_STRARR_FIELDNAMES,address_fields,3);
lpar_append_lst(querylet,sf);
qfw=lpar_create_dblarr(LPAR_DBLARR_QFIELDWEIGHTS,street_weights,3);
lpar_append_lst(querylet,qfw);
lpar_append_lst(multiquery,querylet);
/* city */
querylet=lpar_create_lst(LPAR_LST_QUERYLET);
sq=lpar_create_blk(LPAR_BLK_SEARCHQUERY,address,strlen(address));
lpar_append_lst(querylet,sq);
sf=lpar_create_strarr(LPAR_STRARR_FIELDNAMES,address_fields,3);
lpar_append_lst(querylet,sf);
qfw=lpar_create_dblarr(LPAR_DBLARR_QFIELDWEIGHTS,city_weights,3);
lpar_append_lst(querylet,qfw);
lpar_append_lst(multiquery,querylet);
The first two querylets have the same field set, so the record score is calculated using a single bipartite graph using the six named record fields. These two querylets also include two additional optional parameters. The ALIGNFIELD specifies the cognate field for the querylet (see Cognate Query). In version 4.1 or later of the TIBCO Patterns servers the cognate field for a querylet must be specified either by the ALIGNFIELD parameter or by providing a QFIELDWEIGHTS parameter that has a single unique maximum field weight. (By definition a field set that consists of a single field has a single maximum field weight so it is not necessary to specify either value in that case.) The next parameter, QFIELDWEIGHTS, sets up a penalty factor for field transpositions. In version 4.1 or later these are not applied directly, but are combined to determine the non-cognate field weight. The non-cognate weight is taken as the average of all field weights for non-cognate fields in the field set. As in this example all non-cognate fields have a field weight of 0.8 the average is 0.8.
Note that the address value is matched three times. As there must be a one to one correspondence of querylets to fields there must be 3 address querylets as there are 3 address fields.
The field weights are needed to determine which querylet matches which field. Versions previous to 4.1 did not have this one to one correspondence restriction on interdependent querylets.
Even though the SSN and address querylets do not appear to be matched against the name fields they are as all querylets of the multi-query are merged into a single cognate query. This is probably not what is intended. In reality what is probably desired is a means of combining the scores of a cross match on the name fields with the scores of simple direct matches of the SSN querylet to the SSN field and the address querylet to the address fields. In current versions this can be expressed directly and flexibly using the AND score combiner. Previous to version 4.1 there was a more limited method for doing this using an additional layer of querylet grouping called a multiquerylet. A multiquerylet is used to explicitly specify querylet dependency. A multiquerylet is another list type lpar, LPAR_LST_MULTIQUERYLET, which contains a field set specification (either by name or number) and a collection of interdependent querylets.
Here is the same search done with multiquerylets.
lpar_t multiquery,multiquerylet,querylet,sq,sf,af,qfw;
unsigned char *fname,*lname,*address,*ssn;
unsigned char *name_fields[] = { "First Name",
"Last Name" };
unsigned char *ssn_fields[] = { "Social Security Number" };
unsigned char *address_fields[] = { "House Number",
"Street Name",
"City" };
double name_weights[2][2] = { { 1.0 , 0.8 } ,
{ 0.8 , 1.0 } };
fname = get_fname_box();
lname = get_lname_box();
address = get_address_box();
ssn = get_ssn_box();
multiquery=lpar_create_lst(LPAR_LST_MULTIQUERY);
/* Name Multiquerlet */
multiquerylet=lpar_create_lst(LPAR_LST_MULTIQUERYLET);
sf=lpar_create_strarr(LPAR_STRARR_FIELDNAMES,name_fields,2);
lpar_append_lst(multiquerylet,sf);
/* First name querylet */
querylet=lpar_create_lst(LPAR_LST_QUERYLET);
sq=lpar_create_blk(LPAR_BLK_SEARCHQUERY,fname,strlen(fname));
lpar_append_lst(querylet,sq);
as=lpar_create_int(LPAR_INT_ALIGNFIELD,0);
lpar_append_lst(querylet,af);
qfw=lpar_create_dblarr(LPAR_DBLARR_QFIELDWEIGHTS,name_weights[0],2);
lpar_append_lst(querylet,qfw);
lpar_append_lst(multiquerylet,querylet);
/* Last name querylet */
querylet=lpar_create_lst(LPAR_LST_QUERYLET);
sq=lpar_create_blk(LPAR_BLK_SEARCHQUERY,fname,strlen(fname));
lpar_append_lst(querylet,sq);
as=lpar_create_int(LPAR_INT_ALIGNFIELD,1);
lpar_append_lst(querylet,af);
qfw=lpar_create_dblarr(LPAR_DBLARR_QFIELDWEIGHTS,name_weights[1],2);
lpar_append_lst(querylet,qfw);
lpar_append_lst(multiquerylet,querylet);
lpar_append_lst(multiquery,multiquerylet);
/* SSN querylet */
querylet=lpar_create_lst(LPAR_LST_QUERYLET);
sq=lpar_create_blk(LPAR_BLK_SEARCHQUERY,ssn,strlen(ssn));
lpar_append_lst(querylet,sq);
sf=lpar_create_strarr(LPAR_STRARR_FIELDNAMES,ssn_fields,1);
lpar_append_lst(querylet,sf);
lpar_append_lst(multiquery,querylet);
/* Address querylet */
querylet=lpar_create_lst(LPAR_LST_QUERYLET);
sq=lpar_create_blk(LPAR_BLK_SEARCHQUERY,address,strlen(address));
lpar_append_lst(querylet,sq);
sf=lpar_create_strarr(LPAR_STRARR_FIELDNAMES,address_fields,3);
lpar_append_lst(querylet,sf);
lpar_append_lst(multiquery,querylet);;
In this case, the explicit listing of the interdependency of the first and last name querylets by defining them as part of a single multiquerylet sets them apart from the SSN and address querylets, which are now computed independently. As the address querylet is now independent it is no longer bound by the one to one querylet to database field restriction so we need only specify a single querylet for it.