A search strategy defines how a field is indexed and queried. Any field is associated with a default search strategy, primarily based on its data type.
Search strategies are specified in the Data Model Assistant:
when editing a field, its search strategies can be set in the 'Extensions' tab;
at the data model level, custom search strategies can be specified, under 'Extensions > Search' element in the left pane;
Value-labeling is a global feature in EBX® to display user-friendly labels instead of raw values. For example, in the user interface, a foreign key field displays the label of the linked record, or a field based on a static enumeration displays the localized label associated with the raw value, as specified by the data model.
If a field supports value-labeling, the Quick search and the sort in the user interface usually apply on the displayed label, to preserve an intuitive user interface.
There are some exceptions, where raw value is still used by the quick search and the sort operation:
Programmatic labels and programmatic enumeration constraints (a foreign key specifying a TableRefDisplay
or whose display depends on a UILabelRenderer
specified on the target table, or a field constrained by a ConstraintEnumeration
). It is recommended to use alternative solutions (display patterns and foreign keys).
Enumeration constraint defined using another node (<osd:enumeration osd:path=...
). It is recommended to use an alternative solution (a foreign key).
Obviously, if a field is displayed through a UIWidget
(or a UIBean
), to preserve an intuitive user interface, it is expected for the custom component to display the label (or the value, if this field does not enable value-labeling).
The following fields are not optimized for search and other operators:
computed fields that implement the ValueFunction API or computation rules and scripts that do not depend only on the fields from the container table;
inherited fields.
As a consequence, these are generally excluded from the quick search and xpath search on table via osd:search('', 'pattern')
.
In the specific cases of inherited dataset, history view or mapped tables, legacy search is used. This implies that the size of the table cannot be quickly estimated, and might not be presented in the UI. It also implies that quick search:
considers all searchable fields (including computed fields with non-local dependency);
behaves like a 'contains' (Lucene syntax cannot be used);
does not support sort by relevancy;
may perform poorly on tables with large volumes.
In the case of a node defining a display pattern, non sortable search strategies are prohibited for the default search template. Those are yet allowed in the other search templates.
'Text' | The 'Text' search strategy is intended to contain multiple words, such as descriptions, texts or comments. This strategy supports full-text search and fuzzy search. Sorting, and some functions such as the ‘equals’ and 'starts-with' operators, are irrelevant and are not supported. This strategy is lightweight, consuming little disk space. See also Quick Search |
'Code' | The 'Code' search strategy is intended for codes and identifiers. Values are considered as one single token, allowing any kind of case-sensitive and case-insensitive filter. Full-text search is irrelevant and is replaced by a 'contains' by default. This can be modified to a ‘starts-with’ by defining a custom ‘Code’ strategy. For large volumes, a starts-with is preferable to achieve better performance. |
'Name' | The 'Name' search strategy is intended for names and labels that contain only a few words. Besides having the same search capabilities as 'Text', 'Name' strategy also allows sort, and supports the same filters as 'Code'. This strategy has the most capabilities, but consumes more disk space. If the purpose of the field allows it, it is advised to choose the 'Text', 'Code' or 'Excluded from search' strategy, rather than this one. |
Advanced search strategies are meant to support searching using alternative fuzzy algorithms, and does not support sorting or filtering operations. It is advised to use them in a custom search template, and use a basic strategy with the default search template.
'Levenshtein' | The 'Levenshtein' strategy is intended for names and labels that contain a few words. This strategy only supports fuzzy search, based on an edit distance algorithm. The search syntax (+,-,...) is not supported. This strategy does not perform well on large volumes. The parameters of the strategy allow favorizing the performance, by setting the maximum edit distance to 1, and using an invariant prefix. |
'Soundex' | The 'Soundex' strategy is intended for names and labels that contain a few words. This strategy only supports fuzzy search, based on the Soundex approximate phonetic algorithm. This algorithm is meant for english, and ignores digits. |
'Double Metaphone' | The 'Double metaphone' strategy is intended for names and labels that contain a few words. This strategy only supports fuzzy search, based on the Double metaphone approximate phonetic algorithm. This algorithm is more modern than Soundex, and gives better results, especially for non-english languages. Digits are ignored. |
'NGram strategy' | The 'NGram' strategy is intended for names and labels that contain a few words. This strategy only supports fuzzy search, based on a distance algorithm. This algorithm splits the values into smaller sequences called 'grams'. It performs better than the 'Levenshtein' search strategy, but consumes more disk space. |
The 'Name' strategy is applied to string fields by default, except:
If the field is part of the primary key, it is set by default to 'Code'.
If the field is a foreign key, it is forced to 'Code' and cannot be changed.
If the field has a built-in datatype extending xs:string
, then it has a strategy relevant to its datatype; for instance osd:text
, xs:Name
, osd:email
, osd:html
, etc.
As the default strategy 'Name' can be irrelevant and consumes more disk space, the data model compilation reports warnings for fields with the 'Name' strategy set as default, so as to ensure that strategies are defined on purpose. We advise to choose the 'Text' strategy, when the length of the expected values is greater than 80, as a rough estimate. Long values (> 32766 bytes once encoded into UTF-8) will not be fully indexed with the 'Name' or 'Code' strategy. Quick search is not affected, but sorting will consider only the first 1000 characters, and some operators ('equals' and 'ends-with', SQL DISTINCT and COUNT DISTINCT, ...) will not return the correct results.
Some strategies accept parameters, for example to define stop words, or a specific language. This is done by creating a record in the 'Search strategies' table of the 'Search' data model extension. The new parameterized strategy will be available for selection in the 'Extension' tab, for compatible fields.
It is possible to define stop words and synonyms lists in the 'Search' data model extension. Create a new record in the 'Stop words lists' or 'Synonyms lists' table and select the created list in the parameters tab of the 'Custom search strategies' table.
Primary key fields must have a sortable search strategy defined on the default search template. This excludes the 'Void' strategy for all data types, and the 'Text' strategy for strings. Do note that it is still possible to use non sortable search strategy for a primary key field, if it is defined within a search template other than the default search template.
Foreign key fields have two levels of search:
First, if applicable, the search is performed on each field of the displayed label of the foreign key. Each field strategy is inherited from the field in its target table. This first level is not always applicable, for instance when the search string cannot be converted to any of the target field data types.
Secondly, when the first level of search cannot be applied, the search is performed on the string representation of the target primary key. Modifying the search strategy of a foreign key field in the 'Extension' tab in the Data Model Assistant only affects this second level of search. It can only be a 'Code' search strategy (built-in or customized).
In the case of associations, the search is performed on each field used in the label of the association records (each field strategy is inherited from the field in its target table). To be applicable, the target field must be optimized for search, and the search criterion must be convertible to its type.
Since the search on associations can be applied only on the optimized fields of its label, it will not work on inherited datasets (which do not support optimized search, as described in Limitations).
The 'Excluded from search' (or Void
) strategy deactivates indexing, making filter, search, or sort impossible. It is available for all data types, and is intended for fields that are never queried. Values can still be accessed through their record. Disabling the indexing reduces the disk space consumed and speeds up some operations like data import.
It is also possible to exclude fields from the quick search tool using the property osd:defaultView/hiddenInQuickSearch="true|false"
.
See Default view for more information.
A search strategy can be associated with a field, by means of a search template. This is done in the 'Extension' tab of the field, in the Data Model Assistant. Assigning multiple search strategies to a field requires registering additional search templates into a module. Only the addons EBX® Information Search and EBX® Match and merge are concerned by additional search templates.