Text Search
Searching by text is an important feature when dealing with logs.
LIKE | REGEXP | CONTAINS | |
---|---|---|---|
Matching level | Character | Character | Token |
Syntax | Wildcards: _ % | Java Regular Expressions | Wildcards: ? * |
Expression matches | Full string | Part of the string | Part of the string |
Indexed | No | No | Yes |
Case-sensitive | Yes | Per syntax | No |
LIKE is the classical SQL operator. It matches the full string (so leading and trailing wildcards should be added if trying to match only a fragment). It has a granularity of character, that is, we can match character by character. The supported wildcards are _ for one character and % for many characters. It is not indexed, so it is not particularly fast.
REGEXP allows matching regular expressions. It searches a match within the string. It has character granularity. The syntax for the regular expression language is the same that provided by the Java language. It is not indexed.
CONTAINS searches within the index, with a token granularity. The index stores tokens, or full words, so we must search for the full words, or add wildcards. The wildcards allowed are ? for one character and * for multiple characters, and matching part of the string is enough. This operator takes advantage of the index, and hence CONTAINS speeds up queries.
See Types of FILTER statements for details on the syntax of this operators and available string functions that can be useful to manipulate text.