Parser Field Reference - Advanced Data Models

While adding a parsing rule, you must provide different information depending upon the parser you have selected.

Key-value parser
JSON parser

XML parser
Columnar parser

Regex parser
CEP parser

Key-Value Parser

Field	Description
Values separator	Enter the delimiter that you want to use to separate key-value pairs. You can add only one separator at a time. The delimiters are case sensitive. For example, user=bob,vm=windows where user=bob is one pair and vm=windows is another pair separated with delimiter comma (,). The delimiter can be a single character, a string that has to be matched exactly, or a Java regular expression. RegEx: Select ON to use as a Java regular expression or OFF to use as a literal string.
Key-value separator	Enter the delimiter that you want to use to separate keys from their values. The delimiters are case sensitive. For example, user=bob where user is a key and bob is a value separated with delimiter equal sign (=). The delimiter can be a single character, a string that has to be matched exactly, or a Java regular expression. RegEx: Select ON to use as a Java regular expression or OFF to use as a literal string.
Beginning (RegEx)	If you want some initial characters in each line to be ignored, enter a regular expression for it. If a segment at the beginning of the line matches this regular expression, it is ignored. For example, if a line starts with Login and then followed by keyvalue pairs, then if you enter Login in this field, the first word Login is ignored when extracting columns. Named groups in the regular expression are extracted as columns. Note: For sending logs through UDP, when you create a new data model, type `.?.?.?` in the Beginning (RegEx) field so that LogLogic LMI can parse the logs correctly.
Ending (RegEx)	To ignore some characters at the end of each line, enter a regular expression for those characters. If a segment at the end of the line matches this regular expression, then it is ignored. Named groups in the regular expression are extracted as columns.
Predefined Columns	Used to define a fixed list of columns to be parsed. If predefined columns are specified: The Key-value parser parses only the specified columns from logs. The value in the Last key field is ignored. The values in the Values separator and Key-value separator fields are considered as string literals. The Regex option is not supported. This field is useful when the column names are more than one word and separator is a space. For example, for the log: Account Name:acc1, Account Domain:loglogic, Caller Computer Name:dell specify "Account Name", "Account Domain", and "Caller Computer Name" in the Predefined Columns field to have the columns and their values extracted correctly.
Last key	Enter a key name. Whenever that key is found in a line, the parser stops searching for more key-value pairs in that line and the value for that key is the remaining content of the line. For example, if the line ends: Severity="high",EventSubClass="1",ObjectID="389576426" then if you specify Severity as the last key, then the value for severity is: "",EventSubClass="1",ObjectID="389576426". Note: To specify a <space>, enter \s (backslash followed by s). For a <tab>, enter \t (backslash followed by t).
Expression	The expression uses a key name preceded with “$” to extract the value for the column. For example, $user is the value of the key "user" in the log line or empty if the key is not present.

Back to Adding a Parsing Rule in an Advanced Data Model

JSON Parser

Field Description

Beginning (RegEx)

Beginning (RegEx): If you want some initial characters in each line to be ignored, enter a regular expression for it. If a segment at the beginning of the line matches this regular expression, it is ignored. For example, if a line starts with Login and is followed by array or objects, then if you enter Login in this field, the first word Login is ignored when extracting columns. Named groups in the regular expression are extracted as columns.

Note: For sending logs through UDP, when you create a new data model, type .?.?.? in the Beginning (RegEx) field so that LogLogic LMI can parse the logs correctly.

Ending (RegEx) To ignore some characters at the end of each line, enter a regular expression for those characters. If a segment at the end of the line matches this regular expression, then it is ignored. Named groups in the regular expression are extracted as columns.

Root element

The starting node of the JSON. In case of structured JSON, the root element specifies the starting column in the JSON structure. If the field is empty, all the keys within the JSON are mapped to individual columns. For example:

Log	Root element	Meaning
{"key1":"value1","key2":["value2","value3"]}	key2[0]	Start parsing from 0th element of key2 array
{"key1":"value1","key2":{"key3":"value3"}}	key2.key3	Start parsing from inner structure key3

New Line Delimiter This field is required when the parser operates on multiline logs, especially those arriving from TIBCO LogLogic® Universal Collector. It is recommended to assign the same delimiter that is set in LogLogic® Universal Collector so that the parser removes the delimiters and the logs are parsed successfully.

Back to Adding a Parsing Rule in an Advanced Data Model

XML Parser

Field Description

Beginning (RegEx)

Beginning (RegEx): If you want some initial characters in each line to be ignored, enter a regular expression for it. If a segment at the beginning of the line matches this regular expression, it is ignored. For example, if a line starts with Login and then followed by array or objects, then if you enter Login in this field, the first word 'Login' is ignored when extracting columns. Named groups in the regular expression are extracted as columns.

Note: For sending logs through UDP, when you create a new data model, type .?.?.? in the Beginning (RegEx) field so that LogLogic LMI can parse the logs correctly.

Root path

If you need to extract a portion of the XML log, you can provide the starting point from a specific hierarchy within the XML log. If you leave the field empty or provide "/", the parser parses the entire XML log. To provide a starting point, use "/" as the separator between elements. For example, /files/fileInfo_1/location. If the XML log contains sibling elements with same name, you can address them using "[index]". For example, /files/fileInfo[1] or /files/fileInfo[2]. If an XML element has attributes, the attribute name is also separated by underscore (_) in the column name. For example, if the XML log is:

<files>
  <fileInfo sizeUnit = "kb">
     <fullName>/vaibhav/data.txt</fullName>
     <fileName>vaibhav.txt</fileName>
     <location>/vaibhav</location>     
  </fileInfo>
  <fileInfo sizeUnit = "kb">
     <fullName>/shane/data.txt</fullName>
     <fileName>shane.txt</fileName>
     <location>/shane</location>     
  </fileInfo>
</files>

Root path	Meaning
(empty)	Start parsing the entire XML from the root element
/files/fileInfo[1]/location	Start parsing from the `location` element of the first `fileInfo` element.
/files/fileInfo [2]	Start parsing from the second element `fileInfo`

Back to Adding a Parsing Rule in an Advanced Data Model

Columnar Parser

Field	Description
Separator	Enter the delimiter that you want to use as a column separator. The separator can be a string of one or more characters, or a Java regular expression. The delimiters are case sensitive. For example, `bob,windows` where comma (,) is a character used to separate two columns.
RegEx	Use this option to define how the separator should be interpreted. Select ON to use as a Java regular expression or OFF to use as a literal string.
Escape character	Define a character that is actually used to escape the character used as a column delimiter. The delimiters are case sensitive. For example, if you use a comma as a column separator and your column value has a comma in it, then that value has to be escaped so that a parser does not think that the instance of the comma is the start of a new column.
Beginning (RegEx)	If you want some initial characters in each line to be ignored, enter a regular expression for it. If a segment at the beginning of the line matches this regular expression, it is ignored. For example, if a line starts with Login and then followed by columnar data, then if you enter Login in this field, the first word Login is ignored when extracting columns. Named groups in the regular expression are extracted as columns. Note: For sending logs through UDP, when you create a new data model, type `.?.?.?` in the Beginning (RegEx) field so that LogLogic LMI can parse the logs correctly.
Ending (RegEx)	To ignore some characters at the end of each line, enter a regular expression for those characters. If a segment at the end of the line matches this regular expression, then it is ignored. Named groups in the regular expression are extracted as columns.
Max columns	Enter the maximum number of columns to be extracted. If more columns than maxColumns are found, then the content of the additional columns is included in the last column. For example, if the separator is <space> and the maxColumns value is 3 for a message like “a b c d”, then there are 3 columns with values “a”, “b” and “c <space> d”.
Trim values	If defined ON, then the extra (white) space that is generated at the beginning and end of the column is removed. If defined OFF, the extra space is not removed.
Expression	The expression uses the $<n> identifier where n is the column number for the value of column n. For example, $2 is the value of the column "2".

Back to Adding a Parsing Rule in an Advanced Data Model

Regex Parser

Field	Description
Regex pattern	Make sure to enter a valid PCRE regular expression that contains the groups (named or unnamed) to extracted into column values from the log event. Also, it is good practice to use one or more sample events to validate your regular expression and make sure that the correct values are extracted from the event. For a list of supported regular expression meta characters, based on Java regular expressions, see Supported Regular Expression Characters. For example, (?<Sequence>\d+).(?<ACL>\%\w+ \-\d\-\w+)\:\s(?<Name>\w+)\s(? <Version>\w+) \s(?<Status>\w+)\ s(?<Protocol>\w+)\s(?< SourceIP>\d{1,3}\.\d{1,3}\.\d{ 1,3}\. \d{1,3}).(?< DestinationIP>\d{1,3}\.\d{1,3} \.\d{1,3}\.\d{1,3}).* This extracts 8 fields: Sequence, ACL, Name, Version, Status, Protocol, SourceIP, and DestinationIP.
Expression	The columns are extracted using the capturing group pattern the named capturing group pattern or a combination of both. If you select the parser and the column list is empty, the parser tries to guess columns from the sample data.

Field

Description

Regex pattern

Make sure to enter a valid PCRE regular expression that contains the groups (named or unnamed) to extracted into column values from the log event. Also, it is good practice to use one or more sample events to validate your regular expression and make sure that the correct values are extracted from the event. For a list of supported regular expression meta characters, based on Java regular expressions, see Supported Regular Expression Characters. For example,

(?<Sequence>\d+).*(?<ACL>\%\w+ \-\d\-\w+)\:\s(?<Name>\w+)\s(? <Version>\w+) \s(?<Status>\w+)\ s(?<Protocol>\w+)\s(?< SourceIP>\d{1,3}\.\d{1,3}\.\d{ 1,3}\. \d{1,3}).*(?< DestinationIP>\d{1,3}\.\d{1,3} \.\d{1,3}\.\d{1,3}).*

This extracts 8 fields: Sequence, ACL, Name, Version, Status, Protocol, SourceIP, and DestinationIP.

Expression

The columns are extracted using the capturing group pattern the named capturing group pattern or a combination of both. If you select the parser and the column list is empty, the parser tries to guess columns from the sample data.

Back to Adding a Parsing Rule in an Advanced Data Model

CEP Parser

Field	Description
Expression	Based on the ArcSight Extension Dictionary, the CEF header columns are extracted and the remaining data is formatted as key-value pairs. For example, Sep 19 08:26:10 host CEF:0\|Security\|threatmanager\|1.0\|100\|worm successfully stopped\|10\| src=10.0.0.1 dst=2.1.2.2 spt=1232 This extracts these columns and their values as follows: $cefVersion=0, $cefDeviceVendor=Security, $cefDeviceProduct=threatmanager, $cefDeviceVersion=1.0, $cefSignatureID=100, $cefName=worm successfully stopped, $cefSeverity=10, $sourceAddress=10.0.0.1, $destinationAddress=2.1.2.2, $sourcePort=1232

Field

Description

Expression

Based on the ArcSight Extension Dictionary, the CEF header columns are extracted and the remaining data is formatted as key-value pairs. For example,

Sep 19 08:26:10 host CEF:0|Security|threatmanager|1.0|100|worm successfully stopped|10| src=10.0.0.1 dst=2.1.2.2 spt=1232

This extracts these columns and their values as follows:

$cefVersion=0, $cefDeviceVendor=Security, $cefDeviceProduct=threatmanager, $cefDeviceVersion=1.0, $cefSignatureID=100, $cefName=worm successfully stopped, $cefSeverity=10, $sourceAddress=10.0.0.1, $destinationAddress=2.1.2.2, $sourcePort=1232

Back to Adding a Parsing Rule in an Advanced Data Model