Downstream Parsing

By configuring downstream parsing, you can chain a parsing rule of an advanced data model to another advanced or GP parser-based data model to further process the columns retrieved from the main parsing rule. This method of using two levels of parsing one after another helps to parse log messages having mixed formats. Consider the following example.

Example

In Windows Snare logs, the tab character is escaped with a backslash (\) and hence is a two-character sequence (\t). Consider the following sample log message received in LogLogic LMI:

<13>Apr 15 00:54:19 10.199.187.140 MSWinEventLog\t0\tSecurity\t406243509\tSat Dec 25 17:16:57 2004\t4699\tMicrosoft-Windows-Security-Auditing\tSYSTEM\tUser\tSuccess Audit\tX78UNT2AJIC1Y\tObject Access\t\tA scheduled task was deleted.    Subject:    Security ID:    S-1-5-21-2798475463-3993569027-3406240830-49243    Account Name:    QPAGjyT74D    Account Domain:    BJ    Logon ID:    0x6b441b23    Task Information:    Task Name:     \dEmG_Jclbu7ZtMGKUmLe3vArEqF_\erCw\TACZZvZ4mbn\M36kGL-2mYD-oI\cefE4AJTOA0rGj1V7G0LLcTmen_    Task Content:     <Task version="1.2" xmlns="http://schemas.microsoft.com/windows/2004/02/mit/task"> <RegistrationInfo> <Date>2017-04-21T17:54:26.989698</Date> <Author>BJQPAGjyT74D</Author> </RegistrationInfo> <Triggers/> <Principals> <Principal id="Author"> <RunLevel>LeastPrivilege</RunLevel> <UserId>BJQPAGjyT74D</UserId> <LogonType>S4U</LogonType> </Principal> </Principals> <Settings> <MultipleInstancesPolicy>IgnoreNew</MultipleInstancesPolicy> <DisallowStartIfOnBatteries>true</DisallowStartIfOnBatteries> <StopIfGoingOnBatteries>true</StopIfGoingOnBatteries> <AllowHardTerminate>true</AllowHardTerminate> <StartWhenAvailable>false</StartWhenAvailable> <RunOnlyIfNetworkAvailable>false</RunOnlyIfNetworkAvailable> <IdleSettings> <StopOnIdleEnd>true</StopOnIdleEnd> <RestartOnIdle>false</RestartOnIdle> </IdleSettings> <AllowStartOnDemand>true</AllowStartOnDemand> <Enabled>true</Enabled> <Hidden>false</Hidden> <RunOnlyIfIdle>false</RunOnlyIfIdle> <WakeToRun>false</WakeToRun> <ExecutionTimeLimit>P3D</ExecutionTimeLimit> <Priority>7</Priority> </Settings> <Actions Context="Author"> <Exec> <Command>CWindows\System32\svchost.exe</Command> </Exec> </Actions></Task>\t4054120554

Such log messages, which include a two-character tab, are not parsed in LogLogic LMI. To accurately parse such logs, you can create a new data model, create a parsing rule with parser type as Regex, and add the following pattern in the Regex pattern field:

^(.*)

Next, select the downstream data model as Microsoft_Windows and provide an expression to substitute the two-character \t with a single-character \t:

The main parsing rule captures the entire log message in the $1 variable. The $1 variable is used as the parameter of the function performing the character substitution. Its output is used as the input of the downstream data model.

Downstream Parser Matrix

The following matrix depicts the combination of parsers that can work (indicated by OK), and the combinations that have limitations in downstream parsing. The row indicates the main parser type and the columns in that row indicate the downstream parser type.

For an overview of each parser, see Types of Parsers in Advanced Data Models.

Downstream parser type ® Syslog KVP JSON XML Column Regex CEF GP 
Main parser type ¯                
Syslog N/A OK OK OK OK OK OK OK
Key-value parser (KVP) OK N/A Limited Limited Limited OK Limited OK
JSON OK OK N/A OK OK OK OK OK
XML  OK OK OK N/A OK OK OK OK
Columnar OK OK OK OK OK OK OK OK
Regex parser OK OK OK OK OK OK OK OK
CEF  OK OK OK OK OK OK N/A OK

Limitations of Downstream Parsing

  • You can configure the downstream parsing only for one parsing rule and to only one advanced data model. Only that parsing rule must be enabled and all other parsing rules must be disabled.
  • If the main parser type is KVP, then there are limitations to configure JSON, XML, columnar, or CEF parsers as the downstream parser type:
    • When using JSON, XML, or columnar as the downstream parser type: the key separator must not be comma (,) and the JSON statement must be enclosed in single quotes (').
    • When using CEF as the downstream parser type: the key separator must not be vertical bar (|) and the JSON statement must be enclosed in single quotes (').
  • When using GP as the downstream parser type, you must be extremely careful to provide an accurate and precise expression. The result of that expression must be a string that meets the downstream parser requirements.
    Tip: You can copy the exact expression from the GP parser-based data model and then paste it in the Expression field.