Processes Search API - Raw Command Line and Content Fields


Overview

The Carbon Black Cloud Process Search API has fields today that are tokenized as command line fields. For these fields, many special characters (like path separators, parentheses, brackets, etc.) are stripped out for the purpose of indexing to make searches as simple as possible. This enables a user to find a given word or series of words anywhere in the command line without the need to specify wildcards. This meets the needs most of the time. However there are times when a user may want to look for an exact string of characters on the command line including special characters. Doing this with existing command line tokenization can be difficult or even impossible.

We have introduced new raw versions of the fields to accommodate this need. Below, you will find the tokenized fields and their equivalent raw fields. For these fields, the entire command line or content will become one giant token so searching this will require wildcards or regular expressions. These fields are for advanced use cases and queries that use them may not perform as well as other queries.

Tokenized Field Raw Field
process_cmdline process_cmdline_raw
parent_cmdline parent_cmdline_raw
childproc_cmdline childproc_cmdline_raw
fileless_scriptload_cmdline fileless_scriptload_cmdline_raw
scriptload_content scriptload_content_raw

Example

C:\Windows\System32\test.exe \\this;that

The existing tokenized fields allow you to search for words or combinations of words, but not specific character sequences.

In this example command line, backslashes and semicolons would be converted to whitespace for the purpose of tokenization.

process_cmdline:\\\\this;that

Special characters become indistinguishable from whitespace creating a search for the phrase “this that”.

Note: Notice that backslashes need to be escaped which is why they are doubled up.

With the new “raw” versions of these fields, you can look for exact phrases. Just keep in mind you have to have wildcards before and after what you are searching for if what you are searching for is not at the beginning or end of the command line.

process_cmdline_raw:/.*\\\\\\\\this;that.*/

Tips:

  • For regular expressions, all characters should be specified in lowercase, and also you will need to double escape backslashes (and possibly other characters that are special characters in both regular expressions and in Lucene).
  • It is also recommended that to maximize the performance of this query, include as many leading characters as possible before specifying a wildcard. For example, if it is known that this command line always runs something in the “C:\Windows” directory you could have this query: process_cmdline_raw:/c:\\windows.\\\\this;that./

Guides and Resources


Last modified on November 27, 2023