Data Formatter Usage Guidelines
Overview
In the playbook editor, the data formatter task serves three main purposes: ✅ Extracting useful information from large pieces of data. ✅ Constructing data using extracted and/or predefined data and outputting it in another format. ✅ Performing simple data calculations and transformations. |
---|
However, it's important to note that the data formatter is not intended to: ❌ Act as a middleware to transfer large amounts of data. For example, simply outputting the event data like ❌ Handle complicated logic which requires multiple macros (functions), or nested loops, which should be implemented with Python command tasks. |
Instead, use the following guidelines when using data formatter:
Try to reduce large data transfers between playbook tasks. Instead of downloading a large file in separate Python playbook tasks, merge these functions into one task to optimize efficiency.
Limit the use of excessive loops, nested loops, or other complicated logic in a Jinja template task. Although the data formatter has been designed with extensive features for text formatting and JSON object creation, it is not a substitute for Python command tasks. Since Jinja is built on Python, it incurs additional overhead and will not match Python's performance.
Prerequisites
Before diving into the use cases and examples, make sure you have a basic understanding of the syntax explained in the Data Formatter documentation. The Data Formatter documentation can be accessed by clicking Quick Start within any data formatter task.
General Use Cases
Constructing Simple Text
A common use of the data formatter task is to construct simple text using our given data source. For instance:
INPUT
Build a string: {{PlaybookData | jsonpath('$.DataSource.playbookName')}}
OUTPUT
Build a string: Data Formatter
Constructing a JSON Object
When constructing a JSON object using data formatter, it's important to handle string escape issues properly. Here are examples of bad and better practices:
Bad Practice: INPUT
CODE
OUTPUT
JSON
This approach is flawed as it can lead to string escape issues. When the string contains characters that need to be escaped, such as a double quote in this case, it results in an invalid JSON object because the unescaped string is directly enclosed within double quotes. |
Better Practice: INPUT
CODE
OUTPUT
JSON
The recommended approach places the double brackets outside the JSON object and concatenates the string using '+'. This allows the data formatter to construct the JSON object, as it will automatically escape the characters when processing the string, resulting in a valid JSON object. |
Constructing an If/Else Statement to Output Data
Acceptable Practice: INPUT
CODE
OUTPUT
CODE
This approach is acceptable, but it explicitly checks if the variable is an empty string, which can be verbose and less efficient. |
Better Practice: INPUT
CODE
OUTPUT
CODE
The recommended approach uses However, if you specifically want to check for a non-empty string and do not want any other truthy values to be considered (e.g., you want to exclude |
Additional Examples of Basic Comparisons and Built-in Filters:
Basic Comparisons:
{{"a"=="b"}}
{{1 <= 2}}
{{1 >= 2 }}
{{0 != 0}}
{{['a','b'] == ['b','a']}}
Built-in Filters:
{{ ['Alice','Bob'] | contains_element('Alice') }}
{{ {'alice': 100, 'bob': 99} | contains_key('alice') }}
{{ {'alice': 100, 'bob': 99} | contains_key_value_pair( {'bob':99} ) }}
{{ {'alice': 100, 'bob': 99} | contains_value(100) }}
{{ 'Hello World' | end_with('Hello') }}
{{ 'Hello World' | equals('Hello') }}
{{ 20 | greater_or_equal(10) }}
{{ 20 | greater_than(10) }}
{{ false | is_false }}
{{ true | is_true }}
{{ ['Hello', 'World'] | json_equal(['World', 'Hello']) }}
{% set input1={'key1': 1, 'key2': 2} %}
{% set input2={'key2': 2, 'key1': 1} %}
{{ input1 | json_equal(input2) }}
{{ 'Hello World' | length_equals_to(11) }}
{{ globalList | jsonpath("$[?(@.Email== 'gdevenish0@msn.com')]") | length | greater_or_equal(1)}}
{{(jsonArray|selectattr('id', '==', 3)|first).email}}
{{ 20 | less_or_equal(10) }}
{{ 20 | less_than(10) }}
{{ ['Alice','Bob'] | not_contain_array(['Alice']) }}
{{ ['Alice','Bob'] | not_contain_element('Alice') }}
{{ {'alice': 100, 'bob': 99} | not_contain_key('alice') }}
{{ {'alice': 100, 'bob': 99} | not_contain_key_value_pair( {'bob':99} ) }}
{{ {'alice': 100, 'bob': 99} | not_contain_value(100) }}
{{ 'Hello World' | not_end_with('Hello') }}
{{ 'Hello World' | not_equal('Hello') }}
{{ 'Hello World' | not_start_with('Hello') }}
{{'Bob has 10 tasks in his 3 task lists.' | regex_match('[0-9]+')}}
{{ 'Hello World' | start_with('Hello') }}
{{ 'Hello World' | string_contains('Hello') }}
{{ 'Hello World' | string_not_contain('Hello') }}
{{ ['Alice','Bob'] | text_array_contains_text('Alice') }}
{{ '2000-01-01 11:00:00' | time_after('now') }}
{{ '2000-01-01 11:00:00' | time_after_or_same_as('now') }}
{{ '2000-01-01 11:00:00' | time_before('now') }}
{{ '2000-01-01 11:00:00' | time_before_or_same_as('now') }}
Constructing JSON Object/Array Using Given Data
Bad Practice: INPUT
CODE
OUTPUT
CODE
This approach is not recommended due to the excessive nesting of loops, which can lead to inefficient code and performance issues. |
Better Practice: For nested loops greater than two, consider using Python to handle such complex logic and calculations. Python has a variety of well-supported libraries and is better equipped to execute complex tasks.
All the above examples use no more than two loops, which is a standard usage of loops to iterate through arrays and JSON objects to assemble HTML. However, if the data set is too large, it is recommended to use a Python command instead. |
Using Namespace
The main purpose of using namespaces is to carry a value from within a loop structure to the outer context.
In this example, we define the namespace object in the outer scope, and inside the loop, we alter the array of the namespace object. Ultimately, we print out the array, demonstrating how the value can be transferred from inside the for loop. Initial values (in this scenario, final=[]
) can also be supplied as a dictionary or a single value. Using namespace provides an added layer of adaptability to the data formatting code when working with loops.
INPUT
{% set result = namespace(final=[]) %}
{% set jsonArray= ["John", "Jack", "Jay"] %}
{% for object in jsonArray if jsonArray %}
{# change object format as needed#}
{% set object = [{"Name" : object}]%}
{% set result.final = object | to_list | plus(result.final) %}
{% endfor %}
{{result.final}}
OUTPUT
[
{
"Name": "Jay"
},
{
"Name": "Jack"
},
{
"Name": "John"
}
]
Using {%-%}
Statements
When working with data formatting code, the {% %}
statement is frequently used to define or set variables. However, given its template language properties, the {% %}
statement also outputs a linebreak and blank space when directly printing lines of text. To prevent the output of this extra line, simply add a dash (-
), yielding {%-%}
. Doing so will prevent the generation of a blank line when printing the output.
Without Dash
INPUT
This is line 1
{% set not_use_variable=1 %}
This is line 2
{% set not_use_variable=2 %}
This is line 3
OUTPUT
This is line 1
This is line 2
This is line 3
With Dash
INPUT
This is line 1
{%- set not_use_variable=1 %}
This is line 2
{%- set not_use_variable=2 %}
This is line 3
OUTPUT
This is line 1
This is line 2
This is line 3
Advanced Use Cases with JSONPath Filter
JSONPath filtering offers advanced capabilities to extract specific parts of JSON data effortlessly using its built-in filters, eliminating the need for tedious loops to iterate through objects or arrays. In D3 SOAR, this feature is implemented using the jsonpath-ng library, a Python library compliant with JSONPath standards.
For comprehensive documentation and references, please visit https://pypi.org/project/jsonpath-ng/.
Reader Note
While the native dot-notation is commonly used in JSONPath expressions, it has its limitations when used within D3's data formatter.
For example, if you wish to use the wildcard selection feature (*
), you should format your query like this:
{{PlaybookData | jsonpath('$.DataSource.incident.Artifacts[*].ExternalEndpoints[*].IPAddresses')}}
This is more effective than the native notation:
{{PlaybookData.DataSource.incident.Artifacts[*].ExternalEndpoints[*].IPAddresses}}
The latter does not support wildcard selections.
Searching within a JSON Object
To determine if a value exists within a JSON object and return the entire object, you can use the == operator. For example:
INPUT
{% set country=[{
"Canada": "Vancouver",
"US": "Seattle",
"Japan":"Tokyo"
},{
"Canada": "Toronto",
"US": "New York",
},{
"Canada": "Toronto",
"UK": "London"
} ] %}
{{ country | jsonpath("$[?(@.Canada=='Vancouver' & @.US=='Seattle')]")}}
OUTPUT
{
"Canada": "Vancouver",
"US": "Seattle",
"Japan": "Tokyo"
}
For substring checks, use the =~
operator. For example:
INPUT
{% set country=[{
"Canada": "Vancouver",
"US": "Seattle",
"Japan":"Tokyo"
},{
"Canada": "Toronto",
"US": "New York",
},{
"Canada": "Toronto",
"UK": "London"
} ] %}
{{ country | jsonpath("$[?(@.Canada=~'Van' & @.US=='Seattle')]")}}
OUTPUT
{
"Canada": "Vancouver",
"US": "Seattle",
"Japan": "Tokyo"
}
Reader Note
Other comparison operators (>
, <
, <=
, >=
) can also be used where applicable.
Searching and Retrieving Data
Determine if a value exists within a JSON object without writing loops to iterate through the entire object, and then retrieve the specific data.
INPUT
{% set country=[{
"Canada": "Vancouver",
"US": "Seattle",
"Japan":"Tokyo"
},{
"Canada": "Toronto",
"US": "New York"
},{
"Canada": "Toronto",
"UK": "London"
} ] %}
{{ country | jsonpath("$[?(@.Canada=='Vancouver' & @.US=='Seattle')].Japan")}}
OUTPUT
Tokyo
Sorting JSON Arrays
Use the forward slash (/
) to sort the array by ascending order and the backward slash (\
) for descending order.
Example 1: Ascending Order
INPUTCODE{% set animals = [{ "cat": 1, "cow": 2 }, { "cat": 2, "cow": 1 }, { "cat": 3, "cow": 3 }] %} {{ animals| jsonpath('$[/cat]')}}
OUTPUT
CODE[ { "cat": 1, "cow": 2 }, { "cat": 2, "cow": 1 }, { "cat": 3, "cow": 3 } ]
Example 2: Descending Order
INPUTCODE{% set animals = [{ "cat": 1, "cow": 2 }, { "cat": 2, "cow": 1 }, { "cat": 3, "cow": 3 }] %} {{ animals| jsonpath('$[\cat]')}}
OUTPUT
CODE[ { "cat": 3, "cow": 3 }, { "cat": 2, "cow": 1 }, { "cat": 1, "cow": 2 } ]
Example 3: Descending Order and Retrieve the First Object Field
INPUTCODE{% set animals = [{ "cat": 1, "cow": 2 }, { "cat": 2, "cow": 1 }, { "cat": 3, "cow": 3 }] %} {{ animals| jsonpath('$[\cat][0].cow')}}
OUTPUT
CODE3
Retrieving Multiple Key Values
A single select can be used to retrieve multiple key values of a JSON object. For example:
INPUT
{% set country=[{
"Canada": "Vancouver",
"US": "Seattle",
"Japan":"Tokyo"
},{
"Canada": "Toronto",
"US": "New York"
},{
"Canada": "Toronto",
"UK": "London"
} ] %}
{{ country|jsonpath('$[*].["Canada","UK"]') }}
OUTPUT
[
"Vancouver",
"Toronto",
"Toronto",
"London"
]
A single selection and appended key values in a string can be used to retrieve multiple key values of a JSON object. Note that each key should be present in every object for this query to work. For example:
INPUT
{% set country=[{
"Canada": "Vancouver",
"US": "Seattle",
"Japan":"Tokyo"
},{
"Canada": "Toronto",
"US": "New York",
"Japan":"Tokyo"
},{
"Canada": "Toronto",
"UK": "London",
"Japan":"Tokyo"
} ] %}
{{ country|jsonpath('$[*].Canada + ", "+ $[*].Japan') }}
OUTPUT
[
"Vancouver, Tokyo",
"Toronto, Tokyo",
"Toronto, Tokyo"
]
A single selection can be used to retrieve multiple layers of a JSON object.
INPUT
{% set country={
"Canada": {
"Vancouver": "D3 Security"
},
"US": "Seattle",
"Japan": "Tokyo"
} %}
{{ country | jsonpath('$.US|(Canada.Vancouver)')}}
OUTPUT
[
"Seattle",
"D3 Security"
]