Data Formatter Usage Guidelines

Overview

In the playbook editor, the data formatter task serves three main purposes:

✅ Extracting useful information from large pieces of data.

✅ Constructing data using extracted and/or predefined data and outputting it in another format.

✅ Performing simple data calculations and transformations.

However, it's important to note that the data formatter is not intended to:

❌ Act as a middleware to transfer large amounts of data. For example, simply outputting the event data like {{PlaybookData | jsonpath('$.DataSource.incident.Events')}} is not a recommended use case.

❌ Handle complicated logic which requires multiple macros (functions), or nested loops, which should be implemented with Python command tasks.

Instead, use the following guidelines when using data formatter:

Try to reduce large data transfers between playbook tasks. Instead of downloading a large file in separate Python playbook tasks, merge these functions into one task to optimize efficiency.
Limit the use of excessive loops, nested loops, or other complicated logic in a Jinja template task. Although the data formatter has been designed with extensive features for text formatting and JSON object creation, it is not a substitute for Python command tasks. Since Jinja is built on Python, it incurs additional overhead and will not match Python's performance.

Prerequisites

Before diving into the use cases and examples, make sure you have a basic understanding of the syntax explained in the Data Formatter documentation. The Data Formatter documentation can be accessed by clicking Quick Start within any data formatter task.

General Use Cases

Constructing Simple Text

A common use of the data formatter task is to construct simple text using our given data source. For instance:

INPUT

CODE

Build a string: {{PlaybookData | jsonpath('$.DataSource.playbookName')}}

OUTPUT

CODE

Build a string: Data Formatter

Constructing a JSON Object

When constructing a JSON object using data formatter, it's important to handle string escape issues properly. Here are examples of bad and better practices:

Bad Practice:

INPUT

CODE

{% set var = {"title": 'This title contains "'} %}

{
  "New Playbook Title": "New {{ var | jsonpath('$.title')}}"
}

OUTPUT

JSON

{
  "New Playbook Title": "New This title contains "" 
}

This approach is flawed as it can lead to string escape issues. When the string contains characters that need to be escaped, such as a double quote in this case, it results in an invalid JSON object because the unescaped string is directly enclosed within double quotes.

Better Practice:

INPUT

CODE

{% set var = {"title": 'This title contains "'} %}
{{
{
  "New Playbook Title": "New " + var | jsonpath('$.title')
}
}}

OUTPUT

JSON

{
  "New Playbook Title": "New This title contains \""
}

The recommended approach places the double brackets outside the JSON object and concatenates the string using '+'. This allows the data formatter to construct the JSON object, as it will automatically escape the characters when processing the string, resulting in a valid JSON object.

Constructing an If/Else Statement to Output Data

Acceptable Practice:

INPUT

CODE

{% set empty_string = "" %}

{% if empty_string == ""%}
Empty String
{% else %}
Not Empty!
{% endif %}

OUTPUT

CODE

Empty String

This approach is acceptable, but it explicitly checks if the variable is an empty string, which can be verbose and less efficient.

Better Practice:

INPUT

CODE

{% set empty_string = "" %}

{% if empty_string %}
Not Empty!
{% else %}
Empty String
{% endif %}

OUTPUT

CODE

Empty String

The recommended approach uses {% if variable %} to check if a variable is not empty. This is a more concise and efficient way of checking if a variable has a value. It covers not only empty strings but also other falsy values such as NULL, empty arrays, empty objects, or the number 0. This approach is preferred when you want to check for any truthy value (not just non-empty strings).

However, if you specifically want to check for a non-empty string and do not want any other truthy values to be considered (e.g., you want to exclude True, 1, [1], etc.), then using {% if variable == ""%} is more explicit and appropriate.

Additional Examples of Basic Comparisons and Built-in Filters:

Basic Comparisons:

CODE

{{"a"=="b"}}
{{1 <= 2}}
{{1 >= 2 }}
{{0 != 0}}
{{['a','b'] == ['b','a']}}

Built-in Filters:

CODE

{{ ['Alice','Bob'] | contains_element('Alice') }}
{{ {'alice': 100, 'bob': 99} | contains_key('alice') }}
{{ {'alice': 100, 'bob': 99} | contains_key_value_pair( {'bob':99} ) }}
{{ {'alice': 100, 'bob': 99} | contains_value(100) }}
{{ 'Hello World' | end_with('Hello') }}
{{ 'Hello World' | equals('Hello') }}
{{ 20 | greater_or_equal(10) }}
{{ 20 | greater_than(10) }}
{{ false | is_false }}
{{ true | is_true }}
{{ ['Hello', 'World'] | json_equal(['World', 'Hello']) }}
{% set input1={'key1': 1, 'key2': 2} %}
{% set input2={'key2': 2, 'key1': 1} %}
{{ input1 | json_equal(input2) }}
{{ 'Hello World' | length_equals_to(11) }}
{{ globalList | jsonpath("$[?(@.Email== 'gdevenish0@msn.com')]") | length | greater_or_equal(1)}}
{{(jsonArray|selectattr('id', '==', 3)|first).email}}
{{ 20 | less_or_equal(10) }}
{{ 20 | less_than(10) }}
{{ ['Alice','Bob'] | not_contain_array(['Alice']) }}
{{ ['Alice','Bob'] | not_contain_element('Alice') }}
{{ {'alice': 100, 'bob': 99} | not_contain_key('alice') }}
{{ {'alice': 100, 'bob': 99} | not_contain_key_value_pair( {'bob':99} ) }}
{{ {'alice': 100, 'bob': 99} | not_contain_value(100) }}
{{ 'Hello World' | not_end_with('Hello') }}
{{ 'Hello World' | not_equal('Hello') }}
{{ 'Hello World' | not_start_with('Hello') }}
{{'Bob has 10 tasks in his 3 task lists.' | regex_match('[0-9]+')}}
{{ 'Hello World' | start_with('Hello') }}
{{ 'Hello World' | string_contains('Hello') }}
{{ 'Hello World' | string_not_contain('Hello') }}
{{ ['Alice','Bob'] | text_array_contains_text('Alice') }}
{{ '2000-01-01 11:00:00' | time_after('now') }}
{{ '2000-01-01 11:00:00' | time_after_or_same_as('now') }}
{{ '2000-01-01 11:00:00' | time_before('now') }}
{{ '2000-01-01 11:00:00' | time_before_or_same_as('now') }}

Constructing JSON Object/Array Using Given Data

Bad Practice:

INPUT

CODE

{% set my_list= [1,2,3] %}
{%- for item in my_list %}
    {%- for i in range(my_list|length) %}
        {%- for j in range(my_list|length) %}
            {%- for k in range(my_list|length) %}
                {% set combination = [my_list[i], my_list[j], my_list[k]] %}
                {{ combination }}
            {%- endfor %}
        {%- endfor %}
    {%- endfor %}
{%- endfor %}

OUTPUT

CODE

[1, 1, 1]             
[1, 1, 2]
[1, 1, 3]
[1, 2, 1]
[1, 2, 2]
[1, 2, 3]
[1, 3, 1]
[1, 3, 2]
[1, 3, 3]
[2, 1, 1] ...

This approach is not recommended due to the excessive nesting of loops, which can lead to inefficient code and performance issues.

Better Practice:

For nested loops greater than two, consider using Python to handle such complex logic and calculations. Python has a variety of well-supported libraries and is better equipped to execute complex tasks.

Example 1: Constructing a New JSON Object Array Using a Given Array:
INPUT

CODE

{% set artifacts=[]%}
{% set array=[1,2,3]%}
{% for item in array|to_list if array%}
	{% set object = {
  	"Identity": item
	}%}
	{% set _temp = artifacts.append(object) %}
{% endfor %}
{{artifacts}}

OUTPUT

CODE

[
  {
    "Identity": 1
  },
  {
    "Identity": 2
  },
  {
    "Identity": 3
  }
]

Example 2: Iterating Through a JSON Object
INPUT

CODE

{% set my_dict1 ={"City1":"Vancouver", "City2":"Burnaby","City3":"Richmond"} %}
{# Key has be unique #}
<table>
    {% for key, value in my_dict1.items() %}
   	 <tr>
   	 <td>{{ key }}</td>
   	 <td>{{ value }}</td>
   	 </tr>
    {% endfor %}
</table>

OUTPUT

HTML

<table>
    <tr>
        <td>City1</td>
        <td>Vancouver</td>
    </tr>
    
    <tr>
        <td>City2</td>
        <td>Burnaby</td>
    </tr>
    
    <tr>
        <td>City3</td>
        <td>Richmond</td>
    </tr>
</table>

Example 3: Iterating Through a JSON Object Array
INPUT

CODE

{% set my_dict =[{"City":"Vancouver"}, {"City":"Burnaby"} , {"City":"Richmond"}] %}    
<table>
{% for i in my_dict %}
    {% for key, value in i.items() %}
   	 <tr>
   	 <td>{{ key }}</td>
   	 <td>{{ value }}</td>
   	 </tr>
    {% endfor %}
{% endfor %}
</table>

OUTPUT

HTML

<table>
    <tr>
        <td>City</td>
        <td>Vancouver</td>
    </tr>
    
    <tr>
        <td>City</td>
        <td>Burnaby</td>
    </tr>
    
    <tr>
        <td>City</td>
        <td>Richmond</td>
    </tr>
</table>

All the above examples use no more than two loops, which is a standard usage of loops to iterate through arrays and JSON objects to assemble HTML. However, if the data set is too large, it is recommended to use a Python command instead.

Using Namespace

The main purpose of using namespaces is to carry a value from within a loop structure to the outer context.

In this example, we define the namespace object in the outer scope, and inside the loop, we alter the array of the namespace object. Ultimately, we print out the array, demonstrating how the value can be transferred from inside the for loop. Initial values (in this scenario, final=[]) can also be supplied as a dictionary or a single value. Using namespace provides an added layer of adaptability to the data formatting code when working with loops.

INPUT

CODE

{% set result = namespace(final=[]) %}
{% set jsonArray= ["John", "Jack", "Jay"] %}

{% for object in jsonArray if jsonArray %}
    {# change object format as needed#}
    {% set object = [{"Name" : object}]%}
    {% set result.final =  object | to_list | plus(result.final) %}
{% endfor %}

{{result.final}}

OUTPUT

JSON

[
  {
    "Name": "Jay"
  },
  {
    "Name": "Jack"
  },
  {
    "Name": "John"
  }
]

Using `{%-%}` Statements

When working with data formatting code, the {% %} statement is frequently used to define or set variables. However, given its template language properties, the {% %} statement also outputs a linebreak and blank space when directly printing lines of text. To prevent the output of this extra line, simply add a dash (-), yielding {%-%}. Doing so will prevent the generation of a blank line when printing the output.

Without Dash

INPUT

CODE

This is line 1
{% set not_use_variable=1 %}
This is line 2
{% set not_use_variable=2 %}
This is line 3

OUTPUT

CODE

This is line 1

This is line 2

This is line 3

With Dash

INPUT

CODE

This is line 1
{%- set not_use_variable=1 %}
This is line 2
{%- set not_use_variable=2 %}
This is line 3

OUTPUT

CODE

This is line 1
This is line 2
This is line 3

Advanced Use Cases with JSONPath Filter

JSONPath filtering offers advanced capabilities to extract specific parts of JSON data effortlessly using its built-in filters, eliminating the need for tedious loops to iterate through objects or arrays. In D3 SOAR, this feature is implemented using the jsonpath-ng library, a Python library compliant with JSONPath standards.

For comprehensive documentation and references, please visit https://pypi.org/project/jsonpath-ng/.

Reader Note

While the native dot-notation is commonly used in JSONPath expressions, it has its limitations when used within D3's data formatter.
For example, if you wish to use the wildcard selection feature (*), you should format your query like this:

CODE

{{PlaybookData | jsonpath('$.DataSource.incident.Artifacts[*].ExternalEndpoints[*].IPAddresses')}}

This is more effective than the native notation:

CODE

{{PlaybookData.DataSource.incident.Artifacts[*].ExternalEndpoints[*].IPAddresses}}

The latter does not support wildcard selections.

Searching within a JSON Object

To determine if a value exists within a JSON object and return the entire object, you can use the == operator. For example:

INPUT

CODE

{% set country=[{
  "Canada": "Vancouver",
  "US": "Seattle",
  "Japan":"Tokyo"
},{
  "Canada": "Toronto",
  "US": "New York",
  
},{
  "Canada": "Toronto",
  "UK": "London"
} ] %}

{{ country | jsonpath("$[?(@.Canada=='Vancouver' & @.US=='Seattle')]")}}

OUTPUT

CODE

{
  "Canada": "Vancouver",
  "US": "Seattle",
  "Japan": "Tokyo"
}

For substring checks, use the =~ operator. For example:

INPUT

CODE

{% set country=[{
  "Canada": "Vancouver",
  "US": "Seattle",
  "Japan":"Tokyo"
},{
  "Canada": "Toronto",
  "US": "New York",
  
},{
  "Canada": "Toronto",
  "UK": "London"
} ] %}

{{ country | jsonpath("$[?(@.Canada=~'Van' & @.US=='Seattle')]")}}

OUTPUT

CODE

{
  "Canada": "Vancouver",
  "US": "Seattle",
  "Japan": "Tokyo"
}

Reader Note

Other comparison operators (>, <, <=, >=) can also be used where applicable.

Searching and Retrieving Data

Determine if a value exists within a JSON object without writing loops to iterate through the entire object, and then retrieve the specific data.

INPUT

CODE

{% set country=[{
  "Canada": "Vancouver",
  "US": "Seattle",
  "Japan":"Tokyo"
},{
  "Canada": "Toronto",
  "US": "New York"
},{
  "Canada": "Toronto",
  "UK": "London"
} ] %}

{{ country | jsonpath("$[?(@.Canada=='Vancouver' & @.US=='Seattle')].Japan")}}

OUTPUT

CODE

Tokyo

Sorting JSON Arrays

Use the forward slash (/) to sort the array by ascending order and the backward slash (\) for descending order.

Example 1: Ascending Order
INPUT

CODE

{% set animals = 
[{
  "cat": 1,
  "cow": 2
},
    {
  "cat": 2,
  "cow": 1
},
    {
  "cat": 3,
  "cow": 3
}] %}
                               		 
{{ animals| jsonpath('$[/cat]')}}

OUTPUT

CODE

[
  {
    "cat": 1,
    "cow": 2
  },
  {
    "cat": 2,
    "cow": 1
  },
  {
    "cat": 3,
    "cow": 3
  }
]

Example 2: Descending Order
INPUT

CODE

{% set animals = 
[{
  "cat": 1,
  "cow": 2
},
    {
  "cat": 2,
  "cow": 1
},
    {
  "cat": 3,
  "cow": 3
}] %}
                               		 
{{ animals| jsonpath('$[\cat]')}}

OUTPUT

CODE

[
  {
    "cat": 3,
    "cow": 3
  },
  {
    "cat": 2,
    "cow": 1
  },
  {
    "cat": 1,
    "cow": 2
  }
]

Example 3: Descending Order and Retrieve the First Object Field
INPUT

CODE

{% set animals = 
[{
  "cat": 1,
  "cow": 2
},
    {
  "cat": 2,
  "cow": 1
},
    {
  "cat": 3,
  "cow": 3
}] %}
                               		 
{{ animals| jsonpath('$[\cat][0].cow')}}

OUTPUT

CODE

Retrieving Multiple Key Values

A single select can be used to retrieve multiple key values of a JSON object. For example:

INPUT

CODE

{% set country=[{
  "Canada": "Vancouver",
  "US": "Seattle",
  "Japan":"Tokyo"
},{
  "Canada": "Toronto",
  "US": "New York"
},{
  "Canada": "Toronto",
  "UK": "London"
} ] %}


{{ country|jsonpath('$[*].["Canada","UK"]') }}

OUTPUT

JSON

[
  "Vancouver",
  "Toronto",
  "Toronto",
  "London"
]

A single selection and appended key values in a string can be used to retrieve multiple key values of a JSON object. Note that each key should be present in every object for this query to work. For example:

INPUT

CODE

{% set country=[{
  "Canada": "Vancouver",
  "US": "Seattle",
  "Japan":"Tokyo"
},{
  "Canada": "Toronto",
  "US": "New York",
  "Japan":"Tokyo"
},{
  "Canada": "Toronto",
  "UK": "London",
  "Japan":"Tokyo"
} ] %}


{{ country|jsonpath('$[*].Canada + ", "+ $[*].Japan') }}

OUTPUT

JSON

[
  "Vancouver, Tokyo",
  "Toronto, Tokyo",
  "Toronto, Tokyo"
]

A single selection can be used to retrieve multiple layers of a JSON object.

INPUT

CODE

{% set country={
  "Canada": {
    "Vancouver": "D3 Security"
  },
  "US": "Seattle",
  "Japan": "Tokyo"
} %}
                       		 
{{ country | jsonpath('$.US|(Canada.Vancouver)')}}

OUTPUT

JSON

[
  "Seattle",
  "D3 Security"
]

Overview

Prerequisites

General Use Cases

Constructing Simple Text

Constructing a JSON Object

Constructing an If/Else Statement to Output Data

Additional Examples of Basic Comparisons and Built-in Filters:

Constructing JSON Object/Array Using Given Data

Using Namespace

Using {%-%} Statements

Advanced Use Cases with JSONPath Filter

Searching within a JSON Object

Searching and Retrieving Data

Sorting JSON Arrays

Retrieving Multiple Key Values

Using `{%-%}` Statements