Mutex Lock and Unlock Tasks
LAST UPDATED: OCT 04, 2024
Introduction
This manual aims to help you understand and use Lock and Unlock Tasks in D3 SOAR Playbooks. Here, we explain the concept of mutex locks and their role in controlling task execution flow, demonstrate how to implement Lock and Unlock Tasks, and highlight the areas for monitoring and troubleshooting.
By following this guide, you will learn how to configure Lock and Unlock playbook tasks, avoid deadlock situations, and gain familiarity with the foundational methods of addressing playbook issues using the Investigation Dashboard and Event Playbook Viewer.
Understanding Lock and Unlock Tasks
Concept of Mutex Locks
Mutual-exclusion locks, or mutex locks, are tools used to control access to shared resources. Imagine a mutex lock as a key to a door that leads to a shared room. Only one person can have the key and enter the room at a time, preventing chaos and conflicts.
In D3 Playbooks, mutex locks are used to control the order of task execution. When a Lock Task (e.g. Lock A) from one execution stream runs, it "locks the door," preventing other execution streams from proceeding if they face the same lock. Streams that are parallel to Lock A and are either acquiring a different lock or do not have a Lock Task will execute asynchronously. Tasks in blocked streams will not execute until all tasks in every pathway downstream of Lock A executes to completion, or until an Unlock Task downstream of Lock A "unlocks the door."
READER NOTE
Two execution streams face the same mutex lock if they meet these two conditions:
Their Lock Tasks have the same lock name
Their Lock Tasks have the same lock Scope
In the scenario where Lock Task streams share the same lock, execution is carried out on a first-come first-served basis. If two or more of these Lock Tasks branch out from the same preceding node, the first Lock Task to execute is chosen randomly, since execution can reach any of them first.
Due diligence is required when using multiple Lock Tasks within the same execution stream. A deadlock arises when a Lock Task downstream of another attempts to acquire the same mutex lock. In this situation, the second Lock Task will wait indefinitely for the first to complete, which is impossible because the tasks following the second Lock Task are part of the execution path of the first Lock Task. Refer to the Troubleshooting and Monitoring section for accessing tools in the D3 platform to view, monitor, discover, and resolve issues.
Using Lock and Unlock Tasks in Practice
Scopes
Current Instance of Current Playbook
This option means that only the current event or incident being used to run this current Event Playbook or Incident Playbook will be considered when the system searches for the specified lock name.
All Instances of Current Playbook
This option means that all ingested events or incidents visible to this current Event Playbook or Incident Playbook will be considered when the system searches for the specified lock name.
All Instances of All Playbooks
This option means that any ingested events or incidents used in the current playbook and all other Event Playbooks or Incident Playbooks in D3 SOAR will be considered when the system searches for the specified lock name.
READER NOTE
Independent of the scopes used, lock and unlock tasks will always operate within the site where the playbook is located.
Lock Acquisition
Identical Mutex Locks
Below are steps to use Lock Tasks and acquire identical mutex locks:
Build an Event/Incident Playbook that includes paths with an upstream Lock Task.
Click on your Lock Task node to render its corresponding task details popup.
Enter a lock name within the Name textarea of the Input Parameters section.
Select lock scope within the Scope dropdown menu of the Input Parameters section.
Click on the check mark button at the top right corner of the task details popup.
Repeat steps 2-5 for your other Lock Tasks, ensuring that they have identical lock names and lock scopes.
If even one task of the executing Lock Task stream is unfinished during a playbook run, your other Lock Task stream(s) will be blocked.
Different Mutex Locks
Acquiring separate mutex locks follows a process similar to that outlined for Identical Mutex Locks, with the distinction being that at least one of the lock names or lock scopes differs.
Unblocking Execution Streams
Exhaustive Task Execution
When the final task of a Lock Task stream completes, another Lock Task stream sharing the same lock will be unblocked.
Using the Unlock Task
Below are steps to use unblock Lock Task streams using the Unlock Task:
Build a playbook that has an Unlock Task downstream of a Lock Task.
Click on the Unlock Task node to render its corresponding task details popup.
Enter the lock name of the lock you intend to unlock in the Name textarea of the Input Parameters section.
Select the lock scope of the lock you intend to unlock within the Scope dropdown menu of the Input Parameters section.
READER NOTE
Both the Name and Scope fields must match those of the mutex lock acquired by the Lock Task of the executing stream.
Click on the check mark button at the top right corner of the task details popup.
Now, the second Lock Task stream can execute immediately once the Unlock Task is completed, even if there remains unfinished tasks in the first Lock Task stream.
Timeout
A timeout is defined as the number of seconds to wait before a mutex lock is released. This value is set in the Timeout textarea within the Input Parameters section of the task detail popup.
READER NOTE
The mutex lock will not be released if the timeout value is less or equal to 0.
The Timeout input parameter cannot be used to resolve deadlocks. It is typically used when the second Lock Task, sharing the same mutex lock, operates in a parallel stream to the first Lock Task.
Troubleshooting and Monitoring
Event Playbook Troubleshooting
Suppose two Lock Tasks share a mutex lock, with one being downstream of the other.
In this situation, the execution path will never complete and will result in a playbook error when an ingested event is processed by this Event Playbook. Since the upstream Lock Task has already acquired the shared mutex lock, the downstream Lock Task is blocked and cannot execute. If the downstream Lock Task does not execute, the mutex lock held by the upstream Lock Task will never be released. This causes the second Lock Task to wait indefinitely for the first to complete.
Errors within the Event Playbook can be monitored through two interfaces – the event list view within the Investigation Dashboard, and the Event Playbook Viewer.
Investigation Dashboard
To identify the point of error, follow these steps:
Click on the Investigation Dashboard navigation tab.
Click on the Playbook Errors accordion header.
Click on All Event Playbook Errors within the expanded accordion.
You can now filter a list of all Event Playbook errors. By clicking on an item in the list view, you will be redirected to the triggered flow of that specific playbook, allowing you to examine where and why the error occurred.
Event Playbook Viewer
For a detailed overview of the status and progress of event-triggered playbooks, follow the these steps:
Click on the Event Playbook Viewer navigation tab.
Select your Event Playbook site within the dropdown menu.
Click on your Event Intake Summary Card on the left hand side to render the corresponding Trigger Log.
Click on the event ingestion of interest within the Trigger Log to render the corresponding Event Playbook.
From here, you will be able to easily identify error points through task icons.
Incident Playbook Troubleshooting
Similar to the scenario in the Event Playbook Troubleshooting section, suppose again that we have two Lock Tasks that share a mutex lock with one downstream of the other.
Investigation Dashboard
To identify and address errors that occur across Incident Playbooks, follow the these steps:
Click on the Investigation Dashboard navigation tab.
Click on the Playbook Errors accordion header.
Click on All Incident Playbook Errors within the expanded accordion.
Select the site to which the Incident Playbook belongs at the top right side.
Click on the incident list view item of interest redirected to the corresponding Incident Playbook.
From here, you can identify error points through task icons and understand why the error(s) occurred.