Prefer Atomic Test UUID over calculated hash for Ability ID #45

leba-atr · 2025-03-06T13:44:31Z

Description

Currently, the plugin calculates a md5 hash over the JSON serialized test definition for a given atomic test and then uses the hash value as the id for the plugin. While this is fine for static datasets, Atomic Tests every now and then receive updates. These updates cause the md5 hash to change an any reference in Caldera to break (e.g. abilities referenced from the stockpile plugin, check the debug output of Caldera for current examples).

This PR introduces a breaking change in the way how this plugin generates ability ids. Instead of calculating a hash over the test data, this plugin now prefers to use the auto-generated unique UUID that each test is assigned. This also affects how one references Atomic abilities in Plugins like stockpile but also custom in Adversaries created manually via API calls.

Context: when creating adversaries via manual API calls, one cannot just use the Atomic test uuid to link Abilities with a given Adversary but instead the custom hash value must be retrieved from the API beforehand. This makes it more tedious than I expected to reference Atomic tests in custom Adversaries.

Type of change

Breaking change (fix or feature that would cause existing functionality to not work as expected)

How Has This Been Tested?

caldera setup completes successfully
all warnings from Stockpile plugin regarding missing abilities in adversaries are gone
visual verification that especially the Stockpile adversaries contain only abilities which 'make sense' (see use atomic UUIDs instead of md5 hash stockpile#579)

Checklist:

My code follows the style guidelines of this project
I have performed a self-review of my own code
I have made corresponding changes to the documentation
I have added tests that prove my fix is effective or that my feature works

mkultraWasHere · 2025-03-11T16:42:55Z

We will need to marinate on this change. Not against it logically, but not sure of the fallout from the breaking change.

leba-atr · 2025-03-12T08:56:55Z

I thought about that for a minute yesterday evening and came up with an idea that would on the one hand not be breaking but on the other hand require refactoring in the main Caldera repository.

Roughly, the idea is as follows:

store both the UUID and the hash sum
when looking up abilities, the REST api call handler looks up the id from the request as uuid first; if nothing is found, the handler falls back to the hash sum
alternatively, the api call handler could use a regex to check if the id is either a uuid (e.g. [a-fA-F0-9]{8}-[a-fA-F0-9]{4}-[a-fA-F0-9]{4}-[a-fA-F0-9]{4}-[a-fA-F0-9]{12}) or a md5 hash (e.g. [a-fA-F0-9]{64}) and then look up the requested object by the corresponding property

Maybe that opens another way forward with this MR.

uruwhy · 2025-05-19T12:53:51Z

Here's an idea that wouldn't require refactoring:

Have a setting specific to the atomic plugin that enables/disables legacy IDs.
If enabled (default setting), the atomic plugin will generate 2 copies of each ability - one copy with the MD5 hash ID (legacy ID) and (legacy) appended to the ability name, and the 2nd copy using the atomic test UUID and original name.
This way, users can fully migrate on their own when they are ready by simply disabling the setting
Publicly released content moving forward can reference the atomic test UUIDs only, but anything referencing the old MD5-based ID won't break unless the legacy atomic setting is disabled

Thoughts?

uruwhy · 2025-05-30T16:02:30Z

Alright I have an alternative idea based on @leba-atr 's suggestion that will require some adjustments to core code and ability format, but will actually provide users with more flexibility down the line

Proposal: add an alternative_ids field to the ability object structure. This will be a list of ID strings that represent alternative/backup IDs that users can specify for a single ability. So if users want to label abilities with their own ID schema in addition to the typical UUIDs, they can do so, and CALDERA will be able to search for and manage abilities using any of those IDs (users will just need to make sure all alternative IDs are globally unique within their CALDERA installation)

For the atomic plugin in particular, the plugin service would grab the atomic test ID and use that as the primary ability ID, and use the hash as an alternate ID. This way, we don't break any user's custom adversary profiles that rely on the hash-based IDs, and users can use the more stable atomic test ID going forward or switch at their own pace

Rather than looking up the primary ID first, CALDERA would simply look up whatever ID is referenced in the adversary profile. When abilities get imported, we'll have all of the associated IDs point to the same underlying ability object, but we'll still maintain the notion of a primary ID since that ID will be used for the yaml file name

leba-atr · 2025-06-02T10:36:57Z

Personally, I'm in favor of the second approach. And especially so for the reason that it opens up the possibility to use custom identifiers in addition to the ones used by Caldera internally. Also, this allows to postpone the breaking change for users until the next major release where the deprecated approach can literally just be deleted from the code without the need to re-write lots of functionality.

uruwhy · 2025-06-11T17:13:37Z

As much as I'd like to introduce a secondary ID(s) field, doing so would require quite significant updates to the core code in order to accommodate looking up and referencing abilities (and potentially other objects) with more than 1 ID. Similar amounts of effort would also be required to store and manage both atomic ability IDs as "official" ability object fields. However, I do have a middle approach in the works that looks like the following:

Incorporate your atomic plugin fix to prioritize the pre-generated atomic UUID and only use the MD5 hash if that UUID field isn't there for whatever reason. This will apply to any new installations of the atomic plugin (e.g. fresh install of caldera, first time enabling the atomic plugin, or recloning and re-processing the atomic repo to populate the abilities directory).
- Generated abilities will also have the corresponding MD5-hash-based legacy ID appended to the ability description so that users can easily look up what the correct UUID should be when transitioning any custom adversary profiles
Add a flag file (.e.g .processed_atomic) that indicates whether or not the "new" method of processing the atomic plugin has been performed. If this file does not exist in the data folder, then either the user has never used the atomic plugin, or only has legacy-style atomic abilities in an existing atomic plugin installation
When the atomic plugin runs, it will check if the abilities directory exists.
- If the directory does not exist, then we can safely assume that the user has no legacy abilities with the MD5 hash and can proceed to only generate atomic abilities with the real UUIDs.
- If the directory exists, but the processing flag file does not exist, then we can assume that the user has some legacy abilities with the MD5 hash. The atomic service will then generate atomic abilities with the real UUIDs but still keep the legacy abilities to avoid breaking any custom profiles that the user has that might depend on those MD5 hash IDs.
- If both the directory and processing flag file exist, then we don't need to reprocess anything
If the atomic plugin detects any MD5 hash legacy ID yaml files in the abilities directory, it will provide a warning that notifies the user that they have legacy atomic abilities and should retire them when possible. It will also append (legacy ID) or a similar suffix to each ability name, so that users easily know which copy of the ability is legacy or not when viewing them in the GUI
- during this transition period, users will have an artificially inflated number of abilities in their caldera installation since every atomic ability will essentially be duplicated
we will provide a bash script that users can run to move all the MD5-hash-based abilities to a backup directory in the atomic plugin, effectively "retiring" them

Copilot

Pull Request Overview

This PR changes how ability IDs are generated for Atomic tests by preferring the auto-generated UUID over the calculated MD5 hash. This addresses issues where hash-based IDs break references when test definitions are updated.

Replaces MD5 hash-based ability ID generation with UUID-based approach for stability
Maintains backward compatibility by falling back to hash when UUID is unavailable
Introduces breaking changes for existing ability references

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

Copilot · 2025-10-06T22:53:44Z

app/atomic_svc.py

        Return True if an ability was saved.
        """
-        ability_id = hashlib.md5(json.dumps(test).encode()).hexdigest()
+        ability_id = test.get('auto_generated_guid') or hashlib.md5(json.dumps(test).encode()).hexdigest()


[nitpick] The fallback to MD5 hash maintains the old behavior but could lead to inconsistent ID types. Consider documenting the expected format of 'auto_generated_guid' and whether it should be validated before use.

deacon-mp · 2025-10-06T23:12:49Z

Can you address the above and resubmit for Review

prefer Atomic Test UUID over calculated hash for Ability ID

cce1e3d

leba-atr mentioned this pull request Mar 6, 2025

use atomic UUIDs instead of md5 hash mitre/stockpile#579

Open

6 tasks

endiz mentioned this pull request Mar 7, 2025

Defense Evasion GUID Bug 580 mitre/stockpile#581

Merged

5 tasks

mkultraWasHere requested review from clenk and mkultraWasHere March 11, 2025 16:41

mkultraWasHere self-assigned this Mar 11, 2025

mkultraWasHere added the good write up label Mar 11, 2025

deacon-mp requested review from Copilot and removed request for mkultraWasHere October 6, 2025 22:53

Copilot AI reviewed Oct 6, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Prefer Atomic Test UUID over calculated hash for Ability ID #45

Prefer Atomic Test UUID over calculated hash for Ability ID #45

Uh oh!

leba-atr commented Mar 6, 2025 •

edited

Loading

Uh oh!

mkultraWasHere commented Mar 11, 2025

Uh oh!

leba-atr commented Mar 12, 2025

Uh oh!

uruwhy commented May 19, 2025

Uh oh!

uruwhy commented May 30, 2025

Uh oh!

leba-atr commented Jun 2, 2025

Uh oh!

uruwhy commented Jun 11, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Oct 6, 2025

Uh oh!

deacon-mp commented Oct 6, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Prefer Atomic Test UUID over calculated hash for Ability ID #45

Are you sure you want to change the base?

Prefer Atomic Test UUID over calculated hash for Ability ID #45

Uh oh!

Conversation

leba-atr commented Mar 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of change

How Has This Been Tested?

Checklist:

Uh oh!

mkultraWasHere commented Mar 11, 2025

Uh oh!

leba-atr commented Mar 12, 2025

Uh oh!

uruwhy commented May 19, 2025

Uh oh!

uruwhy commented May 30, 2025

Uh oh!

leba-atr commented Jun 2, 2025

Uh oh!

uruwhy commented Jun 11, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Uh oh!

Copilot AI Oct 6, 2025

Choose a reason for hiding this comment

Uh oh!

deacon-mp commented Oct 6, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

leba-atr commented Mar 6, 2025 •

edited

Loading