HDF Mapper and Converter Creation Guide (for SAF CLI & Heimdall2)

Mapper Creation Guide for HDF Converters

Introduction

HDF Converters is a custom data normalization tool for transforming exported data from various security tool formats into the Heimdall Data Format (HDF). It is currently integrated into Heimdall2 and the SAF CLI, which collectively are part of the Security Automation Framework (SAF), a set of tools and processes which aim to standardize and ease security compliance and testing within an automated build pipeline.

Mappers are frameworks that allow the underlying conversion infrastructure to correlate certain objects or values from one overarching object or file to another overarching object or file. In the case of HDF Converters, these mappers allow for the conversion of some security service formats to HDF (*-to-HDF) and vice-versa (HDF-to-*) using the tools provided by the existing conversion infrastructure.

The process for creating a mapper for HDF Converters is detailed below. In order to ensure that the created mapper produces a HDF file that is both accurate and detailed, it is important that you provide as much information as possible for prototyping and understand the full (or general) schema of your security export for comprehensive information conversion.

Background & Overview

HDF Schema

The full generalized HDF schema is as follows:

{
  platform: {                      //required field
    name                          //required field
    release                       //required field
    target_id
   }
  version                          //required field
  statistics: {                    //required field
    duration
   }
  profiles: [                      //required field
    0: {
      name                       //required field
      version
      sha256                     //required field
      title
      maintainer
      summary
      license
      copyright
      copyright_email
      supports                   //required field
      attributes                 //required field
      groups                     //required field
      controls: [                //required field
        0: {
          id                   //required field
          title
          desc
          descriptions
          impact               //required field
          refs                 //required field
          tags                 //required field
          code
          source_location      //required field
          results: [           //required field
            0: {
              status
              code_desc      //required field
              message
              run_time
              start_time     //required field
            }
          ]
        }
      ]
      status
    }
  ]
  passthrough: {
    auxiliary_data: [
      0: {
        name
        data
      }
    ]
    raw
  }
}

(Note: The documented schema is subject to change and not all required fields need to be populated; for the full schema and more information on the fields, refer to saf.mitre.org/#/normalize)

HDF Schema Breakdown

The HDF schema can be grouped into 3 sets of structures, with each structure being a subset of the previous structure. These groupings are: profiles, controls, and results.

The profiles structure contains metadata on the scan target of the original security service export and on the run performed by the security tool. This provides a high-level overview of the scan service run and target which are both digestible and easily accessible to the user. A generalized format is as follows:

profiles: [
  0: {
    name                 //Name of profile, usually the original security service tool; should be unique
    version              //Version of security service tool
    sha256               //Hash of HDF file; NOTE: AUTOMATICALLY GENERATED BY HDF CONVERTERS, DO NOT POPULATE
    title                //Title of security service scan; should be human readable
    maintainer           //Maintainer
    summary              //Summary of security service export
    license              //Copyright license
    copyright            //Copyright holder
    copyright_email      //Copyright holder's email
    supports             //Supported platform targets
    attributes           //Inputs/attributes used in scan
    groups               //Set of descriptions for the control groups
    controls             //Controls substructure (see below)
    status               //Status of profile (typically 'loaded')
  }
  ... //More items may exist if the security service produces multiple scan targets per export
]

Controls are security parameters used to prevent unauthorized access to sensitive information or infrastructure. In the case of HDF Converters, the controls structure is a collection of such controls tested for retroactively by an external security service to ensure that the target complies with vulnerability and weakness prevention standards. The controls structure is a subset of the profiles structure. A generalized format is as follows:

controls: [
  0: {
    id                //ID of control; used for sorting, should be unique for each unique control
    title             //Title of control
    desc              //Description of the control
    descriptions      //Additional descriptions; usually 'check' and 'fix' text for control
    impact            //Security severity of control
    refs              //References to external control documentation
    tags              //Control tags; typically correlate to existing vulnerability/weakness database (e.g., NIST, CVE, CWE)
    code              //Control source code for code preservation
    source_location   //Location of control within source code
    results           //Results substructure (see below)
  }
  ... //More items may exist if there are multiple controls reported per profile
]

The results structure contains information on the results of specific tests ran by the security service on the scan target against a set of security controls. These results will always correlate to a certain control and will either report 'passed' or 'failed' to indicate the test status (other statuses exist but are rare), which cumulatively affect the compliance level of the scan target with the indicated control set. The results structure is a subset of the controls structure. A generalized structure is as follows:

results: [
  0: {
    status         //Pass/fail status of test (other statuses exist but are rare)
    code_desc      //Test expectations as defined by control
    message        //Demonstration of expected and actual result of test to justify test status
    run_time       //Overall runtime of test
    start_time     //Starting time of test
  }
  ... //More items may exist if there are multiple results reported per control
]

These aforementioned structures cumulatively result in the following generalized structure which primarily defines the HDF:

//Data fields have been removed for the sake of demonstration
profiles: [
  0: {
    controls: [
      0: {
        results: [
          0: {
          },
          ...
        ]
      },
      ...
    ]
  },
  ...
]

There are additional structures in the HDF schema which are used for metadata/extraneous information storage. These exist alongside the profiles structure on the top level of the HDF schema. The general structure for the top level of the HDF schema is as follows:

{
  platform: {                //Information on the platform handling the HDF file; usually 'Heimdall Tools'
    name                    //Platform name
    release                 //Platform version
    target_id               //Platform target ID
  }
  version                    //Platform version
  statistics: {              //Statistics relating to target scan run
    duration                //Duration of run
  }
  profiles                   //Profiles structure
  passthrough: {             //Extraneous information storage
    auxiliary_data: [       //Storage for unused data from the sample file
      0: {
        name                //Name of auxiliary data source
        data                //Auxiliary data
      }
      ... //More items may exist if there are multiple auxiliary data sources available
    ]
    raw                     //Raw data dump of input security service export
  }
}

HDF Schema Mapping Example Walkthrough

The following is an example of a high-level mapping from the Twistlock file format to the HDF. The purpose of this demonstration is to give an easy, non-technical approach to generating a prototype for *-to-HDF mappers that can be used as a guideline for the development of actual technical mappers for the HDF Converter. This process is generally recommended as the first step for the development of any mapper for the HDF Converter.

(NOTE: The format used by your export may not match the one being used in this demonstration. The mappings used in this example are for demonstration purposes and should not be taken as a definitive resource; creative interpretation is necessary for the most accurate mapping according to the specifics of your security service export.)

Given a sample Twistlock scan export (as seen below), our goal is to roughly identify and group data fields according to our 3 primary structures in HDF (profiles, controls, and results) and the non-applicable structure (passthrough). For profiles, we want to find metadata; for controls, we want to find general security control information; for results, we want to find specific security control testing information; and we can place everything else into passthrough.

//Sample Twistlock scan export
{
  "results": [
    {
      "id": "sha256:111",
      "name": "registry.io/test",
      "distro": "Red Hat Enterprise Linux release 8.6 (Ootpa)",
      "distroRelease": "RHEL8",
      "digest": "sha256:222",
      "collections": [
        "All",
        "TEST-COLLECTION"
      ],
      "packages": [
        {
          "type": "os",
          "name": "nss-util",
          "version": "3.67.0-7.el8_5",
          "licenses": [
            "MPLv2.0"
          ]
        }
      ],
      "vulnerabilities": [
        {
          "id": "CVE-2021-43529",
          "status": "affected",
          "cvss": 9.8,
          "description": "DOCUMENTATION: A remote code execution flaw was found in the way NSS verifies certificates. This flaw allows an attacker posing as an SSL/TLS server to trigger this issue in a client application compiled with NSS when it tries to initiate an SSL/TLS connection.  Similarly, a server application compiled with NSS, which processes client certificates, can receive a malicious certificate via a client, triggering the flaw. The highest threat to this vulnerability is confidentiality, integrity, as well as system availability.              STATEMENT: The issue is not limited to TLS. Any applications that use NSS certificate verification are vulnerable; S/MIME is impacted as well.  Similarly, a server application compiled with NSS, which processes client certificates, can receive a malicious certificate via a client.  Firefox is not vulnerable to this flaw as it uses the mozilla::pkix for certificate verification. Thunderbird is affected when parsing email with the S/MIME signature.  Thunderbird on Red Hat Enterprise Linux 8.4 and later does not need to be updated since it uses the system NSS library, but earlier Red Hat Enterprise Linux 8 extended life streams will need to update Thunderbird as well as NSS.             MITIGATION: Red Hat has investigated whether a possible mitigation exists for this issue, and has not been able to identify a practical example. Please update the affec",
          "severity": "critical",
          "packageName": "nss-util",
          "packageVersion": "3.67.0-7.el8_5",
          "link": "https://access.redhat.com/security/cve/CVE-2021-43529",
          "riskFactors": [
            "Remote execution",
            "Attack complexity: low",
            "Attack vector: network",
            "Critical severity",
            "Recent vulnerability"
          ],
          "impactedVersions": [
            "*"
          ],
          "publishedDate": "2021-12-01T00:00:00Z",
          "discoveredDate": "2022-05-18T12:24:22Z",
          "layerTime": "2022-05-16T23:12:25Z"
        }
      ],
      "vulnerabilityDistribution": {
        "critical": 1,
        "high": 0,
        "medium": 0,
        "low": 0,
        "total": 1
      },
      "vulnerabilityScanPassed": true,
      "history": [
        {
          "created": "2022-05-03T08:38:31Z"
        },
        {
          "created": "2022-05-03T08:39:27Z"
        }
      ],
      "scanTime": "2022-05-18T12:24:32.855444532Z",
      "scanID": "asdfghjkl"
    }
  ],
  "consoleURL": "https://twistlock.test.net/#!/monitor/vulnerabilities/images/ci?search=sha256%333"
}

Thus, upon successive passes we can roughly outline what we expect each data field in the Twistlock scan export to correlate to in the HDF. We first want to identify metadata which will most likely belong in the profiles structure. Such data fields will primarily be related to the general security scan itself or be related to the target system that is being scanned, as seen below:

//Data values are removed for visual clarity
{
  "results": [
    {
      "id",                               //Scan target metadata -> profiles
      "name",                             //
      "distro",                           //
      "distroRelease",                    //
      "digest",                           //
      "collections",                      //
      "packages": [],                     //
      "vulnerabilities": [
        {
          "id",
          "status",
          "cvss",
          "description",
          "severity",
          "packageName",
          "packageVersion",
          "link",
          "riskFactors": [],
          "impactedVersions": [],
          "publishedDate",
          "discoveredDate",
          "layerTime"
        }
      ],
      "vulnerabilityDistribution": {},      //Twistlock scan metadata -> profiles
      "vulnerabilityScanPassed",            //
      "history": [],                        //Scan target package install history -> profiles
      "scanTime",                           //Twistlock scan metadata -> profiles
      "scanID"                              //
    }
  ],
  "consoleURL"         //Twistlock scan metadata -> profiles
}

Next, we want to roughly outline general security control information that correlates to our controls structure. For this, we want to look for information that provides a background for the tests performed by the security service. Usually, this strongly correlates to information that gives us a why, what, and how for the tests that are performed, as seen with the fields that are highlighted below:

//Data values are removed for visual clarity
{
  "results": [
    {
      "id",                               //Scan target metadata -> profiles
      "name",                             //
      "distro",                           //
      "distroRelease",                    //
      "digest",                           //
      "collections",                      //
      "packages": [],                     //
      "vulnerabilities": [
        {
          "id",                      //ID of control tested against -> controls
          "status",
          "cvss",                    //CVSS severity score of control -> controls
          "description",             //Description of control -> controls
          "severity",                //Severity of control failure -> controls
          "packageName",
          "packageVersion",
          "link",                    //Link to control documentation -> controls
          "riskFactors": [],
          "impactedVersions": [],
          "publishedDate",           //Control discovery date -> controls
          "discoveredDate",
          "layerTime"
        }
      ],
      "vulnerabilityDistribution": {},      //Twistlock scan metadata -> profiles
      "vulnerabilityScanPassed",            //
      "history": [],                        //Scan target package install history -> profiles
      "scanTime",                           //Twistlock scan metadata -> profiles
      "scanID"                              //
    }
  ],
  "consoleURL"         //Twistlock scan metadata -> profiles
}

After that, we want to outline items that relate to specific instances of control tests ran against the scan target as part of the results structure. Usually, this strongly correlates to information that gives us a who, what, and when for the specific tests that are performed, as seen with the fields that are highlighted below:

//Data values are removed for visual clarity
{
  "results": [
    {
      "id",                               //Scan target metadata -> profiles
      "name",                             //
      "distro",                           //
      "distroRelease",                    //
      "digest",                           //
      "collections",                      //
      "packages": [],                     //
      "vulnerabilities": [
        {
          "id",                      //ID of control tested against -> controls
          "status",                  //Pass/fail result of the control test -> results
          "cvss",                    //CVSS severity score of control -> controls
          "description",             //Description of control -> controls
          "severity",                //Severity of control failure -> controls
          "packageName",             //Package ran against control test -> results
          "packageVersion",          //Version of package ran against control test -> results
          "link",                    //Link to control documentation -> controls
          "riskFactors": [],         //Risk factors associated with failing this specific control test -> results
          "impactedVersions": [],    //Vulnerable versions of package ran against control test -> results
          "publishedDate",           //Control discovery date -> controls
          "discoveredDate",          //Date this control result was discovered -> results
          "layerTime"
        }
      ],
      "vulnerabilityDistribution": {},      //Twistlock scan metadata -> profiles
      "vulnerabilityScanPassed",            //
      "history": [],                        //Scan target package install history -> profiles
      "scanTime",                           //Twistlock scan metadata -> profiles
      "scanID"                              //
    }
  ],
  "consoleURL"         //Twistlock scan metadata -> profiles
}

For fields that we cannot reasonably categorize or have no information about, we can instead just place them into the passthrough structure, as seen below:

//Data values are removed for visual clarity
{
  "results": [
    {
      "id",                               //Scan target metadata -> profiles
      "name",                             //
      "distro",                           //
      "distroRelease",                    //
      "digest",                           //
      "collections",                      //
      "packages": [],                     //
      "vulnerabilities": [
        {
          "id",                      //ID of control tested against -> controls
          "status",                  //Pass/fail result of the control test -> results
          "cvss",                    //CVSS severity score of control -> controls
          "description",             //Description of control -> controls
          "severity",                //Severity of control failure -> controls
          "packageName",             //Package ran against control test -> results
          "packageVersion",          //Version of package ran against control test -> results
          "link",                    //Link to control documentation -> controls
          "riskFactors": [],         //Risk factors associated with failing this specific control test -> results
          "impactedVersions": [],    //Vulnerable versions of package ran against control test -> results
          "publishedDate",           //Control discovery date -> controls
          "discoveredDate",          //Date this control result was discovered -> results
          "layerTime"                //Information on package install time; extraneous -> passthrough
        }
      ],
      "vulnerabilityDistribution": {},      //Twistlock scan metadata -> profiles
      "vulnerabilityScanPassed",            //
      "history": [],                        //Scan target package install history -> profiles
      "scanTime",                           //Twistlock scan metadata -> profiles
      "scanID"                              //
    }
  ],
  "consoleURL"         //Twistlock scan metadata -> profiles
}

With this, we now have a general outline which roughly connects each data field in the Twistlock sample export to one of our structures in the HDF. In order to improve the accuracy of this mapping, we can now begin connecting specific fields in the HDF schema with the data fields in the sample export using our rough draft as a guide.

If we cannot find a field in the HDF schema that fits with a certain field in the sample export per our original groupings, we can instead look to the other structures to see if they have applicable fields or place the field into the passthrough structure as a last resort.

//Data values are removed for visual clarity
{
  "results": [
    {
      "id",                               //profiles -> passthrough.auxiliary_data.data
      "name",                             //profiles -> profiles.name
      "distro",                           //profiles -> passthrough.auxiliary_data.data
      "distroRelease",                    //profiles -> passthrough.auxiliary_data.data
      "digest",                           //profiles -> passthrough.auxiliary_data.data
      "collections",                      //profiles -> profiles.title
      "packages": [],                     //profiles -> passthrough.auxiliary_data.data
      "vulnerabilities": [
        {
          "id",                      //controls -> profiles.controls.id
          "status",                  //results -> profiles.controls.results.status
          "cvss",                    //controls -> profiles.controls.code
          "description",             //controls -> profiles.controls.desc
          "severity",                //controls -> profiles.controls.impact
          "packageName",             //results -> profiles.controls.results.code_desc
          "packageVersion",          //results -> profiles.controls.results.code_desc
          "link",                    //controls -> profiles.controls.code
          "riskFactors": [],         //results -> profiles.controls.code
          "impactedVersions": [],    //results -> profiles.controls.results.code_desc
          "publishedDate",           //controls -> profiles.controls.code
          "discoveredDate",          //results -> profiles.controls.results.start_time
          "layerTime"                //passthrough -> profiles.controls.code
        }
      ],
      "vulnerabilityDistribution": {},      //profiles -> profiles.summary
      "vulnerabilityScanPassed",            //profiles -> passthrough.auxiliary_data.data
      "history": [],                        //profiles -> passthrough.auxiliary_data.data
      "scanTime",                           //profiles -> passthrough.auxiliary_data.data
      "scanID"                              //profiles -> passthrough.auxiliary_data.data
    }
  ],
  "consoleURL"         //profiles -> passthrough.auxiliary_data.data
}

With this, we now have a detailed high-level mapping for the conversion from an external file format to the HDF, which we can use for the technical implementation of a *-to-HDF mapper.

HDF Converters Structure

The following is a simplified depiction of the directory tree for the HDF Converter. Only noteworthy and potentially useful files and directories are included. It is not imperative to memorize the structure, but it is useful to familiarize yourself with it to better understand what exists where within the HDF Converter for future reference.

hdf-converters
+-- data
|   +-- converters
|   |   +-- csv2json.ts
|   |   +-- xml2json.ts
+-- sample_jsons                              //Sample exports for mapper testing are located here
+-- src                                       //*-to-HDF mappers are located here
|   +-- converters-from-hdf                   //HDF-to-* mappers are located here
|   |   +-- reverse-any-base-converter.ts
|   |   +-- reverse-base-converter.ts
|   +-- mappings                              //Non-HDF mappers are located here (e.g., CVE, CCI, NIST)
|   +-- utils
|   |   +-- fingerprinting.ts
|   |   +-- global.ts
|   +-- base-converter.ts
+-- test                                      //Mapper tests are located here
|   +-- mappers
|   |   +-- forward                           //*-to-HDF tests
|   |   +-- reverse                           //HDF-to-* tests
|   |   +-- utils.ts
+-- types                                     //Explicit data typing for known export schemas
+-- index.ts
+-- package.json

Base Converter Tools

[//] # WIP

The base-converter class is the underlying foundation which enables *-to-HDF mapping in HDF Converters. It defines *-to-HDF mappers and provides critical tools which allow for the construction of such mappers. All *-to-HDF mappers inherit from this class and therefore have access to the tools that this class provides; it is thus imperative that you utilize these tools to their fullest potential to ease and simplify mapper development. The provided tools are as follows:

path: Denote JSON object path to go to

Use:

path: PATH_AS_STRING

Example:

//Attribute 'id' will be set as whatever JSON object attribute 'vulnerability.id' is
id: {path: 'vulnerability.id'}

transformer: Execute given code sequence; operates similar to an anonymous function

Use:

transformer: (PARAMETERS): OUTPUT_TYPE => {CODE_TO_EXECUTE}

Example:

//Attribute 'code' will be set as the returned stringified JSON object of input 'vulnerability'
code: {
  transformer: (vulnerability: Record<string, unknown>): string => {
    return JSON.stringify(vulnerability, null, 2);
  }
}

arrayTransformer: Execute given code sequence on a given array; primarily used when in an attribute that is an array of objects

Use:

arrayTransformer: CODE_TO_EXECUTE

Example:

//The function 'deduplicateId' will run against all items in the current array that the 'arrayTransformer' was called inside
arrayTransformer: deduplicateId

pathTransform:
- Use:
- Example:

key: Used by Base Converter to sort the an array of objects by

Use:

key: KEY_AS_STRING

Example:

//'id' is now considered the key by which this section will be sorted by
key: 'id'

Mapper Creation

Environment Set Up

Node.js (a runtime environment for Javascript) and Yarn (a package manager for Node.js) are external utilities which are utilized extensively within this guide. The following section details their installation process.

Linux/Mac OS:

Install nvm.

1a. Use either of the following commands to install nvm:

curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.39.1/install.sh | bash

wget -qO- https://raw.githubusercontent.com/nvm-sh/nvm/v0.39.1/install.sh | bash

1b. Either restart the terminal or run the following commands to use nvm:

export NVM_DIR="$([ -z "${XDG_CONFIG_HOME-}" ] && printf %s "${HOME}/.nvm" || printf %s "${XDG_CONFIG_HOME}/nvm")"

[ -s "$NVM_DIR/nvm.sh" ] && \. "$NVM_DIR/nvm.sh" # This loads nvm

Run the following command to install and use Node.js v16:

nvm install 16

Install Yarn:

npm install --global yarn

Windows:

Install Node.js v16 via the installer. If v16 is not available from the page, use this archive.
Install Yarn:

npm install --global yarn

*-to-HDF Creation Guide

Fork/branch a development repository from the main Heimdall2 GitHub repository.
- SAF team developers have write access to the main repository and should create a branch on the primary development repository. Non-SAF team developers should instead create a fork of the main repository and create a development branch there.
Create a draft pull request for your development branch against the main repository branch.
Have a rough, high-level outline of how your export translates to the HDF. For an example of this process, refer to the HDF Schema Mapping Example Walkthrough section.
Set up for *-to-HDF mapper development.

4a. Install the necessary dependencies for Heimdall2. Under the heimdall2 directory, enter the following command in the terminal:
```
yarn install
```
4b. Create a blank TypeScript file under the src directory in hdf-converters. It should be named:
```
{YOUR-EXPORT-NAME-HERE}-mapper.ts
```
4c. Select the appropriate mapper skeleton for your export type. Place them in the file created in step 4b. Replace names (skeleton by default) as necessary.
4d. Export your mapper class created in the previous steps by specifying its export in the index.ts file. Add the following line:
```
export * from './src/{YOUR-EXPORT-NAME-HERE}-mapper';
```
4e. Create a new directory named {YOUR-EXPORT-NAME-HERE}_mapper under the sample_jsons directory in hdf-converters. Create another directory named sample_input_report in the directory you just made. The file structure should look like this:
```
+-- sample_jsons
|   +-- {YOUR-EXPORT-NAME-HERE}-mapper
|   |   +-- sample_input_report
```
4f. Place your sample export under the sample_input_report directory. Your sample export should be genericized to avoid any leaking of sensitive information. The file structure should now look like this:
```
+-- sample_jsons
|   +-- {YOUR-EXPORT-NAME-HERE}-mapper
|   |   +-- sample_input_report
|   |   |   +-- {YOUR-SAMPLE-EXPORT}
```
4g. Create a blank TypeScript file under the test/mappers/forward directory in hdf-converters. It should be named:
```
{YOUR-EXPORT-NAME-HERE}_mapper.spec.ts
```
4h. Select the appropriate mapper testing skeleton for your export type. Place it in the file created in step 4g. Replace names (skeleton by default) as necessary.

Add fingerprinting to identify your security service scan export.

5a. Go to the file report_intake.ts under the heimdall2/apps/frontend/src/store directory.
5b. Import your mapper file. You should be able to add the name of your mapper class to a pre-existing import statement pointing at hdf-converters as follows:
```
import {
  ASFFResults as ASFFResultsMapper,
  BurpSuiteMapper,
  ...
  {YOUR-MAPPER-CLASS-HERE}
} from '@mitre/hdf-converters';
```

5c. Instantiate your mapper class in the convertToHdf switch block. Add the following lines:

case '{YOUR-EXPORT-SERVICE-NAME-HERE}':
  return new {YOUR-MAPPER-CLASS-HERE}(convertOptions.data).toHdf();

5d. Navigate to the file fingerprinting.ts in the src/utils directory in hdf-converters. Add keywords that are unique to your sample export to the fileTypeFingerprints variable. It should be formatted as follows:
```
export const fileTypeFingerprints = {
  asff: ['Findings', 'AwsAccountId', 'ProductArn'],
  ...
  {YOUR-EXPORT-SERVICE-NAME-HERE}: [{UNIQUE KEYWORDS AS STRINGS}]
};
```

Create the *-to-HDF mapper.

6a. Return to the {YOUR-EXPORT-NAME-HERE}-mapper.ts file. In the file, you should have a generic skeleton mapper picked according to your export type.
6b. Certain security services produce exports which are not immediately usable by the skeleton mapper. In this case, pre-processing on the export and post-processing on the generated HDF file is necessary in order to ensure compatibility.
6c. The skeleton mapper and base-converter have been designed to provide the base functionality needed for *-to-HDF mapper generation. For most developers, mapper creation will be limited to assigning objects from the export structure to correlating attributes in the mapper according to the HDF schema.
- An example of this process is provided in the *-to-HDF Mapper Construction Example section.
6d. Commit your changes with the signoff option and push the changes to your development branch. This should queue up the Github Actions pipeline which includes a Netlify instance of Heimdall2 which you can use to test if your mapper is generating a HDF file correctly.

Set up and use regression testing on your mapper.

7a. Uncomment out the commented out lines in the {YOUR-SAMPLE-EXPORT}-hdf.json file created in step 4f. This will allow the regression tests to automatically generate a HDF output file whenever you run the tests. The commented out lines should look similar to the following:
```
// fs.writeFileSync(
//   'sample_jsons/skeleton_mapper/skeleton-hdf.json',
//   JSON.stringify(mapper.toHdf(), null, 2)
// );
```
7b. Using the terminal, cd into the hdf-converters directory and run the following command. This command will run your mapper against the sample export file in sample_jsons and test to see if the output is generated as expected.
```
yarn run test --verbose --silent=false ./test/mappers/forward/{YOUR-EXPORT-NAME-HERE}_mapper.spec.ts
```
7c. Your tests should generate HDF output files for when --with-raw is not flagged (default behavior) and when it is flagged (denoted by -withraw in the filename). It will also compare the contents of these generated files with a temporary mapper instance created in the test itself. Review the test output to ensure that the tests are all passing and review the HDF output files to ensure that the contents of the mapping are being generated correctly.
7d. Recomment out the lines from step 7b.

Document your new mapper in the README for hdf-converters under the Supported Formats section. It should be formatted as follows:

{#}. [{YOUR-EXPORT-NAME-HERE}] - {MAPPER INPUT DESCRIPTION}

Commit your final changes and mark your pull request as 'ready for review'. You should request for a code review from members of the SAF team and edit your code as necessary. Once approved, your mapper will be merged into the main development branch and scheduled for release as an officially supported conversion format for the HDF Converters.
Create a development branch against the SAF CLI repository and create a draft pull request for your new branch.
Set up for SAF CLI mapper integration.

11a. In the package.json file, update the versions of @mitre/hdf-converters and @mitre/heimdall-lite to the latest release of Heimdall2.
11b. In the src/commands/convert directory, create a blank TypeScript file. It should be named:
```
{YOUR-EXPORT-NAME-HERE}2hdf.ts
```
11c. In the test/sample_data directory, create a directory named {YOUR-EXPORT-NAME-HERE}. Underneath it, create a directory named sample_input_report. The file structure should now look like this:
```
+-- sample_data
|   +-- {YOUR-EXPORT-NAME-HERE}
|   |   +-- sample_input_report
```
11d. Place your sample export under the sample_input_report directory. Your sample export should be genericized to avoid any leaking of sensitive information. Under the {YOUR-EXPORT-NAME-HERE} directory, place your output HDF files generated during the testing phase of step 7c. The file structure should now look like this:
```
+-- sample_data
|   +-- {YOUR-EXPORT-NAME-HERE}
|   |   +-- sample_input_report
|   |   |   +-- {YOUR-SAMPLE-EXPORT}
|   |   +-- {YOUR-EXPORT-NAME-HERE}-hdf.json
|   |   +-- {YOUR-EXPORT-NAME-HERE}-hdf-withraw.json
```
11e. In the test/commands/convert directory, create a blank TypeScript file. It should be named:
```
{YOUR-EXPORT-NAME-HERE}2hdf.test.ts
```

Integrate your mapper with the SAF CLI.

12a. Insert the skeleton for integrating a HDF mapper with the SAF CLI. Replace names (skeleton by default) as necessary.
12b. Insert the skeleton for a convert command test for the SAF CLI. Replace names (skeleton by default) as necessary.
12c. Navigate to the index.ts file under the src/commands/convert directory. Import your mapper using the existing import block as follows:
```
import {
  ASFFResults,
  ...
  {YOUR-MAPPER-CLASS-HERE}
  } from '@mitre/hdf-converters'
```
12d. Under the switch block in the getFlagsForInputFile function, add your mapper class as it is defined in step 5d for fingerprinting for the generic convert command. If the convert command for your mapper has any additional flags beyond the standard --input and --output flags, return the entire flag block in the switch case. This is demonstrated as follows:
```
switch (Convert.detectedType) {
  ...
  case {YOUR-EXPORT-SERVICE-NAME-HERE}:
    return {YOUR-CLI-CONVERT-CLASS}.flags  //Only add if special flags exist
  ...
    return {}
  }
```

Edit the README file to reflect your newly added conversion command under the To HDF section. It should be formatted as follows:

##### {YOUR-EXPORT-NAME-HERE} to HDF

\```
convert {YOUR-EXPORT-NAME-HERE}2hdf       Translate a {YOUR-EXPORT-NAME-HERE} results {EXPORT-TYPE} into
                                              a Heimdall Data Format JSON file

OPTIONS
  -i, --input=input          Input {EXPORT-TYPE} File
  -o, --output=output        Output HDF JSON File
  -w, --with-raw             Include raw input file in HDF JSON file

EXAMPLES
  saf convert {YOUR-EXPORT-NAME-HERE}2hdf -i {INPUT-NAME} -o output-hdf-name.json
\```

Commit your changes and mark your pull request as 'ready for review'. You should request for a code review from members of the SAF team and edit your code as necessary. Once approved, merged, and released, your mapper will be callable using the SAF CLI.

Mapper Creation Skeletons

Skeleton for a general file-based *-to-HDF mapper:

import {ExecJSON} from 'inspecjs';
import _ from 'lodash';
import {version as HeimdallToolsVersion} from '../package.json';
import {
  BaseConverter,
  ILookupPath,
  impactMapping,
  MappedTransform
} from './base-converter';

const IMPACT_MAPPING: Map<string, number> = new Map([
  ['critical', 0.9],
  ['high', 0.7],
  ['medium', 0.5],
  ['low', 0.3]
]);

export class SkeletonMapper extends BaseConverter {
  withRaw: boolean;

  mappings: MappedTransform<
    ExecJSON.Execution & {passthrough: unknown},
    ILookupPath
  > = {
    platform: {
      name: 'Heimdall Tools',
      release: HeimdallToolsVersion,
      target_id: null  //Insert data
    },
    version: HeimdallToolsVersion,
    statistics: {
      duration: null  //Insert data
    },
    profiles: [
      {
        name: '',              //Insert data
        title: null,           //Insert data
        maintainer: null,      //Insert data
        summary: null,         //Insert data
        license: null,         //Insert data
        copyright: null,       //Insert data
        copyright_email: null, //Insert data
        supports: [],          //Insert data
        attributes: [],        //Insert data
        depends: [],           //Insert data
        groups: [],            //Insert data
        status: 'loaded',      //Insert data
        controls: [
          {
            key: 'id',
            tags: {},             //Insert data
            descriptions: [],     //Insert data
            refs: [],             //Insert data
            source_location: {},  //Insert data
            title: null,          //Insert data
            id: '',               //Insert data
            desc: null,           //Insert data
            impact: 0,            //Insert data
            code: null,           //Insert data
            results: [
              {
                status: ExecJSON.ControlResultStatus.Failed,  //Insert data
                code_desc: '',                                //Insert data
                message: null,                                //Insert data
                run_time: null,                               //Insert data
                start_time: ''                                //Insert data
              }
            ]
          }
        ],
        sha256: ''
      }
    ],
    passthrough: {
      transformer: (data: Record<string, any>): Record<string, unknown> => {
        return {
          auxiliary_data: [{name: '', data: _.omit([])}],  //Insert service name and mapped fields to be removed
          ...(this.withRaw && {raw: data})
        };
      }
    }
  };
  constructor(exportJson: string, withRaw = false) {
    super(JSON.parse(exportJson), true);
    this.withRaw = withRaw
  }
}

Skeleton for a general test for a *-to-HDF mapper in HDF Converters:

import fs from 'fs';
import {SkeletonMapper} from '../../../src/skeleton-mapper';
import {omitVersions} from '../../utils';

describe('skeleton_mapper', () => {
  it('Successfully converts Skeleton targeted at a local/cloned repository data', () => {
    const mapper = new SkeletonMapper(
      fs.readFileSync(
        'sample_jsons/skeleton_mapper/sample_input_report/skeleton.json',
        {encoding: 'utf-8'}
      )
    );

    // fs.writeFileSync(
    //   'sample_jsons/skeleton_mapper/skeleton-hdf.json',
    //   JSON.stringify(mapper.toHdf(), null, 2)
    // );

    expect(omitVersions(mapper.toHdf())).toEqual(
      omitVersions(
        JSON.parse(
          fs.readFileSync(
            'sample_jsons/skeleton_mapper/skeleton-hdf.json',
            {
              encoding: 'utf-8'
            }
          )
        )
      )
    );
  });
});

describe('skeleton_mapper_withraw', () => {
  it('Successfully converts withraw flagged Skeleton targeted at a local/cloned repository data', () => {
    const mapper = new SkeletonMapper(
      fs.readFileSync(
        'sample_jsons/skeleton_mapper/sample_input_report/skeleton.json',
        {encoding: 'utf-8'}
      ),
      true
    );

    // fs.writeFileSync(
    //   'sample_jsons/skeleton_mapper/skeleton-hdf-withraw.json',
    //   JSON.stringify(mapper.toHdf(), null, 2)
    // );

    expect(omitVersions(mapper.toHdf())).toEqual(
      omitVersions(
        JSON.parse(
          fs.readFileSync(
            'sample_jsons/skeleton_mapper/skeleton-hdf-withraw.json',
            {
              encoding: 'utf-8'
            }
          )
        )
      )
    );
  });
});

Skeleton for SAF CLI mapper conversion integration:

import {Command, Flags} from '@oclif/core'
import fs from 'fs'
import {SkeletonMapper as Mapper} from '@mitre/hdf-converters'
import {checkSuffix} from '../../utils/global'

export default class Skeleton2HDF extends Command {
  static usage = 'convert skeleton2hdf -i <skeleton-json> -o <hdf-scan-results-json>'

  static description = 'Translate a Skeleton output file into an HDF results set'

  static examples = ['saf convert skeleton2hdf -i skeleton.json -o output-hdf-name.json']

  static flags = {
    help: Flags.help({char: 'h'}),
    input: Flags.string({char: 'i', required: true, description: 'Input Skeleton file'}),
    output: Flags.string({char: 'o', required: true, description: 'Output HDF file'}),
    'with-raw': Flags.boolean({char: 'w', required: false}),
  }

  async run() {
    const {flags} = await this.parse(Skeleton2HDF)
    const input = fs.readFileSync(flags.input, 'utf8')

    const converter = new Mapper(input, flags.['with-raw'])
    fs.writeFileSync(checkSuffix(flags.output), JSON.stringify(converter.toHdf()))
  }
}

Skeleton for a convert command test for the SAF CLI:

import {expect, test} from '@oclif/test'
import tmp from 'tmp'
import path from 'path'
import fs from 'fs'
import {omitHDFChangingFields} from '../utils'

describe('Test skeleton', () => {
  const tmpobj = tmp.dirSync({unsafeCleanup: true})

  test
  .stdout()
  .command([
    'convert skeleton2hdf',
    '-i',
    path.resolve(
      './test/sample_data/skeleton/sample_input_report/skeleton_sample.json',
    ),
    '-o',
    `${tmpobj.name}/skeletontest.json`,
  ])
  .it('hdf-converter output test', () => {
    const converted = JSON.parse(
      fs.readFileSync(`${tmpobj.name}/skeletontest.json`, 'utf8'),
    )
    const sample = JSON.parse(
      fs.readFileSync(
        path.resolve('./test/sample_data/skeleton/skeleton-hdf.json'),
        'utf8',
      ),
    )
    expect(omitHDFChangingFields(converted)).to.eql(
      omitHDFChangingFields(sample),
    )
  })
})

describe('Test skeleton withraw flag', () => {
  const tmpobj = tmp.dirSync({unsafeCleanup: true})

  test
  .stdout()
  .command([
    'convert skeleton2hdf',
    '-i',
    path.resolve(
      './test/sample_data/skeleton/sample_input_report/skeleton_sample.json',
    ),
    '-o',
    `${tmpobj.name}/skeletontest.json`,
    '-w',
  ])
  .it('hdf-converter withraw output test', () => {
    const converted = JSON.parse(
      fs.readFileSync(`${tmpobj.name}/skeletontest.json`, 'utf8'),
    )
    const sample = JSON.parse(
      fs.readFileSync(
        path.resolve('./test/sample_data/skeleton/skeleton-hdf-withraw.json'),
        'utf8',
      ),
    )
    expect(omitHDFChangingFields(converted)).to.eql(
      omitHDFChangingFields(sample),
    )
  })
})

Best Practices

[//] # WIP

*-to-HDF Mapper Construction Example

The following is an example of a implemented HDF mapper for Twistlock created using a high-level Twistlock-to-HDF mapping and tools from base-converter.

const IMPACT_MAPPING: Map<string, number> = new Map([
  ['critical', 0.9],
  ['important', 0.9],
  ['high', 0.7],
  ['medium', 0.5],
  ['moderate', 0.5],
  ['low', 0.3]
]);

export class TwistlockMapper extends BaseConverter {
  withRaw: boolean;

  mappings: MappedTransform<
    ExecJSON.Execution & {passthrough: unknown},
    ILookupPath
  > = {
    platform: {
      name: 'Heimdall Tools',
      release: HeimdallToolsVersion,
      target_id: {path: 'results[0].name'}
    },
    version: HeimdallToolsVersion,
    statistics: {},
    profiles: [
      {
        path: 'results',
        name: 'Twistlock Scan',
        title: {
          transformer: (data: Record<string, unknown>): string => {
            const projectArr = _.has(data, 'collections')
              ? _.get(data, 'collections')
              : 'N/A';
            const projectName = Array.isArray(projectArr)
              ? projectArr.join(' / ')
              : projectArr;
            return `Twistlock Project: ${projectName}`;
          }
        },
        summary: {
          transformer: (data: Record<string, unknown>): string => {
            const vulnerabilityTotal = _.has(data, 'vulnerabilityDistribution')
              ? `${JSON.stringify(
                  _.get(data, 'vulnerabilityDistribution.total')
                )}`
              : 'N/A';
            const complianceTotal = _.has(data, 'complianceDistribution')
              ? `${JSON.stringify(_.get(data, 'complianceDistribution.total'))}`
              : 'N/A';
            return `Package Vulnerability Summary: ${vulnerabilityTotal} Application Compliance Issue Total: ${complianceTotal}`;
          }
        },
        supports: [],
        attributes: [],
        groups: [],
        status: 'loaded',
        controls: [
          {
            path: 'vulnerabilities',
            key: 'id',
            tags: {
              nist: ['SI-2', 'RA-5'],
              cci: ['CCI-002605', 'CCI-001643'],
              cveid: {path: 'id'}
            },
            refs: [],
            source_location: {},
            title: {path: 'id'},
            id: {path: 'id'},
            desc: {path: 'description'},
            impact: {
              path: 'severity',
              transformer: impactMapping(IMPACT_MAPPING)
            },
            code: {
              transformer: (vulnerability: Record<string, unknown>): string => {
                return JSON.stringify(vulnerability, null, 2);
              }
            },
            results: [
              {
                status: ExecJSON.ControlResultStatus.Failed,
                code_desc: {
                  transformer: (data: Record<string, unknown>): string => {
                    const packageName = _.has(data, 'packageName')
                      ? `${JSON.stringify(_.get(data, 'packageName'))}`
                      : 'N/A';
                    const impactedVersions = _.has(data, 'impactedVersions')
                      ? `${JSON.stringify(_.get(data, 'impactedVersions'))}`
                      : 'N/A';
                    return `Package ${packageName} should be updated to latest version above impacted versions ${impactedVersions}`;
                  }
                },
                message: {
                  transformer: (data: Record<string, unknown>): string => {
                    const packageName = _.has(data, 'packageName')
                      ? `${JSON.stringify(_.get(data, 'packageName'))}`
                      : 'N/A';
                    const packageVersion = _.has(data, 'packageVersion')
                      ? `${JSON.stringify(_.get(data, 'packageVersion'))}`
                      : 'N/A';
                    return `Expected latest version of ${packageName}\nDetected vulnerable version ${packageVersion} of ${packageName}`;
                  }
                },
                start_time: {path: 'discoveredDate'}
              }
            ]
          }
        ],
        sha256: ''
      }
    ],
    passthrough: {
      transformer: (data: Record<string, unknown>): Record<string, unknown> => {
        let resultsData = _.get(data, 'results');
        if (Array.isArray(resultsData)) {
          resultsData = resultsData.map((result: Record<string, unknown>) =>
            _.omit(result, [
              'name',
              'collections',
              'complianceDistribution',
              'vulnerabilities',
              'vulnerabilityDistribution'
            ])
          );
        }
        return {
          auxiliary_data: [
            {
              name: 'Twistlock',
              data: {
                results: resultsData,
                consoleURL: _.get(data, 'consoleURL')
              }
            }
          ],
          ...(this.withRaw && {raw: data})
        };
      }
    }
  };
  constructor(twistlockJson: string, withRaw = false) {
    super(JSON.parse(twistlockJson), true);
    this.withRaw = withRaw;
  }
}

Streamline security automation for systems and DevOps pipelines with the SAF CLI

SAF CLI

Support utility for security automation

Provide feedback

Saved searches

Use saved searches to filter your results more quickly