diff --git a/docs/LANGUAGE_SUPPORT.md b/docs/LANGUAGE_SUPPORT.md index 6b29a989..bc01dfa0 100644 --- a/docs/LANGUAGE_SUPPORT.md +++ b/docs/LANGUAGE_SUPPORT.md @@ -73,13 +73,13 @@ Writing detection rules allows us to capture all the values linked to a cryptogr Those values are captured in a tree-like structure based on the definition of detection rules and their dependent detection rules that detected them. The tree does not contain any semantic information about how the cryptographic values relate to each other. -What we want instead is a meaningful representation of all cryptography related values: a tree structure where relationships between nodes carry some meaning. +Ultimately, we want a meaningful representation of all cryptography-relevant values: a tree structure in which the relationships between the nodes have a meaning. Back to our example, we want a tree where the mode is a child node of the algorithm node, to indicate that it's the mode of this algorithm. This process of building a meaningful tree representation of the captured cryptography values is called the translation. This process is also part of the language module. In certain cases where translation requires to parse a string, the parsing and translation process is outsourced to the `mapper` module for better modularity. The last step of the translation process is called the enrichment, and is done by the `enricher` module. -This step aims at adding content to the translated tree, based on external knowledge (i.e. not based on the values we captured in the source code). +This step aims at adding content to the translated tree, based on external knowledge. Indeed, we can get additional information from the documentation of a cryptography library, like some default values. Maybe our algorithm has a default mode when no mode is specified in the code, in such case we can "enrich" the translated tree with this default mode. Additionally, we can enrich most cryptography assets with an [object identifier](https://en.wikipedia.org/wiki/Object_identifier) (OID) that uniquely identifies an algorithm and plays an important role in a CBOM. @@ -92,7 +92,7 @@ Additionally, we can enrich most cryptography assets with an [object identifier] The [`engine`](../engine/) module bridges the gap between the language-specific sonar APIs used for navigating the AST, and the high level language-agnostic API used for writing detection rules. -It has a [`language`](../engine/src/main/java/com/ibm/engine/language/) subfolder which contains the set of functions (defined by five interfaces) to implement to enable this high level API for a language supported for SonarQube plugins. +It has a [`language`](../engine/src/main/java/com/ibm/engine/language/) subfolder that contains the set of functions (defined by five interfaces) to be implemented to set up this high-level API for a language. More details about these interfaces are given [later](#implementing-the-language-specific-parts-of-the-engine). The `engine` module contains a lot of other subfolders and files that enable strong detection capabilities, but they are all based on the functions provided by the `language` files and therefore do not need to be modified when adding support for another language. @@ -107,18 +107,15 @@ Once the plugin supports the targeted language, the [next section](#adding-suppo > [!TIP] > If your language is already supported by our plugin, you can directly skip to [the section](#adding-support-for-another-cryptography-library) explaining in detail how to add support for a cryptography library. -Recall that only languages supported for SonarQube plugins are supported (they are listed [here](https://docs.sonarsource.com/sonarqube/latest/extension-guide/adding-coding-rules/#custom-rule-support-by-language) in the *Java* column), because they come with a sonar analyzer and API. +Currently we only support languages that are provided by Sonar, aka for which there is a Sonar parser to generate an abstract syntax tree ([see](https://github.com/search?q=topic%3Alanguage-team+org%3ASonarSource+&type=repositories) the supported language parsers). Theoretically, any language parser (written in Java) that generates an AST from the source code should be integrable, and thus any language for which such a parser exists. With our current implementation, we have only tested the parsers provided by Sonar. -> [!NOTE] -> If you really want to add support for a language that is not supported for SonarQube plugins, it may be possible to use a third-party analyzer and integrate with SonarQube using [generic formatted issue reports](https://docs.sonarsource.com/sonarqube/latest/analyzing-source-code/importing-external-issues/generic-issue-import-format/). -> However, this has not been attempted yet and will probably result in significantly more work. In the following, we will take the example of adding support for the Java language. -### Adding the language analyzer +### Adding the language parser -The first step is to add a dependency for your sonar language analyzer to the main `sonar-cryptography` [`pom.xml`](../pom.xml). -You should find information about the analyzer group and artifact identifiers in the appropriate language page of the [documentation of sonar languages](https://docs.sonarsource.com/sonarqube/latest/analyzing-source-code/languages/overview/). +The first step is to add a dependency for your sonar language parser to the main `sonar-cryptography` [`pom.xml`](../pom.xml). +You should find a parser for your language by searching the parser provided by Sonar directly ([see](https://github.com/search?q=topic%3Alanguage-team+org%3ASonarSource+&type=repositories)) or by searching for parsers created by the community, such as the one for C/C++ ([see](https://github.com/SonarOpenCommunity/sonar-cxx)) Then, first add its version under ``: ```xml @@ -136,11 +133,11 @@ And add the dependency (using this version reference) under `