Begin adding material CWE-74 (CWE-79 and CWE-94)

Here is a start add subclasses of CWE-74, specifically CWE-79 and CWE-94. CWE-79 is one of the most common vulnerabilities on the planet, including code written in Python, so saying *nothing* about it would be a mistake. This is complicated because the only proper solutions to CWE-79 are external modules, which are out of scope. However, I think it's reasonable to make it clear that you *cannot* solve this at scale with only the built-in modules, so you *must* think beyond it. I didn't try to fill in the examples, because I wanted to see if this was a reasonable direction first. Signed-off-by: David A. Wheeler <dwheeler@dwheeler.com>
ossf · Dec 17, 2024 · 4c97742 · 4c97742
1 parent 0546496
commit 4c97742
Show file tree

Hide file tree

Showing 3 changed files with 72 additions and 0 deletions.
diff --git a/docs/Secure-Coding-Guide-for-Python/CWE-74/CWE-79/README.md b/docs/Secure-Coding-Guide-for-Python/CWE-74/CWE-79/README.md
@@ -0,0 +1,35 @@
+# CWE-79: Improper Neutralization of Input During Web Page Generation ('Cross-site Scripting')
+
+Ensure that web pages, when generated, completely neutralize untrusted input. One of the most common vulnerabilities today is cross-site scripting (XSS), caused by failure to neutralize untrusted input. This is especially common when generating HTML, but can also occur in URLs, CSS, and JavaScript if they are generated from untrusted input.
+
+This vulnerability is common because it is easy to use string concatenation and format strings to combine constant values with data from a potential attacker.
+
+The proper solution is to use a template system that automatically neutralizes (escapes) all data except for data that is trusted (e.g., constant data) and data *specifically* identified and marked as being trusted. Python does not include, in its built-in libraries, such a template system. There are many such libraries available via PyPI; we encourage you to use one of them. Selecting such a library is outside the scope of this document.
+
+Be certain that the template system you use is configured to automatically escape data. For example, the Jinja2 template engine *can* escape HTML data, and is configured to do so by Flask, but when Jinja2 is used in isolation its default is to *not* escape data by default. You should ensure it's properly escaping data by default. Include tests in your automated test suite to verify this.
+
+Python 3's built-in libraries include [`html.escape`](https://docs.python.org/3/library/html.html#html.escape), which escapes HTML characters in a string. If your Python program is only a few lines long, and you carefully ensure that all untrusted input goes through `html.escape` before being included in HTML, this can also counter HTML generation issues. However, this approach is ineffective as programs get larger. The fundamental flaw is that it requires developers to always call the escape function on every use, so any time a developer forgets to make the call, it becomes a vulnerability. Put another way, this approach makes the default insecure, and as the program is maintained a mistake is bound to happen. Templating systems, in contrast, make the default the secure result.
+
+
+## Non-Compliant Code Example
+
+The `noncompliant01.py` code example demonstrates this:
+
+*[noncompliant01.py](noncompliant01.py):*
+
+```py
+```
+
+## Compliant Solution
+
+A compliant solution uses a templating engine. Selecting a templating engine is out of scope for this document.
+
+## Automated Detection
+
+unknown
+
+## Related Guidelines
+
+
+## Biblography
+
diff --git a/docs/Secure-Coding-Guide-for-Python/CWE-74/CWE-94/README.md b/docs/Secure-Coding-Guide-for-Python/CWE-74/CWE-94/README.md
@@ -0,0 +1,32 @@
+# CWE-94: Improper Control of Generation of Code ('Code Injection')
+
+Ensure that generated code influenced by untrusted input, that is then
+directly executed, cannot cause malicious results.
+
+As a general rule, never use untrusted inputs to create something that will be passed to Python's built-in [eval](https://docs.python.org/3/library/functions.html#eval) function. It is difficult to escape properly, and there are far too many ways to subvert it.
+
+As noted by [MITRE's CWE-94 information](https://cwe.mitre.org/data/definitions/94.html), "it is frequently encouraged to use the `ast.literal_eval()` function instead of eval, since it is intentionally designed to avoid executing code. However, an adversary could still cause excessive memory or stack consumption via deeply nested structures [REF-1372], so the python documentation discourages use of ast.literal_eval() on untrusted data [REF-1373]."
+
+Solutions generally involve changing the software architecture and approach so it does not need to dynamically generate code for execution that uses untrusted input. For example, if you know that you're reading JSON data that could also be interpreted as a Python executable, a *terrible* solution would be to use `eval` to evaluate it. A much better solution would be to use Python's built-in [json](https://docs.python.org/3/library/json.html) module to read in the data, as it does not allow attackers to arbitrarily control execution. As noted in the `json` module documentation, "Be cautious when parsing JSON data from untrusted sources. A malicious JSON string may cause the decoder to consume considerable CPU and memory resources. Limiting the size of data to be parsed is recommended."
+
+## Non-Compliant Code Example
+
+The `noncompliant01.py` code example demonstrates this:
+
+*[noncompliant01.py](noncompliant01.py):*
+
+```py
+```
+
+## Compliant Solution
+
+
+## Automated Detection
+
+unknown
+
+## Related Guidelines
+
+
+## Biblography
+
diff --git a/docs/Secure-Coding-Guide-for-Python/readme.md b/docs/Secure-Coding-Guide-for-Python/readme.md
@@ -39,6 +39,11 @@ It is **not production code** and requires code-style or python best practices t
 * Proper logging instead of printing to `stdout`
 * Secure coding compliance outside of described issue
 
+|[CWE-74: Improper Neutralization of Special Elements in Output Used by a Downstream Component ('Injection')](https://cwe.mitre.org/data/definitions/74.html)|Prominent CVE|
+|:---------------------------------------------------------------------------------------------------------------|:----|
+|[CWE-79: Improper Neutralization of Input During Web Page Generation ('Cross-site Scripting')](CWE-74/CWE-79/README.md)||
+|[CWE-94: Improper Control of Generation of Code ('Code Injection')](CWE-74/CWE-94/README.md)||
+
 |[CWE-664: Improper Control of a Resource Through its Lifetime](https://cwe.mitre.org/data/definitions/664.html)|Prominent CVE|
 |:-----------------------------------------------------------------------------------------------------------------------------------------------|:----|
 |[CWE-134: Use of Externally-Controlled Format String](CWE-664/CWE-134/README.md)|[CVE-2022-27177](https://www.cvedetails.com/cve/CVE-2022-27177/),<br/>CVSSv3.1: **9.8**,<br/>EPSS: **00.37** (01.12.2023)|