Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

adds initial draft of memory safety continuum #20

Merged
merged 9 commits into from
Apr 19, 2024
99 changes: 99 additions & 0 deletions docs/memory-safety-continuum.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,99 @@
# Memory Safety Continuum
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this should merge into the definitions.md, since it extends upon the notions in that and that way we would have a sharable, unified artifact that defines memory-safety?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possibly - though I'd like to keep it in a separate doc until we further discuss and refine the idea of the continuum.


At this SIG's January 25, 2024 meeting, we discussed the idea of a continuum when it comes to software memory safety.

When discussing memory safety, it's easy to fall into the idea of a memory safety binary - either your software is "memory safe" or it is not. When considering reality - especially with re: to so much existing software (legacy or not) - it is more nuanced.

This DRAFT document lays out the idea of the "Memory Safety Continuum". Eventually, this idea may be used as a foundation for other content which will help developers and organizations identify where they are on the continuum and how to get to where they want to be.

## Audience

The audience for this draft is developers - however, the audience may expand with future drafts/versions of this document.

## What is Memory Safety?

Rather than using terms like "Memory Safe Language" and "Memory Unsafe Language", this SIG prefers the terms "memory safe by default" and "non-memory safe by default". Please see our [useful definitions](definitions.md) file for more information about memory safety and undefined behavior.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this!


## The Continuum

This is a very rough idea of what the continuum might look like - from "least safety" to "most safety"

* Using a non-memory safe by default language (such as C or C++) without developer best practices or automated tooling to check for memory safety
* Using a non-memory safe by default language with developer best practices, but no automated tooling to check for memory safety
* Using a non-memory safe by default language with developer best practices and automated tooling to check for memory safety in first party code
* Using a non-memory safe by default language with developer best practices and automated tooling to check for memory safety in first party code AND automated tooling to check for memory safety in third party code (dependencies)
* Using a memory safe by default language (such as Rust, Go, Python, Java, JavaScript, C#) without developer best practices and automated tooling
* Using a memory safe by default language with developer best practices, but no automated tooling to check for memory safety
* Using a memory safe by default language with developer best practices and automated tooling to check for memory safety in first party code
* Using a memory safe by default language with developer best practices and automated tooling to check for memory safety in first party code AND automated tooling to check for memory safety in third party code (dependencies)

### Using a non-memory safe by default language without developer best practies or automated tooling

Using raw C or C++ (or old versions of C and C++ language and compiler)

### Using a non-memory safe by default language with developer best practices, but no automated tooling

Examples:

* [Using attributes such as `cleanup` and classes when writing C](https://lwn.net/Articles/934679/) (and depending on developers to manually check this)
* Following the [C++ Core Guidelines](https://github.com/isocpp/CppCoreGuidelines) when writing C++
* Using the [C++ Compiler Hardening Guide](https://github.com/ossf/wg-best-practices-os-developers/tree/main/docs/Compiler-Hardening-Guides) when compiling C++ code

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about? :

Isolating code that processes un-trusted data from code that performs direct memory management operations or uses raw pointers (See: https://langsec.org)

### Using a non-memory safe by default language with developer best practices and automated tooling to check for memory safety in first party code

TO DO

### Using a non-memory safe by default language with developer best practices and automated tooling to check for memory safety in first party code AND automated tooling to check for memory safety in third party code (dependencies)

TO DO

### Using a memory safe by default language without developer best practices or automated tooling

Examples:

* Using unsafe blocks in Rust [without assessing the entire module in which the unsafe code appears](https://github.com/ossf/Memory-Safety/issues/15#issuecomment-1847939439)
* Using the [no_mangle](https://github.com/rust-lang/rust/issues/28179) attribute in Rust
* Using compiler options which turn off some or all of the compiler's memory safety checks

### Using a memory safe by default language with developer best practices, but no automated tooling

Examples:

* Following the [Rustnomicon](https://doc.rust-lang.org/nomicon/intro.html) careful practices when using unsafe blocks in Rust
* Following best practices (LINK NEEDED) when using the Go [unsafe](https://pkg.go.dev/unsafe#pkg-overview) package
* Following [Javascript Memory Management](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Memory_management) practices

### Using a memory safe by default language with developer best practices and automated tooling to check for memory safety in first party code

Examples:

* Using the [Go Data Race Detector](https://go.dev/doc/articles/race_detector)
* Using other tools such as [govulncheck, fuzzing, and vet](https://go.dev/doc/security/best-practices) when writing Go code

### Using a memory safe by default language with developer best practices and automated tooling to check for memory safety in first party code AND third party code

TO DO

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fuzzing components (including third-party code) that parse un-trusted data is a good way to find "low-hanging fruit" when auditing third party code is not feasible.
(One popular fuzzer is AFL++.)

@nellshamrell , I think if I were to say much more than this, it would make sense to stick it in its own file in /docs... should I have a go and make a PR?


## FAQ

### Is there a way to be COMPLETELY memory safe?

In theory, yes. In reality...it's very hard. It is theoretically possible to write C++ code in a way that is free from memory safety bugs. In practice, however, we still see a very large number of memory safety related security vulnerabilities every year - known vulnerabilities that could be prevented if the code were written in a memory safe by default language.

This is not to say that using a memory safe by default language will protect you from all vulnerabilities. Saying something is "default" also implies that there are ways of using the language in a non default way. This is also true of your software's dependencies - for example, your Rust code may be free of unsafe blocks (or, at least, use them sparingly), but an Open Source package you depend on may not be. Evaluating the safety of your software includes evaluating anything your software depends on.

It is also possible that your software written in a memory safe by default language will need to interface with software written in a non-memory safe by default language (for example, Rust code which must interface with a C++ driver).

No matter where your software is on this memory safety continuum, you will still need to exercise some level of personal/professional judgement on what is an acceptable amount of risk and what is not.

### Are you saying we should just re-write billions of lines of C and C++ code in Rust?

No.

This is not a real world possibility. Even if it were, it would not be practical nor would it be truly helpful.

There are times a rewrite of a particular section of software (such as one where vulnerabilities are particularly prevalent) may be necessary. However, this SIG's focus is not on rewriting the world in Rust (or any other language). Instead, it is on helping developers (and others) understand the memory safety of their code now and how to improve it in meaningful and achievable ways.

### Why do you rank using automated tooling higher than just using developer best practices?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is great!


The amount software that has already been produced is staggering - and it is only growing every day. Expecting every developer to manually check their code (and the code their code interacts with) is no longer practical or possible. Automated tooling must be used to catch (at the least) known, common vulnerabilities that a developer may unintentionally introduce in their code.
Loading