From 4156e5fc643845a614957e94d860cf2cd0926e91 Mon Sep 17 00:00:00 2001 From: davidrackerby Date: Wed, 18 Oct 2023 23:58:37 -0400 Subject: [PATCH 1/7] Blog post draft --- blogposts/Rackerby/post.md | 123 +++++++++++++++++++++++++++++++++++++ 1 file changed, 123 insertions(+) create mode 100644 blogposts/Rackerby/post.md diff --git a/blogposts/Rackerby/post.md b/blogposts/Rackerby/post.md new file mode 100644 index 00000000..75aaf6ca --- /dev/null +++ b/blogposts/Rackerby/post.md @@ -0,0 +1,123 @@ +# Fantastic pointers and how to `std::launder` them + +C++ has a wide variety of memory-management options, offering many different levels of abstraction. You do so manually using `new` and `delete`, forgoing the need to keep track of how much memory to allocate. You can utilize smart pointers to take advantage of RAII principles to automatically release memory. Even until C++ 23 there was support for [garbage collection](https://en.cppreference.com/w/cpp/memory#Garbage_collector_support_.28until_C.2B.2B23.29). + +These high-level abstractions have made memory-management and its associated bugs easier to work with. Modern C++ compilers have been fine-tuned to generate efficient low-level code from these abstractions, and optimizers aid in this step by making assumptions about the operations programmers are allowed to write. However, as a general-purpose systems programming language, C++ must give the programmer access to all levels of abstraction. This includes low levels that allow the user to write programs that violate assumptions the compiler makes. Here, we consider how `std::launder` acts as a back door when the compiler doesn't know how to handle uses of placement `new`. + +First, a brief overview of placement `new`. The typical call to `operator new` ([full documentation](https://en.cppreference.com/w/cpp/language/new)) is of the form `new (type) initializer`. This syntax both allocates memory and initializes it with the supplied arguments. However, there exists another syntax to decouple the allocation from the initialization called placement `new`. + +```cpp +struct Foo { + int bar; + int baz; +} + +// Stack-allocate enough memory to hold an int with the proper alignment +alignas(Foo) unsigned char buf[sizeof(Foo)]; + +Foo* foo_ptr = new(&buf) Foo{1, 2}; // Construct a `Foo` object, placing it into the +// pre-allocated storage at memory address of `buf` +// and returning a pointer to that memory foo_ptr. +``` + +> Side-note: the `alignas` specifier ensures that the byte-boundaries of the buffer are the same as that of a `Foo` ([full documentation](https://en.cppreference.com/w/cpp/language/object#Alignment)). + +There are many reasons outside the scope of this article as to why one would prefer this syntax, but some simple ones are: + +- It's faster to reuse pre-allocated memory than it is to allocate new memory +- When writing code for an embedded system that has memory-mapped hardware, one needs to reuse the same fixed address + +Suppose we tried to access that memory via the first pointer to it with `reinterpret_cast(&buf)->bar`. This is actually undefined behavior, since the underlying type of &buf is `unsigned char*` and doesn't point to a `Foo` object. Even if `&buf` and `foo_ptr` point to the same address, their differing types mean that we cannot safely use them interchangeably. One way to solve this issue is with `std::launder`. Launder has an esoteric definition on [cppreference](https://en.cppreference.com/w/cpp/utility/launder): + +Provenance fence with respect to p. Returns a pointer to the same memory that p points to, but where the referent object is assumed to have a distinct lifetime and dynamic type. Formally, given + +- the pointer p represents the address A of a byte in memory +- an object x is located at the address A +- x is within its lifetime +- the type of x is the same as T, ignoring cv-qualifiers at every level +- every byte that would be reachable through the result is reachable through p (bytes are reachable through a pointer that points to an object y if those bytes are within the storage of an object z that is pointer-interconvertible with y, or within the immediately enclosing array of which z is an element). + +Then `std::launder(p)` returns a value of type T* that points to the object x. Otherwise, the behavior is undefined. + +The program is ill-formed if T is a function type or (possibly cv-qualified) void." + +How does this arcane definition apply in this example? Well, + +1. `&buf` points to an address A +2. an object foo is located at A +3. this object is within its lifetime (simply put, memory has been allocated and initialized for it) +4. the type of foo is `Foo` +5. every byte in the returned pointer is reachable through `&buf` + +In order to safely reach the memory through `&buf`, we must wrap the cast in a call to launder: `std::launder(reinterpret_cast(&buf))->bar`. This informs the compiler that we *can* access the memory through that pointer because a call to launder effectively treats that pointer as if it were a freshly made object (similar to a normal call to `new`). The full example is below: + +```cpp +#include + +struct Foo { + int bar; + int baz; +} + +// Stack-allocate enough memory to hold an int with the proper alignment +alignas(Foo) unsigned char buf[sizeof(Foo)]; + +Foo* foo_ptr = new(&buf) Foo{0, 1}; // Construct a `Foo` object, placing it into the +// pre-allocated storage at memory address of `buf` +// and returning a pointer to that memory foo_ptr. + +foo_ptr->bar = 2; // Ok + +reinterpret_cast(&buf)->bar = 3 // Undefined behavior + +std::launder(reinterpret_cast(&buf))->bar = 4 // Ok +``` + +You're likely wondering, "why not just use `foo_ptr`?" There may be scenarios where we call placement `new` without saving its return value to a fresh pointer. Consider another example from cppreference: + +```cpp +struct Base { + virtual int transmogrify(); +}; + +struct Derived : Base { + int transmogrify() override { + new(this) Base; + return 2; + } +}; + +int Base::transmogrify() { + new(this) Derived; + return 1; +} + +static_assert(sizeof(Derived) == sizeof(Base)); + +int main() { + // Case 1: the new object failed to be transparently replaceable because + // it is a base subobject but the old object is a complete object. + Base base; + int n = base.transmogrify(); + // int m = base.transmogrify(); // undefined Behavior + int m = std::launder(&base)->transmogrify(); // OK + assert(m + n == 3); +} +``` + +In this example, the first call to `transmogrify` sneakily change the underlying type of `base` from `Base` to `Derived`. However, the compiler views `base` as a `Base` object and doesn't know which call to `transmogrify` to use the second time. It assumes that the "pointer" to the memory at `base` and the **actual** type of the memory it points to are different, leading to undefined behavior. Once again, a band-aid solution here is to use `std::launder` to tell the compiler "trust me, there really is a valid, freshly-made object at this address." Since launder doesn't affects its arguments, it's return value must be stored in a variable in order to still avoid the problem that *not storing* the result of placement `new` caused. What's the solution here? Unless we absolutely must use placement `new`, it's a better option to use higher-level memory-management options such as smart pointers. And in cases where we *do* have to use placement new, a good way to forego this indirection is to save the result of placement `new` somewhere since we'll need to eventually call `std::launder` if we do not. + +## Resources used + +- https://en.cppreference.com/w/cpp/utility/launder +- https://eel.is/c++draft/ptr.launder +- https://en.cppreference.com/w/cpp/language/new +- https://en.cppreference.com/w/cpp/types/byte +- https://stackoverflow.com/questions/63795395/c20-transparently-replaceable-relation +- http://eel.is/c++draft/basic.life#def:transparently_replaceable +- http://eel.is/c++draft/class.derived +- https://stackoverflow.com/questions/18451683/c-disambiguation-subobject-and-subclass-object +- https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0532r0.pdf +- https://www.ralfj.de/blog/2018/07/24/pointers-and-bytes.html +- https://www.ralfj.de/blog/2020/12/14/provenance.html +- https://stackoverflow.com/questions/222557/what-uses-are-there-for-placement-new From 4446661caca684413c548b023861c874cbde6be7 Mon Sep 17 00:00:00 2001 From: ProtoRiki Date: Sun, 19 Nov 2023 13:12:46 -0500 Subject: [PATCH 2/7] Intermediate commit --- blogposts/Rackerby/post.md | 104 +++++++++++++++++++++++++++---------- 1 file changed, 76 insertions(+), 28 deletions(-) diff --git a/blogposts/Rackerby/post.md b/blogposts/Rackerby/post.md index 75aaf6ca..9358b431 100644 --- a/blogposts/Rackerby/post.md +++ b/blogposts/Rackerby/post.md @@ -4,7 +4,56 @@ C++ has a wide variety of memory-management options, offering many different lev These high-level abstractions have made memory-management and its associated bugs easier to work with. Modern C++ compilers have been fine-tuned to generate efficient low-level code from these abstractions, and optimizers aid in this step by making assumptions about the operations programmers are allowed to write. However, as a general-purpose systems programming language, C++ must give the programmer access to all levels of abstraction. This includes low levels that allow the user to write programs that violate assumptions the compiler makes. Here, we consider how `std::launder` acts as a back door when the compiler doesn't know how to handle uses of placement `new`. -First, a brief overview of placement `new`. The typical call to `operator new` ([full documentation](https://en.cppreference.com/w/cpp/language/new)) is of the form `new (type) initializer`. This syntax both allocates memory and initializes it with the supplied arguments. However, there exists another syntax to decouple the allocation from the initialization called placement `new`. +First, a brief overview of placement `new` and transparent replaceability. The familiar call to `operator new` ([full documentation](https://en.cppreference.com/w/cpp/language/new)) is of the form `new (type) (initializer)`. For example: +```cpp +struct Foo { + int bar; + int baz; +} + +// Heap-allocate the memory and initialize the object +Foo* a = new Foo{1, 2}; +``` + This syntax both allocates memory and initializes it with the supplied arguments. However, if one wishes to decouple the memory allocation from its initialization, a different syntax called placement `new` exists for that purpose. Cppreference provides an example of such: + +```cpp +struct C { + int i; + void f(); + const C& operator=(const C&); +}; + +const C& C::operator=(const C& other) +{ + if (this != &other) { + this->~C(); // lifetime of *this ends + new (this) C(other); // new object of type C created + f(); // well-defined + } + return *this; +} + +C c1; +C c2; +c1 = c2; // well-defined +c1.f(); // well-defined; c1 refers to a new object of type C +``` + +In this example, we've avoided allocating new memory, instead reusing the same memory that was allocated for `c1`. The above operations were well-defined because object `c1` was **transparently replaceable** by `c2`. + +According to Cppreference, "If a new object is created at the address that was occupied by another object, then all pointers, references, and the name of the original object will automatically refer to the new object and, once the lifetime of the new object begins, can be used to manipulate the new object, but only if the original object is transparently replaceable by the new object. Object x is *transparently replaceable* by object y if: + +- the storage for y exactly overlays the storage location which x occupied +- y is of the same type as x (ignoring the top-level cv-qualifiers) +- x is not a complete const object + neither x nor y is a base class subobject, or a member subobject declared with [[no_unique_address]](since C++20) +- either + + - x and y are both complete objects, or + - x and y are direct subobjects of objects ox and oy respectively, and ox is transparently replaceable by oy. + + + ```cpp struct Foo { @@ -12,7 +61,7 @@ struct Foo { int baz; } -// Stack-allocate enough memory to hold an int with the proper alignment +// Stack-allocate enough memory to hold a Foo with the proper alignment alignas(Foo) unsigned char buf[sizeof(Foo)]; Foo* foo_ptr = new(&buf) Foo{1, 2}; // Construct a `Foo` object, placing it into the @@ -20,33 +69,40 @@ Foo* foo_ptr = new(&buf) Foo{1, 2}; // Construct a `Foo` object, placing it into // and returning a pointer to that memory foo_ptr. ``` -> Side-note: the `alignas` specifier ensures that the byte-boundaries of the buffer are the same as that of a `Foo` ([full documentation](https://en.cppreference.com/w/cpp/language/object#Alignment)). +> Note: the `alignas` specifier ensures that the byte-boundaries of the buffer are the same as that of a `Foo` ([full documentation](https://en.cppreference.com/w/cpp/language/object#Alignment)). -There are many reasons outside the scope of this article as to why one would prefer this syntax, but some simple ones are: +There are many reasons outside the scope of this article as to why one would want to separate allocation from initialization, but some simple ones are: - It's faster to reuse pre-allocated memory than it is to allocate new memory - When writing code for an embedded system that has memory-mapped hardware, one needs to reuse the same fixed address Suppose we tried to access that memory via the first pointer to it with `reinterpret_cast(&buf)->bar`. This is actually undefined behavior, since the underlying type of &buf is `unsigned char*` and doesn't point to a `Foo` object. Even if `&buf` and `foo_ptr` point to the same address, their differing types mean that we cannot safely use them interchangeably. One way to solve this issue is with `std::launder`. Launder has an esoteric definition on [cppreference](https://en.cppreference.com/w/cpp/utility/launder): -Provenance fence with respect to p. Returns a pointer to the same memory that p points to, but where the referent object is assumed to have a distinct lifetime and dynamic type. Formally, given +```cpp +// Defined in + +template< class T > +[[nodiscard]] constexpr T* launder( T* p ) noexcept; // Since C++20 +``` + +"Provenance fence with respect to `p`. Returns a pointer to the same memory that `p` points to, but where the referent object is assumed to have a distinct lifetime and dynamic type. Formally, given -- the pointer p represents the address A of a byte in memory -- an object x is located at the address A -- x is within its lifetime -- the type of x is the same as T, ignoring cv-qualifiers at every level -- every byte that would be reachable through the result is reachable through p (bytes are reachable through a pointer that points to an object y if those bytes are within the storage of an object z that is pointer-interconvertible with y, or within the immediately enclosing array of which z is an element). +- the pointer `p` represents the address `A` of a byte in memory +- an object `x` is located at the address `A` +- `x` is within its lifetime +- the type of `x` is the same as `T`, ignoring cv-qualifiers at every level +- every byte that would be reachable through the result is reachable through `p` (bytes are reachable through a pointer that points to an object `y` if those bytes are within the storage of an object `z` that is pointer-interconvertible with `y`, or within the immediately enclosing array of which `z` is an element). -Then `std::launder(p)` returns a value of type T* that points to the object x. Otherwise, the behavior is undefined. +Then `std::launder(p)` returns a value of type `T*` that points to the object `x`. Otherwise, the behavior is undefined. -The program is ill-formed if T is a function type or (possibly cv-qualified) void." +The program is ill-formed if `T` is a function type or (possibly cv-qualified) `void`." How does this arcane definition apply in this example? Well, 1. `&buf` points to an address A -2. an object foo is located at A -3. this object is within its lifetime (simply put, memory has been allocated and initialized for it) -4. the type of foo is `Foo` +2. an object (let's call it `foo`) is located at A +3. this object is within its lifetime (i.e. memory is allocated for it and initialized it has been initialized) +4. the type of `foo` is `Foo` 5. every byte in the returned pointer is reachable through `&buf` In order to safely reach the memory through `&buf`, we must wrap the cast in a call to launder: `std::launder(reinterpret_cast(&buf))->bar`. This informs the compiler that we *can* access the memory through that pointer because a call to launder effectively treats that pointer as if it were a freshly made object (similar to a normal call to `new`). The full example is below: @@ -66,14 +122,14 @@ Foo* foo_ptr = new(&buf) Foo{0, 1}; // Construct a `Foo` object, placing it into // pre-allocated storage at memory address of `buf` // and returning a pointer to that memory foo_ptr. -foo_ptr->bar = 2; // Ok +foo_ptr->bar = 2; // Ok, normal access reinterpret_cast(&buf)->bar = 3 // Undefined behavior -std::launder(reinterpret_cast(&buf))->bar = 4 // Ok +std::launder(reinterpret_cast(&buf))->bar = 4 // Ok, treated as a pointer to a fresh object similar ``` -You're likely wondering, "why not just use `foo_ptr`?" There may be scenarios where we call placement `new` without saving its return value to a fresh pointer. Consider another example from cppreference: +You're likely wondering, "why not use `foo_ptr` directly if we already called placement `new`?" There may be scenarios where we call placement `new` without saving its return value to a fresh pointer. Consider another example from cppreference: ```cpp struct Base { @@ -105,19 +161,11 @@ int main() { } ``` -In this example, the first call to `transmogrify` sneakily change the underlying type of `base` from `Base` to `Derived`. However, the compiler views `base` as a `Base` object and doesn't know which call to `transmogrify` to use the second time. It assumes that the "pointer" to the memory at `base` and the **actual** type of the memory it points to are different, leading to undefined behavior. Once again, a band-aid solution here is to use `std::launder` to tell the compiler "trust me, there really is a valid, freshly-made object at this address." Since launder doesn't affects its arguments, it's return value must be stored in a variable in order to still avoid the problem that *not storing* the result of placement `new` caused. What's the solution here? Unless we absolutely must use placement `new`, it's a better option to use higher-level memory-management options such as smart pointers. And in cases where we *do* have to use placement new, a good way to forego this indirection is to save the result of placement `new` somewhere since we'll need to eventually call `std::launder` if we do not. +In this example, the first call to `transmogrify` changes the underlying type of `base` from `Base` to `Derived`. However, the compiler views `base` as a `Base` object and doesn't know which call to `transmogrify` to use the second time. It assumes that the "pointer" to the memory at `base` and the **actual** type of the memory it points to should be the same, leading to undefined behavior. Once again, a band-aid solution here is to use `std::launder` to tell the compiler "trust me, there really is a valid, freshly-made object at this address." Since launder doesn't affects its arguments, its return value must be stored in a variable in order to avoid the problem that **not storing the result of placement `new`** caused. What's the solution here? Unless we absolutely must use placement `new`, it's a better option to use higher-level memory-management options such as smart pointers. In cases where we *must* use placement new, a good way to forego this indirection is to save the result of placement `new` somewhere since we'll need to eventually call `std::launder` if we do not. Although `std::launder`'s use is niche, its necessity comes about when the compiler cannot ## Resources used - https://en.cppreference.com/w/cpp/utility/launder -- https://eel.is/c++draft/ptr.launder - https://en.cppreference.com/w/cpp/language/new - https://en.cppreference.com/w/cpp/types/byte -- https://stackoverflow.com/questions/63795395/c20-transparently-replaceable-relation -- http://eel.is/c++draft/basic.life#def:transparently_replaceable -- http://eel.is/c++draft/class.derived -- https://stackoverflow.com/questions/18451683/c-disambiguation-subobject-and-subclass-object -- https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0532r0.pdf -- https://www.ralfj.de/blog/2018/07/24/pointers-and-bytes.html -- https://www.ralfj.de/blog/2020/12/14/provenance.html -- https://stackoverflow.com/questions/222557/what-uses-are-there-for-placement-new +- https://en.cppreference.com/w/cpp/language/lifetime From 1d669ddb1b78748cc6218c140093ee9bf57e4b3c Mon Sep 17 00:00:00 2001 From: ProtoRiki Date: Sun, 19 Nov 2023 14:11:34 -0500 Subject: [PATCH 3/7] Final draft finished --- blogposts/Rackerby/post.md | 83 +++++++++++++++++++++----------------- 1 file changed, 47 insertions(+), 36 deletions(-) diff --git a/blogposts/Rackerby/post.md b/blogposts/Rackerby/post.md index 9358b431..05d1c860 100644 --- a/blogposts/Rackerby/post.md +++ b/blogposts/Rackerby/post.md @@ -2,7 +2,7 @@ C++ has a wide variety of memory-management options, offering many different levels of abstraction. You do so manually using `new` and `delete`, forgoing the need to keep track of how much memory to allocate. You can utilize smart pointers to take advantage of RAII principles to automatically release memory. Even until C++ 23 there was support for [garbage collection](https://en.cppreference.com/w/cpp/memory#Garbage_collector_support_.28until_C.2B.2B23.29). -These high-level abstractions have made memory-management and its associated bugs easier to work with. Modern C++ compilers have been fine-tuned to generate efficient low-level code from these abstractions, and optimizers aid in this step by making assumptions about the operations programmers are allowed to write. However, as a general-purpose systems programming language, C++ must give the programmer access to all levels of abstraction. This includes low levels that allow the user to write programs that violate assumptions the compiler makes. Here, we consider how `std::launder` acts as a back door when the compiler doesn't know how to handle uses of placement `new`. +These high-level abstractions have made memory-management and its associated bugs easier to work with. Modern C++ compilers have been fine-tuned to generate efficient low-level code from these abstractions, and optimizers aid in this step by making assumptions about the operations programmers are allowed to write. However, as a general-purpose systems programming language, C++ must give the programmer access to all levels of abstraction. This includes low levels that allow the user to write programs that violate assumptions the compiler makes. Here, we consider how `std::launder` acts as a back door when the compiler doesn't know how to handle certain uses of placement `new`. First, a brief overview of placement `new` and transparent replaceability. The familiar call to `operator new` ([full documentation](https://en.cppreference.com/w/cpp/language/new)) is of the form `new (type) (initializer)`. For example: ```cpp @@ -11,7 +11,7 @@ struct Foo { int baz; } -// Heap-allocate the memory and initialize the object +// Allocate the memory and initialize the object Foo* a = new Foo{1, 2}; ``` This syntax both allocates memory and initializes it with the supplied arguments. However, if one wishes to decouple the memory allocation from its initialization, a different syntax called placement `new` exists for that purpose. Cppreference provides an example of such: @@ -39,21 +39,35 @@ c1 = c2; // well-defined c1.f(); // well-defined; c1 refers to a new object of type C ``` -In this example, we've avoided allocating new memory, instead reusing the same memory that was allocated for `c1`. The above operations were well-defined because object `c1` was **transparently replaceable** by `c2`. +We've reused the same memory that was allocated for `c1` instead of allocating new memory. There are many reasons outside the scope of this article as to why one would want to separate allocation from initialization, but some simple ones are: +- It's faster to reuse pre-allocated memory than it is to allocate new memory +- When writing code for an embedded system that has memory-mapped hardware, one needs to reuse the same fixed address -According to Cppreference, "If a new object is created at the address that was occupied by another object, then all pointers, references, and the name of the original object will automatically refer to the new object and, once the lifetime of the new object begins, can be used to manipulate the new object, but only if the original object is transparently replaceable by the new object. Object x is *transparently replaceable* by object y if: +The above operations were well-defined because object `c1` was **transparently replaceable** by `c2`. -- the storage for y exactly overlays the storage location which x occupied -- y is of the same type as x (ignoring the top-level cv-qualifiers) -- x is not a complete const object - neither x nor y is a base class subobject, or a member subobject declared with [[no_unique_address]](since C++20) -- either +According to [Cppreference](https://en.cppreference.com/w/cpp/language/lifetime): +> If a new object is created at the address that was occupied by another object, then all pointers, references, and the name of the original object will automatically refer to the new object and, once the lifetime of the new object begins, can be used to manipulate the new object, but only if the original object is transparently replaceable by the new object. - - x and y are both complete objects, or - - x and y are direct subobjects of objects ox and oy respectively, and ox is transparently replaceable by oy. +> Object `x` is *transparently replaceable* by object `y` if: +> - the storage for `y` exactly overlays the storage location which `x` occupied +> - `y` is of the same type as `x` (ignoring the top-level cv-qualifiers) +> - `x` is not a complete const object +> - neither `x` nor `y` is a base class subobject, or a member subobject declared with [[no_unique_address]](since C++20) +> - either + > - `x` and `y` are both complete objects, or + > - `x` and `y` are direct subobjects of objects `ox` and `oy` respectively, and `ox` is transparently replaceable by `oy`. +These requirements suggest that transparent replacability is rather strict. The example was also a tad contrived: why go through all the trouble of writing a special copy assignment operator when `C & c2_ref = c2` works just as fine? +If we break the rules of transparent replaceability, it allows for more general memory-reuse. Recall our `Foo` struct: +```cpp +struct Foo { + int bar; + int baz; +} +``` +Suppose we wanted to reuse the memory of *anything* with the same size and alignment as a `Foo`. We can do that by using an `unsigned char[]` or `std::byte []`: ```cpp struct Foo { @@ -61,8 +75,9 @@ struct Foo { int baz; } -// Stack-allocate enough memory to hold a Foo with the proper alignment +// Allocate enough memory to hold a Foo with the proper alignment alignas(Foo) unsigned char buf[sizeof(Foo)]; +// alignas(Foo) std::byte buf[sizeof(Foo)]; Foo* foo_ptr = new(&buf) Foo{1, 2}; // Construct a `Foo` object, placing it into the // pre-allocated storage at memory address of `buf` @@ -71,37 +86,30 @@ Foo* foo_ptr = new(&buf) Foo{1, 2}; // Construct a `Foo` object, placing it into > Note: the `alignas` specifier ensures that the byte-boundaries of the buffer are the same as that of a `Foo` ([full documentation](https://en.cppreference.com/w/cpp/language/object#Alignment)). -There are many reasons outside the scope of this article as to why one would want to separate allocation from initialization, but some simple ones are: +Pre-allocating a chunk of bytes is more generic than in the first example, but it comes at the cost of some correctness. Suppose we tried to access that memory via the first pointer to it with `reinterpret_cast(&buf)->bar`. This is actually undefined behavior: the underlying type of &buf is `unsigned char*` and doesn't point to a `Foo` object. This means that an `unsigned char[]` is not transparently replaceable by a `Foo`. Even if `&buf` and `foo_ptr` point to the same address, their differing types mean that we cannot safely use them interchangeably. To solve the problem of not satisfying transparent replaceability, we must use `std::launder`. -- It's faster to reuse pre-allocated memory than it is to allocate new memory -- When writing code for an embedded system that has memory-mapped hardware, one needs to reuse the same fixed address - -Suppose we tried to access that memory via the first pointer to it with `reinterpret_cast(&buf)->bar`. This is actually undefined behavior, since the underlying type of &buf is `unsigned char*` and doesn't point to a `Foo` object. Even if `&buf` and `foo_ptr` point to the same address, their differing types mean that we cannot safely use them interchangeably. One way to solve this issue is with `std::launder`. Launder has an esoteric definition on [cppreference](https://en.cppreference.com/w/cpp/utility/launder): +Launder has an esoteric definition on [Cppreference](https://en.cppreference.com/w/cpp/utility/launder): ```cpp -// Defined in - -template< class T > -[[nodiscard]] constexpr T* launder( T* p ) noexcept; // Since C++20 +template +[[nodiscard]] constexpr T* launder(T* p) noexcept; // Since C++20 ``` -"Provenance fence with respect to `p`. Returns a pointer to the same memory that `p` points to, but where the referent object is assumed to have a distinct lifetime and dynamic type. Formally, given - -- the pointer `p` represents the address `A` of a byte in memory -- an object `x` is located at the address `A` -- `x` is within its lifetime -- the type of `x` is the same as `T`, ignoring cv-qualifiers at every level -- every byte that would be reachable through the result is reachable through `p` (bytes are reachable through a pointer that points to an object `y` if those bytes are within the storage of an object `z` that is pointer-interconvertible with `y`, or within the immediately enclosing array of which `z` is an element). - -Then `std::launder(p)` returns a value of type `T*` that points to the object `x`. Otherwise, the behavior is undefined. +> "Provenance fence with respect to `p`. Returns a pointer to the same memory that `p` points to, but where the referent object is assumed to have a distinct lifetime and dynamic type. Formally, given +> - the pointer `p` represents the address `A` of a byte in memory +> - an object `x` is located at the address `A` +> - `x` is within its lifetime +> - the type of `x` is the same as `T`, ignoring cv-qualifiers at every level +? - every byte that would be reachable through the result is reachable through `p` (bytes are reachable through a pointer that points to an object `y` if those bytes are within the storage of an object `z` that is pointer-interconvertible with `y`, or within the immediately enclosing array of which `z` is an element). +> Then `std::launder(p)` returns a value of type `T*` that points to the object `x`. Otherwise, the behavior is undefined. The program is ill-formed if `T` is a function type or (possibly cv-qualified) `void`." -How does this arcane definition apply in this example? Well, +How does this arcane definition apply in this example? 1. `&buf` points to an address A 2. an object (let's call it `foo`) is located at A -3. this object is within its lifetime (i.e. memory is allocated for it and initialized it has been initialized) +3. this object is within its lifetime (i.e. memory is allocated for it and it has been initialized) 4. the type of `foo` is `Foo` 5. every byte in the returned pointer is reachable through `&buf` @@ -126,10 +134,10 @@ foo_ptr->bar = 2; // Ok, normal access reinterpret_cast(&buf)->bar = 3 // Undefined behavior -std::launder(reinterpret_cast(&buf))->bar = 4 // Ok, treated as a pointer to a fresh object similar +std::launder(reinterpret_cast(&buf))->bar = 4 // Ok, treated as a pointer to a fresh object ``` -You're likely wondering, "why not use `foo_ptr` directly if we already called placement `new`?" There may be scenarios where we call placement `new` without saving its return value to a fresh pointer. Consider another example from cppreference: +You're likely wondering, "why not use `foo_ptr`? We already called placement `new`!" There may be scenarios where we call placement `new` without saving its return value to a fresh pointer. Consider another example from Cppreference: ```cpp struct Base { @@ -151,7 +159,7 @@ int Base::transmogrify() { static_assert(sizeof(Derived) == sizeof(Base)); int main() { - // Case 1: the new object failed to be transparently replaceable because + // The new object failed to be transparently replaceable because // it is a base subobject but the old object is a complete object. Base base; int n = base.transmogrify(); @@ -161,7 +169,9 @@ int main() { } ``` -In this example, the first call to `transmogrify` changes the underlying type of `base` from `Base` to `Derived`. However, the compiler views `base` as a `Base` object and doesn't know which call to `transmogrify` to use the second time. It assumes that the "pointer" to the memory at `base` and the **actual** type of the memory it points to should be the same, leading to undefined behavior. Once again, a band-aid solution here is to use `std::launder` to tell the compiler "trust me, there really is a valid, freshly-made object at this address." Since launder doesn't affects its arguments, its return value must be stored in a variable in order to avoid the problem that **not storing the result of placement `new`** caused. What's the solution here? Unless we absolutely must use placement `new`, it's a better option to use higher-level memory-management options such as smart pointers. In cases where we *must* use placement new, a good way to forego this indirection is to save the result of placement `new` somewhere since we'll need to eventually call `std::launder` if we do not. Although `std::launder`'s use is niche, its necessity comes about when the compiler cannot +In this example, the first call to `transmogrify` changes the underlying type of `base` from `Base` to `Derived`. However, the compiler views `base` as a `Base` object and doesn't know which call to `transmogrify` to use the second time. It assumes that the "pointer" to the memory at `base` and the **actual** type of the memory it points to should be the same, leading to undefined behavior. Once again, a band-aid solution here is to use `std::launder` to tell the compiler "trust me, there really is a valid, freshly-made object at this address." Since launder doesn't affects its arguments, its return value must be stored in a variable in order to avoid the problem that **not storing the result of placement `new`** caused. + +What's the solution here? Unless we absolutely must use placement `new`, it's likely a better option to let each variable point to its own memory and/or to use higher-level memory-management options like smart pointers. In cases where we *must* use placement new, a good way to forego this indirection is to save the result of placement `new` somewhere since we'll need to eventually call `std::launder` if we do not. Although `std::launder`'s use is niche, its necessity comes about when the compiler cannot reason about the memory lifetime of objects. ## Resources used @@ -169,3 +179,4 @@ In this example, the first call to `transmogrify` changes the underlying type of - https://en.cppreference.com/w/cpp/language/new - https://en.cppreference.com/w/cpp/types/byte - https://en.cppreference.com/w/cpp/language/lifetime +- https://en.cppreference.com/w/cpp/language/object \ No newline at end of file From 175ebc7eaef07f2dad87330421f49fb65a4b6e0f Mon Sep 17 00:00:00 2001 From: ProtoRiki Date: Sun, 19 Nov 2023 14:17:51 -0500 Subject: [PATCH 4/7] Minor fix --- blogposts/Rackerby/post.md | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/blogposts/Rackerby/post.md b/blogposts/Rackerby/post.md index 05d1c860..042fc2ab 100644 --- a/blogposts/Rackerby/post.md +++ b/blogposts/Rackerby/post.md @@ -113,7 +113,12 @@ How does this arcane definition apply in this example? 4. the type of `foo` is `Foo` 5. every byte in the returned pointer is reachable through `&buf` -In order to safely reach the memory through `&buf`, we must wrap the cast in a call to launder: `std::launder(reinterpret_cast(&buf))->bar`. This informs the compiler that we *can* access the memory through that pointer because a call to launder effectively treats that pointer as if it were a freshly made object (similar to a normal call to `new`). The full example is below: +In order to safely reach the memory through `&buf`, we must wrap the cast in a call to launder: +```cpp +std::launder(reinterpret_cast(&buf))->bar; +``` + +This informs the compiler that we *can* access the memory through that pointer because a call to launder effectively treats that pointer as if it were a freshly made object (similar to a normal call to `new`). The full example is below: ```cpp #include @@ -163,7 +168,7 @@ int main() { // it is a base subobject but the old object is a complete object. Base base; int n = base.transmogrify(); - // int m = base.transmogrify(); // undefined Behavior + // int m = base.transmogrify(); // Undefined Behavior int m = std::launder(&base)->transmogrify(); // OK assert(m + n == 3); } From f6af9a60aa96df4328a859476c0e7a0b36b9504e Mon Sep 17 00:00:00 2001 From: ProtoRiki Date: Sun, 19 Nov 2023 14:21:36 -0500 Subject: [PATCH 5/7] Final change (real) --- blogposts/Rackerby/post.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/blogposts/Rackerby/post.md b/blogposts/Rackerby/post.md index 042fc2ab..9e19e2cc 100644 --- a/blogposts/Rackerby/post.md +++ b/blogposts/Rackerby/post.md @@ -176,7 +176,7 @@ int main() { In this example, the first call to `transmogrify` changes the underlying type of `base` from `Base` to `Derived`. However, the compiler views `base` as a `Base` object and doesn't know which call to `transmogrify` to use the second time. It assumes that the "pointer" to the memory at `base` and the **actual** type of the memory it points to should be the same, leading to undefined behavior. Once again, a band-aid solution here is to use `std::launder` to tell the compiler "trust me, there really is a valid, freshly-made object at this address." Since launder doesn't affects its arguments, its return value must be stored in a variable in order to avoid the problem that **not storing the result of placement `new`** caused. -What's the solution here? Unless we absolutely must use placement `new`, it's likely a better option to let each variable point to its own memory and/or to use higher-level memory-management options like smart pointers. In cases where we *must* use placement new, a good way to forego this indirection is to save the result of placement `new` somewhere since we'll need to eventually call `std::launder` if we do not. Although `std::launder`'s use is niche, its necessity comes about when the compiler cannot reason about the memory lifetime of objects. +What's the solution here? Unless we absolutely must use placement `new`, it's likely a better option to let each variable point to its own memory and/or to use higher-level memory-management options like smart pointers. In cases where we *must* use placement new, a good way to forgo this indirection is to save the result of placement `new` somewhere since we'll need to eventually call `std::launder` if we do not. Although `std::launder`'s use is niche, its necessity comes about when the compiler cannot reason about the memory lifetime of objects. ## Resources used From 37a5cd86d320c74beb769b8f24985e0802968921 Mon Sep 17 00:00:00 2001 From: ProtoRiki Date: Sun, 19 Nov 2023 14:39:07 -0500 Subject: [PATCH 6/7] Add blurb about base class subobjects --- blogposts/Rackerby/post.md | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/blogposts/Rackerby/post.md b/blogposts/Rackerby/post.md index 9e19e2cc..11fab6f9 100644 --- a/blogposts/Rackerby/post.md +++ b/blogposts/Rackerby/post.md @@ -57,6 +57,8 @@ According to [Cppreference](https://en.cppreference.com/w/cpp/language/lifetime) > - `x` and `y` are both complete objects, or > - `x` and `y` are direct subobjects of objects `ox` and `oy` respectively, and `ox` is transparently replaceable by `oy`. +Some of these definitions are outside of this article's scope, but the most important non-obvious definition is that of subobjects. You can think of a base class subobject as a class that other classes derive from, for which memory must be allocated inside the derived class. + These requirements suggest that transparent replacability is rather strict. The example was also a tad contrived: why go through all the trouble of writing a special copy assignment operator when `C & c2_ref = c2` works just as fine? If we break the rules of transparent replaceability, it allows for more general memory-reuse. Recall our `Foo` struct: @@ -173,8 +175,8 @@ int main() { assert(m + n == 3); } ``` - -In this example, the first call to `transmogrify` changes the underlying type of `base` from `Base` to `Derived`. However, the compiler views `base` as a `Base` object and doesn't know which call to `transmogrify` to use the second time. It assumes that the "pointer" to the memory at `base` and the **actual** type of the memory it points to should be the same, leading to undefined behavior. Once again, a band-aid solution here is to use `std::launder` to tell the compiler "trust me, there really is a valid, freshly-made object at this address." Since launder doesn't affects its arguments, its return value must be stored in a variable in order to avoid the problem that **not storing the result of placement `new`** caused. +Here is another case where transparent replacability fails: we attempt to replace a type `Base` with `Derived`, but `Base` is a base class subobject. +The first call to `transmogrify` changes the underlying type of `base` from `Base` to `Derived`. However, the compiler views `base` as a `Base` object and doesn't know which call to `transmogrify` to use the second time. It assumes that the "pointer" to the memory at `base` and the **actual** type of the memory it points to should be the same, leading to undefined behavior. Once again, a band-aid solution here is to use `std::launder` to tell the compiler "trust me, there really is a valid, freshly-made object at this address." Since launder doesn't affects its arguments, its return value must be stored in a variable in order to avoid the problem that **not storing the result of placement `new`** caused. What's the solution here? Unless we absolutely must use placement `new`, it's likely a better option to let each variable point to its own memory and/or to use higher-level memory-management options like smart pointers. In cases where we *must* use placement new, a good way to forgo this indirection is to save the result of placement `new` somewhere since we'll need to eventually call `std::launder` if we do not. Although `std::launder`'s use is niche, its necessity comes about when the compiler cannot reason about the memory lifetime of objects. From ed1e0f9a94ae2e165306c207d59f5506863d84bc Mon Sep 17 00:00:00 2001 From: ProtoRiki Date: Sun, 19 Nov 2023 14:44:30 -0500 Subject: [PATCH 7/7] Placement new syntax --- blogposts/Rackerby/post.md | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/blogposts/Rackerby/post.md b/blogposts/Rackerby/post.md index 11fab6f9..6a21796e 100644 --- a/blogposts/Rackerby/post.md +++ b/blogposts/Rackerby/post.md @@ -4,7 +4,7 @@ C++ has a wide variety of memory-management options, offering many different lev These high-level abstractions have made memory-management and its associated bugs easier to work with. Modern C++ compilers have been fine-tuned to generate efficient low-level code from these abstractions, and optimizers aid in this step by making assumptions about the operations programmers are allowed to write. However, as a general-purpose systems programming language, C++ must give the programmer access to all levels of abstraction. This includes low levels that allow the user to write programs that violate assumptions the compiler makes. Here, we consider how `std::launder` acts as a back door when the compiler doesn't know how to handle certain uses of placement `new`. -First, a brief overview of placement `new` and transparent replaceability. The familiar call to `operator new` ([full documentation](https://en.cppreference.com/w/cpp/language/new)) is of the form `new (type) (initializer)`. For example: +First, a brief overview of placement `new` and transparent replaceability. The familiar call to `operator new` ([full documentation](https://en.cppreference.com/w/cpp/language/new)) is of the form `new `. For example: ```cpp struct Foo { int bar; @@ -14,7 +14,11 @@ struct Foo { // Allocate the memory and initialize the object Foo* a = new Foo{1, 2}; ``` - This syntax both allocates memory and initializes it with the supplied arguments. However, if one wishes to decouple the memory allocation from its initialization, a different syntax called placement `new` exists for that purpose. Cppreference provides an example of such: + This syntax both allocates memory and initializes it with the supplied arguments. However, if one wishes to decouple the memory allocation from its initialization, a different syntax called placement `new` exists for that purpose. +```cpp +new (address_to_store_memory_at) +``` + Cppreference provides an example of such: ```cpp struct C {