Skip to content

A performant recreation of the C++ library containers

License

Notifications You must be signed in to change notification settings

jmz-mzr/containers

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

58 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Containers


A cargo ship carrying containers on the sea, its name written on its side is CPP

Table of Contents

Overview

“Containers“ is a C++ project aimed at recreating a handful of containers from the C++ standard library: vector, stack, map, and set.
This project is not just about coding; it's about delving deep into the essence of C++, crafting performance-efficient, memory-safe containers with a clean coding style.

To make it fun & useful, two details matter:

  • The containers must follow the C++98 standard, and re-implement every single feature they had back then, even deprecated ones. No C++11, no external libraries, just pure & beautiful basics.
  • They need to be memory safe, efficient, and closely mirror or outperform the performances of the STD library. For map & set, that means implementing a Red / Black binary search tree.

To achieve our goal, what best start other than digging into the LIBSTDC++ and LIBC++ codebases?
Being able to read and make sense of complex code is an essential skill — and it sometimes leads to little gems (see my findings on stack).

A word about Coding Style

C++ is more verbose than C. However, being a strong advocate for Clean Code & Clean Architecture practices — and to pay homage to the 42 School’s Norminette coding style — I made a point on delivering an explicit, clean, readable and well organised code. Have a look at vector for a start.


Key Features

  • Complete Implementation: Faithfully recreated vector, stack, map, and set as per C++98 standards
  • High Performance: Performances even better than standard library containers
  • Memory Safety: Carefully crafted to prevent memory leaks and ensure safe memory operations
  • Red / Black Tree: Advanced Red / Black binary search tree implementation for map and set
  • Coding Style: Emphasis on clean, self-explanatory, readable and organized code

Testing

Nothing can be great until it’s well tested!

make test

Two testers are provided: the first is an introduction one, made by 42.
The other is a comprehensive test suite, running small and then memory-heavy operations on both our containers (in the ‘ft’ namespace), and the containers in the STD library.
The output and the time performances are compared: the output must be the same, and the time performances as well (Kudos, ours are even faster! 🎉)

Everything is tested, even the subtle requirements, for volatile, const and non-const versions:

  • Containers created empty (even with types that cannot be default constructed)
  • Working with custom allocators
  • Constructors and destructors
  • Typedefs
  • Member operators
  • Iterators
  • Element access functions
  • Capacity functions
  • Modifiers functions
  • Lookup functions
  • Observers functions
  • Non-member functions & operators

Testers.mp4

Core Concepts

Core C++ concepts and techniques — SFINAE, Template Metaprogramming, etc — are the heart of the containers. They are illustrated in the recreation of some handy tools: enable_if, is_integral, iterator_traits, reverse_iterator, is_integral, equal, lexicographical_compare, pair, and make_pair.

Some are hard to grasp, and it’s easier with examples: take a look at the tests written for them here.

Useful links:


Vector

template <
    typename T,
    class Allocator = std::allocator<T>
> class vector;

Vectors represent arrays that can change in size. They use contiguous storage locations, so their elements can be accessed with offsets on regular pointers as efficiently as in arrays. But their size can change dynamically, and their storage is handled automatically.

Compared to arrays, vectors consume more space, with extra memory allocated for future growth. But they grow in an efficient way: they reallocate only when their memory is exhausted. Reallocations happen at logarithmically growing intervals of their size, so insertions at the end are provided with amortized constant time complexity.

The total amount of allocated memory can be queried using capacity().
The reserve() function can be used to eliminate reallocations if the number of elements is known beforehand.

Like arrays, vectors are very efficient at accessing elements, and adding / removing elements from their end.
For inserting / removing elements other than at the end, containers like deque or list perform better.

Internally, the implementation leverages SFINAE (e.g. to recognize pointers in insert() or assign()), the allocator’s allocate() & deallocate() functions, or the std::uninitialized_fill(), std::uninitialized_copy(), and std::copy_backward() functions.
As with the other containers, a clever architecture allows for core private functions (like _construct_at_end(), _reallocate()) to be reused in other private & public ones.

Useful links:


Stack

template <
    typename T,
    class Container = ft::vector<T>
> class stack;

Stacks are container adaptors, a LIFO (last-in-first-out) data structure where elements are inserted and extracted only from one end. Stacks wrap an underlying container (by default here a vector), and push and pop the elements from the back of the container, known as the top of the stack.

An interesting finding

Stacks leverage two comparison operators to achieve all comparisons: operator== and operator<.
As stacks are template containers, in order for the compiler & linker to find the instantiated operators definitions, we either:

  • Forward declare both the class and the operators templates, before declaring the operators as friend function template specializations (if not defining them directly inside the class template body)
  • Or we declare all the instantiations of the operator templates as friend

These two methods have different implications, well detailed here.
One of them allows the operator to access a stack of any type, not just the T type of the stack currently compared!

Looking at their codebase, we observe that LIBC++ and LIBSTDC++ are not protected against this resulting potential vulnerability. Even more interesting, when we go back in LIBSTDC++ commit history, we see that the most secure way to grant this friendship was there, commented, before being partially, then fully deleted later.

It might be hard to exploit, however people tend to get smart when trying to hack stuff ;)
Launch the tester to see a live demo.

Useful links:


Map

template <
    typename Key,
    typename T,
    class Compare = std::less<Key>,
    class Allocator = std::allocator<ft::pair<const Key, T> >
> class map;

Maps are associative containers, storing elements as pairs of key value & mapped value, with unique keys.

Key values are sorted using the Compare function. The mapped values store the content associated to a key.
The content can be accessed directly with the corresponding key using the bracket operator[] (which creates a new element if the key doesn't already exist).

Since the elements are sorted, search, removal, and insertion operations have logarithmic complexity.
Maps are typically implemented as binary search trees — here it is a Red / Black tree.

Useful links:


Set

template <
    typename Key,
    class Compare = std::less<Key>,
    class Allocator = std::allocator<Key>
> class set;

Sets are containers that store unique elements of type Key.

The elements are sorted using the Compare function. They cannot be modified once in the container (the values are always const), but they can be inserted and removed from the container.

Since the elements are sorted, search, removal, and insertion operations have logarithmic complexity.
Sets are typically implemented as binary search trees — here it is a Red / Black tree.

Useful links:


The Red / Black Tree

Map and Set both rely on an underlying Red / Black binary search tree.

It is a self-balancing binary tree, where each node has an extra information of color (red or black) used in the insertion & removal algorithms to satisfy the tree’s core properties.
These algorithms keep the tree balanced, and ensure that the time complexity for insertion, deletion, and searching is always O(log n).

Internally, the tree’s root is the left child of its end_node. So if root != NULL, end_node->left points to root, and root->parent points to end_node.
To optimize memory usage and minimize complexity, the node_base class manages the pointers and the color bit, while the deriving node class contains the actual node’s value.

The nature of the tree allows the use of recursion, for example in _structural_copy(), in _destroy() or in _print(). This print function is used to visualize the content of the tree, especially in the context of...

The tree tester:

make tree && ./_tree_tests

A custom tester, showing what the current tree looks like, along with the expected results.
As with the containers’ tester, everything is tested, from the ability to create an empty tree, with custom allocators, to all the member functions & operators.

Tree.Tester.mp4

Documentation