-
Standard Library
-
"The C++ Standard Library is a collection of classes and functions, which are written in the core language and part of the C++ ISO Standard itself." Wikipedia
-
Learning how to use the Standard Library is an important part of becoming a proficient C++ software engineer. In almost all cases, it is preferable to utilize functionality that already exists in the Standard Library, instead of implementing functionality from scratch. This is both because using the Standard Library is faster (it is well-documented) and because many expert software engineers have worked on the Standard Library. The performance of Standard Library facilities is optimized, robust, and almost always as fast or faster than an initial re-implementation of the same functionality.
-
In fact, guideline SL.1 of the C++ Core Guidelines is:
-
Use libraries wherever possible
-
Reason Save time. Don’t re-invent the wheel. Don’t replicate the work of others. Benefit from other people’s work when they make improvements. Help other people when you make improvements.
-
-
And guideline SL.2 is:
-
Prefer the standard library to other libraries
-
Reason More people know the standard library. It is more likely to be stable, well-maintained, and widely available than your own code or most other libraries.
-
-
-
Namespace
- Standard Library functions and classes exist in the
std::
namespace.std::vector
, for example, refers to the vector class within the Standard Library. Typically, in order to use a Standard Library feature we must both include the necessary header file (e.g.#include <vector>
) and also namespace the class withstd::
(e.g.std::vector
).
- Standard Library functions and classes exist in the
-
Compilers
-
C++ is a compiled programming language, which means that programmers use a program to compile their human-readable source code into machine-readable object and executable files. The program that performs this task is called a compiler.
-
C++ does not have an "official" compiler. Instead, there are many different compilers that a programmer can use.
-
GNU Compiler Collection (GCC)
- In this program we primarily use the GNU Compiler Collection, which is a popular, open-source, cross-platform compiler from the larger GNU Project. In particular, we use the
g++
program, which is a command line executable that compiles C++ source code and automatically links the C++ Standard Library.
- In this program we primarily use the GNU Compiler Collection, which is a popular, open-source, cross-platform compiler from the larger GNU Project. In particular, we use the
-
LLVM (low level virtual machine)
-
This is a new infrastructure of compilers, read more here
-
The LLVM Project is a collection of modular and reusable compiler and toolchain technologies. Despite its name, LLVM has little to do with traditional virtual machines. The name "LLVM" itself is not an acronym; it is the full name of the project.
-
LLVM began as a research project at the University of Illinois, with the goal of providing a modern, SSA-based compilation strategy capable of supporting both static and dynamic compilation of arbitrary programming languages. Since then, LLVM has grown to be an umbrella project consisting of a number of subprojects, many of which are being used in production by a wide variety of commercial and open source projects as well as being widely used in academic research. Code in the LLVM project is licensed under the "Apache 2.0 License with LLVM exceptions"
-
Dynamic compiling means that the language is compiled to machine code while the program is being executed, not before. This allows, for example, just-in-time optimization - the code is optimized while the application is running. A JIT optimizer has the advantage that it has much more reliable information about which branches of the code are used most often and how they are usually used, because it can observe the application in action before applying optimizations.
-
Dynamic compilation is a problem for automatic benchmarking, because multiple measurements of the same program code section can compare completely different machine code interpretations because the optimizer has decided to change the implementation between two runs.
-
-
-
Linking
-
In order to use classes and functions from the C++ Standard Library, the compiler must have access to a compiled version of the standard library, stored in object files. Most compiler implementations, including GCC, include those object files as part of the installation process. In order to use the Standard Library facilities, the compiler must "link" the standard library object files to the object files created from the programmer's source code.
-
Once the compiler links together the necessary object files, it is able to generate a standalone executable file that can run on the operating system.
-
-
Build Tools
-
Make and CMake are two separate and similar build tools that both serve to help simplify the process of building software.
-
In particular, build tools automate the process of compiling multiple source code files into object files, linking those object files together, and generating an executable. Build tools also often automate the process of determining which files have changed since the last build and thus need to be recompiled.
-
Make
-
GNU Make is a widely-used build tool that relies on
Makefiles
to automate the process of building a project. -
A
Makefile
typically includes one or more "targets". Each target performs a different action. -
build
is a common target name that is configured in theMakefile
to compile all of the project's source code into an executable file.clean
, on the other hand, is a common target to delete all object files and other artifacts of the build process, resulting in a clean, unbuilt project state. -
Running either
make build
ormake clean
(or any other target) on the command line would causeMake
to search for a localMakefile
, search for a matching target within that Makefile, and then execute the target.
-
-
CMake
-
CMake is a built tool that facilitates cross-platform builds, so that it is straightforward to build the same source code on Linux, macOS, Windows, or any other operating system.
CMake
relies on aCMakeLists.txt
file, which configures appropriate cross-platform targets. -
Building a
CMakeLists.txt
file can be a bit daunting, butCMake
provides a helpful tutorial. -
In this Nanodegree program, you will not need to build your own
Makefiles
orCMakeLists.txt
files. We provide the appropriate configuration files for each project and instruct you as to their usage.
-
-
-
Installation
-
You are welcome to write all of your code in Udacity's web-based Workspaces. If, however, you prefer to work locally on your machine, you will need to install certain software.
-
llvm
andclang
(this tutorial uses this compiler) -
g++, gdb, make
-
MacOS
-
macOS includes g++ as part of Command Line Tools.
-
Launch Terminal, which can be found in the Utilities folder in Applications.
-
Type
xcode-select --install
into the Terminal window and press "Enter" -
If you don't already have Xcode or Command Line Tools installed, a window will pop up. Press the Install button.
-
Verify: Type
g++
into Terminal and press enter. If the output isclang: error: no input files
, then the installation was successful.
-
-
Linux
-
These programs are typically available through the default package manager for each Linux distribution. For example, we can use APT on Ubuntu systems.
sudo apt update
sudo apt install build-essential
sudo apt install gdb
-
-
Windows
- MinGW provides the necessary software.
- Proceed from Section 3.2 of these linked instructions.
- MinGW provides the necessary software.
-
-
-
Style
-
A consistent style (hopefully) helps improve and make your code more readable.
-
There are many different C++ styles, none of which is authoritative.
-
ClangFormat
-
clang-format
is a command line text formatter that automatically reformats source code according to configurable set of policies. The tool includes several pre-configured styles, or you can create your own. -
clang-format
is an open-source application that you can install on your system, or it is straightforward to install as a Visual Studio Code extension.brew install clang-format
-
-
-
Debugging
-
Debugging is an important part of software development! Therefore, learning how to use a debugger is an important part of becoming a software developer 😬
-
Debuggers
-
Debuggers are tools that allow you to pause the execution of your code in various locations, inspect the state of the program, and step through your code line-by-line.
-
GDB and LLDB are two popular, open-source debuggers for C++. Integrating them into a code editor often makes debugging easier.
-
In order to use Visual Studio Code's debugger with C++ files, you must install the free C/C++ extension.
- Remember to compile your code with Symbols
-g
flagclang++ -std=c++20 -g hello.cpp -o a.out
- Remember to compile your code with Symbols
-
-
-
#include <iostream>
- The
#include
is a preprocessor command which is executed before the code is compiled. It searches for theiostream
header file and pastes its contents into the program.iostream
contains the declarations for the input/output stream objects.
- The
-
using std::cout;
- Namespaces are a way in C++ to group identifiers (names) together. They provide context for identifiers to avoid naming collisions. The
std
namespace is the namespace used for the standard library. - The
using
command addsstd::cout
to the global scope of the program. This way you can usecout
in your code instead of having to writestd::cout
. cout
is an output stream you will use to send output to the notebook or to a terminal, if you are using one.- Note that the second two lines in the example end with a semicolon
;
. Coding statements end with a semicolon in C++. The#include
statement is a preprocessor command, so it doesn't need one.
- Namespaces are a way in C++ to group identifiers (names) together. They provide context for identifiers to avoid naming collisions. The
-
cout << "Hello!" << "\n";
- In this line, the code is using cout to send output to the notebook. The
<<
operator is the stream insertion operator, and it writes what's on the right side of the operator to the left side. So in this case,"Message here"
is written to the output streamcout
.
- In this line, the code is using cout to send output to the notebook. The
C++ has several "primitive" variable types, which are things like int
s (integers), string
s, float
s, and others. These should be similar to variable types in other programming languages you have used.
-
In the previous concept, you learned about some of the primitive types that C++ offers, including strings and ints, and you learned how to store these types in your program. In this concept, you will learn about one of the most common data structures in C++: the vector.
-
C++ also has several container types that can be used for storing data. We will start with
vector
s, as these will be used throughout this lesson, but we will also introduce other container types as needed. -
Vectors are a sequence of elements of a single type, and have useful methods for getting the size, testing if the vector is empty, and adding elements to the vector.
-
Check
1-Foundations/2-vector
-
Unfortunately, there isn't a built-in way to print vectors in C++ using
cout
. You will learn how to access vector elements and you will write your own function to print vectors later. For now, you can see how vectors are created and stored. Below, you can see how to nest vectors to create 2D containers. -
Check
1-Foundations/2-vector
-
You may have noticed comments in some of the code up until this point. C++ provides two kinds of comments:
-
// You can use two forward slashes for single line comments. /* For longer comments, you can enclose the text with an opening slash-star and closing star-slash. */
-
-
You have now seen how to store basic types and vectors containing those types. As you practiced declaring variables, in each case you indicated the type of the variable. It is possible for C++ to do automatic type inference, using the
auto
keyword.- Check
1-Foundations/3-auto
- Check
-
It is helpful to manually declare the type of a variable if you want the variable type to be clear for reader of your code, or if you want to be explicit about the number precision being used; C++ has several number types with different levels of precision, and this precision might not be clear from the value being assigned.
-
-
In order to write the A* search algorithm, you will need a grid or "board" to search through. We'll be working with this board throughout the remaining exercises, and we'll start by storing a hard-coded board in the main function. In later exercises, you will write code to read the board from a file.
-
Note: you will need to include the vector library, just as iostream is included. You will also need to use the namespace
std::vector
if you want to writevector
rather thanstd::vector
in your code. -
This exercise will be ungraded, but if you get stuck, you can find the solution in solution.cpp. Finally, if you feel a little crowded in the editor below and need more space to work, you can click the "Expand" button in the lower left corner.
-
Check
1-Foundations/4-grid
-
-
Just as in other languages you've worked with, C++ has both for loops and while loops. You will learn about for loops in the notebook below, and you will see while loops later in the course.
-
Check 1-Foundations/5-loops
If you haven't seen the ++
operator before, this is the post-increment operator, and it is where the ++
in the name "C++" comes from. The operator increments the value of i
.
There is also a pre-increment operator which is used before a variable, as well as pre and post decrement operators: --
. The difference between pre and post lies in what value is returned by the operator when it is used.
You will only use the post-increment operator i++
for now, but if you are curious, click below for an explanation of the code:
Check 1-Foundations/5-loops
C++ offers several ways to iterate over containers. One way is to use an index-based loop as above. Another way is using a "range-based loop", which you will see frequently in the rest of this course. See the following code for an example of how this works:
-
#include <iostream> #include <vector> using std::cout; using std::vector; int main() { // Add your code here. vector<int> a {1, 2, 3, 4, 5}; for (int i: a) { cout << i << "\n"; } }
In the cell below, there is a simple function to add two numbers and return the result. Test the code below, and click the button for a more in-depth explanation.
-
#include <iostream> using std::cout; // Function declared and defined here. int AdditionFunction(int i, int j) { return i + j; } int main() { auto d = 3; auto f = 7; cout << AdditionFunction(d, f) << "\n"; }
-
In C++, you can use the
std::ifstream
object to handle input file streams. To do this, you will need to include the header file that provides the file streaming classes:<fstream>
. -
Once the
<fstream>
header is included, a new input stream object can be declared and initialized using a file pathpath
:-
std::ifstream my_file; my_file.open(path);
-
-
Alternatively, the declaration and initialization can be done in a single line as follows:
std::ifstream my_file(path);
-
C++
ifstream
objects can also be used as a boolean to check if the stream has been created successfully. If the stream were to initialize successfully, then theifstream
object would evaluate totrue
. If there were to be an error opening the file or some other error creating the stream, then theifstream
object would evaluate tofalse
. -
The following cell creates an input stream from the file
"files/1.board"
: -
#include <fstream> #include <iostream> #include <string> int main() { std::ifstream my_file; my_file.open("files/1.board"); if (my_file) { std::cout << "The file stream has been created!" << "\n"; } }
-
If the input file stream object has been successfully created, the lines of the input stream can be read using the
getline
method. In the cell below, a while loop has been added to the previous example to get each line from the stream and print it to the console. -
#include <fstream> #include <iostream> #include <string> int main() { std::ifstream my_file; my_file.open("files/1.board"); if (my_file) { std::cout << "The file stream has been created!" << "\n"; std::string line; while (getline(my_file, line)) { std::cout << line << "\n"; } } }
-
In C++ strings can be streamed into temporary variables, similarly to how files can be streamed into strings. Streaming a string allows us to work with each character individually.
-
One way to stream a string is to use an input string stream object
istringstream
from the<sstream>
header. -
Once an
istringstream
object has been created, parts of the string can be streamed and stored using the "extraction operator":>>
. The extraction operator will read until whitespace is reached or until the stream fails. Execute the following code to see how this works:
#include <iostream>
#include <sstream>
#include <string>
using std::istringstream;
using std::string;
using std::cout;
int main ()
{
string a("j 2 3");
istringstream my_stream(a);
char n;
my_stream >> n;
cout << n << "\n";
}
- The
istringstream
object can also be used as a boolean to determine if the last extraction operation failed - this happens if there wasn't any more of the string to stream, for example. If the stream still has more characters, you are able to stream again. See the following code for an example of using theistringstream
this way:
#include <iostream>
#include <sstream>
#include <string>
using std::istringstream;
using std::string;
using std::cout;
int main()
{
string a("1 2 3");
istringstream my_stream(a);
int n;
// Testing to see if the stream was successful and printing results.
while (my_stream) {
my_stream >> n;
if (my_stream) {
cout << "That stream was successful: " << n << "\n";
}
else {
cout << "That stream was NOT successful!" << "\n";
}
}
}
- In the previous exercises, you have declared and initialized vectors, and you have also accessed vector elements. In order to make full use of vectors in your code though, you will need to be able to add additional elements to them. Have a look at the following notebook for examples of how to do this.
- Now that you are able to process a string, you may want to store the results of the processing in a convenient container for later use. In the next exercise, you will store the streamed
int
s from each line of the board in avector<int>
. To do this, you will add theint
s to the back of the vector, using thevector
methodpush_back
:
#include <vector>
#include <iostream>
using std::vector;
using std::cout;
int main() {
// Initial Vector
vector v {1, 2, 3};
// Print the contents of the vector
for (int i=0; i < v.size(); i++) {
cout << v[i] << "\n";
}
// Push 4 to the back of the vector
v.push_back(4);
// Print the contents again
for (int i=0; i < v.size(); i++) {
cout << v[i] << "\n";
}
}
-
In the previous exercises, you stored and printed the board as a vector<vector>, where only two states were used for each cell: 0 and 1. This is a great way to get started, but as the program becomes more complicated, there will be more than two possible states for each cell. Additionally, it would be nice to print the board in a way that clearly indicates open areas and obstacles, just as the board is printed above.
-
To do this clearly in your code, you will learn about and use something called an enum. An enum, short for enumerator, is a way to define a type in C++ with values that are restricted to a fixed range. For an explanation and examples, see the notebook below.
-
#include <iostream> using std::cout; int main() { enum class Direction {kUp, kDown, kLeft, kRight}; Direction a = Direction::kUp; switch (a) { case Direction::kUp : cout << "Going up!" << "\n"; break; case Direction::kDown : cout << "Going down!" << "\n"; break; case Direction::kLeft : cout << "Going left!" << "\n"; break; case Direction::kRight : cout << "Going right!" << "\n"; break; } }
-
Motion Planning
- The next videos and quizzes are taught by Sebastian Thrun (Udacity's former CEO) and they come from one of Udacity's first courses. The production style is a little different from what you will see in the rest of the course, but the content is very good. In these videos, Sebastian will discuss motion planning in robotics and provide the conceptual foundation for the project that you will build.
-
Pass by Reference
-
In the previous exercises, you've written functions that accept and return various kinds of objects. However, in all of the functions you've written so far, the objects returned by the function are different from the objects provided to the function. In other words, when the function is called on some data, a copy of that data is made, and the function operates on a copy of the data instead of the original data. This is referred to as pass by value, since only a copy of the values of an object are passed to the function, and not the actual objects itself.
-
In the following example, the value of
int i
is passed to the functionMultiplyByTwo
. Look carefully at the code and try to guess what the output will be before you execute it. When you are finished executing, click the button for an explanation. -
#include <iostream> using std::cout; int MultiplyByTwo(int i) { i = 2*i; return i; } int main() { int a = 5; cout << "The int a equals: " << a << "\n"; int b = MultiplyByTwo(a); cout << "The int b equals: " << b << "\n"; cout << "The int a still equals: " << a << "\n"; }
-
In the code above,
a
is passed by value to the function, so the variablea
is not affected by what happens inside the function. -
But what if we wanted to change the value of
a
itself? For example, it might be that the variable you are passing into a function maintains some state in the program, and you want to write the function to update that state. -
It turns out, it is possible to modify
a
from within the function. To do this, you must pass a reference to the variablea
, instead of the value ofa
. In C++, a reference is just an alternative name for the same variable. -
To pass by reference, you simply need to add an ampersand
&
before the variable in the function declaration. Try the code below to see how this works: -
#include <iostream> using std::cout; int MultiplyByTwo(int &i) { i = 2*i; return i; } int main() { int a = 5; cout << "The int a equals: " << a << "\n"; int b = MultiplyByTwo(a); cout << "The int b equals: " << b << "\n"; cout << "The int a now equals: " << a << "\n"; }
-
In the code above,
a
is passed by reference to the functionMultiplyByTwo
since the argument toMultiplyByTwo
is a reference:&i
. This means thati
is becomes another name for whatever variable that is passed into the function. When the function changes the value ofi
, then the value ofa
is changed as well. -
#include <iostream> #include <string> using std::cout; using std::string; void DoubleString(string value) { // Concatentate the string with a space and itself. value = value + " " + value; } int main() { string s = "Hello"; cout << "The string s is: " << s << "\n"; DoubleString(s); cout << "The string s is now: " << s << "\n"; }
-
-
C++ supports two notions of immutability:
-
const
: meaning roughly " I promise not to change this value."...The compiler enforces the promise made byconst
.... -
constexpr
: meaning roughly "to be evaluated at compile time." This is used primarily to specify constants... -
#include <iostream> int main() { int i; std::cout << "Enter an integer value for i: "; std::cin >> i; const int j = i * 2; // "j can only be evaluated at run time." // "But I promise not to change it after it is initialized." constexpr int k = 3; // "k, in contrast, can be evaluated at compile time." std::cout << "j = " << j << "\n"; std::cout << "k = " << k << "\n"; }
-
The major difference between
const
andconstexpr
, though, is thatconstexpr
must be evaluated at compile time. -
The compiler will catch a
constexpr
variable that cannot be evaluated at compile time. -
#include <iostream> int main() { int i; std::cout << "Enter an integer value for i: "; std::cin >> i; constexpr int j = i * 2; // "j can only be evaluated at run time." // "constexpr must be evaluated at compile time." // "So this code will produce a compilation error." }
-
A common usage of
const
is to guard against accidentally changing a variable, especially when it is passed-by-reference as a function argument. -
#include <iostream> #include <vector> int sum(const std::vector<int> &v) { int sum = 0; for(int i : v) sum += i; return sum; } int main() { std::vector<int> v {0, 1, 2, 3, 4}; std::cout << sum(v) << "\n"; }
- In the previous exercise, we included an array of directional deltas for convenience:
// directional deltas
const int delta[4][2]{{-1, 0}, {0, -1}, {1, 0}, {0, 1}};
-
Arrays are a lower level data structure than vectors, and can be slightly more efficient, in terms of memory and element access. However, this efficiency comes with a price. Unlike vectors, which can be extended with more elements, arrays have a fixed length. Additionally, arrays may require careful memory management, depending how they are used.
-
The example in the project code is a good use case for an array, as it was not intended to be changed during the execution of the program. However, a vector would have worked there as well.
-
Header files, or
.h
files, allow related function, method, and class declarations to be collected in one place. The corresponding definitions can then be placed in.cpp
files. The compiler considers a header declaration a "promise" that the definition will be found later in the code, so if the compiler reaches a function that hasn't been defined yet, it can continue on compiling until the definition is found. This allows functions to be defined (and declared) in arbitrary order. -
In the following code example, the functions are out of order, and the code will not compile. Try to fix this by rearranging the functions to be in the correct order.
-
#include <iostream> using std::cout; void OuterFunction(int i) { InnerFunction(i); } void InnerFunction(int i) { cout << "The value of the integer is: " << i << "\n"; } int main() { int a = 5; OuterFunction(a); }
-
In the mini-project for the first half of the course, the instructions were very careful to indicate where each function should be placed, so you didn't run into the problem of functions being out of order.
-
Using a Header
-
One other way to solve the code problem above (without rearranging the functions) would have been to declare each function at the top of the file. A function declaration is much like the first line of a function definition - it contains the return type, function name, and input variable types. The details of the function definition are not needed for the declaration though.
To avoid a single file from becomming cluttered with declarations and definitions for every function, it is customary to declare the functions in another file, called the header file. In C++, the header file will have filetype .h
, and the contents of the header file must be included at the top of the .cpp
file. See the following example for a refactoring of the code above into a header and a cpp file.
-
// The header file with just the function declarations. // When you click the "Run Code" button, this file will // be saved as header_example.h. #ifndef HEADER_EXAMPLE_H #define HEADER_EXAMPLE_H void OuterFunction(int); void InnerFunction(int); #endif
-
// The contents of header_example.h are included in // the corresponding .cpp file using quotes: #include "header_example.h" #include <iostream> using std::cout; void OuterFunction(int i) { InnerFunction(i); } void InnerFunction(int i) { cout << "The value of the integer is: " << i << "\n"; } int main() { int a = 5; OuterFunction(a); }
-
Notice that the code from the first example was fixed without having to rearrange the functions! In the code above, you might also have noticed several other things:
-
The function declarations in the header file don't need variable names, just variable types. You can put names in the declaration, however, and doing this often makes the code easier to read.
-
The
#include
statement for the header used quotes " " around the file name, and not angle brackets <>. We have stored the header in the same directory as the .cpp file, and the quotes tell the preprocessor to look for the file in the same directory as the current file - not in the usual set of directories where libraries are typically stored.
Finally, there is a preprocessor directive:
#ifndef HEADER_EXAMPLE_H #define HEADER_EXAMPLE_H
-
at the top of the header, along with an
#endif
at the end. This is called an "include guard". Since the header will be included into another file, and#include
just pastes contents into a file, the include guard prevents the same file from being pasted multiple times into another file. This might happen if multiple files include the same header, and then are all included into the same main.cpp, for example. Theifndef
checks if HEADER_EXAMPLE_H has not been defined in the file already. If it has not been defined yet, then it is defined with #define HEADER_EXAMPLE_H, and the rest of the header is used. If HEADER_EXAMPLE_H has already been defined, then the preprocessor does not enter the ifndef block. Note: There are other ways to do this. Another common way is to use an #pragma oncepreprocessor directive, but we won't cover that in detail here. See this Wikipedia article for examples. -
The addition of #include guards to a header file is one way to make that file idempotent. Another construct to combat double inclusion is
#pragma once
, which is non-standard but nearly universally supported among C and C++ compilers.
-
-
In the previous notebook, you saw how example code could be split into multiple .h and .cpp files, and you used g++ to build all of the files together. For small projects with a handful of files, this works well. But what would happen if there were hundreds, or even thousands, of files in the project? You could type the names of the files at the command line each time, but there tools to make this easier.
-
Many larger C++ projects use a build system to manage all the files during the build process. The build system allows for large projects to be compiled with a few commands, and build systems are able to do this in an efficient way by only recompiling files that have been changed.
-
In this workspace you will learn about
- Object files: what actually happens when you run g++.
- How to use object files to compile only a single file at a time. If you have many files in a project, this will allow you can compile only files that have changed and need to be re-compiled.
- How to use cmake (and make), a build system which is popular in large C++ projects.CMake will simplify the process of building project and re-compiling only the changed files.
-
Object Files
-
When you compile a project with g++, g++ actually performs several distinct tasks:
-
The preprocessor runs and executes any statement beginning with a hash symbol: #, such as #include statements. This ensures all code is in the correct location and ready to compile.
-
Each file in the source code is compiled into an "object file" (a .o file). Object files are platform-specific machine code that will be used to create an executable.
-
The object files are "linked" together to make a single executable. In the examples you have seen so far, this executable is a.out, but you can specify whatever name you want.
-
It is possible to have g++ perform each of the steps separately by using the -c flag. For example,
g++ -c main.cpp
-
will produce a
main.o
file, and that file can be converted to an executable withg++ main.o
-
Generate all object files
g++ -c *.cpp
and the link themg++ *.o
-
But what if you make changes to your code and you need to re-compile? In that case, you can compile only the file that you changed, and you can use the existing object files from the unchanged source files for linking.
-
Compiling just the file you have changed saves time if there are many files and compilation takes a long time. However, the process above is tedious when using many files, especially if you don't remember which ones you have modified.
-
For larger projects, it is helpful to use a build system which can compile exactly the right files for you and take care of linking.
-
-
CMake and Make
-
CMake is an open-source, platform-independent build system. CMake uses text documents, denoted as CMakeLists.txt files, to manage build environments, like make. A comprehensive tutorial on CMake would require an entire course, but you can learn the basics of CMake here, so you'll be ready to use it in the upcoming projects.
-
CMakeLists.txt
-
CMakeList.txt files are simple text configuration files that tell CMake how to build your project. There can be multiple CMakeLists.txt files in a project. In fact, one CMakeList.txt file can be included in each directory of the project, indicating how the files in that directory should be built.
-
These files can be used to specify the locations of necessary packages, set build flags and environment variables, specify build target names and locations, and other actions.
-
-
The first lines that you'll want in your CMakeLists.txt are lines that specifies the minimum versions of cmake and C++ required to build the project. Add the following lines to your CMakeLists.txt and save the file:
-
cmake_minimum_required(VERSION 3.5.1) set(CMAKE_CXX_STANDARD 14)
-
These lines set the minimum cmake version required to 3.5.1 and set the environment variable CMAKE_CXX_STANDARD so CMake uses C++ 14. On your own computer, if you have a recent g++ compiler, you could use C++ 17 instead.
-
-
CMake requires that we name the project, so you should choose a name for the project and then add the following line to CMakeLists.txt:
project(<your_project_name>)
-
Next, we want to add an executable to this project. You can do that with the add_executable command by specifying the executable name, along with the locations of all the source files that you will need. CMake has the ability to automatically find source files in a directory, but for now, you can just specify each file needed:
add_executable(your_executable_name path_to_file_1 path_to_file_2 ...)
-
A typical CMake project will have a build directory in the same place as the top-level CMakeLists.txt. Make a build directory in the /home/workspace/cmake_example folder:
-
root@abc123defg:/home/workspace/cmake_example# mkdir build root@abc123defg:/home/workspace/cmake_example# cd build root@abc123defg:/home/workspace/cmake_example/build# cmake .. root@abc123defg:/home/workspace/cmake_example/build# make root@abc123defg:/home/workspace/cmake_example/build# ./your_executable_name
-
The first line directs the cmake command at the top-level CMakeLists.txt file with ... This command uses the CMakeLists.txt to configure the project and create a Makefile in the build directory.
-
In the second line, make finds the Makefile and uses the instructions in the Makefile to build the project.
-
Now that your project builds correctly, try modifying one of the files. When you are ready to run the project again, you'll only need to run the make command from the build folder, and only that file will be compiled again. Try it now!
-
In general, CMake only needs to be run once for a project, unless you are changing build options (e.g. using different build flags or changing where you store your files).
-
Make will be able to keep track of which files have changed and compile only those that need to be compiled before building.
-
-
-
You have seen references used previously, in both pass-by-reference for functions, and in a range-basedfor loop example that used references to modify a vector. As you write larger C++ programs, you will find references useful in a variety of situations. In this short notebook, you will see a few more examples of references to solidify your knowledge.
-
As mentioned previously, a reference is another name given to an existing variable. On the left hand side of any variable declaration, the & operator can be used to declare a reference.
-
#include <iostream> using std::cout; int main() { int i = 1; // Declare a reference to i. int& j = i; cout << "The value of j is: " << j << "\n"; // Change the value of i. i = 5; cout << "The value of i is changed to: " << i << "\n"; cout << "The value of j is now: " << j << "\n"; // Change the value of the reference. // Since reference is just another name for the variable, // th j = 7; cout << "The value of j is now: " << j << "\n"; cout << "The value of i is changed to: " << i << "\n"; }
-
Pointers have traditionally been a stumbling block for many students learning C++, but they do not need to be!
-
A C++ pointer is just a variable that stores the memory address of an object in your program.
-
That is the most important thing to understand and remember about pointers - they essentially keep track of where a variable is stored in the computer's memory.
-
In the previous lessons, you implemented A* search in a single file without using C++ pointers, except in
CellSort
code that was provided for you; a C++ program can be written without using pointers extensively (or at all). However, pointers give you better control over how your program uses memory. However, much like the pass-by-reference example that you saw previously, it can often be far more efficient to perform an operation with a pointer to an object than performing the same operation using the object itself. -
Pointers are an extremely important part of the C++ language, and as you are exposed to more C++ code, you will certainly encounter them.
-
Each variable in a program stores its contents in the computer's memory, and each chunk of the memory has an address number. For a given variable, the memory address can be accessed using an ampersand in front of the variable. To see an example of this, execute the following code which displays the hexadecimal memory addresses of the variables i and j:
-
#include <iostream> using std::cout; int main() { int i = 5; int j = 6; // Print the memory addresses of i and j cout << "The address of i is: " << &i << "\n"; cout << "The address of j is: " << &j << "\n"; }
-
-
At this point, you might be wondering why the same symbol & can be used to both access memory addresses and, as you've seen before, pass references into a function. This is a great thing to wonder about. The overloading of the ampersand symbol & and the * symbol probably contribute to much of the confusion around pointers.
-
The symbols & and * have a different meaning, depending on which side of an equation they appear.
-
This is extremely important to remember. For the & symbol, if it appears on the left side of an equation (e.g. when declaring a variable), it means that the variable is declared as a reference. If the & appears on the right side of an equation, or before a previously defined variable, it is used to return a memory address, as in the example above.
-
#include <iostream> using std::cout; int main() { int i = 5; // A pointer pointer_to_i is declared and initialized to the address of i. int* pointer_to_i = &i; // Print the memory addresses of i and j cout << "The address of i is: " << &i << "\n"; cout << "The variable pointer_to_i is: " << pointer_to_i << "\n"; }
-
As you can see from the code, the variable pointer_to_i is declared as a pointer to an int using the * symbol, and pointer_to_i is set to the address of i. From the printout, it can be seen that pointer_to_i holds the same value as the address of i.
-
Once you have a pointer, you may want to retrieve the object it is pointing to. In this case, the * symbol can be used again. This time, however, it will appear on the right hand side of an equation or in front of an already-defined variable, so the meaning is different. In this case, it is called the "dereferencing operator", and it returns the object being pointed to. You can see how this works with the code below:
-
#include <iostream> using std::cout; int main() { int i = 5; // A pointer pointer_to_i is declared and initialized to the address of i. int* pointer_to_i = &i; // Print the memory addresses of i and j cout << "The address of i is: " << &i << "\n"; cout << "The variable pointer_to_i is: " << pointer_to_i << "\n"; cout << "The value of the variable pointed to by pointer_to_i is: " << *pointer_to_i << "\n"; }
-
In the following example, the code is similar to above, except that the object that is being pointed to is changed before the pointer is dereferenced. Before executing the following code, guess what you think will happen to the value of the dereferenced pointer.
-
#include <iostream> using std::cout; int main() { int i = 5; // A pointer pointer_to_i is declared and initialized to the address of i. int* pointer_to_i = &i; // Print the memory addresses of i and j cout << "The address of i is: " << &i << "\n"; cout << "The variable pointer_to_i is: " << pointer_to_i << "\n"; // The value of i is changed. i = 7; cout << "The new value of the variable i is : " << i << "\n"; cout << "The value of the variable pointed to by pointer_to_i is: " << *pointer_to_i << "\n"; }
-
-
In the previous concept, you were introduced to int pointers, and you learned the syntax for creating a pointer and retrieving an object from a pointer.
-
Although the type of object being pointed to must be included in a pointer declaration, pointers hold the same kind of value for every type of object: just a memory address to where the object is stored. In the following code, a vector is declared. Write your own code to create a pointer to the address of that vector. Then, dereference your pointer and print the value of the first item in the vector.
-
#include <iostream> #include <vector> using std::cout; using std::vector; int main() { // Vector v is declared and initialized to {1, 2, 3} vector<int> v {1, 2, 3}; // Declare and initialize a pointer to the address of v here: vector<int> *pointer_to_v = &v; // The following loops over each int a in the vector v and prints. // Note that this uses a "range-based" for loop: https://www.geeksforgeeks.org/range-based-loop-c/ for (int a: v) { cout << a << "\n"; } // Dereference your pointer to v and print the int at index 0 here (note: you should print 1): cout << "The first element of v is: " << (*pointer_to_v)[0] << "\n"; }
-
-
Pointers can be used in another form of pass-by-reference when working with functions. When used in this context, they work much like the references that you used for pass-by reference previously. If the pointer is pointing to a large object, it can be much more efficient to pass the pointer to a function than to pass a copy of the object as with pass-by-value.
-
In the following code, a pointer to an int is created, and that pointer is passed to a function. The object pointed to is then modified in the function.
-
#include <iostream> using std::cout; void AddOne(int* j) { // Dereference the pointer and increment the int being pointed to. (*j)++; } int main() { int i = 1; cout << "The value of i is: " << i << "\n"; // Declare a pointer to i: int* pi = &i; AddOne(pi); cout << "The value of i is now: " << i << "\n"; }
-
-
You can also return a pointer from a function. As mentioned just above, if you do this, you must be careful that the object being pointed to doesn't go out of scope when the function finishes executing. If the object goes out of scope, the memory address being pointed to might then be used for something else.
-
In the example below, a reference is passed into a function and a pointer is returned. This is safe since the pointer being returned points to a reference - a variable that exists outside of the function and will not go out of scope in the function.
-
#include <iostream> using std::cout; int* AddOne(int& j) { // Increment the referenced int and return the // address of j. j++; return &j; } int main() { int i = 1; cout << "The value of i is: " << i << "\n"; // Declare a pointer and initialize to the value // returned by AddOne: int* my_pointer = AddOne(i); cout << "The value of i is now: " << i << "\n"; cout << "The value of the int pointed to by my_pointer is: " << *my_pointer << "\n"; }
-
- Pointers and references can have similar use cases in C++. As seen previously both references and pointers can be used in pass-by-reference to a function. Additionally, they both provide an alternative way to access an existing variable: pointers through the variable's address, and references through another name for that variable. But what are the differences between the two, and when should each be used? The following list summarizes some of the differences between pointers and references, as well as when each should be used:
References | Pointers |
---|---|
References must be initialized when they are declared. This means that a reference will always point to data that was intentionally assigned to it. | Pointers can be declared without being initialized, which is dangerous. If this happens mistakenly, the pointer could be pointing to an arbitrary address in memory, and the data associated with that address could be meaningless, leading to undefined behavior and difficult-to-resolve bugs. |
References can not be null. This means that a reference should point to meaningful data in the program. | Pointers can be null. In fact, if a pointer is not initialized immediately, it is often best practice to initialize to nullptr, a special type which indicates that the pointer is null. |
When used in a function for pass-by-reference, the reference can be used just as a variable of the same type would be. | When used in a function for pass-by-reference, a pointer must be dereferenced in order to access the underlying object. |
-
References are generally easier and safer than pointers. As a decent rule of thumb, references should be used in place of pointers when possible.
-
However, there are times when it is not possible to use references. One example is object initialization. You might like one object to store a reference to another object. However, if the other object is not yet available when the first object is created, then the first object will need to use a pointer, not a reference, since a reference cannot be null. The reference could only be initialized once the other object is created.
-
So far in this course you have seen container data structures, like the vector and the array. Additionally, you have used classes in your code for this project. Container data structures are fantastic for storing ordered data, and classes are useful for grouping related data and functions together, but neither of these data structures is optimal for storing associated data.
-
A map (alternatively hash table, hash map, or dictionary) is a data structure that uses key/value pairs to store data, and provides efficient lookup and insertion of the data. The name "dictionary" should provide an excellent idea of how these work, since a dictionary is a real life example of a map. Here is a slightly edited entry from www.dictionary.com defining the word "word":
-
word
- a unit of language, consisting of one or more spoken sounds or their written representation, that functions as a principal carrier of meaning.
- speech or talk: to express one's emotion in words.
- a short talk or conversation: "Marston, I'd like a word with you."
- an expression or utterance: a word of warning.
-
-
In the following notebook, you will learn how to use an unordered_map, which is the C++ standard library implementation of a map. Although C++ has several different implementations of map data structures which are similar, unordered_map is the structure that you will use in your project.
-
In the cell below, we have created a hash table (unordered_map) to store the data from the example above. To create an unordered_map in C++, you must include the <unordered_map> header, and the sytnax for declaring an unordered_map is as follows:
-
unordered_map <key_type, value_type> variable_name;
-
In the code below, we check if the key is in the unordered_map using the
.find()
method. If the key does not exist in the map, then.find()
returns anunordered_map::end()
type. Otherwise, .find() returns a C++ iterator, which is a pointer that points to the beginning of the iterable key-value pair. -
We haven't covered iterators in this course, and you won't need them for this project, but they are a lot like pointers that can "iterate" forward or backward through a range.
-
#include <iostream> #include <vector> #include <unordered_map> #include <string> using std::vector; using std::cout; using std::unordered_map; using std::string; int main() { // Create strings to use in the hash table. string key = "word"; string def_1 = "a unit of language, consisting of one or more spoken sounds or their written representation, that functions as a principal carrier of meaning"; string def_2 = "speech or talk: to express one's emotion in words"; string def_3 = "a short talk or conversation: 'Marston, I'd like a word with you.'"; string def_4 = "an expression or utterance: a word of warning"; unordered_map <string, vector<string>> my_dictionary; // Check if key is in the hash table. if (my_dictionary.find(key) == my_dictionary.end()) { cout << "The key 'word' is not in the dictionary." << "\n"; cout << "Inserting a key-value pair into the dictionary." << "\n\n"; // Set the value for the key. my_dictionary[key] = vector<string> {def_1, def_2, def_3, def_4}; } // The key should now be in the hash table. You can access the // value corresponding to the key with square brackets []. // Here, the value my_dictionary[key] is a vector of strings. // We iterate over the vector and print the strings. cout << key << ": \n"; auto definitions = my_dictionary[key]; for (string definition : definitions) { cout << definition << "\n"; } }
-
#include<unordered_map> #include<string> #include<iostream> #include<vector> using std::unordered_map; using std::string; using std::cout; using std::vector; // Write your program here. int main() { unordered_map<int, string> IDD_codes {{972, "Israel"}, {93, "Afghanistan"}, {355, "Albania"}, {213, "Algeria"}, {376, "Andorra"}, {244, "Angola"}, {54, "Argentina"}, {374, "Armenia"}, {297, "Aruba"}, {61, "Australia"}, {43, "Austria"}, {994, "Azerbaijan"}, {973, "Bahrain"}, {880, "Bangladesh"}, {375, "Belarus"}, {32, "Belgium"}, {501, "Belize"}, {229, "Benin"}, {975, "Bhutan"}, {387, "Bosnia and Herzegovina"}, {267, "Botswana"}, {55, "Brazil"}, {246, "British Indian Ocean Territory"}, {359, "Bulgaria"}, {226, "Burkina Faso"}, {257, "Burundi"}, {855, "Cambodia"}, {237, "Cameroon"}, {1, "Canada"}, {238, "Cape Verde"}, {236, "Central African Republic"}, {235, "Chad"}, {56, "Chile"}, {86, "China"}, {61, "Christmas Island"}, {57, "Colombia"}, {269, "Comoros"}, {242, "Congo"}, {682, "Cook Islands"}, {506, "Costa Rica"}, {385, "Croatia"}, {53, "Cuba"}, {537, "Cyprus"}, {420, "Czech Republic"}, {45, "Denmark"}, {253, "Djibouti"}, {593, "Ecuador"}, {20, "Egypt"}, {503, "El Salvador"}, {240, "Equatorial Guinea"}, {291, "Eritrea"}, {372, "Estonia"}, {251, "Ethiopia"}, {298, "Faroe Islands"}, {679, "Fiji"}, {358, "Finland"}, {33, "France"}, {594, "French Guiana"}, {689, "French Polynesia"}, {241, "Gabon"}, {220, "Gambia"}, {995, "Georgia"}, {49, "Germany"}, {233, "Ghana"}, {350, "Gibraltar"}, {30, "Greece"}, {299, "Greenland"}, {590, "Guadeloupe"}, {502, "Guatemala"}, {224, "Guinea"}, {245, "Guinea-Bissau"}, {595, "Guyana"}, {509, "Haiti"}, {504, "Honduras"}, {36, "Hungary"}, {354, "Iceland"}, {91, "India"}, {62, "Indonesia"}, {964, "Iraq"}, {353, "Ireland"}, {972, "Israel"}, {39, "Italy"}, {81, "Japan"}, {962, "Jordan"}, {254, "Kenya"}, {686, "Kiribati"}, {965, "Kuwait"}, {996, "Kyrgyzstan"}, {371, "Latvia"}, {961, "Lebanon"}, {266, "Lesotho"}, {231, "Liberia"}, {423, "Liechtenstein"}, {370, "Lithuania"}, {352, "Luxembourg"}, {261, "Madagascar"}, {265, "Malawi"}, {60, "Malaysia"}, {223, "Mali"}, {356, "Malta"}, {692, "Marshall Islands"}, {596, "Martinique"}, {222, "Mauritania"}, {230, "Mauritius"}, {262, "Mayotte"}, {52, "Mexico"}, {377, "Monaco"}, {976, "Mongolia"}, {382, "Montenegro"}, {212, "Morocco"}, {95, "Myanmar"}, {264, "Namibia"}, {674, "Nauru"}, {977, "Nepal"}, {31, "Netherlands"}, {599, "Netherlands Antilles"}, {687, "New Caledonia"}, {64, "New Zealand"}, {505, "Nicaragua"}, {227, "Niger"}, {234, "Nigeria"}, {683, "Niue"}, {672, "Norfolk Island"}, {47, "Norway"}, {968, "Oman"}, {92, "Pakistan"}, {680, "Palau"}, {507, "Panama"}, {675, "Papua New Guinea"}, {595, "Paraguay"}, {51, "Peru"}, {63, "Philippines"}, {48, "Poland"}, {351, "Portugal"}, {974, "Qatar"}, {40, "Romania"}, {250, "Rwanda"}, {685, "Samoa"}, {378, "San Marino"}, {966, "Saudi Arabia"}, {221, "Senegal"}, {381, "Serbia"}, {248, "Seychelles"}, {232, "Sierra Leone"}, {65, "Singapore"}, {421, "Slovakia"}, {386, "Slovenia"}, {677, "Solomon Islands"}, {27, "South Africa"}, {500, "South Georgia and the South Sandwich Islands"}, {34, "Spain"}, {94, "Sri Lanka"}, {249, "Sudan"}, {597, "Suriname"}, {268, "Swaziland"}, {46, "Sweden"}, {41, "Switzerland"}, {992, "Tajikistan"}, {66, "Thailand"}, {228, "Togo"}, {690, "Tokelau"}, {676, "Tonga"}, {216, "Tunisia"}, {90, "Turkey"}, {993, "Turkmenistan"}, {688, "Tuvalu"}, {256, "Uganda"}, {380, "Ukraine"}, {971, "United Arab Emirates"}, {44, "United Kingdom"}, {1, "United States"}, {598, "Uruguay"}, {998, "Uzbekistan"}, {678, "Vanuatu"}, {681, "Wallis and Futuna"}, {967, "Yemen"}, {260, "Zambia"}, {263, "Zimbabwe"}, {591, "Bolivia, Plurinational State of"}, {673, "Brunei Darussalam"}, {61, "Cocos (Keeling) Islands"}, {243, "Congo, The Democratic Republic of the"}, {225, "Cote dIvoire"}, {500, "Falkland Islands (Malvinas)"}, {44, "Guernsey"}, {379, "Holy See (Vatican City State)"}, {852, "Hong Kong"}, {98, "Iran, Islamic Republic of"}, {44, "Isle of Man"}, {44, "Jersey"}, {850, "Korea, Democratic People's Republic of"}, {82, "Korea, Republic of"}, {856, "Lao People's Democratic Republic"}, {218, "Libyan Arab Jamahiriya"}, {853, "Macao"}, {389, "Macedonia, The Former Yugoslav Republic of"}, {691, "Micronesia, Federated States of"}, {373, "Moldova, Republic of"}, {258, "Mozambique"}, {970, "Palestinian Territory, Occupied"}, {872, "Pitcairn"}, {262, "Réunion"}, {7, "Russia"}, {590, "Saint Barthélemy"}, {290, "Saint Helena, Ascension and Tristan Da Cunha"}, {590, "Saint Martin"}, {508, "Saint Pierre and Miquelon"}, {239, "Sao Tome and Principe"}, {252, "Somalia"}, {47, "Svalbard and Jan Mayen"}, {963, "Syrian Arab Republic"}, {886, "Taiwan, Province of China"}, {255, "Tanzania, United Republic of"}, {670, "Timor-Leste"}, {58, "Venezuela, Bolivarian Republic of"}, {84, "Viet Nam"}}; if (IDD_codes.find(960) == IDD_codes.end()) { IDD_codes[960] = "Maldives"; } vector<int> my_codes {1, 55, 960}; for (int code : my_codes) { cout << code << ": " << IDD_codes[code] << "\n"; } }
-
-
If you are taking this course, you have probably used object-oriented programming (OOP) previously in another language. If it's been a while since you've used OOP, OOP is a style of coding that collects related data (object attributes) and functions (object methods) together to form a single data structure, called an object. This allows that collection of attributes and methods to be used repeatedly in your program without code repetition.
-
In C++ the attributes and methods that make up an object are specified in a code class, and each object in the program is an instance of that class.
-
This concept is intended to provide you with the basic syntax for writing classes in C++. In this Foundations course, you will not need to write your own classes for the project, but you will be modifying existing classes in the code. You will be writing your own classes in the next course of this Nanodegree: Object-Oriented Programming.
-
In the next cell, the code above has been rewritten with a
Car
class. -
#include <iostream> #include <string> using std::string; using std::cout; // The Car class class Car { public: // Method to print data. void PrintCarData() { cout << "The distance that the " << color << " car " << number << " has traveled is: " << distance << "\n"; } // Method to increment the distance travelled. void IncrementDistance() { distance++; } // Class/object attributes string color; int distance = 0; int number; }; int main() { // Create class instances for each car. Car car_1, car_2, car_3; // Set each instance's color. car_1.color = "green"; car_2.color = "red"; car_3.color = "blue"; // Set each instance's number. car_1.number = 1; car_2.number = 2; car_3.number = 3; // Increment car_1's position by 1. car_1.IncrementDistance(); // Print out the position and color of each car. car_1.PrintCarData(); car_2.PrintCarData(); car_3.PrintCarData(); }
-
This looks ok, and you have reduced the number of variables in main, so you might see how this could be more organized going forward. However, there is now a lot more code than you started with, and the main doesn't seem much more organzied. The code above still sets the attributes for each car after the car has been created.
-
The best way to fix this is to add a constructor to the Car class. The constructor allows you to instantiate new objects with the data that you want. In the next code cell, we have added a constructor for Car that allows the number and color to be passed in. This means that each Car object can be created with those variables.
-
#include <iostream> #include <string> using std::string; using std::cout; class Car { public: void PrintCarData() { cout << "The distance that the " << color << " car " << number << " has traveled is: " << distance << "\n"; } void IncrementDistance() { distance++; } // Adding a constructor here: Car(string c, int n) { // Setting the class attributes with // The values passed into the constructor. color = c; number = n; } string color; int distance = 0; int number; }; int main() { // Create class instances for each car. Car car_1 = Car("green", 1); Car car_2 = Car("red", 2); Car car_3 = Car("blue", 3); // Increment car_1's position by 1. car_1.IncrementDistance(); // Print out the position and color of each car. car_1.PrintCarData(); car_2.PrintCarData(); car_3.PrintCarData(); }
-
This is now beginning to look better. The main is more organized than when we first started, although there is a little more code overall to accomodate the class definition. At this point, you might want to separate your class definition into it's own .h and .cpp files. We'll do that in the next concept!
-
It is possible for a class to use methods and attributes from another class using class inheritance. For example, if you wanted to make a Sedan class with additional attributes or methods not found in the generic Car class, you could create a Sedan class that inherited from the Car by using the colon notation:
-
class Sedan : public Car { // Sedan class declarations/definitions here. };
-
By doing this, each Sedan class instance will have access to any of the public methods and attributes of Car. In the code above, these areIncrementDistance() and PrintCarData(). You can add additional features to the Sedan class as well. In the example above, Car is often referred to as the parent class, and Sedan as the child or derived class.
-
A full discussion of inheritance is beyond the scope of this course, but you will encounter it briefly in the project code later. In the project code, the classes are set up to inherit from existing classes of an open source code project. You won't need to use inheritance otherwise, but keep in mind that your classes can use all of the public methods and attributes of their parent class.
-
In the previous concept, you saw how to create a
Car
classs and use a constructor. At the end of that concept, your code looked like this: -
#include <iostream> #include <string> using std::string; using std::cout; class Car { public: void PrintCarData() { cout << "The distance that the " << color << " car " << number << " has traveled is: " << distance << "\n"; } void IncrementDistance() { distance++; } // Adding a constructor here: Car(string c, int n) { // Setting the class attributes with // The values passed into the constructor. color = c; number = n; } string color; int distance = 0; int number; }; int main() { // Create class instances for each car. Car car_1 = Car("green", 1); Car car_2 = Car("red", 2); Car car_3 = Car("blue", 3); // Increment car_1's position by 1. car_1.IncrementDistance(); // Print out the position and color of each car. car_1.PrintCarData(); car_2.PrintCarData(); car_3.PrintCarData(); }
-
If you were planning to build a larger program, at this point it might be good to put your class definition and function declarations into a separate file. Just as when we discussed header files before, putting the class definition into a separate header helps to organize your code, and prevents problems with trying to use class objects before the class is defined.
-
There are two things to note in the code below.
- When the class methods are defined outside the class, the scope resolution operator
::
must be used to indicate which class the method belongs to. For example, in the definition of thePrintCarData
method you see:
-
void Car::PrintCarData()
- This prevents any compiler issues if there are are two classes with methods that have the same name.
- We have changed how the constructor initializes the variables. Instead of the previous constructor:
-
Car(string c, int n) { color = c; number = n; }
-
-
the constructor now uses an initializer list:
-
Car(string c, int n) : color(c), number(n) {}
-
-
Here, the class members are initialized before the body of the constructor (which is now empty). Initializer lists are a quick way to initialize many class attributes in the constructor. Additionally, the compiler treats attributes initialized in the list slightly differently than if they are initialized in the constructor body. For reasons beyond the scope of this course, if a class attribute is a reference, it must be initialized using an initializer list.
- Variables that don't need to be visible outside of the class are set as
private
. This means that they can not be accessed outside of the class, which prevents them from being accidentally changed.
Check out the cells below to see this code in practice. In this code, we have separated the class into declarations and definitions, with declarations being in the .h
file and definitions being in .cpp
. Note that only the .h
file needs to be included in any other file where the definitions are used.
-
car.h
-
#ifndef CAR_H #define CAR_H #include <string> using std::string; using std::cout; class Car { public: void PrintCarData(); void IncrementDistance(); // Using a constructor list in the constructor: Car(string c, int n) : color(c), number(n) {} // The variables do not need to be accessed outside of // functions from this class, so we can set them to private. private: string color; int distance = 0; int number; }; #endif
-
car.cpp
-
#include <iostream> #include "car.h" // Method definitions for the Car class. void Car::PrintCarData() { cout << "The distance that the " << color << " car " << number << " has traveled is: " << distance << "\n"; } void Car::IncrementDistance() { distance++; }
-
main.cpp
-
#include <iostream> #include <string> #include "car.h" using std::string; using std::cout; int main() { // Create class instances for each car. Car car_1 = Car("green", 1); Car car_2 = Car("red", 2); Car car_3 = Car("blue", 3); // Increment car_1's position by 1. car_1.IncrementDistance(); // Print out the position and color of each car. car_1.PrintCarData(); car_2.PrintCarData(); car_3.PrintCarData(); }
-
There is a lot going on in the code to unpack, including the
new
keyword and the->
operator. The arrow operator->
is used to simultaneously- dereference a pointer to an object and
- access an attribute or method.
-
For example, in the code below, cp is a pointer to a Car object, and the following two are equivalent:
-
// Simultaneously dereference the pointer and // access IncrementDistance(). cp->IncrementDistance(); // Dereference the pointer using *, then // access IncrementDistance() with traditional // dot notation. (*cp).IncrementDistance();
-
The new operator allocates memory on the "heap" for a new Car. In general, this memory must be manually managed (deallocated) to avoid memory leaks in your program. Memory management is the primary focus of one of the later courses in this Nanodegree program, so we won't go into greater depth about the difference between
stack
andheap
in this lesson. -
#include <iostream> #include <string> #include <vector> #include "car.h" using std::string; using std::cout; using std::vector; int main() { // Create an empty vector of pointers to Cars // and a null pointer to a car. vector<Car*> car_vect; Car* cp = nullptr; // The vector of colors for the cars: vector<string> colors {"red", "blue", "green"}; // Create 100 cars with different colors and // push pointers to each of those cars into the vector. for (int i=0; i < 100; i++) {; cp = new Car(colors[i%3], i+1); car_vect.push_back(cp); } // Move each car forward by 1. for (Car* cp: car_vect) { cp->IncrementDistance(); } // Print data about each car. for (Car* cp: car_vect) { cp->PrintCarData(); } }
When working with classes it is often helpful to be able to refer to the current class instance or object. For example, given the following Car class from a previous lesson, the IncrementDistance() method implicitly refers to the current Car instance's distance attribute:
-
// The Car class class Car { public: // Method to print data. void PrintCarData() { cout << "The distance that the " << color << " car " << number << " has traveled is: " << distance << "\n"; } // Method to increment the distance travelled. void IncrementDistance() { distance++; } // Class/object attributes string color; int distance = 0; int number; };
-
It is possible to make this explicit in C++ by using the this pointer, which points to the current class instance. Using this can sometimes be helpful to add clarity to more complicated code:
-
// The Car class class Car { public: // Method to print data. void PrintCarData() { cout << "The distance that the " << this->color << " car " << this->number << " has traveled is: " << this->distance << "\n"; } // Method to increment the distance travelled. void IncrementDistance() { this->distance++; } // Class/object attributes string color; int distance = 0; int number; };
-
Note: you may see this used in some code in the remainder of the course.
-
Structures
-
Structures allow developers to create their own types ("user-defined" types) to aggregate data relevant to their needs.
-
For example, a user might define a Rectangle structure to hold data about rectangles used in a program.
-
struct Rectangle { float length; float width; };
length
andwidth
are member variables
-
Types
-
Every C++ variable is defined with a type.
-
int value; Rectangle rectangle; Sphere earth;
-
In this example, the "type" of
value
isint
. Furthermore,rectangle
is "of type"Rectangle
, andearth
has typeSphere
.
-
-
Fundamental Types
-
C++ includes fundamental types, such as
int
andfloat
. These fundamental types are sometimes called "primitives". -
The Standard Library [includes additional types](, such as
std::size_t
andstd::string
.
-
-
User-Defined Types
-
Structures are "user-defined" types. Structures are a way for programmers to create types that aggregate and store data in way that makes sense in the context of a program.
-
For example, C++ does not have a fundamental type for storing a date. (The Standard Library does include types related to time, which can be converted to dates.)
-
A programmer might desire to create a type to store a date.
-
Consider the following example:
-
struct Date { int day; int month; int year; };
-
The code above creates a structure containing three "member variables" of type int: day, month and year.
-
If you then create an "instance" of this structure, you can initialize these member variables:
-
// Create an instance of the Date structure Date date; // Initialize the attributes of Date date.day = 1; date.month = 10; date.year = 2019;
-
-
Generally, we want to avoid instantiating an object with undefined members. Ideally, we would like all members of an object to be in a valid state once the object is instantiated. We can change the values of the members later, but we want to avoid any situation in which the members are ever in an invalid state or undefined.
-
In order to ensure that objects of our Date structure always start in a valid state, we can initialize the members from within the structure definition.
-
struct Date { int day{1}; int month{1}; int year{0}; };
-
There are also several other approaches to either initialize or assign member variables when the object is instantiated. For now, however, this approach ensures that every object of Date begins its life in a defined and valid state.
-
Members of a structure can be specified as
public
orprivate
. -
By default, all members of a structure are
public
, unless they are specifically markedprivate
. -
Public members can be changed directly, by any user of the object, whereas private members can only be changed by the object itself.
-
Private Members
-
This is an implementation of the
Date
structure, with all members marked as private. -
struct Date { private: int day{1}; int month{1}; int year{0}; };
-
Private members of a class are accessible only from within other member functions of the same class (or from their "friends", which we’ll talk about later).
-
There is a third access modifier called
protected
, which implies that members are accessible from other member functions of the same class (or from their "friends"), and also from members of their derived classes. We'll also discuss about derived classes later, when we learn about inheritance. -
The differences between a class and a struct in C++ is:
-
struct
members and base classes/structs arepublic
by default. -
class
members and base classes/struts areprivate
by default. -
Both classes and structs can have a mixture of
public, protected and private
members, can use inheritance and can have member functions.
-
-
Accessors And Mutators
-
To access private members, we typically define public "accessor" and "mutator" member functions (sometimes called "getter" and "setter" functions).
-
struct Date { public: int Day() { return day; } void Day(int day) { this.day = day; } int Month() { return month; } void Month(int month) { this.month = month; } int Year() { return year; } void Year(int year) { this.year = year; } private: int day{1}; int month{1}; int year{0}; };
-
-
Avoid Trivial Getters And Setters
-
Sometimes accessors are not necessary, or even advisable. The C++ Core Guidelines recommend, "A trivial getter or setter adds no semantic value; the data item could just as well be public."
-
class Point { int x; int y; public: Point(int xx, int yy) : x{xx}, y{yy} { } int get_x() const { return x; } // const here promises not to modify the object void set_x(int xx) { x = xx; } int get_y() const { return y; } // const here promises not to modify the object void set_y(int yy) { y = yy; } // no behavioral member functions };
-
This
class
could be made into astruct
, with no logic or "invariants", just passive data. The member variables could both be public, with no accessor functions: -
struct Point { // Good: concise int x {0}; // public member variable with a default initializer of 0 int y {0}; // public member variable with a default initializer of 0 };
-
-
-
Classes
-
Classes, like structures, provide a way for C++ programmers to aggregate data together in a way that makes sense in the context of a specific program. By convention, programmers use structures when member variables are independent of each other, and use classes when member variables are related by an "invariant".
-
Invariants
-
An "invariant" is a rule that limits the values of member variables.
-
For example, in a
Date
class, an invariant would specify that the member variableday
cannot be less than 0. Another invariant would specify that the value of day cannot exceed 28, 29, 30, or 31, depending on the month and year. Yet another invariant would limit the value of month to the range of 1 to 12. -
Date
Class -
Let's define a
Date
class: -
// Use the keyword “class” to define a Date class: class Date { int day{1}; int month{1}; int year{0}; };
-
So far, this class definition provides no invariants. The data members can vary independently of each other.
-
There is one subtle but important change that takes place when we change
struct
Date toclass
Date. By default, all members of a struct default to public, whereas all members of a class default to private. Since we have not specified access for the members of class Date, all of the members are private. In fact, we are not able to assign value to them at all! -
Date
Accessors And Mutators -
As the first step to adding the appropriate invariants, let's specify that the member variable
day
is private. In order to access this member, we'll provide accessor and mutatot functions. Then we can add the appropriate invariants to the mutators. -
class Date { public: int Day() { return day_; } void Day(int d) { day_ = d; } private: int day_{1}; int month_{1}; int year_{0}; };
-
Date
Invariants -
Now we can add the invariants whitin the mutators
-
class Date { public: int Day() { return day; } void Day(int d) { if (d >= 1 && d <= 31) day_ = d; } private: int day_{1}; int month_{1}; int year_{0}; };
-
Now we have a set of invariants for the the class members!
-
As a general rule, member data subject to an invariant should be specified private, in order to enforce the invariant before updating the member's value.
-
-
-
Constructors
-
Constructors are member functions of a class or struct that initialize an object. The Core Guidelines define a constructor) as:
- constructor: an operation that initializes (“constructs”) an object. Typically a constructor establishes an invariant and often acquires resources needed for an object to be used (which are then typically released by a destructor).
-
A constructor can take arguments, which can be used to assign values to member variables.
-
class Date { public: Date(int d, int m, int y) { // This is a constructor. Day(d); } int Day() { return day; } void Day(int d) { if (d >= 1 && d <= 31) day = d; } int Month() { return month; } void Month(int m) { if (m >= 1 && m <= 12) month = m; } int Year() { return year_; } void Year(int y) { year = y; } private: int day{1}; int month{1}; int year{0}; };
-
As you can see, a constructor is also able to call other member functions of the object it is constructing. In the example above,
Date(int d, int m, int y)
assigns a member variable by callingDay(int d)
.
-
-
Default Constructor
-
A class object is always initialized by calling a constructor. That might lead you to wonder how it is possible to initialize a class or structure that does not define any constructor at all.
-
For example:
-
class Date { int day{1}; int month{1}; int year{0}; };
-
-
We can initialize an object of this class, even though this class does not explicitly define a constructor.
-
This is possible because of the default constructor. The compiler will define a default constructor, which accepts no arguments, for any class or structure that does not contain an explicitly-defined constructor.
-
-
-
Scope Resolution
-
C++ allows different identifiers (variable and function names) to have the same name, as long as they have different scope. For example, two different functions can each declare the variable int i, because each variable only exists within the scope of its parent function.
-
In some cases, scopes can overlap, in which case the compiler may need assistance in determining which identifier the programmer means to use. The process of determining which identifier to use is called "scope resolution".
-
Scope Resultion Operator
-
::
is the scope resolution operator. We can use this operator to specify which namespace or class to search in order to resolve an identifier. -
Person::move(); \\ Call the move the function that is a member of the Person class. std::map m; \\ Initialize the map container from the C++ Standard Library.
-
-
Class
-
Each class provides its own scope. We can use the scope resolution operator to specify identifiers from a class.
-
This becomes particularly useful if we want to separate class declaration from class definition.
-
class Date { public: int Day() const { return day; } void Day(int day); // Declare member function Date::Day(). int Month() const { return month; } void Month(int month) { if (month >= 1 && month <= 12) Date::month = month; } int Year() const { return year; } void Year(int year) { Date::year = year; } private: int day{1}; int month{1}; int year{0}; }; // Define member function Date::Day(). void Date::Day(int day) { if (day >= 1 && day <= 31) Date::day = day; }
-
-
Namespaces
-
Namespaces allow programmers to group logically related variables and functions together. Namespaces also help to avoid conflicts between to variables that have the same name in different parts of a program.
-
namespace English { void Hello() { std::cout << "Hello, World!\n"; } } // namespace English namespace Spanish { void Hello() { std::cout << "Hola, Mundo!\n"; } } // namespace Spanish int main() { English::Hello(); Spanish::Hello(); }
-
In this example, we have two different
void Hello()
functions. If we put both of these functions in the same namespace, they would conflict and the program would not compile. However, by declaring each of these functions in a separate namespace, they are able to co-exist. Furthermore, we can specify which function to call by prefixing Hello() with the appropriate namespace, followed by the :: operator. -
#include <cassert> class Date { public: int Day() { return day; } void Day(int day); int Month() { return month; } void Month(int month); int Year() { return year; } void Year(int year); private: int day{1}; int month{1}; int year{0}; }; // TODO: Define Date::Day(int day) void Date::Day(int day) { if(day >= 1 && day <= 31) Date::day = day; } // TODO: Define Date::Month(int month) void Date::Month(int month) { if(month >= 1 && month <= 12) Date::month = month; } // TODO: Define Date::Year(int year) void Date::Year(int year) { Date::year = year; } // Test in main int main() { Date date; date.Day(29); date.Month(8); date.Year(1981); assert(date.Day() == 29); assert(date.Month() == 8); assert(date.Year() == 1981); }
-
-
-
Initializer List
-
Initializer lists initialize member variables to specific values, just before the class constructor runs. This initialization ensures that class members are automatically initialized when an instance of the class is created.
-
Date::Date(int day, int month, int year) : year_(y) { Day(day); Month(month); }
-
In this example, the member value year is initialized through the initializer list, while day and month are assigned from within the constructor. Assigning day and month allows us to apply the invariants set in the mutator.
-
In general, prefer initialization to assignment. Initialization sets the value as soon as the object exists, whereas assignment sets the value only after the object comes into being. This means that assignment creates and opportunity to accidentally use a variable before its value is set.
-
In fact, initialization lists ensure that member variables are initialized before the object is created. This is why class member variables can be declared const, but only if the member variable is initialized through an initialization list. Trying to initialize a const class member within the body of the constructor will not work.
-
#include <assert.h> #include <string> // TODO: Define class Person struct Person { // TODO: Define a public constructor with an initialization list Person(std::string name) : name(name) {} // TODO: Define a public member variable: name std::string name; }; // Test int main() { Person alice("Alice"); Person bob("Bob"); assert(alice.name != bob.name); }
-
Initializer lists exist for a number of reasons. First, the compiler can optimize initialization faster from an initialization list than from within the constructor.
-
A second reason is a bit of a technical paradox. If you have a const class attribute, you can only initialize it using an initialization list. Otherwise, you would violate the const keyword simply by initializing the member in the constructor!
-
The third reason is that attributes defined as references must use initialization lists.
-
#include <assert.h> #include <string> struct Person { public: // TODO: Add an initialization list Person(std::string const & n) : name(n) {} std::string const name; }; // Test int main() { Person alice("Alice"); Person bob("Bob"); assert(alice.name != bob.name); }
-
-
Encapsulation
-
Encapsulation is the grouping together of data and logic into a single unit. In object-oriented programming, classes encapsulate data and functions that operate on that data.
-
This can be a delicate balance, because on the one hand we want to group together relevant data and functions, but on the hand we want to limit member functions to only those functions that need direct access to the representation of a class.
-
In the context of a Date class, a function Date Tomorrow(Date const & date) probably does not need to be encapsulated as a class member. It can exist outside the Date class.
-
However, a function that calculates the number of days in a month probably should be encapsulated with the class, because the class needs this function in order to operate correctly.
-
#include <cassert> class Date { public: Date(int day, int month, int year); int Day() const { return day_; } void Day(int day); int Month() const { return month_; } void Month(int month); int Year() const { return year_; } void Year(int year); private: bool LeapYear(int year) const; int DaysInMonth(int month, int year) const; int day_{1}; int month_{1}; int year_{0}; }; Date::Date(int day, int month, int year) { Year(year); Month(month); Day(day); } bool Date::LeapYear(int year) const { if(year % 4 != 0) return false; else if(year % 100 != 0) return true; else if(year % 400 != 0) return false; else return true; } int Date::DaysInMonth(int month, int year) const { if(month == 2) return LeapYear(year) ? 29 : 28; else if(month == 4 || month == 6 || month == 9 || month == 11) return 30; else return 31; } void Date::Day(int day) { if (day >= 1 && day <= DaysInMonth(Month(), Year())) day_ = day; } void Date::Month(int month) { if (month >= 1 && month <= 12) month_ = month; } void Date::Year(int year) { year_ = year; } // Test int main() { Date date(29, 2, 2016); assert(date.Day() == 29); assert(date.Month() == 2); assert(date.Year() == 2016); Date date2(29, 2, 2019); assert(date2.Day() != 29); assert(date2.Month() == 2); assert(date2.Year() == 2019); }
-
-
Accessor Functions
-
Accessor functions are public member functions that allow users to access an object's data, albeit indirectly.
-
const
-
Accessors should only retrieve data. They should not change the data stored in the object.
-
The main role of the const specifier in accessor methods is to protect member data. When you specify a member function as const, the compiler will prohibit that function from changing any of the object's member data.
-
#include <iostream> #include <string> class BankAccount { public: int number; std::string owner; double funds; }; int main(){ // TODO: instantiate and output a bank account BankAccount account; account.number = 123456789; account.owner = "David Silver"; account.funds = 1,000,000.01 std::cout << "Account Information\n"; std::cout << "-------------------\n"; std::cout << "ID: " << account.number << "\n"; std::cout << "Owner: " << account.owner << "\n"; std::cout << "Funds: $" << account.funds << "\n"; }
-
-
Mutator Functions
-
#include <string> #include <cstring> #include <iostream> class Car { // TODO: Declare private attributes private: std::string _brand; // TODO: Declare getter and setter for brand public: void brand(char*); std::string brand() const; }; // Define setters void Car::brand(char* brand) { Car::_brand = brand; } // Define getters std::string Car::brand() const { return _brand; } // Test in main() int main() { Car car; char brand[] = "Peugeot"; car.brand(brand); std::cout << car.brand() << "\n"; }
-
-
Abstraction
-
Abstraction refers to the separation of a class's interface from the details of its implementation. The interface provides a way to interact with an object, while hiding the details and implementation of how the class works.
-
Example
-
The String() function within this Date class is an example of abstraction.
-
class Date { public: ... std::string String() const; ... };
-
The user is able to interact with the Date class through the String() function, but the user does not need to know about the implementation of either Date or String().
-
For example, the user does not know, or need to know, that this object internally contains three int member variables. The user can just call the String() method to get data.
-
If the designer of this class ever decides to change how the data is stored internally -- using a vector of ints instead of three separate ints, for example -- the user of the Date class will not need to know.
-
-
-
Static Members
-
Class members can be declared
static
, which means that the member belongs to the entire class, instead of to a specific instance of the class. More specifically, astatic
member is created only once and then shared by all instances (i.e. objects) of the class. That means that if the static member gets changed, either by a user of the class or within a member function of the class itself, then all members of the class will see that change the next time they access the static member. -
Implementation
-
static
members are declared within their class (often in a header file) but in most cases they must be defined within the global scope. That's because memory is allocated for static variables immediately when the program begins, at the same time any global variables are initialized. -
Here is an example:
-
#include <cassert> class Foo { public: static int count; Foo() { Foo::count += 1; } }; int Foo::count{0}; int main() { Foo f{}; assert(Foo::count == 1); }
-
An exception to the global definition of
static
members is if such members can be marked asconstexpr
. In that case, thestatic
member variable can be both declared and defined within the class definition: -
struct Kilometer { static constexpr int meters{1000}; };
-
-
-
Inheritence
-
In our everyday life, we tend to divide things into groups, based on their shared characteristics. Here are some groups that you have probably used yourself: electronics, tools, vehicles, or plants.
-
Sometimes these groups have hierarchies. For example, computers and smartphones are both types of electronics, but computers and smartphones are also groups in and of themselves. You can imagine a tree with "electronics" at the top, and "computers" and "smartphones" each as children of the "electronics" node.
-
Object-oriented programming uses the same principles! For instance, imagine a Vehicle class:
-
class Vehicle { public: int wheels = 0; string color = "blue"; void Print() const { std::cout << "This " << color << " vehicle has " << wheels << " wheels!\n"; } };
-
-
We can derive other classes from Vehicle, such as Car or Bicycle. One advantage is that this saves us from having to re-define all of the common member variables - in this case, wheels and color - in each derived class.
-
Another benefit is that derived classes, for example Car and Bicycle, can have distinct member variables, such as sunroof or kickstand. Different derived classes will have different member variables:
-
class Car : public Vehicle { public: bool sunroof = false; }; class Bicycle : public Vehicle { public: bool kickstand = true; };
-
-
Another example:
-
#include <iostream> #include <string> using std::string; class Vehicle { public: int wheels = 0; string color = "blue"; string make = "generic"; void Print() const { std::cout << "This " << color << " " << make << " vehicle has " << wheels << " wheels!\n"; } }; class Car : public Vehicle { public: bool sunroof = false; }; class Bicycle : public Vehicle { public: bool kickstand = true; }; class Scooter : public Vehicle { public: bool electric = false; }; int main() { Scooter scooter; scooter.wheels = 2; scooter.Print(); };
-
-
-
Inherited Access Specifiers
-
Just as access specifiers (i.e. public, protected, and private) define which class members users can access, the same access modifiers also define which class members users of a derived classes can access.
-
Public inheritance: the public and protected members of the base class listed after the specifier keep their member access in the derived class
-
Protected inheritance: the public and protected members of the base class listed after the specifier are protected members of the derived class
-
Private inheritance: the public and protected members of the base class listed after the specifier are private members of the derived class
-
// This example demonstrates the privacy levels // between parent and child classes #include <iostream> #include <string> using std::string; class Vehicle { public: int wheels = 0; string color = "blue"; void Print() const { std::cout << "This " << color << " vehicle has " << wheels << " wheels!\n"; } }; class Car : public Vehicle { public: bool sunroof = false; }; class Bicycle : protected Vehicle { public: bool kickstand = true; void Wheels(int w) { wheels = w; } }; class Scooter : private Vehicle { public: bool electric = false; void Wheels(int w) { wheels = w; } }; int main() { Car car; car.wheels = 4; Bicycle bicycle; bicycle.Wheels(2); Scooter scooter; scooter.Wheels(1); };
-
Another example
-
// Example solution for Animal class #include <iostream> #include <string> // Define base class Animal class Animal { public: std::string color; std::string name; int age; }; // Declare derived class Snake class Snake : public Animal { public: int length; void MakeSound() const { std::cout << "Hiss\n"; } }; // Declare derived class Cat class Cat : public Animal { public: int height; void MakeSound() const { std::cout << "Meow\n"; } }; // Test in main() int main() { Cat cat; Snake snake; cat.age = 10; cat.name = "Lucy"; cat.MakeSound(); snake.MakeSound(); std::cout << cat.age << " " << cat.name << "\n"; }
-
-
-
Composition
-
Composition is a closely related alternative to inheritance. Composition involves constructing ("composing") classes from other classes, instead of inheriting traits from a parent class.
-
A common way to distinguish "composition" from "inheritance" is to think about what an object can do, rather than what it is. This is often expressed as "has a" versus "is a".
-
From the standpoint of composition, a cat "has a" head and "has a" set of paws and "has a" tail.
-
From the standpoint of inheritance, a cat "is a" mammal.
-
There is no hard and fast rule about when to prefer composition over inheritance. In general, if a class needs only extend a small amount of functionality beyond what is already offered by another class, it makes sense to inherit from that other class. However, if a class needs to contain functionality from a variety of otherwise unrelated classes, it makes sense to compose the class from those other classes.
-
In this example, you'll practice working with composition in C++.
-
// Example solution for Circle class #include <iostream> #include <cmath> #include <assert.h> // Define PI #define PI 3.14159; // Define LineSegment struct struct LineSegment { // Define protected attribute length public: double length; }; // Define Circle class class Circle { public: Circle(LineSegment& radius); double Area(); private: LineSegment& radius_; }; // Declare Circle class Circle::Circle(LineSegment& radius) : radius_(radius) {} double Circle::Area() { return pow(Circle::radius_.length, 2) * PI; } // Test in main() int main() { LineSegment radius {3}; Circle circle(radius); assert(int(circle.Area()) == 28); }
-
-
Class Hierarchy
-
#include <cassert> // TODO: Declare Vehicle as the base class class Vehicle {}; // TODO: Derive Car from Vehicle class Car : public Vehicle { public: int wheels{4}; }; // TODO: Derive Sedan from Car class Sedan : public Car { public: bool trunk{true}; int seats{4}; }; // TODO: Update main to pass the tests int main() { Sedan sedan; assert(sedan.trunk == true); assert(sedan.seats == 4); assert(sedan.wheels == 4); }
-
-
-
Friends
-
In C++,
friend
classes provide an alternative inheritance mechanism to derived classes. The main difference between classical inheritance and friend inheritance is that afriend
class can access private members of the base class, which isn't the case for classical inheritance. In classical inheritance, a derived class can only access public and protected members of the base class. -
// Example solution for Rectangle and Square friend classes #include <assert.h> // Declare class Rectangle class Rectangle; // Define class Square as friend of Rectangle class Square { // Add public constructor to Square, initialize side public: Square(int s) : side(s) {} private: // Add friend class Rectangle friend class Rectangle; // Add private attribute side to Square int side; }; // Define class Rectangle class Rectangle { // Add public functions to Rectangle: area() and convert() public: Rectangle(const Square& a); int Area() const; private: // Add private attributes width, height int width {0}; int height {0}; }; // Define a Rectangle constructor that takes a Square Rectangle::Rectangle(const Square& a) : width(a.side), height(a.side) {} // Define Area() to compute area of Rectangle int Rectangle::Area() const { return width * height; } // Update main() to pass the tests int main() { Square square(4); Rectangle rectangle(square); assert(rectangle.Area() == 16); }
-
-
Polymorphism
-
Polymorphism is means "assuming many forms".
-
In the context of object-oriented programming, polymorphism) describes a paradigm in which a function may behave differently depending on how it is called. In particular, the function will perform differently based on its inputs.
-
Polymorphism can be achieved in two ways in C++: overloading and overriding. In this exercise we will focus on overloading.
-
Overloading
-
In C++, you can write two (or more) versions of a function with the same name. This is called "overloading". Overloading requires that we leave the function name the same, but we modify the function signature. For example, we might define the same function name with multiple different configurations of input arguments.
-
This example of class Date overloads:
-
#include <ctime> class Date { public: Date(int day, int month, int year) : day_(day), month_(month), year_(year) {} Date(int day, int month) : day_(day), month_(month) // automatically sets the Date to the current year { time_t t = time(NULL); tm* timePtr = localtime(&t); year_ = timePtr->tm_year; } private: int day_; int month_; int year_; };
-
-
#include <iostream> class Human {}; class Dog {}; class Cat {}; // TODO: Write hello() function void hello() { std::cout << "Hello, World!\n"; } // TODO: Overload hello() three times void hello(Human human) { std::cout << "Hello, Human!\n"; } void hello(Dog dog) { std::cout << "Hello, Dog!\n"; } void hello(Cat cat) { std::cout << "Hello, Cat!\n"; } // TODO: Call hello() from main() int main() { hello(); hello(Human()); hello(Dog()); hello(Cat()); }
-
-
-
Operator Overloading
-
n this exercise you'll see how to achieve polymorphism with operator overloading. You can choose any operator from the ASCII table and give it your own set of rules!
-
Operator overloading can be useful for many things. Consider the + operator. We can use it to add ints, doubles, floats, or even std::strings.
-
In order to overload an operator, use the operator keyword in the function signature:
-
Complex operator+(const Complex& addend) { //...logic to add complex numbers }
-
-
Imagine vector addition. You might want to perform vector addition on a pair of points to add their x and y components. The compiler won't recognize this type of operation on its own, because this data is user defined. However, you can overload the + operator so it performs the action that you want to implement.
-
#include <assert.h> // TODO: Define Point class class Point { public: // TODO: Define public constructor Point(int x = 0, int y = 0) : x(x), y(y) {} // TODO: Define + operator overload Point operator+(const Point& addend) { Point sum; sum.x = x + addend.x; sum.y = y + addend.y; return sum; } // TODO: Declare attributes x and y int x, y; }; // Test in main() int main() { Point p1(10, 5), p2(2, 4); Point p3 = p1 + p2; // An example call to "operator +"; assert(p3.x == p1.x + p2.x); assert(p3.y == p1.y + p2.y); }
-
-
-
Virtual Functions
-
Virtual functions are a polymorphic feature. These functions are declared (and possibly defined) in a base class, and can be overridden by derived classes.
-
This approach declares an interface at the base level, but delegates the implementation of the interface to the derived classes.
-
In this exercise, class Shape is the base class. Geometrical shapes possess both an area and a perimeter. Area() and Perimeter() should be virtual functions of the base class interface. Append
= 0
to each of these functions in order to declare them to be "pure" virtual functions. -
A pure virtual function is a virtual function that the base class declares but does not define.
-
A pure virtual function has the side effect of making its class abstract. This means that the class cannot be instantiated. Instead, only classes that derive from the abstract class and override the pure virtual function can be instantiated.
-
class Shape { public: Shape() {} virtual double Area() const = 0; virtual double Perimeter() const = 0; };
-
Virtual functions can be defined by derived classes, but this is not required. However, if we mark the virtual function with
= 0
in the base class, then we are declaring the function to be a pure virtual function. This means that the base class does not define this function. A derived class must define this function, or else the derived class will be abstract. -
// Example solution for Shape inheritance #include <assert.h> #include <cmath> // TODO: Define pi #define PI 3.14159; // TODO: Define the abstract class Shape class Shape { public: // TODO: Define public virtual functions Area() and Perimeter() // TODO: Append the declarations with = 0 to specify pure virtual functions virtual double Area() const = 0; virtual double Perimeter() const = 0; }; // TODO: Define Rectangle to inherit publicly from Shape class Rectangle : public Shape { public: // TODO: Declare public constructor Rectangle(double width, double height) : width_(width), height_(height) {} // TODO: Override virtual base class functions Area() and Perimeter() double Area() const override { return width_ * height_; } double Perimeter() const override { return 2 * (width_ + height_); } private: // TODO: Declare private attributes width and height double width_; double height_; }; // TODO: Define Circle to inherit from Shape class Circle : public Shape { public: // TODO: Declare public constructor Circle(double radius) : radius_(radius) {} // TODO: Override virtual base class functions Area() and Perimeter() double Area() const override { return pow(radius_, 2) * PI; } double Perimeter() const override { return 2 * radius_ * PI; } private: // TODO: Declare private member variable radius double radius_; }; // Test in main() int main() { double epsilon = 0.1; // useful for floating point equality // Test circle Circle circle(12.31); assert(abs(circle.Perimeter() - 77.35) < epsilon); assert(abs(circle.Area() - 476.06) < epsilon); // Test rectangle Rectangle rectangle(10, 6); assert(rectangle.Perimeter() == 32); assert(rectangle.Area() == 60); }
-
Polymorphism: Overriding
-
"Overriding" a function occurs when:
-
A base class declares a virtual function.
-
A derived class overrides that virtual function by defining its own implementation with an identical function signature (i.e. the same function name and argument types).
-
-
class Animal { public: virtual std::string Talk() const = 0; }; class Cat { public: std::string Talk() const { return std::string("Meow"); } };
-
#include <assert.h> #include <string> class Animal { public: virtual std::string Talk() const = 0; }; // TODO: Declare a class Dog that inherits from Animal class Dog : Animal { public: std::string Talk() const; }; std::string Dog::Talk() const { return "Woof"; } int main() { Dog dog; assert(dog.Talk() == "Woof"); }
-
-
Override
-
"Overriding" a function occurs when a derived class defines the implementation of a
virtual
function that it inherits from a base class. -
It is possible, but not required, to specify a function declaration as
override
. -
class Shape { public: virtual double Area() const = 0; virtual double Perimeter() const = 0; }; class Circle : public Shape { public: Circle(double radius) : radius_(radius) {} double Area() const override { return pow(radius_, 2) * PI; } // specified as an override function double Perimeter() const override { return 2 * radius_ * PI; } // specified as an override function private: double radius_; };
-
This specification tells both the compiler and the human programmer that the purpose of this function is to override a virtual function. The compiler will verify that a function specified as
override
does indeed override some other virtual function, or otherwise the compiler will generate an error. -
Specifying a function as
override
is good practice, as it empowers the compiler to verify the code, and communicates the intention of the code to future users. -
#include <assert.h> #include <cmath> // TODO: Define PI #define PI 3.14159 // TODO: Declare abstract class VehicleModel class VehicleModel { // TODO: Declare virtual function Move() virtual void Move(double v, double phi) = 0; }; // TODO: Derive class ParticleModel from VehicleModel class ParticleModel : public VehicleModel { public: // TODO: Override the Move() function void Move(double v, double phi) override { theta += phi; x += v * cos(theta); y += v * sin(theta); } // TODO: Define x, y, and theta double x = 0; double y = 0; double theta = 0; }; // TODO: Derive class BicycleModel from ParticleModel class BicycleModel : public ParticleModel { public: // TODO: Override the Move() function void Move(double v, double phi) override { theta += v / L * tan(phi); x += v * cos(theta); y += v * sin(theta); } // TODO: Define L double L = 1; }; // TODO: Pass the tests int main() { // Test function overriding ParticleModel particle; BicycleModel bicycle; particle.Move(10, PI / 9); bicycle.Move(10, PI / 9); assert(particle.x != bicycle.x); assert(particle.y != bicycle.y); assert(particle.theta != bicycle.theta); }
-
-
Multiple Inheritance
-
In this exercise, you'll get some practical experience with multiple inheritance. If you have class Animal and another class Pet, then you can construct a class Dog, which inherits from both of these base classes. In doing this, you are able to incorporate attributes of multiple base classes.
-
The Core Guidelines have some worthwhile recommendations about how and when to use multiple inheritance:
-
#include <iostream> #include <string> #include <assert.h> class Animal { public: double age; }; class Pet { public: std::string name; }; // Dog derives from *both* Animal and Pet class Dog : public Animal, public Pet { public: std::string breed; }; class Cat : public Animal, public Pet { public: std::string color; }; int main() { /* Cat cat; cat.color = "black"; cat.age = 10; cat.name = "Max"; */ assert(cat.color == "black"); assert(cat.age == 10); assert(cat.name == "Max"); }
-
-
-
Generic Programming / Templates
-
Templates
-
Templates enable generic programming by generalizing a function to apply to any class. Specifically, templates use types as parameters so that the same implementation can operate on different data types.
-
For example, you might need a function to accept many different data types. The function acts on those arguments, perhaps dividing them or sorting them or something else. Rather than writing and maintaining the multiple function declarations, each accepting slightly different arguments, you can write one function and pass the argument types as parameters. At compile time, the compiler then expands the code using the types that are passed as parameters.
-
template <typename Type> Type Sum(Type a, Type b) { return a + b; } int main() { std::cout << Sum<double>(20.0, 13.7) << "\n"; }
-
Because Sum() is defined with a template, when the program calls Sum() with doubles as parameters, the function expands to become:
-
double Sum(double a, double b) { return a+b; }
-
-
Or in this case:
-
std::cout << Sum<char>(‘Z’, ’j’) << "\n";
-
-
The program expands to become:
-
char Sum(char a, char b) { return a+b; }
-
-
-
We use the keyword
template
to specify which function is generic. Generic code is the term for code that is independent of types. It is mandatory to put thetemplate<>
tag before the function signature, to specify and mark that the declaration is generic. -
Besides
template
, the keywordtypename
(or, alternatively,class
) specifies the generic type in the function prototype. The parameters that follow typename (or class) represent generic types in the function declaration. -
In order to instantiate a templatized class, use a templatized constructor, for example:
Sum<double>(20.0, 13.7)
. You might recognize this form as the same form used to construct a vector. That's because vectors are indeed a generic class! -
#include <assert.h> // TODO: Create a generic function Product that multiplies two parameters template <typename T> T Product(T a, T b) { return a * b; } int main() { assert(Product<int>(10, 2) == 20); }
-
C++ 20 has a new feature for generic programming called
concept
. Class templates, function templates, and non-template functions (typically members of class templates) may be associated with a constraint, which specifies the requirements on template arguments, which can be used to select the most appropriate function overloads and template specializations. -
Named sets of such requirements are called concepts. Each concept is a predicate, evaluated at compile time, and becomes a part of the interface of a template where it is used as a constraint:
-
#include <string> #include <cstddef> #include <concepts> // Declaration of the concept "Hashable", which is satisfied by any type 'T' // such that for values 'a' of type 'T', the expression std::hash<T>{}(a) // compiles and its result is convertible to std::size_t template<typename T> concept Hashable = requires(T a) { { std::hash<T>{}(a) } -> std::convertible_to<std::size_t>; }; struct meow {}; // Constrained C++20 function template: template<Hashable T> void f(T) {} // // Alternative ways to apply the same constraint: // template<typename T> // requires Hashable<T> // void f(T) {} // // template<typename T> // void f(T) requires Hashable<T> {} int main() { using std::operator""s; f("abc"s); // OK, std::string satisfies Hashable //f(meow{}); // Error: meow does not satisfy Hashable }
-
#include <assert.h> // TODO: Declare a generic, templatized function Max() template <typename T> T Max(T a, T b) { return a > b ? a : b; } int main() { assert(Max(10, 50) == 50); assert(Max(5.7, 1.436246) == 5.7); }
-
-
Deduction
-
In this example, you will see the difference between total and partial deduction.
-
Deduction occurs when you instantiate an object without explicitly identifying the types. Instead, the compiler "deduces" the types. This can be helpful for writing code that is generic and can handle a variety of inputs.
-
#include <assert.h> // TODO: Declare a generic, templatized average function template <typename T> T average(T a, T b) { return (a+b)/2; } int main() { assert(average(2.0,5.0) == 3.5); }
-
-
Exercise
-
#include <assert.h> #include <string> #include <sstream> // TODO: Add the correct template specification template <typename KeyType, typename ValueType> class Mapping { public: Mapping(KeyType key, ValueType value) : key(key), value(value) {} std::string Print() const { std::ostringstream stream; stream << key << ": " << value; return stream.str(); } KeyType key; ValueType value; }; // Test int main() { Mapping<std::string, int> mapping("age", 20); assert(mapping.Print() == "age: 20"); }
-
-
Memory Addresses and Hexadecimal Numbers
-
Understanding the number system used by computers to store and process data is essential for effective memory management, which is why we will start with an introduction into the binary and hexadecimal number systems and the structure of memory addresses.
-
Early attempts to invent an electronic computing device met with disappointing results as long as engineers and computer scientists tried to use the decimal system. One of the biggest problems was the low distinctiveness of the individual symbols in the presence of noise. A 'symbol' in our alphabet might be a letter in the range A-Z while in our decimal system it might be a number in the range 0-9. The more symbols there are, the harder it can be to differentiate between them, especially when there is electrical interference. After many years of research, an early pioneer in computing, John Atanasoff, proposed to use a coding system that expressed numbers as sequences of only two digits: one by the presence of a charge and one by the absence of a charge. This numbering system is called Base 2 or binary and it is represented by the digits 0 and 1 (called 'bit') instead of 0-9 as with the decimal system. Differentiating between only two symbols, especially at high frequencies, was much easier and more robust than with 10 digits. In a way, the ones and zeroes of the binary system can be compared to Morse Code, which is also a very robust way to transmit information in the presence of much interference. This was one of the primary reasons why the binary system quickly became the standard for computing.
-
Inside each computer, all numbers, characters, commands and every imaginable type of information are represented in binary form. Over the years, many coding schemes and techniques were invented to manipulate these 0s and 1s effectively. One of the most widely used schemes is called ASCII (American Standard Code for Information Interchange), which lists the binary code for a set of 127 characters. The idea was to represent each letter with a sequence of binary numbers so that storing texts on in computer memory and on hard (or floppy) disks would be possible.
-
The film enthusiasts among you might know the scene in the hit movie "The Martian" with Mat Daemon, in which an ASCII table plays an important role in the rescue from Mars.
-
The following figure shows an ASCII table, where each character (rightmost column) is associated with an 8-digit binary number:
-
In addition to the decimal number (column "Dec") and the binary number, the ASCII table provides a third number for each character (column "Hex"). According to the table above, the letter z is referenced by the decimal number 122, by the binary number 0111 1010 and by 7A. You have probably seen this type of notation before, which is called "hexadecimal". Hexadecimal (hex) numbers are used often in computer systems, e.g for displaying memory readouts - which is why we will look into this topic a little bit deeper. Instead of having a base of 2 (such as binary numbers) or a base of 10 (such as our conventional decimal numbers), hex numbers have a base of 16. The conversion between the different numbering systems is a straightforward operation and can be easily performed with any scientific calculator. More details on how to do this can e.g. be found here.
-
There are several reasons why it is preferable to use hex numbers instead of binary numbers (which computers store at the lowest level), three of which are given below:
-
Readability: It is significantly easier for a human to understand hex numbers as they resemble the decimal numbers we are used to. It is simply not intuitive to look at binary numbers and decide how big they are and how they relate to another binary number.
-
Information density: A hex number with two digits can express any number from 0 to 255 (because 16^2 is 256). To do the same in the binary system, we would require 8 digits. This difference is even more pronounced as numbers get larger and thus harder to deal with.
-
Conversion into bytes: Bytes are units of information consisting of 8 bits. Almost all computers are byte-addressed, meaning all memory is referenced by byte, instead of by bit. Therefore, using a counting system that can easily convert into bytes is an important requirement. We will shortly see why grouping bits into a byte plays a central role in understanding how computer memory works.
-
-
The reason why early computer scientists have decided to not use decimal numbers can also be seen in the figure below. In these days (before pocket calculators were widely available), programers had to interpret computer output in their head on a regular basis. For them, it was much easier and quicker to look at and interpret
7E
instead of0111 1110
. Ideally, they would have used the decimal system, but the conversion between base 2 and base 10 is much harder than between base 2 and base 16. Note in the figure that the decimal system's digit transitions never match those of the binary system. With the hexadecimal system, which is based on a multiple of 2, digit transitions match up each time, thus making it much easier to convert quickly between these numbering systems.
-
-
Using the Debugger to Analyze Memory
-
As you have seen in the last section, binary numbers and hex numbers can be used to represent information. A coding scheme such as an ASCII table makes it possible to convert text into binary form. In the following, we will try to look at computer memory and locate information there.
-
In the following example, we will use the debugger to look for a particular string in computer memory. Depending on your computer operating system and on the compiler you have installed, there might be several debugging tools available to you. In the following video, we will use the
gdb
debugger to locate the character sequence "UDACITY" in computer memory. The code below creates an array of characters in computer memory (on the stack, which we will learn more about shortly) and prints it to the console:- For
clang
user you can uselldb
- For
-
Start the binary by running
lldb a.out
.- Make sure to compile with symbols enbaled
clang++ --std=c++14 -g main.cpp
- Make sure to compile with symbols enbaled
-
b main
/break main
to mark a breakpoint at themain
function -
r main
/run main
to run themain
function. -
s
/step
to move inside the code -
c
/continue
to continue the process -
More commands can be found here
-
bt
/backtrace
to print the stack -
p var
/print var
to print a variable-
expr --raw -- &var
to get a variable address inlldb
-
-
x/nfu memory_address
:x
helps access memoryn
how many units to printf
the format we want to print ona
: pointerc
: read as integer print as characterd
: integer, signed decimalx
: hexadecimal
u
the unitb
byteh
half-word (two bytes)w
four bytesg
giant word (eight bytes)
-
Example:
x/7xb
- 7 units = how many units
- hexadecimal = format
- byte = unit
-
Computer memory is treated as a sequence of cells. This means that we can use the starting address to retrieve the byte of information stored there. The following figure illustrates the principle:
-
Let us perform a short experiment using gdb again: By adding 1, 2, 3, … to the address of the string variable str1, we can proceed to the next cell until we reach the end of the memory we want to look at.
-
Note that the numbers above represent the string "UDACITY" again. Also note that once we exceed the end of the string, the memory cell has the value 0x00. This means that the experiment has shown that an offset of 1 in a hexadecimal address corresponds to an offset of 8 bits (or 1 byte) in computer memory.
-
-
Types of Computer Memory
-
In a course on memory management we obviously need to take a look at the available memory types in computer systems. Below you will find a small list of some common memory types that you will surely have heard of:
- RAM / ROM
- Cache (L1, L2)
- Registers
- Virtual Memory
- Hard Disks, USB drives
-
Let us look into these types more deeply: When the CPU of a computer needs to access memory, it wants to do this with minimal latency. Also, as large amounts of information need to be processed, the available memory should be sufficiently large with regard to the tasks we want to accomplish.
-
Regrettably though, low latency and large memory are not compatible with each other (at least not at a reasonable price). In practice, the decision for low latency usually results in a reduction of the available storage capacity (and vice versa). This is the reason why a computer has multiple memory types that are arranged hierarchically. The following pyramid illustrates the principle:
-
As you can see, the CPU and its ultra-fast (but small) registers used for short-term data storage reside at the top of the pyramid. Below are Cache and RAM, which belong to the category of temporary memory which quickly looses its content once power is cut off. Finally, there are permanent storage devices such as the ROM, hard drives as well as removable drives such as USB sticks.
-
Let us take a look at a typical computer usage scenario to see how the different types of memory are used:
-
After switching on the computer, it loads data from its read-only memory (ROM) and performs a power-on self-test (POST) to ensure that all major components are working properly. Additionally, the computer memory controller checks all of the memory addresses with a simple read/write operation to ensure that memory is functioning correctly.
-
After performing the self-test, the computer loads the basic input/output system (BIOS) from ROM. The major task of the BIOS is to make the computer functional by providing basic information about such things as storage devices, boot sequence, security or auto device recognition capability.
-
The process of activating a more complex system on a simple system is called "bootstrapping": It is a solution for the chicken-egg-problem of starting a software-driven system by itself using software. During bootstrapping, the computer loads the operating system (OS) from the hard drive into random access memory (RAM). RAM is considered "random access" because any memory cell can be accessed directly by intersecting the respective row and column in the matrix-like memory layout. For performance reasons, many parts of the OS are kept in RAM as long as the computer is powered on.
-
When an application is started, it is loaded into RAM. However, several application components are only loaded into RAM on demand to preserve memory. Files that are opened during runtime are also loaded into RAM. When a file is saved, it is written to the specified storage device. After closing the application, it is deleted from RAM.
-
-
This simple usage scenario shows the central importance of the RAM. Every time data is loaded or a file is opened, it is placed into this temporary storage area - but what about the other memory types above the RAM layer in the pyramid?
-
To maximize CPU performance, fast access to large amounts of data is critical. If the CPU cannot get the data it needs, it stops and waits for data availability. Thus, when designing new memory chips, engineers must adapt to the speed of the available CPUs. The problem they are facing is that memory which is able to keep up with modern CPUs running at several GHz is extremely expensive. To combat this, computer designers have created the memory tier system which has already been shown in the pyramid diagram above. The solution is to use expensive memory in small quantities and then back it up using larger quantities of less expensive memory.
-
The cheapest form of memory available today is the hard disk. It provides large quantities of inexpensive and permanent storage. The problem of a hard disk is its comparatively low speed - even though access times with modern solid state disks (SSD) have decreased significantly compared to older magnetic-disc models.
-
The next hierarchical level above hard disks or other external storage devices is the RAM. We will not discuss in detail how it works but only take a look at some key performance metrics of the CPU at this point, which place certain performance expectations on the RAM and its designers:
-
The bit size of the CPU decides how many bytes of data it can access in RAM memory at the same time. A 16-bit CPU can access 2 bytes (with each byte consisting of 8 bit) while a 64-bit CPU can access 8 bytes at a time.
-
The processing speed of the CPU is measured in Gigahertz or Megahertz and denotes the number of operations it can perform in one second.
-
-
From processing speed and bit size, the data rate required to keep the CPU busy can easily be computed by multiplying bit size with processing speed. With modern CPUs and ever-increasing speeds, the available RAM in the market will not be fast enough to match the CPU data rate requirements.
-
Cache Levels
-
Cache memory is much faster but also significantly smaller than standard RAM. It holds the data that will (or might) be used by the CPU more often. In the memory hierarchy we have seen in the last section, the cache plays an intermediary role between fast CPU and slow RAM and hard disk. The figure below gives a rough overview of a typical system architecture:
-
System architecture diagram showing caches, ALU (arithmetic logic unit), main memory, and the buses connected each component.
-
-
The central CPU chip is connected to the outside world by a number of buses. There is a cache bus, which leads to a block denoted as L2 cache, and there is a system bus as well as a memory bus that leads to the computer main memory. The latter holds the comparatively large RAM while the L2 cache as well as the L1 cache are very small with the latter also being a part of the CPU itself.
-
The concept of L1 and L2 (and even L3) cache is further illustrated by the following figure, which shows a multi-core CPU and its interplay with L1, L2 and L3 caches:
-
Level 1 cache is the fastest and smallest memory type in the cache hierarchy. In most systems, the L1 cache is not very large. Mostly it is in the range of 16 to 64 kBytes, where the memory areas for instructions and data are separated from each other (L1i and L1d, where "i" stands for "instruction" and "d" stands for "data". Also see "Harvard architecture" for further reference). The importance of the L1 cache grows with increasing speed of the CPU. In the L1 cache, the most frequently required instructions and data are buffered so that as few accesses as possible to the slow main memory are required. This cache avoids delays in data transmission and helps to make optimum use of the CPU's capacity.
-
Level 2 cache is located close to the CPU and has a direct connection to it. The information exchange between L2 cache and CPU is managed by the L2 controller on the computer main board. The size of the L2 cache is usually at or below 2 megabytes. On modern multi-core processors, the L2 cache is often located within the CPU itself. The choice between a processor with more clock speed or a larger L2 cache can be answered as follows: With a higher clock speed, individual programs run faster, especially those with high computing requirements. As soon as several programs run simultaneously, a larger cache is advantageous. Usually normal desktop computers with a processor that has a large cache are better served than with a processor that has a high clock rate.
-
Level 3 cache is shared among all cores of a multicore processor. With the L3 cache, the cache coherence protocol of multicore processors can work much faster. This protocol compares the caches of all cores to maintain data consistency so that all processors have access to the same data at the same time. The L3 cache therefore has less the function of a cache, but is intended to simplify and accelerate the cache coherence protocol and the data exchange between the cores.
-
On Mac, information about the system cache can be obtained by executing the command
sysctl -a hw
in a terminal. On Debian Linux linux, this information can be found withlscpu | grep cache
. On my iMac Pro (2017), this command yielded (among others) the following output:-
hw.memsize: 34359738368 hw.l1icachesize: 32768 hw.l1dcachesize: 32768 hw.l2cachesize: 1048576 hw.l3cachesize: 14417920
-
hw.l1icachesize
is the size of the L1 instruction cache, wich is at 32kB. This cache is strictly reserved for storing CPU instructions only. -
hw.l1dcachesize
is also 32 KB and is dedicated for data as opposed to instructions. -
hw.l2cachesize
andhw.l3cachesize
show the size of the L2 and L3 cache, which are at 1MB and 14MB respectively. -
It should be noted that the size of all caches combined is very small when compared to the size of the main memory (the RAM), which is at 32GB on my system.
-
-
Ideally, data needed by the CPU should be read from the various caches for more than 90% of all memory access operations. This way, the high latency of RAM and hard disk can be efficiently compensated.
-
Temporal and Spatial Locality
-
The following table presents a rough overview of the latency of various memory access operations. Even though these numbers will differ significantly between systems, the order of magnitude between the different memory types is noteworthy. While L1 access operations are close to the speed of a photon traveling at light speed for a distance of 1 foot, the latency of L2 access is roughly one order of magnitude slower already while access to main memory is two orders of magnitude slower.
-
Originally from Peter Norvig: http://norvig.com/21-days.html#answers
-
-
In algorithm design, programmers can exploit two principles to increase runtime performance:
-
Temporal locality means that address ranges that are accessed are likely to be used again in the near future. In the course of time, the same memory address is accessed relatively frequently (e.g. in a loop). This property can be used at all levels of the memory hierarchy to keep memory areas accessible as quickly as possible.
-
Spatial locality means that after an access to an address range, the next access to an address in the immediate vicinity is highly probable (e.g. in arrays). In the course of time, memory addresses that are very close to each other are accessed again multiple times. This can be exploited by moving the adjacent address areas upwards into the next hierarchy level during a memory access.
-
-
Let us consider the following code example:
-
#include <chrono> #include <iostream> int main() { // create array const int size = 4; // static means the variable is not allocated in the stack (stored in data segment or in BSS segment). // What it is useful for however is if you have some large structure used in main that would be too big for the stack. Then, declaring the variable as static means it lives in the data segment. // Being static also means that, if uninitialized, the variable will be initialized with all 0's, just like globals. static int x[size][size]; auto t1 = std::chrono::high_resolution_clock::now(); for (int i = 0; i < size; i++) { for (int j = 0; j < size; j++) { x[j][i] = i + j; // std::cout << &x[j][i] << ": i=" << i << ", j=" << j << std::endl; } } // print execution time to console auto t2 = std::chrono::high_resolution_clock::now(); // stop time measurement auto duration = std::chrono::duration_cast<std::chrono::microseconds>(t2 - t1).count(); std::cout << "Execution time: " << duration << " microseconds" << std::endl; return 0; }
-
The order we access our array impacts speed in this example. Having said that, accessing "line" neighbour addesses by using
x[i][j]
is faster compared to accessing "column" neighbour addresses withx[j][i]
. Remember, a matrix is stored in a single line in memory, just like an array. -
As can be seen, the rows of the two-dimensional matrix are copied one after the other. This format is called "row major" and is the default for both C and C++. Some other languages such as Fortran are "column major" and a memory-aware programmer should always know the memory layout of the language he or she is using.
-
Note that even though the row major memory layout is used in C++, this doesn't mean that all C++ libraries have the same default; for example, the popular Eigen library used for.
-
-
As we have created an array of integers, the difference between two adjacent memory cells will be sizeof(int), which is 4 bytes. Let us verify this by changing the size of the array to 4x4 and by plotting both the address and the index numbers to the console. Be sure to revert the array access back to
x[i][j] = i + j
. -
You can plot by uncommenting the printout line in the inner for loop:
-
0x6021e0: i=0, j=0 0x6021e4: i=0, j=1 0x6021e8: i=0, j=2 0x6021ec: i=0, j=3 0x6021f0: i=1, j=0 0x6021f4: i=1, j=1 0x6021f8: i=1, j=2 0x6021fc: i=1, j=3 0x602200: i=2, j=0 0x602204: i=2, j=1 0x602208: i=2, j=2 0x60220c: i=2, j=3 0x602210: i=3, j=0 0x602214: i=3, j=1 0x602218: i=3, j=2 0x60221c: i=3, j=3 Execution time: 83 microseconds
-
-
When we interchange the indices i and j when accessing the array as
-
x[j][i] = i + j; std::cout << &x[j][i] << ": i=" << j << ", j=" << i << std::endl;
-
-
we get the following output:
-
0x6021e0: i=0, j=0 0x6021f0: i=1, j=0 0x602200: i=2, j=0 0x602210: i=3, j=0 0x6021e4: i=0, j=1 0x6021f4: i=1, j=1 0x602204: i=2, j=1 0x602214: i=3, j=1 0x6021e8: i=0, j=2 0x6021f8: i=1, j=2 0x602208: i=2, j=2 0x602218: i=3, j=2 0x6021ec: i=0, j=3 0x6021fc: i=1, j=3 0x60220c: i=2, j=3 0x60221c: i=3, j=3 Execution time: 115 microseconds
-
-
As can be see, the difference between two rows is now 0x10, which is 16 in the decimal system. This means that with each access to the matrix, four memory cells are skipped and the principle of spatial locality is violated. As a result, the wrong data is loaded into the L1 cache, leading to cache misses and costly reload operations - hence the significantly increased execution time between the two code samples. The difference in execution time of both code samples shows that cache-aware programming can increase
-
-
Virtual Memory
-
Problems with physical memory
-
Virtual memory is a very useful concept in computer architecture because it helps with making your software work well given the configuration of the respective hardware on the computer it is running on.
-
The idea of virtual memory stems back from a (not so long ago) time, when the random access memory (RAM) of most computers was severely limited. Programers needed to treat memory as a precious resource and use it most efficiently. Also, they wanted to be able to run programs even if there was not enough RAM available. At the time of writing (August 2019), the amount of RAM is no longer a large concern for most computers and programs usually have enough memory available to them. But in some cases, for example when trying to do video editing or when running multiple large programs at the same time, the RAM memory can be exhausted. In such a case, the computer can slow down drastically.
-
There are several other memory-related problems, that programmers need to know about:
-
Holes in address space: If several programs are started one after the other and then shortly afterwards some of these are terminated again, it must be ensured that the freed-up space in between the remaining programs does not remain unused. If memory becomes too fragmented, it might not be possible to allocate a large block of memory due to a large-enough free contiguous block not being available any more.
-
Programs writing over each other : If several programs are allowed to access the same memory address, they will overwrite each others' data at this location. In some cases, this might even lead to one program reading sensitive information (e.g. bank account info) that was written by another program. This problem is of particular concern when writing concurrent programs which run several threads at the same time.
-
-
The basic idea of virtual memory is to separate the addresses a program may use from the addresses in physical computer memory. By using a mapping function, an access to (virtual) program memory can be redirected to a real address which is guaranteed to be protected from other programs.
-
In the following, you will see, how virtual memory solves the problems mentioned above and you will also learn about the concepts of memory pages, frames and mapping. A sound knowledge on virtual memory will help you understand the C++ memory model, which will be introduced in the next lesson of this course.
-
Quiz
-
On a 32-bit machine, each program has its own 32-bit address space. When a program wants to access a memory location, it must specify a 32-bit address, which directs it to the byte stored at this location. On a hardware level, this address is transported to the physical memory via a parallel bus with 32 cables, i.e. each cable can either have the information 'high voltage', and 'low voltage' (or '1' and '0').
-
How large is the address space on a 32-bit system? What is the upper limit for program memory in GB?
- Correct! 2^32 bytes = 4GB; a 32-bit address space gives a program a (theoretical) total of 4 GB of memory it can address. In practice, the operating systems reserves some of this space however.
-
-
Expanding the available memory
-
As you have just learned in the quiz, the total amount of addressable memory is limited and depends on the architecture of the system (e.g. 32-bit). But what would happen if the available physical memory was below the upper bound imposed by the architecture? The following figure illustrates the problem for such a case:
-
In the image above, the available physical memory is less than the upper bound provided by the 32-bit address space.
-
-
On a typical architecture such as MIPS ("Microprocessor without interlocked pipeline stages"), each program is promised to have access to an address space ranging from 0x00000000 up to 0xFFFFFFFF. If however, the available physical memory is only 1GB in size, a 1-on-1 mapping would lead to undefined behavior as soon as the 30-bit RAM address space were exceeded.
-
With virtual memory however, a mapping is performed between the virtual address space a program sees and the physical addresses of various storage devices such as the RAM but also the hard disk. Mapping makes it possible for the operating system to use physical memory for the parts of a process that are currently being used and back up the rest of the virtual memory to a secondary storage location such as the hard disk. With virtual memory, the size of RAM is not the limit anymore as the system hard disk can be used to store information as well.
-
The following figure illustrates the principle:
-
With virtual memory, the RAM acts as a cache for the virtual memory space which resides on secondary storage devices. On Windows systems, the file
pagefile.sys
is such a virtual memory container of varying size. To speed up your system, it makes sense to adjust the system settings in a way that this file is stored on an SSD instead of a slow magnetic hard drive, thus reducing the latency. On a Mac, swap files are stored in/private/var/vm/
.
-
-
In a nutshell, virtual memory guarantees us a fixed-size address space which is largely independent of the system configuration. Also, the OS guarantees that the virtual address spaces of different programs do not interfere with each other.
-
The task of mapping addresses and of providing each program with its own virtual address space is performed entirely by the operating system, so from a programmer’s perspective, we usually don’t have to bother much about memory that is being used by other processes.
-
Before we take a closer look at an example though, let us define two important terms which are often used in the context of caches and virtual memory:
-
A memory page is a number of directly successive memory locations in virtual memory defined by the computer architecture and by the operating system. The computer memory is divided into memory pages of equal size. The use of memory pages enables the operating system to perform virtual memory management. The entire working memory is divided into tiles and each address in this computer architecture is interpreted by the Memory Management Unit (MMU) as a logical address and converted into a physical address.
-
A memory frame is mostly identical to the concept of a memory page with the key difference being its location in the physical main memory instead of the virtual memory.
-
-
The following diagram shows two running processes and a collection of memory pages and frames:
-
As can be seen, both processes have their own virtual memory space. Some of the pages are mapped to frames in the physical memory and some are not. If process 1 needs to use memory in the memory page that starts at address 0x1000, a page fault will occur if the required data is not there. The memory page will then be mapped to a vacant memory frame in physical memory. Also, note that the virtual memory addresses are not the same as the physical addresses. The first memory page of process 1, which starts at the virtual address 0x0000, is mapped to a memory frame that starts at the physical address 0x2000.
-
In summary, virtual memory management is performed by the operating system and programmers do usually not interfere with this process. The major benefit is a unique perspective on a chunk of memory for each program that is only limited in its size by the architecture of the system (32 bit, 64 bit) and by the available physical memory, including the hard disk.
-
-
Variables and Memory
-
The Process Memory Model
-
As we have seen in the previous lesson, each program is assigned its own virtual memory by the operating system. This address space is arranged in a linear fashion with one block of data being stored at each address. It is also divided into several distinct areas as illustrated by the figure below:
-
The last address
0cFFFFFFFF
converts to the decimal 4.294.967.295 , which is the total amount of memory blocks that can theoretically addressed in a32 bit
operating system - hence the well-known limit of 4GB of memory. On a 64 bit system, the available space is significantly (!) larger. Also, the addresses are stored with 8 bytes instead of 4 bytes. -
From a programming perspective though, we are not able to use the entire address space. Instead, the blocks "OS Kernel Space" and "Text" are reserved for the operating system. In kernel space, only the most trusted code is executed - it is fully maintained by the operating system and serves as an interface between the user code and the system kernel. In this course, we will not be directly concerned with this part of memory. The section called 'text' holds the program code generated by the compiler and linker. As with the kernel space, we will not be using this block directly in this course. Let us now take a look at the remaining blocks, starting from the top:
-
The stack is a contiguous memory block with a fixed maximum size. If a program exceeds this size, it will crash. The stack is used for storing automatically allocated variables such as local variables or function parameters. If there are multiple threads in a program, then each thread has its own stack memory. New memory on the stack is allocated when the path of execution enters a scope and freed again once the scope is left. It is important to know that the stack is managed "automatically" by the compiler, which means we do not have to concern ourselves with allocation and deallocation.
-
The heap (also called "free store" in C++) is where data with dynamic storage lives. It is shared among multiple threads in a program, which means that memory management for the heap needs to take concurrency into account. This makes memory allocations in the heap more complicated than stack allocations. In general, managing memory on the heap is more (computationally) expensive for the operating system, which makes it slower than stack memory. Contrary to the stack, the heap is not managed automatically by the system, but by the programmer. If memory is allocated on the heap, it is the programmer’s responsibility to free it again when it is no longer needed. If the programmer manages the heap poorly or not at all, there will be trouble.
-
The BSS (Block Started by Symbol) segment is used in many compilers and linkers for a segment that contains global and static variables that are initialized with zero values. This memory area is suitable, for example, for arrays that are not initialized with predefined values.
-
The Data segment serves the same purpose as the BSS segment with the major difference being that variables in the Data segment have been initialized with a value other than zero. Memory for variables in the Data segment (and in BSS) is allocated once when a program is run and persists throughout its lifetime.
-
-
-
-
Memory Allocation in C++
-
Now that we have an understanding of the available process memory, let us take a look at memory allocation in C++.
-
Not every variable in a program has a permanently assigned area of memory. The term allocate refers to the process of assigning an area of memory to a variable to store its value. A variable is deallocated when the system reclaims the memory from the variable, so it no longer has an area to store its value.
-
Generally, three basic types of memory allocation are supported:
-
Static memory allocation is performed for static and global variables, which are stored in the BSS and Data segment. Memory for these types of variables is allocated once when your program is run and persists throughout the life of your program.
-
Automatic memory allocation is performed for function parameters as well as local variables, which are stored on the stack. Memory for these types of variables is allocated when the path of execution enters a scope and freed again once the scope is left.
-
Dynamic memory allocation is a possibility for programs to request memory from the operating system at runtime when needed. This is the major difference between automatic and static allocation, where the size of the variable must be known at compile time. Dynamic memory allocation is not performed on the limited stack but on the heap and is thus (almost) only limited by the size of the address space.
-
-
Properties of Stack Memory
-
In the available literature on C++, the terms stack and heap are used regularly, even though this is not formally correct: C++ has the free space, storage classes and the storage duration of objects. However, since stack and heap are widely used in the C++ community, we will also use it throughout this course. Should you come across the above-mentioned terms in a book or tutorial on the subject, you now know that they refer to the same concepts as stack and heap do.
-
As mentioned in the last section, the stack is the place in virtual memory where the local variables reside, including arguments to functions. Each time a function is called, the stack grows (from top to bottom) and each time a function returns, the stack contracts. When using multiple threads (as in concurrent programming), it is important to know that each thread has its own stack memory - which can be considered thread-safe.
-
In the following, a short list of key properties of the stack is listed:
-
The stack is a contiguous block of memory. It will not become fragmented (as opposed to the heap) and it has a fixed maximum size.
-
When the maximum size of the stack memory is exceeded, a program will crash.
-
Allocating and deallocating memory is fast on the stack. It only involves moving the stack pointer to a new position.
-
-
The following diagram shows the stack memory during a function call:
-
In the example, the variable
x
is created on the stack within the scope ofmain
. Then, a stack frame which represents the functionAdd
and its variables is pushed to the stack, moving the stack pointer further downwards. It can be seen that this includes the local variablesa
andb
, as well as the return address, a base pointer and finally the return values
.
-
-
When a thread is created, stack memory is allocated by the operating system as a contiguous block. With each new function call or local variable allocation, the stack pointer is moved until eventually it will reach the bottom of said memory block. Once it exceeds this limit (which is called "stack overflow"), the program will crash. We will try to find out the limit of your computer’s stack memory in the following exercise.
-
Before we take a look at the heap memory in the next lesson, let us briefly revisit the principles of call-by-value and call-by-reference with regard to stack usage.
-
-
Call-by-Value vs Call-by-Reference
-
When passing parameters to a function in C++, there is a variety of strategies a programmer can choose from. In this section, we will take a look at these in turn from the perspective of stack usage. First, we will briefly revisit the definition of scope, as well as the strategies call-by-value and call-by-reference. Then, we will look at the amount of stack memory used by these methods.
-
The time between allocation and deallocation is called the lifetime of a variable. Using a variable after its lifetime has ended is a common programming error, against which most modern languages try to protect: Local variables are only available within their respective scope (e.g. inside a function) and are simply not available outside - so using them inappropriately will result in a compile-time error. When using pointer variables however, programmers must make sure that allocation is handled correctly and that no invalid memory addresses are accessed.
-
bool MyLocalFunction(int myInt) { bool isBelowThreshold = myInt < 42 ? true : false; return isBelowThreshold; } int main() { bool res = MyLocalFunction(23); return 0; }
-
When calling a function as in the previous code example, its parameters (in this case myInt) are used to create local copies of the information provided by the caller. The caller is not sharing the parameter with the function but instead a proprietary copy is created using the assignment operator = (more about that later). When passing parameters in such a way, it is ensured that changes made to the local copy will not affect the original on the caller side. The upside to this is that inner workings of the function and the data owned by the caller are kept neatly separate.
-
However, with a slight modification, we can easily create a backchannel to the caller side. Consider the code bellow.
-
#include <iostream> void AddThree(int *val) { *val += 3; } int main() { int val = 0; AddThree(&val); val += 2; std::cout << "val = " << val << std::endl; return 0; }
-
Pointers vs. References
-
As we have seen in the examples above, the use of pointers and references to directly manipulate function arguments in a memory-effective way is very similar. Let us compare the two methods in the code on the right.
-
Pointers can be declared without initialization. This means we can pass an uninitialized pointer to a function who then internally performs the initialization for us.
-
Pointers can be reassigned to another memory block on the heap.
-
References are usually easier to use (depending on the expertise level of the programmer). Sometimes however, if a third-party function is used without properly looking at the parameter definition, it might go unnoticed that a value has been modified.
-
Remember, passing a pointer may be expensive:
-
printf("size of int: %lu\n", sizeof(int)); printf("size of *int: %lu\n", sizeof(int *)); // size of int: 4 // size of *int: 8
-
Obviously, the size of the pointer variable is larger than the actual data type. As my machine has a 64 bit architecture, an address requires 8 byte.
-
In order to benefit from call-by-reference, the size of the data type passed to the function has to surpass the size of the pointer on the respective architecture (i.e. 32 bit or 64 bit).
-
-
#include <stdio.h> void CallByValue(int i) { int j = 1; printf ("call-by-value: %p\n",&j); } void CallByPointer(int *i) { int j = 1; printf ("call-by-pointer: %p\n",&j); } void CallByReference(int &i) { int j = 1; printf ("call-by-reference: %p\n",&j); } int main() { int i = 0; printf ("stack bottom: %p\n",&i); CallByValue(i); CallByPointer(&i); CallByReference(i); return 0; }
-
CallByValue requires 32 bytes of memory. As discussed before, this is reserved for e.g. the function return address and for the local variables within the function (including the copy of i).
-
CallByPointer on the other hand requires - perhaps surprisingly - 36 bytes of memory. Let us complete the examination before going into more details on this result.
-
CallByReference finally has the same memory requirements as CallByPointer.
-
-
-
Dynamic Memory Allocation
-
Heap memory, also know as dynamic memory , is an important resource available to programs (and programmers) to store data. The following diagram again shows the layout of virtual memory with the heap being right above the BSS and Data segment.
-
As mentioned earlier, the heap memory grows upwards while the stack grows in the opposite direction. We have seen in the last lesson that the automatic stack memory shrinks and grows with each function call and local variable. As soon as the scope of a variable is left, it is automatically deallocated and the stack pointer is shifted upwards accordingly.
-
Heap memory is different in many ways: The programmer can request the allocation of memory by issuing a command such as
malloc
ornew
(more on that shortly). This block of memory will remain allocated until the programmer explicitly issues a command such asfree
ordelete
. The huge advantage of heap memory is the high degree of control a programmer can exert, albeit at the price of greater responsibility since memory on the heap must be actively managed. -
Let us take a look at some properties of heap memory:
-
As opposed to local variables on the stack, memory can now be allocated in an arbitrary scope (e.g. inside a function) without it being deleted when the scope is left. Thus, as long as the address to an allocated block of memory is returned by a function, the caller can freely use it.
-
Local variables on the stack are allocated at compile-time. Thus, the size of e.g. a string variable might not be appropriate as the length of the string will not be known until the program is executed and the user inputs it. With local variables, a solution would be to allocate a long-enough array of and hope that the actual length does not exceed the buffer size. With dynamically allocated heap memory, variables are allocated at run-time. This means that the size of the above-mentioned string variable can be tailored to the actual length of the user input.
-
Heap memory is only constrained by the size of the address space and by the available memory. With modern 64 bit operating systems and large RAM memory and hard disks the programmer commands a vast amount of memory. However, if the programmer forgets to deallocate a block of heap memory, it will remain unused until the program is terminated. This is called a "memory leak".
-
Unlike the stack, the heap is shared among multiple threads, which means that memory management for the heap needs to take concurrency into account as several threads might compete for the same memory resource.
-
When memory is allocated or deallocated on the stack, the stack pointer is simply shifted upwards or downwards. Due to the sequential structure of stack memory management, stack memory can be managed (by the operating system) easily and securely. With heap memory, allocation and deallocation can occur arbitrarily, depending on the lifetime of the variables. This can result in fragmented memory over time, which is much more difficult and expensive to manage.
-
-
Memory Fragmentation
-
Let us construct a theoretic example of how memory on the heap can become fragmented: Suppose we are interleaving the allocation of two data types
X
andY
in the following fashion: First, we allocate a block of memory for a variable of type X, then another block for Y and so on in a repeated manner until some upper bound is reached. At the end of this operation, the heap might look like the following: -
At some point, we might then decide to deallocate all variables of type Y, leading to empty spaces in between the remaining variables of type X. In between two blocks of type "X", no memory for an additional "X" could now be squeezed in this example.
-
A classic symptom of memory fragmentation is that you try to allocate a large block and you can’t, even though you appear to have enough memory free. On systems with virtual memory however, this is less of a problem, because large allocations only need to be contiguous in virtual address space, not in physical address space.
-
When memory is heavily fragmented however, memory allocations will likely take longer because the memory allocator has to do more work to find a suitable space for the new object.
-
Until now, our examples have been only theoretical. It is time to gain some practical experience in the next section using
malloc
andfree
as C-style methods for dynamic memory management.
-
-
-
malloc
andfree
- So far we only considered primitive data types, whose storage space requirement was already fixed at compile time and could be scheduled with the building of the program executable. However, it is not always possible to plan the memory requirements exactly in advance, and it is inefficient to reserve the maximum memory space each time just to be on the safe side. C and C++ offer the option to reserve memory areas during the program execution, i.e. at runtime. It is important that the reserved memory areas are released again at the "appropriate point" to avoid memory leaks. It is one of the major challenges in memory management to always locate this "appropriate point" though.
-
Allocating Dynamic Memory¶
-
To allocate dynamic memory on the heap means to make a contiguous memory area accessible to the program at runtime and to mark this memory as occupied so that no one else can write there by mistake.
-
To reserve memory on the heap, one of the two functions
malloc
(stands for Memory Allocation) orcalloc
(stands for Cleared Memory Allocation) is used. The header filestdlib.h
ormalloc.h
must be included to use the functions. -
Here is the syntax of
malloc
andcalloc
in C/C++:-
pointer_name = (cast-type*) malloc(size); pointer_name = (cast-type*) calloc(num_elems, size_elem);
-
-
malloc
is used to dynamically allocate a single large block of memory with the specified size. It returns a pointer of typevoid
which can be cast into a pointer of any form. -
calloc
is used to dynamically allocate the specified number of blocks of memory of the specified type. It initializes each block with a default value '0'. -
Both functions return a pointer of type
void
which can be cast into a pointer of any form. If the space for the allocation is insufficient, a NULL pointer is returned. -
#include <stdio.h> #include <stdlib.h> int main() { void *p = malloc(sizeof(int)); printf("address=%p, value=%d\n", p, *p); return 0; }
-
The
sizeof
command is a convenient way of specifying the amount of memory (in bytes) needed to store a certain data type. For an int, sizeof returns 4. However, when compiling this code, the following warning is generated on my machine: -
warning: ISO C++ does not allow indirection on operand of type 'void *' [-Wvoid-ptr-dereference] printf("address=%p, value=%d", p, *p);
-
In the virtual workspace, when compiling with
g++
, an error is thrown instead of a warning. -
The problem with
void
pointers is that there is no way of knowing the offset to the end of the allocated memory block. For an int, this would be 4 bytes but for a double, the offset would be 8 bytes. So in order to retrieve the entire block of memory that has been reserved, we need to know the data type and the way to achieve this withmalloc
is by casting the return pointer:int *p = (int*)malloc(sizeof(int));
-
This code now produces the following output without compiler warnings:
address=0x1003001f0, value=0
-
Obviously, the memory has been initialized with 0 in this case. However, you should not rely on pre-initialization as this depends on the data type as well as on the compiler you are using.
-
At compile time, only the space for the pointer is reserved (on the stack). When the pointer is initialized, a block of memory of
sizeof(int)
bytes is allocated (on the heap) at program runtime. The pointer on the stack then points to this memory location on the heap. -
Modify the example in a way that memory for 3 integers is reserved.
-
// reserve memory for several integers int *p2 = (int*)malloc(3*sizeof(int)); printf("address=%p, value=%d\n", p2, *p2);
-
-
Memory for Arrays and Structs
-
Since arrays and pointers are displayed and processed identically internally, individual blocks of data can also be accessed using array syntax:
-
int *p = (int*)malloc(3*sizeof(int)); p[0] = 1; p[1] = 2; p[2] = 3; printf("address=%p, second value=%d\n", p, p[1]);
-
Until now, we have only allocated memory for a C/C++ data primitive (i.e. int). However, we can also define a proprietary structure which consists of several primitive data types and use
malloc
orcalloc
in the same manner as before: -
struct MyStruct { int i; double d; char a[5]; }; MyStruct *p = (MyStruct*)calloc(4,sizeof(MyStruct)); p[0].i = 1; p[0].d = 3.14159; p[0].a[0] = 'a';
-
After defining the struct
MyStruct
which contains a number of data primitives, a block of memory four times the size of MyStruct is created using thecalloc
command. As can be seen, the various data elements can be accessed very conveniently. -
The size of the memory area reserved with
malloc
orcalloc
can be increased or decreased with therealloc
function. -
pointer_name = (cast-type*) realloc( (cast-type*)old_memblock, new_size );
-
To do this, the function must be given a pointer to the previous memory area and the new size in bytes. Depending on the compiler, the reserved memory area is either (a) expanded or reduced internally (if there is still enough free heap after the previously reserved memory area) or (b) a new memory area is reserved in the desired size and the old memory area is released afterwards.
-
The data from the old memory area is retained, i.e. if the new memory area is larger, the data will be available within new memory area as well. If the new memory area is smaller, the data from the old area will be available only up until the site of the new area - the rest is lost.
-
In the example on the right, a block of memory of initially 8 bytes (two integers) is resized to 16 bytes (four integers) using
realloc
. -
Note that
realloc
has been used to increase the memory size and then decrease it immediately after assigning the values 3 and 4 to the new blocks. The output looks like the following:-
address=0x100300060, value=1 address=0x100300064, value=2 address=0x100300068, value=3 address=0x10030006c, value=4
-
-
#include <stdio.h> #include <stdlib.h> int main() { // reserve memory for two integers int *p = (int*)malloc(2*sizeof(int)); p[0] = 1; p[1] = 2; // resize memory to hold four integers p = (int*)realloc(p,4*sizeof(int)); p[2] = 3; p[3] = 4; // resize memory again to hold two integers p = (int*)realloc(p,2*sizeof(int)); printf("address=%p, value=%d\n", p+0, *(p+0)); // valid printf("address=%p, value=%d\n", p+1, *(p+1)); // valid printf("address=%p, value=%d\n", p+2, *(p+2)); // INVALID printf("address=%p, value=%d\n", p+3, *(p+3)); // INVALID return 0; }
-
Interestingly, the pointers
p+2
andp+3
can still access the memory location they point to. Also, the original data (numbers 3 and 4) is still there. Sorealloc
will not erase memory but merely mark it as "available" for future allocations. It should be noted however that accessing a memory location after such an operation must be avoided as it could cause asegmentation fault
. We will encounter segmentation faults soon when we discuss "dangling pointers" in one of the next lessons.
-
-
Freeing up Memory
-
If memory has been reserved, it should also be released as soon as it is no longer needed. If memory is reserved regularly without releasing it again, the memory capacity may be exhausted at some point. If the RAM memory is completely used up, the data is swapped out to the hard disk, which slows down the computer significantly.
-
The free function releases the reserved memory area so that it can be used again or made available to other programs. To do this, the pointer pointing to the memory area to be freed is specified as a parameter for the function. In the free_example.cpp, a memory area is reserved and immediately released again.
-
#include <stdio.h> #include <stdlib.h> int main() { void *p = malloc(100); free(p); return 0; }
-
Some things should be considered with dynamic memory management, whose neglect in some cases might result in unpredictable program behavior or a system crash - in some cases unfortunately without error messages from the compiler or the operating system:
-
free
can only free memory that was reserved withmalloc
orcalloc
. -
free
can only release memory that has not been released before. Releasing the same block of memory twice will result in an error.
-
-
In the example on the right, a pointer
p
is copied into a new variable p2, which is then passed to free AFTER the original pointer has been already released. -
free(41143,0x1000a55c0) malloc: *** error for object 0x1003001f0: pointer being freed was not allocated.
-
In the workspace, you will see this error:
*** Error in './a.out': double free or corruption (fasttop): 0x0000000000755010 ***
-
The pointer
p2
in the example is invalid as soon asfree(p)
is called. It still holds the address to the memory location which has been freed, but may not access it anymore. Such a pointer is called a "dangling pointer". -
Memory allocated with
malloc
orcalloc
is not subject to the familiar rules of variables in their respective scopes. This means that they exist independently of block limits until they are released again or the program is terminated. However, the pointers which refer to such heap-allocated memory are created on the stack and thus only exist within a limited scope. As soon as the scope is left, the pointer variable will be lost - but not the heap memory it refers to.
-
-
Using
new
anddelete
-
Comparing
malloc
withnew
-
The functions
malloc
andfree
are library function and represent the default way of allocating and deallocating memory in C. In C++, they are also part of the standard and can be used to allocate blocks of memory on the heap. -
With the introduction of classes and object oriented programming in C++ however, memory allocation and deallocation has become more complex: When an object is created, its constructor needs to be called to allow for member initialization. Also, on object deletion, the destructor is called to free resources and to allow for programmer-defined clean-up tasks. For this reason, C++ introduces the operators
new
/delete
, which represent the object-oriented counterpart to memory management withmalloc
/free
. -
#include <stdlib.h> #include <iostream> class MyClass { private: int *_number; public: MyClass() { std::cout << "Allocate memory\n"; _number = (int *)malloc(sizeof(int)); } ~MyClass() { std::cout << "Delete memory\n"; free(_number); } void setNumber(int number) { *_number = number; std::cout << "Number: " << _number << "\n"; } }; int main() { // allocate memory using malloc // comment these lines out to run the example below MyClass *myClass = (MyClass *)malloc(sizeof(MyClass)); myClass->setNumber(42); // EXC_BAD_ACCESS free(myClass); // allocate memory using new MyClass *myClass = new MyClass(); myClass->setNumber(42); // works as expected delete myClass; return 0; }
-
If we were to create a C++ object with
malloc,
the constructor and destructor of such an object would not be called. Consider the class on the right. The constructor allocates memory for the private element _number (yes, we could have simply used int instead of int*, but that's for educational purposes only), and the destructor releases memory again. The setter method setNumber finally assigns a value to _number under the assumption that memory has been allocated previously. -
In main, we will allocate memory for an instance of MyClass using both
malloc
/free
andnew
/delete
. -
With
malloc
, the program crashes on calling the methodsetNumber
, as no memory has been allocated for _number - because the constructor has not been called. Hence, anEXC_BAD_ACCESS
error occurs, when trying to access the memory location to which _number is pointing. With _new, the output looks like the following: -
Allocate memory Number: 42 Delete memory
-
Before we go into further details of
new
/delete
, let us briefly summarize the major differences betweenmalloc
/free
andnew
/delete
:-
Constructors / Destructors Unlike
malloc( sizeof(MyClass) )
, the callnew MyClass()
calls the constructor. Similarly, delete calls the destructor. -
Type safety
malloc
returns a void pointer, which needs to be cast into the appropriate data type it points to. This is not type safe, as you can freely vary the pointer type without any warnings or errors from the compiler as in the following small example:MyObject *p = (MyObject*)malloc(sizeof(int))
; -
In C++, the call
MyObject *p = new MyObject()
returns the correct type automatically - it is thus type-safe. -
Operator Overloading As
malloc
andfree
are functions defined in a library, their behavior can not be changed easily. Thenew
anddelete
operators however can be overloaded by a class in order to include optional proprietary behavior. We will look at an example of overloadingnew
further down in this section.
-
-
Creating and Deleting Objects
-
As with
malloc
andfree
, a call tonew
always has to be followed by a call todelete
to ensure that memory is properly deallocated. If the programmer forgets to call delete on the object (which happens quite often, even with experienced programmers), the object resides in memory until the program terminates at some point in the future causing a memory leak. -
Let us revisit a part of the code example to the right:
-
myClass = new MyClass(); myClass->setNumber(42); // works as expected delete myClass;
-
-
The call to
new
has the following consequences:-
Memory is allocated to hold a new object of type
MyClass
-
A new object of type
MyClass
is constructed within the allocated memory by calling the constructor ofMyClass
-
-
The call to delete causes the following:
-
The object of type
MyClass
is destroyed by calling its destructor -
The memory which the object was placed in is deallocated
-
-
-
Optimizing Performance with placement
new
-
In some cases, it makes sense to separate memory allocation from object construction. Consider a case where we need to reconstruct an object several times. If we were to use the standard
new
/delete
construct, memory would be allocated and freed unnecessarily as only the content of the memory block changes but not its size. By separating allocation from construction, we can get a significant performance increase. -
C++ allows us to do this by using a construct called
placement new
: Withplacement new
, we can pass a preallocated memory and construct an object at that memory location. Consider the following code: -
void *memory = malloc(sizeof(MyClass)); MyClass *object = new (memory) MyClass;
-
The syntax
new (memory)
is denoted asplacement new
. The difference to the "conventional"new
we have been using so far is that that no memory is allocated. The call constructs an object and places it in the assigned memory location. There is however, nodelete
equivalent toplacement new
, so we have to call the destructor explicitly in this case instead of usingdelete
as we would have done with a regular call tonew
:-
object->~MyClass(); free(memory);
-
-
Important: Note that this should never be done outside of
placement new
. -
In the next section, we will look at how to overload the new operator and show the performance difference between
placement new
andnew
-
-
Overloading new and delete
-
#include <iostream> #include <stdlib.h> class MyClass { int _mymember; public: MyClass() { std::cout << "Constructor is called\n"; } ~MyClass() { std::cout << "Destructor is called\n"; } void *operator new(size_t size) { std::cout << "new: Allocating " << size << " bytes of memory" << std::endl; void *p = malloc(size); return p; } void operator delete(void *p) { std::cout << "delete: Memory is freed again " << std::endl; free(p); } }; int main() { MyClass *p = new MyClass(); delete p; }
-
One of the major advantages of
new
/delete
overfree
/malloc
is the possibility of overloading. While bothmalloc
andfree
are function calls and thus can not be changed easily,new
anddelete
are operators and can thus be overloaded to integrate customized functionality, if needed. -
The syntax for overloading the new operator looks as follows:
void* operator new(size_t size);
-
The operator receives a parameter size of type
size_t
, which specifies the number of bytes of memory to be allocated. The return type of the overloadednew
is avoid
pointer, which references the beginning of the block of allocated memory. -
The syntax for overloading the
delete
operator looks as follows:void operator delete(void*);
-
The operator takes a pointer to the object which is to be deleted. As opposed to
new
, the operatordelete
does not have a return value. -
In the code to the above, both the
new
and thedelete
operator are overloaded. Innew
, the size of the class object in bytes is printed to the console. Also, a block of memory of that size is allocated on the heap and the pointer to this block is returned. Indelete
, the block of memory is freed again. The console output of this example looks as follows:new: Allocating 4 bytes of memory Constructor is called Destructor is called delete: Memory is freed again
-
-
As can be seen from the order of text output, memory is instantiated in
new
before the constructor is called, while the order is reversed for the destructor and the call todelete
.
-
-
Overloading new[] and delete[]:
-
In addition to the
new
anddelete
operators we have seen so far, we can use the following code to create an array of objects: -
void* operator new[](size_t size); void operator delete[](void*);
-
#include <iostream> #include <stdlib.h> class MyClass { int _mymember; public: MyClass() { std::cout << "Constructor is called\n"; } ~MyClass() { std::cout << "Destructor is called\n"; } void *operator new[](size_t size) { std::cout << "new: Allocating " << size << " bytes of memory" << std::endl; void *p = malloc(size); return p; } void operator delete[](void *p) { std::cout << "delete: Memory is freed again " << std::endl; free(p); } }; int main() { MyClass *p = new MyClass[3](); delete[] p; }
-
-
In main, we are now creating an array of three objects of
MyClass
. Also, the overloadednew
anddelete
operators have been changed to accept arrays. Let us take a look at the console output:-
new: Allocating 20 bytes of memory Constructor is called Constructor is called Constructor is called Destructor is called Destructor is called Destructor is called delete: Memory is freed again
-
-
Interestingly, the memory requirement is larger than expected: With
new
, the block size was 4 bytes, which is exactly the space required for a single integer. Thus, with three integers, it should now be 12 bytes instead of 20 bytes. The reason for this is the memory allocation overhead that the compiler needs to keep track of the allocated blocks of memory - which in itself consumes memory. If we change the above call to e.g.new MyClass[100]()
, we will see that the overhead of 8 bytes does not change:-
new: Allocating 408 bytes of memory Constructor is called … Destructor is called delete: Memory is freed again
-
-
Reasons for overloading
new
anddelete
-
Now that we have seen how to overload the
new
anddelete
operators, let us summarize the major scenarios where it makes sense to do this:-
The overloaded
new
operator function allows to add additional parameters. Therefore, a class can have multiple overloadednew
operator functions. This gives the programmer more flexibility in customizing the memory allocation for objects. -
Overloaded the
new
anddelete
operators provides an easy way to integrate a mechanism similar to garbage collection capabilities (such as in Java), as we will shorty see later in this course. -
By adding exception handling capabilities into new and delete, the code can be made more robust.
-
It is very easy to add customized behavior, such as overwriting deallocated memory with zeros in order to increase the security of critical application data.
-
-
-
-
Overview of memory management problems
-
One of the primary advantages of C++ is the flexibility and control of resources such as memory it gives to the programmer. This advantage is further amplified by a significant increase in the performance of C++ programs compared to other languages such as Python or Java.
-
However, these advantages come at a price as they demand a high level of experience from the programer. As Bjarne Stroustrup put it so elegantly:
- "C makes it easy to shoot yourself in the foot; C++ makes it harder, but when you do it blows your whole leg off".
-
In this chapter, we will look at a collection of typical errors in memory management that you need to watch out for.
-
Memory Leaks Memory leaks occur when data is allocated on the heap at runtime, but not properly deallocated. A program that forgets to clear a memory block is said to have a memory leak - this may be a serious problem or not, depending on the circumstances and on the nature of the program. For a program that runs, computes something, and quits immediately, memory leaks are usually not a big concern. Memory leaks are mostly problematic for programs that run for a long time and/or use large data structures. In such a case, memory leaks can gradually fill the heap until allocation requests can no longer be properly met and the program stops responding or crashes completely. We will look at an example further down in this section.
-
Buffer Overruns Buffer overruns occur when memory outside the allocated limits is overwritten and thus corrupted. One of the resulting problems is that this effect may not become immediately visible. When a problem finally does occur, cause and effect are often hard to discern. It is also sometimes possible to inject malicious code into programs in this way, but this shall not be discussed here.
-
In this example, the allocated stack memory is too small to hold the entire string, which results in a segmentation fault:
-
char str[5]; strcpy(str,"BufferOverrun"); printf("%s",str);
-
Uninitialized Memory Depending on the C++ compiler, data structures are sometimes initialized (most often to zero) and sometimes not. So when allocating memory on the heap without proper initialization, it might sometimes contain garbage that can cause problems.
-
Generally, a variable will be automatically initialized in these cases:
- it is a class instance where the default constructor initializes all primitive types
- array initializer syntax is used, such as int a[10] = {}
- it is a global or extern variable
- it is defined
static
-
The behavior of the following code is potentially undefined:
-
int a; int b=a*42; printf("%d",b);
-
-
Incorrect pairing of allocation and deallocation Freeing a block of memory more than once will cause a program to crash. This can happen when a block of memory is freed that has never been allocated or has been freed before. Such behavior could also occur when improper pairings of allocation and deallocation are used such as using
malloc()
withdelete
ornew
withfree()
. -
In this first example, the wrong new and delete are paired
-
double *pDbl=new double[5]; delete pDbl;
-
-
In this second example, the pairing is correct but a double deletion is performed:
-
char *pChr=new char[5]; delete[] pChr; delete[] pChr;
-
-
Invalid memory access This error occurs then trying to access a block of heap memory that has not yet or has already been deallocated.
-
In this example, the heap memory has already been deallocated at the time when
strcpy()
tries to access it: -
char *pStr=new char[25]; delete[] pStr; strcpy(pStr, "Invalid Access");
-
-
-
Valgrind for debugging memory leaks
-
Even experienced developers sometimes make mistakes that cannot be discovered at first glance. Instead of spending a lot of time searching, it makes sense for C and C++ programmers to use helper tools to perform automatic analyses of their code.
-
In this section, we will look at
Valgrind
, a free software for Linux and Mac that is able to automatically detect memory. Windows programers can for example use the Visual Studio debugger and C Run-time Library (CRT) to detect and identify memory leaks. More information on how to do this can be found here: Find memory leaks with the CRT Library - Visual Studio | Microsoft Docs -
With recent versions of MacOS, occasional difficulties have been reported with installing
Valgrind.
A working version for MacOS Mojave can be downloaded from GitHub via Homebrew: GitHub - sowson/valgrind: Experimental Version of Valgrind for macOS 10.14.6 Mojave -
Valgrind
is a framework that facilitates the development of tools for the dynamic analysis of programs. Dynamic analysis examines the behavior of a program at runtime, in contrast to static analysis, which often checks programs for various criteria or potential errors at the source code level before, during, or after translation. More information on Valgrind can be found here: Valgrind: About -
The Memcheck tool within Valgrind can be used to detect typical errors in programs written in C or C++ that occur in connection with memory management. It is probably the best-known tool in the Valgrind suite, and the name Valgrind is often used as a synonym for Memcheck.
-
The following code generates a memory leak as the integer array has been allocated on the heap but the deallocation has been forgotten by the programmer:
-
int main() { int *pInt = new int[10]; return 0; }
-
-
The array of integers on the heap to which pInt is pointing has a size of 10 * sizeof(int), which is 40 bytes. Let us now use Valgrind to search for this leak.
-
After compiling the
memory_leaks_debugging.cpp
code file on the right to a.out, the terminal can be used to start Valgrind with the following command:valgrind --leak-check=full --show-leak-kinds=all --track-origins=yes --log-file=/home/workspace/valgrind-out.txt /home/workspace/a.out
-
Let us look at the call parameters one by one:
-
--leak-check
: Controls the search for memory leaks when the client program finishes. If set to summary, it says how many leaks occurred. If set to full, each individual leak will be shown in detail. -
--show-leak-kinds
: controls the set of leak kinds to show when —leak-check=full is specified. Options are definite, indirect, possible reachable, all and none -
--track-origins
: can be used to see where uninitialised values come from.
-
-
You can read the file into the terminal with:
cat valgrind-out.txt
-
In the following, a (small) excerpt of the
valgrind-out.txt
log file is given:-
==952== 40 bytes in 1 blocks are definitely lost in loss record 18 of 45 ... ==952== by 0x10019A377: operator new(unsigned long) (in /usr/lib/libc++abi.dylib) ... ==952== by 0x100000F8A: main (memory_leaks_debugging.cpp:12) ... ==952== LEAK SUMMARY: ==952== definitely lost: 40 bytes in 1 blocks ==952== indirectly lost: 0 bytes in 0 blocks ==952== possibly lost: 72 bytes in 3 blocks ==952== still reachable: 200 bytes in 6 blocks ==952== suppressed: 18,876 bytes in 160 blocks
-
-
As expected, the memory leak caused by the omitted deletion of the array of 10 integers in the code sample above shows up in the leak summary. Additionally, the exact position where the leak occurs in the code (line 12) can also be seen together with the responsible call with caused the leak.
-
This short introduction into memory leak search is only an example of how powerful analysis tools such as Valgrind can be used to detect memory-related problems in your code.
-
-
Resources Copying Polices
-
Default copying
-
Resource management is one of the primary responsibilities of a C++ programmer. Among resources such as multi-threaded locks, files, network and database connections this also includes memory. The common denominator in all of these examples is that access to the resource is often managed through a handle such as a pointer. Also, after the resource has been used and is no longer, it must be released again so that it available for re-use by someone else.
-
In C++, a common way of safely accessing resources is by wrapping a manager class around the handle, which is initialized when the resource is acquired (in the class constructor) and released when it is deleted (in the class destructor). This concept is often referred to as Resource Acquisition is Initialization (RAII), which we will discuss in greater depth in the next concept. One problem with this approach though is that copying the manager object will also copy the handle of the resource. This allows two objects access to the same resource - and this can mean trouble.
-
Are member variables of an object that is on the heap also automatically on the heap?
- Yes. It's on the heap. Basically, the space allocated to an object on the heap is big enough to hold all its member variables. More here
-
Consider the example on the right of managing access to a block of heap memory.
-
#include <iostream> class MyClass { private: int *_myInt; public: MyClass() { _myInt = (int *)malloc(sizeof(int)); }; ~MyClass() { free(_myInt); }; void printOwnAddress() { std::cout << "Own address on the stack is " << this << std::endl; } void printMemberAddress() { std::cout << "Managing memory block on the heap at " << _myInt << std::endl; } }; int main() { // instantiate object 1 MyClass myClass1; myClass1.printOwnAddress(); myClass1.printMemberAddress(); // copy object 1 into object 2 MyClass myClass2(myClass1); // copy constructor myClass2.printOwnAddress(); myClass2.printMemberAddress(); return 0; }
-
The class
MyClass
has a private member, which is a pointer to a heap-allocated integer. Allocation is performed in the constructor, deallocation is done in the destructor. This means that the memory block of sizesizeof(int)
is allocated when the objectsmyClass1
andmyClass2
are created on the stack and deallocated when their scope is left, which happens at the end of the main. The difference betweenmyClass1
andmyClass2
is that the latter is instantiated using the copy constructor, which duplicates the members in myClass1 - including the pointer to the heap memory where_myInt
resides. -
The output of the program looks like the following:
-
Own address on the stack is 0x7ffeefbff670 Managing memory block on the heap at 0x100300060 Own address on the stack is 0x7ffeefbff658 Managing memory block on the heap at 0x100300060 copy_constructor_1(87582,0x1000a95c0) malloc: *** error for object 0x100300060: pointer being freed was not allocated
-
Note that in the workspace, the error will read:
-
*** Error in './a.out': double free or corruption (fasttop): 0x0000000001133c20 ***
-
From the output we can see that the stack address is different for
myClass1
andmyClass2
- as was expected. The address of the managed memory block on the heap however is identical. This means that when the first object goes out of scope, it releases the memory resource by calling free in its destructor. The second object does the same - which causes the program to crash as the pointer is now referencing an invalid area of memory, which has already been freed. -
The default behavior of both copy constructor and assignment operator is to perform a shallow copy as with the example above. The following figure illustrates the concept:
-
Fortunately, in C++, the copying process can be controlled by defining a tailored copy constructor as well as a copy assignment operator. The copying process must be closely linked to the respective resource release mechanism and is often referred to as
copy-ownership
policy. Tailoring the copy constructor according to your memory management policy is an important choice you often need to make when designing a class. In the following, we will closely examine several well-known copy-ownership policies. -
It is important to point out that the
assignment operator
aka the=
sign will not always invoke the copy constructor. theassignment operator
only calls the copy constructor when you use it during initialization of an unexisting object:MyMovableClass obj2 = obj1
. However, if you use the=
operator after an object was initialized, then it will call theassignment operator
overloaded method:obj2 = obj3
.
-
-
No copying policy
-
The simplest policy of all is to forbid copying and assigning class instances all together. This can be achieved by declaring, but not defining a private copy constructor and assignment operator (see NoCopyClass1 below) or alternatively by making both public and assigning the
delete
operator (see NoCopyClass2 below). The second choice is more explicit and makes it clearer to the programmer that copying has been actively forbidden. Let us have a look at a code example on the right that illustrates both cases. -
class NoCopyClass1 { private: NoCopyClass1(const NoCopyClass1 &); NoCopyClass1 &operator=(const NoCopyClass1 &); public: NoCopyClass1(){}; }; class NoCopyClass2 { public: NoCopyClass2(){} NoCopyClass2(const NoCopyClass2 &) = delete; NoCopyClass2 &operator=(const NoCopyClass2 &) = delete; }; int main() { NoCopyClass1 original1; NoCopyClass1 copy1a(original1); // copy c’tor NoCopyClass1 copy1b = original1; // assigment operator NoCopyClass2 original2; NoCopyClass2 copy2a(original2); // copy c’tor NoCopyClass2 copy2b = original2; // assigment operator return 0; }
-
On compiling, we get the following error messages
-
error: calling a private constructor of class 'NoCopyClass1' NoCopyClass1 copy1(original1); NoCopyClass1 copy1b = original1; error: call to deleted constructor of 'NoCopyClass2' NoCopyClass2 copy2(original2); NoCopyClass2 copy2b = original2;
-
Both cases effectively prevent the original object from being copied or assigned. In the C++11 standard library, there are some classes for multi-threaded synchronization which use the no copying policy.
-
-
Exclusive ownership policy
-
This policy states that whenever a resource management object is copied, the resource handle is transferred from the source pointer to the destination pointer. In the process, the source pointer is set to
nullptr
to make ownership exclusive. At any time, the resource handle belongs only to a single object, which is responsible for its deletion when it is no longer needed. -
The code example on the right illustrates the basic idea of exclusive ownership.
-
#include <iostream> class ExclusiveCopy { private: int *_myInt; public: ExclusiveCopy() { _myInt = (int *)malloc(sizeof(int)); std::cout << "resource allocated" << std::endl; } ~ExclusiveCopy() { if (_myInt != nullptr) { free(_myInt); std::cout << "resource freed" << std::endl; } } ExclusiveCopy(ExclusiveCopy &source) { _myInt = source._myInt; source._myInt = nullptr; } ExclusiveCopy &operator=(ExclusiveCopy &source) { _myInt = source._myInt; source._myInt = nullptr; return *this; } }; int main() { ExclusiveCopy source; ExclusiveCopy destination(source); return 0; }
-
The class
MyClass
overwrites both the copy constructor as well as the assignment operator. Inside, the handle to the resource_myInt
is first copied from the source object and then set to null so that only a single valid handle exists. After copying, the new object is responsible for properly deleting the memory resource on the heap. The output of the program looks like the following: -
resource allocated resource freed
-
As can be seen, only a single resource is allocated and freed. So by passing handles and invalidating them, we can implement a basic version of an exclusive
ownership policy
. However, this example is not the way exclusive ownership is handled in the standard template library. One problem in this implementation is that for a short time there are effectively two valid handles to the same resource - after the handle has been copied and before it is set tonullptr
. In concurrent programs, this would cause a data race for the resource. A much better alternative to handle exclusive ownership in C++ would be to use move semantics, which we will discuss shortly in a very detailed lesson.
-
-
Deep copying policy
-
With this policy, copying and assigning class instances to each other is possible without the danger of resource conflicts. The idea is to allocate proprietary memory in the destination object and then to copy the content to which the source object handle is pointing into the newly allocated block of memory. This way, the content is preserved during copy or assignment. However, this approach increases the memory demands and the uniqueness of the data is lost: After the deep copy has been made, two versions of the same resource exist in memory.
-
Let us look at an example in the code on the bottom.
-
#include <iostream> class DeepCopy { private: int *_myInt; public: DeepCopy(int val) { _myInt = (int *)malloc(sizeof(int)); *_myInt = val; std::cout << "resource allocated at address " << _myInt << std::endl; } ~DeepCopy() { free(_myInt); std::cout << "resource freed at address " << _myInt << std::endl; } DeepCopy(DeepCopy &source) { _myInt = (int *)malloc(sizeof(int)); *_myInt = *source._myInt; std::cout << "resource allocated at address " << _myInt << " with _myInt = " << *_myInt << std::endl; } DeepCopy &operator=(DeepCopy &source) { _myInt = (int *)malloc(sizeof(int)); std::cout << "resource allocated at address " << _myInt << " with _myInt=" << *_myInt << std::endl; *_myInt = *source._myInt; return *this; } }; int main() { DeepCopy source(42); DeepCopy dest1(source); DeepCopy dest2 = dest1; return 0; }
-
The deep-copy version of
MyClass
looks similar to the exclusive ownership policy: Both the assignment operator and the copy constructor have been overloaded with the source object passed by reference. But instead of copying the source handle (and then deleting it), a proprietary block of memory is allocated on the heap and the content of the source is copied into it. -
The output of the program looks like the following:
-
resource allocated at address 0x100300060 resource allocated at address 0x100300070 with _myInt = 42 resource allocated at address 0x100300080 with _myInt = 42 resource freed at address 0x100300080 resource freed at address 0x100300070 resource freed at address 0x100300060
-
As can be seen, all copies have the same value of 42 while the address of the handle differs between source, dest1 and dest2.
-
To conclude, the following figure illustrates the idea of a deep copy: Image
-
-
Shared ownership policy
-
The last ownership policy we will be discussing in this course implements a
shared ownership
behavior. The idea is to perform a copy or assignment similar to the default behavior, i.e. copying the handle instead of the content (as with a shallow copy) while at the same time keeping track of the number of instances that also point to the same resource. Each time an instance goes out of scope, the counter is decremented. Once the last object is about to be deleted, it can safely deallocate the memory resource. We will see later in this course that this is the central idea of unique_ptr, which is a representative of the group of smart pointers. -
The example on the right illustrates the principle.
-
#include <iostream> class SharedCopy { private: int *_myInt; static int _cnt; public: SharedCopy(int val); ~SharedCopy(); SharedCopy(SharedCopy &source); }; int SharedCopy::_cnt = 0; SharedCopy::SharedCopy(int val) { _myInt = (int *)malloc(sizeof(int)); *_myInt = val; ++_cnt; std::cout << "resource allocated at address " << _myInt << std::endl; } SharedCopy::~SharedCopy() { --_cnt; if (_cnt == 0) { free(_myInt); std::cout << "resource freed at address " << _myInt << std::endl; } else { std::cout << "instance at address " << this << " goes out of scope with _cnt = " << _cnt << std::endl; } } SharedCopy::SharedCopy(SharedCopy &source) { _myInt = source._myInt; ++_cnt; std::cout << _cnt << " instances with handles to address " << _myInt << " with _myInt = " << *_myInt << std::endl; } int main() { SharedCopy source(42); SharedCopy destination1(source); SharedCopy destination2(source); SharedCopy destination3(source); return 0; }
-
Note that class MyClass now has a static member _cnt, which is incremented every time a new instance of MyClass is created and decrement once an instance is deleted. On deletion of the last instance, i.e. when _cnt==0, the block of memory to which the handle points is deallocated.
-
The output of the program is the following:
-
resource allocated at address 0x100300060 2 instances with handles to address 0x100300060 with _myInt = 42 3 instances with handles to address 0x100300060 with _myInt = 42 4 instances with handles to address 0x100300060 with _myInt = 42 instance at address 0x7ffeefbff6f8 goes out of scope with _cnt = 3 instance at address 0x7ffeefbff700 goes out of scope with _cnt = 2 instance at address 0x7ffeefbff718 goes out of scope with _cnt = 1 resource freed at address 0x100300060
-
As can be seen, the memory is released only once as soon as the reference counter reaches zero.
-
-
The Rule of Three
-
In the previous examples we have taken a first look at several copying policies:
- Default copying
- No copying
- Exclusive ownership
- Deep copying
- Shared ownership
-
In the first example we have seen that the default implementation of the copy constructor does not consider the "special" needs of a class which allocates and deallocates a shared resource on the heap. The problem with implicitly using the default copy constructor or assignment operator is that programmers are not forced to consider the implications for the memory management policy of their program. In the case of the first example, this leads to a segmentation fault and thus a program crash.
-
In order to properly manage memory allocation, deallocation and copying behavior, we have seen that there is an intricate relationship between destructor, copy constructor and copy assignment operator. To this end, the Rule of Three states that if a class needs to have an overloaded copy constructor, copy assignment operator,
ordestructor, then it must also implement the other two as well to ensure that memory is managed consistently. As we have seen, the copy constructor and copy assignment operator (which are often almost identical) control how the resource gets copied between objects while the destructor manages the resource deletion. -
You may have noted that in the previous code example, the class
SharedCopy
does not implement the assignment operator. This is a violation of the Rule of Three and thus, if we were to use something likedestination3 = source
instead ofSharedCopy destination3(source)
, the counter variable would not be properly decremented. -
The copying policies discussed in this chapter are the basis for a powerful concept in C++11 - smart pointers. But before we discuss these, we need to go into further detail on move semantics, which is a prerequisite you need to learn more about so you can properly understand the exclusive ownership policy as well as the Rule of Five, both of which we will discuss very soon. But before we discuss move semantics, we need to look into the concept of
lvalues
andrvalues
in the next section.
-
-
Lvalues and Rvalues
-
What are lvalues and rvalues?
-
A good grasp of
lvalues
andrvalues
in C++ is essential for understanding the more advanced concepts ofrvalue
references andmotion semantics
. -
Let us start by stating that every expression in C++ has a type and belongs to a value category. When objects are created, copied or moved during the evaluation of an expression, the compiler uses these value expressions to decide which method to call or which operator to use.
-
int main() { // initialize some variables on the stack int i, j, *p; // correct usage of lvalues and rvalues i = 42; // i is an lvalue and 42 is an rvalue p = new int; *p = i; // the dereferenced pointer is an lvalue delete p; ((i < 42) ? i : j) = 23; // the conditional operator returns an lvalue (eiter i or j) // incorrect usage of lvalues and rvalues //42 = i; // error : the left operand must be an lvalue //j * 42 = 23; // error : the left operand must be an lvalue return 0; }
-
-
Lvalue reference
-
An
lvalue
reference can be considered as an alternative name for an object. It is a reference that binds to anlvalue
and is declared using an optional list of specifiers (which we will not further discuss here) followed by the reference declarator &. The short code sample on the right declares an integeri
and a referencej
which can be used as an alias for the existing object. -
#include <iostream> int main() { int i = 1; int &j = i; ++i; ++j; std::cout << "i = " << i << ", j = " << j << std::endl; return 0; }
-
The output of the program is
i = 3, j = 3
-
We can see that the
lvalue
referencej
can be used just asi
can. A change to eitheri
orj
will affect the same memory location on the stack. -
One of the primary use-cases for
lvalue
references is the pass-by-reference semantics in function calls as in the example on the right. -
#include <iostream> void myFunction(int &val) { ++val; } int main() { int i = 1; myFunction(i); std::cout << "i = " << i << std::endl; return 0; }
-
The function
myFunction
has anlvalue
reference as a parameter, which establishes an alias to the integeri
which is passed to it inmain
.
-
-
Rvalue references
-
You already know that an
rvalue
is a temporary expression which is - among other use-cases, a means of initializing objects. In the callint i = 42
, 42 is thervalue
. -
Let us consider an example similar to the last one, shown on the right.
-
#include <iostream> void myFunction(int &val) { std::cout << "val = " << val << std::endl; } int main() { int j = 42; myFunction(j); myFunction(42); int k = 23; myFunction(j+k); return 0; }
-
As before, the function
myFunction
takes anlvalue
reference as its argument. In main, the callmyFunction(j)
works just fine whilemyFunction(42)
as well asmyFunction(j+k)
produces the following compiler error on Mac: -
candidate function not viable: expects an l-value for 1st argument
-
and the following error in the workspace with g++:
-
error: cannot bind non-const lvalue reference of type ‘int&’ to an rvalue of type ‘int’
-
While the number
42
is obviously anrvalue
, withj+k
things might not be so obvious, asj
andk
are variables and thuslvalues
. To compute the result of the addition, the compiler has to create a temporary object to place it in - and this object is anrvalue
.
-
-
Since C++11, there is a new type available called
rvalue
reference, which can be identified from the double ampersand&&
after a type name. With this operator, it is possible to store and even modify anrvalue
, i.e. a temporary object which would otherwise be lost quickly. -
#include <iostream> int main() { int i = 1; int j = 2; int k = i + j; int &&l = i + j; std::cout << "k = " << k << ", l = " << l << std::endl; return 0; }
-
But what do we need this for? Before we look into the answer to this question, let us consider the example on the top.
-
After creating the integers
i
andj
on the stack, the sum of both is added to a third integerk
. Let us examine this simple example a little more closely. In the first and second assignment,i
andj
are created aslvalues
, while1
and2
arervalues
, whose value is copied into the memory location ofi
andj
. Then, a thirdlvalue
,k
, is created. The sumi+j
is created as anrvalue
, which holds the result of the addition before being copied into the memory location ofk
. This is quite a lot of copying and holding of temporary values in memory. With anrvalue
reference, this can be done more efficiently. -
The expression
int &&l
creates anrvalue
reference, to which the address of the temporary object is assigned, that holds the result of the addition. So instead of first creating thervalue
i+j
, then copying it and finally deleting it, we can now hold the temporary object in memory.. This is much more efficient than the first approach, even though saving a few bytes of storage in the example might not seem like much at first glance. One of the most important aspects ofrvalue
references is that they pave the way formove semantics
, which is a mighty technique in modern C++ to optimize memory usage and processing speed.Move semantics
andrvalue
references make it possible to write code that transfers resources such as dynamically allocated memory from one object to another in a very efficient manner and also supports the concept of exclusive ownership, as we will shortly see when discussingsmart pointers
. In the next section we will take a close look at move semantics and its benefits for memory management. -
External Resources
-
Here are some good resources to learn more about Lvalues and Rvalues:
-
How to crack the confusing world of lvalues and rvalues in C++? It is easy!
-
-
Rvalue
references andstd::move
-
In order to fully understand the concept of smart pointers in the next lesson, we first need to take a look at a powerful concept introduced with C++11 called
move semantics
. -
The last section on
lvalues
,rvalues
and especiallyrvalue
references is an important prerequisite for understanding the concept of moving data structures. -
Let us consider the function on the right which takes an
rvalue
reference as its parameter. -
#include <iostream> void myFunction(int &&val) { std::cout << "val = " << val << std::endl; } int main() { myFunction(42); return 0; }
-
The important message of the function argument of
myFunction
to the programmer is : The object that binds to thervalue
reference&&val
is yours, it is not needed anymore within the scope of the caller (which is main). As discussed in the previous section onrvalue
references, this is interesting from two perspectives: -
Passing values like this
improves performance
as no temporary copy needs to be made anymore andownership changes
, since the object the reference binds to has been abandoned by the caller and now binds to a handle which is available only to the receiver. This could not have been achieved withlvalue
references as any change to the object that binds to thelvalue
reference would also be visible on the caller side. -
There is one more important aspect we need to consider:
rvalue
references are themselveslvalues
. While this might seem confusing at first glance, it really is the mechanism that enablesmove semantics
: A reference is always defined in a certain context (such as in the above example the variable val) . Even though the object it refers to (the number 42) may be disposable in the context it has been created (the main function), it is not disposable in the context of the reference . So within the scope ofmyFunction
,val
is anlvalue
as it gives access to the memory location where the number 42 is stored. -
Note however that in the above code example we cannot pass an
lvalue
tomyFunction
, because anrvalue
reference cannot bind to anlvalue
. The code -
int i = 23; myFunction(i)
-
would result in a compiler error. There is a solution to this problem though: The function
std::move
converts anlvalue
into anrvalue
(actually, to be exact, into anxvalue
, which we will not discuss here for the sake of clarity), which makes it possible to use thelvalue
as an argument for the function: -
int i = 23; myFunction(std::move(i));
-
In doing this, we state that in the scope of
main
we will not usei
anymore, which now exists only in the scope ofmyFunction
. Usingstd::move
in this way is one of the components ofmove semantics
, which we will look into shortly. But first let us consider an example of theRule of Three
.
-
-
Let us consider the example to the right of a class which manages a block of dynamic memory and incrementally add new functionality to it. You will add the main function shown above later on in this notebook.
-
#include <stdlib.h> #include <iostream> class MyMovableClass { private: int _size; int *_data; public: MyMovableClass(size_t size) // constructor { _size = size; _data = new int[_size]; std::cout << "CREATING instance of MyMovableClass at " << this << " allocated with size = " << _size*sizeof(int) << " bytes" << std::endl; } ~MyMovableClass() // 1 : destructor { std::cout << "DELETING instance of MyMovableClass at " << this << std::endl; delete[] _data; } };
-
In this class, a block of heap memory is allocated in the constructor and deallocated in the destructor. As we have discussed before, when either destructor, copy constructor or copy assignment operator are defined, it is good practice to also define the other two (known as the
Rule of Three
). While the compiler would generate default versions of the missing components, these would not properly reflect the memory management strategy of our class, so leaving out the manual -
So let us start with the copy constructor of
MyMovableClass
, which could look like the following: -
MyMovableClass(const MyMovableClass &source) // 2 : copy constructor { _size = source._size; _data = new int[_size]; *_data = *source._data; std::cout << "COPYING content of instance " << &source << " to instance " << this << std::endl; }
-
Similar to an example in the section on copy semantics, the copy constructor takes an lvalue reference to the source instance, allocates a block of memory of the same size as in the source and then copies the data into its members (as a deep copy).
-
Next, let us take a look at the copy assignment operator:
-
MyMovableClass &operator=(const MyMovableClass &source) // 3 : copy assignment operator { std::cout << "ASSIGNING content of instance " << &source << " to instance " << this << std::endl; if (this == &source) return *this; delete[] _data; _data = new int[source._size]; *_data = *source._data; _size = source._size; return *this; }
-
The
if-statement
at the top of the above implementation protects against self-assignment and is standard boilerplate code for the user-defined assignment operator. The remainder of the code is more or less identical to the copy constructor, apart from returning a reference to the own instance using this. -
You might have noticed that both copy constructor and assignment operator take a
const
reference to the source object as an argument, by which they promise that they won’ (and can’t) modify the content of source. -
We can now use our class to copy objects as shown in the following implementation of main:
-
int main() { MyMovableClass obj1(10); // regular constructor MyMovableClass obj2(obj1); // copy constructor obj2 = obj1; // copy assignment operator return 0; }
-
In the main above, the object
obj1
is created using the regular constructor ofMyMovableClass
. Then, both the copy constructor as well as the assignment operator are used with the latter one not creating a new object but instead assigning the content ofobj1
toobj2
as defined by our copying policy. -
The output of this textbook implementation of the Rule of Three looks like this:
-
CREATING instance of MyMovableClass at 0x7ffeefbff618 allocated with size = 40 bytes COPYING content of instance 0x7ffeefbff618 to instance 0x7ffeefbff608 ASSIGNING content of instance 0x7ffeefbff618 to instance 0x7ffeefbff608 DELETING instance of MyMovableClass at 0x7ffeefbff608 DELETING instance of MyMovableClass at 0x7ffeefbff618
-
Limitations of Our Current Class Design
-
Let us now consider one more way to instantiate
MyMovableClass
object by usingcreateObject()
function. Add the following function definition to the rule_of_three.cpp, outside the scope of the class MyMovableClass: -
MyMovableClass createObject(int size){ MyMovableClass obj(size); // regular constructor return obj; // return MyMovableClass object by value }
-
Note that when a function returns an object by value, the compiler creates a temporary object as an
rvalue
. Let's call this function insidemain
to create anobj4
instance, as follows:-
int main(){ // call to copy constructor, (alternate syntax) MyMovableClass obj3 = obj1; // Here, we are instantiating obj3 in the same statement; hence the copy assignment operator would not be called. MyMovableClass obj4 = createObject(10); // createObject(10) returns a temporary copy of the object as an rvalue, which is passed to the copy constructor. /* * You can try executing the statement below as well * MyMovableClass obj4(createObject(10)); */ return 0; }
-
-
In the
main
above, the returned value ofcreateObject(10)
is passed to the copy constructor. The functioncreateObject()
returns an instance ofMyMovableClass
by value. In such a case, the compiler creates a temporary copy of the object as anrvalue
, which is passed to the copy constructor.- A special call to copy constructor
- Try compiling and then running the rule_of_three.cpp to notice that
MyMovableClass obj4 = createObject(10);
would not print the cout statement of copy constructor on the console. This is because the copy constructor is called on the temporary object.
-
-
In our current class design, while creating
obj4
, the data is dynamically allocated on the stack, which is then copied from the temporary object to its target destination. This means that two expensive memory operations are performed with the first occurring during the creation of the temporary rvalue and the second during the execution of the copy constructor. The similar two expensive memory operations would be performed with the assignment operator if we execute the following statement insidemain
: -
MyMovableClass obj4 = createObject(10); // Don't write this statement if you have already written it before obj4 = createObject(10); // call to copy assignment operator
-
In the above call to copy assignment operator, it would first erase the memory of
obj4
, then reallocate it during the creation of the temporary object; and then copy the data from the temporary object toobj4
. -
From a performance viewpoint, this code involves far too many copies, making it inefficient - especially with large data structures. Prior to
C++11
, the proper solution in such a case was to simply avoid returning large data structures by value to prevent the expensive and unnecessary copying process. WithC++11
however, there is a way we can optimize this and return even large data structures by value. The solution is the move constructor and the Rule of Five. -
The move constructor
-
The basic idea to optimize the code from the last example is to "steal" the rvalue generated by the compiler during the return-by-value operation and move the expensive data in the source object to the target object - not by copying it but by redirecting the data handles. Moving data in such a way is always cheaper than making copies, which is why programmers are highly encouraged to make use of this powerful tool.
-
The following diagram illustrates the basic principle of moving a resource from a source object to a destination object:
-
In order to achieve this, we will be using a construct called
move constructor
, which is similar to the copy constructor with the key difference being the re-use of existing data without unnecessarily copying it. In addition to the move constructor, there is also a move assignment operator, which we need to look at.
-
-
Just like the copy constructor, the move constructor builds an instance of a class using a source instance. The key difference between the two is that with the move constructor, the source instance will no longer be usable afterwards. Let us take a look at an implementation of the move constructor for our
MyMovableClass
: -
MyMovableClass(MyMovableClass &&source) // 4 : move constructor { std::cout << "MOVING (c’tor) instance " << &source << " to instance " << this << std::endl; _data = source._data; _size = source._size; source._data = nullptr; source._size = 0; }
-
In this code, the
move constructor
takes as its input anrvalue
reference to a source object of the same class. In doing so, we are able to use the object within the scope of themove constructor
. As can be seen, the implementation copies the data handle from source to target and immediately invalidates source after copying is complete. Now, this is responsible for the data and must also release memory on destruction - the ownership has been successfully changed (or moved) without the need to copy the data on the heap. -
The move assignment operator works in a similar way:
-
MyMovableClass &operator=(MyMovableClass &&source) // 5 : move assignment operator { std::cout << "MOVING (assign) instance " << &source << " to instance " << this << std::endl; if (this == &source) return *this; delete[] _data; _data = source._data; _size = source._size; source._data = nullptr; source._size = 0; return *this; }
-
-
As with the move constructor, the data handle is copied from source to target which is coming in as an rvalue reference again. Afterwards, the data members of source are invalidated. The rest of the code is identical with the copy constructor we have already implemented.
-
The Rule of Five
-
By adding both the
move constructor
and themove assignment operator
to ourMyMovableClass
, we have adhered to the Rule of Five. This rule is an extension of the Rule of Three which we have already seen and exists since the introduction of the C++11 standard. The Rule of Five is especially important in resource management, where unnecessary copying needs to be avoided due to limited resources and performance reasons. Also, all the STL container classes such asstd::vector
implement the Rule of Five and usemove semantics
for increased efficiency. -
The Rule of Five states that if you have to write one of the functions listed below then you should consider implementing all of them with a proper resource management policy in place. If you forget to implement one or more, the compiler will usually generate the missing ones (without a warning) but the default versions might not be suitable for the purpose you have in mind. The five functions are:
-
The
destructor
: Responsible for freeing the resource once the object it belongs to goes out of scope. -
The
assignment operator
: The default assignment operation performs a member-wise shallow copy, which does not copy the content behind the resource handle. If a deep copy is needed, it has be implemented by the programmer. -
The
copy constructor
: As with the assignment operator, the default copy constructor performs a shallow copy of the data members. If something else is needed, the programmer has to implement it accordingly. -
The
move constructor
: Because copying objects can be an expensive operation which involves creating, copying and destroying temporary objects,rvalue
references are used to bind to anrvalue
. Using this mechanism, the move constructor transfers the ownership of a resource from a (temporary)rvalue
object to a permanentlvalue
object. -
The
move assignment operator
: With this operator, ownership of a resource can be transferred from one object to another. The internal behavior is very similar to the move constructor.
-
-
-
When are move semantics used?
-
Now that we have seen how move semantics work, let us take a look at situations where they actually apply.
-
One of the primary areas of application are cases, where heavy-weight objects need to be passed around in a program. Copying these without move semantics can cause series performance issues. The idea in this scenario is to create the object a single time and then "simply" move it around using
rvalue
references andmove semantics
. -
A second area of application are cases where ownership needs to be transferred (such as with unique pointers, as we will soon see). The primary difference to shared references is that with move semantics we are not sharing anything but instead we are ensuring through a smart policy that only a single object at a time has access to and thus owns the resource.
-
-
Let us look at some code examples:
-
int main() { MyMovableClass obj1(100), obj2(200); // constructor MyMovableClass obj3(obj1); // copy constructor MyMovableClass obj4 = obj1; // copy constructor obj4 = obj2; // copy assignment operator return 0; }
-
If you compile and run this code, be sure to use the
-std=c++11
flag. The reasons for this will be explained below. -
In the code above, in total, four instances of
MyMovableClass
are constructed here. While obj1 and obj2 are created using the conventional constructor, obj3 is created using the copy constructor instead according to our implementation. Interestingly, even though the creation of obj4 looks like an assignment, the compiler calls the copy constructor int this case. Finally, the last line calls the copy assignment operator. The output of the above main function looks like the following:-
CREATING instance of MyMovableClass at 0x7ffeefbff718 allocated with size = 400 bytes CREATING instance of MyMovableClass at 0x7ffeefbff708 allocated with size = 800 bytes COPYING content of instance 0x7ffeefbff718 to instance 0x7ffeefbff6e8 COPYING content of instance 0x7ffeefbff718 to instance 0x7ffeefbff6d8 ASSIGNING content of instance 0x7ffeefbff708 to instance 0x7ffeefbff6d8 DELETING instance of MyMovableClass at 0x7ffeefbff6d8 DELETING instance of MyMovableClass at 0x7ffeefbff6e8 DELETING instance of MyMovableClass at 0x7ffeefbff708 DELETING instance of MyMovableClass at 0x7ffeefbff718
-
-
Note that the compiler has been called with the option
-fno-elide-constructors
to turn off an optimization techniques called copy elision, which would make it harder to understand the various calls and the operations they entail. This technique is guaranteed to be used as of C++17, which is why we are also reverting to the C++11 standard for the remainder of this chapter using -std=c++11. Until now, no move operation has been performed yet as all of the above calls were involvinglvalues
. -
Now consider the following main function instead:
-
int main() { MyMovableClass obj1(100); // constructor obj1 = MyMovableClass(200); // move assignment operator MyMovableClass obj2 = MyMovableClass(300); // move constructor return 0; }
-
-
In this version, we also have an instance of
MyMovableClass
, obj1. Then, a second instance ofMyMovableClass
is created as anrvalue
, which is assigned to obj1. Finally, we have a second lvalueobj2
, which is created by assigning it an rvalue object. Let us take a look at the output of the program:-
CREATING instance of MyMovableClass at 0x7ffeefbff718 allocated with size = 400 bytes CREATING instance of MyMovableClass at 0x7ffeefbff708 allocated with size = 800 bytes MOVING (assign) instance 0x7ffeefbff708 to instance 0x7ffeefbff718 DELETING instance of MyMovableClass at 0x7ffeefbff708 CREATING instance of MyMovableClass at 0x7ffeefbff6d8 allocated with size = 1200 bytes MOVING (c'tor) instance 0x7ffeefbff6d8 to instance 0x7ffeefbff6e8 DELETING instance of MyMovableClass at 0x7ffeefbff6d8 DELETING instance of MyMovableClass at 0x7ffeefbff6e8 DELETING instance of MyMovableClass at 0x7ffeefbff718
-
-
If that is not the output you see, check this post about copy elision: https://stackoverflow.com/questions/13099603/c11-move-constructor-not-called-default-constructor-preferred
-
By looking at the stack addresses of the objects, we can see that the temporary object at
0x7ffeefbff708
is moved to0x7ffeefbff718
using themove assignment operator
we wrote earlier, because the instanceobj1
is assigned anrvalue
. As expected from anrvalue
, its destructor is called immediately afterwards. But as we have made sure to null its data pointer in the move constructor, the actual data will not be deleted. The advantage from a performance perspective in this case is that no deep-copy of thervalue
object needs to be made, we are simply redirecting the internal resource handle thus making an efficient shallow copy. -
Next, another temporary instance with a size of 1200 bytes is created as a temporary object and "assigned" to obj3. Note that while the call looks like an assignment, the move constructor is called under the hood, making the call identical to
MyMovableClass obj2(MyMovableClass(300));
. By creating obj3 in such a way, we are reusing the temporary rvalue and transferring ownership of its resources to the newly created obj3. -
Let us now consider a final example:
-
void useObject(MyMovableClass obj) { std::cout << "using object " << &obj << std::endl; } int main() { MyMovableClass obj1(100); // constructor useObject(obj1); return 0; }
-
In this case, an instance of
MyMovableClass
, obj1, is passed to a function useObject by value, thus making a copy of it. -
Let us take an immediate look at the output of the program, before going into details:
-
(1) CREATING instance of MyMovableClass at 0x7ffeefbff718 allocated with size = 400 bytes (2) COPYING content of instance 0x7ffeefbff718 to instance 0x7ffeefbff708 using object 0x7ffeefbff708 (3) DELETING instance of MyMovableClass at 0x7ffeefbff708 (4) CREATING instance of MyMovableClass at 0x7ffeefbff6d8 allocated with size = 800 bytes (5) MOVING (c'tor) instance 0x7ffeefbff6d8 to instance 0x7ffeefbff6e8 using object 0x7ffeefbff6e8 DELETING instance of MyMovableClass at 0x7ffeefbff6e8 DELETING instance of MyMovableClass at 0x7ffeefbff6d8 DELETING instance of MyMovableClass at 0x7ffeefbff718
-
First, we are creating an instance of
MyMovableClass
,obj1
, by calling the constructor of the class (1). -
Then, we are passing
obj1
by-value to a functionuseObject
, which causes a temporary objectobj
to be instantiated, which is a copy ofobj1
(2) and is deleted immediately after the function scope is left (3). -
Then, the function is called with a temporary instance of MyMovableClass as its argument, which creates a temporary instance of MyMovableClass as an rvalue (4). But instead of making a copy of it as before, the move constructor is used (5) to transfer ownership of that temporary object to the function scope, which saves us one expensive deep-copy.
-
Moving
lvalues
-
There is one final aspect we need to look at: In some cases, it can make sense to treat
lvalues
likervalues
. At some point in your code, you might want to transfer ownership of a resource to another part of your program as it is not needed anymore in the current scope. But instead of copying it, you want to just move it as we have seen before. The "problem" with our implementation ofMyMovableClass
is that the calluseObject(obj1)
will trigger thecopy constructor
as we have seen in one of the last examples. But in order to move it, we would have to pretend to the compiler thatobj1
was anrvalue
instead of anlvalue
so that we can make an efficientmove operation
instead of an expensive copy. -
There is a solution to this problem in
C++
, which isstd::move
. This function accepts anlvalue
argument and returns it as anrvalue
without triggering copy construction. So by passing an object tostd::move
we can force the compiler to usemove semantics
, either in the form ofmove constructor
or themove assignment operator
: -
int main() { MyMovableClass obj1(100); // constructor useObject(std::move(obj1)); return 0; }
-
Nothing much has changed, apart from
obj1
being passed to thestd::move
function. The output would look like the following: -
CREATING instance of MyMovableClass at 0x7ffeefbff718 allocated with size = 400 bytes MOVING (c'tor) instance 0x7ffeefbff718 to instance 0x7ffeefbff708 using object 0x7ffeefbff708 DELETING instance of MyMovableClass at 0x7ffeefbff708 DELETING instance of MyMovableClass at 0x7ffeefbff718
-
By using
std::move
, we were able to pass the ownership of the resources withinobj1
to the functionuseObject
. The local copyobj1
in the argument list was created with themove constructor
and thus accepted the ownership transfer fromobj1
toobj
. Note that after the call touseObject
, the instanceobj1
has been invalidated by setting its internal handle to null and thus may not be used anymore within the scope of main (even though you could theoretically try to access it, but this would be a really bad idea).
-
-
Exercises - Move semantics
-
Exercise 1:
-
/* Memory Management exercises part 1: Pass data between functions without using move semantics */ #include <iostream> #include <vector> #include <cmath> using namespace std; // pass back by pointer (old C++) const int array_size = 1e6; // determines size of the random number array vector<int> *RandomNumbers1() { vector<int> *random_numbers = new vector<int>[array_size]; // allocate memory on the heap... for (int i = 0; i < array_size; i++) { int b = rand(); (*random_numbers).push_back(b); // ...and fill it with random numbers } return random_numbers; // return pointer to heap memory } // pass back by reference (old C++) void RandomNumbers2(vector<int> &random_numbers) { random_numbers.resize(array_size); // expand vector to desired size for (int i = 0; i < array_size; i++) { random_numbers[i] = rand(); } } int main() { /* EXERCISE 1-1: Get access to random data using a returned pointer from function RandomNumbers1 and make sure that there are no memory leaks.*/ // store the data in a suitable variable named 'random_numbers_1' and free the associated memory immediately afterwards // SOLUTION to exercise 1-1 vector<int> *random_numbers_1 = RandomNumbers1(); // return-by-pointer delete random_numbers_1; /* EXERCISE 1-2: Get access to data using pass-by-reference */ // store the data in a suitable variable named 'random_numbers_2' // SOLUTION to exercise 1-2 vector<int> random_numbers_2; // create identifier to pass to the function RandomNumbers2(random_numbers_2); }
-
-
-
Resource Acquisition is Initialization
-
Error-prone memory management with new and delete
-
In the previous chapters, we have seen that memory management on the heap using
malloc/free
ornew/delete
is extremely powerful, as they allow for a fine-grained control over the precious memory resource. However, the correct use of these concepts requires some degree of skill and experience (and concentration) from the programmer. If they are not handled correctly, bugs will quickly be introduced into the code. A major source of error is that the details around memory management withnew/delete
are completely left to the programer. In the remainder of this lesson, the pairmalloc/free
will be omitted for reasons of brevity. However, many of the aspects that hold fornew/delete
will also apply tomalloc/free
. -
Let us take a look at some of the worst problems with
new
anddelete
:-
Proper pairing of
new
anddelete
: Every dynamically allocated object that is created withnew
must be followed by a manual deallocation at a "proper" place in the program. If the programer forgets to calldelete
(which can happen very quickly) or if it is done at an "inappropriate" position, memory leaks will occur which might clog up a large portion of memory. -
Correct operator pairing : C++ offers a variety of
new/delete
operators, especially when dealing with arrays on the heap. A dynamically allocated array initialized withnew[]
may only be deleted with the operatordelete[]
. If the wrong operator is used, program behavior will be undefined - which is to be avoided at all cost in C++. -
Memory ownership : If a third-party function returns a pointer to a data structure, the only way of knowing who will be responsible for resource deallocation is by looking into either the code or the documentation. If both are not available (as is often the case), there is no way to infer the ownership from the return type. As an example, in the final project of this course, we will use the graphical library
wxWidgets
to create the user interface of a chatbot application. InwxWidgets
, the programmer can create child windows and control elements on the heap usingnew
, but the framework will take care of deletion altogether. If for some reason the programmer does not know this, he or she might calldelete
and thus interfere with the inner workings of thewxWidgets
library.
-
-
-
The benefits of smart pointers
-
To put it briefly: Smart pointers were introduced in C++ to solve the above mentioned problems by providing a degree of automatic memory management: When a
smart pointer
is no longer needed (which is the case as soon as it goes out of scope), the memory to which it points is automatically deallocated. When contrasted with smart pointers, the conventional pointers we have seen so far are often termed"raw pointers"
. -
In essence,
smart pointers
are classes that are wrapped aroundraw pointers
. By overloading the->
and*
operators, smart pointer objects make sure that the memory to which their internal raw pointer refers to is properly deallocated. This makes it possible to use smart pointers with the same syntax as raw pointers. As soon as a smart pointer goes out of scope, its destructor is called and the block of memory to which the internal raw pointer refers is properly deallocated. This technique of wrapping a management class around a resource has been conceived by Bjarne Stroustroup and is calledResource Acquisition Is Initialization (RAII)
. Before we continue with smart pointers and their usage let us take a close look at this powerful concept.
-
-
Resource Acquisition Is Initialization
-
The
RAII
is a widespread programming paradigm, that can be used to protect a resource such as a file stream, a network connection or a block of memory which need proper management. -
Acquiring and releasing resources
- In most programs of reasonable size, there will be many situations where a certain action at some point will necessitate a proper reaction at another point, such as:
-
Allocating memory with new or
malloc
, which must be matched with a call todelete
orfree
. -
Opening a file or network connection, which must be closed again after the content has been read or written.
-
Protecting synchronization primitives such as atomic operations, memory barriers, monitors or critical sections, which must be released to allow other threads to obtain them.
-
- In most programs of reasonable size, there will be many situations where a certain action at some point will necessitate a proper reaction at another point, such as:
-
The following table gives a brief overview of some resources and their respective allocation and deallocation calls in C++:
-
-
The problem of reliable resource release
-
However, there are several problems with this seemingly simple pattern:
-
The program might throw an exception during resource use and thus the point of release might never be reached.
-
There might be several points where the resource could potentially be released, making it hard for a programmer to keep track of all eventualities.
-
We might simply forget to release the resource again.
-
-
RAII
to the rescue-
The major idea of
RAII
revolves around object ownership and information hiding:Allocation
anddeallocation
are hidden within the management class, so a programmer using the class does not have to worry about memory management responsibilities. If he has not directly allocated a resource, he will not need to directly deallocate it - whoever owns a resource deals with it. In the case of RAII this is the management class around the protected resource. The overall goal is to haveallocation
anddeallocation
(e.g. withnew
anddelete
) disappear from the surface level of the code you write. -
RAII can be used to leverage - among others - the following advantages:
- Use class destructors to perform resource clean-up tasks such as proper memory deallocation when the
RAII
object gets out of scope - Manage ownership and lifetime of dynamically allocated objects
- Implement encapsulation and information hiding due to resource acquisition and release being performed within the same object.
- Use class destructors to perform resource clean-up tasks such as proper memory deallocation when the
-
In the following, let us look at
RAII
from the perspective of memory management. There are three major parts to anRAII
class:- A resource is allocated in the constructor of the RAII class
- The resource is deallocated in the destructor
- All instances of the RAII class are allocated on the stack to reliably control the lifetime via the object scope
-
Let us now take a look at the code example on the right.
-
int main() { double den[] = {1.0, 2.0, 3.0, 4.0, 5.0}; for (size_t i = 0; i < 5; ++i) { // allocate the resource on the heap double *en = new double(i); // use the resource std::cout << *en << "/" << den[i] << " = " << *en / den[i] << std::endl; // deallocate the resource delete en; } return 0; }
-
At the beginning of the program, an array of double values
den
is allocated on the stack. Within the loop, a new double is created on the heap usingnew
. Then, the result of a division is printed to the console. At the end of the loop,delete
is called to properly deallocate the heap memory to which en is pointing. Even though this code is working as it is supposed to, it is very easy to forget to calldelete
at the end. Let us therefore use the principles ofRAII
to create a management class that calls delete automatically: -
class MyInt { int *_p; // pointer to heap data public: MyInt(int *p = NULL) { _p = p; } ~MyInt() { std::cout << "resource " << *_p << " deallocated" << std::endl; delete _p; } int &operator*() { return *_p; } // // overload dereferencing operator };
-
In this example, the constructor of class
MyInt
takes a pointer to a memory resource. When the destructor of a MyInt object is called, the resource is deleted from memory - which makesMyInt
anRAII
memory management class. Also, the*
operator is overloaded which enables us to dereferenceMyInt
objects in the same manner as with raw pointers. Let us therefore slightly alter our code example from above to see how we can properly use this new construct: -
int main() { double den[] = {1.0, 2.0, 3.0, 4.0, 5.0}; for (size_t I = 0; I < 5; ++i) { // allocate the resource on the stack MyInt en(new int(i)); // use the resource std::cout << *en << "/" << den[i] << " = " << *en / den[i] << std::endl; } return 0; }
-
Update the code on the right with the snippets above before proceeding.
-
-
Let us break down the resource allocation part in two steps:
- The part
new int(i)
creates a new block of memory on the heap and initializes it with the value ofi
. The returned result is the address of the block of memory. - The part
MyInt en(…)
calls the constructor of classMyInt
, passing the address of a valid memory block as a parameter.
- The part
-
After creating an object of class
MyInt
on the stack, which, internally, created an integer on the heap, we can use the dereference operator in the same manner as before to retrieve the value to which the internal raw pointer is pointing. Because theMyInt
objecten
lives on the stack, it is automatically deallocated after each loop cycle - which automatically calls the destructor to release the heap memory. The following console output verifies this:-
0/1 = 0 resource 0 deallocated 1/2 = 0.5 resource 1 deallocated 2/3 = 0.666667 resource 2 deallocated 3/4 = 0.75 resource 3 deallocated 4/5 = 0.8 resource 4 deallocated
-
-
We have thus successfully used the
RAII
idiom to create a memory management class that spares us from thinking about calling delete. By creating the MyInt object on the stack, we ensure that the deallocation occurs as soon as the object goes out of scope. -
Quiz : What would be the major difference of the following program compared to the last example?
-
int main() { double den[] = {1.0, 2.0, 3.0, 4.0, 5.0}; for (size_t I = 0; I < 5; ++i) { // allocate the resource on the heap MyInt *en = new MyInt(new int(i)); // use the resource std::cout << **en << "/" << den[i] << " = " << **en / den[i] << std::endl; } // memory leak (en not deallocated) return 0; }
-
-
RAII and smart pointers
- In the last section, we have discussed the powerful RAII idiom, which reduces the risk of improperly managed resources. Applied to the concept of memory management,
RAII
enables us to encapsulatenew
anddelete
calls within a class and thus present the programmer with a clean interface to the resource he intends to use. Since C++11, there exists a language feature called smart pointers, which builds on the concept ofRAII
and - without exaggeration - revolutionizes the way we use resources on the heap. Let’s take a look.
- In the last section, we have discussed the powerful RAII idiom, which reduces the risk of improperly managed resources. Applied to the concept of memory management,
-
Smart pointer overview
-
Since C++11, the standard library includes smart pointers, which help to ensure that programs are free of memory leaks while also remaining exception-safe. With smart pointers, resource acquisition occurs at the same time that the object is initialized (when instantiated with
make_shared
ormake_unique
), so that all resources for the object are created and initialized in a single line of code. -
In modern C++,
raw pointers
managed withnew
anddelete
should only be used in small blocks of code with limited scope, where performance is critical (such as withplacement new
) and ownership rights of the memory resource are clear. We will look at some guidelines on where to use which pointer later. -
C++11 has introduced three types of
smart pointers
, which are defined in the header of the standard library:-
The unique pointer
std::unique_ptr
is a smart pointer which exclusively owns a dynamically allocated resource on the heap. There must not be a second unique pointer to the same resource. -
The shared pointer
std::shared_ptr
points to a heap resource but does not explicitly own it. There may even be several shared pointers to the same resource, each of which will increase an internal reference count. As soon as this count reaches zero, the resource will automatically be deallocated. -
The weak pointer
std::weak_ptr
behaves similar to the shared pointer but does not increase the reference counter.
-
-
Prior to C++11, there was a concept called
std::auto_ptr
, which tried to realize a similar idea. However, this concept can now be safely considered as deprecated and should not be used anymore. -
Let us now look at each of the three smart pointer types in detail.
-
-
The Unique pointer
-
A unique pointer is the exclusive owner of the memory resource it represents. There must not be a second unique pointer to the same memory resource, otherwise there will be a compiler error. As soon as the unique pointer goes out of scope, the memory resource is deallocated again. Unique pointers are useful when working with a temporary heap resource that is no longer needed once it goes out of scope.
-
The following diagram illustrates the basic idea of a unique pointer:
-
In the example, a resource in memory is referenced by a unique pointer instance
sourcePtr
. Then, the resource is reassigned to another unique pointer instancedestPtr
usingstd::move
. The resource is now owned bydestPtr
whilesourcePtr
can still be used but does not manage a resource anymore. -
A unique pointer is constructed using the following syntax:
std::unique_ptr<Type> p(new Type);
-
#include <memory> void RawPointer() { int *raw = new int; // create a raw pointer on the heap *raw = 1; // assign a value delete raw; // delete the resource again } void UniquePointer() { std::unique_ptr<int> unique(new int); // create a unique pointer on the stack *unique = 2; // assign a value // delete is not neccessary }
-
In the example on the right we will see how a unique pointer is constructed and how it compares to a raw pointer.
-
The function
RawPointer
contains the familiar steps of (1) allocating memory on the heap with new and storing the address in a pointer variable, (2) assigning a value to the memory block using thedereferencing operator *
and (3) finally deleting the resource on the heap. As we already know, forgetting to call delete will result in a memory leak. -
The function
UniquePointer
shows how to achieve the same goal using a smart pointer from the standard library. As can be seen, a smart pointer is a class template that is declared on the stack and then initialized by a raw pointer (returned by new ) to a heap-allocated object. The smart pointer is now responsible for deleting the memory that the raw pointer specifies - which happens as soon as the smart pointer goes out of scope. Note that smart pointers always need to be declared on the stack, otherwise the scoping mechanism would not work. -
The smart pointer destructor contains the call to delete, and because the smart pointer is declared on the stack, its destructor is invoked when the smart pointer goes out of scope, even if an exception is thrown.
-
In the example now on the right, we will construct a
unique pointer
to a custom class. Also, we will see how the standard->
and*
operators can be used to access member functions of the managed object, just as we would with a raw pointer: -
#include <iostream> #include <memory> #include <string> class MyClass { private: std::string _text; public: MyClass() {} MyClass(std::string text) { _text = text; } ~MyClass() { std::cout << _text << " destroyed" << std::endl; } void setText(std::string text) { _text = text; } }; int main() { // create unique pointer to proprietary class std::unique_ptr<MyClass> myClass1(new MyClass()); std::unique_ptr<MyClass> myClass2(new MyClass("String 2")); // call member function using -> myClass1->setText("String 1"); // use the dereference operator * *myClass1 = *myClass2; // use the .get() function to retrieve a raw pointer to the object std::cout << "Objects have stack addresses " << myClass1.get() << " and " << myClass2.get() << std::endl; return 0; }
-
Note that the custom class
MyClass
has two constructors, one without arguments and one with astring
to be passed, which initializes a member variable_text
that lives on the stack. Also, once an object of this class gets destroyed, a message to the console is printed, along with the value of_text
. In main, two unique pointers are created with the address of aMyClass
object on the heap as arguments. WithmyClass2
, we can see that constructor arguments can be passed just as we would with raw pointers. After both pointers have been created, we can use the->
operator to access members of the class, such as calling the functionsetText
. From looking at the function call alone you would not be able to tell thatmyClass1
is in fact a smart pointer. Also, we can use the dereference operator*
to access the value ofmyClass1
andmyClass2
and assign the one to the other. Finally, the.
operator gives us access to proprietary functions of the smart pointer, such as retrieving the internal raw pointer withget()
. -
The console output of the program looks like the following:
-
Objects have stack addresses 0x1004000e0 and 0x100400100 String 2 destroyed String 2 destroyed
-
Obviously, both pointers have different addresses on the stack, even after copying the contents from
myClass2
tomyClass1
. As can be seen from the last two lines of the output, the destructor of both objects gets called automatically at the end of the program and - as expected - the value of the internal string is identical due to the copy operation. -
Summing up, the unique pointer allows a single owner of the underlying internal raw pointer. Unique pointers should be the default choice unless you know for certain that sharing is required at a later stage. We have already seen how to transfer ownership of a resource using the
Rule of Five
and move semantics. Internally, the unique pointer uses this very concept along withRAII
to encapsulate a resource (the raw pointer) and transfer it between pointer objects when either the move assignment operator or the move constructor are called. Also, a key feature of aunique pointer
, which makes it so well-suited as a return type for many functions, is the possibility to convert it to ashared pointer
. We will have a deeper look into this in the section on ownership transfer.
-
-
The Shared Pointer
-
Just as the
unique pointer
, ashared pointer
owns the resource it points to. The main difference between the two smart pointers is that shared pointers keep a reference counter on how many of them point to the same memory resource. Each time ashared pointer
goes out of scope, the counter is decreased. When it reaches zero (i.e. when the last shared pointer to the resource is about to vanish). the memory is properly deallocated. Thissmart pointer
type is useful for cases where you require access to a memory location on the heap in multiple parts of your program and you want to make sure that whoever owns ashared pointer
to the memory can rely on the fact that it will be accessible throughout the lifetime of that pointer. -
The following diagram illustrates the basic idea of a shared pointer:
-
Please take a look at the code on the right.
-
#include <iostream> #include <memory> int main() { std::shared_ptr<int> shared1(new int); std::cout << "shared pointer count = " << shared1.use_count() << std::endl; { std::shared_ptr<int> shared2 = shared1; std::cout << "shared pointer count = " << shared1.use_count() << std::endl; } std::cout << "shared pointer count = " << shared1.use_count() << std::endl; return 0; }
-
We can see that shared pointers are constructed just as
unique pointers
are. Also, we can access the internal reference count by using the methoduse_count()
. In the inner block, a second shared pointershared2
is created andshared1
is assigned to it. In thecopy constructor
, the internal resource pointer is copied toshared2
and the resource counter is incremented in bothshared1
andshared2
. Let us take a look at the output of the code: -
shared pointer count = 1 shared pointer count = 2 shared pointer count = 1
-
You may have noticed that the lifetime of
shared2
is limited to the scope denoted by the enclosing curly brackets. Thus, once this scope is left andshared2
is destroyed, the reference counter inshared1
is decremented by one - which is reflected in the three console outputs given above. -
A
shared pointer
can also be redirected by using thereset()
function. If the resource which ashared pointer
manages is no longer needed in the current scope, the pointer can be reset to manage a difference resource as illustrated in the example on the right. -
#include <iostream> #include <memory> class MyClass { public: ~MyClass() { std::cout << "Destructor of MyClass called" << std::endl; } }; int main() { std::shared_ptr<MyClass> shared(new MyClass); std::cout << "shared pointer count = " << shared.use_count() << std::endl; shared.reset(new MyClass); std::cout << "shared pointer count = " << shared.use_count() << std::endl; return 0; }
-
Note that in the example, the destructor of
MyClass
prints a string to the console when called. The output of the program looks like the following: -
shared pointer count = 1 Destructor of MyClass called shared pointer count = 1 Destructor of MyClass called
-
After creation, the program prints
1
as the reference count ofshared
. Then, thereset
function is called with a new instance ofMyClass
as an argument. This causes the destructor of the firstMyClass
instance to be called, hence the console output. As can be seen, the reference count of theshared pointer
is still at1
. Then, at the end of the program, the destructor of the secondMyClass
object is called once the path of execution leaves the scope of main. -
Despite all the advantages of
shared pointers
, it is still possible to have problems with memory management though. Consider the scenario on the right. -
#include <iostream> #include <memory> class MyClass { public: std::shared_ptr<MyClass> _member; ~MyClass() { std::cout << "Destructor of MyClass called" << std::endl; } }; int main() { std::shared_ptr<MyClass> myClass1(new MyClass); std::shared_ptr<MyClass> myClass2(new MyClass); return 0; }
-
In main, two shared pointers
myClass1
andmyClass2
which are managing objects of typeMyClass
are allocated on the stack. As can be seen from the console output, both smart pointers are automatically deallocated when the scope of main ends: -
Destructor of MyClass called Destructor of MyClass called
-
When the following two lines are added to main, the result is quite different:
-
myClass1->_member = myClass2; myClass2->_member = myClass1;
-
These two lines produce a circular reference. When
myClass1
goes out of scope at the end of main, its destructor can’t clean up memory as there is still a reference count of1
in thesmart pointer
, which is caused by the shared pointer_member
inmyClass2
. The same holds true formyClass2
, which can not be properly deleted as there is still a shared pointer to it inmyClass1
. Thisdeadlock
situation prevents the destructors from being called and causes a memory leak. When we useValgrind
on this program, we get the following summary: -
==20360== LEAK SUMMARY: ==20360== definitely lost: 16 bytes in 1 blocks ==20360== indirectly lost: 80 bytes in 3 blocks ==20360== possibly lost: 72 bytes in 3 blocks ==20360== still reachable: 200 bytes in 6 blocks ==20360== suppressed: 18,985 bytes in 160 blocks
-
As can be seen, the memory leak is clearly visible with 16 bytes being marked as "definitely lost". To prevent such circular references, there is a third smart pointer, which we will look at in the following.
-
-
The Weak Pointer
-
Similar to
shared pointers
, there can be multiple weak pointers to the same resource. The main difference though is that weak pointers do not increase the reference count. Weak pointers hold a non-owning reference to an object that is managed by anothershared pointer
. -
The following rule applies to
weak pointers
: You can only create weak pointers out of shared pointers or out of another weak pointer. The code on the right shows a few examples of how to use and how not to use weak pointers. -
#include <iostream> #include <memory> int main() { std::shared_ptr<int> mySharedPtr(new int); std::cout << "shared pointer count = " << mySharedPtr.use_count() << std::endl; std::weak_ptr<int> myWeakPtr1(mySharedPtr); std::weak_ptr<int> myWeakPtr2(myWeakPtr1); std::cout << "shared pointer count = " << mySharedPtr.use_count() << std::endl; // std::weak_ptr<int> myWeakPtr3(new int); // COMPILE ERROR return 0; }
-
The output looks as follows:
-
shared pointer count = 1 shared pointer count = 1
-
First, a
shared pointer
to an integer is created with a reference count of1
after creation. Then, twoweak pointers
to the integer resource are created, the first directly from theshared pointer
and the second indirectly from the first weak pointer. As can be seen from the output, neither of both weak pointers increased the reference count. At the end ofmain
, the attempt to directly create a weak pointer to an integer resource would lead to a compile error. -
As we have seen with
raw pointers
, you can never be sure wether the memory resource to which the pointer refers is still valid. With aweak pointer
, even though this type does not prevent an object from being deleted, the validity of its resource can be checked. The code on the right illustrates how to use theexpired()
function to do this. -
#include <iostream> #include <memory> int main() { std::shared_ptr<int> mySharedPtr(new int); std::weak_ptr<int> myWeakPtr(mySharedPtr); mySharedPtr.reset(new int); if (myWeakPtr.expired() == true) { std::cout << "Weak pointer expired!" << std::endl; } return 0; }
-
Thus, with
smart pointers
, there will always be a managing instance which is responsible for the properallocation
anddeallocation
of a resource. In some cases it might be necessary to convert from onesmart pointer
type to another. Let us take a look at the set of possible conversions in the following.
-
-
Converting between smart pointers
-
The example on the right illustrates how to convert between the different pointer types.
-
#include <iostream> #include <memory> int main() { // construct a unique pointer std::unique_ptr<int> uniquePtr(new int); // (1) shared pointer from unique pointer std::shared_ptr<int> sharedPtr1 = std::move(uniquePtr); // (2) shared pointer from weak pointer std::weak_ptr<int> weakPtr(sharedPtr1); std::shared_ptr<int> sharedPtr2 = weakPtr.lock(); // (3) raw pointer from shared (or unique) pointer int *rawPtr = sharedPtr2.get(); delete rawPtr; return 0; }
-
In
(1)
, a conversion fromunique pointer
toshared pointer
is performed. You can see that this can be achieved by usingstd::move
, which calls themove assignment operator
onsharedPtr1
and steals the resource fromuniquePtr
while at the same time invalidating its resource handle on the heap-allocated integer. -
In
(2)
, you can see how to convert fromweak
toshared pointer
. Imagine that you have been passed aweak pointer
to a memory object which you want to work on. To avoid invalid memory access, you want to make sure that the object will not be deallocated before your work on it has been finished. To do this, you can convert aweak pointer
to ashared pointer
by calling thelock()
function on theweak pointer
. -
In
(3)
, araw pointer
is extracted from ashared pointer
. However, this operation does not decrease the reference count withinsharedPtr2
. This means that callingdelete
onrawPtr
in the last line beforemain
returns will generate aruntime error
as a resource is trying to be deleted which is managed bysharedPtr2
and has already been removed. The output of the program when compiled withg++
thus is:malloc: *** error for object 0x1003001f0: pointer being freed was not allocated
. -
Note that there are no options for converting away from a shared pointer. Once you have created a shared pointer, you must stick to it (or a copy of it) for the remainder of your program.
-
-
When to use
raw pointers
andsmart pointers
?-
As a general rule of thumb with modern C++,
smart pointers
should be used often. They will make your code safer as you no longer need to think (much) about the properallocation
anddeallocation
of memory. As a consequence, there will be much fewer memory leaks caused by dangling pointers or crashes from accessing invalidated memory blocks. -
When using
raw pointers
on the other hand, your code might be susceptible to the following bugs:- Memory leaks
- Freeing memory that shouldn’t be freed
- Freeing memory incorrectly
- Using memory that has not yet been allocated
- Thinking that memory is still allocated after being freed
-
With all the advantages of smart pointers in modern C++, one could easily assume that it would be best to completely ban the use of
new
anddelete
from your code. However, while this is in many cases possible, it is not always advisable as well. Let us take a look at the C++ core guidelines, which has several rules for explicit memoryallocation
anddeallocation
. In the scope of this course, we will briefly discuss three of them:-
R. 10: Avoid
malloc
andfree
While the calls(MyClass*)malloc( sizeof(MyClass) )
andnew MyClass
both allocate a block of memory on the heap in a perfectly valid manner, onlynew
will also call the constructor of the class andfree
the destructor. To reduce the risk of undefined behavior,malloc
andfree
should thus be avoided. -
R. 11: Avoid calling
new
anddelete
explicitly Programmers have to make sure that every call ofnew
is paired with the appropriatedelete
at the correct position so that no memory leak or invalid memory access occur. The emphasis here lies in the word "explicitly" as opposed to implicitly, such as withsmart pointers
orcontainers
in the standard template library. -
R. 12: Immediately give the result of an explicit resource allocation to a manager object It is recommended to make use of manager objects for controlling resources such as files, memory or network connections to mitigate the risk of memory leaks. This is the core idea of
smart pointers
as discussed at length in this section.
-
-
Summarizing, raw pointers created with
new
anddelete
allow for a high degree of flexibility and control over the managed memory as we have seen in earlier lessons of this course. To mitigate their proneness to errors, the following additional recommendations can be given:-
A call to
new
should not be located too far away from the correspondingdelete
. It is bad style to stretch younew
/delete
pairs throughout your program with references criss-crossing your entire code. -
Calls to
new
anddelete
should always be hidden from third parties so that they must not concern themselves with managing memory manually (which is similar to R. 12).
-
-
In addition to the above recommendations, the C++ core guidelines also contain a total of 13 rules for the recommended use of
smart pointers
. In the following, we will discuss a selection of these:-
R. 20 : Use
unique_ptr
orshared_ptr
to represent ownership -
R. 21 : Prefer
unique_ptr
overstd::shared_ptr
unless you need to share ownership
-
-
Both pointer types express ownership and responsibilities (R. 20). A
unique_ptr
is an exclusive owner of the managed resource; therefore, it cannot be copied, only moved. In contrast, ashared_ptr
shares the managed resource with others. As described above, this mechanism works by incrementing and decrementing a common reference counter. The resulting administration overhead makesshared_ptr
more expensive thanunique_ptr
. For this reasonunique_ptr
should always be the first choice (R. 21).-
R. 22 : Use
make_shared()
to makeshared_ptr
-
R. 23 : Use
make_unique()
to makestd::unique_ptr
-
-
The increased management overhead compared to raw pointers becomes in particular true if a
shared_ptr
is used. Creating ashared_ptr
requires (1) the allocation of the resource usingnew
and (2) the allocation and management of the reference counter. Using the factory functionmake_shared
is a one-step operation with lower overhead and should thus always be preferred. (R.22). This also holds forunique_ptr
(R.23), although the performance gain in this case is minimal (if existent at all). -
But there is an additional reason for using the
make_...
factory functions: Creating asmart pointer
in a single step removes the risk of a memory leak. Imagine a scenario where an exception happens in the constructor of the resource. In such a case, the object would not be handled properly and its destructor would never be called - even if the managing object goes out of scope. Therefore,make_shared
andmake_unique
should always be preferred. Note thatmake_unique
is only available with compilers that support at least the C++14 standard.- R. 24 : Use
weak_ptr
to break cycles ofshared_ptr
- R. 24 : Use
-
We have seen that
weak pointers
provide a way to break a deadlock caused by two owning references which are cyclicly referring to each other. Withweak pointers
, a resource can be safely deallocated as the reference counter is not increased. -
The remaining set of guideline rules referring to
smart pointers
are mostly concerning the question of how to pass asmart pointer
to a function. We will discuss this question in the next concept.
-
-
Transferring Ownership
-
In the previous section, we have taken a look at the three smart pointer types in C++. In addition to
smart pointers
, you are now also familiar withmove semantics
, which is of particular importance in this section. In the following, we will discuss how to properly pass and returnsmart pointers
to functions and vice-versa. In modern C++, there are various ways of doing this and in many cases, the method of choice has an impact on both performance and code robustness. The basis of this section are the C++ core guidelines onsmart pointers
, some of which we will be examining in the following. -
Passing
smart pointers
to functions-
Let us consider the following recommendation of the C++ guidelines on smart pointers:
-
R. 30 : Take
smart pointers
as parameters only to explicitly express lifetime semantics -
The core idea behind this rule is the notion that functions that only manipulate objects without affecting its lifetime in any way should not be concerned with a particular kind of
smart pointer
. A function that does not manipulate the lifetime or ownership should useraw pointers
orreferences
instead. A function should takesmart pointers
as parameter only if it examines or manipulates thesmart pointer
itself. As we have seen,smart pointers
are classes that provide several features such as counting the references of ashared_ptr
or increasing them by making a copy. Also, data can be moved from oneunique_ptr
to another and thus transferring the ownership. A particular function should accept smart pointers only if it expects to do something of this sort. If a function just needs to operate on the underlying object without the need of using anysmart pointer property
, it should accept the objects via raw pointers or references instead. -
The following examples are pass-by-value types that lend the ownership of the underlying object:
void f(std::unique_ptr<MyObject> ptr)
void f(std::shared_ptr<MyObject> ptr)
void f(std::weak_ptr<MyObject> ptr)
-
Passing
smart pointers
by value means to lend their ownership to a particular functionf
. In the above examples 1-3, all pointers are passed by value, i.e. the functionf
has a private copy of it which it can (and should) modify. Depending on the type ofsmart pointer
, a tailored strategy needs to be used. Before going into details, let us take a look at the underlying rule from the C++ guidelines (where "widget" can be understood as "class"). -
R.32: Take a
unique_ptr
parameter to express that a function assumes ownership of a widget
-
-
The basic idea of a
unique_ptr
is that there exists only a single instance of it. This is why it can’t be copied to a local function but needs to be moved instead with the functionstd::move
. The code example on the right illustrates the principle of transferring the object managed by the unique pointeruniquePtr
into a functionf
. -
#include <iostream> #include <memory> class MyClass { private: int _member; public: MyClass(int val) : _member{val} {} void printVal() { std::cout << ", managed object " << this << " with val = " << _member << std::endl; } }; void f(std::unique_ptr<MyClass> ptr) { std::cout << "unique_ptr " << &ptr; ptr->printVal(); } int main() { std::unique_ptr<MyClass> uniquePtr = std::make_unique<MyClass>(23); std::cout << "unique_ptr " << &uniquePtr; uniquePtr->printVal(); f(std::move(uniquePtr)); if (uniquePtr) uniquePtr->printVal(); return 0; }
-
The class
MyClass
has a private object_member
and a public functionprintVal()
which prints the address of the managed object(this)
as well as the member value to the console. In main, an instance ofMyClass
is created by the factory functionmake_unique()
and assigned to aunique pointer
instanceuniquePtr
for management. Then, the pointer instance is moved into the functionf
usingmove semantics
. As we have not overloaded themove constructor
ormove assignment
operator inMyClass
, the compiler is using the default implementation. Inf
, the address of the copied / moved unique pointerptr
is printed and the functionprintVal()
is called on it. When the path of execution returns tomain()
, the program checks for the validity ofuniquePtr
and, if valid, calls the functionprintVal()
on it again. Here is the console output of the program: -
unique_ptr 0x7ffeefbff710, managed object 0x100300060 with val = 23 unique_ptr 0x7ffeefbff6f0, managed object 0x100300060 with val = 23
-
The output nicely illustrates the
copy / move operation
. Note that the address ofunique_ptr
differs between the two calls while the address of the managed object as well as of the value are identical. This is consistent with the inner workings of themove constructor
, which we overloaded in a previous section. Thecopy-by-value
behavior off()
creates a new instance of theunique pointer
but then switches the address of the managedMyClass
instance fromsource
todestination
. After the move is complete, we can still use the variableuniquePtr
inmain
but it now is only an empty shell which does not contain an object to manage. -
When passing a
shared pointer
by value,move semantics
are not needed. As withunique pointers
, there is an underlying rule for transferring the ownership of ashared pointer
to a function: -
R.34: Take a
shared_ptr
parameter to express that a function is part owner -
#include <iostream> #include <memory> void f(std::shared_ptr<MyClass> ptr) { std::cout << "shared_ptr (ref_cnt= " << ptr.use_count() << ") " << &ptr; ptr->printVal(); } int main() { std::shared_ptr<MyClass> sharedPtr = std::make_shared<MyClass>(23); std::cout << "shared_ptr (ref_cnt= " << sharedPtr.use_count() << ") " << &sharedPtr; sharedPtr->printVal(); f(sharedPtr); std::cout << "shared_ptr (ref_cnt= " << sharedPtr.use_count() << ") " << &sharedPtr; sharedPtr->printVal(); return 0; }
-
Consider the example on the right. The main difference in this example is that the
MyClass
instance is managed by ashared pointer
. After creation inmain()
, the address of the pointer object as well as the current reference count are printed to the console. Then,sharedPtr
is passed to the functionf()
by value, i.e. a copy is made. After returning to main, pointer address and reference counter are printed again. Here is what the output of the program looks like: -
shared_ptr (ref_cnt= 1) 0x7ffeefbff708, managed object 0x100300208 with val = 23 shared_ptr (ref_cnt= 2) 0x7ffeefbff6e0, managed object 0x100300208 with val = 23 shared_ptr (ref_cnt= 1) 0x7ffeefbff708, managed object 0x100300208 with val = 23
-
Throughout the program, the address of the managed object does not change. When passed to
f()
, the reference count changes to2
. After the function returns and the localshared_ptr
is destroyed, the reference count changes back to1
. In summary,move semantics
are usually not needed when usingshared pointers
. Shared pointers can be passed by value safely and the main thing to remember is that with each pass, the internal reference counter is increased while the managed object stays the same. -
Without giving an example here, the
weak_ptr
can be passed by value as well, just like theshared pointer
. The only difference is that the pass does not increase the reference counter. -
With the above examples, pass-by-value has been used to lend the ownership of
smart pointers
. Now let us consider the following additional rules from the C++ guidelines onsmart pointers
: -
R.33: Take a
unique_ptr&
parameter to express that a function reseats the widget -
and
-
R.35: Take a
shared_ptr&
parameter to express that a function might reseat theshared pointer
-
Both rules recommend passing-by-reference, when the function is supposed to modify the ownership of an existing
smart pointer
and not a copy. We pass a non-const reference to aunique_ptr
to a function if it might modify it in any way, including deletion and reassignment to a different resource. -
Passing a
unique_ptr
asconst
is not useful as the function will not be able to do anything with it:Unique pointers
are all about proprietary ownership and as soon as the pointer is passed, the function will assume ownership. But without the right to modify the pointer, the options are very limited. -
A
shared_ptr
can either be passed asconst
ornon-const
reference. Theconst
should be used when you want to express that the function will only read from the pointer or it might create alocal copy
andshare ownership
. -
Lastly, we will take a look at passing
raw pointers
andreferences
. The general rule of thumb is that we can use asimple raw pointer
(which can be null) or a plain reference (which can not be null), when the function we are passing will only inspect the managed object without modifying thesmart pointer
. The internal (raw) pointer to the object can be retrieved using theget()
member function. Also, by providing access to theraw pointer
, you can use thesmart pointer
to manage memory in your own code and pass the raw pointer to code that does not supportsmart pointers
. -
When using
raw pointers
retrieved from theget()
function, you should take special care not todelete
them or to createnew smart pointers
from them. If you did so, the ownership rules applying to the resource would be severely violated. When passing araw pointer
to a function or when returning it (see next section),raw pointers
should always be considered as owned by thesmart pointer
from which the raw reference to the resource has been obtained. -
Returning
smart pointers
from functions-
With return values, the same logic that we have used for passing
smart pointers
to functions applies: Return asmart pointer
, bothunique
orshared
, if the caller needs to manipulate or access the pointer properties. In case the caller just needs the underlying object, araw pointer
should be returned. -
Smart pointers
should always be returned by value. This is not only simpler but also has the following advantages:-
The overhead usually associated with return-by-value due to the expensive copying process is significantly mitigated by the built-in
move semantics
ofsmart pointers
. They only hold a reference to themanaged object
, which is quickly switched fromdestination
tosource
during themove process
. -
Since C++17, the compiler used
Return Value Optimization (RVO)
to avoid the copy usually associated with return-by-value. This technique, together with copy-elision, is able to optimize evenmove semantics
andsmart pointers
(not in call cases though, they are still an essential part of modern C++). -
When returning a
shared_ptr
by value, the internal reference counter is guaranteed to be properly incremented. This is not the case when returning bypointer
or byreference
.
-
-
The topic of
smart pointers
is a complex one. In this course, we have covered many basics and some of the more advanced concepts. However, there are many more aspects to consider and features to use when integratingsmart pointers
into your code. The full set of smart pointer rules in the C++ guidelines is a good start to dig deeper into one of the most powerful features of modern C++.
-
-
-
Best-Practices for
Passing Smart Pointers
-
This sections contains a condensed summary of when (and when not) to use
smart pointers
and how to properly pass them between functions. This section is intended as a guide for your future use of this important feature in modern C++ and will hopefully encourage you not to ditchraw pointers
altogether but instead to think about where your code could benefit fromsmart pointers
- and when it would most probably not. -
The following list contains all the variations (omitting const) of passing an object to a function:
-
void f( object* ); // (a) void f( object& ); // (b) void f( unique_ptr<object> ); // (c) void f( unique_ptr<object>& ); // (d) void f( shared_ptr<object> ); // (e) void f( shared_ptr<object>& ); // (f)
-
The Preferred Way
-
The preferred way of to pass object parameters is by using a) or b) :
-
void f( object* ); void f( object& );
-
In doing so, we do not have to worry about the
lifetime policy
a caller might have implemented. Using a specificsmart pointer
in a case where we only want to observe an object or manipulate a member might be overly restrictive. -
With the
non-owning raw pointer
*
or thereference
&
we can observe an object from which we can assume that its lifetime will exceed the lifetime of the function parameter. In concurrency however, this might not be the case, but for linear code we can safely assume this. -
To decide wether a
*
or&
is more appropriate, you should think about wether you need to express that there is no object. This can only be done with pointers by passing e.g.nullptr
. In most other cases, you should use a reference instead.
-
-
The Object Sink
-
The preferred way of passing an object to a function so that the function takes ownership of the object (or "consumes" it) is by using method c) from the above list:
-
void f( unique_ptr<object> );
-
In this case, we are passing a
unique pointer
by value from caller to function, which then takes ownership of the the pointer and the underlying object. This is only possible usingmove semantics
as there may be only a single reference to the object managed by theunique pointer
. -
After the object has been passed in this way, the caller will have an invalid
unique pointer
and the function to which the object now belongs may destroy it or move it somewhere else. -
Using
const
with this particular call does not make sense as it models an ownership transfer so the source will be definitely modified.
-
-
In And Out Again 1
-
In some cases, we want to modify a unique pointer (not necessarily the underlying object) and re-use it in the context of the caller. In this case, method d) from the above list might be most suitable:
-
void
f( unique_ptr<object>& );
-
Using this call structure, the function states that it might modify the smart pointer, e.g. by redirecting it to another object. It is not recommended to use it for accepting an object only because we should avoid restricting ourselves unnecessarily to a particular object lifetime strategy on the caller side.
-
Using
const
with this call structure is not recommendable as we would not be able to modify theunique_ptr
in this case. In case you want to modify the underlying object, use method a) instead.
-
-
Sharing Object Ownership
-
In the last examples, we have looked at strategies involving unique ownership. In this example, we want to express that a function will store and share ownership of an object on the heap. This can be achieved by using method e) from the list above:
-
void f( shared_ptr<object> )
-
In this example, we are making a copy of the
shared pointer
passed to the function. In doing so, the internal reference counter within all shared pointers referring to the same heap object is incremented by one. -
This strategy can be recommended for cases where the function needs to retain a copy of the
shared_ptr
and thus share ownership of the object. This is of interest when we need access tosmart pointer
functions such as thereference count
or we must make sure that the object to which theshared pointer
refers is not prematurely deallocated (which might happen in concurrent programming). -
If the local scope of the function is not the final destination, a
shared pointer
can also be moved, which does not increase the reference count and is thus more effective. -
A disadvantage of using a
shared_ptr
as a function argument is that the function will be limited to using only objects that are managed byshared pointers
- which limits flexibility and reusability of the code.
-
-
In And Out Again 2
-
As with
unique pointers
, the need to modifyshared pointers
and re-use them in the context of the caller might arise. In this case, method f) might be the right choice: -
void f( shared_ptr<object>& );
-
This particular way of passing a
shared pointer
expresses that the functionf
will modify the pointer itself. As with method e), we will be limiting the usability of the function to cases where the object is managed by ashared_ptr
and nothing else.
-
-
Last Words
- The topic of
smart pointers
is a complex one. In this course, we have covered many basics and some of the more advanced concepts. However, for some cases there are more aspects to consider and features to use when integrating smart pointers into your code. The full set of smart pointer rules in the C++ guidelines is a good start to dig deeper into one of the most powerful features of modern C++.
- The topic of
-
-
Exercise
-
/* Smart pointer exercises: Handling unique, shared and smart pointers // If all tasks are solved properly, the following text should appear in the terminal Learn Coding with Udacity! weak pointer is expired Note: Compile with C++17 */ #include <string> #include <iostream> #include <memory> void f1(std::unique_ptr<std::string> unique_ptr) { // TASK 3: Print the content of unique_ptr to the terminal // SOLUTION 3: std::cout << *unique_ptr; } void f2(std::shared_ptr<std::string> shared_ptr) { // TASK 4: Print the use count property of shared_ptr to the terminal to see how many pointers refer to its resource // If the use count is 2, print the content of shared_ptr to the terminal // SOLUTION 4: if(shared_ptr.use_count()==2) std::cout << *shared_ptr; } void f3(std::weak_ptr<std::string> weak_ptr) { // TASK 5: Lock the weak pointer by assigning it to a shared pointer. Then, print its content to the terminal. // If the weak ptr can not be locked because the resource it refers to has expired, print the string "weak pointer is expired" to the terminal. // SOLUTION 5: if (auto shared_ptr = weak_ptr.lock()) // // Copy into a shared_ptr to use it { std::cout << *shared_ptr << "\n"; } else { std::cout << "weak pointer is expired\n"; } } int main() { // create resources to move around auto unique_str = std::make_unique<std::string>("Learn "); auto shared_str_1 = std::make_shared<std::string>("Coding "); auto shared_str_2 = std::make_shared<std::string>("with Udacity!"); // Moving a unique pointer to transfer ownership // TASK 1 : pass the pointer 'unique_str' into the function f1 // SOLUTION 1: f1(std::move(unique_str)); // Pass a shared pointer by value // TASK 2 : pass the pointer 'shared_str_1' into the function f2 // SOLUTION 2: f2(shared_str_1); // Pass a weak ptr by value and create a shared ptr from it to use it std::weak_ptr<std::string> weak_ptr_1; weak_ptr_1 = shared_str_2; f3(weak_ptr_1); // Pass a weak ptr by value after the shared ptr has expired std::weak_ptr<std::string> weak_ptr_2; { auto shared_str_3 = std::make_shared<std::string>("without Udacity"); weak_ptr_2 = shared_str_3; } f3(weak_ptr_2); }
-
Processes and Threads
-
In this lesson, you will learn how to start and manage your first parallel path of execution, which runs concurrently with the main program and is thus asynchronous. In contrast to synchronous programs, the main program can continue with its line of execution without the need to wait for the parallel task to complete. The following figure illustrates this difference.
-
Before we start writing a first asynchronous program in C++, let us take a look at the differences between two important concepts :
processes
andthreads
. -
A
process
(also called atask
) is a computer program at runtime. It is comprised of the runtime environment provided by the operating system (OS), as well as of the embedded binary code of the program during execution. A process is controlled by the OS through certain actions with which it sets the process into one of several carefully defined states: -
Ready : After its creation, a process enters the ready state and is loaded into main memory. The process now is ready to run and is waiting for CPU time to be executed. Processes that are ready for execution by the CPU are stored in a queue managed by the OS.
-
Running : The operating system has selected the process for execution and the instructions within the process are executed on one or more of the available CPU cores.
-
Blocked : A process that is blocked is one that is waiting for an event (such as a system resource becoming available) or the completion of an I/O operation.
-
Terminated: When a process completes its execution or when it is being explicitly killed, it changes to the "terminated" state. The underlying program is no longer executing, but the process remains in the process table as a "zombie process". When it is finally removed from the process table, its lifetime ends.
-
Ready suspended : A process that was initially in ready state but has been swapped out of main memory and placed onto external storage is said to be in suspend ready state. The process will transition back to ready state whenever it is moved to main memory again.
-
Blocked suspended : A process that is blocked may also be swapped out of main memory. It may be swapped back in again under the same conditions as a "ready suspended" process. In such a case, the process will move to the blocked state, and may still be waiting for a resource to become available.
-
-
Processes are managed by the
scheduler
of the OS. Thescheduler
can either let a process run until it ends or blocks (non-interrupting scheduler), or it can ensure that the currently running process is interrupted after a short period of time. The scheduler can switch back and forth between different active processes (interrupting scheduler), alternately assigning them CPU time. The latter is the typical scheduling strategy of any modern operating system. -
Since the administration of processes is computationally taxing, operating systems support a more resource-friendly way of realizing concurrent operations: the threads.
-
A
thread
represents a concurrent execution unit within a process. In contrast to full-blown processes as described above, threads are characterized as light-weight processes (LWP). These are significantly easier to create and destroy: In many systems the creation of a thread is up to 100 times faster than the creation of a process. This is especially advantageous in situations, when the need for concurrent operations changes dynamically. -
Threads exist within processes and share their resources. As illustrated by the figure above, a process can contain several threads or - if no parallel processing is provided for in the program flow - only a single thread.
-
A major difference between a process and a thread is that each process has its own address space, while a thread does not require a new address space to be created. All the
threads
in aprocess
can access itsshared memory
. Threads also share other OS dependent resources such as processors, files, and network connections. As a result, the management overhead for threads is typically less than for processes. Threads, however, are not protected against each other and must carefully synchronize when accessing the shared process resources to avoid conflicts. -
Similar to processes, threads exist in different states, which are illustrated in the figure below:
-
New : A thread is in this state once it has been created. Until it is actually running, it will not take any CPU resources.
-
Runnable : In this state, a thread might actually be running or it might be ready to run at any instant of time. It is the responsibility of the
thread scheduler
to assign CPU time to the thread. -
Blocked : A
thread
might be in this state, when it is waiting for I/O operations to complete. When blocked, a thread cannot continue its execution any further until it is moved to therunnable state
again. It will not consume any CPU time in this state. Thethread scheduler
is responsible for reactivating the thread. -
Concurrency Support in C++11
-
The concurrency support in C++ makes it possible for a program to execute multiple
threads
in parallel. Concurrency was first introduced into the standard with C++11. Since then, new concurrency features have been added with each new standard update, such as in C++14 and C++17. Before C++11, concurrent behavior had to be implemented using native concurrency support from the OS, usingPOSIX Threads
, or third-party libraries such asBOOST
. The standardization of concurrency in C++ now makes it possible to develop cross-platform concurrent programs, which is as significant improvement that saves time and reduces error proneness. Concurrency in C++ is provided by thethread
support library, which can be accessed by including the header. -
A running program consists of at least one thread. When the main function is executed, we refer to it as the "main thread". Threads are uniquely identified by their
thread ID
, which can be particularly useful for debugging a program. The code on the right prints the thread identifier of the main thread and outputs it to the console: -
#include <iostream> #include <thread> int main() { std::cout << "Hello concurrent world from main! Thread id = " << std::this_thread::get_id() << std::endl; return 0; }
-
These are the results when run:
-
Hello concurrent world from main! Thread id = 1
-
Also, it is possible to retrieve the number of available CPU cores of a system. The example on the right prints the number of CPU cores to the console.
-
#include <iostream> #include <thread> int main() { unsigned int nCores = std::thread::hardware_concurrency(); std::cout << "This machine supports concurrency with " << nCores << " cores available" << std::endl; return 0; }
-
These are the results from a local machine at the time of writing:
-
This machine supports concurrency with 2 cores available
-
-
Starting a second thread
-
In this section, we will start a second
thread
in addition to the mainthread
of our program. To do this, we need to construct a thread object and pass it the function we want to be executed by the thread. Once the thread enters the runnable state, the execution of the associatedthread
function may start at any point in time. -
#include <iostream> #include <thread> void threadFunction() { std::this_thread::sleep_for(std::chrono::milliseconds(100)); // simulate work std::cout << "Finished work in thread\n"; } int main() { // create thread std::thread t(threadFunction); // do something in main() std::this_thread::sleep_for(std::chrono::milliseconds(50)); // simulate work std::cout << "Finished work in main\n"; // wait for thread to finish t.join(); return 0; }
-
After the thread object has been constructed, the main thread will continue and execute the remaining instructions until it reaches the end and returns. It is possible that by this point in time, the thread will also have finished. But if this is not the case, the main program will terminate and the resources of the associated process will be freed by the OS. As the thread exists within the process, it can no longer access those resources and thus not finish its execution as intended.
-
To prevent this from happening and have the main program wait for the thread to finish the execution of the thread function, we need to call
join()
on the thread object. This call will only return when the thread reaches the end of the thread function and block the main thread until then. -
The code on the right shows how to use
join()
to ensure thatmain()
waits for the threadt
to finish its operations before returning. It uses the functionsleep_for()
, which pauses the execution of the respective threads for a specified amount of time. The idea is to simulate some work to be done in the respective threads of execution. -
To compile this code with
g++
, you will need to use the-pthread
flag.pthread
adds support for multithreading with thepthreads
library, and the option sets flags for both the preprocessor and linker:POSIX Threads
, usually referred to aspthreads
, is an execution model that exists independently from a language, as well as a parallel execution model. It allows a program to control multiple different flows of work that overlap in time. Each flow of work is referred to as a thread, and creation and control over these flows is achieved by making calls to the POSIX Threads API. POSIX Threads is an API defined by the standard POSIX.1c, Threads extensions (IEEE Std 1003.1c-1995).
-
Note: If you compile without the
-pthread
flag, you will see an error of the form: undefined reference to pthread_create. You will need to use the -pthread flag for all other multithreaded examples in this course going forward. -
The code produces the following output:
-
Finished work in main Finished work in thread
-
Not surprisingly, the main function finishes before the thread because the delay inserted into the thread function is much larger than in the main path of execution. The call to
join()
at the end of the main function ensures that it will not prematurely return. As an experiment, comment outt.join()
and execute the program. What do you expect will happen?
-
-
Randomness of events
-
One very important trait of concurrent programs is their non-deterministic behavior. It can not be predicted which
thread
thescheduler
will execute at which point in time. In the code on the right, the amount of work to be performed both in the thread function and in main has been split into two separate jobs. -
#include <iostream> #include <thread> void threadFunction() { std::this_thread::sleep_for(std::chrono::milliseconds(50)); // simulate work std::cout << "Finished work 1 in thread\n"; std::this_thread::sleep_for(std::chrono::milliseconds(50)); std::cout << "Finished work 2 in thread\n"; } int main() { // create thread std::thread t(threadFunction); // do something in main() std::this_thread::sleep_for(std::chrono::milliseconds(50)); // simulate work std::cout << "Finished work 1 in main\n"; std::this_thread::sleep_for(std::chrono::milliseconds(50)); std::cout << "Finished work 2 in main\n"; // wait for thread to finish t.join(); return 0; }
-
The console output shows that the work packages in both threads have been interleaved with the first package being performed before the second package.
-
Finished work 1 in thread Finished work 1 in main Finished work 2 in thread Finished work 2 in main
-
Interestingly, when executed on my local machine, the order of execution has changed. Now, instead of finishing the second work package in the thread first, main gets there first.
-
Executing the code several times more shows that the two versions of program output interchange in a seemingly random manner. This element of randomness is an important characteristic of concurrent programs and we have to take measures to deal with it in a controlled way that prevent unwanted behavior or even program crashes.
-
Reminder: You will need to use the
-pthread
flag when compiling this code, just as you did with the previous example. This flag will be needed for all future multithreaded programs in this course as well.
-
-
Using
join()
as a barrier-
In the previous example, the order of execution is determined by the scheduler. If we wanted to ensure that the
thread
function completed its work before themain function
started its own work (because it might be waiting for a result to be available), we could achieve this by repositioning the call to join. -
#include <iostream> #include <thread> void threadFunction() { std::this_thread::sleep_for(std::chrono::milliseconds(50)); // simulate work std::cout << "Finished work 1 in thread\n"; std::this_thread::sleep_for(std::chrono::milliseconds(50)); std::cout << "Finished work 2 in thread\n"; } int main() { // create thread std::thread t(threadFunction); // wait for thread to finish t.join(); // do something in main() std::this_thread::sleep_for(std::chrono::milliseconds(50)); // simulate work std::cout << "Finished work 1 in main\n"; std::this_thread::sleep_for(std::chrono::milliseconds(50)); std::cout << "Finished work 2 in main\n"; return 0; }
-
In the file on the right, the
.join()
has been moved to before the work inmain()
. The order of execution now always looks like the following: -
Finished work 1 in thread Finished work 2 in thread Finished work 1 in main Finished work 2 in main
-
n later sections of this course, we will make extended use of the join() function to carefully control the flow of execution in our programs and to ensure that results of thread functions are available and complete where we need them to be.
-
-
Detach
-
Let us now take a look at what happens if we don’t join a thread before its destructor is called. When we comment out
join
in the example above and then run the program again, it aborts with an error. The reason why this is done is that the designers of the C++ standard wanted to make debugging a multi-threaded program easier: Having the program crash forces the programer to remember joining thethreads
that are created in a proper way. Such a hard error is usually much easier to detect than soft errors that do not show themselves so obviously. -
There are some situations however, where it might make sense to not wait for a thread to finish its work. This can be achieved by "detaching" the thread, by which the internal state variable "joinable" is set to "false". This works by calling the
detach()
method on thethread
. The destructor of a detached thread does nothing: It neither blocks nor does it terminate the thread. In the following example,detach
is called on the thread object, which causes the main thread to immediately continue until it reaches the end of the program code and returns. Note that a detached thread can not be joined ever again. -
#include <iostream> #include <thread> void threadFunction() { std::this_thread::sleep_for(std::chrono::milliseconds(50)); // simulate work std::cout << "Finished work in thread\n"; } int main() { // create thread std::thread t(threadFunction); // detach thread and continue with main t.detach(); // do something in main() std::this_thread::sleep_for(std::chrono::milliseconds(50)); // simulate work std::cout << "Finished work in main\n"; return 0; }
-
You can run the code above using
example_6.cpp
over on the right side of the screen. -
Programmers should be very careful though when using the
detach()-method
. You have to make sure that the thread does not access any data that might get out of scope or be deleted. Also, we do not want our program to terminate with threads still running. Should this happen, such threads will be terminated very harshly without giving them the chance to properly clean up their resources - what would usually happen in the destructor. So a well-designed program usually has a well-designed mechanism for joining all threads before exiting. -
#include <iostream> #include <thread> void threadFunctionEven() { std::this_thread::sleep_for(std::chrono::milliseconds(1)); // simulate work std::cout << "Even thread\n"; } /* Student Task START */ void threadFunctionOdd() { std::this_thread::sleep_for(std::chrono::milliseconds(1)); // simulate work std::cout << "Odd thread\n"; } /* Student Task END */ int main() { /* Student Task START */ for (int i = 0; i < 6; ++i) { if (i % 2 == 0) { std::thread t(threadFunctionEven); t.detach(); } else { std::thread t(threadFunctionOdd); t.detach(); } } /* Student Task END */ // ensure that main does not return before the threads are finished std::this_thread::sleep_for(std::chrono::milliseconds(1)); // simulate work std::cout << "End of main is reached" << std::endl; return 0; }
-
Run the program several times and look the console output. What do you observe? As a second experiment, comment out the
sleep_for
function in the main thread. What happens to the detached threads in this case? -
The order in which even and odd threads are executed changes. Also, some threads are executed after the main function reaches its end. When
sleep_for
is removed, threads will not finish before the program terminates.
-
-
Starting a Thread with a Function Object
-
Functions and Callable Objects
-
In the previous section, we have created our first thread by passing it a function to execute. We did not discuss this concept in depth at the time, but in this section we will focus on the details of passing functions to other functions, which is one form of a
callable object
. -
In C++,
callable objects
are object that can appear as the left-hand operand of the call operator. These can be pointers to functions, objects of a class that defines an overloaded function call operator and lambdas (an anonymous inline function), with which function objects can be created in a very simple way. In the context of concurrency, we can use callable objects to attach a function to a thread. -
Functor example (defines an overloaded function call operator):
-
// this is a functor struct add_x { add_x(int val) : x(val) {} // Constructor int operator()(int y) const { return x + y; } private: int x; }; // Now you can use it like this: add_x add42(42); // create an instance of the functor class
-
In the last section, we constructed a thread object by passing a function to it without any arguments. If we were limited to this approach, the only way to make data available from within the thread function would be to use global variables - which is definitely not recommendable and also incredibly messy.
-
In this section, we will therefore look at several ways of passing data to a thread function.
-
The
std::thread
constructor can also be called with instances of classes that implement the function-call operator. In the following, we will thus define a class that has an overloaded()
-operator. In preparation for the final project of this course, which will be a traffic simulation with vehicles moving through intersections in a street grid, we will define a (very) early version of theVehicle
class in this example: -
#include <iostream> #include <thread> class Vehicle { public: void operator()() { std::cout << "Vehicle object has been created \n" << std::endl; } }; int main() { // create thread std::thread t(Vehicle()); // C++'s most vexing parse // do something in main() std::cout << "Finished work in main \n"; // wait for thread to finish t.join(); return 0; }
-
When executing this code, the clang++ compiler generates a warning, which is followed by an error:
-
example_1.cpp: In function ‘int main()’: example_1.cpp:23:7: error: request for member ‘join’ in ‘t’, which is of non-class type ‘std::thread(Vehicle (*)())’ t.join();
-
The extra parentheses suggested by the compiler avoid what is known as C++'s "most vexing parse", which is a specific form of syntactic ambiguity resolution in the C++ programming language.
-
The expression was coined by Scott Meyers in 2001, who talks about it in details in his book "Effective STL". The "most vexing parse" comes from a rule in C++ that says that anything that could be considered as a function declaration, the compiler should parse it as a function declaration - even if it could be interpreted as something else.
-
In the previous code example, the line
std::thread t(Vehicle());
is seemingly ambiguous, since it could be interpreted either as:- a variable definition for variable
t
of classstd::thread
, initialized with an anonymous instance of class Vehicle or - a function declaration for a function
t
that returns an object of typestd::thread
and has a single (unnamed) parameter that is a pointer to function returning an object of typeVehicle
- a variable definition for variable
-
Most programmers would presumable expect the first case to be true, but the C++ standard requires it to be interpreted as the second - hence the compiler warning.
-
There are three ways of forcing the compiler to consider the line as the first case, which would create the thread object we want:
- Add an extra pair of parentheses
- Use copy initialization
- Use uniform initialization with braces
-
The following code shows all three variants:
-
std::thread t1( (Vehicle()) ); // Add an extra pair of parantheses std::thread t2 = std::thread( Vehicle() ); // Use copy initialization std::thread t3{ Vehicle() };// Use uniform initialization with braces
-
Whichever option we use, the idea is the same: the function object is copied into internal storage accessible to the new
thread
, and the new thread invokes the operator()
. TheVehicle
class can of course have data members and other member functions too, and this is one way of passing data to the thread function: pass it in as a constructor argument and store it as a data member: -
#include <iostream> #include <thread> class Vehicle { public: Vehicle(int id) : _id(id) {} void operator()() { std::cout << "Vehicle #" << _id << " has been created" << std::endl; } private: int _id; }; int main() { // create thread std::thread t = std::thread(Vehicle(1)); // Use copy initialization // do something in main() std::cout << "Finished work in main \n"; // wait for thread to finish t.join(); return 0; }
-
In the above code example, the class
Vehicle
has a constructor that takes an integer and it will store it internally in a variable_id
. In the overloaded function call operator, the vehicleid
is printed to the console. Inmain()
, we are creating theVehicle
object usingcopy initialization
. The output of the program is given below: -
Finished work in main Vehicle #1 has been created
-
As can easily be seen, the integer ID has been successfully passed into the thread function.
-
-
Lambdas
-
Another very useful way of starting a
thread
and passing information to it is by using a lambda expression ("Lambda" for short). With a Lambda you can easily create simple function objects. -
The name "Lambda" comes from Lambda Calculus , a mathematical formalism invented by Alonzo Church in the 1930s to investigate questions of logic and computability. Lambda calculus formed the basis of LISP, a functional programming language. Compared to Lambda Calculus and LISP, C ++ - Lambdas have the properties of being unnamed and capturing variables from the surrounding context, but lack the ability to execute and return functions.
-
A Lambda is often used as an argument for functions that can take a callable object. This can be easier than creating a named function that is used only when passed as an argument. In such cases, Lambdas are generally preferred because they allow the function objects to be defined inline. If Lambdas were not available, we would have to define an extra function somewhere else in our source file - which would work but at the expense of the clarity of the source code.
-
A Lambda is a function object (a
"functor"
), so it has a type and can be stored and passed around. Its result object is called a"closure"
, which can be called using the operator()
as we will see shortly. -
A lambda formally consists of three parts: a capture list
[]
, a parameter list()
and a main part{}
, which contains the code to be executed when the Lambda is called. Note that in principal all parts could be empty. -
The capture list
[]
: By default, variables outside of the enclosing{}
around the main part of the Lambda can not be accessed. By adding a variable to the capture list however, it becomes available within the Lambda either as a copy or as a reference. The captured variables become a part of the Lambda. -
By default, variables in the capture block can not be modified within the Lambda. Using the keyword "mutable" allows to modify the parameters captured by copy, and to call their non-const member functions within the body of the Lambda. The following code examples show several ways of making the external variable "id" accessible within a Lambda.
-
#include <iostream> int main() { // create lambdas int id = 0; // Define an integer variable //auto f0 = []() { std::cout << "ID = " << id << std::endl; }; // Error: 'id' cannot be accessed id++; auto f1 = [id]() { std::cout << "ID = " << id << std::endl; }; // OK, 'id' is captured by value id++; auto f2 = [&id]() { std::cout << "ID = " << id << std::endl; }; // OK, 'id' is captured by reference //auto f3 = [id]() { std::cout << "ID = " << ++id << std::endl; }; // Error, 'id' may not be modified auto f4 = [id]() mutable { std::cout << "ID = " << ++id << std::endl; }; // OK, 'id' may be modified // execute lambdas f1(); f2(); f4(); return 0; }
-
Even though we have been using Lambdas in the above example in various ways, it is important to note that a Lambda does not exist at runtime. The
runtime
effect of a Lambda is the generation of anobject
, which is known asclosure
. The difference between a Lambda and the correspondingclosure
is similar to the distinction between aclass
and aninstance
of theclass
. Aclass
exists only in the source code while the objects created from it exist at runtime. -
We can use (a copy of) the closure (i.e. f0, f1, …) to execute the code within the Lambda at a position in our program different to the line where the function object was created.
-
The parameter list
()
: The way parameters are passed to a Lambda is basically identical to calling a regular function. If the Lambda takes no arguments, these parentheses can be omitted (except when "mutable" is used). -
The following example illustrates how the function object is first created and then used to pass the parameter id later in the code.
-
#include <iostream> int main() { int id = 0; // Define an integer variable // create lambda auto f = [](const int id) { std::cout << "ID = " << id << std::endl; }; // ID is passed as a parameter // execute function object and pass the parameter f(id); return 0; }
-
b) ID in Lambda = 1 c) ID in Main = 0 d) ID in Lambda = 1 e) ID in Main = 1 f) ID in Lambda = 2 a) ID in Lambda = 2
-
Starting Threads with Lambdas
-
A Lambda is, as we’ve seen, just an object and, like other objects it may be copied, passed as a parameter, stored in a container, etc. The Lambda object has its own scope and lifetime which may, in some circumstances, be different to those objects it has ‘captured’. Programers need to take special care when capturing local objects by reference because a Lambda’s lifetime may exceed the lifetime of its capture list: It must be ensured that the object to which the reference points is still in scope when the Lambda is called. This is especially important in multi-threading programs.
-
So let us start a thread and pass it a Lambda object to execute:
-
#include <iostream> #include <thread> int main() { int id = 0; // Define an integer variable // starting a first thread (by reference) auto f0 = [&id]() { std::this_thread::sleep_for(std::chrono::milliseconds(100)); std::cout << "a) ID in Thread (call-by-reference) = " << id << std::endl; }; std::thread t1(f0); // starting a second thread (by value) std::thread t2([id]() mutable { std::this_thread::sleep_for(std::chrono::milliseconds(50)); std::cout << "b) ID in Thread (call-by-value) = " << id << std::endl; }); // increment and print id in main ++id; std::cout << "c) ID in Main (call-by-value) = " << id << std::endl; // wait for threads before returning t1.join(); t2.join(); return 0; }
-
c) ID in Main (call-by-value) = 1 b) ID in Thread (call-by-value) = 0 a) ID in Thread (call-by-reference) = 1
-
As you can see, the output in the main thread is generated first, at which point the variable ID has taken the value 1. Then, the call-by-value thread is executed with ID at a value of 0. Then, the call-by-reference thread is executed with ID at a value of 1. This illustrates the effect of passing a value by reference : when the data to which the reference refers changes before the thread is executed, those changes will be visible to the thread. We will see other examples of such behavior later in the course, as this is a primary source of concurrency bugs.
-
-
Starting a Thread with Variadic Templates and Member Functions
-
Passing Arguments using a Variadic Template
-
In the previous section, we have seen that one way to pass arguments in to the thread function is to package them in a class using the
function call operator
. Even though this worked well, it would be very cumbersome to write a special class every time we need topass data to a thread
. We can also use a Lambda that captures the arguments and then calls the function. But there is a simpler way: Thethread constructor
may be called with a function and all its arguments. That is possible because the thread constructor is avariadic template
that takes multiple arguments. -
Before C++11, classes and functions could only accept a fixed number of arguments, which had to be specified during the first declaration. With variadic templates it is possible to include any number of arguments of any type.
- Variadic functions are functions (e.g. printf) which take a variable number of arguments.
-
#include <iostream> #include <thread> #include <string> void printID(int id) { std::this_thread::sleep_for(std::chrono::milliseconds(50)); std::cout << "ID = " << id << std::endl; } void printIDAndName(int id, std::string name) { std::this_thread::sleep_for(std::chrono::milliseconds(100)); std::cout << "ID = " << id << ", name = " << name << std::endl; } int main() { int id = 0; // Define an integer variable // starting threads using variadic templates std::thread t1(printID, id); std::thread t2(printIDAndName, ++id, "MyString"); std::thread t3(printIDAndName, ++id); // this procudes a compiler error // wait for threads before returning t1.join(); t2.join(); //t3.join(); return 0; }
-
As seen in the code example above, a first thread object is constructed by passing it the function
printID
and an integer argument. Then, a second thread object is constructed with a functionprintIDAndName
, which requires an integer and a string parameter. If only a single argument was provided to the thread when callingprintIDAndName
, a compiler error would occur (seestd::thread t3
in the example) - which is the same type checking we would get when calling the function directly. -
There is one more difference between calling a function directly and passing it to a thread: With the former, arguments may be passed by value, by reference or by using move semantics - depending on the signature of the function. When calling a function using a
variadic template
, the arguments are by default either moved or copied - depending on wether they arervalues
orlvalues
. There are ways however which allow us to overwrite this behavior. If you want to move anlvalue
for example, we can callstd::move
. In the following example, two threads are started, each with a different string as a parameter. Witht1
, the stringname1
is copied by value, which allows us to printname1
even after join has been called. The second stringname2
is passed to the thread function usingmove semantics
, which means that it is not available any more afterjoin
has been called ont2
. -
``#include <iostream> #include <thread> #include <string> void printName(std::string name, int waitTime) { std::this_thread::sleep_for(std::chrono::milliseconds(waitTime)); std::cout << "Name (from Thread) = " << name << std::endl; } int main() { std::string name1 = "MyThread1"; std::string name2 = "MyThread2"; // starting threads using value-copy and move semantics std::thread t1(printName, name1, 50); std::thread t2(printName, std::move(name2), 100); // wait for threads before returning t1.join(); t2.join(); // print name from main std::cout << "Name (from Main) = " << name1 << std::endl; std::cout << "Name (from Main) = " << name2 << std::endl; return 0; }
-
The console output shows how using copy-by-value and
std::move
affect the string parameters: -
Name (from Thread) = MyThread1 Name (from Thread) = MyThread2 Name (from Main) = MyThread1 Name (from Main) =
-
In the following example, the signature of the thread function is modified to take a non-const reference to the string instead.
-
#include <iostream> #include <thread> #include <string> void printName(std::string &name, int waitTime) { std::this_thread::sleep_for(std::chrono::milliseconds(waitTime)); name += " (from Thread)"; std::cout << name << std::endl; } int main() { std::string name("MyThread"); // starting thread std::thread t(printName, std::ref(name), 50); // wait for thread before returning t.join(); // print name from main name += " (from Main)"; std::cout << name << std::endl; return 0; }
-
MyThread (from Thread) MyThread (from Thread) (from Main)
-
When passing the string variable name to the thread function, we need to explicitly mark it as a
reference
, so the compiler will treat it as such. This can be done by using thestd::ref
function. In the console output it becomes clear that the string has been successfully modified within the thread function before being passed to main. -
Even though the code works, we are now sharing mutable data between threads - which will be something we discuss in later sections of this course as a primary source for concurrency bugs.
-
-
Starting Threads with Member Functions
-
In the previous sections, you have seen how to start threads with functions and function objects, with and without additional arguments. Also, you now know how to pass arguments to a
thread
function byreference
. But what if we wish to run a member function other than the functioncall operator
, such as a member function of an existing object? Luckily, the C++ library can handle this use-case: For calling member functions, thestd::thread
function requires an additional argument for the object on which to invoke the member function. -
#include <iostream> #include <thread> class Vehicle { public: Vehicle() : _id(0) {} void addID(int id) { _id = id; } void printID() { std::cout << "Vehicle ID=" << _id << std::endl; } private: int _id; }; int main() { // create thread Vehicle v1, v2; std::thread t1 = std::thread(&Vehicle::addID, v1, 1); // call member function on object v std::thread t2 = std::thread(&Vehicle::addID, &v2, 2); // call member function on object v // wait for thread to finish t1.join(); t2.join(); // print Vehicle id v1.printID(); v2.printID(); return 0; }
-
In the example above, the
Vehicle
objectv1
is passed to the thread function by value, thus a copy is made which does not affect theoriginal
living in the main thread. Changes to its member variable_id
will thus not show when printing callingprintID()
later inmain
. The secondVehicle
objectv2
is instead passed by reference. Therefore, changes to its_id
variable will also be visible in themain
thread - hence the following console output: -
Vehicle ID=0 Vehicle ID=2
-
In the previous example, we have to ensure that the existence of
v2
outlives the completion of the threadt2
- otherwise there will be an attempt to access an invalidated memory address. An alternative is to use a heap-allocated object and a reference-counted pointer such asstd::shared_ptr<Vehicle>
to ensure that the object lives as long as it takes the thread to finish its work. The following example shows how this can be implemented: -
#include <iostream> #include <thread> class Vehicle { public: Vehicle() : _id(0) {} void addID(int id) { _id = id; } void printID() { std::cout << "Vehicle ID=" << _id << std::endl; } private: int _id; }; int main() { // create thread std::shared_ptr<Vehicle> v(new Vehicle); std::thread t = std::thread(&Vehicle::addID, v, 1); // call member function on object v // wait for thread to finish t.join(); // print Vehicle id v->printID(); return 0; }
-
Change the code from the previous example in a way that a new member variable
_name
of typestd::string
is added to theVehicle
class. Then, define a functionsetName
which takes a string as an argument and assigns this to_name
. The functionsetName
needs to be started as a thread frommain
. Also, add a functionprintName
to the Vehicle class which is used at the end of main to print the name to the console. -
#include <iostream> #include <thread> class Vehicle { public: Vehicle() : _id(0) {} void addID(int id) { _id = id; } void setName(std::string name) { _name = name; } void printID() { std::cout << "Vehicle ID=" << _id << std::endl; } void printName() { std::cout << "Vehicle name=" << _name << std::endl; } private: int _id; std::string _name; }; int main() { // create thread 1 std::shared_ptr<Vehicle> v(new Vehicle); std::thread t1 = std::thread(&Vehicle::addID, v, 1); // create thread 2 std::thread t2 = std::thread(&Vehicle::setName, v, "MyVehicle"); // wait for thread to finish t1.join(); t2.join(); // print Vehicle id v->printID(); v->printName(); return 0; }
-
-
Running Multiple Threads
-
Fork-Join Parallelism
-
Using threads follows a basic concept called "fork-join-parallelism". The basic mechanism of this concept follows a simple three-step pattern:
- Split the flow of execution into a parallel thread ("fork")
- Perform some work in both the
main
thread and theparallel thread
- Wait for the parallel thread to finish and unite the split flow of execution again ("join")
-
The following diagram illustrates the basic idea of forking:
-
In the
main
thread, the program flow is forked into three parallel branches. In both worker branches, some work is performed - which is whythreads
are often referred to as"worker threads"
. Once the work is completed, the flow of execution is united again in the main function using thejoin()
command. In this example, join acts as a barrier where all threads are united. The execution ofmain
is in fact halted, until both worker threads have successfully completed their respective work. -
In the following example, a number of threads is created and added to a
vector
. The basic idea is to loop over the vector at the end of themain
function and calljoin
on all the thread objects inside the vector. -
#include <iostream> #include <thread> #include <vector> void printHello() { // perform work std::cout << "Hello from Worker thread #" << std::this_thread::get_id() << std::endl; } int main() { // create threads std::vector<std::thread> threads; for (size_t i = 0; i < 5; ++i) { // copying thread objects causes a compile error /* std::thread t(printHello); threads.push_back(t); */ // moving thread objects will work threads.emplace_back(std::thread(printHello)); } // do something in main() std::cout << "Hello from Main thread #" << std::this_thread::get_id() << std::endl; // call join on all thread objects using a range-based loop for (auto &t : threads) t.join(); return 0; }
-
When we try to compile the program using the
push_back()
function (which is the usual way in most cases), we get a compiler error. The problem with our code is that by pushing the thread object into the vector, we attempt to make a copy of it. However, thread objects do not have acopy constructor
and thus can not be duplicated. If this were possible, we would create yet another branch in the flow of execution - which is not what we want. The solution to this problem is to usemove semantics
, which provide a convenient way for the contents of objects to be 'moved' between objects, rather than copied. It might be a good idea at this point to refresh your knowledge onmove semantics
, onrvalues
andlvalues
as well as onrvalue references
, as we will make use of these concepts throughout the course. -
To solve our problem, we can use the function
emplace_back()
instead ofpush_back()
, which internally usesmove semantics
to move our thread object into the vector without making a copy. When executing the code, we get the following output: -
Hello from Worker thread #Hello from Worker thread #140370329347840140370337740544 Hello from Worker thread #140370320955136 Hello from Worker thread #140370346133248 Hello from Main thread #140370363660096 Hello from Worker thread #140370312562432
-
This is surely not how we intended the console output to look like. When we take a close look at the call to
std::cout
in the thread function, we can see that it actually consists of three parts: the string "Hello from worker…", the respectivethread id
and finally the line break at the end. In the output, all three components are completely intermingled. Also, when the program is run several times, the output will look different with each execution. This shows us two important properties of concurrent programs:- The order in which threads are executed is non-deterministic. Every time a program is executed, there is a chance for a completely different order of execution.
- Threads may get preempted in the middle of execution and another thread may be selected to run.
-
These two properties pose a major problem with concurrent applications: A program may run correctly for thousands of times and suddenly, due to a particular interleaving of threads, there might be a problem. From a debugging perspective, such errors are very hard to detect as they can not be reproduced easily.
-
-
A First Concurrency Bug
-
Let us adjust the program code from the previous example and use a Lambda instead of the function
printHello()
. Also, we will pass the loop counteri
into the Lambda to enforce an individual wait time for each thread. The idea is to prevent the interleaving of text on the command line which we saw in the previous example. -
#include <iostream> #include <thread> #include <chrono> #include <random> #include <vector> int main() { // create threads std::vector<std::thread> threads; for (size_t i = 0; i < 10; ++i) { // create new thread from a Lambda threads.emplace_back([&i]() { // wait for certain amount of time std::this_thread::sleep_for(std::chrono::milliseconds(10 * i)); // perform work std::cout << "Hello from Worker thread #" << i << std::endl; }); } // do something in main() std::cout << "Hello from Main thread" << std::endl; // call join on all thread objects using a range-based loop for (auto &t : threads) t.join(); return 0; }
-
In order to ensure the correct view on the counter variable
i
, pass it to theLambda
function by value and not by reference.
-
-
C3.2 : Promises and Futures
-
The promise - future communication channel¶
-
The methods for passing data to a thread we have discussed so far are both useful during
thread construction
: We can either pass arguments to the thread function usingvariadic templates
or we can use aLambda
to capture arguments by value or by reference. The following example illustrates the use of these methods again: -
#include <iostream> #include <thread> void printMessage(std::string message) { std::this_thread::sleep_for(std::chrono::milliseconds(10)); // simulate work std::cout << "Thread 1: " << message << std::endl; } int main() { // define message std::string message = "My Message"; // start thread using variadic templates std::thread t1(printMessage, message); // start thread using a Lambda std::thread t2([message] { std::this_thread::sleep_for(std::chrono::milliseconds(10)); // simulate work std::cout << "Thread 2: " << message << std::endl; }); // thread barrier t1.join(); t2.join(); return 0; }
-
A drawback of these two approaches is that the information flows from the parent thread
(main)
to the worker threads(t1 and t2)
. In this section, we want to look at a way to pass data in the opposite direction - that is from the worker threads back to the parent thread. -
In order to achieve this, the threads need to adhere to a strict
synchronization protocol
. There is a such a mechanism available in the C++ standard that we can use for this purpose. This mechanism acts as asingle-use
channel between thethreads
. The sending end of the channel is called"promise"
while the receiving end is called"future"
. -
In the C++ standard, the class template
std::promise
provides a convenient way to store avalue
or anexception
that will acquiredasynchronously
at a later time via astd::future
object. Eachstd::promise
object is meant to be used only a single time. -
In the following example, we want to declare a promise which allows for transmitting a string between two threads and modifying it in the process.
-
#include <iostream> #include <thread> #include <future> void modifyMessage(std::promise<std::string> && prms, std::string message) { std::this_thread::sleep_for(std::chrono::milliseconds(4000)); // simulate work std::string modifiedMessage = message + " has been modified"; prms.set_value(modifiedMessage); } int main() { // define message std::string messageToThread = "My Message"; // create promise and future std::promise<std::string> prms; std::future<std::string> ftr = prms.get_future(); // start thread and pass promise as argument std::thread t(modifyMessage, std::move(prms), messageToThread); // print original message to console std::cout << "Original message from main(): " << messageToThread << std::endl; // retrieve modified message via future and print to console std::string messageFromThread = ftr.get(); std::cout << "Modified message from thread(): " << messageFromThread << std::endl; // thread barrier t.join(); return 0; }
-
After defining a message, we have to create a suitable
promise
that can take astring
object. To obtain the corresponding future, we need to call the methodget_future()
on the promise.Promise
andfuture
are the two types of the communication channel we want to use to pass a string between threads. The communication channel set up in this manner can only pass a string. -
We can now create a thread that takes a function and we will pass it the
promise
as an argument as well as the message to be modified.Promises
can not be copied, because thepromise-future
concept is a two-point communication channel for one-time use. Therefore, we must pass thepromise
to thethread
function usingstd::move
. The thread will then, during its execution, use the promise to pass back the modified message. -
The thread function takes the
promise
as anrvalue
reference in accordance withmove semantics
. After waiting for several seconds, the message is modified and the methodset_value()
is called on thepromise
. -
Back in the main thread, after starting the thread, the original message is printed to the console. Then, we start listening on the other end of the communication channel by calling the function
get()
on thefuture
. This method will block until data is available - which happens as soon asset_value
has been called on thepromise
(from the thread). If the result is movable (which is the case forstd::string
), it will be moved - otherwise it will be copied instead. After the data has been received (with a considerable delay), the modified message is printed to the console. -
Original message from main(): My Message Modified message from thread(): My Message has been modified
-
It is also possible that the worker value calls
set_value
on the promise beforeget()
is called on thefuture
. In this case,get()
returns immediately without any delay. Afterget()
has been called once, thefuture
is no longer usable. This makes sense as the normal mode of data exchange betweenpromise
andfuture
works withstd::move
- and in this case, the data is no longer available in the channel after the first call toget()
. Ifget()
is called a second time, an exception is thrown.
-
-
get()
vs.wait()
-
There are some situations where it might be interesting to separate the waiting for the content from the actual retrieving. Futures allow us to do that using the wait() function. This method will block until the future is ready. Once it returns, it is guaranteed that data is available and we can use get() to retrieve it without delay.
-
In addition to wait, the C++ standard also offers the method wait_for, which takes a time duration as an input and also waits for a result to become available. The method wait_for() will block either until the specified timeout duration has elapsed or the result becomes available - whichever comes first. The return value identifies the state of the result.
-
In the following example, please use the
wait_for
method to wait for the availability of a result for one second. After the time has passed (or the result is available) print the result to the console. Should the time be up without the result being available, print an error message to the console instead. -
#include <iostream> #include <thread> #include <future> #include <cmath> void computeSqrt(std::promise<double> &&prms, double input) { std::this_thread::sleep_for(std::chrono::milliseconds(2000)); // simulate work double output = sqrt(input); prms.set_value(output); } int main() { // define input data double inputData = 42.0; // create promise and future std::promise<double> prms; std::future<double> ftr = prms.get_future(); // start thread and pass promise as argument std::thread t(computeSqrt, std::move(prms), inputData); // Student task STARTS here // wait for result to become available auto status = ftr.wait_for(std::chrono::milliseconds(1000)); if (status == std::future_status::ready) // result is ready { std::cout << "Result = " << ftr.get() << std::endl; } // timeout has expired or function has not yet been started else if (status == std::future_status::timeout || status == std::future_status::deferred) { std::cout << "Result unavailable" << std::endl; } // Student task ENDS here // thread barrier t.join(); return 0; }
-
-
Passing exceptions
-
The
future-promise
communication channel may also be used for passingexceptions
. To do this, the worker thread simply sets anexception
rather than a value in thepromise
. In the parent thread, theexception
is thenre-thrown
onceget()
is called on thefuture
. -
Let us take a look at the following example to see how this mechanism works:
-
#include <iostream> #include <thread> #include <future> #include <cmath> #include <memory> void divideByNumber(std::promise<double> &&prms, double num, double denom) { std::this_thread::sleep_for(std::chrono::milliseconds(500)); // simulate work try { if (denom == 0) throw std::runtime_error("Exception from thread: Division by zero!"); else prms.set_value(num / denom); } catch (...) { prms.set_exception(std::current_exception()); } } int main() { // create promise and future std::promise<double> prms; std::future<double> ftr = prms.get_future(); // start thread and pass promise as argument double num = 42.0, denom = 0.0; std::thread t(divideByNumber, std::move(prms), num, denom); // retrieve result within try-catch-block try { double result = ftr.get(); std::cout << "Result = " << result << std::endl; } catch (std::runtime_error e) { std::cout << e.what() << std::endl; } // thread barrier t.join(); return 0; }
-
In the thread function, we need to implement a
try-catch
block which can be set to catch a particular exception or - as in our case - to catchall exceptions
. Instead of setting a value, we now want to throw astd::exception
along with a customized error message. In thecatch-block
, we catch this exception and throw it to the parent thread using the promise withset_exception
. The functionstd::current_exception
allows us to easily retrieve the exception which has been thrown. -
On the parent side, we now need to catch this exception. In order to do this, we can use a
try-block
around the call toget()
. We can set thecatch-block
to catch all exceptions or - as in this example - we could also catch a particular one such as the standardexception
. Calling the methodwhat()
on the exception allows us to retrieve the message from the exception - which is the one defined on the promise side of the communication channel. -
When we run the program, we can see that the exception is being thrown in the
worker thread
with themain thread
printing the corresponding error message to the console. -
So a
promise future
pair can be used to pass either values or exceptions between threads.
-
-
Threads vs. Tasks
-
Starting threads with async
-
In the last section we have seen how data can be passed from a
worker thread
to the parent thread usingpromises
andfutures
. A disadvantage of thepromise-future
approach however is that it is very cumbersome (and involves a lot of boilerplate code) to pass thepromise
to thethread
function using anrvalue
reference andstd::move
. For the straight-forward task of returningdata
orexceptions
from a workerthread
to theparent thread
however, there is a simpler and more convenient way usingstd::async()
instead ofstd::thread()
. -
Let us adapt the code example from the last section to use
std::async
: -
#include <iostream> #include <thread> #include <future> #include <cmath> #include <memory> double divideByNumber(double num, double denom) { std::this_thread::sleep_for(std::chrono::milliseconds(500)); // simulate work if (denom == 0) throw std::runtime_error("Exception from thread: Division by zero!"); return num / denom; } int main() { // use async to start a task double num = 42.0, denom = 2.0; std::future<double> ftr = std::async(divideByNumber, num, denom); // retrieve result within try-catch-block try { double result = ftr.get(); std::cout << "Result = " << result << std::endl; } catch (std::runtime_error e) { std::cout << e.what() << std::endl; } return 0; }
-
The first change we are making is in the thread function: We are removing the
promise
from the argument list as well as thetry-catch
block. Also, the return type of the function is changed fromvoid
todouble
as the result of the computation will be channeled back to themain thread
using a simplereturn
. After these changes, the function has no knowledge ofthreads
, nor offutures
orpromises
- it is a simple function that takestwo doubles
as arguments and returns adouble
as a result. Also, it will throw an exception when a division by zero is attempted. -
In the
main thread
, we need to replace the call tostd::thread
withstd::async
. Note thatasync
returns afuture
, which we will use later in the code to retrieve the value that is returned by the function. Apromise
, as withstd::thread
, is no longer needed, so the code becomes much shorter. In thetry-catch
block, nothing has changed - we are still callingget()
on the future in thetry-block
and exception-handling happens unaltered in the catch-block. Also, we do not need to calljoin()
any more. Withasync
, thethread
destructor will be called automatically - which reduces the risk of a concurrency bug. -
When we execute the code in the previous example, the output is identical to before, so we seemingly have the same functionality as before - or do we? When we use the
std::this_thread::get_id()
to print the system thread ids of themain
and of theworker thread
, we get the following command line output: -
Main thread id = 0x10bf90e00 Worker thread id = 0x10bf90e00 Result = 21
-
As expected, the
ids
between the two threadsdiffer
(not in my case) from each other - they are running inparallel
. However, one of the major differences betweenstd::thread
andstd::async
is that with the latter, the system decides wether the associated function should be runasynchronously
orsynchronously
. By adjusting the launch parameters ofstd::async
manually, we can directly influence wether the associated thread function will be executedsynchronously
orasynchronously
. -
std::future<double> ftr = std::async(std::launch::deferred, divideByNumber, num, denom);
-
enforces the
synchronous
execution ofdivideByNumber
, which results in the following output, where the thread ids for main and worker thread are identical. -
If we were to use the launch option
"async"
instead of"deferred"
, we would enforce anasynchronous
execution whereas the option"any"
would leave it to the system to decide - which is the default. -
At this point, let us compare
std::thread
withstd::async
: Internally,std::async
creates apromise
, gets afuture
from it and runs a template function that takes thepromise
, calls our function and then either sets thevalue
or theexception
of thatpromise
- depending on function behavior. The code used internally bystd::async
is more or less identical to the code we used in the previous example, except that this time it has been generated by the compiler and it is hidden from us - which means that the code we write appears much cleaner and leaner. Also,std::async
makes it possible to control the amount of concurrency by passing an optional launch parameter, which enforces eithersynchronous
orasynchronous
behavior. This ability, especially when left to the system, allows us to prevent an overload ofthreads
, which would eventually slow down the system as threads consume resources for both management and communication. If we were to use too many threads, the increased resource consumption would outweigh the advantages ofparallelism
and slow down the program. By leaving the decision to the system, we can ensure that the number of threads is chosen in a carefully balanced way that optimizes runtime performance by looking at the current workload of the system and the multi-core architecture of the system.
-
-
Task-based concurrency
-
Determining the optimal number of
threads
to use is a hard problem. It usually depends on the number of availablecores
wether it makes sense to execute code as a thread or in a sequential manner. The use ofstd::async (and thus tasks)
take the burden of this decision away from the user and let the system decide wether to execute the code sequentially or as athread
. With tasks, the programmer decides what CAN be run in parallel in principle and the system then decides at runtime what WILL be run in parallel. -
Internally, this is achieved by using
thread-pools
wich represent the number of available threads based on thecores/processors
as well as by usingwork-stealing queues
, where tasks are re-distributed among the available processors dynamically. The following diagram shows the principal of task distribution on a multi-core system usingwork stealing queues
. -
As can be seen, the first core in the example is heavily oversubscribed with several tasks that are waiting to be executed. The other
cores
however are runningidle
. The idea of awork-stealing queue
is to have awatchdog program
running in the background that regularly monitors the amount of work performed by each processor and redistributes it as needed. For the above example this would mean that tasks waiting for execution on the first core would be shifted (or "stolen") from busy cores and added to available free cores such thatidle
time is reduced. After this rearranging procedire, the task distribution in our example could look as shown in the following diagram. -
A work distribution in this manner can only work, when
parallelism
is explicitly described in the program by the programmer. If this is not the case,work-stealing
will not perform effectively. -
To conclude this section, a general comparison of
task-based
andthread-based
programming is given in the following: -
With
tasks
, the system takes care of many details(e.g. join)
. Withthreads
, the programmer is responsible for many details. As far as resources go,threads
are usually moreheavy-weight
as they are generated by theoperating system (OS)
. It takes time for theOS
to be called and to allocate memory / stack / kernel data structures for the thread. Also, destroying thethread
isexpensive
.Tasks
on the other hand are morelight-weight
as they will be using a pool of already created threads (the"thread pool"
). -
Threads
andtasks
are used for different problems.Threads
have more to do withlatency
. When you have functions that can block (e.g. file input, server connection), threads can avoid the program to be blocked, when e.g. the server is waiting for a response.Tasks
on the other hand focus onthroughput
, where many operations are executed inparallel
.
-
-
Assessing the advantage of parallel execution
-
In this section, we want to explore the influence of the number of
threads
on the performance of a program with respect to its overall runtime. The example below has athread
function called "workerThread" which contains a loop with an adjustable number of cycles in which a mathematical operation is performed. -
#include <iostream> #include <thread> #include <future> #include <cmath> #include <vector> #include <chrono> void workerFunction(int n) { // print system id of worker thread std::cout << "Worker thread id = " << std::this_thread::get_id() << std::endl; // perform work for (int i = 0; i < n; ++i) { sqrt(12345.6789); } } int main() { // print system id of worker thread std::cout << "Main thread id = " << std::this_thread::get_id() << std::endl; // start time measurement std::chrono::high_resolution_clock::time_point t1 = std::chrono::high_resolution_clock::now(); // launch various tasks std::vector<std::future<void>> futures; int nLoops = 10, nThreads = 5; for (int i = 0; i < nThreads; ++i) { futures.emplace_back(std::async(std::launch::any, workerFunction, nLoops)); } // wait for tasks to complete for (const std::future<void> &ftr : futures) ftr.wait(); // stop time measurement and print execution time std::chrono::high_resolution_clock::time_point t2 = std::chrono::high_resolution_clock::now(); auto duration = std::chrono::duration_cast<std::chrono::microseconds>( t2 - t1 ).count(); std::cout << "Execution finished after " << duration <<" microseconds" << std::endl; return 0; }
-
In
main()
, a for-loop starts a configurable number of tasks that can either be executedsynchronously
orasynchronously
. As an experiment, we will now use a number of different parameter settings to execute the program and evaluate the time it takes to finish the computations. The idea is to gauge the effect of the number of threads on the overall runtime: -
int nLoops = 1e7 , nThreads = 4 , std::launch::async
-
Main thread id = 0x115ae0e00 Worker thread id = Worker thread id = 0x70000e3c7000 Worker thread id = 0x70000e344000 0x70000e4cd000 Worker thread id = 0x70000e44a000 Execution finished after 21401 microseconds
-
With this set of parameters, the high workload is computed in parallel, with an overall runtime of
~21
milliseconds. -
int nLoops = 1e7 , nThreads = 5 , std::launch::deferred
-
Main thread id = 0x112aebe00 Worker thread id = 0x112aebe00 Worker thread id = 0x112aebe00 Worker thread id = 0x112aebe00 Worker thread id = 0x112aebe00 Execution finished after 78357 microseconds
-
The difference to the first set of parameters is the
synchronous
execution of the tasks - all computations are performedsequentially
- with an overall runtime of~80 milliseconds
. While impressive with regard to the achieved speed-up, the relative runtime advantage of setting 1 to this settings is at a factor of~3.8 on a 4-core machine
. -
int nLoops = 10 , nThreads = 5 , std::launch::async
-
Main thread id = 0x10edb1e00 Worker thread id = 0x7000001cf000 Worker thread id = 0x7000002d5000 Worker thread id = 0x700000252000 Worker thread id = 0x7000003db000Worker thread id = 0x700000358000 Execution finished after 287 microseconds
-
In this parameter setting, the tasks are run in
parallel
again but with a significantly lower number of computations: Thethread
function now computes only10 square roots
where with settings 1 and 2 a total of 10.000.000 square roots were computed. The overall runtime of this example therefore is significantly lower with only ~3 milliseconds. -
int nLoops = 10 , nThreads = 5 , std::launch::deferred
-
Main thread id = 0x116bbee00 Worker thread id = 0x116bbee00 Worker thread id = 0x116bbee00 Worker thread id = 0x116bbee00 Worker thread id = 0x116bbee00 Worker thread id = 0x116bbee00 Execution finished after 29 microseconds
-
In this last example, the same 10 square roots are computed sequentially. Surprising, the overall runtime is at only
0.01 milliseconds
- an astounding difference to theasynchronous
execution and a stark reminder that starting and managing threads takes a significant amount of time. It is therefore not a general advantage if computations are performed inparallel
: It must be carefully weighed with regard to the computational effort whether parallelization makes sense.
-
-
Avoiding Data Races
-
Understanding data races
-
One of the primary sources of error in concurrent programming are data races. They occur, when two concurrent
threads
are accessing the same memory location while at least one of them is modifying (the other thread might be reading or modifying). In this scenario, the value at the memory location is completely undefined. Depending on the system scheduler, the second thread will be executed at an unknown point in time and thus see different data at the memory location with each execution. Depending on the type of program, the result might be anything from a crash to a security breach when data is read by a thread that was not meant to be read, such as a user password or other sensitive information. Such an error is called adata race
because two threads are racing to get access to a memory location first, with the content at the memory location depending on the result of the race. -
The following diagram illustrates the principle: One
thread
wants to increment a variablex
, whereas the other thread wants to print the same variable. Depending on the timing of the program and thus the order of execution, the printed result might change each time the program is executed. -
#include <iostream> #include <thread> #include <future> class Vehicle { public: //default constructor Vehicle() : _id(0) { std::cout << "Vehicle #" << _id << " Default constructor called" << std::endl; } //initializing constructor Vehicle(int id) : _id(id) { std::cout << "Vehicle #" << _id << " Initializing constructor called" << std::endl; } // setter and getter void setID(int id) { _id = id; } int getID() { return _id; } private: int _id; }; int main() { // create instances of class Vehicle Vehicle v0; // default constructor Vehicle v1(1); // initializing constructor // read and write name in different threads (which one of the above creates a data race?) std::future<void> ftr = std::async([](Vehicle v) { std::this_thread::sleep_for(std::chrono::milliseconds(500)); // simulate work v.setID(2); }, v0); v0.setID(3); ftr.wait(); std::cout << "Vehicle #" << v0.getID() << std::endl; return 0; }
-
In this example, one safe way of passing data to a thread would be to carefully
synchronize
the two threads using eitherjoin()
or thepromise-future
concept that can guarantee the availability of a result. Data races are always to be avoided. Even if nothing bad seems to happen, they are a bug and should always be treated as such. Another possible solution for the above example would be to make a copy of the original argument and pass the copy to the thread, thereby preventing the data race.
-
-
Passing data to a thread by value
-
In the following example, an instance of the proprietary class
Vehicle
is created and passed to a thread by value, thus making a copy of it. -
#include <iostream> #include <thread> #include <future> class Vehicle { public: //default constructor Vehicle() : _id(0) { std::cout << "Vehicle #" << _id << " Default constructor called" << std::endl; } //initializing constructor Vehicle(int id) : _id(id) { std::cout << "Vehicle #" << _id << " Initializing constructor called" << std::endl; } // setter and getter void setID(int id) { _id = id; } int getID() { return _id; } private: int _id; }; int main() { // create instances of class Vehicle Vehicle v0; // default constructor Vehicle v1(1); // initializing constructor // read and write name in different threads (which one of the above creates a data race?) std::future<void> ftr = std::async([](Vehicle v) { std::this_thread::sleep_for(std::chrono::milliseconds(500)); // simulate work v.setID(2); }, v0); v0.setID(3); ftr.wait(); std::cout << "Vehicle #" << v0.getID() << std::endl; return 0; }
-
Note that the class
Vehicle
has a default constructor and an initializing constructor. In themain
function, when the instancesv0
andv1
are created, each constructor is called respectively. Note thatv0
is passed by value to aLambda
, which serves as the thread function forstd::async
. Within the Lambda, theid
of theVehicle
object is changed from the default (which is0
) to a new value2
. Note that the thread execution is paused for500
milliseconds to guarantee that the change is performed well after themain
thread has proceeded with its execution. -
In the
main
thread, immediately after starting up the worker thread, theid
ofv0
is changed to3
. Then, after waiting for the completion of thethread
, the vehicleid
is printed to the console. In this program, the output will always be the following: -
Vehicle #0 Default constructor called Vehicle #1 Initializing constructor called Vehicle #3
-
Passing data to a
thread
in this way is a clean and safe method as there is no danger of a data race - at least when atomic data types such as integers, doubles, chars or booleans are passed. -
When passing a
complex data structure
however, there are sometimes pointer variables hidden within, that point to a (potentially) shareddata buffer
- which might cause a data race even though the programmer believes that the copied data will effectively preempt this. The next example illustrates this case by adding a newmember variable
to theVehicle
class, which is a pointer to astring
object, as well as the correspondinggetter
andsetter
functions. -
#include <iostream> #include <thread> #include <future> class Vehicle { public: //default constructor Vehicle() : _id(0), _name(new std::string("Default Name")) { std::cout << "Vehicle #" << _id << " Default constructor called" << std::endl; } //initializing constructor Vehicle(int id, std::string name) : _id(id), _name(new std::string(name)) { std::cout << "Vehicle #" << _id << " Initializing constructor called" << std::endl; } // setter and getter void setID(int id) { _id = id; } int getID() { return _id; } void setName(std::string name) { *_name = name; } std::string getName() { return *_name; } private: int _id; std::string *_name; }; int main() { // create instances of class Vehicle Vehicle v0; // default constructor Vehicle v1(1, "Vehicle 1"); // initializing constructor // launch a thread that modifies the Vehicle name std::future<void> ftr = std::async([](Vehicle v) { // std::this_thread::sleep_for(std::chrono::milliseconds(500)); // simulate work v.setName("Vehicle 2"); },v0); v0.setName("Vehicle 3"); ftr.wait(); std::cout << v0.getName() << std::endl; return 0; }
-
The output of the program looks like this:
-
Vehicle #0 Default constructor called Vehicle #1 Initializing constructor called Vehicle 2
-
The basic program structure is mostly identical to the previous example with the object
v0
being copied by value when passed to thethread
function. This time however, even though a copy has been made, the original objectv0
is modified, when thethread
function sets the newname
. This happens because the member_name
is a pointer to astring
and after copying, even though the pointer variable has been duplicated, it still points to the same location as its value (i.e. thememory location
) has not changed. Note that when the delay is removed in the thread function, the console output varies between"Vehicle 2" and "Vehicle 3"
, depending on the system scheduler. Such an error might go unnoticed for a long time. It could show itself well after a program has been shipped to the client - which is what makes this error type so treacherous. -
Classes from the standard template library usually implement a
deep copy
behavior by default (such asstd::vector
). When dealing with proprietary data types, this is not guaranteed. The only safe way to tell whether a data structure can be safely passed is by looking at its implementation: Does it contain onlyatomic
data types or are therepointers
somewhere? If this is the case, does the data structure implement the copy constructor (and the assignment operator) correctly? Also, if the data structure under scrutiny contains sub-objects, their respective implementation has to be analyzed as well to ensure that deep copies are made everywhere. -
Unfortunately, one of the primary concepts of object-oriented programming -
information hiding
- often prevents us from looking at the implementation details of a class - we can only see the interface, which does not tell us what we need to know to make sure that an object of the class may be safely passed by value.
-
-
Overwriting the copy constructor
-
The problem with passing a proprietary class is that the standard
copy constructor
makes a1:1
copy of all data members, including pointers to objects. This behavior is also referred to as "shallow copy". In the above example we would have liked (and maybe expected) a "deep copy" of the object though, i.e. a copy of the data to which the pointer refers. A solution to this problem is to create a proprietarycopy constructor
in the classVehicle
. The following piece of code overwrites the defaultcopy constructor
and can be modified to make a customized copy of the data members. -
#include <iostream> #include <thread> #include <future> class Vehicle { public: //default constructor Vehicle() : _id(0), _name(new std::string("Default Name")) { std::cout << "Vehicle #" << _id << " Default constructor called" << std::endl; } //initializing constructor Vehicle(int id, std::string name) : _id(id), _name(new std::string(name)) { std::cout << "Vehicle #" << _id << " Initializing constructor called" << std::endl; } // copy constructor Vehicle(Vehicle const &src) { // QUIZ: Student code STARTS here _id = src._id; if (src._name != nullptr) { _name = new std::string; *_name = *src._name; } // QUIZ: Student code ENDS here std::cout << "Vehicle #" << _id << " copy constructor called" << std::endl; }; // setter and getter void setID(int id) { _id = id; } int getID() { return _id; } void setName(std::string name) { *_name = name; } std::string getName() { return *_name; } private: int _id; std::string *_name; }; int main() { // create instances of class Vehicle Vehicle v0; // default constructor Vehicle v1(1, "Vehicle 1"); // initializing constructor // launch a thread that modifies the Vehicle name std::future<void> ftr = std::async([](Vehicle v) { std::this_thread::sleep_for(std::chrono::milliseconds(500)); // simulate work v.setName("Vehicle 2"); },v0); v0.setName("Vehicle 3"); ftr.wait(); std::cout << v0.getName() << std::endl; return 0; }
-
Expanding on the code example from above, please implement the code required for a deep copy so that the program always prints
"Vehicle 3"
to the console, regardless of the delay within the thread function.
-
-
Passing data using move semantics
-
Even though a customized
copy constructor
can help us to avoid data races, it is also time (and memory) consuming. In the following, we will usemove semantics
to implement a more effective way of safely passing data to athread
. -
A
move constructor
enables the resources owned by anrvalue
object to be moved into anlvalue
without physically copying it.Rvalue
references support the implementation ofmove semantics
, which enables the programmer to write code that transfers resources (such as dynamically allocated memory) from one object to another. -
To make use of
move semantics
, we need to provide amove constructor
(and optionally amove assignment operator
).Copy and assignment operations
whose sources arervalues
automatically take advantage ofmove semantics
. Unlike the defaultcopy constructor
however, the compiler does not provide a defaultmove constructor
. -
To define a
move constructor
for a C++ class, the following steps are required: -
- Define an empty constructor method that takes an
rvalue
reference to the class type as its parameter
- Define an empty constructor method that takes an
-
// move constructor Vehicle(Vehicle && src) { //... std::cout << "Vehicle #" << _id << " move constructor called" << std::endl; };
-
- In the
move constructor
, assign the class data members from the source object to the object that is being constructed
- In the
-
_id = src.getID(); _name = new std::string(src.getName());
-
- Assign the data members of the source object to default values.
-
src.setID(0); src.setName("Default Name");
-
When launching the thread, the Vehicle object
v0
can be passed usingstd::move()
- which calls the move constructor and invalidates the original objectv0
in the main thread. -
#include <iostream> #include <thread> #include <future> class Vehicle { public: //default constructor Vehicle() : _id(0), _name(new std::string("Default Name")) { std::cout << "Vehicle #" << _id << " Default constructor called" << std::endl; } //initializing constructor Vehicle(int id, std::string name) : _id(id), _name(new std::string(name)) { std::cout << "Vehicle #" << _id << " Initializing constructor called" << std::endl; } // copy constructor Vehicle(Vehicle const &src) { //... std::cout << "Vehicle #" << _id << " copy constructor called" << std::endl; }; // move constructor Vehicle(Vehicle && src) { _id = src.getID(); _name = new std::string(src.getName()); src.setID(0); src.setName("Default Name"); std::cout << "Vehicle #" << _id << " move constructor called" << std::endl; }; // setter and getter void setID(int id) { _id = id; } int getID() { return _id; } void setName(std::string name) { *_name = name; } std::string getName() { return *_name; } private: int _id; std::string *_name; }; int main() { // create instances of class Vehicle Vehicle v0; // default constructor Vehicle v1(1, "Vehicle 1"); // initializing constructor // launch a thread that modifies the Vehicle name std::future<void> ftr = std::async([](Vehicle v) { v.setName("Vehicle 2"); },std::move(v0)); ftr.wait(); std::cout << v0.getName() << std::endl; return 0; }
-
-
Move semantics and uniqueness
-
As with the above-mentioned
copy constructor
, passing by value is usually safe - provided that a deep copy is made of all the data structures within the object that is to be passed. Withmove semantics
, we can additionally use the notion of uniqueness to prevent data races by default. In the following example, aunique_pointer
instead of araw pointer
is used for the string member in theVehicle
class. -
#include <iostream> #include <thread> #include <future> #include <memory> class Vehicle { public: //default constructor Vehicle() : _id(0), _name(new std::string("Default Name")) { std::cout << "Vehicle #" << _id << " Default constructor called" << std::endl; } //initializing constructor Vehicle(int id, std::string name) : _id(id), _name(new std::string(name)) { std::cout << "Vehicle #" << _id << " Initializing constructor called" << std::endl; } // move constructor with unique pointer Vehicle(Vehicle && src) : _name(std::move(src._name)) { // move id to this and reset id in source _id = src.getID(); src.setID(0); std::cout << "Vehicle #" << _id << " move constructor called" << std::endl; }; // setter and getter void setID(int id) { _id = id; } int getID() { return _id; } void setName(std::string name) { *_name = name; } std::string getName() { return *_name; } private: int _id; std::unique_ptr<std::string> _name; }; int main() { // create instances of class Vehicle Vehicle v0; // default constructor Vehicle v1(1, "Vehicle 1"); // initializing constructor // launch a thread that modifies the Vehicle name std::future<void> ftr = std::async([](Vehicle v) { v.setName("Vehicle 2"); },std::move(v0)); ftr.wait(); std::cout << v0.getName() << std::endl; // this will now cause an exception return 0; }
-
As can be seen, the
std::string
has now been changed to aunique pointer
, which means that only a single reference to the memory location it points to is allowed. Accordingly, themove constructor
transfers theunique pointer
to the worker by usingstd::move
and thus invalidates the pointer in the main thread. When callingv0.getName()
, an exception is thrown, making it clear to the programmer that accessing the data at this point is not permissible - which is the whole point of using aunique pointer
here as a data race will now be effectively prevented. -
The point of this example has been to illustrate that
move semantics
on its own is not enough to avoid data races. The key to thread safety is to usemove semantics
in conjunction withuniqueness
. It is the responsibility of the programmer to ensure that pointers to objects that are moved between threads are unique.
-
-
Mutexes and Locks
-
Using a Mutex To Protect Shared Data
-
The mutex entity
-
Until now, the methods we have used to pass data between threads were short-term and involved passing an argument
(the promise)
from a parentthread
to aworker
thread and then passing a result back to the parent thread (via thefuture
) once it has become available. Thepromise-future
construct is a non-permanent communication channel for one-time usage. -
We have seen that in order to avoid data races, we need to either forego accessing
shared data
or use it inread-only
access without mutating the data. In this chapter, we want to look at a way to establish a stablelong-term
communication channel that allows for bothsharing
andmutation
. Ideally, we would like to have a communication protocol that corresponds to voice communication over a radio channel, where the transmitter uses the expression"over"
to indicate the end of the transmission to the receiver. By using such a protocol, sender and receiver can take turns in transmitting their data. In C++, this concept of taking turns can be constructed by an entity called a"mutex"
- which stands forMUtual EXclusion
. -
Recall that a
data race
requires simultaneous access fromtwo threads
. If we can guarantee that only asingle thread
at a time can access a particular memory location,data races
would not occur. In order for this to work, we would need to establish a communication protocol. It is important to note that amutex
is not the solution to the data race problem per se but merely an enabler for athread-safe
communication protocol that has to be implemented and adhered to by the programmer. -
Let us take a look at how this protocol works: Assuming we have a piece of memory (e.g. a
shared variable
) that we want to protect from simultaneous access, we can assign amutex
to be the guardian of this particular memory. It is important to understand that amutex
is bound tothe memory it protects
. Athread 1
who wants to access the protected memory must"lock"
themutex
first. Afterthread 1
is"under the lock"
, athread 2
isblocked
from access to theshared variable
, it can not acquire thelock
on themutex
and is temporarily suspended by the system. -
Once the reading or writing operation of
thread 1
is complete, it must"unlock"
themutex
so thatthread 2
can access the memory location. Often, the code which is executed"under the lock"
is referred to as a"critical section"
. It is important to note that alsoread-only
access to theshared memory
has tolock
themutex
to prevent a data race - which would happen when anotherthread
, who might be under thelock
at that time, were to modify the data. -
When several
threads
were to try to acquire andlock
themutex
, only one of them would be successful. All otherthreads
would automatically be put on hold - just as cars waiting at an intersection for a green light (see the final project of this course). Once thethread
who has succeeded in acquiring thelock
had finished its job andunlocked
themutex
, a queuedthread
waiting for access would be woken up and allowed tolock
themutex
to proceed with hisread / write
operation. If allthreads
were to follow this protocol, a data race would effectively be avoided. Before we take a closer look at such a protocol, let us analyze a code example next. -
#include <iostream> #include <thread> #include <vector> #include <future> #include<algorithm> class Vehicle { public: Vehicle(int id) : _id(id) {} private: int _id; }; class WaitingVehicles { public: WaitingVehicles() : _tmpVehicles(0) {} // getters / setters void printSize() { std::cout << "#vehicles = " << _tmpVehicles << std::endl; } // typical behaviour methods void pushBack(Vehicle &&v) { //_vehicles.push_back(std::move(v)); // data race would cause an exception int oldNum = _tmpVehicles; std::this_thread::sleep_for(std::chrono::microseconds(1)); // wait deliberately to expose the data race _tmpVehicles = oldNum + 1; } private: std::vector<Vehicle> _vehicles; // list of all vehicles waiting to enter this intersection int _tmpVehicles; }; int main() { std::shared_ptr<WaitingVehicles> queue(new WaitingVehicles); std::vector<std::future<void>> futures; for (int i = 0; i < 1000; ++i) { Vehicle v(i); futures.emplace_back(std::async(std::launch::async, &WaitingVehicles::pushBack, queue, std::move(v))); } std::for_each(futures.begin(), futures.end(), [](std::future<void> &ftr) { ftr.wait(); }); queue->printSize(); return 0; }
-
This code builds on some of the classes we have seen in the previous lesson project - the concurrent traffic simulation. There is a class
Vehicle
that has a single data member (int_id
). Also, there is a classWaitingVehicles
, which is supposed to store a number of vehicles in an internal vector. Note that contrary to the lesson project, a vehicle is moved into the vector using anrvalue
reference. Also note that thepush_back
function is commented out here. The reason for this is that we are trying to provoke a data race - leavingpush_back
active would cause the program to crash (we will comment it in later). This is also the reason why there is an auxiliary member_tmpVehicles
, which will be used to count the number ofVehicles
added via calls topushBack()
. This temporary variable will help us expose the data race without crashing the program. -
In
main()
, afor-loop
is used to launch a large number of tasks who all try to add a newly createdVehicle
to the queue. Running the programsynchronously
with launch optionstd::launch::deferred
generates the following output on the console: -
#vehicles = 1000
-
Just as one would have expected, each task inserted an element into the queue with the total number of vehicles amounting to
1000
. -
Now let us enforce a concurrent behavior and change the launch option to
std::launch::async
. This generates the following output (with different results each time):#vehicles = 191
then another execution gives#vehicles = 179
-
It seems that not all the vehicles could be added to the queue. But why is that? Note that in the thread function
"pushBack"
there is a call tosleep_for
, which pauses thethread
execution for a short time. This is the position where the data race occurs: First, the current value of_tmpVehicles
is stored in a temporary variableoldNum
. While the thread is paused, there might (and will) be changes to_tmpVehicles
performed by otherthreads
. When the execution resumes, the former value of_tmpVehicles
is written back, thus invalidating the contribution of all thethreads
who had write access in the mean time. Interestingly, whensleep_for
is commented out, the output of the program is the same as withstd::launch::deferred
- at least that will be the case for most of the time when we run the program. But once in a while, there might be ascheduling constellation
which causes the bug to expose itself. Apart from understanding the data race, you should take as an advice that introducing deliberate time delays in the testing / debugging phase of development can help expose many concurrency bugs.
-
-
Using mutex to protect data
-
In its simplest form, using a mutex consists of four straight-forward steps:
- Include the
<mutex>
header - Create an
std::mutex
- Lock the
mutex
usinglock()
beforeread/write
is called - Unlock the
mutex
after theread/write
operation is finished usingunlock()
- Include the
-
In order to protect the access to
_vehicles
from being manipulated by severalthreads
at once, amutex
has been added to the class as aprivate data member
. In thepushBack
function, themutex
is locked before a new element is added to the vector and unlocked after the operation is complete. -
Note that the
mutex
is also locked in the functionprintSize
just before printing the size of thevector
. The reason for this lock is two-fold: First, we want to prevent adata race
that would occur when aread-access
to the vector and a simultaneouswrite access
(even when under the lock) would occur. And second, we want to exclusively reserve thestandard output
to the console for printing the vector size without otherthreads
printing to it at the same time. -
When this code is executed,
1000
elements will be in the vector. By using amutex
to our shared resource, a data race has been effectively avoided. -
#include <iostream> #include <thread> #include <vector> #include <future> #include <mutex> #include<algorithm> class Vehicle { public: Vehicle(int id) : _id(id) {} private: int _id; }; class WaitingVehicles { public: WaitingVehicles() {} // getters / setters void printSize() { _mutex.lock(); std::cout << "#vehicles = " << _vehicles.size() << std::endl; _mutex.unlock(); } // typical behaviour methods void pushBack(Vehicle &&v) { _mutex.lock(); _vehicles.emplace_back(std::move(v)); // data race would cause an exception _mutex.unlock(); } private: std::vector<Vehicle> _vehicles; // list of all vehicles waiting to enter this intersection std::mutex _mutex; }; int main() { std::shared_ptr<WaitingVehicles> queue(new WaitingVehicles); std::vector<std::future<void>> futures; for (int i = 0; i < 1000; ++i) { Vehicle v(i); futures.emplace_back(std::async(std::launch::async, &WaitingVehicles::pushBack, queue, std::move(v))); } std::for_each(futures.begin(), futures.end(), [](std::future<void> &ftr) { ftr.wait(); }); queue->printSize(); return 0; }
-
-
Using timed_mutex
-
In the following, a short overview of the different available
mutex
types is given:mutex
: provides the core functionslock()
andunlock()
and the non-blockingtry_lock()
method that returns if themutex
is not available.recursive_mutex
: allows multiple acquisitions of themutex
from the samethread
.timed_mutex
: similar tomutex
, but it comes with two more methodstry_lock_for()
andtry_lock_until()
that try to acquire themutex
for a period of time or until a moment in time is reached.recursive_timed_mutex
: is a combination oftimed_mutex
andrecursive_mutex
.
-
Please adapt the code from the previous example (example_2.cpp) in a way that a
timed_mutex
is used. Also, in the functionpushBack
, please use the method try_lock_for instead of lock, which should be executed until a maximum number of attempts is reached (e.g. 3 times) or until it succeeds. When an attempt fails, you should print an error message to the console that also contains the respective vehicle id and then put the thread to sleep for an amount of time before the next attempt is trief. Also, to expose the timing issues in this example, please introduce a call to sleep_for with a delay of several milliseconds before releasing the lock on the mutex. When done, experiment with the timing parameters to see how many vehicles will be added to the vector in the end. -
#include <iostream> #include <thread> #include <vector> #include <future> #include <mutex> class Vehicle { public: Vehicle(int id) : _id(id) {} int getID() { return _id; } private: int _id; }; class WaitingVehicles { public: WaitingVehicles() {} // getters / setters void printSize() { _mutex.lock(); std::cout << "#vehicles = " << _vehicles.size() << std::endl; _mutex.unlock(); } // typical behaviour methods void pushBack(Vehicle &&v) { for (size_t i = 0; i < 3; ++i) { if (_mutex.try_lock_for(std::chrono::milliseconds(100))) { _vehicles.emplace_back(std::move(v)); //std::this_thread::sleep_for(std::chrono::milliseconds(10)); _mutex.unlock(); break; } else { std::cout << "Error! Vehicle #" << v.getID() << " could not be added to the vector" << std::endl; std::this_thread::sleep_for(std::chrono::milliseconds(100)); } } } private: std::vector<Vehicle> _vehicles; // list of all vehicles waiting to enter this intersection std::timed_mutex _mutex; }; int main() { std::shared_ptr<WaitingVehicles> queue(new WaitingVehicles); std::vector<std::future<void>> futures; for (int i = 0; i < 1000; ++i) { Vehicle v(i); futures.emplace_back(std::async(std::launch::async, &WaitingVehicles::pushBack, queue, std::move(v))); } std::for_each(futures.begin(), futures.end(), [](std::future<void> &ftr) { ftr.wait(); }); queue->printSize(); return 0; }
-
-
Deadlock 1
-
Using
mutexes
can significantly reduce the risk of data races as seen in the example above. But imagine what would happen if an exception was thrown while executing code in the critical section, i.e. betweenlock
andunlock
. In such a case, themutex
would remainlocked
indefinitely and no otherthread
could unlock it - the program would most likely freeze. -
Let us take a look at the following code example, which performs a division of numbers:
-
#include <iostream> #include <thread> #include <vector> #include <future> #include<algorithm> double result; void printResult(int denom) { std::cout << "for denom = " << denom << ", the result is " << result << std::endl; } void divideByNumber(double num, double denom) { try { // divide num by denom but throw an exception if division by zero is attempted if (denom != 0) { result = num / denom; std::this_thread::sleep_for(std::chrono::milliseconds(1)); printResult(denom); } else { throw std::invalid_argument("Exception from thread: Division by zero!"); } } catch (const std::invalid_argument &e) { // notify the user about the exception and return std::cout << e.what() << std::endl; return; } } int main() { // create a number of threads which execute the function "divideByNumber" with varying parameters std::vector<std::future<void>> futures; for (double i = -5; i <= +5; ++i) { futures.emplace_back(std::async(std::launch::async, divideByNumber, 50.0, i)); } // wait for the results std::for_each(futures.begin(), futures.end(), [](std::future<void> &ftr) { ftr.wait(); }); return 0; }
-
In this example, a number of tasks is started up in
main()
with the methoddivideByNumber
as thethread
function. Eachtask
is given a different denominator and withindivideByNumber
a check is performed to avoid a division by zero. Ifdenom
should be zero, anexception is thrown
. In thecatch-block
, the exception is caught, printed to the console and then the function returns immediately. The output of the program changes with each execution and might look like this: -
for denom = -3, the result is -25 for denom = -2, the result is -50 for denom = -5, the result is -50 for denom = -4, the result is 50 Exception from thread: Division by zero! for denom = -1, the result is 16.6667 for denom = 3, the result is 12.5 for denom = 1, the result is 10 for denom = 2, the result is 10 for denom = 4, the result is 10 for denom = 5, the result is 10
-
As can easily be seen, the console output is totally mixed up and some results appear multiple times. There are several issues with this program, so let us look at them in turn:
- First, the thread function writes its result to a
global variable
which is passed to it byreference
. This will cause a data race as illustrated in the last section. Thesleep_for
function exposes the data race clearly. - Second, the result is printed to the console by several
threads
at the same time, causing the chaotic output.
- First, the thread function writes its result to a
-
As we have seen already, using a
mutex
can protect shared resources. So please modify the code in a way that both the console as well as the sharedglobal
variable result are properly protected. -
The problem you have just seen is one type of deadlock, which causes a program to freeze because one thread does not release the lock on the mutex while all other threads are waiting for access indefinitely. Let us now look at another type.
-
#include <iostream> #include <thread> #include <vector> #include <future> #include <mutex> std::mutex mtx; double result; void printResult(int denom) { std::cout << "for denom = " << denom << ", the result is " << result << std::endl; } void divideByNumber(double num, double denom) { mtx.lock(); try { // divide num by denom but throw an exception if division by zero is attempted if (denom != 0) { result = num / denom; std::this_thread::sleep_for(std::chrono::milliseconds(1)); printResult(denom); } else { throw std::invalid_argument("Exception from thread: Division by zero!"); } } catch (const std::invalid_argument &e) { // notify the user about the exception and return std::cout << e.what() << std::endl; return; // deadlock } mtx.unlock(); } int main() { // create a number of threads which execute the function "divideByNumber" with varying parameters std::vector<std::future<void>> futures; for (double i = -5; i <= +5; ++i) { futures.emplace_back(std::async(std::launch::async, divideByNumber, 50.0, i)); } // wait for the results std::for_each(futures.begin(), futures.end(), [](std::future<void> &ftr) { ftr.wait(); }); return 0; }
-
-
Deadlock 2
-
A second type of
deadlock
is a state in which two or more threads are blocked because each thread waits for the resource of the otherthread
to be released before releasing its resource. The result of thedeadlock
is a complete standstill. Thethread
and therefore usually the whole program is blocked forever. The following code illustrates the problem: -
#include <iostream> #include <thread> #include <mutex> std::mutex mutex1, mutex2; void ThreadA() { // Creates deadlock problem mutex2.lock(); std::cout << "Thread A" << std::endl; mutex1.lock(); mutex2.unlock(); mutex1.unlock(); } void ThreadB() { // Creates deadlock problem mutex1.lock(); std::cout << "Thread B" << std::endl; mutex2.lock(); mutex1.unlock(); mutex2.unlock(); } void ExecuteThreads() { std::thread t1( ThreadA ); std::thread t2( ThreadB ); t1.join(); t2.join(); std::cout << "Finished" << std::endl; } int main() { ExecuteThreads(); return 0; }
-
When the program is executed, it produces the following output:
-
Thread AThread B
-
Notice that it does not print the
"Finished"
statement nor does it return - the program is in adeadlock
, which it can never leave. -
Let us take a closer look at this problem:
-
ThreadA
andThreadB
both require access to the console. Unfortunately, they request this resource which is protected by twomutexes
in different order. If the twothreads
workinterlocked
so that firstThreadA
locksmutex 1
, thenThreadB
locksmutex 2
, the program is in adeadlock
: Eachthread
tries to lock the othermutex
and needs to wait for its release, which never comes. The following figure illustrates the problem graphically. -
One way to avoid such a
deadlock
would be to number all resources and require that processes request resources only in strictly increasing (or decreasing) order. Please try to manually rearrange the locks and unlocks in a way that the deadlock does not occur and the following text is printed to the console: -
#include <iostream> #include <thread> #include <mutex> std::mutex mutex1, mutex2; void ThreadA() { // Solves deadlock problem mutex1.lock(); std::cout << "Thread A" << std::endl; mutex2.lock(); mutex2.unlock(); mutex1.unlock(); } void ThreadB() { // Solves deadlock problem mutex1.lock(); std::cout << "Thread B" << std::endl; mutex2.lock(); mutex1.unlock(); mutex2.unlock(); } void ExecuteThreads() { std::thread t1( ThreadA ); std::thread t2( ThreadB ); t1.join(); t2.join(); std::cout << "Finished" << std::endl; } int main() { ExecuteThreads(); return 0; }
-
As you have seen, avoiding such a deadlock is possible but requires time and a great deal of experience. In the next section, we will look at ways to avoid deadlocks - both of this type as well as the previous type, where a call to unlock the mutex had not been issued.
-
-
Using Locks to Avoid Deadlocks
-
Lock Guard
-
In the previous example, we have directly called the
lock()
andunlock()
functions of amutex
. The idea of "working under the lock" is to block unwanted access by otherthreads
to the same resource. Only the thread which acquired thelock
canunlock
themutex
and give all remainingthreads
the chance to acquire thelock
. In practice however, direct calls tolock()
should be avoided at all cost! Imagine that while working under thelock
, athread
would throw anexception
and exit the critical section without calling theunlock
function on themutex
. In such a situation, the program would most likely freeze as no otherthread
could acquire themutex
any more. This is exactly what we have seen in the functiondivideByNumber
from the previous example. -
We can avoid this problem by creating a
std::lock_guard
object, which keeps an associatedmutex
locked during the entire object life time. The lock is acquired onconstruction
and released automatically ondestruction
. This makes it impossible to forgetunlocking
a critical section. Also,std::lock_guard
guaranteesexception
safety because any critical section is automaticallyunlocked
when an exception is thrown. In our previous example, we can simply replace_mutex.lock()
and_mutex.unlock()
with the following code: -
#include <iostream> #include <thread> #include <vector> #include <future> #include <mutex> #include<algorithm> std::mutex mtx; double result; void printResult(int denom) { std::cout << "for denom = " << denom << ", the result is " << result << std::endl; } void divideByNumber(double num, double denom) { try { // divide num by denom but throw an exception if division by zero is attempted if (denom != 0) { std::lock_guard<std::mutex> lck(mtx); result = num / denom; std::this_thread::sleep_for(std::chrono::milliseconds(1)); printResult(denom); } else { throw std::invalid_argument("Exception from thread: Division by zero!"); } } catch (const std::invalid_argument &e) { // notify the user about the exception and return std::cout << e.what() << std::endl; return; } } int main() { // create a number of threads which execute the function "divideByNumber" with varying parameters std::vector<std::future<void>> futures; for (double i = -5; i <= +5; ++i) { futures.emplace_back(std::async(std::launch::async, divideByNumber, 50.0, i)); } // wait for the results std::for_each(futures.begin(), futures.end(), [](std::future<void> &ftr) { ftr.wait(); }); return 0; }
-
Note that there is no direct call to
lock
orunlock
themutex
anymore. We now have astd::lock_guard
object that takes the mutex as an argument andlocks
it at creation. When the methoddivideByNumber
exits, themutex
is automaticallyunlocked
by thestd::lock_guard
object as soon as it is destroyed - which happens, when the local variable gets out of scope. -
We can improve even further on this code by limiting the scope of the
mutex
to the section which accesses the critical resource. Please change the code in a way that themutex
is only locked for the time when result is modified and the result is printed. -
#include <iostream> #include <thread> #include <vector> #include <future> #include <mutex> std::mutex mtx; double result; void printResult(int denom) { std::cout << "for denom = " << denom << ", the result is " << result << std::endl; } void divideByNumber(double num, double denom) { try { // divide num by denom but throw an exception if division by zero is attempted if (denom != 0) { std::lock_guard<std::mutex> lck(mtx); result = num / denom; std::this_thread::sleep_for(std::chrono::milliseconds(1)); printResult(denom); } else { throw std::invalid_argument("Exception from thread: Division by zero!"); } } catch (const std::invalid_argument &e) { // notify the user about the exception and return std::cout << e.what() << std::endl; return; } } int main() { // create a number of threads which execute the function "divideByNumber" with varying parameters std::vector<std::future<void>> futures; for (double i = -5; i <= +5; ++i) { futures.emplace_back(std::async(std::launch::async, divideByNumber, 50.0, i)); } // wait for the results std::for_each(futures.begin(), futures.end(), [](std::future<void> &ftr) { ftr.wait(); }); return 0; }
-
-
Unique Lock
-
The problem with the previous example is that we can only
lock
themutex
once and the only way to controllock
andunlock
is by invalidating the scope of thestd::lock_guard
object. But what if we wanted (or needed) a finer control of the locking mechanism? -
A more flexible alternative to
std::lock_guard
isunique_lock
, that also provides support for more advanced mechanisms, such asdeferred locking
,time locking
,recursive locking
,transfer of lock
ownership and use of condition variables (which we will discuss later). It behaves similar tolock_guard
but provides much more flexibility, especially with regard to the timing behavior of the locking mechanism. -
Let us take a look at an adapted version of the code from the previous section above:
-
#include <iostream> #include <thread> #include <vector> #include <future> #include <mutex> #include<algorithm> std::mutex mtx; double result; void printResult(int denom) { std::cout << "for denom = " << denom << ", the result is " << result << std::endl; } void divideByNumber(double num, double denom) { std::unique_lock<std::mutex> lck(mtx); try { // divide num by denom but throw an exception if division by zero is attempted if (denom != 0) { result = num / denom; std::this_thread::sleep_for(std::chrono::milliseconds(100)); printResult(denom); lck.unlock(); // do something outside of the lock std::this_thread::sleep_for(std::chrono::milliseconds(100)); lck.lock(); // do someting else under the lock std::this_thread::sleep_for(std::chrono::milliseconds(100)); } else { throw std::invalid_argument("Exception from thread: Division by zero!"); } } catch (const std::invalid_argument &e) { // notify the user about the exception and return std::cout << e.what() << std::endl; return; } } int main() { // create a number of threads which execute the function "divideByNumber" with varying parameters std::vector<std::future<void>> futures; for (double i = -5; i <= +5; ++i) { futures.emplace_back(std::async(std::launch::async, divideByNumber, 50.0, i)); } // wait for the results std::for_each(futures.begin(), futures.end(), [](std::future<void> &ftr) { ftr.wait(); }); return 0; }
-
In this version of the code,
std::lock_guard
has been replaced withstd::unique_lock
. As before, thelock
objectlck
willunlock
themutex
in itsdestructor
, i.e. when the functiondivideByNumber
returns andlck
gets out of scope. In addition to this automatic unlocking,std::unique_lock
offers the additional flexibility toengage
anddisengage
thelock
as needed by manually calling the methodslock()
andunlock()
. This ability can greatly improve the performance of a concurrent program, especially when manythreads
are waiting for access to alocked
resource. In the example, thelock
is released before some non-critical work is performed (simulated bysleep_for
) and re-engaged before some other work is performed in the critical section and thus under thelock
again at the end of the function. This is particularly useful for optimizing performance and responsiveness when a significant amount of time passes between two accesses to a critical resource. -
The main advantages of using
std::unique_lock<>
overstd::lock_guard
are briefly summarized in the following. Usingstd::unique_lock
allows you to…- …construct an instance without an associated
mutex
using the default constructor - …construct an instance with an associated
mutex
while leaving themutex
unlocked
at first using thedeferred-locking constructor
- …construct an instance that tries to
lock
amutex
, but leaves itunlocked
if thelock
failed using thetry-lock
constructor - …construct an instance that tries to acquire a
lock
for either a specified time period or until a specified point in time
- …construct an instance without an associated
-
Despite the advantages of
std::unique_lock<>
andstd::lock_guard
over accessing themutex
directly, however, thedeadlock
situation where two mutexes are accessed simultaneously (see the last section) will still occur.
-
-
Avoiding deadlocks with
std::lock()
-
In most cases, your code should only hold one lock on a
mutex
at a time. Occasionally you can nest yourlocks
, for example by calling a subsystem that protects its internal data with amutex
while holding a lock on anothermutex
, but it is generally better to avoidlocks
on multiplemutexes
at the same time, if possible. Sometimes, however, it is necessary to hold alock
on more than onemutex
because you need to perform an operation on two different data elements, each protected by its own mutex. -
In the last section, we have seen that using several
mutexes
at once can lead to adeadlock
, if the order of locking them is not carefully managed. To avoid this problem, the system must be told that bothmutexes
should be locked at the same time, so that one of thethreads
takes over bothlocks
and blocking is avoided. That's what thestd::lock()
function is for - you provide a set oflock_guard
orunique_lock
objects and the system ensures that they are all locked when the function returns. -
In the following example, which is a version of the code we saw in the last section were
std::mutex
has been replaced withstd::lock_guard
. -
#include <iostream> #include <thread> #include <mutex> std::mutex mutex1, mutex2; void ThreadA() { // Creates deadlock problem std::lock_guard<std::mutex> lock2(mutex2); std::cout << "Thread A" << std::endl; std::lock_guard<std::mutex> lock1(mutex1); } void ThreadB() { // Creates deadlock problem std::lock_guard<std::mutex> lock1(mutex1); std::cout << "Thread B" << std::endl; std::lock_guard<std::mutex> lock2(mutex2); } void ExecuteThreads() { std::thread t1( ThreadA ); std::thread t2( ThreadB ); t1.join(); t2.join(); std::cout << "Finished" << std::endl; } int main() { ExecuteThreads(); return 0; }
-
Note that when executing this code, it still produces a deadlock, despite the use of
std::lock_guard
. -
In the following
deadlock-free
code,std::lock
is used to ensure that themutexes
are always locked in the same order, regardless of the order of the arguments. Note thatstd::adopt_lock
option allows us to usestd::lock_guard
on an already lockedmutex
. -
#include <iostream> #include <thread> #include <mutex> std::mutex mutex1, mutex2; void ThreadA() { // Ensure that locks are always executed in the same order std::lock(mutex1, mutex2); std::lock_guard<std::mutex> lock2(mutex2, std::adopt_lock); std::cout << "Thread A" << std::endl; std::lock_guard<std::mutex> lock1(mutex1, std::adopt_lock); } void ThreadB() { std::lock(mutex1, mutex2); std::lock_guard<std::mutex> lock1(mutex1, std::adopt_lock); std::cout << "Thread B" << std::endl; std::lock_guard<std::mutex> lock2(mutex2, std::adopt_lock); } void ExecuteThreads() { std::thread t1( ThreadA ); std::thread t2( ThreadB ); t1.join(); t2.join(); std::cout << "Finished" << std::endl; } int main() { ExecuteThreads(); return 0; }
-
As a rule of thumb, programmers should try to avoid using several mutexes at once. Practice shows that this can be achieved in the majority of cases. For the remaining cases though, using
std::lock
is a safe way to avoid a deadlock situation.
-
-
Condition Variables and Message Queues
-
The Monitor Object Pattern
-
In the previous sections we have learned that data protection is a critical element in concurrent programming. After looking at several ways to achieve this, we now want to build on these concepts to devise a method for a controlled and finely-grained data exchange between
threads
- a concurrent message queue. One important step towards such a construct is to implement amonitor object
, which is adesign pattern
thatsynchronizes
concurrent method execution to ensure that only one method at a time runs within an object. It also allows an object's methods to cooperatively schedule their execution sequences. The problem solved by this pattern is based on the observation that many applications contain objects whose methods are invoked concurrently by multiple client threads. These methods often modify the state of their objects, for example by adding data to an internal vector. For such concurrent programs to execute correctly, it is necessary to synchronize and schedule access to the objects very carefully. The idea of a monitor object is tosynchronize the access to an object's methods so that only one method can execute at any one time.
-
#include <iostream> #include <thread> #include <vector> #include <future> #include <mutex> class Vehicle { public: Vehicle(int id) : _id(id) {} int getID() { return _id; } private: int _id; }; class WaitingVehicles { public: WaitingVehicles() {} void printIDs() { std::lock_guard<std::mutex> myLock(_mutex); // lock is released when myLock goes out of scope for(auto &v : _vehicles) std::cout << " Vehicle #" << v.getID() << " is now waiting in the queue" << std::endl; } void pushBack(Vehicle &&v) { // perform vector modification under the lock std::lock_guard<std::mutex> uLock(_mutex); std::cout << " Vehicle #" << v.getID() << " will be added to the queue" << std::endl; _vehicles.emplace_back(std::move(v)); // simulate some work std::this_thread::sleep_for(std::chrono::milliseconds(500)); } private: std::vector<Vehicle> _vehicles; // list of all vehicles waiting to enter this intersection std::mutex _mutex; }; int main() { // create monitor object as a shared pointer to enable access by multiple threads std::shared_ptr<WaitingVehicles> queue(new WaitingVehicles); std::cout << "Spawning threads..." << std::endl; std::vector<std::future<void>> futures; for (int i = 0; i < 10; ++i) { // create a new Vehicle instance and move it into the queue Vehicle v(i); futures.emplace_back(std::async(std::launch::async, &WaitingVehicles::pushBack, queue, std::move(v))); } std::for_each(futures.begin(), futures.end(), [](std::future<void> &ftr) { ftr.wait(); }); std::cout << "Collecting results..." << std::endl; queue->printIDs(); return 0; }
-
In a previous section, we have looked at a code example which came pretty close to the functionality of a
monitor object
: the classWaitingVehicles
. -
Let us modify and partially reimplement this class, which we want to use as a shared place where concurrent threads may store data, in our case instances of the class
Vehicle
. As we will be using the sameWaitingVehicles
object for all the threads, we have to pass it to them byreference
- and as all threads will be writing to this object at the same time (which is a mutating operation) we will pass it as ashared pointer
. Keep in mind that there will be many threads that will try to pass data to theWaitingVehicles
object simultaneously and thus there is the danger of a data race. -
Before we take a look at the implementation of
WaitingVehicles
, let us look at themain
function first where all the threads are spawned. We need a vector to store the futures as there is no data to be returned from the threads. Also, we need to callwait()
on the futures at the end ofmain()
so the program will not prematurely exit before the thread executions are complete. -
Instead of using
push_back
we will again be usingemplace_back
to construct thefutures
in place rather than moving them into thevector
. After constructing a newVehicle
object within thefor-loop
, we start a new task by passing it a reference to thepushBack
function, ashared pointer
to ourWaitingVehicles
object and the newly created vehicle. Note that the latter is passed usingmove semantics
. -
Now let us take a look at the implementation of the
WaitingVehicle
object. -
We need to enable it to process write requests from several threads at the same time. Every time a request comes in from a
thread
, the object needs to add the new data to its internal storage. Our storage container will be anstd::vector
. As we need to protect the vector from simultaneous access later, we also need to integrate amutex
into the class. As we already know, amutex
has the methodslock
andunlock
. In order to avoid data races, themutex
needs to be locked every time a thread wants to access the vector and it needs to be unlocked one the write operation is complete. In order to avoid a program freeze due to a missing unlock operation, we will be using alock guard
object, which automaticallyunlocks once the lock object gets out of scope
. -
In our modified
pushBack
function, we will first create alock guard
object and pass it themutex
member variable. Now we can freely move theVehicle
object into our vector without the danger of a data race. At the end of the function, there is a call tostd::sleep_for
, which simulates some work and helps us to better expose potential concurrency problems. With each newVehicle
object that is passed into the queue, we will see an output to the console. -
Another function within the
WaitingVehicle
class isprintIDs()
, which loops over all the elements of the vector and prints their respective IDs to the console. One major difference betweenpushBack()
andprintIDs()
is that the latter function accesses allVehicle
objects by looping through the vector whilepushBack
only accesses a single object - which is the newest addition to theVehicle
vector. -
When the program is executed, the following output is printed to the console:
-
Spawning threads... Vehicle #0 will be added to the queue Vehicle #3 will be added to the queue Vehicle #2 will be added to the queue Vehicle #1 will be added to the queue Vehicle #4 will be added to the queue Vehicle #5 will be added to the queue Vehicle #6 will be added to the queue Vehicle #8 will be added to the queue Vehicle #7 will be added to the queue Vehicle #9 will be added to the queue Collecting results... Vehicle #0 is now waiting in the queue Vehicle #3 is now waiting in the queue Vehicle #2 is now waiting in the queue Vehicle #1 is now waiting in the queue Vehicle #4 is now waiting in the queue Vehicle #5 is now waiting in the queue Vehicle #6 is now waiting in the queue Vehicle #8 is now waiting in the queue Vehicle #7 is now waiting in the queue Vehicle #9 is now waiting in the queue
-
As can be seen, the
Vehicle
objects are added one at a time, with all threads duly waiting for their turn. Then, once allVehicle
objects have been stored, the call toprintIDs
prints the entire content of the vector all at once. -
While the functionality of the
monitor object
we have constructed is an improvement over many other methods that allow passing data to threads, it has one significant disadvantage: Themain thread
has to wait until allworker threads
have completed their jobs and only then can it access the added data in bulk. A system which is truly interactive however has to react to events as they arrive - it should not wait until all threads have completed their jobs but insteadact
immediately as soon as new data arrives. In the following, we want to add this functionality to our monitor object.
-
-
Creating an infinite polling loop
-
While the
pushBack
method is used by thethreads
to add data to the monitor incrementally, themain thread
usesprintSize
at the end to display all the results at once. Our goal is to change the code in a way that themain thread
gets notified every time new data becomes available. But how can themain thread
know whether new data has become available? The solution is to write a new method that regularly checks for the arrival of new data. -
In the code listed below, a new method
dataIsAvailable()
has been added whileprintIDs()
has been removed. This method returns true if data is available in the vector and false otherwise. Once themain thread
has found out viadataIsAvailable()
that new data is in thevector
, it can call the methodpopBack()
to retrieve the data from themonitor object
. Note that instead of copying the data, it is moved from the vector to themain method
. -
#include <iostream> #include <thread> #include <vector> #include <future> #include <mutex> class Vehicle { public: Vehicle(int id) : _id(id) {} int getID() { return _id; } private: int _id; }; class WaitingVehicles { public: WaitingVehicles() {} bool dataIsAvailable() { std::lock_guard<std::mutex> myLock(_mutex); return !_vehicles.empty(); } Vehicle popBack() { // perform vector modification under the lock std::lock_guard<std::mutex> uLock(_mutex); // remove last vector element from queue Vehicle v = std::move(_vehicles.back()); _vehicles.pop_back(); return v; // will not be copied due to return value optimization (RVO) in C++ } void pushBack(Vehicle &&v) { // simulate some work std::this_thread::sleep_for(std::chrono::milliseconds(100)); // perform vector modification under the lock std::lock_guard<std::mutex> uLock(_mutex); // add vector to queue std::cout << " Vehicle #" << v.getID() << " will be added to the queue" << std::endl; _vehicles.emplace_back(std::move(v)); } private: std::vector<Vehicle> _vehicles; // list of all vehicles waiting to enter this intersection std::mutex _mutex; }; int main() { // create monitor object as a shared pointer to enable access by multiple threads std::shared_ptr<WaitingVehicles> queue(new WaitingVehicles); std::cout << "Spawning threads..." << std::endl; std::vector<std::future<void>> futures; for (int i = 0; i < 10; ++i) { // create a new Vehicle instance and move it into the queue Vehicle v(i); futures.emplace_back(std::async(std::launch::async, &WaitingVehicles::pushBack, queue, std::move(v))); } std::cout << "Collecting results..." << std::endl; while (true) { if (queue->dataIsAvailable()) { Vehicle v = queue->popBack(); std::cout << " Vehicle #" << v.getID() << " has been removed from the queue" << std::endl; } } std::for_each(futures.begin(), futures.end(), [](std::future<void> &ftr) { ftr.wait(); }); std::cout << "Finished processing queue" << std::endl; return 0; }
-
In the
main
thread, we will use an infinitewhile-loop
to frequently poll the monitor object and check whether new data has become available. Contrary to before, we will now perform the read operation before the workers are done - so we have to integrate our loop beforewait()
is called on thefutures
at the end ofmain()
. Once a newVehicle
object becomes available, we want to print it within the loop. -
When we execute the code, we get a console output similar to the one listed below:
-
From the output it can easily be seen, that adding and removing to and from the monitor object is now interleaved. When executed repeatedly, the order of the vehicles will most probably differ between executions.
-
-
Writing a vehicle counter
-
Note that the program in the example above did not terminate - even though no new
Vehicles
are added to thequeue
, theinfinite while-loop
will not exit. -
One possible solution to this problem would be to integrate a vehicle counter into the
WaitingVehicles
class, that is incremented each time aVehicle
object is added and decremented when it is removed. Thewhile-loop
could then be terminated as soon as the counter reaches zero. Please go ahead and implement this functionality - but remember to protect the counter as it will also be accessed by several threads at once. Also, it will be a good idea to introduce a small delay between spawning threads and collecting results. Otherwise, the queue will be empty by default and the program will terminate prematurely. At the end of main(), please also print the number of remaining Vehicle objects in the vector. -
#include <iostream> #include <thread> #include <vector> #include <future> #include <mutex> class Vehicle { public: Vehicle(int id) : _id(id) {} int getID() { return _id; } private: int _id; }; class WaitingVehicles { public: WaitingVehicles() : _numVehicles(0) {} int getNumVehicles() { std::lock_guard<std::mutex> uLock(_mutex); return _numVehicles; } bool dataIsAvailable() { std::lock_guard<std::mutex> myLock(_mutex); return !_vehicles.empty(); } Vehicle popBack() { // perform vector modification under the lock std::lock_guard<std::mutex> uLock(_mutex); // remove last vector element from queue Vehicle v = std::move(_vehicles.back()); _vehicles.pop_back(); --_numVehicles; return v; // will not be copied due to return value optimization (RVO) in C++ } void pushBack(Vehicle &&v) { // simulate some work std::this_thread::sleep_for(std::chrono::milliseconds(100)); // perform vector modification under the lock std::lock_guard<std::mutex> uLock(_mutex); // add vector to queue std::cout << " Vehicle #" << v.getID() << " will be added to the queue" << std::endl; _vehicles.emplace_back(std::move(v)); ++_numVehicles; } private: std::vector<Vehicle> _vehicles; // list of all vehicles waiting to enter this intersection std::mutex _mutex; int _numVehicles; }; int main() { // create monitor object as a shared pointer to enable access by multiple threads std::shared_ptr<WaitingVehicles> queue(new WaitingVehicles); std::cout << "Spawning threads..." << std::endl; std::vector<std::future<void>> futures; for (int i = 0; i < 10; ++i) { // create a new Vehicle instance and move it into the queue Vehicle v(i); futures.emplace_back(std::async(std::launch::async, &WaitingVehicles::pushBack, queue, std::move(v))); } std::cout << "Collecting results..." << std::endl; while (true) { if (queue->dataIsAvailable()) { Vehicle v = queue->popBack(); std::cout << " Vehicle #" << v.getID() << " has been removed from the queue" << std::endl; if(queue->getNumVehicles()<=0) { std::this_thread::sleep_for(std::chrono::milliseconds(200)); break; } } } std::for_each(futures.begin(), futures.end(), [](std::future<void> &ftr) { ftr.wait(); }); std::cout << "Finished : " << queue->getNumVehicles() << " vehicle(s) left in the queue" << std::endl; return 0; }
-
-
Building a Concurrent Message Queue
-
Condition variables
-
The polling loop we have used in the previous example has not been programmed optimally: As long as the program is running, the
while-loop
will keep the processor busy, constantly asking wether new data is available. In the following, we will look at a better way to solve this problem without putting too much load on the processor. -
The alternative to a polling loop is for the
main thread
to block and wait for a signal that new data is available. This would prevent the infinite loop from keeping the processor busy. We have already discussed a mechanism that would fulfill this purpose - thepromise-future
construct. The problem with futures is that they can only be used a single time. Once a future is ready andget()
has been called, it can not be used any more. For our purpose, we need a signaling mechanism that can be re-used. The C++ standard offers such a construct in the form of "condition variables". -
A
std::condition_variable
has a methodwait()
, which blocks, when it is called by athread
. The condition variable is kept blocked until it is released by anotherthread
. The release works via the methodnotify_one()
or thenotify_all
method. The key difference between the two methods is thatnotify_one
will only wake up a single waiting thread whilenotify_all
will wake up all the waiting threads at once. -
#include <iostream> #include <thread> #include <vector> #include <future> #include <mutex> class Vehicle { public: Vehicle(int id) : _id(id) {} int getID() { return _id; } private: int _id; }; class WaitingVehicles { public: WaitingVehicles() {} Vehicle popBack() { // perform vector modification under the lock std::unique_lock<std::mutex> uLock(_mutex); _cond.wait(uLock, [this] { return !_vehicles.empty(); }); // pass unique lock to condition variable // remove last vector element from queue Vehicle v = std::move(_vehicles.back()); _vehicles.pop_back(); return v; // will not be copied due to return value optimization (RVO) in C++ } void pushBack(Vehicle &&v) { // simulate some work std::this_thread::sleep_for(std::chrono::milliseconds(100)); // perform vector modification under the lock std::lock_guard<std::mutex> uLock(_mutex); // add vector to queue std::cout << " Vehicle #" << v.getID() << " will be added to the queue" << std::endl; _vehicles.push_back(std::move(v)); _cond.notify_one(); // notify client after pushing new Vehicle into vector } private: std::mutex _mutex; std::condition_variable _cond; std::vector<Vehicle> _vehicles; // list of all vehicles waiting to enter this intersection }; int main() { // create monitor object as a shared pointer to enable access by multiple threads std::shared_ptr<WaitingVehicles> queue(new WaitingVehicles); std::cout << "Spawning threads..." << std::endl; std::vector<std::future<void>> futures; for (int i = 0; i < 10; ++i) { // create a new Vehicle instance and move it into the queue Vehicle v(i); futures.emplace_back(std::async(std::launch::async, &WaitingVehicles::pushBack, queue, std::move(v))); } std::cout << "Collecting results..." << std::endl; while (true) { // popBack wakes up when a new element is available in the queue Vehicle v = queue->popBack(); std::cout << " Vehicle #" << v.getID() << " has been removed from the queue" << std::endl; } std::for_each(futures.begin(), futures.end(), [](std::future<void> &ftr) { ftr.wait(); }); std::cout << "Finished!" << std::endl; return 0; }
-
A condition variable is a low-level building block for more advanced communication protocols. It neither has a memory of its own nor does it remember notifications. Imagine that one
thread
callswait()
before another thread callsnotify()
, the condition variable works as expected and the first thread will wake up. Imagine the case however where the call order is reversed such thatnotify()
is called beforewait()
, the notification will be lost and the thread will block indefinitely. So in more sophisticated communication protocols a condition variable should always be used in conjunction with another shared state that can be checked independently. Notifying a condition variable in this case would then only mean to proceed and check this other shared state. -
Let us pretend our shared variable was a boolean called
dataIsAvailable
. Now let’s discuss two scenarios for the protocol depending on who acts first, the producer or the consumer thread. -
The consumer thread checks
dataIsAvailable()
and since it isfalse
, the consumer thread blocks and waits on the condition variable. Later in time, the producer thread sets dataIsAvailable to true and calls notify_one on the condition variable. At this point, the consumer wakes up and proceeds with its work. -
Here, the producer thread comes first, sets
dataIsAvailable()
to true and callsnotify_one
. Then, the consumer thread comes and checksdataIsAvailable()
and finds it to betrue
- so it does not callwait
and proceeds directly with its work. Even though the notification is lost, it does not cause a problem in this construct - the message has been passed successfully throughdataIsAvailable
and thewait-lock
has been avoided. -
In an ideal (non-concurrent) world, these two scenarios would most probably be sufficient to describe to possible combinations. But in concurrent programming, things are not so easy. As seen in the diagrams, there are
four atomic
operations, two for eachthread
. So when executed often enough, all possible interleavings will show themselves - and we have to find the ones that still cause a problem. -
Here is one combination that will cause the program to lock:
-
The consumer thread reads
dataIsAvailable()
, which isfalse
in the example. Then, the producer setsdataIsAvailable()
totrue
and callsnotify
. Due to this unlucky interleaving of actions, the consumer thread callswait
because it has seendataIsAvailable()
asfalse
. This is possible because the consumer thread tasks are not a joint atomic operation but may be separated by the scheduler and interleaved with some other tasks - in this case the two actions performed by the producer thread. The problem here is that after calling wait, the consumer thread will never wake up again. Also, as you may have noticed, the shared variabledataReady
is not protected by amutex
here - which makes it even more likely that something will go wrong. -
One quick idea for a solution which might come to mind would be to perform the two operations
dataIsAvailable
andwait
under alocked mutex
. While this would effectively prevent the interleaving of tasks between different threads, it would also prevent another thread from ever modifyingdataIsAvailable
again. -
One reason for discussing these failed scenarios in such depth is to make you aware of the complexity of concurrent behavior - even with a simple protocol like the one we are discussing right now.
-
So let us now look at the final solution to the above problems and thus a working version of our communication protocol.
-
As seen above, we are closing the gap between reading the state and entering the wait. We are reading the state under the lock (red bar) and we call
wait
still under thelock
. Then, we letwait
release the lock and enter thewait
state in one atomic step. This is only possible because thewait()
method is able to take alock
as an argument. Thelock
that we can pass towait
however is not thelock_guard
we have been using so often until now but instead it has to be a lock that can be temporarilyunlocked
insidewait
- a suitablelock
for this purpose would be theunique_lock
type which we have discussed in the previous section.
-
-
Implementing the WaitingVehicles queue
-
Now that we have all the ingredients to implement the concurrent
queue
to store waitingVehicle
objects, let us start with the implementation according to the diagram above. -
The first step is to add a condition variable to
WaitingVehicles
class as a private member - just as themutex
. -
private: std::mutex _mutex; std::condition_variable _cond;
-
The next step is to notify the client after pushing a new
Vehicle
into the vector. -
// add vector to queue std::cout << " Vehicle #" << v.getID() << " will be added to the queue" << std::endl; _vehicles.push_back(std::move(v)); _cond.notify_one(); // notify client after pushing new Vehicle into vector
-
In the method
popBack
, we need to create thelock
first - it can not be alock_guard
any more as we need to pass it to the condition variable - to its methodwait.
Thus it must be aunique_lock
. Now we can enter thewait
state while at same time releasing thelock
. It is only insidewait
, that themutex
is temporarilyunlocked
- which is a very important point to remember: We are holding thelock
before AND after our call towait
- which means that we are free to access whatever data is protected by themutex
. In our example, this will bedataIsAvailable()
. -
Before we continue, we need to discuss the problem of "spurious wake-ups": Once in a while, the system will - for no obvious reason -
wake up a thread
. If such a spurious wake-up happened with taking proper precautions, we would issuewait
without new data being available (because the wake-up has not been caused by the condition variable but by the system in this case). To prevent the call to wait in this case, we have to modify the code slightly: -
(continued) In this code, even after a spurious wake-up, we are now checking wether data really is available. If so, we would be issuing the call to
wait
on the condition variable. And only if we are inside wait, may other threads modify and accessdataIsAvailable
. -
(If the vector is
empty
,wait
is called. When the thread wakes up again, the condition is immediately re-checked and - in case it has not been a spurious wake-up we can continue with our job and retrieve the vector. -
We can further simplify this code by letting the
wait()
function do the testing as well as the looping for us. Instead of the while loop, we can just pass aLambda
towait()
, which repeatedly checks wether the vector contains elements (thus the inverted logical expression): -
(continued) When
wait()
finishes, we are guaranteed to find a new element in the vector this time. Also, we are still holding thelock
and thus no other thread is able to access the vector - so there is no danger of a data race in this situation. As soon as we are out of scope, the lock will be automatically released. -
In the
main()
function, there is still the polling loop that infinitely queries the availability of newVehicle
objects. But contrary to the example before, a call to popBack now puts themain()
thread into await
state and only resumes when new data is available - thus significantly reducing the load to the processor. -
Let us now take a look at the complete code…
-
#include <iostream> #include <thread> #include <vector> #include <future> #include <mutex> class Vehicle { public: Vehicle(int id) : _id(id) {} int getID() { return _id; } private: int _id; }; class WaitingVehicles { public: WaitingVehicles() {} Vehicle popBack() { // perform vector modification under the lock std::unique_lock<std::mutex> uLock(_mutex); _cond.wait(uLock, [this] { return !_vehicles.empty(); }); // pass unique lock to condition variable // remove last vector element from queue Vehicle v = std::move(_vehicles.back()); _vehicles.pop_back(); return v; // will not be copied due to return value optimization (RVO) in C++ } void pushBack(Vehicle &&v) { // simulate some work std::this_thread::sleep_for(std::chrono::milliseconds(100)); // perform vector modification under the lock std::lock_guard<std::mutex> uLock(_mutex); // add vector to queue std::cout << " Vehicle #" << v.getID() << " will be added to the queue" << std::endl; _vehicles.push_back(std::move(v)); _cond.notify_one(); // notify client after pushing new Vehicle into vector } private: std::mutex _mutex; std::condition_variable _cond; std::vector<Vehicle> _vehicles; // list of all vehicles waiting to enter this intersection }; int main() { // create monitor object as a shared pointer to enable access by multiple threads std::shared_ptr<WaitingVehicles> queue(new WaitingVehicles); std::cout << "Spawning threads..." << std::endl; std::vector<std::future<void>> futures; for (int i = 0; i < 10; ++i) { // create a new Vehicle instance and move it into the queue Vehicle v(i); futures.emplace_back(std::async(std::launch::async, &WaitingVehicles::pushBack, queue, std::move(v))); } std::cout << "Collecting results..." << std::endl; while (true) { // popBack wakes up when a new element is available in the queue Vehicle v = queue->popBack(); std::cout << " Vehicle #" << v.getID() << " has been removed from the queue" << std::endl; } std::for_each(futures.begin(), futures.end(), [](std::future<void> &ftr) { ftr.wait(); }); std::cout << "Finished!" << std::endl; return 0; }
-
… and at the console output it produces:
-
Spawning threads... Collecting results... Message 0 has been sent to the queue Message #0 has been removed from the queue Message 2 has been sent to the queue Message #2 has been removed from the queue Message 1 has been sent to the queue Message 6 has been sent to the queue Message 7 has been sent to the queue Message 8 has been sent to the queue Message 9 has been sent to the queue Message 3 has been sent to the queue Message 4 has been sent to the queue Message 5 has been sent to the queue Message #5 has been removed from the queue Message #4 has been removed from the queue Message #3 has been removed from the queue Message #9 has been removed from the queue Message #8 has been removed from the queue Message #7 has been removed from the queue Message #6 has been removed from the queue Message #1 has been removed from the queue
-
Better code
-
#include <iostream> #include <thread> #include <queue> #include <future> #include <mutex> template <class T> class MessageQueue { public: T receive() { // perform queue modification under the lock std::unique_lock<std::mutex> uLock(_mutex); _cond.wait(uLock, [this] { return !_messages.empty(); }); // pass unique lock to condition variable // remove last vector element from queue T msg = std::move(_messages.back()); _messages.pop_back(); return msg; // will not be copied due to return value optimization (RVO) in C++ } void send(T &&msg) { // simulate some work std::this_thread::sleep_for(std::chrono::milliseconds(100)); // perform vector modification under the lock std::lock_guard<std::mutex> uLock(_mutex); // add vector to queue std::cout << " Message " << msg << " has been sent to the queue" << std::endl; _messages.push_back(std::move(msg)); _cond.notify_one(); // notify client after pushing new Vehicle into vector } private: std::mutex _mutex; std::condition_variable _cond; std::deque<T> _messages; }; int main() { // create monitor object as a shared pointer to enable access by multiple threads std::shared_ptr<MessageQueue<int>> queue(new MessageQueue<int>); std::cout << "Spawning threads..." << std::endl; std::vector<std::future<void>> futures; for (int i = 0; i < 10; ++i) { int message = i; futures.emplace_back(std::async(std::launch::async, &MessageQueue<int>::send, queue, std::move(message))); } std::cout << "Collecting results..." << std::endl; while (true) { int message = queue->receive(); std::cout << " Message #" << message << " has been removed from the queue" << std::endl; } std::for_each(futures.begin(), futures.end(), [](std::future<void> &ftr) { ftr.wait(); }); std::cout << "Finished!" << std::endl; return 0; }
-
A message queue is an effective and very useful mechanism to enable a safe and reusable communication channel between threads. In the final project, we will use shorty use this construct to integrate another component into our simulation - traffic lights at intersections.
-