-
Notifications
You must be signed in to change notification settings - Fork 398
Coding Guidelines
Introduction
Source Files
Source Organization
Source Structure
Naming
Namespaces
Declarations
Formatting Style
C++ Usage
Object-Oriented C++
Performance
Resources
This is the preliminary EnergyPlus C++ coding guidelines. It is likely that this will evolve and grow considerably over time.
These guidelines will gradually become enforced by both code scanning tools and code reviewers. Any C++ guideline should allow exceptions for unusual situations: code reviewers should look at the whole picture to see if bending a rule can improve the resulting code in terms of improved clarity, reduced code/duplication, reduced exposure to bugs, and so forth.
At this time EnergyPlus does not adhere to some of these rules due to its recent conversion from Fortran. The intent is to gradually refactor EnergyPlus to use natural C++ constructs and to follow these guidelines.
- Header file names have a .hh extension
- Source file names have a .cc extension
- Header and source files (we call these both "source" files in some contexts) have Linux-style (\n) line terminators (dos2unix can convert to this)
- Smaller source files for lower coupling and fast compiles
- Maximal use of forward declarations instead of full header inclusion to lower coupling and speed up compilation
- Dependency management
- Organize the source into coherent packages of manageable size: A package is generally implemented as a namespace and can live in its own subdirectory
- Code should be organized in conceptual layers with dependencies carefully thought out. Dependencies between source files can become a problem with C++, both in terms of the need to avoid circular dependencies and to avoid long compilation times.
- High-level components can use and depend on lower level "utility" components but the opposite would indicate a design flaw
- Finer source granularity can reduce dependency and compile speed problems
- Ask for help if designing for dependency control is presenting a challenge
- One pair of .hh and .cc files per class is usually preferable
- A forward declaration header with a .fwd.hh extension is recommended to make it easier for other code to forward declare a class that lives in a namespace. A forward declaration may also provide convenient aliases for class templates.
- Put class method implementations in the .cc file if they would require inclusion of "heavy" header files into the .hh file (unless inlining them is important for performance) or are too big or slow to make inlining useful
- A header file should be self-contained (compilable) and include no more than it needs to
- Namespaces should be used to associate parts of a coherent "package" that work together to provide some functionality
- Namespaces placed in the corresponding subdirectory can be used to group a package: It is recommended that EnergyPlus start to move to namespace directories and away from a monolithic source directory
- The use of namespaces increases the benefit of having .fwd.hh forward declaration headers for classes (writing out such forward declarations is tedious)
- File names should match the contained class name exactly: MyClass.hh declares MyClass
- No trailing spaces are allowed: Most good editors/IDEs can be set to trim off trailing spaces on Save and stand-alone tools can do this
- 4 space indentation is used
- Line wrapping is discouraged for now: Most editors/IDEs can do soft wrapping in the display
- Line wrapping that is permitted should wrap at the same indentation as the initial line but with a single extra indent space
- emacs style wrapping (aligning wrapped lines at the open parenthesis of a function) is not allowed
- Function arguments that are wrapped on separate lines should get an extra indent
- Functions with more than a few arguments should be wrapped to one argument per line
- Functions with one argument should not wrap the argument to a separate line unless an argument comment is needed or it is part of a family of functions that have wrapped multiple arguments
- Header files have include guards wrapping the whole file of the form
#ifndef NamespaceName_FileName_hh_INCLUDED
#define NamespaceName_FileName_hh_INCLUDED
...
#endif
- Use
<>
include wrappers with complete local paths relative to the source root rather than""
because they fully specify the search logic:
#include <path/to/MyClass.hh>
- Includes must be complete and minimal:
- Headers should include everything they need so that they can be compiled if included alone in a .cc file
- Headers should not include anything they don't need
- Forward declarations (or forward declaration headers) should be used instead of full headers wherever they suffice:
- A type only appears in a declaration
- A function definition argument that is passed by reference or pointer and only used to pass along to other functions by reference or pointer
- If you aren't sure try the forward declaration: the compiler will tell you if it isn't enough
- When you modify a file you are responsible for checking for any obsolete, duplicate, or too-strong headers
- Includes should be grouped into sections in this order and with these comments:
// C++ Headers
...
// Some_Package Headers
...
// ObjexxFCL Headers
...
// EnergyPlus Headers
...
- A .cc file should include its own header file first in the EnergyPlus Headers section
- A .hh file should include any base header(s) first in the EnergyPlus Headers section
- Other headers should be in alphabetical order in each section
- Declare (and initialize if needed) variables as close as possible to and in the smallest scope around their point of first use:
- Adding scoping {} braces around a code block to localize the scope of a variable can be useful
- Prefer construction syntax for constructing objects instead of assignment syntax:
X x(a); instead of X x = a;
Y y(a, b, c); instead of Y y = Y(a, b, c);
- CamelCase should be used for type (class, struct, union, enum, ...) names
- camelCase should be used for variable (including class/struct instance) and function names
- Exceptions can be made when the case of a name has particular significance such as providing Name() and NAME() methods to get an item's name with Tile or UPPER case
- C++ library lower_case naming can be used for code that is analogous to a standard library component like a custom container
- Underscores can be used within more complicated camel case names with multiple parts for readability
- Use OO naming: names don't need to indicate the concrete class or expose information about the implementation details
- Use clear, non-cryptic names
- Use contextual names to avoid redundancies:
myWidget.name
notmyWidget.widgetName
- Namespaces are normally given lowercase names
- No extra indentation in namespaces
- Namespaces are marked like this:
namespace thing1 {
namespace thing2 {
// ...
} // thing2
} // thing1
- Prefer specific using declarations such as
using pkg::name
over using directives such asusing namespace pkg
- Never put a using declaration or directive at file or namespace scope in a header ("infects" including files)
- Prefer placing using declarations and directives in a narrow scope where they are needed
To aid with formatting style conformance two files are provided, VisualStudio.c++.settings
which can be directly imported into VisualStudio and .clang-format
in the top level directory which can be automatically picked up by supporting tools:
- VSCode https://marketplace.visualstudio.com/items?itemName=xaver.clang-format
- VisualStudio https://marketplace.visualstudio.com/items?itemName=LLVMExtensions.ClangFormat#review-details
- Resharper++: https://www.jetbrains.com/help/resharper/2017.2/Using_Clang_Format.html
- Vim
- XCode: https://github.com/travisjeffery/ClangFormat-Xcode
- Functions with no, one, and multiple arguments should look like this:
return_type no_arg_function()
{
// ...
}
return_type one_arg_function(arg)
{
// ...
}
return_type multi_arg_function(arg1,
/* ... */
argN)
{
// ...
}
- Function declarations and implementations must be kept in sync wrt variable names
- Default arguments should appear only in function declarations when separate declaration and implementation are used
- The following format is suggested, with appropriate tailoring, for classes:
class MyClass // [: public MyBaseClass]
{
public: // Types (typedefs, using declarations, etc.)
public: // Creation (constructors and destructors)
public: // Assignment
public: // Methods
// (return by value instead of reference if small/builtin type)
TypeB const & b() const
{
return b_;
}
TypeB & b()
{
return b_;
}
public: // Data (if any)
TypeA a;
private: // Data
TypeB b_;
};
- Don't decorate member names with
m_
prefixes (hurts readability and IDEs can tell you about the variable when you hover your mouse over it)
- Braces other than on class and function declarations are placed to avoid extra lines of code (to maximize the lines that we can see on the screen)
if (condition) {
} else if (condition) {
} else {
}
for (/*...*/) {
}
while (/*...*/) {
}
- Conditional statements that wrap onto multiple lines should use {} braces even if there is only one action statement so that someone doesn't add another statement thinking it is in the same conditional block:
// Wrong
if (condition)
freds_function();
dinos_function(); // Oops! Dino didn't notice there was no block
// Correct
if (condition) {
freds_function();
dinos_function(); // That's what Dino meant
}
- Use one blank line between code sections when it aids clarity
- Multiple sequential blank lines and extra comment lines filled with repeat characters are strongly discouraged
- Use spaces to make code more readable for everyone:
- Put single spaces around operators
- Put spaces after commas
var = 2 * foo(arg1, arg2);
- Prefer
// trailing comment
to/* inline comment */
- Inline comments are acceptable in limited situations such as in function call argument lists when breaking into multiple lines would harm clarity
- Leave a single space after
//
before the comment contents - Tag comments directly after the
//
for some basic categories to aid code searching. The list of tags will evolve but here is a starter set: - Permanent comments should be indented to the same column as the code to which they apply
- Commented out code sections normally should have the `//`` at col 1
Much of the inherited Fortran subroutine/function comment block is not a good fit for C++. Unless/until we adopt tagged commenting for documentation systems such as Doxygen the following style of comment block is recommended. A comment block is not necessary for each method in a class since many methods are obvious "boilerplate" like getters and setters.
// Package: Brief Description of the File/Component/Function
//
// Author: Your Name (YourEmail@your.domain)
//
// Purpose: Short description of the purpose of the code
//
// History:
// Sep 2015: Your Name: Initial version
// Oct 2015: Jane Doe: Added support for new widget
//
// Notes:
// Note on method used
// References for algorithm
// Other important information
- History entries should be made for significant revisions: small updates are tracked by Git
- Notes section is optional
- Never put a using declaration or directive at file or namespace scope in a header ("infects" including files)
- Pass large (non-builtin) types by reference
- This includes
std::string
!!! - In some cases move semantics mean this is relaxed
- This includes
- Use
++var
and--var
, notvar++
andvar--
if returning the old value is not needed: It isn't faster in optimized builds anymore but it does express the intent better - Use size types like
std::size_t
instead ofint
when comparing against sizes to avoid compiler warnings - Prefer the exact-sized types in the cstdint header to types like
long int
that have different sizes on different platforms - Use
nullptr
instead of0
orNULL
for a null pointer value - Avoid the use of macros
- Avoid global (including namespace scope) data in new code
- Constructors should initialize data members in the initializer list section rather than inside the constructor body whenever possible:
MyClass(X const & x) :
x_( x )
{}
- Constructor initializer lists should have data members in their declared order since that is the order they are actually initialized and mixing them will lead to compiler warnings
- Constructor initializer lists should skip data members and base classes being default initialized
- std::string default initializes to the empty string so putting things like
name("")
in initializer lists is not needed
- std::string default initializes to the empty string so putting things like
- Don't use
this->
prefixes on all class members (hurts readability and is rarely needed for disambiguation) - Use
x_
for private data andx()
(notgetX()
) for simple accessors- If the class must control modifications of x_ (such as to make other changes to maintain invariants) the "set" method can be
x( X const & x ) // Pass by value if X is builtin type
{
x_ = x;
}
- If the
x
getter and setter methods have different access specifiers (public, protected, or private) having them be overloads can create problems with using declarations: In that case usingset_x
for the setter is fine (and this is an example where setX camelCase rule is best relaxed)
- Use const maximally on variables and arguments to document and enforce intent
- Use const on pass-by-value arguments that are not and should not be altered to have the compiler catch someone modifying them in the mistaken assumption that it will alter the passed variable
- Use shorthand reference aliases for complicated objects like
array(x).foo(a,b,c).q(3)
to reduce typing and thus bug exposures and to improve performance
- Use asserts liberally:
- Live testing is easier than unit tests for legacy code
- Serves as enforced documentation
- Assert function/block pre and post conditions
- Assert class invariants are preserved after creation and modification
- Prefer natural/native C++ constructs for new and refactored code: Don't just copy and paste similar code: Ask for advice if unsure what a "native" approach might look like
- Try to avoid global/namespace data in new code where practical
- Use class owned file resources instead of adding more global files
- Don't try to do memory management with raw pointers and new/delete calls unless you really know what you are doing: Use std::unique_ptr or another, appropriate, smart pointer to manage the lifetime for you
- Don't use C style arrays: Buffer overflows are a very common source of hard to find bugs
- Don't use the scanf/printf family of functions: These are another notorious source of hard to find bugs
- Don't iterate over a collection while inserting/deleting objects into it unless you know that the operations are guaranteed not to invalidate iterators
- This varies depending on the container type so even if it works now someone may change the container type later and introduce a bug: Add a // WARNING comment
- Use assertions to test for events that should not occur for any program inputs and operations: function pre- and post-conditions, object invariants, divisions by zero, and so forth
- Use run-time if blocks to test for errors that could happen in normal program operation due to invalid inputs or system resource limitations:
- Errors should generate useful reports that include the values of relevant variables
- Use a macro function to exit from an error condition that shows you the file and line number where the exit originated
- Use const maximally: If something shouldn't be changed declare it const:
- Combine declaration with initialization where possible to allow logically const objects to be declared const
- Consider "edge cases" when developing algorithms and make sure your code handles unexpected inputs
- Eliminate all legitimate compiler warnings: If a compiler has a intrusive and rarely useful warning it is OK to suppress it if the compiler has switches for that
- Test on as many platforms and compilers as you can, especially after substantial changes
- Add unit tests for new functions and classes and make sure they pass before your code is released
- Doing memory management with raw pointers is very hard to do right and leaves holes for the next modifier of your code to add bugs: Use smart pointers and very careful, assert-checked code with lots of // WARNING comments near any place where there is an exposure to bugs
- Set a pointer to zero right after you call delete on it unless you are about to exit the destructor of the class containing the pointer
- Never allocate an object on the heap (with 'new') in one scope and pass it out as a raw pointer expecting the users to remember to delete it: Either retain lifetime control of the object in a class or smart pointer
- If you allocate an object with new in a loop/scope delete it in that same scope, otherwise you can get a memory leak like this:
X * x_p;
for ( int i = 1; i <= n; ++i ) {
x_p = new X();
// Do something with x_p
}
delete x_p; // This only cleans up the last one: The others leaked
- Use
std::shared_ptr< MyType >
for shared ownership of objects in collections: This lets you deal with membership and forget about ownership
- A class should represent a coherent entity with a well-defined and limited set of responsibilities and services
- A class should be simple enough that its valid states and invariants can be defined and tested
- Document the invariants in the top of the Class.hh
- Test a class' invariant state is preserved by every modifying operation with asserts
- Classes with invariant state should not expose their state-related data members to external modification: Instead of:
Type & x() { return x_; }
use:
void x( Type const & x_a ) { x_ = x_a; /* Check invariants here */ }
- Insulate users of your classes from implementation details:
- Data members should generally be private
- Prefer to provide services on the class' containers or provide iterators to them rather than accessors returning the container so that users of the class are less coupled to the type of container
- You don't need to provide accessors for every data member: Distinguish the external properties of a class from its internal details
- You don't need to provide a non-const accessor for every data member with a const accessor
- If your class allocates data on the heap (with 'new') you will generally need so-called "deep copying" semantics so you will have to provide a copy constructor, a destructor, and a copy assignment operator: This is sometimes called the Rule of Five (was Rule of Three before C++11 added move constructors and assignment)
- If you don't want copying you can declare the copy constructor and/or assignment operator without defining them
- Base classes should not be instantiable and should prevent slicing. If default (automatically provided) implementations of the constructor and assignment operators are appropriate it might look as below. For some classes the non-default versions would appear instead.
protected: // Constructors
Base() = default; // Default constructor
Base( Base const & ) = default; // Copy constructor
Base( Base && ) = default; // Move constructor
public: // Destructor (public and virtual)
virtual
~Base()
{}
protected: // Assignment
Base &
operator =( Base const & ) = default; // Copy assignment
Base &
operator =( Base && ) = default; // Move assignment
- Base classes should declare protected assignment operators unless you are sure you want to allow cross assignment between the derived types (this generally leads to "slicing" off of some data members)
- Base classes should have a virtual destructor (unless you know and need a rare exception to this rule)
- Limit dependencies
- Put coupling operations (those that pull in another class' Class.hh header) in your class' .cc file if inlining them isn't essential
- Use free (non-class) functions for complex multi-class operations to keep the classes decoupled
- Resource Management
- Prefer to acquire resources (heap memory, file handles, etc.) in the constructor and always release such resources in the destructor (this is sometimes called RAII)
- Never pass out a raw resource with the expectation that the using code will release it for you: Use ownership managing smart pointers and analogous "smart" handles for such purposes
- Don't expend a lot of effort optimizing performance of non-performance-critical parts of the code: The guidelines below assume that you are working on code where you know the performance can matter. But don't write something the slow way when the fast way is just as easy and clear: That is called "premature pessimation"
- Hoist as much computation out of inner loops as you can
- The C++ standard container types have very different (and sometimes nasty) performance profiles: Choose the right container type or design one that better meets your needs
- Don't stick a conditional test in the middle of a performance-critical function or loop unless you are sure there is no alternative: Sometimes a special version of a function is justified to avoid taking a performance hit
- Avoid dynamic (heap) allocation in loops and frequently called functions: This can include the construction of objects that allocate heap memory as well as explicit calls to new
- Objects that use heap allocation in high call-count functions can sometimes be made static and reused (with any necessary reinitializtion) on each call to improve performance
- ObjexxFCL Arrays of rank greater than one are row-major ordered and so are most efficiently accessed in loops where the innermost loop varies the last index
- ObjexxFCL Array linear indexing with operator[] can provide a big speed boost for performance-critical code sections at some cost to code clarity: See the ObjexxFCL documentation for details on linear indexing
- Built-in numeric and bool types are slightly more efficiently passed to functions by value rather than by reference (when the function does not need to modify the actual argument's value):
- You can declare the argument as passed by constant value to document and check at compile-time that the function does not alter the argument
- Classes with heap allocated resources should usually have user-defined move constructors and assignment operators as well as the traditional "Rule of 3" items (copy constructor, copy assignment operator, and destructor)
- Exploit C++ short-circuit conditional evaluations for performance by considering cost and true/false distribution of each conditional expression: putting inexpensive expressions leftmost is normally best