Skip to content

Parameters and Return Values

amirroth edited this page Sep 8, 2021 · 11 revisions

What is there to say about parameters and return values? Quite a bit as it turns out!

As with everything else, it helps to understand how things work "under the hood" in order to motivate the recommendations.

One important piece of "under the hood" information is that x86_64 architecture has 16 general purpose registers that can each hold a 64-bit scalar. These registers are used for local variables, intermediate results, and for passing parameters to functions and grabbing returned results. The x86_64 ABI (application binary interface) specifies that the first six parameters that can fit into registers are passed via six specific registers (doesn't matter what they are, just that there are six of them) and the rest are passed via stack memory. The same ABI specifies that a single 64-bit result can be returned via a register but that larger results must be returned via stack memory. Registers are faster than memory--there is overhead of an additional instruction in both the caller and callee, and these additional instructions are a STORE and a LOAD which are somewhat more expensive than instructions like ADD--and so the goal is to maximize the use of registers and minimize the use of stack memory in the function call process.

Another important piece of "under the hood" information oIn terms of compiler optimizations, passing parameters by pointer/reference interferes with compiler optimizations in the caller because the caller cannot keep a variable that is passed by pointer/reference in a register across a call. There is no such thing as a pointer/reference to a register and so the caller has to keep the variable in memory before the call--this may not cost anything because the variable may already be somewhere in memory, but it may if it was just a simple local variable. But also, the callee could have changed the value of the variable and so it has to be re-LOAD-ed into a register after the call. The upshot here is that parameters should be passed by value as much as possible.

Let's get to the meat and potatoes.

Value, Reference, and Constant Reference Arguments and Return Values

There are three basic ways to pass parameters, by value, by reference, and by constant reference, and here are the rules for them.

  • Scalars that are strictly inputs to a function should be passed by value.
  • Non-scalars (i.e., objects) that are strictly inputs to the function should be passed by constant reference (i.e., const &). The two exceptions to this are std::string_view and gsl::span which have two members each and are themselves essentially references.
  • Scalars that are outputs of a function should be returned via the return value as opposed to reference (i.e., &) parameters.
  • Scalars that are outputs of a function but cannot be returned via the return value (e.g., because another scalar is already being returned) and non-scalar outputs should be passed via reference (i.e., &) parameters.

Are we done? Mostly, but there are some nuances.

  • What if you have more than six scalar input parameters? Are you better off collecting some of them into an object and passing that object by constant reference or passing the parameters individually by value understanding that some of them will have to be passed via the stack? This is not empirical but my sense is you are better off collecting some of them into a object--if it makes logical sense to do so--passing that object by reference. Compilers have no latitude to optimize the calling convention or ABI, it is what it is. They have more freedom and are getting better at optimizing local object storage. This would be easier if EnergyPlus classes had better internal organization and made heavier use of "sub-objects" as opposed to be flat lists of fields. A canonical example of this is the PlantLocation struct. This is a logical group of variables, but too often they are represented as individual variables and passed to functions individually rather than by constant reference (or by reference if they need to be written). Representing these as structures more pervasively would facilitate passing them as a structure, reducing function call cost.
struct PlantLocation {
   int LoopNum;
   int LoopSideNum;
   int BranchNum;
   int CompNumNum;
};
  • Same question about output parameters, what if you have many of them? Should you collect them into an object and pass that single object by reference? This is actually a much easier question. If you have multiple output parameters you are already committed to returning them via memory, and so you may as well combine them into an object.
  • If you need to return multiple values, which one should you return as the return value? Should you use std::pair or std::tuple to return multiple values? This is a good discussion for a future EnergyPlus Technicalities meeting.

Pointer vs. Reference Arguments

Here's a religious argument for you, the pointer vs. reference argument. The C-language had only pointers, references are a C++ construct. And they are probably the worst C++ construct at that. What is the difference between a reference and a pointer, you ask? Nothing except for syntactic sugar and the fact that references can technically not be nullptr. Other than that, all a reference does is obscure the fact that something is actually a pointer. If it were up to me--and it may be--I would say that arguments should be passed by pointer rather than by reference. This would require a & in front of the argument at the call-site, making it obvious that the argument is being passed by pointer rather than by value. And it would require the use of * or -> inside the function, again making it clear that we are dealing with a parameter that was passed by address rather than by value.

Although I am against references in general, I am particularly against this specific use of them. I am fine with local reference variables to shorten what would otherwise be long names, in fact we don't do this enough in EnergyPlus.

ZoneData &zone = state.dataHeatBal->Zone(ZoneNum);

I am also somewhat fine with constant reference function parameters, because there is not a big logical difference between a constant reference parameter and a value parameter, but non-const reference parameters are evil in my opinion.

Optional Arguments

Clone this wiki locally