I am confused because I haven't written C in a while. In C++, we would pass them as references, in order not to copy the whole struct. Does this apply to C too? Should we pass them as pointers, even if we don't want to modify them, in order to avoid copying?
In other words, for a function that checks if two structs are equal, we better do
int equal(MyRecord* a, MyRecord* b);
and decrease a bit the readability (because of pointers)
or
int equal(MyRecord a, MyRecord b);
will have the same performance?
Often, passing pointers is faster - and you'll call equal(&r1, &r2) where r1 and r2 are local struct variables. You might declare the formals to be const pointers to a const structure (this could help the optimizing compiler to generate more efficient code). You might also use the restrict keyword (if you are sure you'll never call your equal with two identical pointers, e.g. equal(&r1,&r1), i.e. without pointer aliasing).
However, some particular ABIs and calling conventions may mandate particular processing for some few particular structures. For example, the x86-64 ABI for Linux (and Unix SVR4) says that a struct with two pointers or integral values will be returned thru two registers. This is usually faster than modifying a memory zone with its pointer in a register. YMMV.
So to know what is faster, you really should benchmark. However, passing a large-enough struct (e.g. with at least 4 integral or pointer fields) by value is almost always slower than passing a pointer to it.
BTW, what really matters on current desktop and laptop processors is the CPU cache. Keeping frequently used data inside L1 or L2 cache will increase performance. See also this.
What is faster massively depends on the size of the struct and it’s use inside the called function.
If your struct is not larger than a pointer, passing by value is the best choice (less or equal amount of data needs to be copied).
If your struct is larger than a pointer, it heavily depends on the kind of access taking place inside the called function (and appearantly also on ABI specifics). If many random accesses are made to the struct, it may be faster to pass by value, even though it’s larger than a pointer, because of the pointer indirection taking place inside the function.
All in all, you have to profile to figure out what’s faster, if your struct is larger than a pointer.
Passing pointers is faster, for the reasons you say yourself.
Actually, I find C more readable than C++ in this case: by passing a pointer in the call, you acknowledge that your paramters might get changed by the called function. With C++ references, you can't immediately say that by seeing only the call, you also have to check out the called function prototype to see if it uses references.
Related
A struct can be either passed/returned by value or passed/returned by reference (via a pointer) in C.
The general consensus seems to be that the former can be applied to small structs without penalty in most cases. See Is there any case for which returning a structure directly is good practice? and Are there any downsides to passing structs by value in C, rather than passing a pointer?
And that avoiding a dereference can be beneficial from both a speed and clarity perspective. But what counts as small? I think we can all agree that this is a small struct:
struct Point { int x, y; };
That we can pass by value with relative impunity:
struct Point sum(struct Point a, struct Point b) {
return struct Point { .x = a.x + b.x, .y = a.y + b.y };
}
And that Linux's task_struct is a large struct:
https://github.com/torvalds/linux/blob/b953c0d234bc72e8489d3bf51a276c5c4ec85345/include/linux/sched.h#L1292-1727
That we'd want to avoid putting on the stack at all costs (especially with those 8K kernel mode stacks!). But what's about middling ones? I assume structs smaller than a register are fine. But what about these?
typedef struct _mx_node_t mx_node_t;
typedef struct _mx_edge_t mx_edge_t;
struct _mx_edge_t {
char symbol;
size_t next;
};
struct _mx_node_t {
size_t id;
mx_edge_t edge[2];
int action;
};
What is the best rule of thumb for determining whether a struct is small enough that it's safe to pass it around by value (short of extenuating circumstances such as some deep recursion)?
Lastly please don't tell me that I need to profile. I'm asking for a heuristic to use when I'm too lazy/it's not worth it to investigate further.
EDIT: I have two followup questions based on the answers so far:
What if the struct is actually smaller than a pointer to it?
What if a shallow copy is the desired behavior (the called function will perform a shallow copy anyway)?
EDIT: Not sure why this got marked as a possible duplicate as I actually link the other question in my question. I'm asking for clarification on what constitutes a small struct and am well aware that most of the time structs should be passed by reference.
On small embedded architectures (8/16-bitters) -- always pass by pointer, as non-trivial structures don't fit into such tiny registers, and those machines are generally register-starved as well.
On PC-like architectures (32 and 64 bit processors) -- passing a structure by value is OK provided sizeof(mystruct_t) <= 2*sizeof(mystruct_t*) and the function does not have many (usually more than 3 machine words' worth of) other arguments. Under these circumstances, a typical optimizing compiler will pass/return the structure in a register or register pair. However, on x86-32, this advice should be taken with a hefty grain of salt, due to the extraordinary register pressure a x86-32 compiler must deal with -- passing a pointer may still be faster due to reduced register spilling and filling.
Returning a structure by value on PC-likes, on the other hand, follows the same rule, save for the fact that when a structure is returned by pointer, the structure to be filled out should be passed in by pointer as well -- otherwise, the callee and the caller are stuck having to agree on how to manage the memory for that structure.
My experience, nearly 40 years of real-time embedded, last 20 using C; is that the best way is to pass a pointer.
In either case the address of the struct needs to be loaded, then the offset for the field of interest needs to be calculated...
When passing the whole struct, if it is not passed by reference,
then
it is not placed on the stack
it is copied, usually by a hidden call to memcpy()
it is copied to a section of memory that is now 'reserved'
and unavailable to any other part of the program.
Similar considerations exist for when a struct is returned by value.
However, "small" structs,
that can be completely held in a working register to two
are passed in those registers
especially if certain levels of optimization are used
in the compile statement.
The details of what is considered 'small'
depend on the compiler and the
underlying hardware architecture.
Since the argument-passing part of the question is already answered, I'll focus on the returning part.
The best thing to do IMO is to not return structs or pointers to structs at all, but to pass a pointer to the 'result struct' to the function.
void sum(struct Point* result, struct Point* a, struct Point* b);
This has the following advantages:
The result struct can live either on the stack or on the heap, at the caller's discretion.
There are no ownership problems, as it is clear that the caller is responsible for allocating and freeing the result struct.
The structure could even be longer than what is needed, or be embedded in a larger struct.
How a struct is passed to or from a function depends on the application binary interface (ABI) and the procedure call standard (PCS, sometimes included in the ABI) for your target platform (CPU/OS, for some platforms there may be more than one version).
If the PCS actually allows to pass a struct in registers, this not only depends on its size, but also on its position in the argument list and the types of preceeding arguments. ARM-PCS (AAPCS) for instance packs arguments into the first 4 registers until they are full and passes further data onto the stack, even if that means an argument is split (all simplified, if interested: the documents are free for download from ARM).
For structs returned, if they are not passed through registers, most PCS allocate the space on the stack by the caller and pass a pointer to the struct to the callee (implicit variant). This is identical to a local variable in the caller and passing the pointer explicitly - for the callee. However, for the implicit variant, the result has to be copied to another struct, as there is no way to get a reference to the implicitly allocated struct.
Some PCS might do the same for argument structs, others just use the same mechanisms as for scalars. In any way, you defer such optimizations until you really know you need them. Also read the PCS of your target platform. Remember, that your code might perform even worse on a different platform.
Note: passing a struct through a global temp is not used by modern PCS, as it is not thread-safe. For some small microcontroller architectures, this might be different, however. Mostly if they only have a small stack (S08) or restricted features (PIC). But for these most times structs are not passed in registers, either, and pass-by-pointer is strongly recommended.
If it is just for immutability of the original: pass a const mystruct *ptr. Unless you cast away the const that will give a warning at least when writing to the struct. The pointer itself can also be constant: const mystruct * const ptr.
So: No rule of thumb; it depends on too many factors.
Really the best rule of thumb, when it comes to passing a struct as argument to a function by reference vs by value, is to avoid passing it by value.
The risks almost always outweigh the benefits.
For the sake of completeness I'll point out that when passing/returning a struct by value a few things happen:
all the structure's members are copied on the stack
if returning a struct by value, again, all members are copied from the function's stack memory to a new memory location.
the operation is error prone - if the structure's members are pointers a common error is to assume you are safe to pass the parameter by value, since you are operating on pointers - this can cause very difficult to spot bugs.
if your function modifies the value of the input parameters and your inputs are struct variables, passed by value, you have to remember to ALWAYS return a struct variable by value (I've seen this one quite a few times). Which means double the time copying the structure members.
Now getting to what small enough means in terms of size of the struct - so that it's 'worth' passing it by value, that would depend on a few things:
the calling convention: what does the compiler automatically save on the stack when calling that function(usually it's the content of a few registers). If your structure members can be copied on the stack taking advantage of this mechanism than there is no penalty.
the structure member's data type: if the registers of your machine are 16 bits and your structure's members data type is 64 bit, it obviously won't fit in one registers so multiple operations will have to be performed just for one copy.
the number of registers your machine actually has: assuming you have a structure with only one member, a char (8bit). That should cause the same overhead when passing the parameter by value or by reference (in theory). But there is potentially one other danger. If your architecture has separate data and address registers, the parameter passed by value will take up one data register and the parameter passed by reference will take up one address register. Passing the parameter by value puts pressure on the data registers which are usually used more than the address registers. And this may cause spills on the stack.
Bottom line - it's very difficult to say when it's ok to pass a struct by value. It's safer to just not do it :)
Note: reasons to do so one way or the other overlap.
When to pass/return by value:
The object is a fundamental type like int, double, pointer.
A binary copy of the object must be made - and object is not large.
Speed is important and passing by value is faster.
The object is conceptually a smallish numeric
struct quaternion {
long double i,j,k;
}
struct pixel {
uint16_t r,g,b;
}
struct money {
intmax_t;
int exponent;
}
When to use a pointer to the object
Unsure if value or a pointer to value is better - so this is the default choice.
The object is large.
Speed is important and passing by a pointer to the object is faster.
Stack usage is critical. (Strictly this may favor by value in some cases)
Modifications to the passed object are needed.
Object needs memory management.
struct mystring {
char *s;
size_t length;
size_t size;
}
Notes: Recall that in C, nothing is truly passed by reference. Even passing a pointer is passed by value, as the value of the pointer is copied and passed.
I prefer passing numbers, be they int or pixel by value as it is conceptually easier to understand code. Passing numerics by address is conceptual a bit more difficult. With larger numeric objects, it may be faster to pass by address.
Objects having their address passed may use restrict to inform the function the objects do not overlap.
On a typical PC, performance should not be an issue even for fairly large structures (many dozens of bytes). Consequently other criteria are important, especially semantics: Do you indeed want to work on a copy? Or on the same object, e.g. when manipulating linked lists? The guideline should be to express the desired semantics with the most appropriate language construct in order to make the code readable and maintainable.
That said, if there is any performance impact it may not be as clear as one would think.
Memcpy is fast, and memory locality (which is good for the stack) may be more important than data size: The copying may all happen in the cache, if you pass and return a struct by value on the stack. Also, return value optimization should avoid redundant copying of local variables to be returned (which naive compilers did 20 or 30 years ago).
Passing pointers around introduces aliases to memory locations which then cannot be cached as efficiently any longer. Modern languages are often more value-oriented because all data is isolated from side effects which improves the compiler's ability to optimize.
The bottom line is yes, unless you run into problems feel free to pass by value if it is more convenient or appropriate. It may even be faster.
We do not pass structs by value, neither we use naked pointers (gasp!) all the time and everywhere. Example.
ERR_HANDLE mx_multiply ( MX_HANDLE result, MX_HANDLE left, MX_HANDLE right ) ;
result left and right are instances of the same (struct) type for 2D matrix
multiply is some other error (struct) type
'handle' is the address of the struct on the memory 'slab' pre-allocated for the instances of the same types
is this safe? Very. Is this slow? A bit slower vs naked pointers.
in an abstract way a set of data values passed to a function is a structure by value, albeit undeclared as such.
you can declare a function as a structure, in some cases requiring a type definition. when you do this everything is on the stack. and that is the problem. by putting your data values on the stack it becomes vulnerable to over writing if a function or sub is called with parameters before you utilize or copy the data elsewhere. it is best to use pointers and classes.
Although the subject is discussed many times, I haven't found any satisfying answer so far. When to return data from a function by return or to pass a reference to change the data on address? The classic answer is to pass a variable as reference to a function when it becomes large (to avoid stack copying). This looks true for anything like a structure or array. However returning a pointer from a function is not uncommon. In fact some functions from the C library to the exact thing. For example:
char *strcat(char *dst, const char *src);
Always returns a pointer to destination even in case of an error. In this case we can just use the passed variable and leave the return for what it is (as most do).
When looking at structures I see the same thing happening. I often return pointers when functions only need to be used in variable initialization.
char *p = func(int i, const char *s);
Then there is the argument that stack coping variables is expensive, and so to use pointers instead. But as mentioned here some compilers are able to decide this themselves (assuming this goes for C as well). Is there a general rule, or at least some unwritten convention when to use one or the other? I value performance above design.
Start by deciding which approach makes the most sense at the logical level, irrespective of what you think the performance implications might be. If returning a struct by value most clearly conveys the intent of the code, then do that.
This isn't the 1980s anymore. Compilers have gotten a lot smarter since then and do a really good job of optimizing code, especially code that's written in a clear, straightforward manner. Similarly, parameter passing and value return conventions have become fairly sophisticated as well. The simplistic stack-based model doesn't really reflect the reality of modern hardware.
If the resulting application doesn't meet your performance criteria, then run it through a profiler to find the bottlenecks. If it turns out that returning that struct by value is causing a problem, then you can experiment with passing by reference to the function.
Unless you're working in a highly constrained, embedded environment, you really don't have to count every byte and CPU cycle. You don't want to be needlessly wasteful, but by that same token you don't want to obsess over how things work at the low level unless a) you have really strict performance requirements and b) you are intimately familiar with the details of your particular platform (meaning that you not only know your platform's function calling conventions inside and out, you know how your compiler uses those conventions as well). Otherwise, you're just guessing. Let the compiler do the hard work for you. That's what it's there for.
Rules of thumb:
If sizeof(return type) is bigger than sizeof(int), you should probably pass it by pointer to avoid the copy overhead. This is a performance issue. There's some penalty for dereferencing the pointer, so there are some exceptions to this rule.
If the return type is complex (containing pointer members), pass it by pointer. Copying the local return value to the stack will not copy dynamic memory, for example.
If you want the function to allocate the memory, it should return a pointer to the newly allocated memory. It's called the factory design pattern.
If you have more than one thing you want to return from a function - return one by value, and pass the rest by pointers.
If you have a complex/big data type which is both input and output, pass it by pointer.
This is common in functional languages especially with TCO. I was just wondering if it provided any performance benefits besides being easier to write and keep track of. Is it just as fast to access the variables in the struct as it is to access them if they were just normal arguments? Is there any cons to this method?
There is no benefit, because structs are passed by value. Passing multiple arguments one by one will take the same amount of allocations from a running program as the allocation of a struct. Moreover, struct may give you worse results because of padding.
Even if you pass your struct by pointer, you would still need to allocate a new instance of your struct before passing it to the next level of invocation. Theoretically, you could get some benefit by reusing a struct that you have allocated once in multiple invocations, but in most cases that would be a micro-optimization not worth your trouble (unless your profiler indicates otherwise).
even recursion involves many stack operations which dwindle the performance more over passing arguments to the recursive functions make it even further irregular,create a confusion to developer,going with their own risk
Cons:
If you have __fastcall available, then recursive function could get performance boost by passing multiple arguments via registers. Number of general purpose registers available for this can differ from platform to platform. All the extra arguments are passed via stack.
Chances are good that stack hosted arguments are loaded into registers at the very beginning of the function for calculations. So, every stack passed argument would require at least one memory access. If you packed everything into structure and passed its pointer, then every member access would also generate at least one access to memory. No real benefit here
Passing multiple arguments by value you have freedom to change them as you please. With structure members you either make temporary copies to use in calculations (effectively repeating passing multiple arguments) or compiler will write modified values back to where the structure instance sits. This can produce needless overhead.
Pros:
I would pack output-type arguments into structure. This would just lower number of arguments and make function prototype conceivable, because it is in human's nature to operate entities with features
I have a function and i'm accessing a struct's members a lot of times in it.
What I was wondering about is what is the good practice to go about this?
For example:
struct s
{
int x;
int y;
}
and I have allocated memory for 10 objects of that struct using malloc.
So, whenever I need to use only one of the object in a function, I usually create (or is passed as argument) pointer and point it to the required object (My superior told me to avoid array indexing because it adds a calculation when accessing any member of the struct)
But is this the right way? I understand that dereferencing is not as expensive as creating a copy, but what if I'm dereferencing a number of times (like 20 to 30) in the function.
Would it be better if i created temporary variables for the struct variables (only the ones I need, I certainly don't use all the members) and copy over the value and then set the actual struct's value before returning?
Also, is this unnecessary micro optimization? Please note that this is for embedded devices.
This is for an embedded system. So, I can't make any assumptions about what the compiler will do. I can't make any assumptions about word size, or the number of registers, or the cost of accessing off the stack, because you didn't tell me what the architecture is. I used to do embedded code on 8080s when they were new...
OK, so what to do?
Pick a real section of code and code it up. Code it up each of the different ways you have listed above. Compile it. Find the compiler option that forces it to print out the assembly code that is produced. Compile each piece of code with every different set of optimization options. Grab the reference manual for the processor and count the cycles used by each case.
Now you will have real data on which to base a decision. Real data is much better that the opinions of a million highly experience expert programmers. Sit down with your lead programmer and show him the code and the data. He may well show you better ways to code it. If so, recode it his way, compile it, and count the cycles used by his code. Show him how his way worked out.
At the very worst you will have spent a weekend learning something very important about the way your compiler works. You will have examined N ways to code things times M different sets of optimization options. You will have learned a lot about the instruction set of the machine. You will have learned how good, or bad, the compiler is. You will have had a chance to get to know your lead programmer better. And, you will have real data.
Real data is the kind of data that you must have to answer this question. With out that data nothing anyone tells you is anything but an ego based guess. Data answers the question.
Bob Pendleton
First of all, indexing an array is not very expensive (only like one operation more expensive than a pointer dereference, or sometimes none, depending on the situation).
Secondly, most compilers will perform what is called RVO or return value optimisation when returning structs by value. This is where the caller allocates space for the return value of the function it calls, and secretly passes the address of that memory to the function for it to use, and the effect is that no copies are made. It does this automatically, so
struct mystruct blah = func();
Only constructs one object, passes it to func for it to use transparently to the programmer, and no copying need be done.
What I do not know is if you assign an array index the return value of the function, like this:
someArray[0] = func();
will the compiler pass the address of someArray[0] and do RVO that way, or will it just not do that optimisation? You'll have to get a more experienced programmer to answer that. I would guess that the compiler is smart enough to do it though, but it's just a guess.
And yes, I would call it micro optimisation. But we're C programmers. And that's how we roll.
Generally, the case in which you want to make a copy of a passed struct in C is if you want to manipulate the data in place. That is to say, have your changes not be reflected in the struct it self but rather only in the return value. As for which is more expensive, it depends on a lot of things. Many of which change implementation to implementation so I would need more specific information to be more helpful. Though, I would expect, that in an embedded environment you memory is at a greater premium than your processing power. Really this reads like needless micro optimization, your compiler should handle it.
In this case creating temp variable on the stack will be faster. But if your structure is much bigger then you might be better with dereferencing.
Dear all. I was wondering if there are examples of situations where you would purposefully pass an argument by value in C. Let me rephrase. When do you purposefully use C's pass-by-value for large objects? Or, when do you care that the object argument is fully copied in a local variable?
EDIT: Now that I think about it, if you can avoid pointers, then do. Nowadays, "deep" copying is possible for mostly everything in small apps, and shallow copying is more prone to pointer bugs. Maybe.
In C (sans const references), you pass by value for 3 reasons.
You don't want the source to be modified by the receiving function outside of its context. This is (was) the standard reason taught in school as it why to pass by value.
Passing by value is cheaper if the value fits within the architecture's register - or possibly registers if the compiler is very intelligent. Passing by value means no pointer creation and no dereference to get at the value being passed in. A small gain, but it does add up in certain circumstances.
Passing by value takes less typing. A weak reason to be sure, but there it is.
The const keyword negates most of reason 1, but reason 2 still has merit and is the main reason I pass by value.
Well, for one thing, if you want to change it.
Imagine the following contrived function:
int getNextCharAndCount (char *pCh);
Each time you call it, it returns the next most frequent character from a list by returning the count from the function and setting a character by way of the character pointer.
I'm having a hard time finding another use case which would require the pointer if you only ever wanted to use (but not change) the underlying character. That doesn't mean one doesn't exist of course :-)
In addition, I'm not sure what you're discussing is deep/shallow copy. That tends to apply to structures with pointers where a shallow copy just duplicates the top level while a deep copy makes copies of all levels.
What you're referring to is pass-by-value and pass-by-reference.
Passing by-reference is cheaper because you don't have to create a local copy of an object. If the function needs a local copy (for any purpose) - that could be a case.
I follow as a rule:
pass built-in types by value (int, char, double, float...)
pass classes and structs by (const) reference. There is no pointer handling involved whatsoever.
Never had any problems with this way of work.
If we're going to be pedantic about this, everyhing in C is pass-by-value. You may pass a pointer by value instead of passing the actual object by value, but it's still pass-by-value.
Anyway, why pass an entire object instead of a pointer to an object? Well, for one, your compiler may be able to optmize the call such that underneath the covers only an address is copied. Also/Alternatively, once you introduce pointers, your compiler may not be able to do as much optimization of your function because of aliasing. It's also less error prone to not have to remember to dereference. The caller can also be sure that what he passed in is not modified (const doesn't really guarantee this, it can be -dangerously- cast away)
I don't think your argument about chars holds water. Even though your char is conceptually 1 byte, each argument to a function call typically translates to a whole (word-sized) register and to the same amount of space on the stack for efficiency.
You can pass a whole struct on the stack as an argument if you really want to (and, I believe, return them as well). It's a way of avoiding both allocating memory and having to worry about pointer hygiene.
Depending on how the call stack is built the char and char* may take the same amount of space. It is generally better to have values aligned on word boundaries. The cost of accessing a 32 bit pointer on a word boundary may be significantly lower than accessing it on a non-word boundary.
Passing by value is safer if you don't want the value modified. Passing by reference can be dangerous. Consider passing by referennce
CONST int ONE = 1;
increment( *ONE );
print ONE;
Output is 2 if the constant was modified.