Pointers and execution speed - c

Some blogs and sites were talking about pointers are beneficial one of the causes was because the "execution speed" will be better in a program with pointers than without pointers. The thing I can work out is that:
Dereferencing a single location requires two (or more) memory accesses (depending on number of indirection). Which will increase the execution time, compared to if it was used directly.
Passing a pointer to a large datatype to a function, like a structure can be beneficial, as only the address of the structure/union is getting copied and it's not getting passed by value. Therefore it should be faster in this case.
For example just by force introducing pointers without any need as:
int a, b, *p, *q, c, *d;
p = &a;
q = &b;
d = &c
// get values in a, b
*d = *p + *q; // why the heck this would be faster
c = a + b; // than this code?
I checked the assembler output using gcc -S -masm=intel file.c The pointer version has a lot of loading memory and storing for the dereferences than the direct method.
Am I missing something?
Note: The question is not related to just the code. The code is just an example. Not considering compiler optimizations.

I think your conclusions are basically right. The author did not mean that using more pointers will always speed up all code. That's obviously nonsense.
But there are times when it is faster to pass a pointer to data instead of copying that data.

As you pointed out: Passing a pointer to a large datatype to a function; here the structure is an int, so it's hardly large. BTW: I guess gcc will optimize away the pointer accesses when you use -O2.
Apart from that your understanding is correct.

You are right in your example - that code would run slower. One place where it can be faster is when making a function call:
void foo( Object Obj );
void bar( const Object * pObj );
void main()
{
Object theObject;
foo( theObject ); // Creates a copy of theObject which is then used in the function.
bar( &theObject ); // Creates a copy of the memory address only, then the function references the original object within.
}
bar is faster as we don't need to copy the entire object (assuming the object is more than just a base data type). Most people would use a reference rather than a pointer in this instance, however.
void foobar( const Object & Obj );

Mark Byers is absolutely right. You cannot judge the power of pointers in such simple program.They are used to optimize the memory management and faster execution of programs where there are excessive use of data structures and references are done through addresses.
Consider when you start a program it takes some time in loading but with efficient use of pointers and skills if the program loads even 1 second earlier that's a large accomplishment.

Related

How pointer chasing is working in this benchmark

Trying to understand how pointer chasing is working (basically the access pattern) in the following benchmark:
https://github.com/google/multichase/blob/master/multichase.c
For the simple chase:
static void chase_simple(per_thread_t *t) {
void *p = t->x.cycle[0]; ----> here p is pointing to initial address
do {
x200(p = *(void **)p;) ----> didn't get this statement, is x200 some compiler built-in ?
} while (__sync_add_and_fetch(&t->x.count, 200));
// we never actually reach here, but the compiler doesn't know that
t->x.dummy = (uintptr_t)p;
}
where and how in the code, access pattern is made somewhat unpredictable so that H/W prefetcher doesn't guess it properly?
I'd assume x200 is a CPP macro that repeats its operand 200 times, for loop unrolling. Read x200 as "times 200". Possible implementation:
#define x5(x) x x x x x
...
#define x200(x) x100(x) x100(x)
That would make sense to amortize the cost of the atomic RMW to increment t->x.count.
You can check by looking at C preprocessor output, e.g. gcc -E, or search for the macro definition in the included files.
The actual statement being repeated,
p = *(void **)p;, is just dereferencing a pointer to get another pointer. Often you'd have p = p->next; if you had struct foo *p, but with just a void* you need a cast to keep the compiler happy and get an assembly instruction like mov rax, [rax].
Without a struct that can contain a memory that's a pointer to the same struct type, dereferencing a pointer is going to change the type. C doesn't have a way to declare an infinitely-indirect pointer that just produces the same type when you deref it.
where and how in the code, access pattern is made somewhat unpredictable so that H/W prefetcher doesn't guess it properly?
Nowhere in this function; it's just iterating through an existing linked list.
Look for the code that allocates space and stores pointers into it. Probably allocated as one large array, so it has control over the layout, unlike if it did many small mallocs. A random shuffling of the integers 0..n-1 could be mapped transformed to an array of pointers to pointers. Or something like that; perhaps you need to invert that mapping to make sure there aren't short cycles. Or maybe there's some simpler way to generate a linked list that covers all slots in a random order, with the last element pointing to the first so it's a closed loop that can be iterated as many times as you want.

Difference between : &data[0] vs. data

When I want to pass an array by reference to a function I don't know what to choose.
void myFunction(int* data);
Is there a difference or a best coding way between those two cases:
myFunction(&data[0]);
Or
myFunction(data);
There is no difference. Arrays ("proper" arrays) automatically decay to pointers to their first element.
For example, lets say you have
int my_array[10];
then using plain my_array will automatically decay to a pointer to its first element, which is &my_array[0].
It is this array-to-pointer decay that allows you to use both pointer arithmetic and array indexing for both arrays and pointers. For the array above my_array[i] is exactly equal to *(my_array + i). This equivalence also exists for pointers:
int *my_pointer = my_array; // Make my_pointer point to the first element of my_array
Then my_pointer[i] is also exactly equal to *(my_pointer + i).
For curiosity (and something you should never do in real programs), thanks to the commutative property of addition an expression such as *(my_array + i) will also be equal to *(i + my_array) which is then equal to i[my_array].
An array, when passed as a parameter to a function, automatically decays to a pointer to its first element. So passing either data or &data[0] to the function are exactly equivalent.
From a readability standpoint I would opt for the former. It makes it clear to the reader that the function is potentially operating on the entire array and not just on one element.
Apart from the obvious (data being shorter than &data[0]; therefore easier to write and to read), there's no difference.
Think about what &data[0] means:
It's a pointer to data[0].
And data[0] just means *(data+0), i.e. *data.
A pointer to *data is simply data.
data is a pointer to the beginning of the array.
&data[0] is an address of the first element of an array.
When reading a code, the first option is, for the most people, more readable and i suppose is a way most programmers will and should choose
There isn't any difference as both point to the same starting location of the array i.e. a[0].
I would just use my_function(data) because why make it more confusing than it has to be?
If for some reason you needed to find the memory address of a single element somewhere in the middle of data, then my_function(&data[17]) might possibly be warranted, but there are probably better ways to handle that case too.
In general if you have to manually and specifically pick out single pieces of data like that by hand, you are probably not doing it in a very good way.
There are rare cases where it can makes sense ( like if you are parsing data from some other source and you ALWAYS 100% of the time only care about the 17th byte )... but that's not usually the case.
Consider the following:
As your code evolves and you make changes, you will probably also slightly change data structures. Data[17] might no longer be the magical byte that you need anymore. Now it might be data[18]. If you manually hard coded data[17] in 100 or 1000 different places in your code, you will now have to go manually change them all and hope that it doesn't cause any new bugs. Also... portability issues.
Instead design functions that can find and return whatever data you need from your data structures without needing any hard coded addresses. They will still work ( if designed properly ) as your code evolves and will be 1000 times more portable.
No difference. When coerced into a pointer, an array (data) decays into a pointer to its first element (&data[0]).
Remember that data[0] simply means *(data+0), so &data[0] is equivalent to &*(data+0), which simplifies to data (because &* cancels out).
Demo:
#include <stdio.h>
int main(void) {
int data[2];
printf("%p\n", (void*)data);
printf("%p\n", (void*)&*(data+0));
printf("%p\n", (void*)&data[0]);
return 0;
}
Output:
$ gcc -Wall -Wextra -pedantic a.c -o a && a
0x3c2180f4fa0
0x3c2180f4fa0
0x3c2180f4fa0
I always advice to use a general approach.
Just consider the function
void myFunction( char* data);
where the parameter has the type char * instead of int *.
And now let's assume that you want to pass to the function a string literal.
It can be done either like
myFunction( "Hello" );
or like
myFunction( &"Hello"[0] );
It is evident that the first approach is more clear and readable.
So I prefer to use the first approach.:)
In fact such an expression
&data[i];
is syntactically redundant.. In fact it looks like
&( *( data + i ) )
that is equivalent to just
data + i
When i is equal to 0 then you have
data + 0
that in expressions is equivalent (I do not take into account for example the sizeof operaor) to
data
So use data instead of &data[0].

Why is the need of pointer to an array? [duplicate]

This question goes out to the C gurus out there:
In C, it is possible to declare a pointer as follows:
char (* p)[10];
.. which basically states that this pointer points to an array of 10 chars. The neat thing about declaring a pointer like this is that you will get a compile time error if you try to assign a pointer of an array of different size to p. It will also give you a compile time error if you try to assign the value of a simple char pointer to p. I tried this with gcc and it seems to work with ANSI, C89 and C99.
It looks to me like declaring a pointer like this would be very useful - particularly, when passing a pointer to a function. Usually, people would write the prototype of such a function like this:
void foo(char * p, int plen);
If you were expecting a buffer of an specific size, you would simply test the value of plen. However, you cannot be guaranteed that the person who passes p to you will really give you plen valid memory locations in that buffer. You have to trust that the person who called this function is doing the right thing. On the other hand:
void foo(char (*p)[10]);
..would force the caller to give you a buffer of the specified size.
This seems very useful but I have never seen a pointer declared like this in any code I have ever ran across.
My question is: Is there any reason why people do not declare pointers like this? Am I not seeing some obvious pitfall?
What you are saying in your post is absolutely correct. I'd say that every C developer comes to exactly the same discovery and to exactly the same conclusion when (if) they reach certain level of proficiency with C language.
When the specifics of your application area call for an array of specific fixed size (array size is a compile-time constant), the only proper way to pass such an array to a function is by using a pointer-to-array parameter
void foo(char (*p)[10]);
(in C++ language this is also done with references
void foo(char (&p)[10]);
).
This will enable language-level type checking, which will make sure that the array of exactly correct size is supplied as an argument. In fact, in many cases people use this technique implicitly, without even realizing it, hiding the array type behind a typedef name
typedef int Vector3d[3];
void transform(Vector3d *vector);
/* equivalent to `void transform(int (*vector)[3])` */
...
Vector3d vec;
...
transform(&vec);
Note additionally that the above code is invariant with relation to Vector3d type being an array or a struct. You can switch the definition of Vector3d at any time from an array to a struct and back, and you won't have to change the function declaration. In either case the functions will receive an aggregate object "by reference" (there are exceptions to this, but within the context of this discussion this is true).
However, you won't see this method of array passing used explicitly too often, simply because too many people get confused by a rather convoluted syntax and are simply not comfortable enough with such features of C language to use them properly. For this reason, in average real life, passing an array as a pointer to its first element is a more popular approach. It just looks "simpler".
But in reality, using the pointer to the first element for array passing is a very niche technique, a trick, which serves a very specific purpose: its one and only purpose is to facilitate passing arrays of different size (i.e. run-time size). If you really need to be able to process arrays of run-time size, then the proper way to pass such an array is by a pointer to its first element with the concrete size supplied by an additional parameter
void foo(char p[], unsigned plen);
Actually, in many cases it is very useful to be able to process arrays of run-time size, which also contributes to the popularity of the method. Many C developers simply never encounter (or never recognize) the need to process a fixed-size array, thus remaining oblivious to the proper fixed-size technique.
Nevertheless, if the array size is fixed, passing it as a pointer to an element
void foo(char p[])
is a major technique-level error, which unfortunately is rather widespread these days. A pointer-to-array technique is a much better approach in such cases.
Another reason that might hinder the adoption of the fixed-size array passing technique is the dominance of naive approach to typing of dynamically allocated arrays. For example, if the program calls for fixed arrays of type char[10] (as in your example), an average developer will malloc such arrays as
char *p = malloc(10 * sizeof *p);
This array cannot be passed to a function declared as
void foo(char (*p)[10]);
which confuses the average developer and makes them abandon the fixed-size parameter declaration without giving it a further thought. In reality though, the root of the problem lies in the naive malloc approach. The malloc format shown above should be reserved for arrays of run-time size. If the array type has compile-time size, a better way to malloc it would look as follows
char (*p)[10] = malloc(sizeof *p);
This, of course, can be easily passed to the above declared foo
foo(p);
and the compiler will perform the proper type checking. But again, this is overly confusing to an unprepared C developer, which is why you won't see it in too often in the "typical" average everyday code.
I would like to add to AndreyT's answer (in case anyone stumbles upon this page looking for more info on this topic):
As I begin to play more with these declarations, I realize that there is major handicap associated with them in C (apparently not in C++). It is fairly common to have a situation where you would like to give a caller a const pointer to a buffer you have written into. Unfortunately, this is not possible when declaring a pointer like this in C. In other words, the C standard (6.7.3 - Paragraph 8) is at odds with something like this:
int array[9];
const int (* p2)[9] = &array; /* Not legal unless array is const as well */
This constraint does not seem to be present in C++, making these type of declarations far more useful. But in the case of C, it is necessary to fall back to a regular pointer declaration whenever you want a const pointer to the fixed size buffer (unless the buffer itself was declared const to begin with). You can find more info in this mail thread: link text
This is a severe constraint in my opinion and it could be one of the main reasons why people do not usually declare pointers like this in C. The other being the fact that most people do not even know that you can declare a pointer like this as AndreyT has pointed out.
The obvious reason is that this code doesn't compile:
extern void foo(char (*p)[10]);
void bar() {
char p[10];
foo(p);
}
The default promotion of an array is to an unqualified pointer.
Also see this question, using foo(&p) should work.
I also want to use this syntax to enable more type checking.
But I also agree that the syntax and mental model of using pointers is simpler, and easier to remember.
Here are some more obstacles I have come across.
Accessing the array requires using (*p)[]:
void foo(char (*p)[10])
{
char c = (*p)[3];
(*p)[0] = 1;
}
It is tempting to use a local pointer-to-char instead:
void foo(char (*p)[10])
{
char *cp = (char *)p;
char c = cp[3];
cp[0] = 1;
}
But this would partially defeat the purpose of using the correct type.
One has to remember to use the address-of operator when assigning an array's address to a pointer-to-array:
char a[10];
char (*p)[10] = &a;
The address-of operator gets the address of the whole array in &a, with the correct type to assign it to p. Without the operator, a is automatically converted to the address of the first element of the array, same as in &a[0], which has a different type.
Since this automatic conversion is already taking place, I am always puzzled that the & is necessary. It is consistent with the use of & on variables of other types, but I have to remember that an array is special and that I need the & to get the correct type of address, even though the address value is the same.
One reason for my problem may be that I learned K&R C back in the 80s, which did not allow using the & operator on whole arrays yet (although some compilers ignored that or tolerated the syntax). Which, by the way, may be another reason why pointers-to-arrays have a hard time to get adopted: they only work properly since ANSI C, and the & operator limitation may have been another reason to deem them too awkward.
When typedef is not used to create a type for the pointer-to-array (in a common header file), then a global pointer-to-array needs a more complicated extern declaration to share it across files:
fileA:
char (*p)[10];
fileB:
extern char (*p)[10];
Well, simply put, C doesn't do things that way. An array of type T is passed around as a pointer to the first T in the array, and that's all you get.
This allows for some cool and elegant algorithms, such as looping through the array with expressions like
*dst++ = *src++
The downside is that management of the size is up to you. Unfortunately, failure to do this conscientiously has also led to millions of bugs in C coding, and/or opportunities for malevolent exploitation.
What comes close to what you ask in C is to pass around a struct (by value) or a pointer to one (by reference). As long as the same struct type is used on both sides of this operation, both the code that hand out the reference and the code that uses it are in agreement about the size of the data being handled.
Your struct can contain whatever data you want; it could contain your array of a well-defined size.
Still, nothing prevents you or an incompetent or malevolent coder from using casts to fool the compiler into treating your struct as one of a different size. The almost unshackled ability to do this kind of thing is a part of C's design.
You can declare an array of characters a number of ways:
char p[10];
char* p = (char*)malloc(10 * sizeof(char));
The prototype to a function that takes an array by value is:
void foo(char* p); //cannot modify p
or by reference:
void foo(char** p); //can modify p, derefernce by *p[0] = 'f';
or by array syntax:
void foo(char p[]); //same as char*
I would not recommend this solution
typedef int Vector3d[3];
since it obscures the fact that Vector3D has a type that you
must know about. Programmers usually dont expect variables of the
same type to have different sizes. Consider :
void foo(Vector3d a) {
Vector3d b;
}
where sizeof a != sizeof b
Maybe I'm missing something, but... since arrays are constant pointers, basically that means that there's no point in passing around pointers to them.
Couldn't you just use void foo(char p[10], int plen); ?
type (*)[];
// points to an array e.g
int (*ptr)[5];
// points to an 5 integer array
// gets the address of the array
type *[];
// points to an array of pointers e.g
int* ptr[5]
// point to an array of five integer pointers
// point to 5 adresses.
On my compiler (vs2008) it treats char (*p)[10] as an array of character pointers, as if there was no parentheses, even if I compile as a C file. Is compiler support for this "variable"? If so that is a major reason not to use it.

Saving code space by altering a function call in C

I am calling a function that returns a variable through a pointer parameter. I do not care about the return value of this parameter nor do I want to make a dummy variable to pass to the function. For a simple example's sake, let's say the function is as follows and I don't care about, nor want to make a dummy variable for parameter "d".
void foo(int a, int b, int* c, int* d)
{
*c = a+b;
*d = a+b+*c;
}
I understand that a NULL pointer is in theory a pointer to a location that is not the address of any object or function. Would it be correct to pass NULL into "d" in this function if NULL was defined as the following? Or is this going to change whatever is at the 0'th element in memory?
#define NULL ((void *)0)
The target device is an MSP430 and I am using IAR C. No operating system is used therefore no memory management is implemented
EDIT: Please note that I do not want to create a dummy variable. Also if there was a way to fool the compiler into optimizing the "d" parameter out without altering the function definition, this is preferable.
EDIT#2: I would rather not use the & operator in the function call as it generates inefficient code that I do not want to generate
EDIT#3: For those who don't believe me when I am talking about the & operator... the compiler manual states "Avoid taking the address of local variables using the & operator. This is inefficient
for two main reasons. First, the variable must be placed in memory, and thus cannot be placed in a processor register. This results in larger and slower code. Second, the optimizer can no longer assume that the local variable is unaffected over function calls."
No, it is not correct.
The C standard does not define the behavior when you do this. On many systems, it will cause an error (some sort of memory fault) when foo attempts to store to address 0. If it does not, then you will have written data to address 0, presumably overwriting something else there that may have been needed, so your system may fail at a later time.
You should change your function a bit to allow passing NULL
void foo(int a, int b, int* c, int* d)
{
if(c != NULL)
{
*c = a+b;
if(d != NULL)
{
*d = a+b+*c;
}
}
}
Now you can safely pass NULL. Otherwise, as the other answers already state, you end up dereferencing a NULL pointer which results in undefined behavior.
In your example, if you don't care about the pointer d and you pass NULL as you defined then it'll probably crash due to dereferencing NULL.
You should pass a valid pointer even if you don't care about the result.
Why not just declare a temporary and pass?
int tempd;
foo(a,b,&c, &tempd);
There is no such thing as the 0th element in memory due to virtual memory. However, if you attempt this, your program will crash with a memory exception. I assume you want to ignore d if it's null so simply do this:
if(d != NULL)
{
*d = a+b+*c
}
Since you don't want to create a dummy variable and can't change the function you'll most likely end up scribbling at the memory position 0 on your device whatever that means. Maybe it's a memory mapped hardware register, maybe it's just normal physical memory.
If it's a register, maybe it doesn't have any effect unless you write the magical value 4711 into it which will happen once every three months and the device halts and catches fire. (has happened to me, it's fun to overwrite the boot eeprom on a device)
Or if it's memory maybe you'll send a NULL pointer to a different function later and that function will happily read the value that this function wrote there and you'll end up at 5 in the morning tearing your hair out and yelling "this can't possibly affect that!". (has happened to me on some ancient unix that used to map the NULL page)
Maybe your compiler adds a safety net for you. Maybe it doesn't. Maybe the next version will. Maybe the next hardware revision will come with memory unmapped at address 0 and the device will halt.
I'd create a dummy variable in the calling function and move on to a more interesting problem, but if you're a stress junkie, pass NULL and see what happens (today or in 10 years).
In that specific example code both passed in pointers are dereferenced. Dereferencing NULL is undefined behavior. Just go with the dummy variables.
In general: If a function accepts a pointer it should state if the null pointer is a valid value for the argument. If it doesn't say anything stay on the safe side and just assume it isn't. This will safe you a lot of grief.
Interesting question! Generally speaking, NULL is reserved as an "invalid" address. You shouldn't try to write to it, but I don't think the standard specifies what should happen if you do. On a Windows machine, this will generate an access violation exception. I don't know what will happen on your device.
In any case, passing NULL is the usual way to indicate that you're not interested in the value of an out parameter. But your function must be aware of this and act accordingly:
if( c ) {
*c = a+b;
if( d ) {
*d = a+b+*c;
}
}
Edit
If you can't change the function definition, then you're probably out of luck. You can trick the compiler into not passing d if the calling convention is cdecl. Just declare the function without the d parameter:
extern void foo( int a, int b, int * c );
However, you're definitely into dangerous shenanigans territory here. The function definition will still expect the d paramater, so it will see random garbage.
The only other thing I can think of is passing a fixed address. Since you're writing for a specific device, there might be an address range that's safe to write to. (That is, it won't cause exceptions or corrupt actual memory.)
void * SafeAddress = (void *)0x12345678;
foo( a, b, &c, SafeAddress );
Of course, the easiest thing is to just use the dummy variable. I know you've said more than once that this generates inefficient code, but does that have to be the case? Does it make a difference if the dummy is a local variable versus a global one?
The function tries to store value at the provided address. If the address is invalid, then there will be a malfunction of some sort -- whether you use an operating system or not is irrelevant.
Either you have to give the function some valid address (even if you don't care for the value in a particular case), or you have to change the function so that it does not store the value (and, probably, does not even even compute it), if the address for it is NULL (which may or may not be 0x0 on your platform, BTW).
You keep repeating, that you "don't want" to do the former and can not do the latter. Well, then you have an unsolvable dilemma. Maybe, there already exists some other address, where dummy values like this can be stored in your program (or on the platform) -- you can pass that.
If there is no OS involved, then you must be dealing with some funky programmable device, which means, there ought to be seasoned C-programmers around you. Ask them for confirmation of what you are told here -- clearly, you aren't trusting the answers given to you by several people already.
It will try to assign something at memory location 0x0, so I'd say it will crash
I was able to save code space without increasing memory usage on the stack by declaring a dummy variable as well as a pointer to the dummy variable.
int Dummy;
int* Dummy_ptr = &Dummy;
This allowed the compiler to make optimizations on the function call as the & operator was not used in the function call.
The call is now
foo(a, b, c_ptr, Dummy_ptr);
EDIT: For those of you who don't believe me.
I took a look at the assembler. The Dummy variable exists on the stack, though because it is not used later on, and because it is only a return from the function the address is never passed to the function and any use of that variable in the function is optimized out.

C pointers : pointing to an array of fixed size

This question goes out to the C gurus out there:
In C, it is possible to declare a pointer as follows:
char (* p)[10];
.. which basically states that this pointer points to an array of 10 chars. The neat thing about declaring a pointer like this is that you will get a compile time error if you try to assign a pointer of an array of different size to p. It will also give you a compile time error if you try to assign the value of a simple char pointer to p. I tried this with gcc and it seems to work with ANSI, C89 and C99.
It looks to me like declaring a pointer like this would be very useful - particularly, when passing a pointer to a function. Usually, people would write the prototype of such a function like this:
void foo(char * p, int plen);
If you were expecting a buffer of an specific size, you would simply test the value of plen. However, you cannot be guaranteed that the person who passes p to you will really give you plen valid memory locations in that buffer. You have to trust that the person who called this function is doing the right thing. On the other hand:
void foo(char (*p)[10]);
..would force the caller to give you a buffer of the specified size.
This seems very useful but I have never seen a pointer declared like this in any code I have ever ran across.
My question is: Is there any reason why people do not declare pointers like this? Am I not seeing some obvious pitfall?
What you are saying in your post is absolutely correct. I'd say that every C developer comes to exactly the same discovery and to exactly the same conclusion when (if) they reach certain level of proficiency with C language.
When the specifics of your application area call for an array of specific fixed size (array size is a compile-time constant), the only proper way to pass such an array to a function is by using a pointer-to-array parameter
void foo(char (*p)[10]);
(in C++ language this is also done with references
void foo(char (&p)[10]);
).
This will enable language-level type checking, which will make sure that the array of exactly correct size is supplied as an argument. In fact, in many cases people use this technique implicitly, without even realizing it, hiding the array type behind a typedef name
typedef int Vector3d[3];
void transform(Vector3d *vector);
/* equivalent to `void transform(int (*vector)[3])` */
...
Vector3d vec;
...
transform(&vec);
Note additionally that the above code is invariant with relation to Vector3d type being an array or a struct. You can switch the definition of Vector3d at any time from an array to a struct and back, and you won't have to change the function declaration. In either case the functions will receive an aggregate object "by reference" (there are exceptions to this, but within the context of this discussion this is true).
However, you won't see this method of array passing used explicitly too often, simply because too many people get confused by a rather convoluted syntax and are simply not comfortable enough with such features of C language to use them properly. For this reason, in average real life, passing an array as a pointer to its first element is a more popular approach. It just looks "simpler".
But in reality, using the pointer to the first element for array passing is a very niche technique, a trick, which serves a very specific purpose: its one and only purpose is to facilitate passing arrays of different size (i.e. run-time size). If you really need to be able to process arrays of run-time size, then the proper way to pass such an array is by a pointer to its first element with the concrete size supplied by an additional parameter
void foo(char p[], unsigned plen);
Actually, in many cases it is very useful to be able to process arrays of run-time size, which also contributes to the popularity of the method. Many C developers simply never encounter (or never recognize) the need to process a fixed-size array, thus remaining oblivious to the proper fixed-size technique.
Nevertheless, if the array size is fixed, passing it as a pointer to an element
void foo(char p[])
is a major technique-level error, which unfortunately is rather widespread these days. A pointer-to-array technique is a much better approach in such cases.
Another reason that might hinder the adoption of the fixed-size array passing technique is the dominance of naive approach to typing of dynamically allocated arrays. For example, if the program calls for fixed arrays of type char[10] (as in your example), an average developer will malloc such arrays as
char *p = malloc(10 * sizeof *p);
This array cannot be passed to a function declared as
void foo(char (*p)[10]);
which confuses the average developer and makes them abandon the fixed-size parameter declaration without giving it a further thought. In reality though, the root of the problem lies in the naive malloc approach. The malloc format shown above should be reserved for arrays of run-time size. If the array type has compile-time size, a better way to malloc it would look as follows
char (*p)[10] = malloc(sizeof *p);
This, of course, can be easily passed to the above declared foo
foo(p);
and the compiler will perform the proper type checking. But again, this is overly confusing to an unprepared C developer, which is why you won't see it in too often in the "typical" average everyday code.
I would like to add to AndreyT's answer (in case anyone stumbles upon this page looking for more info on this topic):
As I begin to play more with these declarations, I realize that there is major handicap associated with them in C (apparently not in C++). It is fairly common to have a situation where you would like to give a caller a const pointer to a buffer you have written into. Unfortunately, this is not possible when declaring a pointer like this in C. In other words, the C standard (6.7.3 - Paragraph 8) is at odds with something like this:
int array[9];
const int (* p2)[9] = &array; /* Not legal unless array is const as well */
This constraint does not seem to be present in C++, making these type of declarations far more useful. But in the case of C, it is necessary to fall back to a regular pointer declaration whenever you want a const pointer to the fixed size buffer (unless the buffer itself was declared const to begin with). You can find more info in this mail thread: link text
This is a severe constraint in my opinion and it could be one of the main reasons why people do not usually declare pointers like this in C. The other being the fact that most people do not even know that you can declare a pointer like this as AndreyT has pointed out.
The obvious reason is that this code doesn't compile:
extern void foo(char (*p)[10]);
void bar() {
char p[10];
foo(p);
}
The default promotion of an array is to an unqualified pointer.
Also see this question, using foo(&p) should work.
I also want to use this syntax to enable more type checking.
But I also agree that the syntax and mental model of using pointers is simpler, and easier to remember.
Here are some more obstacles I have come across.
Accessing the array requires using (*p)[]:
void foo(char (*p)[10])
{
char c = (*p)[3];
(*p)[0] = 1;
}
It is tempting to use a local pointer-to-char instead:
void foo(char (*p)[10])
{
char *cp = (char *)p;
char c = cp[3];
cp[0] = 1;
}
But this would partially defeat the purpose of using the correct type.
One has to remember to use the address-of operator when assigning an array's address to a pointer-to-array:
char a[10];
char (*p)[10] = &a;
The address-of operator gets the address of the whole array in &a, with the correct type to assign it to p. Without the operator, a is automatically converted to the address of the first element of the array, same as in &a[0], which has a different type.
Since this automatic conversion is already taking place, I am always puzzled that the & is necessary. It is consistent with the use of & on variables of other types, but I have to remember that an array is special and that I need the & to get the correct type of address, even though the address value is the same.
One reason for my problem may be that I learned K&R C back in the 80s, which did not allow using the & operator on whole arrays yet (although some compilers ignored that or tolerated the syntax). Which, by the way, may be another reason why pointers-to-arrays have a hard time to get adopted: they only work properly since ANSI C, and the & operator limitation may have been another reason to deem them too awkward.
When typedef is not used to create a type for the pointer-to-array (in a common header file), then a global pointer-to-array needs a more complicated extern declaration to share it across files:
fileA:
char (*p)[10];
fileB:
extern char (*p)[10];
Well, simply put, C doesn't do things that way. An array of type T is passed around as a pointer to the first T in the array, and that's all you get.
This allows for some cool and elegant algorithms, such as looping through the array with expressions like
*dst++ = *src++
The downside is that management of the size is up to you. Unfortunately, failure to do this conscientiously has also led to millions of bugs in C coding, and/or opportunities for malevolent exploitation.
What comes close to what you ask in C is to pass around a struct (by value) or a pointer to one (by reference). As long as the same struct type is used on both sides of this operation, both the code that hand out the reference and the code that uses it are in agreement about the size of the data being handled.
Your struct can contain whatever data you want; it could contain your array of a well-defined size.
Still, nothing prevents you or an incompetent or malevolent coder from using casts to fool the compiler into treating your struct as one of a different size. The almost unshackled ability to do this kind of thing is a part of C's design.
You can declare an array of characters a number of ways:
char p[10];
char* p = (char*)malloc(10 * sizeof(char));
The prototype to a function that takes an array by value is:
void foo(char* p); //cannot modify p
or by reference:
void foo(char** p); //can modify p, derefernce by *p[0] = 'f';
or by array syntax:
void foo(char p[]); //same as char*
I would not recommend this solution
typedef int Vector3d[3];
since it obscures the fact that Vector3D has a type that you
must know about. Programmers usually dont expect variables of the
same type to have different sizes. Consider :
void foo(Vector3d a) {
Vector3d b;
}
where sizeof a != sizeof b
Maybe I'm missing something, but... since arrays are constant pointers, basically that means that there's no point in passing around pointers to them.
Couldn't you just use void foo(char p[10], int plen); ?
type (*)[];
// points to an array e.g
int (*ptr)[5];
// points to an 5 integer array
// gets the address of the array
type *[];
// points to an array of pointers e.g
int* ptr[5]
// point to an array of five integer pointers
// point to 5 adresses.
On my compiler (vs2008) it treats char (*p)[10] as an array of character pointers, as if there was no parentheses, even if I compile as a C file. Is compiler support for this "variable"? If so that is a major reason not to use it.

Resources