Related
I have a
LS_Led* LS_vol_leds[10];
declared in one C module, and the proper externs in the other modules that access it.
In func1() I have this line:
/* Debug */
LS_Led led = *(LS_vol_leds[0]);
And it does not cause an exception. Then
I call func2() in another C module (right after above line), and do the same line, namely:
/* Debug */
LS_Led led = *(LS_vol_leds[0]);`
first thing, and exception thrown!!!
I don't think I have the powers to debug this one on my own.
Before anything LS_vol_leds is initialized in func1() with:
LS_vol_leds[0] = &led3;
LS_vol_leds[1] = &led4;
LS_vol_leds[2] = &led5;
LS_vol_leds[3] = &led6;
LS_vol_leds[4] = &led7;
LS_vol_leds[5] = &led8;
LS_vol_leds[6] = &led9;
LS_vol_leds[7] = &led10;
LS_vol_leds[8] = &led11;
LS_vol_leds[9] = &led12;
My externs look like
extern LS_Led** LS_vol_leds;
So does that lead to disaster and I how do I prevent disaster?
Thanks.
This leads to disaster:
extern LS_Led** LS_vol_leds;
You should try this instead:
extern LS_Led *LS_vol_leds[];
If you really want to know why, you should read Expert C Programming - Deep C Secrets, by Peter Van Der Linden (amazing book!), especially chapter 4, but the quick answer is that this is one of those corner cases where pointers and arrays are not interchangeable: a pointer is a variable which holds the address of another one, whereas an array name is an address. extern LS_Led** LS_vol_leds; is lying to the compiler and generating the wrong code to access LS_vol_leds[i].
With this:
extern LS_Led** LS_vol_leds;
The compiler will believe that LS_vol_leds is a pointer, and thus, LS_vol_leds[i] involves reading the value stored in the memory location that is responsible for LS_vol_leds, use that as an address, and then scale i accordingly to get the offset.
However, since LS_vol_leds is an array and not a pointer, the compiler should instead pick the address of LS_vol_leds directly. In other words: what is happening is that your original extern causes the compiler to dereference LS_vol_leds[0] because it believes that LS_vol_leds[0] holds the address of the pointed-to object.
UPDATE: Fun fact - the back cover of the book talks about this specific case:
So that's why extern char *cp isn't the same as extern char cp[]. I
knew that it didn't work despite their superficial equivalence, but I
didn't know why. [...]
UPDATE2: Ok, since you asked, let's dig deeper. Consider a program split into two files, file1.c and file2.c. Its contents are:
file1.c
#define BUFFER_SIZE 1024
char cp[BUFFER_SIZE];
/* Lots of code using cp[i] */
file2.c
extern char *cp;
/* Code using cp[i] */
The moment you try to assing to cp[i] or use cp[i] in file2.c will most likely crash your code. This is deeply tight into the mechanics of C and the code that the compiler generates for array-based accesses and pointer-based accesses.
When you have a pointer, you must think of it as a variable. A pointer is a variable like an int, float or something similar, but instead of storing an integer or a float, it stores a memory address - the address of another object.
Note that variables have addresses. When you have something like:
int a;
Then you know that a is the name for an integer object. When you assign to a, the compiler emits code that writes into whatever address is associated with a.
Now consider you have:
char *p;
What happens when you access *p? Remember - a pointer is a variable. This means that the memory address associated with p holds an address - namely, an address holding a character. When you assign to p (i.e., make it point to somewhere else), then the compiler grabs the address of p and writes a new address (the one you provide it) into that location.
For example, if p lives at 0x27, it means that reading memory location 0x27 yields the address of the object pointed to by p. So, if you use *p in the right hand side of an assignment, the steps to get the value of *p are:
Read the contents of 0x27 - say it's 0x80 - this is the value of the pointer, or, equivalently, the address of the pointed-to object
Read the contents of 0x80 - this finally gives you *p.
What if p is an array? If p is an array, then the variable p itself represents the array. By convention, the address representing an array is the address of its first element. If the compiler chooses to store the array in address 0x59, it means that the first element of p lives at 0x59. So when you read p[0] (or *p), the generated code is simpler: the compiler knows that the variable p is an array, and the address of an array is the address of the first element, so p[0] is the same as reading 0x59. Compare this to the case for which p is a pointer.
If you lie to the compiler, and tell it you have a pointer instead of an array, the compiler will (wrongly) generate code that does what I showed for the pointer case. You're basically telling it that 0x59 is not the address of an array, it's the address of a pointer. So, reading p[i] will cause it to use the pointer version:
Read the contents of 0x59 - note that, in reality, this is p[0]
Use that as an address, and read its contents.
So, what happens is that the compiler thinks that p[0] is an address, and will try to use it as such.
Why is this a corner case? Why don't I have to worry about this when passing arrays to functions?
Because what is really happening is that the compiler manages it for you. Yes, when you pass an array to a function, a pointer to the first element is passed, and inside the called function you have no way to know if it is a "real" array or a pointer. However, the address passed into the function is different depending on whether you're passing a real array or a pointer. If you're passing a real array, the pointer you get is the address of the first element of the array (in other words: the compiler immediately grabs the address associated to the array variable from the symbol table). If you're passing a pointer, the compiler passes the address that is stored in the address associated with that variable (and that variable happens to be the pointer), that is, it does exactly those 2 steps mentioned before for pointer-based access. Again, note that we're discussing the value of the pointer here. You must keep this separated from the address of the pointer itself (the address where the address of the pointed-to object is stored).
That's why you don't see a difference. In most situations, arrays are passed around as function arguments, and this rarely raises problems. But sometimes, with some corner cases (like yours), if you don't really know what is happening down there, well.. then it will be a wild ride.
Personal advice: read the book, it's totally worth it.
Some blogs and sites were talking about pointers are beneficial one of the causes was because the "execution speed" will be better in a program with pointers than without pointers. The thing I can work out is that:
Dereferencing a single location requires two (or more) memory accesses (depending on number of indirection). Which will increase the execution time, compared to if it was used directly.
Passing a pointer to a large datatype to a function, like a structure can be beneficial, as only the address of the structure/union is getting copied and it's not getting passed by value. Therefore it should be faster in this case.
For example just by force introducing pointers without any need as:
int a, b, *p, *q, c, *d;
p = &a;
q = &b;
d = &c
// get values in a, b
*d = *p + *q; // why the heck this would be faster
c = a + b; // than this code?
I checked the assembler output using gcc -S -masm=intel file.c The pointer version has a lot of loading memory and storing for the dereferences than the direct method.
Am I missing something?
Note: The question is not related to just the code. The code is just an example. Not considering compiler optimizations.
I think your conclusions are basically right. The author did not mean that using more pointers will always speed up all code. That's obviously nonsense.
But there are times when it is faster to pass a pointer to data instead of copying that data.
As you pointed out: Passing a pointer to a large datatype to a function; here the structure is an int, so it's hardly large. BTW: I guess gcc will optimize away the pointer accesses when you use -O2.
Apart from that your understanding is correct.
You are right in your example - that code would run slower. One place where it can be faster is when making a function call:
void foo( Object Obj );
void bar( const Object * pObj );
void main()
{
Object theObject;
foo( theObject ); // Creates a copy of theObject which is then used in the function.
bar( &theObject ); // Creates a copy of the memory address only, then the function references the original object within.
}
bar is faster as we don't need to copy the entire object (assuming the object is more than just a base data type). Most people would use a reference rather than a pointer in this instance, however.
void foobar( const Object & Obj );
Mark Byers is absolutely right. You cannot judge the power of pointers in such simple program.They are used to optimize the memory management and faster execution of programs where there are excessive use of data structures and references are done through addresses.
Consider when you start a program it takes some time in loading but with efficient use of pointers and skills if the program loads even 1 second earlier that's a large accomplishment.
I'm sort of learning C, I'm not a beginner to programming though, I "know" Java and python, and by the way I'm on a mac (leopard).
Firstly,
1: could someone explain when to use a pointer and when not to?
2:
char *fun = malloc(sizeof(char) * 4);
or
char fun[4];
or
char *fun = "fun";
And then all but the last would set indexes 0, 1, 2 and 3 to 'f', 'u', 'n' and '\0' respectively. My question is, why isn't the second one a pointer? Why char fun[4] and not char *fun[4]? And how come it seems that a pointer to a struct or an int is always an array?
3:
I understand this:
typedef struct car
{
...
};
is a shortcut for
struct car
{
...
};
typedef struct car car;
Correct? But something I am really confused about:
typedef struct A
{
...
}B;
What is the difference between A and B? A is the 'tag-name', but what's that? When do I use which? Same thing for enums.
4. I understand what pointers do, but I don't understand what the point of them is (no pun intended). And when does something get allocated on the stack vs. the heap? How do I know where it gets allocated? Do pointers have something to do with it?
5. And lastly, know any good tutorial for C game programming (simple) ? And for mac/OS X, not windows?
PS. Is there any other name people use to refer to just C, not C++? I hate how they're all named almost the same thing, so hard to try to google specifically C and not just get C++ and C# stuff.
Thanks!!
It was hard to pick a best answer, they were all great, but the one I picked was the only one that made me understand my 3rd question, which was the only one I was originally going to ask. Thanks again!
My question is, why isn't the second one a pointer?
Because it declares an array. In the two other cases, you have a pointer that refers to data that lives somewhere else. Your array declaration, however, declares an array of data that lives where it's declared. If you declared it within a function, then data will die when you return from that function. Finally char *fun[4] would be an array of 4 pointers - it wouldn't be a char pointer. In case you just want to point to a block of 4 chars, then char* would fully suffice, no need to tell it that there are exactly 4 chars to be pointed to.
The first way which creates an object on the heap is used if you need data to live from thereon until the matching free call. The data will survive a return from a function.
The last way just creates data that's not intended to be written to. It's a pointer which refers to a string literal - it's often stored in read-only memory. If you write to it, then the behavior is undefined.
I understand what pointers do, but I don't understand what the point of them is (no pun intended).
Pointers are used to point to something (no pun, of course). Look at it like this: If you have a row of items on the table, and your friend says "pick the second item", then the item won't magically walk its way to you. You have to grab it. Your hand acts like a pointer, and when you move your hand back to you, you dereference that pointer and get the item. The row of items can be seen as an array of items:
And how come it seems that a pointer to a struct or an int is always an array?
item row[5];
When you do item i = row[1]; then you first point your hand at the first item (get a pointer to the first one), and then you advance till you are at the second item. Then you take your hand with the item back to you :) So, the row[1] syntax is not something special to arrays, but rather special to pointers - it's equivalent to *(row + 1), and a temporary pointer is made up when you use an array like that.
What is the difference between A and B? A is the 'tag-name', but what's that? When do I use which? Same thing for enums.
typedef struct car
{
...
};
That's not valid code. You basically said "define the type struct car { ... } to be referable by the following ordinary identifier" but you missed to tell it the identifier. The two following snippets are equivalent instead, as far as i can see
1)
struct car
{
...
};
typedef struct car car;
2)
typedef struct car
{
...
} car;
What is the difference between A and B? A is the 'tag-name', but what's that? When do I use which? Same thing for enums.
In our case, the identifier car was declared two times in the same scope. But the declarations won't conflict because each of the identifiers are in a different namespace. The two namespaces involved are the ordinary namespace and the tag namespace. A tag identifier needs to be used after a struct, union or enum keyword, while an ordinary identifier doesn't need anything around it. You may have heard of the POSIX function stat, whose interface looks like the following
struct stat {
...
};
int stat(const char *path, struct stat *buf);
In that code snippet, stat is registered into the two aforementioned namespaces too. struct stat will refer to the struct, and merely stat will refer to the function. Some people don't like to precede identifiers always with struct, union or enum. Those use typedef to introduce an ordinary identifier that will refer to the struct too. The identifier can of course be the same (both times car), or they can differ (one time A the other time B). It doesn't matter.
3) It's bad style to use two different names A and B:
typedef struct A
{
...
} B;
With that definition, you can say
struct A a;
B b;
b.field = 42;
a.field = b.field;
because the variables a and b have the same type. C programmers usually say
typedef struct A
{
...
} A;
so that you can use "A" as a type name, equivalent to "struct A" but it saves you a lot of typing.
Use them when you need to. Read some more examples and tutorials until you understand what pointers are, and this ought to be a lot clearer :)
The second case creates an array in memory, with space for four bytes. When you use that array's name, you magically get back a pointer to the first (index 0) element. And then the [] operator then actually works on a pointer, not an array - x[y] is equivalent to *(x + y). And yes, this means x[y] is the same as y[x]. Sorry.
Note also that when you add an integer to a pointer, it's multiplied by the size of the pointed-to elements, so if you do someIntArray[1], you get the second (index 1) element, not somewhere inbetween starting at the first byte.
Also, as a final gotcha - array types in function argument lists - eg, void foo(int bar[4]) - secretly get turned into pointer types - that is, void foo(int *bar). This is only the case in function arguments.
Your third example declares a struct type with two names - struct A and B. In pure C, the struct is mandatory for A - in C++, you can just refer to it as either A or B. Apart from the name change, the two types are completely equivalent, and you can substitute one for the other anywhere, anytime without any change in behavior.
C has three places things can be stored:
The stack - local variables in functions go here. For example:
void foo() {
int x; // on the stack
}
The heap - things go here when you allocate them explicitly with malloc, calloc, or realloc.
void foo() {
int *x; // on the stack
x = malloc(sizeof(*x)); // the value pointed to by x is on the heap
}
Static storage - global variables and static variables, allocated once at program startup.
int x; // static
void foo() {
static int y; // essentially a global that can only be used in foo()
}
No idea. I wish I didn't need to answer all questions at once - this is why you should split them up :)
Note: formatting looks ugly due to some sort of markdown bug, if anyone knows of a workaround please feel free to edit (and remove this note!)
char *fun = malloc(sizeof(char) * 4);
or
char fun[4];
or
char *fun = "fun";
The first one can be set to any size you want at runtime, and be resized later - you can also free the memory when you are done.
The second one is a pointer really 'fun' is the same as char ptr=&fun[0].
I understand what pointers do, but I don't understand what the point of
them is (no pun intended). And when
does something get allocated on the
stack vs. the heap? How do I know
where it gets allocated? Do pointers
have something to do with it?
When you define something in a function like "char fun[4]" it is defined on the stack and the memory isn't available outside the function.
Using malloc (or new in C++) reserves memory on the heap - you can make this data available anywhere in the program by passing it the pointer. This also lets you decide the size of the memory at runtime and finaly the size of the stack is limited (typically 1Mb) while on the heap you can reserve all the memory you have available.
edit 5. Not really - I would say pure C. C++ is (almost) a superset of C so unless you are working on a very limited embedded system it's usualy OK to use C++.
\5. Chipmunk
Fast and lightweight 2D rigid body physics library in C.
Designed with 2D video games in mind.
Lightweight C99 implementation with no external dependencies outside of the Std. C library.
Many language bindings available.
Simple, read the documentation and see!
Unrestrictive MIT license.
Makes you smarter, stronger and more attractive to the opposite gender!
...
In your second question:
char *fun = malloc(sizeof(char) * 4);
vs
char fun[4];
vs
char *fun = "fun";
These all involve an array of 4 chars, but that's where the similarity ends. Where they differ is in the lifetime, modifiability and initialisation of those chars.
The first one creates a single pointer to char object called fun - this pointer variable will live only from when this function starts until the function returns. It also calls the C standard library and asks it to dynamically create a memory block the size of an array of 4 chars, and assigns the location of the first char in the block to fun. This memory block (which you can treat as an array of 4 chars) has a flexible lifetime that's entirely up to the programmer - it lives until you pass that memory location to free(). Note that this means that the memory block created by malloc can live for a longer or shorter time than the pointer variable fun itself does. Note also that the association between fun and that memory block is not fixed - you can change fun so it points to different memory block, or make a different pointer point to that memory block.
One more thing - the array of 4 chars created by malloc is not initialised - it contains garbage values.
The second example creates only one object - an array of 4 chars, called fun. (To test this, change the 4 to 40 and print out sizeof(fun)). This array lives only until the function it's declared in returns (unless it's declared outside of a function, when it lives for as long as the entire program is running). This array of 4 chars isn't initialised either.
The third example creates two objects. The first is a pointer-to-char variable called fun, just like in the first example (and as usual, it lives from the start of this function until it returns). The other object is a bit strange - it's an array of 4 chars, initialised to { 'f', 'u', 'n', 0 }, which has no name and that lives for as long as the entire program is running. It's also not guaranteed to be modifiable (although what happens if you try to modify it is left entirely undefined - it might crash your program, or it might not). The variable fun is initialised with the location of this strange unnamed, unmodifiable, long-lived array (but just like in the first example, this association isn't permanent - you can make fun point to something else).
The reason why there's so many confusing similarities and differences between arrays and pointers is down to two things:
The "array syntax" in C (the [] operator) actually works on pointers, not arrays!
Trying to pin down an array is a bit like catching fog - in almost all cases the array evaporates and is replaced by a pointer to its first element instead.
This question goes out to the C gurus out there:
In C, it is possible to declare a pointer as follows:
char (* p)[10];
.. which basically states that this pointer points to an array of 10 chars. The neat thing about declaring a pointer like this is that you will get a compile time error if you try to assign a pointer of an array of different size to p. It will also give you a compile time error if you try to assign the value of a simple char pointer to p. I tried this with gcc and it seems to work with ANSI, C89 and C99.
It looks to me like declaring a pointer like this would be very useful - particularly, when passing a pointer to a function. Usually, people would write the prototype of such a function like this:
void foo(char * p, int plen);
If you were expecting a buffer of an specific size, you would simply test the value of plen. However, you cannot be guaranteed that the person who passes p to you will really give you plen valid memory locations in that buffer. You have to trust that the person who called this function is doing the right thing. On the other hand:
void foo(char (*p)[10]);
..would force the caller to give you a buffer of the specified size.
This seems very useful but I have never seen a pointer declared like this in any code I have ever ran across.
My question is: Is there any reason why people do not declare pointers like this? Am I not seeing some obvious pitfall?
What you are saying in your post is absolutely correct. I'd say that every C developer comes to exactly the same discovery and to exactly the same conclusion when (if) they reach certain level of proficiency with C language.
When the specifics of your application area call for an array of specific fixed size (array size is a compile-time constant), the only proper way to pass such an array to a function is by using a pointer-to-array parameter
void foo(char (*p)[10]);
(in C++ language this is also done with references
void foo(char (&p)[10]);
).
This will enable language-level type checking, which will make sure that the array of exactly correct size is supplied as an argument. In fact, in many cases people use this technique implicitly, without even realizing it, hiding the array type behind a typedef name
typedef int Vector3d[3];
void transform(Vector3d *vector);
/* equivalent to `void transform(int (*vector)[3])` */
...
Vector3d vec;
...
transform(&vec);
Note additionally that the above code is invariant with relation to Vector3d type being an array or a struct. You can switch the definition of Vector3d at any time from an array to a struct and back, and you won't have to change the function declaration. In either case the functions will receive an aggregate object "by reference" (there are exceptions to this, but within the context of this discussion this is true).
However, you won't see this method of array passing used explicitly too often, simply because too many people get confused by a rather convoluted syntax and are simply not comfortable enough with such features of C language to use them properly. For this reason, in average real life, passing an array as a pointer to its first element is a more popular approach. It just looks "simpler".
But in reality, using the pointer to the first element for array passing is a very niche technique, a trick, which serves a very specific purpose: its one and only purpose is to facilitate passing arrays of different size (i.e. run-time size). If you really need to be able to process arrays of run-time size, then the proper way to pass such an array is by a pointer to its first element with the concrete size supplied by an additional parameter
void foo(char p[], unsigned plen);
Actually, in many cases it is very useful to be able to process arrays of run-time size, which also contributes to the popularity of the method. Many C developers simply never encounter (or never recognize) the need to process a fixed-size array, thus remaining oblivious to the proper fixed-size technique.
Nevertheless, if the array size is fixed, passing it as a pointer to an element
void foo(char p[])
is a major technique-level error, which unfortunately is rather widespread these days. A pointer-to-array technique is a much better approach in such cases.
Another reason that might hinder the adoption of the fixed-size array passing technique is the dominance of naive approach to typing of dynamically allocated arrays. For example, if the program calls for fixed arrays of type char[10] (as in your example), an average developer will malloc such arrays as
char *p = malloc(10 * sizeof *p);
This array cannot be passed to a function declared as
void foo(char (*p)[10]);
which confuses the average developer and makes them abandon the fixed-size parameter declaration without giving it a further thought. In reality though, the root of the problem lies in the naive malloc approach. The malloc format shown above should be reserved for arrays of run-time size. If the array type has compile-time size, a better way to malloc it would look as follows
char (*p)[10] = malloc(sizeof *p);
This, of course, can be easily passed to the above declared foo
foo(p);
and the compiler will perform the proper type checking. But again, this is overly confusing to an unprepared C developer, which is why you won't see it in too often in the "typical" average everyday code.
I would like to add to AndreyT's answer (in case anyone stumbles upon this page looking for more info on this topic):
As I begin to play more with these declarations, I realize that there is major handicap associated with them in C (apparently not in C++). It is fairly common to have a situation where you would like to give a caller a const pointer to a buffer you have written into. Unfortunately, this is not possible when declaring a pointer like this in C. In other words, the C standard (6.7.3 - Paragraph 8) is at odds with something like this:
int array[9];
const int (* p2)[9] = &array; /* Not legal unless array is const as well */
This constraint does not seem to be present in C++, making these type of declarations far more useful. But in the case of C, it is necessary to fall back to a regular pointer declaration whenever you want a const pointer to the fixed size buffer (unless the buffer itself was declared const to begin with). You can find more info in this mail thread: link text
This is a severe constraint in my opinion and it could be one of the main reasons why people do not usually declare pointers like this in C. The other being the fact that most people do not even know that you can declare a pointer like this as AndreyT has pointed out.
The obvious reason is that this code doesn't compile:
extern void foo(char (*p)[10]);
void bar() {
char p[10];
foo(p);
}
The default promotion of an array is to an unqualified pointer.
Also see this question, using foo(&p) should work.
I also want to use this syntax to enable more type checking.
But I also agree that the syntax and mental model of using pointers is simpler, and easier to remember.
Here are some more obstacles I have come across.
Accessing the array requires using (*p)[]:
void foo(char (*p)[10])
{
char c = (*p)[3];
(*p)[0] = 1;
}
It is tempting to use a local pointer-to-char instead:
void foo(char (*p)[10])
{
char *cp = (char *)p;
char c = cp[3];
cp[0] = 1;
}
But this would partially defeat the purpose of using the correct type.
One has to remember to use the address-of operator when assigning an array's address to a pointer-to-array:
char a[10];
char (*p)[10] = &a;
The address-of operator gets the address of the whole array in &a, with the correct type to assign it to p. Without the operator, a is automatically converted to the address of the first element of the array, same as in &a[0], which has a different type.
Since this automatic conversion is already taking place, I am always puzzled that the & is necessary. It is consistent with the use of & on variables of other types, but I have to remember that an array is special and that I need the & to get the correct type of address, even though the address value is the same.
One reason for my problem may be that I learned K&R C back in the 80s, which did not allow using the & operator on whole arrays yet (although some compilers ignored that or tolerated the syntax). Which, by the way, may be another reason why pointers-to-arrays have a hard time to get adopted: they only work properly since ANSI C, and the & operator limitation may have been another reason to deem them too awkward.
When typedef is not used to create a type for the pointer-to-array (in a common header file), then a global pointer-to-array needs a more complicated extern declaration to share it across files:
fileA:
char (*p)[10];
fileB:
extern char (*p)[10];
Well, simply put, C doesn't do things that way. An array of type T is passed around as a pointer to the first T in the array, and that's all you get.
This allows for some cool and elegant algorithms, such as looping through the array with expressions like
*dst++ = *src++
The downside is that management of the size is up to you. Unfortunately, failure to do this conscientiously has also led to millions of bugs in C coding, and/or opportunities for malevolent exploitation.
What comes close to what you ask in C is to pass around a struct (by value) or a pointer to one (by reference). As long as the same struct type is used on both sides of this operation, both the code that hand out the reference and the code that uses it are in agreement about the size of the data being handled.
Your struct can contain whatever data you want; it could contain your array of a well-defined size.
Still, nothing prevents you or an incompetent or malevolent coder from using casts to fool the compiler into treating your struct as one of a different size. The almost unshackled ability to do this kind of thing is a part of C's design.
You can declare an array of characters a number of ways:
char p[10];
char* p = (char*)malloc(10 * sizeof(char));
The prototype to a function that takes an array by value is:
void foo(char* p); //cannot modify p
or by reference:
void foo(char** p); //can modify p, derefernce by *p[0] = 'f';
or by array syntax:
void foo(char p[]); //same as char*
I would not recommend this solution
typedef int Vector3d[3];
since it obscures the fact that Vector3D has a type that you
must know about. Programmers usually dont expect variables of the
same type to have different sizes. Consider :
void foo(Vector3d a) {
Vector3d b;
}
where sizeof a != sizeof b
Maybe I'm missing something, but... since arrays are constant pointers, basically that means that there's no point in passing around pointers to them.
Couldn't you just use void foo(char p[10], int plen); ?
type (*)[];
// points to an array e.g
int (*ptr)[5];
// points to an 5 integer array
// gets the address of the array
type *[];
// points to an array of pointers e.g
int* ptr[5]
// point to an array of five integer pointers
// point to 5 adresses.
On my compiler (vs2008) it treats char (*p)[10] as an array of character pointers, as if there was no parentheses, even if I compile as a C file. Is compiler support for this "variable"? If so that is a major reason not to use it.
I'm preparing some slides for an introductory C class, and I'm trying to present good examples (and motivation) for using pointer arithmetic over array subscripting.
A lot of the examples I see in books are fairly equivalent. For example, many books show how to reverse the case of all values in a string, but with the exception of replacing an a[i] with a *p the code is identical.
I am looking for a good (and short) example with single-dimensional arrays where pointer arithmetic can produce significantly more elegant code. Any ideas?
Getting a pointer again instead of a value:
One usually uses pointer arithmetic when they want to get a pointer again. To get a pointer while using an array index: you are 1) calculating the pointer offset, then 2) getting the value at that memory location, then 3) you have to use & to get the address again. That's more typing and less clean syntax.
Example 1: Let's say you need a pointer to the 512th byte in a buffer
char buffer[1024]
char *p = buffer + 512;
Is cleaner than:
char buffer[1024];
char *p = &buffer[512];
Example 2: More efficient strcat
char buffer[1024];
strcpy(buffer, "hello ");
strcpy(buffer + 6, "world!");
This is cleaner than:
char buffer[1024];
strcpy(buffer, "hello ");
strcpy(&buffer[6], "world!");
Using pointer arithmetic ++ as an iterator:
Incrementing pointers with ++, and decrementing with -- is useful when iterating over each element in an array of elements. It is cleaner than using a separate variable used to keep track of the offset.
Pointer subtraction:
You can use pointer subtraction with pointer arithmetic. This can be useful in some cases to get the element before the one you are pointing to. It can be done with array subscripts too, but it looks really bad and confusing. Especially to a python programmer where a negative subscript is given to index something from the end of the list.
char *my_strcpy(const char *s, char *t) {
char *u = t;
while (*t++ = *s++);
return u;
}
Why would you want to spoil such a beauty with an index? (See K&R, and how they build on up to this style.)There is a reason I used the above signature the way it is. Stop editing without asking for a clarification first. For those who think they know, look up the present signature -- you missed a few restrict qualifications.
Structure alignment testing and the offsetof macro implementation.
Pointer arithmetic may look fancy and "hackerish", but I have never encountered a case it was FASTER than the standard indexing. Just the opposite, I often encountered cases when it slowed the code down by a large factor.
For example, typical sequential looping through an array with a pointer may be less efficient than looping with a classic index on a modern processors, that support SSE extensions. Pointer arithmetic in a loop sufficiently blocks compilers from performing loop vectorization, which can yield typical 2x-4x performance boost. Additionally, using pointers instead of simple integer variables may result in needless memory store operations due to pointer aliasing.
So, generally pointer arithmetic instead of standard indexed access should NEVER be recommended.
iterating through a 2-dimensional array where the position of a datum does not really matter
if you dont use pointers, you would have to keep track of two subscripts
with pointers, you could point to the top of your array, and with a single loop, zip through the whole thing
If you were using an old compiler, or some kind of specialist embedded systems compiler, there might be slight performance differences, but most modern compilers would probably optimize these (tiny) differences out.
The following article might be something you could draw on - depends on the level of your students:
http://geeks.netindonesia.net/blogs/risman/archive/2007/06/25/Pointer-Arithmetic-and-Array-Indexing.aspx
You're asking about C specifically, but C++ builds upon this as well:
Most pointer arithmetic naturally generalizes to the Forward Iterator concept. Walking through memory with *p++ can be used for any sequenced container (linked list, skip list, vector, binary tree, B tree, etc), thanks to operator overloading.
Something fun I hope you never have to deal with: pointers can alias, whereas arrays cannot. Aliasing can cause all sorts of non-ideal code generation, the most common of which is using a pointer as an out parameter to another function. Basically, the compiler cannot assume that the pointer used by the function doesn't alias itself or anything else in that stack frame, so it has to reload the value from the pointer every time it's used. Or rather, to be safe it does.
Often the choice is just one of style - one looks or feels more natural than the other for a particular case.
There is also the argument that using indexes can cause the compiler to have to repeatedly recalculate offsets inside a loop - I'm not sure how often this is the case (other than in non-optimized builds), but I imagine it happens, but it's probably rarely a problem.
One area that I think is important in the long run (which might not apply to an introductory C class - but learn 'em early, I say) is that using pointer arithmetic applies to the idioms used in the C++ STL. If you get them to understand pointer arithmetic and use it, then when they move on to the STL, they'll have a leg up on how to properly use iterators.
#include ctype.h
void skip_spaces( const char **ppsz )
{
const char *psz = *ppsz;
while( isspace(*psz) )
psz++;
*ppsz = psz;
}
void fn(void)
{
char a[]=" Hello World!";
const char *psz = a;
skip_spaces( &psz );
printf("\n%s", psz);
}