I'm sort of learning C, I'm not a beginner to programming though, I "know" Java and python, and by the way I'm on a mac (leopard).
Firstly,
1: could someone explain when to use a pointer and when not to?
2:
char *fun = malloc(sizeof(char) * 4);
or
char fun[4];
or
char *fun = "fun";
And then all but the last would set indexes 0, 1, 2 and 3 to 'f', 'u', 'n' and '\0' respectively. My question is, why isn't the second one a pointer? Why char fun[4] and not char *fun[4]? And how come it seems that a pointer to a struct or an int is always an array?
3:
I understand this:
typedef struct car
{
...
};
is a shortcut for
struct car
{
...
};
typedef struct car car;
Correct? But something I am really confused about:
typedef struct A
{
...
}B;
What is the difference between A and B? A is the 'tag-name', but what's that? When do I use which? Same thing for enums.
4. I understand what pointers do, but I don't understand what the point of them is (no pun intended). And when does something get allocated on the stack vs. the heap? How do I know where it gets allocated? Do pointers have something to do with it?
5. And lastly, know any good tutorial for C game programming (simple) ? And for mac/OS X, not windows?
PS. Is there any other name people use to refer to just C, not C++? I hate how they're all named almost the same thing, so hard to try to google specifically C and not just get C++ and C# stuff.
Thanks!!
It was hard to pick a best answer, they were all great, but the one I picked was the only one that made me understand my 3rd question, which was the only one I was originally going to ask. Thanks again!
My question is, why isn't the second one a pointer?
Because it declares an array. In the two other cases, you have a pointer that refers to data that lives somewhere else. Your array declaration, however, declares an array of data that lives where it's declared. If you declared it within a function, then data will die when you return from that function. Finally char *fun[4] would be an array of 4 pointers - it wouldn't be a char pointer. In case you just want to point to a block of 4 chars, then char* would fully suffice, no need to tell it that there are exactly 4 chars to be pointed to.
The first way which creates an object on the heap is used if you need data to live from thereon until the matching free call. The data will survive a return from a function.
The last way just creates data that's not intended to be written to. It's a pointer which refers to a string literal - it's often stored in read-only memory. If you write to it, then the behavior is undefined.
I understand what pointers do, but I don't understand what the point of them is (no pun intended).
Pointers are used to point to something (no pun, of course). Look at it like this: If you have a row of items on the table, and your friend says "pick the second item", then the item won't magically walk its way to you. You have to grab it. Your hand acts like a pointer, and when you move your hand back to you, you dereference that pointer and get the item. The row of items can be seen as an array of items:
And how come it seems that a pointer to a struct or an int is always an array?
item row[5];
When you do item i = row[1]; then you first point your hand at the first item (get a pointer to the first one), and then you advance till you are at the second item. Then you take your hand with the item back to you :) So, the row[1] syntax is not something special to arrays, but rather special to pointers - it's equivalent to *(row + 1), and a temporary pointer is made up when you use an array like that.
What is the difference between A and B? A is the 'tag-name', but what's that? When do I use which? Same thing for enums.
typedef struct car
{
...
};
That's not valid code. You basically said "define the type struct car { ... } to be referable by the following ordinary identifier" but you missed to tell it the identifier. The two following snippets are equivalent instead, as far as i can see
1)
struct car
{
...
};
typedef struct car car;
2)
typedef struct car
{
...
} car;
What is the difference between A and B? A is the 'tag-name', but what's that? When do I use which? Same thing for enums.
In our case, the identifier car was declared two times in the same scope. But the declarations won't conflict because each of the identifiers are in a different namespace. The two namespaces involved are the ordinary namespace and the tag namespace. A tag identifier needs to be used after a struct, union or enum keyword, while an ordinary identifier doesn't need anything around it. You may have heard of the POSIX function stat, whose interface looks like the following
struct stat {
...
};
int stat(const char *path, struct stat *buf);
In that code snippet, stat is registered into the two aforementioned namespaces too. struct stat will refer to the struct, and merely stat will refer to the function. Some people don't like to precede identifiers always with struct, union or enum. Those use typedef to introduce an ordinary identifier that will refer to the struct too. The identifier can of course be the same (both times car), or they can differ (one time A the other time B). It doesn't matter.
3) It's bad style to use two different names A and B:
typedef struct A
{
...
} B;
With that definition, you can say
struct A a;
B b;
b.field = 42;
a.field = b.field;
because the variables a and b have the same type. C programmers usually say
typedef struct A
{
...
} A;
so that you can use "A" as a type name, equivalent to "struct A" but it saves you a lot of typing.
Use them when you need to. Read some more examples and tutorials until you understand what pointers are, and this ought to be a lot clearer :)
The second case creates an array in memory, with space for four bytes. When you use that array's name, you magically get back a pointer to the first (index 0) element. And then the [] operator then actually works on a pointer, not an array - x[y] is equivalent to *(x + y). And yes, this means x[y] is the same as y[x]. Sorry.
Note also that when you add an integer to a pointer, it's multiplied by the size of the pointed-to elements, so if you do someIntArray[1], you get the second (index 1) element, not somewhere inbetween starting at the first byte.
Also, as a final gotcha - array types in function argument lists - eg, void foo(int bar[4]) - secretly get turned into pointer types - that is, void foo(int *bar). This is only the case in function arguments.
Your third example declares a struct type with two names - struct A and B. In pure C, the struct is mandatory for A - in C++, you can just refer to it as either A or B. Apart from the name change, the two types are completely equivalent, and you can substitute one for the other anywhere, anytime without any change in behavior.
C has three places things can be stored:
The stack - local variables in functions go here. For example:
void foo() {
int x; // on the stack
}
The heap - things go here when you allocate them explicitly with malloc, calloc, or realloc.
void foo() {
int *x; // on the stack
x = malloc(sizeof(*x)); // the value pointed to by x is on the heap
}
Static storage - global variables and static variables, allocated once at program startup.
int x; // static
void foo() {
static int y; // essentially a global that can only be used in foo()
}
No idea. I wish I didn't need to answer all questions at once - this is why you should split them up :)
Note: formatting looks ugly due to some sort of markdown bug, if anyone knows of a workaround please feel free to edit (and remove this note!)
char *fun = malloc(sizeof(char) * 4);
or
char fun[4];
or
char *fun = "fun";
The first one can be set to any size you want at runtime, and be resized later - you can also free the memory when you are done.
The second one is a pointer really 'fun' is the same as char ptr=&fun[0].
I understand what pointers do, but I don't understand what the point of
them is (no pun intended). And when
does something get allocated on the
stack vs. the heap? How do I know
where it gets allocated? Do pointers
have something to do with it?
When you define something in a function like "char fun[4]" it is defined on the stack and the memory isn't available outside the function.
Using malloc (or new in C++) reserves memory on the heap - you can make this data available anywhere in the program by passing it the pointer. This also lets you decide the size of the memory at runtime and finaly the size of the stack is limited (typically 1Mb) while on the heap you can reserve all the memory you have available.
edit 5. Not really - I would say pure C. C++ is (almost) a superset of C so unless you are working on a very limited embedded system it's usualy OK to use C++.
\5. Chipmunk
Fast and lightweight 2D rigid body physics library in C.
Designed with 2D video games in mind.
Lightweight C99 implementation with no external dependencies outside of the Std. C library.
Many language bindings available.
Simple, read the documentation and see!
Unrestrictive MIT license.
Makes you smarter, stronger and more attractive to the opposite gender!
...
In your second question:
char *fun = malloc(sizeof(char) * 4);
vs
char fun[4];
vs
char *fun = "fun";
These all involve an array of 4 chars, but that's where the similarity ends. Where they differ is in the lifetime, modifiability and initialisation of those chars.
The first one creates a single pointer to char object called fun - this pointer variable will live only from when this function starts until the function returns. It also calls the C standard library and asks it to dynamically create a memory block the size of an array of 4 chars, and assigns the location of the first char in the block to fun. This memory block (which you can treat as an array of 4 chars) has a flexible lifetime that's entirely up to the programmer - it lives until you pass that memory location to free(). Note that this means that the memory block created by malloc can live for a longer or shorter time than the pointer variable fun itself does. Note also that the association between fun and that memory block is not fixed - you can change fun so it points to different memory block, or make a different pointer point to that memory block.
One more thing - the array of 4 chars created by malloc is not initialised - it contains garbage values.
The second example creates only one object - an array of 4 chars, called fun. (To test this, change the 4 to 40 and print out sizeof(fun)). This array lives only until the function it's declared in returns (unless it's declared outside of a function, when it lives for as long as the entire program is running). This array of 4 chars isn't initialised either.
The third example creates two objects. The first is a pointer-to-char variable called fun, just like in the first example (and as usual, it lives from the start of this function until it returns). The other object is a bit strange - it's an array of 4 chars, initialised to { 'f', 'u', 'n', 0 }, which has no name and that lives for as long as the entire program is running. It's also not guaranteed to be modifiable (although what happens if you try to modify it is left entirely undefined - it might crash your program, or it might not). The variable fun is initialised with the location of this strange unnamed, unmodifiable, long-lived array (but just like in the first example, this association isn't permanent - you can make fun point to something else).
The reason why there's so many confusing similarities and differences between arrays and pointers is down to two things:
The "array syntax" in C (the [] operator) actually works on pointers, not arrays!
Trying to pin down an array is a bit like catching fog - in almost all cases the array evaporates and is replaced by a pointer to its first element instead.
Related
I am trying to initialize a struct of C array in go side.
I am new to cgo. Still trying to understand the use case.
test.h
typedef struct reply {
char *name;
reply_cb callback_fn;
} reply_t;
typedef struct common {
char *name;
int count;
reply_t reply[];
} common_t;
int
init_s (common_t *service);
test.go
name := C.CString("ABCD")
defer C.free(unsafe.Pointer(name))
num := C.int(3)
r := [3]C.reply_t{{C.CString("AB"), (C.s_cb)(unsafe.Pointer(C.g_cb))},
{C.CString("BC"), (C.s_cb)(unsafe.Pointer(C.g_cb))},
{C.CString("CD"), (C.s_cb)(unsafe.Pointer(C.g_cb))}}
g := C.common_t{
name: name,
count: num,
reply : r,
}
rc := C.init_s(&g)
I am getting error on "reply: r" unknown field 'r' in struct literal of type
Any help will be appreciated. The goal is initialize and then use it values in C init_s for processing.
You cannot use a flexible array field from Go: https://go-review.googlesource.com/c/go/+/12864/.
I think the reasonong is simple: this wart of C normally requires you to perform a trick of allocating a properly-aligned memory buffer long enough to accomodate for the sizeof(struct_type) itself at the beginning of that buffer plus sizeof(array_member[0]) * array_element_count bytes. This does not map to Go's type system because in it, structs have fixed size known at compile time. If Go would not hide reply from the definition, it would refer to a zero-length field you cannot do anything useful with anyway—see #20275.
Don't be deceived by code examples where a flexible array member field is initialized with a literal: as torek pointed out, it's a GCC extension, but what is more important, it requires work on part of the compiler—that is, it analyzes the literal, understands the context it appeared in and generates a code which allocates large enough memory block to accomodate both the struct and all the members of the flexible array.
The initialization of the array in your Go code may look superficially similar but it has an important difference: it allocates a separate array which has nothing to do with the memory block of the struct it's supposed to be "put into".
What's more Go's array are different beasts than C's: in C, arrays are pointers in disguise, in Go, arrays are first-class citizens and when you assign an array or pass it to a function call, the whole array is copied by value—as opposed to "decaying into a pointer"—in C's terms.
So even if the Go compiler would not hide the reply field, assignment to it would fail.
I think you cannot directly use values of this type from Go without additional helper code written in C. For instance, to initialize values of common_t, you would write a C helper which would first allocate a memory buffer long enough and then expose to the Go code a pair of pointers: to the beginning of the buffer (of type *C.common_t), and to the first element of the array—as *C.reply_t.
If this C code is the code you own, I'd recommend to just get rid of the flexible array and maintain a pointer to a "normal" array in the reply field.
Yes, this would mean extra pointer chasing for the CPU but it will be simpler to interoperate with Go.
I am trying to access the members in the struct tCAN_MESSAGE. What I think would work is like the first example in main, i.e. some_ptr->canMessage_ptr->value = 10;. But I have some code that someone else have written and what I can see is that that person have used some_ptr->canMessage_ptr[i].value;.
Is it possible to do it the first way? We are using pointers to structs which contains pointer to another struct (like the example below) quite often, but I never see the use of ptr1->ptr2->value?
typedef struct
{
int value1;
int value2;
int value3;
float value4;
}tCAN_MESSAGE;
typedef struct
{
tCAN_MESSAGE *canMessage_ptr;
}tSOMETHING;
int main(void)
{
tCAN_MESSATE var_canMessage;
tSOMETHING var_something;
tSOMETHING *some_ptr = &var_something;
some_ptr->canMessage_ptr = &var_canMessage;
some_ptr->canMessage_ptr->value1 = 10; //is this valid?
//I have some code that are doing this, ant iterating trough it with a for:
some_ptr->canMessage_ptr[i].value1; //Is this valid?
return 0
}
It's very simple: every pointer has to be set to point at a valid memory location before use. If it isn't, you can't use it. You cannot "store data inside pointers". See this:
Crash or "segmentation fault" when data is copied/scanned/read to an uninitialized pointer
None of your code is valid. some_ptr isn't set to point anywhere, so it cannot be accessed, nor can its members. Similarly, some_ptr->canMessage_ptr isn't set to point anywhere either.
I am trying to access the members in the struct tCAN_MESSAGE. What I
think would work is like the first example in main, i.e.
some_ptr->canMessage_ptr->value = 10;. But I have some code that
someone else have written and what I can see is that that person have
used some_ptr->canMessage_ptr[i].value;. Is it possible to do it the
first way?
The expression
some_ptr->canMessage_ptr[i].value
is 100% equivalent to
(*(some_ptr->canMessage_ptr + i)).value
, which in turn is 100% equivalent to
(some_ptr->canMessage_ptr + i)->value
. When i is 0, that is of course equivalent to
some_ptr->canMessage_ptr->value
So yes, it is possible to use some_ptr->canMessage_ptr->value as long as the index in question is 0. If the index is always 0 then chaining arrow operators as you suggest is good style. Otherwise, the mixture of arrow and indexing operators that you see in practice would be my style recommendation.
We are using pointers to structs wich contains pointer to
another struct (like the example below) quite often, but I never see
the use of ptr1->ptr2->value ?
I'm inclined to suspect that you do not fully understand what you're working with. Usage of the form some_ptr->canMessage_ptr[i].value suggests that your tSOMETHING type contains a pointer to the first element of an array of possibly many tCAN_MESSAGEs, which is a subtle but important distinction to make. In that case, yes, as shown above, you can chain arrow operators to access the first element of such an array (at index 0). However, the cleanest syntax for accessing other elements of that array is to use the indexing operator, and it pays to be consistent.
This question goes out to the C gurus out there:
In C, it is possible to declare a pointer as follows:
char (* p)[10];
.. which basically states that this pointer points to an array of 10 chars. The neat thing about declaring a pointer like this is that you will get a compile time error if you try to assign a pointer of an array of different size to p. It will also give you a compile time error if you try to assign the value of a simple char pointer to p. I tried this with gcc and it seems to work with ANSI, C89 and C99.
It looks to me like declaring a pointer like this would be very useful - particularly, when passing a pointer to a function. Usually, people would write the prototype of such a function like this:
void foo(char * p, int plen);
If you were expecting a buffer of an specific size, you would simply test the value of plen. However, you cannot be guaranteed that the person who passes p to you will really give you plen valid memory locations in that buffer. You have to trust that the person who called this function is doing the right thing. On the other hand:
void foo(char (*p)[10]);
..would force the caller to give you a buffer of the specified size.
This seems very useful but I have never seen a pointer declared like this in any code I have ever ran across.
My question is: Is there any reason why people do not declare pointers like this? Am I not seeing some obvious pitfall?
What you are saying in your post is absolutely correct. I'd say that every C developer comes to exactly the same discovery and to exactly the same conclusion when (if) they reach certain level of proficiency with C language.
When the specifics of your application area call for an array of specific fixed size (array size is a compile-time constant), the only proper way to pass such an array to a function is by using a pointer-to-array parameter
void foo(char (*p)[10]);
(in C++ language this is also done with references
void foo(char (&p)[10]);
).
This will enable language-level type checking, which will make sure that the array of exactly correct size is supplied as an argument. In fact, in many cases people use this technique implicitly, without even realizing it, hiding the array type behind a typedef name
typedef int Vector3d[3];
void transform(Vector3d *vector);
/* equivalent to `void transform(int (*vector)[3])` */
...
Vector3d vec;
...
transform(&vec);
Note additionally that the above code is invariant with relation to Vector3d type being an array or a struct. You can switch the definition of Vector3d at any time from an array to a struct and back, and you won't have to change the function declaration. In either case the functions will receive an aggregate object "by reference" (there are exceptions to this, but within the context of this discussion this is true).
However, you won't see this method of array passing used explicitly too often, simply because too many people get confused by a rather convoluted syntax and are simply not comfortable enough with such features of C language to use them properly. For this reason, in average real life, passing an array as a pointer to its first element is a more popular approach. It just looks "simpler".
But in reality, using the pointer to the first element for array passing is a very niche technique, a trick, which serves a very specific purpose: its one and only purpose is to facilitate passing arrays of different size (i.e. run-time size). If you really need to be able to process arrays of run-time size, then the proper way to pass such an array is by a pointer to its first element with the concrete size supplied by an additional parameter
void foo(char p[], unsigned plen);
Actually, in many cases it is very useful to be able to process arrays of run-time size, which also contributes to the popularity of the method. Many C developers simply never encounter (or never recognize) the need to process a fixed-size array, thus remaining oblivious to the proper fixed-size technique.
Nevertheless, if the array size is fixed, passing it as a pointer to an element
void foo(char p[])
is a major technique-level error, which unfortunately is rather widespread these days. A pointer-to-array technique is a much better approach in such cases.
Another reason that might hinder the adoption of the fixed-size array passing technique is the dominance of naive approach to typing of dynamically allocated arrays. For example, if the program calls for fixed arrays of type char[10] (as in your example), an average developer will malloc such arrays as
char *p = malloc(10 * sizeof *p);
This array cannot be passed to a function declared as
void foo(char (*p)[10]);
which confuses the average developer and makes them abandon the fixed-size parameter declaration without giving it a further thought. In reality though, the root of the problem lies in the naive malloc approach. The malloc format shown above should be reserved for arrays of run-time size. If the array type has compile-time size, a better way to malloc it would look as follows
char (*p)[10] = malloc(sizeof *p);
This, of course, can be easily passed to the above declared foo
foo(p);
and the compiler will perform the proper type checking. But again, this is overly confusing to an unprepared C developer, which is why you won't see it in too often in the "typical" average everyday code.
I would like to add to AndreyT's answer (in case anyone stumbles upon this page looking for more info on this topic):
As I begin to play more with these declarations, I realize that there is major handicap associated with them in C (apparently not in C++). It is fairly common to have a situation where you would like to give a caller a const pointer to a buffer you have written into. Unfortunately, this is not possible when declaring a pointer like this in C. In other words, the C standard (6.7.3 - Paragraph 8) is at odds with something like this:
int array[9];
const int (* p2)[9] = &array; /* Not legal unless array is const as well */
This constraint does not seem to be present in C++, making these type of declarations far more useful. But in the case of C, it is necessary to fall back to a regular pointer declaration whenever you want a const pointer to the fixed size buffer (unless the buffer itself was declared const to begin with). You can find more info in this mail thread: link text
This is a severe constraint in my opinion and it could be one of the main reasons why people do not usually declare pointers like this in C. The other being the fact that most people do not even know that you can declare a pointer like this as AndreyT has pointed out.
The obvious reason is that this code doesn't compile:
extern void foo(char (*p)[10]);
void bar() {
char p[10];
foo(p);
}
The default promotion of an array is to an unqualified pointer.
Also see this question, using foo(&p) should work.
I also want to use this syntax to enable more type checking.
But I also agree that the syntax and mental model of using pointers is simpler, and easier to remember.
Here are some more obstacles I have come across.
Accessing the array requires using (*p)[]:
void foo(char (*p)[10])
{
char c = (*p)[3];
(*p)[0] = 1;
}
It is tempting to use a local pointer-to-char instead:
void foo(char (*p)[10])
{
char *cp = (char *)p;
char c = cp[3];
cp[0] = 1;
}
But this would partially defeat the purpose of using the correct type.
One has to remember to use the address-of operator when assigning an array's address to a pointer-to-array:
char a[10];
char (*p)[10] = &a;
The address-of operator gets the address of the whole array in &a, with the correct type to assign it to p. Without the operator, a is automatically converted to the address of the first element of the array, same as in &a[0], which has a different type.
Since this automatic conversion is already taking place, I am always puzzled that the & is necessary. It is consistent with the use of & on variables of other types, but I have to remember that an array is special and that I need the & to get the correct type of address, even though the address value is the same.
One reason for my problem may be that I learned K&R C back in the 80s, which did not allow using the & operator on whole arrays yet (although some compilers ignored that or tolerated the syntax). Which, by the way, may be another reason why pointers-to-arrays have a hard time to get adopted: they only work properly since ANSI C, and the & operator limitation may have been another reason to deem them too awkward.
When typedef is not used to create a type for the pointer-to-array (in a common header file), then a global pointer-to-array needs a more complicated extern declaration to share it across files:
fileA:
char (*p)[10];
fileB:
extern char (*p)[10];
Well, simply put, C doesn't do things that way. An array of type T is passed around as a pointer to the first T in the array, and that's all you get.
This allows for some cool and elegant algorithms, such as looping through the array with expressions like
*dst++ = *src++
The downside is that management of the size is up to you. Unfortunately, failure to do this conscientiously has also led to millions of bugs in C coding, and/or opportunities for malevolent exploitation.
What comes close to what you ask in C is to pass around a struct (by value) or a pointer to one (by reference). As long as the same struct type is used on both sides of this operation, both the code that hand out the reference and the code that uses it are in agreement about the size of the data being handled.
Your struct can contain whatever data you want; it could contain your array of a well-defined size.
Still, nothing prevents you or an incompetent or malevolent coder from using casts to fool the compiler into treating your struct as one of a different size. The almost unshackled ability to do this kind of thing is a part of C's design.
You can declare an array of characters a number of ways:
char p[10];
char* p = (char*)malloc(10 * sizeof(char));
The prototype to a function that takes an array by value is:
void foo(char* p); //cannot modify p
or by reference:
void foo(char** p); //can modify p, derefernce by *p[0] = 'f';
or by array syntax:
void foo(char p[]); //same as char*
I would not recommend this solution
typedef int Vector3d[3];
since it obscures the fact that Vector3D has a type that you
must know about. Programmers usually dont expect variables of the
same type to have different sizes. Consider :
void foo(Vector3d a) {
Vector3d b;
}
where sizeof a != sizeof b
Maybe I'm missing something, but... since arrays are constant pointers, basically that means that there's no point in passing around pointers to them.
Couldn't you just use void foo(char p[10], int plen); ?
type (*)[];
// points to an array e.g
int (*ptr)[5];
// points to an 5 integer array
// gets the address of the array
type *[];
// points to an array of pointers e.g
int* ptr[5]
// point to an array of five integer pointers
// point to 5 adresses.
On my compiler (vs2008) it treats char (*p)[10] as an array of character pointers, as if there was no parentheses, even if I compile as a C file. Is compiler support for this "variable"? If so that is a major reason not to use it.
I've just started to learn C so please be kind.
From what I've read so far regarding pointers:
int * test1; //this is a pointer which is basically an address to the process
//memory and usually has the size of 2 bytes (not necessarily, I know)
float test2; //this is an actual value and usually has the size of 4 bytes,
//being of float type
test2 = 3.0; //this assigns 3 to `test2`
Now, what I don't completely understand:
*test1 = 3; //does this assign 3 at the address
//specified by `pointerValue`?
test1 = 3; //this says that the pointer is basically pointing
//at the 3rd byte in process memory,
//which is somehow useless, since anything could be there
&test1; //this I really don't get,
//is it the pointer to the pointer?
//Meaning, the address at which the pointer address is kept?
//Is it of any use?
Similarly:
*test2; //does this has any sense?
&test2; //is this the address at which the 'test2' value is found?
//If so, it's a pointer, which means that you can have pointers pointing
//both to the heap address space and stack address space.
//I ask because I've always been confused by people who speak about
//pointers only in the heap context.
Great question.
Your first block is correct. A pointer is a variable that holds the address of some data. The type of that pointer tells the code how to interpret the contents of the address being held by that pointer.
The construct:
*test1 = 3
Is called the deferencing of a pointer. That means, you can access the address that the pointer points to and read and write to it like a normal variable. Note:
int *test;
/*
* test is a pointer to an int - (int *)
* *test behaves like an int - (int)
*
* So you can thing of (*test) as a pesudo-variable which has the type 'int'
*/
The above is just a mnemonic device that I use.
It is rare that you ever assign a numeric value to a pointer... maybe if you're developing for a specific environment which has some 'well-known' memory addresses, but at your level, I wouldn't worry to much about that.
Using
*test2
would ultimately result in an error. You'd be trying to deference something that is not a pointer, so you're likely to get some kind of system error as who knows where it is pointing.
&test1 and &test2 are, indeed, pointers to test1 and test2.
Pointers to pointers are very useful and a search of pointer to a pointer will lead you to some resources that are way better than I am.
It looks like you've got the first part right.
An incidental thought: there are various conventions about where to put that * sign. I prefer mine nestled with the variable name, as in int *test1 while others prefer int* test1. I'm not sure how common it is to have it floating in the middle.
Another incidental thought: test2 = 3.0 assigns a floating-point 3 to test2. The same end could be achieved with test2=3, in which case the 3 is implicitly converted from an integer to a floating point number. The convention you have chosen is probably safer in terms of clarity, but is not strictly necessary.
Non-incidentals
*test1=3 does assign 3 to the address specified by test.
test1=3 is a line that has meaning, but which I consider meaningless. We do not know what is at memory location 3, if it is safe to touch it, or even if we are allowed to touch it.
That's why it's handy to use something like
int var=3;
int *pointy=&var;
*pointy=4;
//Now var==4.
The command &var returns the memory location of var and stores it in pointy so that we can later access it with *pointy.
But I could also do something like this:
int var[]={1,2,3};
int *pointy=&var;
int *offset=2;
*(pointy+offset)=4;
//Now var[2]==4.
And this is where you might legitimately see something like test1=3: pointers can be added and subtracted just like numbers, so you can store offsets like this.
&test1 is a pointer to a pointer, but that sounds kind of confusing to me. It's really the address in memory where the value of test1 is stored. And test1 just happens to store as its value the address of another variable. Once you start thinking of pointers in this way (address in memory, value stored there), they become easier to work with... or at least I think so.
I don't know if *test2 has "meaning", per se. In principle, it could have a use in that we might imagine that the * command will take the value of test2 to be some location in memory, and it will return the value it finds there. But since you define test2 as a float, it is difficult to predict where in memory we would end up, setting test2=3 will not move us to the third spot of anything (look up the IEEE754 specification to see why). But I would be surprised if a compiler would allow such thing.
Let's look at another quick example:
int var=3;
int pointy1=&var;
int pointy2=&pointy1;
*pointy1=4; //Now var==4
**pointy2=5; //Now var==5
So you see that you can chain pointers together like this, as many in a row as you'd like. This might show up if you had an array of pointers which was filled with the addresses of many structures you'd created from dynamic memory, and those structures contained pointers to dynamically allocated things themselves. When the time comes to use a pointer to a pointer, you'll probably know it. For now, don't worry too much about them.
First let's add some confusion: the word "pointer" can refer to either a variable (or object) with a pointer type, or an expression with the pointer type. In most cases, when people talk about "pointers" they mean pointer variables.
A pointer can (must) point to a thing (An "object" in standards parlance). It can only point to the right kind of thing; a pointer to int is not supposed to point to a float object. A pointer can also be NULL; in that case there is no thing to point to.
A pointertype is also a type, and a pointer object is also an object. So it is allowable to construct a pointer to pointer: the pointer-to-pointer just stores the addres of the pointer object.
What a pointer can not be:
It cannot point to a value: p = &4; is impossible. 4 is a literal value, which is not stored in an object, and thus has no address.
the same goes for expressions: p = &(1+4); is impossible, because the expression "1+4" does not have a location.
the same goes for return value p = &sin(pi); is impossible; the return value is not an object and thus has no address.
variables marked as "register" (almost distinct now) cannot have an address.
you cannot take the address of a bitfield, basically because these can be smaller than character (or have a finer granularity), hence it would be possible that different bitmasks would have the same address.
There are some "exceptions" to the above skeletton (void pointers, casting, pointing one element beyond an array object) but for clarity these should be seen as refinements/amendments, IMHO.
This question goes out to the C gurus out there:
In C, it is possible to declare a pointer as follows:
char (* p)[10];
.. which basically states that this pointer points to an array of 10 chars. The neat thing about declaring a pointer like this is that you will get a compile time error if you try to assign a pointer of an array of different size to p. It will also give you a compile time error if you try to assign the value of a simple char pointer to p. I tried this with gcc and it seems to work with ANSI, C89 and C99.
It looks to me like declaring a pointer like this would be very useful - particularly, when passing a pointer to a function. Usually, people would write the prototype of such a function like this:
void foo(char * p, int plen);
If you were expecting a buffer of an specific size, you would simply test the value of plen. However, you cannot be guaranteed that the person who passes p to you will really give you plen valid memory locations in that buffer. You have to trust that the person who called this function is doing the right thing. On the other hand:
void foo(char (*p)[10]);
..would force the caller to give you a buffer of the specified size.
This seems very useful but I have never seen a pointer declared like this in any code I have ever ran across.
My question is: Is there any reason why people do not declare pointers like this? Am I not seeing some obvious pitfall?
What you are saying in your post is absolutely correct. I'd say that every C developer comes to exactly the same discovery and to exactly the same conclusion when (if) they reach certain level of proficiency with C language.
When the specifics of your application area call for an array of specific fixed size (array size is a compile-time constant), the only proper way to pass such an array to a function is by using a pointer-to-array parameter
void foo(char (*p)[10]);
(in C++ language this is also done with references
void foo(char (&p)[10]);
).
This will enable language-level type checking, which will make sure that the array of exactly correct size is supplied as an argument. In fact, in many cases people use this technique implicitly, without even realizing it, hiding the array type behind a typedef name
typedef int Vector3d[3];
void transform(Vector3d *vector);
/* equivalent to `void transform(int (*vector)[3])` */
...
Vector3d vec;
...
transform(&vec);
Note additionally that the above code is invariant with relation to Vector3d type being an array or a struct. You can switch the definition of Vector3d at any time from an array to a struct and back, and you won't have to change the function declaration. In either case the functions will receive an aggregate object "by reference" (there are exceptions to this, but within the context of this discussion this is true).
However, you won't see this method of array passing used explicitly too often, simply because too many people get confused by a rather convoluted syntax and are simply not comfortable enough with such features of C language to use them properly. For this reason, in average real life, passing an array as a pointer to its first element is a more popular approach. It just looks "simpler".
But in reality, using the pointer to the first element for array passing is a very niche technique, a trick, which serves a very specific purpose: its one and only purpose is to facilitate passing arrays of different size (i.e. run-time size). If you really need to be able to process arrays of run-time size, then the proper way to pass such an array is by a pointer to its first element with the concrete size supplied by an additional parameter
void foo(char p[], unsigned plen);
Actually, in many cases it is very useful to be able to process arrays of run-time size, which also contributes to the popularity of the method. Many C developers simply never encounter (or never recognize) the need to process a fixed-size array, thus remaining oblivious to the proper fixed-size technique.
Nevertheless, if the array size is fixed, passing it as a pointer to an element
void foo(char p[])
is a major technique-level error, which unfortunately is rather widespread these days. A pointer-to-array technique is a much better approach in such cases.
Another reason that might hinder the adoption of the fixed-size array passing technique is the dominance of naive approach to typing of dynamically allocated arrays. For example, if the program calls for fixed arrays of type char[10] (as in your example), an average developer will malloc such arrays as
char *p = malloc(10 * sizeof *p);
This array cannot be passed to a function declared as
void foo(char (*p)[10]);
which confuses the average developer and makes them abandon the fixed-size parameter declaration without giving it a further thought. In reality though, the root of the problem lies in the naive malloc approach. The malloc format shown above should be reserved for arrays of run-time size. If the array type has compile-time size, a better way to malloc it would look as follows
char (*p)[10] = malloc(sizeof *p);
This, of course, can be easily passed to the above declared foo
foo(p);
and the compiler will perform the proper type checking. But again, this is overly confusing to an unprepared C developer, which is why you won't see it in too often in the "typical" average everyday code.
I would like to add to AndreyT's answer (in case anyone stumbles upon this page looking for more info on this topic):
As I begin to play more with these declarations, I realize that there is major handicap associated with them in C (apparently not in C++). It is fairly common to have a situation where you would like to give a caller a const pointer to a buffer you have written into. Unfortunately, this is not possible when declaring a pointer like this in C. In other words, the C standard (6.7.3 - Paragraph 8) is at odds with something like this:
int array[9];
const int (* p2)[9] = &array; /* Not legal unless array is const as well */
This constraint does not seem to be present in C++, making these type of declarations far more useful. But in the case of C, it is necessary to fall back to a regular pointer declaration whenever you want a const pointer to the fixed size buffer (unless the buffer itself was declared const to begin with). You can find more info in this mail thread: link text
This is a severe constraint in my opinion and it could be one of the main reasons why people do not usually declare pointers like this in C. The other being the fact that most people do not even know that you can declare a pointer like this as AndreyT has pointed out.
The obvious reason is that this code doesn't compile:
extern void foo(char (*p)[10]);
void bar() {
char p[10];
foo(p);
}
The default promotion of an array is to an unqualified pointer.
Also see this question, using foo(&p) should work.
I also want to use this syntax to enable more type checking.
But I also agree that the syntax and mental model of using pointers is simpler, and easier to remember.
Here are some more obstacles I have come across.
Accessing the array requires using (*p)[]:
void foo(char (*p)[10])
{
char c = (*p)[3];
(*p)[0] = 1;
}
It is tempting to use a local pointer-to-char instead:
void foo(char (*p)[10])
{
char *cp = (char *)p;
char c = cp[3];
cp[0] = 1;
}
But this would partially defeat the purpose of using the correct type.
One has to remember to use the address-of operator when assigning an array's address to a pointer-to-array:
char a[10];
char (*p)[10] = &a;
The address-of operator gets the address of the whole array in &a, with the correct type to assign it to p. Without the operator, a is automatically converted to the address of the first element of the array, same as in &a[0], which has a different type.
Since this automatic conversion is already taking place, I am always puzzled that the & is necessary. It is consistent with the use of & on variables of other types, but I have to remember that an array is special and that I need the & to get the correct type of address, even though the address value is the same.
One reason for my problem may be that I learned K&R C back in the 80s, which did not allow using the & operator on whole arrays yet (although some compilers ignored that or tolerated the syntax). Which, by the way, may be another reason why pointers-to-arrays have a hard time to get adopted: they only work properly since ANSI C, and the & operator limitation may have been another reason to deem them too awkward.
When typedef is not used to create a type for the pointer-to-array (in a common header file), then a global pointer-to-array needs a more complicated extern declaration to share it across files:
fileA:
char (*p)[10];
fileB:
extern char (*p)[10];
Well, simply put, C doesn't do things that way. An array of type T is passed around as a pointer to the first T in the array, and that's all you get.
This allows for some cool and elegant algorithms, such as looping through the array with expressions like
*dst++ = *src++
The downside is that management of the size is up to you. Unfortunately, failure to do this conscientiously has also led to millions of bugs in C coding, and/or opportunities for malevolent exploitation.
What comes close to what you ask in C is to pass around a struct (by value) or a pointer to one (by reference). As long as the same struct type is used on both sides of this operation, both the code that hand out the reference and the code that uses it are in agreement about the size of the data being handled.
Your struct can contain whatever data you want; it could contain your array of a well-defined size.
Still, nothing prevents you or an incompetent or malevolent coder from using casts to fool the compiler into treating your struct as one of a different size. The almost unshackled ability to do this kind of thing is a part of C's design.
You can declare an array of characters a number of ways:
char p[10];
char* p = (char*)malloc(10 * sizeof(char));
The prototype to a function that takes an array by value is:
void foo(char* p); //cannot modify p
or by reference:
void foo(char** p); //can modify p, derefernce by *p[0] = 'f';
or by array syntax:
void foo(char p[]); //same as char*
I would not recommend this solution
typedef int Vector3d[3];
since it obscures the fact that Vector3D has a type that you
must know about. Programmers usually dont expect variables of the
same type to have different sizes. Consider :
void foo(Vector3d a) {
Vector3d b;
}
where sizeof a != sizeof b
Maybe I'm missing something, but... since arrays are constant pointers, basically that means that there's no point in passing around pointers to them.
Couldn't you just use void foo(char p[10], int plen); ?
type (*)[];
// points to an array e.g
int (*ptr)[5];
// points to an 5 integer array
// gets the address of the array
type *[];
// points to an array of pointers e.g
int* ptr[5]
// point to an array of five integer pointers
// point to 5 adresses.
On my compiler (vs2008) it treats char (*p)[10] as an array of character pointers, as if there was no parentheses, even if I compile as a C file. Is compiler support for this "variable"? If so that is a major reason not to use it.