I am learning C, mainly by K&R, but now I have found an Object Oriented C pdf tutorial and am fascinated. I'm going through it, but my C skills/knowledge may not be up to the task.
This is the tutorial: http://www.planetpdf.com/codecuts/pdfs/ooc.pdf
My question comes from looking at many different functions in the first couple of chapters of the pdf. Below is one of them. (page 14 of pdf)
void delete(void * self){
const struct Class ** cp = self;
if (self&&*cp&&(*cp)->dtor)
self = (*cp)->dtor(self);
free(self);
}
dtor is a destructor function pointer. But knowledge of this isn't really necessary for my questions.
My first question is, why is **cp constant? Is it necessary or just being thorough so the code writer doesn't do anything damaging by accident?
Secondly, why is cp a pointer-to-a-pointer (double asterisk?). The struct class was defined on page 12 of the pdf. I don't understand why it can't be a single pointer, since we are casting the self pointer to a Class pointer, it seems.
Thirdly, how is a void pointer being changed to a Class pointer (or pointer-to-a-Class-pointer)? I think this question most shows my lack of understanding of C. What I imagine in my head is a void pointer taking up a set amount of memory, but it must be less than Class pointer, because a Class has a lot of "stuff" in it. I know a void pointer can be "cast" to another type of pointer, but I don't understand how, since there may not be enough memory to perform this.
Thanks in advance
Interesting pdf.
My first question is, why is **cp constant? Is it necessary or just
being thorough so the code writer doesn't do anything damaging by
accident?
It's necessary so the writer doesn't do anything by accident, yes, and to communicate something about the nature of the pointer and its use to the reader of the code.
Secondly, why is cp a pointer-to-a-pointer (double asterisk?). The
struct class was defined on page 12 of the pdf. I don't understand why
it can't be a single pointer, since we are casting the self pointer to
a Class pointer, it seems.
Take a look at the definition of new() (pg 13) where the pointer p is created (the same pointer that's passed as self to delete()):
void * new (const void * _class, ...)
{
const struct Class * class = _class;
void * p = calloc(1, class —> size);
* (const struct Class **) p = class;
So, 'p' is allocated space, then dereferenced and assigned a pointer value (the address in class; this is like dereferencing and assigning to an int pointer, but instead of an int, we're assigning an address). This means the first thing in p is a pointer to its class definition. However, p was allocated space for more than just that (it will also hold the object's instance data). Now consider delete() again:
const struct Class ** cp = self;
if (self&&*cp&&(*cp)->dtor)
When cp is dereferenced, since it was a pointer to a pointer, it's now a pointer. What does a pointer contain? An address. What address? The pointer to the class definition that's at the beginning of the block pointed to by p.
This is sort of clever, because p's not really a pointer to a pointer -- it has a larger chunk of memory allocated which contains the specific object data. However, at the very beginning of that block is an address (the address of the class definition), so if p is dereferenced into a pointer (via casting or cp), you have access to that definition. So, the class definition exists only in one place, but each instance of that class contains a reference to the definition. Make sense? It would be clearer if p were typed as a struct like this:
struct object {
struct class *class;
[...]
};
Then you could just use something like p->class->dtor() instead of the existing code in delete(). However, this would mess up and complicate the larger picture.
Thirdly, how is a void pointer being changed to a Class pointer (or
pointer-to-a-Class-pointer)? I think this question most shows my lack
of understanding of C. What I imagine in my head is a void pointer
taking up a set amount of memory, but it must be less than Class
pointer, because a Class has a lot of "stuff" in it.
A pointer is like an int -- it has a small, set size for holding a value. That value is a memory address. When you dereference a pointer (via * or ->) what you are accessing is the memory at that address. But since memory addresses are all the same length (eg, 8 bytes on a 64-bit system) pointers themselves are all the same size regardless of type. This is how the magic of the object pointer 'p' worked. To re-iterate: the first thing in the block of memory p points to is an address, which allows it to function as a pointer to a pointer, and when that is dereferenced, you get the block of memory containing the class definition, which is separate from the instance data in p.
In this case, that's just a precaution. The function shouldn't be modifying the class (in fact, nothing should probably), so casting to const struct Class * makes sure that the class is more difficult to inadvertently change.
I'm not super-familiar with the Object-Oriented C library being used here, but I suspect this is a nasty trick. The first pointer in self is probably a reference to the class, so dereferencing self will give a pointer to the class. In effect, self can always be treated as a struct Class **.
A diagram may help here:
+--------+
self -> | *class | -> [Class]
| .... |
| .... |
+--------+
Remember that all pointers are just addresses.* The type of a pointer has no bearing on the size of the pointer; they're all 32 or 64 bits wide, depending on your system, so you can convert from one type to another at any time. The compiler will warn you if you try to convert between types of pointer without a cast, but void * pointers can always be converted to anything without a cast, as they're used throughout C to indicate a "generic" pointer.
*: There are some odd platforms where this isn't true, and different types of pointers are in fact sometimes different sizes. If you're using one of them, though, you'd know about it. In all probability, you aren't.
const is used to cause a compilation error if the code attempts to change anything within the object pointed to. This is a safety feature when the programmer intends only to read the object and does not intend to change it.
** is used because that must be what was passed to the function. It would be a grave programming error to re-declare it as something it is not.
A pointer is simply an address. On almost all modern CPUs, all addresses are the same size (32 bit or 64 bit). Changing a pointer from one type to another doesn't actually change the value. It says to regard what is at that address as a different layout of data.
Related
Suppose I am given a (void*) ptr (my basic understanding is, it represents a pointer to a region of unknown data type) passed through the parameter of a function. I am trying to figure out how to access and check if a struct exists a few addresses behind.
To clarify, I am working with a big char array (not malloced) and the ptr passed into the function should point to an address of an unspecified data type within the array. Located before this data is a struct for which I am trying to access.
void function(void *ptr)
{
void *structPtr = (void*)((void*)ptr - sizeof(struct block));
}
Would this work to get me a pointer to the address of the struct located behind the initial "ptr"? And if so, how could I check if it is the block struct?
Apologizes in advance, I know this code is not specific as I am fairly new to the concepts entirely but also, I am in the process of coming up with an algorithm and not yet implementing it. Any references to possibly useful information are much appreciated.
what you are trying to do is risky as you must be sure that you address a correct place in memory. Usually, we add some magic number in struct block so that we can test here that we are not going anywhere.
This pattern is generally used in memory allocators,
have a look to https://sourceware.org/glibc/wiki/MallocInternals for an example.
The usual way of writing this is something like:
...function(void *ptr) {
struct block *hdr = (struct block *)ptr - 1;
relying on pointer arithmetic automatically scaling by the size of the pointed at type. As long as all the pointers passed in here were originally created by taking a pointer to a valid struct block and adding 1 to it (to get the memory after it), you should be fine.
Why do we have pointer types? eg
int *ptr;
I know its for type safety, eg to dereference 'ptr', the compiler needs to know that its dereferencing the ptr to type int, not to char or long, etc, but as others outlined here Why to specify a pointer type? , its also because "we should know how many bytes to read. Dereferencing a char pointer would imply taking one byte from memory while for int it could be 4 bytes." That makes sense.
But what if I have something like this:
typedef struct _IP_ADAPTER_INFO {
struct _IP_ADAPTER_INFO* Next;
DWORD ComboIndex;
char AdapterName[MAX_ADAPTER_NAME_LENGTH + 4];
char Description[MAX_ADAPTER_DESCRIPTION_LENGTH + 4];
UINT AddressLength;
BYTE Address[MAX_ADAPTER_ADDRESS_LENGTH];
DWORD Index;
UINT Type;
UINT DhcpEnabled;
PIP_ADDR_STRING CurrentIpAddress;
IP_ADDR_STRING IpAddressList;
IP_ADDR_STRING GatewayList;
IP_ADDR_STRING DhcpServer;
BOOL HaveWins;
IP_ADDR_STRING PrimaryWinsServer;
IP_ADDR_STRING SecondaryWinsServer;
time_t LeaseObtained;
time_t LeaseExpires;
} IP_ADAPTER_INFO, *PIP_ADAPTER_INFO;
PIP_ADAPTER_INFO pAdapterInfo = (IP_ADAPTER_INFO *)malloc(sizeof(IP_ADAPTER_INFO));
What would be the point of declaring the type PIP_ADAPTER_INFO here? After all, unlike the previous example, we've already allocated enough memory for the pointer to point at (using malloc), so isn't defining the type here redundant? We will be reading as much data from memory as there has been allocated.
Also, side note: Is there any difference between the following 4 declarations or is there a best practice?
PIP_ADAPTER_INFO pAdapterInfo = (IP_ADAPTER_INFO *)malloc(sizeof(IP_ADAPTER_INFO));
or
PIP_ADAPTER_INFO pAdapterInfo = (PIP_ADAPTER_INFO)malloc(sizeof(IP_ADAPTER_INFO));
or
IP_ADAPTER_INFO *pAdapterInfo = (IP_ADAPTER_INFO *)malloc(sizeof(IP_ADAPTER_INFO));
or
IP_ADAPTER_INFO *pAdapterInfo = (PIP_ADAPTER_INFO)malloc(sizeof(IP_ADAPTER_INFO));
You’re kind of asking two different questions here - why have different pointer types, and why hide pointers behind typedefs?
The primary reason for distinct pointer types comes from pointer arithmetic - if p points to an object of type T, then the expression p + 1 points to the next object of that type. If p points to an 4-byte int, then p + 1 points to the next int. If p points to a 128-byte struct, then p + 1 points to the next 128-byte struct, and so on. Pointers are abstractions of memory addresses with additional type semantics.
As for hiding pointers behind typedefs...
A number of us (including myself) consider hiding pointers behind typedefs to be bad style if the user of the type still has to be aware of the type’s “pointer-ness” (i.e., if you ever have to dereference it, or if you ever assign the result of malloc/calloc/realloc to it, etc.). If you’re trying to abstract away the “pointer-ness” of something, you need to do it in more than just the declaration - you need to provide a full API that hides all the pointer operations as well.
As for your last question, best practice in C is to not cast the result of malloc. Best practice in C++ is to not use malloc at all.
I think this is more a question of type definition style than of dynamic memory allocation.
Old-school C practice is to describe structs by their tags. You say
struct foo {
...
};
and then
struct foo foovar;
or
struct foo *foopointer = malloc(sizeof(struct foo));
But a lot of people don't like having to type that keyword struct all the time. (I guess I can't fault then; C has always favored terseness, sometimes seemingly just to reduce typing.) So a form using typedef became quite popular (and it either influenced, or was influenced by, C++):
typedef struct {
...
} Foo;
and then
Foo foovar;
or
Foo *foopointer = malloc(sizeof(Foo));
But then, for reasons that are less clear, it became popular to throw the pointerness into the typedef, too, like this:
typedef struct {
...
} Foo, *Foop;
Foop foopointer = malloc(sizeof(*Foop));
But this is all a matter of style and personal preference, in the service of what someone imagines to be clarity or convenience or usefulness. (But of course opinions on clarity and convenience, like opinions on style, can legitimately vary.) I've seen the pointer typedefs disparaged as being a misleading or Microsoftian practice, but I'm not sure I can fault them right now.
You also asked about the casts, and we could also dissect various options for the sizeof call as the argument to malloc.
It doesn't really matter whether you say
Foop foopointer = (Foop)malloc(sizeof(*Foop));
or
Foop foopointer = (Foo *)malloc(sizeof(*Foop));
The first one may be clearer, in that you don't have to go back and check that Foop and Foo * are the same thing. But they're both poor practice in C, and in at least some circles they've been deprecated since the 1990's. Those casts are are considered distracting and unnecessary in straight C -- although of course they're necessary in C++, or I suppose if you're using a C++ compiler to compile C code. (If you were writing straight C++ code, of course, you'd typically use new instead of malloc.)
But then what should you put in the sizeof()? Which is better,
Foop foopointer = malloc(sizeof(*Foop));
or
Foop foopointer = malloc(sizeof(Foo));
Again, the first one can be easier to read, since you don't have to go back and check that Foop and Foo * are the same thing. But by the same token, there's a third form that can be even clearer:
Foop foopointer = malloc(sizeof(*foopointer));
Now you know that, whatever type foopointer points at, you're allocating the right amount of space for it. This idiom works best, though, if it's maximally clear that foopiinter is in fact a pointer that points at some type, meaning that the variants
Foo *foopointer = malloc(sizeof(*foopointer));
or even
struct foo *foopointer = malloc(sizeof(*foopointer));
can be considered clearer still -- and this may be one of the reasons people consider the pointer typedef to be less than perfectly useful.
Bottom line, if you're still with me: If you don't find PIP_ADAPTER_INFO useful, don't use it -- use IP_ADAPTER_INFO (along with explicit *'s when you need them) instead. Someone thought PIP_ADAPTER_INFO might be useful, which is why it's there, but the arguments in favor of its use aren't too compelling.
What is the point of “pointer types” when you dynamically allocate memory?
At least for the example you show there is none.
So the follow up question would be if there were situations where typedefing a pointer made sense.
And the answer is: Yes.
It definitely makes sense if one is in the need of an opaque data type.
A nice example is the pthread_t type which defines a handle to a POSIX thread.
Depending on the implementation it is defined as
typedef struct bla pthread_t;
typedef struct foo * pthread_t;
typedef long pthread_t;
and with this abstracts away the kind of implementation, as it is of no interest to the user, which probably is not the intention with the struct you show in your question.
Why do we have pointer types?
To accommodate architectures where the size and encoding may differ for various types. C ports well to many platforms, even novel ones.
It is not unusual today that pointers to functions have a different size than pointers to objects. An object pointer coverts to a void *, yet a function pointer may not.
A pointer to char need not be the same size as a pointer to an int or union or struct. This is uncommon today. The spec details follow (my emphasis):
A pointer to void shall have the same representation and alignment requirements as a
pointer to a character type. Similarly, pointers to qualified or unqualified versions of compatible types shall have the same representation and alignment requirements. All
pointers to structure types shall have the same representation and alignment requirements
as each other. All pointers to union types shall have the same representation and
alignment requirements as each other. Pointers to other types need not have the same
representation or alignment requirements. C11dr §6.2.5 28
I have a
LS_Led* LS_vol_leds[10];
declared in one C module, and the proper externs in the other modules that access it.
In func1() I have this line:
/* Debug */
LS_Led led = *(LS_vol_leds[0]);
And it does not cause an exception. Then
I call func2() in another C module (right after above line), and do the same line, namely:
/* Debug */
LS_Led led = *(LS_vol_leds[0]);`
first thing, and exception thrown!!!
I don't think I have the powers to debug this one on my own.
Before anything LS_vol_leds is initialized in func1() with:
LS_vol_leds[0] = &led3;
LS_vol_leds[1] = &led4;
LS_vol_leds[2] = &led5;
LS_vol_leds[3] = &led6;
LS_vol_leds[4] = &led7;
LS_vol_leds[5] = &led8;
LS_vol_leds[6] = &led9;
LS_vol_leds[7] = &led10;
LS_vol_leds[8] = &led11;
LS_vol_leds[9] = &led12;
My externs look like
extern LS_Led** LS_vol_leds;
So does that lead to disaster and I how do I prevent disaster?
Thanks.
This leads to disaster:
extern LS_Led** LS_vol_leds;
You should try this instead:
extern LS_Led *LS_vol_leds[];
If you really want to know why, you should read Expert C Programming - Deep C Secrets, by Peter Van Der Linden (amazing book!), especially chapter 4, but the quick answer is that this is one of those corner cases where pointers and arrays are not interchangeable: a pointer is a variable which holds the address of another one, whereas an array name is an address. extern LS_Led** LS_vol_leds; is lying to the compiler and generating the wrong code to access LS_vol_leds[i].
With this:
extern LS_Led** LS_vol_leds;
The compiler will believe that LS_vol_leds is a pointer, and thus, LS_vol_leds[i] involves reading the value stored in the memory location that is responsible for LS_vol_leds, use that as an address, and then scale i accordingly to get the offset.
However, since LS_vol_leds is an array and not a pointer, the compiler should instead pick the address of LS_vol_leds directly. In other words: what is happening is that your original extern causes the compiler to dereference LS_vol_leds[0] because it believes that LS_vol_leds[0] holds the address of the pointed-to object.
UPDATE: Fun fact - the back cover of the book talks about this specific case:
So that's why extern char *cp isn't the same as extern char cp[]. I
knew that it didn't work despite their superficial equivalence, but I
didn't know why. [...]
UPDATE2: Ok, since you asked, let's dig deeper. Consider a program split into two files, file1.c and file2.c. Its contents are:
file1.c
#define BUFFER_SIZE 1024
char cp[BUFFER_SIZE];
/* Lots of code using cp[i] */
file2.c
extern char *cp;
/* Code using cp[i] */
The moment you try to assing to cp[i] or use cp[i] in file2.c will most likely crash your code. This is deeply tight into the mechanics of C and the code that the compiler generates for array-based accesses and pointer-based accesses.
When you have a pointer, you must think of it as a variable. A pointer is a variable like an int, float or something similar, but instead of storing an integer or a float, it stores a memory address - the address of another object.
Note that variables have addresses. When you have something like:
int a;
Then you know that a is the name for an integer object. When you assign to a, the compiler emits code that writes into whatever address is associated with a.
Now consider you have:
char *p;
What happens when you access *p? Remember - a pointer is a variable. This means that the memory address associated with p holds an address - namely, an address holding a character. When you assign to p (i.e., make it point to somewhere else), then the compiler grabs the address of p and writes a new address (the one you provide it) into that location.
For example, if p lives at 0x27, it means that reading memory location 0x27 yields the address of the object pointed to by p. So, if you use *p in the right hand side of an assignment, the steps to get the value of *p are:
Read the contents of 0x27 - say it's 0x80 - this is the value of the pointer, or, equivalently, the address of the pointed-to object
Read the contents of 0x80 - this finally gives you *p.
What if p is an array? If p is an array, then the variable p itself represents the array. By convention, the address representing an array is the address of its first element. If the compiler chooses to store the array in address 0x59, it means that the first element of p lives at 0x59. So when you read p[0] (or *p), the generated code is simpler: the compiler knows that the variable p is an array, and the address of an array is the address of the first element, so p[0] is the same as reading 0x59. Compare this to the case for which p is a pointer.
If you lie to the compiler, and tell it you have a pointer instead of an array, the compiler will (wrongly) generate code that does what I showed for the pointer case. You're basically telling it that 0x59 is not the address of an array, it's the address of a pointer. So, reading p[i] will cause it to use the pointer version:
Read the contents of 0x59 - note that, in reality, this is p[0]
Use that as an address, and read its contents.
So, what happens is that the compiler thinks that p[0] is an address, and will try to use it as such.
Why is this a corner case? Why don't I have to worry about this when passing arrays to functions?
Because what is really happening is that the compiler manages it for you. Yes, when you pass an array to a function, a pointer to the first element is passed, and inside the called function you have no way to know if it is a "real" array or a pointer. However, the address passed into the function is different depending on whether you're passing a real array or a pointer. If you're passing a real array, the pointer you get is the address of the first element of the array (in other words: the compiler immediately grabs the address associated to the array variable from the symbol table). If you're passing a pointer, the compiler passes the address that is stored in the address associated with that variable (and that variable happens to be the pointer), that is, it does exactly those 2 steps mentioned before for pointer-based access. Again, note that we're discussing the value of the pointer here. You must keep this separated from the address of the pointer itself (the address where the address of the pointed-to object is stored).
That's why you don't see a difference. In most situations, arrays are passed around as function arguments, and this rarely raises problems. But sometimes, with some corner cases (like yours), if you don't really know what is happening down there, well.. then it will be a wild ride.
Personal advice: read the book, it's totally worth it.
I've just started to learn C so please be kind.
From what I've read so far regarding pointers:
int * test1; //this is a pointer which is basically an address to the process
//memory and usually has the size of 2 bytes (not necessarily, I know)
float test2; //this is an actual value and usually has the size of 4 bytes,
//being of float type
test2 = 3.0; //this assigns 3 to `test2`
Now, what I don't completely understand:
*test1 = 3; //does this assign 3 at the address
//specified by `pointerValue`?
test1 = 3; //this says that the pointer is basically pointing
//at the 3rd byte in process memory,
//which is somehow useless, since anything could be there
&test1; //this I really don't get,
//is it the pointer to the pointer?
//Meaning, the address at which the pointer address is kept?
//Is it of any use?
Similarly:
*test2; //does this has any sense?
&test2; //is this the address at which the 'test2' value is found?
//If so, it's a pointer, which means that you can have pointers pointing
//both to the heap address space and stack address space.
//I ask because I've always been confused by people who speak about
//pointers only in the heap context.
Great question.
Your first block is correct. A pointer is a variable that holds the address of some data. The type of that pointer tells the code how to interpret the contents of the address being held by that pointer.
The construct:
*test1 = 3
Is called the deferencing of a pointer. That means, you can access the address that the pointer points to and read and write to it like a normal variable. Note:
int *test;
/*
* test is a pointer to an int - (int *)
* *test behaves like an int - (int)
*
* So you can thing of (*test) as a pesudo-variable which has the type 'int'
*/
The above is just a mnemonic device that I use.
It is rare that you ever assign a numeric value to a pointer... maybe if you're developing for a specific environment which has some 'well-known' memory addresses, but at your level, I wouldn't worry to much about that.
Using
*test2
would ultimately result in an error. You'd be trying to deference something that is not a pointer, so you're likely to get some kind of system error as who knows where it is pointing.
&test1 and &test2 are, indeed, pointers to test1 and test2.
Pointers to pointers are very useful and a search of pointer to a pointer will lead you to some resources that are way better than I am.
It looks like you've got the first part right.
An incidental thought: there are various conventions about where to put that * sign. I prefer mine nestled with the variable name, as in int *test1 while others prefer int* test1. I'm not sure how common it is to have it floating in the middle.
Another incidental thought: test2 = 3.0 assigns a floating-point 3 to test2. The same end could be achieved with test2=3, in which case the 3 is implicitly converted from an integer to a floating point number. The convention you have chosen is probably safer in terms of clarity, but is not strictly necessary.
Non-incidentals
*test1=3 does assign 3 to the address specified by test.
test1=3 is a line that has meaning, but which I consider meaningless. We do not know what is at memory location 3, if it is safe to touch it, or even if we are allowed to touch it.
That's why it's handy to use something like
int var=3;
int *pointy=&var;
*pointy=4;
//Now var==4.
The command &var returns the memory location of var and stores it in pointy so that we can later access it with *pointy.
But I could also do something like this:
int var[]={1,2,3};
int *pointy=&var;
int *offset=2;
*(pointy+offset)=4;
//Now var[2]==4.
And this is where you might legitimately see something like test1=3: pointers can be added and subtracted just like numbers, so you can store offsets like this.
&test1 is a pointer to a pointer, but that sounds kind of confusing to me. It's really the address in memory where the value of test1 is stored. And test1 just happens to store as its value the address of another variable. Once you start thinking of pointers in this way (address in memory, value stored there), they become easier to work with... or at least I think so.
I don't know if *test2 has "meaning", per se. In principle, it could have a use in that we might imagine that the * command will take the value of test2 to be some location in memory, and it will return the value it finds there. But since you define test2 as a float, it is difficult to predict where in memory we would end up, setting test2=3 will not move us to the third spot of anything (look up the IEEE754 specification to see why). But I would be surprised if a compiler would allow such thing.
Let's look at another quick example:
int var=3;
int pointy1=&var;
int pointy2=&pointy1;
*pointy1=4; //Now var==4
**pointy2=5; //Now var==5
So you see that you can chain pointers together like this, as many in a row as you'd like. This might show up if you had an array of pointers which was filled with the addresses of many structures you'd created from dynamic memory, and those structures contained pointers to dynamically allocated things themselves. When the time comes to use a pointer to a pointer, you'll probably know it. For now, don't worry too much about them.
First let's add some confusion: the word "pointer" can refer to either a variable (or object) with a pointer type, or an expression with the pointer type. In most cases, when people talk about "pointers" they mean pointer variables.
A pointer can (must) point to a thing (An "object" in standards parlance). It can only point to the right kind of thing; a pointer to int is not supposed to point to a float object. A pointer can also be NULL; in that case there is no thing to point to.
A pointertype is also a type, and a pointer object is also an object. So it is allowable to construct a pointer to pointer: the pointer-to-pointer just stores the addres of the pointer object.
What a pointer can not be:
It cannot point to a value: p = &4; is impossible. 4 is a literal value, which is not stored in an object, and thus has no address.
the same goes for expressions: p = &(1+4); is impossible, because the expression "1+4" does not have a location.
the same goes for return value p = &sin(pi); is impossible; the return value is not an object and thus has no address.
variables marked as "register" (almost distinct now) cannot have an address.
you cannot take the address of a bitfield, basically because these can be smaller than character (or have a finer granularity), hence it would be possible that different bitmasks would have the same address.
There are some "exceptions" to the above skeletton (void pointers, casting, pointing one element beyond an array object) but for clarity these should be seen as refinements/amendments, IMHO.
I'm sort of learning C, I'm not a beginner to programming though, I "know" Java and python, and by the way I'm on a mac (leopard).
Firstly,
1: could someone explain when to use a pointer and when not to?
2:
char *fun = malloc(sizeof(char) * 4);
or
char fun[4];
or
char *fun = "fun";
And then all but the last would set indexes 0, 1, 2 and 3 to 'f', 'u', 'n' and '\0' respectively. My question is, why isn't the second one a pointer? Why char fun[4] and not char *fun[4]? And how come it seems that a pointer to a struct or an int is always an array?
3:
I understand this:
typedef struct car
{
...
};
is a shortcut for
struct car
{
...
};
typedef struct car car;
Correct? But something I am really confused about:
typedef struct A
{
...
}B;
What is the difference between A and B? A is the 'tag-name', but what's that? When do I use which? Same thing for enums.
4. I understand what pointers do, but I don't understand what the point of them is (no pun intended). And when does something get allocated on the stack vs. the heap? How do I know where it gets allocated? Do pointers have something to do with it?
5. And lastly, know any good tutorial for C game programming (simple) ? And for mac/OS X, not windows?
PS. Is there any other name people use to refer to just C, not C++? I hate how they're all named almost the same thing, so hard to try to google specifically C and not just get C++ and C# stuff.
Thanks!!
It was hard to pick a best answer, they were all great, but the one I picked was the only one that made me understand my 3rd question, which was the only one I was originally going to ask. Thanks again!
My question is, why isn't the second one a pointer?
Because it declares an array. In the two other cases, you have a pointer that refers to data that lives somewhere else. Your array declaration, however, declares an array of data that lives where it's declared. If you declared it within a function, then data will die when you return from that function. Finally char *fun[4] would be an array of 4 pointers - it wouldn't be a char pointer. In case you just want to point to a block of 4 chars, then char* would fully suffice, no need to tell it that there are exactly 4 chars to be pointed to.
The first way which creates an object on the heap is used if you need data to live from thereon until the matching free call. The data will survive a return from a function.
The last way just creates data that's not intended to be written to. It's a pointer which refers to a string literal - it's often stored in read-only memory. If you write to it, then the behavior is undefined.
I understand what pointers do, but I don't understand what the point of them is (no pun intended).
Pointers are used to point to something (no pun, of course). Look at it like this: If you have a row of items on the table, and your friend says "pick the second item", then the item won't magically walk its way to you. You have to grab it. Your hand acts like a pointer, and when you move your hand back to you, you dereference that pointer and get the item. The row of items can be seen as an array of items:
And how come it seems that a pointer to a struct or an int is always an array?
item row[5];
When you do item i = row[1]; then you first point your hand at the first item (get a pointer to the first one), and then you advance till you are at the second item. Then you take your hand with the item back to you :) So, the row[1] syntax is not something special to arrays, but rather special to pointers - it's equivalent to *(row + 1), and a temporary pointer is made up when you use an array like that.
What is the difference between A and B? A is the 'tag-name', but what's that? When do I use which? Same thing for enums.
typedef struct car
{
...
};
That's not valid code. You basically said "define the type struct car { ... } to be referable by the following ordinary identifier" but you missed to tell it the identifier. The two following snippets are equivalent instead, as far as i can see
1)
struct car
{
...
};
typedef struct car car;
2)
typedef struct car
{
...
} car;
What is the difference between A and B? A is the 'tag-name', but what's that? When do I use which? Same thing for enums.
In our case, the identifier car was declared two times in the same scope. But the declarations won't conflict because each of the identifiers are in a different namespace. The two namespaces involved are the ordinary namespace and the tag namespace. A tag identifier needs to be used after a struct, union or enum keyword, while an ordinary identifier doesn't need anything around it. You may have heard of the POSIX function stat, whose interface looks like the following
struct stat {
...
};
int stat(const char *path, struct stat *buf);
In that code snippet, stat is registered into the two aforementioned namespaces too. struct stat will refer to the struct, and merely stat will refer to the function. Some people don't like to precede identifiers always with struct, union or enum. Those use typedef to introduce an ordinary identifier that will refer to the struct too. The identifier can of course be the same (both times car), or they can differ (one time A the other time B). It doesn't matter.
3) It's bad style to use two different names A and B:
typedef struct A
{
...
} B;
With that definition, you can say
struct A a;
B b;
b.field = 42;
a.field = b.field;
because the variables a and b have the same type. C programmers usually say
typedef struct A
{
...
} A;
so that you can use "A" as a type name, equivalent to "struct A" but it saves you a lot of typing.
Use them when you need to. Read some more examples and tutorials until you understand what pointers are, and this ought to be a lot clearer :)
The second case creates an array in memory, with space for four bytes. When you use that array's name, you magically get back a pointer to the first (index 0) element. And then the [] operator then actually works on a pointer, not an array - x[y] is equivalent to *(x + y). And yes, this means x[y] is the same as y[x]. Sorry.
Note also that when you add an integer to a pointer, it's multiplied by the size of the pointed-to elements, so if you do someIntArray[1], you get the second (index 1) element, not somewhere inbetween starting at the first byte.
Also, as a final gotcha - array types in function argument lists - eg, void foo(int bar[4]) - secretly get turned into pointer types - that is, void foo(int *bar). This is only the case in function arguments.
Your third example declares a struct type with two names - struct A and B. In pure C, the struct is mandatory for A - in C++, you can just refer to it as either A or B. Apart from the name change, the two types are completely equivalent, and you can substitute one for the other anywhere, anytime without any change in behavior.
C has three places things can be stored:
The stack - local variables in functions go here. For example:
void foo() {
int x; // on the stack
}
The heap - things go here when you allocate them explicitly with malloc, calloc, or realloc.
void foo() {
int *x; // on the stack
x = malloc(sizeof(*x)); // the value pointed to by x is on the heap
}
Static storage - global variables and static variables, allocated once at program startup.
int x; // static
void foo() {
static int y; // essentially a global that can only be used in foo()
}
No idea. I wish I didn't need to answer all questions at once - this is why you should split them up :)
Note: formatting looks ugly due to some sort of markdown bug, if anyone knows of a workaround please feel free to edit (and remove this note!)
char *fun = malloc(sizeof(char) * 4);
or
char fun[4];
or
char *fun = "fun";
The first one can be set to any size you want at runtime, and be resized later - you can also free the memory when you are done.
The second one is a pointer really 'fun' is the same as char ptr=&fun[0].
I understand what pointers do, but I don't understand what the point of
them is (no pun intended). And when
does something get allocated on the
stack vs. the heap? How do I know
where it gets allocated? Do pointers
have something to do with it?
When you define something in a function like "char fun[4]" it is defined on the stack and the memory isn't available outside the function.
Using malloc (or new in C++) reserves memory on the heap - you can make this data available anywhere in the program by passing it the pointer. This also lets you decide the size of the memory at runtime and finaly the size of the stack is limited (typically 1Mb) while on the heap you can reserve all the memory you have available.
edit 5. Not really - I would say pure C. C++ is (almost) a superset of C so unless you are working on a very limited embedded system it's usualy OK to use C++.
\5. Chipmunk
Fast and lightweight 2D rigid body physics library in C.
Designed with 2D video games in mind.
Lightweight C99 implementation with no external dependencies outside of the Std. C library.
Many language bindings available.
Simple, read the documentation and see!
Unrestrictive MIT license.
Makes you smarter, stronger and more attractive to the opposite gender!
...
In your second question:
char *fun = malloc(sizeof(char) * 4);
vs
char fun[4];
vs
char *fun = "fun";
These all involve an array of 4 chars, but that's where the similarity ends. Where they differ is in the lifetime, modifiability and initialisation of those chars.
The first one creates a single pointer to char object called fun - this pointer variable will live only from when this function starts until the function returns. It also calls the C standard library and asks it to dynamically create a memory block the size of an array of 4 chars, and assigns the location of the first char in the block to fun. This memory block (which you can treat as an array of 4 chars) has a flexible lifetime that's entirely up to the programmer - it lives until you pass that memory location to free(). Note that this means that the memory block created by malloc can live for a longer or shorter time than the pointer variable fun itself does. Note also that the association between fun and that memory block is not fixed - you can change fun so it points to different memory block, or make a different pointer point to that memory block.
One more thing - the array of 4 chars created by malloc is not initialised - it contains garbage values.
The second example creates only one object - an array of 4 chars, called fun. (To test this, change the 4 to 40 and print out sizeof(fun)). This array lives only until the function it's declared in returns (unless it's declared outside of a function, when it lives for as long as the entire program is running). This array of 4 chars isn't initialised either.
The third example creates two objects. The first is a pointer-to-char variable called fun, just like in the first example (and as usual, it lives from the start of this function until it returns). The other object is a bit strange - it's an array of 4 chars, initialised to { 'f', 'u', 'n', 0 }, which has no name and that lives for as long as the entire program is running. It's also not guaranteed to be modifiable (although what happens if you try to modify it is left entirely undefined - it might crash your program, or it might not). The variable fun is initialised with the location of this strange unnamed, unmodifiable, long-lived array (but just like in the first example, this association isn't permanent - you can make fun point to something else).
The reason why there's so many confusing similarities and differences between arrays and pointers is down to two things:
The "array syntax" in C (the [] operator) actually works on pointers, not arrays!
Trying to pin down an array is a bit like catching fog - in almost all cases the array evaporates and is replaced by a pointer to its first element instead.