I am having a problem with my exercise in which I have to explain the running of pointers in C.
Can you explain what is the differences between char *pp and (char*) p and the outputs to me?
#include <stdio.h>
#include <stdlib.h>
/*
*
*/
int main(int argc, char** argv) {
int n=260, *p=&n;
printf("n=%d\n", n);
char *pp=(char*)p;
*pp=0;
printf("n=%d\n",n);
return (EXIT_SUCCESS);
}
n=260
n=256
I'm so sorry for the mistake I've done! Hope you guys can help me.
Your question is a basic question, but one that every new C-programmer wrestles with and is fundamental to understanding C. Understanding pointers. While they are easy to understand once you understand them, getting to that point can be frustrating based on the way the information is presented in many books or tutorials.
Pointer Basics
A pointer is simply a normal variable that holds the address of something else as its value. In other words, a pointer points to the address where something else can be found. Where you normally think of a variable holding an immediate values, such as int n = 260;, a pointer (e.g. int *p = &n;) would simply hold the address where 260 is stored in memory.
If you need to access the value stored at the memory address pointed to by p, you dereference p using the unary '*' operator, (e.g. int j = *p; will initialize j = 260).
If you want to obtain a variables address in memory, you use the & (address of) operator. If you need to pass a variable as a pointer, you simply provide the address of the variable as a parameter.
Since p points to the address where 260 is stored, if you change that value at that address (e.g. *p = 41;) 41 is now stored at the address where 260 was before. Since p points to the address of n and you have changed the value at that address, n now equals 41. However j resides in another memory location and its value was set before you changed the value at the address for n, the value for j remains 260.
Pointer Arithmetic
Pointer arithmetic works the same way regardless of the type of object pointed to because the type of the pointer controls the pointer arithmetic, e.g. with a char * pointer, pointer+1 points to the next byte (next char), for an int * pointer (normal 4-byte integer), pointer+1 will point to the next int at an offset 4-bytes after pointer. (so a pointer, is just a pointer.... where arithmetic is automatically handled by the type)
In your case you create a second pointer of a different type char *pp = (char*)p;. The pointer pp now also holds the address of n but it is interpreted at type char on access instead of type int.
The C standard prohibits access of a value stored at an address though a pointer of a different type. C11 Standard - §6.5 Expressions (p6,7) (known as the strict-aliasing rule). There are exceptions to the rule. One exception (the last point) is that any value may be accessed through a pointer of char type.
What Happens to the Value of n In Your Case?
When you assign:
*pp = 0;
you storing the single-byte 0 (or 00000000 in binary) to the memory location held by pp. Here is where endianess (little-endian, big-endian) come into play. Recall, for little-endian computers (just about all x86 and x86_64 IBM-PC clone type boxes), the values are stored in memory with the Least-Significant Byte first. (big-endian stores values with the Most-Significan Byte first). So your original value of n (10000100in binary) is stored in memory on a little-endian box as
n (little endian) : 00000100-00000001-00000000-00000000 (260)
^
|
p (type int)
The character pointer pp is assigned the address held by p, so both p and pp, hold the same address (the difference being one is a pointer to int the other a pointer to char:
n (little endian) : 00000100-00000001-00000000-00000000 (260)
^
|
p (type int)
pp (type char)
When you dereference pp (e.g. *pp) and assign the value zero (e.g. *pp = 0;), you overwrite the first byte of n in memory with zero. After the assignment, you now have:
n (little endian) : 00000000-00000001-00000000-00000000 (256)
^
|
p (type int)
pp (type char)
Which is the binary value 100000000, (256 or hex 0x0100) and what your code outputs for the value of n. Ask yourself this, if the computer you were using was big-endian, what would be resulting value have been?
Let me know if you have any further questions.
char *pp declares the variable pp as a pointer to char - pp will store the address of a char object.
(char *)p is a cast expression - it means “treat the value of p as a char *”.
p was declared as an int * - it stores the address of an int object (in this case, the address of n). The problem is that the char * and int *types are not compatible - you can’t assign one to the other directly1. You have to use a cast to convert the value to the right type.
Pointers to different types are themselves different types, and do not have to have the same size or representation. The one exception is the void * type - it was introduced specifically to be a “generic” pointer type, and you don’t need to explicitly cast when assigning between void * and other pointer types.
Related
I am trying to figure out all the possible ways I could fill in int pointer k considering the following givens:
int i = 40;
int *p = &i;
int *k = ___;
So far I came up with "&i" and "p". However, is it possible to fill in the blank with "*&p" or "&*p"?
My understanding of "*&p" is that it is dereferencing the address of an integer pointer. Which to me means if printed out would output the content of p, which is &i. Or is that not possible when initializing an int pointer? Or is it even possible at all anytime?
I understand "&*p" as the memory address of the integer *p points to. This one I am really unsure about also.
If anyone has any recommendations or suggestions I will greatly appreciate it! Really trying to understand pointers better.
Pointer Basics
A pointer is simply a normal variable that holds the address of something else as its value. In other words, a pointer points to the address where something else can be found. Where you normally think of a variable holding an immediate values, such as int i = 40;, a pointer (e.g. int *p = &i;) would simply hold the address where 40 is stored in memory.
If you need the value stored at the memory address p points to, you dereference p using the unary '*' operator, e.g. int j = *p; will initialize j = 40).
Since p points to the address where 40 is stored, if you change that value at that address (e.g. *p = 41;) 41 is now stored at the address where 40 was before. Since p points to the address of i and you have changed the value at that address, i now equals 41. However j resides in another memory location and its value was set before you changed the value at the address for i, the value for j remains 40.
If you want to create a second pointer (e.g. int *k;) you are just creating another variable that holds an address as its value. If you want k to reference the same address held by p as its value, you simply initialize k the same way you woul intialize any other varaible by assigning its value when it is declared, e.g. int *k = p; (which is the same as assigning k = p; at some point after initialization).
Pointer Arithmetic
Pointer arithmetic works the same way regardless of the type of object pointed to because the type of the pointer controls the pointer arithmetic, e.g. with a char * pointer, pointer+1 points to the next byte (next char), for an int * pointer (normal 4-byte integer), pointer+1 will point to the next int at an offset 4-bytes after pointer. (so a pointer, is just a pointer.... where arithmetic is automatically handled by the type)
Chaining & and * Together
The operators available to take the address of an object and dereference pointers are the unary '&' (address of) operator and the unary '*' (dereference) operator. '&' in taking the address of an object adds one level of indirection. '*' in dereferening a pointer to get the value (or thing) pointed to by the pointer removes one level of indirection. So as #KamilCuk explained in example in his comment it does not matter how many times you apply one after the other, one simply adds and the other removes a level of indirection making all but the final operator superfluous.
(note: when dealing with an array-of-pointers, the postfix [..] operator used to obtain the pointer at an index of the array also acts to derefernce the array of pointers removing one level of indirection)
Your Options
Given your declarations:
int i = 40;
int *p = &i;
int *k = ___;
and the pointer summary above, you have two options, both are equivalent. You can either initialize the pointer k with the address of i directly, e.g.
int *k = &i;
or you can initialize k by assinging the address held by p, e.g.
int *k = p;
Either way, k now holds, as its value, the memory location for i where 40 is currently stored.
I am a little bit unsure what you're trying to do but,
int* p = &i;
now, saying &*p is really just like saying p since this gives you the address.
Just that p is much clearer.
The rule is (quoting C11 standard footnote 102) that for any pointer E
&*E is equivalent to E
You can have as many &*&*&*... in front of any pointer type variable that is on the right side of =.
With the &*&*&* sequence below I denote: zero or more &* sequences. I've put a space after it so it's, like, somehow visible. So: we can assign pointer k to the address of i:
int *k = &*&*&* &i;
and assign k to the same value as p has:
int *k = &*&*&* p;
We can also take the address of pointer p, so do &p, it will have int** - ie. it will be a pointer to a pointer to int. And then we can dereference that address. So *&p. It will be always equal to p.
int *k = &*&*&* *&p;
is it possible to fill in the blank with "*&p" or "&*p"?
Yes, both are correct. The *&p first takes the address of p variables then deferences it, as I said above. The *&variable should be always equal to the value of variable. The second &*p is equal to p.
My understanding of "*&p" is that it is dereferencing the address of an integer pointer. Which to me means if printed out would output the content of p, which is &i. Or is that not possible when initializing an int pointer? Or is it even possible at all anytime?
Yes and yes. It is possible, anytime, with any type. The &* is possible with complete types only.
Side note: It's get really funny with functions. The dereference operator * is ignored in front of a function or a function pointer. This is just a rule in C. See ex. this question. You can have a infinite sequence of * and & in front of a function or a function pointer as long as there are no && sequences in it. It gets ridiculous:
void func(void);
void (*funcptr)(void) = ***&***********&*&*&*&****func;
void (*funcptr2)(void) = ***&***&***&***&***&*******&******&**funcptr;
Both funcptr and funcptr2 are assigned the same value and both point to function func.
If I have a pointer to an array of char*s, in other words, a char** named p1, what would I get if I do (char*)p1? I’m guessing there would be some loss of precision. What information would I lose, and what would p1 now be pointing to? Thanks!
If you had asked about converting an int ** to an int *, the answer would be different. Let’s consider that first, because I suspect it is more representative of the question you intended to ask, and the char * case is more complicated because there is a special purpose involved in that.
Suppose you have several int: int a, b, c, d;. You can make an array of pointers to them: int *p[] = { &a, &b, &c, &d };. You can also make a pointer to one of these pointers: int **q = &p[1];. Now q points to p[1], which contains the address of b.
When you write *q, the compiler knows q points to a pointer to an int, so it knows *q points to an int. If you write **q, the compiler, knowing that *q points to an int, will get *q from memory and use that as an address to get an int.
What happens if you convert q to an int * and try to use it, as in printf("%d\n", * (int *) q);? When you convert q to an int * you are (falsely) telling the compiler to treat it as a pointer to an int. Then, * (int *) q tells the compiler to go to that address and get an int.
This is invalid code—its behavior is not defined by the C standard. Specifically, it violates C 2018 6.5 7, which says that an object shall be accessed only by an lvalue expression that has a correct type—either a type compatible with that of the actual object or certain other cases, none of which apply here. At the place q points, there is a pointer to an int, but you tried to access it as if it were an int, and that is not allowed.
Now let’s consider the char ** to char * case. As before, you might take some char **q and convert it to char *. Now, you are telling the compiler to go to the place q points, where there is a pointer to a char, and to access that memory location as if there were a char there.
C has special rules for this case. You are allowed to examine the bytes that make up objects by accessing them through a char *. So, if you convert a char ** to char * and use it, as in * (char *) q, the result will be the first (lowest addressed) byte that makes up the pointer there. You can even look at the rest of the bytes, using code like this:
char *t = (char *) q;
printf("%d\n", t[0]);
printf("%d\n", t[1]);
printf("%d\n", t[2]);
printf("%d\n", t[3]);
This code will show you the decimal values for the first four bytes that make up the pointer to char that is at the location specified by q.
In summary, converting a char ** to char * will allow you to examine the bytes that represent a char *. However, in general, you should not convert pointers of one indirection level to pointers of another indirection level.
I find pointers make much more sense when I think of them as memory addresses rather than some abstract high level thing.
So a char** is a memory address which points to, a memory address which points to, a character.
0x0020 -> 0x0010 -> 0x0041 'A'
When you cast you change the interpretation of the data, not the actual data. So the char* is
0x0020 -> 0x0010 (unprintable)
This is almost certainly not useful. Interpreting this random data as a null terminated string would be potentially disastrous.
I am from Java back ground.I am learning C in which i gone through a code snippet for type conversion from int to char.
int a=5;
int *p;
p=&a;
char *a0;
a0=(char* )p;
My question is that , why we use (char *)p instead of (char)p.
We are only casting the 4 byte memory(Integer) to 1 byte(Character) and not the value related to it
You need to consider pointers as variable that contains addresses. Their sole purpose is to show you where to look in the memory.
so consider this:
int a = 65;
void* addr = &a;
now the 'addr' contains the address of the the memory where 'a' is located
what you do with it is up to you.
here I decided to "see" that part of the memory as an ASCII character that you could print to display the character 'A'
char* car_A = (char*)addr;
putchar(*car_A); // print: A (ASCII code for 'A' is 65)
if instead you decide to do what you suggested:
char* a0 = (char)addr;
The left part of the assignment (char)addr will cast a pointer 'addr' (likely to be 4 or 8 bytes) to a char (1 byte)
The right part of the assignment, the truncated address, will be assigned as the address of the pointer 'a0'
If you don't see why it doesn't make sense let me clarify with a concrete example
Say the address of 'a' is 0x002F4A0E (assuming pointers are stored on 4 bytes) then
'*addr' is equal to 65
'addr' is equal to 0x002F4A0E
When casting it like so (char)addr this become equal to 0x0E.
So the line
char* a0 = (char)addr;
become
char* a0 = 0x0E
So 'a0' will end up pointing to the address 0x0000000E and we don't know what is in this location.
I hope this clarify your problem
First of all, p is not necessarily 4 bytes since it's architecture-dependent. Second, p is a pointer to an integer, a0 is a pointer to a character, not a character. You're taking a pointer pointing to an integer and casting it to a pointer to a character. There are few good reasons to do this. You could also cast the value to a character, but I can't imagine any reason for doing this either.
Pointers do not provide information whether they point to a single object of first object of an array.
Consider
int *p;
int a[5] = { 1, 2, 3, 4, 5 };
int x = 1;
p = a;
p = &x;
So having a value in the pointer p you can not say whether the value is the address of the first element of the array a or it is the address of the single object x.
It is your responsibility to interpret the address correctly.
In this expression-statement
a0=(char* )p;
the address of the extent of memory pointed to by the pointer p and occupied by an object of the type int (it is unknown whether it is a single object or the first object of an array) is interpreted as an address of an extent of memory occupied by an object of the type char. Whether it is a single object of the type char or the first object of a character array with the size equal to sizeof( int ) depends on your intention that is how you are going to deal with the pointer.
I've read several posts about casting int pointers to char pointers but i'm still confused on one thing.
I understand that integers take up four bytes of memory (on most 32 bit machines?) and characters take up on byte of memory. By casting a integer pointer to a char pointer, will they both contain the same address? Does the cast operation change the value of what the char pointer points to? ie, it only points to the first 8 bits of an integers and not all 32 bits ? I'm confused as to what actually changes when I cast an int pointer to char pointer.
By casting a integer pointer to a char pointer, will they both contain the same address?
Both pointers would point to the same location in memory.
Does the cast operation change the value of what the char pointer points to?
No, it changes the default interpretation of what the pointer points to.
When you read from an int pointer in an expression *myIntPtr you get back the content of the location interpreted as a multi-byte value of type int. When you read from a char pointer in an expression *myCharPtr, you get back the content of the location interpreted as a single-byte value of type char.
Another consequence of casting a pointer is in pointer arithmetic. When you have two int pointers pointing into the same array, subtracting one from the other produces the difference in ints, for example
int a[20] = {0};
int *p = &a[3];
int *q = &a[13];
ptrdiff_t diff1 = q - p; // This is 10
If you cast p and q to char, you would get the distance in terms of chars, not in terms of ints:
char *x = (char*)p;
char *y = (char*)q;
ptrdiff_t diff2 = y - x; // This is 10 times sizeof(int)
Demo.
The int pointer points to a list of integers in memory. They may be 16, 32, or possibly 64 bits, and they may be big-endian or little endian. By casting the pointer to a char pointer, you reinterpret those bits as characters. So, assuming 16 bit big-endian ints, if we point to an array of two integers, 0x4142 0x4300, the pointer is reinterpreted as pointing to the string "abc" (0x41 is 'a', and the last byte is nul). However if integers are little endian, the same data would be reinterpreted as the string "ba".
Now for practical purposes you are unlikely to want to reinterpret integers as ascii strings. However its often useful to reinterpret as unsigned chars, and thus just a stream of raw bytes.
Casting a pointer just changes how it is interpreted; no change to its value or the data it points to occurs. Using it may change the data it points to, just as using the original may change the data it points to; how it changes that data may differ (which is likely the point of doing the casting in the first place).
A pointer is a particular variable that stores the memory address where another variable begins. Doesnt matter if the variable is a int or a char, if the first bit has the same position in the memory, then a pointer to that variable will look the same.
the difference is when you operate on that pointer. If your pointer variable is p and it's a int pointer, then p++ will increase the address that it contains of 4 bytes.
if your pointer is p and it's a char pointer, then p++ will increase the address that it contains of 1 byte.
this code example will help you understand:
int main(){
int* pi;
int i;
char* pc;
char c;
pi = &i;
pc = &c;
printf("%p\n", pi); // 0x7fff5f72c984
pi++;
printf("%p\n", pi); // 0x7fff5f72c988
printf("%p\n", pc); // 0x7fff5f72c977
pc++;
printf("%p\n", pc); // 0x7fff5f72c978
}
I was looking at an answer to "Pointers in C: when to use the ampersand and the asterisk" and am confused about the example int *p2 = &i; (from Dan Olson's answer)
The address of i is not an int, it's something like 0x02304 say, right? So how can we put this into p2? How can a C pointer of type int hold a memory address, given that a byte memory address is not of type int?
Thanks!
P.S. For anyone confused on this point, another thread I found helpful (though it didn't answer this question for me) is "What exactly is a C pointer if not a memory address" Good luck.
I think the reason int *p2 = &i; is confusing is the combined effect of the simultaneous declaration and instantiation, and the spacing. I'll explain.
Typically, for a pointer-to-integer p2 and an integer i, writing "*p2" dereferences p2 and gives the int living at the address &i.
So the code "int *p2 = &i;" makes it look like an int is being set equal to a memory address.
Indeed, the code
int i = 1;
int *p2;
*p2 = &i
is wrong, because in the last line, *p2 is an int since it is p2 dereferenced, and &i is a pointer.
Why int *p2 = &i; is NOT doing the same thing as the last line of the flawed code above:
The code int *p2 = &i; is different from the 3 lines of code above because it is a declaration. When you declare a pointer variable, you put the type, (e.g. int, long, char, etc.), and the name of your variable (like normal) - and you also put an asterisk * in between these two pieces (see #chqrlie answer above with regard to the many spacing options for this - takehome is best-practice is to adhere the * to the variable name). In a declaration, the * is NOT dereferencing the pointer. Rather, it is telling the compiler that the variable myPointer will be a pointer to memory holding data of the declared type (the int, long, char, etc. from before). So, in the bit of code "int *p2 = &i;", where the pointer is being declared and instantiated at the same time, the * does not render an int on the left-hand side. For declarations, while it is safer (as #chqrlie points out) to put the * next to the variable name, this does not unwrap the pointer even if the pointer is instantiated in the declaration. For declarations, think of the * as being attached to the type (rather than to the variable name, where for good reasons it is likely to be). I like to declare pointers with the * right next to the pointer name, but just understand that when instantiating a pointer in the declaration, a perhaps clearer way for a new learner to imagine the line would be int* p2 = &1:
int* (type pionter-to-int) p2 (name of pointer) = &i (equals the pointer that gives the address of integer i) ;
Thank you to everyone who answered and commented, and good luck to everyone who may have come here trying to figure out something about pointers.
The address of i is not an int, it's something like 0x02304 say, right?
Pointers are variables that hold memory addresses. An address, just like the address of your house, is an integer assigned to the location of a byte of memory.
How can a C pointer of type int hold a memory address
In your example, p2 is variable of type pointer-to-int. It is a pointer - a memory address - that you're declaring points to memory where an int variable will be stored.
p2 is not an int, it is a pointer to int. As such, it can hold the address of an int variable.
The syntax int *p2; defines a pointer to int.
Initializing p2 with &i stores the address of variable i into p2. Modifying the value pointed to by p2 will modify the value of i.
The following alternative syntaxes are all equivalent:
int *p2 = &i;
int * p2 = & i;
int * p2 =& i;
int*p2=&i;
int* p2 = &i;
The preferred syntax is int *p2 = &i; because it avoids a common misunderstanding when defining multiple variables on the same line:
int *p1, *p2; // defines 2 pointers to int
int *p1, p2; // p1 is a pointer-to-int, whereas p2 is an int
Tacking the * to the type makes the latter definition very confusing:
int* p1, p2; // p1 is a pointer-to-int, whereas p2 is an int
As a consequence, defining variables with different indirection levels on the same line is also strongly discouraged.
Pointers are abstractions of memory addresses, with some associated type semantics.
p2 has type int *, or "pointer to int". It stores the location of an integer object (in this case, the location of the object i). The type of the expression *p2 is int:
p2 == &i; // both expressions have type int * and evaluate to an address value
*p2 == i; // both expressions have type int and evaluate to an integer value
Pointers are as big as they need to be to store an address value, however that address value is represented for the given platform (whether that's a single integer value, or a pair of values representing a page number and offset, or some other format). Note that different pointer types may be different sizes, although on modern desktop architectures they are all the same size (32 or 64 bit).
The type of the pointer matters for pointer arithmetic. Given a pointer p, the expression p + 1 yields the address of the next object of the pointed-to type. If the pointed-to type is 1 byte wide (such as char) and its current address is 0x8000, then p + 1 yields the address of the next byte, or 0x8001. If the type is 4 bytes wide (such as a long) and its current address is 0x8000, then p + 1 yields the address of the fourth next byte, or 0x8004.
A pointer is its own type.
If it were used with a different syntax it could be written like this:
Pointer x = new Pointer(int, address);