Multiple Reference and Dereference in C - c

Can somebody clealry explain me the concept behind multiple reference and dereference ? why does the following program gives output as 'h' ?
int main()
{
char *ptr = "hello";
printf("%c\n", *&*&*ptr);
getchar();
return 0;
}
and not this , instead it produces 'd' ?
int main()
{
char *ptr = "hello";
printf("%c\n", *&*&ptr);
getchar();
return 0;
}
I read that consecutive use of '*' and '&' cancels each other but this explanation does not provide the reason behind two different outputs generated in above codes?

The first program produces h because &s and *s "cancel" each other: "dereferencing an address of X" gives back the X:
ptr - a pointer to the initial character of "hello" literal
*ptr - dereference of a pointer to the initial character, i.e. the initial character
&*ptr the address of the dereference of a pointer to the initial character, i.e. a pointer to the initial character, i.e. ptr itself
And so on. As you can see, a pair *& brings you back to where you have started, so you can eliminate all such pairs from your dereference / take address expressions. Therefore, your first program's printf is equivalent to
printf("%c\n", *ptr);
The second program has undefined behavior, because a pointer is being passed to printf with the format specifier of %c. If you pass the same expression to %s, the word hello would be printed:
printf("%s\n", *&*&ptr);

Lets go through the important parts of the program:
char *ptr = "hello";
makes a pointer to char which points to the string literal "hello". Now, for the confusing part:
printf("%c\n", *&*&*ptr);
Here, %c expects a char. Let us look into what type *&*&*ptr is. ptr is a char*. Applying the dereference operator(*) gives a char. Applying the address-of operator to this char gives back the char*. This is repeated again, and finally, the * at the end gives us a char, the first character of the string literal "hello", which gets printed.
In the second program, in *&*&ptr, you first apply the & operator, which gives a char**. Applying * on this gives back the char*. This is repeated again and finally , we get a char*. But %c expects a char, not a char*. So, the second program exhibits Undefined Behavior as per the C11 standard (emphasis mine):
7.21.6.1 The fprintf function
[...]
If a conversion specification is invalid, the behavior is undefined.282 If any argument is not the correct type for the corresponding conversion specification, the behavior is undefined.
So, basically, anything can happen when you execute the second program. Your program might crash, emit a segmentation-fault, output weird things, or do something else.
BTW, you are right about saying:
I read that consecutive use of '*' and '&' cancels each other

Let's break down what *&*&*ptr actually is, but first, remember that when applying * to a pointer, it gives you what that pointer points to. On the other hand, when applying & to a variable, it gives you the address in memory for that variable.
Now, after getting a steady ground, let's see what you have here:
ptr is a pointer to a char, thus when doing *ptr it gives you the data which ptr points to, in this case, ptr points to a string "hello", however, a char can hold only one char, not a whole string, right? so, it point to the beginning of such a string, which is the first character in it, aka h.
Moving on...*&*&*ptr=*&*&(*ptr) = *&*&('h')= *&*(&'h') = *&*(ptr)=*&(*ptr) = *&('h')= *ptr = 'h'
If you apply the same pattern on the next function, I'm pretty sure you can figure it out.
SUMMARY: read pointers from the right to the left!

Related

What's wrong with this example with string literal?

I'm reading an answer from this site which says the following is undefined
char *fubar = "hello world";
*fubar++; // SQUARELY UNDEFINED BEHAVIOUR!
but isn't that fubar++ is done first, which means moving the pointer to e, and *() is then done, which means extract the e out. I know this is supposed to be asked on chat (I'm a kind person) but no one is there so I ask here to attract notice.
The location of the ++ is the key: If it's a suffix (like in this case) then the increment happens after.
Also due to operator precedence you increment the pointer.
So what happens is that the pointer fubar is dereference (resulting in 'h' which is then ignored), and then the pointer variable fubar is incremented to point to the 'e'.
In short: *fubar++ is fine and valid.
If it was (*fubar)++ then it would be undefined behavior, since then it would attempt to increase the first characters of the string. And literal strings in C are arrays of read-only characters, so attempting to modify a character in a literal string would be undefined behavior.
The expression *fubar++ is essentially equal to
char *temporary_variable = fubar;
fubar = fubar + 1;
*temporary_variable; // the result of the whole expression
The code shown is clearly not undefined behaviour, since *fubar++ is somewhat equal to char result; (result = *fubar, fubar++, result), i.e. it increments the pointer, and not the dereferenced value, and the result of the expression is the (dereferenced) value *fubar before the pointer got incremented. *fubar++ actually gives you the character value to which fubar originally points, but you simply make no use of this "result" and ignore it.
Note, however, that the following code does introduce undefined behaviour:
char *fubar = "hello world";
(*fubar)++;
This is because this increments the value to which fubar points and thereby manipulates a string literal -> undefined behaviour.
When replacing the string literal with an character array, then everything is OK again:
int main() {
char test[] = "hello world";
char* fubar = test;
(*fubar)++;
printf("%s\n",fubar);
}
Output:
iello world

Pointers and Characters Variables in C

I'm trying to store the memory location of the evil variable in the ptr variable, but instead of the memory location, the variable prints out the value within the variable and not it's location. Both the ptr variable and the &evil syntax print the same result which is the value of the variable and not it's location in memory. Could someone please nudge me in the right direction, and help me determine the syntax needed to store the memory location of a string/char variable in C?
int main()
{
char *ptr;
char evil[4];
memset(evil, 0x43, 4);//fill evil variable with 4 C's
ptr = &evil[0];//set ptr variable equal to evil variable's memory address
printf(ptr);//prints 4 C's
printf(&evil);//prints 4 C's
return 0;
}
That's normal, because the first parameter to printf is a format specifier, which is basically a char* pointing to a string. Your string will be printed as-is because it does not contain any format specifiers.
If you want to display a pointer's value as an address, use %p as a format specifier, and your pointer as subsequent parameter:
printf("%p", ptr);
However, note that your code invokes Undefined Behavior (UB) because your string is not null-terminated. Chances are it was null-terminated "out-of-luck".
Also notice that to be "correct code", cast the pointer to void* when sending it to printf, because different types of pointers may differ on certain platforms, and the C standard requires the parameter to be pointer-to-void. See this thread for more details: printf("%p") and casting to (void *)
printf("%p", (void*)ptr); // <-- C standard requires this cast, although it migh work without it on most compilers and platforms.
You need #include <stdio.h> for printf, and #include <string.h> for memset.
int main()
You can get away with this, but int main(void) is better.
{
char *ptr;
char evil[4];
memset(evil, 0x43, 4);//fill evil variable with 4 C's
This would be more legible if you replaced 0x43 by 'C'. They both mean the same thing (assuming an ASCII-based character set). Even better, don't repeat the size:
memset(evil, 'C', sizeof evil);
ptr = &evil[0];//set ptr variable equal to evil variable's memory address
This sets ptr to the address of the initial (0th) element of the array object evil. This is not the same as the address of the array object itself. They're both the same location in memory, but &evil[0] and &evil are of different types.
You could also write this as:
ptr = evil;
You can't assign arrays, but an array expression is, in most contexts, implicitly converted to a pointer to the array's initial element.
The relationship between arrays and pointers in C can be confusing.
Rule 1: Arrays are not pointers.
Rule 2: Read section 6 of the comp.lang.c FAQ.
printf(ptr);//prints 4 C's
The first argument to printf should (almost) always be a string literal, the format string. If you give it the name of a char array variable, and it happens to contain % characters, Bad Things Can Happen. If you're just printing a string, you can use "%s" as the format string:
printf("%s\n", ptr);
(Note that I've added a newline so the output is displayed properly.)
Except that ptr doesn't point to a string. A string, by definition, is terminated by a null ('\0') character. Your evil array isn't. (It's possible that there just happens to be a null byte just after the array in memory. Do not depend on that.)
You can use a field width to determine how many characters to print:
printf("%.4s\n", ptr);
Or, to avoid the error-prone practice of having to write the same number multiple times:
printf("%.*s\n", (int)sizeof evil, evil);
Find a good document for printf if you want to understand that.
(Or, depending on what you're doing, maybe you should arrange for evil to be null-terminated in the first place.)
printf(&evil);//prints 4 C's
Ah, now we have some serious undefined behavior. The first argument to printf is a pointer to a format string; it's of type const char*. &evil is of type char (*)[4], a pointer to an array of 4 char elements. Your compiler should have warned you about that (the format string has a known type; the following arguments do not, so getting their types correct is up to you). If it seems to work, it's because &evil points to the same memory location as &evil[0], and different pointer types probably have the same representation on your systems, and perhaps there happens to be a stray '\0' just after the array -- perhaps preceded by some non-printable characters that you're not seeing.
If you want to print the address of your array object, use the %p format. It requires an argument of the pointer type void*, so you'll need to cast it:
printf("%p\n", (void*)&evil);
return 0;
}
Putting this all together and adding some bells and whistles:
#include <stdio.h>
#include <string.h>
int main(void)
{
char *ptr;
char evil[4];
memset(evil, 'C', sizeof evil);
ptr = &evil[0];
printf("ptr points to the character sequence \"%.*s\"\n",
(int)sizeof evil, evil);
printf("The address of evil[0] is %p\n", (void*)ptr);
printf("The address of evil is also %p\n", (void*)&evil);
return 0;
}
The output on my system is:
ptr points to the character sequence "CCCC"
The address of evil[0] is 0x7ffc060dc650
The address of evil is also 0x7ffc060dc650
Using
printf(ptr);//prints 4 C's
causes undefined behavior since the first argument to printf needs to be a null terminated string. In your case it is not.
Could someone please nudge me in the right direction
To print an address, you need to use the %p format specifier in the call to printf.
printf("%p\n", ptr);
or
printf("%p\n", &evil);
or
printf("%p\n", &evil[0]);
See printf documentation for all the ways it can be used.
if you want to see the address of a char - without all the array / pointer / decay oddness
int main()
{
char *ptr;
char evil;
evil = 4;
ptr = &evil;
printf("%p\n",(void*)ptr);
printf("%p\n",(void*)&evil);
return 0;
}

Char pointers and the printf function

I was trying to learn pointers and I wrote the following code to print the value of the pointer:
#include <stdio.h>
int main(void) {
char *p = "abc";
printf("%c",*p);
return 0;
}
The output is:
a
however, if I change the above code to:
#include <stdio.h>
int main(void) {
char *p = "abc";
printf(p);
return 0;
}
I get the output:
abc
I don't understand the following 2 things:
why did printf not require a format specifier in the second case? Is printf(pointer_name) enough to print the value of the pointer?
as per my understanding (which is very little), *p points to a contiguous block of memory that contains abc. I expected both outputs to be the same, i.e.
abc
are the different outputs because of the different ways of printing?
Edit 1
Additionally, the following code produces a runtime error. Why so?
#include <stdio.h>
int main(void) {
char *p = "abc";
printf(*p);
return 0;
}
For your first question, the printf function (and family) takes a string as first argument (i.e. a const char *). That string could contain format codes that the printf function will replace with the corresponding argument. The rest of the text is printed as-is, verbatim. And that's what is happening when you pass p as the first argument.
Do note that using printf this way is highly unrecommended, especially if the string is contains input from a user. If the user adds formatting codes in the string, and you don't provide the correct arguments then you will have undefined behavior. It could even lead to security holes.
For your second question, the variable p points to some memory. The expression *p dereferences the pointer to give you a single character, namely the one that p is actually pointing to, which is p[0].
Think of p like this:
+---+ +-----+-----+-----+------+
| p | ---> | 'a' | 'b' | 'c' | '\0' |
+---+ +-----+-----+-----+------+
The variable p doesn't really point to a "string", it only points to some single location in memory, namely the first character in the string "abc". It's the functions using p that treat that memory as a sequence of characters.
Furthermore, constant string literals are actually stored as (read-only) arrays of the number of character in the string plus one for the string terminator.
Also, to help you understand why *p is the same as p[0] you need to know that for any pointer or array p and valid index i, the expressions p[i] is equal to *(p + i). To get the first character, you have index 0, which means you have p[0] which then should be equal to *(p + 0). Adding zero to anything is a no-op, so *(p + 0) is the same as *(p) which is the same as *p. Therefore p[0] is equal to *p.
Regarding your edit (where you do printf(*p)), since *p returns the value of the first "element" pointed to by p (i.e. p[0]) you are passing a single character as the pointer to the format string. This will lead the compiler to convert it to a pointer which is pointing to whatever address has the value of that single character (it doesn't convert the character to a pointer to the character). This address is not a very valid address (in the ASCII alphabet 'a' has the value 97 which is the address where the program will look for the string to print) and you will have undefined behavior.
p is the format string.
char *p = "abc";
printf(p);
is the same as
print("abc");
Doing this is very bad practice because you don't know what the variable
will contain, and if it contains format specifiers, calling printf may do very bad things.
The reason why the first case (with "%c") only printed the first character
is that %c means a byte and *p means the (first) value which p is pointing at.
%s would print the entire string.
char *p = "abc";
printf(p); /* If p is untrusted, bad things will happen, otherwise the string p is written. */
printf("%c", *p); /* print the first byte in the string p */
printf("%s", p); /* print the string p */
why did printf not require a format specifier in the second case? Is printf(pointer_name) enough to print the value of the pointer?
With your code you told printf to use your string as the format string. Meaning your code turned equivalent to printf("abc").
as per my understanding (which is very little), *p points to a contiguous block of memory that contains abc. I expected both outputs to be the same
If you use %c you get a character printed, if you use %s you get a whole string. But if you tell printf to use the string as the format string, then it will do that too.
char *p = "abc";
printf(*p);
This code crashes because the contents of p, the character 'a' is not a pointer to a format string, it is not even a pointer. That code should not even compile without warnings.
You are misunderstanding, indeed when you do
char *p = "Hello";
p points to the starting address where literal "Hello" is stored. This is how you declare pointers. However, when afterwards, you do
*p
it means dereference p and obtain object where p points. In our above example this would yield 'H'. This should clarify your second question.
In case of printf just try
printf("Hello");
which is also fine; this answers your first question because it is effectively the same what you did when passed just p to printf.
Finally to your edit, indeed
printf(*p);
above line is not correct since printf expects const char * and by using *p you are passing it a char - or in other words 'H' assuming our example. Read more what dereferencing means.
why did printf not require a format specifier in the second case? Is printf(pointer_name) enough to print the value of the pointer?
"abc" is your format specifier. That's why it's printing "abc". If the string had contained %, then things would have behaved strangely, but they didn't.
printf("abc"); // Perfectly fine!
why did printf not require a format specifier in the second case? Is printf(pointer_name) enough to print the value of the pointer?
%c is the character conversion specifier. It instructs printf to only print the first byte. If you want it to print the string, use...
printf ("%s", p);
The %s seems redundant, but it can be useful for printing control characters or if you use width specifiers.
The best way to understand this really is to try and print the string abc%def using printf.
The %c format specifier expects a char type, and will output a single char value.
The first parameter to printf must be a const char* (a char* can convert implicitly to a const char*) and points to the start of a string of characters. It stops printing when it encounters a \0 in that string. If there is not a \0 present in that string then the behaviour of that function is undefined. Because "abc" doesn't contain any format specifiers, you don't pass any additional arguments to printf in that case.

C Language Pointer Used With Arrays

I compiled and ran the following code and the results too are depicted below.
#include <stdio.h>
int main(void) {
char *ptr = "I am a string";
printf("\n [%s]\n", ptr);
return 0; }
** [I am a string]**
I want to understand how a string has been assigned inside a pointer char. As per my understanding the pointer can hold only an address, not a complete string. Here it is holding one whole sentence. I do not understand how being a pointer allows it to behave such way.
If I change the following line of code in the above example,
printf("\n [%c]\n", ptr);
It does not print one single charactor and stop. What it does is that it prints out an unrecognised character which is completely out of ASCII table. I do not understand how that too is happening. I would appreciate some light shred on this issue.
As per my understanding the pointer can hold only an address, not a
complete string
char *ptr = "I am a string";
Is a string literal the string is stored in the read-only location and the address in which the data is stored is returned to the pointer ptr.
It does not print one single charactor and stop. What it does is that
it prints out an unrecognised character which is completely out of
ASCII table. I do not understand how that too is happening
ptr is a pointer and using wrong format specifier in printf() lead to undefined behvaior.
With %s if you provide the address where the string is stored the printf() prints out the whole string
A pointer does not hold a string, it points to a string. (Easy to remember, it's called a "pointer", not a "holder"). To see the difference, write your postal address on a yellow sticky note. Does this piece of paper hold you? No, it points to you. It holds your address.
Pointers are computer equivalent of postal addresses (in fact things that pointers do hold are called addresses). They don't hold "real things" like strings, they tell where "real things" live.
Back to our string, the pointer actually points to the first character of the string, not to the string as a whole, but that's not a problem because we know the rest of the string lives right next to the first chsracter.
Now "%s" as a format specifier wants a pointer to the first character of a string, so you can correctly pass p to printf. OTOH %c wants a character, not a pointer, so passing p in this case leads to undefined behavior.
So how come we can say things like char* p = "abc"? String literals are arrays of characters, and an array in most cases decays into pointer to its first element. Array-to-pointer decay is another confusing property of C but fortunately there is a lot of information available on it out there. OTOH `char p = "abc" is not valid, because a character is not an array (a house is not a street).
Also
char *ptr = "I am a string";
automatically inserts a null character at the end. So when you do a printf with %s format specifier, it starts from the address of the string literal and prints upto the null character and stops.

I am pretty curious about char pointers

I have learned that char *p means "a pointer to char type"
and also I think I've also learned that char means to read
that amount of memory once that pointer reaches its' destination.
so conclusively, in
char *p = "hello World";
the p contains the strings' address and
the p is pointing right at it
Qusetions.
if the p points to the string, shouldn't it be reading only the 'h'???
since it only reads the sizeof a char?
why does `printf("%s", p) print the whole string???
I also learned in Rithcie's book that pointer variables don't possess a data type.
is that true???
So your string "hello world" occupies some memory.
[h][e][l][l][o][ ][w][o][r][l][d][\0]
[0][1][2][3][4][5][6][7][8][9][A][B ]
The pointer p does, in fact, only point to the first byte. In this case, byte 0. But if you use it in the context of
printf("%s", p)
Then printf knows to print until it gets the null character \0, which is why it will print the entire string and not just 'h'.
As for the second question, pointers do posses a data type. If you were to say it outloud, the name would probably be something like type "pointer to a character" in the case of p, or "pointer to an integer" for int *i.
Pointer variables don't hold a data type, they hold an address. But you use a data type so you know how many bytes you'll advance on each step, when reading from memory using that pointer.
When you call printf, the %s in the expression is telling the function to start reading at the address indicated by *p (which does hold the byte value for 'h', as you said), and stop reading when it reaches a terminating character. That's a character that has no visual representation (you refer to it as \0 in code). It tells the program where a string ends.
Here *p is a pointer to some location in memory, that it assumes to be 1 byte (or char). So it points to the 'h' letter. So p[0] or *(p+0) will give you p. But, your string ends with invisible \0 character, so when you use printf function it outputs all symbols, starting from the one, where *p points to and till `\0'.
And pointer is just a variable, that is able to hold some address (4, 8 or more bytes).
For question:
I also learned in Rithcie's book that pointer variables don't possess a data type. is that true???
Simply put, YES.
Data types in C are used to define a variable before its use. The definition of a variable will assign storage for the variable and define the type of data that will be held in the location.
C has the following basic built-in datatypes.int,float,double,char.
Quoting from C Data types
Pointer is derived data type, each data type can have a pointer associated with. Pointers don't have a keyword, but are marked by a preceding * in the variable and function declaration/definition. Most compilers supplies the predefined constant NULL, which is equivalent to 0.
if the p points to the string, shouldn't it be reading only the 'h'???
since it only reads the sizeof a char? why does printf("%s", *p) print
the whole string???
change your printf("%s", *p) to printf("%c", *p) you see what you want. both calls the printf in different ways on the basis of format specifier i.e string(%s) or char(%c).
to print the string use printf("%s", p);
to print the char use printf("%c", *p);
Second Ans. : Pointers possesses a data type. that's why you used char *p .

Resources