I'm reading an answer from this site which says the following is undefined
char *fubar = "hello world";
*fubar++; // SQUARELY UNDEFINED BEHAVIOUR!
but isn't that fubar++ is done first, which means moving the pointer to e, and *() is then done, which means extract the e out. I know this is supposed to be asked on chat (I'm a kind person) but no one is there so I ask here to attract notice.
The location of the ++ is the key: If it's a suffix (like in this case) then the increment happens after.
Also due to operator precedence you increment the pointer.
So what happens is that the pointer fubar is dereference (resulting in 'h' which is then ignored), and then the pointer variable fubar is incremented to point to the 'e'.
In short: *fubar++ is fine and valid.
If it was (*fubar)++ then it would be undefined behavior, since then it would attempt to increase the first characters of the string. And literal strings in C are arrays of read-only characters, so attempting to modify a character in a literal string would be undefined behavior.
The expression *fubar++ is essentially equal to
char *temporary_variable = fubar;
fubar = fubar + 1;
*temporary_variable; // the result of the whole expression
The code shown is clearly not undefined behaviour, since *fubar++ is somewhat equal to char result; (result = *fubar, fubar++, result), i.e. it increments the pointer, and not the dereferenced value, and the result of the expression is the (dereferenced) value *fubar before the pointer got incremented. *fubar++ actually gives you the character value to which fubar originally points, but you simply make no use of this "result" and ignore it.
Note, however, that the following code does introduce undefined behaviour:
char *fubar = "hello world";
(*fubar)++;
This is because this increments the value to which fubar points and thereby manipulates a string literal -> undefined behaviour.
When replacing the string literal with an character array, then everything is OK again:
int main() {
char test[] = "hello world";
char* fubar = test;
(*fubar)++;
printf("%s\n",fubar);
}
Output:
iello world
Related
Can somebody clealry explain me the concept behind multiple reference and dereference ? why does the following program gives output as 'h' ?
int main()
{
char *ptr = "hello";
printf("%c\n", *&*&*ptr);
getchar();
return 0;
}
and not this , instead it produces 'd' ?
int main()
{
char *ptr = "hello";
printf("%c\n", *&*&ptr);
getchar();
return 0;
}
I read that consecutive use of '*' and '&' cancels each other but this explanation does not provide the reason behind two different outputs generated in above codes?
The first program produces h because &s and *s "cancel" each other: "dereferencing an address of X" gives back the X:
ptr - a pointer to the initial character of "hello" literal
*ptr - dereference of a pointer to the initial character, i.e. the initial character
&*ptr the address of the dereference of a pointer to the initial character, i.e. a pointer to the initial character, i.e. ptr itself
And so on. As you can see, a pair *& brings you back to where you have started, so you can eliminate all such pairs from your dereference / take address expressions. Therefore, your first program's printf is equivalent to
printf("%c\n", *ptr);
The second program has undefined behavior, because a pointer is being passed to printf with the format specifier of %c. If you pass the same expression to %s, the word hello would be printed:
printf("%s\n", *&*&ptr);
Lets go through the important parts of the program:
char *ptr = "hello";
makes a pointer to char which points to the string literal "hello". Now, for the confusing part:
printf("%c\n", *&*&*ptr);
Here, %c expects a char. Let us look into what type *&*&*ptr is. ptr is a char*. Applying the dereference operator(*) gives a char. Applying the address-of operator to this char gives back the char*. This is repeated again, and finally, the * at the end gives us a char, the first character of the string literal "hello", which gets printed.
In the second program, in *&*&ptr, you first apply the & operator, which gives a char**. Applying * on this gives back the char*. This is repeated again and finally , we get a char*. But %c expects a char, not a char*. So, the second program exhibits Undefined Behavior as per the C11 standard (emphasis mine):
7.21.6.1 The fprintf function
[...]
If a conversion specification is invalid, the behavior is undefined.282 If any argument is not the correct type for the corresponding conversion specification, the behavior is undefined.
So, basically, anything can happen when you execute the second program. Your program might crash, emit a segmentation-fault, output weird things, or do something else.
BTW, you are right about saying:
I read that consecutive use of '*' and '&' cancels each other
Let's break down what *&*&*ptr actually is, but first, remember that when applying * to a pointer, it gives you what that pointer points to. On the other hand, when applying & to a variable, it gives you the address in memory for that variable.
Now, after getting a steady ground, let's see what you have here:
ptr is a pointer to a char, thus when doing *ptr it gives you the data which ptr points to, in this case, ptr points to a string "hello", however, a char can hold only one char, not a whole string, right? so, it point to the beginning of such a string, which is the first character in it, aka h.
Moving on...*&*&*ptr=*&*&(*ptr) = *&*&('h')= *&*(&'h') = *&*(ptr)=*&(*ptr) = *&('h')= *ptr = 'h'
If you apply the same pattern on the next function, I'm pretty sure you can figure it out.
SUMMARY: read pointers from the right to the left!
Look at this:
int main() {
char *verse = "zappa";
printf("%c\n", *verse);
// the program correctly prints the first character
*verse++;
printf("%c\n", *verse);
// the program correctly prints the second character, which in fact lies
// in the adjacent memory cell
(*verse)++;
printf("%c\n", *verse);
// the program doesn't print anything and crashes. Why?
return 0;
}
Why does my program crash as I try to increment the value pointed by verse? I was expecting something like the next character in the ASCII table.
This line (*verse)++; modifies a string literal. That is undefined behavior.
Note that the earlier line of code *verse++ is parsed as *(verse++)
verse points to a string literal, which you're not allowed to modify. Try:
char verse1[] = "zappa";
char *verse = verse1;
Now your code will work because verse1 is a modifiable string.
Note that *verse++ is effectively equivalent to just verse++. The indirection returns the value pointed to by the pointer before the increment, but since you're not doing anything with the return value of the expression, the indirection doesn't really do anything.
Question 1:
int main()
{
char *p="abcd";
printf("%c",*(p++));
return 0;
} // Here it will print a
Question 2:
int main()
{
char *p="abcd";
printf("%c",++*(p++));//why it is showing error over here
return 0;
} // Here it shows runtime error.
Can someone please explain to me why the statement ++*(p++) causes a runtime error.
char *p="abcd";
"abcd" is a string literal and string literals are unmodifiable in C. Attempting to modify a string literal invokes undefined behavior.
Use a modifiable array initialized by a string literal to fix your issue:
char p[] ="abcd";
String literals are read-only. Any attempt to modify it invokes undefined behavior.
In second code you are modifying string literal which causes undefined behavior of the program.
++*(p++) - This is equivalent to ++*p;p++;
So here first byte value (character) of address stored in variable p is going to increment by 1. And then value of variable p is going to increment, that means p is going to point (store) address of 2nd character of string literal ("abcd").
Now go through the below two variable declaration.
char *p = "abcdef";
char p1[] = "abcdef"
Here for first variable p, 4 bytes will be allocated in stack to
store the address of the string literal "abcdef" and then 6 byte will
be allocated to store the string literal ("abcdef") in text segement of process memory. Always text segment is read only. So this value cannot be modifed.
Then for second variable 6 byte will be allocated in stack itself to
store the string ("abcdef"). Stack segment in process memory has both
read and write access.
So performing ++*p (modifying value in address) is applicable for variable p1 but not appilcable for variable p.
#include <stdio.h>
int main(void){
char *p = "Hello";
p = "Bye"; //Why is this valid C code? Why no derefencing operator?
int *z;
int x;
*z = x
z* = 2 //Works
z = 2 //Doesn't Work, Why does it work with characters?
char *str[2] = {"Hello","Good Bye"};
print("%s", str[1]); //Prints Good-Bye. WHY no derefrencing operator?
// Why is this valid C code? If I created an array with pointers
// shouldn't the element print the memory address and not the string?
return 0;
}
My Questions are outlined with the comments. In gerneal I'm having trouble understanding character arrays and pointers. Specifically why I can acess them without the derefrencing operator.
In gerneal I'm having trouble understanding character arrays and pointers.
This is very common for beginning C programmers. I had the same confusion back about 1985.
p = "Bye";
Since p is declared to be char*, p is simply a variable that contains a memory address of a char. The assignment above sets the value of p to be the address of the first char of the constant string "Bye", in other words the address of the letter "B".
z = 2
z is declared to be char*, so the only thing you can assign to it is the memory address of a char. You can't assign 2 to z, because 2 isn't the address of a char, it's a constant integer value.
print("%s", str[1]);
In this case, str is defined to be an array of two char* variables. In your print statement, you're printing the second of those, which is the address of the first character in the string "Good Bye".
When you type "Bye", you are actually creating what is called a String Literal. Its a special case, but essentially, when you do
p = "Bye";
What you are doing is assigning the address of this String literal to p(the string itself is stored by the compiler in a implementation dependant way (I think) ). Technically address to the first element of a char array, as Richard J. Ross III explains.
Since it is a special case, it does not work with other types.
By the way, you should likely get a compiler warning for lines like char *p = "Hello";. You should be required to define them as const char *p = "Hello"; since modifying them is undefined as the link explains.
As to the printing code.
print("%s", str[1]);
This doesnt need a dereferencing operation, since internally %s requires a pointer(specifically char *) to be passed, thus the dereferencing is done by printf. You can test this by passing a value when printf is expecting a pointer. You should get a runtime crash when it tries to dereference it.
p = "Bye";
Is an assignment of the address of the literal to the pointer.
The
array[n]
operator works in a similar way as a dereferrence of the pointer "array" increased by n. It is not the same, but it works that way.
Remember that "Hello", "Bye" all are char * not char.
So the line, p="Bye"; means that pointer p is pointing to a const char *i.e."Bye"
But in the next case with int *
*z=2 means that
`int` pointed by `z` is assigned a value of 2
while, z=2 means the pointer z points to the same int, pointed by 2.But, 2 is not a int pointer to point other ints.So, the compiler flags the error
You're confusing something: It does work with characters just as it works with integers et cetera.
What it doesn't work with are strings, because they are character arrays and arrays can only be stored in a variable using the address of their first element.
Later on, you've created an array of character pointers, or an array of strings. That means very simply that the first element of that array is a string, the second is also a string. When it comes to the printing part, you're using the second element of the array. So, unsurprisingly, the second string is printed.
If you look at it this way, you'll see that the syntax is consistent.
I have two pointers to the same C string. If I increment the second pointer by one, and assign the value of the second pointer to that of the first, I expect the first character of the first string to be changed. For example:
#include "stdio.h"
int main() {
char* original_str = "ABC"; // Get pointer to "ABC"
char* off_by_one = original_str; // Duplicate pointer to "ABC"
off_by_one++; // Increment duplicate by one: now "BC"
*original_str = *off_by_one; // Set 1st char of one to 1st char of other
printf("%s\n", original_str); // Prints "ABC" (why not "BBC"?)
*original_str = *(off_by_one + 1); // Set 1st char of one to 2nd char of other
printf("%s\n", original_str); // Prints "ABC" (why not "CBC"?)
return 0;
}
This doesn't work. I'm sure I'm missing something obvious - I have very, very little experience with C.
Thanks for your help!
You are attempting to modify a string literal. String literals are not modifiable (i.e., they are read-only).
A program that attempts to modify a string literal exhibits undefined behavior: the program may be able to "successfully" modify the string literal, the program may crash (immediately or at a later time), a program may exhibit unusual and unexpected behavior, or anything else might happen. All bets are off when the behavior is undefined.
Your code declares original_string as a pointer to the string literal "ABC":
char* original_string = "ABC";
If you change this to:
char original_string[] = "ABC";
you should be good to go. This declares an array of char that is initialized with the contents of the string literal "ABC". The array is automatically given a size of four elements (at compile-time), because that is the size required to hold the string literal (including the null terminator).
The problem is that you can't modify the literal "ABC", which is read only.
Try char[] original_string = "ABC", which uses an array to hold the string that you can modify.