printing with pointers and strings (C)

printing with pointers and strings (C) - c

can someone explain why this prints out : "d ce"
I think i understood why it prints out "d" but i don't understand the "ce"
#include <stdio.h>
int main(void){
char s1[]="abcd",*cp;
cp=s1+2;
printf("%c %s\n",(*(++cp))++,s1+2);
return 0;
}

TL;DR
In order to understand this code, you need rather in-depth C knowledge of sequence points, undefined behavior and misc "language-lawyer" stuff. If that isn't the case, then just simply forget about writing or understanding artificial crap code like this and you can stop reading here. Advanced explanation follows.
C11 6.5 states:
If a side effect on a scalar object is unsequenced relative to either a different side effect
on the same scalar object or a value computation using the value of the same scalar
object, the behavior is undefined. If there are multiple allowable orderings of the
subexpressions of an expression, the behavior is undefined if such an unsequenced side
effect occurs in any of the orderings.
Generally this renders weird code like this undefined behavior. But in this case, 1) s is not a scalar but an aggregate. The scalar is rather the s[3] lvalue. And 2) the )++ increment that changes the value is not unsequenced to another value computation using the object inside the same expression.
(*(++cp))++ and s1+2 are unsequenced in relation to each other but that doesn't matter for the purpose of making a case for undefined behavior.
cp=s1+2; here cp is pointing at the 'c'.
This expression alone (*(++cp))++ first increases the pointer one step to point at s[3] that contains 'd'. Then it dereferences the pointer so that we get the value 'd'.
It then increments the value by 1 with the postfix ++, which on most symbol tables means 'e'.
The ++ change of the value is not sequenced in relation to the value computation that happens during s1+2 but it doesn't matter.
All arguments are evaluated before the function is called and there is a sequence point after the full evaluation of the argument list.
printf then spots %s and de-references the string, a value computation, but sequenced in relation to the previous )++. This prints ce.
And finally there is another sequence point before the function returns.
So the code is actually well-defined, even though horribly written.

Hope the inline comments makes it clear.
#include <stdio.h>
int main()
{
char s1[]="abcd",s2[]="cdef",s3[5],*cp;
printf("s1 = %p\n", (void*)s1); // prints start address of s1
cp = s1+2; // cp points to 2 chars advanced of s1 i.e 'c'
printf("cp = %p\n", (void*)cp); // prints address at 'c'
//first increment cp to point to next location, which now points at 'd' from 'c'
//the outer ++ increments 'd' to 'e' post-fix, so first prints 'd'
printf("(*(++cp))++ = %c\n", (*(++cp))++);
printf("s1 = %s\n", s1); // you will see "abce"
printf("s1+2 = %s\n", s1+2); // 2 chars advanced of s1, i.e "ce"
//printf("%c %s\n",(*(++cp))++,s1+2);
return 0;
}

Related

C language: find output of given program

So i have this main:
#define NUM 5
int main()
{
int a[NUM]={20,-90,450,-37,87};
int *p;
for (p=a; (char *)p < ((char *)a + sizeof(int) * NUM); ) //same meaning: for (p=a; p<a+NUM;)
*p++ = ++*p < 60 ? *p : 0; //same meaning: *(p++)=++(*p)<60?*p:0;
for(p=a; (char *)p < ((char *)a + sizeof(int) * NUM); )
printf("\n %d ", *p++);
return 0;
}
And i need to find what is the output.
So after try to understand without any idea i run it and this is the output:
21
-89
0
-36
0
So i will glad to explanation how to solve this kind of questions (i have exam soon and this type of questions probably i will see..)
EDIT:
at the beginning i want to understand what the first forstatement doing:
This jump 1 integer ? and what this going inside the block ?
And what is the different between *p++ and ++*p

The question is similar to Why are these constructs (using ++) undefined behavior in C? although not an exact duplicate due to the (subtle) sequence point inside the ?: operator.
There is no predictable output since the program contains undefined behavior.
While the sub-expression ++*p is sequenced in a well-defined way compared to *p because of the internal sequence point of the ?: operator, this is not true for the other combinations of sub-expressions. Most notably, the order of evaluation of the operands to = is not specified:
C11 6.5.15/3:
The evaluations of the operands are unsequenced.
*p++ is not sequenced in relation to ++*p. The order of evaluation of the sub-expressions is unspecified, and since there are multiple unsequenced side-effects on the same variable, the behavior is undefined.
Similarly, *p++ is not sequenced in relation to *p. This also leads to undefined behavior.
Summary: the code is broken and full of bugs. Anything can happen. Whoever gave you the assignment is incompetent.

at the beginning i want to understand what the first for statement doing
This is what one would call code obfuscation... The difficult part is obviously this one:
(char *)p < ((char *)a+sizeof(int)*NUM);
OK, we convert p to a pointer to char, then compare it to another pointer retrieved from array a that points to the first element past a: sizeof(int)*NUM is the size of the array - which we could have gotten much more easily by just having sizeof(a), so (char*)p < (char*)a + sizeof(a)
Be aware that comparing pointers other than with (in-)equality is undefined behaviour if the pointers do not point into the same array or one past the end of the latter (they do, in this example, though).
Typically, one would have this comparison as p < a + sizeof(a)/sizeof(*a) (or sizeof(a)/sizeof(a[0]), if you prefer).
*p++ increments the pointer and dereferences it afterwards, it is short for p = p + 1; *p = .... ++*p, on the other hand first dereferences the pointer and increments the value it is pointing to (note the difference to *++p, yet another variant - can you get it yourself?), i. e. it is equivalent to *p = *p + 1.
The entire line *p++ = ++*p<60 ? *p : 0; then shall do the following:
increment the value of *p
if the result is less than 60, use it, otherwise use 0
assign this to *p
increment p
However, this is undefined behaviour as there is no sequence point in between read and write access of p; you do not know if the left or the right side of the assignment is evaluated first, in the former case we would assign a[i] = ++a[i + 1], in the latter case, a[i] = ++a[i]! You might have gotten different output with another compiler!!!
However, these are only the two most likely outputs – actually, if falling into undefined behaviour, anything might happen, the compiler might just to ignore the piece of code in question, decide not to do anything at all (just exit from main right as the first instruction), the program might crash or it could even switch off the sun...
Be aware that one single location with undefined behaviour results in the whole program itself having undefined behaviour!

Short answer: because of this line
*p++ = ++*p<60 ? *p : 0;
it is impossible to say how the program behaves. When we access *p on the right-hand side, does it use the old or the new value of p, that is, before or after the p++ on the left-hand side gets to it? There is no rule in C to tell us. What there is instead is a rule that says that for this reason the code is undefined.
Unfortunately the person setting the question didn't understand this, thinks that "tricky" code line this is something to make a puzzle about, instead of something to be avoided at all costs.

The only way to really understand this kind of stuff (memory management, pointer behaviour, etc.) is to experiment yourself. Anyway, I smell someone is trying to seem clever fooling students, so I will try to clarify a few things.
int a[NUM]={20,-90,450,-37,87};
int *p;
This structure in memory would be something like:
This creates a vector of five int, so far, so good. The obvious move, given that data, is to run over the elements of a using p. You would do the following:
for(p = a; p < (a + NUM); ++p) {
printf("%d ", *p);
}
However, the first change to notice is that both loops convert the pointers to char. So, they would be:
for (p=a;(char *)p<((char *)a+sizeof(int)*NUM); ++p) {
printf("%d ", *p);
}
Instead of pointing to a with a pointer to int the code converts pto a pointer to char. Say your machine is a 32bit one. Then an int will probably occupy four bytes. With p being a pointer to int, when you do ++p then you effectively go to the next element in a, since transparently your compiler will jump four bytes. If you convert the int pointer to a char instead, then you cannot add NUM and assume that you are the end of the array anymore: a char is stored in one byte, so ((char *)p) + 5 will point to the second byte in the second element of a, provided it was pointing at the beginning of a before. That is way you have to call sizeof(int) and multiply it by NUM, in order to get the end of the array.
And finally, the infamous *p++ = ++*p<60 ? *p : 0;. This is something unfair to face students with, since as others have already pointed out, the behaviour of that code is undefined. Lets go expression by expression.
++*p means "access p and add 1 to the result. If p is pointing to the first position of a, then the result would be 21. ++*pnot only returns 21, but also stored 21 in memory in the place where you had 20. If you were only to return 21, you would write; *p + 1.
++*p<60 ? *p : 0 means "if the result of permanently adding 1 to the value pointed by p is less than 60, then return that result, otherwise return 0.
*p++ = x means "store the value of x in the memory address pointed by p, and then increment p. That's why you don't find ++p or p++ in the increment part of the for loop.
Now about the whole instruction (*p++ = ++*p<60 ? *p : 0;), it is undefined behaviour (check #Lundin's answer for more details). In summary, the most obvious problem is that you don't know which part (the left or the right one), around the assignment operator, is going to be evaluated first. You don't even know how the subexpressions in the expression at the right of the assignment operator are going to be evaluated (which order).
How could you fix this? It would be actually be very simple:
for (p=a;(char *)p<((char *)a+sizeof(int)*NUM); ++p) {
*p = (*p + 1) <60 ? (*p + 1) : 0;
}
And much more readable. Even better:
for (p = a; p < (a + NUM); ++p) {
*p = (*p + 1) <60 ? (*p + 1) : 0;
}
Hope this helps.

Why This Code Prints Second statment?

Why This code fail in first If statement?
My prediction getting wrong as per associations and precedence.
#include<stdio.h>
void main()
{
int i=10;
if(i==i--)
{
printf("In 1:%d\n",i);
printf("TRUE 1\n");
}
i=10;
if(i==--i)
{
printf("In 2:%d\n",i);
printf("TRUE 2\n");
}
}

i==i-- is undefined behaviour. Please check this: http://c-faq.com/expr/ieqiplusplus.html and this: http://c-faq.com/expr/seqpoints.html

The expression i==i-- will cause undefined behavior because there is no sequence point between the two evaluations of i and i--. This means that anything can happen and at that point the program no longer produces meaningful output.
The same is true for the expression i==--i
If an object is read and also modified without a sequence point separating the two events the behavior is undefined1. In this case the same object is modified (side effect): i-- and read (value computation): i, without a sequence point.
Correct code would separate the two expressions with a sequence point (character ;):
const int i1 = i;
const int i2 = i--;
if( i1 == i2 )
{
//...
}
const int i3 = i;
const int i4 = --i;
if( i3 == i4 )
{
//...
}
1 (Quoted from ISO/IEC 9899:201x 6.5 Expressions 2):
If a side effect on a scalar object is unsequenced relative to either a different side effect
on the same scalar object or a value computation using the value of the same scalar
object, the behavior is undefined.

Undefined behaviour because post and pre increment depend on compiler. Please see stack overflow question here.
C99 standard - 6.5 Expressions, §2
Between the previous and next sequence point an object shall have its
stored value modified at most once by the evaluation of an expression.
Furthermore, the prior value shall be read only to determine the value
to be stored.

There would be no compilation error. It does not go inside the first if statement as it is undefined behavior (as user babon mentioned).
The behavior actually depends on the compiler you're using.

I did not quite get your question. Are you asking "why does statements in only first if gets executed"? If yes, following is the reason
i-- evaluates the expression and then decrements
--i decrements first and then evaluate expression. So, second condition evaluates to 'false' (10 == 9)

Please explain the output of following C code

Code :
#include<stdio.h>
#include<stdlib.h>
int arr[] = {1, 2, 3, 4};
static int count = 0;
int incr( ) {
++count;
++count;
return count;
}
int main(void)
{
printf("\ncount= %d \n",count);
int i;
arr[count++]=incr( );
for(i=0;i<4;i++)
printf("arr[%d]=%d\n", i,arr[i]);
printf("\nIncremented count= %d \n",count);
return 0;
}
Output
count = 0
arr[0]=2
arr[1]=2
arr[2]=3
arr[3]=4
Incremented count = 1
The final incremented value of global variable count is 1 even though it has been incremented thrice.
When count++ is replaced by count in arr[count++]=incr( ) the final incremented value of count is 2.

This is undefined behaviour from bad sequencing. On this line:
arr[count++]=incr( );
What happens (with your compiler) is:
arr[count] is resolved to arr[0], postfix ++ will be applied at the end of the
statement;
incr() is called, count is now equal to 2, incr() returns 2;
arr[0] gets assigned 2;
postfix ++'s side effect kicks in, and count is now equal to 1. Previous changes to count are lost.
You will find more info on "side effects" and "sequence points" by googling their real name :)

To understand why your code goes wrong, you must first understand undefined behavior and sequence points, which is a rather advanced topic. You also need to understand what undefined behavior is, and what unspecified behavior is, explained here.
If you do something to a variable which counts as a side-effect, such as modifying it, then you are not allowed to access that variable again before the next sequence point, for other purposes than to calculate which value to store in your variable.
For example i = i++ is undefined behavior because there are two side effects on the same variable with no sequence point in between. But i = i+1; is well-defined, because there is only one side effect (the assignment) and the i+1 is only a read access to determine what value to store.
In your case, there is no sequence point between the arr[count++] sub-expression and the incr() sub-expression, so you get undefined behavior.
This is how sequence points appear in functions, C11 6.5.2.2:
There is a sequence point after the evaluations of the function
designator and the actual arguments but before the actual call. Every
evaluation in the calling function (including other function calls)
that is not otherwise specifically sequenced before or after the
execution of the body of the called function is indeterminately
sequenced with respect to the execution of the called function.
This means that the contents of the function aren't sequenced in relation to the rest of the expression. So you are essentially writing an expression identical to arr[count++] = ++count;, except through the function you managed to squeeze in two unsequenced ++count on the right side of the operation, which wouldn't otherwise be possible. Any any rate, it is undefined behavior.
Fix your code by enforcing sequence points between the left hand and the right hand of the expression. However, the order of evaluation of sub-expressions is unspecified behavior, so you need to ensure that your code is safe no matter if the left or right side is evaluated first. This code will fix the problems:
// artificial example, don't write code like this
0,arr[count++] = 0,incr();
since the comma operator introduces a sequence point. But of course, writing nonsense code like that isn't something you should be doing. The real solution is to never use ++ together with other operators in the same expression.
// good code, write code like this
arr[count] = incr();
count++;

How a discarded value can be used later in the program

About the expression statement(an example)
i = 1;
it is said that after assigning 1 to i the value of entire expression is being discarded. If the value is discarded then how this can be used later in the program,for example
printf("%d",i);
?
I know this is very basic question but I am really confused with discarded.

The value of the expression is indeed discarded, but this expression has a side effect - it changes the value of i. So next time you will access this variable, you will read the new value, which is 1.
The term "discarded" is more helpful when you do things like foo(5); or even simply "hello";. Since the expression "hello" does not have any side effect, and its value is dicarded, it is does absolutely nothing. When a compiler encounters it, as a stand alone statement:
"hello";
It may simply ignore it altogether, as if it does not exist at all. This is what happens when you call functions, or use operators:
4+5;
sin(2.6);
These expressions, too, have no side effect, and their values are ignored. When you do something like
printf("hello");
This is an expression, too. Its value is the total number of characters written. This value is ignored. But the expression must not be comletely ignored, since it has an important side effect: it prints these characters to the standard output.
So let's build a function instead of using the assignment operator (since C has no references, we'll use pointers):
int assign_int(int* var, int value) {
*var = value;
return *var;
}
now, back to your example, you do something like:
assign_int(&i, 1);
the value returned from assign_int is discarded. Just like in the printf() case. But since the function assign_int has a side effect (changing the value of i), it is not ignored by the compiler.

The important point is the i = 1 has two properties.
It changes the value stored in the variable i to be 1
It is an expression and has a value (which is also 1);
That second part is interesting is a case like
if ( (i=1) == 2 ) { // ...
or
y = 3 + (i = 1); // assign 4 to y
The line
the value of entire expression is being discarded.
refers to the value of the expression (my #2), but does not affect assignment to variable i (my #1).

Why doesn't *(ptr++) give the next item in the array?

int my_array[] = {1,23,17,4,-5,100};
int *ptr;
int i;
ptr = &my_array[0]; /* point our pointer to the first
element of the array */
printf("\n\nptr = %d\n\n", *ptr);
for (i = 0; i < 6; i++)
{
printf("my_array[%d] = %d ",i,my_array[i]); /*<-- A */
printf("my_array[%d] = %d\n",i, *(ptr++)); /*<-- B */
}
Why does this display the same thing for both line a and b? It just displays all of the values in my_array in order (1, 23, 17, 4, -5, 100). Why does the '++' in line B not point ptr to the next element of the array before it is dereferenced? Even if you change that line to
printf("ptr + %d = %d\n",i, *ptr++); /*<-- B */
the output is the same. Why is this?

ptr++ increments ptr but returns the original value
++ptr increments and returns the new value
Hence the joke about c++ - it's one more than c but you use the original value = c

It seems you are puzzled by the fact that the parenthesis do not change the value returned as you expectd.
Maybe it would be clearer to you if you think it in English:
p++ means take the value of p, increment the value of p, return the initial value of p
so, *p++ would dereference the original value of p.
Considering that the value of (x) is the same as x, the value of (p++) is the same as p++.
Hence, *(p++) will dereference p, exactly as *p++ does.

It is evident from the naming post-increment and pre-increment. Meaning, the variable is incremented post the operation or before the operation.
A post-increment operator creates a temporary variable to store the current value and increments the variable (but returns the temporary variable with current value). In pre-increment operator, there is no temporary variable. The same variable is incremented and returned.
So using post-increment operator in the same statement, means using the current value of the variable and incrementing after this statement. Whereas post-increment operator means incrementing the variable and using it in the current statement.

In C there is a difference between post incrementing p++ and preincrementing ++p
p++ : uses the current value of p and then updates it
++P: updates the value of p and then uses it
hence your code should use ++ptr

There are 2 types of operators : Postfix and Prefix .
*ptr++ is postfix operator means first use and then increase
while ++ptr means prefix operator means first increase and then use.
if you add another printf with printing the value of just *ptr in your existing code you will notice the difference how the things go about.

To avoid this whole issue, write either of these alternatives:
++ptr;
printf("my_array[%d] = %d\n",i, *ptr);
or
printf("my_array[%d] = %d\n",i, *ptr);
++ptr;
This will yield the same number of instructions, but with the following major advantages:
Is now more readable and understandable.
If ptr would be used several times in the printf() statement, you need not worry about the order of evaluation of function parameters or operands, which are unspecified in the C language (with a few rare exceptions). Had you writted printf("%d %d", *++ptr, *++ptr); you can't know the result, as the code would then rely on order of evaluation, i.e. it contains a possibly severe bug.