Evaluation of expression in printf function [closed] - c

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 4 years ago.
Improve this question
Please have a look at below code
#include<stdio.h>
int main(void){
int *ptr,a,b;
a = ptr;
b = ptr + 1;
printf("the value of a,b is %d and %d respectively\n",a,b);
printf("the value of a is %d \n",(ptr));
printf("the value of b is %d \n",(ptr+1));
printf("the value of (ptr+1)-ptr is %d \n",(ptr+1)-ptr);
return 0;
}
Output:
the value of a,b is 0 and 4 respectively
the value of a is 0
the value of b is 4
the value of (ptr+1)-ptr is 1
I am not able to understand why the value of (ptr+1)-ptr is 1 not 4 as 4-0?Is it due to computation optimization?

First of all, what you did is wrong. Pointer arithmetic is valid if both of them point to elements of the same array object or one past the last element of the array object. So it is undefined behavior.
In your case, the subtraction returns how far one is from the other - based on the type of the object pointed to which is int here, and sizeof int being 4 bytes, it returned 1 denoting the 1 int separation. (meaning the same as 4 byte - as can be inferred from the value of a and b).
From §6.5.6¶9 C11 standard (this points out both the points mentioned above)
When two pointers are subtracted, both shall point to elements of the same array object, or one past the last element of the array object; the result is the difference of the subscripts of the two array elements...
And you are printing addresses with %d format specifier. You should use %p format specifier with argument void* casted. And never think of dereferencing these pointers as it would be Undefined Behavior to do so.

I am not able to understand why the value of (ptr+1)-ptr is 1 not 4
No, pointer arithmetic in C happens in units related to the pointed type (not in bytes). Read some C programming book.
So (ptr+1) - ptr is 1, but if you cast each pointer to a char*, e.g. by coding (char*)(ptr-1) - (char*)ptr, you'll get probably 4 (assuming sizeof(int) is 4).
Don't forget to enable all warnings and debug info when compiling: if using GCC, compile with gcc -Wall -Wextra -g, then improve your code to get no warnings at all. Read more about undefined behavior (your code have some) and be scared of it. Use the gdb debugger (or whatever other debugger you can use). Read documentation of every standard function that you are using (and of every external function from some other library), notably of printf.
Later, consider also reading pieces of the C11 standard, e.g. n1570 (a draft which is actually the standard text).
Care about portability (to different computers or compilers).
By definition, compiler optimization should not change the observed behavior and semantics of your program.

Related

a is a double, printf("%d", a); works differently in IA32 and IA32-64 [closed]

Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 6 years ago.
Improve this question
Why does The following code work totally differently on IA-32 and x86-64?
#include <stdio.h>
int main() {
double a = 10;
printf("a = %d\n", a);
return 0;
}
On IA-32, the result is always 0.
However, on x86-64 the result can be anything between MAX_INT and MIN_INT.
%d actually is used for printing int. Historically the d stood for "decimal", to contrast with o for octal and x for hexadecimal.
For printing double you should use %e, %f or %g.
Using the wrong format specifier causes undefined behaviour which means anything may happen, including unexpected results.
Passing an argument to printf() that doesn't match the format specifiers in the format string is undefined behaviour... and with undefined behaviour, anything could happen and the results aren't necessarily consistent from one instance to another -- so it should be avoided.
As for why you see a difference between x86 (32-bit) and x86-64, it's likely because of differences in the way parameters are passed in each case.
In the x86 case, the arguments to printf() are likely being passed on the stack, aligned on 4-byte boundaries -- so when printf() processes the %d specifier it reads a 4-byte int from the stack, which is actually the lower 4 bytes from a. Since a was 10 those bytes have no bits set, so they're interpreted as an int value of 0.
In the x86-64 case, the arguments to printf() are all passed in registers (though some would be on the stack if there were enough of them)... but double arguments are passed in different registers than int arguments (such as %xmm0 instead of %rsi). So when printf() tries to process an int argument to match the %d specifier, it takes it from a different register that the one a was passed in, and uses whatever garbage value was left in the register instead of the lower bytes of a, and interprets that as some garbage int value.

Values the same but not equal [closed]

Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 6 years ago.
Improve this question
Can somebody explain to me why the following code outputs 0 0. I was under the impression that mathematically a-b == 0 ⇒ a == b
char* V1 = "hello, world!\n";
main(){
F1(V1);
}
F1(A1){
printf("%u %u\n", V1 - A1, V1 == A1);
}
As per the C11 standard, chapter §6.5.6, two pointers can only any only be subtracted if they are from same array object.
When two pointers are subtracted, both shall point to elements of the same array object,
or one past the last element of the array object; the result is the difference of the
subscripts of the two array elements. The size of the result is implementation-defined,
and its type (a signed integer type) is ptrdiff_t defined in the <stddef.h> header.
If the result is not representable in an object of that type, the behavior is undefined. [...]
That said, you should be using %td for printing a result of type ptrdiff_t, generated from pointer subtraction.
Next, coming to the usage of == operator, it cannot be used to compare the contents of the string (array), it basically compares the pointers themselves (not the content they point to).
As #SouravGhosh pointed out, there is undefined behavior here. Still, it is interesting to understand why the second printed value is sometimes 0 and sometimes 1 (depending on the machine it runs on).
The absence of an explicit type for the input of F means that A1 is implicitly an int. The function call F(V1) thus casts the pointer V1 to an int. On some machines this might be a narrowing conversion, which is why on some machines the comparison V1 == A1 is true while on other machines it is false. Perhaps on some machines both arguments in V1 - A1 are cast to ints (hence the value 0) but in the comparison V1 == A1 (which is ub) tries to cast A1 to a valid address and either fails or casts it to a different address.

C - Why this strange output in printf() [closed]

Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 7 years ago.
Improve this question
I am confused with the strange output of the following c program.
I am using TurboC and DevC compiler
I will be really pleased if someone will help me out in this.
Program
#include<stdio.h>
#include<conio.h>
int main()
{
clrscr();
printf("%d","hb");
printf("%d","abcde"-"abcde");
//Output is -6 why ?
return 0;
}
Outputs
For TurboC
printf("%d","hb");
//Output is 173 Why ?
// No matter what I write in place of "hb" the output is always 173
printf("%d","abcde"-"abcde");
//Output is -6 why ?
For Dev C
printf("%d","hb");
//Output is 4210688 Why ?
// No matter what I write in place of "hb" the output is always 4210688
printf("%d","abcde"-"abcde");
//Output is 0 why ?
Here, you are passing the memory address of a string literal (a char*):
printf("%d","hb");
However, the specifier which should be used is %p (standing for pointer):
printf("%p\n", "hb"); // The output is in hexadecimal
This will ensure that the same representation size is used by printf when displaying it as for when it was passed to printf. Using %d (int specifier) will result in undefined behaviour when sizeof(int) is not the same as sizeof(char*), and even if the sizes would be equal, using %d may result in having negative values printed (if the most significant bit is set - the sign bit of an int).
As for any memory address, you can't expect it to be the same after the program was recompiled, and even less when using different toolchains.
When the output was the same after changing the "hb" literal with another one, it means that it was allocated at the same address.
Here, two pointers to string literals are subtracted:
printf("%d","abcde"-"abcde");
The result of subtracting two pointers is the number of elements of that type between the addresses pointed by them. But please note, the behaviour is only defined when the pointers point to elements from the same array, or to the first element just after the end of the array.
Again, %d may not be the right specifier to be used. An integer type with its size at least equal to the pointer type may be used, maybe long long int (this should be checked against the specific platform). A subtraction overflow may still happen, or the result may not fit into the cast type, and then the behaviour is again undefined.
char *p1, *p2; // These should be initialized and NOT point to different arrays
printf("%lld\n", (long long int)(p1 - p2));
Also note, C standard library provides stddef.h, which defines the ptrdiff_t type used to store a pointer difference. See this: C: Which character should be used for ptrdiff_t in printf?
Note: As there are two different char arrays, the pointer subtraction is undefined, and therefore information below is only based on assumptions, and presented only because the OP mentioned that this was an exam question.
In our case, as sizeof(char) is 1, it represents exactly the difference in bytes. The difference of -6 is telling that the two identical literals "abcde" were placed in memory first next to the second. The literal is including the string terminator, so it's size is 6.
The other thing that can be deduced from here is that the compiler used by DevC++ was "smarter" (or had other optimization options passed to), to create a single copy in the memory for the "abcde" literal, hence the difference of 0.
A string literal is usually placed in the read-only memory, and the program should not try to modify it, so if the compiler (or the linker in some cases) can "detect" a duplicate, it may reuse the previous literal instead.

C pointer addition and substraction in sect. 6.5.6

I am trying to understand paragraph 8 and 9 of C99 sect 6.5.6 (Additive operators)
Does para 8 mean:
int a [4];
int *p = a;
p --; /* undefined behaviour */
p = a + 4; /* okay */
p --; /* okay */
p += 2; /* undefined behaviour */
p = a;
p += 5 - 5; /* okay */
p = p + 5 - 5; /* undefined behaviour */
For paragraph 9, my understanding had been that ptrdiff_t is always large enough to hold the difference of 2 pointers. But the wording:
'provided the value fits in an object of type ptrdiff_t' seems to suggest this understanding is wrong. Is my understanding wrong or C99 meant something else.
You can find a link to the draft standards here:
http://cboard.cprogramming.com/c-programming/84349-c-draft-standards.html
I don't think your interpretation is correct. In the version I have (n1256) paragraph 9 states:
If the result is not representable in an object of that type, the
behavior is undefined
that is it. If the difference is larger than PRTDIFF_MAX or smaller than PTRDIFF_MIN the behavior is undefined.
Notice that this places the burden on the programmer to check if the difference fits in ptrdiff_t. A "lazy" platform implementation could just choose a narrow type for ptrdiff_t and leave you dealing with that.
Checking for that would not be straight forward since you can't do the substraction without provoking UB. You'd have to use the information that the two pointers point inside (or just beyond) of the same object and where the boundaries of that surrounding object are.
I agree to your understanding of paragraph 8. The standard says
If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined.
It seems that C assumes that there is no pointer overflow inside an array, so you can increment/decrement pointers while you stay inside the array. If the result pointer is leaving the array, an overflow might occur and behaviour is undefined.
Regarding paragraph 9 I guess the standard takes into account that you might for example have an architecture that gives you 32 bit pointers and 32 bit data types, but since the difference of two 32 bit pointers in fact is a sign plus 32 bit (so 33 bits), not every pointer difference might match into a 32 bit ptrdiff_t. With 2 complement architecture this is not a problem, but it might be a problem on other architectures.

What are the common undefined/unspecified behavior for C that you run into? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 years ago.
Improve this question
An example of unspecified behavior in the C language is the order of evaluation of arguments to a function. It might be left to right or right to left, you just don't know. This would affect how foo(c++, c) or foo(++c, c) gets evaluated.
What other unspecified behavior is there that can surprise the unaware programmer?
A language lawyer question. Hmkay.
My personal top3:
violating the strict aliasing rule
violating the strict aliasing rule
violating the strict aliasing rule
:-)
Edit Here is a little example that does it wrong twice:
(assume 32 bit ints and little endian)
float funky_float_abs (float a)
{
unsigned int temp = *(unsigned int *)&a;
temp &= 0x7fffffff;
return *(float *)&temp;
}
That code tries to get the absolute value of a float by bit-twiddling with the sign bit directly in the representation of a float.
However, the result of creating a pointer to an object by casting from one type to another is not valid C. The compiler may assume that pointers to different types don't point to the same chunk of memory. This is true for all kind of pointers except void* and char* (sign-ness does not matter).
In the case above I do that twice. Once to get an int-alias for the float a, and once to convert the value back to float.
There are three valid ways to do the same.
Use a char or void pointer during the cast. These always alias to anything, so they are safe.
float funky_float_abs (float a)
{
float temp_float = a;
// valid, because it's a char pointer. These are special.
unsigned char * temp = (unsigned char *)&temp_float;
temp[3] &= 0x7f;
return temp_float;
}
Use memcopy. Memcpy takes void pointers, so it will force aliasing as well.
float funky_float_abs (float a)
{
int i;
float result;
memcpy (&i, &a, sizeof (int));
i &= 0x7fffffff;
memcpy (&result, &i, sizeof (int));
return result;
}
The third valid way: use unions. This is explicitly not undefined since C99:
float funky_float_abs (float a)
{
union
{
unsigned int i;
float f;
} cast_helper;
cast_helper.f = a;
cast_helper.i &= 0x7fffffff;
return cast_helper.f;
}
My personal favourite undefined behaviour is that if a non-empty source file doesn't end in a newline, behaviour is undefined.
I suspect it's true though that no compiler I will ever see has treated a source file differently according to whether or not it is newline terminated, other than to emit a warning. So it's not really something that will surprise unaware programmers, other than that they might be surprised by the warning.
So for genuine portability issues (which mostly are implementation-dependent rather than unspecified or undefined, but I think that falls into the spirit of the question):
char is not necessarily (un)signed.
int can be any size from 16 bits.
floats are not necessarily IEEE-formatted or conformant.
integer types are not necessarily two's complement, and integer arithmetic overflow causes undefined behaviour (modern hardware won't crash, but some compiler optimizations will result in behavior different from wraparound even though that's what the hardware does. For example if (x+1 < x) may be optimized as always false when x has signed type: see -fstrict-overflow option in GCC).
"/", "." and ".." in a #include have no defined meaning and can be treated differently by different compilers (this does actually vary, and if it goes wrong it will ruin your day).
Really serious ones that can be surprising even on the platform you developed on, because behaviour is only partially undefined / unspecified:
POSIX threading and the ANSI memory model. Concurrent access to memory is not as well defined as novices think. volatile doesn't do what novices think. Order of memory accesses is not as well defined as novices think. Accesses can be moved across memory barriers in certain directions. Memory cache coherency is not required.
Profiling code is not as easy as you think. If your test loop has no effect, the compiler can remove part or all of it. inline has no defined effect.
And, as I think Nils mentioned in passing:
VIOLATING THE STRICT ALIASING RULE.
My favorite is this:
// what does this do?
x = x++;
To answer some comments, it is undefined behaviour according to the standard. Seeing this, the compiler is allowed to do anything up to and including format your hard drive.
See for example this comment here. The point is not that you can see there is a possible reasonable expectation of some behaviour. Because of the C++ standard and the way the sequence points are defined, this line of code is actually undefined behaviour.
For example, if we had x = 1 before the line above, then what would the valid result be afterwards? Someone commented that it should be
x is incremented by 1
so we should see x == 2 afterwards. However this is not actually true, you will find some compilers that have x == 1 afterwards, or maybe even x == 3. You would have to look closely at the generated assembly to see why this might be, but the differences are due to the underlying problem. Essentially, I think this is because the compiler is allowed to evaluate the two assignments statements in any order it likes, so it could do the x++ first, or the x = first.
Dividing something by a pointer to something. Just won't compile for some reason... :-)
result = x/*y;
Another issue I encountered (which is defined, but definitely unexpected).
char is evil.
signed or unsigned depending on what the compiler feels
not mandated as 8 bits
I can't count the number of times I've corrected printf format specifiers to match their argument. Any mismatch is undefined behavior.
No, you must not pass an int (or long) to %x - an unsigned int is required
No, you must not pass an unsigned int to %d - an int is required
No, you must not pass a size_t to %u or %d - use %zu
No, you must not print a pointer with %d or %x - use %p and cast to a void *
I've seen a lot of relatively inexperienced programmers bitten by multi-character constants.
This:
"x"
is a string literal (which is of type char[2] and decays to char* in most contexts).
This:
'x'
is an ordinary character constant (which, for historical reasons, is of type int).
This:
'xy'
is also a perfectly legal character constant, but its value (which is still of type int) is implementation-defined. It's a nearly useless language feature that serves mostly to cause confusion.
A compiler doesn't have to tell you that you're calling a function with the wrong number of parameters/wrong parameter types if the function prototype isn't available.
The clang developers posted some great examples a while back, in a post every C programmer should read. Some interesting ones not mentioned before:
Signed integer overflow - no it's not ok to wrap a signed variable past its max.
Dereferencing a NULL Pointer - yes this is undefined, and might be ignored, see part 2 of the link.
The EE's here just discovered that a>>-2 is a bit fraught.
I nodded and told them it was not natural.
Be sure to always initialize your variables before you use them! When I had just started with C, that caused me a number of headaches.

Resources