Is this Integer Promotion? How does it work? - c

I was just experimenting and I tried out two printf()s.
unsigned char a = 1;
a = ~a;
printf("----> %x %x %x %x", ~a, a, ~a, ++a);
This one gave the output
----> ffffff00 ff ffffff00 ff
Next one was
unsigned char a = 1;
printf("----> %x %x %x %x", ~a, a, ~a, ++a);
This one gave the output
----> fffffffd 2 fffffffd 2
Now, I know what '++' does and '~' does. I also know that the sequence of operation inside printf is from the right.
But could some one explain the difference in the number of bytes printed? A total explanation of output would be helpful of course, but I am more interested in the number of bytes and the difference in both cases [especially in the printf a and ~a parts].
EDIT:
OK, looks like the ++ part and my mistake of "I also know that the sequence of operation inside printf is from the right" has prompted every post other than the answer I was hopefully looking for. So may be the way I asked was wrong.
I will try again,
unsigned char a = ~1;
a = ~a;
printf("----> %x", a);
OUTPUT: ----> 1
unsigned char a = ~1;
printf("----> %x", ~a);
OUTPUT: ----> ffffff01
Why this difference?

printf("----> %x %x %x %x", ~a, a, ~a, ++a); actually invokes undefined behavior because you have a side effect on a and other expressions depending on the same lvalue. So anything can happen and it is hopeless to try and explain the output produced.
Assuming 32 bit ints in 2's complement representation, if you wrote
printf("----> %x %x %x %x", ~a, a, ~a, a + 1);
You would get different and less surprising output:
ffffff01 fe ffffff01 ff
Let me explain what is going on:
a = ~a;
a contains 1, is converted to an int with the same value, the ~ operator applied to 1 computes to -2, converting that back to unsigned char gives 254 or 0xfe.
The arguments to printf are then computed as follows:
~a: 0xfe is converted to int and all bit are complemented, yielding 0xffffff01.
a is converted to int with the same value and printed as fe.
~a again of course gives the same output.
a+1: a is converted to int before incrementing by one, result is 255, prints as ff.
The explanation for your surprising outputs is that a is first converted to int and then the computation is done on the int value.

a = ~a; You have integer promotion on this line already, since the ~, like most operators in C, promotes the operand according to the rules of integer promotion.
The character containing value 1 gets integer promoted to an int containing the value 1, before the operation is done. Assuming 32 bit int, the result of ~a is a negative, two complement variable with hex value 0xFFFFFFFE.
You then show this result back into the unsigned char, it will truncate the result and only grab the raw binary value of the least significant byte, that is: 0xFE.
I also know that the sequence of operation inside printf is from the right.
No. The order of evaluation of function parameters in not specified by the standard. The compiler is free to evaluate them in any order it likes, and you cannot know or assume any particular order.
Even more problematic is that there is no sequence point between the evaluation of the different parameters. And since in your case you are using the same variable more than once, each access to the variable is unsequenced and your program invokes undefined behavior. Meaning that anything can happen: weird outputs, program crashes, memory corruption etc etc.
Furthermore, printf is a special case, being an obscure, variadic function. All such functions have particular rules for promotion of the arguments ("the default argument promotions"). So regardless of what promotions that happen or don't happen before you pass the result to printf, printf will ruin everything by applying its own integer promotion to the parameter.
So if you wish to toy around with promotion, printf is a very bad choice for displaying the result. Try using the sizeof operator instead. printf("%zu", sizeof(~a)); will for example print 4, because of integer promotion.

Related

Effect of type casting on printf function

Here is a question from my book,
Actually, I don't know what will be the effect on printf function, so I tried the statements in the original system of C lang. Here is my code:
#include <stdio.h>
void main() {
int x = 4;
printf("%hi\n", x);
printf("%hu\n", x);
printf("%i\n", x);
printf("%u\n", x);
printf("%li\n", x);
printf("%lu\n", x);
}
Try it online!
So, the output is very simple. But, is this really the solution to above problem?
There are numerous problems in this question that make it unsuitable for teaching C.
First, to work on this problem at all, we have to assume a non-standard C implementation is used. In standard C, %x is a complete conversion specification, so %xu and %xd cannot be; the conversion specification has already ended before the u or d. And the uses of z in a conversion specification interferes with its standard use for size_t.
Nonetheless, let’s assume this C variant does not have those standard conversion specifications and instead uses the ones shown in the table but that this C variant otherwise conforms to the C standard with minimal changes.
Our next problem is that, in Y num = 42;, we have a plain Y, not the signed Y or unsigned Y shown in the table. Let’s assume signed Y is intended.
Then num is a signed four-bit integer. The greatest value it can represent is 01112 = 710. So it cannot represent 42. Attempting to initialize it with 42 results in a conversion specified by C 2018 6.3.1.3, which says, in part:
Otherwise, the new type is signed and the value cannot be represented in it; either the result is implementation-defined or an implementation-defined signal is raised.
The result is we do not know what value is in num or even whether the program continues to execute; it may trap and terminate.
Well, let’s assume this implementation just takes the low bits of the value. 42 is 1010102, so its low four bits are 1010. So if the bits in num are 1010, it is negative. The C standard permits several methods of representation for negative numbers, but we will assume the overwhelmingly most common one, two’s complement, so the bits 1010 in num represent −6.
Now, we get to the printf statements. Except the problem text shows Printf, which is not defined by the C standard. (Are you sure this problem relates to C code at all?) Let’s assume it means printf.
In printf("%xu",num);, if the conversion specification is supposed to work like the ones in standard C, then the corresponding argument should be an unsigned X value that has been promoted to int for the function call. As a two-bit unsigned integer, an unsigned X can represent 0, 1, 2, or 3. Passing it −6 is not defined. So we do not know what the program will print. It might take just the low two bits, 10, and print “2”. Or it might use all the bits and print “-6”. Both of those would be consistent with the requirement that the printf behave as specified for values that are in the range representable by unsigned X.
In printf("%xd",num); and printf("%yu",num);, the same problem exists.
In printf("%yd",num);, we are correctly passing a signed Y value for a signed Y conversion specification, so “-6” is printed.
Then printf("%zu",num); has the same problem with the value mismatched for the type.
Finally, in printf("%zd",num);, the value is again in the correct range, and “-6” is printed.
From all the assumptions we had to make and all the points where the behavior is undefined, you can see this is a terrible exercise. You should question the quality of the book it is in and of any school using it.

Difference b/w getting an address of variable using %p and %d

Here is an example
#include <stdio.h>
int main()
{
int a;
printf("%d\n",&a);
printf("%p\n",&a);
return 0;
}
======Output=======
-2054871028
0x7ffd8585280c
Do these two address point to same address in RAM ?
And how can i get the value by using each one of them, especially the second one.
%d format specifier is used to output a signed decimal integer.
From C Standard#7.21.6.1p8
d,i
The int argument is converted to signed decimal in the style [-]dddd. The precision specifies the minimum number of digits to appear; if the value being converted can be represented in fewer digits, it is expanded with leading zeros. The default precision is 1. The result of converting a zero value with a precision of zero is no characters.
%p prints the pointer.
From C Standard#7.21.6.1p8
p
The argument shall be a pointer to void. The value of the pointer is converted to a sequence of printing characters, in an implementation-defined manner. [emphasis mine]
This statement
printf("%d\n",&a);
lead to undefined behavior because %d is not valid for printing a pointer.
From C Standard#7.21.6.1p9
If a conversion specification is invalid, the behavior is undefined.282) If any argument is not the correct type for the corresponding conversion specification, the behavior is undefined.
When you take the address of the variable a by writing &a, what you're really doing is generating a pointer to a.
%p is designed for printing pointers. You should use %p to print pointers.
%d is not designed for printing pointers. It tries to print their values as signed decimal, which can be confusing (as you've seen), and it may not print the entire value, on a machine where pointers are bigger than integers. (For example, if you try to print pointers with %d in most "64 bit" environments, you can get even stranger results -- and that might be part of what happened here.)
This is an easy mistake to make. Good compilers should warn you about it. Mine says "warning: format specifies type 'int' but the argument has type 'int *'".
But yes, both 0x7ffd8585280c and -2054871028 do "point to the same address in RAM", because they're both the same number, the same address. (Well, they're trying to be the same address. See footnote below.)
I'm not sure what you mean by "And how can I get the value". Are you trying to get the value of the pointer, or the value of what the pointer points to?
You've already got the value of the pointer -- it's the address 0x7ffd8585280c. And since we know it points to the variable a, we know the value it points to, too. Things will be a bit more clear if we do it like this:
int a = 5;
int *ip = &a;
printf("value of pointer: %p\n", ip);
printf("pointed-to value: %d\n", *ip);
Without the explicit pointer variable ip, we could write
int a = 5;
printf("value of pointer: %p\n", &a);
printf("pointed-to value: %d\n", *&a);
But that's pretty silly, because the last line is equivalent to the much more straightforward
printf("pointed-to value: %d\n", a);
(Taking the address of a variable with & and then grabbing the contents of the pointer using * is a no-op: it's a lot like like writing a + 1 - 1.)
Footnote: I said that 0x7ffd8585280c and -2054871028 were the same number, but they're not, they're just trying to be. 0x7ffd8585280c is really -140748133160948, and -2054871028 is really 0x8585280c, which is the lower-order 8 digits of 0x7ffd8585280c. It looks like %p on your machine is printing pointers as 48-bit values by default. I was about to be surprised by that, but then I realized my Mac does the same thing. Somehow I'd never noticed that.

What is printing out when int pointer is printed with %d?

This code:
#include <stdio.h>
int main() {
int num;
int *pi;
num = 0;
pi = &num;
printf("address: %p | %d\nvalue: %d\n", pi, pi, *pi);
}
produces this output:
address: 0x7fff5952f9cc | 1498610124
value: 0
I know that the left one is supposed to be the correct address, but what is printing out next to the address?
%p tells printf to treat the corresponding variable as a pointer, thus printf prints p as a pointer; that is, a hexadecimal representation (i.e. 0x7fff5952f9cc). %d on the other hand tells printf to treat the corresponding variable as numeric. Therefore, what is being printed is the actual, numeric value of p (i.e. 1498610124) which is just 0x5952f9cc in base 10.
Now, the reason why these two representations of the same variable seem to have different values is that %d only tells printf to expect a number---it doesn't specify that number's type. If you cast 0x7fff5952f9cc (a 64-bit integer) to int (a 32-bit type) you get 1498610124 (notice 0x7fff getting dropped).
You are printing the address in decimal instead of in hex but it is truncated to an int.
I guess you are executing this program on a 64bit machine.
The number printed next to the address is still the address of the pointer printed in the integer format. You can also see that the value is truncated
decimal of 0x7fff5952f9cc = 140734691998156. but it is printed as 1498610124 which is due to truncation.
you are trying to print the address of num in hex and decimal value respectively. If I am not wrong you are running your program on a 64 bit architecture and hence the address will be of 8 bytes. So your address fits in long data type. By giving %d you are value is getting truncated here. Instead of %d please use %ld. So your printf of statement should be actually as below
printf("address: %p | %ld\nvalue: %d\n", pi, pi, *pi);
Now run the program you will get your value correctly in decimal format.
It could be anything, because it's Undefined Behaviour. (C standard, § 7.21.6.1, The fprintf function):
If any argument is not the correct type for the corresponding conversion specification, the behavior is undefined.
You print pi (a pointer to an int) with two format specifiers, %p and %d. According to the C standard (and probably reproduced word for word in man fprintf on your system):
d,i The int argument is converted to signed decimal in the style [−]dddd
p The argument shall be a pointer to void. The value of the pointer is converted to a sequence of printing characters, in an implementation-defined manner.
So neither of those uses is correct. It's not a pointer to void and it's also not an int. So what you should write is:
printf("address: %p | %d\nvalue: %d\n", (void*)pi, (int)pi, *pi);
On your system, that probably produces the same output (Undefined Behaviour includes unexpectedly producing the incorrectly expected behaviour) but it might not. In any case, writing the line correctly makes it relatively clear what the second number printed is: it's the value of the pointer converted to an integer.
However, there is no guarantee that this will work, either. Again, from the standard (§6.3.2.3, Pointers, para. 6):
Any pointer type may be converted to an integer type. Except as previously specified, the result is implementation-defined. If the result cannot be represented in the integer type, the behavior is undefined.
(The "previously specified" case is that a NULL pointer may reliably be converted to any integer type; it will have the value 0.)
So the idea is that a pointer is a lot like an integer, but it might be the size of a long. Or a long long. Or something else. It might even be bigger than any integer type, but if there is some integer type which is the right size and you #include <stdint.h>, that integer type will be typedef'd to intptr_t and its unsigned version (probably more useful) will be uintptr_t. Unfortunately, there is no standard printf conversions for those sizes, so if you really want to, you would need to #include <intypes.h> and write:
printf("address: %p | %" PRIuPTR "\nvalue: %d\n",
(void*)pi, (uintptr_t)pi, *pi);
Alternatively, because it is always allowed to convert any integer to any unsigned integer (with possible loss of information, but with a well-defined conversion), you could use two casts:
printf("address: %p | %u\nvalue: %d\n",
(void*)pi, (unsigned)(uintptr_t)pi, *pi);
(Because that supplies an unsigned int argument, the format specifier must be u instead of d.)
When using printf, the "p" modifier is intended to print out a memory address in hex format. The "d" modifier tells it to cast the value to a signed integer value. So it's taking 0x7fff5952f9cc and turning it into a signed int.
See this for more details on printf.
In that code who have to print the address as the %d so during compile time it displayed warning and %p display the hexadecimal but %d display the integer.
Its lower part – 4 least significant bytes – of the pointer value in decimal:
0x5952F9CC == 1498610124.

Defined behavior, passing character to printf("%02X"

I recently came across this question, where the OP was having issues printing the hexadecimal value of a variable. I believe the problem can be summed by the following code:
#include <stdio.h>
int main() {
char signedChar = 0xf0;
printf("Signed\n”);
printf(“Raw: %02X\n”, signedChar);
printf(“Masked: %02X\n”, signedChar &0xff);
printf(“Cast: %02X\n", (unsigned char)signedChar);
return 0;
}
This gives the following output:
Signed
Raw: FFFFFFF0
Masked: F0
Cast: F0
The format string used for each of the prints is %02X, which I’ve always interpreted as ‘print the supplied int as a hexadecimal value with at least two digits’.
The first case passes the signedCharacter as a parameter and prints out the wrong value (because the other three bytes of the int have all of their bits set).
The second case gets around this problem, by applying a bit mask (0xFF) against the value to remove all but the least significant byte, where the char is stored. Should this work? Surely: signedChar == signedChar & 0xFF?
The third case gets around the problem by casting the character to an unsigned char (which seems to clear the top three bytes?).
For each of the three cases above, can anybody tell me if the behavior defined? How/Where?
I don't think this behavior is completely defined by c standard. After all it depends on binary representation of signed values. I will just describe how it's likely to work.
printf(“Raw: %02X\n”, signedChar);
(char)0xf0 which can be written as (char)-16 is converted to (int)-16 its hex representation is 0xfffffff0.
printf(“Masked: %02X\n”, signedChar &0xff);
0xff is of type int so before calculating &, signedChar is converted to (int)-16.
((int)-16) & ((int)0xff) == (int)0x000000f0.
printf(“Cast: %02X\n", (unsigned char)signedChar);
(unsigned char)0xf0 which can be written as (unsigned char)240 is converted to (unsigned int)240 as hex it's 0x000000f0

difference between printing a memory address using %u and %d in C?

I reading a C book. To print out a memory address of a variable, sometimes the book uses:
printf("%u\n",&n);
Sometimes, the author wrote:
printf("%d\n",&n);
The result is always the same, but I do not understand the differences between the two (I know %u for unsigned).
Can anyone elaborate on this, please?
Thanks a lot.
%u treats the integer as unsigned, whereas %d treats the integer as signed. If the integer is between 0 an INT_MAX (which is 231-1 on 32-bit systems), then the output is identical for both cases.
It only makes a difference if the integer is negative (for signed inputs) or between INT_MAX+1 and UINT_MAX (e.g. between 231 and 232-1). In that case, if you use the %d specifier, you'll get a negative number, whereas if you use %u, you'll get a large positive number.
Addresses only make sense as unsigned numbers, so there's never any reason to print them out as signed numbers. Furthermore, when they are printed out, they're usually printed in hexadecimal (with the %x format specifier), not decimal.
You should really just use the %p format specifier for addresses, though—it's guaranteed to work for all valid pointers. If you're on a system with 32-bit integers but 64-bit pointers, if you attempt to print a pointer with any of %d, %u, or %x without the ll length modifier, you'll get the wrong result for that and anything else that gets printed later (because printf only read 4 of the 8 bytes of the pointer argument); if you do add the ll length modifier, then you won't be portable to 32-bit systems.
Bottom line: always use %p for printing out pointers/addresses:
printf("The address of n is: %p\n", &n);
// Output (32-bit system): "The address of n is: 0xbffff9ec"
// Output (64-bit system): "The address of n is: 0x7fff5fbff96c"
The exact output format is implementation-defined (C99 §7.19.6.1/8), but it will almost always be printed as an unsigned hexadecimal number, usually with a leading 0x.
%d and %u will print the same results when the most significant bit is not set. However, this isn't portable code at all, and is not good style. I hope your book is better than it seems from this example.
What value did you try? The difference unsigned vs. signed, just as you said you know. So what did it do and what did you expect?
Positive signed values look the same as unsigned so can I assume you used a smaller value to test? What about a negative value?
Finally, if you are trying to print the variable's address (as it appears you are), use %p instead.
All addresses are unsigned 32-bit or 64-bit depending on machine (can't write to a negative address). The use of %d isn't appropriate, but will usually work. It is recommended to use %u or %ul.
There is no such difference ,just don't get confused if u have just started learning pointers.
%u is for unsigned ones.And %d for signed ones

Resources