datatype promotion in c

datatype promotion in c - c

In the following code:
#include "stdio.h"
signed char a= 0x80;
unsigned char b= 0x01;
void main (void)
{
if(b*a>1)
printf("promoted\n");
else if (b*a<1)
printf("why doesnt promotion work?");
while(1);
}
I expected "promoted' to be printed. But it doesnt. Why?
If I can the datatypes to signed and unsigned int, and have a as a negative number, eg, 0x80000000 and b as a positive number, 0x01, "promoted" gets printed as expected.
PLZ HELP me understand what the problem is!

You've just been caught by the messy type-promotion rules of C.
In C, intermediates of integral type smaller than int are automatically promoted to int.
So you have:
0x80 * 0x01 = -128 * 1
0x80 gets signed extended to type int:
0xffffff80 * 0x00000001 = -128 * 1 = -128
So the result is -128 and thus is less than 1.
When you use type int and unsigned int, both operands get promoted to unsigned int. 0x80000000 * 0x01 = 0x80000000 as an unsigned integer is bigger than 1.
So here's the side-by-side comparison of the type promotion that's taking place:
(signed char) * (unsigned char) -> int
(signed int ) * (unsigned int ) -> unsigned int
(signed char)0x80 * (unsigned char)0x01 -> (int) 0xffffff80
(signed int )0x80000000 * (unsigned int )0x01 -> (unsigned int)0x80000000
(int) 0xffffff80 is negative -> prints "why doesnt promotion work?"
(unsigned int)0x80000000 is positive -> prints "promoted"
Here's a reference to the type-promotion rules of C.

The reason printf("promoted\n"); never runs
is because b*a is always == -128, which is less than 1
a b
0x80 * 0x01 = -128 * 1

Related

%d for unsigned integer

I accidentally used "%d" to print an unsigned integer using an online compiler. I thought errors would pop out, but my program can run successfully. It's good that my codes are working, but I just don't understand why.
#include <stdio.h>
int main() {
unsigned int x = 1
printf( "%d", x);
return 0;
}

The value of the "unsigned integer" was small enough that the MSB (most significant bit) was not set. If it were, printf() would have treated the value as a "negative signed integer" value.
int main() {
uint32_t x = 0x5;
uint32_t y = 0xC0000000;
printf( "%d %u %d\n", x, y, y );
return 0;
}
5 3221225472 -1073741824
You can see the difference.
With new-fangled compilers that "read into" printf format specifiers and match those with the datatypes of following parameters, it may be that the online compiler may-or-may-not have been able to report this type mismatch with a warning. This may be something you will want to look into.

refer to printf() manual, they said:
A character that specifies the type of conversion to be applied. The
conversion specifiers and their meanings are:
d, i
The int argument is
converted to signed decimal notation. The precision, if any, gives the
minimum number of digits that must appear; if the converted value
requires fewer digits, it is padded on the left with zeros. The
default precision is 1. When 0 is printed with an explicit precision
0, the output is empty.
so it means that the parameter even if it's in unsigned representation, it will be converted into its signed int representation and printed, see the following code example:
#include <stdio.h>
int main(){
signed int x1 = -2147483648;
unsigned int x2 = -2147483648;
unsigned long long x3 = -2147483648;
printf("signed int x1 = %d\n", x1);
printf("unsigned int x2 = %d\n", x2);
printf("signed long long x3 = %d\n", x3);
}
and this is the output:
signed int x1 = -2147483648
unsigned int x2 = -2147483648
signed long long x3 = -2147483648
so it means no matter what is the type of the variable printed, as long as you specified %d as format specifier, the variable will be converted into its representation in signed int and be printed
in case of unsigned char like for example:
#include <stdio.h>
int main(){
unsigned char s = -10;
printf("s = %d",s);
}
the output is :
s = 246
as the binary representation of unsigned char s = -10 is :
1111 0110
where the MSB bit is 1, but when it's converted into signed int, the new representation is :
0000 0000 0000 0000 0000 0000 1111 0110
so the MSB is no longer have that 1 bit in its MSB which represents whether the number is positive or negative.

Char automatically converts to int (I guess)

I have following code
char temp[] = { 0xAE, 0xFF };
printf("%X\n", temp[0]);
Why output is FFFFFFAE, not just AE?
I tried
printf("%X\n", 0b10101110);
And output is correct: AE.
Suggestions?

The answer you're getting, FFFFFFAE, is a result of the char data type being signed. If you check the value, you'll notice that it's equal to -82, where -82 + 256 = 174, or 0xAE in hexadecimal.
The reason you get the correct output when you print 0b10101110 or even 174 is because you're using the literal values directly, whereas in your example you're first putting the 0xAE value in a signed char where the value is then being sort of "reinterpreted modulo 128", if you wanna think of it that way.
So in other words:
0 = 0 = 0x00
127 = 127 = 0x7F
128 = -128 = 0xFFFFFF80
129 = -127 = 0xFFFFFF81
174 = -82 = 0xFFFFFFAE
255 = -1 = 0xFFFFFFFF
256 = 0 = 0x00
To fix this "problem", you could declare the same array you initially did, just make sure to use an unsigned char type array and your values should print as you expect.
#include <stdio.h>
#include <stdlib.h>
int main()
{
unsigned char temp[] = { 0xAE, 0xFF };
printf("%X\n", temp[0]);
printf("%d\n\n", temp[0]);
printf("%X\n", temp[1]);
printf("%d\n\n", temp[1]);
return EXIT_SUCCESS;
}
Output:
AE
174
FF
255

https://linux.die.net/man/3/printf
According to the man page, %x or %X accept an unsigned integer. Thus it will read 4 bytes from the stack.
In any case, under most architectures you can't pass a parameter that is less then a word (i.e. int or long) in size, and in your case it will be converted to int.
In the first case, you're passing a char, so it will be casted to int. Both are signed, so a signed cast is performed, thus you see preceding FFs.
In your second example, you're actually passing an int all the way, so no cast is performed.
If you'd try:
printf("%X\n", (char) 0b10101110);
You'd see that FFFFFFAE will be printed.

When you pass a smaller than int data type (as char is) to a variadic function (as printf(3) is) the parameter is converted to int in case the parameter is signed and to unsigned int in the case it is unsigned. What is being done and you observe is a sign extension, as the most significative bit of the char variable is active, it is replicated to the thre bytes needed to complete an int.
To solve this and to have the data in 8 bits, you have two possibilities:
Allow your signed char to convert to an int (with sign extension) then mask the bits 8 and above.
printf("%X\n", (int) my_char & 0xff);
Declare your variable as unsigned, so it is promoted to an unsigned int.
unsigned char my_char;
...
printf("%X\n", my_char);

This code causes undefined behaviour. The argument to %X must have type unsigned int, but you supply char.
Undefined behaviour means that anything can happen; including, but not limited to, extra F's appearing in the output.

Print wrong value of unsigned int variable in C

I have written this small program using C:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main()
{
unsigned int un_i = 1112;
printf ("%d and %d", (1 - un_i), (1 - un_i)/10);
return 0;
}
My expectation is: "-1111 and -111"
But my result is: "-1111 and 429496618"
I don't know why it prints out 429496618 instead of -111. Please explain for me
I use gcc ver 4.4.7 and OS centos with kernel 2.6.32
Thank you very much!

That because un_i is of type unsigned int, it does not represent negative values. If you expect the result to be negative, you will need signed type, like int, try this:
unsigned int un_i = 1112;
printf ("PRINTF: un_i[%d] and u_i/10[%d]\n", (1 - un_i), (1 - (int)un_i)/10);

You expect to print -1111 and -111. However 1 - un_i produces a result that is also of type unsigned int; the result is also always non-negative. If unsigned int is 32 bits wide (as it is in your case), the result will be 4294966185; and that divided by 10 would result in 429496618.
The %d switch expects a (signed) int, not an unsigned int. Of that the C11 standard says that when using variable arguments with wrong type the behaviour is undefined, excepting that
one type is a signed integer type, the other type is the corresponding unsigned integer type, and the value is representable in both types
Thus printing 429496618 with %d has defined behaviour, as this same value is representable as both signed int and unsigned int.
However 1 - 1112U will have the value of UINT_MAX - 1110, which, when used as an argument to printf and converted with %d will lead to undefined behaviour since the UINT_MAX - 1100 value is not representable as a signed int; getting -1111 printed just happens to be the (undefined) behaviour in this case.
Since you really want to use signed numbers, then you should declare your variable un_i as an int instead of unsigned int.
In case you expect to do signed math with numbers greater than int use long int or long long int or even better, types such as int64_t instead of this unsigned trickery.

Your issue is from Signed and Unsigned conversion.
1, In case of unsigned int un_i = 1112;
(1 - un_i) = 0xfffffba9 /* first bit is 1 */
Even the first bit is 1, but un_i is unsigned, so it treat the first bit as normal value bit, so the first bit will be 0 after division:
(1 - un_i)/10 = 0x1999992a /* first bit is 0 */
printf ("%d", (1 - un_i)/10); /* 0x1999992a = 429496618 because Sign bit is 0 */
2, In case of int sig_i = 1112, the first bit is treated as Sign bit, and keep it is still 1 (negative) after devision by 10:
(1 - sig_i) = 0xfffffba9 /* first bit is 1 */
(1 - sig_i)/10 = 0xffffff91 /* first bit is 1 */
printf ("%d", (1 - sig_i)/10); /* 0xffffff91 = -111 because Sign bit is 1 */
Please run my code to get the detail result:
unsigned int un_i = 1112;
int sig_i = 1112;
printf ("Unsigned \n%d [hex: %x]\n", (1 - un_i), (1 - un_i));
printf ("and %d [hex: %x]\n", (1 - un_i)/10, (1 - un_i)/10);
printf ("Signed \n%d[hex: %x]\n", (1 - sig_i), (1 - sig_i));
printf ("and %d [hex: %x]\n", (1 - sig_i)/10, (1 - sig_i)/10);
Result
Unsigned
-1111 [hex: fffffba9]
and 429496618 [hex: 1999992a]
Signed
-1111[hex: fffffba9]
and -111 [hex: ffffff91]

Unsigned int from 32 bit to 64bit OS

This code snippet is excerpted from a linux book.
If this is not appropriate to post the code snippet here, please let me know.
I will delete it. Thanks.
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
char buf[30];
char *p;
int i;
unsigned int index = 0;
//unsigned long index = 0;
printf("index-1 = %lx (sizeof %d)\n", index-1, sizeof(index-1));
for(i = 'A'; i <= 'Z'; i++)
buf[i - 'A'] = i;
p = &buf[1];
printf("%c: buf=%p p=%p p[-1]=%p\n", p[index-1], buf, p, &p[index-1]);
return 0;
}
On 32-bit OS environment:
This program works fine no matter the data type of index is unsigned int or unsigned long.
On 64-bit OS environment:
The same program will run into "core dump" if index is declared as unsigned int.
However, if I only change the data type of index from unsigned int to a) unsigned long or b) unsigned short,
this program works fine too.
The reason from the book only tells me that 64-bit will cause the core-dump due to non-negative number. But I have no idea exactly about the reason why unsigned long and unsigned short work but unsigned int.
What I am confused is that
p + (0u -1) == p + UINT_MAX when index is unsigned int.
BUT,
p + (0ul - 1) == p[-1] when index is unsigned long.
I get stuck at here.
If anyone can help to elaborate the details, it is highly appreciated!
Thank you.
Here comes some result on my 32 bit(RHEL5.10/gcc version 4.1.2 20080704)
and 64 bit machine (RHEL6.3/gcc version 4.4.6 20120305)
I am not sure if gcc version makes any difference here.
So, I paste the information as well.
On 32 bit:
I tried two changes:
1) Modify unsigned int index = 0 to unsigned short index = 0.
2) Modify unsigned int index = 0 to unsigned char index = 0.
The program can run without problem.
index-1 = ffffffff (sizeof 4)
A: buf=0xbfbdd5da p=0xbfbdd5db p[-1]=0xbfbdd5da
It seems that the data type of index will be promoted to 4 bytes due to -1.
On 64 bit:
I tried three changes:
1) Modify unsigned int index = 0 to unsigned char index = 0.
It works!
index-1 = ffffffff (sizeof 4)
A: buf=0x7fffef304ae0 p=0x7fffef304ae1 p[-1]=0x7fffef304ae0
2) Modify unsigned int index = 0 to unsigned short index = 0.
It works!
index-1 = ffffffff (sizeof 4)
A: buf=0x7fff48233170 p=0x7fff48233171 p[-1]=0x7fff48233170
3) Modify unsigned int index = 0 to unsigned long index = 0.
It works!
index-1 = ffffffff (sizeof 8)
A: buf=0x7fffb81d6c20 p=0x7fffb81d6c21 p[-1]=0x7fffb81d6c20
BUT, only
unsigned int index = 0 runs into the core dump at the last printf.
index-1 = ffffffff (sizeof 4)
Segmentation fault (core dumped)

Do not lie to the compiler!
Passing printf an int where it expects a long (%ld) is undefined behavior.
(Creating a pointer pointing outside any valid object (and not just behind one) is UB too...)
Correct the format specifiers and the pointer arithmetic (that includes indexing as a special case) and everything will work.
UB includes "It works as expected" as well as "Catastrophic failure".
BTW: If you politely ask your compiler for all warnings, it would warn you. Use -Wall -Wextra -pedantic or similar.

One other problem is code has is in your printf():
printf("index-1 = %lx (sizeof %d)\n", index-1, sizeof(index-1));
Lets simplify:
int i = 100;
print("%lx", i-1);
You are telling printf here is a long but in reality you are sending an int. clang does tell you the corrent warning (I think gcc should also spit the correct waring). See:
test1.c:6:19: warning: format specifies type 'unsigned long' but the argument has type 'int' [-Wformat]
printf("%lx", i - 100);
~~~ ^~~~~~~
%x
1 warning generated.
Solution is simple: you need to pass a long to printf or tell printf to print an int:
printf("%lx", (long)(i-100) );
printf("%x", i-100);
You got luck on 32bit and your app did not crash. Porting it to 64bit revealed a bug in your code and you can now fix it.

Arithmetic on unsigned values is always defined, in terms of wrap-around. E.g. (unsigned)-1 is the same as UINT_MAX. So an expression like
p + (0u-1)
is equivalent to
p + UINT_MAX
(&p[0u-1] is equivalent to &*(p + (0u-1)) and p + (0u-1)).
Maybe this is easier to understand if we replace the pointers with unsigned integer types. Consider:
uint32_t p32; // say, this is a 32-bit "pointer"
uint64_t p64; // a 64-bit "pointer"
Assuming 16, 32, and 64 bit for short, int, and long, respectively (entries on the same line equal):
p32 + (unsigned short)-1 p32 + USHRT_MAX p32 + (UINT_MAX>>16)
p32 + (0u-1) p32 + UINT_MAX p32 - 1
p32 + (0ul-1) p32 + ULONG_MAX p32 + UINT_MAX p32 - 1
p64 + (0u-1) p64 + UINT_MAX
p64 + (0ul-1) p64 + ULONG_MAX p64 - 1
You can always replace operands of addition, subtraction and multiplication on unsigned types by something congruent modulo the maximum value + 1. For example,
-1 ☰ ffffffffhex mod 232
(ffffffffhex is 232-1 or UINT_MAX), and also
ffffffffffffffffhex ☰ ffffffffhex mod 232
(for a 32-bit unsigned type you can always truncate to the least-significant 8 hex-digits).
Your examples:
32-bit
unsigned short index = 0;
In index - 1, index is promoted to int. The result has type int and value -1 (which is negative). Same for unsigned char.
64-bit
unsigned char index = 0;
unsigned short index = 0;
Same as for 32-bit. index is promoted to int, index - 1 is negative.
unsigned long index = 0;
The output
index-1 = ffffffff (sizeof 8)
is weird, it’s your only correct use of %lx but looks like you’ve printed it with %x (expecting 4 bytes); on my 64-bit computer (with 64-bit long) and with %lx I get:
index-1 = ffffffffffffffff (sizeof 8)
ffffffffffffffffhex is -1 modulo 264.
unsigned index = 0;
An int cannot hold any value unsigned int can, so in index - 1 nothing is promoted to int, the result has type unsigned int and value -1 (which is positive, being the same as UINT_MAX or ffffffffhex, since the type is unsigned). For 32-bit-addresses, adding this value is the same as subtracting one:
bfbdd5db bfbdd5db
+ ffffffff - 1
= 1bfbdd5da
= bfbdd5da = bfbdd5da
(Note the wrap-around/truncation.) For 64-bit addresses, however:
00007fff b81d6c21
+ ffffffff
= 00008000 b81d6c20
with no wrap-around. This is trying to access an invalid address, so you get a segfault.
Maybe have a look at 2’s complement on Wikipedia.
Under my 64-bit Linux, using a specifier expecting a 32-bit value while passing a 64-bit type (and the other way round) seems to “work”, only the 32 least-significant bits are read. But use the correct ones. lx expects an unsigned long, unmodified x an unsigned int, hx an unsigned short (an unsigned short is promoted to int when passed to printf (it’s passed as a variable argument), due to default argument promotions). The length modifier for size_t is z, as in %zu:
printf("index-1 = %lx (sizeof %zu)\n", (unsigned long)(index-1), sizeof(index-1));
(The conversion to unsigned long doesn’t change the value of an unsigned int, unsigned short, or unsigned char expression.)
sizeof(index-1) could also have been written as sizeof(+index), the only effect on the size of the expression are the usual arithmetic conversions, which are also triggered by unary +.

Bitwise operation in C

I am new to bitwise operation. I have the basic concepts of AND, OR, XOR, and 2s complement. However I came across following piece of code and am unable to figure out the output.
char c1 = 0xFF; // -1
int shifted = c1 << 8; //-256 (-1 * 256)
printf("%d, %x\n", shifted, shifted);
int myInt;
myInt = 0xFFFFFFE2;
printf("%d\n", myInt);
int i = 0xff;
printf("%d\n", i<<2);
The output is:
-256, ffffff00
-30
1020
Please help me understand what's going on here!

char c1 = 0xFF; // -1
int shifted = c1 << 8; //-256 (-1 * 256)
c1 is promoted to int for the shift, so it's still -1, shifting negative ints is implementation-defined, but your implementation seems to do like most and shifts it as if it were an unsigned bit-pattern, so a left-shift by eight places is multiplication by 256.
printf( "%d, %x\n", shifted, shifted );
-256, ffffff00
as expected. The bit pattern of -256 in two's complement is 0xFFFFFF00 (32-bit ints).
int myInt;
myInt = 0xFFFFFFE2;
That bit pattern is -30 in two's complement
printf("%d\n",myInt);
int i = 0xff ;
This is 255, 255*4 = 1020
printf("%d\n", i<<2);
-30
1020

Write down c1, myInt and i in binary with the declared number of bits.
Apply the operations to them.
Follow it through.

OK let me explain in some detail what's going on. Any binary notation to clarify will be prefixed with 0b and the most significant bit on the left.
char c1 = 0xFF; // -1
char c1 is an eight-bit signed integer type, which is set to 0b11111111
For signed types such as int, short and char, the leftmost bit is used as the sign bit. In two's complement, the most common standard for storing signed types, any signed integer with all bits set is equal to -1.
int shifted = c1 << 8; //-256 (-1 * 256)
c1 is implicitly cast to a signed integer before the shift, so we have an int of -1 which is 0xffffffff. The shift operator in C is a non-rotating shift, i.e. any bits shifted "from" outside the value will be set to zero. After the shift we have 0xffffff00, which is equal to -256 in two's complement.
printf( "%d, %x\n", shifted, shifted );
int myInt;
myInt = 0xFFFFFFE2;
printf("%d\n",myInt);
You're reading a signed integer which is printed according to two's complement.
int i = 0xff ;
printf("%d\n", i<<2);
The initial i is equivalent to 0b11111111, the sign bit not being set. After the shift we have 0b1111111100, which is equal to 1020, again because of non-rotating shift. Hope that clarifies things a bit. If you want to do bit shifts and AND/OR logic you should typically use unsigned types as mentioned before.