unsigned type of an integer constant - c

I got confused of the type of an integer constant, as described here:
On the first row, if a constant ended without 'u', why decimal constant must be signed type, while octal or hexadecimal constant can be an unsigned type?
I think that taking the constant as an unsigned version if the signed version do not fit has problem, for example:
long long l1 = 0xffffffff + 0xffffffff; // 0xffffffff is unsigned int
long long l2 = 4294967295 + 4294967295; // 4294967295 is signed long
l1 is fffffffe, while l2 is 1fffffffe. and obviously l1 is wrong

If I were to say, I'd answer with that hexadecimal and octal numbers represent bit pattern more closely than decimal ones, and therefore the C standard committee has decided that hex and oct numbers may be unsigned even without U suffix.
Think about how many people would write code like this:
uint32_t b = a & 0xFFFFFFF0;
uint32_t b = a & 4294967280; // or -15?

The issue causes problems more because of using wrong type for the operations than the constants not being the right type.
// some_wide_type = some_narrow_type + some_narrow_type --> trouble
long long l1 = 0xffffffff + 0xffffffff;
long long l2 = 4294967295 + 4294967295;
Instead do the math using the target type
long long l1 = 0LL + 0xffffffff + 0xffffffff;
long long l2 = 0LL + 4294967295 + 4294967295;
or use 1 type (long long) rather than the 3 (long long, unsigned long, long)
long long l1 = 0xffffffffLL + 0xffffffffLL;
long long l2 = 4294967295LL + 4294967295LL;

Related

Why left shift 24 bits changed the value of unsigned long in C?

I expect 0b11010010 << 24 should be the same value as 0b11010010000000000000000000000000.
I tested it in C, 0b11010010 << 24 doesn't work as expected if we saved it in c unsigned long.
Does anyone know how C unsigned long works like this?
#include <stdbool.h>
#include <stdint.h>
#include <stdio.h>
int main(){
unsigned long a = 0b11010010000000000000000000000000;
unsigned long b = 0b11010010 << 24;
bool isTheSame1 = a == b;
printf("isTheSame1 %d \n",isTheSame1);
bool isTheSame2 = 0b11010010000000000000000000000000 == (0b11010010 << 24);
printf("isTheSame2 %d",isTheSame2);
}
isTheSame1 should be 1 but it prints 0 as following
isTheSame1 0
isTheSame2 1
Compiled and executed by gcc main.c && ./a.out
gcc --version
Apple clang version 14.0.0 (clang-1400.0.29.202)
Target: x86_64-apple-darwin22.2.0
Thread model: posix
Updated
As Allan Wind pointed out, I added UL suffix and now it works as expected.
unsigned long a = 0b11010010000000000000000000000000UL;
unsigned long b = 0b11010010UL << 24;
bool isTheSame1 = a == b;
printf("isTheSame1 %d \n",isTheSame1);
bool isTheSame2 = 0b11010010000000000000000000000000UL == (0b11010010UL << 24);
printf("isTheSame2 %d",isTheSame2);
The constant 0b11010010 has type int which is signed. Assuming an int is 32 bits, the expression 0b11010010 << 24 will shift a "1" bit into the sign bit. Doing so triggers undefined behavior which is why you're getting strange results.
Add the UL suffix to the constant to give it type unsigned long, then the shift will work as expected.
unsigned long b = 0b11010010UL << 24;
You are doing a left shift of a signed value (see good answer of #dbush)
In absence of suffixes numbers have int or double types
b = 0b11010010 ; /* type int */
b = 1.0; /* type double */
If you want want b in your example as unsigned long use a suffix:
b = 0b11010010UL; /* type unsigned long */
or a cast:
b = (unsigned long)0b11010010; /* type unsigned long */
With 32-bit (or smaller) int, 0b11010010 << 24 is undefined behaver (UB). It attempts to shift into the sign bit.
When int is 32-bit (common), this often results in a negative value corresponding to the bit pattern 11010010-00000000-00000000-00000000.
When a negative value is saved as an unsigned long, ULONG_MAX + 1 is added to it. With a 64-bit unsigned long the value has the bit pattern:
11111111-11111111-11111111-11111111-11010010-00000000-00000000-00000000
This large unsigned long in not equal to 0b11010010000000000000000000000000UL and so the output of "isTheSame1 0".
Had OP's long been 32-bit, it "might" have worked as OP had intended - yet unfortunately still replying on UB.
Appending an L
32-bit unsigned long: 0b11010010 << 24 suffers the same UB problem as above - yet might have "worked".
64-bit unsigned long: 0b11010010L is also long and 0b11010010L << 24 becomes the value 0b11010010000000000000000000000000, the same value as a.
Appending an U
32-bit unsigned: 0b11010010U << 24 becomes the value 0b11010010000000000000000000000000, the same value as a.
16-bit unsigned: 0b11010010U << 24 is undefined behavior as the shift is too great. Often the UB results in the same as 0b11010010U << (24-16), yet this is not reliably done.
Appending an UL
32 or 64-bit unsigned long: 0b11010010UL << 24 becomes the value 0b11010010000000000000000000000000, the same value as a.
Since the left hand side of the = of the below is unsigned long, better for the right hand side constant to be unsigned long.
unsigned long b = 0b11010010 << 24; // Original
unsigned long b = 0b11010010UL << 24; // Better

Shifting on Integer Constants shows warning. How to clear this?

Reference: Suffix in Integer Constants
unsigned long long y = 1 << 33;
Results in warning:
left shift count >= width of type [-Wshift-count-overflow]
Two Questions need to be cleared from the above context:
unsigned long long type has 64-bit, why cant we do left shift in it?
how shifting works in int constants('1')?
In C language, 1 is an int which is 32 bits on most platforms. When you try to shift it 33 bits before storing its value in an unsigned long long, that's not going to end well. You can fix this in 2 ways:
Use 1ULL instead, which is an unsigned long long constant:
unsigned long long y = 1ULL << 33;
Assign the value, then shift it:
unsigned long long y = 1;
y <<= 33;
Both are valid, but I'd suggest the first one since it's shorter and you can make y const.

How do I mute this error : "integer literal is too large to be represented in a signed integer type"

I have this school assignment in C where I will be corrected with the following flags :
-Wall -Wextra -Werror
So this harmless warning becomes an error and prevents compilation :
integer literal is too large to be represented in a signed integer type
(code still works) but if I can't mute it my work will be considered wrong
Here is my code :
static unsigned long long piece_to_map(unsigned short little)
{
static unsigned short row;
unsigned long long big;
char i;
unsigned long long mask_left;
unsigned long long mask_top;
mask_left = 9259542123273814144;
mask_top = 18374686479671623680;
row = 15;
big = 0;
i = 0;
while (i < 16)
{
big |= (little & (row << i)) << i;
i += 4;
}
while ((big & mask_t) == 0)
big = big << 8;
while ((big & mask_l) == 0)
big = big << 1;
return (big);
}
What I'm trying to achieve here is to transform an unsigned short (representing a shape in a 4x4 square) to an unsigned long long representing the same shape in a 8x8 square having the shape cornered top-left. It works perfectly and according to my expectations, I just need to avoid having the warning. I was formerly using the (normally equivalent) binary expression instead and didn't get any warning
0b1111111100000000000000000000000000000000000000000000000000000000
and
0b1000000010000000100000001000000010000000100000001000000010000000
The problem is that the 0bxxxx form is not standard C (As I read in this StackOverflow answer), therefore I am not allowed to use it.
I also tried
mask_left = (unsigned long long)9259542123273814144;
mask_top = (unsigned long long)18374686479671623680;
The compiler still tells me that the value is too large to be represened in a signed integer type. What am I doing wrong ? Is there any way to fix this at all ?
Implicitly, the integer literal is signed and of course the values are too big for a signed long long, so you need to let the compiler know that they have type unsigned, like this
mask_left = 9259542123273814143U;
mask_top = 18374686479671623680U;
Rewrite it with explicit size:
mask_left = 9259542123273814144uLL;
mask_top = 18374686479671623680uLL;
By writing it as (unsigned long long) 9259542123273814144 it means to take the integer and then cast it longer. Unfortunately, the integer is probably munged (by throwing away the higher bits to make it an int) and then increasing the size.
Signed integer literals cannot be larger than 2147483648. For a number larger than that, you need to add the LL prefix, which tells the compiler it is a long long. In your case, you want ULL as that designates an unsigned long long, which is what you're assigning to.

Unsigned int from 32 bit to 64bit OS

This code snippet is excerpted from a linux book.
If this is not appropriate to post the code snippet here, please let me know.
I will delete it. Thanks.
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
char buf[30];
char *p;
int i;
unsigned int index = 0;
//unsigned long index = 0;
printf("index-1 = %lx (sizeof %d)\n", index-1, sizeof(index-1));
for(i = 'A'; i <= 'Z'; i++)
buf[i - 'A'] = i;
p = &buf[1];
printf("%c: buf=%p p=%p p[-1]=%p\n", p[index-1], buf, p, &p[index-1]);
return 0;
}
On 32-bit OS environment:
This program works fine no matter the data type of index is unsigned int or unsigned long.
On 64-bit OS environment:
The same program will run into "core dump" if index is declared as unsigned int.
However, if I only change the data type of index from unsigned int to a) unsigned long or b) unsigned short,
this program works fine too.
The reason from the book only tells me that 64-bit will cause the core-dump due to non-negative number. But I have no idea exactly about the reason why unsigned long and unsigned short work but unsigned int.
What I am confused is that
p + (0u -1) == p + UINT_MAX when index is unsigned int.
BUT,
p + (0ul - 1) == p[-1] when index is unsigned long.
I get stuck at here.
If anyone can help to elaborate the details, it is highly appreciated!
Thank you.
Here comes some result on my 32 bit(RHEL5.10/gcc version 4.1.2 20080704)
and 64 bit machine (RHEL6.3/gcc version 4.4.6 20120305)
I am not sure if gcc version makes any difference here.
So, I paste the information as well.
On 32 bit:
I tried two changes:
1) Modify unsigned int index = 0 to unsigned short index = 0.
2) Modify unsigned int index = 0 to unsigned char index = 0.
The program can run without problem.
index-1 = ffffffff (sizeof 4)
A: buf=0xbfbdd5da p=0xbfbdd5db p[-1]=0xbfbdd5da
It seems that the data type of index will be promoted to 4 bytes due to -1.
On 64 bit:
I tried three changes:
1) Modify unsigned int index = 0 to unsigned char index = 0.
It works!
index-1 = ffffffff (sizeof 4)
A: buf=0x7fffef304ae0 p=0x7fffef304ae1 p[-1]=0x7fffef304ae0
2) Modify unsigned int index = 0 to unsigned short index = 0.
It works!
index-1 = ffffffff (sizeof 4)
A: buf=0x7fff48233170 p=0x7fff48233171 p[-1]=0x7fff48233170
3) Modify unsigned int index = 0 to unsigned long index = 0.
It works!
index-1 = ffffffff (sizeof 8)
A: buf=0x7fffb81d6c20 p=0x7fffb81d6c21 p[-1]=0x7fffb81d6c20
BUT, only
unsigned int index = 0 runs into the core dump at the last printf.
index-1 = ffffffff (sizeof 4)
Segmentation fault (core dumped)
Do not lie to the compiler!
Passing printf an int where it expects a long (%ld) is undefined behavior.
(Creating a pointer pointing outside any valid object (and not just behind one) is UB too...)
Correct the format specifiers and the pointer arithmetic (that includes indexing as a special case) and everything will work.
UB includes "It works as expected" as well as "Catastrophic failure".
BTW: If you politely ask your compiler for all warnings, it would warn you. Use -Wall -Wextra -pedantic or similar.
One other problem is code has is in your printf():
printf("index-1 = %lx (sizeof %d)\n", index-1, sizeof(index-1));
Lets simplify:
int i = 100;
print("%lx", i-1);
You are telling printf here is a long but in reality you are sending an int. clang does tell you the corrent warning (I think gcc should also spit the correct waring). See:
test1.c:6:19: warning: format specifies type 'unsigned long' but the argument has type 'int' [-Wformat]
printf("%lx", i - 100);
~~~ ^~~~~~~
%x
1 warning generated.
Solution is simple: you need to pass a long to printf or tell printf to print an int:
printf("%lx", (long)(i-100) );
printf("%x", i-100);
You got luck on 32bit and your app did not crash. Porting it to 64bit revealed a bug in your code and you can now fix it.
Arithmetic on unsigned values is always defined, in terms of wrap-around. E.g. (unsigned)-1 is the same as UINT_MAX. So an expression like
p + (0u-1)
is equivalent to
p + UINT_MAX
(&p[0u-1] is equivalent to &*(p + (0u-1)) and p + (0u-1)).
Maybe this is easier to understand if we replace the pointers with unsigned integer types. Consider:
uint32_t p32; // say, this is a 32-bit "pointer"
uint64_t p64; // a 64-bit "pointer"
Assuming 16, 32, and 64 bit for short, int, and long, respectively (entries on the same line equal):
p32 + (unsigned short)-1 p32 + USHRT_MAX p32 + (UINT_MAX>>16)
p32 + (0u-1) p32 + UINT_MAX p32 - 1
p32 + (0ul-1) p32 + ULONG_MAX p32 + UINT_MAX p32 - 1
p64 + (0u-1) p64 + UINT_MAX
p64 + (0ul-1) p64 + ULONG_MAX p64 - 1
You can always replace operands of addition, subtraction and multiplication on unsigned types by something congruent modulo the maximum value + 1. For example,
-1 ☰ ffffffffhex mod 232
(ffffffffhex is 232-1 or UINT_MAX), and also
ffffffffffffffffhex ☰ ffffffffhex mod 232
(for a 32-bit unsigned type you can always truncate to the least-significant 8 hex-digits).
Your examples:
32-bit
unsigned short index = 0;
In index - 1, index is promoted to int. The result has type int and value -1 (which is negative). Same for unsigned char.
64-bit
unsigned char index = 0;
unsigned short index = 0;
Same as for 32-bit. index is promoted to int, index - 1 is negative.
unsigned long index = 0;
The output
index-1 = ffffffff (sizeof 8)
is weird, it’s your only correct use of %lx but looks like you’ve printed it with %x (expecting 4 bytes); on my 64-bit computer (with 64-bit long) and with %lx I get:
index-1 = ffffffffffffffff (sizeof 8)
ffffffffffffffffhex is -1 modulo 264.
unsigned index = 0;
An int cannot hold any value unsigned int can, so in index - 1 nothing is promoted to int, the result has type unsigned int and value -1 (which is positive, being the same as UINT_MAX or ffffffffhex, since the type is unsigned). For 32-bit-addresses, adding this value is the same as subtracting one:
bfbdd5db bfbdd5db
+ ffffffff - 1
= 1bfbdd5da
= bfbdd5da = bfbdd5da
(Note the wrap-around/truncation.) For 64-bit addresses, however:
00007fff b81d6c21
+ ffffffff
= 00008000 b81d6c20
with no wrap-around. This is trying to access an invalid address, so you get a segfault.
Maybe have a look at 2’s complement on Wikipedia.
Under my 64-bit Linux, using a specifier expecting a 32-bit value while passing a 64-bit type (and the other way round) seems to “work”, only the 32 least-significant bits are read. But use the correct ones. lx expects an unsigned long, unmodified x an unsigned int, hx an unsigned short (an unsigned short is promoted to int when passed to printf (it’s passed as a variable argument), due to default argument promotions). The length modifier for size_t is z, as in %zu:
printf("index-1 = %lx (sizeof %zu)\n", (unsigned long)(index-1), sizeof(index-1));
(The conversion to unsigned long doesn’t change the value of an unsigned int, unsigned short, or unsigned char expression.)
sizeof(index-1) could also have been written as sizeof(+index), the only effect on the size of the expression are the usual arithmetic conversions, which are also triggered by unary +.

Concatenate two 32bit numbers to get a 64bit result

I need to concatenate two hexadecimal numbers 32 bits each each, to get a final result of 64 bits.
I tried the following code but didn't get a good result:
unsigned long a,b;
unsigned long long c;
c = (unsigned long long) (a << 32 | b);
Can anybody help me please?
Thanks.
Use proper fixed size types and be careful about type promotion and operator precedence, e.g.
#include <stdint.h>
uint32_t a, b;
uint64_t c;
c = ((uint64_t)a << 32) | b;
You need to cast a to long long before shifting it:
unsigned long long c = ((unsigned long long)a << 32 | b);
Shortest form is:
c = a+0ULL<<32|b
The third line should be changed to
((unsigned long long)a) << 32 | ((unsigned long long) b)
What your current code is doing, is taking the 32-bit variable a and shifting it 32 bits to the left (making its value 0, because the bottom 32 bits are all empty), then or-ing it with the 32-bit variable b.
What the changed version does is to case the 32-bit variable a to 64 bits, shift it 32 bits to the left, cast the 32-bit variable b to 64 bits, then or the two 64-bit variables together. The result is naturally 64 bits.
I would imagine that this would do the trick:
typedef unsigned long U64 ; // your unsigned 64-bit int typedef here
typedef unsigned int U32 ; // your unsigned 32-bit int typedef here
U64 join( U32 a , U32 b )
{
U64 result = ((U64)a) << 32
| ((U64)b)
;
return result ;
}
I'll leave to you to divine the appropriate typedefs for U64 and U32.

Resources