I know this is a simple question but I'm confused. I have a fairly typical gcc warning that's usually easy to fix:
warning: comparison between signed and unsigned integer expressions
Whenever I have a hexadecimal constant with the most significant bit, like 0x80000000L, the compiler interprets it as unsigned. For example compiling this code with -Wextra will cause the warning (gcc 4.4x, 4.5x):
int main()
{
long test = 1;
long *p = &test;
if(*p != 0x80000000L) printf("test");
}
I've specifically suffixed the constant as long, so why is this happening?
The answer to Unsigned hexadecimal constant in C? is relevant. A hex constant with L suffix will have the first of the following types that can hold its value:
long
unsigned long
long long
unsigned long long
See the C99 draft, section [ 6.4.4.1 ], for details.
On your platform, long is probably 32 bits, so it is not large enough to hold the (positive) constant 0x80000000. So your constant has type unsigned long, which is the next type on the list and is sufficient to hold the value.
On a platform where long was 64 bits, your constant would have type long.
Because your compiler uses 32-bit longs (and presumably 32-bit ints as well) and 0x80000000 wont fit in a 32-bit signed integer, so the compiler interprets it as unsigned. How to work around this depends on what you're trying to do.
According to the c standard hex constants are unsigned.
It's an unsigned long then. I'm guessing the compiler decides that a hex literal like that is most likely desired to be unsigned. Try casting it (unsigned long)0x80000000L
Hex constants in C/C++ are always unsigned. But you may use explicit typecast to suppress warning.
Related
The output comes to be the 32-bit 2's complement of 128 that is 4294967168. How?
#include <stdio.h>
int main()
{
char a;
a=128;
if(a==-128)
{
printf("%u\n",a);
}
return 0;
}
Compiling your code with warnings turned on gives:
warning: overflow in conversion from 'int' to 'char' changes value from '128' to '-128' [-Woverflow]
which tell you that the assignment a=128; isn't well defined on your plat form.
The standard say:
6.3.1.3 Signed and unsigned integers
1 When a value with integer type is converted to another integer type other than _Bool, if the value can be represented by the new type, it is unchanged.
2 Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type.
3 Otherwise, the new type is signed and the value cannot be represented in it; either the result is implementation-defined or an implementation-defined signal is raised.
So we can't know what is going on as it depends on your system.
However, if we do some guessing (and note this is just a guess):
128 as 8 bit would be 0b1000.0000
so when you call printf where you get a conversion to int there will be a sign extension like:
0b1000.0000 ==> 0b1111.1111.1111.1111.1111.1111.1000.0000
which - printed as unsigned represents the number 4294967168
The sequence of steps that got you there is something like this:
You assign 128 to a char.
On your implementation, char is signed char and has a maximum value of 127, so 128 overflows.
Your implementation interprets 128 as 0x80. It uses two’s-complement math, so (int8_t)0x80 represents (int8_t)-128.
For historical reasons (relating to the instruction sets of the DEC PDP minicomputers on which C was originally developed), C promotes signed types shorter than int to int in many contexts, including variadic arguments to functions such as printf(), which aren’t bound to a prototype and still use the old argument-promotion rules of K&R C instead.
On your implementation, int is 32 bits wide and also two’s-complement, so (int)-128 sign-extends to 0xFFFFFF80.
When you make a call like printf("%u", x), the runtime interprets the int argument as an unsigned int.
As an unsigned 32-bit integer, 0xFFFFFF80 represents 4,294,967,168.
The "%u\n" format specifier prints this out without commas (or other separators) followed by a newline.
This is all legal, but so are many other possible results. The code is buggy and not portable.
Make sure you don’t overflow the range of your type! (Or if that’s unavoidable, overflow for unsigned scalars is defined as modular arithmetic, so it’s better-behaved.) The workaround here is to use unsigned char, which has a range from 0 to (at least) 255, instead of char.
First of all, as I hope you understand, the code you've posted is full of errors, and you would not want to depend on its output. If you were trying to perform any of these manipulations in a real program, you would want to do so in some other, more well-defined, more portable way.
So I assume you're asking only out of curiosity, and I answer in the same spirit.
Type char on your machine is probably a signed 8-bit quantity. So its range is from -128 to +127. So +128 won't fit.
When you try to jam the value +128 into a signed 8-bit quantity, you probably end up with the value -128 instead. And that seems to be what's happening for you, based on the fact that your if statement is evidently succeeding.
So next we try to take the value -128 and print it as if it was an unsigned int, which on your machine is evidently an 32-bit type. It can hold numbers in the range 0 to 4294967295, which obviously does not include -128. But unsigned integers typically behave pretty nicely modulo their range, so if we add 4294967296 to -128 we get 4294967168, which is precisely the number you saw.
Now that we've worked through this, let's resolve in future not to jam numbers that won't fit into char variables, or to print signed quantities with the %u format specifier.
I am wondering why do we cast constants when using #define preprocessor?
For example:
#define WIDTH 128
Would it be equal to -128 in an 8bit platform?
What if I change it like this:
#define WIDTH 128U
What would it be equal to, on 8bit platform?
What is the default sizeof a constant integers like the above? Do its length/type depends on the platform architecture, or it depends on the type of the literal value they hold?
Sorry about my bad English.
Defining WIDTH as 128 poses no problems, int is at least 16 bit wide on all conforming platforms.
Defining WIDTH as 128U would make it an unsigned integer constant literal. Since the value fits in an unsigned int (mandated to be at least 16 bit wide), it has type unsigned int. sizeof(WIDTH) evaluates to sizeof(unsigned int), which is entirely platform specific.
Using this suffix is not recommended. It would have surprising side effects:
if (WIDTH > -1) {
printf("This will never print\n");
}
Since WIDTH expands to 128U, an unsigned constant, the comparison is performed as an unsigned comparison, -1 is converted to unsigned and becomes UINT_MAX, a value much larger than 128. Don't do this.
If you subsequently store WIDTH into a char variable, you may have a problem. It would actually not make a difference whether you define it as 128 or 128U, you would still have an overflow if the char type is 8 bit and signed, leading to undefined behavior. On most platforms, the value stored would indeed be -128 but you cannot even rely on that.
More importantly, you should use all the help the compiler can give you by enabling all compiler warnings and making them errors:
gcc -Wall -Wextra -Werror
or
clang -Weverything -Werror
Very few of these warnings are annoying, most of them are very useful and point to silly mistakes, typos and oversights.
First of all 128 is not equal to -128 on a 8-bit platform.
Second this has nothing to do with the preprocessor. What the preprocessor does is to replace WIDTH with whatever it's defined as. That is the question is why you write 128 or 128u in your source.
The suffix u is not about type casting, it's about to indicate the type of the literal. In this example 128 is an literal with value 128 of type int while 128u is a literal with value 128 of type unsigned int. It's not a problem immediately here, but if you start to use them and end up larger than 32767 you could run into problems. For example:
#define WIDTH 256u
#define HEIGHT 192u
unsigned npixels = WIDTH * HEIGHT;
it should be noted that the suffices are required to make it portable (what could happen is that the platform only uses 16-bit ints and with int the multiplication would overflow which means undefined behavior).
Also note that in newer C standards (but not the antique ones) will extend the literal to become as large as necessary if possible. For example the literal 32768 means a signed integral type with value 32768, if int isn't large enough to hold that signed number then larger types would be used.
The sizeof these integers are the same as sizeof(int) as the type of the literals are int and unsigned int. The actual value of sizeof(int) could be any positive integer.
Giving C99 chapters because I don't have the C11 document at hand.
ISO/IEC 9899:1999, 6.4.4.1 Integer constants
The type of an integer constant is the first of the corresponding list in which its value can be represented.
For decimal constants without suffix:
int
long int
long long int
For decimal constants with u or U suffix:
unsigned int
unsigned long int
unsigned long long int
ISO/IEC 9899:1999, 5.2.4.2.1 Sizes of integer types
The width of integer types is implementation-defined, as is the binary representation.
INT_MAX -- the largest value an int can take -- is guaranteed to be at least +32767.
UINT_MAX -- the largest value an unsigned int can take -- is guaranteed to be at least 65535.
ISO/IEC 9899:1999, 6.3.1.8 Usual arithmetic conversions
If you compare an int with an unsigned int, the int will be implicitly converted to unsigned int for the comparison. Similar for the short / long / long long types. As #chqrlie pointed out, this can be a problem; your compiler should give you a warning if this happens (you are always compiling with -Wall -Wextra / /W3 enabled, aren't you?).
Summary
Your constants will fit into an int / unsigned int even on an 8-bit machine. Assuming they would not, the compiler would use the next largest type for them (instead of casting the value).
As for why we do it...
If you, for example, intend to use WIDTH for comparisons with the result of sizeof(), or the return code of strlen(), or anything else that is unsigned by nature, you would want WIDTH to have the same value domain, i.e. being able to hold all possible values.
That is why you would want WIDTH to be unsigned as well.
Instead of getting number "2147483648" I get "-2147483648" because of an signed int overflow. I tried declaring the variable both as long int as well as unsigned int with no use. They are just not recognized as such types. If anyone wondering, I'm left shiffting the value.
int multiplier = 1,i;
long int mpy = 0;
for(i=32;i>=0;i--){
mpy = 1 << multiplier++;
printf("mpy = %d\n",mpy);
}
Since the constant 1 is an int, when shifted left, it remains an int. If you want an unsigned long long, make it such:
unsigned long long mpy = 1ULL << multiplier++;
You could use one of the suffixes L or UL or LL for long, unsigned long and long long instead (and lower-case versions of these, but the suffix is best written in upper-case to avoid confusion of l and 1). The choice depends on what you're really trying to do.
Note that the type of the result of << is the type of the left-hand operand. The result of the shift is only subsequently converted to the type of the left-hand side of the assignment operator. The LHS of the assignment does not affect how the value on the RHS is calculated.
As user3528438 pointed out in a comment, and as I assumed (perhaps mistakenly) you would know — if multiplier (the RHS of the << operator) evaluates to a negative value or a value equal to or larger than the number of bits in the integer type, then you invoke undefined behaviour.
Note that long long and unsigned long long are standard in the decade-and-a-half old standard (C99) and the newer C11 standard — but they were not part of the quarter-century old C89/C90 standard. If you're stuck on a platform where the compiler is in a time-warp — release date in the 201x decade and C standard compatibility date of 1990 — then you have to go with alternative platform-specific 64-bit techniques. The loop in the updated question covers 33 values since you count from 32 down to and including 0. No 32-bit type will have distinct values for each of the 33 shifts.
(Advanced users might be interested in INT35-C Use correct integer precisions and N1899 — Integer precision bits update; they're a tad esoteric for most people as yet. I'm not sure whether I'll ever find it necessary to worry about the issue raised.)
Note also the discussion in the comments below about printf() formats. You should make sure you print the value with the correct format. For long int, that should be %ld; for unsigned long long, that would be %llu. Other types require other formats. Make sure you're using sensible compiler warning options. If you're using GCC, you should look at gcc -Wall -Wextra -Werror -std=c11 as a rather effective set of options; I use slightly more stringent options than even those when I'm compiling C code.
Depending on what compiler you are using, and if you are compiling in 32-bit vs. 64-bit mode, what you are seeing could be exactly as expected.
https://software.intel.com/en-us/articles/size-of-long-integer-type-on-different-architecture-and-os
tl;dr: with MSVC, both int and long are 32-bits, you need to graduate up to __int64 if you want to store a bigger number. With gcc or other compilers in 32-bit mode, you run into the same issue of int = long = 32-bit, which doesn't help you in your situation. Only when you make the move to 64-bit compilation on non-Microsoft compilers do int and long start to diverge.
edit per comments section:
int64_t or long long would also be standards-compliant types that could be used. Alternatively, unsigned would allow the poster to fit their value into 32-bits.
I was working on an embedded project when I ran into something which I thought was strange behaviour. I managed to reproduce it on codepad (see below) to confirm, but don't have any other C compilers on my machine to try it on them.
Scenario: I have a #define for the most negative value a 32-bit integer can hold, and then I try to use this to compare with a floating point value as shown below:
#define INT32_MIN (-2147483648L)
void main()
{
float myNumber = 0.0f;
if(myNumber > INT32_MIN)
{
printf("Everything is OK");
}
else
{
printf("The universe is broken!!");
}
}
Codepad link: http://codepad.org/cBneMZL5
To me it looks as though this this code should work fine, but to my surprise it prints out The universe is broken!!.
This code implicitly casts the INT32_MIN to a float, but it turns out that this results in a floating point value of 2147483648.0 (positive!), even though the floating point type is perfectly capable of representing -2147483648.0.
Does anyone have any insights into the cause of this behaviour?
CODE SOLUTION: As Steve Jessop mentioned in his answer, limits.h and stdint.h contain correct (working) int range defines already, so I'm now using these instead of my own #define
PROBLEM/SOLUTION EXPLANATION SUMMARY: Given the answers and discussions, I think this is a good summary of what's going on (note: still read the answers/comments because they provide a more detailed explanation):
I'm using a C89 compiler with 32-bit longs, so any values greater than LONG_MAX and less or equal to ULONG_MAX followed by the L postfix have a type of unsigned long
(-2147483648L) is actually a unary - on an unsigned long (see previous point) value: -(2147483648L). This negation operation 'wraps' the value around to be the unsigned long value of 2147483648 (because 32-bit unsigned longs have the range 0 - 4294967295).
This unsigned long number looks like the expected negative int value when it gets printed as an int or passed to a function because it is first getting cast to an int, which is wrapping this out-of-range 2147483648 around to -2147483648 (because 32-bit ints have the range -2147483648 to 2147483647)
The cast to float, however, is using the actual unsigned long value 2147483648 for conversion, resulting in the floating-point value of 2147483648.0.
Replace
#define INT32_MIN (-2147483648L)
with
#define INT32_MIN (-2147483647 - 1)
-2147483648 is interpreted by the compiler to be the negation of 2147483648, which causes overflow on an int. So you should write (-2147483647 - 1) instead.
This is all C89 standard though. See Steve Jessop's answer for C99.
Also long is typically 32 bits on 32-bit machines, and 64 bits on 64-bit machines. int here gets the things done.
In C89 with a 32 bit long, 2147483648L has type unsigned long int (see 3.1.3.2 Integer constants). So once modulo arithmetic has been applied to the unary minus operation, INT32_MIN is the positive value 2147483648 with type unsigned long.
In C99, 2147483648L has type long if long is bigger than 32 bits, or long long otherwise (see 6.4.4.1 Integer constants). So there is no problem and INT32_MIN is the negative value -2147483648 with type long or long long.
Similarly in C89 with long larger than 32 bits, 2147483648L has type long and INT32_MIN is negative.
I guess you're using a C89 compiler with a 32 bit long.
One way to look at it is that C99 fixes a "mistake" in C89. In C99 a decimal literal with no U suffix always has signed type, whereas in C89 it may be signed or unsigned depending on its value.
What you should probably do, btw, is include limits.h and use INT_MIN for the minimum value of an int, and LONG_MIN for the minimum value of a long. They have the correct value and the expected type (INT_MIN is an int, LONG_MIN is a long). If you need an exact 32 bit type then (assuming your implementation is 2's complement):
for code that doesn't have to be portable, you could use whichever type you prefer that's the correct size, and assert it to be on the safe side.
for code that has to be portable, search for a version of the C99 header stdint.h that works on your C89 compiler, and use int32_t and INT32_MIN from that.
if all else fails, write stdint.h yourself, and use the expression in WiSaGaN's answer. It has type int if int is at least 32 bits, otherwise long.
On ubuntu server 11.10 for i386, sizeof(int) is 4, so I guess 4294967296 is the maximum number for int type, isn't it? But printf("%u\n",4294967296) output 0 with a warning :
warning: format '%u' expects argument of type 'unsigned int', but
argument 2 has type 'long long unsigned int' [-Wformat]
Any suggestion ?
The version of gcc is (Ubuntu/Linaro 4.6.1-9ubuntu3) 4.6.1
4294967296 is exactly 2^32, which is 1 beyond the maximum supported by unsigned int (assuming 32-bit). So C treats that integer literal as if it were of the next integer type that can represent it (long long int). (I'm not sure why the warning complains that it's long long unsigned int).
You've therefore invoked undefined behaviour by providing a mismatch between your format specifier and the argument types (in practice, you're simply seeing modulo wrap-around).
Change your format specifier to "%llu\n".
You can use the constants in limits.h to determine the maximum possible and minimum possible values for various types for your implementation. Try
printf("%u\n", UINT_MAX);
and you'll find that the output is:
4294967295
But you are asking printf() to print 4294967296 which is beyond the maximum possible unsigned integer for your implementation.
The value of UINT_MAX makes sense since you have told us that sizeof (int) is 4. The maximum possible 4-byte integer or 32-bit integer is 111111111 111111111 111111111 111111111 when expressed in binary or FFFFFFFF when expressed in hexadecimal or 4294967295 when expressed in decimal.
You have got two ways to fix this code. A simple way is:
printf("%llu\n", 4294967296ull);
A more portable way to fix this by using the ISO C99 standard is:
printf("%" PRIu64 "\n", (uint64_t) 4294967296);
For the latter, you need to #include <inttypes.h>.
The type of an unsuffixed integer literal such as 4294967296 depends on the value of the literal and the ranges of the predefined types for the implementation.
The type of a decimal constant is the first of:
int
long int
long long int
(Octal and hexadecimal literals can be of unsigned types; see section 6.4.4.1 of the C99 standard for details.)
in which its value can be represented exactly. The language requires that int is at least 16 bits, long int is at least 32 bits, and long long int is at least 64 bits, but each can be wider.
On your system (Ubuntu, 32 bits, same as mine), int and long int are both 32 bits, and long long int is 64 bits, so 4294967296 is of type long long int. (The message you showed us indicates that it's long long unsigned int, which doesn't seem right; are you sure that's the message you got for that specific code?)
The behavior of calling printf with an argument of a type that's inconsistent with the format is undefined. You probably got 9 as output because the value passed was 0x0000000100000000, and printf happened to use the low-order 32 bits of the 64-bit value. Other behaviors are possible.
You ask for "Any suggestion", but it's not clear from your question what you want to do. A way to produce the output you probably expected is:
puts("4294967296");
but I'm sure that's not the kind of answer you're looking for.
Usually when you print a number with printf, the argument will be a variable rather than a literal, and the type will be determined by the declared type of the variable. Or it might be a more complicated expression, in which case the type is determined by the types of the operands; if they're of different types, there's a moderately complicated set of rules that determine what the final result is. Just make sure the format you use corresponds to the (promoted) type of the argument.
If you have a more specific question, we'll be glad to give a more specific answer.
If you want do do it in a portable way take a look at here.