Does strtol("-2147483648", 0, 0) overflow if LONG_MAX is 2147483647? - c

Per the specification of strtol:
If the subject sequence has the expected form and the value of base is 0, the sequence of characters starting with the first digit shall be interpreted as an integer constant. If the subject sequence has the expected form and the value of base is between 2 and 36, it shall be used as the base for conversion, ascribing to each letter its value as given above. If the subject sequence begins with a minus-sign, the value resulting from the conversion shall be negated. A pointer to the final string shall be stored in the object pointed to by endptr, provided that endptr is not a null pointer.
The issue at hand is that, prior to the negation, the value is not in the range of long. For example, in C89 (where the integer constant can't take on type long long), writing -2147483648 is possibly an overflow; you have to write (-2147483647-1) or similar.
Since the wording using "integer constant" could be interpreted to apply the C rules for the type of an integer constant, this might be enough to save us from undefined behavior here, but the same issue (without such an easy out) would apply to strtoll.
Finally, note that even if it did overflow, the "right" value should be returned. So this question is really just about whether errno may or must be set in this case.

Although I cannot point to a particular bit of wording in the standard today, when I wrote strtol for 4BSD back in the 1990s I was pretty sure that this should not set errno, and made sure that I would not. Whether this was based on wording in the standard, or personal discussion with someone, I no longer recall.
In order to avoid overflow, this means the calculation has to be done pretty carefully. I did it in unsigned long and included this comment (still in the libc source in the various BSDs):
/*
* Compute the cutoff value between legal numbers and illegal
* numbers. That is the largest legal value, divided by the
* base. An input number that is greater than this value, if
* followed by a legal input character, is too big. One that
* is equal to this value may be valid or not; the limit
* between valid and invalid numbers is then based on the last
* digit. For instance, if the range for longs is
* [-2147483648..2147483647] and the input base is 10,
* cutoff will be set to 214748364 and cutlim to either
* 7 (neg==0) or 8 (neg==1), meaning that if we have accumulated
* a value > 214748364, or equal but the next digit is > 7 (or 8),
* the number is too big, and we will return a range error.
*
* Set 'any' if any `digits' consumed; make it negative to indicate
* overflow.
*/
I was (and still am, to some extent) annoyed by the asymmetry between this action in the C library and the syntax of the language itself (where negative numbers are two separate tokens, - followed by the number, so that writing -217483648 means -(217483648) which becomes -(217483648U) which is of course 217483648U and hence positive! (Assuming 32-bit int of course; the problematic value varies for other bit sizes.)

Based on the comp.std.c thread cited in a comment by ouah (9 years ago), the intent is clearly that it does not overflow. The actual language in the standard is still ambiguous:
If the subject sequence has the expected form and the value of base is zero, the sequence of characters starting with the first digit is interpreted as an integer constant according to the rules of 6.4.4.1. If the subject sequence has the expected form and the value of base is between 2 and 36, it is used as the base for conversion, ascribing to each letter its value as given above. If the subject sequence begins with a minus sign, the value resulting from the conversion is negated (in the return type).
In order to get the right behavior, you have to interpret the phrase "interpreted as an integer constant according to the rules of 6.4.4.1" as yielding an actual integer value, not a value within some C-language integer type, and the final "in the return type" as the negation happening with a typeless integer value as the operand, but a coerced type for the result.
Moreover, the error condition does not actually even define an "overflow" condition, but "correct value outside the range". This part of the text seems to be ignoring the unsigned issue addressed in DR006, since it only deals with the final value, not the pre-negation value:
If the correct value is outside the range of representable values, LONG_MIN, LONG_MAX, LLONG_MIN, LLONG_MAX, ULONG_MAX, or ULLONG_MAX is returned (according to the return type and sign of the value, if any), and the value of the macro ERANGE is stored in errno.
In short, this seems to still be a mess, due to the usual outcome where the committee says "yeah, it's supposed to mean what you think it should mean" and then never updates the ambiguous or outright wrong text in the standard...

On a 32-bit platform, -2147483648 is not an overflow under C89. It's LONG_MIN for and errno == 0.
Quoting directly from the standard
RETURN VALUE
Upon successful completion strtol() returns the converted value, if
any. If no conversion could be performed, 0 is returned and errno may
be set to [EINVAL]. If the correct value is outside the range of
representable values, LONG_MAX or LONG_MIN is returned (according to
the sign of the value), and errno is set to [ERANGE].
When tested, this seems to be in line with the following test:
#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
#include <limits.h>
int main(int argc, char *argv[]) {
long val = strtol(argv[1], NULL, 10);
fprintf(stderr, "long max: %ld, long min: %ld\n", LONG_MAX, LONG_MIN);
fprintf(stderr, "val: %ld, errno: %d\n", val, errno);
perror(argv[1]);
return 0;
}
When compiled as this on a 32-bit x86 system using:
gcc -std=c89 foo.c -o foo
produces the following outputs:
./foo -2147483648
Output:
long max: 2147483647, long min: -2147483648
val: -2147483648, errno: 0
-2147483648: Success
./foo -2147483649
Output:
long max: 2147483647, long min: -2147483648
val: -2147483648, errno: 34
-2147483649: Numerical result out of range

Related

Why this strange values are returned form calling <limits.h> definitios?

I do not understand why these strange values are returned from LONG_MAX, LONG_MIN and UINT_MAX. Furthermore, since I am following the book "The C programming Languege", I noticed that the values of the range of int in my pc are precisely the one of the long in the book. Is this even possible?
SOURCE CODE:
int main(){
printf(" [INT]\t|%d | %d|\n", INT_MAX, INT_MIN);
printf("[UINT]\t|%11d| %11d|\n", UINT_MAX);
printf("[SHRT]\t|%11d| %11d|\n", SHRT_MAX, SHRT_MIN);
printf("[LONG]\t|%11d| %11d|\n", LONG_MAX, LONG_MIN);
}
OUTPUT:
[INT] |2147483647 | -2147483648|
[UINT] | -1| 0|
[SHRT] | 32767| -32768|
[LONG] | -1| 0|
LONG_MAX and LONG_MIN should be printed using %ld (or %11ld or similarly) because their type is long. When they are printed with %d, the behavior is not defined by the C standard, and the program may misbehave in various ways.
UINT_MAX should be printed using %u (or %11u or similarly) because its type is unsigned int.
Also, printf("[UINT]\t|%11d| %11d|\n", UINT_MAX); has two conversion specifications but only one value supplied. Either add another argument, or remove the second conversion specification.
The values are not correct for UINT_MAX, LONG_MAX and for LONG_MIN. The others are correct (maximal and minimal values of the signed integers).
UINT_MAX is defined in limits.h (or in another header file included by it) roughly so:
#define UINT_MAX 4294967295
It is handled by the C preprocessor. Effectively, any time if you use UINT_MAX in your C code, the 4294967295 substituted in it, as a string.
Thus, your relevant code line will be preprocessed to
printf("[UINT]\t|%11d| %11d|\n", 4294967295, 0);
If there is no other advice, this large 4294967295 will be interpreted as a signed integer. But it is too big for a signed integer, as their maximum is about 2billion. This makes an integer overflow bug (the standard calls it "undefined behavior" or so, but our language lawyers will likely correct me soon). But the bit field of 4294967295, interpreted as a signed integer, is exactly -1.
The simplest workaround is to advice the compiler to interpret the variable as an unsigned integer, roughly so:
printf("[UINT]\t|%11d| %11d|\n", 4294967295UL, 0);
This is still not okay, because %d expects a signed integer, so also fix the formatting string. Possibly %D or %Ld will be what you need, check the docs and turn on all warnings.
Because C is (intentionally) weak in type checking, it is good advice to turn on all possible warnings (in gcc, -Wall flag) and write always clean, warning-less code. Do not even try to start a code if it is not purely warning-free - it is much easier to solve warnings than mysterious bugs.
P.s. in theory, the C standard does not describe the bit length of the integer types, possibly not even their bit-level representation, so a huge language lawyering is possible around this topic, and it will likely happen.

Effect of type casting on printf function

Here is a question from my book,
Actually, I don't know what will be the effect on printf function, so I tried the statements in the original system of C lang. Here is my code:
#include <stdio.h>
void main() {
int x = 4;
printf("%hi\n", x);
printf("%hu\n", x);
printf("%i\n", x);
printf("%u\n", x);
printf("%li\n", x);
printf("%lu\n", x);
}
Try it online!
So, the output is very simple. But, is this really the solution to above problem?
There are numerous problems in this question that make it unsuitable for teaching C.
First, to work on this problem at all, we have to assume a non-standard C implementation is used. In standard C, %x is a complete conversion specification, so %xu and %xd cannot be; the conversion specification has already ended before the u or d. And the uses of z in a conversion specification interferes with its standard use for size_t.
Nonetheless, let’s assume this C variant does not have those standard conversion specifications and instead uses the ones shown in the table but that this C variant otherwise conforms to the C standard with minimal changes.
Our next problem is that, in Y num = 42;, we have a plain Y, not the signed Y or unsigned Y shown in the table. Let’s assume signed Y is intended.
Then num is a signed four-bit integer. The greatest value it can represent is 01112 = 710. So it cannot represent 42. Attempting to initialize it with 42 results in a conversion specified by C 2018 6.3.1.3, which says, in part:
Otherwise, the new type is signed and the value cannot be represented in it; either the result is implementation-defined or an implementation-defined signal is raised.
The result is we do not know what value is in num or even whether the program continues to execute; it may trap and terminate.
Well, let’s assume this implementation just takes the low bits of the value. 42 is 1010102, so its low four bits are 1010. So if the bits in num are 1010, it is negative. The C standard permits several methods of representation for negative numbers, but we will assume the overwhelmingly most common one, two’s complement, so the bits 1010 in num represent −6.
Now, we get to the printf statements. Except the problem text shows Printf, which is not defined by the C standard. (Are you sure this problem relates to C code at all?) Let’s assume it means printf.
In printf("%xu",num);, if the conversion specification is supposed to work like the ones in standard C, then the corresponding argument should be an unsigned X value that has been promoted to int for the function call. As a two-bit unsigned integer, an unsigned X can represent 0, 1, 2, or 3. Passing it −6 is not defined. So we do not know what the program will print. It might take just the low two bits, 10, and print “2”. Or it might use all the bits and print “-6”. Both of those would be consistent with the requirement that the printf behave as specified for values that are in the range representable by unsigned X.
In printf("%xd",num); and printf("%yu",num);, the same problem exists.
In printf("%yd",num);, we are correctly passing a signed Y value for a signed Y conversion specification, so “-6” is printed.
Then printf("%zu",num); has the same problem with the value mismatched for the type.
Finally, in printf("%zd",num);, the value is again in the correct range, and “-6” is printed.
From all the assumptions we had to make and all the points where the behavior is undefined, you can see this is a terrible exercise. You should question the quality of the book it is in and of any school using it.

Size of a float variable and compilation

I'm struggling to understand the behavior of gcc in this. The size of a float is of 4 bytes for my architecture. But I can still store a 8 bytes real value in a float, and my compiler says nothing about it.
For example I have :
#include <stdio.h>
int main(int argc, char** argv){
float someFloatNumb = 0xFFFFFFFFFFFF;
printf("%i\n", sizeof(someFloatNumb));
printf("%f\n", someFloatNumb);
printf("%i\n", sizeof(281474976710656));
return 0;
}
I expected the compiler to insult me, or displaying a disclaimer of some sort, because I shouldn't be able to something like that, at least I think it's kind of twisted wizardry.
The program simply run :
4
281474976710656.000000
8
So, if I print the size of someFloatNumb, I get 4 bytes, which is expected. But the affected value isn't, as seen just below.
So I have a few questions:
Does sizeof(variable) simply get the variable type and return sizeof(type), which in this case would explain the result?
Does/Can gcc grow the capacity of a type? (managing multiple variables behind the curtains to allow us that sort of things)
1)
Does sizeof(variable) simply get the variable type and return sizeof(type), which in this case would explain the result ?
Except for variable-length arrays, sizeof doesn't evaluate its operand. So yes, all it cares is the type. So sizeof(someFloatNumb) is 4 which is equivalent to sizeof(float). This explains printf("%i\n", sizeof(someFloatNumb));.
2)
[..] But I can still store a 8 bytes real value in a float, and my compiler says nothing about it.
Does/Can gcc grow the capacity of a type ? (managing multiple variables behind the curtains to allow us that sort of things)
No. Capacity doesn't grow. You simply misunderstood how floats are represented/stored. sizeof(float) being 4 doesn't mean
it can't store more than 2^32 (assuming 1 byte == 8 bits). See Floating point representation.
What the maximum value of a float can represent is defined by the constant FLT_MAX (see <float.h>). sizeof(someFloatNumb) simply yields how many bytes the object (someFloatNumb) takes up in memory which isn't necessarily equal to the range of values it can represent.
This explains why printf("%f\n", someFloatNumb); prints the value as expected (and there's no automatic "capacity growth").
3)
printf("%i\n", sizeof(281474976710656));
This is slightly more involved. As said before in (1), sizeof only cares about the type here. But the type of 281474976710656 is not necessarily int.
The C standard defines the type of integer constants according to the smallest type that can represent the value. See https://stackoverflow.com/a/42115024/1275169 for an explanation.
On my system 281474976710656 can't be represented in an int and it's stored in a long int which is likely to be case on your system as well. So what you see is essentially equivalent to sizeof(long).
There's no portable way to determine the type of integer constants. But since you are using gcc, you could use a little trick with typeof:
typeof(281474976710656) x;
printf("%s", x); /* deliberately using '%s' to generate warning from gcc. */
generates:
warning: format ‘%s’ expects argument of type ‘char *’, but argument 2
has type ‘long int’ [-Wformat=]
printf("%s", x);
P.S: sizeof results a size_t for which the correct format specifier is %zu. So that's what you should be using in your 1st and 3rd printf statements.
This doesn't store "8 bytes" of data, that value gets converted to an integer by the compiler, then converted to a float for assignment:
float someFloatNumb = 0xFFFFFFFFFFFF; // 6 bytes of data
Since float can represent large values, this isn't a big deal, but you will lose a lot of precision if you're only using 32-bit floats. Notice there's a slight but important difference here:
float value = 281474976710656.000000;
int value = 281474976710655;
This is because float becomes an approximation when it runs out of precision.
Capacities don't "grow" for standard C types. You'll have to use a "bignum" library for that.
But I can still store a 8 bytes real value in a float, and my compiler
says nothing about it.
That's not what's happening.
float someFloatNumb = 0xFFFFFFFFFFFF;
0xFFFFFFFFFFFF is an integer constant. Its value, expressed in decimal, is 281474976710655, and its type is probably either long or long long. (Incidentally, that value can be stored in 48 bits, but most systems don't have a 48-bit integer type, so it will probably be stored in 64 bits, of which the high-order 16 bits will be zero.)
When you use an expression of one numeric type to initialize an object of a different numeric type, the value is converted. This conversion doesn't depend on the size of the source expression, only on its numeric value. For an integer-to-float conversion, the result is the closest representation to the integer value. There may be some loss of precision (and in this case, there is). Some compilers may have options to warn about loss of precision, but the conversion is perfectly valid so you probably won't get a warning by default.
Here's a small program to illustrate what's going on:
#include <stdio.h>
int main(void) {
long long ll = 0xFFFFFFFFFFFF;
float f = 0xFFFFFFFFFFFF;
printf("ll = %lld\n", ll);
printf("f = %f\n", f);
}
The output on my system is:
ll = 281474976710655
f = 281474976710656.000000
As you can see, the conversion has lost some precision. 281474976710656 is an exact power of two, and floating-point types generally can represent those exactly. There's a very small difference between the two values because you chose an integer value that's very close to one that can be represented exactly. If I change the value:
#include <stdio.h>
int main(void) {
long long ll = 0xEEEEEEEEEEEE;
float f = 0xEEEEEEEEEEEE;
printf("ll = %lld\n", ll);
printf("f = %f\n", f);
}
the apparent loss of precision is much larger:
ll = 262709978263278
f = 262709979381760.000000
0xFFFFFFFFFFFF == 281474976710655
If you init a float with that value, it will end up being
0xFFFFFFFFFFFF +1 == 0x1000000000000 == 281474976710656 == 1<<48
That fits easily in a 4byte float, simple mantisse, small exponent.
It does however NOT store the correct value (one lower) because that IS hard to store in a float.
Note that the " +1" does not imply incrementation. It ends up one higher because the representation can only get as close as off-by-one to the attempted value. You may consider that "rounding up to the next power of 2 mutliplied by whatever the mantisse can store". Mantisse, by the way, usually is interpreted as a fraction between 0 and 1.
Getting closer would indeed require the 48 bits of your initialisation in the mantisse; plus whatever number of bits would be used to store the exponent; and maybe a few more for other details.
Look at the value printed... 0xFFFF...FFFF is an odd value, but the value printed in your example is even. You are feeding the float variable with an int value that is converted to float. The conversion is loosing precision, as expected by the value used, which doesn't fit in the 23 bits reserved to the target variable mantissa. And finally you get an approximation with is the value 0x1000000....0000 (the next value, which is the closest value to the one you used, as posted #Yunnosch in his answer)

C safely taking absolute value of integer

Consider following program (C99):
#include <stdio.h>
#include <stdlib.h>
#include <inttypes.h>
int main(void)
{
printf("Enter int in range %jd .. %jd:\n > ", INTMAX_MIN, INTMAX_MAX);
intmax_t i;
if (scanf("%jd", &i) == 1)
printf("Result: |%jd| = %jd\n", i, imaxabs(i));
}
Now as I understand it, this contains easily triggerable undefined behaviour, like this:
Enter int in range -9223372036854775808 .. 9223372036854775807:
> -9223372036854775808
Result: |-9223372036854775808| = -9223372036854775808
Questions:
Is this really undefined behaviour, as in "code is allowed to trigger any code path, which any code that stroke compiler's fancy", when user enters the bad number? Or is it some other flavor of not-completely-defined?
How would a pedantic programmer go about guarding against this, without making any assumptions not guaranteed by standard?
(There are a few related questions, but I didn't find one which answers question 2 above, so if you suggest duplicate, please make sure it answers that.)
If the result of imaxabs cannot be represented, can happen if using two's complement, then the behavior is undefined.
7.8.2.1 The imaxabs function
The imaxabs function computes the absolute value of an integer j. If the result cannot
be represented, the behavior is undefined. 221)
221) The absolute value of the most negative number cannot be represented in two’s complement.
The check that makes no assumptions and is always defined is:
intmax_t i = ... ;
if( i < -INTMAX_MAX )
{
//handle error
}
(This if statement cannot be taken if using one's complement or sign-magnitude representation, so the compiler might give a unreachable code warning. The code itself is still defined and valid. )
How would a pedantic programmer go about guarding against this, without making any assumptions not guaranteed by standard?
One method is to use unsigned integers. The overflow behaviour of unsigned integers is well-defined as is the behaviour when converting from a signed to an unsigned integer.
So I think the following should be safe (turns out it's horriblly broken on some really obscure systems, see later in the post for an improved version)
uintmax_t j = i;
if (j > (uintmax_t)INTMAX_MAX) {
j = -j;
}
printf("Result: |%jd| = %ju\n", i, j);
So how does this work?
uintmax_t j = i;
This converts the signed integer into an unsigned one. IF it's positive the value stays the same, if it's negative the value increases by 2n (where n is the number of bits). This converts it to a large number (larger than INTMAX_MAX)
if (j > (uintmax_t)INTMAX_MAX) {
If the original number was positive (and hence less than or equal to INTMAX_MAX) this does nothing. If the original number was negative the inside of the if block is run.
j = -j;
The number is negated. The result of a negation is clearly negative and so cannot be represented as an unsigned integer. So it is increased by 2n.
So algebraically the result for negative i looks like
j = - (i + 2n) + 2n = -i
Clever, but this solution makes assumptions. This fails if INTMAX_MAX == UINTMAX_MAX, which is allowed by C Standard.
Hmm, lets look at this (i'm reading https://busybox.net/~landley/c99-draft.html which is apprarently the last C99 draft prior to standardisation, if anything changed in the final standard please do tell me.
When typedef names differing only in the absence or presence of the initial u are defined, they shall denote corresponding signed and unsigned types as described in 6.2.5; an implementation shall not provide a type without also providing its corresponding type.
In 6.2.5 I see
For each of the signed integer types, there is a corresponding (but different) unsigned integer type (designated with the keyword unsigned) that uses the same amount of storage (including sign information) and has the same alignment requirements.
In 6.2.6.2 I see
#1
For unsigned integer types other than unsigned char, the bits of the object representation shall be divided into two groups: value bits and padding bits (there need not be any of the latter). If there are N value bits, each bit shall represent a different power of 2 between 1 and 2N-1, so that >objects of that type shall be capable of representing values from 0 to 2N-1 >using a pure binary representation; this shall be known as the value representation. The values of any padding bits are unspecified.39)
#2
For signed integer types, the bits of the object representation shall be divided into three groups: value bits, padding bits, and the sign bit. There need not be any padding bits; there shall be exactly one sign bit. Each bit that is a value bit shall have the same value as the same bit in the object representation of the corresponding unsigned type (if there are M value bits in the signed type and N in the unsigned type, then M<=N). If the sign bit is zero, it shall not affect the resulting value.
So yes it seems you are right, while the signed and unsigned types have to be the same size it does seem to be valid for the unsigned type to have one more padding bit than the signed type.
Ok, based on the analysis above revealing a flaw in my first attempt i've written a more paranoid variant. This has two changes from my first version.
I use i < 0 rather than j > (uintmax_t)INTMAX_MAX to check for negative numbers. This means that the algorithm proceduces correct results for numbers grater than or equal to -INTMAX_MAX even when INTMAX_MAX == UINTMAX_MAX.
I add handling for the error case where INTMAX_MAX == UINTMAX_MAX, INTMAX_MIN == -INTMAX_MAX -1 and i == INTMAX_MIN. This will result in j=0 inside the if condition which we can easilly test for.
It can be seen from the requirements in the C standard that INTMAX_MIN cannot be smaller than -INTMAX_MAX -1 since there is only one sign bit and the number of value bits must be the same or lower than in the corresponding unsigned type. There are simply no bit patterns left to represent smaller numbers.
uintmax_t j = i;
if (i < 0) {
j = -j;
if (j == 0) {
printf("your platform sucks\n");
exit(1);
}
}
printf("Result: |%jd| = %ju\n", i, j);
#plugwash I think 2501 is correct. For example, -UINTMAX_MAX value becomes 1: (-UINTMAX_MAX + (UINTMAX_MAX + 1)), and is not caught by your if. – hyde 58 mins ago
Umm,
assuming INTMAX_MAX == UINTMAX_MAX and i = -INTMAX_MAX
uintmax_t j = i;
after this command j = -INTMAX_MAX + (UINTMAX_MAX + 1) = 1
if (i < 0) {
i is less than zero so we run the commands inside the if
j = -j;
after this command j = -1 + (UINTMAX_MAX + 1) = UINTMAX_MAX
which is the correct answer, so no need to trap it in an error case.
On two-complement systems getting the absolute number of the most negative value is indeed undefined behavior, as the absolute value would be out of range. And it's nothing the compiler can help you with, as the UB happens at run-time.
The only way to protect against that is to compare the input against the most negative value for the type (INTMAX_MIN in the code you show).
So calculating the absolute value of an integer invokes undefined behaviour in one single case. Actually, while the undefined behaviour can be avoided, it is impossible to give the correct result in one case.
Now consider multiplication of an integer by 3: Here we have a much more serious problem. This operation invokes undefined behaviour in 2/3rds of all cases! And for two thirds of all int values x, finding an int with the value 3x is just impossible. That's a much more serious problem than the absolute value problem.
You may want to use some bit hacks:
int v; // we want to find the absolute value of v
unsigned int r; // the result goes here
int const mask = v >> sizeof(int) * CHAR_BIT - 1;
r = (v + mask) ^ mask;
This works well when INT_MIN < v <= INT_MAX. In the case where v == INT_MIN, it remains INT_MIN , without causing undefined behavior.
You can also use bitwise operation to handle this on ones' complement and sign-magnitude systems.
Reference: https://graphics.stanford.edu/~seander/bithacks.html#IntegerAbs
according to this http://linux.die.net/man/3/imaxabs
Notes
Trying to take the absolute value of the most negative integer is not defined.
To handle the full range you could add something like this to your code
if (i != INTMAX_MIN) {
printf("Result: |%jd| = %jd\n", i, imaxabs(i));
} else { /* Code around undefined abs( INTMAX_MIN) /*
printf("Result: |%jd| = %jd%jd\n", i, -(i/10), -(i%10));
}
edit: As abs(INTMAX_MIN) cannot be represented on a 2's complement machine, 2 values within respresentable range are concatenated on output as a string.
Tested with gcc, though printf required %lld as %jd was not a supported format.
Is this really undefined behaviour, as in "code is allowed to trigger any code path, which any code that stroke compiler's fancy", when user enters the bad number? Or is it some other flavor of not-completely-defined?
The behaviour of the program is only undefined, when the bad number is successfully input-ed and passed to imaxabs(), which on a typical 2's complement system returns a -ve result as you observed.
That is the undefined behaviour in this case, the implementation would also be allowed to terminate the program with an over-flow error if the ALU set status flags.
The reason for "undefined behaviour" in C is so compiler writers don't have to guard against overflow, so programs can run more efficiently. Whilst it is within C standard for every C program using abs() to try to kill your first born, just because you call it with a too -ve value, writing such code into the object file would simply be perverse.
The real problem with these undefined behaviours, is that an optimising compiler, can reason away naive checks so code like :
r = (i < 0) ? -i : i;
if (r < 0) { // This code may be pointless
// Do overflow recovery
doRecoveryProcessing();
} else {
printf("%jd", r);
}
As a compiler optomiser can reason that negative values are negated, it could in principal determine that (r <0) is always false, so the attempt to trap the problem fails.
How would a pedantic programmer go about guarding against this, without making any assumptions not guaranteed by standard?
By far the best way, is simply to ensure that the program works on a valid range, so in this case validating the input suffices (disallow INTMAX_MIN).
Programs printing tables of abs() ought to avoid INT*_MIN and so on.
if (i != INTMAX_MIN) {
printf("Result: |%jd| = %jd\n", i, imaxabs(i));
} else { /* Code around undefined abs( INTMAX_MIN) /*
printf("Result: |%jd| = %jd%jd\n", i, -(i/10), -(i%10));
}
Appears to write out the abs( INTMAX_MIN) by fakery, allowing the program to live up to it's promise to the user.

Wrong output from printf of a number

int main()
{
double i=4;
printf("%d",i);
return 0;
}
Can anybody tell me why this program gives output of 0?
When you create a double initialised with the value 4, its 64 bits are filled according to the IEEE-754 standard for double-precision floating-point numbers. A float is divided into three parts: a sign, an exponent, and a fraction (also known as a significand, coefficient, or mantissa). The sign is one bit and denotes whether the number is positive or negative. The sizes of the other fields depend on the size of the number. To decode the number, the following formula is used:
1.Fraction × 2Exponent - 1023
In your example, the sign bit is 0 because the number is positive, the fractional part is 0 because the number is initialised as an integer, and the exponent part contains the value 1025 (2 with an offset of 1023). The result is:
1.0 × 22
Or, as you would expect, 4. The binary representation of the number (divided into sections) looks like this:
0 10000000001 0000000000000000000000000000000000000000000000000000
Or, in hexadecimal, 0x4010000000000000. When passing a value to printf using the %d specifier, it attempts to read sizeof(int) bytes from the parameters you passed to it. In your case, sizeof(int) is 4, or 32 bits. Since the first (rightmost) 32 bits of the 64-bit floating-point number you supply are all 0, it stands to reason that printf produces 0 as its integer output. If you were to write:
printf("%d %d", i);
Then you might get 0 1074790400, where the second number is equivalent to 0x40100000. I hope you see why this happens. Other answers have already given the fix for this: use the %f format specifier and printf will correctly accept your double.
Jon Purdy gave you a wonderful explanation of why you were seeing this particular result. However, bear in mind that the behavior is explicitly undefined by the language standard:
7.19.6.1.9: If a conversion specification is invalid, the behavior is undefined.248) If any argument is not the correct type for the corresponding conversion specification, the behavior is undefined.
(emphasis mine) where "undefined behavior" means
3.4.3.1: behavior, upon use of a nonportable or erroneous program construct or of erroneous data, for which this International Standard imposes no requirements
IOW, the compiler is under no obligation to produce a meaningful or correct result. Most importantly, you cannot rely on the result being repeatable. There's no guarantee that this program would output 0 on other platforms, or even on the same platform with different compiler settings (it probably will, but you don't want to rely on it).
%d is for integers:
int main()
{
int i=4;
double f = 4;
printf("%d",i); // prints 4
printf("%0.f",f); // prints 4
return 0;
}
Because the language allows you to screw up and you happily do it.
More specifically, '%d' is the formatting for an int and therefore printf("%d") consumes as many bytes from the arguments as an int takes. But a double is much larger, so printf only gets a bunch of zeros. Use '%lf'.
Because "%d" specifies that you want to print an int, but i is a double. Try printf("%f\n"); instead (the \n specifies a new-line character).
The simple answer to your question is, as others have said, that you're telling printf to print a integer number (for example a variable of the type int) whilst passing it a double-precision number (as your variable is of the type double), which is wrong.
Here's a snippet from the printf(3) linux programmer's manual explaining the %d and %f conversion specifiers:
d, i The int argument is converted to signed decimal notation. The
precision, if any, gives the minimum number of digits that must
appear; if the converted value requires fewer digits, it is
padded on the left with zeros. The default precision is 1.
When 0 is printed with an explicit precision 0, the output is
empty.
f, F The double argument is rounded and converted to decimal notation
in the style [-]ddd.ddd, where the number of digits after the
decimal-point character is equal to the precision specification.
If the precision is missing, it is taken as 6; if the precision
is explicitly zero, no decimal-point character appears. If a
decimal point appears, at least one digit appears before it.
To make your current code work, you can do two things. The first alternative has already been suggested - substitute %d with %f.
The other thing you can do is to cast your double to an int, like this:
printf("%d", (int) i);
The more complex answer(addressing why printf acts like it does) was just answered briefly by Jon Purdy. For a more in-depth explanation, have a look at the wikipedia article relating to floating point arithmetic and double precision.
Because i is a double and you tell printf to use it as if it were an int (%d).
#jagan, regarding the sub-question:
What is Left most third byte. Why it is 00000001? Can somebody explain?"
10000000001 is for 1025 in binary format.

Resources