C : Printing big numbers - c

Take the following :
#include <stdio.h>
main() {
unsigned long long verybig = 285212672;
printf("Without variable : %llu\n", 285212672);
printf("With variable : %llu", verybig);
}
This is the output of the above program :
Without variable : 18035667472744448
With variable : 285212672
As you can see from the above, when printf is passed the number as a constant, it prints some huge incorrect number, but when the value is first stored in a variable, printf prints the correct number.
What is the reasoning behind this?

Try 285212672ULL; if you write it without suffixes, you'll find the compiler treats it as a regular integer. The reason it's working in a variable is because the integer is being cast up to an unsigned long long in the assignment, so that the value passed to printf() is the right type.
And before you ask, no, the compiler probably isn't smart enough to figure it out from the "%llu" in the printf() format string. That's a different level of abstraction. The compiler is responsible for the language syntax, printf() semantics are not part of the syntax, it's a runtime library function (no different really from your own functions except that it's included in the standard).
Consider the following code for a 32-bit int and 64-bit unsigned long long system:
#include <stdio.h>
int main (void) {
printf ("%llu\n",1,2);
printf ("%llu\n",1ULL,2);
return 0;
}
which outputs:
8589934593
1
In the first case, the two 32-bit integers 1 and 2 are pushed on the stack and printf() interprets that as a single 64-bit ULL value, 2 x 232 + 1. The 2 argument is being inadvertently included in the ULL value.
In the second, you actually push the 64-bit 1-value and a superfluous 32-bit integer 2, which is ignored.
Note that this "getting out of step" between your format string and your actual arguments is a bad idea. Something like:
printf ("%llu %s %d\n", 0, "hello", 0);
is likely to crash because the 32-bit "hello" pointer will be consumed by the %llu and %s will try to de-reference the final 0 argument. The following "picture" illustrates this (let's assume that cells are 32-bits and that the "hello" string is stored at 0xbf000000.
What you pass Stack frames What printf() uses
+------------+
0 | 0 | \
+------------+ > 64-bit value for %llu.
"hello" | 0xbf000000 | /
+------------+
0 | 0 | value for %s (likely core dump here).
+------------+
| ? | value for %d (could be anything).
+------------+

It's worth pointing out that some compilers give a useful warning for this case - for example, this is what GCC says about your code:
x.c: In function ‘main’:
x.c:6: warning: format ‘%llu’ expects type ‘long long unsigned int’, but argument 2 has type ‘int’

285212672 is an int value. printf expects an unsigned long long and you're passing it an int. Consequently, it'll take more bytes off the stack than you passed a real value and prints garbage. When you put it in an unsigned long long variable before passing it to the function, it'll be promoted to unsigned long long in the assignment line and you pass that value to printf which works correctly.

Datatype is simply a way of interpreting contents of a memory location.
in first case the constant value is stored in read only memory location as an int, the printf tries to interpret this address as 8 byte location as it is instructed that the value stored is long long in process of which it prints garbage value.
In the second case printf tries to interpret a long long value as 8 bytes and it prints what is expected.

Related

Undefined behavior of " %*d " in C programing language

case 1:
printf("%*d", 10, 5)
output :
_________5
(I am using _ to denote blank spaces )
case 2:
printf("%*d", 10.4, 5)
expected output:
______0005
But goes for infinite loop
why is this behavior being showed by %*d for decimal "field width precision prefix" ?
You told printf using the * format to expect a width integer in the argument list. You gave it a floating-point argument, 10.4. C gets confused when it expects an integer and a float is there instead. You likely intended this:
printf("%*.*d", 10, 4, 5);
with width and precision being represented each by its own separate integral argument.
The second line, printf("%*d",10.4,5) leads to undefined behavior. The function printf expects an int but is given the double 10.4 instead.
The handling of the variable number of parameters of printf, the format string and the corresponding types is complex, and goes deep into compiler construction. It's difficult to trace what happens exactly and can vary from compiler to compiler.
See here for the exact specifications of the printf format specifiers.
Addition:
Let's see if we can trace what exactly happens in GNU's implementation of the standard library. First, looking at the source for printf. It uses varargs and a function called __vfprintf_internal. The latter is defined here (line 1318). On line 1363, a sanity check is performed using a macro, but it only checks if the format string pointer is not NULL. Line 1443:
int prec = -1; /* Precision of output; -1 means none specified. */
Line 1582, specifying the argument as an int in case the * modifier was used:
prec = va_arg (ap, int);
From here on, the precision is processed as an int. Let's look at the implementation of va_arg to see what happens if it is given a double. It is part of the stdarg header of GCC, see here, line 49:
#define va_arg(v,l) __builtin_va_arg(v,l)
And now it gets complex. __builtin_va_arg isn't explicitly defined anyway, but is part of GCC's internal representation of the C language. See the parser file. That means that we cannot read concrete types in the source files anymore.
We can obtain some more information on the processing of varargs in the builtins.c file. From now on I can only guess what happens. The processing appears to start in expand_builtin_va_start which takes a tree parameter and returns an rtx (RTL Expression) object. This object is a constant and probably has the double type mode. I'm assuming the compiler processes double expression until it knows what (machine specific) bit values it has to write in the executable. Since, evidently, the floating point number is not truncated to an int, I wouldn't be surprised if the value would actually correspond, and later will be interpreted as, a more or less random value (e.g. 77975616). It may be also conceivable that the memory of the program would misaligned when, e.g., the type of a double (usually 8 bytes) is larger than an int (usually 4 bytes). More on the implementation of varargs here.
Whatever sort-of random value the integer could take would be processed back in the process_arg(fspec) back in vfprintf-internal.c.
Additional curiosity:
If printf is given a float as specifier, even if it is explicitly cast, GCC will still give a warning that the value is a double:
warning: field width specifier ‘*’ expects argument of type ‘int’, but argument 2 has type ‘double’ [-Wformat=]
10 | printf("%*d\n", (float) 12.3F, 5);
| ~^~ ~~~~~~~~~~~~~
| | |
| int double
When you pass 10.4 into printf, it’s treated as a double. However, when printf sees the * character, it tries to interpret the argument containing the number of characters to print as an int. Despite the fact that you and I can intuitively see how to treat 10.4 as an integer by rounding down to 10, that’s not how C sees things. This results in what's called undefined behavior. On some platforms, C might treats the bytes of the double 10.4 as an integer, producing an absolutely colossal integer rather than the expected 10. On other platforms, it might read other data expecting to find an int argument, but instead which holds some other unexpected value. In either case, the result is unlikely to be the nice "interpret the value as 10" that you expect it to be.
Use -Wall -Wextra to see more warnings. You will discover:
<source>:30:12: warning: field width specifier '*' expects argument of type 'int', but argument 2 has type 'double' [-Wformat=]
30 | printf("%*d", 10.4, 5);
| ~^~ ~~~~
| | |
| int double
It is undefined behaviour.
Now an example what is happening in the printf function:
int foo(const char *fmt, ...)
{
va_list va;
int retval = 0;
va_start(va, fmt);
retval = va_arg(va, int);
return retval;
}
unsigned bar(const char *fmt, ...)
{
va_list va;
unsigned retval = 0;
va_start(va, fmt);
retval = va_arg(va, unsigned);
return retval;
}
int main(void)
{
printf("as int %d\n", foo("", 10.4));
printf("as uint %u\n", bar("", 10.4));
}
And let's execute it:
https://godbolt.org/z/EsMaM8EPs

What if I use %d instead of %ld in c?

I am a beginner and look at the book "C primer plus" and was confused about this saying
"To print a long value, use the %ld format specifier. If int and long are the same size on your system, just %d will suffice, but your program will not work properly when transferred to a system on which the two types are different, so use the %ld specifier for long."
I tested it myself as the following code:
int num = 2147483647;
short num_short = 32767;
long num_long = 2147483647;
printf("int: %d; short: %d; long: %d", num, num_short, num_long);
the program worked okay.
I searched online and found this question:%d with Long Int
An answer said:
it works because long int and int are actually the same numeric representation: four byte, two's complement. With another platform (for example x86-64 Linux), that might not be the case, and you would probably see some sort of problem
My computer is 64-bits. The int is 32-bits. The long int is 32-bits. The short int is 16-bits (I checked it. It's all right). So you can see that the int type and short int type is different. This answer also said it would cause error if these types have different numeric representation. So what does the author mean of his saying?
Whatever happens depends on the implementation. The standard is quite clear on this, and it does invoke undefined behavior.
If a conversion specification is invalid, the behavior is undefined. If any argument is not the correct type for the corresponding conversion specification, the behavior is undefined.
7.21.6.1p9
Roberto explained in detail why it "works" in his answer
What the author of your book says is correct.
Your test invokes undefined behavior, as the standard doesn't specify what will happen when a wrong format specifier is used. It means that the implementation of each compiler will be free to interpret how to manage such cases.
Nevertheless, with some common sense (and with a little knowledge about how things work "in practice") the behavior you experience can be explained.
Every format specifier says to the compiler where, in the list of parameters, it can find the parameter to be printed. For this reason, and that's the core of your book's assertion, passing an integer with an unexpected length makes the following format specifiers retrieve the parameter in the wrong place.
Example 1: a sequence of 32 bits integers
Let's start with a "normal" sequence, in order to have a base example.
int a=1, b=2;
printf("%d %d\n", a, b);
The format specifies tells the compiler that two 4 bytes integers will be found after the format strings. The parameters are actually placed in the stack in the expected way
-------------------------------------------------
... format string | a (4 bytes) | b (4 bytes) |
-------------------------------------------------
Example 2: why printing a 64-bytes long with %d doesn't work?
Let's consider the following printf:
long int a=1;
int b=2;
printf("%d %d\n", a, b);
The format specifies tells the compiler that two 4 bytes integers will be found after the format strings. But the first parameter takes 8 bytes instead of 4
-------------------------------------------------------------
... format string | a (4 bytes) + (4 bytes) | b (4 bytes) |
-------------------------------------------------------------
^ ^
Compiler Compiler
expects 'a' expects 'b'
here here
So the output would be
1 0
because b is searched where the 4 most significant bytes of a are, and they are all 0s.
Example 3: why do printing a 16-bytes short with %d work?
Let's consider the following printf:
short int a=1;
int b=2;
printf("%d %d\n", a, b);
The format specifies tells the compiler that two 4 bytes integers will be found after the format strings. The first parameter takes only two bytes instead of 4, but... we are lucky, because on a 32 bits platform parameters are 4-bytes aligned!
---------------------------------------------------------------------
... format string | a (2 bytes) | (2 bytes PADDING) | b (4 bytes) |
---------------------------------------------------------------------
^ ^
Compiler Compiler
expects 'a' expects 'b'
here here
So the output would be
1 1
because b is searched in the correct place. We would have problems in the representation of a if the alignment was done with non-0 padding, but it's not the case.
So, the real difference in case of %d used for short is just the representation of signed bytes, as the sign bit is expected to be in the most significant one.

Using long long integer to store 32 bit pointer causes printf to bug

Messing around a bit with C pointers, I came across a rather strange behavior.
Consider the following code :
int
main ()
{
char charac = 'r';
long long ptr = (long long) &charac; // Stores the address of charac into a long long variable
printf ("[ptr] points to %p containing the char %c\n", ptr, *(char*)ptr);
}
On 64-bits architectures
Now when compiled for a 64-bits target architecture (compilation command : gcc -Wall -Wextra -std=c11 -pedantic test.c -o test), everything is fine, the execution gives
> ./test
[ptr] points to 0x7fff3090ee47 containing the char r
On 32-bits architectures
But, if the compilation targets a 32-bits arch (with compilation command : gcc -Wall -Wextra -std=c11 -pedantic -ggdb -m32 test.c -o test), the execution gives this weird result :
> ./test
[ptr] points to 0xff82d4f7 containing the char �
The weirdest part now is if I change the printf call in the previous code to printf ("[ptr] contains the char %c\n", *(char*)ptr);, the execution gives a correct result :
> ./test
[ptr] contains the char r
The issue seems to arise only on 32-bits arch, and I can't figure out why the printf call change causes the execution to behave differently.
PS: It's maybe worth mentioning that the underlying machine is a x86 64-bits architecture, but using the 32-bits compatibility mode triggered by the -m32 option in gcc.
You are basically cheating your compiler.
You tell printf that you pass a pointer as first parameter after the format string. But instead you pass an integer variable.
While this is always undefined behaviour, it may somehow work as long as the size of expected type and passed type are the same. That's the "undefined" in "undefined behaviour". It is also not defined to crash or immediately show bad results. It may just pretent to work while waiting to hit you from behind.
If your long long has 64 bits while a pointer only has 32 bits, the layout of your stack is broken causing printf to read from wrong location.
Depending on your architecture and tools, you have good chances that your stack looks like this when you call a function with variadic parameter list:
+---------------+---------------+---------------+
| last fixed par| Par 1 type1 | Par 2 type2 |
| x bytes | x bytes | x bytes |
+---------------+---------------+---------------+
The unknown parameters are pushed on the stack and finally the last known parameter from the signature is pushed. (Other known parameters are ignored here)
Then the function can walk through the parameter list using va_arg and friends. For this purpose the function must know which types of parameters are passed. The printf function uses the format specifier to decide which parameter to consume from the stack.
Now it comes to the point where everything depends on you telling the truth.
What you tell your compiler:
+---------------+---------------+---------------+
| format char* | Par 1 void* | Par 2 int |
| 4 bytes | 4 bytes | 4 bytes |
+---------------+---------------+---------------+
For the first parameter (%p) the compiler takes 4 bytes which is the size of a void*. Then it takes another 4 bytes (size of an int) for parameter 2 (%c).
(Note: The last parameter is printed as a character, i.e. only 1 byte will be used in the end. Due to integer type promotion rules for function calls without proper parameter type specification the parameter is stored as an int on the stack. Hence printf must also consume the bytes for an int in this case.)
Now let's look at your function call (What you really put into printf):
+---------------+-------------------------------+---------------+
| format char* | Par 1 long long | Par 2 int |
| 4 bytes | 8 bytes | 4 bytes |
+---------------+-------------------------------+---------------+
You still claim to provide a pointer and a integer parameter of 4 bytes each.
But now the first parameter comes with an extra 4 bytes of length which remains unknown to the printf function.
As you have told it, the function reads 4 bytes for the pointer. This may be in line with the first 4 bytes of the long long but the remaining 4 bytes are not consumed.
Now the next 4 bytes that are used for the %c format, are read but we are still reading the second half of your long long Whatever this may be, it is not what you want to.
Finally the pushed integer is still untouched when the function returns.
That's the reason why you should not mess with weird type casting and wrong types.
And that's also the reason why you should look at your warnings during compiling.
One big issue: you are using the wrong type for integer/pointer shenanigans. The type intptr_t is an integer type that can store a pointer.
So, what goes wrong on the 32-bit architecture?
The type long long int is (with gcc) a 64-bit type. However, the printf command with %p format expects to receive a 32-bit pointer, not a 64-bit one.
The call to printf will have on the call stack: (illustrative purposes only, details may differ)
pointer to format string
ptr (8 bytes)
*(char *)ptr (at least 1 byte, likely 4)
printf reads the format string, discovers that it should receive a 32-bit pointer and a char. It then reads the first 4 bytes of ptr as the pointer to read and next 1-4 bytes as the character to print. It never even knows that there was more data, the actual character it should have printed, on the stack.

What is printing out when int pointer is printed with %d?

This code:
#include <stdio.h>
int main() {
int num;
int *pi;
num = 0;
pi = &num;
printf("address: %p | %d\nvalue: %d\n", pi, pi, *pi);
}
produces this output:
address: 0x7fff5952f9cc | 1498610124
value: 0
I know that the left one is supposed to be the correct address, but what is printing out next to the address?
%p tells printf to treat the corresponding variable as a pointer, thus printf prints p as a pointer; that is, a hexadecimal representation (i.e. 0x7fff5952f9cc). %d on the other hand tells printf to treat the corresponding variable as numeric. Therefore, what is being printed is the actual, numeric value of p (i.e. 1498610124) which is just 0x5952f9cc in base 10.
Now, the reason why these two representations of the same variable seem to have different values is that %d only tells printf to expect a number---it doesn't specify that number's type. If you cast 0x7fff5952f9cc (a 64-bit integer) to int (a 32-bit type) you get 1498610124 (notice 0x7fff getting dropped).
You are printing the address in decimal instead of in hex but it is truncated to an int.
I guess you are executing this program on a 64bit machine.
The number printed next to the address is still the address of the pointer printed in the integer format. You can also see that the value is truncated
decimal of 0x7fff5952f9cc = 140734691998156. but it is printed as 1498610124 which is due to truncation.
you are trying to print the address of num in hex and decimal value respectively. If I am not wrong you are running your program on a 64 bit architecture and hence the address will be of 8 bytes. So your address fits in long data type. By giving %d you are value is getting truncated here. Instead of %d please use %ld. So your printf of statement should be actually as below
printf("address: %p | %ld\nvalue: %d\n", pi, pi, *pi);
Now run the program you will get your value correctly in decimal format.
It could be anything, because it's Undefined Behaviour. (C standard, § 7.21.6.1, The fprintf function):
If any argument is not the correct type for the corresponding conversion specification, the behavior is undefined.
You print pi (a pointer to an int) with two format specifiers, %p and %d. According to the C standard (and probably reproduced word for word in man fprintf on your system):
d,i The int argument is converted to signed decimal in the style [−]dddd
p The argument shall be a pointer to void. The value of the pointer is converted to a sequence of printing characters, in an implementation-defined manner.
So neither of those uses is correct. It's not a pointer to void and it's also not an int. So what you should write is:
printf("address: %p | %d\nvalue: %d\n", (void*)pi, (int)pi, *pi);
On your system, that probably produces the same output (Undefined Behaviour includes unexpectedly producing the incorrectly expected behaviour) but it might not. In any case, writing the line correctly makes it relatively clear what the second number printed is: it's the value of the pointer converted to an integer.
However, there is no guarantee that this will work, either. Again, from the standard (§6.3.2.3, Pointers, para. 6):
Any pointer type may be converted to an integer type. Except as previously specified, the result is implementation-defined. If the result cannot be represented in the integer type, the behavior is undefined.
(The "previously specified" case is that a NULL pointer may reliably be converted to any integer type; it will have the value 0.)
So the idea is that a pointer is a lot like an integer, but it might be the size of a long. Or a long long. Or something else. It might even be bigger than any integer type, but if there is some integer type which is the right size and you #include <stdint.h>, that integer type will be typedef'd to intptr_t and its unsigned version (probably more useful) will be uintptr_t. Unfortunately, there is no standard printf conversions for those sizes, so if you really want to, you would need to #include <intypes.h> and write:
printf("address: %p | %" PRIuPTR "\nvalue: %d\n",
(void*)pi, (uintptr_t)pi, *pi);
Alternatively, because it is always allowed to convert any integer to any unsigned integer (with possible loss of information, but with a well-defined conversion), you could use two casts:
printf("address: %p | %u\nvalue: %d\n",
(void*)pi, (unsigned)(uintptr_t)pi, *pi);
(Because that supplies an unsigned int argument, the format specifier must be u instead of d.)
When using printf, the "p" modifier is intended to print out a memory address in hex format. The "d" modifier tells it to cast the value to a signed integer value. So it's taking 0x7fff5952f9cc and turning it into a signed int.
See this for more details on printf.
In that code who have to print the address as the %d so during compile time it displayed warning and %p display the hexadecimal but %d display the integer.
Its lower part – 4 least significant bytes – of the pointer value in decimal:
0x5952F9CC == 1498610124.

C int datatype and its variations

Greetings , and again today when i was experimenting on language C in C99 standard , i came across a problem which i cannot comprehend and need expert's help.
The Code:
#include <stdio.h>
int main(void)
{
int Fnum = 256; /* The First number to be printed out */
printf("The number %d in long long specifier is %lld\n" , Fnum , Fnum);
return 0;
}
The Question:
1.)This code prompted me an warning message when i try to run this code.
2.)But the strange thing is , when I try to change the specifier %lld to %hd or %ld,
the warning message were not shown during execution and the value printed out on the console is the correct digit 256 , everything also seems to be normal even if i try with
%u , %hu and also %lu.In short the warning message and the wrong printing of digit only happen when I use the variation of long long specifier.
3.)Why is this happening??I thought the memory size for long long is large enough to hold the value 256 , but why it cannot be used to print out the appropriate value??
The Warning Message :(For the above source code)
C:\Users\Sam\Documents\Pelles C Projects\Test1\Test.c(7): warning #2234: Argument 3 to 'printf' does not match the format string; expected 'long long int' but found 'int'.
Thanks for spending time reading my question.God bless.
You're passing the Fnum variable to printf, which is typed int, but it's expecting long long. This has very little to do with whether a long long can hold 256, just that the variable you chose is typed int.
If you just want to print 256, you can get a constant that's typed to unsigned long long as follows:
printf("The number %d in long long specifier is %lld\n" ,256 , 256ULL);
or cast:
printf("The number %d in long long specifier is %lld\n" , Fnum , (long long int)Fnum);
There are three things going on here.
printf takes a variable number of arguments. That means the compiler doesn't know what type the arguments (beyond the format string) are supposed to be. So it can't convert them to an appropriate type.
For historical reasons, however, integer types smaller than int are "promoted" to int when passed in a variable argument list.
You appear to be using Windows. On Windows, int and long are the same size, even when pointers are 64 bits wide (this is a willful violation of C89 on Microsoft's part - they actually forced the standard to be changed in C99 to make it "okay").
The upshot of all this is: The compiler is not allowed to convert your int to a long long just because you used %lld in the argument list. (It is allowed to warn you that you forgot the cast, because warnings are outside standard behavior.) With %lld, therefore, your program doesn't work. But if you use any other size specifier, printf winds up looking for an argument the same size as int and it works.
When dealing with a variadic function, the caller and callee need some way of agreeing the types of the variable arguments. In the case of printf, this is done via the format string. GCC is clever enough to read the format string itself and work out whether printf will interpret the arguments in the same way as they have been actually provided.
You can get away with slightly different types of arguments in some cases. For example, if you pass a short then it gets implicitly converted to an int. And when sizeof(int) == sizeof(long int) then there is also no distinction. But sizeof(int) != sizeof(long long int) so the parameter fails to match the format string in that case.
This is due to the way varargs work in C. Unlike a normal function, printf() can take any number of arguments. It is up to the programmer to tell printf() what to expect by providing a correct format string.
Internally, printf() uses the format specifiers to access the raw memory that corresponds to the input arguments. If you specify %lld, it will try to access a 64-bit chunk of memory (on Windows) and interpret what it finds as a long long int. However, you've only provided a 32-bit argument, so the result would be undefined (it will combine your 32-bit int with whatever random garbage happens to appear next on the stack).

Resources