Sizeof with different specificators [duplicate] - c

This question already has answers here:
Using %f to print an integer variable
(6 answers)
Closed 3 years ago.
I want to know why sizeof doesn't work with different types of format specifiers.
I know that sizeof is usually used with the %zu format specifier, but I want to know for my own knowledge what happens behind and why it prints nan when I use it with %f or a long number when used with %lf
int a = 0;
long long d = 1000000000000;
int s = a + d;
printf("%d\n", sizeof(a + d)); // prints normal size of expression
printf("%lf\n", sizeof(s)); // prints big number
printf("%f", sizeof(d)); // prints nan

sizeof evaluates to a value of type size_t. The proper specifier for size_t in C99 is %zu. You can use %u on systems where size_t and unsigned int are the same type or at least have the same size and representation. On 64-bit systems, size_t values have 64 bits and therefore are larger than 32-bit ints. On 64-bit linux and OS/X, this type is defined as unsigned long and on 64-bit Windows as unsigned long long, hence using %lu or %llu on these systems is fine too.
Passing a size_t for an incompatible conversion specification has undefined behavior:
the program could crash (and it probably will if you use %s)
the program could display the expected value (as it might for %d)
the program could produce weird output such as nan for %f or something else...
The reason for this is integers and floating point values are passed in different ways to printf and they have a different representation. Passing an integer where printf expects a double will let printf retrieve the floating point value from registers or memory locations that have random contents. In your case, the floating point register just happens to contain a nan value, but it might contain a different value elsewhere in the program or at a later time, nothing can be expected, the behavior is undefined.
Some legacy systems do not support %zu, notably C runtimes by Microsoft. On these systems, you can use %u or %lu and use a cast to convert the size_t to an unsigned or an unsigned long:
int a = 0;
long long d = 1000000000000;
int s = a + d;
printf("%u\n", (unsigned)sizeof(a + d)); // should print 8
printf("%lu\n", (unsigned long)sizeof(s)); // should print 4
printf("%llu\n", (unsigned long long)sizeof(d)); // prints 4 or 8 depending on the system

I want to know for my own knowledge what happens behind and why it prints nan when I use it with %f or a long number when used with %lf
Several reasons.
First of all, printf doesn't know the types of the additional arguments you actually pass to it. It's relying on the format string to tell it the number and types of additional arguments to expect. If you pass a size_t as an additional argument, but tell printf to expect a float, then printf will interpret the bit pattern of the additional argument as a float, not a size_t. Integer and floating point types have radically different representations, so you'll get values you don't expect (including NaN).
Secondly, different types have different sizes. If you pass a 16-bit short as an argument, but tell printf to expect a 64-bit double with %f, then printf is going to look at the extra bytes immediately following that argument. It's not guaranteed that size_t and double have the same sizes, so printf may either be ignoring part of the actual value, or using bytes from memory that isn't part of the value.
Finally, it depends on how arguments are being passed. Some architectures use registers to pass arguments (at least for the first few arguments) rather than the stack, and different registers are used for floats vs. integers, so if you pass an integer and tell it to expect a double with %f, printf may look in the wrong place altogether and print something completely random.
printf is not smart. It relies on you to use the correct conversion specifier for the type of the argument you want to pass.

Related

I'm curious about the c language printf format output value

I have a question while studying C language.
printf("%d, %f \n", 120, 125.12);
The output value in this code is:
120, 125.120000
That's it. And
printf("%d, %f \n", 120.1, 125.12);
The output value of this code is:
1717986918, 0.000000
That's it.
Of course, I understand that the output value of %d in front of us was a strange value because I put a mistake in the integer format character. But I don't know why %f is 0.000000 after that. Does the character on the back have an effect on the value of the previous character?
I'm asking because I'm curious because it keeps coming out like this even if I put in a different price.
printf is a variadic function, meaning its argument types and counts are variable. As such, they must be extracted according to their type. Extraction of a given argument is dependent on the proper extraction of the arguments that precede it. Different argument types have different sizes, alignments, etc.
If you use an incorrect format specifier for a printf argument, then it throws printf out of sync. The arguments after the one that was incorrectly extracted will not, in general, be extracted correctly.
In your example, it probably extracted only a portion of the first double argument that was passed, since it was expecting an int argument. After that it was out of sync and improperly extracted the second double argument. This is speculative though, and varies from one architecture to another. The behavior is undefined.
why %f is 0.000000 after that.
According to the C standard because %d expects an int and you passed a double, undefined behavior happens and anything is allowed to happen.
Anyway:
I used http://www.binaryconvert.com/convert_double.html to convert doubles to bits:
120.1 === 0x405E066666666666
125.12 === 0x405F47AE147AE148
I guess that your platform is x86-ish, little endian, with ILP32 data model.
The %d printf format specifier is for int and sizeof(int) is 4 bytes or 32 bits in LP64. Because the architecture is little endian and the stack too, the first 4 bytes from the stack are 0x66666666 and the decimal value of it is printed out which is 1717986918.
On the stack are left three 32-bit little endian words, let's split them 0x405E0666 0x147AE148 0x405F47AE.
The %f printf format specifier is for double and sizeof(double) on your platform is 8 or 64 bits. We already read 4 bytes from the stack, and now we read 8 bytes. Remembering about the endianess, we read 0x147AE148405E0666 that is converted to double. Again I used this site and converted the hex to a double, it resulted in 5.11013578783948654276686949883E-210. E-210 - this is a very, very small number. Because %f by default prints with 6 digits of precision, the initial 7 digits of the number are 0, so only zeros are printed. If you would use %g printf format specifier, then the number would be printed as 5.11014e-210.
This can be attributed to the fact that the printf function is not type-safe, i.e. there is no connection between the specifiers in the format string and the passed arguments.
The function converts all integer arguments to int (4 bytes) and all floats to double (8 bytes). As the types are guessed from the format string, a shift in the data occurs. It is likely that the 0.000000 appears because what is loaded corresponds to a floating-point zero (biaised exponent = 0).
Try with 1.0, 1.e34.

Why is this code printing 0?

void main()
{
clrscr();
float f = 3.3;
/* In printf() I intentionaly put %d format specifier to see
what type of output I may get */
printf("value of variable a is: %d", f);
getch();
}
In effect, %d tells printf to look in a certain place for an integer argument. But you passed a float argument, which is put in a different place. The C standard does not specify what happens when you do this. In this case, it may be there was a zero in the place printf looked for an integer argument, so it printed “0”. In other circumstances, something different may happen.
Using an invalid format specifier to printf invokes undefined behavior. This is specified in section 7.21.6.1p9 of the C standard:
If a conversion specification is invalid, the behavior is
undefined.282) If any argument is not the correct type for the
corresponding conversion specification, the behavior is undefined.
What this means is that you can't reliably predict what the output of the program will be. For example, the same code on my system prints -1554224520 as the value.
As to what's most likely happening, the %d format specifier is looking for an int as a parameter. Assuming that an int is passed on the stack and that an int is 4 bytes long, the printf function looks at the next 4 bytes on the stack for the value given. Many implementations don't pass floating point values on the stack but in registers instead, so it instead sees whatever garbage values happen to be there. Even if a float is passed on the stack, a float and an int have very different representations, so printing the bytes of a float as an int will most likely not give you the same value.
Let's look at a different example for a moment. Suppose I write
#include <string.h>
char buf[10];
float f = 3.3;
memset(buf, 'x', f);
The third argument to memset is supposed to be an integer (actually a value of type size_t) telling memset how many characters of buf to set to 'x'. But I passed a float value instead. What happens? Well, the compiler knows that the third argument is supposed to be an integer, so it automatically performs the appropriate conversion, and the code ends up setting the first three (three point zero) characters of buf to 'x'.
(Significantly, the way the compiler knew that the third argument of memset was supposed to be an integer was based on the prototype function declaration for memset which is part of the header <string.h>.)
Now, when you called
printf("value of variable f is: %d", f);
you might think the same thing happens. You passed a float, but %d expects an int, so an automatic conversion will happen, right?
Wrong. Let me say that again: Wrong.
The perhaps surprising fact is, printf is different. printf is special. The compiler can't necessarily know what the right types of the arguments passed to printf are supposed to be, because it depends on the details of the %-specifiers buried in the format string. So there are no automatic conversions to just the right type. It's your job to make sure that the types of the arguments you actually pass are exactly right for the format specifiers. If they don't match, the compiler does not automatically perform corresponding conversions. If they don't match, what happens is that you get crazy, wrong results.
(What does the prototype function declaration for printf look like? It literally looks like this: extern int printf(const char *, ...);. Those three dots ... indicate a variable-length argument list or "varargs", they tell the compiler it can't know how many more arguments there are, or what their types are supposed to be. So the compiler performs a few basic conversions -- such as upconverting types char and short int to int, and float to double -- and leaves it at that.)
I said "The compiler can't necessarily know what the right types of the arguments passed to printf are supposed to be", but these days, good compilers go the extra mile and try to figure it out anyway, if they can. They still won't perform automatic conversions (they're not really allowed to, by the rules of the language), but they can at least warn you. For example, I tried your code under two different compilers. Both said something along the lines of warning: format specifies type 'int' but the argument has type 'float'. If your compiler isn't giving you warnings like these, I encourage you to find out if those warnings can be enabled, or consider switching to a better compiler.
Try
printf("... %f",f);
That's how you print float numbers.
Maybe you only want to print x digits of f, eg.:
printf("... %.3f" f);
That will print your float number with 3 digits after the dot.
Please read through this list:
%c - Character
%d or %i - Signed decimal integer
%e - Scientific notation (mantissa/exponent) using e character
%E - Scientific notation (mantissa/exponent) using E character
%f - Decimal floating point
%g - Uses the shorter of %e or %f
%G - Uses the shorter of %E or %f
%o - Signed octal
%s - String of characters
%u - Unsigned decimal integer
%x - Unsigned hexadecimal integer
%X - Unsigned hexadecimal integer (capital letters)
%p - Pointer address
%n - Nothing printed
The code is printing a 0, because you are using the format tag %d, which represents Signed decimal integer (http://devdocs.io).
Could you please try
void main() {
clrscr();
float f=3.3;
/* In printf() I intentionaly put %d format specifier to see what type of output I may get */
printf("value of variable a is: %f",f);
getch();
}

Size of a float variable and compilation

I'm struggling to understand the behavior of gcc in this. The size of a float is of 4 bytes for my architecture. But I can still store a 8 bytes real value in a float, and my compiler says nothing about it.
For example I have :
#include <stdio.h>
int main(int argc, char** argv){
float someFloatNumb = 0xFFFFFFFFFFFF;
printf("%i\n", sizeof(someFloatNumb));
printf("%f\n", someFloatNumb);
printf("%i\n", sizeof(281474976710656));
return 0;
}
I expected the compiler to insult me, or displaying a disclaimer of some sort, because I shouldn't be able to something like that, at least I think it's kind of twisted wizardry.
The program simply run :
4
281474976710656.000000
8
So, if I print the size of someFloatNumb, I get 4 bytes, which is expected. But the affected value isn't, as seen just below.
So I have a few questions:
Does sizeof(variable) simply get the variable type and return sizeof(type), which in this case would explain the result?
Does/Can gcc grow the capacity of a type? (managing multiple variables behind the curtains to allow us that sort of things)
1)
Does sizeof(variable) simply get the variable type and return sizeof(type), which in this case would explain the result ?
Except for variable-length arrays, sizeof doesn't evaluate its operand. So yes, all it cares is the type. So sizeof(someFloatNumb) is 4 which is equivalent to sizeof(float). This explains printf("%i\n", sizeof(someFloatNumb));.
2)
[..] But I can still store a 8 bytes real value in a float, and my compiler says nothing about it.
Does/Can gcc grow the capacity of a type ? (managing multiple variables behind the curtains to allow us that sort of things)
No. Capacity doesn't grow. You simply misunderstood how floats are represented/stored. sizeof(float) being 4 doesn't mean
it can't store more than 2^32 (assuming 1 byte == 8 bits). See Floating point representation.
What the maximum value of a float can represent is defined by the constant FLT_MAX (see <float.h>). sizeof(someFloatNumb) simply yields how many bytes the object (someFloatNumb) takes up in memory which isn't necessarily equal to the range of values it can represent.
This explains why printf("%f\n", someFloatNumb); prints the value as expected (and there's no automatic "capacity growth").
3)
printf("%i\n", sizeof(281474976710656));
This is slightly more involved. As said before in (1), sizeof only cares about the type here. But the type of 281474976710656 is not necessarily int.
The C standard defines the type of integer constants according to the smallest type that can represent the value. See https://stackoverflow.com/a/42115024/1275169 for an explanation.
On my system 281474976710656 can't be represented in an int and it's stored in a long int which is likely to be case on your system as well. So what you see is essentially equivalent to sizeof(long).
There's no portable way to determine the type of integer constants. But since you are using gcc, you could use a little trick with typeof:
typeof(281474976710656) x;
printf("%s", x); /* deliberately using '%s' to generate warning from gcc. */
generates:
warning: format ‘%s’ expects argument of type ‘char *’, but argument 2
has type ‘long int’ [-Wformat=]
printf("%s", x);
P.S: sizeof results a size_t for which the correct format specifier is %zu. So that's what you should be using in your 1st and 3rd printf statements.
This doesn't store "8 bytes" of data, that value gets converted to an integer by the compiler, then converted to a float for assignment:
float someFloatNumb = 0xFFFFFFFFFFFF; // 6 bytes of data
Since float can represent large values, this isn't a big deal, but you will lose a lot of precision if you're only using 32-bit floats. Notice there's a slight but important difference here:
float value = 281474976710656.000000;
int value = 281474976710655;
This is because float becomes an approximation when it runs out of precision.
Capacities don't "grow" for standard C types. You'll have to use a "bignum" library for that.
But I can still store a 8 bytes real value in a float, and my compiler
says nothing about it.
That's not what's happening.
float someFloatNumb = 0xFFFFFFFFFFFF;
0xFFFFFFFFFFFF is an integer constant. Its value, expressed in decimal, is 281474976710655, and its type is probably either long or long long. (Incidentally, that value can be stored in 48 bits, but most systems don't have a 48-bit integer type, so it will probably be stored in 64 bits, of which the high-order 16 bits will be zero.)
When you use an expression of one numeric type to initialize an object of a different numeric type, the value is converted. This conversion doesn't depend on the size of the source expression, only on its numeric value. For an integer-to-float conversion, the result is the closest representation to the integer value. There may be some loss of precision (and in this case, there is). Some compilers may have options to warn about loss of precision, but the conversion is perfectly valid so you probably won't get a warning by default.
Here's a small program to illustrate what's going on:
#include <stdio.h>
int main(void) {
long long ll = 0xFFFFFFFFFFFF;
float f = 0xFFFFFFFFFFFF;
printf("ll = %lld\n", ll);
printf("f = %f\n", f);
}
The output on my system is:
ll = 281474976710655
f = 281474976710656.000000
As you can see, the conversion has lost some precision. 281474976710656 is an exact power of two, and floating-point types generally can represent those exactly. There's a very small difference between the two values because you chose an integer value that's very close to one that can be represented exactly. If I change the value:
#include <stdio.h>
int main(void) {
long long ll = 0xEEEEEEEEEEEE;
float f = 0xEEEEEEEEEEEE;
printf("ll = %lld\n", ll);
printf("f = %f\n", f);
}
the apparent loss of precision is much larger:
ll = 262709978263278
f = 262709979381760.000000
0xFFFFFFFFFFFF == 281474976710655
If you init a float with that value, it will end up being
0xFFFFFFFFFFFF +1 == 0x1000000000000 == 281474976710656 == 1<<48
That fits easily in a 4byte float, simple mantisse, small exponent.
It does however NOT store the correct value (one lower) because that IS hard to store in a float.
Note that the " +1" does not imply incrementation. It ends up one higher because the representation can only get as close as off-by-one to the attempted value. You may consider that "rounding up to the next power of 2 mutliplied by whatever the mantisse can store". Mantisse, by the way, usually is interpreted as a fraction between 0 and 1.
Getting closer would indeed require the 48 bits of your initialisation in the mantisse; plus whatever number of bits would be used to store the exponent; and maybe a few more for other details.
Look at the value printed... 0xFFFF...FFFF is an odd value, but the value printed in your example is even. You are feeding the float variable with an int value that is converted to float. The conversion is loosing precision, as expected by the value used, which doesn't fit in the 23 bits reserved to the target variable mantissa. And finally you get an approximation with is the value 0x1000000....0000 (the next value, which is the closest value to the one you used, as posted #Yunnosch in his answer)

C Control Strings?

I know that in other languages such as Java when I use System.out.println(); and I put a variable inside of it like an int that holds the number 22 it will print 22 to the console.
In C if I do the same thing with printf(); I need to specify the type in the string such as printf("%d", n); I also know that Java has its own printf function.
What I am trying to get at here is how the C control String works compared to other languages such as Java where you don't have to provide the type identifier in the System.out.println(); and it automatically recognizes the variable is an int.
Is this part of C's way of efficiency and does it not actually check the type and rely's on the programmer to understand the type they are providing?
In fact, printf is neither efficient nor type-safe way of writing data.
There is a performance penalty because format string is parsed at compile time and type specific actions are chosen at run time as well, whereas in C++ and Java it can be done at compile time. Moreover, variadic arguments are forced to be passed on the stack, that is less efficient than passing them in registers.
And what is even more important is that printf is not type-safe. It is possible to pass any number of parameters of any types to printf, ignoring format string prescriptions. Of course, it can easily trigger undefined behavior.
The only reason for such behavior is that there is no function overloading in C.
On the other hand, it's not that bad. First, most of the compilers parse format string and issue a warning if it isn't consistent with parameters passed to printf. Second, aforementioned performance penalty is in fact negligible compared to the cost of formatting and printing text after types have been deduced.
Both C's predefined data types (integers, characters, etc.) as well as user defined types carry no type information with them at run-time. Thus if you use a 4 byte integer in your code, it occupies only 4 bytes in memory (disregarding any padding needed for alignment). This is great for efficiency reasons, but it means that functions that handle multiple data types (like printf) need to be told via the format/control string what the types of the arguments are.
So when printf receives a format string of "%d %f" it knows that the type of the first non-format argument is integer and the second argument is of type float.
By passing control characters in printf() we tell the compiler how to allocate the memory and what is the range of the data that we wanna to print.
Control strings are also known as format specifiers. They are used in formatted input and output operations of the data. Remember, format specifiers are used for formatting the data, for example, in printf() and scanf().
Control strings in scanf() are used to transfer the data to the processor's memory in a formatted way, whereas printf () transfers the data to the output device e.g. monitor screen in a formatted way.
Some common Format Specifierss are listed below:
________________________________________________________________
FORMAT SPECIFIER :: DESCRIPTION
________________________________________________________________
%c Character
%d Signed Integer
%e or %E Scientific notation
%f Floating point
%g or %G. Similar as %e or %E
%hi. Signed Integer
%hu Unsigned Integer(Short)
%i Signed Integer
%l or %ld or %li Signed Integer
%lf Floating point
%Lf Floating point
%lu Unsigned integer
%lli, %lld Signed long long int
%llu. unsigned long long int
%o. Octal representation of Integer.
%p Address of pointer to void void *
void *
%s String char *
%u Unsigned int
Unsigned short int
%n Prints nothing
%% Prints % character
%o Octal representation
%p Address of pointer to void void *
void *
%s String char *
%u Unsigned Integer
%x or %X Hexadecimal representation of
Unsigned Int.
%n Prints nothing
%% Prints % character
________________________________________________________________

C int datatype and its variations

Greetings , and again today when i was experimenting on language C in C99 standard , i came across a problem which i cannot comprehend and need expert's help.
The Code:
#include <stdio.h>
int main(void)
{
int Fnum = 256; /* The First number to be printed out */
printf("The number %d in long long specifier is %lld\n" , Fnum , Fnum);
return 0;
}
The Question:
1.)This code prompted me an warning message when i try to run this code.
2.)But the strange thing is , when I try to change the specifier %lld to %hd or %ld,
the warning message were not shown during execution and the value printed out on the console is the correct digit 256 , everything also seems to be normal even if i try with
%u , %hu and also %lu.In short the warning message and the wrong printing of digit only happen when I use the variation of long long specifier.
3.)Why is this happening??I thought the memory size for long long is large enough to hold the value 256 , but why it cannot be used to print out the appropriate value??
The Warning Message :(For the above source code)
C:\Users\Sam\Documents\Pelles C Projects\Test1\Test.c(7): warning #2234: Argument 3 to 'printf' does not match the format string; expected 'long long int' but found 'int'.
Thanks for spending time reading my question.God bless.
You're passing the Fnum variable to printf, which is typed int, but it's expecting long long. This has very little to do with whether a long long can hold 256, just that the variable you chose is typed int.
If you just want to print 256, you can get a constant that's typed to unsigned long long as follows:
printf("The number %d in long long specifier is %lld\n" ,256 , 256ULL);
or cast:
printf("The number %d in long long specifier is %lld\n" , Fnum , (long long int)Fnum);
There are three things going on here.
printf takes a variable number of arguments. That means the compiler doesn't know what type the arguments (beyond the format string) are supposed to be. So it can't convert them to an appropriate type.
For historical reasons, however, integer types smaller than int are "promoted" to int when passed in a variable argument list.
You appear to be using Windows. On Windows, int and long are the same size, even when pointers are 64 bits wide (this is a willful violation of C89 on Microsoft's part - they actually forced the standard to be changed in C99 to make it "okay").
The upshot of all this is: The compiler is not allowed to convert your int to a long long just because you used %lld in the argument list. (It is allowed to warn you that you forgot the cast, because warnings are outside standard behavior.) With %lld, therefore, your program doesn't work. But if you use any other size specifier, printf winds up looking for an argument the same size as int and it works.
When dealing with a variadic function, the caller and callee need some way of agreeing the types of the variable arguments. In the case of printf, this is done via the format string. GCC is clever enough to read the format string itself and work out whether printf will interpret the arguments in the same way as they have been actually provided.
You can get away with slightly different types of arguments in some cases. For example, if you pass a short then it gets implicitly converted to an int. And when sizeof(int) == sizeof(long int) then there is also no distinction. But sizeof(int) != sizeof(long long int) so the parameter fails to match the format string in that case.
This is due to the way varargs work in C. Unlike a normal function, printf() can take any number of arguments. It is up to the programmer to tell printf() what to expect by providing a correct format string.
Internally, printf() uses the format specifiers to access the raw memory that corresponds to the input arguments. If you specify %lld, it will try to access a 64-bit chunk of memory (on Windows) and interpret what it finds as a long long int. However, you've only provided a 32-bit argument, so the result would be undefined (it will combine your 32-bit int with whatever random garbage happens to appear next on the stack).

Resources