Clarification on various format specifiers in C program - c

I am not understanding what is major difference between %p,%u,%x,%d, except that %x shows hexadecimal,%u is used for unsigned integer and that %d is for any integer. I am very much confused after I took a integer variable and printed its address and its value (positive integer) separately, then irrespective of whatever format specifier I use, it was correctly printing the output (except of the difference in hexadecimal and decimal number system). So what is a major difference?
And if there is not much difference then which format specifiers are preferable for printing what type of variables?
Another doubt is that: Whether pointer of all multiplicity (I mean int *p; int **p; int ***p; etc.) occupy the same size (which is the size needed to store a valid address in the machine)? If not, then what is the size of these pointers?
Thanks for your help.

The %u, %x, %d, and %p format specifiers are used as follows:
%u: expects an unsigned int as a parameter and prints it in decimal format.
%x: expects an unsigned int as a parameter and prints it in hexadecimal format.
%d: expects an int as a parameter and prints it in decimal format.
%p: expects a void * as a parameter and prints it in an implementation defined way (typically as a hexadecimal number)
Additionally, %u, %x, %d can be prefixed with a length modifier:
l: denotes a long int or unsigned long int
ll: denotes a long long int or unsigned long long int
h: denotes a short int or unsigned short int
hh: denotes a signed char or unsigned char
Regarding pointer sizes, int *, int **, int ***, etc. are not required to be the same size, although on most implementations they will be.

With format specifiers, you tell the computer how to interpret the given variable/data.
A quick demo:
#include <stdio.h>
int main(void)
{
int x = -5;
printf("x value as int: [%d]\n", x);
printf("x value as unsigned int: [%u]\n", x);
printf("x value as hexadecimal: [%x]\n", x);
printf("x value as pointer: [%p]\n", x);
return 0;
}
Output:
x value as int: [-5]
x value as unsigned int: [4294967291]
x value as hexadecimal: [fffffffb]
x value as pointer: [0xfffffffb]
It's the same value given every time, i.e. x = -5.
We see the exact representation when given the right format specifier (the first case).
In second case we see a very big number. The answer to "Why" is a bit long to explain here, but you should look up how negative integers are represented in 2's complement system.
In the third case we see the hexadecimal representation of the number 4294967291. Hexadecimal numbers are usually shown with 0x at the beginning but %x doesn't do that.
The last one just shows how would the variable x seem if it were an address in the memory, again in hexadecimal format of course.

Related

Why does unsigned short (0xffff) print 65,535 and unsigned int (0xffffffff) print -1 in C?

I think the title explains pretty well what I'm asking so here is my code.
#include <stdio.h>
unsigned short u_short = 0xffff;
unsigned int u_int = 0xffffffff;
int main(){
printf("unsigned short = %d\n", u_short);
printf("unsigned int = %d\n", u_int);
return 0;
}
Here is my printout.
printout picture
printf("unsigned int = %d\n", u_int); is undefined behavior (UB) when u_int is out of the positive int range. Do not used "%d" to print unsigned.
Use printf("unsigned int = %u\n", u_int);
This is likely what happened in your C implementation:
In printf("unsigned short = %d\n", u_short);, the unsigned short value 65,535 is automatically converted to an int with the same value.1,2
The int value 65,535 is passed to printf, which formats it as “65535” due to the %d conversion specification.
In printf("unsigned int = %d\n", u_int);, the unsigned int value 4,294,967,295 is passed to printf; it is not converted to an int. As an unsigned int, 4,294,967,295 is represented with 32 one bits.
Because of the %d conversion specification, printf seeks an int value that was passed as an argument. For this, it finds the bits passed for your unsigned int, because an unsigned int and an int are passed in the same place in your C implementation, so the printf looking for an int finds the bits in the same place the calling routine put the unsigned int bits.3
When interpreted as an int type, these bits, 32 ones, represent the value −1.3 Given the −1 value, printf formats it as “-1” due to the %d conversion specification.
Footnotes
1 In many places in expressions, including arguments corresponding to ... of a function declaration, values of types narrower than int are automatically promoted to int, as part of the integer promotions.
2 A C implementation could have an unsigned short as wide as an int, in which case this conversion would not occur. That is rare these days.
3 This is a description of what likely happened in your C implementation. The behavior is not defined by the C standard and may vary in other C implementations or even in different programs in your C implementation.
printf has some anomalies due to the usual argument promotions. In particular, arguments of type char and short are promoted to int when passing them to printf. Usually this is fine, but sometimes it results in surprises like these. What you get when you promote an unsigned 16-bit 0xffff to 32 bits is not 0xffffffff.
printf has some relatively little-known and relatively rarely-used modifiers to, in effect, undo those promotions and print char and short arguments as what they "really were". So you'll see more-consistent results if you tell printf that you were actually passing a short, like this:
printf("unsigned short = %hd\n", u_short);
printf("unsigned int = %d\n", u_int);
Now printf knows that the argument in the first call was really a short, so it treats it as such. On my machine, this now prints
unsigned short = -1
unsigned int = -1
(Now, with that said, it's arguably a bad idea to print unsigned integers with %d, as the other answers and comments have explained.)

How does printf knows if variable passed signed or unsigned

Given the following code snippet:
signed char x = 150;
unsigned char y = 150;
printf("%d %d\n", x, y);
The output is:
-106 150
However, I'm using the same format specifier, for variables that are represented in memory in the same way. How does printf knows whether it's signed or unsigned.
Memory representation in both cases is:
10010110
signed char x = 150; incurs implementation defines behavior as 150 is not in the range of OP's signed char.
The 150 is an int and not fitting in the signed char range undergoes:
the new type is signed and the value cannot be represented in it; either the result is implementation-defined or an implementation-defined signal is raised. C17dr § 6.3.1.3 3
In this case, x took on the value of 150 - 256.
Good code would not assume this result of -106 and instead not assign to a signed char values outside it range.
Then ...
Commonly, both signed char x and unsigned char y are promoted to int before being passed as arguments to a ... function due to the usual arithmetic conversions. (types in the range of int are promoted to int).
Thus printf("%d %d\n", x, y); is not a problem. printf() receive 2 ints and that matches the "%d" specifiers.
Let's first recognize this issue:
char x = 150;
x never had the value 150 to begin with. That 150 is going to get auto casted to signed char. Hence x, is immediately going to assume the value of -106, since 150 can't be represented within a signed 8-bit value. You might as well have said:
char x = (signed char)150; // same as -106, which is 0x96 in hex
Second, char and short values when passed as variable arguments get auto promoted int. as part of being put on the stack. This includes getting sign-extended.
So when you invoke printf("%d %d\n", x, y);, the compiler will massage it to really be this:
printf("%d %d\n", (int)x, (unsigned int)y);
the following gets put onto the stack:
"%d %d\n"
0xffffff96 (int)x
0x00000096 (unsigned int)y
When printf runs, it parses the formatting string on the stack (%d %d\n) and sees it needs to interpret the next two items on the stack as signed integers. It references 0xffffff96 and 00000096 as value on the stack respectively and renders both to the console in decimal form.
How does printf knows if variable passed signed or unsigned?
The printf function doesn't "know".
You effectively tell it by using either a signed conversion specifier (d or i) or an unsigned conversion specifier (o, u, x or X).
And if you print a signed integer as unsigned or vice versa, printf just does what you told it to do.
I used the same specifier "%d", and it printed different values (the positive one and the negative one"
In your example, you are printing signed and unsigned char values.
signed char x = 150;
The value in x is -106 (8 bits signed) because 150 is greater than the largest value for char. (The char type's range is -128 to +127 with any hardware / C compiler that you are likely to encounter.)
unsigned char y = 150;
The value in y is 150 (8 bits unsigned) as expected.
At the call site. The char value -108 is sign extended to a larger integer type. The unsigned char value 150 is converted without sign extension.
By the time printf is called, the values that are have been passed to it have a different representation.

C: Correct way to print "unsigned long" in hex

I have a function that gets an unsigned long variable as parameter and I want to print it in Hex.
What is the correct way to do it?
Currently, I use printf with "%lx"
void printAddress(unsigned long address) {
printf("%lx\n", address);
}
Should I look for a printf pattern for unsigned long hex? (and not just "long hex" as mentioned above)
Or does printf convert numbers to hex only using the bits? - so I should not care about the sign anyway?
Edit/Clarification
This question was rooted in a confusion: hex is just another way to express bits, which means that signed/unsigned number is just an interpretation. The fact that the type is unsigned long therefore doesn't change the hex digits. Unsigned just tells you how to interpret those same bits in your computer program.
You're doing it right.
From the manual page:
o, u, x, X
The unsigned int argument is converted to unsigned octal (o), unsigned decimal (u), or unsigned hexadecimal (x and X) notation.
So the value for x should always be unsigned. To make it long in size, use:
l
(ell) A following integer conversion corresponds to a long int or unsigned long int argument [...]
So %lx is unsigned long. An address (pointer value), however, should be printed with %p and cast to void *.
I think the following format specifier should work
give it a try
printf("%#lx\n",address);

Sign extension query in case of short

Given,
unsigned short y = 0xFFFF;
When I print
printf("%x", y);
I get : 0xFFFF;
But when I print
printf("%x", (signed short)y);
I get : 0xFFFFFFFF
Whole program below:
#include <stdio.h>
int main() {
unsigned short y = 0xFFFF;
unsigned short z = 0x7FFF;
printf("%x %x\n", y,z);
printf("%x %x", (signed short)y, (signed short)z);
return 0;
}
Sign extension happens when we typecast lower to higher byte data type, but here we are typecasting short to signed short.
In both cases sizeof((signed short)y) or sizeof((signed short)z) prints 2 bytes. Short remains of 2 bytes, if sign bit is zero as in case of 0x7fff.
Any help is very much appreciated!
Output of the first printf is as expected. The second printf produces undefined behavior.
In C language when you pass a a value smaller than int as a variadic argument, that value is always implicitly converted to type int. It is not possible to physically pass a short or char variadic argument. That implicit conversion to int is where your "sign extension" takes place.
For this reason, your printf("%x", y); is equivalent to printf("%x", (int) y);. The value that is passed to printf is 0xFFFF of type int. Technically, %x format requires an unsigned int argument, but a non-negative int value is also OK (unless I'm missing some technicality). The output is 0xFFFF.
Conversion to int happens in the second case as well. I.e. your printf("%x", (signed short) y); is equivalent to printf("%x", (int) (signed short) y);. The conversion of 0xFFFF to (signed short) is implementation-defined, because 0xFFFF is apparently out of range of signed short on your platform. But most likely it produces a negative value (-1). When converted to int it produces the same negative value of type int (again, -1 represented as 0xFFFFFFFF for a 32-bit int). The further behavior is undefined, since you are passing a negative int value for format specifier %x, which requires unsigned int argument. It is illegal to use %x with negative int values.
In other words, formally your second printf prints unpredictable garbage. But practically the above explains where that 0xFFFFFFFF came from.
Let's break it down and into smaller pieces:
Given,
unsigned short y = 0xFFFF;
Assuming two-bytes unsigned short maximum value is 2^16-1, that is indeed 0xFFFF.
When I print
printf("%x", y);
Due to default argument promotion (as printf() is variadic function) value of y is implicitly promoted to type int. With %x format-specified it's treated as unsigned int. Assuming common two-complement's representation and four-bytes int type, that means that as most-significant bit is set to zero, the bit patterns of int and unsigned int are simply the same.
But when I print
printf("%x", (signed short)y);
What you have done is cast to signed type, that cannot represent value of 0xFFFF. Such conversion as standard stays is implementation-defined, so you can get whatever result. After implicit conversion to int apparently you have bit-patern of 32-ones, that are represented as 0xFFFFFFFF.

What is the difference between %lx and %ld when printing an address from pointer?

I have tried googling the first one for %lx, but I have no good results, BUT I have successfully searched up %ld which is just long int. Necessary for printing addresses I guess, but what is %lx for?
This is where I am confused:
int main()
{
int value = 25;
int *pointer = &value;
printf("%ld\n", pointer); // prints out the address of variable value( I hope)
printf("0x%lx\n", pointer); // Completely confused here, is this perhaps address in hex?
}
Would be awesome if someone can clear this confusion I am having!
I have ran this code, and I have the results, but I am still not sure what the lx does..I have seriously tried googling this "%lx" in google, but no results explaining it.
Edit: if I use 'p' to print address then have I been wrong in thinking %ld prints address? Confused.
They're both undefined behavior.
To print a pointer with printf, you should cast the pointer to void * and use "%p".
That being said:
We can talk about the difference between "%ld" and "%lx" when trying to print integers. %ld expects a variable of type long int, and %lx expects a variable of type long unsigned int.
More or less though, The difference between x, o, d and u are about how numbers are going to be printed.
x prints an unsigned number in hexadecimal.
o prints an unsigned number in octal.
u prints an unsigned number in decimal.
d prints a signed number in decimal.
i prints a signed number in decimal.
We can then attach l to the format string for formats like %lx to specify that instead of an int, we're using a long int (That is, an unsigned long int, or long int).
There is a table at cppreference that has additional information: http://en.cppreference.com/w/c/io/fprintf
%p and %lx prints the address in hexadecimal while %ld prints it in decimal

Resources