formatter size in printf() - c

Regarding question Why do I have to specify data type each time in C? and my earlier question how to read memory bytes one by one in hex(so without any format) with printf()
Is it possible to clarify the below question for me?
int32_t a[3]={21,3,1000031};
char* p1=&a[0]; /* char is 1-bye and &a[0] is 0x0004 for example */
printf("p1 in hex=%x\n",*p1); /* 4 bytes starting from word-aligned address p1 */
printf("(p1+3)=%d",(p1+3)); /* 4 bytes starting from a NON word-aligned address?* line 2 printf */
printf("p1+3=%p",p1+3) /* line 3 print*/
%x and %d ALWAYS tell printf to use int format which in my pc is a 4-byte? am i right?
(p1+3) is a non-word aligned address Ox004+3=0x007,so what does printf() show in this case?in another way, which bytes are concerned by line 2 printf?
also, %p formatter(void *) does it need 1 byte to read(because of char)or since we talk about pointers and they always take 4 bytes(one-word)?
to sum up my questions, %d %x %p,.. do they read a constant size(depending on pc) from memory or it depends on what are the size of their corresponding arguments?

Variadic parameters, just like all other parameter passing in C, is pass-by-value only. There's no address of anything going on in your case. In some cases you're printing the value of a pointer variable, though. I'll try to explain in order:
First:
printf("p1 in hex=%x\n",*p1);
Prints whatever *p1 is as a hex number. That's either 15 or 0, depending on whether you have a little- or big-endian machine, respectively.
Next:
printf("(p1+3)=%d",(p1+3));
Will try to print whatever p1 + 3 is as a decimal number. Since p1 is a pointer, that's not really a sane thing to do, and technically this statement causes undefined behaviour. You should be using %p to print a pointer. Assuming pointers and int are the same size on your machine, you'll probably get some number, but probably not a really meaningful one.
Last:
printf("p1+3=%p",p1+3)
%p prints a pointer type, so this line is correct. You'll (probably) get the same value as in #2, except in hexadecimal format. That's all machine/implementation-specific, though.
As to your other questions:
%x and %d ALWAYS tell printf to use int format which in my pc is a 4-byte? am i right?
%x is for unsigned int and %d is for int. The %x will give you hexadecimal output, and the %d decimal output. If int is a four-byte type on your machine, they'll both print the corresponding 4 bytes-worth of arguments that you passed.
(p1+3) is a non-word aligned address Ox004+3=0x007,so what does printf() show in this case?in another way, which bytes are concerned by line 2 printf?
Since you're printing the pointer value itself, the alignment is meaningless. You shouldn't be using %d to do it though (as mentioned above). The address in question is probably not 7, either... I'm not really sure where you got that from.
also, %p formatter(void *) does it need 1 byte to read(because of char)or since we talk about pointers and they always take 4 bytes(one-word)?
%p must be paired with a void * argument, as you say. It will will print the appropriate size for the pointer type on your machine (sounds like 4 bytes in your case).
to sum up my questions, %d %x %p,.. do they read a constant size(depending on pc) from memory or it depends on what are the size of their corresponding arguments?
They don't necessarily read anything from memory - that depends on how your ABI works and what the calling convention is for variadic functions your machine. You have to match types between the format specifiers and the corresponding variables, or else you'll cause undefined behaviour.

I don't think your code shows clearly what you want to ask, but to answer your sum-up question:
%d %x %p,.. do they read a constant size(depending on pc) from memory or it depends on what are the size of their corresponding arguments?
They read the size depends on the size of specific type on the machine. For example, for a 32-bit machine, %d will read 4 bytes, because it assumes the variable is an int.
I think this piece of code shows the general idea:
int a = 1089;
printf("%c\n", a); // prints "A" on a little-endian machine
printf("%d\n", a); // prints "1089"

Related

Why can't I use %d to print the address of a variable

#include <stdio.h>
int main()
{
int a = 10;
printf("address = %d\n ", &a);
return 0;
}
test.c:11:29: warning: format specifies type 'int' but the argument has type 'int *' [-Wformat]
printf("address = %d\n ", &a);
output if I use %d
address = -376933160
output if I use %p, this is also weird, I think I should get a positive integer instead of this?
address = 0x7ffee07004d8
I know the correct way to do this should be %p, but I see in YouTube video, people can just use %d and don't get any problems. I wonder if my setting is wrong, I am using VS code to run the program.
video example
Update
Now I am aware -376933160 is an incorrect value, but still I wonder why it just outputs a random number instead of stopping the execution ?
Format string identifiers have specific purposes. The %d identifier is for integer values, like short, int etc. long has %ld more variants as such.
You have a pointer, that holds an address. Although it is a numerical value, it's special in its purpose and should be formatted as %p, the proper way to print pointers.
Also, the size of pointers may change by architecture, so it may not be the same size as the %d identifier expects.
Regarding the different values that were printed to the screen: If you were to print the address of a variable in these two formats in the same execution, you may get again one 'positive' hex value and one 'negative' integer value. But, these values are actually the same. Almost.
The integer value representation of the variable is only the lower 32 bits of the 64 bit value, and it's negative because it is signed, and as a pointer representation (since it's an address and sign doesn't matter) it is unsigned hex value and looks positive, though both are the same value in memory (At least the 32 bits that are equal). This is happening because of different width of variable and something called "Two's complement". You can further read about it here
Note: The two values you mentioned are not the same, since you got them in different executions of the program and ASLR was on, the actual address value of a has changed between executions.
It is important to mention, even though I refer to pointers in this answer as numerical values, it is not correct to regard them as so, as they hold addresses which are their own type category (Thanks #JohnBullinger for clarifying).
Use the correct format identifier to avoid this warning, at it is informing you that you may have miss typed or used the wrong variable since it is not a regular numerical value you're trying to print, but an address.
Now I am aware -376933160 is an incorrect value, but still I wonder
why it just outputs a random number instead of stopping the execution
?
Because it is Undefined Behaviour. It does what it does. No rules.
Using the wrong format specifier is cheating the compiler. Imagine that I want to buy your old car. The price is $5000. I pay you but not in US$. If I pay £5000 you are the winner. But if I pay in drachmas 5000GDR you will not be very happy. And your behaviour will be undefined. Maybe you will chase me with the baseball bat in your hand or maybe you simply give up and accept your losses.
Same happens with the compiler and the printf function.
For the same reason you can't use %f, or %c, or anything other than %p.
%d expects its corresponding argument to have type int; if the argument doesn't have type int, then the behavior is undefined. You may get reasonable-looking output, or you may not.
On most 64-bit machines sizeof (int *) > sizeof (int). On x86 pointers are 64 bits wide while ints are 32 bits wide. If you pass a pointer as the argument for %d, printf will only pull half of the pointer value - the output is gibberish.
Now I am aware -376933160 is an incorrect value, but still I wonder why it just outputs a random number instead of stopping the execution ?
Again, you're likely only seeing the lower 32 bits of a 64-bit pointer value.
Undefined behavior does not mean "stop execution" or "print an error" or anything else - it just means that neither the compiler nor the runtime environment are required to handle the situation in any particular way.
The loss of bits and the apppearence of the minus sign provoke a warning when you do it directly:
adr.c:7:13: warning: overflow in conversion from 'long int' to 'int' changes value from '140732663858392' to '-529529640' [-Woverflow]
7 | int adr = 0x7ffee07004d8;
e07004d8 (lower 32 bits) is over 2^32. With 0x7ffe0000000a it converts to '10'.
Not a random number, and no reason to "stop execution".
Such a "cast" rarely makes sense, especially not on pointers. (Unless you take a pointer to convert it to a long just to play with it).
The compiler will generally issue a warning if there is a conversion between pointer and integer types without an explicit cast as a protection to the programmer. You can eliminate the warning by casting &a to long long int.
Depending on the system you may be able to print the decimal value if you cast it with %lld:
printf("address = %lld\n ", (long long int)&a);
However, not all systems may support ll as a valid length.

Difference b/w getting an address of variable using %p and %d

Here is an example
#include <stdio.h>
int main()
{
int a;
printf("%d\n",&a);
printf("%p\n",&a);
return 0;
}
======Output=======
-2054871028
0x7ffd8585280c
Do these two address point to same address in RAM ?
And how can i get the value by using each one of them, especially the second one.
%d format specifier is used to output a signed decimal integer.
From C Standard#7.21.6.1p8
d,i
The int argument is converted to signed decimal in the style [-]dddd. The precision specifies the minimum number of digits to appear; if the value being converted can be represented in fewer digits, it is expanded with leading zeros. The default precision is 1. The result of converting a zero value with a precision of zero is no characters.
%p prints the pointer.
From C Standard#7.21.6.1p8
p
The argument shall be a pointer to void. The value of the pointer is converted to a sequence of printing characters, in an implementation-defined manner. [emphasis mine]
This statement
printf("%d\n",&a);
lead to undefined behavior because %d is not valid for printing a pointer.
From C Standard#7.21.6.1p9
If a conversion specification is invalid, the behavior is undefined.282) If any argument is not the correct type for the corresponding conversion specification, the behavior is undefined.
When you take the address of the variable a by writing &a, what you're really doing is generating a pointer to a.
%p is designed for printing pointers. You should use %p to print pointers.
%d is not designed for printing pointers. It tries to print their values as signed decimal, which can be confusing (as you've seen), and it may not print the entire value, on a machine where pointers are bigger than integers. (For example, if you try to print pointers with %d in most "64 bit" environments, you can get even stranger results -- and that might be part of what happened here.)
This is an easy mistake to make. Good compilers should warn you about it. Mine says "warning: format specifies type 'int' but the argument has type 'int *'".
But yes, both 0x7ffd8585280c and -2054871028 do "point to the same address in RAM", because they're both the same number, the same address. (Well, they're trying to be the same address. See footnote below.)
I'm not sure what you mean by "And how can I get the value". Are you trying to get the value of the pointer, or the value of what the pointer points to?
You've already got the value of the pointer -- it's the address 0x7ffd8585280c. And since we know it points to the variable a, we know the value it points to, too. Things will be a bit more clear if we do it like this:
int a = 5;
int *ip = &a;
printf("value of pointer: %p\n", ip);
printf("pointed-to value: %d\n", *ip);
Without the explicit pointer variable ip, we could write
int a = 5;
printf("value of pointer: %p\n", &a);
printf("pointed-to value: %d\n", *&a);
But that's pretty silly, because the last line is equivalent to the much more straightforward
printf("pointed-to value: %d\n", a);
(Taking the address of a variable with & and then grabbing the contents of the pointer using * is a no-op: it's a lot like like writing a + 1 - 1.)
Footnote: I said that 0x7ffd8585280c and -2054871028 were the same number, but they're not, they're just trying to be. 0x7ffd8585280c is really -140748133160948, and -2054871028 is really 0x8585280c, which is the lower-order 8 digits of 0x7ffd8585280c. It looks like %p on your machine is printing pointers as 48-bit values by default. I was about to be surprised by that, but then I realized my Mac does the same thing. Somehow I'd never noticed that.

Why does %d,and %p give different values? [duplicate]

This question already has answers here:
What happens when I use the wrong format specifier?
(2 answers)
Correct format specifier to print pointer or address?
(5 answers)
Closed 4 years ago.
As you can see in the program, The first output is 6356744 and the second output is 0060FF08, why is it different? Is the %d typecasting it into an integer, if so, how?
#include<stdio.h>
int main()
{
int *a;
int b = 7;
a = &b;
printf(" The value of a = %d",a);
printf("\n The value of a= %p",a);
}
Printing a pointer with %d is formally undefined behavior, meaning anything can happen, including a program crash. Your program will for example likely break when you compile it as a 64 bit application, where int is 32 bits but a pointer is likely 64 bits. Therefore, always use %p and never anything else when printing a pointer.
There is no implicit conversion taking place - the printf family of functions doesn't have that kind of intelligence - it doesn't know the type passed. With the format specifier, you tell the function which type it is getting. And if you lie to printf and say "I'll give you an int" and then give it a pointer, you unleash bugs. This makes the printf family of functions very dangerous in general.
(The only implicit conversion that take place in printf is when you pass small integer types or float, in which case the "default argument promotions" take place and promote the parameter either to int or double. This is not the case here, however.)
In this specific case, you happened to get the decimal representation of 0x0060FF08, which is by no means guaranteed.
Pedantically, you should also cast the pointer to type (void*) since this is what %p expects.
%p prints a pointer and it's not necessarily hexadecimal, or even a number
If format specifiers do not match the datatype of the provided parameter, you yield undefined behaviour. %d expects an integral value, such that when you pass a pointer value, you get undefined behaviour (cf., for example, cppreference.com-printf):
...If any argument after default argument promotions is not the type
expected by the corresponding conversion specifier, or if there are
fewer arguments than required by format, the behavior is undefined.
The (only) correct format specifier for printing pointer values is %p, usually printing the address in hex format.
One of the undefined behaviours is that %d takes the pointer value as an 32/64 bit integral value and hence prints decimals, which - if you printed it using the correct %p-format - corresponds to the hex-value of the address (yet in decimal format).
%d prints an int, in decimal.
%p prints a pointer (strictly speaking, a void * pointer) in an implementation-defined way, typically in hexadecimal.
On a machine where ints and pointers have different sizes, trying to print a pointer using %d will typically give a meaningless result. For example, on a lot of machines these days, ints are 32 bits while pointers are 64 bits. So if you try to print a pointer using %d, what you might get is half the pointer value, in decimal.
(Strictly speaking, trying to print a pointer using %d is undefined, no matter what the relative sizes are.)
Bottom line, use the right printf specifier for the job. Use %d to print ints. Use %p to print pointers. Don't try to mix 'n' match.

How can a variable have two different addresses at same point of time? [duplicate]

This question already has answers here:
How to printf a memory address in C
(2 answers)
Closed 5 years ago.
I tried to play around a little bit with pointer for some assigned value of 'i' and what I found out in in that there are two different addresses assigned for declaration %u and %lu,%llu. How is possible that a variable can have two different addresses at the same instance of execution -
#include <stdio.h>
int main(void)
{
int i;
float f;
printf("\nEnter an integer:\t");
scanf("%d",&i);
printf("\nValue of address of i=%u",&i);
printf("\nvalue of address of i=%d",&i);
printf("\nValue of address of i=%lu",&i);
printf("\nValue of address of i=%llu",&i);
printf("\nvalue of i=%d",i);
printf("\nvalue of i=%u",i);
printf("\nvalue of i=%lu",i);
printf("\nvalue of i=%llu\n",i);
}
This is the output -
aalpanigrahi#aalpanigrahi-HP-Pavilion-g4-Notebook-PC:~/Desktop/Daily programs/pointers$ ./pointer001
Enter an integer: 12
Value of address of i=1193639268
value of address of i=1193639268
Value of address of i=140725797092708
Value of address of i=140725797092708
value of i=12
value of i=12
value of i=12
value of i=12
Here we can clearly see that for %u and %d the address is 1193639268 (Though the output of %d and %u might not be equal in all cases) and the output of %lu and %llu is 140725797092708 and what is it's physical significance.
The proper format specifier for printing a pointer is %p.
Using the wrong format specifier, such as %d, %u, %lu, or %llu invokes undefined behavior.
As the the specific behavior you're seeing, a pointer on your particular implementation is probably an 8 byte value, while an int or unsigned int is probably a 4 byte value. As a result, using %d or %u only reads the first 4 bytes of the 8 byte value passed in to the function and prints that value. When you then use %lu or %llu, all 8 bytes are read and printed as such.
Again, because you're invoking undefined behavior, you can't depend on this particular output to be consistent. For example, compiling in 32-bit mode vs 64-bit mode will likely give different results. Best to use %p, and also to cast the pointer to void *, as that is the specific pointer type expected by %p.
There is only one address. You're using integer printing codes of varying widths, so sometimes you see 32 bits of address, and sometimes you see 64 bits of address. How much is valid depends on your system; on a 32 bit system, printing 64 bits means 32 bits of the value are garbage; on a 64 bit system, printing 32 bits means you're omitting half the pointer.
There is a dedicated format code, %p that's intended for printing pointers, use that, not integer printing codes.
You're using the wrong format specifier, which results in undefined behavior. In this specific case the behavior is that 1193639268 (0x47257D64) is the lower part of 140725797092708 (0x7FFD47257D64)
The correct format specifier for addresses is %p
It doesn't. It's just that you are using incorrect format specifiers for pointer types.
The behaviour on doing that is undefined. The output you observe is a manifestation of that undefined behaviour.
Use %p for a pointer, and cast the pointer argument to a const void* type (many compilers are lax on this last point but it's good practice nonetheless).

difference between printing a memory address using %u and %d in C?

I reading a C book. To print out a memory address of a variable, sometimes the book uses:
printf("%u\n",&n);
Sometimes, the author wrote:
printf("%d\n",&n);
The result is always the same, but I do not understand the differences between the two (I know %u for unsigned).
Can anyone elaborate on this, please?
Thanks a lot.
%u treats the integer as unsigned, whereas %d treats the integer as signed. If the integer is between 0 an INT_MAX (which is 231-1 on 32-bit systems), then the output is identical for both cases.
It only makes a difference if the integer is negative (for signed inputs) or between INT_MAX+1 and UINT_MAX (e.g. between 231 and 232-1). In that case, if you use the %d specifier, you'll get a negative number, whereas if you use %u, you'll get a large positive number.
Addresses only make sense as unsigned numbers, so there's never any reason to print them out as signed numbers. Furthermore, when they are printed out, they're usually printed in hexadecimal (with the %x format specifier), not decimal.
You should really just use the %p format specifier for addresses, though—it's guaranteed to work for all valid pointers. If you're on a system with 32-bit integers but 64-bit pointers, if you attempt to print a pointer with any of %d, %u, or %x without the ll length modifier, you'll get the wrong result for that and anything else that gets printed later (because printf only read 4 of the 8 bytes of the pointer argument); if you do add the ll length modifier, then you won't be portable to 32-bit systems.
Bottom line: always use %p for printing out pointers/addresses:
printf("The address of n is: %p\n", &n);
// Output (32-bit system): "The address of n is: 0xbffff9ec"
// Output (64-bit system): "The address of n is: 0x7fff5fbff96c"
The exact output format is implementation-defined (C99 §7.19.6.1/8), but it will almost always be printed as an unsigned hexadecimal number, usually with a leading 0x.
%d and %u will print the same results when the most significant bit is not set. However, this isn't portable code at all, and is not good style. I hope your book is better than it seems from this example.
What value did you try? The difference unsigned vs. signed, just as you said you know. So what did it do and what did you expect?
Positive signed values look the same as unsigned so can I assume you used a smaller value to test? What about a negative value?
Finally, if you are trying to print the variable's address (as it appears you are), use %p instead.
All addresses are unsigned 32-bit or 64-bit depending on machine (can't write to a negative address). The use of %d isn't appropriate, but will usually work. It is recommended to use %u or %ul.
There is no such difference ,just don't get confused if u have just started learning pointers.
%u is for unsigned ones.And %d for signed ones

Resources