// intialize a char variable, print its address and the next address
char charvar = '\0';
printf("address of charvar = %p\n", (void *)(&charvar));
printf("address of charvar - 1 = %p\n", (void *)(&charvar - 1));
printf("address of charvar + 1 = %p\n", (void *)(&charvar + 1));
// intialize an int variable, print its address and the next address
int intvar = 1;
printf("address of intvar = %p\n", (void *)(&intvar));
printf("address of intvar - 1 = %p\n", (void *)(&intvar - 1));
printf("address of intvar + 1 = %p\n", (void *)(&intvar + 1));
This is a code i found online and here is the concerned output
address of charvar = 0x7fff9575c05f
address of charvar - 1 = 0x7fff9575c05e
address of charvar + 1 = 0x7fff9575c060
address of intvar = 0x7fff9575c058
address of intvar - 1 = 0x7fff9575c054
address of intvar + 1 = 0x7fff9575c05c
My doubt is why the memory address in a computer is stored in a hexadecimal format? We know the size of one char is 8bits or 1 byte, what does 1 byte mean in memory that is the address of the start bit of charvar is 0x7fff9575c05f shouldn't the address of the char+1 be the 0x7fff9575c05f + 8bits be 0x7fff9575c067, but it seems that one memory location in the computer is organised in terms of 8bits or 1 byte. Am i correct?If so why?
The memory is organized in terms of bytes, and pointers point to a specific byte, not to a single bit. The reason is probably that early computers had 8-bit registers/... and usually whole bytes were processed at once. Since the computer was operating on whole bytes, addressing bytes instead of single bits made more sense. It also saves address space, allowing for more memory to be addressed with the same pointer size.
Also the memory addresses are not really stored in hexadecimal format, they are just formatted that way when printed out. Internally in memory they are binary numbers just like all the other numbers a computer works with.
The smallest part of memory you can easily access is a byte, so there would be no use making an address for every bit.
Memory is addressed in bytes rather than bits, you have to accept this as a matter of fact. Once you get a value of a certain byte in a memory you can work with its bits using logical (bitwise) functions or operators, such as &, |, ^, ~ etc. See http://www.cprogramming.com/tutorial/bitwise_operators.html
Moreover, the address is not stored in hexadecimal format, the hexadecimal format is only the format of the number printed to the output. If you used other formatter %d rather than %p in your printf call you will get decadic format. See http://www.cplusplus.com/reference/cstdio/printf/
Related
I am little bit confused on usage of memcpy. I though memcpy can be used to copy chunks of binary data to address we desire. I was trying to implement a small logic to directyl convert 2 bytes of hex to 16 bit signed integer without using union.
#include <stdio.h>
#include <stdint.h>
#include <string.h>
int main()
{ uint8_t message[2] = {0xfd,0x58};
// int16_t roll = message[0]<<8;
// roll|=message[1];
int16_t roll = 0;
memcpy((void *)&roll,(void *)&message,2);
printf("%x",roll);
return 0;
}
This return 58fd instead of fd58
No, memcpy did not reverse the bytes as it copied them. That would be a strange and wrong thing for memcpy to do.
The reason the bytes seem to be in the "wrong" order in the program you wrote is that that's the order they're actually in! There's probably a canonical answer on this somewhere, but here's what you need to understand about byte order, or "endianness".
When you declare a string, it's laid out in memory just about exactly as you expect. Suppose I write this little code fragment:
#include <stdio.h>
char string[] = "Hello";
printf("address of string: %p\n", (void *)&string);
printf("address of 1st char: %p\n", (void *)&string[0]);
printf("address of 5th char: %p\n", (void *)&string[4]);
If I compile and run it, I get something like this:
address of string: 0xe90a49c2
address of 1st char: 0xe90a49c2
address of 5th char: 0xe90a49c6
This tells me that the bytes of the string are laid out in memory like this:
0xe90a49c2 H
0xe90a49c3 e
0xe90a49c4 l
0xe90a49c5 l
0xe90a49c6 o
0xe90a49c7 \0
Here I've shown the string vertically, but if we laid it out horizontally, with addresses increasing from left to right, we would see the characters of the string "Hello" laid out from left to right also, just as we would expect.
But that's for strings, which are arrays of char. But integers of various sizes are not really built out of characters, and it turns out that the individual bytes of an integer are not necessarily laid out in memory in "left-to-right" order as we might expect. In fact, on the vast majority of machines today, the bytes within an integer are laid out in the opposite order. Let's take a closer look at how that works.
Suppose I write this code:
int16_t i2 = 0x1234;
printf("address of short: %p\n", (void *)&i2);
unsigned char *p = &i2;
printf("%p: %02x\n", p, *p);
p++;
printf("%p: %02x\n", p, *p);
This initializes a 16-bit (or "short") integer to the hex value 0x1234, and then uses a pointer to print the two bytes of the integer in "left-to-right" order, that is, with the lower-addressed byte first, followed by the higher-addressed byte.
On my machine, the result is something like:
address of short: 0xe68c99c8
0xe68c99c8: 34
0xe68c99c9: 12
You can clearly see that the byte that's stored at the "front" of the two-byte region in memory is 34, followed by 12. The least-significant byte is stored first. This is referred to as "little endian" byte order, because the "little end" of the integer — its least-significant byte, or LSB — comes first.
Larger integers work the same way:
int32_t i4 = 0x5678abcd;
printf("address of long: %p\n", (void *)&i4);
p = &i4;
printf("%p: %02x\n", p, *p);
p++;
printf("%p: %02x\n", p, *p);
p++;
printf("%p: %02x\n", p, *p);
p++;
printf("%p: %02x\n", p, *p);
This prints:
address of long: 0xe68c99bc
0xe68c99bc: cd
0xe68c99bd: ab
0xe68c99be: 78
0xe68c99bf: 56
There are machines that lay the byes out in the other order, with the most-significant byte (MSB) first. Those are called "big endian" machines, but for reasons I won't go into they're not as popular.
How do you construct an integer value out of individual bytes if you don't know your machine's byte order? The best way is to do it "mathematically", based on the properties of the numbers. For example, let's go back to your original array of bytes:
uint8_t message[2] = {0xfd, 0x58};
Now, you know, because you wrote it, that 0xfd is supposed to be the MSB and 0xf8 is supposed to be the LSB. So one good way of combining them together into an integer is like this:
int16_t roll = message[0] << 8; /* MSB */
roll |= message[1]; /* LSB */
The nice thing about this code is that it works correctly on machines of either endianness. I called this technique "mathematical" because it's equivalent to doing it this other way:
int16_t roll = message[0] * 256; /* MSB */
roll += message[1]; /* LSB */
And, in fact, this suggestion of mine involving roll = message[0] << 8 is very close to something you already tried, but had commented out in the code you posted. The difference is that you don't want to think about it in terms of two bytes next to each other in memory; you want to think about it in terms of the most- and least-significant byte. When you say << 8, you're obviously thinking about the most-significant byte, so that should be message[0].
Does memcpy copy bytes in reverse order?
memcpy does not reverse the order bytes.
This return 58fd instead of fd58
Yes, your computer is little endian, so bytes 0xfd,0x58 in order are interpreted by your computer as the value 0x58fd.
Basically I have a hard coded address in decimal value, and I would like to convert that to a pointer, I have been following this link
But I am not getting it to run as I believe my address is being truncated i.e. the 0's in the address are being removed.
Is there any how I can maintain the 0's or is there a way where I can type cast my address stored in buff to a pointer?
#include <stdio.h>
#include <stdint.h>
int main(int argc, char *argv[]) {
int address = 200000000;
char buff[80];
sprintf(buff, "0x%012x", address);
printf("%s\n", buff);
uint32_t * const Value = (uint32_t *)(uintptr_t)buff;
// *Value = 10;
printf("%p\n", Value); // Value is now storing the value of the variable buff, I dont want this
uint32_t *const Value2 = (uint32_t *)(uintptr_t)0x00000bebc200;
printf("%p\n", Value2); // my address gets truncated, dont want the address to be truncated
}
If %p presents only 8 hex digits for the address, then that is because a pointer on your platform is only 32 bits and in that case the leading zeros have no meaning as there are no address bus lines A32 to A40 to set to zero. The bits are not "truncated", they are not there in they first place.
If you some odd reason you wish to present the address as 48 bits (12 hex digits) on a platform where 32 bits is sufficient then:
uintptr_t address = 200000000u ;
uint32_t* const Value = (uint32_t *)address ;
printf( "0x%12.12lX\n", (uintptr_t)Value ) ;
Outputs:
0x00000BEBC200
But that is only a matter of presentation, the value in address and Value are unchanged and remain 32 bits.
It is not necessary to prevent the truncation of your pointer.
When compiling for 64bit, your pointer will be 64 bit big.
This means it holds a number like 0x0123456789ABCDEF.
However, the output formatter %p will drop any leading 0, as they do not change the behaviour of your programm. It is like comparing 0x42==0x0042.
You do not need to convert your address to hex in order to use it as a pointer.
A computer saves your address in binary format. In memory, your address 200000000 will be saved as 0b1011111010111100001000000000.
The output format of decimal and hexadecimal is only used to make it more comfortable for humans to read the output.
The computer does not care, if you supply decimal, hexadecimal or binary numbers, in-memory it will always work with binary representation.
This means that you can directly follow the advice of your linked answer
#include <inttypes.h> // defines PRIxPTR, see comments of #chqrlie and #JonathanLeffler
uintptr_t address= 200000000; // compiler makes sure to convert this to binary for the pc
uint32_t *Pointer = (uint32_t*) address;
printf("0x%" PRIxPTR " address\n", address); // if the ptr size is known, e.g. %lx can be used
printf("%p pointer\n", Pointer);
sprintf converts your number into an ascii string and saves that to buff. That means you cannot cast the content of buff to get back the number. You would need to to an string to int or string to hex conversion before.
Edit:
You can test the conversion of your compiler by printing the following compare statements
printf("%d\n", address == 200000000); // output true
printf("%d\n", address == 0xbebc200); // output true
printf("%d\n", address == 0x00000bebc200); // output true
printf("%d\n", address == 0b1011111010111100001000000000); // output true
#include<stdio.h>
main(){
int b=90;
int* a;
a=&b;
//pointer arith metic
printf("Address a is %d\n",a);
printf("size of integer is %d bytes\n",sizeof(int));
printf("Address a is %d\n",*a);
printf("Address a+1 is %d\n",a+1);
printf("value of a+1 is %d\n",*(a+1));
char *ip;
ip=(char*)&b;
printf("Address ip is %d\n",ip);
}
Output of the Program :
Address a is 1495857868
size of integer is 4 bytes
Address a is 90
Address a+1 is 1495857872
value of a+1 is 1495857868
Address ip is 1495857868
Address ip is 90
1.there is always 4 byte gap between the address of the a+1 position and and
2.The output for the value at *(a+1) and the address of variable b when the
pointer converts to char becomes equal
3.Though the pointer value converts into char it shows full value of the variable
ip=(char*)&b;
printf("Address ip is %d\n",*ip);
the output:Address ip is 90
This:
a + 1
when a has type int * will print the address incremented by the size of one int. This is known as pointer arithmetic and is one of C's most core features when it comes to pointers.
1.there is always 4 byte gap between the address of the a+1 position
Gap bytes is determine by type of pointer: char pointer--> gap bytes are 1 byte, int pointer --> gap bytes are 4 bytes ...
In this case: variable a is int pointer --> Gap bytes are 4 bytes
==>Offset of Address (a+1) and (a) is 4 byte (1495857872 - 1495857868 = 4)
The output for the value at *(a+1) and the address of variable b when the pointer converts to char becomes equal
Value at address (a+1) is can not predic, it base on your system.
I run in my PC, result is:
Address b is: 2665524
value of a+1 is: 2665720
If you change your code a little, Add (a+1) = 5; before: //pointer arith metic*
Then run --> result become:
Address b is: 2665524
value of a+1 is: 5
3.Though the pointer value converts into char it shows full value of the variable
It show full value of the variable because value of b is 90, this value only need 1 byte to store in memory, so when convert it to char (1 byte in memory) you saw that value after convert to char equal with int value.
if you asign b > 255, ex: int b=290;
Then run --> result become:
Value a is: 290
value ip is: 34
You may change :
printf("Address a is %d\n",a);
to
printf("Address a is %p\n",a); //%p is specifically designed for pointers.
Till
printf("Address a+1 is %d\n",a+1);
your code should be fine. But the next statement is problematic
printf("value of a+1 is %d\n",*(a+1));
As a is a pointer a+1 will advance a by the number of bytes int occupies in your system. Then, by doing *(a+1) you're trying to access a memory location which is not assigned for a.(Remember a is meant only to store the address of one integer when you do a=&b;).
When you're try to de-reference a memory location which you have no right to, the behavior is undefined as per C standards.
3.Though the pointer value converts into char it shows full value of the variable
ip=(char*)&b;
The statement Though the pointer value converts into char is wrong.It is just that you're treating the contents in address &b as characters. Then you have:
printf("Address ip is %d\n",*ip);
// Remember you're printing the value not address. So change the print message
Remember that the any data is stored as bits ie zeroes or ones and it is the format specifier that you choose which determines how it is displayed. Here you are printing the contents in the &b as an integer as you have mentioned in %d in printf so you get 90 as the result. Fortunately for you the number 90 is small enough to be contained in one byte.
Suppose you changed the printf to :
printf("Value pointed to by ip is %c\n",*ip); // %d to %c
You should have got
Value pointed to by ip is Z
Here 90 is interpreted as an ASCII code which corresponds to letter Z.
I am trying to see that for a given function, memory allocation on stack segment of memory will happen in contiguous way. So, I wrote below code and I got below output.
For int allocation I see the memory address are coming as expected but not for character array. After memory address 0xbff1599c I was expecting next address to be 0xbff159a0 and not 0xbff159a3. Also, since char is 1 byte and I am using 4 bytes, so after 0xbff159a3 I was expecting 0xbff159a7 and not 0xbff159a8
All memory locations comes as expected if I remove char part but I am not able to get expected memory locations with character array.
My base assumption is that on stack segment, memory will always be contiguous. I hope that is not wrong.
#include <stdio.h>
int main(void)
{
int x = 10;
printf("Value of x is %d\n", x);
printf("Address of x is %p\n", &x);
printf("Dereferencing address of x gives %d\n", *(&x));
printf("\n");
int y = 20;
printf("Value of y is %d\n", y);
printf("Address of y is %p\n", &y);
printf("Dereferencing address of y gives %d\n", *(&y));
printf("\n");
char str[] = "abcd";
printf("Value of str is %s\n", str);
printf("Address of str is %p\n", &str);
printf("Dereferencing address of str gives %s\n", *(&str));
printf("\n");
int z = 30;
printf("Value of z is %d\n", z);
printf("Address of z is %p\n", &z);
printf("Dereferencing address of z gives %d\n", *(&z));
}
Output:
Value of x is 10
Address of x is 0xbff159ac
Dereferencing address of x gives 10
Value of y is 20
Address of y is 0xbff159a8
Dereferencing address of y gives 20
Value of str is abcd
Address of str is 0xbff159a3
Dereferencing address of str gives abcd
Value of z is 30
Address of z is 0xbff1599c
Dereferencing address of z gives 30
Also, since char is 1 byte and I am using 4 bytes, so after 0xbff159a3 I was expecting 0xbff159a7 and not 0xbff159a8
char takes up 1 byte , but str is string and you did not count '\0' which is at the end of string and thus ,char str[]="abcd" takes up 5 bytes.
I think this could be because the addresses are aligned to boundaries(e.g. 8 byte boundary)?.
The allocations are always aligned to boundaries and allocated in chunks
in some OS. You can check using a structure. For example,
struct A
{
char a;
char b;
int c;
};
The size of the struct will not be 6 bytes on a UNIX/LINUX platform.
But it might vary from OS to OS.
Similar thing apply to other data types also .
Moreover, a string just points to an address allocated in a
heap if malloc is used and the allocation logic might vary
from OS to OS. The following is output from Linux box
for the same program.
Value of x is 10
Address of x is 0x7ffffa43a50c
Dereferencing address of x gives 10
Value of y is 20
Address of y is 0x7ffffa43a508
Dereferencing address of y gives 20
Value of str is abcd
Address of str is 0x7ffffa43a500
Dereferencing address of str gives abcd
Value of z is 30
Address of z is 0x7ffffa43a4fc
Dereferencing address of z gives 30
Both answers from #ameyCU and #Umamahesh were good but none was self-sufficient so I am writing my answer and adding more information so that folks visiting further can get maximum knowledge.
I got that result because of concept called as Data structure alignment. As per this, computer will always try to allocate memory (whether in heap segment or stack segment or data segment, in my case it was stack segment) in chunks in such a way that it can read and write quickly.
When a modern computer reads from or writes to a memory address, it will do this in word sized chunks (e.g. 4 byte chunks on a 32-bit system) or larger. Data alignment means putting the data at a memory address equal to some multiple of the word size, which increases the system's performance due to the way the CPU handles memory.
On a 32 bits architecture, computers word size is 4 bytes, so computer will always try to allocate memory with addresses falling in multiple of 4, so that it can quickly read and write in block of 4 bytes. When there are lesser number of bytes then computer does padding of some empty bytes either in start or end.
In my case, suppose I use char str[] = "abc"; then including EOL character '\0' I have requirement of 4 bytes, so there will be no padding. But when I do char str[] = "abcd"; then including EOL character '\0' I have requirement of 5 bytes, now computer wants to allocate in block of 4 so it will add padding of 3 bytes (either in start or end) and hence complete char array will be spanned over 8 bytes in memory.
Since int, long memory requirement is already in multiple of 4 so there is no issue and it gets tricky with char or short which are not in multiple of 4. This explains the thing which I reported - "All memory locations comes as expected if I remove char part but I am not able to get expected memory locations with character array."
Rule of thumb is that if your memory requirement is not in multiple of 4 (for example, 1 short, char array of size 2) then extra padding will be added and then memory allocation will happen, so that computer can read and write quickly.
Below is nice excerpt from this answer which explains data structure alignment.
Suppose that you have the structure.
struct S {
short a;
int b;
char c, d;
};
Without alignment, it would be laid out in memory like this (assuming a 32-bit architecture):
0 1 2 3 4 5 6 7
|a|a|b|b|b|b|c|d| bytes
| | | words
The problem is that on some CPU architectures, the instruction to load a 4-byte integer from memory only works on word boundaries. So your program would have to fetch each half of b with separate instructions.
But if the memory was laid out as:
0 1 2 3 4 5 6 7 8 9 A B
|a|a| | |b|b|b|b|c|d| | |
| | | |
Then access to b becomes straightforward. (The disadvantage is that more memory is required, because of the padding bytes.)
if the machine is 32bit little-endianess and the sizeof(int) is 4 byte.
Given the following program:
line1: #include<stdio.h>
line2: {
line3: int arr[3]={2,3,4};
line4: char *p;
line5: p=(char*)arr;
line6: printf("%d",*p);
line7: p=p+1;
line8: printf("%d\n",*p);
line9: return 0;
}
What is the expected output?
A: 2 3
B: 2 0
C: 1 0
D: garbage value
one thing that bothering me the casting of the integer pointer to an character pointer.
How important the casting is?
What is the compiler doing at line 5? (p = (char *) arr;)
What is happening at line 7? (p = p + 1)
If the output is 20 then how the 0 is being printed out?
(E) none of the above
However, provided that (a) you are on a little-endian machine (e.g. x86), and (b) sizeof(int) >= 2, this should print "20" (no space is printed between the two).
a) the casting is "necessary" to read the array one byte at a time instead of as a series of ints
b) this is just coercing the address of the first int into a pointer to char
c) increment the address stored in p by sizeof(char) (which is 1)
d) the second byte of the machine representation of the int is printed by line 8
(D), or compiler specific, as sizeof(int) (as well as endianness) is platform-dependent.
How important the casting is?
Casting, as a whole is an integral (pun unintended) part of the C language.
and what the compilar would do in line number5?
It takes the address of the first element of arr and puts it in p.
and after line number 5 whats going on line number7?
It increments the pointer so it points to the next char from that memory address.
and if the output is 2 0 then how the 0 is being printed by the compiler?
This is a combination of endanness and sizeof(int). Without the specs of your machine, there isn't much else I can do to explain.
However, assuming little endian and sizeof(int) == 4, we can see the following:
// lets mark these memory regions: |A|B|C|D|
int i = 2; // represented as 0x02000000
char *ptr = (char *) &i; // now ptr points to 0x02 (A)
printf("%d\n", *ptr); // prints '2', because ptr points to 0x02 (A)
ptr++; // increment ptr, ptr now points to 0x00 (B)
printf("%d\n", *ptr); // prints '0', because ptr points to 0x00 (B)
1.important of casting:-
char *p;
this line declare a pointer to a character.That means its property is it can de-reference
only one byte at a time,and also displacement are one one byte.
p=(char*)arr;
2. type casting to char * is only for avoid warning by compiler nothing else.
If you don't then also same behavior.
as pointer to a character as I already write above p=p+1 point to next byte
printf("%d\n",*p);
%d is formatting the value to decimal integer so decimal format shown
here *p used and as per its property it can de-reference only one byte.So now memory organisation comes into picture.
that is your machine follows little endian/LSB first or big endian/MSB first
as per your ans your machine follow little endian.So first time your ans is 0.
Then next byte must be zero so output is 0.
in binary:
2 represented as 00-00-00-02(byte wise representation)
but in memory it stores like
02-00-00-00 four bytes like this
in first memory byte 02
and in 2nd memory byte 00