Understanding the stack address manipulation

Understanding the stack address manipulation - c

Consider the following piece of code (on a 32-bit Intel/Linux system):
struct mystruct {
char a;
int b;
short c;
char d;
};
int main()
{
struct mystruct foo[3];
char ch;
int i;
printf("%p %p %p %p %p\n", &foo[0], &foo[1], &foo[2], &ch, &i);
return 0;
}
Which of the following would be the most likely candidate for its output?
0xfeeece90 0xfeeece9c 0xfeeecea8 0xfeeece8f 0xfeeece88
0xfeeece90 0xfeeece9c 0xfeeecea8 0xfeeeceb4 0xfeeeceb8
0xfeeece90 0xfeeece9c 0xfeeecea8 0xfeeeceb4 0xfeeeceb5
0xfeeece90 0xfeeece9c 0xfeeecea8 0xfeeece8c 0xfeeece88
The output is option (1). I could not understand, how the option a is the right answer..
Stack grows from higher addresses to lower addresses. Here foo[3], foo[2], foo[1], foo[0], ch, i ... i will be at the lower address of stack and foo[3] at the higher address. Here the sizeof(mystruct) is 16, hence the foo[3], foo[2], foo[1], ch will be laid with the address gap of 16.
I couldn't get the address difference 16 here with the below address:
0xfeeece90 0xfeeece9c 0xfeeecea8 0xfeeece8f 0xfeeece88

The options that are given to you suppose a sizeof(mystruct) of 12 bytes, which is plausible, since you'll have
a at offset 0
3 bytes of padding to keep b aligned to 4 bytes
b at offset 4
c at offset 8 (no padding required, we are already at a multiple of 2)
d at offset 10 (no padding required, char does not need any particular alignment)
1 byte of padding to keep the next element of an array on 4 bytes boundaries (so that the b of all elements remains aligned to 4 bytes boundaries).
So, all the options are plausible in this respect; next comes ch; assuming that the compiler keeps the order of variables (which is definitely not plausible in optimized builds) will sit right over foo. foo starts at 0xfeeece90 and ch does not have alignment requirements, so it'll be 0xfeeece8f. Finally, i will sit at the nearest 4-bytes boundary above ch, which is indeed 0xfeeece88.
Again: this discussion is realistic as far as the struct is concerned, it is not when talking about the locals. In an optimized build the order of declaration does not matter, the compiler will probably reorder them to avoid wastes of space (and, if you do not ask for their address, will try to fit them in registers).

Related

Weird alignment behaviour in C

#include <stdio.h>
int main(void)
{
int a = 0x4565;
long ch1;
int ch2;
printf("%p %p %p\n", &ch2, &ch1, &a);
printf("%zu %zu %zu\n", sizeof(long), sizeof (int), _Alignof(a));
return 0;
}
Output:
0x7fffebb487dc 0x7fffebb487e0 0x7fffebb487ec
8 4 4
If the alignment of int is 4 then why the space for the variable had not been allocated in 0x7fffebb487e8 ?
Why compiler gives extra 4 byte space (padding) ?
This happens only if int is allocate after variable which is the size of 8 (like pointer, long, long long).
If the preceding variable is type of int i.e. having size of and alignment of 4 the compiler gives no padding.
I am confused. Please help me.
Thank you.

Without optimization on, the compiler is assigning space naïvely and is working with the stack from high addresses to low, which is the direction the stack grows in.
Starting from an aligned address which ends in 0 (hex), it assign four bytes for a, putting it at an address that ends in C. Then, for the long ch1, it has to skip four bytes to get to the eight-byte-aligned address ending in 0. Finally, for ch2, it merely subtracts four bytes.
When you turn on optimization, a smarter algorithm will be used.

Question regarding disposition of C-struct members in memory

My question is based on the third case presented in this page:
https://www.geeksforgeeks.org/is-sizeof-for-a-struct-equal-to-the-sum-of-sizeof-of-each-member/
// C program to illustrate
// size of struct
#include <stdio.h>
int main()
{
struct C {
// sizeof(double) = 8
double z;
// sizeof(short int) = 2
short int y;
// Padding of 2 bytes
// sizeof(int) = 4
int x;
};
printf("Size of struct: %ld", sizeof(struct C));
return 0;
}
Why does it require a padding after y, instead of having a padding at the end (after x)?
I can see why it's needed on the cases 1st and 2nd, but I fail to see it on the 3rd.

In some computer architectures, instructions that access values in memory will only accept a subset of all addresses due to alignment restrictions. For example, an instruction that copies a 32-bit value from memory into a register might require the value to be at an address that's divisible by 4. (You might still be able to obtain the value byte-by-byte, but that would far slower as it would require multiple instructions). Other architectures might merely perform better if the value are aligned properly. And in yet other architectures, it might not matter at all.
As such, the C standard allows for implementation-specific padding to be used in structures. By adding padding, the compiler can assure that each member will be properly aligned (since it can enforce an alignment on the structure itself). This allows us to declare the following and let the compiler figure out the exact size and offsets:
struct A {
int x;
short y;
double z;
};
Let's look at what a compiler might do.
Let's say your system uses 2 bytes for short values, 4 bytes for int values and 8 bytes for double values. And let's say values of size N are required to be placed at an address evenly divisible by N.
struct A {
int x; // 4 bytes, address must be divisible by 4.
double z; // 8 bytes, address must be divisible by 8.
short y; // 2 bytes, address must be divisible by 2.
};
If we just put the members end to end, z would be found at offset 4, which isn't divisible by 8, so the computer would be unable to access this field efficiently. The compiler might therefore utilize padding.
struct A {
int x; // 4 bytes, address must be divisible by 4. // At offset 0.
// 4 bytes of padding. // At offset 4.
double z; // 8 bytes, address must be divisible by 8. // At offset 8.
short y; // 2 bytes, address must be divisible by 2. // At offset 16.
};
Now, z is at offset 8, which is divisible by 8.
But that's not quite it.
The alignment restrictions are imposed on the absolute address of the members, not merely their offset. So the members of struct C are only properly aligned if the address of the structure itself is at an address evenly divisible by 8. The compiler can take care of that when you do
struct A a;
But what if you do
struct A *array = malloc(sizeof(struct A) * n);
malloc will return a pointer that meets all possible alignment restrictions, so array[0] will be properly aligned, but what about array[1]? For that to be properly aligned, sizeof(struct A) needs to be a multiple of 8! So padding will be added to the end to make the size of the structure a multiple of 8, and we end up with this:
// Address must be divisible by 8, so sizeof(struct A) must be divisible by 8.
struct A {
int x; // 4 bytes, address must be divisible by 4. // At offset 0.
// 4 bytes of padding. // At offset 4.
double z; // 8 bytes, address must be divisible by 8. // At offset 8.
short y; // 2 bytes, address must be divisible by 2. // At offset 16.
// 2 bytes of padding. // At offset 18.
};
Finally, you asked about struct C. Applying the above, we get:
// Address must be divisible by 8, so sizeof(struct C) must be divisible by 8.
struct C {
double z; // 8 bytes, address must be divisible by 8. // At offset 0.
short y; // 2 bytes, address must be divisible by 2. // At offset 8.
// 2 bytes of padding. // At offset 10.
int x; // 4 bytes, address must be divisible by 4. // At offset 12.
// 0 bytes of padding. // At offset 16.
};

As described, by the site,
"C language doesn’t allow the compilers to reorder the struct members to reduce the amount of padding. In order to minimize the amount of padding, the struct members must be sorted in a descending order (similar to the case 2)."
This means that the padding for the structs are created right after they are made. C language can't reorder struct members, so the code runs like this: creates 8 bytes of storage for double z first, then creates 2 bytes of storage for short int y with a padding of 2 bytes, and finally creates 4 bytes of storage for int x. You should think of the padding as a package with the original storage: you can't separate them, which is why the padding of y is created before the storage for x is.
Edit: Sorry if my response was a little confusing or didn't answer the question. x is an int, so x doesn't have any padding. y is a short int, so it has the 2 byte padding. The padding doesn't at the end because it comes with the variable that requires it (y).
-mihirm

Unexpected memory allocation in stack segment

I am trying to see that for a given function, memory allocation on stack segment of memory will happen in contiguous way. So, I wrote below code and I got below output.
For int allocation I see the memory address are coming as expected but not for character array. After memory address 0xbff1599c I was expecting next address to be 0xbff159a0 and not 0xbff159a3. Also, since char is 1 byte and I am using 4 bytes, so after 0xbff159a3 I was expecting 0xbff159a7 and not 0xbff159a8
All memory locations comes as expected if I remove char part but I am not able to get expected memory locations with character array.
My base assumption is that on stack segment, memory will always be contiguous. I hope that is not wrong.
#include <stdio.h>
int main(void)
{
int x = 10;
printf("Value of x is %d\n", x);
printf("Address of x is %p\n", &x);
printf("Dereferencing address of x gives %d\n", *(&x));
printf("\n");
int y = 20;
printf("Value of y is %d\n", y);
printf("Address of y is %p\n", &y);
printf("Dereferencing address of y gives %d\n", *(&y));
printf("\n");
char str[] = "abcd";
printf("Value of str is %s\n", str);
printf("Address of str is %p\n", &str);
printf("Dereferencing address of str gives %s\n", *(&str));
printf("\n");
int z = 30;
printf("Value of z is %d\n", z);
printf("Address of z is %p\n", &z);
printf("Dereferencing address of z gives %d\n", *(&z));
}
Output:
Value of x is 10
Address of x is 0xbff159ac
Dereferencing address of x gives 10
Value of y is 20
Address of y is 0xbff159a8
Dereferencing address of y gives 20
Value of str is abcd
Address of str is 0xbff159a3
Dereferencing address of str gives abcd
Value of z is 30
Address of z is 0xbff1599c
Dereferencing address of z gives 30

Also, since char is 1 byte and I am using 4 bytes, so after 0xbff159a3 I was expecting 0xbff159a7 and not 0xbff159a8
char takes up 1 byte , but str is string and you did not count '\0' which is at the end of string and thus ,char str[]="abcd" takes up 5 bytes.

I think this could be because the addresses are aligned to boundaries(e.g. 8 byte boundary)?.
The allocations are always aligned to boundaries and allocated in chunks
in some OS. You can check using a structure. For example,
struct A
{
char a;
char b;
int c;
};
The size of the struct will not be 6 bytes on a UNIX/LINUX platform.
But it might vary from OS to OS.
Similar thing apply to other data types also .
Moreover, a string just points to an address allocated in a
heap if malloc is used and the allocation logic might vary
from OS to OS. The following is output from Linux box
for the same program.
Value of x is 10
Address of x is 0x7ffffa43a50c
Dereferencing address of x gives 10
Value of y is 20
Address of y is 0x7ffffa43a508
Dereferencing address of y gives 20
Value of str is abcd
Address of str is 0x7ffffa43a500
Dereferencing address of str gives abcd
Value of z is 30
Address of z is 0x7ffffa43a4fc
Dereferencing address of z gives 30

Both answers from #ameyCU and #Umamahesh were good but none was self-sufficient so I am writing my answer and adding more information so that folks visiting further can get maximum knowledge.
I got that result because of concept called as Data structure alignment. As per this, computer will always try to allocate memory (whether in heap segment or stack segment or data segment, in my case it was stack segment) in chunks in such a way that it can read and write quickly.
When a modern computer reads from or writes to a memory address, it will do this in word sized chunks (e.g. 4 byte chunks on a 32-bit system) or larger. Data alignment means putting the data at a memory address equal to some multiple of the word size, which increases the system's performance due to the way the CPU handles memory.
On a 32 bits architecture, computers word size is 4 bytes, so computer will always try to allocate memory with addresses falling in multiple of 4, so that it can quickly read and write in block of 4 bytes. When there are lesser number of bytes then computer does padding of some empty bytes either in start or end.
In my case, suppose I use char str[] = "abc"; then including EOL character '\0' I have requirement of 4 bytes, so there will be no padding. But when I do char str[] = "abcd"; then including EOL character '\0' I have requirement of 5 bytes, now computer wants to allocate in block of 4 so it will add padding of 3 bytes (either in start or end) and hence complete char array will be spanned over 8 bytes in memory.
Since int, long memory requirement is already in multiple of 4 so there is no issue and it gets tricky with char or short which are not in multiple of 4. This explains the thing which I reported - "All memory locations comes as expected if I remove char part but I am not able to get expected memory locations with character array."
Rule of thumb is that if your memory requirement is not in multiple of 4 (for example, 1 short, char array of size 2) then extra padding will be added and then memory allocation will happen, so that computer can read and write quickly.
Below is nice excerpt from this answer which explains data structure alignment.
Suppose that you have the structure.
struct S {
short a;
int b;
char c, d;
};
Without alignment, it would be laid out in memory like this (assuming a 32-bit architecture):
0 1 2 3 4 5 6 7
|a|a|b|b|b|b|c|d| bytes
| | | words
The problem is that on some CPU architectures, the instruction to load a 4-byte integer from memory only works on word boundaries. So your program would have to fetch each half of b with separate instructions.
But if the memory was laid out as:
0 1 2 3 4 5 6 7 8 9 A B
|a|a| | |b|b|b|b|c|d| | |
| | | |
Then access to b becomes straightforward. (The disadvantage is that more memory is required, because of the padding bytes.)

memory allocation for structures elements

Hi I am having difficulties in understanding about how the memory is allocated to the structure elements.
For example if i have the below structure and the size of char is 1 and int is 4 bytes respectively.
struct temp
{
char a;
int b;
};
I am aware that the size of the structure would be 8. Because there will be a padding of 3 bytes after the char, and the next element should be placed in multiple of 4 so the size will be 8.
Now consider the below structure.
struct temp
{
int a; // size is 4
double b; // size is 8
char c; // size is 4
double d; // size is 8
int e; // size is 4
};
This is the o/p i got for the above strucure
size of node is 40
the address of node is 3392515152 ( =: base)
the address of a in node is 3392515152 (base + 0)
the address of b in node is 3392515160 (base + 8)
the address of c in node is 3392515168 (base + 16)
the address of d in node is 3392515176 (base + 24)
the address of e in node is 3392515184 (base + 32)
The total memory sum up to 36 bytes, why does it show as 40 bytes?
If we create an array of such structure also the first element of the next array element can be place in 3392515188 (base + 36) as it is a multiple of 4, but why is it not happening this way?
Can any one plz solve my doubt.
Thanks in advance,
Saravana

It seems that on your system, double has to have the alignment of 8.
struct temp {
int a; // size is 4
// padding 4 bytes
double b; // size is 8
char c; // size is 1
// padding 7 bytes
double d; // size is 8
int e; // size is 4
// padding 4 bytes
};
// Total 4+4+8+1+7+8+4+4 = 40 bytes
Compiler adds an extra 4 bytes to the end of struct to make sure that array[1].b will be properly aligned.
Without end padding (assuming array is at address 0):
&array[0] == 0
&array[1] == 36
&array[1].b == 36 + 8 == 44
44 % 8 == 4 -> ERROR, not aligned!
With end padding (assuming array is at address 0):
&array[0] == 0
&array[1] == 40
&array[1].b == 40 + 8 == 48
48 % 8 == 0 -> OK!
Note that sizes, alignments, and paddings depend on target system and compiler in use.

In your calculation, you ignore the fact that e is subject to be padded as well:
The struct looks like
0 8 16 24 32
AAAAaaaaBBBBBBBBCcccccccDDDDDDDDEEEEeeee
where uppercase is the variable itself, and lowercase is the padding applied to it.
As you see (and as well from the addresses), each field is padded to 8 bytes, which is the largest field in the structure.
As the structure might be used in an array, and all array elements should be well-aligned as well, the padding to e is necessary.

It's heavily dependent on both your processor architecture and compiler. Modern machines and compilers may choose larger or smaller padding to reduce the access cost to data.
Four-byte alignment means that two address lines are unused. Eight, three. A chip can use that to address more memory (coarser grain) with the same amount of hardware.
A compiler might use a similar trick for various reasons, but no compiler is required to do anything but be no less fine-grained than the processor. Often, they'll just take the biggest-size value and use it exclusively for that block. In your case, that's a double, which is eight bytes.

This is a compiler dependent behavior.
Some compiler makes that 'double' to be stored after 8 bit offset.
IF you modify the structure as below you will get different result.
struct temp
{
double b; // size is 8
int a; // size is 4
int e; // size is 4
double d; // size is 8
char c; // size is 4
}
Every programmer should know what padding you compiler is doing.
E.g. If you are working on ARM platform and you set compiler settings to do not pad structure elements[ then accessing structure elements through pointers may generate 'odd' address for which processor generates an exception.

Every structure will also have alignment requirements
for example :
typedef struct structc_tag
{
char c;``
double d;
int s;
} structc_t;
Applying same analysis, structc_t needs sizeof(char) + 7 byte padding + sizeof(double) + sizeof(int) = 1 + 7 + 8 + 4 = 20 bytes. However, the sizeof(structc_t) will be 24 bytes. It is because, along with structure members, structure type variables will also have natural alignment. Let us understand it by an example. Say, we declared an array of structc_t as shown below structc_t structc_array[3];
Assume, the base address of structc_array is 0×0000 for easy calculations. If the structc_t occupies 20 (0×14) bytes as we calculated, the second structc_t array element (indexed at 1) will be at 0×0000 + 0×0014 = 0×0014. It is the start address of index 1 element of array. The double member of this structc_t will be allocated on 0×0014 + 0×1 + 0×7 = 0x001C (decimal 28) which is not multiple of 8 and conflicting with the alignment requirements of double. As we mentioned on the top, the alignment requirement of double is 8 bytes. In order to avoid such misalignment, compiler will introduce alignment requirement to every structure. It will be as that of the largest member of the structure. In our case alignment of structa_t is 2, structb_t is 4 and structc_t is 8. If we need nested structures, the size of largest inner structure will be the alignment of immediate larger structure.
In structc_t of the above program, there will be padding of 4 bytes after int member to make the structure size multiple of its alignment. Thus the sizeof (structc_t) is 24 bytes. It guarantees correct alignment even in arrays. You can cross check

to avoid structure padding!
#pragma pack ( 1 ) directive can be used for arranging memory for structure members very next to the end of other structure members.
#pragma pack(1)
struct temp
{
int a; // size is 4
int b; // size is 4
double s; // size is 8
char ch; //size is 1
};
size of structure would be:17

If we create an array of such structure also the first element of the next array element can be place in 3392515188 (base + 36) as it is a multiple of 4, but why is it not happening this way?
It can't because of the double elements in there.
It's clear that the compiler and architecture you are using requires a double to be eight byte aligned. This is obvious because there is seven bytes of padding after the char c.
This requirement also means that the entire struct must be eight byte aligned. There's no point in carefully making all the doubles aligned to eight bytes relative to the start of the struct if the struct itself is only four byte aligned. Hence the padding after the final int to make sizeof(temp) a multiple of eight.
Note that this alignment requirement need not be a hard requirement. The compiler could choose to do the alignment even if doubles can be four byte aligned on the grounds that it might take more memory cycles to access the double if it's only four byte aligned.

Unexpected Behaviour of bit field operator in C

void main()
{
struct bitfield
{
unsigned a:5;
unsigned c:5;
unsigned b:6;
}bit;
char *p;
struct bitfield *ptr,bit1={1,3,3};
p=&bit1;
p++;
printf("%d",*p);
}
Explanation:
Binary value of a=1 is 00001 (in 5 bit)
Binary value of b=3 is 00011 (in 5 bit)
Binary value of c=3 is 000011 (in 6 bit)
My question is: In memory how it will represented as?
When I compile it's giving output 12 I am not able to figure out why It's happening: In my view let say memory representation will be in below format:
00001 000011 00011
| |
501 500 (Let Say starting address)
Please Correct me If I am wrong here.

The actual representation is like:
000011 00011 00001
b c a
When aligned as bytes:
00001100 01100001
| |
p+1 p
On the address (p+1) is 0001100 which gives 12.

The C standard does not completely specify how bit-fields are packed into bytes. The details depend on each C implementation.
From C 2011 6.7.2.1:
11 An implementation may allocate any addressable storage unit large enough to hold a bit-field. If enough space remains, a bit-field that immediately follows another bit-field in a structure shall be packed into adjacent bits of the same unit. If insufficient space remains, whether a bit-field that does not fit is put into the next unit or overlaps adjacent units is implementation-defined. The order of allocation of bit-fields within a unit (high-order to low-order or low-order to high-order) is implementation-defined. The alignment of the addressable storage unit is unspecified.

From the C11 standard (6.7.2.1):
The order of allocation of bit-ﬁelds within a unit (high-order to low-order or low-order to high-order) is implementation-deﬁned. The alignment of the addressable storage unit is unspeciﬁed.
I know for a fact that GCC and other compilers on unix-like systems order bit fields in the host byte order which can be evidenced from the definition of an IP header from an operating system source I had handy:
struct ip {
#if _BYTE_ORDER == _LITTLE_ENDIAN
u_int ip_hl:4, /* header length */
ip_v:4; /* version */
#endif
#if _BYTE_ORDER == _BIG_ENDIAN
u_int ip_v:4, /* version */
ip_hl:4; /* header length */
#endif
Other compilers might do the same. Since you're most likely on a little endian machine, your bit field will be backwards from what you're expecting (in addition to the words being backwards already). Most likely it looks like this in memory (notice that the order of your fields in the struct in your question is "a, c, b", not "a, b, c", just to make this all more confusing):
01100001 00001100
| |
byte 0 byte 1
| | | |
x a b c
So, all three bit fields can be stuffed in an int. Padding is added automatically and it's at the start of all the bitfields, it is put at byte 2 and 3. Then the b starts at the lowest bit of byte 1. After it c starts in byte 1 two, but we can only fit two bits of it, the two highest bits of c are 0, then c continues in byte 0 (x in my picture above), and then after that you have a.
Notice that the picture is with the lowest address of both the bytes and the bits on the left side growing to the right (this is pretty much standard in literature, your picture had the bits in one direction and bytes in another which makes everything more confusing, especially adding your weird ordering of the fields "a, c, b").
I none of the above made any sense run this program and then read up on byte-ordering:
#include <stdio.h>
int
main(int argc, char **argv)
{
unsigned int i = 0x01020304;
unsigned char *p;
p = (unsigned char *)&i;
printf("0x%x 0x%x 0x%x 0x%x\n", (unsigned int)p[0], (unsigned int)p[1], (unsigned int)p[2], (unsigned int)p[3]);
return 0;
}
Then when you understand what little-endian does to the ordering of bytes in an int, map your bit-field on top of that, but with the fields backwards. Then it might start making sense (I've been doing this for years and it's still confusing as hell).
Another example to show how the bit fields are backwards twice, once because of the compiler deciding to put them backwards on a little-endian machine, and then once again because the byte order of ints:
#include <stdio.h>
int
main(int argc, char **argv)
{
struct bf {
unsigned a:4,b:4,c:4,d:4,e:4,f:4,g:4,h:4;
} bf = { 1, 2, 3, 4, 5, 6, 7, 8 };
unsigned int *i;
unsigned char *p;
p = (unsigned char *)&bf;
i = (unsigned int *)&bf;
printf("0x%x 0x%x 0x%x 0x%x\n", (unsigned int)p[0], (unsigned int)p[1], (unsigned int)p[2], (unsigned int)p[3]);
printf("0x%x\n", *i);
return 0;
}