How much memory allocated for node in C? - c

typedef struct Node {
int data;
struct Node *next;
} node;
pointer->next = (node*)malloc(sizeof(node));
How many bytes of memory are dynamically given to pointer->next in the above code. For (int*)malloc(sizeof(int)), 2 bytes are given. Likewise how many for node?

Malloc will dinamically assign the size of "node".
Node is a struct and the size of every struct depends on the size of every element inside the struct.
In this case, the size of node will be: size of int + size of struct Node*
(If the result is not multiple of 2, it will be padded for architecture reasons)
Your device has an architecture of 2 bytes, and for that reason, the size of the structs can only be 2, 4, 6, 8 etc...
The size of int depends on the target you are working on. Since your architecture is 16 bits, the size of int is 2 bytes.
About. the size of struct Node *, you need to know that EVERY pointer data types have exactly the same size, it doesn't matter the data type their are pointing to. And that size also depends on the architecture. Again, your architecture is 16 bits and that's why the size of struct node * is 2 bytes.
size of int = 2.
size of struct node * = 2
Total memory assigned by malloc = 2 + 2 = 4

First, a suggestion: rewrite
pointer->next=(node*)malloc(sizeof(node));
as
pointer->next = malloc( sizeof *pointer->next );
You don't need the cast (unless you're working on a pre-ANSI implementation, in which case God help you), and using the dereferenced target as the operand of sizeof means you don't have to specify the type, potentially saving you some maintenance heartburn.
Also, a little whitespace goes a long way (although you don't need to put whitespace around the function arguments - that's my style, some people don't like it, but it makes things easier for me to read).
How much bytes of memory is dynamically given to pointer->next
It will be at least as big as sizeof (int) plus sizeof (struct Node *), and potentially may be bigger; depending on your platform, it could be as small as 4 bytes or as large as 16. C allows for "padding" bytes between struct members to satisfy alignment requirements for the underlying architecture. For example, a particular architecture may require that all multi-byte objects be aligned on addresses that are multiples of 4; if your data member is only 2 bytes wide, then there will be 2 unused bytes between it and the next member.

Without knowing a lot about your system, we just can't tell you. You can take that same code and try it on multiple compilers, and you'll get different answers. You have to check yourself, using sizeof(node) or sizeof(struct Node) (I think either syntax works, but just in case).

Related

How do you access an array of structures

I'm confused about how to access an array of structs.
simple case:
typedef struct node
{
int number;
struct node *left;
struct node *right;
} node;
node *nodeArray = malloc(sizeof(node));
nodeArray->number = 5;
So, that all makes sense. but the following doesn't work:
typedef struct node
{
int number;
struct node *left;
struct node *right;
} node;
node *nodeArray = malloc(511 * sizeof(node));
for(int i = 0; i < 511; i++)
{
nodeArray[i]->number = i;
}
However, nodeArray[i].number = i does seem to work can someone explain what's going on and also what's the difference between node *nodeArray = malloc(511 * sizeof(node)); and node (*nodeArray) = malloc(511 * sizeof(node));
In the first snippet, the following are all equivalent:
nodeArray->number = 5; // preferred
nodeArray[0].number = 5;
(*nodeArray).number = 5;
In the second snippet, the following are all equivalent:
(nodeArray + i)->number = i;
nodeArray[i].number = i; // preferred
(*(nodeArray + i)).number = i;
So, as you can see, there is a choice of three different syntaxes that all do the same thing. The arrow syntax (nodeArray->number) is preferred when dealing with a pointer to a single instance of the struct. The array indexing with dot notation (nodeArray[i].number) is preferred when dealing with a pointer to an array of structs. The third syntax (dereferencing the pointer and dot notation) is avoided by sensible programmers.
When you allocate an array like this
node* nodeArray = malloc(511*sizeof(node));
nodeArray is a pointer, getting a pointer to an individual struct node you just add an integer:
nodeArray + 1 would give a pointer to the second node
nodeArray + 1 can be written as &nodeArray[1]
so to dereference the pointer
*(nodeArray + 1).number or write nodeArray[1].number
May be the problem is caused by alignment:
Your node structure contains an integer and two pointers, its minimum storage size could be 12 bytes (on most 32-bit architectures) or 24 bytes (64-bit architectures) but the alignment constraints of the architecture may force each node to be aligned using another maximum storage size (with extra padding, which needs to be allocated too.
sizeof(type) just returns a minimum storage size (the extra allocated padding should not be accessible, even if this is not checked at runtime or by the compiler).
Solution: use calloc() which will also take into consideration the alignment constraints for each item in your array!
Replace:
node *nodeArray = malloc(511 * sizeof(node));
by:
node *nodeArray = calloc(511, sizeof(node));
and now your code is normally safe, the actually allocated size will include the necessary additional padding required by the underlying architecture.
Otherwise your code is not portable.
Note that some C/C++ compiler also provide a alignof(type) to get the correct alignment for the datatype (and it should be used for implementing void *calloc(size_t nitems, size_t size) in the C/C++ libraries).
Your sample code above may suffer from buffer overflows because you did not allocate enough space for the array before writing items in the loop.
You don't see the difference when you use simple types (you don't care about their alignment or where they are are isolately allocated, there's possibly extra padding allocated on the stack or in structures using them, which is not accessible, even if no padding is necessary when their storage is allocated inside physical registers; but even with "auto" or "register" allocation, the compiler may still allocate space on the stack for it, as a backing store that could be used to save the register when it is needed for something else or before performing an external function call, or method call in C++ and the function body is not inlined).
See the documentation of alignofand alignas declarators in C++11. There are many resources about them; for example:
https://en.cppreference.com/w/cpp/language/alignas
See as well the documentation of calloc()
(And don't be confused by the simplified 32-bit or 64-bit memory models used in Linux; even Linux uses now more precise memory models, taking into account alignment problems, as well as accessibility and performance problems, sometimes enforced by the underlying platform for good security reasons in order to reduce a surface of attacks that exists in the single/unified "flat" memory model for everything: segmented architectures are coming back in the computing industry, and C/C++ compilers had to adapt: C++11 replies to this problematic that otherwise would require costlier or inefficient solutions in the compiled code, severely limiting some optimizations such as cache management, efficiency of TLB stores, paging and virtualized memory, enforced security scopes for users/process/threads and so on).
Remember that each datatype has its own size and alignment and they are independent. The assumption that there's a single "size" to allocate for a datatype in an array is wrong (as well extra padding at end of the allocated array, after its last item, may not be allocated, and read/write access to padding areas may be restricted/enforced by the compiler or at runtime).
Now consider also the case of bitfields (datatypes declared as members of structures with an extra precision/size parameter): their sizeof() is not the true minimum as they can be packed more tightly (including arrays of booleans: sizeof() returns the minimum size of the datatype once it has been promoted to an integer and so when it has possibly been enlarged with extra padding or extension of the sign bit; usually the compiler enforces theses invalid accesses to padding bits by using bitmasking, shifts or rotations; but a processor may provide more convenient instructions to handle bits inside a word unit in memory or even in a register, so that your bitfields won't overflow and modify other surrounding bitfields or padding bits because of an arithmetic operation on their value).
As well your nodeArray[i] returns a reference to a node object, not a pointer, so nodeArray[i]->anything is invalid: you need to replace the -> by a ..

extra padding between structs in C

I have struct in C:
typedef struct Node {
int data; // 4 bytes int + 4 bytes for alignment
struct Node* prev; // 8 bytes pointer
struct Node* next; // 8 bytes pointer
} Node;
The size of this struct is 24 bytes (8 + 8 + 8). When I use the sizeof(Node), the compiler also shows 24 bytes.
However, when I create two or more structs on the heap (one after another) and look at their memory location, there are 8 byte gaps between each Node struct.
For example:
11121344 (the 1st Node address)
11121376 (the 2nd Node address) // 376-344 = 32-24 = 8 extra bytes
11121408 (the 3rd Node address) // 408-376 = 32-24 = 8 extra bytes
Can you explain why compiler separates Node structs by adding 8 bytes between Nodes?
There are 2 possible reasons for your observation:
The C standard requires that malloc always returns memory chunks with maximum alignment to prevent alignment issues no matter what you allocate for.
malloc manages memory chunks internally by using some sort of data structures. Depending on the implementation, it would add additional information to each memory chunk for internal usage. For instance, malloc could manage memory chunks in a linked list, then it would require each chunk to hold an additional pointer that points to the next chunk.
The maximum alignment depends on the architecture and the compiler / malloc - implementation used.
For your case and assuming glibc, taken straight out of the docs of glibc/malloc.c :
Alignment: 2 * sizeof(size_t) (default)
(i.e., 8 byte alignment with 4byte size_t). This suffices for
nearly all current machines and C compilers. However, you can
define MALLOC_ALIGNMENT to be wider than this if necessary.
Minimum overhead per allocated chunk: 4 or 8 bytes
Each malloced chunk has a hidden word of overhead holding size
and status information.
Minimum allocated size: 4-byte ptrs: 16 bytes (including 4 overhead)
8-byte ptrs: 24/32 bytes (including, 4/8 overhead)
Thus malloc in your case will align to 2 * sizeof(size_t) = 16 bytes.
Also note the 'hidden overhead' mentioned. This overhead is due store additional internal information used for memory management...
Can you explain why compiler separates Node structs by adding 8 bytes between Nodes?
It's a coincidence. There is no rule about how to lay out memory for any sequence of malloc() calls.
The address can be ascending with a fixed interval, descending with varying intervals, (seemingly) random, ..., ....
If you want fixed relative addresses use an array
struct Node arr[3];
ptrdiff_t delta10 = &arr[1] - &arr[0];
ptrdiff_t delta20 = &arr[2] - &arr[0];
ptrdiff_t delta21 = &arr[2] - &arr[1];
if (delta10 != delta21) /* cannot happen */;
or allocate a group of elements (maybe with realloc()) at the same time
struct Node *elements = malloc(3 * sizeof *elements);
ptrdiff_t delta10 = &elements[1] - &elements[0];
ptrdiff_t delta20 = &elements[2] - &elements[0];
ptrdiff_t delta21 = &elements[2] - &elements[1];
if (delta10 != delta21) /* cannot happen */;
free(elements);

Why does node* root = malloc(sizeof(int)) allocate 16 bytes of memory instead of 4?

I'm messing around with Linked List type data structures to get better with pointers and structs in C, and I don't understand this.
I thought that malloc returned the address of the first block of memory of size sizeof to the pointer.
In this case, my node struct looks like this and is 16 bytes:
typedef struct node{
int index;
struct node* next;
}node;
I would expect that if I try to do this: node* root = malloc(sizeof(int))
malloc would allocate only a block of 4 bytes and return the address of that block to the pointer node.
However, I'm still able to assign a value to index and get root to point to a next node, as such:
root->index = 0;
root->next = malloc(sizeof(node));
And the weirdest part is that if I try to run: printf("size of pointer root: %lu \n", sizeof(*root));
I get size of pointer root: 16, when I clearly expected to see 4.
What's going on?
EDIT: I just tried malloc(sizeof(char)) and it still tells me that *root is 16 bytes.
There is a few things going on here, plus one more that probably isn't a problem in this example but is a problem in general.
1) int isn't guaranteed to be 4 bytes, although in most C compiler implementations they are. I would double check sizeof(int) to see what you get.
2) node* root = malloc(sizeof(int)) is likely to cause all sorts of problems, because sizeof(struct node) is not the same as an int. As soon as you try to access root->next, you have undefined behavior.
3) sizeof(struct node) is not just an int, it is an int and a pointer. Pointers are (as far as I know, someone quote the standard if not) the same size throughout a program depending on how it was compiled (32-bit vs 64-bit, for example). You can easily check this on your compiler with sizeof(void*). It should be the same as sizeof(int*) or sizeof(double*) or any other pointer type.
4) Your struct should be sizeof(int) + sizeof(node*), but isn't guaranteed to be. For example, say I have this struct:
struct Example
{
char c;
int i;
double d;
};
You'd expect its size to be sizeof(char) + sizeof(int) + sizeof(double), which is 1 + 4 + 8 = 13 on my compiler, but in practice it won't be. Compilers can "align" members internally to match the underlying instruction architecture, which generally will increase the structs size. The tradeoff is that they can access data more quickly. This is not standardized and varies from one compiler to another, or even different versions of the same compiler with different settings. You can learn more about it here.
5) Your line printf("size of pointer root: %lu \n", sizeof(*root)) is not the size of the pointer to root, it is the size of the struct root. This leads me to believe that you are compiling this as 64-bit code, so sizeof(int) is 4, and sizeof(void*) is 8, and they are being aligned to match the system word (8 bytes), although I can't be positive without seeing your compiler, system, and settings. If you want to know the size of the pointer to root, you need to do sizeof(node*) or sizeof(root). You dereference the pointer in your version, so it is the equivalent of saying sizeof(node)
Bottom line, is that the weirdness you are experiencing is undefined behavior. You aren't going to find a concrete answer, and just because you think you find a pattern in the behavior doesn't mean you should use it (unless you want impossible to find bugs later that make you miserable).
You didn't mention what system (M$ or linux, 32bit or 64bit) but your assumptions about memory allocation are wrong. Memory allocations are aligned to some specified boundary to guarantee all allocations for supported types are properly aligned - typically it is 16 bytes for 64bit mode.
Check this - libc manual:
http://www.gnu.org/software/libc/manual/html_node/Aligned-Memory-Blocks.html
The address of a block returned by malloc or realloc in GNU systems is
always a multiple of eight (or sixteen on 64-bit systems). If you need
a block whose address is a multiple of a higher power of two than
that, use aligned_alloc or posix_memalign. aligned_alloc and
posix_memalign are declared in stdlib.h.
There's a few things happening here. First, C has no bounds checking. C doesn't track how much memory you allocated to a variable, either. You didn't allocate enough memory for a node, but C doesn't check that. The following "works", but really it doesn't.
node* root = malloc(sizeof(int));
root->index = 0;
root->next = malloc(sizeof(node));
Since there wasn't enough memory allocated for the struct, someone else's memory has been overwritten. You can see this by printing out the pointers.
printf("sizeof(int): %zu\n", sizeof(int));
printf("root: %p\n", root);
printf("&root->index: %p\n", &root->index);
printf("&root->next: %p\n", &root->next);
sizeof(int): 4
root: 0x7fbde5601560
&root->index: 0x7fbde5601560
&root->next: 0x7fbde5601568
I've only allocated 4 bytes, so I'm only good from 0x7fbde5601560 to 0x7fbde5601564. root->index is fine, but root->next is writing to someone else's memory. It might be unallocated, in which case it might get allocated to some other variable and then you'll see weird things happening. Or it might be memory for some existing variable, in which case it will overwrite that memory and cause very difficult to debug memory problems.
But it didn't go so far out of bounds so as to walk out of the memory allocated to the whole process, so it didn't trigger your operating system's memory protection. That's usually a segfault.
Note root->next is 8 bytes after root->index because this is a 64 bit machine and so elements of a struct align on 8 bytes. If you were to put another integer into the struct after index, next would still be 8 bytes off.
There's another possibility: even though you only asked for sizeof(int) memory, malloc probably allocated more. Most memory allocators do their work in chunks. But this is all implementation defined, so your code still has undefined behavior.
And the weirdest part is that if I try to run: printf("size of pointer root: %lu \n", sizeof(*root)); I get size of pointer root: 16, when I clearly expected to see 4.
root is a pointer to a struct, and you'd expect sizeof(root) to be pointer sized, 8 bytes on a 64 bit machine to address 64 bits of memory.
*root dereferences that pointer, sizeof(*root) is the actual size of the struct. That's 16 bytes. (4 for the integer, 4 for padding, 8 for the struct pointer). Again, C doesn't track how much memory you allocated, it only tracks what the size of the variable is supposed to be.

Array of pointers confusion with skip list

I think I have a basic understanding of how skip lists work, but being new to them in addition to being a C-beginner has me confused on a few points, especially the initialization of the list. Here's the code I'm trying to follow:
#define MAXSKIPLEVEL 5
typedef struct Node {
int data;
struct Node *next[1];
} Node;
typedef struct SkipList {
Node *header;
int level;
} SkipList;
// Initialize skip list
SkipList* initList() {
SkipList *list = calloc(1, sizeof(SkipList));
if ((list->header = calloc(1, sizeof(Node) + MAXSKIPLEVEL*sizeof(Node*))) == 0) {
printf("Memory Error\n");
exit(1);
}
for (int i = 0; i < MAXSKIPLEVEL; i++)
list->header->next[i] = list->header;
return list;
}
I haven't done anything with arrays of pointers yet in C, so I think I'm getting a bit caught up with how they work. I have a few questions if someone would be kind enough to help me out.
First, I did sizeof(int) and got 4, sizeof(Node*) and got 8, so I expected sizeof(Node) to equal 12, but it ended up being 16, why is this? Same confusion with the size of SkipList compared to the sizes of its contents. I took the typedef and [1] out to see if either of them was the cause, but the size was still 16.
Second, why is the [1] in struct Node *next[1]? Is it needed for the list->header->next[i] later on? Is it okay that next[i] will go higher than 1? Is it just because the number of pointers for each node is variable, so you make it an array then increase it individually later on?
Third, why does list->header->next[i] = list->header initially instead of NULL?
Any advice/comments are greatly appreciated, thanks.
For your first question - why isn't the size of the struct the size of its members? - this is due to struct padding, where the compiler, usually for alignment reasons, may add in extra blank space between or after members of a struct in order to get the size up to a nice multiple of some fundamental size (often 8 or 16). There's no portable way to force the size of the struct to be exactly the size of its members, though most compilers have some custom switches you can flip to do this.
For your second question - why the [1]? - the idea here is that when you actually allocate one of the node structs, you'll overallocate the space so that the memory at the end of the struct can be used for the pointers. By creating an array of length one and then overallocating the space, you make it syntactically convenient to access this overallocated space as though it were a part of the struct all along. Newer versions of C have a concept called flexible array members that have supplanted this technique; I'd recommend Googling it and seeing if that helps out.
For your final question - why does list->header->next[i] initially point to list->header rather than NULL? - without seeing more of the code it's hard to say. Many implementations of linked list structures use some sort of trick like this to avoid having to special-case on NULL in the implementation, and it's entirely possible that this sort of trick is getting used here as well.
The sizeof number is 16 because of structure padding.
Many architectures either require or strongly prefer their pointers to be aligned on a certain boundary (e.g., 4-byte boundary, 8-byte boundary, etc.) They will either fail, or they will perform slowly, if pointers are "misaligned". Your C compiler is probably inserting 4 unused bytes in the middle of your structure so that your 8-byte pointer is aligned on an 8-byte boundary, which causes the structure size to increase by 4 bytes.
There is more explanation available from the C FAQ.

malloc and heap in c

consider the code below:
#include "list.h"
struct List
{
int size;
int* data;
};
List *list_create()
{
List *list;
printf("%d %d",sizeof(list),sizeof(List));
list = malloc(sizeof(list));
assert(list != NULL);
if (list != NULL) {
list->size = 0;
}
return list;
}
The number printed out is "4 8", i assume this is the 4 bytes taken by "int size" in List object?and the size of "int* data" is 0 cause nothing has assigned to data?
the size of int pointer is also 4 bytes so the type List take 8 bytes in total? or there are some thing else going on? Can some one help me understand all this in detail?
then the malloc() get 4 bytes from the heap and assign the address to the pointer list? later in main if i do "list->data[i]=1;" this will give me a run time error why? Is it because I cant change contents in the heap? but if i do "list->size++" this would work,
isn't the whole list object is in the heap?
really need some help here
Thanks in advance.
sizeof(List*) is the size of a pointer to a List struct.
sizeof(list) in your case, since variable list is of type List* is the same as sizeof(List*).
sizeof(List) instead is the size of the struct List, it contains two 32 bit variables (I assume you are using a 32 bit compiler obviously), an integer and a pointer and your compiler decided that the right size for your struct is 8 bytes.
Pointers to types are usually 4 byte in 32 bit compilers and 8 bytes in 64 bit compilers.
As a side note, reading your code however i read you never initialize list->data, you should initialize it to something somewhere i guess.
This is C++ however, you should write
typedef struct { ... } List; // This is C.
Sizeof operator is evaluated at compile time, not at runtime, it gives only information of the size of a type.
You cannot, for example, know how much elements are in a dynamic array with sizeof, if you were trying to accomplish this, sizeof(pointer) will give you the size in byte of the pointer type.
As something to read about what is a pointer and what is an array i would suggest you to read http://www.lysator.liu.se/c/c-faq/c-2.html or http://pw1.netcom.com/~tjensen/ptr/pointers.htm
Technically your code has an error in it.
The code should read: sizeof(struct List) or have typedef struct List List; somewhere.
But yes, sizeof(list) is the size of the variable list. Since list is a pointer it is equivalent to sizeof(void*) which on your system/compiler is 4.
sizeof(struct List) is the size of the struct which is sizeof(int)+sizeof(int*)+any alignment issues. The alignment thing is often forgotten but is very important as it can change the size of the struct in unexpected ways.

Resources