Heap memory: Gap of 16 bytes for 8 byte struct - c

I'm using the following code to create and insert a new node into a linked list, subsequently freeing them.
// the node
struct node
{
int data;
struct node *next;
};
// returns a pointer to a new node with data
struct node *new_node(int data)
{
struct node *node;
node = malloc(sizeof(struct node));
node->data = data;
node->next = NULL;
return node;
}
// insert node at front of list
void push(struct node **head, int data)
{
struct node *node = new_node(data);
node->next = *head;
*head = node;
}
// free the list, deallocate each node's memory
void delete_list(struct node** head)
{
if(*head == NULL)
return;
struct node *next = NULL;
next = (*head)->next;
while(next != NULL)
{
next = (*head)->next;
// print address of the memory released
printf("Freeing %d\n", (int)*head);
free(*head);
*head = next;
}
}
Now, the struct is 8 bytes in my machine (4 byte int and 4 byte pointer). Now, I'm a bit unsure about the following, so please help me out:
When I call push() in sequence, is the memory allocated contiguously? Is that always the case? I guess it cannot be, for the memory in heap can be fragmented.
Say the memory allocated was contiguous, then would it be spaced 8 bytes apart, since the struct's size is 8 bytes. On my machine, when I printed the address of the memory being freed, the memory addresses printed are 16 bytes apart, on every execution. Why?
Freeing 148025480
Freeing 148025464
Freeing 148025448
Freeing 148025432
Freeing 148025416
Freeing 148025400
Freeing 148025384
Freeing 148025368
Freeing 148025352
<empty list>
Now if the memory was NOT allocated contiguously for our integer array (the heap was very much fragmented, and memory requirement was quite large), and we used pointer arithmetic to process each element of the array by incrementing the address by 4 each time(or whatever the size of int is), shouldn't we run into some memory not reserved by our program, throwing off the program? Or is the runtime environment smart enough to take care of this, as the compiler cannot, for it doesn't know how the memory will be allocated. Does the OS take care of this?

Each time you call new_node, it calls malloc().
You cannot (or should not) predict where malloc() will find you memory. It is OS and runtime dependent.
Running on a particular OS, under certain circumstances, you might observe that allocations from consecutive calls to malloc() are contiguous. However that behaviour may change under load, or with a kernel update, a change in the implementation of libc, or under all kinds of other conditions.
You can assume that the chunk of memory allocated by a single call to malloc() is contiguous (at least, in terms of the pointers your program sees). Your program should not assume anything else about contiguity.
If this really bothers you, you can take charge of more of the memory management in your own code -- instead of calling malloc() for each node, call it at the start and get a larger chunk of memory. Subsequent calls to new_node can use part of this chunk. If you run out of space in that chunk, you can either malloc() another chunk (which probably won't be contiguous with the first) or realloc() to extend (and probably move) it.
You'll probably find that all this makes your code more complicated -- and it's up to you whether there are benefits to counter that. The authors of the Hotspot Java VM essentially do this - they malloc() a big block of memory at the start of execution, then rather than call malloc() and free() when the Java program wants memory, it uses its own routines to allocate parts of that block.

Regarding #2, one reason for the results of calls to malloc not being exactly contiguous can be metadata: when you ask for x bytes, malloc implementations may allocate a little extra memory for internal bookkeeping processes (tagging the size of the block, pointers to other free blocks, etc). Thus a request for 8 bytes may actually cause 16 bytes to be allocated internally, hence the 16-byte spacing between successive allocations.

If you allocate memory using malloc, the memory returned will always be contiguous.
If you call malloc twice, there's no guarantee that the two allocations will be placed next to each other, not even if the two malloc calls are right next to each other.

When allocating memory using malloc() you would see a gap of 1-Word size in memory allocations, i.e. 16-Bytes extra would be allocated. This is normal as this is due to the fact that we might want to use free() method for deallocating the memory. The extra space is used by the free() method as a scratch space while performing deallocation operations.
Let say we have the following 2D array:
1 2 3
4 5 6
7 8 9
And the starting address of array is 448 (also the memory address of '1')
Then the memory address for '1', '4', '7' would be 448, 480, 512 respectively which could be calculated as follows:
448 + (3 x 4) + 4 + 16 = 480
Here, we have:
448 - Address of the previous block
3 - Number of columns or number of items in the array
4 - Size of int (I'm assuming the array as integer array)
4 - Extra four bytes because size of int is 4 bytes. So the last int address + 4
16 - As left by the malloc for free() function
I hope it clarifies a little.

Related

How C knows size of memory it need to free?

I have pointer to buffer that was initialised with calloc:
notEncryptBuf = (unsigned char*) calloc(1024, notEncryptBuf_len);
Later I moved pointer to another position:
notEncryptBuf+=20;
And finally I free buffer:
free(notEncryptBuf);
Will if free whole allocated size? How C knows size of memory it need to free?
The behavior of free is specified only if it is passed an address that was previously returned by malloc or a related routine or is passed a null pointer (in which case it does nothing). If you pass an address modified from an original allocation, as by notEncryptBuf += 20;, the behavior of free is not specified.
C implementations commonly know how much space is in an allocation because they store it in some bytes immediately preceding the allocation. For example, if you ask for 1,024 byes, it may allocate 1,040, record information about the allocation in the first 16 bytes, and return to you the address 16 bytes after the whole allocation. Then, when you pass that address to free, it looks in the 16 bytes before that address to see the amount of space.
Other implementations are theoretically possible. For example, a memory manager could designate one zone of memory for common fixed-size allocations, such as 32 bytes, and then use a bitmap to indicate whether each 32-byte block in that zone is free or allocated. Or it could keep a database of allocations, using a hash table or trees or other data structures. When free is called, it would look up the address in the database.
How C knows size of memory it need to free?
"C" does not know about memory allocation or freeing. It relies on the underlying memory manager to keep track of the allocated memories and free them up.
That said, if you pass a pointer to free() which was not returned by any allocator function, it invokes undefined behaviour. So, you cannot pass the pointer which you have shifted to free(). You need to pass the pointer which was returned by calloc().
A good way to answer these kinds of questions is to ask yourself, “how would I myself write malloc() and free()?”
Suppose you have a “memory pool” — fancy words for just an array of bytes:
unsigned char memory[10000];
Now the user wants eight bytes of that. The user calls ptr = my_malloc(8). You know full well that you can’t just give the user any random spot in your memory array — you can only give away stuff that hasn’t already been given away.
In other words, you somehow need to keep track of what pieces of memory have been given away.
Linked-lists → Variable-sized elements in an array
One way we know of to manage dynamic memory is through linked-lists. A linked list is a block of memory that you organize with a struct:
struct node
{
SOME_TYPE data; // the data to store
struct node * next; // a pointer to the next node
};
However, since we are the memory manager, we don’t have some magic pool to allocate our node. We have to use our own memory[] to create space for the node.
Let’s make a simple modification. Instead of a pointer, we will keep track of how big a piece is. We can do this with a structure:
struct piece_of_memory
{
int size;
unsigned char memory[size];
};
Memory, then, is just an array of those things, where all the sizes add up to our available memory pool size:
piece_of_memory memory[...];
So now our initial pool of memory looks like this:
int size = 9992; // 10000 - sizeof(int), which is minus eight bytes on 64-bit systems
unsigned char memory[9992];
Graphically, that’s something like
[----,------------------------------------------------------------]
↑ ↑
9992 memory
If I give away eight bytes, that gets reordered:
[----,---][----,--------------------------------------------------]
↑ ↑ ↑ ↑
↑ ↑ ↑ memory
↑ ↑ new size = 9992 - 8 - sizeof(int) = 9976
↑ ↑
8 returned from malloc
That is two of those structs in a row
int size = 8
unsigned char memory[8]
int size = 9976
unsigned char memory[9976]
We can verify that the pieces all use exactly 10000 bytes:
{(size) 8 + (8 bytes) 8} + {(size) 8 + (9976 bytes) 9976}
= 16 + 9984
= 10000
So when the user asks us to ptr = my_malloc(8), we find a piece of memory with at least eight available bytes, rearrange things, then return a pointer to the ‘memory’ part (not the ‘size’!).
Freeing allocated memory
Suppose our user is now finished with the eight bytes and calls my_free(ptr).
[----,---][----,--------------------------------------------------]
8 ↑ 9976
↑
free me!
We can find our struct piece_of_memory (it is sizeof(int) bytes before the address returned to us), and we can recombine the free pieces of memory into a whole free block:
[----,------------------------------------------------------------]
9992
Notice how this only works if the user gives us an address we gave it earlier, right? What would happen if I returned a wrong ptr value?
More to think about
Naturally we must also be able to keep track of which blocks are available to return and which ones are in use. This makes our struct piece_of_memory a bit more complicated. We could do something like:
struct piece_of_memory
{
int size;
bool is_used;
unsigned char memory[];
};
We also need a way for the memory manager to search through the memory blocks for a piece that is big enough for the requested size. If we want to be smart about it, we might take some time to find the smallest available block that is big enough for the requested size.
We don’t actually have to keep the (‘size’ and ‘is_used’) with the ‘memory’ pieces, either. We could split up our struct to simply have an array of (‘size’ + ‘is_used’) structures at one end of our memory[] array and all the pieces of returned memory at the other end.
Finally, we must waste a little memory when we divide it up in order to make sure that we always return a pointer that is aligned for the worst-case alignment needs our user might put it to. For example, if user wants to get dynamic memory for an array of double, we don’t want to return something that is byte-aligned.
This isn’t the only way to do it!
This is just one simple way. More advanced structures could certainly be used as well.
Conclusions
Hopefully you can answer your own questions now:
How does the memory manager know how much memory to free?
(Because it keeps track of it.)
Can I return a pointer that was not given to me by the memory manager?
(No, because it would break things.)
Obviously the memory manager can be written to prevent things from breaking if you try to free a pointer it did not give you, but the C specification does not require it to. It requires (expects) the user to not give it bad input.

C: Is my understanding about the specifics of heap and stack allocation correct?

I have a sort of linked list implemented (code at bottom) in C (which has obvious issues, but I'm not asking about those or about linked lists; I'm aware for instance that there are no calls to free() the allocated memory) given below which does what I expect (so far as I've checked). My question is about the first couple of lines of the addnodeto() function and what it does to the heap/stack.
My understanding is that calling malloc() sets aside some memory on the heap, and then returns the address of that memory (pointing to the beginning) which is assigned to struct node *newnode which is itself on the stack. When the function is first called, *nodetoaddto is a pointer to struct node first, both of which are on the stack. Thus the (*nodeaddto)->next = newnode sets first.next equal to the value of newnode which is the address of the newly allocated memory.
When we leave this function, and continue executing the main() function, is *newnode removed from the stack (not sure if 'deallocated' is the correct word), leaving only struct node first pointing to the 'next' node struct on the heap? If so, does this 'next' struct node have a variable name also on the stack or heap, or it is merely some memory pointed too? Moreover, is it true to say that struct node first is on the stack, whilst all subsequent nodes will be on the heap, and that just before main() returns 0 there are no structs/variables on the stack other than struct node first? Or is/are there 1/more than 1 *newnode still on the stack?
I did try using GDB which showed that struct node *newnode was located at the same memory address both times addnodeto() was called (so was it removed and then happened to be re-defined/allocated in to the same location, or was perhaps the compiler being smart and left it there even once the function was exited the first time, or other?), but I couldn't work anything else out concretely. Thank you.
The code:
#include <stdio.h>
#include <stdlib.h>
#define STR_LEN 5
struct node {
char message[STR_LEN];
struct node *next;
};
void addnodeto(struct node **nodeaddto, char letter, int *num_of_nodes){
struct node *newnode = malloc(sizeof(struct node));
(*nodeaddto)->next = newnode;
newnode->message[0] = letter;
(*nodeaddto) = newnode;
*num_of_nodes += 1;
}
int main(void){
struct node first = {"F", NULL};
struct node *last = &first;
int num_nodes = 1;
addnodeto(&last, 'S', &num_nodes);
addnodeto(&last, 'T', &num_nodes);
addnodeto(&last, 'I', &num_nodes);
printf("Node: %d holds the char: %c\n", num_nodes-3, first.message[0]);
printf("Node: %d holds the char: %c\n", num_nodes-2, (first.next)->message[0]);
printf("Node: %d holds the char: %c\n", num_nodes-1, ((first.next)->next)->message[0]);
printf("Node: %d holds the char: %c\n", num_nodes, (last)->message[0]);
return 0;
}
Which when run outputs:
Node: 1 holds the char: F
Node: 2 holds the char: S
Node: 3 holds the char: T
Node: 4 holds the char: I
As expected.
My understanding is that calling malloc() sets aside some memory on the heap, and then returns the address of that memory (pointing to the beginning)…
Yes, but people who call it “the heap” are being sloppy with terminology. A heap is a kind of data structure, like a linked list, a binary tree, or a hash table. Heaps can be used for things other than tracking available memory, and available memory can be tracked using data structures other than a heap.
I do not actually know of a specific term for the memory that the memory management routines manage. There are actually several different sets of memory we might want terms for:
all the memory they have acquired from the operating system so far and are managing, including both memory that is currently allocated to clients and memory that has been freed (and not yet returned to the operating system) and is available for reuse;
the memory that is currently allocated to clients;
the memory that is currently available for reuse; and
the entire range of memory that is being managed, including portions of the virtual address space reserved for future mapping when necessary to request more memory from the operating system.
I have seen “pool” used to describe such memory but have not seen a specific definition of it.
… which is assigned to struct node *newnode which is itself on the stack.
struct node *newnode is indeed nominally on the stack in common C implementations. However, the C standard only classifies it as automatic storage duration, meaning its memory is automatically managed by the C implementation. The stack is the most common way to implement that, but specialized C implementations may do it in other ways. Also, once the compiler optimizes the program, newnode might not be on the stack; the compiler might generate code that just keeps it in a register, and there are other possibilities too.
A complication here is when we are talking about memory use in a C program, we can talk about the memory use in a model computer the C standard uses to describe the semantics of programs or the memory use in actual practice. For example, as the C standard describes it, every object has some memory reserved for it during its lifetime. However, when a program is compiled, the compiler can produce any code it wants that gets the same results as required by the C standard. (The output of the program has to be the same, and certain other interactions have to behave the same.) So a compiler might not use memory for an object at all. After optimization, an object might be in memory at one time and in registers at another, or it might always be in a register and never in memory, and it might be in different registers at different times, and it might not be any particular place because it might have been incorporated into other things. For example, in int x = 3; printf("%d\n", 4*x+2);, the compiler might eliminate x completely and just print “14”. So, when asking about where things are in memory, you should be clear about whether you want to discuss the semantics in the model computer that the C standard uses or the actual practice in optimized programs.
When the function is first called, *nodetoaddto is a pointer to struct node first, both of which are on the stack.
nodetoaddto may be on the stack, per above, but it also may be in a register. It is common that function arguments are passed in registers.
It points to a struct node. By itself, struct node is a type, so it is just a concept, not an object to point to. In contrast, “a struct node” is an object of that type. That object might or might not be on the stack; addnodeto would not care; it could link to it regardless of where it is in memory. Your main routine does create its first and last nodes with automatic storage duration, but it could use static just as well, and then the nodes would likely be located in a different part of memory rather than the stack, and addnodeto would not care.
Thus the (*nodeaddto)->next = newnode sets first.next equal to the value of newnode which is the address of the newly allocated memory.
Yes: In main, last is initialized to pointer to first. Then &last is passed to addnodeto, so nodeaddto is a pointer to last. So *nodeaddto is a pointer to first. So (*nodeaddto)->next is the next member in `first.
When we leave this function, and continue executing the main() function, is *newnode removed from the stack (not sure if 'deallocated' is the correct word), leaving only struct node first pointing to the 'next' node struct on the heap?
newnode is an object with automatic storage duration inside addnodeto, so its memory is automatically released when addnodeto ends.
*newnode is a struct node with allocated storage duration, so its memory is not released when a function ends. Its memory is released when free is called, or possibly some other routine that may release memory, like realloc.
If so, does this 'next' struct node have a variable name also on the stack or heap, or it is merely some memory pointed [to]?
There are no variable names in the stack or in the heap. Variable names exist only in source code (and in the compiler while compiling and in debugging information associated with the compiled program, but that debugging information is generally separate from the normal execution of the program). When we work with allocated memory, we generally work with it only by pointers to it.
Moreover, is it true to say that struct node first is on the stack, whilst all subsequent nodes will be on the heap,…
Yes, subject to the caveats about stack and “heap” above.
… and that just before main() returns 0 there are no structs/variables on the stack other than struct node first?
All of the automatic objects in main are on the stack (or otherwise automatically managed): first, last, and num_nodes.
Or is/are there 1/more than 1 *newnode still on the stack?
No.

Why does node* root = malloc(sizeof(int)) allocate 16 bytes of memory instead of 4?

I'm messing around with Linked List type data structures to get better with pointers and structs in C, and I don't understand this.
I thought that malloc returned the address of the first block of memory of size sizeof to the pointer.
In this case, my node struct looks like this and is 16 bytes:
typedef struct node{
int index;
struct node* next;
}node;
I would expect that if I try to do this: node* root = malloc(sizeof(int))
malloc would allocate only a block of 4 bytes and return the address of that block to the pointer node.
However, I'm still able to assign a value to index and get root to point to a next node, as such:
root->index = 0;
root->next = malloc(sizeof(node));
And the weirdest part is that if I try to run: printf("size of pointer root: %lu \n", sizeof(*root));
I get size of pointer root: 16, when I clearly expected to see 4.
What's going on?
EDIT: I just tried malloc(sizeof(char)) and it still tells me that *root is 16 bytes.
There is a few things going on here, plus one more that probably isn't a problem in this example but is a problem in general.
1) int isn't guaranteed to be 4 bytes, although in most C compiler implementations they are. I would double check sizeof(int) to see what you get.
2) node* root = malloc(sizeof(int)) is likely to cause all sorts of problems, because sizeof(struct node) is not the same as an int. As soon as you try to access root->next, you have undefined behavior.
3) sizeof(struct node) is not just an int, it is an int and a pointer. Pointers are (as far as I know, someone quote the standard if not) the same size throughout a program depending on how it was compiled (32-bit vs 64-bit, for example). You can easily check this on your compiler with sizeof(void*). It should be the same as sizeof(int*) or sizeof(double*) or any other pointer type.
4) Your struct should be sizeof(int) + sizeof(node*), but isn't guaranteed to be. For example, say I have this struct:
struct Example
{
char c;
int i;
double d;
};
You'd expect its size to be sizeof(char) + sizeof(int) + sizeof(double), which is 1 + 4 + 8 = 13 on my compiler, but in practice it won't be. Compilers can "align" members internally to match the underlying instruction architecture, which generally will increase the structs size. The tradeoff is that they can access data more quickly. This is not standardized and varies from one compiler to another, or even different versions of the same compiler with different settings. You can learn more about it here.
5) Your line printf("size of pointer root: %lu \n", sizeof(*root)) is not the size of the pointer to root, it is the size of the struct root. This leads me to believe that you are compiling this as 64-bit code, so sizeof(int) is 4, and sizeof(void*) is 8, and they are being aligned to match the system word (8 bytes), although I can't be positive without seeing your compiler, system, and settings. If you want to know the size of the pointer to root, you need to do sizeof(node*) or sizeof(root). You dereference the pointer in your version, so it is the equivalent of saying sizeof(node)
Bottom line, is that the weirdness you are experiencing is undefined behavior. You aren't going to find a concrete answer, and just because you think you find a pattern in the behavior doesn't mean you should use it (unless you want impossible to find bugs later that make you miserable).
You didn't mention what system (M$ or linux, 32bit or 64bit) but your assumptions about memory allocation are wrong. Memory allocations are aligned to some specified boundary to guarantee all allocations for supported types are properly aligned - typically it is 16 bytes for 64bit mode.
Check this - libc manual:
http://www.gnu.org/software/libc/manual/html_node/Aligned-Memory-Blocks.html
The address of a block returned by malloc or realloc in GNU systems is
always a multiple of eight (or sixteen on 64-bit systems). If you need
a block whose address is a multiple of a higher power of two than
that, use aligned_alloc or posix_memalign. aligned_alloc and
posix_memalign are declared in stdlib.h.
There's a few things happening here. First, C has no bounds checking. C doesn't track how much memory you allocated to a variable, either. You didn't allocate enough memory for a node, but C doesn't check that. The following "works", but really it doesn't.
node* root = malloc(sizeof(int));
root->index = 0;
root->next = malloc(sizeof(node));
Since there wasn't enough memory allocated for the struct, someone else's memory has been overwritten. You can see this by printing out the pointers.
printf("sizeof(int): %zu\n", sizeof(int));
printf("root: %p\n", root);
printf("&root->index: %p\n", &root->index);
printf("&root->next: %p\n", &root->next);
sizeof(int): 4
root: 0x7fbde5601560
&root->index: 0x7fbde5601560
&root->next: 0x7fbde5601568
I've only allocated 4 bytes, so I'm only good from 0x7fbde5601560 to 0x7fbde5601564. root->index is fine, but root->next is writing to someone else's memory. It might be unallocated, in which case it might get allocated to some other variable and then you'll see weird things happening. Or it might be memory for some existing variable, in which case it will overwrite that memory and cause very difficult to debug memory problems.
But it didn't go so far out of bounds so as to walk out of the memory allocated to the whole process, so it didn't trigger your operating system's memory protection. That's usually a segfault.
Note root->next is 8 bytes after root->index because this is a 64 bit machine and so elements of a struct align on 8 bytes. If you were to put another integer into the struct after index, next would still be 8 bytes off.
There's another possibility: even though you only asked for sizeof(int) memory, malloc probably allocated more. Most memory allocators do their work in chunks. But this is all implementation defined, so your code still has undefined behavior.
And the weirdest part is that if I try to run: printf("size of pointer root: %lu \n", sizeof(*root)); I get size of pointer root: 16, when I clearly expected to see 4.
root is a pointer to a struct, and you'd expect sizeof(root) to be pointer sized, 8 bytes on a 64 bit machine to address 64 bits of memory.
*root dereferences that pointer, sizeof(*root) is the actual size of the struct. That's 16 bytes. (4 for the integer, 4 for padding, 8 for the struct pointer). Again, C doesn't track how much memory you allocated, it only tracks what the size of the variable is supposed to be.

Malloc for character array and Structures

In this question the usage of malloc for character arrays is explained in a detailed way.
When should I use malloc in C and when don't I?
Is this same for structures in C?
For example consider the following definition:
struct node
{
int x;
struct node * link;
}
typedef struct node * NODE;
Consider the following two usage of the above structure:
1)
NODE temp = (NODE) malloc(sizeof(struct node));
temp->x =5;
temp->link = NULL;
2)
struct node node1, *temp;
node1.x = 5;
node1.link = NULL;
temp = &node1;
Can I use the declaration of temp from the second example and modify the node1.link point to another structure struct node node2 by using temp->link = &node2 (pointer to node2 structure)?
Here, this implementation is used for creating a tree data structure.
Will the structures also follow the same rules as like arrays as stated in the above link?
Because many implementations I have seen followed the first usage. Is there any specific reason for using malloc ?
You can do what you describe in #2, but remember that it will only work as long as node1 is in scope. If node1 is a local variable in a function, then when the function returns, the memory location it refers to will be reclaimed and used for something else. Any other pointers in your program which still point to &node1 will no longer be valid. One of the advantages of using malloc to allocate memory dynamically is that it remains valid until you call free to dispose of it explicitly.
the first usage use the heap memory which is almost as large as you phsical memory, but the second usage use the stack momory which is quite scare( normally 8M )
Yes. You can do "temp->link = &node2;", because temp points to node1.
Yes, this rule is very general in C, so it can work for arrays.
When malloc is used, the space is allocated in the heap rather than the stack. This allows you to allocate more space for large data structure. The downside is that you also need to "free" the memory after the usage. Otherwise, memory leak will occur.
If you allocate memory dynamically, you must deallocate that memory programatically by calling free().
In first you are allocating memory dynamically it will allocate the memory from heap. If you store the data in heap memory it will stay until you call free else it will memory will deallocate when program terminate.
your second one store the value in stack its scope is local to the function.
Your example seems that you are working on linked list so you may need to add or delete the node at any time, so using dynamic allocation is the best option.

How to resolve the situation that malloc() corrupts the pointer?

Here's a simplified version of a function I wrote just now:
int foobar(char * foo) {
puts(type);
struct node * ptr = (struct node *) malloc (sizeof(struct node));
puts(type);
memset(ptr, 0, sizeof(ptr));
ptr=head;
return head->id;
}
Here node is just a struct declared as a node in the linklist, which contains a char * and a pointer to the next node. However, I realize that the malloc() here is corrupting my input char * foo.
Why would malloc() corrupt my input char pointer? Also, how could I resolve the issue here? Now I am just copying the content of that pointer to a local array, but this is too hacky, even for my taste (which isn't the best).
Thanks for any inputs!
EDIT: Well, here's more real code:
void foobar(char * type) {
puts(type); <-- here it's a long string about 30 char
struct node * ptr = (struct node *) malloc (sizeof(struct node));
puts(type); <- chopped of, 10 left with some random thing at the end
}
Hope the problem is clear now! Thanks!
EDIT:
Here's how type got initialized:
type = strdup("Some ");
tempType = strdup("things");
sprintf(type + strlen(type), "%s", tempType);
Thanks!
The apparent corruption would happen because type or foo points at already freed memory which malloc() gives back to you for a different use.
Once you've released the memory, you cannot continue to use it.
You also have a problem because you allocate to ptr, then wipe ptr out with:
ptr = head;
You might have meant:
head = ptr;
but you would probably need to set ptr->next = head; before that. Of course, this is speculation since you've not shown the type definitions.
It also isn't obvious why you return head->id instead of either head or ptr. Unfortunately, we don't have enough information to say "that is wrong"; it is just not usual.
Commentary on 2nd Edit
Here's how type got initialized:
type = strdup("Some ");
tempType = strdup("things");
sprintf(type + strlen(type), "%s", tempType);
There is some of the trouble. You've gone trampling on memory that you have no business trampling on.
The first two lines are fine; you duplicate a string. Note, though, that type is a pointer to 6 bytes of memory, and tempType is a pointer to 7 bytes of memory.
The disaster strikes in the third line.
type + strlen(type) is pointing to the null byte at the end of the type string. You then write 1 byte of tempType over it more or less legitimately; you don't have a null terminated string any more, but the first byte is within bounds. The second and subsequent bytes are written in space that is not allocated to you, and that probably contains control information about memory allocation.
Writing out of the bounds of the allocated memory leads to 'undefined behaviour'. Anything can happen. On some machines, particularly with a 64-bit compilation, you might get away with it altogether. On most machines, and especially 32-bit compilations, you've wrecked your heap memory and something somewhere (typically somewhat distant from this spot) will run into trouble because of it. That is the nature of memory abuse; the place where it occurs often appears to work and it is some other innocent piece of code that suffers from the problems caused elsewhere.
So, if you want to concatenate those strings, you need to do something like:
char *type = strdup("Some ");
char *tempType = strdup("things");
char *concat = malloc(strlen(type) + strlen(tempType) + 1);
sprintf(concat, "%s%s", type, tempType);
I've omitted error checking. You should check the allocations from strdup() and malloc() to ensure that you got memory allocated. Some might argue that you should use snprintf(); it was a conscious decision not to do so since I've calculated the necessary space in the previous line and allocated sufficient space. But you should at least consider it. If you have not ensured that there is enough space available, then you should use snprintf() to avoid buffer overflows. You would also check its return value so you know whether the information was all formatted or not. (Also note that you have 3 pointers to free, or pass off to some other code so that the allocated memory is freed at an appropriate time.)
Note that on Windows, snprintf() (or _snprintf()) does not behave the way specified by the C99 standard. Frankly, that's not helpful.
I'm not sure what you're trying to do, but the comments indicate what's happening:
// This allocates enough memory for a struct node and assigns it to ptr.
struct node * ptr = (struct node *) malloc (sizeof(struct node));
// This displays the data in the (unspecified) string type,
// which must be null terminated.
puts(type);
// This sets the first 4 bytes of ptr to 0, assuming pointers are 4 bytes.
// You probably want memset(ptr, 0, sizeof(struct node));
memset(ptr, 0, sizeof(ptr));
// This makes ptr point to the address of head, orphaning the memory
// that was just malloc'ed to ptr.
ptr=head;

Resources