"default value" of allocated struct pointer in C

"default value" of allocated struct pointer in C - c

I am storing the input data which includes the specific order, so I choose to use array to sort them:
struct Node** array = (struct Node**)malloc(sizeof(Node**) * DEFAULT_SIZE);
int i;
int size = DEFAULT_SIZE;
while(/* reading input */) {
// do something
int index = token; // token is part of an input line, which specifies the order
struct Node* node = (struct Node*)malloc(sizeof(struct Node));
*node = (struct Node){value, index};
// do something
if (index >= size) {
array = realloc(array, index + 1);
size = index + 1;
}
array[index] = node;
}
I am trying to loop through the array and do something when the node exists at the index
int i;
for (i = 0; i < size; i++) {
if (/* node at array[i] exists */) {
// do something
}
}
How can I check if node exists at the specific index of the array? (Or what is the "default value" of the struct node after I allocated its memory?) I only know it is not NULL...
Should I use calloc and try if ((int)array[index] != 0)? Or there is a better data structure I am able to use?

When you realloc (or malloc) your list of pointers, the system resizes/moves the array, copying your data if needed, and reserving more space ahead without changing the data, so you get what was there before. You cannot rely on the values.
Only calloc does a zero init, but you cannot calloc when you realloc.
For starters you should probably use calloc:
struct Node** array = calloc(DEFAULT_SIZE,sizeof(*array));
In your loop, just use realloc and set the new memory to NULL so you can test for null pointers
Note that your realloc size is incorrect, you have to multiply by the size of the element. Also update the size after reallocation or that won't work more than once.
Note the tricky memset which zeroes only the unallocated data without changing the valid pointer data. array+size computes the proper address size due to pointer arithmetic, but the size parameter is in bytes, so you have to multiply by sizeof(*array) (the size of the element)
if (index >= size)
{
array = realloc(array, (index + 1)*sizeof(*array)); // fixed size
memset(array+size,0,(index+1-size) * sizeof(*array)); // zero the rest of elements
size = index+1; // update size
}
aside:
realloc for each element is inefficient, you should realloc by chunks to avoid too many system calls/copies
I have simplified the malloc calls, no need to cast the return value of malloc, and also better to pass sizeof(*array) instead of sizeof(Node **). In case the type of array changes you're covered (also protects you from one-off errors with starred types)

The newly-allocated memory contains garbage and reading a pointer from uninitialized memory is a bug.
If you allocated using calloc( DEFAULT_SIZE, sizeof(Node*) ) instead, the contents of the array would be defined: all bits would be set to zero. On many implementations, this is a NULL pointer, although the standard does not guarantee it. Technically, there could be a standard-conforming compiler that makes the program crash if you attempt to read a pointer with all bits set to zero.
(Only language lawyers need to worry about that, though. In practice, even the fifty-year-old mainframes people bring up as the example of a machine where NULL was not binary 0 updated its C compiler to recognize 0 as a NULL pointer, because that broke too much code.)
The safe, portable way to do what you want is to initialize every pointer in the array to NULL:
struct Node** const array = malloc(sizeof(Node**) * DEFAULT_SIZE);
// Check for out-of-memory error if you really want to.
for ( ptrdiff_t i = 0; i < DEFAULT_SIZE; ++i )
array[i] = NULL;
After the loop executes, every pointer in the array is equal to NULL, and the ! operator returns 1 for it, until it is set to something else.
The realloc() call is erroneous. If you do want to do it that way, the size argument should be the new number of elements times the element size. That code will happily make it a quarter or an eighth the desired size. Even without that memory-corruption bug, you’ll find yourself doing reallocations far too often, which might require copying the entire array to a new location in memory.
The classic solution to that is to create a linked list of array pages, but if you’re going to realloc(), it would be better to multiply the array size by a constant each time.
Similarly, when you create each Node, you’d want to initialize its pointer fields, if you care about portability. No compiler this century will generate less-efficient code if you do.
If you only allocate nodes in sequential order, an alternative is to create an array of Node rather than Node*, and maintain a counter of how many nodes are in use. A modern desktop OS will only map in as many pages of physical memory for the array as your process writes to, so simply allocating and not initializing a large dynamic array does not waste real resources in most environments.
One other mistake that’s probably benign: the elements of your array have type struct Node*, but you allocate sizeof(Node**) rather than sizeof(Node*) bytes for each. However, the compiler does not type-check this, and I am unaware of any compiler where the sizes of these two kinds of object pointer could be different.

You might need something like this
unsigned long i;
for (i = 0; i < size; i++) {
if (array[i]->someValidationMember==yourIntValue) {
// do something
}
}
Edit.
The memory to be allocated must be blank. Or if an item is deleted just simply change the Node member to zero or any of your choice.

Related

How to import a large quantity of numerical data

I'm thinking what is the best technique for importing a large amount of data, whether integer or floating point type, from a file into an array to be processed later.
Considering that the number of data can vary (not all import files are of equal size), therefore in one file there can be 100 numbers, in another file 1 million numbers and they are in ASCII format, I thought that before sizing the array to hold the data i should know how much data will fill it.
I can't size the array upfront if I don't know how much data will go into that array. So I could read the data from the file and as they are read, use the realloc instruction to resize the array every time (in doing so, however, it seems to me to waste system resources since if the file consists of a million numbers, it is forced to resize the array 1 million times).
Or (but I think this would be fine if it were in binary format), understand the file size, know which separator there is between the numbers and then calculate, based on this, the size of the array.
Or again, if the file as I said is in ASCII format, first read the number of separators (for example, they can be spaces or commas), and based on this understand the quantity of elements and size the array accordingly.
I don't know which technique would be the best.

Here's an example of the realloc dynamic resizing approach [as Bodo mentioned] from some code I've had lying around. Note the ary_grow can be set to whatever you want.
// qwklib/ary.c -- quick dynamic array control
#include <string.h>
#include <stdlib.h>
typedef void (*aryinit_p)(void *);
typedef struct {
void *ary_base; // base address
int ary_siz; // size of elements
int ary_cnt; // current count
int ary_max; // maximum count
int ary_grow; // amount to grow
aryinit_p ary_init; // initialization
} ary_t;
typedef ary_t *ary_p;
// aryinit -- initialize the array
ary_p
aryinit(ary_p ary,int siz,int grow)
{
memset(ary,0,sizeof(ary_t));
ary->ary_siz = siz;
ary->ary_grow = grow;
return ary;
}
static inline void *
aryloc(ary_p ary,int idx)
{
void *ptr;
ptr = ary->ary_base;
ptr += (ary->ary_siz * idx);
return ptr;
}
// arypush -- add to dynamic array
void *
arypush(ary_p ary)
{
aryinit_p init;
int cnt;
void *ptr;
do {
// got enough space already
if (ary->ary_cnt < ary->ary_max)
break;
if (ary->ary_siz == 0)
ary->ary_siz = 1;
// get number of elements to grow by
if (ary->ary_grow == 0)
ary->ary_grow = 10;
// add to allocated space
ary->ary_max += ary->ary_grow;
ptr = realloc(ary->ary_base,ary->ary_max * ary->ary_siz);
ary->ary_base = ptr;
ptr += ary->ary_cnt;
cnt = ary->ary_max - ary->ary_cnt;
memset(ptr,0,ary->ary_siz * cnt);
init = ary->ary_init;
if (init == NULL)
break;
for (; cnt > 0; --cnt, ptr += ary->ary_siz)
init(ptr);
} while (0);
// get pointer to first available slot
ptr = aryloc(ary,ary->ary_cnt);
// advance count for next time
ary->ary_cnt += 1;
return ptr;
}
// arytrim -- trim allocated array size to in-use size
void
arytrim(ary_p ary)
{
void *ptr;
ary->ary_max = ary->ary_cnt;
ptr = realloc(ary->ary_base,ary->ary_max * ary->ary_siz);
ary->ary_base = ptr;
}
// aryclean -- free up storage
void
aryclean(ary_p ary)
{
free(ary->ary_base);
}
Note that, for completeness, you may wish to use size_t instead of int for some variables if your array indexes could overflow a 32 bit number, as well as adding proper error checking for realloc

One thing you could do is not store the data in an array, but rather in a linked list storing one piece of data per list node. That way, you could add elements to the linked list at will, without ever having to resize anything. However, this has the following disadvantges:
Dynamic memory allocation is rather slow.
Linked lists aren't cached as well as arrays, which is bad for performance.
It is not very space efficient. For example, on a 64-bit system, pointers are normally 8 bytes long. So, if every node contains a 32-bit int as data, you will have 4 bytes of data per node and 8 bytes of overhead from the pointer (16 bytes if the linked list is doubly-linked). This means that more than half of the space is being wasted. In addition, the memory allocator itself likely has a few bytes of internal overhead for every memory allocation, so even more space is wasted.
For this reason, it would be more efficient to allocate an array of several kilobytes of memory at once using malloc and, if if it later turns out that you need more memory, you can allocate another array of the same size (or maybe higher size) using malloc. These individual arrays could be linked with each other using a linked list, so the number of new arrays you can allocate would only be limited by your available memory.
However, this efficient solution is also more complicated. Therefore, if the disadvantages mentioned above are acceptable to you, then a simple linked list storing one piece of data per list node would probably be the easiest and most flexible solution.
An alternative would be to allocate one single array and expand it as necessary using realloc in large steps of several kilobytes (instead of once for every new element). This would be significantly faster than calling realloc once for every new element. However, when compared to the linked list solution, it has the following two disadvantages:
If there is not enough room to expand the array, the entire array must be copied to a new location with more room. Even if this is handled internally by realloc (so you don't have to program it yourself), it can be bad for performance.
If the memory is too fragmented, the allocator may not be able to find any room anywhere for a large enough array to store all elements.
When deciding whether to use arrays or linked lists, it is also worth taking into consideration that certain operations are better suited for linked lists (such as insert operations), whereas other operations (such as random access) are better suited for arrays.

Does a pointer in an array of pointer to struct, if set to NULL, allocate memory?

I'm new to C and this is my first question:
for the this structure:
typedef struct Branch
{
Tree * thisTree;
struct Branch * nodes[];
} Branch;
it seems to work fine if I do the following:
Branch branch1;
branch1->nodes[0] = NULL;
even if I do not allocate memory for the pointer nr 0 in the array this way:
branch->node[0] = (Branch *) malloc(sizeof(Branch *));
if i check with this code:
if ( branch1->nodes[0] == NULL)
printf("is NULL");
it prints to the output: is NULL
So my question is:
has there been allocated memory for the pointer?
branch1->nodes[0]
because I have a lot of structures and if I initialise each branch with a fixed number of pointers I get a lot of allocated data (if I check with the sizeof function).
Is this way: setting to NULL (above) a wrong way of thinking ?
My problem is that the allocation of memory for a pointer is 4 bytes. So not having a declared number of pointers in the array, when does it allocate memory for it ?
Sorry
I tried to keep the question simple but I need to reach a string through the structure pointer in the next branch
this means that the struct I use is
typedef struct Branch
{
Tree * thisTree;
char *string;
struct Branch * nodes[];
} Branch;
So if I do not
branch->node[0] = (Branch *) malloc(sizeof(Branch *));
and than
branch->node[0]->String = strdup("text");
I ge a compiler error.

No. The null pointer does not allocate any memory to store it on the heap because there is nothing to allocate (hence the "null").
Your nodes array does allocate memory to store null pointers itself, but just as much memory it would take to store null pointers array of integers, floats, structs, you name it.

Essentially you are allocating memory to the pointer, and placing the value NULL in there, which in C is normally 0. But the value has to be placed in allocated memory, in this case allocated to the pointer, so yes, there is still allocated memory to the pointer, you just set it's value to zero.
You would however have nullified the pointer, losing access to that memory, if you had written branch->node = NULL;, but that is not the right way to remove it, as the memory is still allocated but just unreachable, instead you should use free(branch->node); to un-allocate the memory, in case you want to do that.
Concerning the number of allocated positions in memory, that would be defined by the times you multiply the sizeof() function:
int number = 2; //assume you want two elements
branch->node[0] = (Branch *) malloc(number*sizeof(Branch *));

C, Trouble getting size of array of structs

I'm trying to get the number of elements in an array of structs so I can pass it into another function. Struct:
struct info{
char string1[30];
float float1;
int int1;
char string2[30];
};
Section I'm trying to run:
void function1(){
struct info* temp = build();
printf("flag: %lu %lu %lu\n", sizeof(temp), sizeof(temp[0]), sizeof(temp)/sizeof(temp[0]));
sortFloat(temp, sizeof(temp)/sizeof(temp[0]), 1);
free(temp);
}
build() returns an array of structs after reading in data from a file where each line will be a struct in the array. I'm having trouble passing the size of the array into sortFloat(). The print line returns
flag: 8 72 0
when there are only two lines in the data file. Hard coding that argument as 2 makes the whole program work correctly. Why is this method of counting the elements of the array of structs not?

sizeof(temp)
will not evaluate to the number of elements in the array. It will evaluate to just the size of the pointer.
If you need the size of the array, you can do this:
Change the signature of build to:
struct info* build(int* sizePtr);
Make sure that sizePtr is appropriately set in the implementation.
Then, call it using:
int size;
struct info* temp = build(&size);

What you are thinking of as an array is more precisely just a pointer to the first element. How many elements follow the first element cannot be known to function1(). Only the function build() knows how many elements were read and how much memory was dynamically allocated to store those elements.
The only solution is to get the build() function to also pass back the number of elements read from the file. One way to do this is to send the address of an int variable to build(int *countp) and have build(int *countp) store the count into this int using either ++*countp; as it reads each element or *countp = n; where n is a different variable in which build(...) maintains the count.

Because temp is not an array. It is a pointer. So calling sizeof on it will return the size of the pointer.
It might be hard to understand, but in C, arrays (real arrays, not pointers) are best regarded as value types that are quite tricky to pass around by value. Arrays are not pointers and pointers are not arrays.
sizeof is an operator for determining the size of memory occupied by a value. That is why, when applied to an array, it returns the size of the array, and when applied to a pointer - just the size of the pointer.
What you're creating and returning from build() is most likely a dynamically-allocated (via malloc or friends) buffer of memory. In this case, it was never a real array to begin with! It is just a pointer to a chunk of memory on the heap. This chunk of memory is not one value, and its size cannot be determined using sizeof like that. So you've got no choice but to count the number of allocated structs and get that information to the caller.

Is there a way to initialize an array without defining the size

Is there a way to initialize an array without defining the size.
The size of an array increases on its own as as when the loop runs it reallocates the array.

There is no such thing out of the box. You will have to create your own array-like data structure that does this. It shouldn't be very hard to implement, if you're careful.
What you're looking for is, roughly, a data structure that, when created, allocates (using malloc, for instance) a predefined size and starts using the consecutive space inside it as slots of an array. Then, as more items are added, it reallocates (say, using realloc) that space.
Of course, you won't be able to use the indexer syntax you're used to with simple arrays. Instead, your data structure will have to provide its own pair of set/get functions that take care of the above, under the hood. Therefore, the set function will check the index specified in its arguments and, if that index is greater than the current size of the array, perform a reallocation. Then, in any case, set the value provided to the specified index.

You can initialize an array without specifying the size but it would not be useful unless you allocated space for it before you used it. Normally when you declare a variable in C, the compiler reserves a specific amount of memory for that variable on the "stack". If you want an array to be able to grow throughout the program, however, this is not what you are looking for because the amount of space allocated for a variable on the "stack" is static.
The solution, therefore, is to have the program decide how much memory to allocate to your variable at run-time, instead of compile-time. This way, while the program is running, you will be able to decide how much space your variable needs to have reserved.
In practice, this is called dynamic memory allocation and it is accomplished in C using the functions malloc() and realloc(). I would suggest reading up on these functions, I think they will be very useful to you.
If you have follow up questions feel free to ask.
One last thing!
Whenever you use malloc() to allocate memory for a variable, you should remember to call the function free() on that variable at the end of the program or whenever you are done using the variable.

Here's a simple implementation of such a datastructure (for ints, but you can replace the int with whatever type you need). I've omitted error-handling for clarity.
typedef struct array_s {
int len, cap;
int *a;
} array_s, *array_t;
/* Create a new array with 0 length, and the given capacity. */
array_t array_new(int cap) {
array_t result = malloc(sizeof(array_s));
array_s a = {0, cap, malloc(sizeof(int) * cap)};
*result = a;
return result;
}
/* Destroy an array. */
void array_free(array_t a) {
free(a->a);
free(a);
}
/* Change the size of an array, truncating if necessary. */
void array_resize(array_t a, int new_cap) {
result->cap = new_cap;
result->a = realloc(result->a, new_cap * sizeof(int));
if (result->len > result->cap) {
result->len = result->cap;
}
}
/* Add a new element to the end of the array, resizing if necessary. */
void array_append(array_t a, int x) {
if (result->len == result->cap) {
// max the new size with 4 in case cap is 0.
array_resize(a, max(4, result->cap * 2));
}
a->a[a->len++] = x;
}
By storing len (the current length of the array), and cap (the amount of space you've reserved for the array), you can extend the array in O(1) up to the point when len is cap, then resize the array (eg: using realloc), perhaps by multiplying the existing cap by 2 or 1.5 or something. This is what most vector or list types do in languages that support resizable arrays. I've coded this in array_append as an example.

Setting the first two bytes of a block of memory as a pointer or NULL while still accessing the rest of the block

Suppose I have a block of memory as such:
void *block = malloc(sizeof(void *) + size);
How do I set a pointer to the beginning of the block while still being able to access the rest of the reserved space? For this reason, I do not want to simply assign 'block' to another pointer or NULL.

How do I set the first two bytes of the block as NULL or have it point somewhere?
This doesn't make any sense unless you're running on a 16-bit machine.
Based on the way that you're calling malloc(), you're planning to have the first N bytes be a pointer to something else (where N may be 2, 4, or 8 depending on whether you're running on a 16-, 32-, or 64-bit architecture). Is this what you really want to do?
If it is, then you can create use a pointer-to-a-pointer approach (recognizing that you can't actually use a void* to change anything, but I don't want to confuse matters by introducing a real type):
void** ptr = block;
However, it would be far more elegant to define your block with a struct (this may contain syntax errors; I haven't run it through a compiler):
typedef struct {
void* ptr; /* replace void* with whatever your pointer type really is */
char[1] data; } MY_STRUCT;
MY_STRUCT* block = malloc(sizeof(MY_STRUCT) + additional);
block->ptr = /* something */

memset(block, 0, 2);
memset can be found in string.h

Putting the first two bytes of the allocated memory block to 0 is easy. There is many ways to do it, for example:
((char*)block)[0] = 0;
((char*)block)[1] = 0;
Now, the way the question is asked show some misunderstanding.
You can put anything in the first two bytes of your allocated block, it doesn't change anything for accessing the following bytes. The only difference is that C string manipulation operator use as a convention that strings end with a 0 byte. Then if you do things like strcpy((char*)block, target) it will stop copying immediately if the first byte is a zero. But you can still do strcpy((char*)block+2, target).
Now if you want to store a pointer a the beginning of the block (and usually it's not 2 bytes).
You can do the same thing as above but using void* instead of char.
((void**)block)[0] = your_pointer;
You access the rest of the block as you like, just get it's address and go on. You could do it for example with.
void * pointer_to_rest = &((void**)block)[1];
PS: I do not recommand such pointer games. They are very error prone. Your best move would probably be to follow the struct method proposed by #Anon.

void *block = malloc(sizeof(void *) + size); // allocate block
void *ptr = NULL; // some pointer
memcpy(block, &ptr, sizeof(void *)); // copy pointer to start of block

I have a guess at what you're trying to ask, but your wording is so confusing that I could be totally wrong. I am assuming that you want a pointer that points to the "first 2 bytes" of the block you allocated, and then another pointer that points to the rest of the block.
Pointers carry no information about the size of the memory block that they point to, so you can do this:
void *block = malloc(sizeof(void *) + size);
void *first_two_bytes = block;
void *rest_of_block = ((char*)block)+2;
Now, first_two_bytes points to the beginning of the block that you allocated, and you should just treat it as if it pointed to a memory area 2 bytes long.
And rest_of_block points to the portion of the block starting 3 bytes in, and you should treat it as if it pointed to a memory area 2 bytes smaller than what you allocated.
Note, however, that this is still only a single allocation, and you should only free the block pointer. If you free all three pointers, you will corrupt the heap, since you will be calling free more than once on the same block.

While implementing a map interface using a hash table I faced a similar issue, where each key-value pair (both of which are not statically sized, omitting the option of defining a compile-time struct) had to be stored in block of heap memory that also included a pointer to the next element in a linked list (should the blocks be chained in the event that more than one is hashed to the same index in the hash table array). Leaving space for the pointer at the beginning of the block, I found that the solution mentioned by kriss:
((void**)block)[0] = your_pointer;
where you cast the pointer to the block as an array, and then use the bracket syntax to handle pointer arithmetic and dereferencing, was the cleanest solution for copying a new value into this pointer "field" of the block.