How to store custom objects (struct) in C? - c

I want to know how to store custom objects (not their pointers) in C. I have created a custom structure called Node
#define MAXQ 100
typedef struct {
int state[MAXQ];
int height;
} Node;
(which works) and I want to store a few of these Nodes in a container (without using pointers, since they are not stored elsewhere) so I can access them later.
The internet seems to suggest something like calloc() so my last attempt was to make a container Neighbors following this example, with numNeighbors being just an integer:
Node Neighbors = (Node*)calloc(numNeighbors, sizeof(Node));
At compilation, I got an error from this line saying
initializing 'Node' with an expression of incompatible type 'void *'
and in places where I referenced to this container (as in Neighbors[i]) I got errors of
subscripted value is not an array, pointer, or vector
Since I'm spoiled by Python, I have no idea if I've got my syntax all wrong (it should tell you something that I'm still not there after scouring a ton of tutorials, docs, and stackoverflows on malloc(), calloc() and the like), or if I am on a completely wrong approach to storing custom objects (searching "store custom objects in C" on the internet gives irrelevant results dealing with iOS and C# so I would really appreciate some help).
EDIT: Thanks for the tips everyone, it finally compiled without errors!

You can create a regular array using your custom struct:
Node Neighbors[10];
You can then reference them like any other array, for example:
Neighbors[3].height = 10;

If your C implementation supports C.1999 style VLA, simply define your array.
Node Neighbors[numNeighbors];
(Note that VLA has no error reporting mechanism. A failed allocation results in undefined behavior, which probably expresses itself as a crash.)
Otherwise, you will need dynamic allocation. calloc is suitable, but it returns a pointer representing the contiguous allocation.
Node *Neighbors = calloc(numNeighbors, sizeof(*Neighbors));
Note, do not cast the result of malloc/calloc/realloc when programming in C. It is not required, and in the worst case, can mask a fatal error.

I want to store a few of these Nodes in a container (without using pointers, since they are not stored elsewhere) so I can access them later.
If you know the amount of them at compile-time (or at the very least a reasonable maximum); then you can create an array of stack-allocated objects. For instance, say you are OK with a maximum of 10 objects:
#define MAX_NODES 10
Node nodes[MAX_NODES];
int number_nodes = 0;
Then when you add an object, you keep in sync number_nodes (so that you know where to put the next one). Technically, you will always have 10, but you only use the ones you want/need. Removing objects is similar, although more involved if you want to take out some in the middle.
However, if you don't know how many you will have (nor a maximum); or even if you know but they are way too many to fit in the stack; then you are forced to use the heap (typically with malloc() and free()):
int number_nodes; // unknown until runtime or too big
Node * nodes = malloc(sizeof(Node) * number_nodes);
...
free(nodes);
In any case, you will be using pointers in the dynamically allocated memory case, and most probably in the stack case as well.
Python is hiding and doing all this dance for you behind the scenes -- which is quite useful and time saving as you have probably already realized, as long as you do not need precise control over it (read: performance).

malloc and calloc are for dynamic allocation, and they need pointer variables. I don't see any reason for you to use dynamic allocation. Just define a regular array until you have a reason not to.
#define MAXQ 100
#define NUM_NEIGHBORS 50
typedef struct {
int state[MAXQ];
int height;
} Node;
int main(void)
{
Node Neighbors[NUM_NEIGHBORS];
Neighbors[0].state[0] = 0;
Neighbors[0].height = 1;
}
Here NUM_NEIGHBORS needs to be a constant. (Hence static) If you want it to be variable or dynamic, then you need dynamic allocations, and pointers inevitably:
#define MAXQ 100
typedef struct {
int state[MAXQ];
int height;
} Node;
int main(void)
{
int numNeighbors = 50;
Node *Neighbors;
Neighbors = (Node*)calloc(numNeighbors, sizeof(Node));
Neighbors[0].state[0] = 0;
Neighbors[0].height = 1;
}

Related

Is it ok to create a large array in the heap when you aren't necessarily using all of it?

So I'm looking at a solution to some coding interview type questions, and there's an array inside a struct
#define MAX_SIZE 1000000
typedef struct _heap {
int data[MAX_SIZE];
int heap_size;
}heap;
heap* init(heap* h) {
h = (heap*)malloc(sizeof(heap));
h->heap_size = 0;
return h;
}
This heap struct is later created like so
heap* max_heap = NULL;
max_heap = init(max_heap);
First of all, I'd wish this was written in C++ style than C, but secondly if I'm just conscerned about the array, I'm assuming it is equivalent to solely analyze the array portion by changing the code like this
int* data = NULL;
data = (int*)malloc(1000000 * sizeof(int));
Now in that case, is there any problems with declaring the array with the max size if you are probably just using a little bit of it?
I guess this boils down to the question of when an array is created in the heap, how does the system block out that portion of the memory? In which case does the system prevent you from accessing memory that is part of the array? I wouldn't want a giant array holding up space if I'm not using much of it.
is there any problems with declaring the array with the max size if you are probably just using a little bit of it?
Yes. The larger the allocation size the greater the risk of an out-of-memory error. If not here, elsewhere in code.
Yet some memory allocation systems handle this well as real memory allocations do not immediately occur, but later when needed.
I guess this boils down to the question of when an array is created in the heap, how does the system block out that portion of the memory?
That is an implementation defined issue not defined by C. It might happen immediately or deferred.
For maximum portability, code would take a more conservative approach and allocate large memory chunks only as needed, rather than rely on physical allocation occurring in a delayed fashion.
Alternative
In C, consider a struct with a flexible member array.
typedef struct _heap {
size_t heap_size;
int data[];
} heap;

Array of pointers confusion with skip list

I think I have a basic understanding of how skip lists work, but being new to them in addition to being a C-beginner has me confused on a few points, especially the initialization of the list. Here's the code I'm trying to follow:
#define MAXSKIPLEVEL 5
typedef struct Node {
int data;
struct Node *next[1];
} Node;
typedef struct SkipList {
Node *header;
int level;
} SkipList;
// Initialize skip list
SkipList* initList() {
SkipList *list = calloc(1, sizeof(SkipList));
if ((list->header = calloc(1, sizeof(Node) + MAXSKIPLEVEL*sizeof(Node*))) == 0) {
printf("Memory Error\n");
exit(1);
}
for (int i = 0; i < MAXSKIPLEVEL; i++)
list->header->next[i] = list->header;
return list;
}
I haven't done anything with arrays of pointers yet in C, so I think I'm getting a bit caught up with how they work. I have a few questions if someone would be kind enough to help me out.
First, I did sizeof(int) and got 4, sizeof(Node*) and got 8, so I expected sizeof(Node) to equal 12, but it ended up being 16, why is this? Same confusion with the size of SkipList compared to the sizes of its contents. I took the typedef and [1] out to see if either of them was the cause, but the size was still 16.
Second, why is the [1] in struct Node *next[1]? Is it needed for the list->header->next[i] later on? Is it okay that next[i] will go higher than 1? Is it just because the number of pointers for each node is variable, so you make it an array then increase it individually later on?
Third, why does list->header->next[i] = list->header initially instead of NULL?
Any advice/comments are greatly appreciated, thanks.
For your first question - why isn't the size of the struct the size of its members? - this is due to struct padding, where the compiler, usually for alignment reasons, may add in extra blank space between or after members of a struct in order to get the size up to a nice multiple of some fundamental size (often 8 or 16). There's no portable way to force the size of the struct to be exactly the size of its members, though most compilers have some custom switches you can flip to do this.
For your second question - why the [1]? - the idea here is that when you actually allocate one of the node structs, you'll overallocate the space so that the memory at the end of the struct can be used for the pointers. By creating an array of length one and then overallocating the space, you make it syntactically convenient to access this overallocated space as though it were a part of the struct all along. Newer versions of C have a concept called flexible array members that have supplanted this technique; I'd recommend Googling it and seeing if that helps out.
For your final question - why does list->header->next[i] initially point to list->header rather than NULL? - without seeing more of the code it's hard to say. Many implementations of linked list structures use some sort of trick like this to avoid having to special-case on NULL in the implementation, and it's entirely possible that this sort of trick is getting used here as well.
The sizeof number is 16 because of structure padding.
Many architectures either require or strongly prefer their pointers to be aligned on a certain boundary (e.g., 4-byte boundary, 8-byte boundary, etc.) They will either fail, or they will perform slowly, if pointers are "misaligned". Your C compiler is probably inserting 4 unused bytes in the middle of your structure so that your 8-byte pointer is aligned on an 8-byte boundary, which causes the structure size to increase by 4 bytes.
There is more explanation available from the C FAQ.

How to include a variable-sized array as stuct member in C?

I must say, I have quite a conundrum in a seemingly elementary problem. I have a structure, in which I would like to store an array as a field. I'd like to reuse this structure in different contexts, and sometimes I need a bigger array, sometimes a smaller one. C prohibits the use of variable-sized buffer. So the natural approach would be declaring a pointer to this array as struct member:
struct my {
struct other* array;
}
The problem with this approach however, is that I have to obey the rules of MISRA-C, which prohibits dynamic memory allocation. So then if I'd like to allocate memory and initialize the array, I'm forced to do:
var.array = malloc(n * sizeof(...));
which is forbidden by MISRA standards. How else can I do this?
Since you are following MISRA-C, I would guess that the software is somehow mission-critical, in which case all memory allocation must be deterministic. Heap allocation is banned by every safety standard out there, not just by MISRA-C but by the more general safety standards as well (IEC 61508, ISO 26262, DO-178 and so on).
In such systems, you must always design for the worst-case scenario, which will consume the most memory. You need to allocate exactly that much space, no more, no less. Everything else does not make sense in such a system.
Given those pre-requisites, you must allocate a static buffer of size LARGE_ENOUGH_FOR_WORST_CASE. Once you have realized this, you simply need to find a way to keep track of what kind of data you have stored in this buffer, by using an enum and maybe a "size used" counter.
Please note that not just malloc/calloc, but also VLAs and flexible array members are banned by MISRA-C:2012. And if you are using C90/MISRA-C:2004, there are no VLAs, nor are there any well-defined use of flexible array members - they invoked undefined behavior until C99.
Edit: This solution does not conform to MISRA-C rules.
You can kind of include VLAs in a struct definition, but only when it's inside a function. A way to get around this is to use a "flexible array member" at the end of your main struct, like so:
#include <stdio.h>
struct my {
int len;
int array[];
};
You can create functions that operate on this struct.
void print_my(struct my *my) {
int i;
for (i = 0; i < my->len; i++) {
printf("%d\n", my->array[i]);
}
}
Then, to create variable length versions of this struct, you can create a new type of struct in your function body, containing your my struct, but also defining a length for that buffer. This can be done with a varying size parameter. Then, for all the functions you call, you can just pass around a pointer to the contained struct my value, and they will work correctly.
void create_and_use_my(int nelements) {
int i;
// Declare the containing struct with variable number of elements.
struct {
struct my my;
int array[nelements];
} my_wrapper;
// Initialize the values in the struct.
my_wrapper.my.len = nelements;
for (i = 0; i < nelements; i++) {
my_wrapper.my.array[i] = i;
}
// Print the struct using the generic function above.
print_my(&my_wrapper.my);
}
You can call this function with any value of nelements and it will work fine. This requires C99, because it does use VLAs. Also, there are some GCC extensions that make this a bit easier.
Important: If you pass the struct my to another function, and not a pointer to it, I can pretty much guarantee you it will cause all sorts of errors, since it won't copy the variable length array with it.
Here's a thought that may be totally inappropriate for your situation, but given your constraints I'm not sure how else to deal with it.
Create a large static array and use this as your "heap":
static struct other heap[SOME_BIG_NUMBER];
You'll then "allocate" memory from this "heap" like so:
var.array = &heap[start_point];
You'll have to do some bookkeeping to keep track of what parts of your "heap" have been allocated. This assumes that you don't have any major constraints on the size of your executable.

Accessing array as a struct *

This is one of those I think this should work, but it's best to check questions. It compiles and works fine on my machine.
Is this guaranteed to do what I expect (i.e. allow me to access the first few elements of the array with a guarantee that the layout, alignment, padding etc of the struct is the same as the array)?
struct thingStruct
{
int a;
int b;
int c;
};
void f()
{
int thingsArray[5];
struct thingStruct *thingsStruct = (struct thingStruct *)&thingsArray[0];
thingsArray[0] = 100;
thingsArray[1] = 200;
thingsArray[2] = 300;
printf("%d", thingsStruct->a);
printf("%d", thingsStruct->b);
printf("%d", thingsStruct->c);
}
EDIT: Why on earth would I want to do something like this? I have an array which I'm mmapping to a file. I'm treating the first part of the array as a 'header', which stores various pieces of information about the array, and the rest of it I'm treating as a normal array. If I point the struct to the start of the array I can access the pieces of header data as struct members, which is more readable. All the members in the struct would be of the same type as the array.
While I have seen this done frequently, you cannot (meaning it is not legal, standard C) make assumptions about the binary layout of a structure, as it may have padding between fields.
This is explained in the comp.lang.c faq: http://c-faq.com/struct/padding.htmls
Although it's likely to work in most places, it's still a bit iffy. If you want to give symbolic names to parts of the header, why not just do:
enum { HEADER_A, HEADER_B, HEADER_C };
/* ... */.
printf("%d", thingsArray[HEADER_A]);
printf("%d", thingsArray[HEADER_B]);
printf("%d", thingsArray[HEADER_C]);
As Evan commented on the question, this will probably work in most cases (again, probably best if you use #pragma pack to ensure their is no padding) assuming all the types in your struct are the same type as your array. Given the rules of C, this is legal.
My question to you is "why?" This isn't a particularly safe thing to do. If a float gets thrown into the middle of the struct, this all falls apart. Why not just use the struct directly? This really ins't a technique that I'd recommend in most cases.
Another solution for representing a header and the rest of file data is using a structure like this:
struct header {
long headerData1;
int headerData2;
int headerData3;
int fileData[ 1 ]; // <- data begin here
};
Then you allocate the memory block with a file contents and cast it as struct header *myFileHeader (or map the memory block on a file) and access all your file data with
myFileHeader->fileData[ position ]
for arbitrary big position. The language imposes no restriction on the index value, so it's only your responsibility to keep your arbitrary big posistion within the actual size of the memory block you allocated (or the mapped file's size).
One more important note: apart from switching off the struct members padding, which has been already described by others, you should carefully choose data types for the header members, so that they fit the actual file data layout despite compiler you use (say, int won't change from 32 to 64 bits...)

how to create a linked list of tables, a table represents a group of elements

what I am trying to do is represented in my other question with code.
Basically I need to keep in memory a table of elements (structs), there's no fixed number of elements that can exist, but it is small, but I still can't use an array.
And I don't want to use a linked list of elements because I don't want to keep adding and deleting elements everytime I need to change anything.
Instead what I want to do is allocate a chunk of memory with a single malloc, that chunk of memory will be large enough for say, 100 elements, and if in the rare case that I need more, I can allocate another chunk of 100 elements and link it to the original....
Is this a good idea? is there a name for this kind of structure? it's kinda of like dynamic expanding array? Do people actually use this? or I am just on crack? if this is bad idea, what do you recommend using instead?
Thanks
typedef struct Tb{
POINT points;
POINT *next;
} TABLE;
typedef struct Pt{
int x;
int y;
}POINT;
POINT *mypoints;
int a = 10;
int b = 1000;
mypoints = (POINT*) malloc (100 * sizeof(POINT));
for (int i =0; i < 100; i++) {
mypoints->x = a++;
mypoints->y = b++;
++mypoints;
}
Such allocation schemes have been used everywhere from the early Unix file system to Python's internal list allocation.
Code on!
This is a common data structure, that I've seen in some places named as "linked list of tables".
I'm assuming you are looking for a C++ answer since your code is in C++. The C++ standards does not impose a specific data structure for its containers. However, the specification kind of forces the compiler builders to use a specific data structure, since it it is the most appropriate to fulfill the specifications.
In C++, this is the case for the std::deque, which typically uses the data structure you describe above. To quote the documentation on the subject : "As opposed to std::vector, the elements of a deque are not stored contiguously: typical implementations use a sequence of individually allocated fixed-size arrays". See : https://en.cppreference.com/w/cpp/container/deque

Resources