Accessing the -1 element of an array in c - c

I have an array of structs, which is dynamically allocated. A pointer to this array is passed around to other functions.
struct body{
char* name;
double mass;
// ... some more stuff
};
body *bodies = malloc(Number_of_bodies*sizeof(body));
I need to know the size of the array, so I'm storing the size in one of the structs, which is in the 0th element of the array (the first struct).
bodies[0].mass = (double)Number_of_bodies;
I then return from the function a pointer to the 1st element of the array i.e bodies[1]
return (bodies+1);
Now, when I use this pointer in other functions, the data should start at the 0th element.
body *new_bodies = (bodies+1); //Just trying to show what happens effectively when i pass to another function
new_bodies[0] = *(bodies+1); //I Think
If I want to see the initial struct, which was at bodies[0], does that mean in other functions I have to access new_bodies[-1] ?
Is this something I can do?
How can I access the initial struct?

Yes, you can use new_bodies[-1] to access the initial element of the array. This is perfectly legal.
The reason behind this is pointer arithmetic: square brackets is another way of writing +, so when you write new_bodies[-1] it is the same as *(new_bodies-1).
Since new_bodies has been obtained as bodies+1, new_bodies-1 is (bodies+1)-1, or bodies, making new_bodies[-1] identical to bodies[0].
Note: It looks like you are trying to shoehorn the number of elements into the initial element of the array of your structs, re-purposing the mass field for it. This will work, but it is suboptimal, both in terms of memory allocation (a pointer name remains unused) but most importantly in terms of readability. You would be a lot better off using a flexible array member in a struct that stores the number of entries explicitly:
struct body {
char* name;
double mass;
// ... some more stuff
};
struct bodies {
size_t count;
body bodies[]; // <<== Flexible array member
};
...
bodies *bb = malloc(sizeof(bodies)+Number_of_bodies*sizeof(body));
bb->count = Number_of_bodies;
Here is a link to another Q&A with an example of working with flexible array members.

Related

Flexible array member and pointer member : pros and cons?

What is the difference between using flexible array member (FAM) or pointer member ? In the two cases, a malloc and an affectation element by element must be done. But with FAM, a memory allocation is done for the whole structure and with ptr member, a memory allocation is done for the ptr member only (see code). What are the pros ans the cons of these two methods ?
#include <stdio.h>
#include <stdlib.h>
typedef struct farr_mb {
int lg;
int arr[];
} Farr_mb;
typedef struct ptr_mb {
int lg;
int * ptr;
} Ptr_mb;
int main() {
int lg=5;
Farr_mb *a=malloc(sizeof(Farr_mb)+lg*sizeof(int));
Ptr_mb b; b.ptr=malloc(lg*sizeof(int));
for (int i=0;i<lg;i++) (a->arr)[i]=i;
for (int i=0;i<lg;i++) (b.ptr)[i]=i;
for (int i=0;i<lg;i++) printf("%d \t",(a->arr)[i]=i);
printf("\n");
for (int i=0;i<lg;i++) printf("%d \t",(b.ptr)[i]=i);
return 0;
}
Before we get to the pros and cons, let's look at some real-world examples.
Let's say we wish to implement a hash table, where each entry is a dynamically managed array of elements:
struct hash_entry {
size_t allocated;
size_t used;
element array[];
};
struct hash_table {
size_t size;
struct hash_entry **entry;
};
#define HASH_TABLE_INITIALIZER { 0, NULL }
This in fact uses both. The hash table itself is a structure with two members. The size member indicates the size of the hash table, and the entry member is a pointer to an array of hash table entry pointers. This way, each unused entry is just a NULL pointer. When adding elements to a hash table entry, the entire struct entry can be reallocated (for sizeof (struct entry) + allocates * sizeof (element) or freed, as long as the corresponding pointer in the entry member in the struct hash_table is updated accordingly.
If we used element *array instead, we would need use struct hash_entry *entry: in the struct hash_table; or allocate the struct hash_entry separately from the array; or allocate both struct hash_entry and array in the single chunk, with the array pointer pointing just after the same struct hash_entry.
The cost of that would be two extra size_ts worth of memory used for each unused hash table slot, as well as an extra pointer dereference when accessing elements. (Or, to get the address of the array, two consecutive pointer dereferences, instead of one pointer dereference plus offset.) If this is a key structure heavily used in an implementation, that cost can be visible in profiling, and negatively affect cache performance. For random accesses, the larger the element array is, the less difference there is, however; the cost is largest when the arrays are small, and fit within the same cacheline (or a few cachelines) as the allocated and used members.
We do not usually want to make the entry member in the struct hash_table a flexible array member, because that would mean you no longer can declare a hash table statically, using struct hash_table my_table = HASH_TABLE_INITIALIZER;; you would need to use a pointer to a table, and an initializer function: struct hash_table *my_table; my_table = hash_table_init(); or similar.
I do have another example of related data structures using both pointer members and flexible array members. It allows one to use variables of type matrix to represent any 2D matrix with double entries, even when a matrix is a view to another (say, a transpose, a block, a row or column vector, or even a diagonal vector); these views are all equal (unlike in e.g. GNU Scientific Library, where matrix views are represented by a separate data type). This matrix representation approach makes writing robust numerical linear algebra code easy, and the ensuing code is much more readable than when using GSL or BLAS+LAPACK. In my opinion, that is.
So, let's look at the pros and cons, from the point of view of how to choose which approach to use. (For that reason, I will not designate any feature as "pro" or "con", as the determination depends on the context, on each particular use case.)
Structures with flexible array members cannot be initialized statically. You can only refer to them via pointers.
You can declare and initialize structures with pointer members. As shown in above example, using a preprocessor initializer macro can mean you do not need an initializer function. For example, a function accepting a struct hash_table *table parameter can always resize the array of pointers using realloc(table->entry, newsize * sizeof table->entry[0]), even when table->entry is NULL. This reduces the number of functions needed, and simplifies their implementation.
Accessing an array via a pointer member can require an extra pointer dereference.
If we compare the accesses to arrays in statically initialized structures with pointer to the array, to a structure with a flexible array member referred via a static pointer, the same number of dereferences are made.
If we have a function that gets the address of a structure as a parameter, then accessing an array element via a pointer member requires two pointer dereferences, whereas accessing a flexible array element requires only one pointer dereference and one offset. If the array elements are small enough and the array index small enough, so that the accessed array element is in the same cacheline, the flexible array member access is often significantly faster. For larger arrays, the difference in performance tends to be insignificant. This does vary between hardware architectures, however.
Reallocating an array via a pointer member hides the complexity from those using the structure as an opaque variable.
This means that if we have a function that receives a pointer to a structure as a parameter, and that structure has a pointer to a dynamically allocated array, the function can reallocate that array without the caller seeing any change in the structure address itself (only structure contents change).
However, if we have a function that receives a pointer to a structure with a flexible array member, reallocating the array means reallocating the entire structure. That potentially modifies the address of the structure. Because the pointer is passed by value, the modification is not visible to the caller. Thus, a function that may resize a flexible array member, must receive a pointer to a pointer to the structure with a flexible array member.
If the function only examines the contents of a structure with a flexible array member, say counts the number of elements that fulfill some criteria, then a pointer to the structure suffices; and both the pointer and the pointed-to data can be marked const. This might help the compiler produce better code. Furthermore, all the data accessed is linear in memory, which helps more complex processors manage caching more efficiently. (To do the same with an array having a pointer member, one would need to pass the pointer to the array, as well as the size field at least, as parameters to the counting function, instead of a pointer to the structure containing those values.)
An unused/empty structure with a flexible array member can be represented by a NULL pointer (to such structure). This can be important when you have an array of arrays.
With structures with flexible array members, the outer array is just an array of pointers. With structures with pointer members, the outer array can be either an array of structures, or an array of pointers to structures.
Both can support different types of sub-arrays, if the structures have a common type tag as the first member, and you use an union of those structures. (What 'use' means in this context, is unfortunately debatable. Some claim you need to access the array via the union, I claim the visibility of such an union is sufficient because anything else will break a huge amount of existing POSIX C code; basically all server-side C code using sockets.)
Those are the major ones I can think of right now. Both forms are ubiquitous in my own code, and I have had no issues with either. (In particular, I prefer using a structure free helper function that poisons the structure to help detect use-after-free bugs in early testing; and my programs do not often have any memory-related issues.)
I will edit the above list, if I find I've missed important facets. Therefore, if you have a suggestion or think I've overlooked something above, please let me know in a comment, so I can verify and edit as appropriate.

Significance of array with only one element

I have come across an array with only one element. This array is defined inside a structure. Which goes like this:
typedef struct abc
{
int variable1;
char variable2;
float array[1];
};
I don't understand why this array is required, why can't we define just a variable or define a pointer(considering array property).
I want to use it. How do i use this variable? abc.array[0] seems correct. Isn't it.
Addition I am not using any dynamic memory allocation then what is its significance ?
It's probably what is called the "struct hack". By allocating a large block of memory, the array becomes dynamic. The one element is just a placeholder to make it compile, in fact there will be many floats.
The dynamic array has to be the last element.
Use like this:
struct abc *ptr = malloc(sizeof(struct abc) + (N-1) * sizeof(float));
ptr->variable1 = N; /* usually store length somewhere in struct*/

C structs sharing common pointer?

I'm currently having an issue with the following struct:
typedef struct __attribute__((__packed__)) rungInput{
operation inputOperation;
inputType type;
char* name;
char numeroInput;
u8 is_not;
} rungInput;
I create multiple structs like above inside a for loop, and then fill in their fields according to my program logic:
while (a < 5){
rungInput input;
(...)
Then when I'm done filling the struct's fields appropriately, I then attempt to copy the completed struct to an array as such:
rungArray[a] = input; //memcpy here instead?
And then I iterate again through my loop. I'm having a problem where my structs seem to all have their name value be the same, despite clearly having gone through different segments of code and assigning different values to that field for every loop iteration.
For example, if I have three structs with the following names: "SW1" "SW2" SW3", after I am done adding them to my array I seem to have all three structs point me to the value "SW3" instead. Does this mean I should call malloc() to allocate manually each pointer inside each struct to ensure that I do not have multiple structs that point to the same value or am I doing something else wrong?
When you write rungArray[i] = input;, you are copying the pointer that is in the input structure into the rungArray[i] structure. If you subsequently overwrite the data that the input structure is pointing at, then you also overwrite the data that the rungArray[i] structure is pointing at. Using memcpy() instead of assignment won't change this at all.
There are a variety of ways around this. The simplest is to change the structure so that you allocate a big enough array in the structure to hold the name:
enum { MAX_NAME_SIZE = 32 };
…
char name[MAX_NAME_SIZE];
…
However, if the extreme size of a name is large but the average size is small, then this may waste too much space. In that case, you continue using a char *, but you do indeed have to modify the copying process to duplicate the string with dynamically allocated memory:
rungArray[i] = input;
rungArray[i].name = strdup(input.name);
Remember to free the memory when you discard the rungArray. Yes, this code copies the pointer and then overwrites it, but it is more resilient to change because all the fields are copied, even if you add some extra (non-pointer) fields, and then the pointer fields are handled specially. If you write the assignments to each member in turn, you have to remember to track all the places where you do this (that would be a single assignment function, wouldn't it?) and add the new assignments there. With the code shown, that mostly happens automatically.
You should malloc memory for your struct and then store the pointers to the structs inside your array. You could also turn your structs into a linked list by adding a pointer to each struct that points to the next instance of your struct.
http://www.cprogramming.com/tutorial/c/lesson15.html

Better way of declaring an array?

I'm writing in C and compiling with GCC.
is there a better way of declaring points. I was surprised to see that points was an array. Is there some way of declaring points so it looks more like an array.
typedef struct Span
{
unsigned long lo;
unsigned long hi;
} Span;
typedef struct Series
{
unsigned long *points;
unsigned long count;
unsigned long limit;
} Series;
void SetSpanSeries(Series *self, const Span *src)
{
unsigned long *points;
if (src->lo < src->hi )
{
// Overlays second item in series.
points = self->points; // a pointer in self structure
points[0] = src->lo;
points[1] = src->hi;
self->count = 1;
}
}
Now lets say that points points to a structure that is an array.
typedef struct Span
{
unsigned long lo;
unsigned long hi;
} Span;
span *points[4];
now how do I write these lines of code? Did I get this right?
points = self->points; // a pointer in self structure
points[0].lo = src->lo;
points[0].hi = src->hi;
With the declaration unsigned long *points, points is a pointer. It points to the beginning of an array. arr[x] is the same as *(arr + x), so whether arr is an array (in which case, it takes the address of the array, adds x, and dereferences the 'pointer') or a pointer (in which case, it takes the pointer value, adds x, and dereferences the pointer), arr[0] still gets the same array access.
In this case, you can't declare points as an array because you're not using it as an array - you're using it as a pointer, which points to an array. A pointer is a shallow copy - if you change the data pointed to by a pointer, it changes the original data. To create a regular array, you'd need to do a deep copy, which would prevent your changes in pointer from affecting the array self, which is ultimately what you want.
In fact, you could rewrite the whole thing without points:
void SetSpanSeries(Series *self, const Span *src)
{
if (src->lo < src->hi )
{
self->points[0] = src->lo;
self->points[1] = src->hi;
self->count = 1;
}
}
As to your second example, yes, points[0].lo is correct. points->lo would also be correct, so long as you're only accessing points[0]. (Or self->points[0].lo if you take out points entirely.)
The ability to treat a pointer as an array definitely confuses most C beginners. Arrays even decay to pointers when passed as arguments to functions, giving the impression that arrays and pointers are completely interchangeable -- they aren't. An excellent description is in Expert C Programming: Deep C Secrets. (This is one of my favorite books; it's strongly recommended if you intend to understand C.)
Anyway, writing pointer[2] is the same as *(pointer+2) -- the array syntax is far easier for most people to read (and write).
Since you are using this *points variable to provide easier access to another block of memory (the pointer points in the struct Series), you cannot use an array for your local variable because you cannot re-assign the base of an array to something else. Consider the following illegal code:
int foo[10];
int *bar;
int wrong[10];
bar = foo; /* fine */
wrong = foo; /* compile error -- cannot assign to the array 'wrong' */
Another option for re-writing this code is to remove the temporary variable:
if (src->lo < src->hi) {
self->points[0] = src->lo;
self->points[1] = src->hi;
self->count = 1;
}
I'm not sure the temporary variable helps with legibility -- it just saved typing a few characters at the expense of adding a lot of characters. (And a confusing variable, too.)
In the middle section you say points is an array 4 of pointer to struct span. In the third section you are assigning points from self->points (meaning the previous value of points, that array, has been lost). You then dereference points as if it were an array of struct Span and not an array of pointers to struct Span.
In other works, this cannot compile because you are mixing types and even if you were not, you are overwriting the memory allocated by your definition of the points variable.
Providing the definition of Series might help explain what is going on.
But certainly in the first example, points should probably be a Span *points but without seeing Series we cannot tell for sure.

What is the cause of flexible array member not at end of struct error?

I am wondering why I keep getting error: flexible array member not at end of struct error when I call malloc. I have a struct with a variable length array, and I keep getting this error.
The struct is,
typedef struct {
size_t N;
double data[];
int label[];
} s_col;
and the call to malloc is,
col = malloc(sizeof(s_col) + lc * (sizeof(double) + sizeof(int)));
Is this the correct call to malloc?
You can only have one flexible array member in a struct, and it must always be the last member of the struct. In other words, in this case you've gone wrong before you call malloc, to the point that there's really no way to call malloc correctly for this struct.
To do what you seem to want (arrays of the same number of data and label members), you could consider something like:
struct my_pair {
double data;
int label;
};
typedef struct {
size_t N;
struct my_pair data_label[];
};
Note that this is somewhat different though: instead of an array of doubles followed by an array of ints, it gives you an array of one double followed by one int, then the next double, next int, and so on. Whether this is close enough to the same or not will depend on how you're using the data (e.g., for passing to an external function that expects a contiguous array, you'll probably have to do things differently).
Given a struct definition and a pointer to the start of a struct, it is necessary that the C compiler be able to access any member of the struct without having to access anything else. Since the location of each item within the structure is determined by the number and types of items preceding it, accessing any item requires that the number and types of all preceding items be known. In the particular case where the last item is an array, this poses no particular difficulty since accessing an item in an array requires knowing where it starts (which requires knowing the number and type of preceding items, rather than the number of items in the array itself), and the item index (which the compiler may assume to be smaller than the number of items for which space exists, without having to know anything about the array size). If a Flexible Array Member appeared anywhere other than at the end of a struct, though, the location of any items which followed it would depend upon the number of items in the array--something the compiler isn't going to know.
typedef struct {
size_t N;
double data[];
int label[];
} s_col;
You can't have
flexible array member (double data[]) in the middle. Consider hardcoded array size or double *data

Resources