I would like to compose a structure that possesses a contiguous array of flexible members that are, in turn, aggregation of few data fields. With that code pattern I want to: a) store multiple relevant data fields, b) have an array of such instances and c) destroy them all as simply as free(). The code bellow seems do the job:
struct s_symm_root{
unsigned int size_c;
struct {
unsigned int symm_color;
t_symm_gnrtr symm_gnrtr;
} _[];
} *symm_root;
unsigned int SIZE_C = 10;
symm_root = (struct s_symm_root*)malloc( sizeof(struct s_symm_root) + sizeof(*((struct s_symm_root*)NULL)->_) * SIZE_C );
symm_root->size_c = 1;
symm_root->_[0].symm_color = 2;
printf(" %d, %d, %ul.\n", symm_root->size_c, symm_root->_[0].symm_color, sizeof(*((struct s_symm_root*)NULL)->_));
free(symm_root);
My questions is if I can improve it a bit, specially, get rid of that ugly '_' somehow? I thought about anonymous structures but IDK how to implement them...
Thanks for suggestions!
Related
For the problem, consider below two structures.
struct type1_A
{
int a;
char b;
char rsvd[10];
char c;
int d;
}
struct type1_B
{
int d;
char rsvd[12];
char c;
char b;
int a;
}
I need to read fields a, b, c & d from the structs. I will have a buffer address and that buffer will have one of the struct. A flag can tell what kind of struct it is.
if (flag == TYPE1_A) {
a = ((struct type1_A*) (buffer))->a;
}
else if (flag == TYPE1_B) {
a = ((struct type1_B*) (buffer))->a;
}
But when there are many such reads, I dont want to keep on having if-else like above. Is there some way (hack) that this can be done without if-else. The field names will be same but at a different offset.
You can do the pointer arithmetic manually and store the offsets in a table indexed by type. That'd replace the if-else ladder with a table lookup:
if(flag_is_within_type_range(flag))
a=*(int*) ((char*)buffer+offset_table_indexed_by_type[flag]);
Essentially the resulting assembly needs to do a dynamic_type_number-to-an_offset lookup and you having the same overloaded name (i.e., the a member) for those different offsets doesn't help anything. There's hardly any good way to exploit it.
I would use macro like this:
#define TYPE_1 1
#define GETMEMBER(type, buff, field) ((type) == TYPE_1 ? ((struct type1_A *)(buff)) -> field : ((struct type1_B *)(buff)) -> field)
void foo(void *buff)
{
int type1a = GETMEMBER(TYPE_1, buff, a);
int type2a = GETMEMBER(0, buff, a);
printf("Type1: %d, type2:%d\n", type1a, type2a);
}
If you use a constant expression as type the compiler will optimize out the comparison leaving only the assignment:
https://godbolt.org/z/a73YsorMe
I have a list of variables char [][20] ls = {"var_1", "var_2", ... , ""}
which are the names of the fields of a struct struct {char var1[10], ...} my_struct;
The variables inside the struct are all char[] with changing lengths.
The list itself is const and should not change mid-run-time.
I want to access those variables in a loop in a somewhat generic way. Instead of calling myfunc(my_struct.var1); myfunc(my_struct.var2); and so on, I would much rather have:
for (char * p = ls[0]; *p; p += sizeof(ls[0]))
{
myfunc(my_struct.{some magic that would put var_1 / var_2 here});
}
But I guess this is impossible due to fact that the loop is executed in run-time, and the variable name needs to be available in compile-time.
Am I correct or is there something that can be done here? (not have to be this way, just wants to know if I can pack this routine into a nice loop)
Since all members are arrays of the same type, you can create an array of addresses to each member and loop through that:
char *my_struct_addrs[] = { my_struct.var1, my_struct.var2, ... };
int i;
for (i=0; i < sizeof(my_struct_addrs) / sizeof(my_struct_addrs[0]); i++) {
myfunc(my_struct_addrs[i]);
}
Since the size of each of these arrays is different however, you'll need to take care not to pass the bounds of each one. You can address this by keeping track of the size of each field and passing that to the function as well:
struct addr_list {
char *addr;
int len;
};
struct addr_list my_struct_addrs[] = {
{ my_struct.var1, sizeof(my_struct.var1) },
{ my_struct.var2, sizeof(my_struct.var2) },
...
};
int i;
for (i=0; i < sizeof(my_struct_addrs) / sizeof(my_struct_addrs[0]); i++) {
myfunc(my_struct_addrs[i].addr, my_struct_addrs[i].len);
}
Assuming you have something like
const char* ls[] = {"var_1", "var_2", ""};
where this list is not tightly-coupled to the struct data (if so you can use the answer by dbush), but is a separate item for whatever reason.
Then the slightly hacky, but well-defined version would be to use look-up tables. Create two lookup tables, one with strings, one with offsets:
#include <stddef.h>
typedef struct
{
int var_1;
int var_2;
} my_struct_t;
static const char* VAR_STRINGS[] =
{
"var_1",
"var_2",
""
};
static const size_t VAR_OFFSET[] =
{
offsetof(my_struct_t, var_1),
offsetof(my_struct_t, var_2),
};
Then do something like index = search_in_VAR_STRINGS_for(ls[i]); to get an index. (Loop through all items, or use binary search etc). The following code is then actually legal and well-defined:
unsigned char* ptr = (unsigned char*)&my_struct;
ptr += VAR_OFFSET[index];
int var_1 = *(int*)ptr;
This takes padding in account and the pointer arithmetic is guaranteed to be OK by C11 6.3.2.3/7:
When a pointer to an object is converted to a pointer to a character type,
the result points to the lowest addressed byte of the object. Successive increments of the
result, up to the size of the object, yield pointers to the remaining bytes of the object.
And since what's really stored at that address (effective type) is indeed an int, the variable access is guaranteed to be OK by C11 6.5/7 ("strict aliasing"):
An object shall have its stored value accessed only by an lvalue expression that has one of
the following types:
— a type compatible with the effective type of the object,
But various error handling obviously needs to be in place to check that something doesn't go out of bounds.
I have a struct:
struct A
{
unsigned int a, b, c, d, ...
}
I want to make a function:
unsigned int A_hash(const A* const var)
{
return ...
}
The number returned needs to be very very large as modulus for HashTable insertion will not work properly if A_hash(var) < myHashTable.capacity.
I've seen questions like this before like "Hash function that takes in two integers", "hash function that takes in five integers", etc but what aboutn integers? I'm looking for a more general algorithm for decent hashing. It doesn't need to be enterprise-level.
I was thinking perhaps start with a massive number like
return (0x7FFFFFFFF & a) + (0x7FFFFFFFF & b) + ...
but I don't think this will be good enough. I also don't know how to stop the A_hash function from overflowing but that may be another problem all together.
I think implicitly you are asking how it is possible to treat the entire object just like a long byte-stream, like #bruceg explained. If I'm wrong, then you might as well ignore this answer, because this is what I will address. Note that this solution does not apply merely for hashing, but for anything that requires you to treat data like bytes (such as copying from/writing to memory or files).
I think what you are looking for is merely reading byte by byte. For this you can insipre yourself from std::ostream::write (which is a C++ method though). For example, you could write A_hash in such a way that you could invoke it like this :
int hash = A_hash((char*)&a, sizeof(a)); // where 'a' is of type 'struct A'.
You could write A_hash, for example, like this:
unsigned int A_hash(char* data, unsigned int dataSize)
{
unsigned int hash = someValue;
for (unsigned int i = 0; i < dataSize; ++i)
{
char byte = data[i];
doSomethingWith(hash);
}
return hash;
}
The great advantage of this method is that you don't need to rewrite the function if you add/remove fields to your struct ; sizeof(A) will expand/reduce at compile-time. The other great advantage is that it works for any value, so you can reuse that function with any type you want, including int, another struct, an enum, a pointer, ...
With the following (simplified) data definition :
#define DIM0 10
#define DIM1 15
typedef struct {
uint32_t var1:
...
int8_t arrayVar1[DIM0];
} dataClass0;
typedef struct {
uint32_t var1:
...
int8_t arrayVar1[DIM1];
} dataClass1;
At one given point I must create an array of these structures and process them.
The processing is exactly the same except for the arrays (different length). Right now it's something like:
dataClass0 *data;
data = (dataClass0 *) malloc(dimension * sizeof (dataClass0));
// Processing and filling structure
data[i].var1 = <value>
...
Right now I have the same function duplicated for each data classs. Is there a way around duplicating code when using these data structures?
Notes:
Only pure C, no C++;
I cannot change the data definition (i.e. cannot use int8_t *arrayVar1 in the struct).
When processing I receive the type of data to process (0 for class0, 1 for class1, ...).
typedef struct {
uint32_t var1:
...
int8_t arrayVar[]; /* Declare as flexible array, allowed since C99 */
} dataClass;
allocate with something like that:
data1 = malloc(sizeof (dataCLass) + DIM1*sizeof ((dataClass*)NULL)->arrayVar[0]);
data2 = malloc(sizeof (dataCLass) + DIM2*sizeof ((dataClass*)NULL)->arrayVar[0]);
or define
#define ALLOCDATA(dim) malloc(sizeof (dataCLass) + (dim)*sizeof ((dataClass*)NULL)->arrayVar[0]);
define
#define ELEMENT1(data, i) (dataClass*)(((char*)(data))+(i)*(DIM1+sizeof (dataCLass)))
#define ELEMENT2(data, i) (dataClass*)(((char*)(data))+(i)*(DIM2+sizeof (dataCLass)))
or if you parametrize the DIM
#define ELEMENT(data, i, dim) (dataClass*)(((char*)(data))+(i)*((dim)+sizeof (dataCLass)))
enjoy
ELEMENT1(data1, i)->var1 = 1;
ELEMENT1(data1, i)->arrayVar1[9] = 4;
ELEMENT2(data2, i)->arrayVar1[14] = 4;
or
ELEMENT(data1, i, DIM1)->var1 = 1;
ELEMENT(data1, i, DIM1)->arrayVar1[9] = 4;
ELEMENT(data2, i, DIM2)->arrayVar1[14] = 4;
Not perfect, but not too weird a construct to not be usable.
EDIT:
The ELEMENT define should be changed to
#define ELEMENT1(data, i) (dataClass*)(((char*)(data))+(i)*(DIM1*sizeof ((dataClass*)NULL)->arrayVar[0]+sizeof (dataCLass)))
#define ELEMENT2(data, i) (dataClass*)(((char*)(data))+(i)*(DIM2*sizeof ((dataClass*)NULL)->arrayVar[0]+sizeof (dataCLass)))
#define ELEMENT(data, i, dim) (dataClass*)(((char*)(data))+(i)*((dim)*sizeof ((dataClass*)NULL)->arrayVar[0]+sizeof (dataCLass)))
with this change, your arrayVar field can be of any type and is not limited to elements of size 1.
Can't you just make the array dynamic? So that you create your structures with malloc() and then initialize some member to hold the size (and make sure it ends with an uint8_t * instead of an actual array, or use VLA's)?
It depends whether or not you want to fill the arrays with different values initially. Otherwise you can just have a macro that initializes both types of structures
#define STRUCTURE_INITIALIZER(VAR1, VAR2) { .var1 = (VAR1), .var2 = (VAR2) }
and use that as
dataClass0 data = STRUCTURE_INITIALIZER(31, 42);
your array components would then always be 0 initialized, regardless of their size.
To initialize an malloced array of your stuff:
dataClass0 *data = malloc(dimension * sizeof (dataClass0));
// Processing and filling structure
for (size_t i = 0; i < dimension; ++i)
data[i]= (dataClass0)STRUCTURE_INITIALIZER(43, i);
BTW, prefer to initialize variables properly and don't cast the return of malloc.
This is just like struct hack.
Is it valid according to standard C?
// error check omitted!
typedef struct foo {
void *data;
char *comment;
size_t num_foo;
}foo;
foo *new_Foo(size_t num, blah blah)
{
foo *f;
f = malloc(num + sizeof(foo) + MAX_COMMENT_SIZE );
f->data = f + 1; // is this OK?
f->comment = f + 1 + num;
f->num_foo = num;
...
return f;
}
Yes, it's completely valid. And I would strongly encourage doing this when it allows you to avoid unnecessary additional allocations (and the error handling and memory fragmentation they entail). Others may have different opinions.
By the way, if your data isn't void * but something you can access directly, it's even easier (and more efficient because it saves space and avoids the extra indirection) to declare your structure as:
struct foo {
size_t num_foo;
type data[];
};
and allocate space for the amount of data you need. The [] syntax is only valid in C99, so for C89-compatibility you should use [1] instead, but this may waste a few bytes.
The line you question is valid - as others have said.
Interestingly, the next line, which you did not query, is syntactically valid but is not giving you the answer you want (except in the case where num == 0).
typedef struct foo
{
void *data;
char *comment;
size_t num_foo;
} foo;
foo *new_Foo(size_t num, blah blah)
{
foo *f;
f = malloc(num + sizeof(foo) + MAX_COMMENT_SIZE );
f->data = f + 1; // This is OK
f->comment = f + 1 + num; // This is !!BAD!!
f->num_foo = num;
...
return f;
}
The value of f + 1 is a foo * (implicitly coerced into a void * by the assignment).
The value of f + 1 + num is also a foo *; it points to the num+1th foo.
What you probably had in mind was:
foo->comment = (char *)f->data + num;
Or:
foo->comment = (char *)(f + 1) + num;
Note that while GCC will allow you to add num to a void pointer, and it will treat it as if sizeof(void) == 1, the C Standard does not give you that permission.
That is an old game, though the usual form is like
struct foo {
size_t size
char data[1]
}
and then allocate the space as big as you want and use array as if it had the desired size.
It is valid, but I would encourage you to find another way if possible: there are lots of chance to screw this up.
Yes, the general idea of the hack is valid, but at least as I read it, you haven't implemented it quite correctly. This much you've done right:
f = malloc(num + sizeof(foo) + MAX_COMMENT_SIZE );
f->data = f + 1; // is this OK?
But this is wrong:
f->comment = f + 1 + num;
Since f is foo *, the f+1+num is computed in terms of sizeof(foo) -- i.e., it's equivalent to saying f[1+num] -- it (attempts to) index to the 1+numth foo in an array. I'm pretty sure that's not what you want. When you allocate the data, you're passing sizeof(foo)+num+MAX_COMMENT_SIZE, so what you're allocating space for is num chars, and what you (presumably) want is to point f->comment to a spot in memory that's num chars after f->data, which would be more like this:
f->comment = (char *)f + sizeof(foo) + num;
Casting f to a char * forces the math to be done in terms of chars instead of foos.
OTOH, since you're always allocating MAX_COMMENT_SIZE for comment, I'd probably simplify things (quite) a bit, and use something like this:
typedef struct foo {
char comment[MAX_COMMENT_SIZE];
size_t num_foo;
char data[1];
}foo;
And then allocate it like:
foo *f = malloc(sizeof(foo) + num-1);
f->num_foo = num;
and it'll work without any pointer manipulation at all. If you have a C99 compiler, you can modify this slightly:
typedef struct foo {
char comment[MAX_COMMENT_SIZE];
size_t num_foo;
char data[];
}foo;
and allocate:
foo *f = malloc(sizeof(foo) + num);
f->num_foo = num;
This has the additional advantage that the standard actually blesses it, though in this case the advantage is pretty minor (I believe the version with data[1] will work with every C89/90 compiler in existence).
Another possible problem might be alignment.
If you simply malloc your f->data, then you can safely e.g. convert your void* to double* and use it to read/write a double (provided that num is sufficiently large). However, in your example you can no longer do that, as f->data might not be properly aligned. For example, to store a double in f->data, you will need to use something like memcpy instead of a simple typecast.
I'd rather use some function to allocate the data dynamically and free it correctly instead.
Using this trick only saves you the trouble of initializing the data structure, and can lead to very bad problems (see Jerry's comment).
I'd do something like this:
typedef struct foo {
void *data;
char *comment;
size_t num_foo;
}foo;
foo *alloc_foo( void * data, size_t data_size, const char *comment)
{
foo *elem = calloc(1,sizeof(foo));
void *elem_data = calloc(data_size, sizeof(char));
char *elem_comment = calloc(strlen(comment)+1, sizeof(char));
elem->data = elem_data;
elem->comment = elem_comment;
memcpy(elem_data, data, data_size);
memcpy(elem_comment, comment, strlen(comment)+1);
elem->num_foo = data_size + strlen(comment) + 1;
}
void free_foo(foo *f)
{
if(f->data)
free(f->data);
if(f->comment)
free(f->comment);
free(f);
}
Note that I did no check on data validity, and my alloc can be optimized (replacing strlen() calls by a stored lenght value).
It seems to me that this behavior is more secure... at the price of a disseminated data chunk maybe.