Get the index of an element declared in a struct - c

I'd like to associate each element of my struct (or something else, may not be possible with a struct) with an incrementing index, kind of like an enum but associate with value fields.
Let's say that I have this data struct :
typedef struct
{
int8_t value_at_index0; // start with index 0 for this field
int8_t value_at_index1; // then index = 1, etc
int8_t value_at_index2;
int8_t value_at_index3;
int8_t value_at_index4;
} data;
I want to get the index of a field, and the index should mirror the position where the field was declared in the struct.
EDIT: My problem is that I want to write values in an external memory, but I don't want to figure at which index is the value. For example I'd like to do that: write(index.value_at_index2, data.value_at_index2)
Thanks

Ok im assuming you have a badly written code and don't wanna fix it or its a very specific problem
Although i would just use an array or allocate some memory for it, I haven't read the manual but i remember in a year ago that structs are always written as a block(ie they aren't fragmented unlike memory allocation) and although this is not correct behavior to work with structs like this i did a test and it worked
Use __attribute__((packed, aligned(1))), this makes the padding 1, meaning that no blank space bytes will be added, that is if you have a single byte element it wont add 3 more blank ones to fill the gap.
This doesn't mean that it will shorten everything to one and this stops working as soon as you factor in elements of different sizes as adding one when its going through a 4 bytes integer will cause unexpected behaviour
typedef struct __attribute__((packed, aligned(1))) //like so
{
int8_t value_at_index0; // start with index 0 for this field
int8_t value_at_index1; // then index = 1, etc
int8_t value_at_index2;
int8_t value_at_index3;
int8_t value_at_index4;
} data;
int main(int argc, char **argv){
data data;
int8_t *hey=NULL;
data.value_at_index0=5;
data.value_at_index1=3;
data.value_at_index2=2;
data.value_at_index3=5;
data.value_at_index4=10;
size_t i;
hey = &data;
for (i = 0; i < 5; i++){
printf("hey its working %d", *hey);
hey+=1;
}
//letsee(5, data);
return 0;
}
Now this is not correct behavior, superficially by your question i would just advise you to stick with memory allocation since it allows for dynamic sizes and even if you want a static size, you could just use arrays.
If you want an even sketchier version of above in the most recent C compilers there is a few tricks to make it dynamic. Although i will tell you that you're gonna have a bunch of memory leaks, core dumps, and unexpected behavior
TL;DR This is not correct behavior and I'm just posting this in case that you can't change the code and you must use structs
Now for the clean version i read in the question that you are going to write a struct assuming that the struct is basic and you just wanna write every static element in it you can easily just do
FILE *fp;
if ((fp = fopen("lets.txt", "w")) == NULL){
return 1;
}
fwrite(&data, sizeof(data), 1, fp);
if (fclose(fp) != 0){
return 1;
}
since when you place a struct in the first parameter, and provide the size of it it will just write every element inside the struct, this also applies to pointers although there you have to specify the size of each element in the size param and the number of elements in the next param.
Again this is assuming all your elements are well defined and static, since structs have a lot of flag's

You should consider using arrays/pointers for this purpose. To make this work you need to make sure that all values are the same type and temporary removing memory padding for that struct. The following code illustrates how it is being done.
/*Remove struct padding*/
#pragma pack(push, 1)
typedef struct
{
int8_t a;
int8_t b;
int8_t c;
int8_t d;
} data;
/*Recover struct padding, some structs from other libraries malfunction from not using struct pudding.*/
#pragma pack(pop)
int main()
{
/*Define a clean data structure*/
data tmp = { 0x00, 0x00, 0x00, 0x00 };
((int8_t *)&tmp)[0] = 0x01; /*data.a = 0x01*/
((int8_t *)&tmp)[1] = 0x03; /*data.b = 0x03*/
((int8_t *)&tmp)[2] = 0x05; /*data.c = 0x05*/
((int8_t *)&tmp)[3] = 0x07; /*data.d = 0x07*/
return 0;
}

Related

May I assume that struct fields are placed in order and with no padding?

Frankly, is such a code valid or does it invoke undefined behavior?
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
struct two_values
{
int some;
char value;
};
int main(void) {
int some = 5;
char value = 'a';
unsigned char *data = malloc(sizeof(struct two_values));
memcpy(data, &some, sizeof(int));
memcpy(data+sizeof(int), &value, sizeof(char));
struct two_values dest;
memcpy(&dest, data, sizeof(struct two_values));
printf("some = %d, value = %c\n", dest.some, dest.value);
return 0;
}
http://ideone.com/4JbrP9
Can I just put the binary representation of two struct field together and reinterpret this as the whole struct?
You had better to not disturb the internal compiler doings in your code, as it would lead you to incorrect code and undefined behaviour. You can switch compilers, or just updating the version of your favourite, and run into trouble.
The best way to solve the thing you show of having two variables and to store them properly in the struct fields is to use properly the types provided by C, and use a pointer typed to the proper type. If you use
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
struct two_values
{
int some;
char value;
};
int main(void) {
int some = 5;
char value = 'a';
/* next instead of unsigned char *data = malloc(sizeof(struct two_values)); */
struct two_values *data = malloc(sizeof(struct two_values));
/* next instead of memcpy(data, &some, sizeof(int)); */
data->some = some;
/* next instead of memcpy(data+sizeof(int), &value, sizeof(char)); */
data->value = value;
struct two_values dest;
/* next instead of memcpy(&dest, data, sizeof(struct two_values)); */
dest = *data;
printf("some = %d, value = %c\n", dest.some, dest.value);
return 0;
}
You'll avoid all compiler alignment issues. It is always possible to do it with the language operators & (address of) and * (points to) or -> (field of struct pointed to).
Anyway, if you prefer the memcpy approach (no idea of why, but you are on your way, anyway) you can substitute:
data->some = some;
...
data->value = value;
...
dest = *data;
by
memcpy(&data->some, &some, sizeof data->some);
...
memcpy(&data->value, &value, sizeof data->value);
...
memcpy(&dest, data, sizeof dest);
And that will take internally the alignments that the compiler could make by itself.
All compilers have defined some pragma, or keyword, to control alignment. This is also nonportable, as you can switch compilers and get to the issue of having to change the way you expressed things. C11 has some standard means to control for packed structs and use no alignment in the compiler. This is done mainly when you have to serialize some structure and don't want to deal with holes on it. Look at the C11 specs for that.
Serializing structs is not completely solved by just making them packed, as normally you have to deal with the serialized representations of integer, floating point or char data (which can or cannot coincide with the internal representation used by the compiler) so you again face the problem of being compiler agnostic and have to think twice before using externally the internal representation of data.
My recomendation anyway, is never trust how the compiler stores data internally.
The padding is determined by the compiler. The order is guaranteed. If you need something similar to your code above, I would recommend the offsetof-macro in <stddef.h>.
memcpy(data + offsetof(struct two_values, value), &value, sizeof(char));
Or without explicitly adding the offset at all:
memcpy(&data->value, &value, sizeof(char));
It depend on how your structure is aligned. You can check by verifying sizeof(two_values), if it comes 5(assuming sizeof int is 4), you probably are ok.
If its more than that it implies filler bytes are inserted in your structure to align each element of your structure at correct byte boundry
May I assume that struct fields are placed in order
Yes, this is guaranteed by the standard. C11 6.2.5/20:
a structure is a type consisting of a sequence of members, whose
storage is allocated in an ordered sequence
and with no padding?
No, you cannot assume this. C11 6.7.1/15:
Within a structure object, the non-bit-field members and the units in which bit-fields reside have addresses that increase in the order in which they are declared. /--/
There may be unnamed padding within a structure object, but not at its beginning.
Padding and alignment are implementation-defined behavior.
You are however guaranteed that two structs of the same type have the same padding. Copying from a struct to another struct of same type, as in your example, is safe and well-defined.

Stringize loopcounter to reference members of C struct

I have two structs that appears like:
typedef unsigned char byte;
struct my_struct_2 {
int type;
int length; //will be 2 in this case
byte value1; //MSB
byte value2; //LSB
}
struct my_struct_4 {
int type;
int length; //will be 4 in this case
byte value1; //MSB
byte value2;
byte value3;
byte value4; //LSB
}
I want to loop through the "value" members in the struct based on "length" to concatenate the byte values into one number. I am wondering if it is possible to use some sort of stringizing so that I can construct a for loop with a structure similar to this:
int value = 0;
for( int i = 1; i <= length; i++)
{
value = value << 8;
value = value & my_struct.value#i#;
}
I want to take advantage of the fact that the structs members are sequentially named.
I know that I can accomplish this task by using a pointer to the first value and incrementing the pointer. I am trying to familiarize myself more with stringizing so please refrain from a pointer like solution. Thanks!
There is no way to loop through struct fields by constructing names like that. Once a C program has been compiled, any information about names is gone (ignoring information for the debugger); it's all offsets hard-coded into the machine code.
With some C preprocessor abuse you can get something a bit like looping on names, but it's not pretty and definitely not recommended unless you're just doing it for the challenge of doing it.

How to output a binary file in C without padding bits

I'd like to output a struct's data to a binary file, but without any padding bits between each variable's information. For example:
struct s {
int i1;
short s1;
char c1;
};
struct s example[2];
If I use fwrite(&example, sizeof(struct s), 2, file), the binary file still has the padding bits between, for example, s1 and c1, and also from c1 to i1 (of the 2nd struct).
What would be a good approach to remove those padding bits from the output file ?
Thanks! Any help is appreciated
I would just suggest manually reading/writing the members of the struct individually. Packing using your compiler directives can cause inefficiency and portability issues with unaligned data access. And if you have to deal with endianness, it's easy to support that later when your read operations break down to field members rather than whole structs.
Another thing, and this relates more to futuristic maintenance-type concerns, is that you don't want your serialization code or the files people have saved so far to break if you change the structure a bit (add new elements or even change the order as a cache line optimization, e.g.). So you'll potentially run into a lot less pain with code that provides a bit more breathing room than dumping the memory contents of the struct directly into a file, and it'll often end up being worth the effort to serialize your members individually.
If you want to generalize a pattern and reduce the amount of boilerplate you write, you can do something like this as a basic example to start and build upon:
struct Fields
{
int num;
void* ptrs[max_fields];
int sizes[max_fields];
};
void field_push(struct Fields* fields, void* ptr, int size)
{
assert(fields->num < max_fields);
fields->ptrs[fields->num] = ptr;
fields->sizes[fields->num] = size;
++fields->num;
}
struct Fields s_fields(struct s* inst)
{
struct Fields new_fields;
new_fields.num = 0;
field_push(&new_fields, &inst->i1, sizeof inst->i1);
field_push(&new_fields, &inst->s1, sizeof inst->s1);
field_push(&new_fields, &inst->c1, sizeof inst->c1);
return new_fields;
}
Now you can use this Fields structure with general-purpose functions to read and write members of any struct, like so:
void write_fields(FILE* file, struct Fields* fields)
{
int j=0;
for (; j < fields->num; ++j)
fwrite(fields->ptrs[j], fields->sizes[j], 1, file);
}
This is generally a bit easier to work with than some functional for_each_field kind of approach accepting a callback.
Now all you have to worry about when you create some new struct, S, is to define a single function to output struct Fields from an instance to then enable all those general functions you wrote that work with struct Fields to now work with this new S type automatically.
Many compilers accept a command line parameter which means "pack structures". In addition, many accept a pragma:
#pragma pack(1)
where 1 means byte alignment, 2 means 16-bit word alignment, 4 means 32-bit word alignment, etc.
To make your solution platform independent, you can create a function that writes each field of the struct one at a time, and then call the function to write as many of the structs as needed.
int writeStruct(struct s* obj, size_t count, FILE* file)
{
size_t i = 0;
for ( ; i < count; ++i )
{
// Make sure to add error checking code.
fwrite(&(obj[i].i1), sizeof(obj[i].i1), 1, file);
fwrite(&(obj[i].s1), sizeof(obj[i].s1), 1, file);
fwrite(&(obj[i].c1), sizeof(obj[i].c1), 1, file);
}
// Return the number of structs written to file successfully.
return i;
}
Usage:
struct s example[2];
writeStruct(s, 2, file);

If only using the first element, do I have to allocate mem for the whole struct?

I have a structure where the first element is tested and dependent on its value the rest of the structure will or will not be read. In the cases where the first element's value dictates that the rest of the structure will not be read, do I have to allocate enough memory for the entire structure or just the first element?
struct element
{
int x;
int y;
};
int foo(struct element* e)
{
if(e->x > 3)
return e->y;
return e->x;
}
in main:
int i = 0;
int z = foo((struct element*)&i);
I assume that if only allocating for the first element is valid, then I will have to be wary of anything that may attempt to copy the structure. i.e. passing the struct to a function.
don't force your information into structs where it's not needed: don't use the struct as the parameter of your function.
either pass the member of your struct to the function or use inheritance:
typedef struct {
int foo;
} BaseA;
typedef struct {
int bar;
} BaseB;
typedef struct {
BaseA a;
BaseB b;
} Derived;
void foo(BaseB* info) { ... }
...
Derived d;
foo(&d.b);
BaseB b;
foo(&b);
if you're just curious (and seriously don't use this): you may.
typedef struct {
int foo, goo, hoo, joo;
} A;
typedef struct {
int unused, goo;
} B;
int foo(A* a) { return a->goo; }
...
B b;
int goo = foo((A*)&b);
In general you'll have to allocate a block of memory at least as many bytes as are required to fully read the accessed member with the largest offset in your structure. In addition when writing to this block you have to make sure to use the same member offsets as in the original structure.
The point being, a structure is only a block of memory with different areas assigned different interpretations (int, char, other structs etc...) and accessing a member of a struct (after reordering and alignment) boils down to simply reading from or writing to a bit of memory.
I do not think the code as given is legitimate. To understand why, consider:
struct CHAR_AND_INT { unsigned char c; int i; }
CHAR_AND_INT *p;
A compiler would be entitled to assume that p->c will be word-aligned and have whatever padding would be necessary for p->i to also be word-aligned. On some processors, writing a byte may be slower than writing a word. For example, a byte-store instruction may require the processor to read a word from memory, update one byte within it, and write the whole thing back, while a word-store instruction could simply store the new data without having to read anything first. A compiler that knew that p->c would be word-aligned and padded could implement p->c = 12; by using a word store to write the value 12. Such behavior wouldn't yield desired results, however, if the byte following p->c wasn't padding but instead held useful data.
While I would not expect a compiler to impose "special" alignment or padding requirements on any part of the structure shown in the original question (beyond those which apply to int) I don't think anything in the standard would forbid a compiler from doing so.
You need to only check that the structure itself is allocated; not the members (in that case at least)
int foo(struct element* e)
{
if ( e != 0) // check that the e pointer is valid
{
if(e->x != 0) // here you only check to see if x is different than zero (values, not pointers)
return e->y;
}
return 0;
}
In you edited change, I think this is poor coding
int i = 0;
int z = foo((struct element*)&i);
In that case, i will be allocation on the stack, so its address is valid; and will be valid in foo; but since you cast it into something different, the members will be garbage (at best)
Why do you want to cast an int into a structure?
What is your intent?

Why does internal Lua strings store the way they do?

I was wanting a simple string table that will store a bunch of constants and I thought "Hey! Lua does that, let me use some of there functions!"
This is mainly in the lstring.h/lstring.c files (I am using 5.2)
I will show the code I am curious about first. Its from lobject.h
/*
** Header for string value; string bytes follow the end of this structure
*/
typedef union TString {
L_Umaxalign dummy; /* ensures maximum alignment for strings */
struct {
CommonHeader;
lu_byte reserved;
unsigned int hash;
size_t len; /* number of characters in string */
} tsv;
} TString;
/* get the actual string (array of bytes) from a TString */
#define getstr(ts) cast(const char *, (ts) + 1)
/* get the actual string (array of bytes) from a Lua value */
#define svalue(o) getstr(rawtsvalue(o))
As you see, the data is stored outside of the structure. To get the byte stream, you take the size of TString, add 1, and you got the char* pointer.
Isn't this bad coding though? Its been DRILLED into m in my C classes to make clearly defined structures. I know I might be stirring a nest here, but do you really lose that much speed/space defining a structure as header for data rather than defining a pointer value for that data?
The idea is probably that you allocate the header and the data in one big chunk of data instead of two:
TString *str = (TString*)malloc(sizeof(TString) + <length_of_string>);
In addition to having just one call to malloc/free, you also reduce memory fragmentation and increase memory localization.
But answering your question, yes, these kind of hacks are usually a bad practice, and should be done with extreme care. And if you do, you'll probably want to hide them under a layer of macros/inline functions.
As rodrigo says, the idea is to allocate the header and string data as a single chunk of memory. It's worth pointing out that you also see the non-standard hack
struct lenstring {
unsigned length;
char data[0];
};
but C99 added flexible array members so it can be done in a standard compliant way as
struct lenstring {
unsigned length;
char data[];
};
If Lua's string were done in this way it'd be something like
typedef union TString {
L_Umaxalign dummy;
struct {
CommonHeader;
lu_byte reserved;
unsigned int hash;
size_t len;
const char data[];
} tsv;
} TString;
#define getstr(ts) (ts->tsv->data)
It relates to the complications arising from the more limited C language. In C++, you would just define a base class called GCObject which contains the garbage collection variables, then TString would be a subclass and by using a virtual destructor, both the TString and it's accompanying const char * blocks would be freed properly.
When it comes to writing the same kind of functionality in C, it's a bit more difficult as classes and virtual inheritance do not exist.
What Lua is doing is implementing garbage collection by inserting the header required to manage the garbage collection status of the part of memory following it. Remember that free(void *) does not need to know anything other than the address of the memory block.
#define CommonHeader GCObject *next; lu_byte tt; lu_byte marked
Lua keeps a linked list of these "collectable" blocks of memory, in this case an array of characters, so that it can then free the memory efficiently without knowing the type of object it is pointing to.
If your TString pointed to another block of memory where the character array was, then it require the garbage collector determine the object's type, then delve into its structure to also free the string buffer.
The pseudo code for this kind of garbage collection would be something like this:
GCHeader *next, *prev;
GCHeader *current = firstObject;
while(current)
{
next = current->next;
if (/* current is ready for deletion */)
{
free(current);
// relink previous to the next (singly-linked list)
if (prev)
prev->next = next;
}
else
prev = current; // store previous undeleted object
current = next;
}

Resources