Manually zeroing variables VS copying struct - c

I have a long C (not C++) struct. It is used to control entities in a game, with position, some behavior data, nothing flashy, except for two strings. The struct is global.
Right now, whenever an object is initialized, I put all the values to defaults, one by one
myobjects[index].angle = 0; myobjects[index].speed = 0;
like that. It doesn't really feel slow, but I am wondering if copying a "template" struct with all values set to the defaults is faster or more convenient.
So, to sum up into two proper questions: Is it faster to copy a struct instead of manually setting all the data?
What should I keep in mind about the malloc-ed memory for the strings?

"More convenient" is likely the more important part.
struct s vs[...];
...
// initialize every member to "default"
// but can't use in normal assignment
struct s newv = {0};
vs[...] = newv;
Or hide the "initialization details" behind an init-function (or a macro, if you dislike maintainable code :-)
struct s* init_s (struct s* v, ...) { /* and life goes on */ }

You may use this sequence:
memset(myobjects+index,0,sizeof(myobjects[0]);
if all you need is to set all members to zero
Beware: if a particular member is pointer, it will be set to NULL
Nelu Cozac

Related

Can I access a member of a struct with a pointer to a struct which contains a pointer to that struct

I am trying to access the members in the struct tCAN_MESSAGE. What I think would work is like the first example in main, i.e. some_ptr->canMessage_ptr->value = 10;. But I have some code that someone else have written and what I can see is that that person have used some_ptr->canMessage_ptr[i].value;.
Is it possible to do it the first way? We are using pointers to structs which contains pointer to another struct (like the example below) quite often, but I never see the use of ptr1->ptr2->value?
typedef struct
{
int value1;
int value2;
int value3;
float value4;
}tCAN_MESSAGE;
typedef struct
{
tCAN_MESSAGE *canMessage_ptr;
}tSOMETHING;
int main(void)
{
tCAN_MESSATE var_canMessage;
tSOMETHING var_something;
tSOMETHING *some_ptr = &var_something;
some_ptr->canMessage_ptr = &var_canMessage;
some_ptr->canMessage_ptr->value1 = 10; //is this valid?
//I have some code that are doing this, ant iterating trough it with a for:
some_ptr->canMessage_ptr[i].value1; //Is this valid?
return 0
}
It's very simple: every pointer has to be set to point at a valid memory location before use. If it isn't, you can't use it. You cannot "store data inside pointers". See this:
Crash or "segmentation fault" when data is copied/scanned/read to an uninitialized pointer
None of your code is valid. some_ptr isn't set to point anywhere, so it cannot be accessed, nor can its members. Similarly, some_ptr->canMessage_ptr isn't set to point anywhere either.
I am trying to access the members in the struct tCAN_MESSAGE. What I
think would work is like the first example in main, i.e.
some_ptr->canMessage_ptr->value = 10;. But I have some code that
someone else have written and what I can see is that that person have
used some_ptr->canMessage_ptr[i].value;. Is it possible to do it the
first way?
The expression
some_ptr->canMessage_ptr[i].value
is 100% equivalent to
(*(some_ptr->canMessage_ptr + i)).value
, which in turn is 100% equivalent to
(some_ptr->canMessage_ptr + i)->value
. When i is 0, that is of course equivalent to
some_ptr->canMessage_ptr->value
So yes, it is possible to use some_ptr->canMessage_ptr->value as long as the index in question is 0. If the index is always 0 then chaining arrow operators as you suggest is good style. Otherwise, the mixture of arrow and indexing operators that you see in practice would be my style recommendation.
We are using pointers to structs wich contains pointer to
another struct (like the example below) quite often, but I never see
the use of ptr1->ptr2->value ?
I'm inclined to suspect that you do not fully understand what you're working with. Usage of the form some_ptr->canMessage_ptr[i].value suggests that your tSOMETHING type contains a pointer to the first element of an array of possibly many tCAN_MESSAGEs, which is a subtle but important distinction to make. In that case, yes, as shown above, you can chain arrow operators to access the first element of such an array (at index 0). However, the cleanest syntax for accessing other elements of that array is to use the indexing operator, and it pays to be consistent.

Why use address of first element of struct, rather than struct itself?

I've just come upon yet another code base at work where developers consistently use the address of the first element of structs when copying/comparing/setting, rather than the struct itself. Here's a simple example.
First there's a struct type:
typedef struct {
int a;
int b;
} foo_t;
Then there's a function that makes a copy of such a struct:
void bar(foo_t *inp)
{
foo_t l;
...
memcpy(&l.a, &inp->a, sizeof(foo_t));
...
}
I wouldn't myself write a call to memcpy in that way and I started out with suspecting that the original developers simply didn't quite grasp pointers and structs in C. However, now I've seen this in two unrelated code bases, with no common developers so I'm starting to doubt myself.
Why would one want to use this style?
Nobody should do that.
If you rearrange struct members you are in trouble.
Instead of that:
memcpy(&l.a, &inp->a, sizeof(foo_t));
you can do that:
memcpy(&l, inp, sizeof(foo_t));
While it can be dangerous and misleading, both statements actually do the same thing here as C guarantees there is no padding before the first structure member.
But the best is just to copy the structure objects using a simple assignment operator:
l = *inp;
Why would one want to use this style?
My guess: ignorance or bad discipline.
One wouldn't. If you ever moved a in the struct or you inserted member(s) before it, you would introduce a memory smashing bug.
This code is unsafe because rearranging the members of the struct can result in the memcpy accessing beyond the bounds of the struct if member a is no longer the first member.
However, it's conceivable that members are intentionally ordered within the struct and programmer only wants to copy a subset of them, beginning with member a and running until the end of the struct. If that's the case then the code can be made safe with the following change:
memcpy(&l.a, &inp->a, sizeof(foo_t) - offsetof(foo_t, a));
Now the struct members may be rearranged into any order and this memcpy will never go out of bounds.
Actually, there is one legitimate use case for this: constructing a class hierarchy.
When treating structs as a class instances, the first member (i.e. offset 0) will typically be the supertype instance... if a supertype exists. This allows a simple cast to move between using the subtype vs. the supertype. Very useful.
On Darren Stone's note about intention, this is expected when executing OO in the C language.
In any other case, I would suggest avoiding this pattern and accessing the member directly instead, for reasons already cited.
It's a really bad habit. The struct might have another member prepended, for example. This is an insanely careless habit and I am surprised to read that anyone would do this.
Others have already noted these; the one that bugs me is this:
struct Foo rgFoo [3];
struct Foo *pfoo = &rgFoo [0];
instead of
struct Foo *pfoo = rgfoo;
Why deref the array by index and then take the address again? It's already the address, the only difference of note is that pfoo is technically
struct Foo *const,
not
struct Foo *.
Yet I used to see the first one all the time.

Copying one structure to another

I know that I can copy the structure member by member, instead of that can I do a memcpy on structures?
Is it advisable to do so?
In my structure, I have a string also as member which I have to copy to another structure having the same member. How do I do that?
Copying by plain assignment is best, since it's shorter, easier to read, and has a higher level of abstraction. Instead of saying (to the human reader of the code) "copy these bits from here to there", and requiring the reader to think about the size argument to the copy, you're just doing a plain assignment ("copy this value from here to here"). There can be no hesitation about whether or not the size is correct.
Also, if the structure is heavily padded, assignment might make the compiler emit something more efficient, since it doesn't have to copy the padding (and it knows where it is), but mempcy() doesn't so it will always copy the exact number of bytes you tell it to copy.
If your string is an actual array, i.e.:
struct {
char string[32];
size_t len;
} a, b;
strcpy(a.string, "hello");
a.len = strlen(a.string);
Then you can still use plain assignment:
b = a;
To get a complete copy. For variable-length data modelled like this though, this is not the most efficient way to do the copy since the entire array will always be copied.
Beware though, that copying structs that contain pointers to heap-allocated memory can be a bit dangerous, since by doing so you're aliasing the pointer, and typically making it ambiguous who owns the pointer after the copying operation.
For these situations a "deep copy" is really the only choice, and that needs to go in a function.
Since C90, you can simply use:
dest_struct = source_struct;
as long as the string is memorized inside an array:
struct xxx {
char theString[100];
};
Otherwise, if it's a pointer, you'll need to copy it by hand.
struct xxx {
char* theString;
};
dest_struct = source_struct;
dest_struct.theString = malloc(strlen(source_struct.theString) + 1);
strcpy(dest_struct.theString, source_struct.theString);
If the structures are of compatible types, yes, you can, with something like:
memcpy (dest_struct, source_struct, sizeof (*dest_struct));
The only thing you need to be aware of is that this is a shallow copy. In other words, if you have a char * pointing to a specific string, both structures will point to the same string.
And changing the contents of one of those string fields (the data that the char * points to, not the char * itself) will change the other as well.
If you want a easy copy without having to manually do each field but with the added bonus of non-shallow string copies, use strdup:
memcpy (dest_struct, source_struct, sizeof (*dest_struct));
dest_struct->strptr = strdup (source_struct->strptr);
This will copy the entire contents of the structure, then deep-copy the string, effectively giving a separate string to each structure.
And, if your C implementation doesn't have a strdup (it's not part of the ISO standard), get one from here.
You can memcpy structs, or you can just assign them like any other value.
struct {int a, b;} c, d;
c.a = c.b = 10;
d = c;
In C, memcpy is only foolishly risky. As long as you get all three parameters exactly right, none of the struct members are pointers (or, you explicitly intend to do a shallow copy) and there aren't large alignment gaps in the struct that memcpy is going to waste time looping through (or performance never matters), then by all means, memcpy. You gain nothing except code that is harder to read, fragile to future changes and has to be hand-verified in code reviews (because the compiler can't), but hey yeah sure why not.
In C++, we advance to the ludicrously risky. You may have members of types which are not safely memcpyable, like std::string, which will cause your receiving struct to become a dangerous weapon, randomly corrupting memory whenever used. You may get surprises involving virtual functions when emulating slice-copies. The optimizer, which can do wondrous things for you because it has a guarantee of full type knowledge when it compiles =, can do nothing for your memcpy call.
In C++ there's a rule of thumb - if you see memcpy or memset, something's wrong. There are rare cases when this is not true, but they do not involve structs. You use memcpy when, and only when, you have reason to blindly copy bytes.
Assignment on the other hand is simple to read, checks correctness at compile time and then intelligently moves values at runtime. There is no downside.
You can use the following solution to accomplish your goal:
struct student
{
char name[20];
char country[20];
};
void main()
{
struct student S={"Wolverine","America"};
struct student X;
X=S;
printf("%s%s",X.name,X.country);
}
You can use a struct to read write into a file.
You do not need to cast it as a `char*.
Struct size will also be preserved.
(This point is not closest to the topic but guess it:
behaving on hard memory is often similar to RAM one.)
To move (to & from) a single string field you must use strncpy
and a transient string buffer '\0' terminating.
Somewhere you must remember the length of the record string field.
To move other fields you can use the dot notation, ex.:
NodeB->one=intvar;
floatvar2=(NodeA->insidebisnode_subvar).myfl;
struct mynode {
int one;
int two;
char txt3[3];
struct{char txt2[6];}txt2fi;
struct insidenode{
char txt[8];
long int myl;
void * mypointer;
size_t myst;
long long myll;
} insidenode_subvar;
struct insidebisnode{
float myfl;
} insidebisnode_subvar;
} mynode_subvar;
typedef struct mynode* Node;
...(main)
Node NodeA=malloc...
Node NodeB=malloc...
You can embed each string into a structs that fit it,
to evade point-2 and behave like Cobol:
NodeB->txt2fi=NodeA->txt2fi
...but you will still need of a transient string
plus one strncpy as mentioned at point-2 for scanf, printf
otherwise an operator longer input (shorter),
would have not be truncated (by spaces padded).
(NodeB->insidenode_subvar).mypointer=(NodeA->insidenode_subvar).mypointer
will create a pointer alias.
NodeB.txt3=NodeA.txt3
causes the compiler to reject:
error: incompatible types when assigning to type ‘char[3]’ from type ‘char *’
point-4 works only because NodeB->txt2fi & NodeA->txt2fi belong to the same typedef !!
A correct and simple answer to this topic I found at
In C, why can't I assign a string to a char array after it's declared?
"Arrays (also of chars) are second-class citizens in C"!!!

Accessing array as a struct *

This is one of those I think this should work, but it's best to check questions. It compiles and works fine on my machine.
Is this guaranteed to do what I expect (i.e. allow me to access the first few elements of the array with a guarantee that the layout, alignment, padding etc of the struct is the same as the array)?
struct thingStruct
{
int a;
int b;
int c;
};
void f()
{
int thingsArray[5];
struct thingStruct *thingsStruct = (struct thingStruct *)&thingsArray[0];
thingsArray[0] = 100;
thingsArray[1] = 200;
thingsArray[2] = 300;
printf("%d", thingsStruct->a);
printf("%d", thingsStruct->b);
printf("%d", thingsStruct->c);
}
EDIT: Why on earth would I want to do something like this? I have an array which I'm mmapping to a file. I'm treating the first part of the array as a 'header', which stores various pieces of information about the array, and the rest of it I'm treating as a normal array. If I point the struct to the start of the array I can access the pieces of header data as struct members, which is more readable. All the members in the struct would be of the same type as the array.
While I have seen this done frequently, you cannot (meaning it is not legal, standard C) make assumptions about the binary layout of a structure, as it may have padding between fields.
This is explained in the comp.lang.c faq: http://c-faq.com/struct/padding.htmls
Although it's likely to work in most places, it's still a bit iffy. If you want to give symbolic names to parts of the header, why not just do:
enum { HEADER_A, HEADER_B, HEADER_C };
/* ... */.
printf("%d", thingsArray[HEADER_A]);
printf("%d", thingsArray[HEADER_B]);
printf("%d", thingsArray[HEADER_C]);
As Evan commented on the question, this will probably work in most cases (again, probably best if you use #pragma pack to ensure their is no padding) assuming all the types in your struct are the same type as your array. Given the rules of C, this is legal.
My question to you is "why?" This isn't a particularly safe thing to do. If a float gets thrown into the middle of the struct, this all falls apart. Why not just use the struct directly? This really ins't a technique that I'd recommend in most cases.
Another solution for representing a header and the rest of file data is using a structure like this:
struct header {
long headerData1;
int headerData2;
int headerData3;
int fileData[ 1 ]; // <- data begin here
};
Then you allocate the memory block with a file contents and cast it as struct header *myFileHeader (or map the memory block on a file) and access all your file data with
myFileHeader->fileData[ position ]
for arbitrary big position. The language imposes no restriction on the index value, so it's only your responsibility to keep your arbitrary big posistion within the actual size of the memory block you allocated (or the mapped file's size).
One more important note: apart from switching off the struct members padding, which has been already described by others, you should carefully choose data types for the header members, so that they fit the actual file data layout despite compiler you use (say, int won't change from 32 to 64 bits...)

How to dynamically create and read structs in C?

How can I do something like that (just an example):
any_struct *my_struct = create_struct();
add_struct_member(my_struct, "a", int_member);
add_struct_member(my_struct, "b", float_member);
So that I could load and use a struct instance "from the outside" (at the address addressOfMyStruct) with the given structure here?
any_struct_instance *instance = instance(my_struct, addressOfMyStruct);
int a = instance_get_member(instance, "a");
float b = instance_get_member(instance, "b");
I would also like to be able to create struct instances dynamically this way.
I hope it's clear what I want to do. I know that C/Invoke is able to do it, but is there a separate library to do that?
Actually demonstrating the code to make this work in C is a bit too involved for an SO post. But explaining the basic concept is doable.
What you're really creating here is a templated property bag system. The one thing you'll need a lot of to keep this going is some assiociative structure like a hash table. I'd say go with std::map but you mentioned this was a C only solution. For the sake of discussion I'm just going to assume you have some sort of hashtable available.
The "create_struct" call will need to return a structure which contains a pointer to a hashtable which makes const char* to essentially a size_t. This map defines what you need in order to create a new instance of the struct.
The "insance" method will essentially create a new hashtable with equal number of members as the template hashtable. Lets throw lazy evualation out the window for a second and assume you create all members up front. The method will need to loop over the template hashtable adding a member for every entry and malloc'ing a memory chunk of the specified size.
The implementation of instance_get_member will simply do a lookup in the map by name. The signature though and usage pattern will need to change though. C does not support templates and must chose a common return type that can represent all data. In this case you'll need to chose void* since that's how the memory will need to be stored.
void* instance_get_member(any_struct_instance* inst, const char* name);
You can make this a bit better by adding an envil macro to simulate templates
#define instance_get_member2(inst, name, type) \
*((type*)instance_get_member((inst),(name)))
...
int i = instance_get_member2(pInst,"a", int);
You've gone so far defining the problem that all that's left is a bit of (slightly tricky in some parts) implementation. You just need to keep track of the information:
typedef struct {
fieldType type;
char name[NAMEMAX];
/* anything else */
} meta_struct_field;
typedef struct {
unsigned num_fields;
meta_struct_field *fields;
/* anything else */
} meta_struct;
Then create_struct() allocates memory for meta_struct and initialized it to 0, and add_struct_member() does an alloc()/realloc() on my_struct.fields and increments my_struct.num_fields. The rest follows in the same vein.
You'll also want a union in meta_struct_field to hold actual values in instances.
I did some of this a long time ago.
The way I did it was to generate code containing the struct definition, plus all routines for accessing it and then compile and link it into a DLL "on the fly", then load that DLL dynamically.

Resources