I know that I can copy the structure member by member, instead of that can I do a memcpy on structures?
Is it advisable to do so?
In my structure, I have a string also as member which I have to copy to another structure having the same member. How do I do that?
Copying by plain assignment is best, since it's shorter, easier to read, and has a higher level of abstraction. Instead of saying (to the human reader of the code) "copy these bits from here to there", and requiring the reader to think about the size argument to the copy, you're just doing a plain assignment ("copy this value from here to here"). There can be no hesitation about whether or not the size is correct.
Also, if the structure is heavily padded, assignment might make the compiler emit something more efficient, since it doesn't have to copy the padding (and it knows where it is), but mempcy() doesn't so it will always copy the exact number of bytes you tell it to copy.
If your string is an actual array, i.e.:
struct {
char string[32];
size_t len;
} a, b;
strcpy(a.string, "hello");
a.len = strlen(a.string);
Then you can still use plain assignment:
b = a;
To get a complete copy. For variable-length data modelled like this though, this is not the most efficient way to do the copy since the entire array will always be copied.
Beware though, that copying structs that contain pointers to heap-allocated memory can be a bit dangerous, since by doing so you're aliasing the pointer, and typically making it ambiguous who owns the pointer after the copying operation.
For these situations a "deep copy" is really the only choice, and that needs to go in a function.
Since C90, you can simply use:
dest_struct = source_struct;
as long as the string is memorized inside an array:
struct xxx {
char theString[100];
};
Otherwise, if it's a pointer, you'll need to copy it by hand.
struct xxx {
char* theString;
};
dest_struct = source_struct;
dest_struct.theString = malloc(strlen(source_struct.theString) + 1);
strcpy(dest_struct.theString, source_struct.theString);
If the structures are of compatible types, yes, you can, with something like:
memcpy (dest_struct, source_struct, sizeof (*dest_struct));
The only thing you need to be aware of is that this is a shallow copy. In other words, if you have a char * pointing to a specific string, both structures will point to the same string.
And changing the contents of one of those string fields (the data that the char * points to, not the char * itself) will change the other as well.
If you want a easy copy without having to manually do each field but with the added bonus of non-shallow string copies, use strdup:
memcpy (dest_struct, source_struct, sizeof (*dest_struct));
dest_struct->strptr = strdup (source_struct->strptr);
This will copy the entire contents of the structure, then deep-copy the string, effectively giving a separate string to each structure.
And, if your C implementation doesn't have a strdup (it's not part of the ISO standard), get one from here.
You can memcpy structs, or you can just assign them like any other value.
struct {int a, b;} c, d;
c.a = c.b = 10;
d = c;
In C, memcpy is only foolishly risky. As long as you get all three parameters exactly right, none of the struct members are pointers (or, you explicitly intend to do a shallow copy) and there aren't large alignment gaps in the struct that memcpy is going to waste time looping through (or performance never matters), then by all means, memcpy. You gain nothing except code that is harder to read, fragile to future changes and has to be hand-verified in code reviews (because the compiler can't), but hey yeah sure why not.
In C++, we advance to the ludicrously risky. You may have members of types which are not safely memcpyable, like std::string, which will cause your receiving struct to become a dangerous weapon, randomly corrupting memory whenever used. You may get surprises involving virtual functions when emulating slice-copies. The optimizer, which can do wondrous things for you because it has a guarantee of full type knowledge when it compiles =, can do nothing for your memcpy call.
In C++ there's a rule of thumb - if you see memcpy or memset, something's wrong. There are rare cases when this is not true, but they do not involve structs. You use memcpy when, and only when, you have reason to blindly copy bytes.
Assignment on the other hand is simple to read, checks correctness at compile time and then intelligently moves values at runtime. There is no downside.
You can use the following solution to accomplish your goal:
struct student
{
char name[20];
char country[20];
};
void main()
{
struct student S={"Wolverine","America"};
struct student X;
X=S;
printf("%s%s",X.name,X.country);
}
You can use a struct to read write into a file.
You do not need to cast it as a `char*.
Struct size will also be preserved.
(This point is not closest to the topic but guess it:
behaving on hard memory is often similar to RAM one.)
To move (to & from) a single string field you must use strncpy
and a transient string buffer '\0' terminating.
Somewhere you must remember the length of the record string field.
To move other fields you can use the dot notation, ex.:
NodeB->one=intvar;
floatvar2=(NodeA->insidebisnode_subvar).myfl;
struct mynode {
int one;
int two;
char txt3[3];
struct{char txt2[6];}txt2fi;
struct insidenode{
char txt[8];
long int myl;
void * mypointer;
size_t myst;
long long myll;
} insidenode_subvar;
struct insidebisnode{
float myfl;
} insidebisnode_subvar;
} mynode_subvar;
typedef struct mynode* Node;
...(main)
Node NodeA=malloc...
Node NodeB=malloc...
You can embed each string into a structs that fit it,
to evade point-2 and behave like Cobol:
NodeB->txt2fi=NodeA->txt2fi
...but you will still need of a transient string
plus one strncpy as mentioned at point-2 for scanf, printf
otherwise an operator longer input (shorter),
would have not be truncated (by spaces padded).
(NodeB->insidenode_subvar).mypointer=(NodeA->insidenode_subvar).mypointer
will create a pointer alias.
NodeB.txt3=NodeA.txt3
causes the compiler to reject:
error: incompatible types when assigning to type ‘char[3]’ from type ‘char *’
point-4 works only because NodeB->txt2fi & NodeA->txt2fi belong to the same typedef !!
A correct and simple answer to this topic I found at
In C, why can't I assign a string to a char array after it's declared?
"Arrays (also of chars) are second-class citizens in C"!!!
Related
So I know if you have two strings you need strcpy to assign one to the other, you can't use =.
But say you have a structure like
typedef struct{
char name[15];
int age;}person;
and you have
person q,f;
q=f;
this will assign all the fields of f to q. But there's a string in there f.name. How does that work? Shouldn't the string in there cause a problem. I'm new to coding and that's confusing me a bit.
Each field is populated with a value equal to the corresponding field in the source structure.
For example, it could possibly be implemented using
memcpy(&q, &f, sizeof(struct person));
This means the following is perfectly legit:
#include <stdio.h>
typedef struct {
char name[15];
int age;
} Person;
int main(void) {
Person f = { "abc", 30 };
Person q;
q = f;
printf("%s\n", q.name); // abc
printf("%d\n", q.age); // 30
}
(I find it useful to use uppercase for types. Then, you can do Person person;.)
(Do note that I had to initialize f before performing the assignment to avoid undefined behaviour.)
In the above, changing f.name or q.name (after the assignment) has no effect on the other. However, consider what happens if you had the following instead:
typedef struct {
char *name;
int age;
} Person2;
Again, (only) a field-for-field copy is made, you'll end up with the same pointer (pointing to the same object) in both structures.
There is no technical reason a modern compiler could not copy an array, making an assignment like a = b; work if a and b were suitable arrays. That includes copying the characters of a string, as with a = "abc";.
However, when C was being developed, it was very much a give-the-computer-each-instruction-yourself kind of language, and copying a “big” thing like an array was not suitable for the hardware and the programming situations of the time. You also could not copy structures in early versions of C.
Instead, to help programmers deal with arrays, features were added to automatically convert arrays to pointers, so programmers could easily write their own code to copy arrays or to manipulate their contents, as in:
for (char *p = array; *p; ++p)
*d++ = *p;
Over the years, hardware became more powerful, and C grew to be used more and more. The ability to copy structures using assignments was added. However, by then, the convert-array-to-pointer feature was built into the language and could not be changed. Thus, we cannot make a = b; copy arrays because it does not mean “Assign the value of array b to array a”; it would mean “Assign the address of the first element of b to the address of the first element of a.”
Aside from using memcpy, other workarounds could be devised, but it has simply not proven valuable enough to add any special provision for assigning arrays to the language.
So the presence of an array of characters inside a structure is not a problem: There is no rule in C that you cannot copy arrays. It is simply that there is no way to express an assignment that copies an array. Assignments that copy structures are easy to express, and they simply copy the contents of the structures.
I am working on refactoring some old code and have found few structs containing zero length arrays (below). Warnings depressed by pragma, of course, but I've failed to create by "new" structures containing such structures (error 2233). Array 'byData' used as pointer, but why not to use pointer instead? or array of length 1? And of course, no comments were added to make me enjoy the process...
Any causes to use such thing? Any advice in refactoring those?
struct someData
{
int nData;
BYTE byData[0];
}
NB It's C++, Windows XP, VS 2003
Yes this is a C-Hack.
To create an array of any length:
struct someData* mallocSomeData(int size)
{
struct someData* result = (struct someData*)malloc(sizeof(struct someData) + size * sizeof(BYTE));
if (result)
{ result->nData = size;
}
return result;
}
Now you have an object of someData with an array of a specified length.
There are, unfortunately, several reasons why you would declare a zero length array at the end of a structure. It essentially gives you the ability to have a variable length structure returned from an API.
Raymond Chen did an excellent blog post on the subject. I suggest you take a look at this post because it likely contains the answer you want.
Note in his post, it deals with arrays of size 1 instead of 0. This is the case because zero length arrays are a more recent entry into the standards. His post should still apply to your problem.
http://blogs.msdn.com/oldnewthing/archive/2004/08/26/220873.aspx
EDIT
Note: Even though Raymond's post says 0 length arrays are legal in C99 they are in fact still not legal in C99. Instead of a 0 length array here you should be using a length 1 array
This is an old C hack to allow a flexible sized arrays.
In C99 standard this is not neccessary as it supports the arr[] syntax.
Your intution about "why not use an array of size 1" is spot on.
The code is doing the "C struct hack" wrong, because declarations of zero length arrays are a constraint violation. This means that a compiler can reject your hack right off the bat at compile time with a diagnostic message that stops the translation.
If we want to perpetrate a hack, we must sneak it past the compiler.
The right way to do the "C struct hack" (which is compatible with C dialects going back to 1989 ANSI C, and probably much earlier) is to use a perfectly valid array of size 1:
struct someData
{
int nData;
unsigned char byData[1];
}
Moreover, instead of sizeof struct someData, the size of the part before byData is calculated using:
offsetof(struct someData, byData);
To allocate a struct someData with space for 42 bytes in byData, we would then use:
struct someData *psd = (struct someData *) malloc(offsetof(struct someData, byData) + 42);
Note that this offsetof calculation is in fact the correct calculation even in the case of the array size being zero. You see, sizeof the whole structure can include padding. For instance, if we have something like this:
struct hack {
unsigned long ul;
char c;
char foo[0]; /* assuming our compiler accepts this nonsense */
};
The size of struct hack is quite possibly padded for alignment because of the ul member. If unsigned long is four bytes wide, then quite possibly sizeof (struct hack) is 8, whereas offsetof(struct hack, foo) is almost certainly 5. The offsetof method is the way to get the accurate size of the preceding part of the struct just before the array.
So that would be the way to refactor the code: make it conform to the classic, highly portable struct hack.
Why not use a pointer? Because a pointer occupies extra space and has to be initialized.
There are other good reasons not to use a pointer, namely that a pointer requires an address space in order to be meaningful. The struct hack is externalizeable: that is to say, there are situations in which such a layout conforms to external storage such as areas of files, packets or shared memory, in which you do not want pointers because they are not meaningful.
Several years ago, I used the struct hack in a shared memory message passing interface between kernel and user space. I didn't want pointers there, because they would have been meaningful only to the original address space of the process generating a message. The kernel part of the software had a view to the memory using its own mapping at a different address, and so everything was based on offset calculations.
It's worth pointing out IMO the best way to do the size calculation, which is used in the Raymond Chen article linked above.
struct foo
{
size_t count;
int data[1];
}
size_t foo_size_from_count(size_t count)
{
return offsetof(foo, data[count]);
}
The offset of the first entry off the end of desired allocation, is also the size of the desired allocation. IMO it's an extremely elegant way of doing the size calculation. It does not matter what the element type of the variable size array is. The offsetof (or FIELD_OFFSET or UFIELD_OFFSET in Windows) is always written the same way. No sizeof() expressions to accidentally mess up.
I know a pointer to one type may be converted to a pointer of another type. I have three questions:
What should kept in mind while typecasting pointers?
What are the exceptions/error may come in resulting pointer?
What are best practices to avoid exceptions/errors?
A program well written usually does not use much pointer typecasting. There could be a need to use ptr typecast for malloc for instance (declared (void *)malloc(...)), but it is not even necessary in C (while a few compilers may complain).
int *p = malloc(sizeof(int)); // no need of (int *)malloc(...)
However in system applications, sometimes you want to use a trick to perform binary or specific operation - and C, a language close to the machine structure, is convenient for that. For instance say you want to analyze the binary structure of a double (that follows thee IEEE 754 implementation), and working with binary elements is simpler, you may declare
typedef unsigned char byte;
double d = 0.9;
byte *p = (byte *)&d;
int i;
for (i=0 ; i<sizeof(double) ; i++) { ... work with b ... }
You may also use an union, this is an exemple.
A more complex utilisation could be the simulation of the C++ polymorphism, that requires to store the "classes" (structures) hierarchy somewhere to remember what is what, and perform pointer typecasting to have, for instance, a parent "class" pointer variable to point at some time to a derived class (see the C++ link also)
CRectangle rect;
CPolygon *p = (CPolygon *)▭
p->whatami = POLY_RECTANGLE; // a way to simulate polymorphism ...
process_poly ( p );
But in this case, maybe it's better to directly use C++!
Pointer typecast is to be used carefully for well determined situations that are part of the program analysis - before development starts.
Pointer typecast potential dangers
use them when it's not necessary - that is error prone and complexifies the program
pointing to an object of different size that may lead to an access overflow, wrong result...
pointer to two different structures like s1 *p = (s1 *)&s2; : relying on their size and alignment may lead to an error
(But to be fair, a skilled C programmer wouldn't commit the above mistakes...)
Best practice
use them only if you do need them, and comment the part well that explains why it is necessary
know what you are doing - again a skilled programmer may use tons of pointer typecasts without fail, i.e. don't try and see, it may work on such system / version / OS, and may not work on another one
In plain C you can cast any pointer type to any other pointer type. If you cast a pointer to or from an uncompatible type, and incorrectly write the memory, you may get a segmentation fault or unexpected results from your application.
Here is a sample code of casting structure pointers:
struct Entity {
int type;
}
struct DetailedEntity1 {
int type;
short val1;
}
struct DetailedEntity2 {
int type;
long val;
long val2;
}
// random code:
struct Entity* ent = (struct Entity*)ptr;
//bad:
struct DetailedEntity1* ent1 = (struct DetailedEntity1*)ent;
int a = ent->val; // may be an error here, invalid read
ent->val = 117; // possible invali write
//OK:
if (ent->type == DETAILED_ENTITY_1) {
((struct DetailedEntity1*)ent)->val1;
} else if (ent->type == DETAILED_ENTITY_2) {
((struct DetailedEntity2*)ent)->val2;
}
As for function pointers - you should always use functions which exactly fit the declaration. Otherwise you may get unexpected results or segfaults.
When casting from pointer to pointer (structure or not) you must ensure that the memory is aligned in the exact same way. When casting entire structures the best way to ensure it is to use the same order of the same variables at the start, and differentiating structures only after the "common header". Also remember, that memory alignment may differ from machine to machine, so you can't just send a struct pointer as a byte array and receive it as byte array. You may experience unexpected behaviour or even segfaults.
When casting smaller to larger variable pointers, you must be very careful. Consider this code:
char* ptr = malloc (16);
ptr++;
uint64_t* uintPtr = ptr; // may cause an error, memory is not properly aligned
And also, there is the strict aliasing rule that you should follow.
You probably need a look at ... the C-faq maintained by Steve Summit (which used to be posted in the newsgroups, which means it was read and updated by a lot of the best programmers at the time, sometimes the conceptors of the langage itself).
There is an abridged version too, which is maybe more palatable and still very, very, very, very useful. Reading the whole abridged is, I believe, mandatory if you use C.
I can't seem to understand the difference between the following to pointer notations, can someone please guide me?
typedef struct some_struct struct_name;
struct_name this;
char buf[50];
this = *((some_struct *)(buf));
Now I tried to play around a bit and did the above thing like:
struct some_struct * this;
char buf[50];
this=(struct some_struct *)buf;
As far as I am concerned I think both the implementations should generate the same result, Can someone guide me whether there is a difference between the two and if yes can some one point it out?
Thanks.
In your first snippet, this is not a pointer, it's an instance of some_struct. The assignment you made did a shallow copy (i.e. memcpy()) of what's in buf as if it were an instance of some_struct as well.
In the second snippet, this is a pointer, and it's just pointed to the address of buf.
So, basically to sum up, first snippet this is not a pointer and the struct is copied into it. In the second, it's a pointer and assigned to the same memory as buf (i.e. not a copy).
In the second one, "this" will point to the first memory location of "buf". In the first example, you will either get a compiler error (I don't think you can assign structs in C with =, I could be wrong though), or the contents of buf (up to sizeof(struct_name)) will be copied into this, which resides on the stack.
Both approaches have their problems.
alignment: your buf might not be properly aligned for a variable of the structure type. If so this will produce undefined behavior (UB): in the best case it aborts your program, but it may make much worse things than that.
initialization: in the first cases you access uninitialized memory for reading. In the best case that gives you unspecific data, that is some random bytes. In the worst case, char is a signed integer type on your platform and you hit a trap representation for char => UB as above. (Your second case will encounter the same problem, once you try to access the object at the other end of the pointer.)
How to avoid all that:
Always initialize your variables. A simple = { 0 } should do in all cases.
never use char as a generic type for bytes but use unsigned char
never cast a byte buffer of arbitrary alignment to another data type. If needed, do it the other way round, cast a struct object to unsigned char.
This ought to simple. Say we have a struct from a library that doesn't offer copying facilities. Is there an easy way to copy a variable of the type of that struct to a new variable of the same type without doing assignments for each of its sub members? Or does one have to be making special copying functions?
Well, struct types are assignable in C:
struct SomeStruct s, d;
...
d = s;
It doesn't matter, where they are defined. And there's no need to copy "each of the sub members". Where did you even get the idea about copying it member by member?
Of course, the assignment will perform shallow copying only. If you need deep copying, you need a library-provided copying routine. If there's none, you will have to implement it yourself. In order to do that you will need full knowledge of the actual deep-memory organization of the structure. If you don't know it (i.e. if it is not documented), you are out of luck - proper deep-copying is impossible.
This sounds like the classic deep copy vs shallow copy topic.
You could copy the variable (shallow copy) with a memcpy operation.
If the struct holds references to other varibles, then this might not be enough and you should implement a more sofisticated (deep) copy.
C supports struct assignment natively, so if a per-member copy is safe/effective, you can just use direct assignment:
typedef struct {
int v1, v2;
float f1;
} A;
A a = { 0, 2, 1.5f };
A b;
b = a;
It has already been explained that one may use simple assignment for structs, but perhaps an interesting characteristic of this technique is that it even allows you to copy structs containing arrays despite arrays being unassignable. More importantly, it demonstrates that it is not always possible to copy a struct just by assigning to its members individually.
For example:
typedef struct {
char data[100];
} string;
string a = {"Hello world!"};
string b;
b = a; // works!
puts(b.data); // writes "Hello world!" to standard out.
b.data = a.data; // oops, doesn't work!
Keep in mind that it might not be safe to do per-member copy of a structure; for instance, it may hold pointers, and just copying those pointers might not be a great idea if you want an actual copy of the data. However, when it is safe to do it, you may use memcpy save yourself some keystrokes.
struct foo a, b;
// let the third-party lib fill the struct
thirdPartyLibraryCallInvolvingTheFooStruct(&a);
// now we want to copy a to b
memcpy(&b, &a, sizeof b);
Don't forget to read the manpage for more infos!
You could use the memcpy function. This sets a contiguous area of memory, just copying bit for bit; ignoring whether it's a series of ints, or a char array, or whatever.