Deep copy a struct to another - c

I have a struct which contains strings and pointer within. Is there any library functions available to do a deep copy of the struct into another. I don't want to do a field by field copy since the struct I'm having is quite large.
Does glib have any function that does the trick?

No. A general-purpose function would have no way to know the structure of your struct (i.e. the information that's only available at compile time). And even if it did, how it would know what constitutes a "deep copy" in all circumstances?

You can use memcpy or memmove to copy the entire contents of the struct itself. However, as C has no introspection, copying pointed-to objects can't be done by a general purpose function.
Edited to add: As several commenters note, you can just assign structures to other structures in the C dialects in use for the last couple of decades, memcpy is not needed any longer.

Related

What is the best way to store integers with void pointers in C?

Hello i am trying to learn and build data structures in c and i want to store integers progressively in the stack.
my struct is like this:
typedef struct STACK_NODE_s *STACK_NODE;
typedef struct STACK_NODE_s{
STACK_NODE forward;
void *storage;
} STACK_NODE_t;
typedef struct L_STACK_s{
STACK_NODE top;
} L_STACK_t, *L_STACK;
In a while loop i want to read and store my chars in integer form.
//assume that str is an proper string
//assume that we have a linked stack called LS
int i=0;
int temp;
while(str[i]!='\0'){
tmp=str[i]-'0';
push(LS,(void *)&tmp);
}
I know this won't work properly as we store the same variable's adress over and over again.
Do i need to allocate an auxiliary array in order to store them 1 by 1 or is there a better way to do this?
The answer must address two separate aspects of your question:
How to organize some collection of items, and where to get the memory from to do that.
First code snippet / Linked list format
The first code snippet is good the way it is.
It sets up a linked list, which has its pros and cons, but serves very well if you don't know the number of items in advance, if you want to be able to quickly remove or insert items somewhere in the middle of the list, and if you don't mind that looking up one certain entry inside the list costs you O(N) effort.
For a generic library-like implementation...
... void* is as good as it goes with ANSI C.
In C++, for example, you could make a template that leaves open the type that is stored in the list (or better yet, you would directly reuse the well-known STL implementation in class forward_list<int>).
Sadly, ANSI C doesn't have something comparable.
One solution is the one you picked, create int objects and hook their addresses into your list of void*.
Another solution for a generic library implementation is to use a precompiler macro for the type, and to define this macro above a header file that holds the generic implementation. This tries to resemble the clean C++ solution, but with precompiler it is not typesafe, so this approach is far from beautiful and comes with several risks.
Second code snippet / Memory allocation
Creating the list with void* instead of int (or whatever non-pointer type) requires you to allocate further memory beside the list.
I. e., it is not only that you have to allocate every list item (= variable of type STACK_NODE_t) but also the actual entry value (e. g., *(int*)(LS->storage)).
This means you have to allocate/deallocate the data in some other way that outlives the stack.
On most systems, you can use malloc/free for that, and you only have to take into account the size of the heap available for malloc and the time de-/allocating takes.
If the list shall implement real-time requirements or on embedded systems, you may not have malloc or you may not be allowed to use it.
Then you have to allocate and implement your own heap (= memory pool of storage items) for your list.
How to implement such a memory pool with desired properties is a separate question that would take us to far here.
In any case, you must not use the pointer to a stack variable (like a local variable inside a function) because the memory "behind" that variable will not be reserved for this purpose once the function exits, and the memory may be used for something different in the meantime.
This is, however, what the second code snippet does apparently.
As you noticed yourself, taking this path...
we store the same variable's adress over and over again.
Reusing the memory position for another entry of the same list is an extreme case of the risk explained above.
I solved the problem using an auxiliary array like i anticipated. If someone comes up with a better solution its more than welcome.

What is the easiest way to do deep copy of a struct in C?

Suppose I have a
struct A{
char *name;
unsigned long *trunks;
bool value;
const struct smap *smap;
...
...
}
This struct has all types of data structures and I do not have direct exposure to the struct apart from struct A A*, which is a pointer to it.
You have to copy every element in the struct and all referenced objects to newly allocated structures the same way (recursively).
If the struct has only few pointers, you might use memcpy to copy all elements as-is first and then copy all referenced (through pointers) objects in a second pass. If there are many pointers, it might be more efficient to copy each field by hand.
Referenced objects must be treated identical (by recursion, iteration would be pretty nasty). However, for this, you need to know the structure of these types. Alternatively, there might be copy functions for all these objects in their implemenation file, thus keeping them opaque. If neither the structure, nor a copy function is available, you are somewhat lost, as there is no way to detect the pointers without that.
A problem will arise if there are circular references. Then things get even more complicated.

How to perform deep copy at ADTs in C?

Recently, I am writing some Abstract Data Types (ADTs) for Queue in C Language.
But I found a problem for ADT in C:
How can I pass the type of data in C?
For example, in C++ I can use the template to pass type:
std::queue< struct mySt > myQ;
That template will pass "struct mySt" type to create myQ.
But how to do this in C?
All I know is to create a generic pointer pushing the data of "struct mySt" like below:
void enq(void *dataPtr);
and pop it using casting like below:
struct mySt *a = (struct mySt *) deq()
That seems work in C, but how can I perform "deep copy" action? I mean creating a new memory space for the content of the pointer dataPtr rather than just points it?
Except using Macro or Function pointer to solve this, is there other better way to solve it?
There isn't a simple way to do it in C. The C++ code relies on constructors (the copy constructor) to implement the copy, which is necessary because you can't tell a priori whether there are pointers in the class or allocated memory that need to be altered when an independent copy of the structure is made.
If you are going to copy the structures in a C ADT, then at a bare minimum you will need to specify the size of the structure to be copied as part of the interface. However, you really need a copy function that knows how to deal with the pointers in a copy of the structure.
Passing pointers around is simpler; it is clear that the object that is pointed at continues to exist unmodified by the fact that its pointer is now stored in a list.

the dynamic array in C

Recently I found it was annoyed to deal with array in c language.
I have to realloc() frequently to increase the size.
And there is no standard data structure like vector in C++ or Arraylist in java
I have got to known that in linux kernel, there is some data structure, such as kfifo,
we could use this by kfifo_in(), kfifo_out() function.
But this means the user would define kfifo *pointer; to record the array, and this variable does not contain any info about the type contained in the structure.
The user have to remember that when he try to use the dynamic array by kfifo pointer.
I think it may be a little confusing.
Is there any better way to deal with the problem? What's the common solution in linux c programing?
realloc is not that bad, as long as you do not spread it all over your code, and use a reasonable strategy to grow your dynamic array.
Rolling your own dynamic arrays in C is a matter of implementing a handful of easy functions. Numerous short articles walk you through this exercise - here is one for an example. The article defines a struct that represents your dynamic array, along with the currently used and the allocated size. It also provides functions for initializing, growing, and de-allocating the array represented by the structure. There is no explicit initialization function in the library - you initialize by passing NULL as the first parameter. This is a valid approach, but you could also opt for a more traditional separation of init and grow.
I'd use Glib arrays. It's a very well known library in Linux and other OSes, used in projects like Gnome.
There is no standard for dynamic arrays in C.
#bluesea
I mean they could define struct array{int len; int capacity; int each_element_size; void *data;} and copy the bytes of element, put at the end of the data. – bluesea Jun 29 at 3:04
This is already taken care of in the library under discussion. See the macro's that it comes with and the examples in the main.c file. Depending on the macros's being used, you would either end up with an array of pointers to the original data, or an array of pointers to a copy of the data.
FWIW, I'm the author of the library, and I'll be the first to admit that it comes without airbags, so you have to be sure to use it safely (as with anything else in C).

Is there any case for which returning a structure directly is good practice?

IMO all code that returns structure directly can be modified to return pointer to structure.
When is returning a structure directly a good practice?
Modified how? Returning a pointer to a static instance of the structure within the function, thus making the function non-reentrant; or by returning a pointer to a heap allocated structure that the caller has to make sure to free and do so appropiately? I would consider returning a structure being the good practice in the general case.
The biggest advantage to returning a complete structure instead of a pointer is that you don't have to mess with pointers. By avoiding the risks inherent with pointers, especially if you're allocating and freeing your own memory, both coding and debugging can be significantly simplified.
In many cases, the advantages of passing the structure directly outweigh the downsides (time/memory) of copying the entire structure to the stack. Unless you know that optimization is necessary, no reason not to take the easier path.
I see the following cases as the ones I would most commonly opt for the passing structs directly approach:
"Functional programming" style code. Lots of stuff is passed around and having pointers would complicate the code a lot (and that is not even counting if you need to start using malloc+free)
Small structs, like for example
struct Point{ int x, y; };
aren't worth the trouble of passing stuff around by reference.
And lastly, lets not forget that pass-by-value and pass-by-reference are actually very different so some classes of programs will be more suited to one style and will end up looking ugly if the other style is used instead.
These other answers are good, but I think missingno comes closest to "answering the question" by mentioning small structs. To be more concrete, if the struct itself is only a few machine words long, then both the "space" objection and the "time" objection are overcome. If a pointer is one word, and the struct is two words, how much slower is the struct copy operation vs the pointer copy? On a cached architecture, I suspect the answer is "none aat all". And as for space, 2 words on stack < 1 word on stack + 2 words (+overhead) on heap.
But thes considerations are only appropriate for specific cases: THIS porion of THIS program on THIS architecture.
For the level of writing C programs, you should use whichever is easier to read.
If you're trying to make your function side-effect free, returning a struct directly would help, because it would effectively be pass-by-value. Is it more efficient? No, passing by reference is quicker. But having no side effects can really simplify working with threads (a notoriously difficult task).
There are a few cases where returning a structure by value is contra-indicated:
1) A library function that returns 'token' data that is to be re-used later in other calls, eg. a file or socket stream descriptor. Returning a complete structure would break encapsulation of the library.
2) Structs containing data buffers of variable length where the struct has been sized to accommodate the absolute maximum size of the data but where the average data size is much less, eg. a network buffer struct that has a 'dataLen' int and a 'char data[65536]' at its end.
3) Large structs of any typedef where the cost of copying the data becomes significant, eg:
a) When the struct has to be returned through several function calls - multiple copying of the same data.
b) Where the struct is subsequently queued off to other threads - wide queues means longer lock times during the copy-in/copy-out and so increased chance of contention. That, and the size of the struct is inflicted on both producer and consumer thread stacks.
c) Where the struct is often moved around between layers, eg. protocol stack.
4) Where structs of varying def. are to be stored in any array/list/queue/stack/whateverContainer.
I suspect that I am so corrupted by c++ and other OO languages that I tend to malloc/new almost anything that cannot be stored in a native type
Rgds,
Martin

Resources