Behavior of struct copying in C - c

I have been reading a number of questions on this site about how to duplicate structs in C. I've been playing around with some code, trying to understand the differences between 'shallow' copying (where the new struct is simply assigned a pointer to the memory address of the first struct) and 'deep' copying (where the data is copied member-by-member into a new chunk of memory).
I created the following code, assuming that it would show the 'shallow' copying behavior:
#include <stdio.h>
struct tester
{
int blob;
int glob;
char* doob[10];
};
int main (void)
{
//initializing first structure
struct tester yoob;
yoob.blob = 1;
yoob.glob = 2;
*yoob.doob = "wenises";
//initializing second structure without filling members
struct tester newyoob;
newyoob = yoob;
//assumed this line would simply copy the address pointed to by 'yoob'
//printing values to show that they are the same initially
printf("Before modifying:\n");
printf("yoob blob: %i\n", yoob.blob);
printf("newyoob blob: %i\n", newyoob.blob);
//modifying 'blob' in second structure. Assumed this would be mirrored by first struct
newyoob.blob = 3;
//printing new int values
printf("\nAfter modifying:\n");
printf("yoob blob: %i\n", yoob.blob);
printf("newyoob blob: %i\n", newyoob.blob);
//printing memory addresses
printf("\nStruct memory addresses:\n");
printf("yoob address: %p\n", &yoob);
printf("newyoob address: %p\n", &newyoob);
}
Output on running:
Before modifying:
yoob blob: 1
newyoob blob: 1
After modifying:
yoob blob: 1
newyoob blob: 3
Struct memory addresses:
yoob address: 0x7fff3cd98d08
newyoob address: 0x7fff3cd98cb0
Is this code create a deep copy, as it appears, or am I misunderstanding what's happening here?

The shallow vs. deep copy problem is only relevant to pointers. Given a type struct foo:
struct foo a = /* initialize */;
struct foo b = a;
All the values in a are copied to b. They are not the same variable.
However, with pointers:
struct foo *p = calloc(1, sizeof *p);
struct foo *q = p;
q now points to the same memory as p; no copying has taken place (and you run the risk of a dangling pointer once one gets freed). This is a pointer alias. In order to do a shallow copy, one would do:
struct foo *p = calloc(1, sizeof *p);
/* assign to p's fields... */
struct foo *q = calloc(1, sizeof *q);
*q = *p;
Now q has the same field values as p but points to a different block of memory.
A deep copy requires additional effort; any pointers in the structure must be traversed and have their contents copied as well. See this post for a good explanation.

When you use newyoob = yoob; the compiler create code to copy the structure for you.
An important note about the copying: It's a shallow. That means if you have e.g. a structure containing pointers, it's only the actual pointers that will be copied and not what they point to, so after the copy you will have two pointers pointing to the same memory.

Your concept of "shallow copy" is wrong. The code
newyoob = yoob;
is in fact creating a shallow copy of yoob to newyoob. Your variables yoob and newyoob are separate memory allocations.
Now if you did this
struct tester* newyoob = &yoob;
Then newyoob and yoob are "the same" - but again, two variables referencing the same memory region is not considered a copy.

Suppose this
typedef struct tester {
int someInt;
char* someString;
} tester;
Shallow copying
Then you assign one instance to another:
tester a = {1, "hohohahah"};
tester b = a;
The members of a will be copied by value, including the pointer. This means:
a.someString == b.someString // True: comparing addresses and addresses are the same.
b is a shallow copy of a because the pointed members point to the same memory.
Deep copying
A deep copy means that the pointed members are also duplicated. It would go along these lines:
#include <string.h>
#include <stdio.h>
#include <stdlib.h>
typedef struct tester {
int someInt;
char* someString;
} tester;
void deepcopy_tester(tester *in, tester *out) {
out->someInt = in->someInt;
out->someString = malloc(strlen(in->someString)+1);
strcpy(out->someString, in->someString);
}
int main() {
tester t1 = {1, "Yo"};
tester t2 = {0, NULL};
deepcopy_tester(&t1, &t2);
printf("%s\na", t2.someString);
}
This code should work, just tested with gcc.

Related

Idiom for aliasing structs

I understand that C is (mostly) call and assign "by value", and that struct assignment a = b creates a copy of b.
This leads to some verbosity when iterating through an array whose members are struct, and accessing member fields (example below). Is there an idiom for aliasing structs in a loop?
#include <stdio.h>
#define N_FOOS 2
struct Foo {
char *bar;
char *baz;
};
int main() {
struct Foo foos[N_FOOS] = {
{"foo", "bar"},
{"baz", "qux"},
};
for (int i = 0; i < N_FOOS; ++i) {
printf("foos[%d].bar = %s\n", i, foos[i].bar);
printf("foos[%d].baz = %s\n", i, foos[i].baz);
}
}
In a higher-level language I would have created an alias within for to point to foos[i] and avoid repeated indexing.
Would the idiomatic way be to create a pointer that references foos[i]?
int main() {
struct Foo *foo;
struct Foo foos[N_FOOS] = {
{"foo", "bar"},
{"baz", "qux"},
};
for (int i = 0; i < N_FOOS; ++i) {
foo = &foos[i];
printf("foos[%d].bar = %s\n", i, foos[i].bar);
printf("foos[%d].baz = %s\n", i, foos[i].baz);
printf("foo->bar = %s\n", foo->bar);
printf("foo->baz = %s\n", foo->baz);
}
}
The downside is having to do manual deallocation, but I guess that's just an inescapable part of the language.
EDIT: fixed code due to #dbush's feedback
What you have will work fine. It can be especially useful if you have a structure several layers deep to abbreviate what you're referring to and make your code more clear.
The one problem you have is a memory leak. You dynamically assign memory to foo, but then you overwrite the address of that allocated memory with the address of another variable:
foo = &foos[i];
Now the allocated memory is lost.
Because you're using the pointer to point to an existing variable, you don't need dynamic allocation at all. Get rid of malloc and free.
Your program is incorrect. Here's how to fix it: Remove the malloc on the RHS, replace it with NULL, and remove the free line.
You're just moving the pointer, so there's no need to do manual allocation, and if you free the last value, that's a bug because you're freeing something that wasn't malloced.

Copy one pointer content to another

I thought I've read somewhere that when using pointers and we want to copy the content of one to another that there are two options:
using memcpy or
just assigning them with = ?
However in the lower example, I just tested it by allocating memory for two pointers, then assigning the second, changing first..but then the entry of my second pointer is also changing.. What am I doing wrong :/.
typedef struct {
int a;
int b;
int c;
} my_struct;
int main(int argc, char** argv) {
my_struct* first = malloc(sizeof(my_struct));
first->a = 100; first->b = 101; first->c = 1000;
my_struct* bb = malloc(sizeof(my_struct));
printf("first %d %d %d\n", first->a, first->b, first->c);
bb = first;
printf("second %d %d %d\n", bb->a, first->b, bb->c);
first->a = 55; first->b = 55; first->c = 89;
printf("second %d %d %d\n", bb->a, first->b, bb->c);
}
The moment you do bb = first;, bb and first are pointing to the same location of memory. first->a = 55; first->b = 55; first->c = 89; will change the values for a, b, and c in that location. The original value of first, is still lingering in memory but no way to access it anymore.
I think what you may want to do is *bb = *first;.
Your knowledge about memcpy is correct but you cannot assign contents of "location pointed to by the pointers" just by assigning pointers like you did in your statement mentioned above.
You are assigning one pointer to the another in the following statement:
bb = first;
Now both these point to the same memory location (think of bb as an alias to first).
If you want to copy data then you copy using "data pointed to by pointers" *bb = *first
As has already been pointed out, if you have a pointer first that points to some location in memory, and you make the assignment bb = first, where bb is a compatible pointer type, then bb points to the same address as first. This does not copy the contents of the memory referenced by first to the location originally referenced by bb. It copies the value of the pointer, which is an address, to bb.
If you define an array A, you can't make the assignment B = A to copy the contents of A to B. You must use strcpy() or memcpy() or some such function. But structs are different. You can assign the contents of one struct to a compatible struct.
In your example, bb and first are pointers to structs, and when you write bb = first, now both pointers reference the same address in memory, and you no longer have access to the memory originally referenced by bb-- so now you have a memory leak! But *bb and *first are structs, and when you write *bb = *first, the contents of the struct *first are copied to the struct *bb. So now you have two different structs, at different locations in memory, each with copies of the same three ints.
If your my_struct type contained a pointer to int, then after the assignment *bb = *first they would each contain a copy of a pointer to the same location in memory, but the data referenced by those pointers would not be copied. So, if the structs contained a pointer to an array, only the pointer would be copied, not the contents of the array, which would be shared by the two structs.
you need to copy the data pointed by pointers like *p1 = *p2. But Remember, this will not work if you have pointers again inside the structures you are copying.

Save pointer address to dereferenced pointer / own malloc function

I'm stuck, maybe on a very simple question.
In university we have to make our own malloc-function in C. I only have a problem when saving the pointer Address on the dereferenced pointer. Im working on heap and there is enough memory left.
void *actual_pointer = sbrk(sizeof(Node));
*(char*)actual_pointer = 'O';
actual_pointer = actual_pointer+sizeof(char);
*(char*)actual_pointer = 'K';
actual_pointer = actual_pointer+sizeof(unsigned int);
*(unsigned int*)actual_pointer = size;
actual_pointer = actual_pointer+sizeof(unsigned int);
*(unsigned int*)actual_pointer = 0;
actual_pointer = actual_pointer+sizeof(unsigned int);
*actual_pointer = actual_pointer;
The last line doesn't work. I tried everything. Isn't it possible to store some pointer Address to the dereferenced pointer?
typedef struct _Node_
{
char checkCorruption_[2];
unsigned int size_;
unsigned int status_;
char *location_;
struct Node *next_;
struct Node *prev_;
} Node;
This is the structure of my double-linked list representing the momory structure.
My Idea was the following:
We need to make a simple mallocfunction. From main function for example data[1] = malloc(100 * sizeof(int)) is called. Then I will create in the mallocfunction one Node and store the "checkCorruption"-Value 'OK' in it. After it the size, in my example "100 * sizeof(int)". After this I store a 0 for used or a 1 for free in it. Then I will store the location which is returned to data[0] - the storage gets reserved with sbrk(100*sizeof(int)) and begins at the location. Then i will store the Pointer to the next Node and the previous.
I always check the OK-value if some other malloc had an overflow and overwrited it - then i will exit with an error.
Is my Idea totally bullshit or is it ok?
Edit2:
When I will use now Node instead of void I can store also my location pointer to the node.
Node *actual_pointer = sbrk(sizeof(Node));
actual_pointer->checkCorruption_[1] = 'O';
printf("actual_pointer: %p\n", actual_pointer);
printf("actual_O: %c\n", actual_pointer->checkCorruption_[1]);
printf("actual_pointer_before: %p\n", actual_pointer);
actual_pointer = actual_pointer+sizeof(char);
printf("actual_pointer_after: %p\n", actual_pointer);
Output:
actual_pointer: 0x1ad4000
actual_O: O
actual_pointer_before: 0x1ad4000
actual_pointer_after: 0x1ad4028
But now I have some problems with actual_pointer = actual_pointer+sizeof(char);. This command should add the size of char to the actual_pointer but it increases the pointer with 40 bytes? I don't understand this?
Thanks in Advance,
Philipp
It is impossible to store value into a void...
Try replacing the last line with
*(unsigned int*)actual_pointer = (unsigned int*)actual_pointer

Changing values in elements of an array of structs

I am working on an assignment and ran into challenging problem. As far as I'm concerned and from what I've learnt the code that follows should be correct however it does not work. Basically what I am trying to is copy a string value into the variable member of a structure the is part of an array passed into a method as a pointer. What am I missing?
typedef struct
{
char * name; //variable in struct I am trying to access
} Struct;
void foo(Struct * arr) //array of Structs passed into function as a pointer
{
int i = 0;
while(i++ < 2)
{
arr[i].name = malloc(sizeof(char *)); //assigning memory to variable in each Struct
arr[i].name = strdup("name"); //copying "name" to variable in each Struct
printf("C - %s\n", arr[i].name); //printing out name variable in each Struct
}
}
main()
{
Struct * arr; //defining pointer
arr = calloc(2, sizeof(Struct)); //allocating memory so pointer can hold 2 Structs
foo(arr); //calling function foo passing pointer into function
return 0;
}
This code compiles and runs however it does not do what it is designed to do. Forgive me if it is something trivial. I am new to the language C
Two issues:
while(i++ < 2) This line changes the value of i as soon as it checks it, so your loop body will not be the same as it was checked.
arr[i].name = strdup("name"); overwrites the value of the .name pointer, causing a memory leak of the memory you malloc()'ed earlier.
Extending on 2 pointed out correctly already,
arr[i].name = strdup("name");
Even if you use following instead of above,
strcpy(array[i].name, "name");
you haven't allocated enough bytes to store the string i.e. this is wrong
arr[i].name = malloc(sizeof(char *));
// even if pointer is 8 byte here, concept isn't right
Should be something like
arr[i].name = malloc(strlen("name")+1);
// or MAX_SIZE where it is greater than the possible "name".
Or better yet, remove the malloc at all, strdup takes care of allocation itself
This is not answering your question directly, but addresses an issue to big to put into a comment...
Additional issue: You probably did not intend to allocate only a (char *) worth of memory to a variable intended to hold at least "name". Change;
arr[i].name = malloc(sizeof(char *));
to:
arr[i].name = malloc(sizeof(char)*strlen("name")+1); //+1 for '\0'
or better yet, use char *name="name";, then:
arr[i].name = malloc(sizeof(char)*strlen(name)+1);
Even more general (and better):
char *name;
name = malloc(strlen(someInputString)+1);
//do stuff with name...
free(name);
Now, you can allocate name to any length needed based on the length of someInputString.
[EDIT]
Etienz, I wanted to address one more thing, alluded to by #H2CO3 above, but not really explained, that I think might be useful to you:
Regarding your desire to have room for two structs, because you typedef'd your struct, you can simply do something like this: (but I am going to change the name you used from Struct to NAME :) The whole point being that when a struct is created as an array, you do not need to use calloc or malloc to create space for them, it is done as shown below...
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
typedef struct{
char *name;
}NAME;
//use new variable type NAME to create global variables:
NAME n[2], *pN; //2 copies AND pointer created here
//prototype func
int func(NAME *a);
int main()
{
pN = &n[0]; //pointer initialized here
func(pN); //pointer used here (no malloc or calloc)
printf("name1 is %s\nname 2 is %s", pN[0].name, pN[1].name);
return 0;
}
int func(NAME *a)
{
char namme1[]="andrew";
char namme2[]="billebong";
//You DO have to allocate the members though
a[0].name = malloc(strlen(namme1)+1);
a[1].name = malloc(strlen(namme2)+1);
strcpy(a[0].name, namme1);
strcpy(a[1].name, namme2);
return 0;
}

Copy two structs in C that contain char pointers

What is the standard way to copy two structs that contain char arrays?
Here is some code:
#include stdio.h>
#include string.h>
#include stdlib.h>
typedef struct {
char* name;
char* surname;
} person;
int main(void){
person p1;
person p2;
p1.name = (char*)malloc(5);
p1.surname = (char*)malloc(5);
strcpy(p1.name, "AAAA");
strcpy(p1.surname, "BBBB");
memcpy(&p2, &p1, sizeof(person));
free(p1.name);
printf("%s\n", p2.name);
return 0;
}
The line printf("%s\n", p2.name); does not print something, because I freed the buffer.
The problem with my structs is that they are bigger than struct person. They contain hundreds of char pointers, and I have to copy every member one by one.
Is there another way to copy two structs that contain char arrays without using malloc and strcpy for every member?
You have no choice but provide a copy function yourself:
void copy_person(person *dst, const person *src)
{
dst->name = malloc(strlen(src->name) + 1);
dst->surname = malloc(strlen(src->surname) + 1);
strcpy(dst->name, src->name);
strcpy(dst->surname, src->surname);
}
which may be more elaborated than that: checking for errors, factoring the strlen + strcpy in an auxilliary function, etc.
That's what copy constructors in C++ are for.
Yes, copying struct that contain char arrays will work without any problem, but struct with char pointers (or any type of pointer for that matter) you will have to do manually.
Also note that the cast of malloc's return type is not needed in C (it is in C++) and can hide a missing prototype for malloc.
To elaborate on the answer of Alexandre C. you might want to do the malloc() as a single operation so that a free() is also simple.
This approach provides a degree of protection in that the single malloc() will either succeed or fail so that you would not have a problem of malloc() failing midway through constructing a copy. With this approach you would mix person with pointers to person that have been malloced so you might want to have two different data types something along the lines of the following in order to better mark which is which.
I have provided two alternatives for the copying with one using C Standard library functions strcpy() and strlen() and the other using a simple function that does a straight copy and returns a pointer to where it left off in the destination buffer.
I have not tried to compile this example so there may be problems with it.
There is one possible concern with this approach. Since the individual strings are not malloced you may run into a problem if you are moving the individual strings around using their pointers with the idea that each of the individual strings is its own malloced area of memory. This approach assumes the entire object is wanted or none of it is wanted.
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
typedef struct {
char* name;
char* surname;
char* address1;
} person, *personptr;
// copy a string to destination string return pointer after end of destination string
char * StrCpyRetEnd (char *pDest, char *pSrc)
{
while (*pDest++ = *pSrc++);
return pDest;
}
personptr DeepCopyPerson (person *pSrc)
{
personptr pDest = 0;
unsigned int iTotalSize = sizeof(person);
iTotalSize += (strlen(pSrc->name) + 1) * sizeof(char);
iTotalSize += (strlen(pSrc->surname) + 1) * sizeof(char);
iTotalSize += (strlen(pSrc->address1) + 1) * sizeof(char);
pDest = malloc(iTotalSize);
if (pDest) {
#if 1
// alternative one without a helper function
pDest->name = (char *)(pDest + 1); strcpy (pDest->name, pSrc->name);
pDest->surname = pDest->name + strlen(pDest->name) + 1; strcpy (pDest->surname, pSrc->surname);
pDest->address1 = pDest->surname + strlen(pDest->surname) + 1; strcpy (pDest->address1, pSrc->address1);
#else
// alternative two using StrCpyRetEnd () function
pDest->name = (char *)(pDest + 1);
pDest->surname = StrCpyRetEnd (pDest->name, pSrc->name);
pDest->address1 = StrCpyRetEnd (pDest->surname, pSrc->surname);
strcpy (pDest->address1, pSrc->address1);
#endif
}
return pDest;
}
int main(void){
person p1; // programmer managed person with separate mallocs
personptr p2; // created using ClonePerson()
p1.name = malloc(5);
p1.surname = malloc(5);
p1.address1 = malloc(10);
strcpy(p1.name,"AAAA");
strcpy(p1.surname,"BBBB");
strcpy(p1.address1,"address1");
p2 = DeepCopyPerson (&p1);
free(p1.name);
printf("%s\n", p2->name);
free (p2); // frees p2 and all of the memory used by p2
return 0;
}
You have to allocate memory to any pointer if you want to do a copy. However you can always make a pointer point to already allocated memory. For example, you can do the following:
p2.name = p1.name (p1.name is already allocated memory)
This is dangerous as there are more than one reference to the same memory location. If you free either p1.name or p2.name, it results in a dangerous situation.
In order to copy the entire content you have to allocate memory to the pointers of the struct p2.
p2.name = <allocate memory>
Copy individual struct members instead of a memcpy of the entire struct
This is because memory is not allocated in a contiguous manner. Also sizeof(struct) will give you size of the members of the struct and not the memory allocated to it.
For example sizeof(p2) = 8 = sizeof(p1)= sizeof(person) even after allocating memory to members of p1.
It would be a different case had the members been char arrays.
A bit out-of-the-box thinking:
Since the structure of your struct is static, you could write a small utility program or script to generate the copy code for you.
Take the source-code of your struct definition as input, and then devise a set of rules to generate the copying code.
This is quickshot, and I don't know if it were faster to just write the copy-code manually - but at least it is a more interesting problem.

Resources