Why struct shallow copy does not work? - c

I'm testing shallow copy for struct with this code:
#include "stdio.h"
#include "conio.h"
int main() {
struct str {
char * name;
int value;
};
struct str str_1 = {"go", 255};
struct str str_2;
str_2 = str_1;
str_1.name = "back";
printf("%s\n",str_1.name);
printf("%s\n",str_2.name);
printf("\n");
system("pause");
return 0;
}
I expected the result should be:
back
back
But it was:
back
go
Edit: I expected that because with shallow copy, str_1.name and str_2.name should always point to same place.
Edit: And with dynamic allocation, I got what I expected:
#include <stdio.h>
#include <conio.h>
#include <string.h>
int main() {
struct str {
char * name;
int value;
};
struct str str_1;
struct str str_2;
str_1.name = (char*) malloc(5);
strcpy(str_1.name,"go");
str_2 = str_1;
strcpy(str_1.name,"back");
printf("%s\n",str_1.name);
printf("%s\n",str_2.name);
printf("\n");
system("pause");
return 0;
}
The result is:
back
back
What was I misunderstand here ?

Take a piece of paper and draw what you think is happening at each step slowly, and this should become clear.
Let's draw struct str as so:
---------------------
| const char * name |
---------------------
| int value |
---------------------
And let's denote a string at address 0xabcdef10 as so:
--0xabcdef10---------------------
| "string\0" |
---------------------------------
So, when we initialise str_1, we need some memory location which will hold "go", lets call that 0xaaaaaa00
--0xaaaaaa00---------------------
| "go\0" |
---------------------------------
Then we initialise str_1 with a pointer to that:
struct str str_1 = {"go", 255};
--str_1---------------------------
| const char * name = 0xaaaaaa00 |
----------------------------------
| int value = 255 |
----------------------------------
Now we take a shallow copy of str_1, and call it str_2:
--str_2---------------------------
| const char * name = 0xaaaaaa00 |
----------------------------------
| int value = 255 |
----------------------------------
Next we execute str_1.name = "back";. As before, we first want to create the new string. Lets put that at 0xbbbbbb00:
--0xbbbbbb00---------------------
| "back\0" |
---------------------------------
Then we assign it to str_1.name, so str_1 now looks like:
--str_1---------------------------
| const char * name = 0xbbbbbb00 |
----------------------------------
| int value = 255 |
----------------------------------
Note that we haven't changed str_2.
So, when we look at our final "memory", we see:
--0xaaaaaa00---------------------
| "go\0" |
---------------------------------
....
--0xbbbbbb00---------------------
| "back\0" |
---------------------------------
....
--str_1---------------------------
| const char * name = 0xbbbbbb00 |
----------------------------------
| int value = 255 |
--str_2---------------------------
| const char * name = 0xaaaaaa00 |
----------------------------------
| int value = 255 |
----------------------------------
So str_1 points at the new string, and str_2 points at the old string.
In case you describe as dynamic, you never update the pointers in the struct, you could go through the same exercise to draw out what happens to memory in that case.

str_2 = str_1; did take a shallow copy.
But that doesn't mean that that any subsequent alteration to what name points to in str_1 will be automatically reflected in str_2.
(Really you should use const char* as the string type as you are assigning read-only string literals).

str_2 = str_1; makes a hard copy of the struct itself. For example the value member will be unique for every struct.
But you got a soft copy of any pointer members, since the copy does not affect the pointed-at data of any pointer. Meaning that after the copy, the name pointer of both str_1 and str_2 points at the literal "go".
And then str_1.name = "back"; only changes where str_1.name points at. It doesn't change where str_2.name points at.
When you used malloc and strcpy, you change the pointed-at data and get a complete hard copy of everything.

Related

Creating a singly-linked list

I have a function to join two structs to create a linked list.
Here, is the code:
struct point{
int x;
int y;
struct point *next;
};
void printPoints(struct point *);
void printPoint(struct point *);
struct point * append(struct point *, struct point *);
void main(){
struct point pt1={1,-1,NULL};
struct point pt2={2,-2,NULL};
struct point pt3={3,-3,NULL};
struct point *start, *end;
start=end=&pt1;
end=append(end,&pt2);
end=append(end,&pt3);
printPoints(start);
}
void printPoint(struct point *ptr){
printf("(%d, %d)\n", ptr->x, ptr->y);
}
struct point * append(struct point *end, struct point *newpt){
end->next=newpt;
return newpt;
}
void printPoints(struct point *start){
while(start!=NULL){
printPoint(start);
start=start->next;
}
}
Here, the append function's task involves changing the end pointer.
Both the arguments of append function are pointers; in 1st case, 1st argument is &pt1 and 2nd argument is &pt2.
The function makes a copy of the end pointer which has the type struct point.
Since &pt1 is passed then this duplicate end pointer has x component as 1 and y component as -1 and next component as NULL.
Now we change this copy's next component to newpt pointer and return the newpt pointer.
Back to the main function, the original end pointer now has the value of &pt2.
end->next = newpt; shouldn't produce any change in the original end pointer in main because only the local pointer was changed.
So then why do I get a linked list.
What I get:
(1, -1)
(2, -2)
(3, -3)
What I think I should get:
(1, -1)
end->next = newpt; shouldn't produce any change in the original end pointer in main. Because, only the local pointer was changed
Not quite correct. It is true that when you call append, a copy of end is made. However, the -> operator dereferences what that pointer points to. You would get the same behavior with (*end).. Since the end in main is the same as the end in append, they both point to the same thing. You could have 100 copies of a pointer, all pointing to the same thing. If you choose one, follow what it points to and change that, then you've changed the same thing that all other 99 pointers point to. Furthermore, you reassign end in main by returning newpt, so each call to append results in an updated end. The output you observe is correct. Consider the condensed stack frames:
In main, at first call to append:
____main____
|___pt1____| <----+ <-+
|x=1 y=-1 | | |
|_next=NULL| | |
|___pt2____| | |
|___pt3____| | |
|__start___|------+ |
|___end____|-----------+ // cell sizes NOT drawn to scale
// start and end both point to pt1
Now, on the first call to append, the main stack frame stays the same, and a new one is created for append, where end and the address to pt2 are passed in.
|___main____
|___pt1____| <----+ <-+
|_next=NULL| | | // x and y omitted for brevity
|___pt2____| | |
|___pt3____| | |
|___end____|------+ |
|
___append__ |
|___&pt2___| |
|___end____|-----------+ // also points to pt1 back in main
When you use the -> operator, you dereference what that pointer points to. In this case, pt1, so both end in main and end in append point to pt1. In append, you do
end->next = newpt;
which is the address of pt2. So now your stack frames look like this:
|___main____
|___pt1____| <-----------+ <-+
|_next=&pt2|------+ | | // x and y omitted for brevity
| | | | | // (made pt1 cell bigger just to make the picture clearer, no other reason)
|__________| | | |
|___pt2____| <----+ | |
|___pt3____| | |
|___end____|-------------+ |
|
___append__ |
|___&pt2___| |
|___end____|------------------+ // also points to pt1 back in main
Finally, when you return from append, you return the address of pt2 and assign it to end, so your stack in main looks like this before the 2nd call to append (again, some cells made larger for picture clarity, this does not suggest anything grew in size):
____main____
|___pt1____| <-----------+
|_next=&pt2|---+ |
| | | |
|__________| | |
|___pt2____| <-+ <-+ |
|___pt3____| | |
|___end____|-------+ |
|___start__|-------------+ // flipped start and end position to make diagram cleaner, they don't really change positions on the stack
And you do it all again with your next call to append, passing in end (now points to pt2) and the address of pt3. After all the calls to append, start points to pt1, pt1->next points to pt2, and pt2->next points to pt3, just as you see in the output.
One final note, you have an incorrect function signature for main
As in the illustration start still points to p1 and end points to pt3.
void main(){
struct point pt1={1,-1,NULL};
struct point pt2={2,-2,NULL};
struct point pt3={3,-3,NULL};
struct point *start, *end;
start=end=&pt1;
end=append(end,&pt2);
end=append(end,&pt3);
printPoints(start);
}
As in the main function, you make start and end to point at pt1.
That's why any changes made to end is also seen from start.
struct point * append(struct point *end, struct point *newpt){
end->next=newpt;
return newpt;
}
In the append function, end->next=newpt which sets the next of end to newpt. In the first case, when end points pt1, the next is set to point at pt2. This change in the list is also seen from start.
Hence, the output you are getting is correct.
Changing Pointers
When you pass a pointer to a function, the value of the pointer (that is, the address) is copied into the function not the value it is pointing at.
So, when you dereference the pointer and change it, the change is also seen from any pointer which contain the same address.
Remember that p->next is the same as (*p).next.
void change_the_pointer_reference(int* ip)
{
int i = 1;
*ip = i;
printf("%d\n", *ip); // outputs 1
}
int main()
{
int i = 0;
change_the_pointer_reference(&i);
printf("%d\n", i); // outputs 1
}
But as the value of the pointer is copied, if you assign to the pointer, this change is only seen locally.
void change_the_pointer(int* ip)
{
int i = 1;
ip = &i;
printf("%d\n", *ip); // outputs 1
}
int main()
{
int i = 0;
change_the_pointer(&i);
printf("%d\n", i); // outputs 0
}
Last final note, you have an incorrect signature of main

When passing a pointer as an argument is the address & symbol not required / automatically added by compiler?

In the below code, a *Pointer is passed to a function.
The address & symbol is omitted.
The functions can still modify the value in the passed pointer.
Does the compiler auto-add the &?
char * input();
void get_command() {
char * command_pointer = input();
char * second = second_arg(command_pointer); //second needs to be before first
char * first = first_arg(command_pointer); //becuase first adds a /0
choose(first, second);
free(command_pointer);
}
char * first_arg(char * arg) {
char * j = strchr(arg, '-');
if(j==NULL) {
return arg;
}
*j = '\0';
return arg;
}
char * second_arg(char * arg) {
char * j = strchr(arg, '-');
if( j==NULL ) {
return NULL;
}
return ++j;
}
Does the compiler auto-add the &?
No. & gets the address of a variable, and the address of command_pointer isn't passed. Only its value (the address of the buffer) is passed.
The functions can still modify the value in the passed pointer.
No, they can't. They can't modify (the value of) the pointer. They can only modify that to which the pointer points.
first_arg and second_arg can't change command_pointer (the variable in get_command) because they don't know anything about it and they don't have a pointer to it.
e.g. arg = NULL; would have no effect on command_pointer.
However, they can modify the data to which command_pointer points (the chars), because they have a copy of the same pointer.
e.g. *arg = 0 (aka arg[0] = 0;) would effect *command_pointer (aka command_pointer[0]).
Let's illustrate some of the variables of get_command and first_arg.
input()'s rv Buffer allocated by input
+----------+ +---+---+---+---+---+---+---
| 0x1000 -------+--->| | | | | | | ...
+----------+ | +---+---+---+---+---+---+---
| ^
command_pointer | |
+----------+ | |
| 0x1000 -------+ |
+----------+ | |
| |
arg | j |
+----------+ | +----------+ |
| 0x1000 -------+ | 0x1004 ---------+
+----------+ +----------+
input() returns an address to a bufer it allocated (0x1000 in the diagram).
The value returned by input() is copied into command_pointer.
The value of command_pointer is copied into arg.
A value based on arg is stored in j.
get_command can modify command_pointer and the buffer.
first_arg can modify arg, j and the buffer, but not command_pointer.
The value is changed only if you use the * on the pointer. if you will not the value will not change, the address will change. Consider these functions:
char * first_arg(char * arg) {
arg = 'a';
// some other code //
}
char * second_arg(char * arg) {
*arg = 'c';
// some other code //
}
the first function will change the address that arg is pointing at. This function can be dangerous because we don't now what is stored in the address 'a'. This function may change the behavior of otter scopes in an unexpected way. but it will not change the value that arg is pointing at. (except of the scenario that you hardcode the exact address of arg)
the second function will change the value in the address arg is pointing at. This is the well defined behavior that we are looking for. This will only change the value of the memory arg is pointing at.

Shared memory with structure and int

So im having this problem that I want to add "one" structure and one int to my shared memory
And I want to have my "int in the first position of the shared memory" (since i ll need this int in another programs) and then have the structure
This is my code
int id = shmget( 0x82488, (sizeof(student)) + sizeof(int) ,IPC_CREAT | 0666 );
exit_on_error (id, "Error");
int *p = shmat(id,0,0);
exit_on_null(p,"Erro no attach");
Student *s = shmat(id,0,0);
exit_on_null (s,"Error");
And now comes my question since I have 2 pointers how can I make the int be the first and then the structure, should I just
p[0]=100 s[1] = (new Student)
I would just do
int *p = shmat(id,0,0);
exit_on_null(p,"Erro no attach");
Student *s = (Student*)(void*)(p + 1);
so that s points to where the next int would be if that would be an int.
It is a bit tricky, but clears all possible interoperation issues with possible padding bytes in a struct.
Example:
+---+---+---+---+---+---+---+---+---+---+
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
+---+---+---+---+---+---+---+---+---+---+
In this case, p points to the location 0 (relative to the start of the buffer), and thus p + 1 points to the position 4 (if an int has 32 bits). Casting p + 1 the way I do makes pont s to this place, but be of type Student *.
And if you want to add a structure struct extension, you do the same:
struct extension *x = (struct extension*)(void*)(s + 1);
This points immediately behind the Struct and, again, has the correct pointer type.

I have three loops over an array of (char*) elements in C. Why does the third fail?

While experimenting with methods for stepping through an array of strings in C, I developed the following small program:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
typedef char* string;
int main() {
char *family1[4] = {"father", "mother", "son", NULL};
string family2[4] = {"father", "mother", "son", NULL};
/* Loop #1: Using a simple pointer to step through "family1". */
for (char **p = family1; *p != NULL; p++) {
printf("%s\n", *p);
}
putchar('\n');
/* Loop #2: Using the typedef for clarity and stepping through
* family2. */
for (string *s = family2; *s != NULL; s++) {
printf("%s\n", *s);
}
putchar('\n');
/* Loop #3: Again, we use the pointer, but with a unique increment
* step in our for loop. This fails to work. Why? */
for (string s = family2[0]; s != NULL; s = *(&s + 1)) {
printf("%s\n", s);
}
}
My specific question involves the failure of Loop #3. When run through the debugger, Loops #1 and #2 complete successfully, but the last loop fails for an unknown reason. I would not have asked this here, except for the fact that is shows me that I have some critical misunderstanding regarding the "&" operator.
My question (and current understanding) is this: family2 is an array-of-pointer-to-char. Thus, when s is set to family2[0] we have a (char*) pointing to "father". Therefore, taking &s should give us the equivalent of family2, pointing to the first element of family2 after the expected pointer decay. Why doesn't, then,
*(&s + 1) point to the next element, as expected?
Many thanks,
lifecrisis
EDIT -- Update and Lessons Learned:
The following list is a summary of all of the relevant facts and interpretations that explain why the third loop does not work like the first two.
s is a separate variable holding a copy of the value (a pointer-to-char) from the variable family2[0]. I.e., these two equivalent values are positioned at SEPARATE locations in memory.
family2[0] up to family2[3] are contiguous elements of memory, and s has no presence in this space, though it does contain the same value that is stored in family2[0] at the start of our loop.
These first two facts mean that &s and &family2[0] are NOT equal. Thus, adding one to &s will return a pointer to unknown/undefined data, whereas adding one to &family2[0] will give you &family2[1], as desired.
In addition, the update step in the third for loop doesn't actually result in s stepping forward in memory on each iteration. This is because &s is constant throughout all iterations of our loop. This is the cause of the observed infinite loop.
Thanks to EVERYONE for their help!
lifecrisis
When you do s = *(&s + 1) the variable s is a local variable in an implicit scope that only contains the loop. When you do &s you get the address of that local variable, which is unrelated to any of the arrays.
The difference from the previous loop is that there s is a pointer to the first element in the array.
To explain it a little more "graphically" what you have in the last loop is something like
+----+ +---+ +------------+
| &s | ---> | s | ---> | family2[0] |
+----+ +---+ +------------+
That is, &s is pointing to s, and s is pointing to family2[0].
When you do &s + 1 you effectively have something like
+------------+
| family2[0] |
+------------+
^
|
+---+----
| s | ...
+---+----
^ ^
| |
&s &s + 1
Pictures help a lot:
+----------+
| "father" |
+----------+ +----------+ +-------+ NULL
/-----------→1000 | "mother" | | "son" | ↑
+-----+ ↑ +----------+ +-------+ |
| s | ? | 2000 2500 |
+-----+ | ↑ ↑ |
6000 6008 +----------------+----------------+--------------+--------------+
| family2[0] | family2[1] | family2[2] | family2[3] |
+----------------+----------------+--------------+--------------+
5000 5008 5016 5024
( &s refers to 6000 )
( &s+1 refers to 6008 but )
( *(&s+1) invokes UB )
Addresses chosen as random integers for simplicity
The thing here is that, although both s and family2[0] point to the same base address of the string literal "father", the pointers aren't related with each other and has its own different memory location where they are stored. *(&s+1) != family2[1].
You hit UB when you do *(&s + 1) because &s + 1 is a memory location you're not supposed to tamper with, i.e, it doesn't belong to any object you created. You never know what's stored in there => Undefined Behavior.
Thanks #2501 for pointing out several mistakes!
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
typedef char* string;
int main() {
char *family1[4] = { "father", "mother", "son", NULL };
string family2[4] = { "father", "mother", "son", NULL };
/* Loop #1: Using a simple pointer to step through "family1". */
for (char **p = family1; *p != NULL; p++) {
printf("%s\n", *p);
}
putchar('\n');
/* Loop #2: Using the typedef for clarity and stepping through
* family2. */
for (string *s = family2; *s != NULL; s++) {
printf("%s\n", *s);
}
putchar('\n');
/* Loop #3: Again, we use the pointer, but with a unique increment
* step in our for loop. This fails to work. Why? */
/*for (string s = family2[0]; s != NULL; s = *(&s + 1)) {
printf("%s\n", s);
}
*/
for (int j = 0; j < 3; j++)
{
printf("%d ",family2[j]);
printf("%d\n", strlen(family2[j]));
}
printf("\n");
int i = 0;
for (string s = family2[i]; i != 3; s = (s + strlen(family2[i]) + 2),i++) {
printf("%d ",s);
printf("%s\n", s);
}
system("pause");
}
this is a example revised from your code,if you run it,you will find the change of the address of the point and the family2, then you will understand the relationship of the loop #3.

Assigning same values to 2 pointers, not equalising their references (pointers in C)

printf("%p\n\n", element); //0x1000020c0
table->head->element = element;
printf("%p\n\n", table->head->element); //0x1000020c0
I have a pointer to a struct which points to another struct where is char* variable is stored. The problem is a pointer(char * element) which is sent to this method is modified somewhere else, and I don't want those modifications to be affected in table->head->element. Simply said I want to make their values equal, not the reference.
I knew that we can assign same values to 2 pointers like this: *p1=*p2. However, I am not sure how to do that with structs, I tried:
*(table->head->element) = *element;
But it did not work.
I hope I could clarify my question.
If you want the pointed-to string by a copy, not simply a reference, you'll need to strcpy() it into some new memory. e.g.
int len = strlen(element);
table->head->element = malloc(len+1); // +1 for string-terminating null
strcpy(table->head->element, element);
(or use strdup() for a one-line solution, as pointed out in R Sahu's reply).
Responding to OP's comments in the answer by #GrahamPerks.
That will work, but I don't want using additional coping, because we can do it with pointer
Let's say the memory used by element looks like below:
element +---------+ +---+---+---+---+---+---+---+---+------+
| ptr1 | -> | a | | s | t | r | i | n | g | '\0' |
+---------+ +---+---+---+---+---+---+---+---+------+
If you use:
table->head->element = element;
the value of table->head->element is ptr1.
If some time later, you went ahead and changed the contents of ptr1 through element, such as by using:
fscanf(file, "%s", element);
You could end up with:
element +---------+ +---+---+---+---+---+---+---+---+---+---+------+
| ptr1 | -> | n | e | w | | s | t | r | i | n | g | '\0' |
+---------+ +---+---+---+---+---+---+---+---+---+---+------+
At that point, the string that you see from table->head->element is "new string", not "a string".
This is what happens when you don't copy the contents of element but just copy the pointer value of element.
If you want the value of table->head->element to remain "a string" while the value of element changes, you have to copy the contents of element using
int len = strlen(element);
table->head->element = malloc(len+1);
strcpy(table->head->element, element);
or, if your complier supports strdup, by using
table->head->element = strdup(element);

Resources