Understanding strings and arrays

Understanding strings and arrays - c

Consider this code.
int main()
{
char *s, *t;
s = malloc(4 * sizeof(char));
strcpy(s, "foo");
t = s;
printf("%s %s\n", s, t); // Output --> foo foo
strcpy(s, "bar"); // s = "bar"
printf("%s %s\n", s, t); // Output --> bar bar
}
There are 2 strings s and t. First I set s to "foo" and then make t point to s. When I print the strings, I get foo foo.
Then, copy "bar" tos and print again, I get bar bar.
Why does value of t changes in this case? (I copied "bar" to s why did t change).
Now when I change strcpy(s, "bar") to s = "bar" -
int main()
{
char *s, *t;
s = malloc(4 * sizeof(char));
strcpy(s, "foo");
t = s;
printf("%s %s\n", s, t); // Output --> foo foo
s = "bar"
printf("%s %s\n", s, t); // Output --> bar foo
}
This code gives me foo foo and bar foo.
Why didn't it change in this case?

This is undefined behaviour, which means anything can happen:
char *s, *t;
strcpy(s, "foo");
as strcpy() is writing to a random location in memory because s is an uninitialised pointer.
(after edit that corrected undefined behaviour)
Question 1 - Why does value of t changes in this case? (I copied "bar" to s why did t change).
This is a pointer assignment:
t = s;
and results in both t and s pointing to the same memory that was malloc() and assigned to s earlier. Any change to that memory is visible via both t and s.
Question 2 - Why isn't t changing in the second case?
This assigns the address of the string literal "bar" to s:
s = "bar";
so now t and s do not point to the same memory location. t points to the memory that was earlier malloc() and assigned to s (because of the t = s; pointer assignment).
strcpy() and = are very different:
strcpy() copies characters to the memory address specified by its first argument
assignment, =, changes the address which a pointer holds

strcpy(s, "foo");
Copies foo to memory location pointed to by s
t = s;
Now, t and s both point to same location
Hence, same output
Now, you copy bar to s. Since both t and s point to same location. Hence, same output again.
Upto this line everything is same
s = "bar"
You create a string constant bar. And assign its address to s. Its a pointer it can point to any memory location. not necessarily the original one.
Now,
s points to bar and t still to the earlier location it pointed to in the beginning and hence the output

a simplistic way to understand could be as follows :-
s = malloc(4 * sizeof(char)); // this will be interpreted as follows
s = 123 ------------>| | | | | //some garbage values at these location
123 124 125 126 // some random address chosen.
strcpy(s, "foo") ; // it copies at the address provided by s i.e.
|'f'|'o'|'o'|'\0'|
123 124 125 126
t = s; this statement means t it pointing to address = 123
strcpy(s, "bar"); // this statement copies at given address it means it will override previous value . i.e
|'b'|'a'|'r'|'\0'|
123 124 125 126
now t still pointing to address 123 that is why t, s both prints bar .
s = "bar" // this will assign a new address to s which is base address of "bar" string . i.e .
|'b'|'a'|'r'|'\0'|
321 322 323 324
now s will contain address 321 while t has value 123 that is why s and t are giving different values .

Related

Why is the pointer no longer referencing the memory it was assigned?

I am new to programming and I am watching a tutorial on pointers to understand how they work. In the tutorial, the instructor stated the following:
The for-loop assigns values to the address of pointer p. Then the address stored in pointer p is incremented at line 21 to the next integer chunk. This approach works but what happens to pointer p? Its reference is gone. After the for-loop, pointer p no longer references the allocated chunk of memory.
the instructor proceeds to the next lesson without explaining how the reference is gone. Can someone explain how the pointer is no longer referencing the memory it was assigned?
P.S. This is my first time using StackOverflow. Tips on how to correctly ask questions would be appreciated if I posted this one incorrectly.
#include <stdio.h>
#include <stdlib.h>
int main()
{
int *p,x;
/* allocate storage */
p = (int *)malloc( sizeof(int) * 10 );
if( p==NULL )
{
fprintf(stderr,"Allocation failure\n");
exit(1);
}
/* fill storage */
for( x=0; x<10; x++ )
{
*p = x * 100;
/* reference the next integer location */
p++;
}
puts("Memory allocated and filled");
return(0);
}
Disclaimer: This code does not belong to me. I do not take credit for it.

After it calls malloc(), p points to the beginning of the int array that was allocated.
During the loop, it uses p++ to make it point to the next element of the array. At the end of the loop, it points to the address just past the last value in the array. So it no longer points to the beginning of the array.
If you want to do anything more with the array, such as print the contents, return it to a caller, or free it, you need the original pointer to the beginning of the allocated memory. So you need another variable that isn't updated during the loop.
#include <stdio.h>
#include <stdlib.h>
int main()
{
int *p,x,*original_p;
/* allocate storage */
p = (int *)malloc( sizeof(int) * 10 );
original_p = p;
if( p==NULL )
{
fprintf(stderr,"Allocation failure\n");
exit(1);
}
/* fill storage */
for( x=0; x<10; x++ )
{
*p = x * 100;
/* reference the next integer location */
p++;
}
puts("Memory allocated and filled");
free(original_p);
return(0);
}
Alternatively, you can use array indexing rather than incrementing the pointer:
for( x=0; x<10; x++ )
{
p[x] = x * 100;
}

At the start of each for loop iteration, p points to somewhere in memory; that location gets assigned a value, and then p is changed so it no longer points to that location.

In the for loop the pointer p is incremented 10 times.
for( x=0; x<10; x++ )
{
*p = x * 100;
/* reference the next integer location */
p++; // <===
}
So after the for loop it does not point to the allocated memory because the memory was allocated exactly for 10 elements. That is now the pointer p points to beyond the allocated array.

What he's talking about is that once the pointer is incremented the block of memory allocated can't be freed because it no longer points to p[0]. Keep in mind that int *p and p[0] mean the same thing in this example. So he's essentially asking "how do we free p now that we've lost track of it?". Too bad he didn't expand on that before he moved on because it would have made more sense to you.
The simplest thing to do would just be to make another that you can use to move around in the array with then go back to the original pointer when you are done with it. Like this:
int *p = malloc(sizeof(int) * 10);
int *cp = p; /* current location */
Then move cp around as you see fit, incrementing, decrementing, etc. And when you are done with cp you free(p)

Let's think abut creating a string, by hand. As you may know, a string in C is just an array of characters, terminated by a special "null" character. Here's a very simple example, which I encourage you to experiment with:
char string1[10];
char *p = string1;
*p++ = 'a';
*p++ = 'b';
*p++ = 'c';
*p = '\0'; /* this is the special "null character" terminator */
printf("%s\n", string1); /* prints "abc" */
In this example, the pointer p steps along the first few cells of the array string1, filling in characters. Then, we print out the string we've just constructed.
Here is a second example of almost the same thing, except that we call malloc to obtain some dynamically-allocated memory to construct the string, rather than using an array:
char *string2 = malloc(10);
if(string2 == NULL) {fprintf(stderr, "out of memory!\n"); exit(1); }
char *p = string2;
*p++ = 'a';
*p++ = 'b';
*p++ = 'c';
*p = '\0';
printf("%s\n", string2);
This second example works almost exactly the same way the first one does, and it also prints "abc".
Now, finally, here is a third example. Pay attention to the differences.
char *p = malloc(10);
if(p == NULL) {fprintf(stderr, "out of memory!\n"); exit(1); }
*p++ = 'a';
*p++ = 'b';
*p++ = 'c';
*p = '\0';
/* but now how can we print the string? */
In the second example, the pointer string2 still pointed at the beginning of the allocated region, which was the beginning of the string we constructed, so it was possible to print it. In this third example, we have no record of that pointer, so we have no direct way to print the string.
Theoretically, since we know how many characters we placed in the string, we could cheat, and print it like this:
printf("%s\n", p - 3); /* DANGEROUS */
But this is a silly and dangerous thing to do. Normally, the thing to do is keep one pointer pointing to the beginning of the string, and use another to step along it -- that is, as we did in the second exampe, with string2.

The variable p itself is only a pointer to some memory. It can't point to ten things at once, only one single thing.
And what is pointing at (initially!) is the first element of the "array" that you've allocated with malloc. After doing p++ once, then p is pointing to the second element, and so on.
Somewhat graphically it's like this:
+---+---+---+---+---+---+---+---+---+---+
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
+---+---+---+---+---+---+---+---+---+---+
^
|
+---+
| p |
+---+
This is directly after the call to malloc. The number are the indexes of the elements.
Now if we do p++ once it will look like this instead:
+---+---+---+---+---+---+---+---+---+---+
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
+---+---+---+---+---+---+---+---+---+---+
^
|
+---+
| p |
+---+
Then after a second p++:
+---+---+---+---+---+---+---+---+---+---+
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
+---+---+---+---+---+---+---+---+---+---+
^
|
+---+
| p |
+---+
And so on...

Why is the compiler giving a warning that a pointer may be uninitialized when it's going to be initialized and won't the pointer update?

char *s, *p = s;
size_t len = 0;
while (str[len++]);
s = malloc(sizeof(*s) * (len + 1));
How come here: char *s, *p = s; gives warning, but s is going to be initialized with malloc later.
chl/string.c:9:15: warning: ‘s’ may be used uninitialized in this function [-Wmaybe-uninitialized]
9 | char *s, *p = s;
^
Since p is a pointer, pointing to s, won't p be updated as well when it points to s when s will be memory allocated?
Why would I have to do this instead:
char *s, *p;
size_t len = 0;
while (str[len++]);
s = malloc(sizeof(*s) * (len + 1));
p = s;
I thought pointers can change to what it points to, so why isn't p being updated as a pointer? Or if I'm seeing this wrong, why can't I just do *p = s, because s soon is going to be initialized, and p will point to s, so won't p update too?

Let's break this down a little.
What you have essentially is this:
char *s;
char *p = s;
s = malloc(...);
You're proposing that when s gets initialized (by malloc's return value), the value of p should also update.
But, as you've discovered this is not the case. Initially, when you do char *s, s can point to anything. It is not yet initialized.
Subsequently, when you do char *p = s;, you are assigning the current value of s to p -- which could be anything.
If you change the value of s, that doesn't automatically change the value of p. They are distinct variables. They are both pointers - but that doesn't mean they should point to the same thing just because one was initialized from the other.
There is no intrinsic link between these two pointers, even if you assign one to the other. The point is, even if they do point to the same thing at one point in time, you can change what one points to in the future without affecting the other.
Its actually no different from assigning to a non-pointer variable and asserting that it should be updated automatically, e.g.
int i;
int j;
i = j;
j = 5;
printf("%d\n", i); // Prints rubbish
printf("%d\n", j); // Prints 5
Here, j is initialized and the printf is as expected. Meanwhile, i was initialized from j's rubbish value -- the value that happened to be lying in memory at j's location (and that could be anything). Yet, I doubt anyone would suggest that i should "automatically" update in this case.
UPDATE:
The following update is in response to this followup comment made:
Here's why I thought it would update.. char *s = malloc(100); char *p
= s; see this, right? p[0] = 'e' for example will also change s[0], so I thought that since if assigning the element of p by index would also
change the element of s by index, there would be change/update, right?
How come p[0] = 'e' changes the element of both s and p, even though p
just assigned the current value of malloc? They are different pointers
but point to the same memory block, that's why! Am I right?
In this example, p and s again point to the same memory. When you do the assignment p[0] = 'e', you are NOT changing p or s -- you are in fact changing the value pointed to by p. And, since p and s point to the same memory, the change you've made will be visible through both p and s -- when you dereference either. Below is an in-depth example - I recommend compiling it and running it to see what gets printed, and read the comments which explain what is happening at each step.
#include <stdio.h>
#include <stdlib.h>
int main(void) {
// this initializes s to point to some block of memory, e.g. address 0x560890f49260 when I run it locally
// it can store 100 bytes (chars) of data
char *s = malloc(100);
// this initializes p to point to the same block of memory as s => 0x560890f49260
char *p = s;
// this prints out the value of p and s
// they are of type 'pointer', so use %p
// this shows their address as being the same
printf("This is where p points to: %p\n", p);
printf("This is where s points to: %p\n", s);
// this sets the 1st byte at the location pointed to by p
// the thing we're changing is at address 0x560890f49260
// this "array" notation is just syntactic sugar for dereferencing a pointer - see below
p[0] = 'e';
// but p and s are unchanged
printf("This is where p points to: %p\n", p);
printf("This is where s points to: %p\n", s);
// this also changes the 1st byte (same as *p = 'e' and p[0] = 'e')
// here we're using the dereferencing syntax explictly
*(p + 0) = 'e';
// and p and s are still the same
printf("This is where p points to: %p\n", p);
printf("This is where s points to: %p\n", s);
// this changes the 2nd byte (same as p[1] = 'f')
// the thing we're changing is at address 0x560890f49261 - i.e. the next byte
*(p + 1) = 'f';
// and p and s still haven't changed
printf("This is where p points to: %p\n", p);
printf("This is where s points to: %p\n", s);
// this prints the 1st and 2nd byte pointed to by p and s
// they show the same thing in both cases - since p and s point to the same thing
printf("First byte pointed to by p: %c\n", p[0]);
printf("First byte pointed to by s: %c\n", s[0]);
printf("Second byte pointed to by p: %c\n", p[1]);
printf("Second byte pointed to by s: %c\n", s[1]);
// now p is pointing to something new, e.g. address 0x5617ba3ef6e0 when I run it locally
p = malloc(100);
// we see that p **HAS** changed, but s has **NOT** changed
// they are now pointing to different things
printf("This is where p points to: %p (new location!)\n", p);
printf("This is where s points to: %p (old location!)\n", s);
// this sets the 1st byte pointed to by p to be 'g'
p[0] = 'g';
// we can see that the 1st byte pointed to by p is 'g'
printf("First byte pointed to by p: %c\n", p[0]);
// while the first byte pointed to be s is unaffected
// since p and s point to different things
printf("First byte pointed to by s: %c\n", s[0]);
// always free your memory
free(p);
free(s);
return 0;
}

Oh no, p is not pointing to s.
Your statement:
char *p = s;
It's saying "copy the value of pointer s, into pointer p", so whatever may be the address stored in s (which is not initialized) it's what's going to be the value stored in p.
Once s is assigned the value of the malloc, p will remain with the initial value and s will be different.

Correct way to pass the char pointer to helper fuction and assign the value in C

I have some questions regarding assign the value to a char* in a helper function. If I have following code
int main() {
char *p = malloc(6* sizeof(char));
changeValue(p);
printf("value of p=%s\n", p);
return 0;
}
If I define following function, then it doesn't work:
void changeValue(char* input){
input = "hello";
}
My first quesiton is what is the reason we can't assign a value directly to a pointer?
My previous understanding is because the space "hello" is created only during the changeValue scope and once it is out of the changeValue function, It is destroyed. However if I use the pointer of pointer to assign the value it works. Seems like "hello" space is not destroyed:
void changeValue(char ** input){
*input = "hello";
}
In the main I need to change to:
char **p2 = malloc(sizeof(char*));
changeValue(p2);
printf("value of p2=%s\n", *p2);
And it works properly. My second question is what happened to the second function to make it work properly and which part is wrong in my previous logic?
I also find the following way to assign the value:
The changeValue function keep the same:
void changeValue(char ** input){
*input = "hello";
}
In the main cpp I did following:
char *p = malloc(6* sizeof(char));
changeValue(&p);
printf("value of p=%s\n", p);
Seems like it also works properly but it doesn't make any sense to me. My third question is the input is the address of the p in memory, how dereference the address and assign a value to it works?
And what is the correct approach to assign a char* value in the helper function?
Thanks

In the first version, 'p' points to malloc'ed memory (6 bytes). In the 2nd version, p2 points to memory that will contain an address.
In the 1st version of your helper function:
void changeValue(char* input){
input = "hello";
}
Outside of that function, 'p' is still pointing to the malloc'ed memory. When the program enters the changeValue() function, the value of 'p' is pushed onto the stack, where it is now referenced by a new field called 'input'... that only exists on the stack. So by assigning it the literal "hello", you have replaced 'p' with the address of the string literal "hello".
Meanwhile, the location of 'p' is NOT the same as 'input'. Once the function returns, the memory temporarily assigned to 'input' has been popped and is no longer relevant.
Maybe a diagram can help:
At first:
char *p = malloc(6* sizeof(char));
(Stack) (Heap)
+-------+ +------------------+
| p +--------> | 6 * sizeof(char) |
+-------+ +------------------+
Next, the call to:
changeValue(p); // first version
affects the following:
(Stack) (Heap)
+-------+
| input +----+
+-------+ | +------------------+
| p +----+---> | 6 * sizeof(char) |
+-------+ +------------------+
and then:
input = "hello";
(Stack) (Heap) (DataSegment)
+-------+ +-------+
| input +----------------------------------->|"hello"|
+-------+ +------------------+ +-------+
| p +-----> | 6 * sizeof(char) |
+-------+ +------------------+
and upon exit from 'changeValue', the stack is unwound and 'input' is no longer relevant.
(Stack) (Heap)
+-------+ +------------------+
| p +--------> | 6 * sizeof(char) |
+-------+ +------------------+
and in the end of the main function, you now have a memory leak (the malloc'ed memory has not been freed).
One correct way to use the helper function is:
int main() {
char *p = malloc(6* sizeof(char));
changeValue(p, 6* sizeof(char) );
printf("value of p=%s\n", p);
free( p ) ; // <<<<<< avoid memory leak
return 0;
}
void changeValue(char * input, size_t maxSize){
// ... Copy "hello" into the space pointed to by input, taking
// care not to overrun the memory
strncpy( input, "hello", maxSize ) ;
}

As to this question, I think the memory that stores the pointer s, whose size is dependent on the system, with the memory that the pointer s points to, which is 6 bytes big allocated, are two different concepts.
“The third bit of code (char p = malloc(6 sizeof(char)); changeValue(&p);) is wrong because the memory pointed to by p might not be large enough to hold a pointer. On systems with 64-bit pointers, sizeof(char *) is typically 8, which is greater than the 6 * sizeof(char) you allocated. By the way, sizeof(char) is 1 by definition, so malloc(6 * sizeof(char)) is the same as malloc(6). ”

Dynamically allocated char pointer in main doesn't return string when using call by reference

I am trying to pass address of char pointer to a function by using call by reference. When I try to check the address of char pointer in main and function they both are different, why?. One more surprising thing what I cant understand is when using call by reference the string updated in function should actually reflect in main also.
void fun(char *str){
str = "hello";
printf(" str address in fun is = %p\n",str);
printf("In fun str is = %s\n",str);
}
int main(){
char *str = (char*) malloc(sizeof(10));
fun(str);
printf(" str address is = %p\n",str);
printf("In main str is = %s\n",str);
}
Output of the program is as follows:
str address in fun is = 0x804859b
In fun str is = hello
str address is = 0x839e008
In main str is =
I am not able to understand why this is happening. Can any one explain what actually is happening in this code. Why I am not able to get the string updated in main from function.
[Note: When I try the same code using int pointer this works fine. I am trying to understand the heap memory role in char pointer scenarios.]

I am not able to understand why this is happening.
Since pointers are also passed by value, this line
str = "hello";
overwrites the value passed from main to another value, namely, that of the location of "hello" string literal, for the duration of fun function call. That is why you see correct printouts inside fun. However, this re-assignment is not visible in main.
Replacing this line with
strcpy(str, "hello");
will fix the problem. Now both addresses and both string values will be the same.
Note: The allocation char *str = (char*) malloc(sizeof(10)); is incorrect: you don't need sizeof in there. It should be char *str = malloc(10);

Assuming the working program with int looks something like this (simplified):
void fun(int *ptr)
{
*ptr = 10;
}
int main(void)
{
int var;
fun(&var);
}
This is emulating pass by reference by passing a pointer to the variable and using the dereference operator in the function to access what the pointer ptr is pointing to.
In fun you have something like this
+-----+ +--------------------------+
| ptr | ---> | var in the main function |
+-----+ +--------------------------+
That is, ptr is pointing to the var variable in the main function. By using the dereference on ptr we can modify what ptr is pointing to, which as said is the var variable.
Now lets take a simplified version of the program you have problems with:
void fun(char *ptr)
{
// 1
ptr = "hello"
// 2
}
void main(void)
{
char str[10];
fun(str);
}
At point 1 in the function above, ptr is pointing to the first element in the array str from the main function.
At point 2 (after the assignment) then ptr is pointing to the string literal "hello" instead. And as ptr is a local variable who will go out of scope once the function returns, the assignment is lost.
To fix this either you strcpy as answered by dasblinkenlight. Or you can emulate pass by reference by passing a pointer to the variable:
void fun(char **ptr_to_ptr)
{
// Note the use of the dereference operator here
*ptr_to_ptr = "hello";
}
void main(void)
{
char str[10] = "foobar";
char *ptr = str; // Make ptr point to the first element of str
// Note the use of the address-of operator
fun(&ptr);
printf("%s\n", ptr); // Will print "hello"
printf("%s\n", str); // Will print "foobar"
}
What the above program is doing is to change what ptr in the main function is pointing at.
Important note: If you use dynamic allocation for ptr (like you do in your program) then fun as shown above will lead to a memory leak, because you no longer have the original pointer returned by malloc.

How to edit values of a char* string in C?

I just started learning C and I'm quite unsure of how to "correctly" access and edit values of a character pointer.
For example:
#include <stdio.h>
#include <stdlib.h>
int main()
{
char* text = malloc(20);
char* othertext = "Teststring";
do
{
*text++ = *othertext++;
} while (*othertext != '\0');
printf("%s\n", text + 3);
free(text);
return 0;
}
Firstly, why does the do-while function not work? The content of "othertext" doesn't get copied to the "text" pointer. Furthermore, the program crashes when free(text) is being executed!
We know that this code works if we add a second pointer:
#include <stdio.h>
#include <stdlib.h>
int main()
{
char* text = malloc(20);
char* othertext = "Teststring";
char *ptr1 = text;
char *ptr2 = othertext;
do
{
*ptr1++ = *ptr2++;
} while (*ptr2 != '\0');
printf("%s\n", text + 3);
free(text);
return 0;
}
But both pointers have basically the same address! They have the same values in the debugger, so how does a second pointer make any difference?
As a final note: We are not allowed to use string.h. And we do know there's a subtle difference between arrays and pointers. But we need to specifically understand how char* works!

You should pass to free() the pointer returned by malloc() (the same address). You are passing the incremented pointer i.e. the text pointer now doesn't have the address of the pointer returned by malloc() instead it's an the address of the last element of text, use another pointer to copy the data, or better an index
size_t i;
for (i = 0 ; othertext[i] != '\0' ; ++i)
text[i] = othertext[i];
text[i] = '\0';
You say
But both pointers have basically the same address!
This is not true, try this
printf("%p -- %p\n", (void *) text, (void *) ptr1);

You modify the value of the text pointer in your loop, such that at the end it's not pointing to the beginning of the memory region you allocated; that's (part of) why you don't see the text, and why the free crashes.
You need to preserve that original pointer value (which you do in your second snippet).

You are actually changing the address that text is pointing too.
Consider a very simple memory in which text points to address 5 and othertext points to address 42 on which the T of TestString has been placed.
You now copy the character found at address 42 to address 5, so that address 5 also contains a T. However, you now increment the address that text is pointing too. In other words, text now points at address 6. You also increment othertext which now points to address 43.
In the next round you copy the e found at address 43 to address 6 and increment both again. This is all fine.
However after you are done copying text will point to 5 + 10 = 15. On address 15 however you can not print anything, nor can you remove what is there.
In your second piece of code there are no problems because text keeps pointing to address 5.

do this printf inside the loop after the assignment line. I think you will see why it's not working ;)
printf("%s\n", text);

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Understanding strings and arrays - c

Related

Why is the pointer no longer referencing the memory it was assigned?

Why is the compiler giving a warning that a pointer may be uninitialized when it's going to be initialized and won't the pointer update?

Correct way to pass the char pointer to helper fuction and assign the value in C

Dynamically allocated char pointer in main doesn't return string when using call by reference

How to edit values of a char* string in C?

Categories

Resources