Differences between pointer initializations - c

I am speaking in Standard, K&R C.
Given:
const char a[] = {1, 2, 3};
const char *p = NULL;
Are these two statements equivalent:
*p = a;
p = a;
Each of them would be on the third line of the snippet.
1 and 2 certainly don't look the same.
What's the difference between the two then?

No.
p = a initializes the pointer to point to something else (usually it copies another pointer or you will point to a reference, ala p = &a.
*p = a initializes what p refers to. You are "dereferencing" (looking at) what p points to. If p points to NULL as in your example, you will crash (this is good! you do not want to accidentally access something and mess your program up).
In this case, p = a will point to the first of the array a[], and *p = a will attempt to change the first of the array (it won't work; you have it declared const).
Here is a small example program in C++, with almost identical syntax to C.
#include <iostream>
int main()
{
char arr[5] { 'a', 'b', 'c' }; // arr[3] and arr[4] are set to 0
char *ptr = arr; //point to 'a'
for (int i = 0; i != 5; i++)
{
*ptr = 'f'; //this changes the array
ptr++; //this changes what the pointer points to; moves it to next in array
}
for (int i = 0; i != 5; i++)
{
std::cout << *ptr << " ";
}
//outputs f f f f f
}

The * operator is what we call the dereference operator. To understand what it does, you must understand exactly what a pointer is.
When you do
char *p;
the "variable" p does not use the same amount of memory as a normal char, it uses more memory: it uses the amount of memory needed to correctly identify a memory position in your computer. So, let's say you use a 32-bit architecture, the variable p occupies 4 bytes (not the 1 byte you would expect from a char).
So, when you do
p = a;
you see clearly that you are changing the contents of the variable p, that is, you are putting another 32-bit number inside it: you are changing the address it is pointing to.
After that line executes, the value of p is the memory address of the character array a.
Now for the dereference operator. When you do
*p = 'Z';
you are telling the compiler that you want to store the value 'Z' ON THE ADDRESS pointed by p. So, the value of p remains the same after this line: it continues to point to the same address. It's the value of this address that has changed, and now contains 'Z'.
So, the final effect of
char a[] = {'a', 'b', 'c'};
char p = a;
*p = 'Z';
is the same as changing the first position of the array a to 'Z', that is:
char a[] = {'a', 'b', 'c'};
a[0] = 'Z';
NOTE: there is a difference when making a pointer point to an array: the variable that contains the array contains only the address of the first element, so a is the same as "the starting address of the array".
Usually you will see the & operator. It is an operator used to obtain the memory address of a variable. For example:
int number = 42;
int pointer = &number;
printf("%d", *pointer);
Here we have them all. The first line creates an integer variable and stores 42 inside it.
The second line creates a pointer to an integer, and stores the address of the variable number inside it.
The third line reades the value on the address pointed by the pointer.
So, the trick is to read *x as on the address pointed by x and &x as the address of x.

The first dereferences a null pointer, and tries to assign it the address of the array. This will be a compiler error, because char != char []. If it weren't, it would likely crash.
The second sets p to point to the the array.

I think you are mistaking:
char a[8];
char *p=a;
which is legal and does the same as:
char a[8];
char *p=NULL;
p=a;
with:
char a[8];
char *p=NULL;
*p=a;
which as others said would generate a compile error or a segmentation fault.
In the left side of declarations you should read *x as pointer(x) while in
statements it must be read as value_pointed_by(x). &x on the other hand
would be pointer_to(x)

Here's a trick I used when I learned C (and still use today).
Whenever you see the * in front of a variable in your code, automatically read it as "what is pointed to by".
So you should be able to easily see that setting "p" to "a" is very different from setting "what is pointed to by p" to "a".
Also, since p is supposed to be pointing at a char, setting that char p is pointing at (currently the "char" at memory location 0 assuming null is 0) to a char pointer (a) is probably going to fail at compile time if you are lucky (depending on your compiler and lint settings it may actually succeed.)
from comment:In a function declaration like f(char c), I usually try to separate out the variable name from the rest of it--so it would be f( (char) c). so c is a char*. Exactly like a variable definition.
Also & usually reads as "The address of", but that gets even more iffy. A few examples of how I read things to myself. May or may not help you.
int a[] = {1,2,3}; // I mentally parse this as (int[]) a, so a is an int array.
int *p; // p is a pointer to "integers"
int i;
p=a; // p acts exactly as a does now.
i=*p; // i is "What is pointed to by" p (1)
i=p; // i is some memory address
i=*a; // i is what is pointed to by a (1)
i=p[1]; // Don't forget that * and [] syntax are generally interchangable.
i=a+1; // Same as above (2).
p=&i; // p is the address of i (it can because it's a pointer)
// remember from hs algebra that = generally reads as "is", still works!
*p=7; // what is pointed to by p (i) is 7;
a=*i; // whoops, can't assign an array. This is the only difference between
// arrays and pointers that you will have to deal with often, so feel
// free to use which ever one you are more comfortable with.
char c='a';
char * d = &c;// d is a char pointer, and it is the address of c
char ** e ; // e is a pointer to a memory location containing
// a pointer to a char!
e=&d; // gets d's address. a pointer to a pointer gets
// the address of a pointer. Messy but gets the job done
**e=5; // what is pointed to by what is pointed to by e is 5.
*e=&'f'; // what is pointed to by e (which is a char * itself, and is still d!)
// is set to the address of the memory location holding the value 'f'.
// does not change c or e, just d!
I haven't touched c in 10 years, so some of this may be a bit wrong, but it helps me to read it out loud that way.

No, they are not equivalent
If p = NULL, then doing *p = a will give you a segmentation fault.

Because "*p" dereferences the pointer wouldnt this make "p" a "char**" ?
This would point "p" to the first array as expected.
I guess they are not the same.

Related

using array of chars and strdup, getting segmentation fault

Suppose i write,
char **p;
p[0] = strdup("hello");
Strdup creates a duplicate string in heap with ending character '\0'; As p is pointer to pointer of char, p[0] = strdup("hello") seems perfectly fine for me. But why am i getting segmentation fault.
Let's look at a simpler example. Suppose you say
int *ip;
ip[0] = 5;
ip is a pointer to one or more ints -- but it's not initialized, so it points nowhere, so ip[0] isn't a valid memory location, so we can't store the value 5 there.
In the same way, when you said
char **p;
p is a pointer that points nowhere. If it did point somewhere, it would point to another pointer. But it doesn't point anywhere, so
p[0] = strdup("hello");
blows up.
To fix this, you need to make p point somewhere, and specifically to memory allocated to hold one or more pointers. There are many ways to do this:
char *q;
p = &q; /* way 1 */
char *a[10];
p = a; /* way 2 */
p = malloc(10 * sizeof(char *)); /* way 3 */
or instead of using a pointer, use an array to start with:
char *p[10]; /* way 4 */
After any of those, p[0] = strdup("hello") should work.
For way 3, we would also need to check that malloc succeeded (that it dd not return a null pointer).
For ways 2 through 4, we could also set p[1] through p[9]. But for way 1, only p[0] is valid.
See also this answer to a different question for more discussion about trying to use uninitialized pointers.
There is declared an uninitialized pointer that has an indeterminate value.
char **p;
so dereferencing the pointer in this expression p[0] (that is equivalent to the expression *p) used in this statement
p[0] = strdup("hello");
invokes undefined behavior because there is an attempt to write to memory using an incorrect pointer value of the expression p[0].
You could write either for example
char *s;
char **p = &s;
p[0] = strdup("hello");
Or
char **p = malloc( sizeof( char * ) );
p[0] = strdup("hello");
That is the pointer to pointer p must point to a valid object. Thus dereferencing the pointer you will get a valid object of the type char * that will be assigned by the value returned by the call of strdup..

learning c having trouble understanding pointers

void
bar (char *arg, char *targ, int len)
{
int i;
for (i = 0; i < len; i++)
{
*targ++ = *arg++;
}
}
learning c right now, a friend sent me this snippet and I can't understand what it does. An explanation of the pointer portion would be helpful. From my understanding, it seems to be copying the value of arg into targ for i chars?
A pointer is a variable that stores an address. This address can be the address
of another variable:
int a = 18;
int *pa = &a;
or it can be the start of a dynamically allocated memory block:
int *p = malloc(sizeof *p);
The important thing is that pointers allow you access the values behind an
address. You do that by dereferencing the pointer using the *-operator:
int a = 18;
int *pa = &a;
*pa = 10;
printf("a=%d\n", a); // will print 10
For these kind of examples, this might not seem like a big deal, but it is,
because you can pass pointers to functions and those function can then interact
with the memory pointed to by the pointer, depending on the memory block, even
modifiy it.
Pointers can also point to the start of sequence of objects, for example to the
start of an array:
int arr[] = { 1, 3, 5 };
int *p = arr;
Note p[0] is 1, p[1] is 3 and p[2] is 5. It is also possible to change the
values by doing p[1] = -14;. This is also dereferencing, but you also can use the
*-operator:
p[1] = 12;
// is equivalent to
*(p + 1) = 12;
And that's what your snippet is using. Forget for a second the loop. Take a look
at this line:
*targ++ = *arg++;
This can be rewritten as:
targ[0] = arg[0];
targ = targ + 1; // or targ = &(targ[1])
arg = arg + 1; // or arg = &(arg[1])
Now it's more clear what it is doing. It copies the value of first character
pointed to by arg to the position where targ is pointing to. After that both
arg and targ are incremented to advance to the next element in the
sequence.1
So what the loop is doing is copying len objects pointed to by arg to
targ. This could be used to copy a string into another char array. But it is
not safe, as it is not clear whether the '\0'-terminating byte is is copied
and it is not clear whether the buffers are large enough (meaning larger than
len). If they are not strings but sequences of bytes, then this function would
be OK.
In C a string is just a sequence of characters that ends with the '\0'-terminating byte.
For that reason they are stored using char arrays and are passed to functions
as pointers of char, that point to the start of the string. We could rewrite
this function in a more safer way like this:
int safe_copy_string(char *dest, char *source, size_t dest_size)
{
if(dest == NULL || source == NULL)
return 0;
if(dest_size == 0)
return 1; // no space to copy anything
// copying one element less than dest_size
// that last element if dest should be \0
for(size_t i = 0; i < dest_size - 1; ++i)
{
*dest++ = *source++;
if(*(source - 1) == '\0')
break; // copied sources's 0-terminating byte
}
dest[dest_size - 1] = 0; // making sure it is 0-terminated
return 1;
}
Footenotes
1It's worth mentioning the ++-operator here. This is the post-increment
operator which is used to add 1 to the operand (for integers), in case of a pointer
to advance the pointer by 1 thus making it point to the next object.
When you do:
int a = 6;
int b = a++;
// a == 7, b == 6
a is initialized with 6. When initializing b, the compiler will use the
current value of a for the initialization, however the post-increment
operator has the side effect that it will increment the value of a by 1. When
this exactly happens is define by the rules of sequence points. What
matters is that in the initialization of b, the current value of a is used
and after the assignment a will have a the new value.
A pointer points to a data, and usually contains the memory address of the data. The pointer is a normal 'c' variable.
operator '*' applied at the time, when the pointer is used, tells the compiler to access data, at the location denoted by the pointer.
Operator '++', applied to the pointer, increments its value so, that it points to the next data element, adjacent to the previous one. So, for 'char*' pointers, it increments the address by '1' to point to the next char in the string.
In your case *targ++ means: access data referenced by the pointer 'targ' and then increment the value of the pointer.
*targ++ = *arg++;
In the above expression the program takes char pointed by 'arg' and assigns it to the char location referenced by 'targ'. Then it increments values of the pointers 'arg' and 'targ'.
Here is the function you provided, exploded with a line-by-line explanation of the function. At the end are revisions to make the function slightly safer.
This line declares the return type of the function. The type 'void' indicates that this function returns nothing (a procedure as opposed to a function).
void
This line declares the name of the function, 'bar', and then presents a list of three arguments, 'arg', 'targ', and 'len'. The types of these arguments are 'arg' is a pointer to a character, and is how you pass a string in C; 'targ' is also a pointer to a character, and again is how you pass a string in C; and 'len' is an int(eger).
bar (char *arg, char *targ, int len)
This symbol '{' indicates that the body of the function definition follows.
{
This line declares that 'i' is a variable of type 'int'(eger), and this variable has space reserved on the stack for the duration of the function.
int i;
This line declares a repetition loop using the 'for' keyword. Between the parenthesis appears three sub-parts: the first part contains initialization prior to loop start, and the variable 'i' is set to the value 0; The second part contains an expression that is evaluated to determine whether to continue the loop, in this case the expression 'i < len' is checked every pass through the loop, so the loop terminates when the value of i becomes greater than or equal to the value of 'len'; the third part contains an expression that is performed at the end of each loop pass.
for (i = 0; i < len; i++)
This line '{' opens a statement body for the for loop,
{
This line is where the interesting work of the function is performed. As explained above, the for loop will be executed once for each value of 'i' in the list (0,1,2,3, ... ,len-1), seriatim. Note that the loop ends when 'i' has the value 'len'. Refer to the function declaration line above, how the variables 'arg' and 'targ' are pointers to character? These lines de-reference these variables to access the pointer location(s).
Suppose the line were '*targ = *arg', then the R-value *arg would be the contents at the current location pointed at by 'arg', and the L-value '*targ' would be the current location pointed at by 'targ'. Where the character 'x' stored at '*arg', then 'x' would be copied to '*targ', overwriting the previous contents.
*targ++ = *arg++;
The '++' symbols after the pointers do not affect the characters pointed at by the pointers, but increment the pointers (due to precedence of the '*' and '++' operators). That one line could be rewritten as,
*targ = *arg;
arg++;
targ++;
This line '}' closes the statement body for the for loop,
}
This symbol '}' matches and closes the body of the function definition.
}
You should always check for valid arguments to functions. And you should check that values do not overrun their buffers. I have added a few statement to the function to make it more robust. As presented, the function behaves like 'memcpy', and does not care about terminated 'strings'; I have added 'endc' to detect string end.
char* //return the destination, for further processing, e.g. strlen()
bar (char *arg, char *targ, int len)
{
int ndx; //try searching for 'i' in a big source file sometime...
char endc; //temporary to detect null character, more like 'strncpy'
if( (!arg) || (!targ) || (len<1) ) return targ;
for (ndx = 0; ndx < len; ndx++)
{
endc = *targ++ = *arg++;
if( !endc ) break; //comment this out for 'memcpy' behavior
}
}
Some would say checking the length is not needed. That is true for simple functions, but complex functions may reveal that this habit is rewarding.

This program is giving output "abc" for both p and c, but how?

When pc is assigning cdefg, why it is printing abc. when it goes to fun it is assigning pc= ""cdefg"
void fun(char *pc)
{
pc = malloc(5);
pc = "cdefg";
}
int main()
{
char *p = "abc";
char *c = p;
fun(p);
printf("%s %s\n",p,c);
}
The reason your program does what it does is that the assignment of pc in fun has nothing to do with assigning p in main. The pointer is passed by value; any changes made by fun get discarded.
If you would like to assign a new value inside a function, do one of three things:
Pass a pointer to pointer, or
Allocate a buffer in the caller, and pass it to the function, along with buffer's length, or
Return the pointer from the function, and assign in the caller.
First approach:
void fun(char **ppc) {
*ppc = "cdefg";
}
...
fun(&p); // in main
Second approach:
void fun(char *pc, size_t len) {
if (len >= 6) {
strcpy(pc, "cdefg");
}
}
...
char p[20]; // in main
fun(p, 20);
Third approach:
char *fun() {
return "sdefg";
}
...
char *p = fun(); // in main
Your program has other issues - for example, malloc-ed memory gets leaked by the assignment that follows.
Try this instead. It actually updates the original pointer, rather than assigning to a copy which is then left dangling:
void fun(char **pc)
{
*pc = malloc(6);
strcpy(*pc, "cdefg");
}
int main()
{
char *p = "abc";
char *c = p;
fun(&p);
printf("%s %s\n",p,c);
}
It also fixed 2 other problems. The buffer of size 5 isn't big enough for the string plus the string terminator character, and you also need to copy the string into the buffer - assignment won't work.
When the function fun is called, the value of the pointer p is copied. Thus, only the local pointer pc in fun is changed. If you want to change the value of a pointer, you should take a double pointer as argument.
By the way, you do not have to call malloc(3) because the string "cdefg" is already present in memory (in rodata). The instruction pc = "cdefg"; puts the address of "cdefg" into pc. You will loose the address of the memory allocated by malloc(3), it's a memory leak.
When you allocated the pointer again in caller function, the value of pointer variable changed. In order to take this new value to the calling function, you have to pass the address of the pointer. ie: pass the pointer by reference.
There are two things at play here, passing by value and reassigning instead of copying.
If we start with the simple reassignment, take a closer look at these two lines:
pc = malloc(5);
pc = "cdefg";
The first lines assign to pc, making pc point to whatever memory malloc returned. The second line reassigns pc to point somewhere else. These two lines are basically the same as having an int variable i and doing
i = 1;
i = 2;
The first assignment you do is lost because the you immediately make another assignment. To make the memory returned by malloc contain the string "cdefg" there are two things you need to do: The first is that you need to allocate six characters, to fit the string terminator, and the second thing you need to do is to copy the string into the memory:
pc = malloc(strlen("cdefg") + 1);
strcpy(pc, "cdefg");
The second issue is more complex, and has to do with how arguments are passed in C. In C the arguments are passed by values which means they are copied and the function only have a local copy of the data in those arguments.
When you pass a pointer, like in your code, then the pointer is copied into the variable pc, and when the function returns the variable goes out of scope and all changes you made to the variable (like reassigning it to point somewhere else) are simply lost.
The solution is to pass arguments by reference. This is unfortunately not possible in C, but it can be emulated using pointers, or rather using pointers to variables. To pass a pointer to a variable that is a pointer, the type is a pointer to a pointer to some other type, so the function should take a pointer to a pointer to char:
void fun(char **ppc) { ... }
The variable ppc points to the variable p from the main function.
Now since ppc is pointing to a variable, you need to use the dereference operator to access the original variable:
*ppc = malloc(strlen("cdefg") + 1);
strcpy(*ppc, "cdefg");
To call the function you use the address-of operator & to create a pointer to the variable:
char *p = "abc";
...
fun(&p);
Because
char *p - in main function
and
char *pc - in fun function
are different variables.

C: pointer of char & segmantation fault

In the next code:
char i,*p;
i = 65;
p = &i;
p = (char *) 66;
(*p)++;
printf("%d",p);
I got segmentation fault. I didn't understand why. I have a pointer to a char (in this case char 66=C), and then I change it value, which is also 66 - to 67. Are the values of char "protected" from this change? Is it happen also with others, except char?
I tried to understand the idea that stand behind this thing (and not only fix it). Thanks.
Here is the problem:
p = (char *) 66;
It should be:
*p = 66;
p is a pointer to a char, so you cannot assign values like 66 to it. You can derefernce p in order to assign values to where the pointer "looks".
If you want to print the value where p points to, you must use again the dereference operator (*) like this:
printf("%d", *p); // prints the value where p points to
If you want to print the pointer address you can do this:
printf("%p", p); // prints the address where p points
A character pointer doesn't store a character, it stores an address where a character can be found. So
p = (char *)66;
says that p points to address number 66, where a character can be found. Odds are that address isn't even accessible by your program, much less that it stores a character.

Can anybody explain how its printing output as "ink"

I am new to pointers in C. I know the basic concepts. In the below code, why is it printing the "ink" as its output?
#include<stdio.h>
main()
{
static char *s[]={"black","white","pink","violet"};
char **ptr[]={s+3,s+2,s+1,s},***p;
p=ptr;
++p;
printf("%s",**p+1);
}
Thanks
Let's trace it:
ptr = {pointer to "violet", pointer to "pink", pointer to "white", pointer to "black"}
p = ptr --> *p = pointer to "violet"
++p --> *p = pointer to "pink"
This implies that:
*p = {'p','i','n','k','\0'}
Which means:
**p = 'p'
**p + 1 = 'i'
so **p + 1 is a pointer to this string: {'i', 'n', 'k', '\0'}, which is simply "ink".
s is an array of char * (which represent strings).
ptr is an array of pointers to pointers (pointing to the values of s, which are pointers to strings)
p is set to point to ptr[0] (which is a pointer to s[3] or "violet")
p is incremented to point to ptr[1], which points to s[2] or "pink"
In the printf statement p is dereferenced twice. The first deref is a pointer to s[2], the second deref gets you the value of s[2] - "pink". The +1 shifts the pointer to the start of "pink" on by one char, so printing from here to the end of the string will give you "ink".
I'd advise that you work backwards from what you know was printed (the 'ink' in 'pink') and see where the different variables must be pointing for that to happen.
Remember that an array can be viewed as a pointer to its first element and that a string can similarly be viewed as a pointer to its first character.
static char
*s[]={"black","white","pink","violet"};
In the above statement you are initialize.
char **ptr[]={s+3,s+2,s+1,s}
In the above statement you are assign pointer s value to ptr
value. And declare a triple pointer.
p=ptr;
In the above statement you assign the double pointer address to
triple pointer.
++p;
In the above statement you increment the triple pointer value. So
that time it point to "pink".
But you are print the **p+1. That time it will print only "ink".
If you print **(p+1), that time it
will print "white". Because in the
initialization of double pointer you
initialize the "s+2". So that time it
will points to the "white".

Resources