I'm struggling to understand how pointers work.
The way I got it is that, when I declare a pointer to, say, int, I create both a variable that'll contain an address (that must be initialized to even operate on the int) and an int variable. Visually, I'd represent this this way (address;int). For example, if I declared
int* number;
I'd have "number" being the address variable and "*number" being the int variable.
Likewise, declaring something such as int** d should mean to create a pointer to (address;int). That'd be [address;(address;int)].
With this in mind, I was trying to modify the int value of **d by using an external function, incrementer_3, and this so called pass by reference, but I get an error on runtime. So, I was wondering what I'm missing.
#include <stdio.h>
void incrementer(int* a) {
(*a)++;
}
void incrementer_2(int** a) {
(**a)++;
}
void incrementer_3(int*** a) {
(***a)++;
}
int main() {
int b = 7;
incrementer(&b);
printf("%d\n", b);
int* c = (int*)malloc(sizeof(int));
*c = 4;
incrementer_2(&c);
printf("%d\n", *c);
int** d = (int**)malloc(sizeof(int*));
**d = 6;
incrementer_3(&d);
printf("%d\n", **d);
system("pause");
}
FYI the part when I increase b and c works fine.
On a side note, I was also wondering if it's possible to modify the value of *c by using the function "incrementer" and not "incrementer_2". In fact I was just thinking that I could have simply written from main
incrementer(&(*c));
or, in a simpler way
incrementer(c);
but none of them work on runtime.
You need to keep in mind that a pointer need not actually refer to anything, and even if it does refer to something that something need not be valid. Keeping track of those things is your job as a programmer.
By convention an invalid pointer will be given the value 0 (which is what NULL eventually comes to) but that is only convention, other values might be used in some circumstances.
So, with "int* number;" you have declared a pointer-to-int but because it is not initialized you have absolutely no idea what value it contains, dereferencing it at this point is undefined behavior - meaning that most anything could happen if you tried doing so, though in reality it will likely simply crash your program.
The problem with:
int** d = (int**)malloc(sizeof(int*));
**d = 6;
is that while d is initialized *d is not. You could do:
*d = malloc(sizeof(int));
or
*d = c;
but *d needs to be pointed at something before you can use **d.
int b
b is an int. We can refer to it by writing b.
b = 7;
Here we assign a number to b.
int* c
c is a pointer that should point to an int. We can refer to that int by writing *c.
c = (int*)malloc(sizeof(int));
We have found a piece of memory that can hold an int, and made a pointer that points to that piece, and assigned it to c. All is well.
*c = 4;
Here we assign a number to *c. See? *c behaves just like b. But that's only because we have initialised it with a valid pointer! Without that, *c = 4; would be invalid.
int** d
d is a pointer that should point to a thing of type int*, which we can refer to by writing *d. That int* thing, in turn, should point to an int, which we can refer to by writing **d.
d = (int**)malloc(sizeof(int*));
We have found a piece of memory that can hold an int*, and made a pointer that points to that piece, and assigned it to d. All is well. Now that int* we call *d, what does it point to?
Nothing. In order to point it to something, we could have found a piece of memory that can hold an int, and made a pointer that points to that piece, and assigned it to our *d, just as we have done earlier with c. See? *d behaves just like c. In order to use *c we had to initialise c with a valid pointer first. In order to use **d we need to initialise *d with a valid pointer first.
*d = (int*)malloc(sizeof(int));
The problem is that you allocate memory for the int* but you don't allocate any memory for the int or set the pointer of the int.
Should be:
int** d = (int**)malloc(sizeof(int*));
*d = (int*)malloc(sizeof(int));
**d=6;
The way I got it is that, when I declare a pointer to, say, int, I create both a variable that'll contain an address (that must be initialized to even operate on the int) and an int variable.
No, when you declare a pointer you create a variable that knows how to contain an address. When you use malloc() you allocate memory. malloc() returns an address that you may assign to your pointer.
P.S. - incrementer(c) should work just fine
Related
The following two blocks of code do the same thing:
void foo(int *num) {...}
int main(void)
{
int num = 5;
foo(&num);
...
}
void foo(int *num) {...}
int main(void)
{
int *num;
*num = 5;
foo(num);
...
}
In both cases foo expects a pointer-to-int argument. In the first I declare an int and use the address operator (&) to pass it into foo. In the second I declare a pointer variable and directly pass it into foo.
Is there a reason to use either one, other than a stylistic preference? I prefer to declare all of my variables as pointers, as in the second, but I see the first a lot more frequently.
No, those two snippets do very different thing. The first snippet is well-formed code. The second snippet is Undefined Behavior due to dereferencing an uninitialized pointer.
The second version is likely to crash or do some other undesired thing when executed. Also, compiled with warnings on any modern compiler, it would exhibit a warning, for example:
warning: variable 'num' is uninitialized when used here
[-Wuninitialized]
While the first code snippet gets the address of a variable and passes that to foo, the second dereferences an uninitialized pointer. What you instead could have done was:
int *num = malloc(sizeof(int));
*num = 5;
foo(num);
free(num);
Or:
int num = 5;
int *p = #
foo(p);
The main thing to take away from this is that the first form is the simplest. A valid pointer variable must be pointing to memory within your program's control, and that requires more work than just int *num; *num = 5;
The following two blocks of code do the same thing:
No, they don't. Only the first has defined behavior at all. The second dereferences a wild pointer when it performs *num = 5 without having assigned a valid pointer to num.
In both cases foo expects a pointer-to-int argument. In the first I declare an int and use the address operator (&) to pass it into foo. In the second I declare a pointer variable and directly pass it into foo.
You seem to be suffering from from the distressingly common failing to distinguish between a pointer and the thing to which it points. They are different and largely orthogonal. Declaring int *num does not declare or reserve any space for an int. It declares a pointer that can point to an int, but the value of that pointer is invalid until you assign one that does point to an int.
Is there a reason to use either one, other than a stylistic
preference?
Yes. The first is valid, and the second is not.
I prefer to declare all of my variables as pointers, as in
the second, but I see the first a lot more frequently.
I'm sorry to hear that you see the second at all.
Of course, as others have pointed out, in the second you need to initialize the point to valid allocated memory. So that's one point in favor of the first code.
For John Kugelman's example
int num;
int *p = #
*p = 5;
foo(p);
there doesn't seem to be any advantage over the first example. So if you fix the second example, you might just have more voluminous code, without any additional clarity.
Finally, and I think as a fair overall summary, the second example is really not getting the value of "declare all of my variables as pointers". You have to have actual memory somewhere, and the first example makes that clear. It is passed as a pointer, so except for the code which declared the memory, it is a pointer.
one
void one(void) {
int n = 42; // initialization
foo(&n);
putchar(n);
}
two
void two(void) {
int *n;
n = malloc(sizeof *n);
*n = 42; // no initialization
foo(n);
putchar(*n);
free(n);
}
three
void three(void) {
int n[1] = { 42 }; // initialization
foo(n);
putchar(*n); //putchar(n[0]);
}
I prefer to declare all of my variables as pointers, as in the second, but I see the first a lot more frequently.
This statement doesn't make much sense - surely not everything you work with is a pointer.
The code
int *num;
*num = 5;
foo( num );
is not well-formed, because num isn't pointing to anything (its value is indeterminate). You're attempting to assign 5 to some random memory address which may result in a segfault, or corrupt other data. It may work as intended with no apparent problems, but that's by accident, not design.
Just because a function takes a pointer as a parameter doesn't mean you have to pass a pointer object as that parameter. Yes, you could write something like
int num;
int *numptr = #
*numptr = 5;
foo( numptr )
but that doesn't buy you anything over simply writing
int num = 5;
foo( &num );
C requires us to use pointers in 2 cases:
When a function needs to update one of its input parameters;
When we need to track dynamically-allocated memory;
They also come in handy for building dynamic data structures like lists, queues, trees, etc. But you shouldn't use them for everything.
This may be a very basic question but the idea of having pointers in C seems confusing to me or may be I don't know the exact purpose. I will provide some examples to demonstrate what my concerns are:
1st point:
Definition says something like this:
A pointer is a variable containing the address of another variable.
So, if one program goes like this:
int i = 23;
printf("%d", i);
printf("%d", &i);
And, another program goes like this:
int i = 23;
int *ptr;
ptr = &i;
printf("%d", *ptr);
printf("%d", ptr);
Both the programs above can output same thing.
If pointer also keeps the variable's address in it and at the same time we can get the variable's address using & sign, can't we do the same task pointer does by deriving the address of any variable? I mean if I don't declare it as pointer and use it as int ptr = &i; in 2nd code snippet and use it as normal variable, what would be the differences?
2nd point:
I found somewhere here that:
C does not have array variables....but this is really just working
with pointers with an alternative syntax.
Is that statement correct? As I am still beginner, I can't validate any statement regarding this. But this was somewhat confusing to me. If that statement is correct either, then what is the actual workaround in this regard? Is it actually the pointers which works in back-end and just the compilers/ide are fooling us by using array (obviously for maintaining simplicity)?
Answering questions in reverse order:
C does not have array variables....but this is really just working with pointers with an alternative syntax.
This is incorrect, and you need to toss that bookmark in the trash. It's a common misconception that arrays and pointers are the same thing, but they are not. An array expression will be converted to a pointer expression under most circumstances, and array subscripting is accomplished through pointer arithmetic, but an array object is an actual array, not a pointer.
If pointer also keeps the variable's address in it and at the same time we can get the variable's address using & sign, can't we do the same task pointer does by deriving the address of any variable? I mean if I don't declare it as pointer and use it as int ptr = &i; in 2nd code snippet and use it as normal variable, what would be the differences?
That code doesn't illustrate why pointers exist, or why they are useful.
C actually requires us to use pointers in the following cases:
To write to a function's parameters;
To track dynamically allocated memory;
Pointers also make dynamic data structures like trees and lists easy to implement, but they aren't required for it (unless you're using dynamic memory allocation in those structures).
Writing to a function's parameters
C passes all function arguments by value; the formal parameter in the function definition is a separate object in memory from the actual parameter in the function call, so any change to the formal parameter is not reflected in the actual parameter. For example, assume the following swap function:
void swap( int a, int b ) { int t = a; a = b; b = t; }
This function exchanges the values in a and b. However, when we call the function as
int x = 4, y = 5;
swap( x, y );
the values of x and y won't be updated, because they are different objects than a and b. If we want to update x and y, we have to pass pointers to them:
swap( &x, &y );
and update the function definition as follows:
void swap( int *a, int *b ) { int t = *a; *a = *b; *b = t; }
Instead of swapping the contents of a and b, we swap the contents of the objects that a and b point to. This crops up all the time - think about the scanf function, and how you have to use the & operator on scalar arguments.
Tracking dynamically allocated memory
The dynamic memory allocation functions malloc, calloc, and realloc all return pointers to dynamic memory buffers; there's no variable associated with that memory as such.
char *buffer = malloc( sizeof *buffer * some_length );
A pointer is the only way to track that memory.
My understanding is that when you declare a pointer, say int *a = 5, a is the pointer, and *ais the int pointed to - so the * indicates you're accessing the pointer data. (And the & is accessing the address). Hopefully this is correct?
How come when I'm doing printf it doesn't seem to work the way I want?
int main()
{
int *a = 5;
printf("%d\n",a);
return 0;
}
This gives me the correct result, which I didn't expect. When I did *a instead of a in the printf, it failed, which I'm confused with?
Nopes, int *a = 5; does not store an int value of 5 into the memory location pointed by a, the memory location itself is 5 (which is mostly invalid). This is an initialization statement, which initializes the variable a which is of type int * (a pointer) to 5.
For ease of understanding, consider the following valid case
int var = 10;
int *ptrVar = &var;
here, ptrVar is assigned the value of &var, the pointer. So, in other words, ptrVar points to a memory location which holds an int and upon dereferencing ptrVar, we'll get that int value.
That said, in general,
printf("%d\n",a);
is an invite to undefined behavior, as you're passing a pointer type as the argument to %d format specifier.
The declaration int *a does declare a to be a pointer. Thus, the declaration
int *a = 5;
initializes a with the value 5. Just like how
int i = 5;
would initialize i with the value 5.
There are very few situations where you would want to initialize a pointer variable with a literal value (other than 0 or NULL). Those would likely be embedded (or otherwise esoteric) applications where certain addresses have a defined meaning on a particular platform.
Given pointers to char, one can do the following:
char *s = "data";
As far as I understand, a pointer variable is declared here, memory is allocated for both variable and data, the latter is filled with data\0 and the variable in question is set to point to the first byte of it (i. e. variable contains an address that can be dereferenced). That's short and compact.
Given pointers to int, for example, one can do this:
int *i;
*i = 42;
or that:
int i = 42;
foo(&i); // prefix every time to get a pointer
bar(&i);
baz(&i);
or that:
int i = 42;
int *p = &i;
That's somewhat tautological. It's small and tolerable with one usage of a single variable. It's not with multiple uses of several variables, though, producing code clutter.
Are there any ways to write the same thing dry and concisely? What are they?
Are there any broader-scope approaches to programming, that allow to avoid the issue entirely? May be I should not use pointers at all (joke) or something?
String literals are a corner case : they trigger the creation of the literal in static memory, and its access as a char array. Note that the following doesn't compile, despite 42 being an int literal, because it is not implicitly allocated :
int *p = &42;
In all other cases, you are responsible of allocating the pointed object, be it in automatic or dynamic memory.
int i = 42;
int *p = &i;
Here i is an automatic variable, and p points to it.
int * i;
*i = 42;
You just invoked Undefined Behaviour. i has not been initialized, and is therefore pointing somewhere at random in memory. Then you assigned 42 to this random location, with unpredictable consequences. Bad.
int *i = malloc(sizeof *i);
Here i is initialized to point to a dynamically-allocated block of memory. Don't forget to free(i) once you're done with it.
int i = 42, *p = &i;
And here is how you create an automatic variable and a pointer to it as a one-liner. i is the variable, p points to it.
Edit : seems like you really want that variable to be implicitly and anonymously allocated. Well, here's how you can do it :
int *p = &(int){42};
This thingy is a compound literal. They are anonymous instances with automatic storage duration (or static at file scope), and only exist in C90 and further (but not C++ !). As opposed to string literals, compound literals are mutable, i.e you can modify *p.
Edit 2 : Adding this solution inspired from another answer (which unfortunately provided a wrong explanation) for completeness :
int i[] = {42};
This will allocate a one-element mutable array with automatic storage duration. The name of the array, while not a pointer itself, will decay to a pointer as needed.
Note however that sizeof i will return the "wrong" result, that is the actual size of the array (1 * sizeof(int)) instead of the size of a pointer (sizeof(int*)). That should however rarely be an issue.
int i=42;
int *ptr = &i;
this is equivalent to writing
int i=42;
int *ptr;
ptr=&i;
Tough this is definitely confusing, but during function calls its quite useful as:
void function1()
{
int i=42;
function2(&i);
}
function2(int *ptr)
{
printf("%d",*ptr); //outputs 42
}
here, we can easily use this confusing notation to declare and initialize the pointer during function calls. We don't need to declare pointer globally, and the initialize it during function calls. We have a notation to do both at same time.
int *ptr; //declares the pointer but does not initialize it
//so, ptr points to some random memory location
*ptr=42; //you gave a value to this random memory location
Though this will compile, but it will invoke undefined behaviour as you actually never initialized the pointer.
Also,
char *ptr;
char str[6]="hello";
ptr=str;
EDIT: as pointed in the comments, these two cases are not equivalent.
But pointer points to "hello" in both cases. This example is written just to show that we can initialize pointers in both these ways (to point to hello), but definitely both are different in many aspects.
char *ptr;
ptr="hello";
As, name of string, str is actually a pointer to the 0th element of string, i.e. 'h'.
The same goes with any array arr[], where arr contains the address of 0th element.
you can also think it as array , int i[1]={42} where i is a pointer to int
int * i;
*i = 42;
will invoke undefined behavior. You are modifying an unknown memory location. You need to initialize pointer i first.
int i = 42;
int *p = &i;
is the correct way. Now p is pointing to i and you can modify the variable pointed to by p.
Are there any ways to write the same thing dry and concisely?
No. As there is no pass by reference in C you have to use pointers when you want to modify the passed variable in a function.
Are there any broader-scope approaches to programming, that allow to avoid the issue entirely? May be I should not use pointers at all (joke) or something?
If you are learning C then you can't avoid pointers and you should learn to use it properly.
As I know, when a pointer is passed into a function, it becomes merely a copy of the real pointer. Now, I want the real pointer to be changed without having to return a pointer from a function. For example:
int *ptr;
void allocateMemory(int *pointer)
{
pointer = malloc(sizeof(int));
}
allocateMemory(ptr);
Another thing, which is, how can I allocate memory to 2 or more dimensional arrays? Not by subscript, but by pointer arithmetic. Is this:
int array[2][3];
array[2][1] = 10;
the same as:
int **array;
*(*(array+2)+1) = 10
Also, why do I have to pass in the memory address of a pointer to a function, not the actual pointer itself. For example:
int *a;
why not:
allocateMemory(*a)
but
allocateMemory(a)
I know I always have to do this, but I really don't understand why. Please explain to me.
The last thing is, in a pointer like this:
int *a;
Is a the address of the memory containing the actual value, or the memory address of the pointer? I always think a is the memory address of the actual value it is pointing, but I am not sure about this. By the way, when printing such pointer like this:
printf("Is this address of integer it is pointing to?%p\n",a);
printf("Is this address of the pointer itself?%p\n",&a);
I'll try to tackle these one at a time:
Now, I want the real pointer to be changed without having to return a pointer from a function.
You need to use one more layer of indirection:
int *ptr;
void allocateMemory(int **pointer)
{
*pointer = malloc(sizeof(int));
}
allocateMemory(&ptr);
Here is a good explanation from the comp.lang.c FAQ.
Another thing, which is, how can I allocate memory to 2 or more dimensional arrays?
One allocation for the first dimension, and then a loop of allocations for the other dimension:
int **x = malloc(sizeof(int *) * 2);
for (i = 0; i < 2; i++)
x[i] = malloc(sizeof(int) * 3);
Again, here is link to this exact question from the comp.lang.c FAQ.
Is this:
int array[2][3];
array[2][1] = 10;
the same as:
int **array;
*(*(array+2)+1) = 10
ABSOLUTELY NOT. Pointers and arrays are different. You can sometimes use them interchangeably, however. Check out these questions from the comp.lang.c FAQ.
Also, why do I have to pass in the memory address of a pointer to a function, not the actual pointer itself?
why not:
allocateMemory(*a)
It's two things - C doesn't have pass-by-reference, except where you implement it yourself by passing pointers, and in this case also because a isn't initialized yet - if you were to dereference it, you would cause undefined behaviour. This problem is a similar case to this one, found in the comp.lang.c FAQ.
int *a;
Is a the address of the memory containing the actual value, or the memory address of the pointer?
That question doesn't really make sense to me, but I'll try to explain. a (when correctly initialized - your example here is not) is an address (the pointer itself). *a is the object being pointed to - in this case that would be an int.
By the way, when printing such pointer like this:
printf("Is this address of integer it is pointing to?%p\n",a);
printf("Is this address of the pointer itself?%p\n",&a);
Correct in both cases.
To answer your first question, you need to pass a pointer to a pointer. (int**)
To answer your second question, you can use that syntax to access a location in an existing array.
However, a nested array (int[][]) is not the same as a pointer to a pointer (int**)
To answer your third question:
Writing a passes the value of the variable a, which is a memory address.
Writing *a passes the value pointed to by the variable, which is an actual value, not a memory address.
If the function takes a pointer, that means it wants an address, not a value.
Therefore, you need to pass a, not *a.
Had a been a pointer to a pointer (int**), you would pass *a, not **a.
Your first question:
you could pass a pointer's address:
void allocateMemory(int **pointer) {
*pointer = malloc(sizeof(int));
}
int *ptr;
allocateMemory(&ptr);
or you can return a pointer value:
int *allocateMemory() {
return malloc(sizeof(int));
}
int *ptr = mallocateMemory();
I think you're a little confused about what a pointer actually is.
A pointer is just variable whose value represents an address in memory. So when we say that int *p is pointer to an integer, that just means p is a variable that holds a number that is the memory address of an int.
If you want a function to allocate a buffer of integers and change the value in the variable p, that function needs to know where in memory p is stored. So you have to give it a pointer to p (i.e., the memory address of p), which itself is a pointer to an integer, so what the function needs is a pointer to a pointer to an integer (i.e., a memory address where the function should store a number, which in turn is the memory address of the integers the function allocated), so
void allocateIntBuffer(int **pp)
{
// by doing "*pp = whatever" you're telling the compiler to store
// "whatever" not in the pp variable but in the memory address that
// the pp variable is holding.
*pp = malloc(...);
}
// call it like
int *p;
allocateIntBuffer(&p);
I think the key to your questions is to understand that there is nothing special about pointer variables. A pointer is a variable like any other, only that the value stored in that variable is used to represent a position in memory.
Note that returning a pointer or forcing the caller to move the pointer in an out of a void * temp variable is the only way you can make use of the void * type to allow your function to work with different pointer types. char **, int **, etc. are not convertible to void **. As such, I would advise against what you're trying to do, and instead use the return value for functions that need to update a pointer, unless your function by design only works with a specific type. In particular, simple malloc wrappers that try to change the interface to pass pointer-to-pointer types are inherently broken.