Misunderstanding in particular user case of pointers and double-pointers

Misunderstanding in particular user case of pointers and double-pointers - arrays

I'm dealing with pointers, double-pointers and arrays, and I think I'm messing up a bit my mind. I've been reading about it, but my particular user-case is messing me up, and I'd appreciate if someone could clear a bit my mind. This is a small piece of code I've built to show my misunderstanding:
#include <stdio.h>
#include <stdint.h>
void fnFindValue_vo(uint8_t *vF_pu8Msg, uint8_t vF_u8Length, uint8_t **vF_ppu8Match, uint8_t vF_u8Value)
{
for(int i=0; i<vF_u8Length; i++)
{
if(vF_u8Value == vF_pu8Msg[i])
{
*vF_ppu8Match = &vF_pu8Msg[i];
break;
}
}
}
int main()
{
uint8_t u8Array[]={0,0,0,1,2,4,8,16,32,64};
uint8_t *pu8Reference = &u8Array[3];
/*
* Purpose: Find the index of a value in u8Array from a reference
* Reference: First non-zero value
* Condition: using the function with those input arguments
*/
// WAY 1
uint8_t *pu8P2 = &u8Array[0];
uint8_t **ppu8P2 = &pu8P2;
fnFindValue_vo(u8Array,10,ppu8P2,16); // Should be diff=4
uint8_t u8Diff1 = *ppu8P2 - pu8Reference;
printf("Diff1: %u\n", u8Diff1);
// WAY 2
uint8_t* ppu8Pos; // Why this does not need to be initialized and ppu8P2 yes
fnFindValue_vo(u8Array,10,&ppu8Pos,64); // Should be diff=6
uint8_t u8Diff2 = ppu8Pos - pu8Reference;
printf("Diff2: %u\n", u8Diff2);
}
Suppose the function fnFindValue_vo and its arguments cannot be changed. So my purpose is to find the relative index of a value in the array taking as reference the first non-zero value (no need to find it, can be hard-coded).
In the first way, I've done it following my logic and understanding of the pointers. So I have *pu8P2 that contains the address of the first member of u8Array, and **ppu8P2 containing the address of pu8P2. So after calling the funcion, I just need to substract the pointers 'pointing' to u8Array to get the relative index.
Anyway, I tried another method. I just created a pointer, and passed it's address, without initializing the pointer, to the funcion. So later I just need to substract those two pointers and I get also the relative index.
My confusion comes with this second method.
Why ppu8Pos does not have to be initialized, and ppu8P2 yes? I.e. Why couldn't I declare it as uint8_t **ppu8P2;? (it gives me Segmentation fault).
Which of the two methods is more practical/better practice for coding?
Why is it possible to give the address to a pointer when the function's argument is a double pointer?

Why ppu8Pos does not have to be initialized, and ppu8P2 yes
You are not using the value of ppu8Pos right away. Instead, you pass its address to another function, where it gets assigned by-reference. On the other hand, ppu8P2 is the address of ppu8Pos you pass to another function, where its value is used, so you need to initialise it.
Which of the two methods is more practical/better practice for coding
They are identical for all intents and purposes, for exactly the same reason these two fragments are identical:
// 1
double t = sin(x)/cos(x);
// 2
double s = sin(x), c = cos(x);
double t = s/c;
In one case, you use a variable initialised to a value. In the other case, you use a value directly. The type of the value doesn't really matter. It could be a double, or a pointer, or a pointer to a pointer.
Why is it possible to give the address to a pointer when the function's argument is a double pointer?
These two things you mention, an address to a pointer and a double pointer, are one and the same thing. They are not two very similar things, or virtually indistinguishable, or any weak formulation like that. No, the two wordings mean exactly the same, to all digits after the decimal point.

The address of a pointer (like e.g. &pu8P2) is a pointer to a pointer.
The result of &pu8P2 is a pointer to the variable pu8P2.
And since pu8P2 is of the type uint8_t * then a pointer to such a type must be uint8_t **.
Regarding ppu8Pos, it doesn't need to be initialized, because that happens in the fnFindValue_vo function with the assignment *vF_ppu8Match = &vF_pu8Msg[i].
But there is a trap here: If the condition vF_u8Value == vF_pu8Msg[i] is never true then the assignment never happens and ppu8Pos will remain uninitialized. So that initialization of ppu8Pos is really needed after all.
The "practicality" of each solution is more an issue of personal opinion I believe, so I leave that unanswered.

For starters the function fnFindValue_vo can be a reason of undefined behavior because it does not set the pointer *vF_ppu8Match in case when the target value is not found in the array.
Also it is very strange that the size of the array is specified by an object of the type uint8_t. This does not make a sense.
The function should be declared at least the following way
void fnFindValue_vo( const uint8_t *vF_pu8Msg, size_t vF_u8Length, uint8_t **vF_ppu8Match, uint8_t vF_u8Value )
{
const uint8_t *p = vF_pu8Msg;
while ( p != vF_pu8Msg + vF_u8Length && *p != vF_u8Value ) ++p;
*vF_ppu8Match = ( uint8_t * )p;
}
The difference between the two approaches used in your question is that in the first code snippet if the target element will not be found then the pointer will still point to the first element of the array
uint8_t *pu8P2 = &u8Array[0];
And this expression
uint8_t u8Diff1 = *ppu8P2 - pu8Reference;
will yield some confusing positive value (due to the type uint8_t) because the difference *ppu8P2 - pu8Reference be negative.
In the second code snippet in this case you will get undefined behavior due to this statement
uint8_t u8Diff2 = ppu8Pos - pu8Reference;
because the pointer ppu8Pos was not initialized.

Honestly, not trying to understand your code completely, but my advice is do not overcomplicate it.
I would start with one fact which helped me untangle:
if you have int a[10]; then a is a pointer, in fact
int x = a[2] is exactly the same like int x = *(a+2) - you can try it.
So let's have
int a[10]; //this is an array
//a is a pointer to the begging of the array
a[2] is an int type and it is the third value in that array stored at memory location a plus size of two ints;
&a[2] is a pointer to that third value
*(a) is the first value in the array a
*(a+1) is the same as a[1] and it is the second int value in array a
and finally
**a is the same as *(*a) which means: *a is take the first int value in the array a (the same as above) and the second asterisk means "and take that int and pretend it is a pointer and take the value from the that location" - which is most likely a garbage.
https://stackoverflow.com/questions/42118190/dereferencing-a-double-pointer
Only when you have a[5][5]; then a[0] would be still a pointer to the first row and a[1] would be a pointer to the second row and **(a) would then be the same as a[0][0].
https://beginnersbook.com/2014/01/2d-arrays-in-c-example/
Drawing it on paper as suggested in comments helps, but what helped me a lot is to learn using debugger and break points. Put a breakpoint at the first line and then go trough the program step by step. In the "watches" put all variants like
pu8P2,&pu8P2,*pu8P2,**pu8P2 and see what is going on.

Related

Does C always have to use pointers to handle addresses?

As I understand it, all of the cases where C has to handle an address involve the use of a pointer. For example, the & operand creates a pointer to the program, instead of just giving the bare address as data (i.e it never gives the address without using a pointer first):
scanf("%d", &foo)
Or when using the & operand
int i; //a variable
int *p; //a variable that store adress
p = &i; //The & operator returns a pointer to its operand, and equals p to that pointer.
My question is: Is there a reason why C programs always have to use a pointer to manage addresses? Is there a case where C can handle a bare address (the numerical value of the address) on its own or with another method? Or is that completely impossible? (Being because of system architecture, memory allocation changing during and in each runtime, etc). And finally, would that be useful being that addresses change because of memory management? If that was the case, it would be a reason why pointers are always needed.
I'm trying to figure out if the use pointers is a must in C standardized languages. Not because I want to use something else, but because I want to know for sure that the only way to use addresses is with pointers, and just forget about everything else.
Edit: Since part of the question was answered in the comments of Eric Postpischil, Michał Marszałek, user3386109, Mike Holt and Gecko; I'll group those bits here: Yes, using bare adresses bear little to no use because of different factors (Pointers allow a number of operations, adresses may change each time the program is run, etc). As Michał Marszałek pointed out (No pun intended) scanf() uses a pointer because C can only work with copies, so a pointer is needed to change the variable used. i.e
int foo;
scanf("%d", foo) //Does nothing, since value can't be changed
scanf("%d", &foo) //Now foo can be changed, since we use it's address.
Finally, as Gecko mentioned, pointers are there to represent indirection, so that the compiler can make the difference between data and address.
John Bode covers most of those topics in it's answer, so I'll mark that one.

A pointer is an address (or, more properly, it’s an abstraction of an address). Pointers are how we deal with address values in C.
Outside of a few domains, a “bare address” value simply isn’t useful on its own. We’re less interested in the address than the object at that address. C requires us to use pointers in two situations:
When we want a function to write to a parameter
When we need to track dynamically allocated memory
In these cases, we don’t really care what the address value actually is; we just need it to access the object we’re interested in.
Yes, in the embedded world specific address values are meaningful. But you still use pointers to access those locations. Like I said above, a pointer is an address for our purposes.

C allows you to convert pointers to integers. The <stdint.h> header provides a uintptr_t type with the property that any pointer to void can be converted to uintptr_t and back, and the result will compare equal to the original pointer.
Per C 2018 6.3.2.3 6, the result of converting a pointer to an integer is implementation-defined. Non-normative note 69 says “The mapping functions for converting a pointer to an integer or an integer to a pointer are intended to be consistent with the addressing structure of the execution environment.”
Thus, on a machine where addresses are a simple numbering scheme, converting a pointer to a uintptr_t ought to give you the natural machine address, even though the standard does not require it. There are, however, environments where addresses are more complicated, and the result of converting a pointer to an integer may not be straightforward.

int i; //a variable
int *p; //a variable that store adres
i = 10; //now i is set to 10
p = &i; //now p is set to i address
*p = 20; //we set to 20 the given address
int tab[10]; // a table
p = tab; //set address
p++; //operate on address and move it to next element tab[1]
We can operate on address by pointers move forward or backwards. We can set and read from given address.
In C if we want get return values from functions we must use pointers. Or use return value from functions, but that way we can only get one value.
In C we don't have references therefore we must use pointers.
void fun(int j){
j = 10;
}
void fun2(int *j){
*j = 10;
}
int i;
i = 5; // now I is set to 5
fun(i);
//printf i will print 5
fun2(&i);
//printf I will print 10

C -- Modify const through aliased non-const pointer

Is it allowed in standard C for a function to modify an int given as const int * using an aliased int *? To put it another way, is the following code guaranteed to always return 42 and 1 in standard C?
#include <stdio.h>
void foo(const int *a, int *b)
{
printf("%d\n", *a);
*b = 1;
printf("%d\n", *a);
}
int main(void)
{
int a = 42;
foo(&a, &a);
return 0;
}

In your example code, you have an integer. You take a const pointer to it, and a non-const pointer to it. Modifying the integer via the non-const pointer is legal and well-defined, of course.
Since both pointers are pointers to integers, and the const pointer need not point to a const object, then the compiler should expect that the value read from the const pointer could have changed, and is required to reload the value.
Note that this would not be the case if you had used the restrict keyword, because it specifies that a pointer argument does not alias any other pointer argument, so then the compiler could optimise the reload away.

Yes and yes. Your program is defined.
The fact that you point to an non-const int variable with a pointer to a const int, doesn't make that variable const and may be still modified trough a pointer to an int or by using the original variable label.

Yes you can do this (if you know you can get away with it).
One reason you may not be able to get away with it, is if the destination memory you are writing to is in a read-only protected areas (such as constant data) then you will get an access violation. For example any const's at compile time that end up in read-only data sections of the executable. Most platform support protecting it from being written to at runtime.
Basically don't do it.
There are other issues with your example that probably don't make it the best demonstration. Such as needing a reload of *a in the 2nd printf the compiler may optimize it out! (it knows 'a' did not change, it know's 'a' points to a const, therefore, it does not need to reload memory by preforming a memory load for the 2nd '*a' expression, it can reuse the value it probably has in a register from the 1st time it loaded '*a'). Now if you add in a memory barrier between, then your example has a chance of working better.
https://en.wikipedia.org/wiki/Memory_barrier
GCC ? asm volatile ("" : : : "memory"); // might work before 2nd printf
But the principal for the actual question you asked, yes you can do it if you know what you are doing about other stuff like that.

Yes, it is guaranteed to always print 42 and 1.
const int *a means the value pointed to is a constant for pointer a.
Try dereferencing from a (*a = 10;) in the function and you will get an error.
The pointer a however is not constant. You can do a = b for example.
b can point to the same address as a and/or modify the value, as you did in your example.
Would you declare b pointer's value to be constant (const int *b), you would receive an error.
I try to memorize like this:
const int *a - a points to an object of type int, which it is not allowed to modify (any other pointer to that object can do what it wants, depends on its declaration/definition).

Pointer to 2D arrays in C

I know there is several questions about that which gives good (and working) solutions, but none IMHO which says clearly what is the best way to achieve this.
So, suppose we have some 2D array :
int tab1[100][280];
We want to make a pointer that points to this 2D array.
To achieve this, we can do :
int (*pointer)[280]; // pointer creation
pointer = tab1; //assignation
pointer[5][12] = 517; // use
int myint = pointer[5][12]; // use
or, alternatively :
int (*pointer)[100][280]; // pointer creation
pointer = &tab1; //assignation
(*pointer)[5][12] = 517; // use
int myint = (*pointer)[5][12]; // use
OK, both seems to work well. Now I would like to know :
what is the best way, the 1st or the 2nd ?
are both equals for the compiler ? (speed, perf...)
is one of these solutions eating more memory than the other ?
what is the more frequently used by developers ?

//defines an array of 280 pointers (1120 or 2240 bytes)
int *pointer1 [280];
//defines a pointer (4 or 8 bytes depending on 32/64 bits platform)
int (*pointer2)[280]; //pointer to an array of 280 integers
int (*pointer3)[100][280]; //pointer to an 2D array of 100*280 integers
Using pointer2 or pointer3 produce the same binary except manipulations as ++pointer2 as pointed out by WhozCraig.
I recommend using typedef (producing same binary code as above pointer3)
typedef int myType[100][280];
myType *pointer3;
Note: Since C++11, you can also use keyword using instead of typedef
using myType = int[100][280];
myType *pointer3;
in your example:
myType *pointer; // pointer creation
pointer = &tab1; // assignation
(*pointer)[5][12] = 517; // set (write)
int myint = (*pointer)[5][12]; // get (read)
Note: If the array tab1 is used within a function body => this array will be placed within the call stack memory. But the stack size is limited. Using arrays bigger than the free memory stack produces a stack overflow crash.
The full snippet is online-compilable at gcc.godbolt.org
int main()
{
//defines an array of 280 pointers (1120 or 2240 bytes)
int *pointer1 [280];
static_assert( sizeof(pointer1) == 2240, "" );
//defines a pointer (4 or 8 bytes depending on 32/64 bits platform)
int (*pointer2)[280]; //pointer to an array of 280 integers
int (*pointer3)[100][280]; //pointer to an 2D array of 100*280 integers
static_assert( sizeof(pointer2) == 8, "" );
static_assert( sizeof(pointer3) == 8, "" );
// Use 'typedef' (or 'using' if you use a modern C++ compiler)
typedef int myType[100][280];
//using myType = int[100][280];
int tab1[100][280];
myType *pointer; // pointer creation
pointer = &tab1; // assignation
(*pointer)[5][12] = 517; // set (write)
int myint = (*pointer)[5][12]; // get (read)
return myint;
}

Both your examples are equivalent. However, the first one is less obvious and more "hacky", while the second one clearly states your intention.
int (*pointer)[280];
pointer = tab1;
pointer points to an 1D array of 280 integers. In your assignment, you actually assign the first row of tab1. This works since you can implicitly cast arrays to pointers (to the first element).
When you are using pointer[5][12], C treats pointer as an array of arrays (pointer[5] is of type int[280]), so there is another implicit cast here (at least semantically).
In your second example, you explicitly create a pointer to a 2D array:
int (*pointer)[100][280];
pointer = &tab1;
The semantics are clearer here: *pointer is a 2D array, so you need to access it using (*pointer)[i][j].
Both solutions use the same amount of memory (1 pointer) and will most likely run equally fast. Under the hood, both pointers will even point to the same memory location (the first element of the tab1 array), and it is possible that your compiler will even generate the same code.
The first solution is "more advanced" since one needs quite a deep understanding on how arrays and pointers work in C to understand what is going on. The second one is more explicit.

int *pointer[280]; //Creates 280 pointers of type int.
In 32 bit os, 4 bytes for each pointer. so 4 * 280 = 1120 bytes.
int (*pointer)[100][280]; // Creates only one pointer which is used to point an array of [100][280] ints.
Here only 4 bytes.
Coming to your question, int (*pointer)[280]; and int (*pointer)[100][280]; are different though it points to same 2D array of [100][280].
Because if int (*pointer)[280]; is incremented, then it will points to next 1D array, but where as int (*pointer)[100][280]; crosses the whole 2D array and points to next byte. Accessing that byte may cause problem if that memory doen't belongs to your process.

Ok, this is actually four different question. I'll address them one by one:
are both equals for the compiler? (speed, perf...)
Yes. The pointer dereferenciation and decay from type int (*)[100][280] to int (*)[280] is always a noop to your CPU. I wouldn't put it past a bad compiler to generate bogus code anyways, but a good optimizing compiler should compile both examples to the exact same code.
is one of these solutions eating more memory than the other?
As a corollary to my first answer, no.
what is the more frequently used by developers?
Definitely the variant without the extra (*pointer) dereferenciation. For C programmers it is second nature to assume that any pointer may actually be a pointer to the first element of an array.
what is the best way, the 1st or the 2nd?
That depends on what you optimize for:
Idiomatic code uses variant 1. The declaration is missing the outer dimension, but all uses are exactly as a C programmer expects them to be.
If you want to make it explicit that you are pointing to an array, you can use variant 2. However, many seasoned C programmers will think that there's a third dimension hidden behind the innermost *. Having no array dimension there will feel weird to most programmers.

Better way of declaring an array?

I'm writing in C and compiling with GCC.
is there a better way of declaring points. I was surprised to see that points was an array. Is there some way of declaring points so it looks more like an array.
typedef struct Span
{
unsigned long lo;
unsigned long hi;
} Span;
typedef struct Series
{
unsigned long *points;
unsigned long count;
unsigned long limit;
} Series;
void SetSpanSeries(Series *self, const Span *src)
{
unsigned long *points;
if (src->lo < src->hi )
{
// Overlays second item in series.
points = self->points; // a pointer in self structure
points[0] = src->lo;
points[1] = src->hi;
self->count = 1;
}
}
Now lets say that points points to a structure that is an array.
typedef struct Span
{
unsigned long lo;
unsigned long hi;
} Span;
span *points[4];
now how do I write these lines of code? Did I get this right?
points = self->points; // a pointer in self structure
points[0].lo = src->lo;
points[0].hi = src->hi;

With the declaration unsigned long *points, points is a pointer. It points to the beginning of an array. arr[x] is the same as *(arr + x), so whether arr is an array (in which case, it takes the address of the array, adds x, and dereferences the 'pointer') or a pointer (in which case, it takes the pointer value, adds x, and dereferences the pointer), arr[0] still gets the same array access.
In this case, you can't declare points as an array because you're not using it as an array - you're using it as a pointer, which points to an array. A pointer is a shallow copy - if you change the data pointed to by a pointer, it changes the original data. To create a regular array, you'd need to do a deep copy, which would prevent your changes in pointer from affecting the array self, which is ultimately what you want.
In fact, you could rewrite the whole thing without points:
void SetSpanSeries(Series *self, const Span *src)
{
if (src->lo < src->hi )
{
self->points[0] = src->lo;
self->points[1] = src->hi;
self->count = 1;
}
}
As to your second example, yes, points[0].lo is correct. points->lo would also be correct, so long as you're only accessing points[0]. (Or self->points[0].lo if you take out points entirely.)

The ability to treat a pointer as an array definitely confuses most C beginners. Arrays even decay to pointers when passed as arguments to functions, giving the impression that arrays and pointers are completely interchangeable -- they aren't. An excellent description is in Expert C Programming: Deep C Secrets. (This is one of my favorite books; it's strongly recommended if you intend to understand C.)
Anyway, writing pointer[2] is the same as *(pointer+2) -- the array syntax is far easier for most people to read (and write).
Since you are using this *points variable to provide easier access to another block of memory (the pointer points in the struct Series), you cannot use an array for your local variable because you cannot re-assign the base of an array to something else. Consider the following illegal code:
int foo[10];
int *bar;
int wrong[10];
bar = foo; /* fine */
wrong = foo; /* compile error -- cannot assign to the array 'wrong' */
Another option for re-writing this code is to remove the temporary variable:
if (src->lo < src->hi) {
self->points[0] = src->lo;
self->points[1] = src->hi;
self->count = 1;
}
I'm not sure the temporary variable helps with legibility -- it just saved typing a few characters at the expense of adding a lot of characters. (And a confusing variable, too.)

In the middle section you say points is an array 4 of pointer to struct span. In the third section you are assigning points from self->points (meaning the previous value of points, that array, has been lost). You then dereference points as if it were an array of struct Span and not an array of pointers to struct Span.
In other works, this cannot compile because you are mixing types and even if you were not, you are overwriting the memory allocated by your definition of the points variable.
Providing the definition of Series might help explain what is going on.
But certainly in the first example, points should probably be a Span *points but without seeing Series we cannot tell for sure.

Can we differentiate a variable against a pointer variable

Yesterday while I was coding in C, my friend asked me pointing to a variable is it pointer or a variable ? I stucked up for a while. I didnt find an aswer to it , I just have to go back and search it and tell him.But I was thinking is there any function to differentiate them.
Can we differentiate a variable against a pointer variable
int a;
sizeof(a); // gives 2 bytes
int *b;
sizeof(b); // gives 2 bytes
// if we use sizeof() we get same answer and we cant say which is pointer
// and which is a variable
Is there a way to find out a variable is a normal variable or a pointer? I mean can someone say that it is a pointer or a variable after looking at your variable that you have declared at the beginning and then going down 1000 lines of your code?
After the comment
I wanted to say explicitly it's a 16 bit system architecture.

First, the question "Is it a pointer or a variable" doesn't make much sense. A pointer variable is a variable, just as an integer variable, or an array variable, is a variable.
So the real question is whether something is a pointer or not.
No, there's no function that can tell you whether something is a pointer or not. And if you think about it, in a statically typed language like C, there can't be. Functions take arguments of certain specified types. You can't pass a variable to a function unless the type (pointer or otherwise) is correct in the first place.

You mean differentiate them at run time without seeing the code? No, you can't. Pointers are variables that hold memory address. You can't check it at run time. That means, there is no such function isPointer(n) that will return true/false based on parameter n.

You can deduce the type from the use.
For example:
char* c;
...
c[0] = 'a';
*c = 'a';
Indexing and dereferencing would let you know it's a pointer to something (or it's an array if defined as char c[SOME_POSITIVE_NUMBER];).
Also, things like memset(c,...), memcpy(c,...) will suggest that c is a pointer (array).
OTOH, you can't normally do with pointers most of arithmetic, so, if you see something like
x = c * 2;
y = 3 / c;
z = c << 1;
w = 1 & c;
then c is not a pointer (array).

Three things:
What platform are you using where sizeof(int) returns 2? Seriously
Pointers are types. A pointer to an int is a type, just like an int is. The sizes of a type and a pointer to that type are sometimes equal but not directly related; for instance, a pointer to a double (on my machine, at least) has size 4 bytes while a double has size 8 bytes. sizeof() would be a very poor test, even if there was a situation where such a test would be appropriate (there isn't).
C is a strictly typed language, and your question doesn't really make sense in that context. As the programmer, you know exactly what a is and you will use it as such.

If you'd like to be able to tell whether a variable is a pointer or not when you see it in the source code, but without going back to look at the declaration, a common approach is to indicate it in the way you name your variables. For example, you might put a 'p' at the beginning of the names of pointers:
int *pValue; /* starts with 'p' for 'pointer' */
int iOther; /* 'i' for 'integer' */
...or even:
int *piSomething; /* 'pi' for 'Pointer to Integer' */
This makes it easy to tell the types when you see the variable in your code. Some people use quite a range of prefixes, to distinguish quite a range of types.
Try looking up "Hungarian notation" for examples.

no , you can't.
and what is the usage, as each time u run the code the pointer address will be different ?? however u can subtract two pointers and also can get the memory address value of any pointer.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight