Character Array Pointers and Casts from Integers (Memory Address)

Character Array Pointers and Casts from Integers (Memory Address) - c

Okay, currently I am writing a kernel for the sake of my resume. While writing my memory management unit I have hit a brick wall.
int address = (int)malloc(sizeof(Test))
consoleWriteString("Variable Address:\n");
consoleWriteInteger(address);
char* f = (void*)address;
consoleWriteString("\nVariable Address:\n");
consoleWriteInteger((int)&f); // Should print off the same as above
Logically the output should be the same for both. Somewhere somthing has gone wrong though. as my output is the following.
Variable Address: 47167
Variable Address: 1065908
After a long period of testing and debugging I finally gave in and decided to ask stack overflow. Also if you spot any syntax errors ignore them. By the way this is all in C, and all functions are custom, including malloc, but I have determined that the error does not lie in that functon, or any other for a fact. I believe this is just me being stupid about pointers and casting but don't laugh at me when it was somthing super simple that I missed.
Thanks yall

&f is the address of f, not the address contained in it! f is on the stack. Its value (the address you first printed) is pointing to the allocated memory.
Think of it this way: You allocate room in memory for some stuff. This memory region has an address. You put the address in a pointer (f), so that it points to that region. But f itself needs to be somewhere in memory in order to hold the value of that address. In this case, f is on the stack, and &f gets the address of f (the container of the original address), not the address that f contains.
As an aside, be very careful casting addresses to int (and back!), since int might not be large enough to hold an address (e.g. on x86-64, depending on your compiler). I believe the correct type to use when you want to use an address as an integer is uintptr_t in stdint.h.

The value of f (which happens to be a pointer) is the same as address (which is also a pointer, but of a different type) - this is what you do in the line
char* f = (void*)address;
But then you print the address of f:
consoleWriteInteger((int)&f);
And that is not the same thing as the value of f... change that line to
consoleWriteInteger((int)f);
and you should be all set.

the first print out is an int, although malloc returns an address, The second is an address casted to an int. do f instead & gets you the address of a value while * dereferences a pointer getting you the value of what is being pointed to.

Related

Following a pointer to a pointer

sqlite3_open takes a pointer to a pointer. Id like to trace the address of the second pointer.
E.g: p1(p2(obj))
https://www.sqlite.org/c3ref/open.html
int sqlite3_open(
const char *filename, /* Database filename (UTF-8) */
sqlite3 **ppDb /* OUT: SQLite db handle */
);
What is the syntax to get the address of that pointer in DTrace?
Im using the pid$target::sqlite3_open:return probe to read from the arg1 that was set from the entry probe.
Im currently using:
// Copy pointer bytes from arg1 to kernel, cast to pointer.
(uintptr_t *)copyin(arg1, sizeof(uintptr_t))
Which results in: invalid kernel access in action.
Im on MacOS with SIP enabled, is this the issue?

I may be misunderstanding your question, but what I suspect is that you've misunderstood how sqlite3_open works.
To call sqlite3_open you should have a code that looks like this:
sqlite3 * pDB = NULL;
/* ... */
int result = sqlite3_open("file:database.db", &pDB);
As you see, there's no "pointer to pointer" variable in my code. Instead, sqlite3_ope takes the address of of a pointer variable I allocated on the stack.
To copy that pointer is as simple as:
sqlite3 * pDB2 = pDB
The reason for this is simple:
The sqlite3_open function wants to return two variable, which is impossible in C.
Instead of returning two variables, sqlite3_open returns only one variable directly and returns the second variable indirectly.
In order to return the second, it takes a pointer to a variable of the same type it wants to return. Then, by dereferencing the address and filling in the value, it provides you with the second variable's value.
However, the second variable sqlite3_open returns is a pointer. This is why, in order to return a pointer as a second variable, sqlite3_open requires a pointer to a pointer variable.
Reading the address
In the example above, the pDB variable holds the address for the sqlite3 object (the one allocated by sqlite3_open).
The address, as you know, is simply a number representing a location in the memory. To read the pointer value as a number, simply cast the pointer to a uintptr_t. i.e.:
uintptr_t db_mem_addr_value = (uintptr_t)pDB;
Of course, numbers (and memory addresses) can't be printed as hex strings directly, they need a function that will convert them into hex notation.
Consider that in C you would print the memory address in Hex notation by using printf i.e.,
fprintf(stderr, "%p\n", (void *)pDB);
Using dtrace would be the same. You might want to convert the pointer address to a number, for example, using the lltostr dtrace function:
lltostr((uintptr_t)*(void**)arg1, 16)

Not a dtrace pro, but here are some observations.
uintptr_t is defined to be large enough to hold any pointer converted to an integer. Note that this does not imply that sizeof(uintptr_t) == sizeof(void*). It is perfectly valid (and on some platforms, necessary) for uintptr_t to be strictly larger than a pointer. That means your copyin call might be copying more bytes than are actually there. Try using a size of sizeof(sqlite**) instead.
Also, it's possible that some of OSX's internal protection mechanisms are causing you problems. See the answer on this related question for a good explanation.

Does C always have to use pointers to handle addresses?

As I understand it, all of the cases where C has to handle an address involve the use of a pointer. For example, the & operand creates a pointer to the program, instead of just giving the bare address as data (i.e it never gives the address without using a pointer first):
scanf("%d", &foo)
Or when using the & operand
int i; //a variable
int *p; //a variable that store adress
p = &i; //The & operator returns a pointer to its operand, and equals p to that pointer.
My question is: Is there a reason why C programs always have to use a pointer to manage addresses? Is there a case where C can handle a bare address (the numerical value of the address) on its own or with another method? Or is that completely impossible? (Being because of system architecture, memory allocation changing during and in each runtime, etc). And finally, would that be useful being that addresses change because of memory management? If that was the case, it would be a reason why pointers are always needed.
I'm trying to figure out if the use pointers is a must in C standardized languages. Not because I want to use something else, but because I want to know for sure that the only way to use addresses is with pointers, and just forget about everything else.
Edit: Since part of the question was answered in the comments of Eric Postpischil, Michał Marszałek, user3386109, Mike Holt and Gecko; I'll group those bits here: Yes, using bare adresses bear little to no use because of different factors (Pointers allow a number of operations, adresses may change each time the program is run, etc). As Michał Marszałek pointed out (No pun intended) scanf() uses a pointer because C can only work with copies, so a pointer is needed to change the variable used. i.e
int foo;
scanf("%d", foo) //Does nothing, since value can't be changed
scanf("%d", &foo) //Now foo can be changed, since we use it's address.
Finally, as Gecko mentioned, pointers are there to represent indirection, so that the compiler can make the difference between data and address.
John Bode covers most of those topics in it's answer, so I'll mark that one.

A pointer is an address (or, more properly, it’s an abstraction of an address). Pointers are how we deal with address values in C.
Outside of a few domains, a “bare address” value simply isn’t useful on its own. We’re less interested in the address than the object at that address. C requires us to use pointers in two situations:
When we want a function to write to a parameter
When we need to track dynamically allocated memory
In these cases, we don’t really care what the address value actually is; we just need it to access the object we’re interested in.
Yes, in the embedded world specific address values are meaningful. But you still use pointers to access those locations. Like I said above, a pointer is an address for our purposes.

C allows you to convert pointers to integers. The <stdint.h> header provides a uintptr_t type with the property that any pointer to void can be converted to uintptr_t and back, and the result will compare equal to the original pointer.
Per C 2018 6.3.2.3 6, the result of converting a pointer to an integer is implementation-defined. Non-normative note 69 says “The mapping functions for converting a pointer to an integer or an integer to a pointer are intended to be consistent with the addressing structure of the execution environment.”
Thus, on a machine where addresses are a simple numbering scheme, converting a pointer to a uintptr_t ought to give you the natural machine address, even though the standard does not require it. There are, however, environments where addresses are more complicated, and the result of converting a pointer to an integer may not be straightforward.

int i; //a variable
int *p; //a variable that store adres
i = 10; //now i is set to 10
p = &i; //now p is set to i address
*p = 20; //we set to 20 the given address
int tab[10]; // a table
p = tab; //set address
p++; //operate on address and move it to next element tab[1]
We can operate on address by pointers move forward or backwards. We can set and read from given address.
In C if we want get return values from functions we must use pointers. Or use return value from functions, but that way we can only get one value.
In C we don't have references therefore we must use pointers.
void fun(int j){
j = 10;
}
void fun2(int *j){
*j = 10;
}
int i;
i = 5; // now I is set to 5
fun(i);
//printf i will print 5
fun2(&i);
//printf I will print 10

Extern arrays usage causing access violation

I have a
LS_Led* LS_vol_leds[10];
declared in one C module, and the proper externs in the other modules that access it.
In func1() I have this line:
/* Debug */
LS_Led led = *(LS_vol_leds[0]);
And it does not cause an exception. Then
I call func2() in another C module (right after above line), and do the same line, namely:
/* Debug */
LS_Led led = *(LS_vol_leds[0]);`
first thing, and exception thrown!!!
I don't think I have the powers to debug this one on my own.
Before anything LS_vol_leds is initialized in func1() with:
LS_vol_leds[0] = &led3;
LS_vol_leds[1] = &led4;
LS_vol_leds[2] = &led5;
LS_vol_leds[3] = &led6;
LS_vol_leds[4] = &led7;
LS_vol_leds[5] = &led8;
LS_vol_leds[6] = &led9;
LS_vol_leds[7] = &led10;
LS_vol_leds[8] = &led11;
LS_vol_leds[9] = &led12;
My externs look like
extern LS_Led** LS_vol_leds;
So does that lead to disaster and I how do I prevent disaster?
Thanks.

This leads to disaster:
extern LS_Led** LS_vol_leds;
You should try this instead:
extern LS_Led *LS_vol_leds[];
If you really want to know why, you should read Expert C Programming - Deep C Secrets, by Peter Van Der Linden (amazing book!), especially chapter 4, but the quick answer is that this is one of those corner cases where pointers and arrays are not interchangeable: a pointer is a variable which holds the address of another one, whereas an array name is an address. extern LS_Led** LS_vol_leds; is lying to the compiler and generating the wrong code to access LS_vol_leds[i].
With this:
extern LS_Led** LS_vol_leds;
The compiler will believe that LS_vol_leds is a pointer, and thus, LS_vol_leds[i] involves reading the value stored in the memory location that is responsible for LS_vol_leds, use that as an address, and then scale i accordingly to get the offset.
However, since LS_vol_leds is an array and not a pointer, the compiler should instead pick the address of LS_vol_leds directly. In other words: what is happening is that your original extern causes the compiler to dereference LS_vol_leds[0] because it believes that LS_vol_leds[0] holds the address of the pointed-to object.
UPDATE: Fun fact - the back cover of the book talks about this specific case:
So that's why extern char *cp isn't the same as extern char cp[]. I
knew that it didn't work despite their superficial equivalence, but I
didn't know why. [...]
UPDATE2: Ok, since you asked, let's dig deeper. Consider a program split into two files, file1.c and file2.c. Its contents are:
file1.c
#define BUFFER_SIZE 1024
char cp[BUFFER_SIZE];
/* Lots of code using cp[i] */
file2.c
extern char *cp;
/* Code using cp[i] */
The moment you try to assing to cp[i] or use cp[i] in file2.c will most likely crash your code. This is deeply tight into the mechanics of C and the code that the compiler generates for array-based accesses and pointer-based accesses.
When you have a pointer, you must think of it as a variable. A pointer is a variable like an int, float or something similar, but instead of storing an integer or a float, it stores a memory address - the address of another object.
Note that variables have addresses. When you have something like:
int a;
Then you know that a is the name for an integer object. When you assign to a, the compiler emits code that writes into whatever address is associated with a.
Now consider you have:
char *p;
What happens when you access *p? Remember - a pointer is a variable. This means that the memory address associated with p holds an address - namely, an address holding a character. When you assign to p (i.e., make it point to somewhere else), then the compiler grabs the address of p and writes a new address (the one you provide it) into that location.
For example, if p lives at 0x27, it means that reading memory location 0x27 yields the address of the object pointed to by p. So, if you use *p in the right hand side of an assignment, the steps to get the value of *p are:
Read the contents of 0x27 - say it's 0x80 - this is the value of the pointer, or, equivalently, the address of the pointed-to object
Read the contents of 0x80 - this finally gives you *p.
What if p is an array? If p is an array, then the variable p itself represents the array. By convention, the address representing an array is the address of its first element. If the compiler chooses to store the array in address 0x59, it means that the first element of p lives at 0x59. So when you read p[0] (or *p), the generated code is simpler: the compiler knows that the variable p is an array, and the address of an array is the address of the first element, so p[0] is the same as reading 0x59. Compare this to the case for which p is a pointer.
If you lie to the compiler, and tell it you have a pointer instead of an array, the compiler will (wrongly) generate code that does what I showed for the pointer case. You're basically telling it that 0x59 is not the address of an array, it's the address of a pointer. So, reading p[i] will cause it to use the pointer version:
Read the contents of 0x59 - note that, in reality, this is p[0]
Use that as an address, and read its contents.
So, what happens is that the compiler thinks that p[0] is an address, and will try to use it as such.
Why is this a corner case? Why don't I have to worry about this when passing arrays to functions?
Because what is really happening is that the compiler manages it for you. Yes, when you pass an array to a function, a pointer to the first element is passed, and inside the called function you have no way to know if it is a "real" array or a pointer. However, the address passed into the function is different depending on whether you're passing a real array or a pointer. If you're passing a real array, the pointer you get is the address of the first element of the array (in other words: the compiler immediately grabs the address associated to the array variable from the symbol table). If you're passing a pointer, the compiler passes the address that is stored in the address associated with that variable (and that variable happens to be the pointer), that is, it does exactly those 2 steps mentioned before for pointer-based access. Again, note that we're discussing the value of the pointer here. You must keep this separated from the address of the pointer itself (the address where the address of the pointed-to object is stored).
That's why you don't see a difference. In most situations, arrays are passed around as function arguments, and this rarely raises problems. But sometimes, with some corner cases (like yours), if you don't really know what is happening down there, well.. then it will be a wild ride.
Personal advice: read the book, it's totally worth it.

Doubts about pointer and memory access

i am just started learning pointers in c. I have following few doubts. If i find the answers for the below questions. It Will be really useful for me to understand the concept of pointers in c. Thanks in advance.
i)
char *cptr;
int value = 2345;
cptr = (char *)value;
whats the use of (char *) and what it mean in the above code snippet.
ii)
char *cptr;
int value = 2345;
cptr = value;
This also compiles without any error .then whats the difference between i & ii code snippet
iii) &value is returning address of the variable. Is it a virtual memory address in RAM? Suppose another c program running in parallel, will that program can have same memory address as &value. Will each process can have duplicate memory address same as in other process and it is independent of each other?
iv)
#define MY_REGISTER (*(volatile unsigned char*)0x1234)
void main()
{
MY_REGISTER=12;
printf("value in the address tamil is %d",(MY_REGISTER));
}
The above snippet compiled successfully. But it outputs segmentation fault error. I don't know what's the mistake I am doing. I want to know how to access the value of random address, using pointers. Is there any way? Will program have the address 0x1234 for real?
v) printf("value at the address %d",*(236632));//consider the address 236632 available in
//stack
why does the above printf statement showing error?

That's a type cast, it tells the compiler to treat one type as some other (possibly unrelated) type. As for the result, see point 2 below.
That makes cptr point to the address 2345.
Modern operating systems isolate the processes. The address of one variable in one process is not valid in another process, even if started with the same program. In fact, the second process may have a completely different memory map due to Address Space Layout Randomisation (ASLR).
It's because you try to write to address 0x1234 which might be a valid address on some systems, but not on most, and almost never on a PC running e.g. Windows or Linux.

i)
(char *) means, that you cast the data stored in value to a pointer ptr, which points to a char. Which means, that ptr points to the memory location 2345. In your code snipet ptr is undefined though. I guess there is more in that program.
ii)
The difference is, that you now write to cptr, which is (as you defined) a pointer pointing to a char. There is not much of a difference as in i) except, that you write to a different variable, and that you use a implicit cast, which gets resolved by the compiler. Again, cptr points now to the location 2345 and expects there to be a char
iii)
Yes you can say it is a virtual address. Also segmentation plays some parts in this game, but at your stage you don't need to worry about it at all. The OS will resolve that for you and makes sure, that you only overwrite variables in the memory space dedicated to your program. So if you run a program twice at the same time, and you print a pointer, it is most likely the same value, but they won't point at the same value in memory.
iv)
Didn't see the write instruction at first. You can't just write anywhere into memory, as you could overwrite another program's value.
v)
Similar issue as above. You cannot just dereference any number you want to, you first need to cast it to a pointer, otherwise neither the compiler, your OS nor your CPU will have a clue, to what exactely it is pointing to
Hope I could help you, but I recommend, that you dive again in some books about pointers in C.

i.) Type cast, you cast the integer to a char
ii.) You point to the address of 2345.
iii.) Refer to answer from Joachim Pileborg. ^ ASLR
iv.) You can't directly write into an address without knowing if there's already something in / if it even exists.
v.) Because you're actually using a pointer to print a normal integer out, which should throw the error C2100: illegal indirection.

You may think pointers like numbers on mailboxes. When you set a value to a pointer, e.g cptr = 2345 is like you move in front of mailbox 2345. That's ok, no actual interaction with the memory, hence no crash. When you state something like *cptr, this refers to the actual "content of the mailbox". Setting a value for *cptr is like trying to put something in the mailbox in front of you (memory location). If you don't know who it belongs to (how the application uses that memory), it's probably a bad idea. You could use "malloc" to initialize a pointer / allocate memory, and "free" to cleanup after you finish the job.

Dereferencing in C

I've just started to learn C so please be kind.
From what I've read so far regarding pointers:
int * test1; //this is a pointer which is basically an address to the process
//memory and usually has the size of 2 bytes (not necessarily, I know)
float test2; //this is an actual value and usually has the size of 4 bytes,
//being of float type
test2 = 3.0; //this assigns 3 to `test2`
Now, what I don't completely understand:
*test1 = 3; //does this assign 3 at the address
//specified by `pointerValue`?
test1 = 3; //this says that the pointer is basically pointing
//at the 3rd byte in process memory,
//which is somehow useless, since anything could be there
&test1; //this I really don't get,
//is it the pointer to the pointer?
//Meaning, the address at which the pointer address is kept?
//Is it of any use?
Similarly:
*test2; //does this has any sense?
&test2; //is this the address at which the 'test2' value is found?
//If so, it's a pointer, which means that you can have pointers pointing
//both to the heap address space and stack address space.
//I ask because I've always been confused by people who speak about
//pointers only in the heap context.

Great question.
Your first block is correct. A pointer is a variable that holds the address of some data. The type of that pointer tells the code how to interpret the contents of the address being held by that pointer.
The construct:
*test1 = 3
Is called the deferencing of a pointer. That means, you can access the address that the pointer points to and read and write to it like a normal variable. Note:
int *test;
/*
* test is a pointer to an int - (int *)
* *test behaves like an int - (int)
*
* So you can thing of (*test) as a pesudo-variable which has the type 'int'
*/
The above is just a mnemonic device that I use.
It is rare that you ever assign a numeric value to a pointer... maybe if you're developing for a specific environment which has some 'well-known' memory addresses, but at your level, I wouldn't worry to much about that.
Using
*test2
would ultimately result in an error. You'd be trying to deference something that is not a pointer, so you're likely to get some kind of system error as who knows where it is pointing.
&test1 and &test2 are, indeed, pointers to test1 and test2.
Pointers to pointers are very useful and a search of pointer to a pointer will lead you to some resources that are way better than I am.

It looks like you've got the first part right.
An incidental thought: there are various conventions about where to put that * sign. I prefer mine nestled with the variable name, as in int *test1 while others prefer int* test1. I'm not sure how common it is to have it floating in the middle.
Another incidental thought: test2 = 3.0 assigns a floating-point 3 to test2. The same end could be achieved with test2=3, in which case the 3 is implicitly converted from an integer to a floating point number. The convention you have chosen is probably safer in terms of clarity, but is not strictly necessary.
Non-incidentals
*test1=3 does assign 3 to the address specified by test.
test1=3 is a line that has meaning, but which I consider meaningless. We do not know what is at memory location 3, if it is safe to touch it, or even if we are allowed to touch it.
That's why it's handy to use something like
int var=3;
int *pointy=&var;
*pointy=4;
//Now var==4.
The command &var returns the memory location of var and stores it in pointy so that we can later access it with *pointy.
But I could also do something like this:
int var[]={1,2,3};
int *pointy=&var;
int *offset=2;
*(pointy+offset)=4;
//Now var[2]==4.
And this is where you might legitimately see something like test1=3: pointers can be added and subtracted just like numbers, so you can store offsets like this.
&test1 is a pointer to a pointer, but that sounds kind of confusing to me. It's really the address in memory where the value of test1 is stored. And test1 just happens to store as its value the address of another variable. Once you start thinking of pointers in this way (address in memory, value stored there), they become easier to work with... or at least I think so.
I don't know if *test2 has "meaning", per se. In principle, it could have a use in that we might imagine that the * command will take the value of test2 to be some location in memory, and it will return the value it finds there. But since you define test2 as a float, it is difficult to predict where in memory we would end up, setting test2=3 will not move us to the third spot of anything (look up the IEEE754 specification to see why). But I would be surprised if a compiler would allow such thing.
Let's look at another quick example:
int var=3;
int pointy1=&var;
int pointy2=&pointy1;
*pointy1=4; //Now var==4
**pointy2=5; //Now var==5
So you see that you can chain pointers together like this, as many in a row as you'd like. This might show up if you had an array of pointers which was filled with the addresses of many structures you'd created from dynamic memory, and those structures contained pointers to dynamically allocated things themselves. When the time comes to use a pointer to a pointer, you'll probably know it. For now, don't worry too much about them.

First let's add some confusion: the word "pointer" can refer to either a variable (or object) with a pointer type, or an expression with the pointer type. In most cases, when people talk about "pointers" they mean pointer variables.
A pointer can (must) point to a thing (An "object" in standards parlance). It can only point to the right kind of thing; a pointer to int is not supposed to point to a float object. A pointer can also be NULL; in that case there is no thing to point to.
A pointertype is also a type, and a pointer object is also an object. So it is allowable to construct a pointer to pointer: the pointer-to-pointer just stores the addres of the pointer object.
What a pointer can not be:
It cannot point to a value: p = &4; is impossible. 4 is a literal value, which is not stored in an object, and thus has no address.
the same goes for expressions: p = &(1+4); is impossible, because the expression "1+4" does not have a location.
the same goes for return value p = &sin(pi); is impossible; the return value is not an object and thus has no address.
variables marked as "register" (almost distinct now) cannot have an address.
you cannot take the address of a bitfield, basically because these can be smaller than character (or have a finer granularity), hence it would be possible that different bitmasks would have the same address.
There are some "exceptions" to the above skeletton (void pointers, casting, pointing one element beyond an array object) but for clarity these should be seen as refinements/amendments, IMHO.