Pointer array of pointers with C? - c

I want an array of pointers and I want to set byte values in the memory addresses where the pointers (of the array) are pointing.
Would this work:
unsigned int *pointer[4] = {(unsigned int *) 0xFF200020, (unsigned int *) 0xFF20001C, (unsigned int *) 0xFF200018, (unsigned int *) 0xFF200014};
*pointer[0] = 0b0111111; // the value is correct for the address
Or is the syntax somehow different?
EDIT:
I'm coding for an SOC board and these are memory addresses that contain the case of some UI elements.
unsigned int *element1 = (unsigned int *) 0xFF200020;
*element1 = 0b0111111;
works so I'm just interested about the C syntax of this.
EDIT2: There was one 0 too much in ... = 0b0...

Short answer:
Everything you've written is fine.
Thoughts:
I'm a big fan of using the types from stdint.h. This would let you write uint32_t which is more clearly a 32 bit unsigned number than unsigned long.
You'll often see people write macros to refer to these registers:
#define REG_IRQ (*(volatile uint32_t *)(0xFF200020))
REG_IRQ = 0x42;
It's possible that you actually want these pointers to be to volatile integers. You want it to be volatile if the value can change outside of the execution of your program. That is, if that memory position doesn't act strictly like a piece of memory. (For example, it's a register that stores the interrupt flags).
With most compilers I've used on embedded platforms, you'll have problems from ignoring volatile once optimizations have been enabled.
0b00111111 is, sadly, non-standard. You can use octal, decimal, or hexadecimal.

Sure, this should work, providing you can find addresses in your own segment.
Most probably, you'll have a segmentation fault when running this code, because 0xFF200020 have really few chances to be in your program segment.

This will not throw any error and will work fine but hard-coding memory address the pointer is pointing to is not a good idea. De-referencing some unknown/non-existing memory location will cause segmentation fault but if you are sure about the memory location and hard-coding values to them as done here is totally fine.

Related

What does this complex expression using volatile, pointers and memory allocation does under the hood?

I know that people have already answered what volatile is and that (int *) is just casting but I cannot understand what really goes under the hood in this expression: volatile int *p = (int *)0x0
So we have a pointer p to an int that can obviously have its value changed unexpectedly. We assign it another pointer that points to the memory address 0? So is it a pointer to a pointer or am I making it more complex than it should be? I would really appreciate it if you could provide a simple sketch like this as it helps me understand.
Sketch
volatile NULL pointer does not have too much sense as no one is going to dereference it.
But if I make this expression to have more sense ( this is STM32 microcontroller specific example)
volatile uint32_t *GPIOA_CRL = (volatile uint32_t *)0x40010800UL;
I declare the volatile pointer GPIO_CLR to the uint32_t object. The assignment casts the address of this hardware register defined by the unsigned long constant to the pointer type.
0x40010800UL - address of the hardware register
(volatile uint32_t *) - converts the unsigned long address to the pointer
Then I can set or read the value of this register.
Why have I defined this pointer as volatile? Because it can be changed by the hardware and compiler will not know about it. So volatile will force the compiler to read the value stores under this address before every use (when i dereference this pointer) and store it after every change. Otherwise the compiler may optimize those reads and writes as it will see no effect of it in the normal program execution path.
Let`s analyze this line.
volatile int * p; declares a pointer to a volatile int. A volatile int is from the storage and semantics a normal int. But volatile instructs the compiler, that its value could change anytime or has other side effects.
On the right side of the assignment you have (int *) 0x0. Here you tell the compiler, that it should assume a pointer to int which points to the address 0x0.
The full assignment volatile int * p = (int *) 0x0; assigns p the value 0x0. So in total you told the compiler, that at address 0x0 is an integer value which has side effects. It can be access by *p.
volatile
You seem to be unclear what volatile means. Take a look at this code:
int * ptr = (int *) 0x0BADC0DE;
*ptr = 0;
*ptr = 1;
An optimizing compiler would take a short look on this code and say: Setting *ptr to 0 has no effect. So we will skip this line and will only assign 1 immediately to safe some execution time and shrink the binary at the same time.
So effectively the compiler would only compile the following code:
int * ptr = (int *) 0x0BADC0DE;
*ptr = 1;
Normally this is OK, but when we are talking about memory mapped IOs, things get different.
A memory mapped IO is some hardware that can be manipulated by accessing special RAM addresses. A very simple example would be an output -- a simple wire coming out of your processor.
So let's assume, that at the address 0x0BADC0DE is a memory mapped output. And when we write 1 to it, this output is set to high and when we write 0 the output is set to low.
When the compiler skips the first assignment, the output is not changed. But we want to signalize another component something with a rising edge. In this case we want to get rid of the optimization. And one common way to do this, is to use the volatile keyword. It instructs the compiler that it can't check every aspect of read or write accesses to this register.
volatile int * ptr = (int *) 0x0BADC0DE;
*ptr = 0;
*ptr = 1;
Now the compiler will not optimize the accesses to *ptr and we are able to signalize a rising edge to an outside component.
Memory mapped IOs are important in programming embedded systems, operating systems and drivers. Another use case for the volatile keyword would be shared variables in parallel computing (that in addition to the volatile keyword would need some kind of mutex of semaphore to restrict simultaneous access to these variables).

How many pointer value combinations can exist in c?

I have a C program that does the following two operations:
struct element *e = (struct element*) malloc(sizeof(struct element));
long unsigned addr = (long unsigned) e;
From this, addr has the decimal value of the pointer. I can convert addr back to an element pointer and use that to get the element from memory.
I am wondering how many possible values addr can be. I know the maximum value of long unsigned is about 4.3 billion, but can I really have an addr value of 1? Is there a certain range of numbers that I can get, and what is that range dependent on, if anything?
Some addresses are reserved for the Operating System (OS) and are usually located in the low memory addresses. However you shouldn't be much interested in the addresses* your data will have, since they are determined by the OS, thus is OS-dependent, plus much dependent in the current state of the OS (an OS with many programs been executed will behave differently than one with few programs executed, and give your program different addresses in these two cases).
Read more in C : Memory layout of C program execution.
Use intptr_t (<stdint.h>) for the address, as BLUEPIXY said (more).
I have a C program that does struct element *e = (struct element*) malloc(sizeof(struct element));
Do I cast the result of malloc? No!
*Except if you are writing OS code or embedded code with no OS might very well be interested in what addresses data might have, as Carey Gregory said, or other special cases.

Memory addressing and pointers in C

This is taken from C, and is based on that.
Let's imagine we have a 32 bit pointer
char* charPointer;
It points into some place in memory that contains some data. It knows that increments of this pointer are in 1 byte, etc.
On the other hand,
int* intPointer;
also points into some place in memory and if we increase it it knows that it should go up by 4 bytes if we add 1 to it.
Question is, how are we able to address full 32 bits of addressable space (2^32) - 4 gigabytes with those pointers, if obviously they contain some information in them that allows them to be separated one from another, for example char* or int*, so this leaves us with not 32 bytes, but with less.
When typing this question I came to thinking, maybe it is all syntatic sugar and really for compiler? Maybe raw pointer is just 32 bit and it doesn't care of the type? Is it the case?
You might be confused by compile time versus run time.
During compilation, gcc (or any C compiler) knows the type of a pointer, in particular knows the type of the data pointed by that pointer variable. So gcccan emit the right machine code. So an increment of a int * variable (on a 32 bits machine having 32 bits int) is translated to an increment of 4 (bytes), while an increment of a char* variable is translated to an increment of 1.
During runtime, the compiled executable (it does not care or need gcc) is only dealing with machine pointers, usually addresses of bytes (or of the start of some word).
Types (in C programs) are not known during runtime.
Some other languages (Lisp, Python, Javascript, ....) require the types to be known at runtime. In recent C++ (but not C) some objects (those having virtual functions) may have RTTI.
It is indeed syntactic sugar. Consider the following code fragment:
int t[2];
int a = t[1];
The second line is equivalent to:
int a = *(t + 1); // pointer addition
which itself is equivalent to:
int a = *(int*)((char*)t + 1 * sizeof(int)); // integer addition
After the compiler has checked the types it drops the casts and works only with addresses, lengths and integer addition.
Yes. Raw pointer is 32 bits of data (or 16 or 64 bits, depending on architecture), and does not contain anything else. Whether it's int *, char *, struct sockaddr_in * is just information for compiler, to know what is the number to actually add when incrementing, and for the type it's going to have when you dereference it.
Your hypothesis is correct: to see how different kinds of pointer are handled, try running this program:
int main()
{
char * pc = 0;
int * pi = 0;
printf("%p\n", pc + 1);
printf("%p\n", pi + 1);
return 0;
}
You will note that adding one to a char* increased its numeric value by 1, while doing the same to the int* increased by 4 (which is the size of an int on my machine).
It's exactly as you say in the end - types in C are just a compile-time concept that tells to the compiler how to generate the code for the various operations you can perform on variables.
In the end pointers just boil down to the address they point to, the semantic information doesn't exist anymore once the code is compiled.
Incrementing an int* pointer is different from a incrementing char* solely because the pointer variable is declared as int*. You can cast an int* to char* and then it will increment with 1 byte.
So, yes, it is all just syntactic sugar. It makes some kinds of array processing easier and confuses void* users.

Does casting remove endian dependency in C/C++?

i.e. if we cast a C or C++ unsigned char array named arr as (unsigned short*)arr and then assign to it, is the result the same independent of machine endianness?
Side note - I saw the discussion on IBM and elsewhere on SO with example:
unsigned char endian[2] = {1, 0};
short x;
x = *(short *) endian;
...stating that the value of x will depend on the layout of endian, and hence the endianness of the machine. That means dereferencing an array is endian-dependent, but what about assigning to it?
*(short*) endian = 1;
Are all future short-casted dereferences then guaranteed to return 1, regardless of endianness?
After reading the responses, I wanted to post some context:
In this struct
struct pix {
unsigned char r;
unsigned char g;
unsigned char b;
unsigned char a;
unsigned char y[2];
};
replacing unsigned char y[2] with unsigned short y makes no individual difference, but if I make an array of these structs and put that in another struct, then I've noticed that the size of the container struct tends to be higher for the "unsigned short" version, so, since I intend to make a large array, I went with unsigned char[2] to save space overhead. I'm not sure why, but I imagine it's easier to align the uchar[2] in memory.
Because I need to do a ton of math with that variable y, which is meant to be a single short-length numerical value, I find myself casting to short a lot just to avoid individually accessing the uchar bytes... sort of a fast way to avoid ugly byte-specific math, but then I thought about endianness and whether my math would still be correct if I just cast everything like
*(unsigned short*)this->operator()(x0, y0).y = (ySum >> 2) & 0xFFFF;
...which is a line from a program that averages 4-adjacent-neighbors in a 2-D array, but the point is that I have a bunch of these operations that need to act on the uchar[2] field as a single short, and I'm trying to find the lightest (i.e. without an endian-based if-else statement every time I need to access or assign), endian-independent way of working with the short.
Thanks to strict pointer aliasing it's undefined behaviour, so it might be anything. If you'd do the same with a union however the answer is no, the result is dependent on machine endianness.
Each possible value of short has a so-called "object representation"[*], which is a sequence of byte values. When an object of type short holds that value, the bytes of the object hold that sequence of values.
You can think of endianness as just being one of the ways in which the object representation is implementation-dependent: does the byte with the lowest address hold the most significant bits of the value, or the least significant?
Hopefully this answers your question. Provided you've safely written a valid object representation of 1 as a short into some memory, when you read it back from the same memory you'll get the same value again, regardless of what the object representation of 1 actually is in that implementation. And in particular regardless of endianness. But as the others say, you do have to avoid undefined behavior.
[*] Or possibly there's more than one object representation for the same value, on exotic architectures.
Yes, all future dereferences will return 1 as well: As 1 is in range of type short, it will end up in memory unmodified and won't change behind your back once it's there.
However, the code itself violates effective typing: It's illegal to access an unsigned char[2] as a short, and may raise a SIGBUS if your architecture doesn't support unaligned access and you're particularly unlucky.
However, character-wise access of any object is always legal, and a portable version of your code looks like this:
short value = 1;
unsigned char *bytes = (unsigned char *)&value;
How value is stored in memory is of course still implementation-defined, ie you can't know what the following will print without further knowledge about the architecture:
assert(sizeof value == 2); // check for size 2 shorts
printf("%i %i\n", bytes[0], bytes[1]);

Casting an int pointer to a char ptr and vice versa

The problem is simple. As I understand, GCC maintains that chars will be byte-aligned and ints 4-byte-aligned in a 32-bit environment. I am also aware of C99 standard 6.3.2.3 which says that casting between misaligned pointer-types results in undefined operations. What do the other standards of C say about this? There are also many experienced coders here - any view on this will be appreciated.
int *iptr1, *iptr2;
char *cptr1, *cptr2;
iptr1 = (int *) cptr1;
cptr2 = (char *) iptr2;
There is only one standard for C (the one by ISO), with two versions (1989 and 1999), plus some pretty minor revisions. All versions and revisions agree on the following:
all data memory is byte-addressable, and chars are bytes
thus a char* will be able to address any data
void* is the same as char* except conversions to and from it do not require type casts
converting from int* to char* always works, as does convering back to int*
converting an arbitrary char* to int* is not guaranteed to work
The reasons char pointers are guaranteed to work like this is so that you can, for example, copy integers from anywhere in memory to elsewhere in memory or disk, and back, which turns out to be a pretty useful thing to do in low-level programming, e.g., graphics libraries.
There are big-endian and little-endian for CPUs, so the results are undefined.
For example, the value of 0x01234567 could be 0x12 or 0x67 for a char pointer after casting.
You can try doing:
iptr1 = atoi(cptr1); // val now = pointed by cptr1
cptr2 = atoi(iptr2); // val now = pointed by iptr2
This worked for me in DevCpp!

Resources