I have a struct declared and set in memory, I have a global constant pointer to this struct and throughout my program I dereference this pointer to access the different parts of the struct. However, there are times when the pointer memory address changes when dereferenced from a specific function.
My struct
typedef struct configData_t
{
uint8_t version[4];
inputConfig_t inputModuleConfig [MAX_INPUT];
outputConfig_t outputModuleConfig [MAX_OUTPUT];
notificationConfig_t notificationConfig [MAX_NOTIFICATIONS];
functionConfig_t autoFunctionConfig [MAX_FUNCTIONS];
uint16_t Crc16;
} configData_t;
The constant pointer is declared by setting the memory address of the data (externally loaded and outside of the applications memory)
//Pointer points to memory location on uC (data already in memory)
const configData_t* theConfigData = (configData_t*)0x0460000;
To get a notification from the 'notificationConfig' array I dereference 'theConfigData' by [1]:
const notificationConfig_t *pNotificationConfig = theConfigData->notificationConfig + notificationID;
The following occurs when stepping through the code on the uC:
In function A, get notification from struct by using [1], pointer address is 0x463e18
In function A call function B, dereference the struct using [1] the address changes to 0x463e2a (This is the wrong memory address, 0x12 difference)
Function B finishes and returns to A, dereferencing theConfigData again using [1] gives 0x463e18
Every other function in the program that uses [1] always returns the correct (0x463e18) address.
Function B does not alter 'theConfigData' in any way. In the debuggers memory view, the data in 0x0460000 + sizeOf(configData_t) is not altered in any way.
How is the 'pNotificationConfig' pointer changing address when going from function A to B?
You need to make sure that :
the definition of configData_t is exactly the same in both the function A and function B compilation units
the struct padding of configData_t is exactly the same for both the function A and function B compilation units
Red flags of the above for your specific issue would be eg. :
sizeof(configData_t) is different
offsetof(configData_t, notificationConfig) is different
sizeof(notificationConfig_t) is different
If one or more of these red flags are raised (and in a comment, you confirm that), you need to determine which of the two earlier options causes it :
a difference in definition can be caught by verifying the source code :
make sure the same struct definitions are used throughout the code (typically with the use of an include file)
make sure supporting compile time values are the same (eg. array dimensions MAX_INPUT, MAX_OUTPUT, ... in your case)
a difference in padding can be caused by the use of different compilers and/or different compiler flags - refer to your compiler's documentation for details (specifically wrt. struct padding/packing)
Related
The problem is that local pointer variables to the struct appear to have the same address, although they are defined locally within each function.
The question is why they are appearing to share a memory location (attached debugger screenshots), and how can they have their own unique memory allocation.
typedef struct Map Map;
struct Map {
const char *key;
const void *value;
Map *next;
};
void testPut() {
struct Map *map;
Test test1 = {"name1", 1};
Test test2 = {"name2", 2};
Test test3 = {"name3", 3};
}
void testGet() {
struct Map *map; //this seems to have same address as map in testPut
Test test1 = {"name1", 1};
Test test2 = {"name2", 2};
Test test3 = {"name3", 3};
}
testPut
testGet
The pointers are located on the stack. Both functions seem to have the same stack size. If you call them one after another the first variable of the second function will probably have the same address as the first variable of the first call.
So when you jump into testPut() the stack is increased by (i do not exactly know, but assume) 8bytes.
When the function is exited, the stack gets decreased by those 8bytes again.
And when you then access testGet() the stack for that function is created with the same size as testPut(). Then the address in testPut() and testGet() are the same.
The address should change if you try to:
add a simple int foo; in front of struct Map *map; in testGet() or
as stated in the comments put a static in front of the struct
When an object is defined inside a function without static, extern, or _Thread_local, it has automatic storage duration. This means memory is reserved for it only until execution of its associated block ends. When the function returns, the memory may be reused for other purposes.
You have examined calls to testGet and testPut at different times. The memory is simply being used for one purpose at one time and a different purpose at another time.
The local variables in testPut and testGet appear to have the same address in memory because they have automatic storage, hence are created on the stack upon entering each function (by code at the beginning of the function) and both functions happen to be called with the same stack depth and define the same local variables.
This is completely circumstantial, these addresses may change:
if you call testPut() or testGet() from another place in your program
if you run the program another time on modern OSes that implement address space randomisation
if you compile with a different compiler or just different flags.
if you change something in the rest of these functions
if you change something elsewhere in the program
if anything else happens or does not happen...
It is educational to try and understand how objects and statements are implemented in memory and code, it will help grab the concept behind pointers, but remember that none of this is defined by the C Standard, ony the semantics of the abstract machine.
Take this arguably questionable code.
struct X {
int arr[1];
float something_else;
};
struct X get_x(int first)
{
struct X ret = { .arr = { first } };
return ret;
}
int main(int argc, char **argv) {
int *p = get_x(argc+50).arr;
return *p;
}
get_x returns a struct X.
I'm only interested in its member arr. Why would I make a local variable for the entire struct if I only want arr...
But.. is that code correct?
In the shown example, does the C standard know to keep the return value of get_x on the stack until the end of the calling stack frame because I'm peeking inside it with a pointer?
What you're doing is not allowed by the standard.
The struct returned from the function has temporary lifetime which ends outside of the expression it is used in. So right after p is initialized, it points to an object whose lifetime has ended and its value becomes indeterminate. Then attempting to dereference p (which is now indeterminate) in the following statement triggers undefined behavior.
This is documented in section 6.2.4p8 of the C standard:
A non-lvalue expression with structure or union type, where
the structure or union contains a member with array type
(including, recursively, members of all contained structures and
unions) refers to an object with automatic storage duration and
temporary lifetime. Its lifetime begins when the expression is evaluated and its initial value is the value of the expression.
Its lifetime ends when the evaluation of the containing full
expression or full declarator ends. Any attempt to modify an
object with temporary lifetime results in undefined behavior.
Where the lifetime of an object and what happens to a pointer to an object when its lifetime ends is specified in section 6.2.4p2:
The lifetime of an object is the portion of program
execution during which storage is guaranteed to be reserved
for it. An object exists, has a constant address, and retains
its last-stored value throughout its lifetime. If an object
is referred to outside of its lifetime, the behavior is
undefined. The value of a pointer becomes indeterminate when the
object it points to (or just past) reaches the end of its lifetime
If you were to assign the return value of the function to an instance of struct X, then you can safely access the arr member of that instance.
In the shown example, does the C standard know to keep the return value of get_x on the stack until the end of the calling stack frame because I'm peeking inside it with a pointer?
No, it cannot ever do this, even if it "knew" to do so. Things are popped off the stack when a function returns, and the contents of anything "above" that point become undefined.
Even so,
But.. is that code correct?
That part is! This is because you are not creating a pointer to the struct that was in the callee's stack frame. You are creating a pointer to a copy, which was implicitly created when you returned a struct by value.
Conceptually, the code will copy this struct into space reserved in the caller's stack frame (because you're specifically calling a function that returns a struct, in the general case the value can't be returned in a register). In practice, an optimizing compiler might return it in a register (if your machine's registers can fit a struct containing an int and a float), construct it directly in place in the caller's stack frame (the right location can easily be found as an offset from the base of the callee's stack frame), shuffle memory around (a destructive overlapping-move operation is acceptable exactly because of the "memory contents are now undefined" thing), etc.
... But only that part, as pointed out by #dbush. To create a copy properly (i.e., with a long enough lifetime to use this way), the return value from the function would need to be an lvalue. Conceptually, the compiler is allowed to pop that copy off the stack once it's done retrieving the .arr member. In practice, the stack pointer wouldn't get adjusted, but an optimizing compiler would consider that part of the stack free to use for other local variables.
Does the C standard keep the return struct of a function on the stack if I keep a pointer to a value inside it?
No, not if the struct is a NON l-value, meaning you have not stored it into a variable after it was returned from the function.
In the shown example, does the C standard know to keep the return value of get_x on the stack until the end of the calling stack frame because I'm peeking inside it with a pointer?
No. Read the C standard reference in #dbush's answer.
The problem isn't the get_x() function--that's all fine. Rather, in the erroneous code in the original question and in Example 1 below, the problem is simply the fact that the returned-by-value struct X (returned by get_x()) is NOT an l-value (assigned to a variable), so it is ephemeral, meaning its storage duration ends once the int *p = get_x(argc+50).arr; line is evaluated. Therefore, the *p in return *p is undefined behavior since it accesses memory for a struct X which was never stored into an l-value and therefore no longer exists. Examples 2 and 3 below, however, solve this problem and exhibit no undefined behavior, and are valid.
Example 1 (from the question; is undefined behavior):
Therefore, this is NOT legal:
int *p = get_x(argc+50).arr;
return *p;
See these warnings output by the clang 11.0.1 LLVM C++ compiler: https://godbolt.org/z/PajThdsxz :
<source>:15:14: warning: temporary whose address is used as
value of local variable 'p' will be destroyed at the end of
the full-expression [-Wdangling]
int *p = get_x(argc+50).arr;
^~~~~~~~~~~~~~
1 warning generated.
ASM generation compiler returned: 0
<source>:15:14: warning: temporary whose address is used as
value of local variable 'p' will be destroyed at the end of
the full-expression [-Wdangling]
int *p = get_x(argc+50).arr;
^~~~~~~~~~~~~~
1 warning generated.
Execution build compiler returned: 0
Program returned: 51
When using the clang 11.0.1 C compiler, however, no such warnings exist: https://godbolt.org/z/Y3zdszMvG. I don't know why.
Example 2 (ok):
But this is fine:
int p = get_x(argc+50).arr[0];
return p;
Example 3 (ok):
...and this is fine too:
struct X x = get_x(argc+50);
int *p = x.arr;
return *p;
Interestingly enough though,the compiled assembly generated by all 3 versions above is exactly identical (only when compiled in C++), indicating that while the first may be undefined, it works just as well as the other two for this particular compiler when compiled in C++. Here is the C++ assembly output:
get_x(int): # #get_x(int)
mov eax, edi
ret
main: # #main
push rax
add edi, 50
call get_x(int)
pop rcx
ret
However, the C-compiler-generated assembly is different for all 3 cases, and significantly longer than the C++-compiler-generated assembly. See the last godbolt link just above to see for yourself.
It looks like the clang C++ compiler is significantly smarter than the clang C compiler.
OP goals differ from code.
I'm only interested in its member arr. Why would I make a local variable for the entire struct if I only want arr... (?)
Member .arr is an array. So int *p = get_x(argc+50).arr; does not copy the array .arr, but copies the address of the .arr[0] into p. Copying the address does not fulfill "only interested in its member arr". If you want the data in .arr, copy the data.
To make a copy of only array .arr and not the entire struct X, use memcpy().
int my_copy[sizeof (struct X){0}.arr / sizeof *(struct X){0}.arr];
memcpy(my_copy, get_x(argc+50).arr, sizeof my_copy);
return my_copy[0];
I was trying to pass a set of parameters to a task. So I created a struct and passed it to my task, like this:
my_type_t parameters_set;
//... assigning values to parameters_set
xTaskCreate( vMyTask, "MyTask", STACK_SIZE, (void*)parameters_set, 2,
&xHandle);
Inside my task I tried to retrieve the struct values doing the following:
my_type_t received_parameters = (my_type_t) pvParameters;
The task creation line triggers the following error while compiling: "cannot convert to a pointer type" and the struct retrieving line triggers the "Conversion to non-scalar type requested" error. I know if I use a pointer instead of the variable itself it will compile, but I can't do this because the function that creates the task will die and the task will have a reference to a variable that does not exist anymore. In the end I'll use a global struct variable.
But what I really would like to understand is why do tasks accept int values (not by reference), and don't accept a typedef struct? For example, the following snippet builds and works without problem:
//Inside the function that creates the task
int x = 0;
xTaskCreate( vMyTask, "MyTask", STACK_SIZE, (void*)x, 2,
&xHandle);
//Inside the Task
int received_parameter = (int) pvParameters;
Thanks in advance!
The code you show is not taking an int value as a parameter. You pass the parameter using (void*)x, which is an expression that converts the int x to a pointer.
The function xTaskCreate accepts a pointer. You have given it a pointer, so the compiler does not complain. However, the pointer you have given it is not a pointer to x (which would be written as &x) but is a pointer converted from the value of x. Although the compiler does not complain, this is generally not the right thing to pass to xTaskCreate.
The reason this is not working with a structure is that (void*)parameters_set is not a proper expression when parameters_set is a structure. This is because C provides for integers to be converted to pointers but not for structures to be converted to pointers. Generally, there is a natural correspondence between integers and pointers: In a “flat” address space, every byte in memory has an address, and those addresses are essentially counts of bytes from the “start” of the memory address space. So an integer can serve as an address and vice-versa. (In other address spaces, the correspondence may be more complicated, but C still allows the conversions, with some rules.) Thus, when you write (void*)x, the compiler converts the integer value in x to a pointer.
There is no such correspondence between structures and pointers, so the C standard does not define any conversion from structures to pointers, and the compiler complains when you write (void*)parameters_set.
What you should be passing for this parameter is the address of some data to be given to the task. If you want the task to have the data in x, you should pass &x. If you want the task to have the dat in parameters_set, you should pass ¶meters_set. The & operator takes the address of its operand, which is different from using the (void*) cast, which attempts to convert its operand to a pointer.
According to the RTOS documentation for xTaskCreate, you should not pass the address of a “stack variable,” by which it effectively means, for C, an object with automatic storage duration. In other words, you should not pass an x or parameters_set that is defined within a function without static or (possibly, depending on arrangements made by RTOS and the C implementation; I am unfamiliar with RTOS) _Thread_local. Alternatively, you can allocate space for the parameters using malloc (or, per comment below, pvPortMalloc for RTOS) and pass that to xTaskCreate.
I have a
LS_Led* LS_vol_leds[10];
declared in one C module, and the proper externs in the other modules that access it.
In func1() I have this line:
/* Debug */
LS_Led led = *(LS_vol_leds[0]);
And it does not cause an exception. Then
I call func2() in another C module (right after above line), and do the same line, namely:
/* Debug */
LS_Led led = *(LS_vol_leds[0]);`
first thing, and exception thrown!!!
I don't think I have the powers to debug this one on my own.
Before anything LS_vol_leds is initialized in func1() with:
LS_vol_leds[0] = &led3;
LS_vol_leds[1] = &led4;
LS_vol_leds[2] = &led5;
LS_vol_leds[3] = &led6;
LS_vol_leds[4] = &led7;
LS_vol_leds[5] = &led8;
LS_vol_leds[6] = &led9;
LS_vol_leds[7] = &led10;
LS_vol_leds[8] = &led11;
LS_vol_leds[9] = &led12;
My externs look like
extern LS_Led** LS_vol_leds;
So does that lead to disaster and I how do I prevent disaster?
Thanks.
This leads to disaster:
extern LS_Led** LS_vol_leds;
You should try this instead:
extern LS_Led *LS_vol_leds[];
If you really want to know why, you should read Expert C Programming - Deep C Secrets, by Peter Van Der Linden (amazing book!), especially chapter 4, but the quick answer is that this is one of those corner cases where pointers and arrays are not interchangeable: a pointer is a variable which holds the address of another one, whereas an array name is an address. extern LS_Led** LS_vol_leds; is lying to the compiler and generating the wrong code to access LS_vol_leds[i].
With this:
extern LS_Led** LS_vol_leds;
The compiler will believe that LS_vol_leds is a pointer, and thus, LS_vol_leds[i] involves reading the value stored in the memory location that is responsible for LS_vol_leds, use that as an address, and then scale i accordingly to get the offset.
However, since LS_vol_leds is an array and not a pointer, the compiler should instead pick the address of LS_vol_leds directly. In other words: what is happening is that your original extern causes the compiler to dereference LS_vol_leds[0] because it believes that LS_vol_leds[0] holds the address of the pointed-to object.
UPDATE: Fun fact - the back cover of the book talks about this specific case:
So that's why extern char *cp isn't the same as extern char cp[]. I
knew that it didn't work despite their superficial equivalence, but I
didn't know why. [...]
UPDATE2: Ok, since you asked, let's dig deeper. Consider a program split into two files, file1.c and file2.c. Its contents are:
file1.c
#define BUFFER_SIZE 1024
char cp[BUFFER_SIZE];
/* Lots of code using cp[i] */
file2.c
extern char *cp;
/* Code using cp[i] */
The moment you try to assing to cp[i] or use cp[i] in file2.c will most likely crash your code. This is deeply tight into the mechanics of C and the code that the compiler generates for array-based accesses and pointer-based accesses.
When you have a pointer, you must think of it as a variable. A pointer is a variable like an int, float or something similar, but instead of storing an integer or a float, it stores a memory address - the address of another object.
Note that variables have addresses. When you have something like:
int a;
Then you know that a is the name for an integer object. When you assign to a, the compiler emits code that writes into whatever address is associated with a.
Now consider you have:
char *p;
What happens when you access *p? Remember - a pointer is a variable. This means that the memory address associated with p holds an address - namely, an address holding a character. When you assign to p (i.e., make it point to somewhere else), then the compiler grabs the address of p and writes a new address (the one you provide it) into that location.
For example, if p lives at 0x27, it means that reading memory location 0x27 yields the address of the object pointed to by p. So, if you use *p in the right hand side of an assignment, the steps to get the value of *p are:
Read the contents of 0x27 - say it's 0x80 - this is the value of the pointer, or, equivalently, the address of the pointed-to object
Read the contents of 0x80 - this finally gives you *p.
What if p is an array? If p is an array, then the variable p itself represents the array. By convention, the address representing an array is the address of its first element. If the compiler chooses to store the array in address 0x59, it means that the first element of p lives at 0x59. So when you read p[0] (or *p), the generated code is simpler: the compiler knows that the variable p is an array, and the address of an array is the address of the first element, so p[0] is the same as reading 0x59. Compare this to the case for which p is a pointer.
If you lie to the compiler, and tell it you have a pointer instead of an array, the compiler will (wrongly) generate code that does what I showed for the pointer case. You're basically telling it that 0x59 is not the address of an array, it's the address of a pointer. So, reading p[i] will cause it to use the pointer version:
Read the contents of 0x59 - note that, in reality, this is p[0]
Use that as an address, and read its contents.
So, what happens is that the compiler thinks that p[0] is an address, and will try to use it as such.
Why is this a corner case? Why don't I have to worry about this when passing arrays to functions?
Because what is really happening is that the compiler manages it for you. Yes, when you pass an array to a function, a pointer to the first element is passed, and inside the called function you have no way to know if it is a "real" array or a pointer. However, the address passed into the function is different depending on whether you're passing a real array or a pointer. If you're passing a real array, the pointer you get is the address of the first element of the array (in other words: the compiler immediately grabs the address associated to the array variable from the symbol table). If you're passing a pointer, the compiler passes the address that is stored in the address associated with that variable (and that variable happens to be the pointer), that is, it does exactly those 2 steps mentioned before for pointer-based access. Again, note that we're discussing the value of the pointer here. You must keep this separated from the address of the pointer itself (the address where the address of the pointed-to object is stored).
That's why you don't see a difference. In most situations, arrays are passed around as function arguments, and this rarely raises problems. But sometimes, with some corner cases (like yours), if you don't really know what is happening down there, well.. then it will be a wild ride.
Personal advice: read the book, it's totally worth it.
I am learning C, mainly by K&R, but now I have found an Object Oriented C pdf tutorial and am fascinated. I'm going through it, but my C skills/knowledge may not be up to the task.
This is the tutorial: http://www.planetpdf.com/codecuts/pdfs/ooc.pdf
My question comes from looking at many different functions in the first couple of chapters of the pdf. Below is one of them. (page 14 of pdf)
void delete(void * self){
const struct Class ** cp = self;
if (self&&*cp&&(*cp)->dtor)
self = (*cp)->dtor(self);
free(self);
}
dtor is a destructor function pointer. But knowledge of this isn't really necessary for my questions.
My first question is, why is **cp constant? Is it necessary or just being thorough so the code writer doesn't do anything damaging by accident?
Secondly, why is cp a pointer-to-a-pointer (double asterisk?). The struct class was defined on page 12 of the pdf. I don't understand why it can't be a single pointer, since we are casting the self pointer to a Class pointer, it seems.
Thirdly, how is a void pointer being changed to a Class pointer (or pointer-to-a-Class-pointer)? I think this question most shows my lack of understanding of C. What I imagine in my head is a void pointer taking up a set amount of memory, but it must be less than Class pointer, because a Class has a lot of "stuff" in it. I know a void pointer can be "cast" to another type of pointer, but I don't understand how, since there may not be enough memory to perform this.
Thanks in advance
Interesting pdf.
My first question is, why is **cp constant? Is it necessary or just
being thorough so the code writer doesn't do anything damaging by
accident?
It's necessary so the writer doesn't do anything by accident, yes, and to communicate something about the nature of the pointer and its use to the reader of the code.
Secondly, why is cp a pointer-to-a-pointer (double asterisk?). The
struct class was defined on page 12 of the pdf. I don't understand why
it can't be a single pointer, since we are casting the self pointer to
a Class pointer, it seems.
Take a look at the definition of new() (pg 13) where the pointer p is created (the same pointer that's passed as self to delete()):
void * new (const void * _class, ...)
{
const struct Class * class = _class;
void * p = calloc(1, class —> size);
* (const struct Class **) p = class;
So, 'p' is allocated space, then dereferenced and assigned a pointer value (the address in class; this is like dereferencing and assigning to an int pointer, but instead of an int, we're assigning an address). This means the first thing in p is a pointer to its class definition. However, p was allocated space for more than just that (it will also hold the object's instance data). Now consider delete() again:
const struct Class ** cp = self;
if (self&&*cp&&(*cp)->dtor)
When cp is dereferenced, since it was a pointer to a pointer, it's now a pointer. What does a pointer contain? An address. What address? The pointer to the class definition that's at the beginning of the block pointed to by p.
This is sort of clever, because p's not really a pointer to a pointer -- it has a larger chunk of memory allocated which contains the specific object data. However, at the very beginning of that block is an address (the address of the class definition), so if p is dereferenced into a pointer (via casting or cp), you have access to that definition. So, the class definition exists only in one place, but each instance of that class contains a reference to the definition. Make sense? It would be clearer if p were typed as a struct like this:
struct object {
struct class *class;
[...]
};
Then you could just use something like p->class->dtor() instead of the existing code in delete(). However, this would mess up and complicate the larger picture.
Thirdly, how is a void pointer being changed to a Class pointer (or
pointer-to-a-Class-pointer)? I think this question most shows my lack
of understanding of C. What I imagine in my head is a void pointer
taking up a set amount of memory, but it must be less than Class
pointer, because a Class has a lot of "stuff" in it.
A pointer is like an int -- it has a small, set size for holding a value. That value is a memory address. When you dereference a pointer (via * or ->) what you are accessing is the memory at that address. But since memory addresses are all the same length (eg, 8 bytes on a 64-bit system) pointers themselves are all the same size regardless of type. This is how the magic of the object pointer 'p' worked. To re-iterate: the first thing in the block of memory p points to is an address, which allows it to function as a pointer to a pointer, and when that is dereferenced, you get the block of memory containing the class definition, which is separate from the instance data in p.
In this case, that's just a precaution. The function shouldn't be modifying the class (in fact, nothing should probably), so casting to const struct Class * makes sure that the class is more difficult to inadvertently change.
I'm not super-familiar with the Object-Oriented C library being used here, but I suspect this is a nasty trick. The first pointer in self is probably a reference to the class, so dereferencing self will give a pointer to the class. In effect, self can always be treated as a struct Class **.
A diagram may help here:
+--------+
self -> | *class | -> [Class]
| .... |
| .... |
+--------+
Remember that all pointers are just addresses.* The type of a pointer has no bearing on the size of the pointer; they're all 32 or 64 bits wide, depending on your system, so you can convert from one type to another at any time. The compiler will warn you if you try to convert between types of pointer without a cast, but void * pointers can always be converted to anything without a cast, as they're used throughout C to indicate a "generic" pointer.
*: There are some odd platforms where this isn't true, and different types of pointers are in fact sometimes different sizes. If you're using one of them, though, you'd know about it. In all probability, you aren't.
const is used to cause a compilation error if the code attempts to change anything within the object pointed to. This is a safety feature when the programmer intends only to read the object and does not intend to change it.
** is used because that must be what was passed to the function. It would be a grave programming error to re-declare it as something it is not.
A pointer is simply an address. On almost all modern CPUs, all addresses are the same size (32 bit or 64 bit). Changing a pointer from one type to another doesn't actually change the value. It says to regard what is at that address as a different layout of data.