24 bit const pointers on XC8 PIC18 not workings - c

I came across this problem twice in my project and the last time I used a kind of dirty solution.
Platform: PIC18F87J60, XC8 v1.12
I'm trying to use function pointers to point to functions that possibly reside in the upper halve of my ROM (>= 0x10000). This means that the pointer itself needs to be 17-bits or bigger (up to 20) to be able to address such a function.
This is the relevant code snippet (simplyfied):
void test(void) # 0x1C000
{
printf("function pointer called!\r\n");
}
void main(void) {
void (*testPointer) (void) = &test;
//Now testPointer contains 0x0C000
(*testPointer)(); //Doesn't call test. Instead it jumps to 0x0C000
}
What happens is that test never actually gets called. When I use the debugger (PICKIT 3) I can see that the value in testPointer is 0x0C000. It just seems that the address in the pointer is rounded down to just 16-bits max and this always happens. But when I place test() somewhere below 0x10000 everything works fine because then the pointer just needs to be max 16 bits.
When I read back the program from the device test() really is placed at 0x1C000 so that is not the problem, the code is there.
The last time I solved the situation by casting a literal long to a pointer and that worked but its dirty and now I want to avoid it.
Does anyone recognize the problem? Is this a compiler bug? If so, does Microchip already know about this? Any clean work-arounds? Does the XC8 compiler support 20-bit const pointers at all?
Edit: fixed typo in code above &testPointer(); --> (*testPointer()); (no, this was not causing my problem)

The MPLAB C18 Compiler User's Guide lists a few extra storage qualifiers that appear to be relevant to your use-case:
near/far Program Memory Objects
The far qualifier is used to denote that a variable that is located in program memory can be found anywhere in program memory, or, if a pointer, that it can access up to and beyond 64K of program memory space.
ram/rom Qualifiers
The rom qualifier denotes that the object is located in program memory, whereas the ram qualifier denotes that the object is located in data memory.
Later on, the manual shows an example of creating "a function pointer that can access up to a beyond 64K of program memory space":
far rom void (*fp) (void);
The XC8 manual is less clear about the function of the far qualifier, but still lists it, which strongly suggests that it still is recognized by the newer compilers.

Related

How are const char and char pointers represented in memory in STM32

How does MCU now that the string a variable is pointing on is in data memory or in program memory?
What does compiler do when I'm casting a const char * to char * (e.g. when calling strlen function)?
Can char * be used as a char * and const char * without any performance loss?
The STM32s use a flat 32-bit address space, so RAM and program memory (flash) are in the same logical space.
The Cortex core of course knows which type of memory is where, probably through hardware address decoders that are triggered by the address being accessed. This is of course way outside the scope of what C cares about, though.
Dropping const is not a run-time operation, so there should be no performance overhead. Of course dropping const is bad, since somewhere you risk someone actually believing that a const pointer means data there won't be written to, and going back on that promise can make badness happen.
By taking a STM32F4 example with 1MB flash/ROM memory and 192KB RAM memory (128KB SDRAM + 64KB CCM) - the memory map looks something as follows:
Flash/ROM - 0x08000000 to 0x080FFFFF (1MB)
RAM - 0x20000000 to 0x2001FFFF (128KB)
There's more areas with separate address spaces that I won't cover here for the simplicity of the explanation. Such memories include Backup SRAM and CCM RAM, just to name two. In addition, each area may be further divided sections, such as RAM being divided to bss, stack and heap.
Now onto your question about strings and their locations - constant strings, such as:
const char *str = "This is a string in ROM";
are placed in flash memory. During compilation, the compiler places a temporary symbol that references such string. Later during linking phase, the linker (which knows about concrete values for each memory section) lays down all of your data (program, constant data etc.) in each section one after another and - once it knows concrete values of each such object - replaces those symbols placed by the compiler with concrete values which then appear in your binary. Because of this, later on during runtime when the assignment above is done, your str variable is simply assigned a constant value deduced by the linker (such as 0x08001234) which points directly to the first byte of the string.
When it comes to dynamically allocated values - whenever you call malloc or new a similar task is done. Assuming sufficient memory is available, you are given the address to the requested chunk of memory in RAM and those calculations are during runtime.
As for the question regarding const qualifier - there is not meaning to it once the code is executed. For example, during runtime the strlen function will simply go over memory byte-by-byte starting at the passed location and ending once binary 0 is encountered. It doesn't matter what "type" of bytes are being analyzed, because this information is lost once your code is converted to byte code. Regarding const in your context - const qualifier appearing in function parameter denotes that such function will not modify the contents of the string. If it attempted to, a compilation error would be raised, unless it implicitly performs a cast to a non-const type. You may, of course, pass a non-const variable as a const parameter of a function. The other way however - that is passing a const parameter to a non-const function - will raise an error, as this function may potentially modify the contents of the memory you point to, which you implicitly specified to be non-modifiable by making it const.
So to summarize and answer your question: you can do casts as much as you want and this will not be reflected at runtime. It's simply an instruction to the compiler to treat given variable differently than the original during its type checks. By doing an implicit cast, you should however be aware that such cast may potentially be unsafe.
With and without const, assuming your string is truly read only, is going to change whether it lands in .data or .rodata or some other read only section (.text, etc). Basically is it going to be in flash or in ram.
The flash on these parts if I remember right at best has an extra wait state or is basically half the speed of ram...at best. (Fairly common for mcus in general, there are exceptions though). If you are running in the slower range of clocks, of you boost the clock the the ram performance vs flash will improve. So having it in flash (const) vs sram is going to be slower for any code that parses through that string.
This assumes your linker script and bootstrap are such that .data is actually copied to ram on boot...

What is the trick behind strcpy()/uninitialized char pointer this code?

#include <stdio.h>
#include <string.h>
#include <stdlib.h>
void main ()
{
char *imsi;
unsigned int i;
int val;
char *dest;
imsi = "405750111";
strncpy(dest,imsi,5);
printf("%s",dest);
/* i = 10; */
}
In the above code, with the i = 10 assignment is commented as above, the code works fine without error. When assignment is included for compilation, the error (segmentation fault) occurs at strncpy(dest,imsi,5);.
By avoiding optimization to variable i (i.e., volatile int i;), the error is cleared even with the assignment (i = 10) included.
In your code, by saying
strncpy(dest,imsi,5);
you're trying to write into an unitialized pointer dest. It can (and most possibly, it will) point to some memory which is not accessible from your program (invalid memory). It invokes undefined behavior.
There is nothing that can be guaranteed about a program having UB. It can work as expected (depends on what you're expecting, actually) or it may crash or open your bank account and transfer all money to some potential terrorist organization.
N.B - I hope by reading last line you got scared, so the bottom line is
Don't try to write into any uninitialized pointer (memory area). Period.
The behaviour of this code is unpredictable because the pointer dest is used before it is initialised. The difference in observed behaviour is only indirectly related to the root cause bug, which is the uninitialised variable. In C it is the programmers responsibility to allocate storage for the output of the strncpy() function and you haven't done that.
The simplest fix is to define an output buffer like this:
char dest[10];
Assuming you compiled this C source code into machine code for some "normal" architecture and then ran it, the possible effects of read-undefined UB basically boil down to what value floating around in registers or memory ends up getting used.
If the compiler happens to use the same value both times, and that value happened to point to a writeable memory address (and didn't overwrite anything that would break printf), it could certainly happen to work. UB doesn't guarantee a crash. It doesn't guarantee anything. Part of the point of UB is to let the compiler make assumptions and optimize based on them.
Any changes to surrounding code will affect code-gen for that function, and thus will can affect what's in the relevant register when the call happens, or which stack slot is used for dest. Reading from a different stack address will give dest a different value.
Before main, calls to dynamic-linker functions might have dirtied some memory, leaving some pointers floating around in there, maybe including apparently some to writeable memory.
Or main's caller might have a pointer to some writeable memory in a register, e.g. a stack address.
Although that's less likely; if a compiler was going to just not even set a register before making a call, strncpy would probably get main's first arg, an integer argc, unless the compiler used that register as a temporary first. But string literals normally go in read-only memory so that's an unlikely explanation in this case. (Even on an ISA / calling convention like ARM where gcc's favourite register for temporaries is R0, the return-value register but also the first arg-passing register. If optimization is disabled so statements compile separately, most expressions will use R0.)

Void * parameter address shift

I am using Codewarrior 8.3 (IDE version 5.9) to program a 56f8367 DSC.
I am using respected third party software, so I imagine that they know what they are doing and don't want to mess with their code too much, but they are playing around with passing void * parameters and it is something I am not totally familiar with.
So I have this function:
static void T_CALLBACK _transmit(
void *pContext,
TX_DATA *pTxDescriptor)
{
CONTEXT *pLinkContext = (CONTEXT *)pContext;
...
}
which is being called through a function pointer. When I stop the processor just before this function call, I can see the address pointed to by pContext is 0x1000, but after it is cast here, the address pointed to by pLinkContext is 0x0800. This, obviously causes problems because we start writing and reading from a different part of memory.
There is something weird going on with the byte addressing/alignment where it is getting "shifted" over 1 bit. I see what is going wrong, I just don't understand why or, more importantly, how to solve the problem.
What should I be looking for?
(Editing to add the call per comment request) - although, I'm not sure how much it will help considering everything is buried in structures and is being called through a function pointer. I can say that "pTprtContext->tmw.pChannel->pLinkContext" is of a different type than CONTEXT, pLinkContext does match up with the beginning of CONTEXT, so I think they are just trying to insert it in there.
static void T_LOCAL _transmitNextFrame(
D_CONTEXT *pTprtContext)
{
...
/* Transmit frame */
pTprtContext->t.pChannel->pLink->pLinkTransmit(
pTprtContext->t.pChannel->pLinkContext, &pTprtContext->linkTxDescriptor);
}
You say "shifted over by 1 byte," but it is actually only one bit, that is, the number is divided by 2.
This is usually the result of using a byte address in one context and a (2-byte) word address in another context. They probably refer to the same address.
Does this help you decipher it?
I use CodeWarrior compiler for an HC12 family's 16-bit microcontroller. With this compiler, I can choose a few memory models which change (among other several things) how many bytes pointers are. More specifically, +small+ memory model uses __near 16-bit pointers, whereas +large+ model makes use of __far 24-bit pointers.
If your code is compiled with a different memory model than your third party software's and the compiler does not warn you, I guess you may get wierd result.

How to store a variable at a specific memory location?

As i am relatively new to C , i have to use for one of my projects the following:
i must declare some global variables which have to be stored every time the program runs at the same memory address.
I did some read and i found that is i declare it "static" it will be stored at the same memory location.
But my question is: can i indicate the program where to store that variable or not.
For example : int a to be stored at 0xff520000. Can this thing be done or not? i have searched here but did not found any relevant example. If their is some old post regarding this, please be so kind to share the link .
Thank you all in advance.
Laurentiu
Update: I am using a 32uC
In your IDE there will be a memory map available through some linker file. It will contain all addresses in the program. Read the MCU manual to see at which addresses there is valid memory for your purpose, then reserve some of that memory for your variable. You have to read the documentation of your specific development platform.
Next, please note that it doesn't make much sense to map variables at specific addresses unless they are either hardware registers or non-volatile variables residing in flash or EEPROM.
If the contents of such a memory location will change during execution, because it is a register, or because your program contains a bootloader/NVM programming algorithm changing NVM memory cells, then the variables must be declared as volatile. Otherwise the compiler will break your code completely upon optimization.
The particular compiler most likely has a non-standard way to allocate variables at specific addresses, such as a #pragma or sometimes the weird, non-standard # operator. The only sensible way you can allocate a variable at a fixed location in standard C, is this:
#define MY_REGISTER (*(volatile uint8_t*)0x12345678u)
where 0x12345678 is the address where 1 byte of that is located. Once you have a macro declaration like this, you can use it as if it was a variable:
void func (void)
{
MY_REGISTER = 1; // write
int var = MY_REGISTER; // read
}
Most often you want these kind of variables to reside in the global namespace, hence the macro. But if you for some reason want the scope of the variable to be reduced, then skip the macro and access the address manually inside the code:
void func (void)
{
*(volatile uint8_t*)0x12345678u = 1; // write
int var = *(volatile uint8_t*)0x12345678u; // read
}
You can do this kind of thing with linker scripts, which is quite common in embedded programming.
On a Linux system you might never get the same virtual address due to address space randomization (a security feature to avoid exploits that would rely on knowing the exact location of a variable like you describe).
If it's just a repeatable pointer you want, you may be able to map a specific address with mmap, but that's not guaranteed.
Like was mentioned in other answers - you can't.
But, you can have a workaround. If it's ok for the globals to be initialized in the main(), you can do something of this kind:
int addr = 0xff520000;
int main()
{
*((int*)addr) = 42;
...
return 0;
}
Note, however, that this is very dependent on your system and if running in protected environment, you'll most likely get a runtime crash. If you're in embedded/non-protected environment, this can work.
No you cannot tell it explicitly where to store a variable in memory. Mostly because on modern systems you have many things done by the system in regards to memory, that is out of your control. Address Layout Randomization is one thing that comes to mind that would make this very hard.
according your compiler if you use XC8 Compiler.
Simply you can write int x # 0x12 ;
in this line you set x in the memory location 0x12
Not at the C level. If you work with assembly language, you can directly control the memory layout. But the C compiler does this for you. You can't really mess with it.
Even with assembly, this only controls the relative layout. Virtual memory may place this at any (in)convenient physical location.
You can do this with some compiler extensions, but it's probably not what you want to do. The operating system handles your memory and will put things where it wants. How do you even know that the memory address you want will be mapped in your program? Ignore everything in this paragraph if you're on an embedded platform, then you should read the manual for that platform/compiler or at least mention it here so that people can give a more specific answer.
Also, static variables don't necessarily have the same address when the program runs. Many operating systems use position independent executables and randomize the address space on every execution.
You can declare a pointer to a specific memory address, and use the contents of that pointer as a variable I suppose:
int* myIntPointer = 0xff520000;

C - calling a function via func_ptr, why doesnt it work?

i have the following code:
void print(const char* str){
system_call(4,1,str,strlen(str)); }
void foo2(void){ print("goo \n");}
void buz(void){ ...}
int main(){
char buf[256];
void (*func_ptr)(void)=(void(*)(void))buf;
memcpy(buf,foo2, ((void*)buz)-((void*)foo2));
func_ptr();
return 0;
}
the question is, why will this code fall?
the answer was, something about calling a function not via pointer is to a relative address, but i havent been able to figure out whats wrong here? which line is the problematic one?
thank you for your help
Well to begin with, there is nothing which says that foo2() and buz() must be next to each other in memory. And for another, as you guess, the code must be relative for stunts like that to work. But most of all, it is not allowed by the standard.
As Chris Luts referred to, stack (auto) variables are not executable on many operating systems, to protect from attacks.
The first two lines in your main() function are problematic.
Line 1. (void(*)(void))buf
converting buf to a function pointer is undefined
Line 2. ((void*)buz)-((void*)foo2)
subtraction of pointers is undefined unless the pointers point within the same array.
Also, Section 5.8 Functions of H&S says "Although a pointer to a function is often assumed to be the address of the function's code in memory, on some computers a function pointer actually points to a block of information needed to invoke the function."
First and foremost, C function pointers mechanism is for equal-signature function calling abstraction. This is powerful and error-prone enough without these stunts.
I can't see an advantage/sense in trying to copying code from one place to another. As some have commented, it's not easy to tell the amount of relativeness/rellocatable code within a C function.
You tried copying the code of a function onto a data memory region. Some microcontrollers would just told you "Buzz off!". On machine architectures that have data/program separated memories, given a very understanding compiler (or one that recognizes data/code modifiers/attributes), it would compile to the specific Code-Data Move instructions. It seams it would work... However, even in data/code separated memory archs, data-memory instruction execution is not possible.
On the other hand, in "normal" data/code shared memory PCs, likely it would also not work because data/code segments are declared (by the loader) on the MMU of the processor. Depending on the processor and OS, attempts to run code on data segments, is a segmentation fault.

Resources