I am trying to work with flash memory on MPC5748G - a microcontroller from NXP running FreeRTOS 10.0.1, and I get some behaviour that I can't understand.
I am allocating memory manually, and the assignment seems not to work. However, I can reach the value at the address when using 'printf' - but only from the same function.
(I'm using the copy of a pointer, to make sure that some sore of compiler optimisation doesn't take place)
void vFlashTask(void* pvParameters){
vTaskDelay(1000);
FLASH_DRV_Init();
uint32_t* val_ptr;
uint32_t* val_ptr_cpy;
val_ptr = (uint32_t *)0xFB8000;
val_ptr_cpy = val_ptr;
*val_ptr = 444;
DBGPRINTF("Task| value at xFB8000:%d", *val_ptr_cpy);
getValTest();
vTaskDelay(1500);
vTaskDelete(NULL);
}
void getValTest(){
uint32_t* val_ptr;
val_ptr =(uint32_t *)0xfb8000;
DBGPRINTF("getValTest| value at xFB8000:%d", *val_ptr);
}
Gives back (in UART Terminal):
[../include/flash.c:26]: Task| value at xFB8000:444
[../include/flash.c:37]: getValTest| value at xFB8000:-1
I am attaching also the screenshot from the debugger, which clearly shows that however memory at the xFB8000 is uninitialized (it has the value of 0xffffffff), but still, the printf function manages to print the correct value(?).
My DBGPRINTF macro:
#define DBGPRINTF(f, ...) dbgPrintf("[%s:%d]: " f "\n", __FILE__, __LINE__, __VA_ARGS__)
void dbgPrintf(const char *format, ...){
va_list args;
va_start(args, format);
int len = vsnprintf((char*) uart_buffer, UART_BUFFER_SIZE - 1, format, args);
UART_SendDataBlocking(&uart_pal1_instance, (const char *)uart_buffer, len, UART_TIMEOUT);
va_end(args);
}
I would really appreciate any help or suggestions.
My compiler flags:
\S32DS_Power_v2.1\eclipse\../S32DS/software/S32_SDK_S32PA_RTM_3.0.3/rtos/FreeRTOS_PA/Source/portable/GCC/PowerPC" -I"C:\NXP\S32DS_Power_v2.1\eclipse\../S32DS/software/S32_SDK_S32PA_RTM_3.0.3/middleware/tcpip/tcpip_stack/ports/OS" -I"C:\NXP\S32DS_Power_v2.1\eclipse\../S32DS/software/S32_SDK_S32PA_RTM_3.0.3/middleware/tcpip/tcpip_stack/ports/platform/generic/gcc/setting" -I"C:\NXP\S32DS_Power_v2.1\eclipse\../S32DS/software/S32_SDK_S32PA_RTM_3.0.3/middleware/tcpip/wolfssl/wolfssl" -I"C:\NXP\S32DS_Power_v2.1\eclipse\../S32DS/software/S32_SDK_S32PA_RTM_3.0.3/middleware/tcpip/wolfssl" -I"C:\NXP\S32DS_Power_v2.1\eclipse\../S32DS/software/S32_SDK_S32PA_RTM_3.0.3/rtos/FreeRTOS_PA/Source" -I"C:\NXP\S32DS_Power_v2.1\eclipse\../S32DS/software/S32_SDK_S32PA_RTM_3.0.3/platform/pal/inc" -I"C:\NXP\S32DS_Power_v2.1\eclipse\../S32DS/software/S32_SDK_S32PA_RTM_3.0.3/platform/drivers/src/flash_c55" -O1 -g3 -Wall -c -fmessage-length=0 -msdata=eabi -mlra -funsigned-bitfields -ffunction-sections -fdata-sections -fno-common -Wno-address -mcpu=e200z4 -specs=nosys.specs -mbig -mvle -mregnames -mhard-float --sysroot="C:\NXP\S32DS_Power_v2.1\eclipse\../S32DS/build_tools/powerpc-eabivle-4_9/powerpc-eabivle/newlib"
The problem was writing to FLASH memory - it hasn't been correctly initialized.
The proper way to write to flash on MPC5748g using the SDK 3.0.3 is following:
save flash controller cache
initialise flash
check and protect UT block
unblock an address space
erase a block in this space
check if the space block is blank
program the block
verify if the block is programmed correctly
check sum of the programmed data
restore flash controller cache
The strange behaviour of printf and pointer was due to compiler optimization. After changing the compiler flags to -O0 (no optimization), the error was consistent.
The same consistent error can be achieved when marking the pointers as 'volatile'.
Related
I'm writing my own test-runner for my current project. One feature (that's probably quite common with test-runners) is that every testcase is executed in a child process, so the test-runner can properly detect and report a crashing testcase.
I want to also test the test-runner itself, therefore one testcase has to force a crash. I know "crashing" is not covered by the C standard and just might happen as a result of undefined behavior. So this question is more about the behavior of real-world implementations.
My first attempt was to just dereference a null-pointer:
int c = *((int *)0);
This worked in a debug build on GNU/Linux and Windows, but failed to crash in a release build because the unused variable c was optimized out, so I added
printf("%d", c); // to prevent optimizing away the crash
and thought I was settled. However, trying my code with clang instead of gcc revealed a surprise during compilation:
[CC] obj/x86_64-pc-linux-gnu/release/src/test/test/test_s.o
src/test/test/test.c:34:13: warning: indirection of non-volatile null pointer
will be deleted, not trap [-Wnull-dereference]
int c = *((int *)0);
^~~~~~~~~~~
src/test/test/test.c:34:13: note: consider using __builtin_trap() or qualifying
pointer with 'volatile'
1 warning generated.
And indeed, the clang-compiled testcase didn't crash.
So, I followed the advice of the warning and now my testcase looks like this:
PT_TESTMETHOD(test_expected_crash)
{
PT_Test_expectCrash();
// crash intentionally
int *volatile nptr = 0;
int c = *nptr;
printf("%d", c); // to prevent optimizing away the crash
}
This solved my immediate problem, the testcase "works" (aka crashes) with both gcc and clang.
I guess because dereferencing the null pointer is undefined behavior, clang is free to compile my first code into something that doesn't crash. The volatile qualifier removes the ability to be sure at compile time that this really will dereference null.
Now my questions are:
Does this final code guarantee the null dereference actually happens at runtime?
Is dereferencing null indeed a fairly portable way for crashing on most platforms?
I wouldn't rely on that method as being robust if I were you.
Can't you use abort(), which is part of the C standard and is guaranteed to cause an abnormal program termination event?
The answer refering to abort() was great, I really didn't think of that and it's indeed a perfectly portable way of forcing an abnormal program termination.
Trying it with my code, I came across msvcrt (Microsoft's C runtime) implements abort() in a special chatty way, it outputs the following to stderr:
This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information.
That's not so nice, at least it unnecessarily clutters the output of a complete test run. So I had a look at __builtin_trap() that's also referenced in clang's warning. It turns out this gives me exactly what I was looking for:
LLVM code generator translates __builtin_trap() to a trap instruction if it is supported by the target ISA. Otherwise, the builtin is translated into a call to abort.
It's also available in gcc starting with version 4.2.4:
This function causes the program to exit abnormally. GCC implements this function by using a target-dependent mechanism (such as intentionally executing an illegal instruction) or by calling abort.
As this does something similar to a real crash, I prefer it over a simple abort(). For the fallback, it's still an option trying to do your own illegal operation like the null pointer dereference, but just add a call to abort() in case the program somehow makes it there without crashing.
So, all in all, the solution looks like this, testing for a minimum GCC version and using the much more handy __has_builtin() macro provided by clang:
#undef HAVE_BUILTIN_TRAP
#ifdef __GNUC__
# define GCC_VERSION (__GNUC__ * 10000 \
+ __GNUC_MINOR__ * 100 + __GNUC_PATCHLEVEL__)
# if GCC_VERSION > 40203
# define HAVE_BUILTIN_TRAP
# endif
#else
# ifdef __has_builtin
# if __has_builtin(__builtin_trap)
# define HAVE_BUILTIN_TRAP
# endif
# endif
#endif
#ifdef HAVE_BUILTIN_TRAP
# define crashMe() __builtin_trap()
#else
# include <stdio.h>
# define crashMe() do { \
int *volatile iptr = 0; \
int i = *iptr; \
printf("%d", i); \
abort(); } while (0)
#endif
// [...]
PT_TESTMETHOD(test_expected_crash)
{
PT_Test_expectCrash();
// crash intentionally
crashMe();
}
you can write memory instead of reading it.
*((int *)0) = 0;
No, dereferencing a NULL pointer is not a portable way of crashing a program. It is undefined behavior, which means just that, you have no guarantees what will happen.
As it happen, for the most part under any of the three main OS's used today on desktop computers, that being MacOS, Linux and Windows NT (*) dereferencing a NULL pointer will immediately crash your program.
That said: "The worst possible result of undefined behavior is for it to do what you were expecting."
I purposely put a star beside Windows NT, because under Windows 95/98/ME, I can craft a program that has the following source:
int main()
{
int *pointer = NULL;
int i = *pointer;
return 0;
}
that will run without crashing. Compile it as a TINY mode .COM files under 16 bit DOS, and you'll be just fine.
Ditto running the same source with just about any C compiler under CP/M.
Ditto running that on some embedded systems. I've not tested it on an Arduino, but I would not want to bet either way on the outcome. I do know for certain that were a C compiler available for the 8051 systems I cut my teeth on, that program would run fine on those.
The program below should work. It might cause some collateral damage, though.
#include <string.h>
void crashme( char *str)
{
char *omg;
for(omg=strtok(str, "" ); omg ; omg=strtok(NULL, "") ) {
strcat(omg , "wtf");
}
*omg =0; // always NUL-terminate a NULL string !!!
}
int main(void)
{
char buff[20];
// crashme( "WTF" ); // works!
// crashme( NULL ); // works, too
crashme( buff ); // Maybe a bit too slow ...
return 0;
}
I'm expermenting with function pointers on Linux and trying to execute this C program:
#include <stdio.h>
#include <string.h>
int myfun()
{
return 42;
}
int main()
{
char data[500];
memcpy(data, myfun, sizeof(data));
int (*fun_pointer)() = (void*)data;
printf("%d\n", fun_pointer());
return 0;
}
Unfortunately it segfaults on fun_pointer() call. I suspect that it is connected with some memory flags, but I don't found information about it.
Could you explain why this code segfaults? Don't see to the fixed data array size, it is ok and copying without calling the function is successfull.
UPD: Finally I've found that the memory segment should be marked as executable using mprotect system call called with PROT_EXEC flag. Moreover the memory segment should be returned by mmap function as stated in the POSIX specification.
There is the same code that uses allocated by mmap memory with PROT_EXEC flag (and works):
#include <stdio.h>
#include <string.h>
#include <sys/mman.h>
int myfun()
{
return 42;
}
int main()
{
size_t size = (char*)main - (char*)myfun;
char *data = mmap(NULL, size, PROT_EXEC | PROT_READ | PROT_WRITE,
MAP_PRIVATE | MAP_ANONYMOUS, 0, 0);
memcpy(data, myfun, size);
int (*fun_pointer)() = (void*)data;
printf("%d\n", fun_pointer());
munmap(data, size);
return 0;
}
This example should be complied with -fPIC gcc option to ensure that the code in functions is position-independent.
Several problems there:
Your data array stays in data segment, not in code segment.
The address relocation is not handled.
The code size is not known, just guessed.
In addition to Diask's answer you probably want to use some JIT compilation techniques (to generate executable code in memory), and you should be sure that the memory zone containing the code is executable (see mprotect(2) and the NX bit; often the call stack is not executable for security reasons). You could use GNU lightning (quickly emitting slow machine code), asmjit, libjit, LLVM, GCCJIT (able to slowly emit fast optimized machine code). You could also emit some C code in some temporary file /tmp/emittedcode.c, fork a compilation command gcc -Wall -O -fPIC -shared /tmp/emittedcode.c -o /tmp/emittedcode.so then dlopen(3) that shared object /tmp/emittedcode.so and use dlsym(3) to find function pointers by their name there.
See also this, this, this, this and that answers. Read about trampoline code, closures, and continuations & CPS.
Of course, copying code from one zone to another usually don't work (it has to be position independent code to make that work, or you need your own relocation machinery, a bit like a linker does).
It's because this line is wrong:
memcpy(data, myfun, sizeof(data));
You are copying the code (compiled) of the function instead of the address of the function.
myfun and &myfun will have the same adress, so to do your memcpy operation, you will have to use a function pointer and then copy from its address.
Example:
int (*p)();
p = myfun;
memcpy(data, &p, sizeof(data));
I have downloaded and compiled Apples source and added it to Xcode.app/Contents/Developer/usr/bin/include/c++/v1. Now how do I go about implementing in C? The code I am working with is from this post about Hackadays shellcode executer. My code is currently like so:
#include <stdio.h>
#include <stdlib.h>
unsigned char shellcode[] = "\x31\xFA......";
int main()
{
int *ret;
ret = (int *)&ret + 2;
(*ret) = (int)shellcode;
printf("2\n");
}
I have compiled with both:
gcc -fno-stack-protector shell.c
clang -fno-stack-protector shell.c
I guess my final question is, how do I tell the compiler to implement "__enable_execute_stack"?
The stack protector is different from an executable stack. That introduces canaries to detect when the stack has been corrupted.
To get an executable stack, you have to link saying to use an executable stack. It goes without saying that this is a bad idea as it makes attacks easier.
The option for the linker is -allow_stack_execute, which turns into the gcc/clang command line:
clang -Wl,-allow_stack_execute -fno-stack-protector shell.c
your code, however, does not try to execute code on the stack, but it does attempt to change a small amount of the stack content, trying to accomplish a return to the shellcode, which is one of the most common ROP attacks.
On a typically compiled OSX 32bit environment this would be attempting to overwrite what is called the linkage area (this is the address of the next instruction that will be called upon function return). This assumes that the code was not compiled with -fomit-frame-pointer. If it's compiled with this option, then you're actually moving one extra address up.
On OSX 64bit it uses the 64bit ABI, the registers are 64bit, and all the values would need to be referenced by long rather than by int, however the manner is similar.
The shellcode you've got there, though, is actually in the data segment of your code (because it's a char [] it means that it's readable/writable, not readable-executable. You would need to either mmap it (like nneonneo's answer) or copy it into the now-executable stack, get it's address and call it that way.
However, if you're just trying to get code to run, then nneonneo's answer makes it pretty easy, but if you're trying to experiment with exploit-y code, then you're going to have to do a little more work. Because of the non-executable stack, the new kids use return-to-library mechanisms, trying to get the return to call, say, one of the exec/system calls with data from the stack.
With modern execution protections in place, it's a bit tricky to get shellcode to run like this. Note that your code is not attempting to execute code on the stack; rather, it is storing the address of the shellcode on the stack, and the actual code is in the program's data segment.
You've got a couple options to make it work:
Put the shellcode in an actual executable section, so it is executable code. You can do this with __attribute__((section("name"))) with GCC and Clang. On OS X:
const char code[] __attribute__((section("__TEXT,__text"))) = "...";
followed by a
((void (*)(void))code)();
works great. On Linux, use the section name ".text" instead.
Use mmap to create a read-write section of memory, copy your shellcode, then mprotect it so it has read-execute permissions, then execute it. This is how modern JITs execute dynamically-generated code. An example:
#include <sys/mman.h>
void execute_code(const void *code, size_t codesize) {
size_t pagesize = (codesize + PAGE_SIZE - 1) & ~(PAGE_SIZE - 1);
void *chunk = mmap(NULL, pagesize, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANON, -1, 0);
if(chunk == MAP_FAILED) return;
memcpy(chunk, code, codesize);
mprotect(chunk, pagesize, PROT_READ|PROT_EXEC);
((void (*)(void)chunk)();
munmap(chunk, pagesize);
}
Neither of these methods requires you to specify any special compiler flags to work properly, and neither of them require fiddling with the saved EIP on the stack.
I am developing a software for AVR microcontroller. Saying in fromt, now I only have LEDs and pushbuttons to debug. The problem is that if I pass a string literal into the following function:
void test_char(const char *str) {
if (str[0] == -1)
LED_PORT ^= 1 << 7; /* Test */
}
Somewhere in main()
test_char("AAAAA");
And now the LED changes state. On my x86_64 machine I wrote the same function to compare (not LED, of course), but it turns out that str[0] equals to 'A'. Why is this happening?
Update:
Not sure whether this is related, but I have a struct called button, like this:
typedef struct {
int8_t seq[BTN_SEQ_COUNT]; /* The sequence of button */
int8_t seq_count; /* The number of buttons registered */
int8_t detected; /* The detected button */
uint8_t released; /* Whether the button is released
after a hold */
} button;
button btn = {
.seq = {-1, -1, -1},
.detected = -1,
.seq_count = 0,
.released = 0
};
But it turned out that btn.seq_count start out as -1 though I defined it as 0.
Update2
For the later problem, I solved by initializing the values in a function. However, that does not explain why seq_count was set to -1 in the previous case, nor does it explain why the character in string literal equals to -1.
Update3
Back to the original problem, I added a complete mini example here, and same occurs:
void LED_on() {
PORTA = 0x00;
}
void LED_off() {
PORTA = 0xFF;
}
void port_init() {
PORTA = 0xFF;
DDRA |= 0xFF;
}
void test_char(const char* str) {
if (str[0] == -1) {
LED_on();
}
}
void main() {
port_init();
test_char("AAAAA");
while(1) {
}
}
Update 4
I am trying to follow Nominal Animal's advice, but not quite successful. Here is the code I have changed:
void test_char(const char* str) {
switch(pgm_read_byte(str++)) {
case '\0': return;
case 'A': LED_on(); break;
case 'B': LED_off(); break;
}
}
void main() {
const char* test = "ABABA";
port_init();
test_char(test);
while(1) {
}
}
I am using gcc 4.6.4,
avr-gcc -v
Using built-in specs.
COLLECT_GCC=avr-gcc
COLLECT_LTO_WRAPPER=/home/carl/Softwares/AVR/libexec/gcc/avr/4.6.4/lto-wrapper
Target: avr
Configured with: ../configure --prefix=/home/carl/Softwares/AVR --target=avr --enable-languages=c,c++ --disable-nls --disable-libssp --with-dwarf2
Thread model: single
gcc version 4.6.4 (GCC)
Rewritten from scratch, to hopefully clear up some of the confusion.
First, some important background:
AVR microcontrollers have separate address spaces for RAM and ROM/Flash ("program memory").
GCC generates code that assumes all data is always in RAM. (Older versions used to have special types, such as prog_char, that referred to data in the ROM address space, but newer versions of GCC do not and cannot support such data types.)
When linking against avr-libc, the linker adds code (__do_copy_data) to copy all initialized data from program memory to RAM. If you have both avr-gcc and avr-libc packages installed, and you use something like avr-gcc -Wall -O2 -fomit-frame-pointer -mmcu=AVRTYPE source.c -o binary.elf to compile your source file into a program binary, then use avr-objcopy to convert the elf file into the format your firmware utilities support, you are linking against avr-libc.
If you use avr-gcc to only produce an object file source.o, and some other utilities to link and upload your program to your microcontroller, this copying from program memory to RAM may not happen. It depends on what linker and libraries your use.
As most AVRs have only a few dozen to few hundred bytes of RAM available, it is very, very easy to run out of RAM. I'm not certain if avr-gcc and avr-libc reliably detect when you have more initialized data than you have RAM available. If you specify any arrays containing strings, it is very likely you're already overrun your RAM, causing all sorts of interesting bugs to appear.
The avr/pgmspace.h header file is part of avr-libc, and defines a macro, PROGMEM, that can be used to specify data that will only be referred to by functions that take program memory addresses (pointers), such as pgm_read_byte() or strcmp_P() defined in the same header file. The linker will not copy such variables to RAM -- but neither will the compiler tell you if you're using them wrong.
If you use both avr-gcc and avr-libc, I recommend using the following approach for all read-only data:
#include <avr/pgmspace.h>
/*
* Define LED_init(), LED_on(), and LED_off() functions.
*/
void blinky(const char *str)
{
while (1) {
switch (pgm_read_byte(str++)) {
case '\0': return;
case 'A': LED_on(); break;
case 'B': LED_off(); break;
}
/* Add a sleep or delay here,
* or you won't be able to see the LED flicker. */
}
}
static const char example1[] PROGMEM = "AB";
const char example2[] PROGMEM = "AAAA";
int main(void)
{
static const char example3[] PROGMEM = "ABABB";
LED_init();
while (1) {
blinky(example1);
blinky(example2);
blinky(example3);
}
}
Because of changes (new limitations) in GCC internals, the PROGMEM attribute can only be used with a variable; if it refers to a type, it does nothing. Therefore, you need to specify strings as character arrays, using one of the forms above. (example1 is visible within this compilation unit only, example2 can be referred to from other compilation units too, and example3 is visible only in the function it is defined in. Here, visible refers to where you can refer to the variable; it has nothing to do with the contents.)
The PROGMEM attribute does not actually change the code GCC generates. All it does is put the contents to .progmem.data section, iff without it they'd be in .rodata. All of the magic is really in the linking, and in linked library code.
If you do not use avr-libc, then you need to be very specific with your const attributes, as they determine which section the contents will end up in. Mutable (non-const) data should end up in the .data section, while immutable (const) data ends up in .rodata section(s). Remember to read the specifiers from right to left, starting at the variable itself, separated by '*': the leftmost refers to the content, whereas the rightmost refers to the variable. In other words,
const char *s = p;
defines s so that the value of the variable can be changed, but the content it points to is immutable (unchangeable/const); whereas
char *const s = p;
defines s so that you cannot modify the variable itself, but you can the content -- the content s points to is mutable, modifiable. Furthermore,
const char *s = "literal";
defines s to point to a literal string (and you can modify s, ie. make it point to some other literal string for example), but you cannot modify the contents; and
char s[] = "string";
defines s to be a character array (of length 6; string length + 1 for end-of-string char), that happens to be initialized to { 's', 't', 'r', 'i', 'n', 'g', '\0' }.
All linker tools that work on object files use the sections to determine what to do with the contents. (Indeed, avr-libc copies the contents of .rodata sections to RAM, and only leaves .progmem.data in program memory.)
Carl Dong, there are several cases where you may observe weird behaviour, even reproducible weird behaviour. I'm no longer certain which one is the root cause of your problem, so I'll just list the ones I think are likely:
If linking against avr-libc, running out of RAM
AVRs have very little RAM, and copying even string literals to RAM easily eats it all up. If this happens, any kind of weird behaviour is possible.
Failing to linking against avr-libc
If you think you use avr-libc, but are not certain, then use avr-objdump -d binary.elf | grep -e '^[0-9a-f]* <_' to see if the ELF binary contains any library code. You should expect to see at least <__do_clear_bss>:, <_exit>:, and <__stop_program>: in that list, I believe.
Linking against some other C library, but expecting avr-libc behaviour
Other libraries you link against may have different rules. In particular, if they're designed to work with some other C compiler -- especially one that supports multiple address spaces, and therefore can deduce when to use ld and when lpm based on types --, it might be impossible to use avr-gcc with that library, even if all the tools talk to each other nicely.
Using a custom linker script and a freestanding environment (no C library at all)
Personally, I can live with immutable data (.rodata sections) being in program memory, with myself having to explicitly copy any immutable data to RAM whenever needed. This way I can use a simple microcontroller-specific linker script and GCC in freestanding mode (no C library at all used), and get complete control over the microcontroller. On the other hand, you lose all the nice predefined macros and functions avr-libc and other C libraries provide.
In this case, you need to understand the AVR architecture to have any hope of getting sensible results. You'll need to set up the interrupt vectors and all kinds of other stuff to get even a minimal do-nothing loop to actually run; personally, I read all the assembly code GCC produces (from my own C source) simply to see if it makes sense, and to try to make sure it all gets processed correctly.
Questions?
I faced a similar problem (inline strings were equal to 0xff,0xff,...) and solved it by just changing a line in my Makefile
from :
.out.hex:
$(OBJCOPY) -j .text \
-j .data \
-O $(HEXFORMAT) $< $#
to :
.out.hex:
$(OBJCOPY) -j .text \
-j .data \
-j .rodata \
-O $(HEXFORMAT) $< $#
or seems better :
.out.hex:
$(OBJCOPY) -R .fuse \
-R .lock \
-R .eeprom \
-O $(HEXFORMAT) $< $#
You can see full problem and answer here : https://www.avrfreaks.net/comment/2943846#comment-2943846
I've got some C code I'm targeting for an AVR. The code is being compiled with avr-gcc, basically the gnu compiler with the right backend.
What I'm trying to do is create a callback mechanism in one of my event/interrupt driven libraries, but I seem to be having some trouble keeping the value of the function pointer.
To start, I have a static library. It has a header file (twi_master_driver.h) that looks like this:
#ifndef TWI_MASTER_DRIVER_H_
#define TWI_MASTER_DRIVER_H_
#define TWI_INPUT_QUEUE_SIZE 256
// define callback function pointer signature
typedef void (*twi_slave_callback_t)(uint8_t*, uint16_t);
typedef struct {
uint8_t buffer[TWI_INPUT_QUEUE_SIZE];
volatile uint16_t length; // currently used bytes in the buffer
twi_slave_callback_t slave_callback;
} twi_global_slave_t;
typedef struct {
uint8_t slave_address;
volatile twi_global_slave_t slave;
} twi_global_t;
void twi_init(uint8_t slave_address, twi_global_t *twi, twi_slave_callback_t slave_callback);
#endif
Now the C file (twi_driver.c):
#include <stdint.h>
#include "twi_master_driver.h"
void twi_init(uint8_t slave_address, twi_global_t *twi, twi_slave_callback_t slave_callback)
{
twi->slave.length = 0;
twi->slave.slave_callback = slave_callback;
twi->slave_address = slave_address;
// temporary workaround <- why does this work??
twi->slave.slave_callback = twi->slave.slave_callback;
}
void twi_slave_interrupt_handler(twi_global_t *twi)
{
(twi->slave.slave_callback)(twi->slave.buffer, twi->slave.length);
// some other stuff (nothing touches twi->slave.slave_callback)
}
Then I build those two files into a static library (.a) and construct my main program (main.c)
#include
#include
#include
#include
#include "twi_master_driver.h"
// ...define microcontroller safe way for mystdout ...
twi_global_t bus_a;
ISR(TWIC_TWIS_vect, ISR_NOBLOCK)
{
twi_slave_interrupt_handler(&bus_a);
}
void my_callback(uint8_t *buf, uint16_t len)
{
uint8_t i;
fprintf(&mystdout, "C: ");
for(i = 0; i < length; i++)
{
fprintf(&mystdout, "%d,", buf[i]);
}
fprintf(&mystdout, "\n");
}
int main(int argc, char **argv)
{
twi_init(2, &bus_a, &my_callback);
// ...PMIC setup...
// enable interrupts.
sei();
// (code that causes interrupt to fire)
// spin while the rest of the application runs...
while(1){
_delay_ms(1000);
}
return 0;
}
I carefully trigger the events that cause the interrupt to fire and call the appropriate handler. Using some fprintfs I'm able to tell that the location assigned to twi->slave.slave_callback in the twi_init function is different than the one in the twi_slave_interrupt_handler function.
Though the numbers are meaningless, in twi_init the value is 0x13b, and in twi_slave_interrupt_handler when printed the value is 0x100.
By adding the commented workaround line in twi_driver.c:
twi->slave.slave_callback = twi->slave.slave_callback;
The problem goes away, but this is clearly a magic and undesirable solution. What am I doing wrong?
As far as I can tell, I've marked appropriate variables volatile, and I've tried marking other portions volatile and removing the volatile markings. I came up with the workaround when I noticed removing fprintf statements after the assignment in twi_init caused the value to be read differently later on.
The problem seems to be with how I'm passing around the function pointer -- and notably the portion of the program that is accessing the value of the pointer (the function itself?) is technically in a different thread.
Any ideas?
Edits:
resolved typos in code.
links to actual files: http://straymark.com/code/ [test.c|twi_driver.c|twi_driver.h]
fwiw: compiler options: -Wall -Os -fpack-struct -fshort-enums -funsigned-char -funsigned-bitfields -mmcu=atxmega128a1 -DF_CPU=2000000UL
I've tried the same code included directly (rather than via a library) and I've got the same issue.
Edits (round 2):
I removed all the optimizations, without my "workaround" the code works as expected. Adding back -Os causes an error. Why is -Os corrupting my code?
Just a hunch, but what happens if you switch these two lines around:
twi->slave.slave_callback = slave_callback;
twi->slave.length = 0;
Does removing the -fpack-struct gcc flag fix the problem? I wonder if you haven't stumbled upon a bug where writing that length field is overwriting part of the callback value.
It looks to me like with the -Os optimisations on (you could try combinations of the individual optimisations enabled by -Os to see exactly which one is causing it), the compiler isn't emitting the right code to manipulate the uint16_t length field when its not aligned on a 2-byte boundary. This happens when you include a twi_global_slave_t inside a twi_global_t that is packed, because the initial uint8_t member of twi_global_t causes the twi_global_slave_t struct to be placed at an odd address.
If you make that initial field of twi_global_t a uint16_t it will probably fix it (or you could turn off struct packing). Try the latest gcc build and see if it still happens - if it does, you should be able to create a minimal test case that shows the problem, so you can submit a bug report to the gcc project.
This really sounds like a stack/memory corruption issue. If you run avr-size on your elf file, what do you get? Make sure (data + bss) < the RAM you have on the part. These types of issues are very difficult to track down. The fact that removing/moving unrelated code changes the behavior is a big red flag.
Replace "&my_callback" with "my_callback" in function main().
Because different threads access the callback address, try protecting it with a mutex or read-write lock.
If the callback function pointer isn't accessed by a signal handler, then the "volatile" qualifier is unnecessary.