AVR32 exception: Bus Data Error - c

Recently, I am facing a - to me - strange behavior in my embedded software.
What I got: Running a 32 bit AVR32 controller, starting the program from an external SDRAM, as the file size is too big to start it directly from the micro-controller flash. Due to the physical memory map, the memory areas are split between:
stack (start at 0x1000, length of 0xF000) ( < 0x1000 is protected by the MPU)
EBI SDRAM (start at 0xD0000000, length of 0x00400000).
What happens: Unfortunately I got an exception, which is not reproducible. Looking at my given stack trace, the following event irregular occurs:
Name: Bus error data fetch - Event source: Data bus - Stored Return Address: First non-completed instruction
Additionally, the stack pointer has a valid value, whereas the address where the exception occurs (last entry point for fetching instructions), points into the memory nirvana (e.g. 0x496e6372, something around 0x5..., 0x6....). I guess, this has to be the "First non-completed instruction", the manual is talking about. However, the line in my source code is always the same: accessing a member function from a data array via pointer.
if(mSomeArray[i])
{
mSomeArray[i]->someFunction(); <-- Crash
}
The thing is: adding or deleting other source code makes the event disappear and return again.
What I thought about: Something is corrupting my memory (mapping). What kinds of errors are possible for this?
A buffer overflow?
The SDRAM controller could be turned off, so it loses some data. That is not impossible, but rather improbably
The stack is big enough, I already checked this with a watermark
The Data Bus Rate and AVR clock are set correctly
How to solve this: More assert? Unfortunately I cannot debug this with AVRStudio. Anyone a hint or idea? Or am I missing something obvious?
Edit:
Mentioned approaches from users:
Check for addresses of function pointer and array entries
Overwrite of stack array
Not properly written interrupts
Not initialized pointers
Check for array access via i at crash case
use exception handler address for illegal memory access
use snprintf instead of sprintf
Late appendix to the thread: the issue was a wrong array access (wrong index was set) in an old software module, that had nothing to do with my modules. I found this by accident, it was a curiosity that it didn't appear earlier and it took me quite a while to find the line of code. I mark the only given answer as correct solution.
Thank you all for your input.
Take care (of your software ;))

Here are some ideas:
Check 'i' to make sure it is within the array bounds.
Check the address of the function pointer that is about to be called. It should have an address within the SDRAM.
See if the chip has an exception handler address it will jump to when it accesses illegal memory. Once you are there, output some debug data
If your debugger allows, set a breakpoint on someFunction() when it is written. This would catch some other function when it overwrites the function pointer.

Related

Running own code with a buffer overflow exploit

I am trying to understand the buffer overflow exploit and more specifically, how it can be used to run own code - e.g. by starting our own malicious application or anything similar.
While I do understand the idea of the buffer overflow exploit using the gets() function (overwriting the return address with a long enough string and then jumping to the said address), there are a few things I am struggling to understand in real application, those being:
Do I put my own code into the string just behind the return address? If so, how do I know the address to jump to? And if not, where do I jump and where is the actual code located?
Is the actual payload that runs the code my own software that's running and the other program just jumps into it or are all the instructions provided in the payload? Or more specifically, what does the buffer overflow exploit implementation actually look like?
What can I do when the address (or any instruction) contains 0? gets() function stops reading when it reads 0 so how is it possible to get around this problem?
As a homework, I am trying to exploit a very simple program that just asks for an input with gets() (ASLR turned off) and then prints it. While I can find the memory address of the function which calls it and the return, I just can't figure out how to actually implement the exploit.
You understand how changing the return address lets you jump to an arbitrary location.
But as you have correctly identified you don't know where you have loaded the code you want to execute. You just copied it into a local buffer(which was mostly some where on the stack).
But there is something that always points to this stack and it is the stack pointer register. (Lets assume x64 and it would be %rsp).
Assuming your custom code is on the top of the stack. (It could be at an offset but that too can be managed similarly).
Now we need an instruction that
1. Allows us to jump to the esp
2. Is located at a fixed address.
So most binaries use some kind of shared libraries. On windows you have kernel32.dll. In all the programs this library is loaded, it is always mapped at the same address. So you know the exact location of every instruction in this library.
All you have to do is disassemble one such library and find an instruction like
jmp *%rsp // or a sequence of instructions that lets you jump to an offset
Then the address of this instruction is what you will place where the return address is supposed to be.
The function will return then and then jump to the stack (ofcourse you need an executable stack for this). Then it will execute your arbitrary code.
Hope that clears some confusion on how to get the exploit running.
To answer your other questions -
Yes you can place your code in the buffer directly. Or if you can find the exact code you want to execute (again in a shared library), you can simply jump to that.
Yes, gets would stop at \n and 0. But usually you can get away by changing your instructions a bit to write code that doesn't use these bytes at all.
You try different instructions and check the assembled bytes.

How to locate bug only by a memory address?

I got a segment error in a object like this:
http_client_reset(struct http_client *client) {
if (client->last_req) {
/* #client should never be NULL, but weather
a valid object, I don't know */
...
}
}
by debugging the core dump file in GDB, the memory address of client is 0x40a651c0. I have tried several times, and the address is the same.
Then I tried the bt command in GDB:
(gdb) bt
#0 0x0804c80e in http_client_reset (
c=<error reading variable: Cannot access memory at address 0x40a651c0>,
c#entry=<error reading variable: Cannot access memory at address 0x40a651bc>)
at http/client.c:170
Cannot access memory at address 0x40a651bc
there is no back trace message, I have greped my source code, and there is only one call on http_client_reset.
How to debug such a bug via only a memory address?
Is there a way to judge a object is valid before access its field(except obj == NULL)?
Never a coredump crash debugging is a 'Black and White' matter.So you would not be able to get an exact answer for the questions pertaining to debugging coredump. However, most coredump will be due to programming errors which can be classified into broad areas. I will provide some of these broad areas and some debugging mechanism - which might help you.
Class of programming error leading to crash
multi-threaded code - check for missing critical section while accessing common data. This can corrupt the data leading to such crash. In your case you can check for http_client pointer, access of this and CRUD - Create/Read/Update and Delete.
Heap Corruption - In most of the cases, this would be a valid pointer and due incorrect handling of heap in another section of code, this may cause the valid pointer to be overwritten. Think of an array in and around the pointer location - ABW etc kind of issues would easily cause this problem.
Stack Corruption - This is very unlikely, but hard to find them. In case you overwrite stack data - similar to array in the above example - but on stack, then the same issue will occur.
Ways to un-earth the coredump root cause
You need understand that - technically coredump is an illegal operation causing un-handled exception leading to crash. Since most of it are related to memory handling, a static-analysis tool - such as kloc/PCLint would capture almost 80% of the issues. Then I would next run on valgrind/purify and would most probably uncover the rest of the issue. Very few issues miss both of them - which would be some sequencing/timing related code - which can be found out with code review.
HTH!

Use of keep(int,int) function in TSR programming using dos.h

While studying tsr programming i have seen the use of certain code which i cannot understand..
The example cede part is(in c):
(please explain the bolded sections)
#include "dos.h"
#include"stdio.h"
void interrupt our();
void interrupt (*prev)();
char far *scr=(char far*)0xB8000000L;
int ticks;
unsigned char color;
main()
{
prev=getvect(8); // <<<<<
setvect(8,our); // <<<<<
keep(0,10000); // <<<<<
}
You would partially understand this code if you read the answer i posted to your similar question on TSR
How to write a TSR which changes case of characters
The most important things here are
Far pointer: Since 16 bit DOS used segment offset addressing scheme, your normal near pointer could not access memory beyond 64K of it's allocated segment. You have to read details to understand it.
Video memory address: This B8000000 is the address for which you need far pointer. The special thing about this address is, that starting from this location bytes (equal to the resolution of screen * 2) are copied directly into video memory.
So if you assign a character to a pointer address after indirection it will be printed on screen
Something like
char * far p = 0xB8000000;
*p = 'a'; // this would actually print a on screen at left top
Loop forward to get to the rest of the screen.
There was a c book by yashwant kanetkar which had a good deal of reference for this. I remember using it in my undergrad many years ago.
The rest of them are just indexing api's in dos.h. Why don't you go through their description and get back here if you don't understand any?
This program installs an interrupt handler. It uses interrupt number 8, the system timer interrupt. This was a common practice to use this interrupt to "continuously" do stuff on a machine running DOS.
prev=getvect(8);
This line gets the interrupt vector, that is, a pointer to a function that the system calls 18 times per second.
setvect(8,our);
This line sets the interrupt vector, that is, tells the system to call this function, instead of the old function, 18 times per second. Note that to avoid a crash, the new function must call the old function, in addition to its main purpose (which seems to be changing the case of characters).
keep(0,10000);
This line makes the program with exit code 0 (a conventional value for success) and tells DOS to leave 10000 bytes (or maybe 16-byte units? Unlikely; I don't remember) in RAM. This is unlike normal completion of program (exit(0)), where DOS marks all RAM previously occupied by the program as free.
A common cause of a crash in a TSR program is caused by the absence of keep at the end. DOS releases the memory occupied the by the code of the function our, and in the next 1/18 of a second, a random piece of code is executed.
See Int 21/AH=31h for more information.
Please note also that the parameter to keep should be calculated by manipulating some addresses, so that you don't take too much memory, and on the other hand, take enough memory to contain the code of the function our, which performs the stuff you need. The value 10000 is just an example.

What can cause Program Counter to have an invalid address?

I am getting an exception "Invalid Program Counter Address" in Vxworks + PPC 603.
Application is linking to multiple 'C' libraries. Am not able to place, what could cause this problem?
Is there a possibility that incorrect compilation options could be causing this?
Any directions or pointers will be helpful.
Thanks
UPDATE:
I am having a structure whose members are function pointers. The structure itself is static and it's address is passed around and through the structure different functions are being invoked.
During one of the test rounds, I found that in the function pointer, the function address value is reduced by 1. If the function address is 0x009a3730, the PC is having 0x00913729.
Also, if I change the compiler options, the place of crash or the number of runs after which the crash happens changes.
Any case where you're working with function pointers can easily lead to this, if the pointer value gets corrupted and later is called. Check signal handlers if any, and any other API:s that deal with callbacks.
"If the function address is 0x009a3730, the PC is having 0x00913729". The difference here is not 1 :) However PC will always point to the address of the next instruction it has to execute AFAIK.
Maybe you could run the core dump in a debugger and print out the :
Back trace
'disassemble' code around the region of the crash
info registers -> register values at the time of the crash
info locals --> local variables of the function inside which it crashed
#All, Thanks for your suggestions.
It turned out that the location containing the address was incorrectly getting pointed to a reference member of another structure and that reference member was getting decremented by one in each call to free that structure.
The memory for that structure should have been allocated by a call to one of our functions. But, instead it was left to refer to some garbade memory without any initalization or memory allocation and it ended up referring to this static memory where the global structure is stored. This led to the static structure getting corrupted and which inturn led to the crash.
A thorough line-by-line analysis of our logs helped in putting all pieces together.

Is there a way to test for an invalid memory location?

In a language like C, for example, if a routine receives a pointer, is there any system call or other test that can be applied to the pointer to check that it is a valid memory location, other than catching SIGSEGV or equivalent?
No, you can't for sure check whether the address is invalid. Even if you used some operating system function to test if teh address is mapped into the address space you still can't be sure if the address is of some service data that you should not read or modify.
One good example. If your program uses Microsoft RPC to accept calls from another program you have to implement a set of callback functions to server the requests. Those callback functions will be run on separated threads started by RPC. You don't know when those thereads start and what their stack size is, so you can't detect whether a buffer overrun occurs if you write through an address that is meant to be of a stack variable but accidentially is to the stack of another thread.
Well, if you knew where the memory being pointed to was being stored (on the stack, for instance), you could check to see if it's in a certain 'range' that is the approximate address range of the stack. That could also work for something on the heap, if you have an idea of how big your heap "should" be. It's definitely not a fail-safe approach, but I'm unaware of any sure-fire methods for checking the 'validity' of a pointer.
If you mean purely within your own application you can establish a convention that any memory allocated by your code is initialized in a way you can recognize. E.g. in one project I saw they wrote an eyecatcher in the first few bytes. In some products I know they write a unique id at the start and end and each time it's accessed they check the 2 ids still match to show it's not been corrupted. E.g CICS on z/Series does the latter.

Resources