SPARC assembly jmp \boot - c

I'll explain the problem briefly. I have a Leon3 board (gr-ut-g99). Using GRMON2 I can load executables at the desired address in the board.
I have two programs. Let's call them A and B. I tried to load both in memory and individually they work.
What I would like to do now is to make the A program call the B program.
Both programs are written in C using a variant of the gcc compiler (the Gaisler Sparc GCC).
To do the jump I wrote a tiny inline assembler function in program A that jumps to a memory address where I loaded the program B.
below a snippet of the program A
unsigned int return_address;
unsigned int * const RAM_pointer = (unsigned int *) RAM_ADDRESS;
printf("RAM pointer set to: 0x%08x \n",(unsigned int)RAM_pointer);
printf("jumping...\n");
__asm__(" nop;" //clean the pipeline
"jmp %1;" // jmp to programB
:"=r" (return_address)
:"r" (RAM_pointer)
);
RAM_ADDRESS is a #define
#define RAM_ADDRESS 0x60000000
The program B is a simple hello world. The program B is loaded at the 0x60000000 address. If I try to run it, it works!
int main()
{
printf ("HELLO! I'M BOOTED! \n");
fflush(stdout);
return 0;
}
What I expect when I run the ProgramA, is to see the "jumping..." message on the console and then see the "HELLO! I'M BOOTED!" from the programB
What happens instead an IU exception.
Below I posted the messages show by grmon2 monitor. I also reported the "inst" report which should show the last operations performed before the exception.
grmon2> run
IU exception (tt = 0x07, mem address not aligned)
0x60004824: 9fc04000 call %g1
grmon2> inst
TIME ADDRESS INSTRUCTION RESULT SYMBOL
407085 600047FC mov %i3, %o2 [600063B8] -
407086 60004800 cmp %i4 [00000013] -
407089 60004804 be 0x60004970 [00000000] -
407090 60004808 mov %i0, %o0 [6000646C] -
407091 6000480C mov %i4, %o3 [00000013] -
407092 60004810 cmp %i4, %l0 [80000413] -
407108 60004814 bleu 0x60004820 [00000000] -
407144 60004818 ld [%i1 + 0x20], %o1 [FFFFFFFF] -
407179 60004820 ld [%i1 + 0x28], %g1 [FFFFFFFF] -
407186 60004824 call %g1 [ TRAP ] -
I also tried to substitute the "jmp" with a "jmpl" or a "call" but it does not worked.
I'm quite confused.
I do not know how to cope well with the problem and therefore I do not know what other information it is necessary to provide.
I can say that, the programB is loaded at 0x60000000 and the entry_point is, of course, 0x60000000. Running directly program B from that entry point it works good!
Thanks in advance for your help!

Looks to me like you did execute the jump, and it got to program B, as evidenced by the addresses of the instructions in the trace buffer. But where you crashed was in stdio trying to print stuff. Stdio makes extensive use of function pointers, and the sequence clearly shows a call instruction with the target address in a register, which indicates use of a function pointer.
I suggest putting fflush(stdout) in program A just before the jump, and this will allow you to see the messages before doing the jump. Then, in program B, instead of using printf, just put some known value in memory that you can examine later via the monitor to verify that it got there.
My guess is that the stdio library has some data or parameter that needs to be set up at the start of the program, and that's not being done or not done properly. Not sure about the platform you are running on, but do you have some sort of debugging or single stepping ability, like in a debugger? If so, just single step through the jump and follow where the program goes.

Related

Is Ghidra misinterpreting a function call?

When analyzing the assembly listing in Ghidra, I stumbled upon this instruction:
CALL dword ptr [EBX*0x4 + 0x402ac0]=>DAT_00402abc
I assumed that the program was calling a function whose address was inside DAT_00402abc, which I initially thought it was a dword variable. Indeed, when trying to create a function in the location where DAT_00402abc is in, Ghidra wouldn't let me do it.
The decompiler shows to me this line of code to translate that instruction:
(*(code *)(&int2)[iVar2])();
So I was wondering, what does it mean and what's the program supposed to do with this call? Is there a possibility that Ghidra totally messed up? And if so, how should I interpret that instruction?
I'm not at all familiar with Ghidra, but I can tell you how to interpret the machine instruction...
CALL dword ptr [EBX*0x4 + 0x402ac0]
There is a table of function addresses at 0x402ac0; the EBX'th entry in that table is being called. I have no idea what DAT_00402abc means, but if you inspect memory in dword-sized chunks at address 0x0402ac0 you should find plausible function addresses. [EDIT: 0x0040_2abc = 0x0040_2ac0 - 4. I suspect this means Ghidra thinks EBX has value -1 when control reaches this point. It may be wrong, or maybe the program has a bug. One would expect EBX to have a nonnegative value when control reaches this point.]
The natural C source code corresponding to this instruction would be something like
extern void do_thing_zero(void);
extern void do_thing_one(void);
extern void do_thing_two(void);
extern void do_thing_three(void);
typedef void (*do_thing_ptr)(void);
const do_thing_ptr do_thing_table[4] = {
do_thing_zero, do_thing_one, do_thing_two, do_thing_three
};
// ...
void do_thing_n(unsigned int n)
{
if (n >= 4) abort();
do_thing_table[n]();
}
If the functions in the table take arguments or return values, you'll see argument-handing code before and after the CALL instruction you quoted, but the CALL instruction itself will not change.
You would be seeing something different and much more complicated if the functions didn't all take the same set of arguments.

SPARC LEON error: IU exception (tt = 0x2B, data store error)

Good morning, I need an help because I'm stuck and I cannot find any solution looking at the manuals.
I want to use EDAC on Leon3. I'm programming in C using the BCC compiler. In particular, I have a GR-UT699 board. I'm using GRMON to flash my elf file in the RAM. My program is a short test where I want to use the EDAC. To enable the EDAC I simple bitbang the registers in this way (I can say that I checked the register and they are correctly wroted):
#define MCFG2_RMW_bit_set 0x00000040 //enable read-modify-write cycles on sub-word writes to 16 and 32bit areas with common write strobe
#define MCFG2_DE_bit_set 0x00004000 //SDRAM controller (1 en, 0 dis)
#define MCFG3_R_bit_set 0x00000200 //enable EDAC checking of the SDRAM or SRAM (1 en, 0 dis)
#define MCFG1_IE_bit_set 0x00080000 //enable access to mapped I/O memory.
...
edac->MCFG1 = edac->MCFG1 | MCFG1_IE_bit_set;
edac->MCFG2 = edac->MCFG2 | MCFG2_RMW_bit_set | MCFG2_DE_bit_set;
edac->MCFG3 = edac->MCFG3 | MCFG3_R_bit_set;
...
return 0;
}
these instructions are executed inside a init function which returns 0. I just set the bits which you can see in the previous defines.
When the function returns, I just want to call a printf() to show a message. The latter (the printf) output is never showed. So the program crashes after having set the register and before the printf. I think it crashes during the init function return.
these is the grmon console output:
grmon2> run
IU exception (tt = 0x2B, data store error)
0x40009acc: 81c3e008 retl <memmove+484>
grmon2> inst
TIME ADDRESS INSTRUCTION RESULT SYMBOL
2608062 40009978 andcc %g1, %g3, %g0 [00000000] memmove+0x90
2608065 4000997C be 0x40009AB0 [00000000] memmove+0x94
2608066 40009980 or %g2, %o1, %g1 [40013FA0] memmove+0x98
2608067 40009AB0 mov 0, %g1 [00000000] memmove+0x1c8
2608068 40009AB4 ldub [%o1 + %g1], %g3 [0000002E] memmove+0x1cc
2608070 40009AB8 stb %g3, [%g2 + %g1] [40012EA0 2E2E2E2E] memmove+0x1d0
2608072 40009ABC add %g1, 1, %g1 [00000001] memmove+0x1d4
2608073 40009AC0 cmp %g1, %o2 [00000000] memmove+0x1d8
2608076 40009AC4 bne,a 0x40009AB8 [00000000] memmove+0x1dc
2608078 40009ACC retl [ TRAP ] memmove+0x1e4
I saw that I needed to set the IE bit in the MCFG1 reg, and so I did. But the program still crashes. What is wrong here?
thanks in advance for your patience.
-Lorenzo
I found at least one solution which does not produces a crash of the program.
If you want to use EDAC you have to initialize the memory controller registers (from GRMON using "mcfgx 0xvalue etc" OR using -edac option when starting GRMON).
Then a wash of the RAM shall be performed (use of the wash command from GRMON).
It is important launch the wash command (or generally wash the memory from a firmware) after the EDAC has been enabled. In fact, if you wash the memory after the ENAC has been enabled the checkbits are generated. Otherwise you'll perform a simple memory clean.
Then you can finally load a program into the RAM (from grmon using "load").
It is important to notice that also IU/FPU register shall be cleared at reset, this can be done from MKPROM (if necessary).
This solution works for programs that are loaded in the RAM through GRMON.
If is necessary to flash the programs into the flash ROM similar operation shall be performed by means of MKPROM. I have not done this yet but I hope is something really similar.
Lorenzo.

Call stack backtrace in C

I am trying to get call stack backtrace at my assert/exception handler. Can't include "execinfo.h" therefore can't use int backtrace(void **buffer, int size);.
Also, tried to use __builtin_return_address() but acording to :http://codingrelic.geekhold.com/2009/05/pre-mortem-backtracing.html
... on some architectures, including my beloved MIPS, only __builtin_return_address(0) works.MIPS has no frame pointer, making it difficult to walk back up the stack. Frame 0 can use the return address register directly.
How can I reproduce full call stack backtrace?
I have successfully used the method described here, to get a call trace from stack on MIPS32.
You can then print out the call stack:
void *retaddrs[16];
int n, i;
n = get_call_stack_no_fp (retaddrs, 16);
printf ("CALL STACK: ");
for (i = 0; i < n; i++) {
printf ("0x%08X ", (uintptr_t)retaddrs[i]);
}
printf ("\r\n");
... and if you have the ELF file, then use the addr2line to convert the return addresses to function names:
addr2line -a -f -p -e xxxxxxx.elf addr addr ...
There are of course many gotchas, when using a method like this, including interrupts and exception handlers or results of code optimization. But nevertheless, it might be helpful sometimes.
I have successfully used the method suggested by #Erki A and described here.
Here is a short summary of the method:
The problem:
get a call stack without a frame pointer.
Solution main idea:
conclude from the assembly code what the debugger understood from debug info.
The information we need:
1. Where the return address is kept.
2. What amount the stack pointer is decremented.
To reproduce the whole stack trace one need to:
1. Get the current $sp and $ra
2. Scan towards the beginning of the function and look for "addui
sp,sp,spofft" command (spofft<0)
3. Reprodece prev. $sp (sp- spofft)
4. Scan forward and look for "sw r31,raofft(sp)"
5. Prev. return address stored at [sp+ raofft]
Above I described one iteration. You stop when the $ra is 0.
How to get the first $ra?
__builtin_return_address(0)
How to get the first $sp?
register unsigned sp asm("29");
asm("" : "=r" (sp));
***Since most of my files compiled with micro-mips optimisation I had to deal with micro-mips-ISA.
A lot of issues arose when I tried to analyze code that compiled with microMips optimization(remember that the goal at each step is to reproduce prev. ra and prev. sp):
It makes things a bit more complicated:
1. ra ($31) register contain unaligned return address.
You may find more information at Linked questions.
The unaligned ra helps you understand that you run over different
ISA(micro-mips-isa)
2. There are functions that do not move the sp. You can find more
information [here][3].
(If a "leaf" function only modifies the temporary registers and returns
to a return statement in its caller's code, then there is no need for
$ra to be changed, and there is no need for a stack frame for that
function.)
3. Functions that do not store the ra
4. MicroMips instructions can be both - 16bit and 32bit: run over the
commnds using unsinged short*.
5. There are functions that perform "addiu sp, sp, spofft" more than once
6. micro-mips-isa has couple variations for the same command
for example: addiu,addiusp.
I have decided to ignore part of the issues and that is why it works for 95% of the cases.

Is there any operation in C analogous to this assembly code?

Today, I played around with incrementing function pointers in assembly code to create alternate entry points to a function:
.386
.MODEL FLAT, C
.DATA
INCLUDELIB MSVCRT
EXTRN puts:PROC
HLO DB "Hello!", 0
WLD DB "World!", 0
.CODE
dentry PROC
push offset HLO
call puts
add esp, 4
push offset WLD
call puts
add esp, 4
ret
dentry ENDP
main PROC
lea edx, offset dentry
call edx
lea edx, offset dentry
add edx, 13
call edx
ret
main ENDP
END
(I know, technically this code is invalid since it calls puts without the CRT being initialized, but it works without any assembly or runtime errors, at least on MSVC 2010 SP1.)
Note that in the second call to dentry I took the address of the function in the edx register, as before, but this time I incremented it by 13 bytes before calling the function.
The output of this program is therefore:
C:\Temp>dblentry
Hello!
World!
World!
C:\Temp>
The first output of "Hello!\nWorld!" is from the call to the very beginning of the function, whereas the second output is from the call starting at the "push offset WLD" instruction.
I'm wondering if this kind of thing exists in languages that are meant to be a step up from assembler like C, Pascal or FORTRAN. I know C doesn't let you increment function pointers but is there some other way to achieve this kind of thing?
AFAIK you can only write functions with multiple entry-points in asm.
You can put labels on all the entry points, so you can use normal direct calls instead of hard-coding the offsets from the first function-name.
This makes it easy to call them normally from C or any other language.
The earlier entry points work like functions that fall-through into the body of another function, if you're worried about confusing tools (or humans) that don't allow function bodies to overlap.
You might do this if the early entry-points do a tiny bit of extra stuff, and then fall through into the main function. It's mainly going to be a code-size saving technique (which might improve I-cache / uop-cache hit rate).
Compilers tend to duplicate code between functions instead of sharing large chunks of common implementation between slightly different functions.
However, you can probably accomplish it with only one extra jmp with something like:
int foo(int a) { return bigfunc(a + 1); }
int bar(int a) { return bigfunc(a + 2); }
int bigfunc(int x) { /* a lot of code */ }
See a real example on the Godbolt compiler explorer
foo and bar tailcall bigfunc, which is slightly worse than having bar fall-through into bigfunc. (Having foo jump over bar into bigfunc is still good, esp. if bar isn't that trivial.)
Jumping into the middle of a function isn't in general safe, because non-trivial functions usually need to save/restore some regs. So the prologue pushes them, and the epilogue pops them. If you jump into the middle, then the pops in the prologue will unbalance the stack. (i.e. pop off the return address into a register, and return to a garbage address).
See also Does a function with instructions before the entry-point label cause problems for anything (linking)?
You can use the longjmp function: http://www.cplusplus.com/reference/csetjmp/longjmp/
It's a fairly horrible function, but it'll do what you seek.

C - How to create a pattern in code segment to recognize it in memory dump?

I dump my RAM (a piece of it - code segment only) in order to find where is which C function being placed. I have no map file and I don't know what boot/init routines exactly do.
I load my program into RAM, then if I dump the RAM, it is very hard to find exactly where is what function. I'd like to use different patterns build in the C source, to recognize them in the memory dump.
I've tryed to start every function with different first variable containing name of function, like:
char this_function_name[]="main";
but it doesn't work, because this string will be placed in the data segment.
I have simple 16-bit RISC CPU and an experimental proprietary compiler (no GCC or any well-known). The system has 16Mb of RAM, shared with other applications (bootloader, downloader). It is almost impossible to find say a unique sequence of N NOPs or smth. like 0xABCD. I would like to find all functions in RAM, so I need unique identificators of functions visible in RAM-dump.
What would be the best pattern for code segment?
If it were me, I'd use the symbol table, e.g. "nm a.out | grep main". Get the real address of any function you want.
If you really have no symbol table, make your own.
struct tab {
void *addr;
char name[100]; // For ease of searching, use an array.
} symtab[] = {
{ (void*)main, "main" },
{ (void*)otherfunc, "otherfunc" },
};
Search for the name, and the address will immediately preceed it. Goto address. ;-)
If your compiler has inline asm you can use it to create a pattern. Write some NOP instructions which you can easily recognize by opcodes in memory dump:
MOV r0,r0
MOV r0,r0
MOV r0,r0
MOV r0,r0
How about a completely different approach to your real problem, which is finding a particular block of code: Use diff.
Compile the code once with the function in question included, and once with it commented out. Produce RAM dumps of both. Then, diff the two dumps to see what's changed -- and that will be the new code block. (You may have to do some sort of processing of the dumps to remove memory addresses in order to get a clean diff, but the order of instructions ought to be the same in either case.)
Numeric constants are placed in the code segment, encoded in the function's instructions. So you could try to use magic numbers like 0xDEADBEEF and so on.
I.e. here's the disassembly view of a simple C function with Visual C++:
void foo(void)
{
00411380 push ebp
00411381 mov ebp,esp
00411383 sub esp,0CCh
00411389 push ebx
0041138A push esi
0041138B push edi
0041138C lea edi,[ebp-0CCh]
00411392 mov ecx,33h
00411397 mov eax,0CCCCCCCCh
0041139C rep stos dword ptr es:[edi]
unsigned id = 0xDEADBEEF;
0041139E mov dword ptr [id],0DEADBEEFh
You can see the 0xDEADBEEF making it into the function's source. Note that what you actually see in the executable depends on the endianness of the CPU (tx. Richard).
This is a x86 example. But RISC CPUs (MIPS, etc) have instructions moving immediates into registers - these immediates can have special recognizable values as well (although only 16-bit for MIPS, IIRC).
Psihodelia - it's getting harder and harder to catch your intention. Is it just a single function you want to find? Then can't you just place 5 NOPs one after another and look for them? Do you control the compiler/assembler/linker/loader? What tools are at your disposal?
As you noted, this:
char this_function_name[]="main";
... will end up setting a pointer in your stack to a data segment containing the string. However, this:
char this_function_name[]= { 'm', 'a', 'i', 'n' };
... will likely put all these bytes in your stack so you will be able to recognize the string in your code (I just tried it on my platform).
Hope this helps
Why not get each function to dump its own address. Something like this:
void* fnaddr( char* fname, void* addr )
{
printf( "%s\t0x%p\n", fname, addr ) ;
return addr ;
}
void test( void )
{
static void* fnaddr_dummy = fnaddr( __FUNCTION__, test ) ;
}
int main (int argc, const char * argv[])
{
static void* fnaddr_dummy = fnaddr( __FUNCTION__, main ) ;
test() ;
test() ;
}
By making fnaddr_dummy static, the dump is done once per-function. Obviously you would need to adapt fnaddr() to support whatever output or logging means you have on your system. Unfortunately, if the system performs lazy initialisation, you'll only get the addresses of the functions that are actually called (which may be good enough).
You could start each function with a call to the same dummy function like:
void identifyFunction( unsigned int identifier)
{
}
Each of your functions would call the identifyFunction-function with a different parameter (1, 2, 3, ...). This will not give you a magic mapfile, but when you inspect the code dump you should be able to quickly find out where the identifyFunction is because there will be lots of jumps to that address. Next scan for those jump and check before the jump to see what parameter is passed. Then you can make your own mapfile. With some scripting this should be fairly automatic.

Resources