How to create a interrupt stack? - c

I want my interrupt service routine to use a different stack(may be of its own) & not use the caller thread's stack.
thread_entry (){
do_something();
--> Interrupt occurs
do_otherstuff();
}
void interrupt_routine ()
{
uint8_t read_byte; // I don't want this to be part of caller thread's stack
read_byte= hw_read();
}
Is it possible & how to achieve this?

The stacks required for OS and interrupt handlers is set up at initialization itself. This is again architecture specific code. For case of ARM processors it has a distinct R13 that is used when the processor is in the interrupt mode. Again this register is initialized at bootup. What is the problem you want to address with this design.

The GNU C library for Linux has methods to control the stack in which the signal executes. Refer to the documentation for full details.
The basic idea is that you allocate memory for the stack and the call the function
sigstack()
to specify that this stack is available to be used for signal handling. You then use the
sigaction()
function to register a handler for a particular signal and specify the flag value
SA_ONSTACK
that this handler runs on the special stack
Here is a code snippet showing the pattern, it's "borrowed" from the Linux Programming Interface examples
sigstack.ss_sp = malloc(SIGSTKSZ);
if (sigstack.ss_sp == NULL)
errExit("malloc");
sigstack.ss_size = SIGSTKSZ;
sigstack.ss_flags = 0;
if (sigaltstack(&sigstack, NULL) == -1)
errExit("sigaltstack");
printf("Alternate stack is at %10p-%p\n",
sigstack.ss_sp, (char *) sbrk(0) - 1);
sa.sa_handler = sigsegvHandler; /* Establish handler for SIGSEGV */
sigemptyset(&sa.sa_mask);
sa.sa_flags = SA_ONSTACK; /* Handler uses alternate stack */
if (sigaction(SIGSEGV, &sa, NULL) == -1)
errExit("sigaction");

Here's a simple x86 inline assembly implementation. You have a wrapper function which changes the stack, and calls your real routine.
const uint32_t interrupt_stack_size = 4096;
uint8_t interrupt_stack[interrupt_stack_size];
void interrupt_routine_wrap()
{
static int thread_esp;
// Stack grows towards lower addresses, so start at the bottom
static int irq_esp = (int) interrupt_stack + interrupt_stack_size;
// Store the old esp
asm mov dword ptr thread_esp, esp;
// Set the new esp
asm mov esp, dword ptr irq_esp;
// Execute the real interrupt routine
interrupt_routine();
// Restore old esp
asm mov esp, dword ptr thread_esp;
}
I'm completely ignoring the segment register here (ss), but different memory models may need to store that along with sp.
You can get rid of the inline assembly by using setjmp/longjmp to read/write all registers. That's a more portable way to do it.
Also note that I'm not preserving any registers here, and inline assembly may confuse the compiler. Perhaps it'd be worth it to add a pusha/popa pair around the wrapper routine. Compiler may do this for you if you specify the function as interrupt. Check the resulting binary to be certain.

Related

How do i properly implement threads in Windows Kernel Driver?

I am trying to learn how to code windows kernel drivers.
In my driver i have 2 threads which are created at some point with PsCreateSystemThread
I have a global variable called Kill which signals the threads to terminate like this.
VOID AThread(IN PVOID Context)
{
for (;;)
{
if(Kill == True)
break;
KmWriteProcessMemory(rProcess, &NewValue, dwAAddr, sizeof(NewValue));
}
PsTerminateSystemThread(STATUS_SUCCESS);
}
In my unload function i am setting Kill = TRUE
VOID f_DriverUnload(PDRIVER_OBJECT pDriverObject)
{
Kill = TRUE;
IoDeleteSymbolicLink(&SymLinkName);
IoDeleteDevice(pDeviceObject);
DbgPrint("Driver Unloaded successfully..\r\n");
}
Most of the time there's no problem, but sometimes the machine will crash when i try to unload the driver. It happens more frequently when i have some kind of sleep function being used in the threads, so i'm assuming it's crashing because the threads have not yet terminated before the driver tries to unload.
I'm not too sure how to use synchronisation and such, and there's not a lot of clear information out there that i can find. So how do i properly implement threads and ensure they're terminated before the driver is unloaded?
Once the thread is created, you have HANDLE threadHandle result. Then you need to convert this handle to PETHREAD ThreadObject; :
ObReferenceObjectByHandle(threadHandle,
THREAD_ALL_ACCESS,
NULL,
KernelMode,
&ThreadObject,
NULL );
and close threadHandle:
ZwClose(threadHandle);
When you want to stop the thread, set the flag and wait for thread completion:
Kill = TRUE;
KeWaitForSingleObject(ThreadObject,
Executive,
KernelMode,
FALSE,
NULL );
ObDereferenceObject(ThreadObject);
Then f_DriverUnload function may exit.
You can see all this stuff here: https://github.com/Microsoft/Windows-driver-samples/tree/master/general/cancel/sys
See cancel.h and cancel.c files. Additionally, this code uses semaphore instead of global flag to stop the thread.
when you create thread which used your driver, the driver of course must not be unloaded, until thread not exit. for do this need call ObfReferenceObject for your driver object, before create thread. if create thread fail - call ObfDereferenceObject. and when thread exit - need call ObfDereferenceObject. but here is problem - how / from where call this ? call ObfDereferenceObject from the end of thread routine no sense - the driver can be unloaded inside ObfDereferenceObject and we return from call to not existing memory place. ideally will be if external code (windows itself) call this, just after thread return.
look for IoAllocateWorkItem for good example. work item - like thread, and driver must not be unloaded, until WorkerRoutine not return. and here system care about this - for this we pass DeviceObject to IoAllocateWorkItem: Pointer to the caller's driver object or to one of the caller's device objects. - the system reference this object (device or driver) when we call IoQueueWorkItem and this is guarantee that driver will be not unloaded during WorkerRoutine execution. when it return - windows call ObfDereferenceObject for passed device or driver object. and here all ok, because we return to system kernel code (not to driver) after this. but unfortunately PsCreateSystemThread not take pointer to driver object and not implement such functional.
another good example FreeLibraryAndExitThread - the driver is kernel mode dll by fact, which can be loaded and unloaded. and FreeLibraryAndExitThread exactly implement functional which we need, but for user mode dlls only. again no such api in kernel mode.
but anyway solution is possible. possible yourself jump (not call) to ObfDereferenceObject at the end of thread execution, but for this need use assembler code. not possible do this trick in c/c++.
first of all let declare pointer to driver object in global variable - we initialize it to valid value in driver entry point.
extern "C" PVOID g_DriverObject;
than some macros for get mangled c++ names, this need for use it in asm file:
#if 0
#define __ASM_FUNCTION __pragma(message(__FUNCDNAME__" proc\r\n" __FUNCDNAME__ " endp"))
#define _ASM_FUNCTION {__ASM_FUNCTION;}
#define ASM_FUNCTION {__ASM_FUNCTION;return 0;}
#define CPP_FUNCTION __pragma(message("extern " __FUNCDNAME__ " : PROC ; " __FUNCSIG__))
#else
#define _ASM_FUNCTION
#define ASM_FUNCTION
#define CPP_FUNCTION
#endif
in c++ we declare 2 functions for thread:
VOID _AThread(IN PVOID Context)_ASM_FUNCTION;
VOID __fastcall AThread(IN PVOID Context)
{
CPP_FUNCTION;
// some code here
// but not call PsTerminateSystemThread !!
}
(don't forget __fastcall on AThread - for x86 this need)
now we create thread with next code:
ObfReferenceObject(g_DriverObject);
HANDLE hThread;
if (0 > PsCreateSystemThread(&hThread, 0, 0, 0, 0, _AThread, ctx))
{
ObfDereferenceObject(g_DriverObject);
}
else
{
NtClose(hThread);
}
so you set thread entry point to _AThread which will be implemented in asm file. at begin you call ObfReferenceObject(g_DriverObject);. the _AThread will call you actual thread implementation AThread in c++. finally it return back to _AThread (because this you must not call PsTerminateSystemThread. anyway call this api is optional at all - when thread routine return control to system - this will be auto called). and _AThread at the end de-reference g_DriverObject and return to system.
so main trick in asm files. here 2 asm for x86 and x64:
x86:
.686p
extern _g_DriverObject:DWORD
extern __imp_#ObfDereferenceObject#4:DWORD
extern ?AThread##YIXPAX#Z : PROC ; void __fastcall AThread(void *)
_TEXT segment
?_AThread##YGXPAX#Z proc
pop ecx
xchg ecx,[esp]
call ?AThread##YIXPAX#Z
mov ecx,_g_DriverObject
jmp __imp_#ObfDereferenceObject#4
?_AThread##YGXPAX#Z endp
_TEXT ends
END
x64:
extern g_DriverObject:QWORD
extern __imp_ObfDereferenceObject:QWORD
extern ?AThread##YAXPEAX#Z : PROC ; void __cdecl AThread(void *)
_TEXT segment 'CODE'
?_AThread##YAXPEAX#Z proc
sub rsp,28h
call ?AThread##YAXPEAX#Z
add rsp,28h
mov rcx,g_DriverObject
jmp __imp_ObfDereferenceObject
?_AThread##YAXPEAX#Z endp
_TEXT ENDS
END

How to save the context execution of a newly created user thread, Linux 64 to structure in C?

I am trying to implement a new user thread management library similar to the original pthread but only in C. Only the context switch should be assembler.
Looks like I am missing something fundamentally.
I have the following structure for the context execution:
enter code here
struct exec_ctx {
uint64_t rbp;
uint64_t r15;
uint64_t r14;
uint64_t r13;
uint64_t r12;
uint64_t r11;
uint64_t r10;
uint64_t r9;
uint64_t r8;
uint64_t rsi;
uint64_t rdi;
uint64_t rdx;
uint64_t rcx;
uint64_t rbx;
uint64_t rip;
}__attribute__((packed));
I create new thread structure and I should put the registers into the mentioned variables, part of the context execution structure. How may I do it on C? Everywhere only talks about setcontext, getcontext, but this is not the case here.
Also, the only hint I received is I need to have some kind of dump stack function into the create function.... not sure how to do it. Please advise where can I read further/how to do it.
Thanks in advance!
I started with:
char *stack;
stack = malloc(StackSize);
if (!stack)
return -1;
*(uint64_t *)&stack[StackSize - 8] = (uint64_t)stop;
*(uint64_t *)&stack[StackSize - 16] = (uint64_t)f;
pet_thread->ctx.rip = (uint64_t)&stack[StackSize - 16];
pet_thread->thread_state = Ready;
This is how I put a pointer to the thread function on the top of the stack in order to call the thread more easily.
First of all, you do not need to save all the registers. Since your context switch is implemented as a function, any register that the ABI defines as "caller saved" or "clobbered" you can safely leave out. The code generated by the C compiler will assume it might change.
Since this is a school assignment I will not give you the code to do this. I will give you the outline.
Your function needs to both save the registers to the struct for the outgoing micro-thread and load the register for the incoming micro-thread. The reason is that you have logically always have one register set "in effect". So your function needs two arguments, the struct for the outgoing micro-thread and the one for the incoming.
Those two arguments are stored in two registers. Those two you do not need to save. So your code should have the following structure (assuming your structure, which, as I said, is too complete):
# save context
mov [rdi], rbp
add 8, rdi
...
#load context
mov rbp, [rsi]
add 8, rsi
...
If you place that in a separate .S file, you'll make sure that the C compiler will not add anything or optimize anything.
This is not the cleanest or most efficient solution, but it is the simplest.

Hooking a function I don't know the parameters to

Lets say there is a DLL A.DLL with a known entry point DoStuff that I have in some way hooked out with my own DLL fakeA.dll such that the system is calling my DoStuff instead. How do I write such a function such that it can then call the same entry point of the hooked DLL (A.DLL) without knowing the arguments of the function? I.e. My function in fakeA.DLL would look something like
LONG DoStuff(
// don't know what to put here
)
{
FARPROC pfnHooked;
HINSTANCE hHooked;
LONG lRet;
// get hooked library and desired function
hHooked = LoadLibrary("A.DLL");
pfnHooked = GetProcAddress(hHooked, "DoStuff");
// how do I call the desired function without knowing the parameters?
lRet = pfnHooked( ??? );
return lRet;
}
My current thinking is that the arguments are on the stack so I'm guessing I would have to have a sufficiently large stack variable (a big ass struct for example) to capture whatever the arguments are and then just pass it along to pfnHooked? I.e.
// actual arg stack limit is >1MB but we'll assume 1024 bytes is sufficient
typedef struct { char unknownData[1024]; } ARBITARY_ARG;
ARBITARY_ARG DoStuff(ARBITARY_ARG args){
ARBITARY_ARG aRet;
...
aRet = pfnHooked(args);
return aRet;
}
Would this work? If so, is there a better way?
UPDATE: After some rudimentary (and non-conclusive) testing passing in the arbitrary block as arguments DOES work (which is not surprising, as the program will just read what it needs off the stack). However collecting the return value is harder as if it's too large it can cause an access violation. Setting the arbitrary return size to 8 bytes (or maybe 4 for x86) may be a solution to most cases (including void returns) however that's still guesswork. If I had some way of knowing the return type from the DLL (not necessarily at runtime) that would be grand.
This should be a comment but the meta answer is yes you can hook the function without knowing the calling convention and arguments, on an x64/x86 platform. Can it be purely done in C? No, it also needs a good deal of understanding of various calling convention and Assembly programming. The hooking framework will have some of it's bits written in Assembly.
Most hooking framework inherently do that by creating a trampoline that redirects the execution flow from the called function's preamble to stub code that is generally independent of the function it is hooking. In user mode you're guaranteed stack to be always present so you can push your own local variables too on the same stack as long as you can pop them and restore the stack to it's original state.
You don't really need to copy the existing arguments to your own stack variable. You can just inspect the stack, definitely read a bit about calling convention and how stacks are constructed on different architectures for various types of invocation in assembly before you attempt anything.
yes, this is possible do generic hooking 100% correct - one common for multiple functions with different arguments count and calling conventions. for both x86/x64 (amd64) platforms.
but for this need use little asm stubs - of course it will be different for x86/x64 - but it will be very small - several lines of code only - 2 small stub procedures - one for filter pre-call and one for post-call. but most code implementation (95%+) will be platform independent and in c++ (of course this possible do and on c but compare c++ - c source code will be larger, ugly and harder to implement)
in my solution need allocate small executable blocks of code for every hooking api (one block per hooked api). in this block - store function name, original address (or to where transfer control after pre-call - this is depended from hooking method) and one relative call instruction to common asm pre-call stub. magic of this call not only that it transfer control to common stub, but that return address in stack will be point to block itself (ok , with some offset, but if we will use c++ and inheritance - it will be exactly point to some base class, from which we derive our executable block class). as result in common precall stub we will be have information - which api call we hook here and then pass this info to c++ common handler.
one note, because in x64 relative call can be only in range [rip-0x80000000, rip+0x7fffffff] need declare (allocate) this code blocks inside our PE in separate bss section and mark this section as RWE. we can not simply use VirtualAlloc for allocate storage, because returned address can be too far from our common precall stub.
in common asm precall stub code must save rcx,rdx,r8,r9 registers for x64 (this is absolute mandatory) and ecx,edx registers for x86. this is need for case if function use __fastcall calling conventions. however windows api for example almost not using __fastcall - only several __fastcall functions exist from thousands of win api (for ensure this and found this functions - go to LIB folder and search for __imp_# string (this is __fastcall common prefix) and then call c++ common handler, which must return address of original function(to where transfer control) to stub. stub restore rcx,rdx,r8,r9 (or ecx,edx) registers and jump (but not call !) to this address
if we want filter only pre-call this is all what we need. however in most case need filter (hook) and post-call - for view/modify function return value and out parameters. and this is also possible do, but need little more coding.
for hook post-call obviously we must replace the return address for hooked api. but on what we must change return address ? and where save original return address ? for this we can not use global variable. even can not use thread local (__declspec( thread ) or thread_local) because call can be reqursive. can not use volatile register (because it changed during api call) and can not use non-volatile register - because in this case we will be save it,for restore later - but got some question - where ?
only one (and nice) solution here - allocate small block of executable memory (RWE) which containing one relative call instruction to common post-call asm stub. and some data - saved original return address, function parameters(for check out parameters in post handler) and function name
here again, some issuer for x64 - this block must be not too far from common post stub (+/- 2GB) - so the best also allocate this stubs in separate .bss section (with the pre-call stubs).
how many need this ret-stubs ? one per api call (if we want control post call). so not more than api calls active at any time. usually say 256 pre-allocated blocks - more than enough. and even if we fail allocate this block in pre-call - we only not control it post call, but not crash. and we can not for all hooked api want control post-call but only for some.
for very fast and interlocked alloc/free this blocks - need build stack semantic over it. allocate by interlocked pop and free by interlocked push. and pre initialize (call instruction) this blocks at begin (while push all it to stack, for not reinitialize it every time in pre-call)
common post-call stub in asm is very simply - here we not need save any registers. we simply call c++ post handler with address of block (we pop it from stack - result of call instruction from block) and with original return value (rax or eax). strictly said - api function can return pair rax+rdx or eax+edx but 99.9%+ of windows api return value in single register and i assume that we will be hooking only this api. however if want, can little adjust code for handle this too (simply in most case this not need)
c++ post call handler restore original return address (by using _AddressOfReturnAddress()), can log call and/or modify out parameters and finally return to.. original caller of api. what our handler return - this and will be final return value of api call. usually we mast return original value.
c++ code
#if 0
#define __ASM_FUNCTION __pragma(message(__FUNCDNAME__" proc\r\n" __FUNCDNAME__ " endp"))
#define _ASM_FUNCTION {__ASM_FUNCTION;}
#define ASM_FUNCTION {__ASM_FUNCTION;return 0;}
#define CPP_FUNCTION __pragma(message("extern " __FUNCDNAME__ " : PROC ; " __FUNCTION__))
#else
#define _ASM_FUNCTION
#define ASM_FUNCTION
#define CPP_FUNCTION
#endif
class CODE_STUB
{
#ifdef _WIN64
PVOID pad;
#endif
union
{
DWORD code;
struct
{
BYTE cc[3];
BYTE call;
};
};
int offset;
public:
void Init(PVOID stub)
{
// int3; int3; int3; call stub
code = 0xe8cccccc;
offset = RtlPointerToOffset(&offset + 1, stub);
C_ASSERT(sizeof(CODE_STUB) == RTL_SIZEOF_THROUGH_FIELD(CODE_STUB, offset));
}
PVOID Function()
{
return &call;
}
// implemented in .asm
static void __cdecl retstub() _ASM_FUNCTION;
static void __cdecl callstub() _ASM_FUNCTION;
};
struct FUNC_INFO
{
PVOID OriginalFunc;
PCSTR Name;
void* __fastcall OnCall(void** stack);
};
struct CALL_FUNC : CODE_STUB, FUNC_INFO
{
};
C_ASSERT(FIELD_OFFSET(CALL_FUNC,OriginalFunc) == sizeof(CODE_STUB));
struct RET_INFO
{
union
{
struct
{
PCSTR Name;
PVOID params[7];
};
SLIST_ENTRY Entry;
};
INT_PTR __fastcall OnCall(INT_PTR r);
};
struct RET_FUNC : CODE_STUB, RET_INFO
{
};
C_ASSERT(FIELD_OFFSET(RET_FUNC, Entry) == sizeof(CODE_STUB));
#pragma bss_seg(".HOOKS")
RET_FUNC g_rf[1024];//max call count
CALL_FUNC g_cf[16];//max hooks count
#pragma bss_seg()
#pragma comment(linker, "/SECTION:.HOOKS,RWE")
class RET_FUNC_Manager
{
SLIST_HEADER _head;
public:
RET_FUNC_Manager()
{
PSLIST_HEADER head = &_head;
InitializeSListHead(head);
RET_FUNC* p = g_rf;
DWORD n = RTL_NUMBER_OF(g_rf);
do
{
p->Init(CODE_STUB::retstub);
InterlockedPushEntrySList(head, &p++->Entry);
} while (--n);
}
RET_FUNC* alloc()
{
return static_cast<RET_FUNC*>(CONTAINING_RECORD(InterlockedPopEntrySList(&_head), RET_INFO, Entry));
}
void free(RET_INFO* p)
{
InterlockedPushEntrySList(&_head, &p->Entry);
}
} g_rfm;
void* __fastcall FUNC_INFO::OnCall(void** stack)
{
CPP_FUNCTION;
// in case __fastcall function in x86 - param#1 at stack[-1] and param#2 at stack[-2]
// this need for filter post call only
if (RET_FUNC* p = g_rfm.alloc())
{
p->Name = Name;
memcpy(p->params, stack, sizeof(p->params));
*stack = p->Function();
}
return OriginalFunc;
}
INT_PTR __fastcall RET_INFO::OnCall(INT_PTR r)
{
CPP_FUNCTION;
*(void**)_AddressOfReturnAddress() = *params;
PCSTR name = Name;
char buf[8];
if (IS_INTRESOURCE(name))
{
sprintf(buf, "#%04x", (ULONG)(ULONG_PTR)name), name = buf;
}
DbgPrint("%p %s(%p, %p, %p ..)=%p\r\n", *params, name, params[1], params[2], params[3], r);
g_rfm.free(this);
return r;
}
struct DLL_TO_HOOK
{
PCWSTR szDllName;
PCSTR szFuncNames[];
};
void DoHook(DLL_TO_HOOK** pp)
{
PCSTR* ppsz, psz;
DLL_TO_HOOK *p;
ULONG n = RTL_NUMBER_OF(g_cf);
CALL_FUNC* pcf = g_cf;
while (p = *pp++)
{
if (HMODULE hmod = LoadLibraryW(p->szDllName))
{
ppsz = p->szFuncNames;
while (psz = *ppsz++)
{
if (pcf->OriginalFunc = GetProcAddress(hmod, psz))
{
pcf->Name = psz;
pcf->Init(CODE_STUB::callstub);
// do hook: pcf->OriginalFunc -> pcf->Function() - code for this skiped
DbgPrint("hook: (%p) <- (%p)%s\n", pcf->Function(), pcf->OriginalFunc, psz);
if (!--n)
{
return;
}
pcf++;
}
}
}
}
}
asm x64 code:
extern ?OnCall#FUNC_INFO##QEAAPEAXPEAPEAX#Z : PROC ; FUNC_INFO::OnCall
extern ?OnCall#RET_INFO##QEAA_J_J#Z : PROC ; RET_INFO::OnCall
?retstub#CODE_STUB##SAXXZ proc
pop rcx
mov rdx,rax
call ?OnCall#RET_INFO##QEAA_J_J#Z
?retstub#CODE_STUB##SAXXZ endp
?callstub#CODE_STUB##SAXXZ proc
mov [rsp+10h],rcx
mov [rsp+18h],rdx
mov [rsp+20h],r8
mov [rsp+28h],r9
pop rcx
mov rdx,rsp
sub rsp,18h
call ?OnCall#FUNC_INFO##QEAAPEAXPEAPEAX#Z
add rsp,18h
mov rcx,[rsp+8]
mov rdx,[rsp+10h]
mov r8,[rsp+18h]
mov r9,[rsp+20h]
jmp rax
?callstub#CODE_STUB##SAXXZ endp
asm x86 code
extern ?OnCall#FUNC_INFO##QAIPAXPAPAX#Z : PROC ; FUNC_INFO::OnCall
extern ?OnCall#RET_INFO##QAIHH#Z : PROC ; RET_INFO::OnCall
?retstub#CODE_STUB##SAXXZ proc
pop ecx
mov edx,eax
call ?OnCall#RET_INFO##QAIHH#Z
?retstub#CODE_STUB##SAXXZ endp
?callstub#CODE_STUB##SAXXZ proc
xchg [esp],ecx
push edx
lea edx,[esp + 8]
call ?OnCall#FUNC_INFO##QAIPAXPAPAX#Z
pop edx
pop ecx
jmp eax
?callstub#CODE_STUB##SAXXZ endp
you can ask from where i know this decorated names like ?OnCall#FUNC_INFO##QAIPAXPAPAX#Z ? look for very begin of c++ code - for several macros - and first time compile with #if 1 and look in output window. hope you understand (and you will be probably need use this names, but not my names - decoration can be different)
and how call void DoHook(DLL_TO_HOOK** pp) ? like that:
DLL_TO_HOOK dth_kernel32 = { L"kernel32", { "VirtualAlloc", "VirtualFree", "HeapAlloc", 0 } };
DLL_TO_HOOK dth_ntdll = { L"ntdll", { "NtOpenEvent", 0 } };
DLL_TO_HOOK* ghd[] = { &dth_ntdll, &dth_kernel32, 0 };
DoHook(ghd);
Lets say there is a DLL A.DLL with a known entry point DoStuff
If the entry point DoStuff is known it ought to be documented somewhere, at the very least in some C header file. So a possible approach might be to parse that header to get its signature (i.e. the C declaration of DoStuff). Maybe you could fill some database with the signature of all functions declared in all system header files, etc... Or perhaps use debug information if you have it.
If you call some function (in C) and don't give all the required parameters, the calling convention & ABI will still be used, and these (missing) parameters get garbage values (if the calling convention defines that parameter to be passed in a register, the garbage inside that register; if the convention defines that parameter to be passed on the call stack, the garbage inside that particular call stack slot). So you are likely to crash and surely have some undefined behavior (which is scary, since your program might seem to work but still be very wrong).
However, look also into libffi. Once you know (at runtime) what to pass to some arbitrary function, you can construct a call to it passing the right number and types of arguments.
My current thinking is that the arguments are on the stack
I think it is wrong (at least on many x86-64 systems). Some arguments are passed thru registers. Read about x86 calling conventions.
Would this work?
No, it won't work because some arguments are passed thru registers, and because the calling convention depends upon the signature of the called function (floating point values might be passed in different registers, or always on the stack; variadic functions have specific calling conventions; etc....)
BTW, some recent C optimizing compilers are able to do tail call optimizations, which might complicate things.
There is no standard way of doing this because lot of things like calling conventions, pointer sizes etc matter when passing arguments. You will have to read the ABI for your platform and write an implementation, which I fear again won't be possible in C. You will need some inline assembly.
One simple way to do it would be (for a platform like X86_64) -
MyDoStuff:
jmpq *__real_DoStuff
This hook does nothing but just calls the real function. If you want to do anything useful while hooking you will have to save restore some registers before the call (again what to save depends on the ABI)

Register pointer in creating threads in xv6

I want to create a thread in xv6 by using a system call "clone()", but I am confused about the stack creation, since if I want to create a thread, I need to create the corresponding register pointer such like ebp, esp, eip. But I don't know how to set the value of these register pointer.
Here is a code of clone() in xv6, I don't know why we need to set the value of the register pointer like this.......
int clone(void(*fcn)(void*), void *arg, void*stack){
int i, pid;
struct proc *np;
int *ustack = stack + PGSIZE - sizeof(void*);
//allocate process.
if((np=allocproc()) == 0)
return -1;
//copy process state from p
np->pgdir = proc->pgdir;
np->sz = proc->sz;
np->parent = 0;
np->pthread = proc;
*np->tf = *proc->tf;
np->ustack = stack;
//initialize stack variables
//void *stackArg, *stackRet;
//stackRet = stack + PGSIZE -2*sizeof(void*);
//*(uint *)stackRet = 0xffffffff;
//stackArg = stack + PGSIZE -sizeof(void*);
//*(uint *)stackArg = (uint)arg;
*ustack = (int) arg;
*(ustack - 1) = 0xffffffff;
*(ustack - 2) = 0xffffffff;
//Set stack pinter register
np->tf->eax = 0;
np->tf->esp = (int) ustack - sizeof(void*);
np->tf->ebp = np->tf->esp;
np->tf->eip = (int)fcn;
for(i = 0; i < NOFILE; i++) {
if(proc->ofile[i])
np->ofile[i] = filedup(proc->ofile[i]);
}
np->cwd = idup(proc->cwd);
np->state = RUNNABLE;
safestrcpy(np->name, proc->name, sizeof(proc->name));
pid = np->pid;
return pid;
}
You don't set these registers -- clone sets them for you. You need to provide a function (which clone uses to initialize ip) and a stack (which clone uses to initialize sp).
The function pointer is pretty straight-forward (its just a C function pointer), but the stack is trickier. For the clone implementation you show, you need to allocate some memory and provide a pointer PGSIZE below the end of that block. Linux's clone call is similar, but slightly different (you need to provide a pointer to the end of the block). If you want to catch stack overflows, you'll need to do more work (probably allocating a read/write protected guard page below the stack).
Of all the register values you set, the only useful ones are:
eip - Tells the thread where to start executing from when it returns to userspace
esp - This is points to the top of the stack. This means that if you did this right, 4 bytes stored at the top of the stack should contain your return address
eax is not really useful here seeing as the thread jumps to a new context and not the one where it was created. Otherwise, eax will store the return value of the last system call. See the implementation of fork if you are still confused about this one.
ebp is not manipulated by you, but rather by the x86 function call conventions and is usually set to the value of esp when the function is called. As such you will usually see this sort of thing in the disassembly of most function calls
push ebp ; Preserve current frame pointer
mov ebp, esp ; Create new frame pointer pointing to current stack top
ebp is also useful for stack tracing because it stores the top of the previous function's stack before it is then changed to point to the current stack top
You don't need this *(ustack - 2) = 0xffffffff;

Address Error ISR

I am trying to run and debug a C program on a dsPIC30f3011 microcontroller. When I run my code in MPLAB, the code always tends to stop at this ISR and I am stuck with absolutely no output for any variables, with my code not even executing. It seems to be some kind of "trap" program that I assume is for catching simple mistakes (i.e. oscillator failures, etc.) I am using MPLabIDE v8.5, with an MPLab ICD3 in debug mode. It's worth mentioning that MPLAB shows that I am connected to both the target(dsPIC) and the ICD3. Can someone please give me a reason as to why this problem is occurring?
Here is the ISR:
void _ISR __attribute__((no_auto_psv))_AddressError(void)
{
INTCON1bits.ADDRERR = 0;
while(1);
}
Here is my code with initializations first, then PID use, then the DSP functions,
then the actual DSP header file where the syntax/algorithm is derived. There is also some sort of problem where I define DutyCycle.
///////////////////////////////Initializations/////////////////////////////////////////////
#include "dsp.h" //see bottom of program
tPID SPR4535_PID; // Declare a PID Data Structure named, SPR4535_PID, initialize the PID object
/* The SPR4535_PID data structure contains a pointer to derived coefficients in X-space and */
/* pointer to controller state (history) samples in Y-space. So declare variables for the */
/* derived coefficients and the controller history samples */
fractional abcCoefficient[3] __attribute__ ((space(xmemory))); // ABC Coefficients loaded from X memory
fractional controlHistory[3] __attribute__ ((space(ymemory))); // Control History loaded from Y memory
/* The abcCoefficients referenced by the SPR4535_PID data structure */
/* are derived from the gain coefficients, Kp, Ki and Kd */
/* So, declare Kp, Ki and Kd in an array */
fractional kCoeffs[] = {0,0,0};
//////////////////////////////////PID variable use///////////////////////////////
void ControlSpeed(void)
{
LimitSlew();
PID_CHANGE_SPEED(SpeedCMD);
if (timer3avg > 0)
ActualSpeed = SPEEDMULT/timer3avg;
else
ActualSpeed = 0;
max=2*(PTPER+1);
DutyCycle=Fract2Float(PID_COMPUTE(ActualSpeed))*max;
// Just make sure the speed that will be written to the PDC1 register is not greater than the PTPER register
if(DutyCycle>max)
DutyCycle=max;
else if (DutyCycle<0)
DutyCycle=0;
}
//////////////////////////////////PID functions//////////////////////////////////
void INIT_PID(int DESIRED_SPEED)
{
SPR4535_PID.abcCoefficients = &abcCoefficient[0]; //Set up pointer to derived coefficients
SPR4535_PID.controlHistory = &controlHistory[0]; //Set up pointer to controller history samples
PIDInit(&SPR4535_PID); //Clear the controller history and the controller output
kCoeffs[0] = KP; // Sets the K[0] coefficient to the KP
kCoeffs[1] = KI; // Sets the K[1] coefficient to the KI
kCoeffs[2] = KD; // Sets the K[2] coefficient to the Kd
PIDCoeffCalc(&kCoeffs[0], &SPR4535_PID); //Derive the a,b, & c coefficients from the Kp, Ki & Kd
SPR4535_PID.controlReference = DESIRED_SPEED; //Set the Reference Input for your controller
}
int PID_COMPUTE(int MEASURED_OUTPUT)
{
SPR4535_PID.measuredOutput = MEASURED_OUTPUT; // Records the measured output
PID(&SPR4535_PID);
return SPR4535_PID.controlOutput; // Computes the control output
}
void PID_CHANGE_SPEED (int NEW_SPEED)
{
SPR4535_PID.controlReference = NEW_SPEED; // Changes the control reference to change the desired speed
}
/////////////////////////////////////dsp.h/////////////////////////////////////////////////
typedef struct {
fractional* abcCoefficients; /* Pointer to A, B & C coefficients located in X-space */
/* These coefficients are derived from */
/* the PID gain values - Kp, Ki and Kd */
fractional* controlHistory; /* Pointer to 3 delay-line samples located in Y-space */
/* with the first sample being the most recent */
fractional controlOutput; /* PID Controller Output */
fractional measuredOutput; /* Measured Output sample */
fractional controlReference; /* Reference Input sample */
} tPID;
/*...........................................................................*/
extern void PIDCoeffCalc( /* Derive A, B and C coefficients using PID gain values-Kp, Ki & Kd*/
fractional* kCoeffs, /* pointer to array containing Kp, Ki & Kd in sequence */
tPID* controller /* pointer to PID data structure */
);
/*...........................................................................*/
extern void PIDInit ( /* Clear the PID state variables and output sample*/
tPID* controller /* pointer to PID data structure */
);
/*...........................................................................*/
extern fractional* PID ( /* PID Controller Function */
tPID* controller /* Pointer to PID controller data structure */
);
The dsPIC traps don't offer much information free of charge, so I tend to augment the ISRs with a little assembly language pre-prologue. (Note that the Stack Error trap is a little ropey, as it uses RCALL and RETURN instructions when the stack is already out of order.)
/**
* \file trap.s
* \brief Used to provide a little more information during development.
*
* The trapPreprologue function is called on entry to each of the routines
* defined in traps.c. It looks up the stack to find the value of the IP
* when the trap occurred and stores it in the _errAddress memory location.
*/
.global __errAddress
.global __intCon1
.global _trapPreprologue
.section .bss
__errAddress: .space 4
__intCon1: .space 2
.section .text
_trapPreprologue:
; Disable maskable interrupts and save primary regs to shadow regs
bclr INTCON2, #15 ;global interrupt disable
push.s ;Switch to shadow registers
; Retrieve the ISR return address from the stack into w0:w1
sub w15, #4, w2 ;set W2 to the ISR.PC (SP = ToS-4)
mov [--w2], w0 ;get the ISR return address LSW (ToS-6) in w0
bclr w0, #0x0 ;mask out SFA bit (w0<0>)
mov [--w2], w1 ;get the ISR return address MSW (ToS-8) in w1
bclr w1, #0x7 ;mask out IPL<3> bit (w1<7>)
ze w1, w1 ;mask out SR<7:0> bits (w1<15..8>)
; Save it
mov #__errAddress, w2 ;Move address of __errAddress into w2
mov.d w0, [w2] ;save the ISR return address to __errAddress
; Copy the content of the INTCON1 SFR into memory
mov #__intCon1, w2 ;Move address of __intCon1 into w2
mov INTCON1, WREG ;Read the trap flags into w0 (WREG)
mov w0, [w2] ;save the trap flags to __intCon1
; Return to the 'C' handler
pop.s ;Switch back to primary registers
return
Then I keep all the trap ISRs in a single traps.c file that uses the pre-prologue in traps.s. Note that the actual traps may be different for your microcontroller - check the data sheet to see which are implemented.
/**
* \file traps.c
* \brief Micro-controller exception interrupt vectors.
*/
#include <stdint.h>
#include "traps.h" // Internal interface to the micro trap handling.
/* Access to immediate call stack. Implementation in trap.s */
extern volatile unsigned long _errAddress;
extern volatile unsigned int _intCon1;
extern void trapPreprologue(void);
/* Trap information, set by the traps that use them. */
static unsigned int _intCon2;
static unsigned int _intCon3;
static unsigned int _intCon4;
/* Protected functions exposed by traps.h */
void trapsInitialise(void)
{
_errAddress = 0;
_intCon1 = 0;
_intCon2 = 0;
_intCon3 = 0;
_intCon4 = 0;
}
/* Trap Handling */
// The trap routines call the _trapPreprologue assembly routine in traps.s
// to obtain the value of the PC when the trap occurred and store it in
// the _errAddress variable. They reset the interrupt source in the CPU's
// INTCON SFR and invoke the (#defined) vThrow macro to report the fault.
void __attribute__((interrupt(preprologue("rcall _trapPreprologue")),no_auto_psv)) _OscillatorFail(void)
{
INTCON1bits.OSCFAIL = 0; /* Clear the trap flag */
vThrow(_intCon1, _intCon2, _intCon3, _intCon4, _errAddress);
}
void __attribute__((interrupt(preprologue("rcall _trapPreprologue")),no_auto_psv)) _StackError(void)
{
INTCON1bits.STKERR = 0; /* Clear the trap flag */
vThrow(_intCon1, _intCon2, _intCon3, _intCon4, _errAddress);
}
void __attribute__((interrupt(preprologue("rcall _trapPreprologue")),no_auto_psv)) _AddressError(void)
{
INTCON1bits.ADDRERR = 0; /* Clear the trap flag */
vThrow(_intCon1, _intCon2, _intCon3, _intCon4, _errAddress);
}
void __attribute__((interrupt(preprologue("rcall _trapPreprologue")),no_auto_psv)) _MathError(void)
{
INTCON1bits.MATHERR = 0; /* Clear the trap flag */
vThrow(_intCon1, _intCon2, _intCon3, _intCon4, _errAddress);
}
void __attribute__((interrupt(preprologue("rcall _trapPreprologue")),no_auto_psv)) _DMACError(void)
{
INTCON1bits.DMACERR = 0; /* Clear the trap flag */
vThrow(_intCon1, _intCon2, _intCon3, _intCon4, _errAddress);
}
void __attribute__((interrupt(preprologue("rcall _trapPreprologue")),no_auto_psv)) _HardTrapError(void)
{
_intCon4 = INTCON4;
INTCON4 = 0; // Clear the hard trap register
_intCon2 = INTCON2;
INTCON2bits.SWTRAP = 0; // Make sure the software hard trap bit is clear
vThrow(_intCon1, _intCon2, _intCon3, _intCon4, _errAddress);
}
void __attribute__((interrupt(preprologue("rcall _trapPreprologue")),no_auto_psv)) _SoftTrapError(void)
{
_intCon3 = INTCON3;
INTCON3 = 0; // Clear the soft trap register
vThrow(_intCon1, _intCon2, _intCon3, _intCon4, _errAddress);
}
Implementation of the vThrow macro is up to you. However, it should not use the stack, as this may be unavailable (so no puts() debug calls!) During development, it would be reasonable to use a simple endless loop with a NOP statement in it that you can breakpoint on.
(In a production build, my vThrow macro logs the parameters into a reserved area of RAM that is excluded from being zeroed at start-up by the linker script, and resets the microcontroller. During start-up the program inspects the reserved area and if it is non-zero records the error event for diagnostics.)
Once you get a trap, inspecting the content of the _errAddress variable will give you the return address for the ISR, which is the address immediately following the instruction that generated the interrupt. You can then inspect your MAP files to find the routine and, if you're really keen, inspect the disassembly to find the specific instruction. After that, debugging it is up to you.
As suggested in the comments, the while(1) statement is where your code is getting hung. Note however, that your code is executing - you're just in an infinite loop. That's also why you can't view your variables or current program counter. Generally when you're attached to a ucontroller via PC host, you can't view state information while the ucontroller is executing. Everything is running too fast, even on a slow one, to constantly update your screen.
To try to identify the cause, you can set a breakpoint in the ISR and reset the controller. When the breakpoint is hit, execution will halt, and you may be able to investigate your stack frames to see the last line of code executed before the ISR was triggered. This is not guaranteed though - depending on how your particular ucontroller handles interrupts, the call stack may not be continuous between normal program execution and the interrupt context.
If that doesn't work, set a breakpoint in your code before the ISR is invoked, and step through your code until it is. The last line of code you executed before the ISR will be the cause. Keep in mind, this may take some time, especially if the offending line is in the loop and only trips the ISR after a certain number of iterations.
EDIT
After posting this answer, I noticed your last comment about the linkscript warning. This is a perfect example of why you should, with very few exceptions, work just as hard to resolve warnings as you do to resolve compiler errors. Especially if you don't understand what the warning means and what caused it.
A PID algorithm involves multiplication. On a dspic, this is done via the built in hardware multiplier. This multiplier has one register which must point to xmemory space and another pointing to ymemory space. The dsp core then multiplies these two and the result can be found in the accumulator (there a two of them).
An addres error trap will be triggered if an xmemory address range is loaded into the ymemory register and viceversa. You can check this by single stepping the code in the assembly.
This is not the only instance the trap is triggered. There are also silicon bugs that can cause this, check the errata.

Resources