Intel Pin: how to replace/skip syscall - c

I'd like to replace/emulate syscalls in a static binary running on 64bit linux 4.4.0-33, preferably using Intel Pin
from the documentation there is PIN_AddSyscallEntryFunction(...)
https://software.intel.com/sites/landingpage/pintool/docs/98484/Pin/html/group__SYSCALL.html
but it seems unable to skip the real syscall, am I missing something? Google didn't do the trick :(
I can try to replace syscall with a invalid id inside syscall entry callback and patching the retval inside syscall exit callback, but I'd rather not do these
there also seems to be other lower level functions(e.g. https://software.intel.com/sites/landingpage/pintool/docs/98484/Pin/html/group__INS__REF.html) but I'd like to try higher level functions first for readability and also to exploit the full potential of Pin and get familiar with the tool
-- BACKGROUND --
I've implemented a virtual file system(of HDFS) using LD_PRELOAD, letting any program able to access HDFS unmodified using a special path /hdfs/..., but it didn't work for static linked binaries, and it has too many interception points(open and also open64, seek and also fseek, fputs etc)
here are the methods that I'v considered, please suggest if there's a better way:
LD_PRELOAD replacing open/read/write/... // not working for static linked binaries, so I'm trying Pin here
ptrace/SYSEMU // it seems too complex and likely to have performance issues
nfs/fuse/... // too complex, need to adapt to too many protocols and can only be used for vfs, can't be extended to support other tech(e.g. hook socket operations) later when needed
replace sysenter/syscall with int3 // should be the same as Pin? and SIGTRAP would be slower
is there any alternatives to Pin? https://github.com/pmem/syscall_intercept also relies on LD_PRELOAD so no luck there

Pin allows to add a call of function before the instruction using INS_InsertCall(). You can add the call of the function before the syscall instruction. This function will check the syscall arguments and emulate the system call if it is necessary. The arguments for system calls are passed only via registers, therefore it is necessary to pass the CONTEXT object to the function. This object keeps the state of the processor and allows to get the register values. Also, this object can be passed to PIN_ExecuteAt() to skip the syscall instruction:
#include "pin.H"
#include <sys/syscall.h>
VOID syscall_handler(CONTEXT* ctx) {
bool skip_orig_sycall = true;
switch (PIN_GetContextReg(ctx, REG_RAX)) {
case SYS_write:
// emulate the syscall here or
// just notify the app that somthing went wrong with write() call
PIN_SetContextReg(ctx, REG_RAX, static_cast<ADDRINT>(-1));
break;
default:
skip_orig_sycall = false;
break;
}
if (skip_orig_sycall) {
const ADDRINT syscall_ins_size = 2;
const ADDRINT cur_ip = PIN_GetContextReg(ctx, REG_RIP);
PIN_SetContextReg(ctx, REG_RIP, cur_ip + syscall_ins_size);
PIN_ExecuteAt(ctx); // continue execution after syscall instruction
}
}
VOID image_load(IMG img, VOID* v) {
if(!IMG_IsMainExecutable(img)) {
return;
}
for (SEC sec = IMG_SecHead(img); SEC_Valid(sec); sec = SEC_Next(sec)) {
for (RTN rtn = SEC_RtnHead(sec); RTN_Valid(rtn); rtn = RTN_Next(rtn)) {
RTN_Open(rtn);
for (INS ins = RTN_InsHead(rtn); INS_Valid(ins); ins = INS_Next(ins)) {
if (INS_IsSyscall(ins)) {
INS_InsertCall(ins, IPOINT_BEFORE, (AFUNPTR)syscall_handler,
IARG_CONTEXT, IARG_END);
}
}
RTN_Close(rtn);
}
}
}
int main(int argc, char* argv[]) {
if (PIN_Init(argc, argv)) {
PIN_ERROR("Cannot initialize Pin");
return EXIT_FAILURE;
}
PIN_InitSymbols();
IMG_AddInstrumentFunction(image_load, 0);
PIN_StartProgram();
return EXIT_SUCCESS;
}

Related

Error handling for multi-layer implementation in C

I am writing firmware for a device in C. The Software allows the PC to communicate with this device over the serial interface (UART). The firmware contains multiple layers as following:
Communication layer to send and receive data over UART.
Block layer: This layer enables/disables certain blocks on the device by writing data to the device over UART.
API layer: This contains a sequence of calls to the routines in the block layer. It is used to enable or disable a set of blocks on the device.
My problem is the error handling since in C there are no exceptions. Here is how I've implemented my firmware and I am trying to see if there is a more efficient and compact way to build it while still handling errors efficiently. I want to avoid checking at each layer the status of the call of the lower layer.
The code below is very compact and in reality, I have a long sequence of send_uart_commands in the block layer.
// Communication layer
operation_status_t send_uart_command(command_id_t id, command_value_t value)
{
// Send data over UART
// Return success if the operation is successful; otherwise failure
}
// Block layer
operation_status_t enable_block1(void)
{
if (send_uart_command(BLOCK1_COMMAND_1, 10) != operation_success)
return operation_failure;
if (send_uart_command(BLOCK1_COMMAND_2, 20) != operation_success)
return operation_failure;
// A list of sequences
if (send_uart_command(BLOCK1_COMMAND_N, 15) != operation_success)
return operation_failure;
return operation_success;
}
operation_status_t enable_block2(void)
{
if (send_uart_command(BLOCK2_COMMAND_1, 1) != operation_success)
return operation_failure;
if (send_uart_command(BLOCK2_COMMAND_2, 8) != operation_success)
return operation_failure;
return operation_success;
}
// API layer
operation_status_t initialize(void)
{
if (enable_block1() != operation_success)
return operation_failure;
if (enable_block2() != operation_success)
return operation_failure;
// A list of calls to the functions in the block layer
return operation_success;
}
One of the many big problems with exception handling like in C++ is that exceptions can crash through all layers like a cannonball. So when you are writing some completely unrelated code, you suddenly get that cannonball in your face: "UART framing error!" When you haven't even touched the UART code...
Therefore "I want to avoid checking at each layer the status of the call of the lower layer" is the wrong primise. Rather you should do like this:
Check errors at each layer.
Deal with errors as close to the error source as possible.
Only forward errors to the caller in case they are actually meaningful to them.
You can rename/change type of errors to suit the caller along the way.
For example: "UART framing error" might useful to the code calling the UART driver, but useless to the higher layer application. "Possible incorrect baudrate settings" might be a more relevant error description which you should pass along. Though in some cases you want detailed errors available even at the higher layers.
One reason why you might want that, is that it's common and often good design to have a centralized error handler on the top layer, which can make decisions of state changes, print/log errors etc from one single place in the code. Instead of doing that from all over the place. You'll often find the top layer of a microcontroller application looking something like this:
void main (void)
{
/* init & setup code called */
for(;;)
{
kick_watchdog(); // the only place in the program where you do this
result = state_machine[state]();
if(result != OK)
{
state = error_handler(result);
}
}
}
As for your specific code, it looks just fine and mostly doesn't contradict anything of what I've written above. It is always good to return an error code upon error - less confusing than goto, or even worse: massively nested statements and/or loops with error condition flags.
Your code is fine. Actually, it's a bad practice to avoid explicit error checking in C. But if you really want it you could use longjmp. But you should use it with a great care.
This feature allows to jump through the stack skipping arbitrary number of nested calls.
Below you can find an exemplary with mocked up send_uart_command().
#include <setjmp.h>
#include <stdio.h>
jmp_buf env;
void send_uart_command_impl(const char *cmd, int val) {
static int left = 3;
if (left-- == 0) {
printf("%s(%d): failed\n", cmd, val);
longjmp(env, 1);
}
printf("%s(%d): success\n", cmd, val);
}
#define send_uart_command(name, val) send_uart_command_impl(#name, val)
void enable_block1(void) {
send_uart_command(BLOCK1_COMMAND_1, 10);
send_uart_command(BLOCK1_COMMAND_2, 20);
send_uart_command(BLOCK1_COMMAND_N, 15);
}
void enable_block2(void) {
send_uart_command(BLOCK2_COMMAND_1, 1);
send_uart_command(BLOCK2_COMMAND_2, 8);
}
int initialize(void) {
if (setjmp(env)) return -1;
enable_block1();
enable_block2();
return 0;
}
int main() {
if (initialize() != 0)
puts("initialize failed");
else
puts("initialize success");
}
The program is constructed to fail on the 4th invocation of send_uart_command(). Adjust variable left to select other invocation.
The program logic is very streamlined and it prints the expected output:
BLOCK1_COMMAND_1(10): success
BLOCK1_COMMAND_2(20): success
BLOCK1_COMMAND_N(15): success
BLOCK2_COMMAND_1(1): failed
initialize failed

Linux Kernel - How to match a jprobe to kretprobe?

I am writing a kernel module to monitor a few syscalls wanting to return the function arguments to user-land (via netlink socket) if the call was successful.
jprobe.kp.symbol_name = "rename";
jprobe.entry = rename_handler;
kretprobe.kp.symbol_name = "rename";
kretprobe.handler = rename_ret_handler;
static rename_obj_t _g_cur_rename = NULL;
static void _rename_handler(const char *oldpath, const char *newpath)
{
_g_cur_rename = create_rename(oldpath, newpath);
jprobe_return();
}
static void _rename_ret_handler(struct kretprobe_instance *ri, struct pt_regs *regs)
{
/* Send only if successful */
if (regs_return_value(regs) == 0) {
add_send_queue(_g_cur_rename);
}
return 0;
}
I worry that another rename syscall may preempt[1] the current one after the jprobe and I will send incorrect return codes and arguments.
jprobe: rename(a, b)
jprobe rename(c, d)
kretprobe
kretprobe
Edit: This article[2] states that interrupts are disabled during a kprobe handler. But does that mean that interrupts are disable throughout the whole chain (jprobe -> kprobe -> kretprobe) or just for that single kprobe?
https://unix.stackexchange.com/questions/186355/few-questions-about-system-calls-and-kernel-modules-kernel-services-in-parallel
https://lwn.net/Articles/132196/
Interrupts are disabled for each jprobe call: not for the entire sequence.
How many calls are you expecting in the time it will take the application to process them? There are different approaches depending on how fast you expect the calls to come in. The simplest method, if you are only expecting maybe a few hundred calls before you can process them and you will dedicate the static memory to the purpose, is to implement a static array of rename_obj_t objects in memory and then use atomic_add from the kernel asm includes to point to the next entry (mod the size of your array).
This way you are returning a unique static reference each time, so long as the counter doesn't wrap around before you process the returned values. atomic_add is guaranteed to have the correct memory barriers in place so you don't have to worry about things like cache coherency.

How can I catch the timeout exception in a third dll function,I use c language in Windows

How can I catch the timeout exception in a third dll function,I use c language in Windows
I want to catch a timeout Exception while call a thirdly dll function, you know the function takes a long while, and I need it return a value in limited time, if it doesn't return in the time, I will give it a default value.
I have to look for so much infomation about but it doesn't work.
I get the two point:
1.use the alarm function in ,but it only work in Linux,I can't use it in Windows even I use the MinGW standerd GCC complier.
2.use the timeSetEvent function in and the setjmp/longjmp function in ,the three function maybe so closed to take it work.but I use them caused the programe dump,windows pops a DialogMessage say something wrong.
I give the code and the picture like this :
`
#include <stdio.h>
#include <stdlib.h>
#include <windows.h>
#include <setjmp.h>
jmp_buf j;
/**
* 时间中断函数
*/
void PASCAL OneMilliSecondProc(UINT wTimerID, UINT msg, DWORD dwUser, DWORD dwl, DWORD dw2) {
printf("Timout!\n");
longjmp(j,1);
}
int longTimeFunction(){
while (1) {
printf("operating...\n");
Sleep(1000);
}
return 0;
}
int main(){
HANDLE hHandle;
UINT wTimerRes_1ms;//定义时间间隔
UINT wAccuracy; //定义分辨率
UINT TimerID_1ms; //定义定时器句柄
wTimerRes_1ms = 5000;
if((TimerID_1ms = timeSetEvent(
wTimerRes_1ms,
wAccuracy,
(LPTIMECALLBACK)OneMilliSecondProc, // 回调函数
(DWORD)(1), // 用户传送到回调函数的数据;
TIME_PERIODIC//周期调用定时处理函数
)) == 0) {
printf("start!!!!!!!!!!!\n");
} else {
printf("end!!!!!!!!!!!\n");
}
int temp = 0;
if(setjmp(j) == 0){
temp = longTimeFunction();
}else{
printf("xxxxxx...\n");
temp = -1;
}
printf("%d\n", temp);
return 0;
}
`
Unlike UNIX signals, timeSetEvent doesn't interrupt a thread, the callback runs in parallel and longjmping across threads is undefined behavior.
Concerning your actual question, this is a bad idea. Such an abortion could leave the library in an inconsistent state.
Instead, try to get the library vendor to offer an API that accepts a timeout, or use another library that already supports it.

Alternative to blocking code

Attempting to use mbed OS scheduler for a small project.
As mbed os is Asynchronous I need to avoid blocking code.
However the library for my wireless receiver uses a blocking line of:
while (!(wireless.isRxData()));
Is there an alternative way to do this that won't block all the code until a message is received?
static void listen(void) {
wireless.quickRxSetup(channel, addr1);
sprintf(ackData,"Ack data \r\n");
wireless.acknowledgeData(ackData, strlen(ackData), 1);
while (!(wireless.isRxData()));
len = wireless.getRxData(msg);
}
static void motor(void) {
pc.printf("Motor\n");
m.speed(1);
n.speed(1);
led1 = 1;
wait(0.5);
m.speed(0);
n.speed(0);
}
static void sendData() {
wireless.quickTxSetup(channel, addr1);
strcpy(accelData, "Robot");
wireless.transmitData(accelData ,strlen(accelData));
}
void app_start(int, char**) {
minar::Scheduler::postCallback(listen).period(minar::milliseconds(500)).tolerance(minar::milliseconds(1000));
minar::Scheduler::postCallback(motor).period(minar::milliseconds(500));
minar::Scheduler::postCallback(sendData).period(minar::milliseconds(500)).delay(minar::milliseconds(3000));
}
You should remove the while (!(wireless.isRxData())); loop in your listen function. Replace it with:
if (wireless.isRxData()) {
len = wireless.getRxData(msg);
// Process data
}
Then, you can process your data in that if statement, or you can call postCallback on another function that will do your processing.
Instead of looping until data is available, you'll want to poll for data. If RX data is not available, exit the function and set a timer to go off after a short interval. When the timer goes off, check for data again. Repeat until data is available. I'm not familiar with your OS so I can't offer any specific code. This may be as simple as adding a short "sleep" call inside the while loop, or may involve creating another callback from the scheduler.

exchange of control between functions in C language

Considering the following scenario:
fn(1) calls fn(2) , then
fn(2) calls fn(3), and now
fn(3) should pass the control to fn(1) instead of fn(2) and control must not come back again.
Regarding this I have tried with goto, but goto does not work between functions, its only a local jump.
I wanted to check if there is any other method I could use to send the control to another function
NOTE: NO global variable, pointer to functions will work in this case, as per my exploration
Well, the typical way of doing this would be:
int fn3() {
return 1;
}
void fn2() {
if (fn3())
return;
...
}
Not sure if you're looking for something more esoteric, such as setjmp/longjmp
You can use longjmp as a "long range goto" if you absolutely must do this.
int fn1(void) {
printf("in fn1 before calling fn2\n");
fn2();
printf("in fn1 after calling fn2\n");
return 0;
}
int fn2(void) {
printf("in fn2 before calling fn3\n");
if (1) {
return fn3();
}
printf("in fn2 after calling fn3\n");
return 0;
}
int fn3(void) {
printf("in fn3\n");
return 0;
}
You can use setjmp and longjmp to do this -- but it's almost certainly a really bad idea to actually do so. Former Fortran programmers (among others) still sometimes have nightmares about the kind of mess you seem intent on creating. Given a time when a mainframe that served 300+ simultaneous users ran at 20 MHz or so, there was some excuse at the time, even if keeping track of things was a mess. Given current computers, I question not only the utility but the very sanity of having a function call that doesn't return (especially since CPUs are now optimized for that case, so what you're asking for will be slower than normal returns).
What you try to implement are so called coroutines. While C doesn't directly support them, there are ways to exploit some ingenious hacks like Duff's Device to implement them.
Simon Tatham wrote an excellent article about Coroutines in C: http://www.chiark.greenend.org.uk/~sgtatham/coroutines.html
I think you should use setjmp() and longjmp(). The man is available here.
The following example shows you how to use it (from http://en.wikipedia.org/wiki/Setjmp.h#Example_usage ):
#include <stdio.h>
#include <setjmp.h>
static jmp_buf buf;
void second(void) {
printf("second\n"); // prints
longjmp(buf,1); // jumps back to where setjmp was called - making setjmp now return 1
}
void first(void) {
second();
printf("first\n"); // does not print
}
int main() {
if ( ! setjmp(buf) ) {
first(); // when executed, setjmp returns 0
} else { // when longjmp jumps back, setjmp returns 1
printf("main\n"); // prints
}
return 0;
}
Output :
second
main

Resources