How to play with ptrace on x86-64? - c

I'm following the tutorial here, and modified a little for x86-64(basically replace eax to rax,etc) so that it compiles:
#include <sys/ptrace.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <unistd.h>
#include <sys/user.h>
#include <sys/reg.h>
#include <unistd.h>
int main()
{ pid_t child;
long orig_eax;
child = fork();
if(child == 0) {
ptrace(PTRACE_TRACEME, 0, NULL, NULL);
execl("/bin/ls", "ls", NULL);
}
else {
wait(NULL);
orig_eax = ptrace(PTRACE_PEEKUSER,
child, 4 * ORIG_RAX,
NULL);
printf("The child made a "
"system call %ld\n", orig_eax);
ptrace(PTRACE_CONT, child, NULL, NULL);
}
return 0;
}
But it doesn't actually work as expected, it always says:
The child made a system call -1
What's wrong in the code?

ptrace returns -1 with errno EIO because what you're trying to read is not correctly aligned. Taken from ptrace manpage:
PTRACE_PEEKUSER
Reads a word at offset addr in the child's USER area, which
holds the registers and other information about the process (see
<sys/user.h>). The word is returned as the result of the
ptrace() call. Typically the offset must be word-aligned,
though this might vary by architecture. See NOTES. (data is
ignored.)
In my 64-bits system, 4 * ORIG_RAX is not 8-byte-aligned. Try with values such 0 or 8 and it should work.

In 64 bit = 8 * ORIG_RAX
8 = sizeof(long)

Related

Recoding Strace, why I can't catch the "write syscall ?"

I am currently recode the Strace command.
I understand the goal of this command and I can catch some syscalls from an exectuable file.
My question is : Why I don't catch the "write" syscall ?
this is my code :
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/ptrace.h>
#include <sys/user.h>
#include <wait.h>
int main(int argc, char* argv[]) {
int status;
pid_t pid;
struct user_regs_struct regs;
int counter = 0;
int in_call =0;
switch(pid = fork()) {
case -1:
perror("fork");
exit(1);
case 0:
ptrace(PTRACE_TRACEME, 0, NULL, NULL);
execvp(argv[1], argv + 1);
break;
default:
wait(&status);
while (status == 1407) {
ptrace(PTRACE_GETREGS, pid, NULL, &regs);
if(!in_call) {
printf("SystemCall %lld called with %lld, %lld, %lld\n",regs.orig_rax,
regs.rbx, regs.rcx, regs.rdx);
in_call=1;
counter ++;
}
else
in_call = 0;
ptrace(PTRACE_SYSEMU, pid, NULL, NULL);
wait(&status);
}
}
printf("Total Number of System Calls = %d\n", counter);
return 0;
}
This is the output using my program :
./strace ./my_program
SystemCall 59 called with 0, 0, 0
SystemCall 60 called with 0, 4198437, 5
Total Number of System Calls = 2
59 represents the execve syscall.
60 represents the exit syscall.
This is the output using the real strace :
strace ./my_program
execve("./my_program", ["./bin_asm_write"], 0x7ffd2929ae70 /* 67 vars */) = 0
write(1, "Toto\n", 5Toto
) = 5
exit(0) = ?
+++ exited with 0 +++
As you can see, my program don't catch the write syscall.
I don't understrand why, do you have any idea ?
Thank You for your answer.
Your while loop is set up rather strangely -- you have this in_call flag that you toggle back and forth between 0 and 1, and you only print the system call when it is 0. The net result is that while you catch every system call, you only print every other system call. So when you catch the write call, the flag is 1 and you don't print anything.
Another oddness is that you're using PTRACE_SYSEMU rather than PTRACE_SYSCALL. SYSEMU is intended for emulating system calls, so the system call won't actually run at all (it will be skipped); normally your ptracing program would do whatever the systme call is supposed to do itself and then call PTRACE_SETREGS to set the tracee's registers with the appropriate return values before calling PTRACE_SYSEMU again to run to the next system call.
Your in_call flagging would make more sense if you were actually using PTRACE_SYSCALL, as that will stop twice for each syscall -- once on entry to the syscall and a second time when the call returns. However, it will also stop for signals, so you need to be decoding the status to see if a signal has occurred or not.

C - Retrieving a child's exit status that is larger than 8 bits

Note: For simplicity, I don't include much error checking and my sample code doesn't really serve any practical purpose.
What I want:
I want a program that fork()s a child process and has it invoke a process using execl(). My parent then retrieves the exit code for that process. This is fairly trivial.
What I tried:
int main(int argc, char** argv) {
int ch = fork();
if(ch == -1) {
perror(NULL);
}else if(ch == 0) {
// In child, invoke process
execl("/path/process", "process", 0);
}else {
// In parent, retrieve child exit code
int status = 0;
wait(&status);
// print exit status
if(WIFEXITED(status)) printf("%d\n", WEXITSTATUS(status));
}
}
My issue:
WEXITSTATUS() only retrieves the lower 8 bits of the exit value while I need all the bits from the int value. Specifically, process performs a calculation and the result may be larger than 8 bits. It may even be negative, in which case, the highest bit will be needed to represent the correct value.
What else I tried:
Also, while looking around, I found the pipe() function. However, I'm not sure how to use it in this situation since, after calling execl(), I can't write to a file descriptor from within the child.
So how can I go about retrieving a child's exit status that is larger than 8 bits?
I don't think that what you are trying to accomplish it's possible, because in Linux (actually i think it's UX specific), a process exit code is an 8bit number (max 256): 0-255 (and as a convention, 0 means success, anything else means error) and lots of stuff rely on this fact (including the macros you used). Take the following piece of code:
// a.c
int main() {
return 257;
}
If you compile it (gcc a.c), and run the resulting executable (a.out) checking (echo $?) its exit code (that will be truncated by the OS; hmm or is it the shell?) it will output 1 (wrap around arithmetic): 257 % 256 = 1.
As an alternative as you mentioned, you could use pipe (this post is pretty descriptive) or sockets (AF_UNIX type).
this code is from: How to send a simple string between two programs using pipes?
writer.c
#include <fcntl.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <unistd.h>
int main()
{
int fd;
char * myfifo = "/tmp/myfifo";
/* create the FIFO (named pipe) */
mkfifo(myfifo, 0666);
/* write "Hi" to the FIFO */
fd = open(myfifo, O_WRONLY);
write(fd, "Hi", sizeof("Hi"));
close(fd);
/* remove the FIFO */
unlink(myfifo);
return 0;
}
reader.c
#include <fcntl.h>
#include <stdio.h>
#include <sys/stat.h>
#include <unistd.h>
#define MAX_BUF 1024
int main()
{
int fd;
char * myfifo = "/tmp/myfifo";
char buf[MAX_BUF];
/* open, read, and display the message from the FIFO */
fd = open(myfifo, O_RDONLY);
read(fd, buf, MAX_BUF);
printf("Received: %s\n", buf);
close(fd);
return 0;
}
the code, probably the parent/reader, should delete the fifo node, perhaps by calling rm.
Otherwise the fifo entry is left in existence, even across a re-boot, just like any other file.

Process fork isn't executing desired code

So I'm trying to execute this code given to me by my professor. It's dead simple. It forks, checks to see if the forking works properly, then executes another bit of code in a separate file.
For some reason, on my OS X 10.9.5 machine, it's failing to execute the second bit of code. Here are both of the programs:
exercise.c
#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <sys/types.h>
#include <sys/wait.h>
int main() {
pid_t child = fork();
if ((int)child < 0) {
fprintf(stderr, "fork error!\n");
exit(0);
} else if ((int)child > 0) {
int status;
(void) waitpid(child, &status, 0);
if (WIFEXITED(status)) {
printf("child %d exited normally and returned %d\n",
child, WEXITSTATUS(status));
} else if (WIFSIGNALED(status)) {
printf("\nchild %d was killed by signal number %d and %s leave a core dump\n",
child, WTERMSIG(status), (WCOREDUMP(status) ? "did" : "didn't"));
} else {
printf("child %d is dead and I don't know why!\n", child);
}
} else {
char *argv[] = { "./getcode" };
execve(argv[0], argv, NULL);
}
return 0;
}
And getcode.c
#include <stdio.h>
#include <sys/types.h>
#include <unistd.h>
#include <stdlib.h>
int main() {
int rc = 256;
printf("I am child process %d\n", getpid());
while ((rc > 255) || (rc < 0)) {
printf("please type an integer from 0 to 255: ");
scanf("%d", &rc);
}
exit(rc);
}
I compile both with the commands:
gcc -Wall -pedantic-errors exercise.c -o exercise
and
gcc -Wall -pedantic-errors getcode.c -o getcode
Unfortunately, the only thing I get back from the child process is a return code of 0
./exercise
child 903 exited normally and returned 0
I'm baffled. Can anyone help?
EDIT: Okay, so I included perror("execve") as requested, and it returns execve: Bad address. So how can I fix that?
EDIT2: All right. I fixed it. I've changed the bit of the above code to include this:
char *argv[] = { "./getcode",NULL };
execve(argv[0], argv, NULL);
Null termination fixes the argv issues.
You need to terminate argv with a NULL element. From the execve man page:
Both argv and envp must be terminated by a NULL pointer.
Also it is not clear that NULL is valid for the envp argument. The Linux man page says
On Linux, argv can be specified as NULL, which has the same effect as specifying this argument as a pointer to a list containing a single NULL pointer. Do not take advantage of this misfeature! It is nonstandard and nonportable: on most other UNIX systems doing this will result in an error (EFAULT).
Possibly specifying envp as NULL is similarly nonstandard. Use execv not execve if you don't need to specify an environment.
You should check the return value of execve. And use errno to determine the cause. Eg., use perror("execve") It may be complaining.
You're not checking the result of the execve call, so I suspect it's failing, and the child process is reaching the return 0 at the end of main.

(ORIG_EAX*4) in ptrace calls

I was going through an article here and was trying out the code snippet I have copied out below :-
#include <sys/ptrace.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <unistd.h>
#include <linux/user.h> /* For constants
ORIG_EAX etc */
int main()
{ pid_t child;
long orig_eax;
child = fork();
if(child == 0) {
ptrace(PTRACE_TRACEME, 0, NULL, NULL);
execl("/bin/ls", "ls", NULL);
}
else {
wait(NULL);
orig_eax = ptrace(PTRACE_PEEKUSER,
child, 4 * ORIG_EAX,
NULL);
printf("The child made a "
"system call %ld\n", orig_eax);
ptrace(PTRACE_CONT, child, NULL, NULL);
}
return 0;
}
I have a doubt regarding what ORIG_EAX is exactly and why 4*ORIG_EAX is passed onto the ptrace call. I initially assumed that ORIG_EAX, EBX, ECX etc would be the offsets into a particular structure where the values of the registers would be stored.
So I decided to print the value of ORIG_EAX just after the wait by using printf("origeax = %ld\n", ORIG_EAX);. The value was 11. So, my earlier assumption regarding the offsets was wrong.
I understand that the wait call is terminated when the child has a state change(in this case, issues a system call) and that ORIG_EAX would contain the system call number.
However, why is ORIG_EAX * 4 passed onto the ptrace call?
The parameter is an offset into the user_regs_struct. Note that each of these is an unsigned long, so to get the 11th entry (orig_eax) the offset in bytes is 44, (provided you're on an x86 machine of course).

How to create a customized file descriptor on linux

I would like to create a file whose descriptor would have some customizable behavior. In particular, I'd like to create a file descriptor, which, when written to, would prefix every line, with name of the process and pid (and maybe time), but I can imagine it can be useful to do other things.
I don't want to alter the writing program - for one thing, I want it to work for all programs on my system, even shell/perl/etc. scripts, and it would be impractical if not impossible to change the source code of everything.
Note that pipes wouldn't do in this case, because when the writing process fork()s, the newly created child shares the fd and is indistinguishable from its parent by the reading end of the pipe.
There are approaches which would do, but I think they are rather clumsy:
Create a kernel module that will create such fds. For example, you could open some /dev/customfd and then instruct the module to do some transformation etc. or send data to userspace or socket etc.
Use LD_PRELOAD that will override the fd manipulation functions and do these kinds of things on the "special" fd.
However, both of these approaches are quite laborious, so I would like to know if there is a better way, or any infrastructure (like off-the-shelf libraries) that would help.
I'd prefer a solution which doesn't involve kernel changes, but I'm ready to accept them if necessary.
Just an idea: would FUSE be an answer?
You have a lot of options , as you mentioned using the LD_PRELOAD wrapping the write()/read() functions is a good approach.
I recommend you to use unix ptrace(2) to caught the desired system call and pass the arguments to your own function.
Example :
#include <sys/ptrace.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <unistd.h>
#include <linux/user.h>
#include <sys/syscall.h> /* For SYS_write etc */
int main()
{ pid_t child;
long orig_eax, eax;
long params[3];
int status;
int insyscall = 0;
child = fork();
if(child == 0) {
ptrace(PTRACE_TRACEME, 0, NULL, NULL);
execl("/bin/ls", "ls", NULL);
}
else {
while(1) {
wait(&status);
if(WIFEXITED(status))
break;
orig_eax = ptrace(PTRACE_PEEKUSER,
child, 4 * ORIG_EAX, NULL);
if(orig_eax == SYS_write) {
if(insyscall == 0) {
/* Syscall entry */
insyscall = 1;
params[0] = ptrace(PTRACE_PEEKUSER,
child, 4 * EBX,
NULL);
params[1] = ptrace(PTRACE_PEEKUSER,
child, 4 * ECX,
NULL);
params[2] = ptrace(PTRACE_PEEKUSER,
child, 4 * EDX,
NULL);
printf("Write called with "
"%ld, %ld, %ld\n",
params[0], params[1],
params[2]);
}
else { /* Syscall exit */
eax = ptrace(PTRACE_PEEKUSER,
child, 4 * EAX, NULL);
printf("Write returned "
"with %ld\n", eax);
insyscall = 0;
}
}
ptrace(PTRACE_SYSCALL,
child, NULL, NULL);
}
}
return 0;
}

Resources