How does ptrace work in Linux? - c

The ptrace system call allows the parent process to inspect the attached child. For example, in Linux, strace (which is implemented with the ptrace system call) can inspect the system calls invoked by the child process.
When the attached child process invokes a system call, the ptracing parent process can be notified. But how exactly does that happen? I want to know the technical details behind this mechanism.
Thank you in advance.

When the attached child process invokes a system call, the ptracing parent process can be notified. But how exactly does that happen?
Parent process calls ptrace with PTRACE_ATTACH, and his child calls ptrace with PTRACE_TRACEME option. This pair will connect two processes by filling some fields inside their task_struct (kernel/ptrace.c: sys_ptrace, child will have PT_PTRACED flag in ptrace field of struct task_struct, and pid of ptracer process as parent and in ptrace_entry list - __ptrace_link; parent will record child's pid in ptraced list).
Then strace will call ptrace with PTRACE_SYSCALL flag to register itself as syscall debugger, setting thread_flag TIF_SYSCALL_TRACE in child process's struct thread_info (by something like set_tsk_thread_flag(child, TIF_SYSCALL_TRACE);). arch/x86/include/asm/thread_info.h:
67 /*
68 * thread information flags
69 * - these are process state flags that various assembly files
70 * may need to access ...*/
75 #define TIF_SYSCALL_TRACE 0 /* syscall trace active */
99 #define _TIF_SYSCALL_TRACE (1 << TIF_SYSCALL_TRACE)
On every syscall entry or exit, architecture-specific syscall entry code will check this _TIF_SYSCALL_TRACE flag (directly in assembler implementation of syscall, for example x86 arch/x86/kernel/entry_32.S: jnz syscall_trace_entry in ENTRY(system_call) and similar code in syscall_exit_work), and if it is set, ptracer will be notified with signal (SIGTRAP) and child will be temporary stopped. This is done usually in syscall_trace_enter and syscall_trace_leave :
1457 long syscall_trace_enter(struct pt_regs *regs)
1483 if ((ret || test_thread_flag(TIF_SYSCALL_TRACE)) &&
1484 tracehook_report_syscall_entry(regs))
1485 ret = -1L;
1507 void syscall_trace_leave(struct pt_regs *regs)
1531 if (step || test_thread_flag(TIF_SYSCALL_TRACE))
1532 tracehook_report_syscall_exit(regs, step);
The tracehook_report_syscall_* are actual workers here, they will call ptrace_report_syscall. include/linux/tracehook.h:
80 /**
81 * tracehook_report_syscall_entry - task is about to attempt a system call
82 * #regs: user register state of current task
83 *
84 * This will be called if %TIF_SYSCALL_TRACE has been set, when the
85 * current task has just entered the kernel for a system call.
86 * Full user register state is available here. Changing the values
87 * in #regs can affect the system call number and arguments to be tried.
88 * It is safe to block here, preventing the system call from beginning.
89 *
90 * Returns zero normally, or nonzero if the calling arch code should abort
91 * the system call. That must prevent normal entry so no system call is
92 * made. If #task ever returns to user mode after this, its register state
93 * is unspecified, but should be something harmless like an %ENOSYS error
94 * return. It should preserve enough information so that syscall_rollback()
95 * can work (see asm-generic/syscall.h).
96 *
97 * Called without locks, just after entering kernel mode.
98 */
99 static inline __must_check int tracehook_report_syscall_entry(
100 struct pt_regs *regs)
101 {
102 return ptrace_report_syscall(regs);
103 }
104
105 /**
106 * tracehook_report_syscall_exit - task has just finished a system call
107 * #regs: user register state of current task
108 * #step: nonzero if simulating single-step or block-step
109 *
110 * This will be called if %TIF_SYSCALL_TRACE has been set, when the
111 * current task has just finished an attempted system call. Full
112 * user register state is available here. It is safe to block here,
113 * preventing signals from being processed.
114 *
115 * If #step is nonzero, this report is also in lieu of the normal
116 * trap that would follow the system call instruction because
117 * user_enable_block_step() or user_enable_single_step() was used.
118 * In this case, %TIF_SYSCALL_TRACE might not be set.
119 *
120 * Called without locks, just before checking for pending signals.
121 */
122 static inline void tracehook_report_syscall_exit(struct pt_regs *regs, int step)
123 {
...
130
131 ptrace_report_syscall(regs);
132 }
And ptrace_report_syscall generates SIGTRAP for debugger or strace via ptrace_notify/ptrace_do_notify:
55 /*
56 * ptrace report for syscall entry and exit looks identical.
57 */
58 static inline int ptrace_report_syscall(struct pt_regs *regs)
59 {
60 int ptrace = current->ptrace;
61
62 if (!(ptrace & PT_PTRACED))
63 return 0;
64
65 ptrace_notify(SIGTRAP | ((ptrace & PT_TRACESYSGOOD) ? 0x80 : 0));
66
67 /*
68 * this isn't the same as continuing with a signal, but it will do
69 * for normal use. strace only continues with a signal if the
70 * stopping signal is not SIGTRAP. -brl
71 */
72 if (current->exit_code) {
73 send_sig(current->exit_code, current, 1);
74 current->exit_code = 0;
75 }
76
77 return fatal_signal_pending(current);
78 }
ptrace_notify is implemented in kernel/signal.c, it stops the child and pass sig_info to ptracer:
1961 static void ptrace_do_notify(int signr, int exit_code, int why)
1962 {
1963 siginfo_t info;
1964
1965 memset(&info, 0, sizeof info);
1966 info.si_signo = signr;
1967 info.si_code = exit_code;
1968 info.si_pid = task_pid_vnr(current);
1969 info.si_uid = from_kuid_munged(current_user_ns(), current_uid());
1970
1971 /* Let the debugger run. */
1972 ptrace_stop(exit_code, why, 1, &info);
1973 }
1974
1975 void ptrace_notify(int exit_code)
1976 {
1977 BUG_ON((exit_code & (0x7f | ~0xffff)) != SIGTRAP);
1978 if (unlikely(current->task_works))
1979 task_work_run();
1980
1981 spin_lock_irq(&current->sighand->siglock);
1982 ptrace_do_notify(SIGTRAP, exit_code, CLD_TRAPPED);
1983 spin_unlock_irq(&current->sighand->siglock);
1984 }
ptrace_stop is in the same signal.c file, line 1839 for 3.13.

Related

Context switching - ucontext_t and makecontext()

I am studying context switching in C programming and have found the following example code on the Internet. I am trying to figure out whether only makecontext() function can trigger a function that does something. The other functions such as setcontext(), getcontext(), and swapcontext() are used for setting the context.
The makecontext() attaches a function and its parameter(s) to a context, does the function stick to the context all the time until a change is committed to it?
1 #include <stdio.h>
2 #include <stdlib.h>
3 #include <ucontext.h>
4 #define MEM 64000
5
6 ucontext_t T1, T2, Main;
7 ucontext_t a;
8
9 int fn1()
10 {
11 printf("this is from 1\n");
12 setcontext(&Main);
13 }
14
15 void fn2()
16 {
17 printf("this is from 2\n");
18 setcontext(&a);
19 printf("finished 1\n");
20 }
21
22 void start()
23 {
24 getcontext(&a);
25 a.uc_link=0;
26 a.uc_stack.ss_sp=malloc(MEM);
27 a.uc_stack.ss_size=MEM;
28 a.uc_stack.ss_flags=0;
29 makecontext(&a, (void*)&fn1, 0);
30 }
31
32 int main(int argc, char *argv[])
33 {
34 start();
35 getcontext(&Main);
36 getcontext(&T1);
37 T1.uc_link=0;
38 T1.uc_stack.ss_sp=malloc(MEM);
39 T1.uc_stack.ss_size=MEM;
40 makecontext(&T1, (void*)&fn1, 0);
41 swapcontext(&Main, &T1);
42 getcontext(&T2);
43 T2.uc_link=0;
44 T2.uc_stack.ss_sp=malloc(MEM);
45 T2.uc_stack.ss_size=MEM;
46 T2.uc_stack.ss_flags=0;
47 makecontext(&T2, (void*)&fn2, 0);
48 swapcontext(&Main, &T2);
49 printf("completed\n");
50 exit(0);
51 }
makecontext writes the function info into a context, and it will remain there until it is overwritten by something else. getcontext overwites the entire context, so would overwrite any function written there by a previous call to makecontext. Likewise, swapcontext completely overwrites the context pointed at by its first argument.
The basic idea is that a u_context contains a snapshot of part of the process context at some time. It contains all the machine registers, stack info, and signal masks. It does NOT include any memory map or file descriptor state. The state in the u_context is precisely all the state you need to manipulate in order to implement threads or coroutines.
edit
swapcontext(&current, &another) saves the current context in current and switches to another. At some later point, the code (runnning from another context) may switch back to current (with another call to swapcontext) or it might switch to some third context.
When a context ends (the function set into it with makecontext returns), if some context is pointed at by its uc_link field, it will switch to that context. But if uc_link is NULL, then the thread (and process if there's only one thread) will exit -- other contexts that are not running will just be abandoned.

A second getpwuid call appears to overwrite old value

Here's a small C program that prints (well, supposed to print) the real and effective IDs of a process when the file has the setuid flag set. In this program, when I call getpwuid a second time (L.No 38), it tends to overwrite the value of the variable realUserName that was obtained in L.No 24. I'm unable to explain this behavior. Is this the expected behavior and why? I'm trying this in a Linux box (RHEL 2.6.18-371.1.2.el5).
1 /* Filename: test.c
2 * Notes:
3 * 1] ./test owned by user cadmn (userID: 3585)
4 * 2] ./test run by user pmn (4471)
5 * 3] ./test has the setuid bit switched-on.
6 */
7 #include <stdio.h>
8 #include <pwd.h>
9 #include <sys/types.h>
10 #include <unistd.h>
11 int main()
12 {
13
14 uid_t realId, effectiveId;
15 struct passwd *realUser, *effUser;
16
17 realId = getuid(); // realId = 4471
18 effectiveId = geteuid(); //effectiveId = 3585
19
20 printf("Real ID is %i and Effective ID is %i\n", (int)realId, (int)effectiveId);
21 //prints 4472 and 3585, respectively
22
23 realUser = getpwuid(realId);
24 char *realUserName = realUser->pw_name; //realUserName = pmn
25
26 printf("Real ID (name) at this point is %s\n", realUserName);
27 // prints pmn.
28
29 /*
30 *********************************************************
31 * *
32 * everything works as expected up to this point *
33 * *
34 *********************************************************
35 */
36
37 // The value obtained from this call is not used anywhere in this program
38 effUser = getpwuid(effectiveId);
39 printf("\nCalled getpwuid with the effectiveId\n\n");
40
41 printf("Real ID is %i and Effective ID is %i\n", (int)realId, (int)effectiveId);
42 //prints 4472 and 3585, respectively
43
44 printf("Real ID (name) at this point is %s.\n", realUserName);
45 // Expect to still see 'pmn' printed; though see 'cadmn' as the output!
46 // Why does this happen?
47
48 return 0;
49 }
50
Output:
pmn#rhel /tmp/temp > id pmn
uid=4471(pmn) gid=1000(nusers) groups=1000(nusers)
pmn#rhel /tmp/temp >
pmn#rhel /tmp/temp > id cadmn
uid=3585(cadmn) gid=401(cusers) groups=401(cusers)
pmn#rhel /tmp/temp >
pmn#rhel /tmp/temp > ls -l ./test
-r-sr-xr-x 1 cadmn cusers 9377 Dec 24 19:48 ./test
pmn#rhel /tmp/temp >
pmn#rhel /tmp/temp > ./test
Real ID is 4471 and Effective ID is 3585
Real ID (name) at this point is pmn
Called getpwuid with the effectiveId
Real ID is 4471 and Effective ID is 3585
Real ID (name) at this point is cadmn.
pmn#rhel /tmp/temp >
The behaviour you observe is the expected one.
The structure referenced by the return value of getpwuid() is defined statically internal to the latter, so it is expected to be filled (and with this overwritten) for each call to getpwuid().
This line
char * realUserName = realUser->pw_name;
just stores a reference to a value held by this statically internal structure. This value is also overwritten if the statically internal structure is overwritten by the next call to getpwuid().
To get around this there are two possibilities:
Use te reentrant version of getpwuid() which is getpwuid_r(). To be able to use it, add
#define _POSIX_SOURCE
before the very 1st #include statement in your program's sources.
Create copy of the members you need, pw_namein this case. The can be achieved by for example doing:
char * realUserName = strdup(realUser->pw_name);
Be awre that realUserName is now pointing to dynamically allocated memory, which needs to be free()ed by the program itself if not need anymore. To do so call
free(realUserName);

significance of read_un/lock(&tasklist_lock)

While following path for wait system call, noticed that before calling
do_wait_thread we get hold of tasklist_lock. I am trying to understand the significance
of tasklist_lock & where its appropriate to use.
716 read_lock(&tasklist_lock);
1717 tsk = current;
1718 do {
1719 retval = do_wait_thread(wo, tsk);
1720 if (retval)
1721 goto end;
1722
1723 retval = ptrace_do_wait(wo, tsk);
1724 if (retval)
1725 goto end;
1726
1727 if (wo->wo_flags & __WNOTHREAD)
1728 break;
1729 } while_each_thread(current, tsk);
1730 read_unlock(&tasklist_lock);
I looked at the declaration of tasklist_lock, It is as follows.
/*
251 * This serializes "schedule()" and also protects
252 * the run-queue from deletions/modifications (but
253 * _adding_ to the beginning of the run-queue has
254 * a separate lock).
255 */
256 extern rwlock_t tasklist_lock;
257 extern spinlock_t mmlist_lock;
I am not able to understand where we should use this. Can you please let me know about it.
Appreciate your help.
The loop iterates over each thread of the current task. Holding the tasklist lock ensures that none of those threads disappear while the loop is running.

How fast is mprotect

My question is how fast is mprotect. What will be the difference between mprotecting say 1 MB of contiguous memory as compared to 1 GB of contiguous memory? Of course I can measure the time, but I want to know what goes under the hood.
A quick check on the source seems to indicate that it iterates over the process mappings in the selected region and change their flags. If you protect less than a whole mapping it'll split it into two or three.
So in short it's O(n) where n is the number of times you've called mmap.
You can see all the current maps in /proc/pid/maps
It is O(n) on page count in the region too, because it should change a access bits on all PTEs (Page translation Entries, which describes virtual->physical page mappings in PageTable). Calltree:
mprotect
->
mprotect_fixup
->
change_pte_range
http://lxr.free-electrons.com/source/mm/mprotect.c#L32
47 do {
48 oldpte = *pte;
49 if (pte_present(oldpte)) {
50 pte_t ptent;
51
52 ptent = ptep_modify_prot_start(mm, addr, pte);
53 ptent = pte_modify(ptent, newprot);
54
55 /*
56 * Avoid taking write faults for pages we know to be
57 * dirty.
58 */
59 if (dirty_accountable && pte_dirty(ptent))
60 ptent = pte_mkwrite(ptent);
61
62 ptep_modify_prot_commit(mm, addr, pte, ptent);
63 } else if (PAGE_MIGRATION && !pte_file(oldpte)) {
64 swp_entry_t entry = pte_to_swp_entry(oldpte);
65
66 if (is_write_migration_entry(entry)) {
67 /*
68 * A protection check is difficult so
69 * just be safe and disable write
70 */
71 make_migration_entry_read(&entry);
72 set_pte_at(mm, addr, pte,
73 swp_entry_to_pte(entry));
74 }
75 }
76 } while (pte++, addr += PAGE_SIZE, addr != end);
Note the increment: addr += PAGE_SIZE, addr != end);

Why can't access cocoa control in a pthread?

When I use pthread in cocoa, and want to access cocoa control in pthread function(setBtnState), it doesn't work. What's the problem?
The following is the source code:
AppController.h
1 //
2 // AppController.h
3 // PThreadTest
4 //
5 // Created by zhu on 10-9-5.
6 // Copyright 2010 __MyCompanyName__. All rights reserved.
7 //
8
9 #import <Cocoa/Cocoa.h>
10
11
12 #interface AppController : NSObject {
13 IBOutlet NSButton *btnNew;
14 IBOutlet NSButton *btnEnd;
15 }
16
17 -(IBAction)newThread:(id)sender;
18 -(IBAction)endThread:(id)sender;
19
20 #end
AppController.m
1 //
2 // AppController.m
3 // PThreadTest
4 //
5 // Created by zhu on 10-9-5.
6 // Copyright 2010 __MyCompanyName__. All rights reserved.
7 //
8
9 #import "AppController.h"
10 #import <pthread.h>
11
12
13 #implementation AppController
14
15 struct mydata {
16 pthread_mutex_t mutex;
17 pthread_cond_t cond;
18 int stop;
19 NSButton *btnNew;
20 NSButton *btnEnd;
21 };
22
23 struct mydata adata;
24 struct mydata *ptr;
25
26 void setBtnState(struct mydata *p) {
27 BOOL stop = NO;
28 if (p->stop) {
29 stop = YES;
30 }
31 [p->btnNew setEnabled:stop];
32 [p->btnEnd setEnabled:!stop];
33 }
34
35 void* mythread(void* arg) {
36 NSLog(#"new thread start...");
37 ptr->stop = 0;
38 setBtnState(ptr);
39 pthread_mutex_lock(&ptr->mutex);
40 while (!ptr->stop) {
41 pthread_cond_wait(&ptr->cond, &ptr->mutex);
42 }
43 pthread_mutex_unlock(&ptr->mutex);
44 setBtnState(ptr);
45 NSLog(#"current thread end...");
46 }
47
48 -(id)init {
49 self = [super init];
50 ptr = &adata;
51 pthread_mutex_init(&ptr->mutex, NULL);
52 pthread_cond_init(&ptr->cond, NULL);
53 ptr->stop = 0;
54 ptr->btnNew = btnNew;
55 ptr->btnEnd = btnEnd;
56 return self;
57 }
58
59 -(IBAction)newThread:(id)sender {
60 pthread_t pid;
61 pthread_create(&pid, NULL, mythread, NULL);
62 }
63
64 -(IBAction)endThread:(id)sender {
65 pthread_mutex_lock(&ptr->mutex);
66 ptr->stop = 1;
67 pthread_mutex_unlock(&ptr->mutex);
68 pthread_cond_signal(&ptr->cond);
69 }
70
71 #end
Thanks to Chris. In the backgroud thread in order to update control's state I use performSelectorOnMainThread to communicate with the main UI thread.
But when btnEnd is pressed, the debugger console show the following infomation:
2010-09-12 23:36:29.255 PThreadTest[1888:a0f] -[AppController setBtnState]: unrecognized selector sent to instance 0x100133030
Why it doesn't work after I updated AppController.m as the following:
1 //
2 // AppController.m
3 // PThreadTest
4 //
5 // Created by zhu on 10-9-5.
6 // Copyright 2010 __MyCompanyName__. All rights reserved.
7 //
8
9 #import "AppController.h"
10 #import <pthread.h>
11
12
13 #implementation AppController
14
15 struct mydata {
16 pthread_mutex_t mutex;
17 pthread_cond_t cond;
18 int stop;
19 NSButton *btnNew;
20 NSButton *btnEnd;
21 id obj;
22 };
23
24 struct mydata adata;
25 struct mydata *ptr;
26
27 void* mythread(void* arg) {
28 NSLog(#"new thread start...");
29 ptr->stop = 0;
30 pthread_mutex_lock(&ptr->mutex);
31 while (!ptr->stop) {
32 pthread_cond_wait(&ptr->cond, &ptr->mutex);
33 }
34 pthread_mutex_unlock(&ptr->mutex);
35 [ptr->obj performSelectorOnMainThread:#selector(setBtnState) withObject:#"YES" waitUntilDone:NO];
36 NSLog(#"current thread end...");
37 }
38
39 -(void)setBtnState:(id)aobj {
40 BOOL stop = NO;
41 if ([aobj isEqualToString:#"YES"]) {
42 stop = YES;
43 }
44 [btnNew setEnabled:stop];
45 [btnEnd setEnabled:!stop];
46 }
47
48 -(id)init {
49 self = [super init];
50 ptr = &adata;
51 pthread_mutex_init(&ptr->mutex, NULL);
52 pthread_cond_init(&ptr->cond, NULL);
53 ptr->stop = 0;
54 ptr->obj = self;
55 // ptr->btnNew = btnNew;
56 // ptr->btnEnd = btnEnd;
57 return self;
58 }
59
60 - (void)awakeFromNib {
61 ptr->btnNew = btnNew;
62 ptr->btnEnd = btnEnd;
63 }
64
65 -(IBAction)newThread:(id)sender {
66 [self setBtnState:#"NO"];
67 pthread_t pid;
68 pthread_create(&pid, NULL, mythread, NULL);
69 }
70
71 -(IBAction)endThread:(id)sender {
72 pthread_mutex_lock(&ptr->mutex);
73 ptr->stop = 1;
74 pthread_mutex_unlock(&ptr->mutex);
75 pthread_cond_signal(&ptr->cond);
76 }
77
78 #end
79
You should only interact with your UI from the main thread, not from background threads.
This isn't just a matter of locking around interaction with your own controls; your controls may interact with other objects in the UI behind your back. (For example, your button may interact with your window.) This can lead to race conditions, deadlocks, invalid/mixed state, and other concurrency problems.
When you're doing some processing work on a background thread, and it needs to communicate results (whether intermediate or final) to the user, it will need to push that communication through the main thread. There are a few mechanisms in Cocoa with which to do this:
The -performSelectorOnMainThread:withObject:waitUntilDone: method lets you say "run this other method on the main thread," optionally waiting until it's done. You should almost never pass YES for the waitUntilDone: argument though, as that's a recipe for deadlock.
Starting in Mac OS X 10.6 and iOS 4.0, there is +[NSOperationQueue mainQueue], which returns an instance of NSOperationQueue associated with the main thread: You can place operations on this queue and they will run on the main thread.
This is useful if you're using several operations on a background queue and you have some "finishing" operation to perform on the main thread that depends on all of them. You can just use NSOperation's dependency mechanism to set up dependencies between them even though they're on different queues.
You can either subclass NSOperation or use Objective-C blocks for the bodies of your operations (via +[NSBlockOperation blockOperationWithBock:]).
Also starting in Mac OS X 10.6 and iOS 4.0, there is Grand Central Dispatch which lets you perform a block on a main queue associated with the main thread. It's a slightly less verbose API than NSOperation, but at the cost of not having direct support for dependencies and some of the other features of NSOperationQueue, and of being built in plain C rather than in Objective-C.
One thing to remember when using any of these mechanisms is that you must not interact with the same data from multiple threads at the same time without appropriate concurrency control (such as locking, or the use of specialized lock-free data structures and primitives). You can't get away with "Oh, I'm just reading, so I don't need to take a lock," or "Oh, I just need to enable a button, I don't really need to push that to the main thread."
A good way to avoid this issue is to do as much work as possible by batching it up into discrete units, processing those units in the background, and then relaying the results to the main thread.
So your threaded code is not written like this:
spin off a thread to:
lock the document
get some data from the document
unlock the document
process the data
tell the main thread whether to enable or disable the document's Foo button
Instead your threaded code is written like this:
get some data from the document
spin off a thread to:
process the data
then tell the main thread the result of the processing
on the main thread:
determine from the result of the processing whether to enable or disable the document's Foo button
The difference is that the latter is written in terms of the units of work being done, rather than in terms of the user interface, and will be both easier to understand and more robust in the face of your application's life (such as adding features) - it's essentially "MVC applied to threads."
You want to move:
ptr->btnNew = btnNew;
ptr->btnEnd = btnEnd;
... to awakeFromNib if you are using outlets and loading from a nib file. There is no guarantee that the outlets will be resolved until awakeFromNib is called and init is called before awakeFromNib.
I'm not certain you can use pthreads to communicate with the UI. I'm pretty sure you have to use NSThread and notifications to talk to controls on the main thread.

Resources