I'm trying to write a clone of ghostscript and I can't figure out how they make it so you can type into the graphics window, and the keystrokes show up in the terminal window as though you'd typed them there to begin with. So having received the KeyRelease Event, can I stuff the char into stdin somehow, to be read with normal filereading code? Or do I have to make my own internal buffer in front of stdin so I can hack new chars into it? Or is ther some simple way to map keyboard events from my application window to Xterm?
I'm willing to do the work, but I don't even know what I'm looking for here. Help?!!
I don't think gs does this (at least on linux).
I tried it running from to my linux box from a SSH session and switched focus to the X11 windows that pops up with the rendered image (tiger) and the keys I pressed there did NOT go to the application on the remote host.
The (end of the) strace shows GS waiting for stdin -- the read with fd=0
read(3, " } if\n psuserparams readonly p"..., 4096) = 3258
brk(0x1124000) = 0x1124000
read(3, "", 4096) = 0
close(3) = 0
munmap(0x7f8ccaee5000, 4096) = 0
poll([{fd=4, events=POLLIN|POLLOUT}], 1, -1) = 1 ([{fd=4, revents=POLLOUT}])
writev(4, [{"+\2\1\0", 4}, {NULL, 0}, {"", 0}], 3) = 4
poll([{fd=4, events=POLLIN}], 1, -1) = 1 ([{fd=4, revents=POLLIN}])
read(4, "\1\1'\0\0\0\0\0\1\0\200\0\0\0\0\0\1\0\0\0\264\2\0\0008\0A\2\4\0\0\0", 4096) = 32
read(4, 0xc9bd54, 4096) = -1 EAGAIN (Resource temporarily unavailable)
fstat(1, {st_mode=S_IFREG|0644, st_size=143204, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f8ccaee5000
write(1, "GS>", 3GS>) = 3
read(0, ...unfinished ...
I had to switch focus back to the ssh window in order for the key press (Ctrl-C) to take effect. I had tried "quit" as well as ctrl-C when the focus was the image X11 window.
I don't know the answer, but I know the way to find it. Run ghostscript under strace and watch what it's doing. This is usually a lot easier and more informative than trying to read source.
Eureka!
in the file gdevxini.c
435 wm_hints.flags = InputHint;
436 wm_hints.input = False;
437 XSetWMHints(xdev->dpy, xdev->win, &wm_hints); /* avoid input focus */
Edit: Now that I know what it looks like, I was able to find some documentation:
The input member is used to communicate to the window manager the input focus model used by the application.... Applications that never expect any keyboard input ... should set this member to False.
--X Window System: C Library and Protocol Reference, p.282
Related
I am trying to compile some stuff for an embedded system that I'm trying to build upon. However, when running binaries I've built for the system I am met with failed to map segment from shared object: Invalid argument. This happens to anything that does more than simple loops and prints.
In the name of debugging this, I've built a binary (testbin) that simply lists the contents of the current directory:
#include<stdio.h>
#include<dirent.h>
int main(void)
{
DIR *d;
struct dirent *dir;
d = opendir(".");
if (d)
{
while ((dir = readdir(d)) != NULL)
{
printf("%s\n", dir->d_name);
}
closedir(d);
}
return(0);
}
As the system is somewhat uncooperative in terms of libraries, I have to override it all, so I run programs this way: ./ld-linux.so.3 ./testbin which produces:
./testbin: error while loading shared libraries: ./shitls: failed to map segment from shared object: Invalid argument
...as suspected, testbin also struggles as soon as external libraries are needed. I managed to statically compile strace for the system, which somehow runs just fine. So I'm fairly sure that the culprit is whenever libc.so is needed.
When running strace on my testbin via ld-linux.so.3 like this: ./strace ./ld-linux.so.3 ./testbin
..I am presented with something to work with:
execve("./ld-linux.so.3", ["./ld-linux.so.3", "./testbin"], 0xbee2b554 /* 74 vars */) = 0
brk(NULL) = 0xb873c000
open("./testbin", O_RDONLY) = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\2\0(\0\1\0\0\0\311\203\0\0004\0\0\0"..., 512) = 512
lseek(3, 1920, SEEK_SET) = 1920
read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1240) = 1240
lseek(3, 1601, SEEK_SET) = 1601
read(3, "A2\0\0\0aeabi\0\1(\0\0\0\0057-A\0\6\n\7A\10\1\t\2\n\4\22"..., 51) = 51
fstat64(3, {st_mode=S_IFREG|0777, st_size=5379, ...}) = 0
getcwd("/mnt/output", 128) = 22
mmap2(0x5241ae00, 4096, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0) = -1 EINVAL (Invalid argument)
close(3) = 0
writev(2, [{iov_base="./testbin", iov_len=9}, {iov_base=": ", iov_len=2}, {iov_base="error while loading shared libra"..., iov_len=36}, {iov_base=": ", iov_len=2}, {iov_base="./testbin", iov_len=9}, {iov_base=": ", iov_len=2}, {iov_base="failed to map segment from share"..., iov_len=40}, {iov_base=": ", iov_len=2}, {iov_base="Invalid argument", iov_len=16}, {iov_base="\n", iov_len=1}], 10./testbin: error while loading shared libraries: ./testbin: failed to map segment from shared object: Invalid argument
) = 119
exit_group(127) = ?
+++ exited with 127 +++
...but I'm not fluent in reading strace output. Any pointer as to where the running of the binary croaks?
It looks like your development sysroot is not matching the sysroot libs on the target. One option is to locate the proper cross-development tools for your chip. Another option is to pull the rootfs from the target and try to build against it inside your qemu setup. For example:
LDFLAGS = -L/rootfs-mipsel/mipsel-linux-gnu -lusb-1.0 -ludev
Also, the cross-compiler can take the desired rootfs as argument: --sysroot=/opt/rootfs-mipsel.
Your current development tools inside qemu may use a different ld.so version than that of the target. After compilation it can be edited with patchelf:
patchelf --set-interpreter /lib/ld.so.1 a.out
According to tutorial for libuv, making subsequent calls to uv_write should not cause one write to block another write (my understanding was that they were supposed to occur on separate threads).
However I've run the example code under strace and it seems that this isn't the case. Having run similar examples using uv_fs_write, I can see that each call to write occurs on separate threads and don't block.
Can someone explain what the expected behaviour is for uv_write and if it is supposed to be different from uv_fs_write when the underlying stream is a file handle?
cat Makefile | strace ./uvtee/uvtee ~/out.txt
open("/home/james/out.txt", O_RDWR|O_CREAT|O_CLOEXEC, 0644) = 11
ioctl(11, FIONBIO, [1]) = 0
epoll_ctl(6, EPOLL_CTL_ADD, 7, {EPOLLIN, {u32=7, u64=7}}) = 0
epoll_ctl(6, EPOLL_CTL_ADD, 9, {EPOLLIN, {u32=9, u64=9}}) = 0
epoll_ctl(6, EPOLL_CTL_ADD, 0, {EPOLLIN, {u32=0, u64=0}}) = 0
epoll_wait(6, [{EPOLLIN|EPOLLHUP, {u32=0, u64=0}}], 1024, -1) = 1
brk(0xb3e000) = 0xb3e000
read(0, "examples=\\\n\thelloworld\\\n\tidle-ba"..., 65536) = 1965
write(1, "examples=\\\n\thelloworld\\\n\tidle-ba"..., 1965) = 1965
write(11, "examples=\\\n\thelloworld\\\n\tidle-ba"..., 1965) = 1965
Full code can be found here.
The fs operations run on thread pool. This is because there is no good and portable way to do non-blocking IO for files. Because we use a thread pool, write operations can indeed run in parallel. That's why uv_fs_write takes an offset parameter, so multiple threas can write without stepping on top of each other.
A notable exception to this is macOS, where a global lock is used to serialize uv_fs_write operations.
Now, network IO is totally different. We use an event loop (as you know) and write operations are queued, so they will be written in the order they were sent and when the underlying socket is writable.
I work with C and I make apache modules and I work with strace as my main tool for debugging timings. Here's code I threw together. My apologies if variable names do not meet standards.
#include <stdio.h>
int main(){
long ct2,ct; //counters
int a=0; //dummy value
FILE *f0=fopen("/","r"); //measuring point
ct2=10;
while (--ct2>0){
ct=5000000;
while (--ct>0){
if (!!a){
printf("%d",a);
}
}
}
FILE *f=fopen("/","r"); //measuring point
ct2=10;
while (--ct2>0){
ct=5000000;
while (--ct>0){
if (a){
printf("%d",a);
}
}
}
FILE *f2=fopen("/","r"); //measuring point
return 0;
}
This code does compile. I then run it through strace (by typing in a terminal: strace -r -ttt ./a.out) and I see:
0.000000 execve("./a.out", ["./a.out"], [/* 47 vars */]) = 0
0.000315 brk(0) = 0x804a000
0.000124 access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
0.000144 open("/etc/ld.so.cache", O_RDONLY) = 3
0.000116 fstat64(3, {st_mode=S_IFREG|0644, st_size=139721, ...}) = 0
0.000138 mmap2(NULL, 139721, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb7ece000
0.000114 close(3) = 0
0.000109 open("/lib/libc.so.6", O_RDONLY) = 3
0.000113 read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\360d\1"..., 512) = 512
0.000130 fstat64(3, {st_mode=S_IFREG|0755, st_size=1575187, ...}) = 0
0.000131 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7ecd000
0.000122 mmap2(NULL, 1357360, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xb7d81000
0.000119 mmap2(0xb7ec7000, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x146) = 0xb7ec7000
0.000146 mmap2(0xb7eca000, 9776, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0xb7eca000
0.000139 close(3) = 0
0.000112 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7d80000
0.000119 set_thread_area({entry_number:-1 -> 6, base_addr:0xb7d806c0, limit:1048575, seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, seg_not_present:0, useable:1}) = 0
0.000217 mprotect(0xb7ec7000, 4096, PROT_READ) = 0
0.000108 munmap(0xb7ece000, 139721) = 0
0.000174 brk(0) = 0x804a000
0.000099 brk(0x806b000) = 0x806b000
0.000110 open("/", O_RDONLY) = 3
0.203487 open("/", O_RDONLY) = 4
0.202225 open("/", O_RDONLY) = 5
0.000133 exit_group(0) = ?
I can tell right off at the end that:
0.000110 open("/", O_RDONLY) = 3
0.203487 open("/", O_RDONLY) = 4
0.202225 open("/", O_RDONLY) = 5
return to the three measuring points I set up.
I want to be able to adjust the measuring point lines in my code so that when I run strace I can find my measuring points like I do now, but where the system makes less intensive operations. I don't see anything else from strace related to my program other than the file calls.
I'm thinking maybe if there was such a thing as a built-in MeasureMe function in C that I would use that in place of the measuring point lines in my code, then strace could output:
0.000110 MeasureMe called in code
0.203487 MeasureMe called in code
0.202225 MeasureMe called in code
Is there any way I can go about this with Strace?
The reason why I'm asking about strace instead of gdb is because I use it to debug requests to my apache server like the person in this video does it, and I'll be able to see apache modules in action:
https://www.youtube.com/watch?v=eF-p--AH37E
Any idea how I can solve this? or will I have to continue to make failed attempts at opening non-existing files?
I gather what you are currently using is open("/",O_RDONLY) [or open("/i_do_not_exist",O_RDONLY)] for a "tracepoint". Unfortunately, because you're using strace, you're constrained to using syscalls. But, there is a way to achieve the effect you want.
What you need/want for a tracepoint that you're manually inserting at various points in your source code is:
Any unique syscall that doesn't harm anything
Is easily distinguishable from real code [even code that may return errors such as opening a file or checking for existence with access]
Minimal overhead / fastest execution
Actually, dup on a bad fildes fills the bill nicely:
dup(-10000);
It will return EBADF. It is easily distinguishable as a tracepoint because most real dup calls that are "bad" will be dup(-1)
You can have as many of these as you want. The actual argument becomes the "tracepoint number":
dup(-10001); // tracepoint 1
...
dup(-10002); // tracepoint 2
...
dup(-10003); // tracepoint 3
The output will look like:
0.000044 dup(-10001) = -1 EBADF (Bad file descriptor)
0.000022 dup(-10002) = -1 EBADF (Bad file descriptor)
0.000019 dup(-10003) = -1 EBADF (Bad file descriptor)
I usually encapsulate this in a macro:
#ifdef DEBUG
#define TRACEPOINT(_tno) tracepoint(_tno)
#else
#define TRACEPOINT(_tno) /**/
#endif
void
tracepoint(int tno)
{
dup(-10000 - tno);
}
Then, I add something like:
TRACEPOINT(1); // initialization phase
...
TRACEPOINT(2); // execution phase
...
TRACEPOINT(3); // cleanup/shutdown
Now, I'll write a perl or python script to read in the source files, extracting the comments for the given tracepoints, and append them to the matching lines in the strace output file:
0.000044 TRACEPOINT(1) initialization phase
0.000022 TRACEPOINT(2) execution phase
0.000019 TRACEPOINT(3) cleanup/shutdown
A more sophisticated version of the post-processing script can do all sorts of things:
keep track of timestamps and append a time difference between one tracepoint and the previous one to the trace line
add file name and line number information to the tracepoint lines
keep track of the number of times a given tracepoint is hit [similar to gdb and breakpoints]
generate summary reports relating to tracepoints
Ηow to EXIT_SUCCESS after strict mode seccomp is set. Is it the correct practice, to call syscall(SYS_exit, EXIT_SUCCESS); at the end of main?
#include <stdlib.h>
#include <unistd.h>
#include <sys/prctl.h>
#include <linux/seccomp.h>
#include <sys/syscall.h>
int main(int argc, char **argv) {
prctl(PR_SET_SECCOMP, SECCOMP_MODE_STRICT);
//return EXIT_SUCCESS; // does not work
//_exit(EXIT_SUCCESS); // does not work
// syscall(__NR_exit, EXIT_SUCCESS); // (EDIT) This works! Is this the ultimate answer and the right way to exit success from seccomp-ed programs?
syscall(SYS_exit, EXIT_SUCCESS); // (EDIT) works; SYS_exit equals __NR_exit
}
// gcc seccomp.c -o seccomp && ./seccomp; echo "${?}" # I want 0
As explained in eigenstate.org and in SECCOMP (2):
The only system calls that the calling thread is permitted to
make are read(2), write(2), _exit(2) (but not exit_group(2)),
and sigreturn(2). Other system calls result in the delivery
of a SIGKILL signal.
As a result, one would expect _exit() to work, but it's a wrapper function that invokes exit_group(2) which is not allowed in strict mode ([1], [2]), thus the process gets killed.
It's even reported in exit(2) - Linux man page:
In glibc up to version 2.3, the _exit() wrapper function invoked the kernel system call of the same name. Since glibc 2.3, the wrapper function invokes exit_group(2), in order to terminate all of the threads in a process.
Same happens with the return statement, which should end up in killing your process, in the very similar manner with _exit().
Stracing the process will provide further confirmation (to allow this to show up, you have to not set PR_SET_SECCOMP; just comment prctl()) and I got similar output for both non-working cases:
linux12:/home/users/grad1459>gcc seccomp.c -o seccomp
linux12:/home/users/grad1459>strace ./seccomp
execve("./seccomp", ["./seccomp"], [/* 24 vars */]) = 0
brk(0) = 0x8784000
access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
mmap2(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb775f000
access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=97472, ...}) = 0
mmap2(NULL, 97472, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb7747000
close(3) = 0
access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
open("/lib/i386-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\220\226\1\0004\0\0\0"..., 512) = 512
fstat64(3, {st_mode=S_IFREG|0755, st_size=1730024, ...}) = 0
mmap2(NULL, 1739484, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xdd0000
mmap2(0xf73000, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1a3) = 0xf73000
mmap2(0xf76000, 10972, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0xf76000
close(3) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7746000
set_thread_area({entry_number:-1 -> 6, base_addr:0xb7746900, limit:1048575, seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, seg_not_present:0, useable:1}) = 0
mprotect(0xf73000, 8192, PROT_READ) = 0
mprotect(0x8049000, 4096, PROT_READ) = 0
mprotect(0x16e000, 4096, PROT_READ) = 0
munmap(0xb7747000, 97472) = 0
exit_group(0) = ?
linux12:/home/users/grad1459>
As you can see, exit_group() is called, explaining everything!
Now as you correctly stated, "SYS_exit equals __NR_exit"; for example it's defined in mit.syscall.h:
#define SYS_exit __NR_exit
so the last two calls are equivalent, i.e. you can use the one you like, and the output should be this:
linux12:/home/users/grad1459>gcc seccomp.c -o seccomp && ./seccomp ; echo "${?}"
0
PS
You could of course define a filter yourself and use:
prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, filter);
as explained in the eigenstate link, to allow _exit() (or, strictly speaking, exit_group(2)), but do that only if you really need to and know what you are doing.
The problem occurs, because the GNU C library uses the exit_group syscall, if it is available, in Linux instead of exit, for the _exit() function (see sysdeps/unix/sysv/linux/_exit.c for verification), and as documented in the man 2 prctl, the exit_group syscall is not allowed by the strict seccomp filter.
Because the _exit() function call occurs inside the C library, we cannot interpose it with our own version (that would just do the exit syscall). (The normal process cleanup is done elsewhere; in Linux, the _exit() function only does the final syscall that terminates the process.)
We could ask the GNU C library developers to use the exit_group syscall in Linux only when there are more than one thread in the current process, but unfortunately, it would not be easy, and even if added right now, would take quite some time for the feature to be available on most Linux distributions.
Fortunately, we can ditch the default strict filter, and instead define our own. There is a small difference in behaviour: the apparent signal that kills the process will change from SIGKILL to SIGSYS. (The signal is not actually delivered, as the kernel does kill the process; only the apparent signal number that caused the process to die changes.)
Furthermore, this is not even that difficult. I did waste a bit of time looking into some GCC macro trickery that would make it trivial to manage the allowed syscalls' list, but I decided it would not be a good approach: the list of allowed syscalls should be carefully considered -- we only add exit_group() compared to the strict filter, here! -- so making it a bit difficult is okay.
The following code, say example.c, has been verified to work on a 4.4 kernel (should work on kernels 3.5 or later) on x86-64 (for both x86 and x86-64, i.e. 32-bit and 64-bit binaries). It should work on all Linux architectures, however, and it does not require or use the libseccomp library.
#define _GNU_SOURCE
#include <stdlib.h>
#include <stddef.h>
#include <sys/prctl.h>
#include <sys/syscall.h>
#include <linux/seccomp.h>
#include <linux/filter.h>
#include <stdio.h>
static const struct sock_filter strict_filter[] = {
BPF_STMT(BPF_LD | BPF_W | BPF_ABS, (offsetof (struct seccomp_data, nr))),
BPF_JUMP(BPF_JMP | BPF_JEQ, SYS_rt_sigreturn, 5, 0),
BPF_JUMP(BPF_JMP | BPF_JEQ, SYS_read, 4, 0),
BPF_JUMP(BPF_JMP | BPF_JEQ, SYS_write, 3, 0),
BPF_JUMP(BPF_JMP | BPF_JEQ, SYS_exit, 2, 0),
BPF_JUMP(BPF_JMP | BPF_JEQ, SYS_exit_group, 1, 0),
BPF_STMT(BPF_RET | BPF_K, SECCOMP_RET_KILL),
BPF_STMT(BPF_RET | BPF_K, SECCOMP_RET_ALLOW)
};
static const struct sock_fprog strict = {
.len = (unsigned short)( sizeof strict_filter / sizeof strict_filter[0] ),
.filter = (struct sock_filter *)strict_filter
};
int main(void)
{
/* To be able to set a custom filter, we need to set the "no new privs" flag.
The Documentation/prctl/no_new_privs.txt file in the Linux kernel
recommends this exact form: */
if (prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0)) {
fprintf(stderr, "Cannot set no_new_privs: %m.\n");
return EXIT_FAILURE;
}
if (prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &strict)) {
fprintf(stderr, "Cannot install seccomp filter: %m.\n");
return EXIT_FAILURE;
}
/* The seccomp filter is now active.
It differs from SECCOMP_SET_MODE_STRICT in two ways:
1. exit_group syscall is allowed; it just terminates the
process
2. Parent/reaper sees SIGSYS as the killing signal instead of
SIGKILL, if the process tries to do a syscall not in the
explicitly allowed list
*/
return EXIT_SUCCESS;
}
Compile using e.g.
gcc -Wall -O2 example.c -o example
and run using
./example
or under strace to see the syscalls and library calls done;
strace ./example
The strict_filter BPF program is really trivial. The first opcode loads the syscall number into the accumulator. The next five opcodes compare it to an acceptable syscall number, and if found, jump to the final opcode that allows the syscall. Otherwise the second-to-last opcode kills the process.
Note that although the documentation refers to sigreturn being the allowed syscall, the actual name of the syscall in Linux is rt_sigreturn. (sigreturn was deprecated in favour of rt_sigreturn ages ago.)
Furthermore, when the filter is installed, the opcodes are copied to kernel memory (see kernel/seccomp.c in the Linux kernel sources), so it does not affect the filter in any way if the data is modified later. Having the structures static const has zero security impact, in other words.
I used static since there is no need for the symbols to be visible outside this compilation unit (or in a stripped binary), and const to put the data into the read-only data section of the ELF binary.
The form of a BPF_JUMP(BPF_JMP | BPF_JEQ, nr, equals, differs) is simple: the accumulator (the syscall number) is compared to nr. If they are equal, then the next equals opcodes are skipped. Otherwise, the next differs opcodes are skipped.
Since the equals cases jump to the very final opcode, you can add new opcodes at the top (that is, just after the initial opcode), incrementing the equals skip count for each one.
Note that printf() will not work after the seccomp filter is installed, because internally, the C library wants to do a fstat syscall (on standard output), and a brk syscall to allocate some memory for a buffer.
I tried to compile and run C code that has the following lines:
FILE *preproc_producer = NULL;
preproc_producer = tmpfile();
// preproc_producer is not NULL here
preproc_producer = freopen(NULL, "r+", preproc_producer);
// preproc_producer is NULL here
However, when running the code, preproc_producer ends up NULL, and error code is Stale NFS file handle
What is the issue with the above code?
What is the purpose of the freopen call here? I commented out the freopen line and the rest of the program seems to be working.
I'm using GCC 4.7.2, running Ubuntu 64 12.04 inside a Docker 0.6.7 Linux container. The above code seems to work outside the Docker container.
Update: strace dump:
stat("/tmp", {st_mode=S_IFDIR|S_ISVTX|0777, st_size=4096, ...}) = 0
gettimeofday({1385247432, 199732}, NULL) = 0
getpid() = 127
open("/tmp/tmpf9l14HD", O_RDWR|O_CREAT|O_EXCL, 0600) = 3
unlink("/tmp/tmpf9l14HD") = 0
fcntl(3, F_GETFL) = 0x8002 (flags O_RDWR|O_LARGEFILE)
brk(0) = 0xc94000
brk(0xcb5000) = 0xcb5000
fstat(3, {st_mode=S_IFREG|0600, st_size=0, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f7afb9d0000
lseek(3, 0, SEEK_CUR) = 0
lstat("/proc/self/fd/3", {st_mode=S_IFLNK|0700, st_size=64, ...}) = 0
munmap(0x7f7afb9d0000, 4096) = 0
open("/proc/self/fd/3", O_RDWR) = -1 ESTALE (Stale NFS file handle)
From the C99 standard:
The freopen function opens the file whose name is the string pointed to by filename
and associates the stream pointed to by stream with it. The mode argument is used just
as in the fopen function.
If filename is a null pointer, the freopen function attempts to change the mode of
the stream to that specified by mode, as if the name of the file currently associated with
the stream had been used. It is implementation-defined which changes of mode are
permitted (if any), and under what circumstances.
So, probably who wrote this code meant to change the temporary file open mode from w+b to r+ (which mostly boils down to change the stream to text mode). Unfortunately, it seems that in your implementation it's not possible to change the open mode of a temporary file in that way.
I suppose that it may come from the fact that closing a temporary file also deletes it, but it may also be that glibc implementation of freopen doesn't support mode changes in freopen (the manpage doesn't even mention the possibility to pass NULL as first argument).