Why does wrapping printf with ld fail when there's a newline? - c

I'm trying to intercept calls to printf using ld's -wrap option. I have the two files:
main.c:
#include <stdio.h>
int main(void) {
printf("printing\n");
printf("printing");
}
printf_wrapper.c:
int __real_printf(const char *format, ...);
int __wrap_printf(const char *format, ...) {
(void)format;
return __real_printf("WRAPPED\n");
}
And I compile with the following command:
gcc -Wl,-wrap,printf *.c
When I run the resulting a.out binary, I get this output:
printing
WRAPPED
Why does the wrapping fail if there's a newline in the string? I checked my system's stdio.h and printf isn't a macro. This is with gcc 5.3.0

Use the -fno-builtin option to tell gcc not to mess around with some specified functions. So, if you added -fno-builtin-printf it should work. In general it may cause some problems that would have been caught by the compiler to be missed. See the gcc docs for details, e.g. https://gcc.gnu.org/onlinedocs/gcc-4.2.2/gcc/C-Dialect-Options.html

Related

No output from split up source, but no warnings either, when omitting an included file

I ran into an issue invoking gcc where if I omit a library .c file, I got no output from the binary (unexpected behavior change) but since this is a missing dependency, I kind of expected the compile to fail (or at least warn)...
Example for this issue is from Head First C page 185 (but is not errata, see my compile mis-step below):
encrypt.h:
void encrypt(char *message);
encrypt.c:
#include "encrypt.h"
void encrypt(char *message)
{
// char c; errata
while (*message) {
*message = *message ^ 31;
message++;
}
}
message_hider.c:
#include <stdio.h>
#include "encrypt.h"
int main() {
char msg[80];
while (fgets(msg, 80, stdin)) {
encrypt(msg);
printf("%s", msg);
}
}
NOW, everything works fine IF I faithfully compile as per exercise instruction:
gcc message_hider.c encrypt.c -o message_hider
... but bad fortune led me to compile only the main .c file, like so:
$ gcc message_hider.c -o message_hider
This surprisingly successfully builds, even if I added -Wall -Wextra -Wshadow -g.
Also surprisingly, it silently fails, with no output from encrypt() function:
$ ./message_hider < ./encrypt.h
$
my gcc is:
$ /usr/bin/gcc --version
Apple clang version 13.1.6 (clang-1316.0.21.2.5)
Target: x86_64-apple-darwin21.6.0
Thread model: posix
InstalledDir: /Library/Developer/CommandLineTools/usr/bin
Mindful that even with a Makefile, I could "still" end up with a missing .c file due to a mistake in the recipe.
Q: Is it possible to force a hard error if I forget to tell gcc about a .c file?
As I noted in a (misspelled) comment:
There is probably a function encrypt() in the system library.
On a Mac, man -s 3 encrypt shows:
CRYPT(3) BSD Library Functions Manual CRYPT(3)
NAME
crypt, encrypt, setkey -- DES encryption
SYNOPSIS
#include <unistd.h>
char *
crypt(const char *key, const char *salt);
void
encrypt(char *block, int edflag);
#include <stdlib.h>
void
setkey(const char *key);
…
The encrypt() and setkey() functions are part of POSIX, so they'll be available on most POSIX-like systems. Curiously, as shown in the manual page extract, the functions are declared in separate headers — <unistd.h> for encrypt() and
<stdlib.h> for setkey(). There's probably a good (enough) historical reason for the disconnect.
You should have received a compiler warning about the function being undeclared — if you didn't, you are presumably compiling using the C90 standard. That is very old and should not still be being taught; you need to be learning C11 or C18 (almost the same).
Since C99, the C standard requires functions to be declared before use — you can define a static function without pre-declaring it, but all other functions (except main()) should be declared before they are used or defined. You can use GCC compiler warning options such as -Wmissing-prototypes -Wstrict-prototypes (along with -Wold-style-declaration and -Wold-style-definition) to trigger warnings. Of these, -Wold-style-declaration is enabled by -Wextra (and none by -Wall). Be aware: as noted in the comments, clang does not support -Wold-style-declaration though true GCC (not Apple's clang masquerading as gcc) does support it.

No parameter checking in GCC

I have written a C program, which consists below given three files in same directory
main.c
#include<stdio.h>
#include "test.h"
int main()
{
int b=0;
b = test_add(3,2);
printf("Added: b=%d\n\n",b);
return 0;
}
test.h
int test_add(int a, int b);
test.c
int test_add(int a, int b, int c)
{
return a+b+c;
}
I am compiling the program using below command:
$gcc -Wall -Wextra main.c test.c
It compiles successfully. I can see there is mismatch in number of arguments of calling function and its actual definition. Compiler doesn't give any warning/error for such problem. How can this type of errors be reported by compiler?
This shows one of the oddities of the C standard. It allows entities such as functions to be undefined.
The actual error is that you did not
#include "test.h"
in you test.c file.
That means that the main file only sees the version of the function with three parameters. When it reaches the function call, it implicitly declares the function with two parameters.
When you run it, you get bogus values for b. I am guessing the superuser's password could somehow be in there ;)
If you add the include directive, you get an error at compile time.
What worries me, that there is no warning, not even with -Wall -Wextra -pedantic.

Detect -nostdlib or just detect whether stdlib is available or not

I have a homework assignment that requires us to open, read and write to file using system calls rather than standard libraries. To debug it, I want to use std libraries when test-compiling the project. I did this:
#ifdef HOME
//Home debug prinf function
#include <stdio.h>
#else
//Dummy prinf function
int printf(const char* ff, ...) {
return 0;
}
#endif
And I compile it like this: gcc -DHOME -m32 -static -O2 -o main.exe main.c
Problem is that I with -nostdlib argument, the standard entry point is void _start but without the argument, the entry point is int main(const char** args). You'd probably do this:
//Normal entry point
int main(const char** args) {
_start();
}
//-nostdlib entry point
void _start() {
//actual code
}
In that case, this is what you get when you compile without -nostdlib:
/tmp/ccZmQ4cB.o: In function `_start':
main.c:(.text+0x20): multiple definition of `_start'
/usr/lib/gcc/i486-linux-gnu/4.7/../../../i386-linux-gnu/crt1.o:(.text+0x0): first defined here
Therefore I need to detect whether stdlib is included and do not define _start in that case.
The low-level entry point is always _start for your system. With -nostdlib, its definition is omitted from linking so you have to provide one. Without -nostdlib, you must not attempt to define it; even if this didn't get a link error from duplicate definition, it would horribly break the startup of the standard library runtime.
Instead, try doing it the other way around:
int main() {
/* your code here */
}
#ifdef NOSTDLIB_BUILD /* you need to define this with -D */
void _start() {
main();
}
#endif
You could optionally add fake arguments to main. It's impossible to get the real ones from a _start written in C though. You'd need to write _start in asm for that.
Note that -nostdlib is a linker option, not compile-time, so there's no way to automatically determine at compile-time that that -nostdlib is going to be used. Instead just make your own macro and pass it on the command line as -DNOSTDLIB_BUILD or similar.

How can I keep gcc -O2 from optimizing putchar out?

I have an application that uses a custom putchar(); which until now has been working fine.
I bumped up the optimization level of the application to -O2, and now my putchar isn't used.
I already use -fno-builtin, and based on some googling I added -fno-builtin-putchar to my CFLAGS, but that didn't matter.
Is there a "correct" way to get around this or do I have to go into my code and add something like
#define putchar myputchar
to be able to use -O2 and still pull in my own putchar() function?
edit--
Since my original post of this question, I stumbled on -fno-builtin-functions=putchar, as yet another gcc commandline option. Both this and the one above are accepted by gcc, but don't seem to have any noticeable effect.
edit more--
Experimenting further I see that gcc swallows -fno-builtin-yadayada also, so apparently the options parsing at the gcc front end is just passing the text after the second dash to some lower level which ignores it.
more detail:
Three files try1.c, try2.c and makefile...
try1.c:
#include <stdio.h>
int
main(int argc, char *argv[])
{
putchar('a');
printf("hello\n");
return(0);
}
try2.c:
#include <stdio.h>
int
putchar(int c)
{
printf("PUTCHAR: %c\n",c);
return(1);
}
makefile:
OPT=
try: try1.o try2.o
gcc -o try try1.o try2.o
try1.o: try1.c
gcc -o try1.o $(OPT) -c try1.c
try2.o: try2.c
gcc -o try2.o $(OPT) -c try2.c
clean:
rm -f try1.o try2.o try
Here's the output:
Notice that without optimization it uses the putchar I provided; but with -O2 it gets it from some other "magic" place...
els:make clean
rm -f try1.o try2.o try
els:make
gcc -o try1.o -c try1.c
gcc -o try2.o -c try2.c
gcc -o try try1.o try2.o
els:./try
PUTCHAR: a
hello
els:
els:
els:
els:make clean
rm -f try1.o try2.o try
els:make OPT=-O2
gcc -o try1.o -O2 -c try1.c
gcc -o try2.o -O2 -c try2.c
gcc -o try try1.o try2.o
els:./try
ahello
els:
Ideally, you should produce an MCVE (Minimal, Complete, Verifiable Example) or
SSCCE (Short, Self-Contained, Correct Example) — two names (and links) for the same basic idea.
When I attempt to reproduce the problem, I created:
#include <stdio.h>
#undef putchar
int putchar(int c)
{
fprintf(stderr, "%s: 0x%.2X\n", __func__, (unsigned char)c);
return fputc(c, stdout);
}
int main(void)
{
int c;
while ((c = getchar()) != EOF)
putchar(c);
return 0;
}
When compiled with GCC 4.9.1 on Mac OS X 10.9.4 under both -O2 and -O3, my putchar function was called:
$ gcc -g -O2 -std=c99 -Wall -Wextra -Wmissing-prototypes -Wstrict-prototypes -Werror pc.c -o pc
$ ./pc <<< "abc"
putchar: 0x61
putchar: 0x62
putchar: 0x63
putchar: 0x0A
abc
$
The only thing in the code that might be relevant to you is the #undef putchar which removes the macro override for the function.
Why try1.c doesn't call your putchar() function
#include <stdio.h>
int
main(int argc, char *argv[])
{
putchar('a');
printf("hello\n");
return(0);
}
The function putchar() may be overridden by a macro in <stdio.h>. If you wish to be sure to call a function, you must undefine the macro.
If you don't undefine the macro, that will override anything you do. Hence, it is crucial that you write the #undef putchar (the other changes are recommended, but not actually mandatory):
#include <stdio.h>
#undef putchar
int main(void)
{
putchar('a');
printf("hello\n");
return(0);
}
Note that putchar() is a reserved symbol. Although in practice you will get away with using it as a function, you have no grounds for complaint if you manage to find an implementation where it does not work. This applies to all the symbols in the standard C library. Officially, therefore, you should use something like:
#include <stdio.h>
#undef putchar
extern int put_char(int c); // Should be in a local header
#define putchar(c) put_char(c) // Should be in the same header
int main(void)
{
putchar('a');
printf("hello\n");
return(0);
}
This allows you to leave your 'using' source code unchanged (apart from including a local header — but you probably already have one to use). You just need to change the implementation to use the correct local name. (I'm not convinced that put_char() is a good choice of name, but I dislike the my_ prefix, for all it is a common convention in answers.)
ISO/IEC 9899:2011 §7.1.4 Use of library functions
Each of the following statements applies unless explicitly stated otherwise in the detailed
descriptions that follow: …
Any function
declared in a header may be additionally implemented as a function-like macro defined in
the header, so if a library function is declared explicitly when its header is included, one
of the techniques shown below can be used to ensure the declaration is not affected by
such a macro. Any macro definition of a function can be suppressed locally by enclosing
the name of the function in parentheses, because the name is then not followed by the left
parenthesis that indicates expansion of a macro function name. For the same syntactic
reason, it is permitted to take the address of a library function even if it is also defined as
a macro.185) The use of #undef to remove any macro definition will also ensure that an
actual function is referred to. Any inv ocation of a library function that is implemented as
a macro shall expand to code that evaluates each of its arguments exactly once, fully
protected by parentheses where necessary, so it is generally safe to use arbitrary
expressions as arguments.186) Likewise, those function-like macros described in the
following subclauses may be invoked in an expression anywhere a function with a
compatible return type could be called.187)
185) This means that an implementation shall provide an actual function for each library function, even if it
also provides a macro for that function.
186) Such macros might not contain the sequence points that the corresponding function calls do.
187) Because external identifiers and some macro names beginning with an underscore are reserved,
implementations may provide special semantics for such names. For example, the identifier
_BUILTIN_abs could be used to indicate generation of in-line code for the abs function. Thus, the
appropriate header could specify
#define abs(x) _BUILTIN_abs(x)
for a compiler whose code generator will accept it.
In this manner, a user desiring to guarantee that a given library function such as abs will be a genuine
function may write
#undef abs
whether the implementation’s header provides a macro implementation of abs or a built-in
implementation. The prototype for the function, which precedes and is hidden by any macro
definition, is thereby revealed also.
Judging from what you observe, in one set of headers, putchar() is not defined as a macro (it does not have to be, but it may be). And switching compilers/libraries means that now that putchar() is defined as a macro, the missing #undef putchar means that things no longer work as before.

Embedding binary blobs using gcc mingw

I am trying to embed binary blobs into an exe file. I am using mingw gcc.
I make the object file like this:
ld -r -b binary -o binary.o input.txt
I then look objdump output to get the symbols:
objdump -x binary.o
And it gives symbols named:
_binary_input_txt_start
_binary_input_txt_end
_binary_input_txt_size
I then try and access them in my C program:
#include <stdlib.h>
#include <stdio.h>
extern char _binary_input_txt_start[];
int main (int argc, char *argv[])
{
char *p;
p = _binary_input_txt_start;
return 0;
}
Then I compile like this:
gcc -o test.exe test.c binary.o
But I always get:
undefined reference to _binary_input_txt_start
Does anyone know what I am doing wrong?
In your C program remove the leading underscore:
#include <stdlib.h>
#include <stdio.h>
extern char binary_input_txt_start[];
int main (int argc, char *argv[])
{
char *p;
p = binary_input_txt_start;
return 0;
}
C compilers often (always?) seem to prepend an underscore to extern names. I'm not entirely sure why that is - I assume that there's some truth to this wikipedia article's claim that
It was common practice for C compilers to prepend a leading underscore to all external scope program identifiers to avert clashes with contributions from runtime language support
But it strikes me that if underscores were prepended to all externs, then you're not really partitioning the namespace very much. Anyway, that's a question for another day, and the fact is that the underscores do get added.
From ld man page:
--leading-underscore
--no-leading-underscore
For most targets default symbol-prefix is an underscore and is defined in target's description. By this option it is possible to disable/enable the default underscore symbol-prefix.
so
ld -r -b binary -o binary.o input.txt --leading-underscore
should be solution.
I tested it in Linux (Ubuntu 10.10).
Resouce file:
input.txt
gcc (Ubuntu/Linaro 4.4.4-14ubuntu5) 4.4.5 [generates ELF executable, for Linux]
Generates symbol _binary__input_txt_start.
Accepts symbol _binary__input_txt_start (with underline).
i586-mingw32msvc-gcc (GCC) 4.2.1-sjlj (mingw32-2) [generates PE executable, for Windows]
Generates symbol _binary__input_txt_start.
Accepts symbol binary__input_txt_start (without underline).
Apparently this feature is not present in OSX's ld, so you have to do it totally differently with a custom gcc flag that they added, and you can't reference the data directly, but must do some runtime initialization to get the address.
So it might be more portable to make yourself an assembler source file which includes the binary at build time, a la this answer.

Resources