Undefined reference to strlcpy and strlcat - c

I am using strlcpy and strlcat in place of strncat/cpy; however, whenever I go to compile it
GCC -o Project Project.c it will continuously through me errors saying:
Undefined reference to strlcpy; Undefined reference to 'strlcat'
My code:
#include <bsd/string.h>
strlcpy(path, ARTICLEPATH, sizeof(ARTICLEPATH));
strlcat(path, ARTICLEPATH, sizeof(path));
I have added the library to my file, but it seems to continue to throw me the error:
#include <bsd/string.h>
Is there something else I need to do? Or is there an another alternative to using strncpy that utilizes null byte termination?
EDIT: As background, I am on ubuntu 20.04

As is (sort of) explained in the libbsd man page, you need to link the libbsd library as well as including the header. So add -lbsd to your command line when linking. For a simple program, you might do
gcc -o prog prog.c -lbsd
Note that the ordering of options is important here, see Why does the order in which libraries are linked sometimes cause errors in GCC? If you put -lbsd before your source file on the command line, it will probably not work.
The way you asked the question suggests you might have some confusion about the difference between a header and a library, and the roles that each one plays. You may want to read What's the difference between a header file and a library?
(This is almost a duplicate of when I use strlcpy function in c the compilor give me an error, but that question is more generic and some of the answers aren't applicable to Ubuntu specifically, so I thought a separate answer would be useful.)

Related

Prevent my C code from printing (seriously slows down the execution)

I have an issue.
I finally found a way to use an external library to solve my numerical systems. This library automatically prints the matrices. It is fine for dim=5, but for dim=1.000.000, you understand the problem...
Those parasite "printf"s slow down considerably the execution, and I would like to get rid of them. The problem is: I don't know where they are ! I looked in every ".h" and ".c" file in my library: they are nowhere to be found.
I suspect they already are included in the library itself: superlu.so. I can't access them, thus.
How could I possibly prevent my C code from printing anything during the execution ?
Here is my Makefile. I use the libsuperlu-dev library, directly downloaded from Ubuntu. The .so file was already there.
LIB = libsuperlu.so
main: superlu.o read_file.o main.o sample_arrays.o super_csr.o
cc $^ -o $# $(LIB)
clean:
rm *.o
rm main
Just to explain the LD_PRELOAD method that was mentioned, that I use sometimes precisely for that usage (or, on the contrary to add some printf, for example, when I want to pipe the output of a GUI), here is how you can do a rudimentary version of it
myprint.c:
int printf(char *, ...){
return 0;
}
int putchar(int){
return 0;
}
Then
gcc -shared -std=gnu99 -o myprint.so myprint.c
Then
LD_PRELOAD=./myprint.so ./main
Forces the load of your printf and putchar symbols before any other library has the opportunity to load them force. So, no printing occurs. At least none with printf. But you may have to add some other functions to the list, such as fprintf, fputc, fputs, puts, ...
And of course, another problem of overloading the fthing functions (and even possibly the others), is that you might also prevent some wanted behavior. Such as writing files. Or interacting with some devices.
It may be even worse if those printing are done with low level write function. That one, you very likely can't afford to overload (unless you overload it with a function that calls the real write, loaded manually by dlopen) filtering only the ones that you want to avoid, based on target file descriptor (1) or on content of written data.
Note: if you want to verify if the libsuperlu.so is responsible of those printing, you can check with nm libsuperlu.so if it is referring to some well known printing functions, such as printf

Where does GCC find printf ? My code worked without any #include

I am a C beginner so I tried to hack around the stuff.
I read stdio.h and I found this line:
extern int printf (const char *__restrict __format, ...);
So I wrote this code and i have no idea why it works.
code:
extern int printf (const char *__restrict __format, ...);
main()
{
printf("Hello, World!\n");
}
output:
sh-5.1$ ./a.out
Hello, World!
sh-5.1$
Where did GCC find the function printf? It also works with other compilers.
I am a beginner in C and I find this very strange.
gcc will link your program, by default, with the c library libc which implements printf:
$ ldd ./a.out
linux-vdso.so.1 (0x00007ffd5d7d3000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fdf2d307000)
/lib64/ld-linux-x86-64.so.2 (0x00007fdf2d4f0000)
$ nm -D /lib/x86_64-linux-gnu/libc.so.6 | grep ' printf' | head -1
0000000000056cf0 T printf##GLIBC_2.2.5
If you build your program with -nolibc you have to satisfy a few symbols on your own (see
Compiling without libc):
$ gcc -nolibc ./1.c
/usr/bin/ld: /usr/lib/gcc/x86_64-linux-gnu/10/../../../x86_64-linux-gnu/Scrt1.o: in function `_start':
(.text+0x12): undefined reference to `__libc_csu_fini'
/usr/bin/ld: (.text+0x19): undefined reference to `__libc_csu_init'
/usr/bin/ld: (.text+0x26): undefined reference to `__libc_start_main'
/usr/bin/ld: /tmp/user/1000/ccCFGFhf.o: in function `main':
1.c:(.text+0xc): undefined reference to `puts'
collect2: error: ld returned 1 exit status
You need to understand the difference between the compile and link phases of program compilation.
In the compilation phase you describe to the compiler the various things you intend to call that may be in this file, in other files or in libraries. This is done using function declarations.
int woodle(char*);
for example. This is what header files are full of.
If the function is in the same file then the compiler will work out how to call it while it compiles that file. But for other functions it leaves a note in the generated code that says
please wire up the woodle function here so I can call it.
Usually called an import and there are tools you can use to look at the imports in an object file - name depends on platform and toolset
The linkers job is to find those imports and resolve them. It will look at objects files passed on the command line, at libraries included on the command line and also standard libraries that the c standard says should be available to all programs.
In your printf case the linker found printf in the c standard library that the linker includes automatically.
BTW - the linker looks for 'exports' from objects and libraries, there are tools to look at those too. The linkers job is to match each 'import' to an 'export'
First, realize what the gcc program is. Technically, it is not a compiler, but a compiler driver. A compiler driver is responsible for driving the various other tools which perform compilation-related tasks. Some of the tools are found in PATH, whereas others are in internal compiler directories.
There are various ways to check what the driver is doing. I won't go into much detail about how I made the rest of this post, but briefly:
strace -f -e %process gcc is a Linux-specific way of showing all the programs executed (elsewhere in this answer, I assume Linux when specifying details but it doesn't matter)
gcc -v will dump out various information, but you have to learn what parts actually matter for whatever you are doing.
there exists a "specs" file that controls some of the argument-related stuff the driver does
Now for the actual data:
Here's the tree of processes that gcc might execute:
gcc, the "driver" (input various, output various. Some arguments are handled by the driver itself, but most are passed to the various subprocesses)
(these are repeated for every input file. If -pipe is passed, temporary files are omitted and processes are run in parallel; if --save-temps is passed, intermediate files are preserved):
cc1 -E -lang-asm, the "preprocessor" for assembly code (input .S, output .s - yes, case matters. Only relevant if you're trying to compile separate ASM files that need preprocessing)
cc1 -E, the "preprocessor" for C code (input .c; output .i. Only a separate process if -fno-integrated-cpp is passed, which is rare. Note that the cpp program in PATH is never called, even though it is provided by GCC - rather, it calls this. If -E is passed, the driver stops after this)
cc1, the "compiler" proper (input (usually) .c or (rarely) .i; output .s. If -S is passed, the driver stops after this; if -fsyntax-only is passed, this stage doesn't even complete)
(For other languages, replace cc1 with cc1plus, cc1d, cc1obj, f951, gnat1, etc. Note that the different drivers like g++, gdc, etc. only affect what extra libraries are linked by default)
as, the "assembler" (input .s; output .o. This is looked up in PATH; it is shipped as part of Binutils, not GCC. If -c is passed, the driver stops here)
collect2, the "linker" wrapper (supposedly this has something to do with constructors, and potentially calls ld twice, but in practice I've never seen it. Just think of it as forwarding all its arguments to ld, even if you have constructors normally)
ld, the "linker" proper (input .o or others (assumed to be libraries); output executable or shared library. Like as, this is actually part of Binutils, not GCC, so it is looked up in PATH)
The driver has a lot of logic, so it is important that you use it. Notably, you should never call as or ld yourself, since that will omit arguments that rely on the driver's sense of "exact current platform".
Now, getting to your specific question:
Ignoring irrevelant arguments and simplifying paths, the ld call ends up looking like:
ld -o foo Scrt1.o crti.o crtbeginS.o foo.o -lgcc -lgcc_s -lc -lgcc -lgcc_s crtendS.o crtn.o
The various "crt" loose object files are a mixture of parts of GLIBC and GCC, needed to support the C runtime (note that there are others as well; which are linked depends on arguments). The gcc and gcc_s libraries are needed to run code on the platform at all; they are repeated because they rely on the c library which also relies on them.
Since -lc is passed by default (regardless of language), the printf symbol can be resolved. Notably, -lm, -lrt, -lpthread and others are not passed by default, so other symbols from differents parts of the C library will not be resolved unless you pass them manually.
All of this is completely independent of what headers are included.
That your program compiles without a header present means that the compiler settings were lenient. You should still get a warning though. The reason that your program links is that the C standard library, which contains the code of the function printf, is linked automatically. Almost every C program needs it because input and output, or generally interaction with peripherals, which that library handles, are the general means of generating a "side effect", an effect outside the program. The opposite is so uncommon that one must make the wish to not link with it explicit.
So why does your compiler accept a call to a function which has not been declared?
C emerged at a time when programs were much smaller and software development as an engineering discipline didn't formally exist:
Four years later [i.e., in 1978], as a still-junior faculty member, I tried to get my colleagues [...] to create an undergraduate computer-science degree. A senior mechanical engineer of forbidding mien snorted surely not: Harvard had never offered a degree in automotive science, why would we create one in computer science? I waited until I had tenure before trying again (and succeeding) in 1982. -Harry R. Lewis
That was about 10 years after Denis Ritchie had started to develop this versatile new programming language, the successor to B. The problems involved in creating and maintaining large programs back then were simply not as pressing and not as well-understood as they are, perhaps, today.
Among the many things that help us today, at least in most compiled languages, is strong typing. Every identifier we use is declared with a static type. But the importance and benefits of that were not that obvious in the 1970s, and early C permitted mixing and matching integers and pointers at will. It's all numbers, right? And a function is just a name for a jump address, right? The user will know what to put on the stack, and the function will read it off the stack — I really don't see a problem here ;-). This attitude brought us functions like printf().
After this stage-setting we are slowly getting to the point. Because a function is just a jump address, no function declaration needed to be present in order to to call one. The assumed parameters were what you presented, and the presumed return type defaulted to int, which was often correct or at least didn't hurt. And for a long time C kept this backward compatibility. I think the C99 standard forbid the use of undeclared identifiers, and the standard drafts for C11 and C21 both say:
An identifier is a primary expression, provided it has been declared as designating an object (in which case it is an lvalue) or a function (in which case it is a function designator)91
Footnote 91 says "Thus, an undeclared identifier is a violation of the syntax." (All emphasis by me.)
All compilers I tried compile it anyway (with a warning), perhaps because some ancient code that still gets compiled frequently depends on it.

GCC: compiling an application without linking any library

I know how to compile a C application without linking any library using GCC in bare metal embedded application just setting up the startup function(s) and eventually the assembly startup.s file.
Instead, I am not able to do the same thing in Windows (I am using MINGW32 GCC). Seems that linking with -nostdlib removes also everything needed to be executed before main, so I should write a specific startup but I did not find any doc about that.
The reason because I need to compile without C std lib is that I am writing a rduced C std lib for little 32 bits microcontrollers and I would like to test and unit test this library using GCC under Windows. So, if there is an alternative simplest way it is OK for me.
Thanks.
I found the solution adding -nostdlib and -lgcc together to ld (or gcc used as linker). In this way the C standard lib is not automatically linked to the application but everything needed to startup the application is linked.
I found also that the order of these switches matters, it may not work at all, signal missing at_exit() function or work without any error/warning depending by the order and position of the options.
I discovered another little complication using Eclipse based IDEs because there are some different approaches in the Settings menu so to write the options in the right order I needed to set them in different places.
After that I had a new problem: I did not think that unit test libraries require at least a function able to write to stdout or to a file.
I found that using "" and <> forces the compiler and linker to use the library modules I want by my library and the C standard library.
So, for instance:
#include "string.h" // points to my library include
#include <stdio.h> // points to C stdlib include
permits me to test all my library string functions using the C stdlib stdout functions.
It works both using GCC and GCC Cross Compilers.

Why is stddef.h not in /usr/include?

I have compiled the gnu standard library and installed it in $GLIBC_INST.
Now, I try to compile a very simple programm (using only one #include : #include <stdio.h>):
gcc --nostdinc -I$GLIBC_INST/include foo.c
The compilation (preprocessor?) tells me, that it doesn't find stddef.h.
And indeed, there is none in $GLIBC_INST/include (nor is there one in /usr/include). However, I found a stddef.h in /usr/lib/gcc/x86_64-unknown-linux-gnu/5.3.0/include.
Why is that file not under /usr/include? I thought it belonged to the standard c library and should be installed in $GLIBC_INST/include.
How can I compile my foo.c with the newly installed standard library when it doesn't seem to come with a stddef.h?
Edit: Clarification
I feel that the title of this question is not optimal. As has been pointed out by some answers, there is not a requirement for stddef.h to be in /usr/include (or $GLIBC_INST/include, for that matter). I do understand that.
But I am wondering how I can proceed when I want to use $GLIBC_INST. It seems obvious to me (although I might be wrong here) that I need to invoke gcc with --nostdinc in order to not use the system installed header files.
This entails that I use -I$GLIB_INST/include. This is clear to me.
Yet, what remains unclear to me is: when I also add -I/usr/lib/gcc/x86..../include, how can I be sure that I do have in fact the newest header files for the freshly compiled glibc?
That's because files under /usr/include are common headers that provided by the C library, for example, glibc, while the files at /usr/lib/gcc are specific for that particular compiler. It is common that each compiler has their own different implementation of stddef.h, but they will use the same stdio.h when links to the installed C library.
When you say #include <stddef.h> it does not require that /usr/include/stddef.h exists as a file on disk at all. All that is required of an implementation is that #include <stddef.h> works, and that it gives you the features that header is meant to give you.
In your case, the implementation put some of its files in another search path. That's pretty typical.
Why is that file not under /usr/include?
Because there's absolutely no requirement for standard headers to be located at /usr/include/.
The implementation could place them anywhere. The only guarantee is
that when you do #include <stddef.h>, the compiler/preprocessor correctly locates and includes it. Since you disable that with -nostdinc option of gcc, you are on your own (to correctly give the location of that header).

How does creating a replacement memset trigger duplicate symbol errors in specific circumstances?

I recently wrote a few replacements for string routines (memcpy, memset, and memmove). It is my understanding that if the library containing these routines is specified on the compile / link line, these will take precedence over system standard library routines of the same name. If I'm wrong already, please let me know!
This works correctly in all testing I did (verified by disassembly that the correct routines are there and glibc routines don't exist), but further testing discovered an odd break caused by this:
1) build another file in the same library with -g (I had been building -O2)
2) this file has an explicit call to memset
3a) if the compile time options work in such a way that this memset is inlined by gcc, everything is OK
3b) if, however, the options disable the inlining of a memset call which would have been inlined otherwise, the library will build but using the library to statically link an application causes a duplicate symbol linker error - the other instance of the symbol is the system library's memset.
Basically I can build two versions of my library (100's of source files), and by changing the make CFLAGS in one directory from -O1 -g to just -g I can trigger the linker error when this library is used.
I can take the working version, run it through nm, and see that it has many undefined references to memset including in routines which are linked into my test case - so I know it should be trying to resolve memset in the working case. When I diff this against the nm output for the broken library, all I see is a few extra undefined memcpy and memset references. If memset resolved in the first case (to my routine), it should in the second.
I have also looked at the verbose compiler output and verified that the link lines are exactly the same in both cases, except for the path to this one library.
There are two super puzzling things here (among a myriad of other issues):
1) Why would a file in a library built -O1 -g link any differently than -g
2) Why would a replacement memset, in a user library, conflict with the system memset
And for the grand prize, how does 1) cause 2)
It took a long time to come up with this solution, but it makes sense now:
1) Higher optimization enabled gcc to inline bzero, which had no other references to it at link time. The memset calls it was inlining / not inlining here were red herrings.
2) bzero is in the same file as memset in libc.a : memset.o. When ld tried to pull in memset.o to satisfy the bzero request it got the duplicate memset symbol.
(1) causes (2).
The solution was to provide my own bzero routine in my library, stopping libc's memset.o from ever being needed.
GCC provides a large number of built-in versions of standard library functions. These are provided for optimization purposes.
Many of these functions are optimized in only certain cases. If they
are not optimized in a certain case, a call to the library function is
emitted.
Hence a library built -O1 -g would link differently than -g.

Resources