Using GDB without debug symbols - c

Assume the following code:
#include <iostream>
void test(){
//
}
int main(){
return 0;
}
Compiling whiteout -g I'm still able to set a breakpoint on main and test using GDB.
How is that possible? Is it related to symbol tables?
(gdb) b test
Breakpoint 1 at 0x400512

Here's what you're missing.
C++ is built around the concept of compiling and then linking. As such, during the compilation stage, the compiler assumes that the current file is just one file in a more complex program that will be eventually linked together.
When you write:
void test(){
//
}
The compiler has no choice but to assume that test is going to be called by code from another source file, and that will be compiled into a separate .o file. As such, it exports test's symbol despite the fact that no debug symbols are defined.
To see this effect in action, try the following. First, mark test as static. If you compile with optimization, you will see that test is no longer visible to gdb. In fact, it is no longer even defined. The compiler inlines it away.
Another way of making this happen is by passing g++ the -fwhole-program option. This option tells gcc to assume the current file being compile is the whole program, no other compilation unit will exist. This allows it, effectively, to treat all function and global definitions as static. Again, once you turn on optimizations, you will see that test is no longer visible to gdb.

Related

Resolve undefined reference by stripping unused code

Assume we have the following C code:
void undefined_reference(void);
void bad(void) {
undefined_reference();
}
int main(void) {}
In function bad we fall into the linker error undefined reference to 'undefined_reference', as expected. This function is not actually used anywhere in the code, though, and as such, for the execution of the program, this undefined reference doesn't matter.
Is it possible to compile this code successfully, such that bad simply gets removed as it is never called (similar to tree-shaking in JavaScript)?
This function is not actually used anywhere in the code!
You know that, I know that, but the compiler doesn't. It deals with one translation unit at a time. It cannot divine out that there are no other translation units.
But main doesn't call anything, so there cannot be other translation units!
There can be code that runs before and after main (in an implementation-defined manner).
OK what about the linker? It sees the whole program!
Not really. Code can be loaded dynamically at run time (also by code that the linker cannot see).
So neither the compiler nor linker even try to find unused function by default.
On some systems it is possible to instruct the compiler and the linker to try and garbage-collect unused code (and assume a whole-program view when doing so), but this is not usually the default mode of operation.
With gcc and gnu ld, you can use these options:
gcc -ffunction-sections -Wl,--gc-sections main.c -o main
Other systems may have different ways of doing this.
Many compilers (for example gcc) will compile and link it correctly if you
Enable optimizations
make function bad static. Otherwise, it will have external linkage.
https://godbolt.org/z/KrvfrYYdn
Another way is to add the stump version of this function (and pragma displaying warning)

Lua crases when loading dll extension

I'm trying to create a Lua dll extension on Windows. I'm using Lua 5.3. My compiler is from MinGW and is gcc 4.9.3.
My C code for the dll extension is something like this:
#include <stdio.h>
#include <lua.h>
static int dub(lua_State *L) {
const double a = lua_tonumber(L, 1);
lua_pushnumber(L, a*2);
return 1;
}
__declspec(dllexport) int __cdecl luaopen_mylib(lua_State *L){
printf("One\n");
lua_pushcfunction(L, dub);
printf("Two\n");
lua_setglobal(L, "dub");
printf("Three\n");
return 1;
}
I'm compiling my dll like this:
gcc mylib.c -shared -o mylib.dll -llua
The idea being that I can load it from Lua and use it like this:
require "mylib"
print (dub(5)) --should print 10
However, when I actually try to run the Lua code, it crashes on the require "mylib" line. The DLL is able to print "One" and "Two", but it does not get to print "Three" before it crashes. This tells me the problem may be with the 'lua_setglobal' call.
What's going wrong? How to I debug it further or fix it?
As a bonus question: what should the return value of luaopen_mylib be?
Thanks!
I compiled your code and it works fine. So your problem isn't from your code. My guess is that you're using different versions of Lua, or the linking is somehow going wrong.
There are three places a version mismatch could occur.
In your #include <lua.h> line
In your build line with -llua
When you run the lua executable
It is vital that all three of these be the same lua version. If one is mismatched, that could cause the problem you're having. I would guess that your lua executable version doesn't match the others.
One other thing to try, is that your dll should link against the lua dll. There should be a file on your system called lua53.dll. Copy it to your build directory. When you compile with gcc, try this instead:
gcc mylib.c -shared -o mylib.dll lua53.dll
This means that your dll extension calls the exact same lua code that the lua executable is using. They way you have it now, with the -llua line, it appears that you are linking in lua statically. This is almost never what you want for a dll library, because the lua executable will be calling code in lua53.dll while your dll is calling separate code in the static library. I don't think this alone is causing your crash, but it's not good practice. If lua used global state (it doesn't), this linking issue could certainly cause your crash. Also, if you compile against lua53.dll you'll find that mylib.dll is much smaller.
In summary, I think you have a lua version mismatch somewhere that is causing your crash. You should also link against the lua dll instead of the static library for good form.
As a bonus question: what should the return value of luaopen_mylib be?
In your code it should actually be 0 instead of 1 as you have. The return value is the number of things left on the stack that you want the require() call to return. Your library is just shoving itself into the global state and not returning any lua values. An alternative way to do things is to return a table with the library's functions in it. That way you don't pollute global state. Since your library has only one function, you could return it directly like this:
__declspec(dllexport) int __cdecl luaopen_mylib(lua_State *L){
lua_pushcfunction(L, dub);
return 1;
}
Then you use it from lua as:
local dub = require("mylib")
print (dub(5)) --should print 10
I prefer this way since the caller can decided what to name the imports and it doesn't pollute global space.

How to link two files in C

I am currently working on a class assignment. The assignment is to create a linked list in c. But because we it's a class assignment we have some constraints:
We have a header file that we cannot modify.
We have a c file that is the linkedlist
We have a c file that is just a main method just to test the linkedlist
the header file has a main method defined, so when I attempt to build the linkedlist it fails because there is no main method. What should I do to resolve the issue?? Import the test file (this causes another error)?
I'm assuming your three files are called header.h, main.c, and linkedlist.c
gcc main.c linkedlist.c -o executable
This will create an executable binary called "executable"
Note this also assumes you're using gcc as a compiler.
Like most languages, C supports modules. What I assume your assignment requires is compiling a module. Modules, unlike full programs, lack entry points. Roughly speaking, they are collections of functions, in the manner of a library. When compiling a module, no linking is made.
You would compile a module like this: gcc -c linkedlist.c -> this would actually produce linkedlist.o, which is a module. Try executing this linkedlist.o (after changing its mode to executable, since it won't be so by default). The reason you fail to execute this module is, partly, because it is not in the proper format to be executed. Ones of the reasons it is not so is it lacks entry point (what we know as 'main') and linkage. Your assignment seems to provide a test 'main.c', if you wanted to use it, you would only have to link the 'main.c' (actually compiled into main.o) with linkedlist.o . To actually do that, simply type in gcc -o name_of_your_program main.c linkedlist.o. In fact, what is being done here is that your compiler first compiles main.c into a main.o module, then links the 2 modules together under the name you have given it with the -o option, but the compiler is pretty smart and needs nothing explicit about the steps he needs to take. Now if you wanted to know more about this stuff, you'd have to try and learn about how compilers do what they do. Google can help you with that more than I ever could. Good luck.

About C compilation process and the linking of libraries

Are C libraries linked with object code or first with source code so only later with object code? I mean, look at the image found at Cardiff School of Computer Science & Informatics's website
:
It's "strange" that after generating object-code the libraries are being linked. I mean, we use the source code while putting the includes!
So.. How this actually works? Thanks!
That diagram is correct.
When you #include a header, it essentially copies that header into your file. A header is a list of types and function declarations and constants, etc., but doesn't contain any actual code (C++ and inline functions notwithstanding).
Let's have an example: library.h
int foo(int num);
library.c
int foo(int num)
{
return num * 2;
}
yourcode.c
#include <stdio.h>
#include "library.h"
int main(void)
{
printf("%d\n", foo(100));
return 0;
}
When you #include library.h, you get the declaration of foo(). The compiler, at this point, knows nothing else about foo() or what it does. The compiler can freely insert calls to foo() despite this. The linker, seeing a call to foo() in youcode.c, and seeing the code in library.c, knows that any calls to foo() should go to that code.
In other words, the compiler tells the linker what function to call, and the linker (given all the object code) knows where that function actually is.
(Thanks to Fiddling Bits for corrections.)
Includes from libraries normally contain only library interface - so in the simplest case the .h file provided with the library contains function declaration, and the compiled function is in the library file. So you compile the sources with provided library functions declarations from library headers, and then linker adds the compiler library functions to your executable.
It might be instructive to look at what each piece in the tool-chain does, so using the boxes in your image.
pre-processor
This is really a text-editor doing a bunch of substitutions (ok, really really oversimplified). Some of the things that the pre-processor does is:
performs simple textual based substitution on #defines. So if we have #define PI 3.1415 in our file and then later on we have a line such as angle = angle * PI / 180; the pre=processor will convert this line into angle = angle * 3.1414 / 180;
anytime we encounter an #include, we can imagine that the pre-processor goes and gets the entire contents of that file and pastes the contents on the file on to where the #include is. (and then we go back and perform the substitutions.
we can also pass options to the compiler with the #pragma directive.
Finally, we can see the results of running the pre-processor by using the -E option to gcc.
compiler
The output of the pre-processor is still text, and it not contains everything that the compiler needs to be able to process the file. Now the compiler does a lot of things (and I normally break the box up when I describe this process). The compiler will process the text, do a lexical analysis of it, pass it to the parser that verifies that the program satisfies the grammar of the language, output an intermediate representation of the language, perform optimization and produce assembly code.
We can see the results of running up to the assembler by using the -s option to gcc.
assembler
The output of the compiler is an assembly listing, which is then passed to an assembler (most commonly `gas' (GNU assembler) on Linux), that converts the assembly code into machine code. In addition, on task of the assembler is to build a list of undefined referenced (i.e. a library function of a function that you wrote that is implemented in another source file.)
We can see the results of getting the output of the assembler by using the -c option to gcc.
linker
The input to the linker will be the output from the assembler (typically called object files and use an extention 'o'), as well as various libraries. Conceptually, the linker is responsible for hooking everything together, including fixing up the calls to functions that are found in libraries. Normally, the program that performs the linking in Linux is ld, and we can see the results of linking just by running gcc without any special command line options.
I have simplified the discussion of the linker, I hope I gave you a flavor of what the linker does.
The only issue that I have with the image you referenced, is that I would have move the phase "Object Code" to immediately below the assembler box, and at the same time I would move the arrow labeled "Libraries" down. I feel that this would indicate that the object code from the assembler is combined with libraries and these are combined by the linker to make an executable.
The Compilation Process of C with

gcc - 2 versions, different treatment of inline functions

Recently I've come across a problem in my project. I normally compile it in gcc-4, but after trying to compile in gcc-3, I noticed a different treatment of inline functions. To illustrate this I've created a simple example:
main.c:
#include "header.h"
#include <stdio.h>
int main()
{
printf("f() %i\n", f());
return 0;
}
file.c:
#include "header.h"
int some_function()
{
return f();
}
header.h
inline int f()
{
return 2;
}
When I compile the code in gcc-3.4.6 with:
gcc main.c file.c -std=c99 -O2
I get linker error (multiple definition of f), the same if I remove the -O2 flag. I know the compiler does not have to inline anything if it doesn't want to, so I assumed it placed f in the object file instead of inlining it in case of both main.c and file.c, thus multiple definition error. Obviously I could fix this by making f static, then, in the worst case, having a few f's in the binary.
But I tried compiling this code in gcc-4.3.5 with:
gcc main.c file.c -std=c99 -O2
And everything worked fine, so I assumed the newer gcc inlined f in both cases and there was no function f in the binary at all (checked in gdb and I was right).
However, when I removed the -O2 flag, I got two undefined references to int f().
And here, I really don't understand what is happening. It seems like gcc assumed f would be inlined, so it didn't add it to the object file, but later (because there was no -O2) it decided to generate calls to these functions instead of inlining and that's where the linker error came from.
Now comes the question: how should I define and declare simple and small functions, which I want inline, so that they can be used throughout the project without the fear of problems in various compilers? And is making all of them static the right thing to do? Or maybe gcc-4 is broken and I should never have multiple definitions of inline functions in a few translation units unless they're static?
Yes, the behavior has been changed from gcc-4.3 onwards. The gcc inline doc (http://gcc.gnu.org/onlinedocs/gcc-4.1.2/gcc/Inline.html) details this.
Short story: plain inline only serves to tell gcc (in the old version anyway) to
inline calls to the from the same file scope. However, it does not tell gcc that
all callers would be from the file scope, thus gcc also keeps a linkable version
of f() around: which explains your duplicate symbols error above.
Gcc 4.3 changed this behavior to be compatible with c99.
And, to answer your specific question:
Now comes the question: how should I define and declare simple and small functions, which I want inline, so that they can be used throughout the project without the fear of problems in various compilers? And is making all of them static the right thing to do? Or maybe gcc-4 is broken and I should never have multiple definitions of inline functions in a few translation units unless they're static?
If you want portability across gcc versions use static inline.

Resources