How to dump path to source from object-file

How to dump path to source from object-file - c

Assume I have a C object-file app.o compiled with gcc. How can I dump the file path to the original app.c from which app.o was compiled. My goal is to create a listing of all symbols + respective source file path using the binutils and gcc toolsuite.
By no means am I expecting an all-in-one solution. So I tried playing with multiple tools to gather the information I need.
Inspecting the object-file with a text-editor reveals that (appart from a lot of unreadable binary gibberish) the file does contain a reference to app.c as a string embedded into the object-file format. However I did not find a way to extract that string using objdump or nm.
I was hoping objdump would have some flag that could extract this source file string, but after trying virtually all options documented in the man page I still couldn't find it.
With the path of the source file I was hoping I could run gcc -M <path-to-source>. This would allow me to look through all the headers included by app.c and find the in-source declarations.
Suppose a simple app.c like this:
void foo(void) {
}
Compile it via gcc -c app.c -o app.o.
Running objdump -t app.o dumps the symbol table, but does not refer anywhere to the original app.c.
Running cat app.o does show that the object-file contains the file path to app.c (relative to pwd at compile-time). But I wasn't exactly planning on writing my own object-file parser just to get to that string.

To answer my own question minutes after posting it (duh!):
readelf -s app.o prints a symbol table including the name of the source file (app.c). With that I am able to run gcc -M app.c and then parse through all header files to gather the symbol declarations.

Related

How to load C library functions into assembly, and use assembly library functions in another C or assembly project

Currently I am building a foo.h and foo.c with:
$ clang -I . -dynamiclib \
-undefined dynamic_lookup \
-o foo.dylib foo.c
I am able to use this in other C libraries like this:
clang -I . -dynamiclib \
-undefined dynamic_lookup \
-o bar.dylib bar.c foo.dylib
I would like to use this library in an assembly project.
$ nasm -f macho64 test.asm \
&& ld -e start -macosx_version_min 10.13.0 -static -o test test.o foo.dylib
$ ./test
ld: warning: foo.dylib, ignoring unexpected dylib file
Wondering how I link together the C -> asm system to get the C functions working in asm. Then I would like to go further and use that compiled asm to use in either a C or asm project, so wondering how to do that.
When using the assembly in C, I would like for you to basically get functions and import #include "myassembly.h" or something like that, so it feels like a real library. Then you have a function like myfunc which is defined in assembly, but you can use it in c as myfunc(1, 2, 3); sort of thing.
If I change it from static to dynamic linking with the -lSystem flag (and removing -static), I get this:
dyld: Library not loaded: foo.dylib
Referenced from: ./test
Reason: image not found
make: *** [...] Abort trap: 6

You're specifying -static which means:
-static Produces a mach-o file that does not use the dyld. Only used
building the kernel.
dyld is the dynamic loader. If you're not using the dynamic loader, you can't use dynamic libraries.
Update for edited question:
When a dylib is created, it gets an "install name". When an executable is linked to that dylib, the executable stores the install name of the dylib in its reference to it. (Note, it does not store the link-time path of the dylib file it linked against.) When the executable is loaded, the dynamic loader looks for the dylib using the install name it recorded, by default.
You can specify the install name using the -install_name <name> option to the linker. It could be the absolute path to where you expect the library to be installed (e.g. /usr/local/lib/foo.dylib), if you expect it to be installed in a fixed location. Often, though, that's not useful. You want a more flexible means for the dynamic loader to find the dylib.
The dynamic loader understands certain special path prefixes on install names to support such flexibility. See the dyld(1) man page. For example, if you specify an install name of #executable_path/foo.dylib then, at load time, the loader will look next to the executable for the library.
You can see the install name of a dylib by using otool -D foo.dylib. Your dylib may not have an install name, in which case its effective install name is just its file name with no path.
If the loader doesn't find the library by using its install name, it has a search strategy. By default, it looks in ~/lib:/usr/local/lib:/lib:/usr/lib. You can use some environment variables to alter the search strategy. For example, you can set DYLD_FALLBACK_LIBRARY_PATH to a colon-delimited list of directories to search, instead. These environment variables are also listed in the dyld(1) man page.

Name an executable file c

I know this seems like a stupid question but how do I name an executable file when using flags like -Wall and -pedantic in c?
I have a file named test.c and another one named function.c where I wrote the functions I need for my program test.c .
I use this command to compile: gcc -Wall -pedantic test.c
Where should I put the name of the executable file? I tried every place but it doesn't seem to work. Is my compiler lacking something or what?

You need to use the -o option, like this
gcc -Wall -pedantic -o MY_EXECUTABLE_NAME test.c
# ^ here (output file name option)
You know, you can always do gcc --help
Usage: gcc [options] file...
Options:
-pass-exit-codes Exit with highest error code from a phase
--help Display this information
--target-help Display target specific command line options
--help={common|optimizers|params|target|warnings|[^]{joined|separate|undocumented}}[,...]
Display specific types of command line options
(Use '-v --help' to display command line options of sub-processes)
--version Display compiler version information
-dumpspecs Display all of the built in spec strings
-dumpversion Display the version of the compiler
-dumpmachine Display the compiler's target processor
-print-search-dirs Display the directories in the compiler's search path
-print-libgcc-file-name Display the name of the compiler's companion library
-print-file-name=<lib> Display the full path to library <lib>
-print-prog-name=<prog> Display the full path to compiler component <prog>
-print-multiarch Display the target's normalized GNU triplet, used as
a component in the library path
-print-multi-directory Display the root directory for versions of libgcc
-print-multi-lib Display the mapping between command line options and
multiple library search directories
-print-multi-os-directory Display the relative path to OS libraries
-print-sysroot Display the target libraries directory
-print-sysroot-headers-suffix Display the sysroot suffix used to find headers
-Wa,<options> Pass comma-separated <options> on to the assembler
-Wp,<options> Pass comma-separated <options> on to the preprocessor
-Wl,<options> Pass comma-separated <options> on to the linker
-Xassembler <arg> Pass <arg> on to the assembler
-Xpreprocessor <arg> Pass <arg> on to the preprocessor
-Xlinker <arg> Pass <arg> on to the linker
-save-temps Do not delete intermediate files
-save-temps=<arg> Do not delete intermediate files
-no-canonical-prefixes Do not canonicalize paths when building relative
prefixes to other gcc components
-pipe Use pipes rather than intermediate files
-time Time the execution of each subprocess
-specs=<file> Override built-in specs with the contents of <file>
-std=<standard> Assume that the input sources are for <standard>
--sysroot=<directory> Use <directory> as the root directory for headers
and libraries
-B <directory> Add <directory> to the compiler's search paths
-v Display the programs invoked by the compiler
-### Like -v but options quoted and commands not executed
-E Preprocess only; do not compile, assemble or link
-S Compile only; do not assemble or link
-c Compile and assemble, but do not link
-o <file> Place the output into <file>
-pie Create a position independent executable
-shared Create a shared library
-x <language> Specify the language of the following input files
Permissible languages include: c c++ assembler none
'none' means revert to the default behavior of
guessing the language based on the file's extension
Options starting with -g, -f, -m, -O, -W, or --param are automatically
passed on to the various sub-processes invoked by gcc. In order to pass
other options on to these processes the -W<letter> options must be used.
For bug reporting instructions, please see:
<http://bugzilla.redhat.com/bugzilla>.

gcc -o output_name -Wall -pedant file.c

U should use -o before file name ,because it creates a object file of your code file.

Include an external library in C

I'm attempting to use a C library for an opencourseware course from Harvard. The instructor's instructions for setting up the external lib can be found here.
I am following the instructions specific to ubuntu as I am trying to use this lib on my ubuntu box. I followed the instructions on the page to set it up, but when I run a simple helloWorld.c program using a cs50 library function, gcc doesn't want to play along.
Example:
helloWorld.c
#include <stdio.h>
#include <cs50.h>
int
main(void){
printf("What do you want to say to the world?\n");
string message = GetString();
printf("%s!\n\n", message);
}
$ gcc helloWorld.c
/tmp/ccYilBgA.o: In function `main':
helloWorld.c:(.text+0x16): undefined reference to `GetString'
collect2: ld returned 1 exit status
I followed the instructions to the letter as stated in the instructions, but they didn't work for me. I'm runing ubuntu 12.04. Please let me know if I can clarify further my problem.

First, as a beginner, you should always ask GCC to compile with all warnings and debugging information enabled, i.e. gcc -Wall -g. But at some time read How to invoke gcc. Use a good source code editor (such as GNU emacs or vim or gedit, etc...) to edit your C source code, but be able to compile your program on the command line (so don't always use a sophisticated IDE hiding important compilation details from you).
Then you are probably missing some Harvard specific library, some options like -L followed by a library directory, then -l glued to the library name. So you might need gcc -Wall -g -lcs50 (replace cs50 by the appropriate name) and you might need some -Lsome-dir
Notice that the order of program arguments to gcc is significant. As a general rule, if a depends upon b you should put a before b; more specifically I suggest
Start with the gcc program name; add the C standard level eg -std=c99 if wanted
Put compiler warning, debugging (or optimizing) options, eg -Wall -g (you may even want to add -Wextra to get even more warnings).
Put the preprocessor's defines and include directory e.g. -DONE=1 and -Imy-include-dir/
Put your C source file hello.c
Put any object files with which you are linking i.e. bar.o
Put the library directories -Lmy-lib-dir/ if relevant
Pur the library names -laa and -lbb (when the libaa.so depends upon libbb.so, in that order)
End with -o your-program-name to give the name of the produced binary. Don't use the default name a.out
Directory giving options -I (for preprocessor includes) and -L for libraries can be given several times, order is significant (search order).
Very quickly you'll want to use build automation tools like GNU make (perhaps with the help of remake on Linux)
Learn also to use the debugger gdb.
Get the habit to always ask for warnings from the compiler, and always improve your program till you get no warnings: the compiler is your friend, it is helping you!
Read also How to debug small programs and the famous SICP (which teaches very important concepts; you might want to use guile on Linux while reading it, see http://norvig.com/21-days.html for more). Be also aware of tools like valgrind
Have fun.

I take this course and sometimes I need to practice offline while I am traveling or commuting. Under Windows using MinGW and Notepad++ as an IDE (because I love it and use it usually while codding python) I finally found a solution and some time to write it down.
Starting from scratch. Steps for setting up gcc C compiler, if already set please skip to 5
Download Git and install. It includes Git Bash, which is MINGW64 linux terminal. I prefer to use Git as I need linux tools such as sed, awk, pull, push on my Windows and can replace Guthub's terminal.
Once Git installed make sure that gcc packages are installed. You can use my configuration for reference...
Make sure your compiler works. Throw it this simple code,
by saving it in your working directory Documents/Harvard_CS50/Week2/
hello.c
#include <stdio.h>
int main(void)
{
printf("Hello StackOverflow\n");
}
start Git Bash -> navigate to working directory
cd Documents/Harvard_CS50/Week2/
compile it in bash terminal
gcc helloworld.c -o helloworld.exe
execute it using bash terminal
./helloworld.exe
Hello StackOverflow
If you see Hello StackOverflow, your compiler works and you can write C code.
Now to the important bit, installing CS50 library locally and using it offline. This should be applicable for any other libraries introduced later in the course.
Download latest source code file cs50.c and header file cs50.h from https://github.com/cs50/libcs50/tree/develop/src and save them in Documents/Harvard_CS50/src
Navigate into src directory and list the files to make sure you are on the right location using
ls
cs50.c cs50.h
Cool, we are here. Now we need to compile object file for the library using
gcc -c -ggdb -std=c99 cs50.c -o cs50.o
Now using the generated cs50.o object file we can create our cs50 library archive file.
ar rcs libcs50.a cs50.o
After all this steps we ended with 2 additional files to our original files. We are interested in only 2 of them cs50.h libcs50.a
ls
cs50.c cs50.h cs50.o libcs50.a
Copy Library and header files to their target locations. My MinGW is installed in C:\ so I copy them there
cs50.h --> C:\MinGW\include
libcs50.a --> C:\MinGW\lib
Testing the cs50 Library
To make sure our library works, we can throw one of the example scripts in the lecture and see if we can compile it using cs50.h header file for the get_string() method.
#include <stdio.h>
#include <cs50.h>
int main(void)
{
printf("Please input a string to count how long it is: ");
string s = get_string();
int n = 0;
while (s[n] != '\0')
{
n++;
}
printf("Your string is %i chars long\n", n);
}
Compile cs50 code using gcc and cs50 library. I want to be explicit and use:
gcc -ggdb -std=c99 -Wall -Werror test.c -lcs50 -o test.exe
But you can simply point the source, output filename and cs50 library
gcc test.c -o test.exe -lcs50
Here we go, program is compiled using header and methods can be used within.
If you want Notepad++ as an IDE you can follow this tip to set it up with gcc as a compiler and run your code from there.
Just make sure your nppexec script includes the cs50 library
npp_save
gcc -ggdb -std=c99 -Wall -Werror "$(FULL_CURRENT_PATH)" -lcs50 -o "$(CURRENT_DIRECTORY)\$(NAME_PART).exe"
cmd /c "$(CURRENT_DIRECTORY)\$(NAME_PART).exe"

Download the cs50 from: http://mirror.cs50.net/library50/c/library50-c-5.zip
Extract it. (You will get two files cs50.c and cs50.h)
Now copy both the files to your default library folder. (which includes your stdio.h file)
Now while writing your program use: #include < cs50.c >
You can also copy the files to the folder containing your helloWorld.c file.
You have to use: #include " cs50.c ".
OR =====================================================================>
Open cs50.c and cs50.h files in text editor.
In cs50.h, just below #include < stdlib.h > add #include < stdio.h > and #include < string.h > both on new line.
Now open cs50.c file, copy everything (from: /**Reads a line of text from standard input and returns the equivalent {from line 47 to last}) and paste it in cs50.h just above the #endif and save the files.
Now you can copy the file cs50.h to either your default library folder or to your current working folder.
If you copied the file to default folder then use: #include < cs50.h > and if you copied the files to current working folder then use: #include " cs50.h ".

You need to link against the library during compilation. The library should end in .a or .so if you are on Ubuntu. To link against a library:
gcc -o myProgram myProgram.c -l(library name goes here but no parentheses)

You have to link against the library, how come GCC would know what library you want to use?
gcc helloWorld.c -lcs50

Research Sources:
building on the answers above given by Basile Starynkevitch, and Gunay Anach
combined with instructions from some videos on youtube 1 2
Approach:
covering the minimum things to do, and sharing the "norms" separately
avoiding any modification to anywhere else on the system
including the basic breakdown of the commands used
not including all the fine details, covering only the requirements absolute to task or for effective communication of instructions. leaving the other mundane details to the reader
assuming that the other stuff like compiler, environment variable etc is already setup, and familiarity with shell's file navigation commands is there
My Environment:
compiler: gcc via msys2
shell: bash via msys2
IDE: doesnt matter here
Plan:
getting the source files
building the required files: *.o (object) and *.a (archive)
telling the compiler to use it
Action:
Let's say, current directory = "desktop/cs50"
It contains all the *.c files like test-file.c which I will be creating for assignments/problem sets/practise etc.
Get the *.h and *.c files
Source in this particular case: https://github.com/cs50/libcs50/tree/main/src
Go over each file individually
Copy all the content of it
Say using "Copy raw contents" icon of individual files
Create the corresponding file locally in the computer
Do it in a a separate folder just to keep things clean, let's say in "desktop/cs50/src" aka ./src
Build the required files using in the terminal after changing your current directory to "desktop/cs50/src" :
gcc -c cs50.c to create the "cs50.o" object file from "cs50.c" using "gcc"
ar cr libcs50.a cs50.o to create "libcs50.a" archive file which'll be containing "cs50.o" object file
Here, "libcs50" = "lib" prefix + "cs50" name (same as the header file's name)
This is the norm/standard way where the prefix "lib" is significant as well for a later step
However, prefix can be skipped, and it's not compulsory for name to match the header file's name either. Though, Skipping prefix is not recommended. And I can't say for sure about the name part
To tell the compiler to be able to use this infrastructure, the commands will be in following syntax after going to the parent directory (i.e. to "desktop/cs50"):
gcc test-file.c -Isrc -Lsrc -lcs50 if you used "lib" prefix in step 2.2 above
here, -I flag is for specifying the directory of *.h header file included in your test_file.c
and -L flag is for specifying the directory to be used for -l
and -l is for the name of the *.a file. Here the "lib" prefix talked about earlier, and ".a" extension is not mentioned
the order of these flags matter, keep the -I -L -l flags after the "test-file.c"
Some more notees:
don't forget to use the additional common flags (like those suggested above for errors etc)
if you skipped the "lib" prefix, then you can't use -L -l flags
so, syntax for command will become: gcc test-file.c -Isrc src/libcs50.a
say i created my test-file.c file in "desktop/cs50/psets", so, it can be handled in 2 notable ways (current dir = "desktop/cs50/") :
cd psets then changing the relative address correspondingly in -I -L, so result:
gcc test-file.c -I../src -L../src -lcs50
keeping current directory same, but then changing the file's relative address correspondingly, so result:
gcc psests/test-file.c -Isrc -Lsrc -lcs50
or use absolute addresses 😜
as it can be seen that this becomes quite long, that's when build automation tools such as make kick in (though i am accomplishing that using a shell script 😜)

How do I emulate objdump --dwarf=decodedline in .bundle files?

I've been successfully using objdump --dwarf=decodedline to find the source location of each offset in a .so file on Linux.
Unfortunately on Mac-OS X. It seems that .bundle files (used as shared libraries) are not queriable in this manner.
I'm optimistic that there's something I can do, because gdb is able to correctly debug and step through code in these bundles — does anyone know what it's doing?
Further information:
The dwarfdump utility claims that the .bundle file contains no DWARF data, but that it does contain STABS data; however objdump --stabs cannot find any stabs data either.
(If it makes the question easier to answer, I don't actually need all of the offsets; being able to query the source location of any given offset would be good enough).
The bundle file I've been testing this on was generated using:
cc -dynamic -bundle -undefined suppress -flat_namespace -g -o c_location.bundle c_location.o -L. -L/Users/User/.rvm/rubies/ruby-1.8.7-p357/lib -L. -lruby -ldl -lobjc
The original c_location.o file does contain the necessary information for objdump --dwarf=decodedline to work.

So it turns out that one way to do this is to use Apple's nm -pa *.bundle to find the symbol name and the original object file for a given offset.
Once you have that, you can first use objdump -tT to find the offset of the symbol name in the original object file; and then use objdump --dwarf=decodedline as before.
Each step requires a little bit of simplistic output parsing, but it does seem to work™. I'd be interested if there are more robust approaches.

Generate assembler code from C file in linux

I would like to know how to generate assembler code from a C program using Unix.
I tried the gcc: gcc -c file.c
I also used firstly cpp and then try as but I'm getting errors.
I'm trying to build an assembler program from 3 different programs
prog1.c prog2.c prog.h
Is it correct to do gcc -S prog1.c prog2.c prog.h?
Seems that is not correct. I don't know if I have to generate the assembler from each of them and then link them
Thanks

According the manual:
`-S'
Stop after the stage of compilation proper; do not assemble. The
output is in the form of an assembler code file for each
non-assembler input file specified.
By default, the assembler file name for a source file is made by
replacing the suffix `.c', `.i', etc., with `.s'.
Input files that don't require compilation are ignored.
so try gcc -S file.c.

From man gcc:
-S Stop after the stage of compilation proper; do not
assemble. The output is an assembler code file for
each non-assembler input file specified.
By default, GCC makes the assembler file name for a
source file by replacing the suffix `.c', `.i',
etc., with `.s'. Use -o to select another name.
GCC ignores any input files that don't require com-
pilation.

If you're using gcc (as it seems) it's gcc -S.
Don't forget to specify the include paths with -I if needed.
gcc -I ../my_includes -S my_file.c
and you'll get my_file.s with the Assembler instructions.

objdump -d also works very nicely, and will give you the assembly listing for the whole binary (exe or shared lib).
This can be a lot clearer than using the compiler generated asm since calls to functions within the same source file can show up not yet resolved to their final locations.
Build your code with -g and you can also add --line and/or --source to the objdump flags.