Access #ident information in an executable? - c

gcc (and other compilers) support the #ident preprocessor directive:
The ‘#ident’ directive takes one argument, a string constant. On some
systems, that string constant is copied into a special segment of the
object file. On other systems, the directive is ignored. The ‘#sccs’
directive is a synonym for ‘#ident’.
And I see (with a hex dump) that by e.g. adding this to a source file:
#ident "Hello there !"
This string gets embedded in the executable.
Now, are there any tools (readelf,objdump, gdb or others) that can extract/view these strings ?

If you have RCS installed, I think the ident command will display them. This assumes you format them in the conventional way: $keyword: value $.
Without this keyword, the traditional way to get ident strings into binaries is by putting them in static variables, e.g.
static char const rcsid[] =
"$Id: f.c,v 5.4 1993/11/09 17:40:15 eggert Exp $";
The problem with this is that you get warnings about unused variables, and compilers might optimize them away. So you have to put bogus uses of the variables in your code to prevent this. Also, if ident strings are put in header files, they have to follow naming conventions to avoid conflicts.
I can't find a specification for where #ident puts them. I suspect they're just stuck somewhere in the pure data section, so that they can be found just like the above string.

strings -a will find it -- along with a lot of other stuff. (Experiment shows that strings without -a doesn't find it; apparently it's not stored in the data section.)
As the name #sccs implies, it was originally intended for use with the old SCCS version control system, which can expand keywords in files as they're checked out. You can do something similar with the slightly more modern RCS and CVS systems, which have a different keyword syntax. I don't remember the details of SCCS, but for RCS or CVS you can have something like:
#ident "$Header:$"
which will be expanded on RCS or CVS checkout to, for example,
#ident $Header: /path/to/foo.txt,v 1.6 2013/04/02 20:21:33 yourname Exp $
You can then use the ident command to find strings like that, even in binaries. (ident is part of RCS, not CVS.)
Even if you're not using RCS or CVS, you can still use the same keyword syntax:
#ident "$Header: anything you like here$"
(But your text can be clobbered if you check the file into an RCS or CVS repository.)

Related

How to detect the system locale at compile time using gcc?

I'm using gcc to compile a C program (that is intended to be as portable as possible), and I have a separate module in which I define all the string constants I want to use. These are then defined as external constants in an include file used by the rest of the code.
I would like to use a different set of string constants based upon the user's locale, and my current thinking is to simply define a symbol with the desired language settings and use the pre-processor to conditionally compile the appropriate values. (Yes this does assume the user will be compiling the program themselves).
If I go down this route I can get the current language settings from the command line using
$ echo $LANG | cut -f 1 -d '.'
en_GB
but I have not managed to use this to define the symbol 'en_GB' (using -D) in a make file.
There may of course be a better ways to solve the problem!
It might be better to select the correct string constants at runtime but I don't want to have to update any of the source code other then the module the string constants are defined in.
Thanks
How to detect the system locale at compile time using gcc?
It is not possible, in the way you want to do. gcc compiles code, it does not detect system locale.
Pass the locale with some prefix.
-DLANG_$(echo $LANG | cut -f 1 -d '.')=1
Then just check with preprocessor if the macro is defined
#if LANG_en_US
stuff
#endif
To use shell script inside Makefile, you have to use shell.
CFLAGS += -DLANG_$(shell echo $$LANG | cut -f 1 -d '.')=1

Reading a set of variables into a Makefile from a C source file?

I have an AVR8 GCC application that can be built with a standard makefile. Because some folks who want to build the application don't want to set up make and such (or have trouble doing so), I also have figured out how to set the project up so it can be compiled from the Arduino IDE as well.
All is working.
But, I normally set some items in the makefile, like the version number and such, but creating the VERSION string in the makefile and passing it as a define into each source file compilation. But, when run from the Arduino IDE, that step is obviously not occurring. So, I have to create a second #define in the Arduino sketch stub to recreate the define.
This means when I update the version, I need to do so in 2 places, in the makefile and in the source file.
The easy option is to simply move the VERSION creation to the source file, where both can use it. And, I'm OK doing that, but
The makefile actually needs the version information, both to create the right filename (think app_v1.2.3.4.bin) and embed the version number into the bin file since it is used by the boot-loader (if requested) to ensure the version the boot-loader flashes is newer than the one already in FLASH. So, if I move the VERSION, RELEASE, MODIFICATION, etc. defines into the C code, I need to find a way to pull them back into the makefile.
I tried using the file read operations in the makefile, but they seem to ignore:
#define VERSION 0
with the prefaced '#' char.
I see there's some options to run sed/awk/etc, in bash, but I don't want to make too many assumptions on the environment, and the makefile currently runs on Windows as well as Unix/Linux without any differences.
I tried a few stack overflow examples, but nothing seems to yield those 4 numbers from any file, .h or otherwise.
I'm OK with creating version.h with just:
#define VERSION 0
#define RELEASE 1
#define MODIFICATION 2
#define FIX 4
If I can read it into the makefile and create the variables I need.
Jim
You may take a look at gmtt which was designed exactly with you use case in mind. In gmtt the following should read and analyze your header file:
include gmtt.mk
# create a 3-column table from the header file. The first column is just the "#define"
VNR_TABLE := 3 $(file < version.h)
# Extract the values from the table: select column 3 from VNR_TABLE where column 2 equals a string constant.
# Be careful not to introduce spaces in the compare!
VER := $(call select,3,$(VNR_TABLE),$$(call str-eq,$$2,VERSION))
REL := $(call select,3,$(VNR_TABLE),$$(call str-eq,$$2,RELEASE))
MODF := $(call select,3,$(VNR_TABLE),$$(call str-eq,$$2,MODIFICATION))
FIX := $(call select,3,$(VNR_TABLE),$$(call str-eq,$$2,FIX))
I couldn't test it but I think you get the idea.
PS: using a GNUmake library just means placing the included file alongside the makefile.
I think in this case you can use the ‘file’ function of makefiles.
It allows you to write (with > specifier) or read (with < specifier) to/from files. Then you can trim (with filter-out) your variables inside your makefile.
Source: https://www.gnu.org/software/make/manual/html_node/File-Function.html#File-Function
You can use GNU make's $(shell ...) function to extract the macro expansions. Assuming VERSION is defined in src.c and tokens are delimited by spaces (not tabs):
VERSION := $(shell sed -n -e "s/^\#define VERSION *\(.*\)/\1/p" src.c)
.PHONY: all
all:
#echo VERSION=$(VERSION)

How can we discard the symbols from an object file and re-use it when looking the core dumps?

The requirement is to use GNU strip to discard the symbols from an object file, and save the symbols. Later on, if there are any core dumps of this object file, we have to include the symbols to check the core dumps. How can we accomplish this task? Thanks in advance!
The simple solution would be to keep a copy of the binary before you strip it. Then you can use the unstripped version to investigate the core dump as these binaries will be identical except for the stripped symbol information.
So, if you have a FOO binary, do something like:
cp FOO FOO.unstripped
strip FOO
then run FOO and when you investigate the coredump, use FOO.unstripped.
Different ways to get a "symbol file" are discussed here:
How to generate gcc debug symbol outside the build target?
Actually, the linked question is very similar to this one.

Retrieving Global Variable Values from Command Line

In one particular project, we're trying to embed version information into shared object files. We'd like to be able to use some standard linux tool to parse the shared object to determine the version for automated testing.
Currently I have "const int plugin_version = 14;". I can use 'nm' and 'objdump' and verify that it's there:
00000000000dcfbc r plugin_version
I can't, however, seem to be able to get the value of that variable easily from command line. I figured there'd be a POSIX tool for showing the initialized values for globals. I have contemplated using a format for the variable as the information itself, ie, plugin_version_14, but that seems like a huge hack. Embedding the information in the filename unfortunately is NOT an option. Any other suggestions welcome.
You could embed it as a string
"MAGIC MARKER STRING VERSION: 4.56 END OF MAGIC" then just look for "MAGIC MARKER STRING" in the file and extract the version information that comes after it.
if you make it a standard, you could easily make command line tool to find these embeded strings on all your software.
if you require it also to be an int, a little macro magic will construct both the int and magic string to make sure they are never out of synch.
There's a couple of options I think.
My first instinct is to make sure the version information lives in its own section in the ELF file. You can use objdump -s -j name of section /bin/whatever.
This rather relies on objdump being available of course.
Alternatively you can do what Keith suggested, and just use 'strings', along with a magical marker string. This feels a little hackish, but should work quite well.
Finally, why don't you just add a --version command line option? You can then store the version information however you like, and trivially retrieve it using the one tool which is certain to be installed on any system which has your software.
A terrible hack that I've used in the past is to embed the version information in a variable name, so nm will show:
00000000000dcfbc r plugin_version_14
Why not writing your own tool to get that version in C/C++ ? You could Use dlopen, then dlsym to get the symbol and print its value to standard output. This way you also verify if the symbol is already there. It looks like 20 ~ 30 lines of code to me and about 20 minutes of your life :)
I know that the question is about command line, but writing such a tool yourself should be easy (especially if such a command line tool does not exist).
If the binary is not stripped, you could use gdb to print the variable. (I just tried to script gdb, but it seems to refuse work if stdin is not a tty, maybe expect will do the job ? )
If you can accept using python, this might help:
import struct
import sys
import subprocess
if __name__ == '__main__':
so = sys.argv[1]
sym = sys.argv[2]
addr = subprocess.check_output('nm %s | grep %s' % (so, sym), shell=True)
addr = int(addr.split()[0], 16)
so_file = open(so)
so_file.seek(addr)
data = so_file.read(4)
print struct.unpack('#i', data)[0]
Disclaimer: This script doesn't do any error checking (if you like it I'm sure you can come up with some ;)). It also assumes you're reading a 4-byte native int value.
$ cat global.c
const int plugin_version = 14;
$ python readsym.py global.so plugin_version
14

Environment variabile in Macro path

I need to define some path to files with macros. How can I use the $HOME environment variable?
I can't find it on the GNU manual, and this doesn't work:
#define LOGMMBOXMAN "$HOME/mmbox/LOGmmboxman"
No it shouldn't and you probably don't want constant-defined settings like that in any case. If you did that and it worked as you're intending to use it, your home directory would be built in as whatever $HOME is for whoever's doing the building. The executable then depends on that specific home directory existing. If that's OK, just #define your own home. I suspect it isn't though, so you need to deduce it at runtime.
For run-time deduction what you want is this, such that:
const char* home_dir = getenv("HOME");
If there is no $HOME defined, you get NULL returned so be sure to test for this.
You can then build your string based on that. You'll need #include <stdlib.h>.
Sounds like you are really asking "how can I set some cpp macro from my environment?"
If sothen you should just be able to add it to CPPFLAGS.
export CPPFLAGS="$CPPFLAGS -D LOGMMBOXMAN=$HOME/mmbox/LOGmmboxman"
Then in your code
#ifndef LOGBOXMAN
#error LOGBOXMAN not defined
#endif
Then make sure your source is built using the CPPFLAGS in the command line to gcc:
$ gcc -c file.c $CPPFLAGS
You can't. You need to use your build system to define a macro with the $HOME value (or equivalent on a non-unix system), i.e. something like this:
gcc -DHOME="/home/username" file.c
Or "/Users/username" for Mac OS X, or "C:\Users\username" (or something) for Windows. Basically, GCC provides the -D flag to define a macro on the command line. You can set up a script (or your build system) to take care of this macro definition for you, or perhaps make a system-dependent include file to define the HOME macro properly.
Then, in your C header, you can do:
#define LOGMMBOXMAN HOME "/mmbox/LOGmmboxman"
Note that, in C, consecutive string literals are concatenated. So this macro expands to:
"/home/username" "/mmbox/LOGmmboxman"
Which C interprets as
"/home/username/mmbox/LOGmmboxman"
EDIT: All that thinking, and I didn't think! D'oh!
As others have pointed out, you probably don't want to do this. This will hard-code your program to work for one specific user's home directory. This will likely cause problems if you want each user to use your program, but for each to keep his (or her) own separate files.
Ninefingers' answer is what you're most likely looking for. In the event that you ever find yourself in need of the above technique (i.e. storing application files in a system-specific place) I will leave my answer unchanged, but I expect it won't help you here.

Resources