what is the easiest way to lookup function names of a c binary in a cross-platform manner? - c

I want to write a small utility to call arbitrary functions from a C shared library. User should be able to list all the exported functions similar to what objdump or nm does. I checked these utilities' source but they are intimidating. Couldn't find enough information on google, if dl library has this functionality either.
(Clarification edit: I don't want to just call a function which is known beforehand. I will appreciate an example fragment along your answer.)

This might be near to what you're looking for:
http://python.net/crew/theller/ctypes/

Well, I'll speak a little bit about Windows. The C functions exported from DLLs do not contain information about the types, names, or number of arguments -- nor do I believe you can determine what the calling convention is for a given function.
For comparison, take a look at National Instrument's LabVIEW programming environment. You can import functions from DLLs, but you have to manually type in the type and names of the arguments before you use a given function. If this limitation is OK, please edit your question to reflect that.
I don't know what is possible with *nix environments.
EDIT: Regarding your clarification. If you don't know what the function is ahead of time, you're pretty screwed on Windows because in general you won't be able to determine what the number and types of arguments the functions take.

You could try ParaDyn's SymtabAPI. It lets you grab all the symbols in a shared library (or executable) and look at their types, offset, etc. It's all wrapped up in a reasonably nice C++ interface and runs on a lot of platforms. It also provides support for binary rewriting, which you could potentially use to do what you're talking about at runtime.
Webpage is here:
http://www.paradyn.org/html/symtab2.1-features.html
Documentation is here:
http://ftp.cs.wisc.edu/paradyn/releases/release5.2/doc/symtabProgGuide.21.pdf

A standard-ish API is the dlopen/dlsym API; AFAIK it's implemented by GNU libc on Linux and Mac OS X's standard C library (libSystem), and it might be implemented on Windows by MinGW or other compatibility packages.

Only sensible solution (without reinventing the wheel) seems to use libbfd. Downsides are its documentation is scarce and it is a bit bloated for my purposes.

The source code for nm and objdump are available. If you want to start from specification then ELF is what you want to look into.
/Allan

I've written something like this in Perl. On Win32 it runs dumpbin /exports, on POSIX it runs nm -gP. Then, since it's Perl, the results are interpreted using regular expressions: / _(\S+)#\d+/ for Win32 (stdcall functions) and /^(\S+) T/ for POSIX.

Eek! You've touched on one of the very platform-dependent topics of programming. On windows, you have DLLs, on linux, you have ld.so, ld-linux.so, and mac os x's dyld.

Related

C Header Files and ABI

I'd like to know how C Header Files and ABIs relate. The sizes of various types are architecture and even compiler-dependent. Then how can one reliably link to a C library?
For a more specific problem: When using Haskell's FFI, one even only uses Haskell types like CDouble to define (duplicate the definition of) the C library interface. I don't know where the binary type size information is coming from. What is the trick for making the linking work?
Please see this link https://code.google.com/p/tabi
It may help you to avoid difficulties with possible ABI differences between Haskell and C.
The library type information comes from magic macros that are run to insert information grabbed from the C compiler by autoconf.
For example, see the definition of CDoublehere: https://hackage.haskell.org/package/base-4.8.2.0/docs/src/Foreign.C.Types.html#CDouble
and then see where the HTYPE_DOUBLE size comes from in this autoconf input here: https://hackage.haskell.org/package/base-4.8.2.0/src/include/HsBaseConfig.h.in
Since GHH compiles against the compiler/arch it is compiled with (except in the special cross-compiler modes, which are new and different in ways I'm not fully cognizant of) this makes everything tie out with the ABI properly.

Using parse_datetime from gnu c

I am developing a program for analyzing time series under gnu/linux. To analyze a time window, I want to be able to specify start/end times on the command line. Parsing dates using strptime is simple enough, however I would like to use the flexible 'natural language' format as it is used by the unix ''date'' command. There, this is done using the parse_datetime function.
I have the source of the coreutils, but would like to avoid copying over the code and all attached header files.
My question is: is there a standard library under Unix/Linux which gives access to the full power of parse_datetime().
The function you refer to is not part of any standard, nor any stock utility library. However, it is available as a semi-standalone component as part of gnulib, namely the parse-datetime module. You will need to take it and incorporate it into your program; the gnulib distribution has tools for that. Be aware that if you do this you have to GPL your entire program (this is not a big deal if the program is only for your personal use -- the GPL's requirements only kick in when you start giving the compiled program to other people).
A possible alternative is g_date_set_parse from GLib, but I can't speak to how clever it is.

What is GLIBC? What is it used for?

I was searching for the source code of the C standard libraries. What I mean with it is, for example, how are cos, abs, printf, scanf, fopen, and all the other standard C functions written, I mean to see their source code.
So while searching for this, I came across with GLIBC, but I don't know what it actually is. It is GNU C Library, and it contains some source codes, but what are they actually, are they the source code of the standard functions or are they something else? And what is it used for?
Its the implementation of Standard C library described in C standards plus some extra useful stuffs which are not strictly standard but used frequently.
Its main contents are :
1) C library described in ANSI,c99,c11 standards. It includes macros, symbols, function implementations etc.(printf(),malloc() etc)
2) POSIX standard library. The "userland" glue of system calls. (open(),read() etc. Actually glibc does not "implement" system calls. kernel does it. But glibc provides the user land interface to the services provided by kernel so that user application can use a system call just like a ordinary function.
3) Also some nonstandard but useful stuff.
"use the force, read the source "
$git clone git://sourceware.org/git/glibc.git
(I was recently pretty enlightened when i looked through malloc.c in glibc)
There are several implementations of the standard. Glibc is the implementation that most Linuxes use, but there are others. Glibc also contains (as Aftnix states) the glue functions which set up the scene for jumps into the kernel (also known as system calls). So many of glibc's 'functions' don't do the actual work but only delegate to the kernel.
To read the source of Glibc, just google for it. There are myriad sites which carry it, and also several variations.
Windows uses Microsoft's own implementation, which I believe is called MSVCR.DLL. I doubt that you will find the source code to that library anywhere. Also note that some functions which a Linux hacker might think of as 'standard', simply don't exist on Windows (notably fork). The reverse is also true.
Other systems will have their own libc.
The glibc package contains standard libraries which are used by multiple programs on the system. In order to save disk space and memory, as well as to make upgrading easier, common system code iskept in one place and shared between programs. This particular package contains the most important sets of shared libraries: the standard C library and the standard math library. Without these two libraries, a Linux system will not function. The glibc package also contains national language (locale) support.
Yes, It's the implementation of standard library functions.
More specifically, it is the implementation for all GNU systems and in almost all *NIX systems that use the Linux kernel.
Here are a few "hands-on" points of view:
it implements the POSIX C API on top of the Linux kernel: What is the meaning of "POSIX"?
it contains several assembly hand-optimized versions of ANSI C functions for several different architectures, e.g. strlen:
sysdeps/x86_64/strlen.S
sysdeps/aarch64/strlen.S
how to modify its source, recompile and use it understand it better: How to compile my own glibc C standard library from source and use it?
how to GDB step debug it with QEMU and Buildroot: https://github.com/cirosantilli/linux-kernel-module-cheat/tree/9693c23fe6b2ae1409010a1a29ff0c1b7bd4b39e#gdbserver-libc

How do I use C libraries in assembler?

I want to know how to write a text editor in assembler. But modern operating systems require C libraries, particularly for their windowing systems. I found this page, which has helped me a lot.
But I wonder if there are details I should know. I know enough assembler to write programs that will use windows in Linux using GTK+, but I want to be able to understand what I have to send to a function for it to be a valid input, so that it will be easier to make use of all C libraries. For interfacing between C and x86 assembler, I know what can be learned from this page, and little else.
One of the most instructive ways to learn how to call C from assembler is to:
Write a C program that calls the C function of interest
Compile it, and look at the assembly listing (gcc -S)
This approach makes it easy to experiment by starting with something that is already known to work. You can change the C source and see how the generated code changes, and you can start with the generated code and modify it yourself.
push parameter on the stack
call the function
clear the stack
The links you have in your question show all these steps.
The OS may define the calling standard (it pretty well must define the standard for invoking system calls), in which case you need only find where that is documents and read it closely.

Delphi dcu to obj

Is there a way to convert a Delphi .dcu file to an .obj file so that it can be linked using a compiler like GCC? I've not used Delphi for a couple of years but would like to use if for a project again if this is possible.
Delphi can output .obj files, but they are in a 32-bit variant of Intel OMF. GCC, on the other hand, works with ELF (Linux, most Unixes), COFF (on Windows) or Mach-O (Mac).
But that alone is not enough. It's hard to write much code without using the runtime library, and the implementation of the runtime library will be dependent on low-level details of the compiler and linker architecture, for things like correct order of initialization.
Moreover, there's more to compatibility than just the object file format; code on Linux, in particular, needs to be position-independent, which means it can't use absolute values to reference global symbols, but rather must index all its global data from a register or relative to the instruction pointer, so that the code can be relocated in memory without rewriting references.
DCU files are a serialization of the Delphi symbol tables and code generated for each proc, and are thus highly dependent on the implementation details of the compiler, which changes from one version to the next.
All this is to say that it's unlikely that you'd be able to get much Delphi (dcc32) code linking into a GNU environment, unless you restricted yourself to the absolute minimum of non-managed data types (no strings, no interfaces) and procedural code (no classes, no initialization section, no data that needs initialization, etc.)
(answer to various FPC remarks, but I need more room)
For a good understanding, you have to know that a delphi .dcu translates to two differernt FPC files, .ppu file with the mentioned symtable stuff, which includes non linkable code like inline functions and generic definitions and a .o which is mingw compatible (COFF) on Windows. Cygwin is mingw compatible too on linking level (but runtime is different and scary). Anyway, mingw32/64 is our reference gcc on Windows.
The PPU has a similar version problem as Delphi's DCU, probably for the same reasons. The ppu format is different nearly every major release. (so 2.0, 2.2, 2.4), and changes typically 2-3 times an year in the trunk
So while FPC on Windows uses own assemblers and linkers, the .o's it generates are still compatible with mingw32 In general FPC's output is very gcc compatible, and it is often possible to link in gcc static libs directly, allowing e.g. mysql and postgres linklibs to be linked into apps with a suitable license. (like e.g. GPL) On 64-bit they should be compatible too, but this is probably less tested than win32.
The textmode IDE even links in the entire GDB debugger in library form. GDB is one of the main reasons for gcc compatibility on Windows.
While Barry's points about the runtime in general hold for FPC too, it might be slightly easier to work around this. It might only require calling certain functions to initialize the FPC rtl from your startup code, and similarly for the finalize. Compile a minimal FPC program with -al and see the resulting assembler (in the .s file, most notably initializeunits and finalizeunits) Moreover the RTL is more flexible and probably more easily cut down to a minimum.
Of course as soon as you also require exceptions to work across gcc<->fpc bounderies you are out of luck. FPC does not use SEH, or any scheme compatible with anything else ATM. (contrary to Delphi, which uses SEH, which at least in theory should give you an advantage there, Barry?) OTOH, gcc might use its own libunwind instead of SEH.
Note that the default calling convention of FPC on x86 is Delphi compatible register, so you might need to insert proper cdecl (which should be gcc compatible) modifiers, or even can set it for entire units at a time using {$calling cdecl}
On *nix this is bog standard (e.g. apache modules), I don't know many people that do this on win32 though.
About compatibility; FPC can compile packages like Indy, Teechart, Zeos, ICS, Synapse, VST
and reams more with little or no mods. The dialect levels of released versions are a mix of D7 and up, with the focus on D7. The dialect level is slowly creeping to D2006 level in trunk versions. (with for in, class abstract etc)
Yes. Have a look at the Project Options dialog box:
(High-Res)
As far as I am aware, Delphi only supports the OMF object file format. You may want to try an object format converter such as Agner Fog's.
Since the DCU format is proprietary and has a tendency of changing from one version of Delphi to the next, there's probably no reliable way to convert a DCU to an OBJ. Your best bet is to build them in OBJ format in the first place, as per Andreas's answer.

Resources