How to get function address by name? - c

I'd like to get function's address by name.
For example, currently I am using dlsym:
unsigned long get_func_addr(const char *func_name)
{
return (unsigned long)dlsym(NULL, func_name);
}
However, dlsym only works for extern function. It won't work for static function. I know there could multiple static functions with same name in different files. But I need to at least get one static function's address with the name. Sometime static function will be inlned. But it's OK if C file is compiled with debug. I think with -g, the symbol table of static functions is present, but how can I access it?
I don't want to created a table for mapping the string to function address. I need to find a way to do it dynamically.

This isn't really possible without somehow creating some external file that can be used for a look-up ... for instance, as you mentioned, a symbol table of static functions is present, but that is generated at compile/link time ... it is not something accessible from a non-compiled code module.
So basically you could generate and export the symbol table as an external file from your compiled and linked executable, and then have a function that dynamically looks up the function name in the external file which would provide the information necessary to get the address of the function where the complier and linker compiled/linked it to.

A static function need not even exist in the binary, so there's no way to get its address. Even if it does exist, it might have been modified by the compiler based on the knowledge that certain arguments can only take particular values, or it might have had the calling convention adjusted such that it's not externally callable, etc. The only way you can be sure a "real" version of a static function exists is if its address is made visible to other modules via a function pointer.

If the required function you want to lookup is in a DLL, you could use the Windows API getprocaddress(), which takes the name of the function and the name of the DLL as parameters.
If you want to find user defined functions, I would recommend using a lookup table as the names of those functions are not stored.
For user defined functions, you could force that every function export its name to another function at its start. i.e.:
void my_func()
{
register(my_func,"my_func");// the address and the name
// ...
}
Thus you could lookup the function later by name.

Related

How to tell apart imported function vs imported global variable in a DLL's PE header?

I'm writing a small tool that should be able to inspect an arbitrary process of interest and check if any of its statically linked functions were trampolined. (An example of a trampoline could be what Microsoft Detours does to a process.)
For that I parse the PE header of the target process and retrieve all of its imported DLLs with all imported functions in them. Then I can compare the following between DLLs on disk and the DLLs loaded in the target process memory:
A. Entries in the Import Address Table for each imported function.
B. First N bytes of each function's machine code.
And if any of the above do not match, this will most certainly mean that a trampoline was applied to a particular function (or WinAPI.)
This works well, except of one situation when a target process can import a global variable instead of a function. For example _acmdln is such global variable. You can still find it in msvcrt.dll and use it as such:
//I'm not sure why you'd want to do it this way,
//but it will give you the current command line.
//So just to prove the concept ...
HMODULE hMod = ::GetModuleHandle(L"msvcrt.dll");
char* pVar = (char*)::GetProcAddress(hMod, "_acmdln");
char* pCmdLine = pVar ? *(char**)pVar : NULL;
So, what this means for my trampoline checking tool is that I need to differentiate between an imported function (WinAPI) and a global variable. Any idea how?
PS. If I don't do that, my algorithm that I described above will compare a global variable's "code bytes" as if it was a function, which is just a pointer to a command line that will most certainly be different, and then flag it as a trampolined function.
PS2. Not exactly my code, but a similar way to parse PE header can be found here. (Search for DumpImports function for extracting DLL imports.)
Global variables will be in the .data section not the .text section, in addition the section will not have execute permissions if it's not a function. Therefore you can use both of these characteristics to filter.

How does C limit a static function's use to only its file?

I understand that a static function in C allows that particular function to only be call within the confines of that file. What I am interested in is how this occurs. Is it being placed into a specific part of memory or is the compiler applying a specific operation to that function. Can this same process be applied to a function call in assembly?
Declaring a function static doesn't really prevent it from being called from other translation units.
What static does is it prevents the function from being referred (linked) from other translation units by name. That will eliminate the possibility of direct calls to that function, i.e calls "by name". To achieve that, the compiler simply excludes the function name from the table of external names exported from the translation unit. Other than that, there's absolutely nothing special about static functions.
You still can call that function from other translation units by other means. For example, if you somehow obtained a pointer to static function in other translation unit, you can call it through that pointer.
It doesn't make it into the object's name table which prevents it from being linked into other stuff.
Functions and other names are exported as symbols in the object file. The linker uses these symbols to resolve all sorts of dangling references at link time (e.g. a call to a function defined in another file). When you declare it static, simply it won't be exported as a symbol. Therefore it won't be picked up by any other file. You could still call it from another file if you had a function pointer to it.
It's in fact the opposite. When a function is not static, its name is written somewhere in the object file, which the linker can then use to link other object files using this function, to the address of that function.
When the function is declared static, the compiler simply doesn't put the name there.

How to create modules in C

I have an interface with which I want to be able to statically link modules. For example, I want to be able to call all functions (albeit in seperate files) called FOO or that match a certain prototype, ultimately make a call into a function in the file without a header in the other files. Dont say that it is impossible since I found a hack that can do it, but I want a non hacked method. (The hack is to use nm to get functions and their prototypes then I can dynamically call the function). Also, I know you can do this with dynamic linking, however, I want to statically link the files. Any ideas?
Put a table of all functions into each translation unit:
struct functions MOD1FUNCS[]={
{"FOO", foo},
{"BAR", bar},
{0, 0}
};
Then put a table into the main program listing all these tables:
struct functions* ALLFUNCS[]={
MOD1FUNCS,
MOD2FUNCS,
0
};
Then, at run time, search through the tables, and lookup the corresponding function pointer.
This is somewhat common in writing test code. e.g., you want to call all functions that start with test_. So you have a shell script that grep's through all your .C files and pulls out the function names that match test_.*. Then that script generates a test.c file that contains a function that calls all the test functions.
e.g., generated program would look like:
int main() {
initTestCode();
testA();
testB();
testC();
}
Another way to do it would be to use some linker tricks. This is what the Linux kernel does for its initialization. Functions that are init code are marked with the qualifier __init. This is defined in linux/init.h as follows:
#define __init __section(.init.text) __cold notrace
This causes the linker to put that function in the section .init.text. The kernel will reclaim memory from that section after the system boots.
For calling the functions, each module will declare an initcall function with some other macros core_initcall(func), arch_initcall(func), et cetera (also defined in linux/init.h). These macros put a pointer to the function into a linker section called .initcall.
At boot-time, the kernel will "walk" through the .initcall section calling all of the pointers there. The code that walks through looks like this:
extern initcall_t __initcall_start[], __initcall_end[], __early_initcall_end[];
static void __init do_initcalls(void)
{
initcall_t *fn;
for (fn = __early_initcall_end; fn < __initcall_end; fn++)
do_one_initcall(*fn);
/* Make sure there is no pending stuff from the initcall sequence */
flush_scheduled_work();
}
The symbols __initcall_start, __initcall_end, etc. get defined in the linker script.
In general, the Linux kernel does some of the cleverest tricks with the GCC pre-processor, compiler and linker that are possible. It's always been a great reference for C tricks.
You really need static linking and, at the same time, to select all matching functions at runtime, right? Because the latter is a typical case for dynamic linking, i'd say.
You obviusly need some mechanism to register the available functions. Dynamic linking would provide just this.
I really don't think you can do it. C isn't exactly capable of late-binding or the sort of introspection you seem to be requiring.
Although I don't really understand your question. Do you want the features of dynamically linked libraries while statically linking? Because that doesn't make sense to me... to static link, you need to already have the binary in hand, which would make dynamic loading of functions a waste of time, even if you could easily do it.

How is the static function/variable protected

I want to know how a static variable or function is protected to be used only for the file it is defined in. I know that such variables and functions are declared in data section (heap area to be precise), but is it tagged with the file name ? Suppose I make a fool of the compiler by assigning such a static function (defined in foo.c) to a global function pointer, and call that function pointer in some other file (bar.c). Obviously my code wont give any compilation warning, but incidentally, it gives segmentation fault. Obviously, it is a protection fault, but I am interested in knowing how it is implemented inside the system.
Thanks. MS
The linker takes care of restricting the scope of mapping the function name to the function.
There is no protection for static functions called by function pointer - it's not that uncommon an idiom. For example, the recommended way of implementing GObject methods is to expose a pointer to a static function (see the virtual public methods section in this GObject how-to)
It is 'protected' simply by not having its symbol/location made known to the linker. So you cannot write code in another module that explicitly references the static object by its symbol name, because the linker has no such symbol. There is no run-time protection.
If you pass an address to a static object to some other module at runtime, then you will then be able to access it through such a pointer. That is not "making a fool of the compiler" (or linker in fact), such action may be entirely legitimate.
The fact that you got a seg-fault is probably for an entirely different reason (an invalid pointer for example). The compiler may choose to in-line the code in which case a pointer to it would not be be possible, but if you explicitly take the address of an object, the compiler should instantiate it, so this seems unlikely.
The purpose of static is not to 'protect' the variable/function but to protect the namespace and protect the rest of your program from having its behavior messed up by symbols with conflicting names. It also allows a good bit more optimization in that the compiler knows it doesn't have to facilitate access to the symbol name by outside modules.
you "may" get a problem if foo.c and bar.c are compiled into different dynamic loaded libraries.

How to deal with duplicated function name within C?

I have a little project in which I named two same name function in two different source file, but while I building the project, the compiler failed with 'func_name already defined in filename.obj'.
Why could not I have two functions with the same name in two different source files? I thought the function should be local to the source file only if when we declared it in the header file will it become global.
And except for changing the filename, are there any other elegant solution to duplicated function name in the C programming language?
In C, a function has global scope by default. To restrict its scope, use the static keyword to make it private to a the module.
The role of the header file is just to publicize the function along with its signature to other modules.
All global names must (with some caveats) be unique. This makes sense because that name is what is used by the linker to connect a function call to implementation of the function itself.
Names with static and local scope need only be unique within their scope.
Whether some thing is declared in header file or in source file makes absolutely no difference for the compiler. In fact, the compiler proper knows absolutely nothing about any "header files", since header files are embedded into source files by so called preprocessor, which does its work before the compiler proper. By the time the source files (with embedded header files) get to the actual compiler, there's no way to tell what was there originally and what was inserted from header files. The source file with all the header files embedded into it is called translation unit. I.e. the compiler proper works with translation units, not with some "source" or "header" files.
In C language all objects and functions declared at file scope have external linkage by default, which means that they are global, unique for the entire program. So, you thought incorrectly. Functions are not local to one source file only.
If you want to make a function (or an object) local to a single translation unit, you have to take some explicit steps. You have to declare it as static. Declaring it as static will give it internal linkage, which essentially means that it becomes internal to its translation unit.
Declaring your functions static will only work if both of them really have to be local to their own translation units. If this is not the case, i.e. if at least one of the functions should be a globally accessible (linkable) function, then you have no other choice but to rename one of functions.
Why could not I have two function with the same name in two differenct source file?
Because the linker needs to know which is meant when you reference it.
Imagine that a.h and b.h both declare my_function(). The compiler generates code for both. Now, imagine that c.c calls my_function() - how does the linker know which of the two versions of the function should be called?
Declare the function static to make it local to the file. In C, every identifier name must be unique.
The elegant solution is namespaces introduced in C++. The solution, if there are few calls to func_name is take one, rename it and recompile.
Something hackerous but quick solution might be this:
//In one of the two source files and any file that calls it
//if your functions is something like this
//void func_name(int) { ... }
//Add the following line
#define func_name SOME_UNIQUE_FUNC_NAME

Resources