Say I want to write an API (for C/Linux) that offers a customized output stream, like stdout but mine should be called not_stdout. So I could demand that people using my API always begin their main program by calling a function init_the_stream() that initializes extern FILE* not_stdout.
But what I'd really like is for my stream to be initialized prior to main(), so that it works just like stdout.
I would guess that this is somewhat hard to do in a portable way, since the C standard wants prior-to-main initialized variables to be constants or string literals, and that stdout gets special compiler treatment. But I'm not sure, so I want to ask:
Is it possible to write a C library such that stuff like extern FILE* not_stdout is initialized before the first line of main() whenever the library is included?
On gcc and clang, you can use __attribute__((constructor)) (not standard C).
Example:
#include <stdio.h>
__attribute__((constructor))
static void not_stdout__init(void)
{
puts("initializing not_stdout");
}
int main()
{
puts("main");
}
It works well with dynamically linked (.so) or loaded (dlopen) ELF libraries – if a library provides such hooks, they will get invoked when the library gets linked in.
If you want to be portable, you could leave the initializer externally visible (no static) and conditionally add the constructor attribute only if it's supported. That way, your users could invoke the initializer from main manually if no mechanism exists to make their platform do it for them.
Related
I have heard that, when you have just 1 (main.c) file (or use a "unity build"), there are benefits to be had if you make all your functions static.
I am kind of confused why this (allegedly) isn't optimized by default, since it's not probable that you will include main.c into another file where you will use one of its functions.
I would like to know the benefits and dangers of doing this before implementing it.
Example:
main.c
static int my_func(void){ /*stuff*/ }
int main(void) {
my_func();
return 0;
}
You have various chunks of wisdom in the comments, assembled here into a Community Wiki answer.
Jonathan Leffler noted:
The primary benefit of static functions is that the compiler can (and will) aggressively inline them when it knows there is no other code that can call the function. I've had error messages from four levels of inlined function calls (three qualifying “inlined from” lines) on occasion. It's staggering what a compiler will do!
and:
FWIW: my rule of thumb is that every function should be static until it is known that it will be called from code in another file. When it is known that it will be used elsewhere, it should be declared in a header file that is included both where the function is defined and where it is used. (Similar rules apply to file scope variables — aka 'global variables'; they should be static until there's a proven need for them elsewhere, and then they should be declared in a header too.)
The main() function is always called from the startup code, so it is never static. Any function defined in the same file as an unconditionally compiled main() function cannot be reused by other programs. (Library code might contain a conditionally compiled test program for the library function(s) defined in the source file — most of my library code has #ifdef TEST / …test program… / #endif at the end.)
Eirc Postpischil generalized on that:
General rule: Anytime you can write code that says the use of something is limited, do it. Value will not be modified? Make it const. Name only needs to be used in a certain section? Declare it in the innermost enclosing scope. Name does not need to be linked externally? Make it static. Every limitation both shrinks the window for a bug to be created and may remove complications that interfere with optimization.
In my project we are heavily using a C header which provides an API to comunicate to an external software. Long story short, in our project's bugs show up more often on the calling of the functions defined in those headers (it is an old and ugly legacy code).
I would like to implement an indirection on the calling of those functions, so I could include some profiling before calling the actual implementation.
Because I'm not the only person working on this project, I would like to make those wrappers in a such way that if someone uses the original implementations directly it should cause a compile error.
If those headers were C++ sources, I would be able to simply make a namespace, wrap the included files in it, and implement my functions using it (the other developers would be able to use the original implementation using the :: operator, but just not being able to call it directly is enough encapsulation to me). However the headers are C sources (which I have to include with extern "C" directive to include), so namespaces won't help me AFAIK.
I tried to play around with defines, but with no luck, like this:
#define my_func api_func
#define api_func NULL
What I wanted with the above code is to make my_func to be translated to api_func during the preprocessing, while making a direct call to api_func give a compile error, but that won't work because it will actually make my_func to be translated to NULL too.
So, basically, I would like to make a wrapper, and make sure the only way to access the API is through this wrapper (unless the other developers make some workaround, but this is inevitable).
Please note that I need to wrap hundreds of functions, which show up spread in the whole code several times.
My wrapper necessarily will have to include those C headers, but I would like to make them leave scope outside the file of my wrapper, and make them to be unavailable to every other file who includes my wrapper, but I guess this is not possible in C/C++.
You have several options, none of them wonderful.
if you have the sources of the legacy software, so that you can recompile it, you can just change the names of the API functions to make room for the wrapper functions. If you additionally make the original functions static and put the wrappers in the same source files, then you can ensure that the originals are called only via the wrappers. Example:
static int api_func_real(int arg);
int api_func(int arg) {
// ... instrumentation ...
int result = api_func_real(arg);
// ... instrumentation ...
return result;
}
static int api_func_real(int arg) {
// ...
}
The preprocessor can help you with that, but I hesitate to recommend specifics without any details to work with.
if you do not have sources for the legacy software, or if otherwise you are unwilling to modify it, then you need to make all the callers call your wrappers instead of the original functions. In this case you can modify the headers or include an additional header before that uses #define to change each of the original function names. That header must not be included in the source files containing the API function implementations, nor in those providing the wrapper function implementations. Each define would be of the form:
#define api_func api_func_wrapper
You would then implement the various api_func_wrapper() functions.
Among the ways those cases differ is that if you change the legacy function names, then internal calls among those functions will go through the wrappers bearing the original names (unless you change the calls, too), but if you implement wrappers with new names then they will be used only when called explicitly, which will not happen for internal calls within the legacy code (unless, again, you modify those calls).
You can do something like
[your wrapper's include file]
int origFunc1 (int x);
int origFunc2 (int x, int y);
#ifndef WRAPPER_IMPL
#define origFunc1 wrappedFunc1
#define origFunc2 wrappedFunc2
#else
int wrappedFunc1(int x);
int wrappedFunc2(int x, int y);
#endif
[your wrapper implementation]
#define WRAPPER_IMPL
#include "wrapper.h"
int wrapperFunc1 (...) {
printf("Wrapper1 called\n");
origFunc1(...);
}
Your wrapper's C file obviously needs to #define WRAPPER_IMPL before including the header.
That is neither nice nor clean (and if someone wants to cheat, he could simply define WRAPPER_IMPL), but at least some way to go.
There are two ways to wrap or override C functions in Linux:
Using LD_PRELOAD:
There is a shell environment variable in Linux called LD_PRELOAD,
which can be set to a path of a shared library,
and that library will be loaded before any other library (including glibc).
Using ‘ld --wrap=symbol‘:
This can be used to use a wrapper function for symbol.
Any further reference to symbol will be resolved to the wrapper function.
a complete writeup can be found at:
http://samanbarghi.com/blog/2014/09/05/how-to-wrap-a-system-call-libc-function-in-linux/
I have an interface with which I want to be able to statically link modules. For example, I want to be able to call all functions (albeit in seperate files) called FOO or that match a certain prototype, ultimately make a call into a function in the file without a header in the other files. Dont say that it is impossible since I found a hack that can do it, but I want a non hacked method. (The hack is to use nm to get functions and their prototypes then I can dynamically call the function). Also, I know you can do this with dynamic linking, however, I want to statically link the files. Any ideas?
Put a table of all functions into each translation unit:
struct functions MOD1FUNCS[]={
{"FOO", foo},
{"BAR", bar},
{0, 0}
};
Then put a table into the main program listing all these tables:
struct functions* ALLFUNCS[]={
MOD1FUNCS,
MOD2FUNCS,
0
};
Then, at run time, search through the tables, and lookup the corresponding function pointer.
This is somewhat common in writing test code. e.g., you want to call all functions that start with test_. So you have a shell script that grep's through all your .C files and pulls out the function names that match test_.*. Then that script generates a test.c file that contains a function that calls all the test functions.
e.g., generated program would look like:
int main() {
initTestCode();
testA();
testB();
testC();
}
Another way to do it would be to use some linker tricks. This is what the Linux kernel does for its initialization. Functions that are init code are marked with the qualifier __init. This is defined in linux/init.h as follows:
#define __init __section(.init.text) __cold notrace
This causes the linker to put that function in the section .init.text. The kernel will reclaim memory from that section after the system boots.
For calling the functions, each module will declare an initcall function with some other macros core_initcall(func), arch_initcall(func), et cetera (also defined in linux/init.h). These macros put a pointer to the function into a linker section called .initcall.
At boot-time, the kernel will "walk" through the .initcall section calling all of the pointers there. The code that walks through looks like this:
extern initcall_t __initcall_start[], __initcall_end[], __early_initcall_end[];
static void __init do_initcalls(void)
{
initcall_t *fn;
for (fn = __early_initcall_end; fn < __initcall_end; fn++)
do_one_initcall(*fn);
/* Make sure there is no pending stuff from the initcall sequence */
flush_scheduled_work();
}
The symbols __initcall_start, __initcall_end, etc. get defined in the linker script.
In general, the Linux kernel does some of the cleverest tricks with the GCC pre-processor, compiler and linker that are possible. It's always been a great reference for C tricks.
You really need static linking and, at the same time, to select all matching functions at runtime, right? Because the latter is a typical case for dynamic linking, i'd say.
You obviusly need some mechanism to register the available functions. Dynamic linking would provide just this.
I really don't think you can do it. C isn't exactly capable of late-binding or the sort of introspection you seem to be requiring.
Although I don't really understand your question. Do you want the features of dynamically linked libraries while statically linking? Because that doesn't make sense to me... to static link, you need to already have the binary in hand, which would make dynamic loading of functions a waste of time, even if you could easily do it.
This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
main() in C, C++, Java, C#
I'm new to programming in general, and C in particular. Every example I've looked at has a "main" function - is this pre-defined in some way, such that the name takes on a special meaning to the compiler or runtime... or is it merely a common idiom among C programmers (like using "foo" and "bar" for arbitrary variable names).
No, you need to define main in your program. Since it's called from the run-time, however, the interface your main must provide is pre-defined (must return an int, must take either zero arguments or two, the first an int, and the second a char ** or, equivalently, char *[]). The C and C++ standards do specify that a function with external linkage named main acts as the entry point for a program1.
At least as the term is normally used, a predefined function would be one such as sin or printf that's in the standard library so you can use it without having to write it yourself.
1If you want to get technical, that's only true for a "hosted" implementation -- i.e., the kind most of us use most of the time that produces programs that run on an operating system. A "free-standing" implementation (one produces program that run directly on the "bare metal" with no operating system under it) is free to define the entry point(s) as it sees fit. A free-standing implementation can also leave out most of the normal run-time library, providing only a handful of headers (e.g., <stddef.h>) and virtually no standard library functions.
Yes, main is a predefined function in the general sense of the the word "defined". In other words, the C language standard specifies that the function called at program startup shall be named main. It is not merely a convention used by programmers as we have with foo or bar.
The fine print: from the perspective of the technical meaning of the word "defined" in the context of C programming, no the main function is not "predefined" -- the compiler or C library do not supply a predefined function named main. You need to define your own implementation of the main function (and, obviously, you should name it main).
There is typically a piece of code that normal C programs are linked to which does:
extern int main(int argc, char * argv[], char * envp[]);
FILE * stdin;
FILE * stdout;
FILE * stderr;
/* ** setup argv ** */
...
/* ** setup envp ** */
...
/* ** setup stdio ** */
stdin = fdopen(0, "r");
stdout = fdopen(1, "w");
stderr = fdopen(2, "w");
int rc;
rc = main(argc, argv, envp); // envp may not be present here with some systems
exit(rc);
Note that this code is C, not C++, so it expects main to be a C function.
Also note that my code does no error checking and leaves out a lot of other system dependent stuff that probably happens. It also ignores some things that happen with C++, objective C, and various other languages that may be linked to it (notably constructor and destructor calling, and possibly having main be within a C++ try/catch block).
Anyway, this code knows that main is a function and takes arguments. If your main looks like:
int main(void) {
Then it still gets passed arguments, but they are ignored.
This code specially linked so that it is called when the program starts up.
You are completely free to write your own version of this code on many architectures, but it relies on intimate knowledge of how the operating system starts a new program as well as the C (and C++ and possibly Objective C) run time. It is likely to require some assembly programming and or use of compiler extensions to properly build.
The C compiler driver (the command you usually call when you call the compiler) passes the object file containing all of this (often called crt0.0, for C Run Time ...) along with the rest of your program to the linker, unless told not to.
When building an operating system kernel or an embedded program you often do not want to use the standard crt*.o file. You also may not want to use it if you are building a normal application in another programming language, or have some other non-standard requirements.
No, or you couldn't define one.
It's not predefined, but its meaning as an entry point, if it is present, is defined.
Can the main() function be declared static in a C program? If so then what is the use of it?
Is it possible if I use assembly code and call the static main() function myself (consider embedded programs)?
No. The C spec actually says somewhere in it (I read the spec, believe it or not) that the main function cannot be static.
The reason for this is that static means "don't let anything outside this source file use this object". The benefit is that it protects against name collisions in C when you go to link (it would be bad bad bad if you had two globals both named "is_initialized" in different files... they'd get silently merged, unless you made them static). It also allows the compiler to perform certain optimizations that it wouldn't be able to otherwise. These two reasons are why static is a nice thing to have.
Since you can't access static functions from outside the file, how would the OS be able to access the main function to start your program? That's why main can't be static.
Some compilers treat "main" specially and might silently ignore you when you declare it static.
Edit: Looks like I was wrong about that the spec says main can't be static, but it does say it can't be inline in a hosted environment (if you have to ask what "hosted environment" means, then you're in one). But on OS X and Linux, if you declare main static, then you'll get a link error because the linker can't find the definition of "main".
You could have a static function called main() in a source file, and it would probably compile, but it would not be the main() function because it would be invisible to the linker when the start-up code (crt0.o on many (older) Unix systems) calls main().
Given the code:
static int main(int argc, char **argv)
{
return(argv + argc);
}
extern int x(int argc, char **argv)
{
return(main(argc, argv));
}
GCC with -Wall helpfully says:
warning: 'main' is normally a non-static function
Yes, it can be done. No, it is normally a mistake - and it is not the main() function.
No you cannot do it. If you will do it you will be unable to compile your program. Because static function is only visible within the same file, so the linker will no be able to find it and make a call of it.
As others have said, no it can't. And that goes double if you ever intend to port your code to C++, as the C++ Standard specifies that main() need not actually be a function.
C has two meanings for 'static'...
static for a local variable means it can be used globally.
static for a global variable means is can only be used in the current file.
static for functions has the exact same impact as denoting a global variable as static ... the static function IS ONLY VISIBLE IN THE CURRENT FILE ...
Thus main can NEVER be static, because it would not be able to serve as the primary entry point for the program.