Avoiding the main (entry point) in a C program

Avoiding the main (entry point) in a C program - c

Is it possible to avoid the entry point (main) in a C program. In the below code, is it possible to invoke the func() call without calling via main() in the below program ? If Yes, how to do it and when would it be required and why is such a provision given ?
int func(void)
{
printf("This is func \n");
return 0;
}
int main(void)
{
printf("This is main \n");
return 0;
}

If you're using gcc, I found a thread that said you can use the -e command-line parameter to specify a different entry point; so you could use func as your entry point, which would leave main unused.
Note that this doesn't actually let you call another routine instead of main. Instead, it lets you call another routine instead of _start, which is the libc startup routine -- it does some setup and then it calls main. So if you do this, you'll lose some of the initialization code that's built into your runtime library, which might include things like parsing command-line arguments. Read up on this parameter before using it.
If you're using another compiler, there may or may not be a parameter for this.

When building embedded systems firmware to run directly from ROM, I often will avoid naming the entry point main() to emphasize to a code reviewer the special nature of the code. In these cases, I am supplying a customized version of the C runtime startup module, so it is easy to replace its call to main() with another name such as BootLoader().
I (or my vendor) almost always have to customize the C runtime startup in these systems because it isn't unusual for the RAM to require initialization code for it to begin operating correctly. For instance, typical DRAM chips require a surprising amount of configuration of their controlling hardware, and often require a substantial (thousands of bus clock cycles) delay before they are useful. Until that is complete, there may not even be a place to put the call stack so the startup code may not be able to call any functions. Even if the RAM devices are operational at power on, there is almost always some amount of chip select hardware or an FPGA or two that requires initialization before it is safe to let the C runtime start its initialization.
When a program written in C loads and starts, some component is responsible for making the environment in which main() is called exist. In Unix, linux, Windows, and other interactive environments, much of that effort is a natural consequence of the OS component that loads the program. However, even in these environments there is some amount of initialization work to do before main() can be called. If the code is really C++, then there can be a substantial amount of work that includes calling the constructors for all global object instances.
The details of all of this are handled by the linker and its configuration and control files. The linker ld(1) has a very elaborate control file that tells it exactly what segments to include in the output, at what addresses, and in what order. Finding the linker control file you are implicitly using for your toolchain and reading it can be instructive, as can the reference manual for the linker itself and the ABI standard your executables must follow in order to run.
Edit: To more directly answer the question as asked in a more common context: "Can you call foo instead of main?" The answer is "Maybe, but but only by being tricky".
On Windows, an executable and a DLL are very nearly the same format of file. It is possible to write a program that loads an arbitrary DLL named at runtime, and locates an arbitrary function within it, and calls it. One such program actually ships as part of a standard Windows distribution: rundll32.exe.
Since a .EXE file can be loaded and inspected by the same APIs that handle .DLL files, in principle if the .EXE has an EXPORTS section that names the function foo, then a similar utility could be written to load and invoke it. You don't need to do anything special with main, of course, since that will be the natural entry point. Of course, the C runtime that was initialized in your utility might not be the same C runtime that was linked with your executable. (Google for "DLL Hell" for hint.) In that case, your utility might need to be smarter. For instance, it could act as a debugger, load the EXE with a break point at main, run to that break point, then change the PC to point at or into foo and continue from there.
Some kind of similar trickery might be possible on Linux since .so files are also similar in some respects to true executables. Certainly, the approach of acting like a debugger could be made to work.

A rule of thumb would be that the loader supplied by the system would always run main. With sufficient authority and competence you could theoretically write a different loader that did something else.

Rename main to be func and func to be main and call func from name.
If you have access to the source, you can do this and it's easy.

If you are using an open source compiler such as GCC or a compiler targeted at embedded systems you can modify the C runtime startup (CRT) to start at any entry point you need. In GCC this code is in crt0.s. Generally this code is partially or wholly in assembler, for most embedded systems compilers example or default start-up code will be provided.
However a simpler approach is to simply 'hide' main() in a static library that you link to your code. If that implementation of main() looks like:
int main(void)
{
func() ;
}
Then it will look to all intents and purposes as if the user entry point is func(). This is how many application frameworks with entry points other than main() work. Note that because it is in a static library, any user definition of main() will override that static library version.

The solution depends on the compiler and linker which you use. Always is that not main is the real entry point of the application. The real entry point makes some initializations and call for example main. If you write programs for Windows using Visual Studio, you can use /ENTRY switch of the linker to overwrite the default entry point mainCRTStartup and call func() instead of main():
#ifdef NDEBUG
void mainCRTStartup()
{
ExitProcess (func());
}
#endif
If is a standard practice if you write the most small application. In the case you will receive restrictions in the usage of C-Runtime functions. You should use Windows API function instead of C-Runtime function. For example instead of printf("This is func \n") you should use OutputString(TEXT("This is func \n")) where OutputString are implemented only with respect of WriteFile or WriteConsole:
static HANDLE g_hStdOutput = INVALID_HANDLE_VALUE;
static BOOL g_bConsoleOutput = TRUE;
BOOL InitializeStdOutput()
{
g_hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE);
if (g_hStdOutput == INVALID_HANDLE_VALUE)
return FALSE;
g_bConsoleOutput = (GetFileType (g_hStdOutput) & ~FILE_TYPE_REMOTE) != FILE_TYPE_DISK;
#ifdef UNICODE
if (!g_bConsoleOutput && GetFileSize (g_hStdOutput, NULL) == 0) {
DWORD n;
WriteFile (g_hStdOutput, "\xFF\xFE", 2, &n, NULL);
}
#endif
return TRUE;
}
void Output (LPCTSTR pszString, UINT uStringLength)
{
DWORD n;
if (g_bConsoleOutput) {
#ifdef UNICODE
WriteConsole (g_hStdOutput, pszString, uStringLength, &n, NULL);
#else
CHAR szOemString[MAX_PATH];
CharToOem (pszString, szOemString);
WriteConsole (g_hStdOutput, szOemString, uStringLength, &n, NULL);
#endif
}
else
#ifdef UNICODE
WriteFile (g_hStdOutput, pszString, uStringLength * sizeof (TCHAR), &n, NULL);
#else
{
//PSTR pszOemString = _alloca ((uStringLength + sizeof(DWORD)));
CHAR szOemString[MAX_PATH];
CharToOem (pszString, szOemString);
WriteFile (g_hStdOutput, szOemString, uStringLength, &n, NULL);
}
#endif
}
void OutputString (LPCTSTR pszString)
{
Output (pszString, lstrlen (pszString));
}

This really depends how you are invoking the binary, and is going to be reasonably platform and environment specific. The most obvious answer is to simply rename the "main" symbol to something else and call "func" "main", but I suspect that's not what you are trying to do.

Related

Initializing custom output stream before main

Say I want to write an API (for C/Linux) that offers a customized output stream, like stdout but mine should be called not_stdout. So I could demand that people using my API always begin their main program by calling a function init_the_stream() that initializes extern FILE* not_stdout.
But what I'd really like is for my stream to be initialized prior to main(), so that it works just like stdout.
I would guess that this is somewhat hard to do in a portable way, since the C standard wants prior-to-main initialized variables to be constants or string literals, and that stdout gets special compiler treatment. But I'm not sure, so I want to ask:
Is it possible to write a C library such that stuff like extern FILE* not_stdout is initialized before the first line of main() whenever the library is included?

On gcc and clang, you can use __attribute__((constructor)) (not standard C).
Example:
#include <stdio.h>
__attribute__((constructor))
static void not_stdout__init(void)
{
puts("initializing not_stdout");
}
int main()
{
puts("main");
}
It works well with dynamically linked (.so) or loaded (dlopen) ELF libraries – if a library provides such hooks, they will get invoked when the library gets linked in.
If you want to be portable, you could leave the initializer externally visible (no static) and conditionally add the constructor attribute only if it's supported. That way, your users could invoke the initializer from main manually if no mechanism exists to make their platform do it for them.

Need C compiler options to create to easy-to-reverse executable to teach reversing

I've written a simple C program with the goal to have students use OllyDbg reverse it and patch it to change how it executes. They only have to find a 3 and change it to a 5. The compiled executable is filled with difficult-to-follow startup calls that will thwart efforts for newbies to reverse it in Olly. I'd like to tell the compiler to remove all/most of that.
I'm using Visual Studio 2015 Pro, but I'm not a VS whiz, I don't know where to start.

As long as your main function is not accessing the standard C library and you only use Windows API calls, you can remove the c-runtime startup code with the following two linker switches:
/entry:<your_main_function> /nodefaultlib
In a full example using a simple console application (which I've assumed you are using) and a main function of "main", you could compile and link your code without the c-runtime library as follows:
cl /c main.c
link /entry:main /nodefaultlib /libpath:<path_to_libs> /subsystem:console main.obj kernel32.lib user32.lib
Visual Studio has the equivalent options buried in your project dialogs if you don't want to manually run the command line tools.
This should produce a small, yet simple EXE. When your debugging session begins, you should start right inside your main function.
NOTE: The lack of standard IO may make outputting a number (which I assume you are doing) more difficult as you might need to write some routines yourself, such as getting the string length. However the only API calls you'll need to output to the console are the kernel32 GetStdHandle() and WriteConsole() functions. Keep in mind there is a wsprintf() API that is accessible in user32.lib which I'd also suggest using to keep your sample bare-bones but still allow you to build c-style formatted strings. Use wsprintf() to convert your number into an output prompt/string and use WriteConsole to send it to the console.
UPDATE: Below is a sample C program that outputs the results of adding two numbers without needing the standard-C library and thus the CRT startup code. This is a good reverse engineering sample to change one of the numbers to get a different result on the command line.
DISCLAIMER: I agree that wsprintf() is indeed an unsafe function due to its lack of checking buffer size and when mis-used can contribute to a buffer overflow. It is used here ONLY because it is built in to the Win32 API (user32.lib) and is a quick way to convert a number to a string just for this example.
As far as I know, the only C-style formatting functions built in to windows are the ANSI/WIDE versions of this function. The strsafe.lib functions seem to have a dependency on the standard library otherwise I would have used them. If you know any different, please let me know! For that reason, I recommend writing your own number to string conversion function or using someone else's if you do not use the standard-C library functions.
#include <Windows.h>
//quick-and-dirty string-length function
DWORD getStringLen(const char* pszStr)
{
const char* p = pszStr;
while(*p) ++p;
return(p - pszStr);
}
//program entry point - note that int return value does propagate back to
// ExitProcess() despite main() being the program's actual entry point;
// the ret at the end of this function returns back to the loader, which
// in turn calls ExitThread() with EAX; the return value can be
// checked on the comamnd-line via: echo %ERRORLEVEL%
int main(void)
{
DWORD dwNum1 = 5;
DWORD dwNum2 = 3;
DWORD dwResult = dwNum1+dwNum2;
//build formatted output string
//
// NOTE: wsprintfA() is an unsafe function and is only used as an example
// DO NOT USE in production code!
//
#pragma warning(disable : 4995) //wsprintf is a depreciated function
char szTemp[100];
int iRet = wsprintfA(szTemp,"%u + %u = %u\n",dwNum1,dwNum2,dwResult);
//get console stdandard output handle
HANDLE hConsole = GetStdHandle(STD_OUTPUT_HANDLE);
if (INVALID_HANDLE_VALUE == hConsole)
{
iRet = -1;
}
else
{
//output the string
WriteConsoleA(hConsole,szTemp,getStringLen(szTemp),&dwNum1,NULL);
}
return(iRet);
}

Pointer to end of a function code

I understand that a function pointer points to the starting address of the code for a function. But is there any way to be able to point to the end of the code of a function as well?
Edit: Specifically on an embedded system with a single processor and no virtual memory. No optimisation too. A gcc compiler for our custom processor.
I wish to know the complete address range of my function.

If you put the function within its own special linker section, then your toolchain might provide a pointer to the end (and the beginning) of the linker section. For example, with Green Hills Software (GHS) MULTI compiler I believe you can do something like this:
#pragma ghs section text=".mysection"
void MyFunction(void) { }
#pragma ghs section
That will tell the linker to locate the code for MyFunction in .mysection. Then in your code you can declare the following pointers, which point to the beginning and end of the section. The GHS linker provides the definitions automatically.
extern char __ghsbegin_mysection[];
extern char __ghsend_mysection[];
I don't know whether GCC supports similar functionality.

You didn't say why you need this information, but on some embedded system it's required to copy a single function from flash to ram in order to (re)program the flash.
Normally you are placing this functions into a new unique section and depending of your linker you can copy this section with pure C or with assembler to the new (RAM) location.
You also need to tell the linker that the code will run from another address than that it is placed in flash.
In a project the flash.c could look like
#pragma define_section pram_code "pram_code.text" RWX
#pragma section pram_code begin
uint16_t flash_command(uint16_t cmd, uint16_t *addr, uint16_t *data, uint16_t cnt)
{
...
}
#pragma section pram_code end
The linker command file looks like
.prog_in_p_flash_ROM : AT(Fpflash_mirror) {
Fpram_start = .;
# OBJECT(Fflash_command,flash.c)
* (pram_code.text)
. = ALIGN(2);
# save data end and calculate data block size
Fpram_end = .;
Fpram_size = Fpram_end - Fpram_start;
} > .pRAM
But as others said, this is very toolchain specific

There is no way with C to point to the end of a function. A C compiler has a lot of latitude as to how it arranges the machine code it emits during compilation. With various optimization settings, a C compiler may actually merge machine code intermingling the machine code of the various functions.
Since along with what ever the C compiler does there is also what is done by the linker as well as the loader as a part of linking the various compiled pieces of object code together and then loading the application which may also be using various kinds of shared libraries.
In the complex running environment of modern operating systems and modern development tool chains, unless the language provides a specific mechanism for doing something, it is prudent to not try to get fancy leaving yourself open to an application which suddenly stops working due to changes in the operating environment.
In most cases if you use a non-optimizing setting of the compiler with static linked libraries, the symbol map that most linkers provide will give you a good idea as to where functions begin and end. However the only thing you can really depend on is knowing the address of the function entry points.

In some implementations (including gcc) you could do something like this (but its not guaranteed and lots of implementation details could affect it):
int foo() {
printf("testing\n");
return 7;
}
void foo_end() { }
int sizeof_foo() {
// assumes no code size optimizations across functions
// function could be smaller than reported
// reports size, not range
return (int (*)())foo_end - foo;
}

How to include C backtrace in a kernel module code?

So I am trying to find out what kernel processes are calling some functions in a block driver. I thought including backtrace() in the C library would make it easy. But I am having trouble to load the backtrace.
I copied this example function to show the backtrace:
http://www.linuxjournal.com/files/linuxjournal.com/linuxjournal/articles/063/6391/6391l1.html
All attempts to compile have error in one place or another that a file cannot be found or that the functions are not defined.
Here is what comes closest.
In the Makefile I put the compiler directives:
-rdynamic -I/usr/include
If I leave out the second one, -I/usr/include, then the compiler reports it cannot find the required header execinfo.h.
Next, in the code where I want to do the backtrace I have copied the function from the example:
//trying to include the c backtrace capability
#include <execinfo.h>
void show_stackframe() {
void *trace[16];
char **messages = (char **)NULL;
int i, trace_size = 0;
trace_size = backtrace(trace, 16);
messages = backtrace_symbols(trace, trace_size);
printk(KERN_ERR "[bt] Execution path:\n");
for (i=0; i<trace_size; ++i)
printk(KERN_ERR "[bt] %s\n", messages[i]);
}
//backtrace function
I have put the call to this function later on, in a block driver function where the first sign of the error happens. Simply:
show_stackframe();
So when I compile it, the following errors:
user#slinux:~/2.6-32$ make -s
Invoking make againt the kernel at /lib/modules/2.6.32-5-686/build
In file included from /usr/include/features.h:346,
from /usr/include/execinfo.h:22,
from /home/linux/2.6-32/block/block26.c:49:
/usr/include/sys/cdefs.h:287:1: warning: "__always_inline" redefined
In file included from /usr/src/linux-headers-2.6.32-5-common/include/linux/compiler-gcc.h:86,
from /usr/src/linux-headers-2.6.32-5-common/include/linux/compiler.h:40,
from /usr/src/linux-headers-2.6.32-5-common/include/linux/stddef.h:4,
from /usr/src/linux-headers-2.6.32-5-common/include/linux/list.h:4,
from /usr/src/linux-headers-2.6.32-5-common/include/linux/module.h:9,
from /home/linux/2.6-32/inc/linux_ver.h:40,
from /home/linux/2.6-32/block/block26.c:32:
/usr/src/linux-headers-2.6.32-5-common/include/linux/compiler-gcc4.h:15:1: warning: this is the location of the previous definition
/home/linux/2.6-32/block/block26.c:50: warning: function declaration isn’t a prototype
WARNING: "backtrace" [/home/linux/2.6-32/ndas_block.ko] undefined!
WARNING: "backtrace_symbols" [/home/linux/2.6-32/ndas_block.ko] undefined!
Note: block26.c is the file I am hoping to get the backtrace from.
Is there an obvious reason why the backtrace and backtrace_symbols remain undefined when it is compiled into the .ko modules?
I am guessing it because I use the compiler include execinfo.h which is residing on the computer and not being loaded to the module.
It is my uneducated guess to say the least.
Can anyone offer a help to get the backtrace functions loading up in the module?
Thanks for looking at this inquiry.
I am working on debian. When I take out the function and such, the module compiles fine and almost works perfectly.
From ndasusers

To print the stack contents and a backtrace to the kernel log, use the dump_stack() function in your kernel module. It's declared in linux/kernel.h in the include folder in the kernel source directory.

If you need to save the stack trace and process its elements somehow, save_stack_trace() or dump_trace() might be also an option. These functions are declared in <linux/stacktrace.h> and <asm/stacktrace.h>, respectively.
It is not as easy to use these as dump_stack() but if you need more flexibility, they may be helpful.
Here is how save_stack_trace() can be used (replace HOW_MANY_ENTRIES_TO_STORE with the value that suits your needs, 16-32 is usually more than enough):
unsigned long stack_entries[HOW_MANY_ENTRIES_TO_STORE];
struct stack_trace trace = {
.nr_entries = 0,
.entries = &stack_entries[0],
.max_entries = HOW_MANY_ENTRIES_TO_STORE,
/* How many "lower entries" to skip. */
.skip = 0
};
save_stack_trace(&trace);
Now stack_entries array contains the appropriate call addresses. The number of elements filled is nr_entries.
One more thing to point out. If it is desirable not to output the stack entries that belong to the implementation of save_stack_trace(), dump_trace() or dump_stack() themselves (on different systems, the number of such entries may vary), the following trick can be applied if you use save_stack_trace(). You can use __builtin_return_address(0) as an "anchor" entry and process only the entries "not lower" than that.

I know this question is about Linux, but since it's the first result for "backtrace kernel", here's a few more solutions:
DragonFly BSD
It's print_backtrace(int count) from /sys/sys/systm.h. It's implemented in
/sys/kern/kern_debug.c and/or /sys/platform/pc64/x86_64/db_trace.c. It can be found by searching for panic, which is implemented in /sys/kern/kern_shutdown.c, and calls print_backtrace(6) if DDB is defined and trace_on_panic is set, which are both defaults.
FreeBSD
It's kdb_backtrace(void) from /sys/sys/kdb.h. Likewise, it's easy to find by looking into what the panic implementation calls when trace_on_panic is true.
OpenBSD
Going the panic route, it appears to be db_stack_dump(), implemented in /sys/ddb/db_output.c. The only header mention is /sys/ddb/db_output.h.

dump_stack() is function can be used to print your stack and thus can be used to backtrack . while using it be carefull that don't put it in repetitive path like loops or packet receive function it can fill your dmesg buffer can cause crash in embedded device (having less memory and cpu).
This function is declared in linux/kernel.h .

Is main() a pre-defined function in C? [duplicate]

This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
main() in C, C++, Java, C#
I'm new to programming in general, and C in particular. Every example I've looked at has a "main" function - is this pre-defined in some way, such that the name takes on a special meaning to the compiler or runtime... or is it merely a common idiom among C programmers (like using "foo" and "bar" for arbitrary variable names).

No, you need to define main in your program. Since it's called from the run-time, however, the interface your main must provide is pre-defined (must return an int, must take either zero arguments or two, the first an int, and the second a char ** or, equivalently, char *[]). The C and C++ standards do specify that a function with external linkage named main acts as the entry point for a program1.
At least as the term is normally used, a predefined function would be one such as sin or printf that's in the standard library so you can use it without having to write it yourself.
1If you want to get technical, that's only true for a "hosted" implementation -- i.e., the kind most of us use most of the time that produces programs that run on an operating system. A "free-standing" implementation (one produces program that run directly on the "bare metal" with no operating system under it) is free to define the entry point(s) as it sees fit. A free-standing implementation can also leave out most of the normal run-time library, providing only a handful of headers (e.g., <stddef.h>) and virtually no standard library functions.

Yes, main is a predefined function in the general sense of the the word "defined". In other words, the C language standard specifies that the function called at program startup shall be named main. It is not merely a convention used by programmers as we have with foo or bar.
The fine print: from the perspective of the technical meaning of the word "defined" in the context of C programming, no the main function is not "predefined" -- the compiler or C library do not supply a predefined function named main. You need to define your own implementation of the main function (and, obviously, you should name it main).

There is typically a piece of code that normal C programs are linked to which does:
extern int main(int argc, char * argv[], char * envp[]);
FILE * stdin;
FILE * stdout;
FILE * stderr;
/* ** setup argv ** */
...
/* ** setup envp ** */
...
/* ** setup stdio ** */
stdin = fdopen(0, "r");
stdout = fdopen(1, "w");
stderr = fdopen(2, "w");
int rc;
rc = main(argc, argv, envp); // envp may not be present here with some systems
exit(rc);
Note that this code is C, not C++, so it expects main to be a C function.
Also note that my code does no error checking and leaves out a lot of other system dependent stuff that probably happens. It also ignores some things that happen with C++, objective C, and various other languages that may be linked to it (notably constructor and destructor calling, and possibly having main be within a C++ try/catch block).
Anyway, this code knows that main is a function and takes arguments. If your main looks like:
int main(void) {
Then it still gets passed arguments, but they are ignored.
This code specially linked so that it is called when the program starts up.
You are completely free to write your own version of this code on many architectures, but it relies on intimate knowledge of how the operating system starts a new program as well as the C (and C++ and possibly Objective C) run time. It is likely to require some assembly programming and or use of compiler extensions to properly build.
The C compiler driver (the command you usually call when you call the compiler) passes the object file containing all of this (often called crt0.0, for C Run Time ...) along with the rest of your program to the linker, unless told not to.
When building an operating system kernel or an embedded program you often do not want to use the standard crt*.o file. You also may not want to use it if you are building a normal application in another programming language, or have some other non-standard requirements.

No, or you couldn't define one.
It's not predefined, but its meaning as an entry point, if it is present, is defined.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight