How to implement analogue of exit() functions? -std=c99 - c

I'm writing a university project. Writing in standard C99. One of the requirements is the lack of exit(); function. Is it possible to implement a similar function?
I tried to make a function that calls main with a minus argc to detect exit. It was a stupid attempt, because the first main continues.
Just the description of the project specified that the scores will be reduced for the use of exit by exit().I understand that it asks me to code running through pointers and returns an error in the return values ​​of the function. I'm more interested in the practice. Only for myself.

I think you misunderstood the requirement: They probably said something like do not use exit(). This does not mean you are supposed to implement your own exit(), quite to the contrary: they probably mean that the only exit-point of your program shall be the end of your main-function (or a return-statement within the main function) which is considered good programming style.

exit() is a system level facility that you can't implement on your own without knowing how the operating system implements it (Linux? Windows? embedded system?) works. As Daniel Fischer mentioned, you could call abort() which will basically do the same thing that exit will do and quit the program.
There are other "hacks" to get your program to abort without calling exit() explicitly, but these are just hacks and should not be used in production code.
Create a C++ function with C linkage and throw an exception
extern "C" MyExit() { throw std::exception(); }
Call signal() with SIGKILL
Call abort()
Write some assembly code to unwind the call stack until it gets to the function that called main and insert the return value in to the proper return register and go from there. I don't think you can do this in pure C, as the ABI is not accessible directly. But at least this would be only method that doesn't involve the operating system (just the ABI).

Related

Writing my own longjmperror() in C

I was looking at the manual for longjmp and in the Errors part it says this:
ERRORS
If the contents of the env are corrupted, or correspond to an environment that has already returned, the longjmp() routine calls the routine longjmperror(3). If longjmperror()
returns, the program is aborted (see abort(3)). The default version of longjmperror() prints the message ``longjmp botch'' to standard error and returns. User programs wishing to exit more gracefully should write their own versions of longjmperror().
How would i write my own version of longjmperror? From what i know in C you can't override functions and i really need the long jump to exit in a specific way when it doesn't find the point to jump to.
On Mac OS X (10.9.2, Mavericks) at any rate, the prototype for longjmperror() is:
void longjmperror(void);
You write a function with that signature. It must not return (or, rather, if it does, the program will be abort()ed). What you do in that function is your business, but bear in mind that things have gone moderately catastrophically wrong for the function to be called at all). It might log an error to your log file, or just write a more meaningful message before exiting (instead of aborting and perhaps core dumping).
You link the object file containing the function ahead of the system library. You are normally not expected to replace system functions, but this is one you are intended to override.

What is the need for C startup routine?

Quoting from one of the unix programming books,
When a C program is executed by the
kernelby, one of the exec functions
calls special start-up routine. This
function is called before the main
function is called. The executable
program file specifies this routine as
the starting address for the program;
this is set up by the link editor when
it is invoked by the C compiler. This
start-up routine takes values from the
kernel the command-line arguments and
the environment and sets things up so
that the main function is called as
shown earlier.
Why do we a need a middle man start-up routine. The exec function could have straightway called the main function and the kernel could have directly passed the command line arguments and environment to the main function. Why do we need the start-up routine in between?
Because C has no concept of "plug in". So if you want to use, say, malloc() someone has to initialize the necessary data structures. The C programmers were lazy and didn't want to have to write code like this all the time:
main() {
initialize_malloc();
initialize_stdio();
initialize_...();
initialize_...();
initialize_...();
initialize_...();
initialize_...();
... oh wow, can we start already? ...
}
So the C compiler figures out what needs to be done, generates the necessary code and sets up everything so you can start with your code right away.
The start-up routine initializes the CRT (i.e. creates the CRT heap so that malloc/free work, initializes standard I/O streams, etc.); in case of C++ it also calls the globals' constructors. There may be other system-specific setup, you should check the sources of your run-time library for more details.
Calling main() is a C thing, while calling _start() is a kernel thing, indicated by the entry point in the binary format header. (for clarity: the kernel doesn't want or need to know that we call it _start)
If you would have a non-C binary, you might not have a main() function, you might not even have the concept of a "function" at all.
So the actual question would be: why doesn't a compiler give the address of main() as a starting point? That's because typical libc implementations want to do some initializations before really starting the program, see the other answers for that.
edit as an example, you can change the entry point like this:
$ cat entrypoint.c
int blabla() { printf("Yes it works!\n"); exit(0); }
int main() { printf("not called\n"); }
$ gcc entrypoint.c -e blabla
$ ./a.out
Yes it works!
Important to know also is that an application program is executed in user mode, and any system calls out, set the privileged bit and go into kernel mode. This helps increase OS security by preventing the user from accessing kernel level system calls and a myriad of other complications. So a call to printf will trap, set kernel mode bit, execute code, then reset to user mode and return to your application.
The CRT is required to help you and allow you to use the languages you want in Windows and Linux. it provides some very fundamental bootstrapping into the OS to provide you with feature sets for development.

Why does avr-gcc bother to save the register state when calling main()?

The main() function in an avr-gcc program saves the register state on the stack, but when the runtime calls it I understand on a microcontroller there isn't anything to return to. Is this a waste of RAM? How can this state saving be prevented?
How can the compiler be sure that you aren't going to recursively call main()?
It's all about the C-standard.
Nothing forbids you from exiting main at some time. You may not do it in your program, but others may do it.
Furthermore you can register cleanup-handlers via the atexit runtime function. These functions need a defined register state to execute properly, and the only way to guarantee this is to save and restore the registers around main.
It could even be useful to do this:
I don't know about the AVR but other micro-controllers can go into a low power state when they're done with their job and waiting for a reset. Doing this from a cleanup-handler may be a good idea because this handler gets called if you exit main the normal way and (as far as I now) if your program gets interrupted via a kill-signal.
Most likely main is just compiled in the same was as a standard function. In C it pretty much needs to be because you might call it from somewhere.
Note that in C++ it's illegal to call main recursively so a c++ compiler might be able to optimize this more. But in C as your question stated it's legal (if a bad idea) to call main recursively so it needs to be compiled in the same way as any other function.
How can this state saving be prevented?
The only thing you can do is to write you own C-Startup routine. That means messing with assembler, but you can then JUMP to your main() instead of just CALLing it.
In my tests with avr-gcc 4.3.5, it only saves registers if not optimizing much. Normal levels (-Os or -O2) cause the push instructions to be optimized away.
One can further specify in a function declaration that it will not return with __attribute__((noreturn)). It is also useful to do full program optimization with -fwhole-program.
The initial code in avr-libc does use call to jump to main, because it is specified that main may return, and then jumps to exit (which is declared noreturn and thus generates no call). You could link your own variant if you think that is too much. exit() in turn simply disables interrupts and enters an infinite loop, effectively stopping your program, but not saving any power. That's four instructions and two bytes of stack memory overhead if your main() never returns or calls exit().

Learning C coming from managed OO languages

I am fairly comfortable coding in languages like Java and C#, but I need to use C for a project (because of low level OS API calls) and I am having some difficulty dealing with pointers and memory management (as seen here)
Right now I am basically typing up code and feeding it to the compiler to see if it works. That just doesn't feel right for me. Can anyone point me to good resources for me to understand pointers and memory management, coming from managed languages?
k&r - http://en.wikipedia.org/wiki/The_C_Programming_Language_(book)
nuff said
One of the good resources you found already, SO.
Of course you are compiling with all warnings on, don't you?
Learning by doing largely depends on the quality of your compiler and the warnings / errors he feeds you. The best in that respect that I found in the linux / POSIX world is clang. Nicely traces the origin of errors and tells you about missing header files quite well.
Some tips:
By default varibles are stored in the stack.
Varibles are passed into functions by Value
Stick to the same process for allocating and freeing memory. eg allocate and free in the same the function
C's equivalent of
Integer i = new Integer();
i=5;
is
int *p;
p=malloc(sizeof(int));
*p=5;
Memory Allocation(malloc) can fail, so check the pointer for null before you use it.
OS functions can fail and this can be detected by the return values.
Learn to use gdb to step through your code and print variable values (compile with -g to enable debugging symbols).
Use valgrind to check for memory leaks and other related problems (like heap corruption).
The C language doesn't do anything you don't explicitly tell it to do.
There are no destructors automatically called for you, which is both good and bad (since bugs in destructors can be a pain).
A simple way to get somewhat automatic destructor behavior is to use scoping to construct and destruct things. This can get ugly since nested scopes move things further and further to the right.
if (var = malloc(SIZE)) { // try to keep this line
use_var(var);
free(var); // and this line close and with easy to comprehend code between them
} else {
error_action();
}
return; // try to limit the number of return statements so that you can ensure resources
// are freed for all code paths
Trying to make your code look like this as much as possible will help, though it's not always possible.
Making a set of macros or inline functions that initialize your objects is a good idea. Also make another set of functions that allocate your objects' memory and pass that to your initializer functions. This allows for both local and dynamically allocated objects to easily be initialized. Similar operations for destructor-like functions is also a good idea.
Using OO techniques is good practice in many instances, and doing so in C just requires a little bit more typing (but allows for more control). Putters, getters, and other helper functions can help keep objects in consistent states and decrease the changes you have to make when you find an error, if you can keep the interface the same.
You should also look into the perror function and the errno "variabl".
Usually you will want to avoid using anything like exceptions in C. I generally try to avoid them in C++ as well, and only use them for really bad errors -- ones that aren't supposed to happen. One of the main reasons for avoiding them is that there are no destructor calls magically made in C, so non-local GOTOs will often leak (or otherwise screw up) some type of resource. That being said, there are things in C which provide a similar functionality.
The main exception like mechanism in C are the setjmp and longjmp functions. setjmp is called from one location in code and passed a (opaque) variable (jmp_buf) which can later be passed to longjmp. When a call to longjmp is made it doesn't actually return to the caller, but returns as the previously called setjmp with that jmp_buf. setjmp will return a value specified by the call to longjmp. Regular calls to setjmp return 0.
Other exception like functionality is more platform specific, but includes signals (which have their own gotchas).
Other things to look into are:
The assert macro, which can be used to cause program exit when the parameter (a logical test of some sort) fails. Calls to assert go away when you #define NDEBUG before you #include <assert.h>, so after testing you can easily remove the assertions. This is really good for testing for NULL pointers before dereferencing them, as well as several other conditions. If a condition fails assert attempts to print the source file name and line number of the failed test.
The abort function causes the program to exit with failure without doing all of the clean up that calling exit does. This may be done with a signal on some platforms. assert calls abort.

Is there a difference between the on_exit() and atexit() functions?

Is there any difference between
int on_exit(void (*function)(int , void *), void *arg);
and
int atexit(void (*function)(void));
other than the fact that the function used by on_exit gets the exit status?
That is, if I don't care about the exit status, is there any reason to use one or the other?
Edit: Many of the answers warned against on_exit because it's non-standard. If I'm developing an app that is for internal corporate use and guaranteed to run on specific configurations, should I worry about this?
You should use atexit() if possible. on_exit() is nonstandard and less common. For example, it's not available on OS X.
Kernel.org - on_exit():
This function comes from SunOS 4, but is also present in libc4, libc5 and
glibc. It no longer occurs in Solaris (SunOS 5). Avoid this function, and
use the standard atexit(3) instead.
According to this link I found, it seems there are a few differences. on_exit will let you pass in an argument that is passed in to the on_exit function when it is called... which might let you set up some pointers to do some cleanup work on when it is time to exit.
Furthermore, it appears that on_exit was a SunOS specific function that may not be compatible on all platforms... so you may want to stick with atexit, despite it being more restrictive.
The difference is that atexit is C and on_exit is some weird extension available on GNU and who-knows-what-other Unixy systems (but NOT part of POSIX).
#Nathan, I can't find any function that will return the exit code for the current running process. I expect that it isn't set yet at the point when atexit() is called, anyway. By this I mean that the runtime knows what it is, but probably hasn't reported it to the OS. This is pretty much just conjecture, though.
It looks like you will either need to use on_exit() or structure your program so that the exit code doesn't matter. It would not be unreasonable to have the last statement in your main function flip a global exited_cleanly variable to true. In the function you register with atexit(), you could check this variable to determine how the program exited. This will only give you two states, but I expect that would be sufficient for most needs. You could also expand this type of scheme to support more exit states if necessary.
#Nathan
First, see if there is another API call to determine exit status... a quick glance and I don't see one, but I am not well versed in the standard C API.
An easy alternative is to have a global variable that stores the exit status... the default being an unknown error cause (for if the program terminates abnormally). Then, when you call exit, you can store the exit status in the global and retrieve it from any atexit functions. This requires storing the exit status diligently before every exit call, and clearly is not ideal, but if there is no API and you don't want to risk on_exit not being on the platform... it might be the only option.

Resources