Is there any way to access the command line arguments, without using the argument to main? I need to access it in another function, and I would prefer not passing it in.
I need a solution that only necessarily works on Mac OS and Linux with GCC.
I don't know how to do it on MacOS, but I suspect the trick I will describe here can be ported to MacOS with a bit of cross-reading.
On linux you can use the so called ".init_array" section of the ELF binary, to register a function which gets called during program initilization (before main() is called). This function has the same signature as the normal main() function, execept it returns "void".
Thus, you can use this function to remember or process argc, argv[] and evp[].
Here is some code you can use:
static void my_cool_main(int argc, char* argv[], char* envp[])
{
// your code goes here
}
__attribute__((section(".init_array"))) void (* p_my_cool_main)(int,char*[],char*[]) = &my_cool_main;
PS: This code can also be put in a library, so it should fit your case.
It even works, when your prgram is run with valgrind - valgrind does not fork a new process, and this results in /proc/self/cmdline showing the original valgrind command-line.
PPS: Keep in mind that during this very early program execution many subsystem are not yet fully initialized - I tried libc I/O routines, they seem to work, but don't rely on it - even gloval variables might not yet be constructed, etc...
In Linux, you can open /proc/self/cmdline (assuming that /proc is present) and parse manually (this is only required if you need argc/argv before main() - e.g. in a global constructor - as otherwise it's better to pass them via global vars).
More solutions are available here: http://blog.linuxgamepublishing.com/2009/10/12/argv-and-argc-and-just-how-to-get-them/
Yeah, it's gross and unportable, but if you are solving practical problems you may not care.
You can copy them into global variables if you want.
I do not think you should do it as the C runtime will prepare the arguments and pass it into the main via int argc, char **argv, do not attempt to manipulate the behaviour by hacking it up as it would largely be unportable or possibly undefined behaviour!! Stick to the rules and you will have portability...no other way of doing it other than breaking it...
You can. Most platforms provide global variables __argc and __argv. But again, I support zneak's comment.
P.S. Use boost::program_options to parse them. Please do not do it any other way in C++.
Is there some reason why passing a pointer to space that is already consumed is so bad? You won't be getting any real savings out of eliminating the argument to the function in question and you could set off an interesting display of fireworks. Skirting around main()'s call stack with creative hackery usually ends up in undefined behavior, or reliance on compiler specific behavior. Both are bad for functionality and portability respectively.
Keep in mind the arguments in question are pointers to arguments, they are going to consume space no matter what you do. The convenience of an index of them is as cheap as sizeof(int), I don't see any reason not to use it.
It sounds like you are optimizing rather aggressively and prematurely, or you are stuck with having to add features into code that you really don't want to mess with. In either case, doing things conventionally will save both time and trouble.
Related
I often see programs where people put argc and argv in main, but never make any use of these parameters.
int main(int argc, char *argv[]) {
// never touches any of the parameters
}
Should I do so too? What is the reason for that?
The arguments to the main function can be omitted if you do not need to use them. You can define main this way:
int main(void) {
// never touches any of the parameters
}
Or simply:
int main() {
// never touches any of the parameters
}
Regarding why some programmers do that, it could be to conform to local style guides, because they are used to it, or simply a side effect of their IDE's source template system.
When you have a function, it's obviously important that the arguments passed by the caller always match up properly with the arguments expected by the function.
When you define and call one of your own functions, you can pick whatever arguments make sense to you for the function to accept, and then it's your job to call your function with the arguments you've decided on.
When you call a function that somebody else defined — like a standard library function — somebody else picked the arguments that function would accept, and it's your job to pass them correctly. For example, if you call the standard library function strcpy, you just have to pass it a destination string, and a source string, in that order. If you think it would make sense to pass three arguments, like the destination string, and the size of the destination string, and the source string, it won't work. You don't get to make up the way you'll call the function, because it's already defined.
And then there are a few cases where somebody else is going to call a function that you defined, and the way they're going to call it is fixed, such that you don't have any choice in the way you define it. The best example of this (except it turns out it's not such a good example after all, as we'll see) is main(). It's your job to define this function. It's not a standard library function that somebody else is going to define. But, it is a function that somebody else — namely, the C start-up code — is going to call. That code was written a while ago, by somebody else, and you have no control over it. It's going to call your main function in a certain way. So you're constrained to write your main function in a way that's compatible with the way it's going to be called. You can put whatever you want in the body of your main function, but you don't get to pick your own arguments: there are supposed to be two of those, an int and a char **, in that order.
Now, it also turns out that there's a very special exception for main. Even though the caller is going to be calling it with those two predefined arguments, if you're not interested in them, and if you define main with no arguments, instead, like this:
int main()
{
/* ... */
}
your C implementation is required to set things up so that nothing will go wrong, no problems will be caused by the caller passing those two arguments that your main function doesn't accept.
So, in answer to your question, many programs are written to accept int argc and char **argv because they're complying with the simple rule: those are the arguments the caller is accepting, so those are the arguments they believe their main function should be defined as accepting, even if it doesn't actually use them.
Programmers who define main functions that accept argc and argv without using them either haven't heard of, or choose not to make use of, the special exception that says they don't have to. Personally, I don't blame them: that special exception for main is a strange one, which didn't always exist, so since it's not wrong to define main as taking two required arguments but not using them, that could be considered "better style".
(Yes, if you define a function that fails to actually use the arguments it defines, your compiler might warn you about this, but that's a separate question.)
Can a function tell what's calling it, through the use of memory addresses maybe? For example, function foo(); gets data on whether it is being called in main(); rather than some other function?
If so, is it possible to change the content of foo(); based on what is calling it?
Example:
int foo()
{
if (being called from main())
printf("Hello\n");
if (being called from some other function)
printf("Goodbye\n");
}
This question might be kind of out there, but is there some sort of C trickery that can make this possible?
For highly optimized C it doesn't really make sense. The harder the compiler tries to optimize the less the final executable resembles the source code (especially for link-time code generation where the old "separate compilation units" problem no longer prevents lots of optimizations). At least in theory (but often in practice for some compilers) functions that existed in the source code may not exist in the final executable (e.g. may have been inlined into their caller); functions that didn't exist in the source code may be generated (e.g. compiler detects common sequences in many functions and "out-lines" them into a new function to avoid code duplication); and functions may be replaced by data (e.g. an "int abcd(uint8_t a, uint8_t b)" replaced by a abcd_table[a][b] lookup table).
For strict C (no extensions or hacks), no. It simply can't support anything like this because it can't expect that (for any compiler including future compilers that don't exist yet) the final output/executable resembles the source code.
An implementation defined extension, or even just a hack involving inline assembly, may be "technically possible" (especially if the compiler doesn't optimize the code well). The most likely approach would be to (ab)use debugging information to determine the caller from "what the function should return to when it returns".
A better way for a compiler to support a hypothetical extension like this may be for the compiler to use some of the optimizations I mentioned - specifically, split the original foo() into 2 separate versions where one version is only ever called from main() and the other version is used for other callers. This has the bonus of letting the compiler optimize out the branches too - it could become like int foo_when_called_from_main() { printf("Hello\n"); }, which could be inlined directly into the caller, so that neither version of foo exists in the final executable. Of course if foo() had other code that's used by all callers then that common code could be lifted out into a new function rather than duplicating it (e.g. so it might become like int foo_when_called_from_main() { printf("Hello\n"); foo_common_code(); }).
There probably isn't any hypothetical compiler that works like that, but there's no real reason you can't do these same optimizations yourself (and have it work on all compilers).
Note: Yes, this was just a crafty way of suggesting that you can/should refactor the code so that it doesn't need to know which function is calling it.
Knowing who called a specific function is essentially what a stack trace is visualizing. There are no general standard way of extracting that though. In theory one could write code that targeted each system type the software would run on, and implement a stack trace function for each of them. In that case you could examine the stack and see what is before the current function.
But with all that said and done, the question you should probably ask is why? Writing a function that functions in a specific way when called from a specific function is not well isolated logic. Instead you could consider passing in a parameter to the function that caused the change in logic. That would also make the result more testable and reliable.
How to actually extract a stack trace has already received many answers here: How can one grab a stack trace in C?
I think if loop in C cannot have a condition as you have mentioned.
If you want to check whether this function is called from main(), you have to do the printf statement in the main() and also at the other function.
I don't really know what you are trying to achieve but according to what I understood, what you can do is each function will pass an additional argument that would uniquely identify that function in form of a character array, integer or enumeration.
for example:
enum function{main, add, sub, div, mul};
and call functions like:
add(3,5,main);//adds 3 and 5. called from main
changes to the code would be typical like if you are adding more functions. but it's an easier way to do it.
No. The C language does not support obtaining the name or other information of who called a function.
As all other answers show, this can only be obtained using external tools, for example that use stack traces and compiler/linker emitted symbol tables.
I am working on some legacy C code. The original code was written in the mid-90s, targeting Solaris and Sun's C compiler of that era. The current version compiles under GCC 4 (albeit with many warnings), and it seems to work, but I'm trying to tidy it up -- I want to squeeze out as many latent bugs as possible as I determine what may be necessary to adapt it to 64-bit platforms, and to compilers other than the one it was built for.
One of my main activities in this regard has been to ensure that all functions have full prototypes (which many did not have), and in that context I discovered some code that calls a function (previously un-prototyped) with fewer arguments than the function definition declares. The function implementation does use the value of the missing argument.
Example:
impl.c:
int foo(int one, int two) {
if (two) {
return one;
} else {
return one + 1;
}
}
client1.c:
extern foo();
int bar() {
/* only one argument(!): */
return foo(42);
}
client2.c:
extern int foo();
int (*foop)() = foo;
int baz() {
/* calls the same function as does bar(), but with two arguments: */
return (*foop)(17, 23);
}
Questions: is the result of a function call with missing arguments defined? If so, what value will the function receive for the unspecified argument? Otherwise, would the Sun C compiler of ca. 1996 (for Solaris, not VMS) have exhibited a predictable implementation-specific behavior that I can emulate by adding a particular argument value to the affected calls?
EDIT: I found a stack thread C function with no parameters behavior which gives a very succinct and specific, accurate answer. PMG's comment at the end of the answer taks about UB. Below were my original thoughts, which I think are along the same lines and explain why the behaviour is UB..
Questions: is the result of a function call with missing arguments defined?
I would say no... The reason being is that I think the function will operate as-if it had the second parameter, but as explained below, that second parameter could just be junk.
If so, what value will the function receive for the unspecified argument?
I think the values received are undefined. This is why you could have UB.
There are two general ways of parameter passing that I'm aware of... (Wikipedia has a good page on calling conventions)
Pass by register. I.e., the ABI (Application Binary Interface) for the plat form will say that registers x & y for example are for passing in parameters, and any more above that get passed via stack...
Everything gets passed via stack...
Thus when you give one module a definition of the function with "...unspecified (but not variable) number of parameters..." (the extern def), it will not place as many parameters as you give it (in this case 1) in either the registers or stack location that the real function will look in to get the parameter values. Therefore the second area for the second parameter, which is missed out, essentially contains random junk.
EDIT: Based on the other stack thread I found, I would ammended the above to say that the extern declared a function with no parameters to a declared a function with "unspecified (but not variable) number of parameters".
When the program jumps to the function, that function assumes the parameter passing mechanism has been correctly obeyed, so either looks in registers or the stack and uses whatever values it finds... asumming them to be correct.
Otherwise, would the Sun C compiler of ca. 1996 (for Solaris, not VMS) have exhibited a >> predictable implementation-specific behavior
You'd have to check your compiler documentation. I doubt it... the extern definition would be trusted completely so I doubt the registers or stack, depending on parameter passing mechanism, would get correctly initialised...
If the number or the types of arguments (after default argument promotions) do not match the ones used in the actual function definition, the behavior is undefined.
What will happen in practice depends on the implementation. The values of missing parameters will not be meaningfully defined (assuming the attempt to access missing arguments will not segfault), i.e. they will hold unpredictable and possibly unstable values.
Whether the program will survive such incorrect calls will also depend on the calling convention. A "classic" C calling convention, in which the caller is responsible for placing the parameters into the stack and removing them from there, will be less crash-prone in presence of such errors. The same can be said about calls that use CPU registers to pass arguments. Meanwhile, a calling convention in which the function itself is responsible for cleaning the stack will crash almost immediately.
It is very unlikely the bar function ever in the past would give consistent results. The only thing I can imagine is that it is always called on fresh stack space and the stack space was cleared upon startup of the process, in which case the second parameter would be 0. Or the difference between between returning one and one+1 didn't make a big difference in the bigger scope of the application.
If it really is like you depict in your example, then you are looking at a big fat bug. In the distant past there was a coding style where vararg functions were implemented by specifying more parameters than passed, but just as with modern varargs you should not access any parameters not actually passed.
I assume that this code was compiled and run on the Sun SPARC architecture. According to this ancient SPARC web page: "registers %o0-%o5 are used for the first six parameters passed to a procedure."
In your example with a function expecting two parameters, with the second parameter not specified at the call site, it is likely that register %01 always happened to have a sensible value when the call was made.
If you have access to the original executable and can disassemble the code around the incorrect call site, you might be able to deduce what value %o1 had when the call was made. Or you might try running the original executable on a SPARC emulator, like QEMU. In any case this won't be a trivial task!
I saw a C code like this:
#include <stdio.h>
void main ()
{
static int ivar = 5;
printf ("%d", ivar--);
if (ivar)
main ();
}
which outputs :
54321
I'm a novice in C and I guess until the condition fails the main method is called again and again. Since I'm a novice in C, is it good practice to call the main function more than once as in the above case? Are there any real world cases, where this kind of code is very useful?
Thanks in advance.
In your example, it doesn't matter because it's a very small piece of code. But in the general case I think calling main is a bad idea for the following reasons:
Readability. When examining a program no one would assume that main will be called at one point. When you see that it is, you have to back track and reread the whole thing. Besides, main is not a meaningful name in the sense that it's not clear what the intent of the recursion is. So I'd write another function with a meaningful name to reflect this.
Reusability. It's quite probable that a new function with a meaningful name will be useful in more than one place in a complex program.
Command line arguments. At one point you may need command line arguments in your program. Even GUI programs need them (for file associations etc.). And you will need to refactor all your calls to main to take that into account.
C++ compatibility. It's illegal in C++.
I would say it's rarely, if ever, a good idea to call the main function. If you're going to recurse, create a function to do it.
A while-loop would be more suitable. Recursion makes sense when, with each recursion, you are doing a different job -- typically a smaller one.
What this code is really doing is demonstrating function-local static variables: the ivar is only initialised in the first call of main. Each time you recurse, it is decremented despite the ivar=5 statement.
main has a special meaning. Idiomatically, it should initialise the environment and then call some other function which drives the application logic.
An optimising compiler might well transform that code into an iterative version anyway.
This is not often seen (I've never seen it done before), very confusing, since main is supposed to be called once when the program starts and end when the program ends, and largely impractical in most real programs, since you would need to stop the subsequent calls to main() from parsing the command line again.
It's far more reasonable to just write a separate recursive function and call it from main and using ordinary function arguments instead of static variables.
It's called recursion and it can be very useful. For instance traversing a tree. Some math calculations also use recursion.
I'm trying to make a call to a DLL function (via GetProcAddress etc) from C, using lcc compiler. The function gets called and everything goes well, but it looks like the top of the stack gets corrupted. I've tried to play with calling conventions (__stdcall / __cdecl), but that didn't help.
Unfortunately I don't have access to the dll code, and have to use the lcc compiler.
I found that this simple hack avoids stack corruption:
void foo(params)
{
int dummy;
dll_foo(params);
}
Here dll_foo is the pointer returned by GetProcAddress, and the stack is kind of protected by the dummy variable. So it's not the stack pointer that gets corrupted, but the data at the top of the stack. It works like this, but I'd like to know the reason of the corruption.
Any ideas?
UPD:
As asked in the comments, here are the actual function types:
typedef unsigned char (CALLBACK Tfr)(unsigned char);
typedef void (CALLBACK Tfw)(unsigned char,unsigned char);
typedef int (CALLBACK Tfs)(int);
typedef void (CALLBACK Tfwf)(int*,int);
All they show a similar behavior.
Unfortunately, it is not so straightforward to attach a debugger, as the code is compiled and launched by Matlab, using the LCC compiler, and there is no debugging support. Probably I will have to reproduce this problem in a standalone configuration, but it is not that easy to make it.
Sounds like you use MSVC, Debug + Windows + Registers. Look at the value of ESP before and after the call. If it doesn't match then first change the calling convention in the function pointer declaration (did you do that right?) If it still doesn't match then it is __stdcall and you haven't guessed the arguments you need to pass correctly.
Or the function could just clobbers the stack frame, it isn't impossible.
Posting your function pointer declaration that shows the real arguments would probably help diagnose this better.
It sounds to me like you were on the right track with looking at the calling convention. The main thing you need to do is ensure that the caller and callee are both using the same convention. Typically for a DLL, you want to use __stdcall for both, but if (as you say) you have no control over the DLL, then you need to modify your code to match what it's doing. Unfortunately, it's almost impossible to guess what that is -- I'm pretty sure lcc (like most C and C++ compilers) can produce code to use a variety of conventions.
Based on your hack working by putting an extra dword on the stack, it sounds like you currently have a mismatch where both the caller and the callee are trying to clear arguments off the stack (e.g., the caller using __cdecl and the callee using __stdcall.
You could try to "follow" the call to dll_foo() i assembler using a debugger, at check out exactly what the routine does stack-wise.