I have a program which forks off other processes. The arguments to my program include the process name of the process to be forked, along with any arguments.
This means, when I make the call to exec(), I need to be able to handle however many arguments were supplied.
Any ideas?
Thanks.
The execv function takes a pointer to an array of arguments.
Just like in main, the last element in the array needs to be a null pointer.
Alternately, execl() takes a variable number of arguments, with a NULL pointer at the end of the list. You should probably use execv(), however, as it's much cleaner; varargs in C can only be considered an ugly hack (take a look at (the files pointed to by) /usr/include/varargs.h sometime, if you dare!).
Related
I often see programs where people put argc and argv in main, but never make any use of these parameters.
int main(int argc, char *argv[]) {
// never touches any of the parameters
}
Should I do so too? What is the reason for that?
The arguments to the main function can be omitted if you do not need to use them. You can define main this way:
int main(void) {
// never touches any of the parameters
}
Or simply:
int main() {
// never touches any of the parameters
}
Regarding why some programmers do that, it could be to conform to local style guides, because they are used to it, or simply a side effect of their IDE's source template system.
When you have a function, it's obviously important that the arguments passed by the caller always match up properly with the arguments expected by the function.
When you define and call one of your own functions, you can pick whatever arguments make sense to you for the function to accept, and then it's your job to call your function with the arguments you've decided on.
When you call a function that somebody else defined — like a standard library function — somebody else picked the arguments that function would accept, and it's your job to pass them correctly. For example, if you call the standard library function strcpy, you just have to pass it a destination string, and a source string, in that order. If you think it would make sense to pass three arguments, like the destination string, and the size of the destination string, and the source string, it won't work. You don't get to make up the way you'll call the function, because it's already defined.
And then there are a few cases where somebody else is going to call a function that you defined, and the way they're going to call it is fixed, such that you don't have any choice in the way you define it. The best example of this (except it turns out it's not such a good example after all, as we'll see) is main(). It's your job to define this function. It's not a standard library function that somebody else is going to define. But, it is a function that somebody else — namely, the C start-up code — is going to call. That code was written a while ago, by somebody else, and you have no control over it. It's going to call your main function in a certain way. So you're constrained to write your main function in a way that's compatible with the way it's going to be called. You can put whatever you want in the body of your main function, but you don't get to pick your own arguments: there are supposed to be two of those, an int and a char **, in that order.
Now, it also turns out that there's a very special exception for main. Even though the caller is going to be calling it with those two predefined arguments, if you're not interested in them, and if you define main with no arguments, instead, like this:
int main()
{
/* ... */
}
your C implementation is required to set things up so that nothing will go wrong, no problems will be caused by the caller passing those two arguments that your main function doesn't accept.
So, in answer to your question, many programs are written to accept int argc and char **argv because they're complying with the simple rule: those are the arguments the caller is accepting, so those are the arguments they believe their main function should be defined as accepting, even if it doesn't actually use them.
Programmers who define main functions that accept argc and argv without using them either haven't heard of, or choose not to make use of, the special exception that says they don't have to. Personally, I don't blame them: that special exception for main is a strange one, which didn't always exist, so since it's not wrong to define main as taking two required arguments but not using them, that could be considered "better style".
(Yes, if you define a function that fails to actually use the arguments it defines, your compiler might warn you about this, but that's a separate question.)
I'm curious as to what C does exactly to parse command line arguments. For example, assume I have a program named myProgram that takes in two arguments like this
./myProgram arg1 arg2
If I were to call
./myProgram arg1$'\0otherstuff' arg2
arg1 and arg2 would still print if we were to print argv[1] and argv[2], ignoring $'\0otherstuff', but where does it go? Is it store in memory behind arg1? Could it potentially overwrite any buffer? How is arg2 read if there's a null character before it?
Converting ./myProgram arg1 arg2 into a C style int argc, char *argv[] is done by the operating system or by shell (it depends). C does not parse the arguments, you parse the arguments in C. C is a programming language, not entity. The form int argc, char *argc[] is used in the C programming language as the arguments passed to the main function, but other programming languages may use a different form, for C see main_function.
In linux, one may use execve system call to specify arguments passed to a function. Parsing from the form ./myProgram arg1 arg2 to execve arguments is done by the shell (e.g. bash), which constructs argv array and passes arguments to execve call.
Your shell is probably ignoring the part $'\0otherstuff', because under POSIX flename cannot contain the NUL character (assuming your shell is POSIX compatible).
When calling an executable, your OS kernel will take the additional arguments (as plain text) and pass them into the program memory.
Before the main function is called, a small code is executed, which passes the given arguments to the actual main function in C.
Experimenting with bash (version 3.2.57(1)-release (x86_64-apple-darwin17)) suggests that the “otherstuff” in your example is not passed to the program. When a program is called with the command line you show, the memory pointed to by argv[1] contains “arg1”, then a null character, then “arg2”. Thus, the null and “otherstuff” in your command line has not been passed to the program.
(Hypothetically: If the shell were to pass it to the program, I would expect it would pass it in the memory continuing from that pointed to by argv[1], and there would be no danger of it overwriting any buffer. If the shell were designed to tolerate an embedded null character in an argument, I expect (based on how we design things) that it would treat the argument as a complete string and provide the necessary space to hold it.)
The fact that the argument prior to “arg2” contains a null character is irrelevant to the handling of “arg2”. After initial processing of the command line, the shell does not treat the line as one string. It has divided it into words or other units and handles them with its own data structures. So the presence of null characters in prior arguments has no effect on later arguments.
Additionally, it may not be possible for the shell to pass an argument containing an embedded null character. The routines typically used to execute a program, such as execl, accept the arguments as null-terminated strings. So the embedded null terminates the string, and the execl routine never passes anything beyond the null character.
I am reading http://www.cs.utexas.edu/users/lavender/courses/cs345/lectures/CS345-Lecture-07.pdf to try to understand how does Stack Activation Frame for Variable arguments functions works?
Specifically how can the called function knows how many arguments are being passed?
The slide said:
The va_start procedure computes the fp+offset value following the argument
past the last known argument (e.g., const char format). The rest of the arguments are then computed by calling
va_arg, where the ‘ap’ argument to va_arg is some fp+offset value.*
My question is what is fp (frame point)? how does va_start computes the 'fp+offset' values?
and how does va_arg get 'some fp+offset values? and what does va_end supposed to do with stack?
The function doesn't know how many arguments are passed. At least not in any way that matters, i.e. in C you cannot query for the number of arguments.
That's why all varargs functions must either:
Use a non-varying argument that contains information about the number and types of all variable arguments, like printf() does; or
Use a sentinel value on the end of the variable argument list. I'm not aware of a function in the standard library that does this, but see for instance GTK+'s gtk_list_store_set() function.
Both mechanisms are risky; if your printf() format string doesn't match the arguments, you're going to get undefined behavior. If there was a way for printf() to know the number of passed arguments, it would of course protect against this, but there isn't. So it can't.
The va_start() macro takes as an argument the last non-varying argument, so it can somehow (this is compiler internals, there's no single correct or standard answer, all we can do from this side of the interface is reason from the available data) use that to know where the first varying argument is located on the stack.
The va_arg() macro gets the type as an "argument", which makes it possible to use that to compute the offset, and probably increment some state in the va_list object to point at the next varying argument.
I've just discovered variadic functions in C and have defined one as a general notification typedef, that as well as a pointer to a text string can optionally have whatever arguments sent along with it- useful as a generic debug function for instance where I want all the output string manipulation in one place.
Since I want my C files to be as generic as possible I have static variables that contain pointers to possible callbacks in higher code, populated in an init call. Since the pointers may be null if higher code isn't interested, I'd normally have a local wrapper that only calls through the pointer if it's not null. But I'm having trouble figuring out how to forward this fuzzy thing represented by '...' and simply calling the function with '...' in the argument list gives a syntax error.
Is there any way to do this, or am I stuck with having a dummy local handler and having init set null callbacks to a pointer to that?
You can't pass on the variadic arguments. You have to fetch them into a va_list and pass this to the inner function.
Take a look at this Question at the C FAQ. It defines a variadic error function that wants to forward to printf. This is just your use case.
In the same FAQs, it is generally recommended to have a version taking va_list for every (or most) variadic functions
I'm trying to plug a hole in my knowledge. Why variadic functions require at least two arguments? Mostly from C's main function having argc as argument count and then argv as array of arrays of chars? Also Objective-C's Cocoa has NSString methods that require format as first argument and afterwards an array of arguments ([NSString stringWithFormat:#"%#", foo]). Why is it impossible to create a variadic function accepting only a list of arguments?
argc/argv stuff is not really variadic.
Variadic functions (such as printf()) use arguments put on the stack, and don't require at least 2 arguments, but 1.
You have void foo(char const * fmt, ...) and usually fmt gives a clue about the number of arguments.
That's minimum 1 argument (fmt).
C has very limited reflection abilities so you must have some way to indicate what it is that the variable arguments contain - either specifying the number of arguments or the type of them (or both), and that is the logic behind having one more parameter. It is required by the ISO C standard so you can't omit it. If feel you don't need any extra parameters because the number and type of the arguments is always constant then there is no need for variable arguments in the first place.
You could of course design other ways to encode the number / type information inside the variable arguments such as a sentinel value. If you want to do this, you can just supply a dummy value for the first argument and not use it in the method body.
And just to be pedantic about your title, variadic functions only require one argument (not two). It's perfectly valid to make a call to a variadic function without providing any optional arguments:
printf("Hello world");
I think, that the reason is the following:
in the macro va_start(list, param); you specify the last fixed argument - it is needed to determine the address of the beginning of the variable arguments list on the stack.
How would you then know if the user provided any arguments?
There has to be some information to indicate this, and C in general wasn't designed to do behind-your-back data manipulation. So anything you need, it makes you pass explicitly.
I'm sure if you really wanted to you could try to enforce some scheme whereby the variadic function takes only a certain type of parameter (a list of ints for example) - and then you fill some global variable indicating how many ints you had passed.
Your two examples are not variadic functions. They are functions with two arguments, but they also highlight a similar issue. How can you know the size of a C array without additional information? You can either pass the size of the array, or you describe a scheme with some sentinel value demarcating the end of the array (i.e. '\0' for a C string).
In both the variadic case and the array case you have the same problem, how can you know how much data you have legitimate access to? If you don't know this with the array case, you will go out of bounds. If you don't know this with the variadic case you will call va_arg too many times, or with the wrong type.
To turn the question around, how would you be able to implement a function taking a variable number of arguments without passing the extra information?