I'm curious as to what C does exactly to parse command line arguments. For example, assume I have a program named myProgram that takes in two arguments like this
./myProgram arg1 arg2
If I were to call
./myProgram arg1$'\0otherstuff' arg2
arg1 and arg2 would still print if we were to print argv[1] and argv[2], ignoring $'\0otherstuff', but where does it go? Is it store in memory behind arg1? Could it potentially overwrite any buffer? How is arg2 read if there's a null character before it?
Converting ./myProgram arg1 arg2 into a C style int argc, char *argv[] is done by the operating system or by shell (it depends). C does not parse the arguments, you parse the arguments in C. C is a programming language, not entity. The form int argc, char *argc[] is used in the C programming language as the arguments passed to the main function, but other programming languages may use a different form, for C see main_function.
In linux, one may use execve system call to specify arguments passed to a function. Parsing from the form ./myProgram arg1 arg2 to execve arguments is done by the shell (e.g. bash), which constructs argv array and passes arguments to execve call.
Your shell is probably ignoring the part $'\0otherstuff', because under POSIX flename cannot contain the NUL character (assuming your shell is POSIX compatible).
When calling an executable, your OS kernel will take the additional arguments (as plain text) and pass them into the program memory.
Before the main function is called, a small code is executed, which passes the given arguments to the actual main function in C.
Experimenting with bash (version 3.2.57(1)-release (x86_64-apple-darwin17)) suggests that the “otherstuff” in your example is not passed to the program. When a program is called with the command line you show, the memory pointed to by argv[1] contains “arg1”, then a null character, then “arg2”. Thus, the null and “otherstuff” in your command line has not been passed to the program.
(Hypothetically: If the shell were to pass it to the program, I would expect it would pass it in the memory continuing from that pointed to by argv[1], and there would be no danger of it overwriting any buffer. If the shell were designed to tolerate an embedded null character in an argument, I expect (based on how we design things) that it would treat the argument as a complete string and provide the necessary space to hold it.)
The fact that the argument prior to “arg2” contains a null character is irrelevant to the handling of “arg2”. After initial processing of the command line, the shell does not treat the line as one string. It has divided it into words or other units and handles them with its own data structures. So the presence of null characters in prior arguments has no effect on later arguments.
Additionally, it may not be possible for the shell to pass an argument containing an embedded null character. The routines typically used to execute a program, such as execl, accept the arguments as null-terminated strings. So the embedded null terminates the string, and the execl routine never passes anything beyond the null character.
Related
I want to compare the different elements of a command-line argument. It would be entered all together resulting in the string being found at argv[1]. However, I am not sure how to compare the elements and individual characters as I am looking for repetitions.
If I compared [2] to [3] in the string, there would be nothing there as only 1 string is entered in the command line argument and I need to compare the characters found within that string argv[1]. I am unable to include spaces so I wouldn't be able to compare argv[2] to argv[1].
Read Modern C and see this C reference.
If you code for Linux, read also documentation of GNU libc, in particular the section on parsing program arguments.
You could also use strcmp(3) to compare your program argument to some fixed constant string.
The program arguments are given to your main function, traditionally defined as int main(int argc, char**argv). They are always strings (of different addresses), and on POSIX you are practically certain that argc>0, argv[0] is a non-empty string (somehow the name of the program), all argv[i] with i >= 0 and i < argc are non-null, and argv[argc] is NULL.
You could code inside your main something like
if (argc>1 && !strcmp(argv[1], "foo")) {
/// handle `foo` as first program argument
}
I would recommend to study for inspiration the source code of GNU findutils or of GNU make.
I don't recommend modifying program arguments with e.g. some code like strcpy(argv[1], "bar"). It is not portable, perhaps forbidden, and certainly unreadable and brittle.
If you are on Linux, see also proc(5) about /proc/self/cmdline
I'm making a university project which is supposed to read a table from stdin, apply some changes to it and print to stdout. Here's how the program should be run:
./main [delimiter] [function] <file1.txt >file2.txt
[delimiter] is the character that will divide the cells in the resulting table, defined in the body;
[function] is the function that will be performed to modify rows or columns, defined in the body.
So my question is, how can I read the [delimiter] and [function] from the terminal so that I can use them accordingly in the body of the program?
A C program generally has a main function with a signature like:
int main (int argc, char *argv[])
where argc is an integer that tells you how many things are in the array, argv, and argv is an array of arguments starting with the name of the program (at index 0) and including all the options and parameters that you specify when you invoke the program. Since parsing arguments is something that so many programs have to do, there are various libraries that simplify the task. You can find a number of them listed in the question Parsing command-line arguments in C?.
Parsing the arguments yourself really isn't difficult, though, especially if your program expects the arguments in a particular sequence. Just loop over the entries in argv and read the strings.
I have a program that parses command line arguments using a while loop. Simply, while iterating through the length of argc, if an argument matches a flag than the next argument is taken as a variable. Now in my assignment we are asked to do this in a way that spaces between flags and integer arguments are optional.
For example if i input -k1 it is the same as -k 1 and 1 is the value stored.
I can't find anything that allows this. The only thing I can think is that if argc is an even number it means that there are no pals between a set of argument and i could use scanf("-k%d",key).
Any helpful pointers for me?
At a POSIX-compatible OS you can use a standart API for that: man getopt. It will do all the dirty job to parse the parameters and will provide you a convenient interface to deal with.
Here is a good example for it: http://www.gnu.org/software/libc/manual/html_node/Example-of-Getopt.html#Example-of-Getopt
If the following array contained shell code in a C program on a LINUX machine
char buf [100]
then how does the following execute this shell code :
((void(*)())buf)()
Simple. It casts buf to a pointer-to-function taking no arguments and returning void, and then invokes that function.
However, that probably won't work since the page containing buf is highly unlikely to be marked as executable.
I have a program which forks off other processes. The arguments to my program include the process name of the process to be forked, along with any arguments.
This means, when I make the call to exec(), I need to be able to handle however many arguments were supplied.
Any ideas?
Thanks.
The execv function takes a pointer to an array of arguments.
Just like in main, the last element in the array needs to be a null pointer.
Alternately, execl() takes a variable number of arguments, with a NULL pointer at the end of the list. You should probably use execv(), however, as it's much cleaner; varargs in C can only be considered an ugly hack (take a look at (the files pointed to by) /usr/include/varargs.h sometime, if you dare!).