Seperating options from non-option arguments in a command-line program - c

I am trying to write a limited version of ls w/ some options.
However, I am stuck on the problem of parsing out my options from my arguments in a clean manner.
For example:
$ ls -l -t somefile anotherFile
$ ls somefile -lt anotherFile
have the same behavior.
This poses two problems for me:
It makes using argc a bit more difficult. For example I would consider the arguments ls -lt and ls to both have 0 arguments (other than the name of the command) however argc counts -l as an argument.
Therefore the naive implementation of :
if( argc == 1) {list all the contents of cwd}
does not work.
Is there a built-in way to get the options as well as the option count, or do I have to roll my own function?
I have to consider all the different ways options can be arranged and be careful not to get an option mixed up as a file name or directory name. It seems like the cleanest solution is to separate the options from the file arguments from the start. Is there an idiomatic way to do this / is there standard library calls that do this?

There is no built-in argument parsing help, but getopt is the "standard" method for argument parsing.
For simple apps, I sometimes roll my own with something like:
int pos=0;
argc--;argv++;
while (argc > 0) {
if (*argv[0]=='-') {
switch ((*argv)[1]) {
case 'l': //-l argument
save_option_l(++argv);
argc--; //we consumed one name
break;
//... other -options here ...
default:
usage("unrecognized option %s", *argv);
}
}
else {
save_positional_argument(argv,pos++);
}
argv++;
argc--;
}
In this case, I require the modifiers to directly follow the flags. Don't support variable usage like your first example, unless there are very strong reasons to do so.

If you have Gnu's implementation of getopt, it will do all that for you.
Posix standard getopt terminates option processing when it hits the first non-option argument. That conforms to Posix guidelines for utility argument parsing, and many of us prefer this behaviour. But others like the ability to intermingle options and non-options, and that's the norm for Gnu utilities unless you set an environment variable with the ungainly name POSIXLY_CORRECT.
Consistent with that preference, Gnu getopt parses arguments:
The default is to permute the contents of argv while scanning it so that eventually all the non-options are at the end. This allows options to be given in any order, even with programs that were not written to expect this.
Note the wording about permuting arguments. This means that if you start with
ls somefile -lt anotherFile
Gnu getopt will:
Report a l
Report a t
Report end of options (-1), leaving optind with the value 2 and argv now looking like:
ls -lt somefile anotherFile
So now you can process your non-option arguments with:
for (int argno = optind; argno < argc; ++argno) {
/* Do something with argv[argno] */
}
Also, you can tell how many non-option arguments you received with argc-optind, and if argc == optind, you know there weren't any.
Unbundling -lt into two options is standard Posix getopt behaviour. You can combine options lime that as long as the first one doesn't take an argument.

Related

How do you use optional and non optional arguments?

I understand how to use getopt to accept command-line arguments like
./program -a yes -b no
What I am currently trying to do is accept command-line arguments where some are optional and some are not.
For example:
./program argv[1] argv[2] -a yes -b no
Options after multiple optional arguments is probably bad idea; don't design the command line syntax that way, if you can help it.
That said, you can parse the arguments yourself outside of getopt until you see something which looks like an option (while incrementing argv and decrementing argc). Then use getopt for the remainder of the command line from that point on.
Pseudo-code:
for (; *argv; argc--, argv++) {
if (argv looks like an option)
break;
process *argv somehow
}
now process with getopt(argc, argv, ...)

Command line arguments with spaces to a C program through shell-wrapper script

what does it take to make my program accept command-line arguments with spaces?
Yet-another EDIT: I have just recognized that the program is started from a shell-script that sets up the environment for the execution of the program. As there are some external libraries, LD_LIBRARY_PATH is set to the current working directory:
#!/bin/sh
ARCH=`uname -m`
export LD_LIBRARY_PATH=".:lib/magic/linux-${ARCH}"
./data_sniffer.bin $*
exit $?
The issue is definitely related to $*. Is there any solution to correctly forward the command-line parameters?
Here is a code-snippet from main():
if (stat(argv[1], &lStat) != 0)
{
fprintf(stderr, "Cannot stat(2) given file: %s. \n", argv[1]);
return EXIT_FAILURE;
}
I am starting my program with the following parameters:
./data_sniffer /mnt/pod/movies/some\ test\ movie\ file.avi
The resulting error message looks like this:
Cannot stat(2) given file: /mnt/pod/movies/some.
Anyone an idea what's wrong here? I think that I am not the first one with this problem (though, I could not find a related question here).
Replace the use of ./data_sniffer.bin $* with ./data_sniffer.bin "$#" in your wrapper script and the arguments should be forwarded in a correct manner.
More regarding the difference between $*, $# and "$#" can be found here.
the author of this question changed it completely and came forward with more information regarding the matter, I will let everything already written stand but please remember that this was written before his/her last edit..
Regarding argv..
It doesn't require much from you as a developer, I'm tempted (and it's much more truthful) to say that it requires nothing of you.
Arguments passed to the application is handled by the shell executing the binary, doing this below should definitely work the way you want it to (even though I find it odd that you are claiming that the current shell doesn't handle '\ ' correctely):
./data_sniffer '/mnt/pod/movies/some test movie file.avi'
./data_sniffer /mnt/pod/movies/some\ test\ movie\ file.avi
# there should be no difference between the two
My recommendation(s)
Have you tried doing printf ("argv[1]: %s\n", argv[1]); in the beginning of main to validate the contents of it?
Are you sure that you are invoking the correct binary with the correct command-line arguments?
Sadly the only reasonable thing to write is that you are doing something wrong. We are however unable to answer what without further information regarding the issue.
I find it very hard to believe that there is a bug in your shell, even though that is of course possible - I doubt it.
Parsing the command-line into argv isn't something that you as a developer should worry about, there are no enforced functionality that you have to implement for the binary itself to handle spaces in it's argument(s).
In my CentOS 5.3, I made a test.
#include <stdio.h>
int main(int argc, char** argv)
{
printf("argc=%d\n", argc);
int i = 0;
for(i = 0; i<argc; i++)
{
printf("argv[%d]=%s\n", i, argv[i]);
}
return 0;
}
And then run it with different parameters:
[root#dw examples]# ./a.out a b c
argc=4
argv[0]=./a.out
argv[1]=a
argv[2]=b
argv[3]=c
[root#dw examples]# ./a.out 'a b c'
argc=2
argv[0]=./a.out
argv[1]=a b c
[root#dw examples]# ./a.out a\ b\ c
argc=2
argv[0]=./a.out
argv[1]=a b c
[root#dw examples]# ./a.out "a b c"
argc=2
argv[0]=./a.out
argv[1]=a b c
So looks like (1) 'a b c' ; (2) a\ b\ c; (3) "a b c" all are working.

Getopt - filename as argument

Let's say I made a C program that is called like this:
./something -d dopt filename
So -d is a command, dopt is an optional argument to -d and filename is an argument to ./something, because I can also call ./something filename.
What is the getopt form to represent get the filename?
Use optstring "d:"
Capture -d dopt with optarg in the usual way. Then look at optind (compare it with argc), which tells you whether there are any non-option arguments left. If so, your filename is the first of these.
getopt doesn't specifically tell you what the non-option arguments are or check the number. It just tells you where they start (having first moved them to the end of the argument array, if you're in GNU's non-strict-POSIX mode)
Check-out how grep does it. At the end of main() you'll find:
if (optind < argc)
{
do
{
char *file = argv[optind];
// do something with file
}
while ( ++optind < argc);
}
The optind is the number of command-line options found by getopt. So this conditional/loop construct can handle all of the files listed by the user.

C getopt multiple value

My argument is like this
./a.out -i file1 file2 file3
How can I utilize getopt() to get 3 (or more) input files?
I'm doing something like this:
while ((opt = getopt(argc, argv, "i:xyz.."))!= -1){
case 'i':
input = optarg;
break;
...
}
I get just the file1; how to get file2, file3?
I know this is quite old but I came across this in my search for a solution.
while((command = getopt(argc, argv, "a:")) != -1){
switch(command){
case 'a':
(...)
optind--;
for( ;optind < argc && *argv[optind] != '-'; optind++){
DoSomething( argv[optind] );
}
break;
}
I found that int optind (extern used by getopt() ) points to next position after the 'current argv' selected by getopt();
That's why I decrease it at the beginning.
First of all for loop checks if the value of current argument is within boundaries of argv (argc is the length of array so last position in array argv is argc-1).
Second part of && compares if the next argument's first char is '-'. If the first char is '-' then we run out of next values for current argument else argv[optind] is our next value. And so on until the argv is over or argument runs out of values.
At the end increment optind to check for the next argv.
Note that because we are checking 'optind < argc' first second part of condition will not be executed unless first part is true so no worries of reading outside of array boundaries.
PS I am a quite new C programmer if someone has an improvements or critique please share it.
If you must, you could start at argv[optind] and increment optind yourself. However, I would recommend against this since I consider that syntax to be poor form. (How would you know when you've reached the end of the list? What if someone has a file named with a - as the first character?)
I think that it would be better yet to change your syntax to either:
/a.out -i file1 -i file2 -i file3
Or to treat the list of files as positional parameters:
/a.out file1 file2 file3
Note that glibc's nonconformant argument permutation extension will break any attempt to use multiple arguments to -i in this manner. And on non-GNU systems, the "second argument to -i" will be interpreted as the first non-option argument, halting any further option parsing. With these issues in mind, I would drop getopt and write your own command line parser if you want to use this syntax, since it's not a syntax supported by getopt.
I looked and tried the code above, but I found my solution a little easier and worked better for me:
The handling I wanted was:
-m mux_i2c_group mux_i2c_out
(2 arguments required).
Here's how it panned out for me:
case 'm':
mux_i2c_group = strtol(optarg, &ch_p, 0);
if (optind < argc && *argv[optind] != '-'){
mux_i2c_out = strtol(argv[optind], NULL, 0);
optind++;
} else {
fprintf(stderr, "\n-m option require TWO arguments <mux_group> "
"<mux_out>\n\n");
usage();
}
use_mux_flag = 1;
break;
This grabbed the first value form me as normal and then just looked for the second, REQUIRED value.
The solution by GoTTimw has proven very useful to me. However, I would like to mention one more idea, that has not been suggested here yet.
Pass arguments as one string in this way.
./a.out -i "file1 file2 file3"
Then you get one string as a single argument and you only need to split it by space.

command line processing library - getopt

Can someone help me with the getopt function?
When I do the following in main:
char *argv1[] = {"testexec","-?"};
char *argv2[] = {"testexec","-m","arg1"};
int cOption;
/* test for -? */
setvbuf(stdout,(char*)NULL,_IONBF,0);
printf("\n argv1 ");
while (( cOption = getopt (2, argv1, "m:t:n:fs?")) != -1) {
switch(cOption){
case 'm':
printf("\n -m Arg : %s \n",optarg);
break;
case '?':
printf("\n -? Arg ");
break;
case 'n':
printf("\n -n Arg : %s \n",optarg);
break;
}
}
printf("\n argv2 ");
while (( cOption = getopt (3, argv2, "m:t:n:fs?")) != -1) {
switch(cOption){
case 'm':
printf("\n -m Arg : %s \n",optarg);
break;
case '?':
printf("\n -? Arg : %s \n",optarg);
break;
case 'n':
printf("\n -n Arg : %s \n",optarg);
break;
}
}
I'm running this code on rhel3 which uses old libc version. I don't know which one to be exact.
Now the problem is getopt doesn't work the second time with argv2.
But if I comment out the first getopt call with argv1 , it works.
Can someone tell me what am I doing wrong here?
argv1 and 2 must end in 0:
char* argv1[] = {"par1", "par2", 0};
Edit: OK, I read the getopt man page and I found this:
The variable optind is the index of the next element to be processed in argv. The system initializes this value
to 1. The caller can reset it to 1 to restart scanning of the same argv, or when scanning a new argument vector.
So, making optind=1 between the two calls at getopt makes it work as expected.
The getopt() function uses some global variables, like optind and optarg, to store state information between calls. After you finish processing one set of options, there is data left in those variables that is causing problems with the next set of options. You could potentially try to reset getopt's state between calls by clearing the variables, but I'm not sure that would work since the function might use other variables which aren't documented and you'd never know if you'd gotten them all; besides, it would be absolutely nonportable (i.e. if the implementation of getopt() changes, your code breaks). See the man page for details. Best not to use getopt() for more than one set of arguments in a given program if you can help it.
I'm not sure if there is an actual function to reset getopt's state (or perhaps a reentrant version of the function, which lets you store the state in your own variables)... I seem to remember seeing something like that once, but I can't find it now that I look :-/
As stated in the man page:
"A program that scans multiple argument vectors, or rescans the same vector more than once, and wants to make use of GNU extensions such as '+' and '-' at the start of optstring, or changes the value of POSIXLY_CORRECT between scans, must reinitialize getopt() by resetting optind to 0, rather than the traditional value of 1. (Resetting to 0 forces the invocation of an internal initialization routine that rechecks POSIXLY_CORRECT and checks for GNU extensions in optstring.)"
Is there any reason why you are not using getopt_long() instead? On most platforms, getopt() just calls _getopt_long() with a switch to disable long arguments. That's the case with almost every platform that I know of (still in use), including Linux, BSD and even emerging OS's like HelenOS -, I know, I was the one who ported getopt to its libc :)
It is much easier on ANYONE using your program to have long options at least until they get used to using it.
getopt_long() will allow you to use two (or more) option indexes that can stay 'live' after they are done processing arguments, only the internal (global, non-reentrant) one would have to be re-set which is no big deal.
This lets you easily compare the argument count to the number of options actually passed in both invocations with many other benefits .. please consider not using the antiquated interface.
Look at getopt.h, you'll see what I mean.

Resources