Proper getopt usage - c

Here there's an example on how to use GNU getopt. I understand most parts of the code but I have some questions:
Why is ctype lib included?
I mean unistd.h is needed for getopt, stdlib.h is needed for abort and stdio.h is the standard lib for io.
On the default case why do we use abort? cant' we just use return 1;?
I would like someone to share a link with more details on optopt, optind and optarg if possible.

ctype is needed for isprint()
not sure - it might be slightly useful if you're not doing this in the main function, but printing an error then calling exit(1) is probably better; I guess this is just a minimal example, and one line was shorter than two
Here are several links you might find useful:
FreeBSD man page for getopt(3)
Open Group UNIX Specification for getopt(3)
In brief:
optopt, optind, and optarg are external symbols. They're global, and are declared in unistd.h.
optopt (opt = option) isn't usually needed. It should be the same as the value returned by calling getopt.
optarg (arg = argument) is easy. It's for the argument to a flag. e.g. if -f is an option that requires a filename argument (optstring contains f:), you can do something like
case 'f':
filename = optarg;
break;
optind (ind means index) tells you where option process finished after the end of your while (flag = getopt...) block.
e.g. before you add option handling, your script might look like this
// print command line arguments, start at 1 to skip the program name
for (int i = 1; i < argc; i++) {
printf("arg[%d]=%s\n", i, argv[i]);
}
after adding your getopt block to handle options, you can do
// print command line arguments remaining after option processing
for (int i = optind; i < argc; i++) {
printf("arg[%d]=%s\n", i, argv[i]);
}
or
// skip command line options
argc -= optind; argv += optind;
// print command line arguments
for (int i = 0; i < argc; i++) {
printf("arg[%d]=%s\n", i, argv[i]);
}
If you don't have any required command line arguments, then you don't have to worry about optind.

Related

How to get the arguments from getopt to work together

I want to be able to use every argument (-S, -s, -f) and them be able to be used together. -S prints the files in the folder and their size... -s prints the files if they are >= the file size provided by the argument -f finds all the files with the given substring.
How would I get these to work together? Right now, my code does all of this separately.
while((c = getopt(argc, argv, "Ss:f:")) != -1){
switch(c){
case 'S':
// No clue how to make them work together.
printf("Case: S\n");
printf("option -%c with argument '%s'\n", c, argv[optind]);
printDIR(cwd, case_S);
break;
case 's':
printf("Case: s\n");
printf("option -%c with argument '%s'\n", c, optarg);
printDIR(cwd, case_s);
break;
case 'f':
printf("Case: f\n");
printf("option -%c with argument '%s'\n", c, optarg);
printDIR(cwd, case_f);
break;
default:
printf("...");
}
}
printDIR is a pointer function which is why I have cwd(which is the directory) and case_S and so on.
I want to be able to say... './search -S -s 1024 -f tar'. This should recursively search the current directory and print the size of the file if it is >= 1024 and if the file has the substring 'tar' in it. But I also want it to work even if I don't provide all arguments.
This is my first time trying anything like this so I'm new to trying to make UNIX commands and using getopt args.
Converting parts of some of my comments into an answer.
You should process the options without doing any actions. Only when you've finished processing the options, with no errors, do you think about doing anything like calling printDIR(). You'll probably need more arguments to the function, or use global variables.
You'd have a flag such as:
int recursive = 0;
which you would set to 1 if the search was to be recursive. And int minimum_size = 0; and modify it with -s 1024. And const char *filter = ""; and then modify that with -s tar. Etc. Often, these are global variables — but if you can avoid that by passing them to the function, that is better.
Your function might then become:
int printDIR(const char *cwd, int recursive, int minimum, const char *filter);
and you'd call it with the appropriately set local variables. Note that you should check the conversion from string to integer before calling printDIR().
If there are non-option arguments, you'd process them after the option handling with:
for (int i = optind; i < argc; i++)
printDIR(argv[i], recursive, minimum_size, filter);

Seperating options from non-option arguments in a command-line program

I am trying to write a limited version of ls w/ some options.
However, I am stuck on the problem of parsing out my options from my arguments in a clean manner.
For example:
$ ls -l -t somefile anotherFile
$ ls somefile -lt anotherFile
have the same behavior.
This poses two problems for me:
It makes using argc a bit more difficult. For example I would consider the arguments ls -lt and ls to both have 0 arguments (other than the name of the command) however argc counts -l as an argument.
Therefore the naive implementation of :
if( argc == 1) {list all the contents of cwd}
does not work.
Is there a built-in way to get the options as well as the option count, or do I have to roll my own function?
I have to consider all the different ways options can be arranged and be careful not to get an option mixed up as a file name or directory name. It seems like the cleanest solution is to separate the options from the file arguments from the start. Is there an idiomatic way to do this / is there standard library calls that do this?
There is no built-in argument parsing help, but getopt is the "standard" method for argument parsing.
For simple apps, I sometimes roll my own with something like:
int pos=0;
argc--;argv++;
while (argc > 0) {
if (*argv[0]=='-') {
switch ((*argv)[1]) {
case 'l': //-l argument
save_option_l(++argv);
argc--; //we consumed one name
break;
//... other -options here ...
default:
usage("unrecognized option %s", *argv);
}
}
else {
save_positional_argument(argv,pos++);
}
argv++;
argc--;
}
In this case, I require the modifiers to directly follow the flags. Don't support variable usage like your first example, unless there are very strong reasons to do so.
If you have Gnu's implementation of getopt, it will do all that for you.
Posix standard getopt terminates option processing when it hits the first non-option argument. That conforms to Posix guidelines for utility argument parsing, and many of us prefer this behaviour. But others like the ability to intermingle options and non-options, and that's the norm for Gnu utilities unless you set an environment variable with the ungainly name POSIXLY_CORRECT.
Consistent with that preference, Gnu getopt parses arguments:
The default is to permute the contents of argv while scanning it so that eventually all the non-options are at the end. This allows options to be given in any order, even with programs that were not written to expect this.
Note the wording about permuting arguments. This means that if you start with
ls somefile -lt anotherFile
Gnu getopt will:
Report a l
Report a t
Report end of options (-1), leaving optind with the value 2 and argv now looking like:
ls -lt somefile anotherFile
So now you can process your non-option arguments with:
for (int argno = optind; argno < argc; ++argno) {
/* Do something with argv[argno] */
}
Also, you can tell how many non-option arguments you received with argc-optind, and if argc == optind, you know there weren't any.
Unbundling -lt into two options is standard Posix getopt behaviour. You can combine options lime that as long as the first one doesn't take an argument.

How does "optind" get assigned in C?

I am creating this question because there is not much about how this optind gets assigned for each loop.
Man page says :
The variable optind is the index of the next element to be processed in argv. The system initializes this value to 1.
Below, I have a simple code I got from Head First C and in the code we subtract "optind" from "argc" and we get the number of leftover arguments, which will we use then to print leftover arguments as "Ingredients".
#include <unistd.h>
#include <stdio.h>
int main(int argc, char* argv[]) {
char* delivery = "";
int thick = 0 ;
int count = 0;
char ch;,
for(int i = 0; i < argc;i++){
//This is , to show the whole array and their indexes.
printf("Argv[%i] = %s\n", i, argv[i]);
}
while((ch = getopt(argc, argv, "d:t")) != -1 ){
switch(ch) {
case 'd':
printf("Optind in case 'd' : %i\n",optind);
delivery = optarg;
break;
case 't':
printf("Optind in case 't' : %i\n",optind);
thick = 1;
break;
default:
fprintf(stderr,"Unknown option: '%s'\n", optarg); // optional argument.
return 1;
}
}
argc -= optind;
argv += optind;
printf("Optind : %i and Argc after the subctraction : %i\n",optind,argc);
if(thick)
puts("Thick crust");
if(delivery[0]){
printf("To be delivered %s\n", delivery);
}
puts("Ingredients:");
for(count = 0; count < argc ; count ++){
puts(argv[count]);
}
return 0;
}
So at the beginning of the code the for loop writes all the array and its indexes to see the difference.
Then I run the code with :
./pizzaCode -d now Anchovies Pineapple -t //-t is intentionally at the end
I was told that if the flag was at the end it wouldn't get in the 't' case but somehow it works on my ubuntu. That is another thing I wonder but not the main question.
So the output is as follows :
Argv[0] = ./pizzaCode
Argv[1] = -d
Argv[2] = now
Argv[3] = Anchovies
Argv[4] = Pineapple
Argv[5] = -t
Optind in case 'd' : 3
Optind in case 't' : 6
Optind : 4 and Argc after the subctraction : 2
Thick crust
To be delivered now
Ingredients:
Anchovies
Pineapple
1- Everything is fine so far, the problem is how come argv[0] and argv1 became Anchovies and Pineapple ?
2- And another question is how did optind become 3 in case 'd'? Since 'd's index is 1 and the next index is 2.
3- How did optind become 4 after the loop ? It was 6 in the case 't'.
I hope my question is clear for you all, I am just trying to understand the logic instead of having to memorize it.
Thank you in advance!
The manpage for Gnu getopt documents this non-standard implementation:
By default, getopt() permutes the contents of argv as it scans, so that eventually all the nonoptions are at the end.
This is actually not quite true; the permutation occurs after the last option is scanned, as you have seen in your example. But the effect is the same; argv is permuted so that nonoptions are at the end, and optind is modified to index the first nonoption.
If you want to avoid the permutation, so that getopt behaves as per Posix:
If the first character of optstring is '+' or the environment variable POSIXLY_CORRECT is set, then option processing stops as soon as a nonoption argument is encountered.
In this case, no permuting is done and optind's value is preserved.
Setting POSIXLY_CORRECT has other consequences, documented here and there in the manpages for various Gnu utilities. My habit is to use + as the first character of the option string (unless I actually want non-Posix behaviour), but arguably setting the environment variable is more portable.
For your specific questions:
Why are the non-option arguments at argv[0] and argv[1]?
Because you modified argv: argv += optind;
Why is optind 3 in the loop processing option -d?
Because that option takes an argument. So the next argument is the one following the now argument, which has already been processed (by placing a pointer to it into optarg).
How did optind become 4?
As above, it was modified after the argv vector was permuted, in order for optind to be the index of the first "unprocessed" non-option argument.

getopt with repeated and optional arguments

For a simple C project of a filesystem in a file, I have to make a command for writing the partitions table. It just contains the number of partitions and the size of them, pretty simple.
It should work like mk_part -s size [-s size ...] [name].
[name] is the filename of the disk, it's optionnal because there is a default one provided.
I don't know much getopt_long (and getopt) but all I read is that I while get options in a while so the two way of processing for me would be :
store all the sizes in an array and then write them in the table.
write size directly during parsing
For the first choice the difficulty is that I don't know the number of partitions. But I still could majorate this number by argc or better by (argc-1)/2 and it would work.
For the second choice I don't know which file to write.
So what is the best alternative to get all those arguments and how can I get this optionnal name ?
getopt can handle both repeated and optional args just fine. For repeated args each invocation of getopt will give you the next arg. getopt doesn't care that it is repeated. For the arg at the end, just need to check for its presence once all the options are parsed. Below is code modified from the example in the getopt man page to handle your scenario:
#include <unistd.h>
#include <stdlib.h>
#include <stdio.h>
int
main(int argc, char *argv[])
{
int opt;
while ((opt = getopt(argc, argv, "s:")) != -1) {
switch (opt) {
case 's':
printf("size=%d\n", atoi(optarg));
break;
default: /* '?' */
exit(EXIT_FAILURE);
}
}
if (optind < argc) {
printf("name=%s\n", argv[optind]);
} else {
printf("optional name arg not present\n");
}
exit(EXIT_SUCCESS);
}
And here is some sample runs of the program showing it handling repeated options and the arg at the end.
$ ./a.out -s 10 -s 20 -s 30
size=10
size=20
size=30
optional name arg not present
$ ./a.out -s 1 my_name
size=1
name=my_name
I think you're overthinking this. I know it's always tempting to try to avoid malloc's, but the efficiency of option parsing is never (*) important. You only parse options once, and in the cost of initializing a new process, finding the executable, linking and loading it, and all the rest of the process of starting up a command, the time it takes to parse options is probably not even noise.
So just do it in the simplest way possible. Here's one possible outline:
int main(int argc, char* argv) {
/* These variables describe the options */
int nparts = 0; // Number of partitions
unsigned long* parts = NULL; // Array of partitions (of size nparts)
const char* diskname="/the/default/name"; // Disk's filename
for (;;) {
switch (getopt(argc, argv, "s:")) {
case '?':
/* Print usage message */
exit(1);
case 's':
/* Some error checking missing */
parts = realloc(parts, ++nparts * sizeof *parts);
parts[nparts - 1] = strtoul(optarg, NULL, 0);
continue;
case -1:
break;
}
break;
}
if (optind < argc) diskname = argv[optind++];
if (optind != argc) {
/* print error message */
exit(1);
}
return do_partitions(diskname, parts, nparts);
}
The above code is missing a lot of error checking and other niceties, but it's short and to the point. It simply reallocs the partition array every time a new size is found. (That's probably not as awful as you think it is, because realloc itself is probably clever enough to increase the allocation size exponentially. But even if it were awful, it's not going to happen often enough to even notice.)
The trick with continue and break is a common way of nesting a switch inside a for. In the switch, continue will continue the for loop, while break will break out of the switch; since all the switch actions which do not terminate the for loop continue, whatever follows the switch block is only executed for a switch action which explicitly breaks. So the break following the switch block breaks the for loop in precisely those cases where the switch action did a break.
You might want to check that there was at least one partition size defined before call the function which does the repartitioning.

how to parse command line arguments in C?

I am trying to parse command line arguments in C. Currently, I am using getopt do the parse. I have something like this:
#include <unistd.h>
int main(int argc, char ** argv)
{
while((c=getopt(argc, argv, "abf:")) != -1)
{
switch(c)
{
case 'a':
break;
case 'b':
break;
case 'f':
puts(optarg);
break;
case ':':
puts("oops");
break;
case '?'
puts("wrong command");
break;
}
}
}
then need to use ./a.out -fto run the program, and -f is the command element, but looks like -f must start with a '-', if I do not want the command element starts with '-', i.e, using ./a.out f instead of ./a.out -f, how to achieve it?
if getopt does not support parsing a command line in this way, are there any other library to use in C?
The argc and argv variables give you access to what you're looking for. argc is "argument count" and argv is "argument vector" (array of strings).
getopt is a very useful and powerful tool, but if you must not start with a dash, you can just access the argument array directly:
int main( int argc, char** argv) {
if( argc != 1) { /* problem! */ }
char * argument = argv[1]; // a.out f ... argv[1] will be "f"
}
You could use (on Linux with GNU libc) for parsing program arguments:
getopt with getopt_long; you might skip some arguments using tricks around optind
argp which is quite powerful
and of course you could parse program arguments manually, since they are given thru main(int argc, char**argv) on Linux (with the guarantee that argc>0, that argv[0] is "the program name" -e.g. to find it in your $PATH when it contains no / ..., that argv[argc] is the NULL pointer, and that before that every argv[i] with i<argc and i>0 is a zero-terminated string. See execve(2) for more.
GNU coding standards: command line interfaces document quite clearly some conventions. Please, obey at least the --help and --version conventions!
You might also be concerned by customizing the shell auto-completion facilities. GNU bash has programmable completion. zsh has a sophisticated completion system.
Remember that on Posix and Linux the globbing of command words is done by the shell before starting your program. See glob(7).
The getopt library will stop parsing at the first non-option argument. For command-based programs, this will be at the command name. You can then set optind to the index to start at and run getopt again with the command-specific arguments.
For example:
// general getopts
if (optind >= argc) return 0; // error -- no command
if (strcmp(argv[optind], "command") == 0)
{
++optind; // move over the command name
// 'command'-specific getopts
if (optind >= argc) return 0; // error -- no input
}
This should allow git-like command-line parsing.

Resources