Parsing optional command line arguments in C

Parsing optional command line arguments in C - c

I have a program that takes in optional arguments. The necessary arguments are a file and integers (1 or more). The optional arguments are a mix of strings and integers.
So a correct input on the command line could be:
./main trace_file 8 12 # (only necessary arguments)
./main –n 3000000 –p page.txt trace_file 8 7 4 # (with optional arguments)
I need to get the integers after trace_file into an array. I'm having trouble figuring out how to do this when the optional arguments are enabled, because another integer is on the command line. A push in the right direction would be greatly appreciated, because I cannot figure out how to do this.
EDIT:
so far, all I have for parsing the arguments is this:
for(j=2, k=0; j<argc; j++, k++) {
shift += atoi(argv[j]);
shiftArr[k] = 32 - shift;
bitMaskArr[k] = (int)(pow(2, atoi(argv[j])) - 1) << (shiftArr[k]);
entryCnt[k] = (int)pow(2, atoi(argv[j]));
}
But this will only work when no optional arguments are entered.

I don't see any major problems if you use a reasonably POSIX-compliant version of getopt().
Source code (goo.c)
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
/*
./main trace_file 8 12 # (only necessary arguments)
./main –n 3000000 –p page.txt trace_file 8 7 4 # (with optional arguments)
*/
static void usage(const char *argv0)
{
fprintf(stderr, "Usage: %s [-n number][-p pagefile] trace n1 n2 ...\n", argv0);
exit(EXIT_FAILURE);
}
int main(int argc, char **argv)
{
int number = 0;
char *pagefile = "default.txt";
char *tracefile;
int opt;
while ((opt = getopt(argc, argv, "n:p:")) != -1)
{
switch (opt)
{
case 'p':
pagefile = optarg;
break;
case 'n':
number = atoi(optarg);
break;
default:
usage(argv[0]);
}
}
if (argc - optind < 3)
{
fprintf(stderr, "%s: too few arguments\n", argv[0]);
usage(argv[0]);
}
tracefile = argv[optind++];
printf("Trace file: %s\n", tracefile);
printf("Page file: %s\n", pagefile);
printf("Multiplier: %d\n", number);
for (int i = optind; i < argc; i++)
printf("Processing number: %d (%s)\n", atoi(argv[i]), argv[i]);
return 0;
}
Compilation
$ gcc -O3 -g -std=c11 -Wall -Wextra -Wmissing-prototypes -Wstrict-prototypes \
> -Wold-style-definition -Werror goo.c -o goo
Example runs
$ ./goo trace_file 8 12
Trace file: trace_file
Page file: default.txt
Multiplier: 0
Processing number: 8 (8)
Processing number: 12 (12)
$ ./goo -n 3000000 -p page.txt trace_file 8 7 4
Trace file: trace_file
Page file: page.txt
Multiplier: 3000000
Processing number: 8 (8)
Processing number: 7 (7)
Processing number: 4 (4)
$

I've given this a bit of thought and there is no simple way for getopt to return multiple arguments. I did wonder about defining "y::",but have dismissed this. Options are
1) have 2 options y and Y use one for each int and a boolean flag to catch the exception where y is defined, but Y isn't.
2) leave complex options like y out of the getopt cycle. Once getopt has processed options and shuffled arguments. Pre process these arguments to capture the -y operands and code in the process arguments to skip -y and it's operands
3) it's more usual across *nix commands to provide multiple values as a single argument which itself is a comma separated list of values. You achieve this by adding "y:" to your getopt processing. The string it points too needs to be parsed into the 2 tokens A and B from string A,B

If you can't use getopt() or some other function that does the hard work for you, then a possible strategy would be:
Create new variables myargc, myargv, and copy argc and argv into them.
Write functions that deal with paired arguments (like "-n 300000" or "-p page.txt". Pass myargc and myargv into these functions by reference. Each such function returns the value associated with the argument (e.g., 300000 or page.txt) if it was found, or an invalid value (e.g., -1 or "") if it wasn't. And in either case, the function removes both items from myargv and decrements myargc by 2.
If you also have arguments that are just individual flags, write functions for those that return boolean indicating whether the flag was found, remove the flag from myargv, and decrements myargc by 1. (You could deal with trace_file this way, even if it's not optional. I'm assuming that trace_file is just a flag, and not related to the 8 that follows it.)
Call the functions for each of your optional arguments first. The stuff that's left in myargc and myargv after you've called them all should just be your required arguments, and you can process them as you normally would.

If you can't use getopt() or some other function that does the hard work for you, then a possible strategy would be:
Create new variables myargc, myargv, and copy argc and argv into them.
Write functions that deal with paired arguments (like "-n 300000" or "-p page.txt". Pass myargc and myargv into these functions by reference. Each such function returns the value associated with the argument (e.g., 300000 or page.txt) if it was found, or an invalid value (e.g., -1 or "") if it wasn't. And in either case, the function removes both items from myargv and decrements myargc by 2.
If you also have arguments that are just individual flags, write functions for those that return boolean indicating whether the flag was found, remove the flag from myargv, and decrements myargc by 1. (You could deal with trace_file this way, even if it's not optional. I'm assuming that trace_file is just a flag, and not related to the 8 that follows it.)
Call the functions for each of your optional arguments first.

Related

argc/argv in c linked list or binary tree

So I am trying to make a linked list/binary tree and:
The user should be able to choose the data structure directly from the command line when it starts the program. This should use the argc or argv arguments to main()
how would I do this? I don’t get it why not just use switch case statement asking the student.
option 1: linked list
option 2: binary tree?
we didn’t really cover argc argv properly can anyone help?
Apparently its a duplicate ... hmm.. well i am asking specically about binary tree/linked list how would the user tell it to choose which data structure?

Experiment with the following skeleton program, and find out.
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
int main(int argc, char *argv[])
{
if (argc != 2) {
fprintf(stderr, "Usage: %s COMMAND\n", argv[0]);
return EXIT_FAILURE;
}
if (!strcmp(argv[1], "foo")) {
printf("Doing foo.\n");
} else
if (!strcmp(argv[1], "bar")) {
printf("Doing bar.\n");
} else {
fprintf(stderr, "Unknown command line parameter '%s'.\n", argv[1]);
return EXIT_FAILURE;
}
return EXIT_SUCCESS;
}
The most common way to inform the utility user as to what to do, is to run the utility without parameters, or with -h or --help as the only parameter. (Windows command-line utilities might use /? or similar.)
Let's say the user can run the compiled program, program in the following ways:
./program list
./program tree
./program -h
./program --help
./program
where the first form tells the program to use a linked list; the second form tells the program to use a tree; and the other forms just output usage, information on how to call the program:
Usage: ./program [ -h | --help ]
./program MODE
Where MODE is one of:
list Linked-list mode
tree Tree mode
Further details on what the program actually does...
You achieve this with very little code:
#include <stdlib.h>
#include <string.h>
#include <stdio.h>
enum {
NO_MODE = 0,
LIST_MODE,
TREE_MODE
};
int main(int argc, char *argv[])
{
int mode = NO_MODE;
if (argc != 2 || !strcmp(argv[1], "-h") || !strcmp(argv[1], "--help")) {
printf("Usage: %s [ -h | --help ]\n", argv[0]);
printf(" %s MODE\n", argv[0]);
printf("\n");
printf("Where MODE is one of\n");
printf(" list for linked list mode\n");
printf(" tree for tree mode\n");
printf("\n");
printf("Further details on what the program actually does...\n");
printf("\n");
return EXIT_SUCCESS;
}
if (!strcmp(argv[1], "list"))
mode = LIST_MODE;
else
if (!strcmp(argv[1], "tree"))
mode = TREE_MODE;
else {
fprintf(stderr, "%s: Unknown MODE.\n", argv[1]);
return EXIT_FAILURE;
}
/* mode == LIST_MODE or TREE_MODE here,
depending on the first command line parameter.
*/
return EXIT_SUCCESS;
}
Note that || operator is short-circuited in C: if the left side is false, the right side is not evaluated at all. So, above, the first strcmp() check is only done when argv == 2, and the second when argv == 2 and the first strcmp() returned nonzero (no match).
In other words, the body of the usage section is only run when argv != 2 (there is less than two, or more than two command line items, counting the program name as one); or if the sole command-line parameter matches either -h or --help.
! is the not operator in C. !x evaluates to 1 if and only if x is zero or NULL; and to 0 otherwise.
(You can confuse people by using !!x. It evaluates to zero if x is zero, and to one if x is not zero. Which is logical. It's often called the not-not operation.)
The enum is just there to remind you that magic constants are bad; it is better to use either enums, or preprocessor macros (#define NO_MODE 0 and so on). It would be terribly easy to use 1 in one place to indicate tree mode, and 2 in another; such bugs are horrible to debug, needs way too much concentration from the human reading the code, to find such bugs. So don't use magic constants, use enums or macros instead.
Above, I decided that NO_MODE has value zero, and let the compiler assign (increasing) values to LIST_MODE and TREE_MODE; consider them compile-time integer constants. (Meaning, you can use them in case labels in a switch statement.)
Because strcmp() returns zero if the two strings match, !strcmp(argv[1], "baz")) is true (nonzero) if and only if argv[1] contains string baz. You see it all the time in real-world code when strings are compared.
If you look at my answers here, you'll very often see an if (argc ...) "usage" block in my example code. This is because even I myself will forget, often within days, exactly what the purpose of the program is. I typically have several dozen example programs on my machines I've written, and rather than looking at the sources to see if something jogs my memory, I simply run the example snippets without command-line parameters (or actually, with -h parameter, since some are filters), to see what they do. It's faster, less reading, and I'll find the relevant snippet faster.
In summary, write an usage output block in all your programs, especially when it is just a test program you won't publish anywhere. They are useful, especially when you have a library full of them, of various code snippets (each in their own directory; I use a four-digit number and a short descriptive name) that implement interesting or useful things. It saves time and effort in the long run, and anything that lets me be efficient and lazy is good in my book.

argc = argument count, argv = array of arguments. argv[0] is the executing program. argv[1..n] are the arguments passed to the executable.
Example: I call the executable foo with 2 arguments, bar and bas:
foo bar bas
argc = 3, argv = [foo, bar, bas]

How does "optind" get assigned in C?

I am creating this question because there is not much about how this optind gets assigned for each loop.
Man page says :
The variable optind is the index of the next element to be processed in argv. The system initializes this value to 1.
Below, I have a simple code I got from Head First C and in the code we subtract "optind" from "argc" and we get the number of leftover arguments, which will we use then to print leftover arguments as "Ingredients".
#include <unistd.h>
#include <stdio.h>
int main(int argc, char* argv[]) {
char* delivery = "";
int thick = 0 ;
int count = 0;
char ch;,
for(int i = 0; i < argc;i++){
//This is , to show the whole array and their indexes.
printf("Argv[%i] = %s\n", i, argv[i]);
}
while((ch = getopt(argc, argv, "d:t")) != -1 ){
switch(ch) {
case 'd':
printf("Optind in case 'd' : %i\n",optind);
delivery = optarg;
break;
case 't':
printf("Optind in case 't' : %i\n",optind);
thick = 1;
break;
default:
fprintf(stderr,"Unknown option: '%s'\n", optarg); // optional argument.
return 1;
}
}
argc -= optind;
argv += optind;
printf("Optind : %i and Argc after the subctraction : %i\n",optind,argc);
if(thick)
puts("Thick crust");
if(delivery[0]){
printf("To be delivered %s\n", delivery);
}
puts("Ingredients:");
for(count = 0; count < argc ; count ++){
puts(argv[count]);
}
return 0;
}
So at the beginning of the code the for loop writes all the array and its indexes to see the difference.
Then I run the code with :
./pizzaCode -d now Anchovies Pineapple -t //-t is intentionally at the end
I was told that if the flag was at the end it wouldn't get in the 't' case but somehow it works on my ubuntu. That is another thing I wonder but not the main question.
So the output is as follows :
Argv[0] = ./pizzaCode
Argv[1] = -d
Argv[2] = now
Argv[3] = Anchovies
Argv[4] = Pineapple
Argv[5] = -t
Optind in case 'd' : 3
Optind in case 't' : 6
Optind : 4 and Argc after the subctraction : 2
Thick crust
To be delivered now
Ingredients:
Anchovies
Pineapple
1- Everything is fine so far, the problem is how come argv[0] and argv1 became Anchovies and Pineapple ?
2- And another question is how did optind become 3 in case 'd'? Since 'd's index is 1 and the next index is 2.
3- How did optind become 4 after the loop ? It was 6 in the case 't'.
I hope my question is clear for you all, I am just trying to understand the logic instead of having to memorize it.
Thank you in advance!

The manpage for Gnu getopt documents this non-standard implementation:
By default, getopt() permutes the contents of argv as it scans, so that eventually all the nonoptions are at the end.
This is actually not quite true; the permutation occurs after the last option is scanned, as you have seen in your example. But the effect is the same; argv is permuted so that nonoptions are at the end, and optind is modified to index the first nonoption.
If you want to avoid the permutation, so that getopt behaves as per Posix:
If the first character of optstring is '+' or the environment variable POSIXLY_CORRECT is set, then option processing stops as soon as a nonoption argument is encountered.
In this case, no permuting is done and optind's value is preserved.
Setting POSIXLY_CORRECT has other consequences, documented here and there in the manpages for various Gnu utilities. My habit is to use + as the first character of the option string (unless I actually want non-Posix behaviour), but arguably setting the environment variable is more portable.
For your specific questions:
Why are the non-option arguments at argv[0] and argv[1]?
Because you modified argv: argv += optind;
Why is optind 3 in the loop processing option -d?
Because that option takes an argument. So the next argument is the one following the now argument, which has already been processed (by placing a pointer to it into optarg).
How did optind become 4?
As above, it was modified after the argv vector was permuted, in order for optind to be the index of the first "unprocessed" non-option argument.

How to use command line options in a c command line tool?

I am trying to understand how to use command line options with a command line c tool and I came accross this example.Can some one explain how the code flow works,I am not able to understand it,also I understand that it uses a getopt() function which is inbuilt.
The exe called is rocket_to and it has two command line options, e and a. e option takes 4 as an argument and a option takes Brasalia,Tokyo,London as argument.
Can some one explain how the code works?
This is the actual code:
command line:
rocket_to -e 4 -a Brasalia Tokyo London
code:
#include<unistd.h>
..
while((ch=getopt(argc,argv,"ae:"))!=EOF)
switch(ch){
..
case 'e':
engine_count=optarg;
..
}
argc -=optind;
argv +=optind;

There are many manual pages for getopt() including the POSIX specification. They describe what the getopt() function does. You can also read the POSIX Utility Conventions which describes how arguments are handled by most programs (but there are plenty of exceptions to the rules, usually because of historical, pre-POSIX precedent).
In the example outline code, the -e option takes an argument, and that is the 4 in the example command line. You can tell because of the e: in the third argument to getopt() (the colon following the letter indicates that the option takes an argument). The -a option takes no argument; you can tell because it is not followed by a colon in the third argument to getopt(). The names Brasilia, Tokyo, London are non-option arguments after the option processing is complete. They're the values in argv[0] .. argv[argc-1] after the two compound assignments outside the loop.
The use of EOF is incorrect; getopt() returns -1 when there are no more options for it to process. You don't have to include <stdio.h> to be able to use getopt().
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
int main(int argc, char **argv)
{
int ch;
int aflag = 0;
char *engine_count = "0";
while ((ch = getopt(argc, argv, "ae:")) != -1)
{
switch (ch)
{
case 'a':
aflag = 1;
break;
case 'e':
engine_count = optarg;
break;
default:
fprintf(stderr, "Usage: %s [-a][-e engine] [name ...]\n", argv[0]);
exit(EXIT_FAILURE);
}
}
argc -= optind;
argv += optind;
printf("A flag = %d\n", aflag);
printf("Engine = %s\n", engine_count);
for (int i = 0; i < argc; i++)
printf("argv[%d] = %s\n", i, argv[i]);
return 0;
}
That is working code which, if compiled to create a program rocket_to, produces:
$ ./rocket_to -e 4 -a Brasilia Tokyo London
A flag = 1
Engine = 4
argv[0] = Brasilia
argv[1] = Tokyo
argv[2] = London
$ ./rocket_to -a -e 4 Brasilia Tokyo London
A flag = 1
Engine = 4
argv[0] = Brasilia
argv[1] = Tokyo
argv[2] = London
$ ./rocket_to -e -a 4 Brasilia Tokyo London
A flag = 0
Engine = -a
argv[0] = 4
argv[1] = Brasilia
argv[2] = Tokyo
argv[3] = London
$

From the getopt man page:
The getopt() function parses the command-line arguments. Its arguments argc and argv are the argument count and array as passed to
the main() function on program invocation. An element of argv that starts with '-' (and is not exactly "-" or "--") is an option element. The characters of this element (aside from the initial '-') are option characters. If getopt() is called repeatedly, it
returns successively each of the option characters from each of the option elements.
The 3rd argument to getopt() are the valid options. If the option is followed by a colon it requires an argument. The argument can be accessed through the optarg variable. So in your example you have two options: 'a' which takes no argument and 'e' which takes an argument.
If getopt() finds an options it returns the character. If all options are parsed it returns -1 and if an unknown option is found it returns -1.
So your code loops through all options and processes them in a switch statement.
Next time when you have trouble understanding something like this try to run man <unknown function> first.

How to count an argument with a space and an input as one argument in C?

I enter the following command line:
./file -a 1 -b2 -a5 -b 55 -b4
The output I get is:
a: 1
argv[1]: -a
b: 2
argv[2]: 1
a: 5
argv[3]: -b2
b: 55
argv[4]: -a5
b: 4
argv[5]: -b
Counter: 5
The output I wish to get should be:
a: 1
argv[1]: -a 1
b: 2
argv[2]: -b2
a: 5
argv[3]: -a5
b: 55
argv[4]: -b 55
b: 4
argv[5]: -b4
Counter: 5
Arguments with a space are currently counted as 2 arguments. I would like my program to count it as 1 argument only (I want it sees "-a 1". Not "-a" and "1" separately).
This is the source code I use I get the outputs:
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <getopt.h>
int main(int argc, char *argv[])
{
int opt = 0;
int quantum1 = 0, quantum2 = 0;
int counter = 0;
while ((opt = getopt(argc, argv,"a:b:")) != -1)
{
switch (opt)
{
case 'a' :
quantum1 = atoi(optarg);
printf("a: %d\n", quantum1);
break;
case 'b' :
quantum2 = atoi(optarg);
printf("b: %d\n", quantum2);
break;
default:
printf("Error\n");
return 1;
break;
}
counter++;
printf("argv[%d]: %s\n", counter, argv[counter]);
}
printf("Counter: %d\n", counter);
return 0;
}
Note: The quotation marks suggested work, but I'm not allowed to use quotation marks or any other symbols.

Use
./file "-a 1" -b2 -a5 "-b 55" -b4
to make "-a 1" as the first argument and "-b 55" as the fourth argument.
If you are now allowed to use quotation marks, you can escape the space in linux using:
./file -a\ 1 -b2 -a5 -b\ 55 -b4

The problem is not in the way the arguments are being parsed. The problem is the way the code displays the arguments. Specifically, when a space exists between the option and the argument, the optind will advance by two. optind is an external variable that getopt uses to keep track of the next index into the argv array.
So if you simply eliminate the counter and the lines shown below, you'll find that your code is already working correctly.
counter++;
printf("argv[%d]: %s\n", counter, argv[counter]);
printf("Counter: %d\n", counter);
If you absolutely must have a count of the number of arguments found, then simply update the counter in each case statement, e.g.
case 'a' :
counter++;
...
case 'b' :
counter++;
...
and then print the counter at the end. It's the printf("argv[%d]: %s\n", counter, argv[counter]); line that's getting you confused. That line serves no useful purpose and should be removed.
For a command line of ./test -a1 -b 2, the strings in argv will be
argv[0]: "./test"
argv[1]: "-a1"
argv[2]: "-b"
argv[3]: "2"
When you call getopt the first time, it will read argv[1] and will recognize the -a as one of the options that you've specified, so it splits the string into two parts and returns the 'a' while setting the optarg to "1".
When you call getopt the second time, it will read argv[2] and will recognize the -b option. Since you've specified that -b takes an argument, but argv[2] doesn't contain the argument, getopt will take argv[3] as the argument. Hence it returns the 'b' while setting the optarg to "2".
The bottom line is that getopt is designed to ignore the white space between the option and its argument. It does that by processing two of the argv strings if the user put white space between the option and its argument.

Handle segmentation fault accessing non-existent command line argument

I'm making a program in C in linux environment. Now, program runs with arguments which I supply in the command line.
For example:
./programName -a 45 -b 64
I wanted to handle the case when my command line parameters are wrongly supplied. Say, only 'a' and 'b' are valid parameters and character other than that is wrong. I handled this case. But suppose if my command line parameter is like this:
./programName -a 45 -b
It gives segmentation fault(core dumped). I know why it gives because there is no arguments after b. But how can I handle this situation such that when this condition arrives, I can print an error message on screen and exit my program.

As per the main function wiki page:
The parameters argc, argument count, and argv, argument vector, respectively
So you can use your argc parameter to check whether or not you have the right number of arguments. If you don't have 4, handle it and proceed without segfault.

You can, and quite probably should, use getopt() or its GNU brethren getopt_long().
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
int main(int argc, char **argv)
{
int b = 0;
int a = 0;
int opt;
while ((opt = getopt(argc, argv, "a:b:")) != -1)
{
switch (opt)
{
case 'a':
a = atoi(optarg);
break;
case 'b':
b = atoi(optarg);
break;
default:
fprintf(stderr, "Usage: %s -a num -b num\n", argv[0]);
exit(1);
}
}
if (a == 0 || b == 0)
{
fprintf(stderr, "%s: you did not provide non-zero values for both -a and -b options\n", argv[0]);
exit(1);
}
printf("a = %d, b = %d, sum = %d\n", a, b, a + b);
return(0);
}
You can make the error detection more clever as you wish, not allowing repeats, spotting extra arguments, allowing zeros through, etc. But the key point is that getopt() will outlaw your problematic invocation.
We can't see what went wrong with your code because you didn't show it, but if you go accessing a non-existent argument (like argv[4] when you run ./programName -a 42 -b), then you get core dumps. There are those who write out option parsing code by hand; such code is more vulnerable to such problems than code using getopt() or an equivalent option parsing function.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Parsing optional command line arguments in C - c

Related

argc/argv in c linked list or binary tree

How does "optind" get assigned in C?

How to use command line options in a c command line tool?

How to count an argument with a space and an input as one argument in C?

Handle segmentation fault accessing non-existent command line argument

Categories

Resources