getopt with repeated and optional arguments - c

For a simple C project of a filesystem in a file, I have to make a command for writing the partitions table. It just contains the number of partitions and the size of them, pretty simple.
It should work like mk_part -s size [-s size ...] [name].
[name] is the filename of the disk, it's optionnal because there is a default one provided.
I don't know much getopt_long (and getopt) but all I read is that I while get options in a while so the two way of processing for me would be :
store all the sizes in an array and then write them in the table.
write size directly during parsing
For the first choice the difficulty is that I don't know the number of partitions. But I still could majorate this number by argc or better by (argc-1)/2 and it would work.
For the second choice I don't know which file to write.
So what is the best alternative to get all those arguments and how can I get this optionnal name ?

getopt can handle both repeated and optional args just fine. For repeated args each invocation of getopt will give you the next arg. getopt doesn't care that it is repeated. For the arg at the end, just need to check for its presence once all the options are parsed. Below is code modified from the example in the getopt man page to handle your scenario:
#include <unistd.h>
#include <stdlib.h>
#include <stdio.h>
int
main(int argc, char *argv[])
{
int opt;
while ((opt = getopt(argc, argv, "s:")) != -1) {
switch (opt) {
case 's':
printf("size=%d\n", atoi(optarg));
break;
default: /* '?' */
exit(EXIT_FAILURE);
}
}
if (optind < argc) {
printf("name=%s\n", argv[optind]);
} else {
printf("optional name arg not present\n");
}
exit(EXIT_SUCCESS);
}
And here is some sample runs of the program showing it handling repeated options and the arg at the end.
$ ./a.out -s 10 -s 20 -s 30
size=10
size=20
size=30
optional name arg not present
$ ./a.out -s 1 my_name
size=1
name=my_name

I think you're overthinking this. I know it's always tempting to try to avoid malloc's, but the efficiency of option parsing is never (*) important. You only parse options once, and in the cost of initializing a new process, finding the executable, linking and loading it, and all the rest of the process of starting up a command, the time it takes to parse options is probably not even noise.
So just do it in the simplest way possible. Here's one possible outline:
int main(int argc, char* argv) {
/* These variables describe the options */
int nparts = 0; // Number of partitions
unsigned long* parts = NULL; // Array of partitions (of size nparts)
const char* diskname="/the/default/name"; // Disk's filename
for (;;) {
switch (getopt(argc, argv, "s:")) {
case '?':
/* Print usage message */
exit(1);
case 's':
/* Some error checking missing */
parts = realloc(parts, ++nparts * sizeof *parts);
parts[nparts - 1] = strtoul(optarg, NULL, 0);
continue;
case -1:
break;
}
break;
}
if (optind < argc) diskname = argv[optind++];
if (optind != argc) {
/* print error message */
exit(1);
}
return do_partitions(diskname, parts, nparts);
}
The above code is missing a lot of error checking and other niceties, but it's short and to the point. It simply reallocs the partition array every time a new size is found. (That's probably not as awful as you think it is, because realloc itself is probably clever enough to increase the allocation size exponentially. But even if it were awful, it's not going to happen often enough to even notice.)
The trick with continue and break is a common way of nesting a switch inside a for. In the switch, continue will continue the for loop, while break will break out of the switch; since all the switch actions which do not terminate the for loop continue, whatever follows the switch block is only executed for a switch action which explicitly breaks. So the break following the switch block breaks the for loop in precisely those cases where the switch action did a break.
You might want to check that there was at least one partition size defined before call the function which does the repartitioning.

Related

How to get the arguments from getopt to work together

I want to be able to use every argument (-S, -s, -f) and them be able to be used together. -S prints the files in the folder and their size... -s prints the files if they are >= the file size provided by the argument -f finds all the files with the given substring.
How would I get these to work together? Right now, my code does all of this separately.
while((c = getopt(argc, argv, "Ss:f:")) != -1){
switch(c){
case 'S':
// No clue how to make them work together.
printf("Case: S\n");
printf("option -%c with argument '%s'\n", c, argv[optind]);
printDIR(cwd, case_S);
break;
case 's':
printf("Case: s\n");
printf("option -%c with argument '%s'\n", c, optarg);
printDIR(cwd, case_s);
break;
case 'f':
printf("Case: f\n");
printf("option -%c with argument '%s'\n", c, optarg);
printDIR(cwd, case_f);
break;
default:
printf("...");
}
}
printDIR is a pointer function which is why I have cwd(which is the directory) and case_S and so on.
I want to be able to say... './search -S -s 1024 -f tar'. This should recursively search the current directory and print the size of the file if it is >= 1024 and if the file has the substring 'tar' in it. But I also want it to work even if I don't provide all arguments.
This is my first time trying anything like this so I'm new to trying to make UNIX commands and using getopt args.
Converting parts of some of my comments into an answer.
You should process the options without doing any actions. Only when you've finished processing the options, with no errors, do you think about doing anything like calling printDIR(). You'll probably need more arguments to the function, or use global variables.
You'd have a flag such as:
int recursive = 0;
which you would set to 1 if the search was to be recursive. And int minimum_size = 0; and modify it with -s 1024. And const char *filter = ""; and then modify that with -s tar. Etc. Often, these are global variables — but if you can avoid that by passing them to the function, that is better.
Your function might then become:
int printDIR(const char *cwd, int recursive, int minimum, const char *filter);
and you'd call it with the appropriately set local variables. Note that you should check the conversion from string to integer before calling printDIR().
If there are non-option arguments, you'd process them after the option handling with:
for (int i = optind; i < argc; i++)
printDIR(argv[i], recursive, minimum_size, filter);

argc/argv in c linked list or binary tree

So I am trying to make a linked list/binary tree and:
The user should be able to choose the data structure directly from the command line when it starts the program. This should use the argc or argv arguments to main()
how would I do this? I don’t get it why not just use switch case statement asking the student.
option 1: linked list
option 2: binary tree?
we didn’t really cover argc argv properly can anyone help?
Apparently its a duplicate ... hmm.. well i am asking specically about binary tree/linked list how would the user tell it to choose which data structure?
Experiment with the following skeleton program, and find out.
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
int main(int argc, char *argv[])
{
if (argc != 2) {
fprintf(stderr, "Usage: %s COMMAND\n", argv[0]);
return EXIT_FAILURE;
}
if (!strcmp(argv[1], "foo")) {
printf("Doing foo.\n");
} else
if (!strcmp(argv[1], "bar")) {
printf("Doing bar.\n");
} else {
fprintf(stderr, "Unknown command line parameter '%s'.\n", argv[1]);
return EXIT_FAILURE;
}
return EXIT_SUCCESS;
}
The most common way to inform the utility user as to what to do, is to run the utility without parameters, or with -h or --help as the only parameter. (Windows command-line utilities might use /? or similar.)
Let's say the user can run the compiled program, program in the following ways:
./program list
./program tree
./program -h
./program --help
./program
where the first form tells the program to use a linked list; the second form tells the program to use a tree; and the other forms just output usage, information on how to call the program:
Usage: ./program [ -h | --help ]
./program MODE
Where MODE is one of:
list Linked-list mode
tree Tree mode
Further details on what the program actually does...
You achieve this with very little code:
#include <stdlib.h>
#include <string.h>
#include <stdio.h>
enum {
NO_MODE = 0,
LIST_MODE,
TREE_MODE
};
int main(int argc, char *argv[])
{
int mode = NO_MODE;
if (argc != 2 || !strcmp(argv[1], "-h") || !strcmp(argv[1], "--help")) {
printf("Usage: %s [ -h | --help ]\n", argv[0]);
printf(" %s MODE\n", argv[0]);
printf("\n");
printf("Where MODE is one of\n");
printf(" list for linked list mode\n");
printf(" tree for tree mode\n");
printf("\n");
printf("Further details on what the program actually does...\n");
printf("\n");
return EXIT_SUCCESS;
}
if (!strcmp(argv[1], "list"))
mode = LIST_MODE;
else
if (!strcmp(argv[1], "tree"))
mode = TREE_MODE;
else {
fprintf(stderr, "%s: Unknown MODE.\n", argv[1]);
return EXIT_FAILURE;
}
/* mode == LIST_MODE or TREE_MODE here,
depending on the first command line parameter.
*/
return EXIT_SUCCESS;
}
Note that || operator is short-circuited in C: if the left side is false, the right side is not evaluated at all. So, above, the first strcmp() check is only done when argv == 2, and the second when argv == 2 and the first strcmp() returned nonzero (no match).
In other words, the body of the usage section is only run when argv != 2 (there is less than two, or more than two command line items, counting the program name as one); or if the sole command-line parameter matches either -h or --help.
! is the not operator in C. !x evaluates to 1 if and only if x is zero or NULL; and to 0 otherwise.
(You can confuse people by using !!x. It evaluates to zero if x is zero, and to one if x is not zero. Which is logical. It's often called the not-not operation.)
The enum is just there to remind you that magic constants are bad; it is better to use either enums, or preprocessor macros (#define NO_MODE 0 and so on). It would be terribly easy to use 1 in one place to indicate tree mode, and 2 in another; such bugs are horrible to debug, needs way too much concentration from the human reading the code, to find such bugs. So don't use magic constants, use enums or macros instead.
Above, I decided that NO_MODE has value zero, and let the compiler assign (increasing) values to LIST_MODE and TREE_MODE; consider them compile-time integer constants. (Meaning, you can use them in case labels in a switch statement.)
Because strcmp() returns zero if the two strings match, !strcmp(argv[1], "baz")) is true (nonzero) if and only if argv[1] contains string baz. You see it all the time in real-world code when strings are compared.
If you look at my answers here, you'll very often see an if (argc ...) "usage" block in my example code. This is because even I myself will forget, often within days, exactly what the purpose of the program is. I typically have several dozen example programs on my machines I've written, and rather than looking at the sources to see if something jogs my memory, I simply run the example snippets without command-line parameters (or actually, with -h parameter, since some are filters), to see what they do. It's faster, less reading, and I'll find the relevant snippet faster.
In summary, write an usage output block in all your programs, especially when it is just a test program you won't publish anywhere. They are useful, especially when you have a library full of them, of various code snippets (each in their own directory; I use a four-digit number and a short descriptive name) that implement interesting or useful things. It saves time and effort in the long run, and anything that lets me be efficient and lazy is good in my book.
argc = argument count, argv = array of arguments. argv[0] is the executing program. argv[1..n] are the arguments passed to the executable.
Example: I call the executable foo with 2 arguments, bar and bas:
foo bar bas
argc = 3, argv = [foo, bar, bas]

Handle segmentation fault accessing non-existent command line argument

I'm making a program in C in linux environment. Now, program runs with arguments which I supply in the command line.
For example:
./programName -a 45 -b 64
I wanted to handle the case when my command line parameters are wrongly supplied. Say, only 'a' and 'b' are valid parameters and character other than that is wrong. I handled this case. But suppose if my command line parameter is like this:
./programName -a 45 -b
It gives segmentation fault(core dumped). I know why it gives because there is no arguments after b. But how can I handle this situation such that when this condition arrives, I can print an error message on screen and exit my program.
As per the main function wiki page:
The parameters argc, argument count, and argv, argument vector, respectively
So you can use your argc parameter to check whether or not you have the right number of arguments. If you don't have 4, handle it and proceed without segfault.
You can, and quite probably should, use getopt() or its GNU brethren getopt_long().
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
int main(int argc, char **argv)
{
int b = 0;
int a = 0;
int opt;
while ((opt = getopt(argc, argv, "a:b:")) != -1)
{
switch (opt)
{
case 'a':
a = atoi(optarg);
break;
case 'b':
b = atoi(optarg);
break;
default:
fprintf(stderr, "Usage: %s -a num -b num\n", argv[0]);
exit(1);
}
}
if (a == 0 || b == 0)
{
fprintf(stderr, "%s: you did not provide non-zero values for both -a and -b options\n", argv[0]);
exit(1);
}
printf("a = %d, b = %d, sum = %d\n", a, b, a + b);
return(0);
}
You can make the error detection more clever as you wish, not allowing repeats, spotting extra arguments, allowing zeros through, etc. But the key point is that getopt() will outlaw your problematic invocation.
We can't see what went wrong with your code because you didn't show it, but if you go accessing a non-existent argument (like argv[4] when you run ./programName -a 42 -b), then you get core dumps. There are those who write out option parsing code by hand; such code is more vulnerable to such problems than code using getopt() or an equivalent option parsing function.

Forking with command line arguments

I am building a Linux Shell, and my current headache is passing command line arguments to forked/exec'ed programs and system functions.
Currently all input is tokenized on spaces and new lines, in a global variable char * parsed_arguments. For example, the input dir /usa/folderb would be tokenized as:
parsed_arguments[0] = dir
parsed_arguments[1] = /usa/folderb
parsed_arguments tokenizes everything perfectly; My issue now is that i wish to only take a subset of parsed_arguments, which excludes the command/ first argument/path to executable to run in the shell, and store them in a new array, called passed_arguments.
so in the previous example dir /usa/folderb
parsed_arguments[0] = dir
parsed_arguments[1] = /usa/folderb
passed_arguments[0] = /usa/folderb
passed_arguments[1] = etc....
Currently I am not having any luck with this so I'm hoping someone could help me with this. Here is some code of what I have working so far:
How I'm trying to copy arguments:
void command_Line()
{
int i = 1;
for(i;parsed_arguments[i]!=NULL;i++)
printf("%s",parsed_arguments[i]);
}
Function to read commands:
void readCommand(char newcommand[]){
printf("readCommand: %s\n", newcommand);
//parsed_arguments = (char* malloc(MAX_ARGS));
// strcpy(newcommand,inputstring);
parsed = parsed_arguments;
*parsed++ = strtok(newcommand,SEPARATORS); // tokenize input
while ((*parsed++ = strtok(NULL,SEPARATORS)))
//printf("test1\n"); // last entry will be NULL
//passed_arguments=parsed_arguments[1];
if(parsed[0]){
char *initial_command =parsed[0];
parsed= parsed_arguments;
while (*parsed) fprintf(stdout,"%s\n ",*parsed++);
// free (parsed);
// free(parsed_arguments);
}//end of if
command_Line();
}//end of ReadCommand
Forking function:
else if(strstr(parsed_arguments[0],"./")!=NULL)
{
int pid;
switch(pid=fork()){
case -1:
printf("Fork error, aborting\n");
abort();
case 0:
execv(parsed_arguments[0],passed_arguments);
}
}
This is what my shell currently outputs. The first time I run it, it outputs something close to what I want, but every subsequent call breaks the program. In addition, each additional call appends the parsed arguments to the output.
This is what the original shell produces. Again it's close to what I want, but not quite. I want to omit the command (i.e. "./testline").
Your testline program is a sensible one to have in your toolbox; I have a similar program that I call al (for Argument List) that prints its arguments, one per line. It doesn't print argv[0] though (I know it is called al). You can easily arrange for your testline to skip argv[0] too. Note that Unix convention is that argv[0] is the name of the program; you should not try to change that (you'll be fighting against the entire system).
#include <stdio.h>
int main(int argc, char **argv)
{
while (*++argv != 0)
puts(*argv);
return 0;
}
Your function command_line() is also reasonable except that it relies unnecessarily on global variables. Think of global variables as a nasty smell (H2S, for example); avoid them when you can. It should be more like:
void command_Line(char *argv[])
{
for (int i = 1; argv[i] != NULL; i++)
printf("<<%s>>\n", argv[i]);
}
If you're stuck with C89, you'll need to declare int i; outside the loop and use just for (i = 1; ...) in the loop control. Note that the printing here separates each argument on a line on its own, and encloses it in marker characters (<< and >> — change to suit your whims and prejudices). It would be fine to skip the newline in the loop (maybe use a space instead), and then add a newline after the loop (putchar('\n');). This makes a better, more nearly general purpose debug routine. (When I write a 'dump' function, I usually use void dump_argv(FILE *fp, const char *tag, char *argv[]) so that I can print to standard error or standard output, and include a tag string to identify where the dump is written.)
Unfortunately, given the fragmentary nature of your readCommand() function, it is not possible to coherently critique it. The commented out lines are enough to elicit concern, but without the actual code you're running, we can't guess what problems or mistakes you're making. As shown, it is equivalent to:
void readCommand(char newcommand[])
{
printf("readCommand: %s\n", newcommand);
parsed = parsed_arguments;
*parsed++ = strtok(newcommand, SEPARATORS);
while ((*parsed++ = strtok(NULL, SEPARATORS)) != 0)
{
if (parsed[0])
{
char *initial_command = parsed[0];
parsed = parsed_arguments;
while (*parsed)
fprintf(stdout, "%s\n ", *parsed++);
}
}
command_Line();
}
The variables parsed and parsed_arguments are both globals and the variable initial_command is set but not used (aka 'pointless'). The if (parsed[0]) test is not safe; you incremented the pointer in the previous line, so it is pointing at indeterminate memory.
Superficially, judging from the screen shots, you are not resetting the parsed_arguments[] and/or passed_arguments[] arrays correctly on the second use; it might be an index that is not being set to zero. Without knowing how the data is allocated, it is hard to know what you might be doing wrong.
I recommend closing this question, going back to your system and producing a minimal SSCCE. It should be under about 100 lines; it need not do the execv() (or fork()), but should print the commands to be executed using a variant of the command_Line() function above. If this answer prevents you deleting (closing) this question, then edit it with your SSCCE code, and notify me with a comment to this answer so I get to see you've done that.

Where does getopt_long store an unrecognized option?

When getopt or getopt_long encounters an illegal option, it stores the offending option character in optopt. When the illegal option is a long option, where can I find out what the option was? And does anything meaningful get stored in optopt then?
I've set opterr = 0 to suppress the automatically printed error message. I want to create my own message that I can print or log where I'd like, but I want to include the name of the unrecognized option.
The closest I can find is that if you get a BADCH return the argv item that caused it is in argv[optind-1]. Seems like there should be a better way to find the problem argument.
You're quite right that the man page glosses right over these details, but enough hints can be gleaned from the source code, e.g., glibc's implementation in glibc-x.y.z/posix/getopt.c's _getopt_internal_r. (Perhaps that's the only interesting implementation of this GNU extension function?)
That code sets optopt to 0 when it encounters an erroneous long option, which I guess is useful to distinguish this case from an erroneous short option, when optopt will surely be non-NUL.
The error messages produced when opterr != 0 mostly print out the erroneous long option as argv[optind], and later code (always or -- conservatively -- at least mostly) later increments optind before returning.
Hence consider this program:
#include <getopt.h>
#include <stdio.h>
int main(int argc, char **argv) {
struct option longopts[] = {
{ "foo", no_argument, NULL, 'F' },
{ NULL, 0, NULL, 0 }
};
int c;
do {
int curind = optind;
c = getopt_long(argc, argv, "f", longopts, NULL);
switch (c) {
case 'f': printf("-f\n"); break;
case 'F': printf("--foo\n"); break;
case '?':
if (optopt) printf("bad short opt '%c'\n", optopt);
else printf("bad long opt \"%s\"\n", argv[curind]);
break;
case -1:
break;
default:
printf("returned %d\n", c);
break;
}
} while (c != -1);
return 0;
}
$ ./longopt -f -x --bar --foo
-f
./longopt: invalid option -- 'x'
bad short opt 'x'
./longopt: unrecognized option '--bar'
bad long opt "--bar"
--foo
Thus in these cases, by caching the pre-getopt_long value of optind, we're easily able to print out the same bad options as the opterr messages.
This may not be quite right in all cases, as the glibc implementation's use of its own __nextchar rather than argv[optind] (in the "unrecognized option" case) deserves study, but it should be enough to get you started.
If you think carefully about the relationship between optind and the repeated invocations of getopt_long, I think printing out argv[cached_optind] is going to be pretty safe. optopt exists because for short options you need to know just which character within the word is the problem, but for long options the problem is the whole current word (modulo stripping off option arguments of the form =param). And the current word is the one that getopt_long is looking at with the (incoming) optind value.
In the absence of a guarantee written in the documentation, I would be somewhat less sanguine about taking advantage of the optopt = 0 behaviour though.

Resources