less: missing filename using execvp() - c

I am trying to use execvp() to run less however I keep running into the same error saying that:
Missing filename ("less --help" for help)
I'm assuming I am trying to input the file completely wrong. Could anyone give me some guidance? Here is my code line trying to implement it:
// args[0] == "tempFile" which is in my directory
execvp("less", args)

argv[0] is meant to be the name you give to the command. The command you execute uses that to know how it was invoked. Typically, you'd want to use something like less here:
argv[0] = "less";
argv[1] = "filename";
argv[2] = NULL;
execvp("less", argv);

Argument 0, taken from argv[0], is conventionally the command name. Typically it's the same string that you pass as the first argument to execvp (that's what shells do).
Argument 1, from argv[1], is the first “real” argument. So pass a 3-element array containing: char *args[] = {"less", "tempFile", NULL}
Most languages follow the same argument numbering. For example, if you invoke a shell script, it sees what you pass as argv[0] as its $0, what you pass as argv[1] as $1, etc. Perl is a notable exception: in a Perl script, argv[0] is $0, argv[1] is $ARGV[0], argv[2] is $ARGV[1], etc.

Just as when you execute a shell script on the command line with
sh -c 'script body here' arg0 arg1 arg2
the arg0 argument gets placed in $0 (this is usually the name of the process) and is not really counted as one of the command line arguments of the script itself (it's not part of $# and $# will not count it). The first command line argument available in $1 is arg1.
In your case, use args[0] = "less" and args[1] = "tempFile" in your C code. args[3] should be a null pointer.

Related

pathname vs arguments for execve parameters

I am trying to implement a simple shell program that runs user input commands. I want the user to enter "ls" or "dir" and have the shell run /bin/ls or /bin/dir . For the execve argument what would be correct:
char *args[] ={"/bin/", "ls", NULL};
//Fork, pid, etc...then
execve(args[0], args+1, NULL);
Or would it be something different?? I've see some people using /bin/ls as the pathname and then no arguments or ls as the pathname and \bin for the environment? I tried what I had above and it didn't work so I want to know if it is the arguments I am sending it or whether I should be looking elsewhere in my code for the problem. Note I am not interested in other variations of execve such as execvp. I am on a Linux system. Thanks
PATHNAME
The pathname in execve() must be the full path to the executable, such as /bin/ls. If using execvpe(), you could use ls alone as the pathname, but as you already specified, you don’t want to use that.
ARGUMENTS
The arguments should be an array of strings, one for each space-separated argument specified on the command line. The last one should be NULL. The first argument should be the pathname itself. For example:
char* args[] = {"/bin/ls", "-la", "foo/bar", NULL};
ENVIRONMENT
The environment variables cannot be omitted when using execve(). In some implementations, NULL can be passed as the last argument to execve(), but this is not standard. Instead, you should pass a pointer to a null pointer; essentially an empty array of environment variables.
Putting it together
char *args[] ={"/bin/ls", "-foo", "bar", NULL};
//Fork, pid, etc...then
char* nullstr = NULL;
execve(args[0], args, &nullstr);
From execve [emphasis added]:
int execve(const char *pathname, char *const argv[],
char *const envp[]);
execve() executes the program referred to by pathname. This
causes the program that is currently being run by the calling
process to be replaced with a new program, with newly initialized
stack, heap, and (initialized and uninitialized) data segments.
pathname must be either a binary executable, or a script starting
with a line of the form:
#!interpreter [optional-arg]
For details of the latter case, see "Interpreter scripts" below.
argv is an array of pointers to strings passed to the new program
as its command-line arguments. By convention, the first of these
strings (i.e., argv[0]) should contain the filename associated
with the file being executed. The argv array must be terminated
by a NULL pointer. (Thus, in the new program, argv[argc] will be
NULL.)
In your case, the pathname should be "/bin/ls" and not "/bin/". If you want to pass any command line argument with the command, you can provide first argument them with argv vector index 1, second argument with index 2 and so on and terminate the argument vector with NULL.
A sample program which replace the current executable image with /bin/ls program and runs /bin/ls testfile:
#include <stdio.h>
#include <unistd.h>
int main (void) {
char *args[] = {"/bin/ls", "testfile", NULL};
// here you can call fork and then call execve in child process
execve(args[0], args, NULL);
return 0;
}
Output:
# ./a.out
testfile

How to accept file argument in C?

I know, if ran in bash, my program is supposed to able to handle arguments like (where a.out is the name of file):
$ a.out <inputFile
Does this mean that inputFile is argv[1]? If so, what is the data type of argv[1] in case I need to pass it in to some other function? Would I read it using something like:
FILE *fopen( const char * filename, const char * mode );
OR
Does that mean I have to accept input from user getchar() or something?
How do I deal with such situations?
There's two ways to receive input:
Via STDIN, which is a pre-defined filehandle (fd 0 or STDIN_FILENO) you can read from at any time.
Via command-line arguments passed by argv
The shell interprets redirection operators to adjust what STDIN actually is, so by the time the program runs the only arguments left are:
"a.out"
The redirection is gone. It's just "piped" into STDIN.
Shell operators like <, > and | are interpreted by the shell before your program is run. The same goes for interpolation like $ variables and other shell-specific functions.
The command
./a.out < inputFile
isn't passing arguments to the program, instead if does redirection.
That means the shell will set up standard input (stdin, which e.g. scanf reads from) in your program to read from the redirected file.
To pass an actual argument to the program you need to run it as:
./a.out inputFile
In this case argc will be equal to 2, and argv[1] will be the string "inputFile". Which you can then pass on to e.g. fopen.
You need to pass a path to the c program then use fopen()
Using "<" means you are redirecting standard input of your process to be read from inputFile. So, all the standard read routines (getchar(), cin << ...) can be used.
If you omit "<", then inputFile becomes command argument and it is passed as part of char* argv[] to your main. After checking argc, validating file exists etc., you can use file reading routines, like fopen.

Why do we pass the command name twice to execve, as a path and in the argument list?

I have a program written by my professor that prints the working directory (pwd) by using execve(), but I don't understand the parameters.
pid_t pid = fork();
if(pid <0)
perror(NULL);
else if(pid == 0)
{
char*argv[] = {"pwd",NULL};
execve("/bin/pwd",argv,NULL);
perror(NULL);
}
else
printf("Im the parent!");
return 0;
}
"/bin/pwd" gives the path to the executable that will be executed.
This means that it will call the pwd function, doesn't it?
Then why do I need to have the parameter pwd?
Couldn't the program run without that parameter?
By convention, the first argument passed to a program is the file name of the executable. However, it doesn't necessarily have to be.
As an example, take the following program:
#include <stdio.h>
int main(int argc, char *argv[])
{
int i;
printf("number of arguments: %d\n", argc);
printf("program name: %s\n", argv[0]);
for (i=1; i<argc; i++) {
printf("arg %d: %s\n", argv[i]);
}
return 0;
}
If you run this program from another like this:
char*argv[] = {"myprog", "A", "B", NULL};
execve("/home/dbush/myprog",argv,NULL);
The above will output:
number of arguments: 3
program name: myprog
arg 1: A
arg 2: B
But you could also run it like this
char*argv[] = {"myotherprog", "A", "B", NULL};
execve("/home/dbush/myprog",argv,NULL);
And it will output:
number of arguments: 3
program name: myotherprog
arg 1: A
arg 2: B
You can use the value of argv[0] as a way to know how your program was called and perhaps expose different functionality based on that.
The popular busybox tool does just this. A single executable is linked with different file names. Depending on which link a user used to run the executable, it can read argv[0] to know whether it was called as ls, ps, pwd, etc.
The execve man page has some mention of this. The emphasis is mine.
By convention, the first of these strings should contain the filename associated with the file being executed.
That is, it is not a actually mandatory for the first argv to be the filename. In fact one can test that by changing the argv[0] to any string in the example code and the result will still be correct.
So it really is just a convention. Many programs will use argv[0] and expect it to be the filename. But many programs also do not care about argv[0] (like pwd). So whether argv[0] actually needs to be set to the filename depends on what program is being executed. Having said that, it would be wise to always follow the convention to play nicely with almost everyone's long held expectations.
From execve man page: http://man7.org/linux/man-pages/man2/execve.2.html
argv is an array of argument strings passed to the new program. By
convention, the first of these strings (i.e., argv[0]) should contain
the filename associated with the file being executed. envp is an
array of strings, conventionally of the form key=value, which are
passed as environment to the new program. The argv and envp arrays
must each include a null pointer at the end of the array.
So, argv is treated as command line args for new program to execute with.
Since by default, for a linux binary invoked with arguments, these args are accessed through argc/argv, where argv[0] holds the program name.
I think this is to keep the behavior parity to match with default case (prog invoked with arguments).
From the source:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/fs/exec.c#l1376
The argv passed to execve is used to construct argv for the about to be launched binary.

How do you use optional and non optional arguments?

I understand how to use getopt to accept command-line arguments like
./program -a yes -b no
What I am currently trying to do is accept command-line arguments where some are optional and some are not.
For example:
./program argv[1] argv[2] -a yes -b no
Options after multiple optional arguments is probably bad idea; don't design the command line syntax that way, if you can help it.
That said, you can parse the arguments yourself outside of getopt until you see something which looks like an option (while incrementing argv and decrementing argc). Then use getopt for the remainder of the command line from that point on.
Pseudo-code:
for (; *argv; argc--, argv++) {
if (argv looks like an option)
break;
process *argv somehow
}
now process with getopt(argc, argv, ...)

C: execvp() and command line arguments

So I'm writing a program where the arguments are as follows:
program start emacs file.c
or even
program wait
In essence, the first argument (argv[0]) is the program name, followed by user inputs.
Inside my code, I invoke execvp. Thing is, I'm not entirely sure I'm invoking the right arguments.
if (pid == 0) {
execvp(argv[1], argv); //Line of interest
exit(1);
}
are argv[1] and argv the correct arguments to pass for the functionality described above? I looked at the man page and they make sense but might not be correct for this case.
Thank you!
In your main, argv will be like this in the first example:
argv[0] = "program";
argv[1] = "start";
argv[2] = "emacs";
argv[3] = "file.c";
argv[4] = NULL;
In execv you want to execute the program "start" with args "emacs file.c", right?. Then the first parameter should be argv[1] - "start" and the second one an array with this strings: {"start", "emacs", "file.c", NULL}. If you use argv, you include the "program" string in argv[0].
You can create a new array and copy these parameters or use the address of argv[1] like this:
execvp(argv[1], &argv[1]); //Line of interest
The only thing that might be an issue is that argv[0] in argv passed to execvp won't match argv[1] (the first argument). Otherwise, it looks okay.
Imagine calling program cat file.txt. In your program, argv will be {"program", "cat", "file.txt", NULL}. Then, in cat, even though the binary called will be cat, argv will still be {"program", "cat", "file.txt", NULL}.
Since cat tries to open and read each argument as a file, the first file it'll try to open is cat (argv[1]), which isn't the desired behavior.
The simple solution is to use execvp(argv[1], argv+1) - this essentially shifts the argument array to the left by one element.
My understanding is that you want to take a specific action based on the second command-line argument (argv[1]). If the second argument is 'start', your program should start the executable named argv[2] with the arguments provided thereafter (right?). In this case, you should provide execvp with the executable name (argv[2]) [1] and a list of arguments, which by convention starts with the name of the executable (argv[2]).
execvp(argv[2], &argv[2]) would implement what we have described in the last paragraph (assuming this is what you intended to do).
[1] execvp expects 2 arguments as you know. The first is a filename; if the specified filename does not contain a slash character (/), execvp will do a lookup in the PATH environment variable (which contains a list of directories where executable files reside) to find the executable's fully-qualified name. The second argument is a list of command-line arguments that will be available to the program when it starts.

Resources