ls piped into the command line - c

I have been trying to pipe in the results from ls into the command line for a C program I am writing (in Unix). I want to be able to have an index of the files and so I was planning on using argv. This is how I thought it should work:
./foo &(ls ~/path)
It doesn't work — what's the correct way to pass the output of ls as arguments to the command?

Your syntax is a bit off...
./foo $(ls ~/path)
Do note that this will choke on files with certain characters in them. Use an array instead to fix this.
pushd ~/path
files=(*)
popd
./foo "${files[#]}"

The notation you specified does two things:
./foo &
runs the program foo in the background (with no arguments other than its command name). Then:
(ls ~/path)
runs the ls command in a sub-shell (which, in this context, is the same as running it in the main shell). The problem is you intended (or need) to use $ in place of &.
./foo $(ls ~/path)
This runs the command ls ~/path and captures the output, which is split into words (using the separators listed in the $IFS variable). Each word is then supplied as an argument to the command ./foo, as you required.
We can then debate the wisdom of using the output of ls like that, but unless you have file names containing spaces (tabs, newlines etc,) you will be OK.

You know how Unix tools accept glob patterns, so you can do cat *.txt or rm ~/Pictures/Vacation*.jpg, without having to pipe/expand ls?
That's an ability your shell gives your program for free!
Just use ./foo ~/path/* and argv[1] will contain /home/you/path/fileone, argv[2] will contain /home/you/path/filetwo, and so forth.
These filenames may be relative or absolute, but can always be passed directly to open/fopen/execve or whichever function you want to use.
Using ls as you describe will only give you the last part of the filename with no directory, so you won't know where the files are to do anything with them (though if that's what you want, just use basename(argv[1])).

Related

How to pass a filename when executing a C program

I am trying to not hardcode the name of the input file in my C program. I have all of the other components working when I hardcode the filename. But would like to be able to pass it a string filename.
I am trying to execute compile a file called Matrix.c and name its executable matrix.
So, in terminal, when I get to my working directory.
gcc -g Matrix.c -o matrix
then when I compile
./matrix
It doesn't have a filename passed to it so I am gonna check for that and have the user input a filename to load.
However, when someone passes the filename, should it be passed as:
./matrix filename.txt
or
./matrix < filename.txt
With the latter option, I can't seem to get the name of the argument passed to the function from argv[1] — it's just "(Null)".
I know this is very simplistic question. But am I just completely off my rocker? Is it something to do with me running on OS X El Capitan. I know I've used the '<' convention before.
The issue is how the shell works, mainly. When you use:
./matrix filename.txt
then the program is given two arguments — the program name and the file name. When you use:
./matrix < filename.txt
then the program is given just one argument — the program name — and the shell arranges for its standard input to come from the file (and the file name is not passed to your program).
Either can be made to work; you just have to decide which you want to support. What should happen if the user types ./matrix file1.txt file2.txt file3.txt? One version of conventional behaviour would be to process each file in turn, writing each set of results to standard output. There are plenty of alternative behaviours — most of them have been used by someone at some time or another. Reading from standard input when there is no file name specified is a common mode of operation (think cat and grep and …).
Arguments to a command are in argv[1 .. argc-1].
The redirect from '<' sends the contents of the file to the program's stdin.
A third way to get the filename would be to print "Enter filename: " and then read the string typed by the user.

When is a file created when using output redirection?

I have a script that runs in AIX ksh that looks like this:
wc dir1/* dir2/* | {awk command to rearrange output} | {grep command to filter more} > dir2/output.txt
It is a precondition to this line that dir2/output.txt does not exist.
The issue is that dir2/output.txt has contained itself in the output (it's happened a handful of times out of hundreds of times with no problem). dir1 and dir2 are NFS-mounted.
Is it related to the implementation of wc -- what if the the first parameter takes a long time? I think not, as I've tried the following:
wc `sleep 5` *.txt > out.txt
Even in this case out.txt does not list itself.
As a last note, wildcards are used in this example where they are used in the actual script. So if the expansion happens first, why does this problem occur?
At what point is dir2/output.txt actually created?
Redirections are done by the shell, as are globs. Your problem is that, in the case of a pipeline, each pipeline stage is a separate subprocess; whether the shell subprocess that does the final redirection runs before the one that builds the glob of input files for wc will depend on details of the scheduler and system load, among other things, and should be considered indeterminate.
In short, you should assume that this will happen and either exclude dir2/output.txt (take a look at ksh extended glob patterns; in particular, something along the lines of dir2/!(output.txt) may be useful) or create the output somewhere else and mv it to its final location afterward.

Preparing an input for a shell

I'm writing a simple shell in C and I want to implement the user input the same way the other shells do, or at least how bash does (never used others). If you enter a command with random whitespace then it can still run the command:
ex.
ls -1
Obviously strtok() wont work on this when separating the command and args...
How does bash do this? Should I search through the thousands of lines of the source code?
You can skip spaces while you're parsing your command:
while(*p==' '||*p=='\t') ++p;
whitespaces aren't problem. You can remove all whitespace which is not necesery.
But I'm not sure that shell do it in this way. I think shell may search all option in string which is input. Example
ls -a -l and ls -l -a
is the same. Maybe better will be if you will be search all possible option, or interpret all char step after step. Example, firstly search all "-", after it interpret option which is after "-"

command injection in C programming

I was implementing an echo command using the system() function. The argument for the echo command comes from a command line argument. But when used ';' in the argument it is showing the directory listing.
What should i do to avoid it? Is it because of command injection in my program?
update: code added from comment
#include<string.h>
#include<stdio.h>
#include<stdlib.h>
int main(int argc, char **argv) {
char cmd[50] = "echo ";
strcat(cmd,argv[1]);
system(cmd);
}
I could compile the code but while executing if i give the command line argument as eg: './a.out hello;ls ' then directory listing is happening.
Why are you trying to use a shell access (which is exactly what System() does), and than attempt to restrict it?
If you need for some reason to use 'echo', please build your own execve() parameters, and launch /bin/echo directly.. this way you can restrict the damage only to the tasks 'echo' can do.
When attempting to run your program with the command ./a.out hello;ls, you are actually providing the shell with two separate commands that it executes in sequence. First the shell runs a.out with the command line parameter "hello" in argv[1], which prints it out using echo. Then your program exits, and the shell runs the next command, ls, and displays the directory listing.
If you want to pass that string to the program as a command line parameter, you need to escape the special shell character ;, so the shell does not parse it before giving it to your program. To escape a character, precede it with a \.
Try running the command with ./a.out hello\;ls, and then using printf instead of echo.
[can't respond to other answers yet, so reposting the question]
"Is possible to get the argument with ';', without using '\' in the command line argument. Is possible for me to include a '\' from my program after getting argv?"
No, it is not possible. The interpretation of ";" is done by the shell before getting to your program, so unless you escape at the call, your program will never be aware of the ";". i.e.
PROG1 parms ; PROG2
will cause the shell (which is interpreting what you type) to do the following:
start PROG1 and pass it parms.
once PROG1 is done, start PROG2
There are a number of special characters which the shell will take over by default and your program will never see: * for wildcards, | for pipes, & for parallel execution, etc... None of these will be seen by the program being run, they just tell the shell to do special things.
Alternatively to using the "\", you can enclose your parameter in single or double quotes (which are different, but for your example will both work). i.e.:
./a.out "hello;ls"
./a.out 'hello;ls'
Note that these will work for the printf option, if you call "system" you are in effect telling C to start a shell to run what you are passing in, so the input will once again be subject to shell interpretation.
system() is very difficult to use in a secure manner. It's much easier to just use one of the exec* functions.

Command line arguments with datafiles

If I want to pass a program data files how can I distinguish the fact they are data files, not just strings of the file names. Basically I want to file redirect, but use command line arguments so I can a sure input is correct.
I have been using:
./theapp < datafile1 < datafile2 arg1 arg2 arg3 > outputfile
but I am wondering is it posible for it to look like this:
./the app datafile1 datafile2 arg1 arg2 arg3 > outputfile
Allowing the use of command line arguments.
It's a little hard to combine two files into standard input like that. Better would be:
cat datafile1 datafile2 | ./theapp arg1 arg2 arg3 >outputfile
With bash (at least), the second input redirection overrides the first, it does not augment it. You can see that with the two commands:
cat <realfile.txt </dev/null # no output.
cat </dev/null <realfile.txt # outputs realfile.txt.
When you use redirection, your application never even sees >outputfile (for example). It is evaluated by the shell which opens it up and connects it to the standard output of the process you're trying to run. All your program will generally see will be:
./theapp arg1 arg2 arg3
Same with standard input, it's taken care of by the shell.
The only possible problem with that first command above is that it combines the two files into one stream so that your program doesn't know where the first ends and second begins (unless it can somehow deduce this from the content of the files).
If you want to process multiple files and know which they are, there's a time-honoured tradition of doing something like:
./theapp arg1 arg2 arg3 #datafile1 #datafile2 >outputfile
and then having your application open and process the files itself. This is more work than letting the shell do it though.
From the perspective of your program, all command line arguments are strings, and you have to decide whether they represent file names or not yourself. There are only two bytes that cannot appear in a file name on Unix: 0x00 and 0x2F (NUL and /). [I really mean bytes. Except for HFS+, Unix file systems are completely oblivious to character encoding, although sensible people use UTF-8, of course.]
Shell redirections don't appear in argv at all.
There is a convention, though: treat each element of argv (except argv[0] of course) that does not begin with a dash as the name of a file to process, in the order that they appear. You do NOT have to do any unquoting operations; just pass them to fopen (or open) as is. If the string "-" appears as an element of argv, process standard input at that point until exhausted, then continue looping over argv. And if the string "--" appears in argv, treat everything after that point as a file name, whether or not it begins with a dash. (Including subsequent appearances of "-" or "--").
There may be a handy library module or even a language primitive to deal with this stuff for you, depending on what language you're using. For instance, in Perl, you just write
for (<>) {
... do stuff with $_ ...
}
and you get everything I said in the "There is a convention..." paragraph for free. (But you said C, so, um, you gotta do most of it yourself. I'm not aware of an argument-processing library for plain C that's worth the space it takes on disk. :-( )

Resources