I'm writing a unix minishell in C, and am at the point where I'm adding command expansion. What I mean by this is that I can nest commands in other commands, for example:
$> echo hello $(echo world! ... $(echo and stuff))
hello world! ... and stuff
I think I have it working mostly, however it isn't marking the end of the expanded string correctly, for example if I do:
$> echo a $(echo b $(echo c))
a b c
$> echo d $(echo e)
d e c
See it prints the c, even though I didn't ask it to. Here is my code:
msh.c - http://pastebin.com/sd6DZYwB
expand.c - http://pastebin.com/uLqvFGPw
I have a more code, but there's a lot of it, and these are the parts that I'm having trouble with at the moment. I'll try to tell you the basic way I'm doing this.
Main is in msh.c, here it gets a line of input from either the commandline or a shellfile, and then calls processline (char *line, int outFD, int waitFlag), where line is the line we just got, outFD is the file descriptor of the output file, and waitFlag tells us whether or not we should wait if we fork. When we call this from main we do it like this:
processline (buffer, 1, 1);
In processline, we allocate a new line:
char expanded_line[EXPANDEDLEN];
We then call expand, in expand.c:
expand(line, expanded_line, EXPANDEDLEN);
In expand, we copy the characters literally from line to expanded_line until we find a $(, which then calls:
static int expCmdOutput(char *orig, char *new, int *oldl_ind, int *newl_ind)
orig is line, and new is expanded line. oldl_ind and newl_ind are the current positions in the line and expanded line, respectively. Then we pipe, and recursively call processline, passing it the nested command(for example, if we had "echo a $(echo b)", we would pass processline "echo b").
This is where I get confused, each time expand is called, is it allocating a new chunk of memory EXPANDEDLEN long? If so, this is bad because I'll run out of stack room really quickly(in the case of a hugely nested commandline input). In expand I insert a null character at the end of the expanded string, so why is it printing past it?
If you guys need any more code, or explanations, just ask. Secondly, I put the code in pastebin because there's a ton of it, and in my experience people don't like it when I fill up several pages with code.
Your problem lies in expCmdOutput. As you already noticed, you do not get NUL terminated strings when reading the output of your child process using read. What you want to do is terminate the string manually, by adding something like
buf[bytes_read] = '\0';
after your call to read in line 29 (expand.c). Sicne you need space for the NUL, you can only read up to BUF_SIZE - 1 bytes then, of course.
You should probably rethink the whole loop you do afterwards, though:
/* READ OUTPUT OF COMMAND FROM READ END OF PIPE, THEN CLOSE READ END */
bytes_read = read(fd[0],buf,BUF_SIZE);
while(bytes_read > 0)
{
bytes_read = read(fd[0], buf, BUF_SIZE);
if (bytes_read == -1) perror("read");
}
close(fd[0]);
If the output of your command is longer than BUF_SIZE, you simply read again to buf, overwriting the output you just read. What you really want here is to allocate memory and append to the end using strcat (or by holding a pointer to the end of your string for efficiency).
Related
I discovered the function read(), but I don't understand everything.
Here is my code:
#include <unistd.h>
#include <stdio.h>
int main(void)
{
char array[10];
int ret;
printf("read : ");
fflush(stdout);
array[sizeof(array) - 1] = '\0';
ret = read(STDIN_FILENO, array, sizeof(array) - 1);
printf("array = %s\n", array);
printf("characters read = %d\n", ret);
//getchar();
return (0);
}
Here is an example of the running program :
$> ./a.out
read : hi guys how are you
array = hi guys h
characters read = 9
$> ow are you
zsh: command not found: ow
$>
Why is it launching a shell command after the end of the program?
I noticed that if I uncomment the getchar() line, this strange behavior disappears. I'd like to understand what is going on, if someone has an idea :)
Your call to read is reading in the first 9 characters of what you've type. Anything else will be left in the input buffer so that when you program exits, your shell will read it instead.
You should check the return value of read so you know how much has been read as it's not guaranteed that it'll be the amount you ask for and also the value returned is used to indicate an error.
The string read in won't be null-terminated either, so you also should use the return value (if positive) to put the NUL character in so that your string is valid.
If you want to read in the whole line, you'll need to put in a loop and identify when there is an end of line character (most likely '\n').
You typed about 20 characters, but you only read 9 characters with read(). Everything after that was left in the terminal driver's input buffer. So when the shell called read() after the program exited, it got the rest of the line, and tried to execute it as a command.
To prevent this, you should keep reading until you get to the end of the line.
I have a simple C program with the read function and I don't understand the output.
//code1.c
#include <unistd.h>
#include <stdio.h>
#include <fcntl.h>
int main()
{
int r;
char c; // In C, char values are stored in 1 byte
r = read ( 0, &c, 1);
// DOC:
//ssize_t read (int filedes, void *buffer, size_t size)
//The read function reads up to size bytes from the file with descriptor filedes, storing the results in the buffer.
//The return value is the number of bytes actually read.
// Here:
// filedes is 0, which is stdin from <stdio.h>
// *buffer is &c : address in memory of char c
// size is 1 meaning it will read only 1 byte
printf ("r = %d\n", r);
return 0;
}
And here is a screenshot of the result:
I ran this program 2 times as showed above and typed "a" for the first try and "aecho hi" for the second try.
How I try to explain the results:
When read is called it sees that stdin is closed and opens it (from my point of view, why? It should just read it. I don't know why it opens it).
I type "aecho hi" in the bash and press enter.
read has priority to process stdin and reads the first byte of "aecho hi" : "a".
I get the confirmation that read has processed 1 byte with the printf.
a.out has finished and is terminated.
Somehow the remaining data in stdin is processed in bash (the father of my program) and goes to stdout which executes it and for some reason the first byte has been deleted by read.
This is all hypothetical and very blurry. Any help understanding what is happening would be very welcome.
When you type at your terminal emulator, it writes your keystrokes to a "file", in this case an in-memory buffer that, thanks to the file system, looks just like any other file that might be on disk.
Every process inherits 3 open file handles from its parent. We are interested in one of them here, standard input. The program executed by the terminal emulator (here, bash), is given as its standard input the in-memory buffer described in the first paragraph.
a.out, when run by bash, also receives this same file as its standard input. Keep this in mind: bash and a.out are reading from the same, already-opened file.
After you run a.out, its read blocks, because its standard input is empty. When you type aecho hi<enter>, the terminal writes these characters to the buffer (<enter> becoming a single linefeed character). a.out only requests one character, so it gets a and leaves the rest of the characters in the file. (Or more precisely, the file pointer is still pointing at the e after a is read.)
After a.out completes, bash tries to read from the same file. Normally, the file is empty (i.e., the file pointer is at the end of the file), so bash blocks waiting for another command. In this case, though, there is input available already: echo hi\n. bash reads this now the same as if you had typed it after a.out completed.
Check this. As alk suggests stdin and stdout are already open with the program. Now you have to understand, once you type:
aecho hi
and hit return the stdin buffer is filled with all those letters (and space) - and will continue to be as long as you don't flush it. When the program exits, the stdin buffer is still full, and your terminal automatically handles a write into stdin by echoing it to stdout - this is what you're seeing at the end - your shell reading stdin.
Now as you point out, your code "presses return" for you so to speak - in the first execution adding an empty shell line, and in the second executing echo hi. But you must remember, you pressed return, so "\n" is in the buffer! To be explicit, you in fact typed:
aecho hi\n
Once your program exits the shell reads the remaining characters in the buffer, including the return, and that's what you see!
EDIT: GDB was not the issue. Bugs in my code created the behaviour.
I am wondering how GDB's input works.
For example I created the following small c program:
#include <stdlib.h>
#include <stdio.h>
int main(){
setbuf(stdout,NULL);
printf("first:\n");
char *inp;
size_t k = 0;
getline(&inp, &k, stdin);
printf("%s",inp);
free(inp);
// read buffer overflow
printf("second:\n");
char buf[0x101];
read(fileno(stdin),buf,0x100);
printf("%s",buf);
printf("finished\n");
}
It reads two times a string from stdin and prints the echo of it.
To automate this reading I created following python code:
python3 -c 'import sys,time; l1 = b"aaaa\n"; l2 = b"bbbb\n"; sys.stdout.buffer.write(l1); sys.stdout.buffer.flush(); time.sleep(1); sys.stdout.buffer.write(l2); sys.stdout.buffer.flush();'
Running the c programm works fine. Running the c program with the python input runs fine, too:
python-snippet-above | ./c-program
Running gdb without an input file, typing the strings when requested, seems also fine.
But when it comes to using an inputfile in gdb, I am afraid I am using the debugger wrongly.
Through tutorials and stackoverflow posts I know that gdb can take input via file.
So I tried:
& python-snippet > in
& gdb ./c-program
run < in
I expected that gdb would use for the first read the first line of the file in and for the second read the second line of in.
in looks like (due to the python code):
aaaa
bbbb
But instead gdb prints:
(gdb) r < in
Starting program: /home/user/tmp/stackoverflow/test < in
first:
aaaa
second:
finished
[Inferior 1 (process 24635) exited with code 011]
Observing the variable buf after read(fileno(stdin),buf,0x100) shows me:
(gdb) print buf
$1 = 0x0
So i assume that my second input (bbbb) gets lost. How can I use multiple input inside gdb?
Thanks for reading :)
I am wondering how GDB's input works.
Your problem doesn't appear to have anything to with GDB, and everything to do with bugs in your program itself.
First, if you run the program outside of GDB in the same way, namely:
./a.out < in
you should see the same behavior that you see in GDB. Here is what I see:
./a.out < in
first:
aaaa
second:
p ��finished
So what are the bugs?
The first one: from "man getline"
getline() reads an entire line from stream, storing the address
of the buffer containing the text into *lineptr.
If *lineptr is NULL, then getline() will allocate a buffer
for storing the line, which should be freed by the user program.
You did not set inp to NULL, nor to an allocated buffer. If inp didn't happen to be NULL, you would have gotten heap corruption.
Second bug: you don't check return value from read. If you did, you'd discover that it returns 0, and therefore your printf("%s",buf); prints uninitialized values (which are visible in my terminal as ��).
Third bug: you are expecting read to return the second line. But you used getline on stdin before, and when reading from a file, stdin will use full buffering. Since your input is small, the first getline tries to read BUFSIZ worth of data, and reads (buffers) all of it. A subsequent read (naturally) returns 0 since you've already reached end of file.
You have setbuf(stdout,NULL);. Did you mean to disable buffering on stdin instead?
Fourth bug: read does not NUL-terminate the string, you have to do that yourself, before you can call printf("%s", ...) on it.
With the bugs corrected, I get expected:
first:
aaaa
second:
bbbb
finished
I need to write a C program (myprogram) which checks output of other programs. It should basically work like this:
./otherprogram | ./myprogram
But I could not find how to read line-by-line from stdout (or the pipe), and then write all this to stdout.
One program's stdout becomes the next program's stdin. Just read from stdin and you will be fine.
The shell, when it runs myprogram, will connect everything for you.
BTW, here is the bash code responsible:
http://git.savannah.gnu.org/cgit/bash.git/tree/execute_cmd.c
Look for execute_pipeline. No, the code is not easy to follow, but it fully explains it.
Create an executable using:
#include <stdio.h>
int main()
{
char line[BUFSIZ];
while ( fgets(line, BUFSIZ, stdin) != NULL )
{
// Do something with the line of text
}
}
Then you can pipe the output of any program to it, read the contents line by line, do something with each line of text.
I am trying to pass File1.txt ">" File2.txt as terminal arguments to my program in order to override the cat command. But for some reason, the program is not working. Although the argc is 4 in above defined case but still the condition in the program is not getting true. Here is the code:
int main(int argc, char *argv[])
{
int readbytes,fp;
char buf[1024];
if(argc==2)
{
fp=open(argv[1],O_RDONLY);
dup2(0,fp);
close(fp);
readbytes=read(STDIN_FILENO,buf,1024);
write(STDOUT_FILENO,buf,readbytes);
}
if(argc==4)
{
printf("inside4");
fp=open(argv[1],O_RDONLY);
dup2(fp,0);
close(fp);
fp=open(argv[3],O_WRONLY|O_CREAT|O_TRUNC,S_IRWXU);
dup2(fp,1);
close(fp);
readbytes=read(STDIN_FILENO,buf,1024);
//printf("%c",buf);
write(STDOUT_FILENO,buf,readbytes);
}
return 0;
}
I couldn't find a solution to this issue so I leave it to experts now.What is the reason for this problem?
NOTE:
For some reason when I send ./prog File1.txt > File2.txt to program, argc==2 condition is selected, however argc is 4. Why is that?
Regards
This is likely being caused by how you are running your program. Typing
./myProg foo > bar
will instruct most shells to run myProg with argument foo and save whatever is printed to stdout in a file named bar. To pass foo, >, and bar as command line arguments, use
./myProg foo \> bar
or
./myProg 'foo' '>' 'bar'
Side note: Because piping output into a file using > is part of the shell, not a program like cat itself, you likely shouldn't have to worry about it. Just write to stdout and the shell will handle the rest.
What do you mean by the condition in the program is not getting true? Are you saying that you don't see "inside4" printed to the terminal? There are a few things to consider. First, you do no error checking. We will have to assume that all of your open and dup2 calls succeed. I would expect that "inside4" is getting printed to the end of the output file. The reason for that is simply that printf does not actually write anything. It just stores the string "inside4" in a buffer, but that buffer is not written to the output until your program exits, and by that time the underlying file descriptor has been changed to the output file. The simplest fix is to append a newline to the output, and write printf( "inside4\n" ); In the normal setup, printing a newline causes the internal buffer to be flushed. You can also explicitly flush the buffer after calling printf by calling fflush.