I am running an external process via exec.Command() and I want the stdout from the command to be printed as well as written to file, in real time (similar to using tee from a command-line) .
I can achieve this with a scanner and a writer:
cmd := exec.Command("mycmd")
cmdStdOut, _ := cmd.StdoutPipe()
s := bufio.NewScanner(cmdStdOut)
f, _ := os.Create("stdout.log")
w := bufio.NewWriter(f)
go func() {
for s.Scan(){
t := s.Text()
fmt.Println(t)
fmt.Fprint(w, t)
w.Flush()
}
}
Is there a more idiomatic way to do this that avoids clobbering Scan and Flush?
Assign a multiwriter to the commmand's stdout that writes to a file and to a pipe. You can then use the pipe's read end to follow the output.
This example behaves similar to the tee tool:
package main
import (
"io"
"os"
"os/exec"
)
func main() {
var f *os.File // e.g. os.Create, os.Open
r, w := io.Pipe()
defer w.Close()
cmd := exec.Command("mycmd")
cmd.Stdout = io.MultiWriter(w, f)
// do something with the output while cmd is running by reading from r
go io.Copy(os.Stdout, r)
cmd.Run()
}
Alternative with StdoutPipe:
package main
import (
"io"
"os"
"os/exec"
)
func main() {
var f *os.File
cmd := exec.Command("date")
stdout, _ := cmd.StdoutPipe()
go io.Copy(io.MultiWriter(f, os.Stdout), stdout)
cmd.Run()
}
Ignoring errors for brevity. As stated by other answers, you could use io.MultiWriter in an io.Copy, but when you are dealing with stdout of exec.Cmd, you need to be aware of Wait closing the pipes as soon as the command terminates, as stated by the documentation (https://golang.org/pkg/os/exec/#Cmd.StdoutPipe).
Wait will close the pipe after seeing the command exit, so most callers need not close the pipe themselves. It is thus incorrect to call Wait before all reads from the pipe have completed.
Ignoring this could lead to some portions of the output not being read, and is therefore lost. Instead, do not use Run, but instead use Start and Wait. eg.
package main
import (
"io"
"os"
"os/exec"
)
func main() {
cmd := exec.Command("date")
stdout, _ := cmd.StdoutPipe()
f, _ := os.Create("stdout.log")
cmd.Start()
io.Copy(io.MultiWriter(f, os.Stdout), stdout)
cmd.Wait()
}
This will ensure everything is read from stdout and close all pipes afterwards.
Related
I wrote a test program to analyse the behaviour of a piece of code that caused a bug today, to understand its behaviour better. The opposite happend.
This is the test program. It should execute a testcommand and stream the commands output to stdout.
import (
"bufio"
"fmt"
"io"
"os/exec"
)
func main() {
cmd1 := exec.Command("./testcommands/testcommand.sh")
execCmd(cmd1)
cmd2 := exec.Command("./testcommands/testcommand")
execCmd(cmd2)
}
func execCmd(cmd *exec.Cmd) {
stderr, _ := cmd.StderrPipe()
stdout, _ := cmd.StdoutPipe()
multi := io.MultiReader(stdout, stderr)
scanner := bufio.NewScanner(multi)
cmd.Start()
for scanner.Scan() {
m := scanner.Text()
fmt.Println(m)
}
cmd.Wait()
}
The two testcommands called do basically the same. One is implemented in bash
#!/bin/bash
for i in `seq 1 10` ; do
echo "run $i"
sleep 1
done
The other one in C
#include <stdio.h>
#include <unistd.h>
int main() {
int i;
for (i=1; i<=10; i++) {
printf("run %d\n", i);
sleep(1);
}
return 0;
}
The output of the shell script does get streamed (1 line per second), however the output of the c program only arrives after the program is finished completely (10 lines at once after 10 seconds).
This goes way over my head. I'm not even sure if this is working as intended and i'm just missing something, or if i should open a bug report - and if so, i'm not even sure if it'll be for bash, golang or c. Or maybe it's some linux thing i don't know about.
When stdout (to which printf writes) is connected to a terminal, then stdout will be line-buffered and output will be flushed (actually written) on each newline.
But when stdout is not connected to a terminal, like for example it's used for redirection or a pipe, then it's fully buffered. Fully buffered means the output only will be written when the buffer becomes full (unlikely in your small example) or when explicitly flushed (for example with fflush(stdout)).
What the Go exec functionality probably does is to create pipes for the programs input and output.
To solve your problem, your C program needs to call fflush after each printf call:
printf("run %d\n", i);
fflush(stdout);
This program successfully runs even though it's writing to a deleted file. Why does this work?
package main
import (
"fmt"
"os"
)
func main() {
const path = "test.txt"
f, err := os.Create(path) // Create file
if err != nil {
panic(err)
}
err = os.Remove(path) // Delete file
if err != nil {
panic(err)
}
_, err = f.WriteString("test") // Write to deleted file
if err != nil {
panic(err)
}
err = f.Close()
if err != nil {
panic(err)
}
fmt.Printf("No errors occurred") // test.txt doesn't exist anymore
}
On Unix-like systems, when a process opens a file it gets a File descriptor which points to the process File table entry, which, in turn, refers to inode structure on the disk. inode keeps file information, including data location.
Contents of a directory are just pairs of inode numbers and names.
If you delete a file, you simply delete a link to inode from the directory, inode still exists (as long as there is no link to it from somewhere, including processes) and data can be read and written from/to data location.
On Windows this code fails since Windows does not allow opened file to be deleted:
panic: remove test.txt: The process cannot access the file because it is being used by another process.
goroutine 1 [running]:
main.main()
D:/tmp/main.go:18 +0x1d1
exit status 2
I have the following C program:
#include <stdio.h>
#include <unistd.h>
void readAndEchoAll(void) {
for(;;) {
char buf[100];
ssize_t size = read(STDIN_FILENO, buf, sizeof(buf));
if(size <= 0) {
return;
}
fwrite(buf, 1, size, stdout);
}
}
int main(void) {
puts("Reading and echoing STDIN until first EOF...");
readAndEchoAll();
puts("Got first EOF. Now reading and echoing STDIN until second EOF...");
readAndEchoAll();
puts("Got second EOF.");
return 0;
}
When I run it, it works the way I want it to. Here's what it does:
Reading and echoing STDIN until first EOF...
asdf
^Dasdf
Got first EOF. Now reading and echoing STDIN until second EOF...
fdsa
^Dfdsa
Got second EOF.
I'm trying to create an equivalent Haskell program. Here's my attempt:
readAndEchoAll :: IO ()
readAndEchoAll = do
buf <- getContents
putStr buf
main :: IO ()
main = do
putStrLn "Reading and echoing STDIN until first EOF..."
readAndEchoAll
putStrLn "Got first EOF. Now reading and echoing STDIN until second EOF..."
-- ???
readAndEchoAll
putStrLn "Got second EOF."
This doesn't work. Here's what it does:
Reading and echoing STDIN until first EOF...
asdf
^Dasdf
Got first EOF. Now reading and echoing STDIN until second EOF...
readtwice.hs: <stdin>: hGetContents: illegal operation (handle is closed)
How do I make this work like the C program? I assume that I need to put some equivalent of clearerr(stdin); where I have -- ???, but I'm not sure what that equivalent is.
Update: Turns out clearerr is a bit of a red herring, as it's exclusive to the standard C API. When using the POSIX API, you can just read again without needing to do anything equivalent to it. So rather than make Haskell do anything extra, I need to make it not do something: not prevent further reads once it sees EOF.
You can't use getContents, because hGetContents (semi-)closes the handle it's passed and getContents calls hGetContents. But there's no problem with reading from a handle again after EOF with most of the other functions from the standard library. Here's a simple but inefficient example of reading all the characters without using getContents:
import Control.Exception
import System.IO.Error
readAll = go [] where
handler cs err = if isEOFError err
then return (reverse cs)
else throwIO err
go cs = catch (do
c <- getChar
go (c:cs))
(handler cs)
main = do
all <- readAll
putStrLn $ "got: " ++ all
putStrLn "go again, mate"
all <- readAll
putStrLn $ "got: " ++ all
If you want better efficiency, there are various functions available for reading lines-at-a-time or other large chunks in the standard library, rather than one character at a time.
A quick search of the GHC source code shows that clearerr() is not used at all there. However, you can open /dev/stdin again, since it looks like you're using Linux or similar. Try this:
stdin2 <- openFile "/dev/stdin" ReadMode
You can also use hDuplicate. See here: Portably opening a handle to stdin many times in a single session
If given a path, I would use this to get file size
file, _ := os.Open(path)
fi, _ := file.Stat()
fsuze := fi.Size()
But if only given fd, how can I get the file size?
Is there any way in Go like this in C:
lseek(fd, 0, SEEK_END)
You create a new *os.File from a file descriptor using the os.NewFile function.
You can do it exactly the same way as in C, using Seek
offset, err := f.Seek(0, os.SEEK_END)
But since you have the *os.File already, you can call Stat even if it was derived directly from the file descriptor.
try to get the file start
fileInfo, err := file.Stat()
if err != nil {...
}
files fileInfo.Size())
I'm trying to find a good way of reading the first two bytes from a file using Go.
I have some .zip files in my current directory, mixed in with other files.
I would like to loop through all the files in the directory and check if the first two bytes contain the right .zip identifier, namely 50 4B.
What would be a good way to accomplish this using the standard library without having to read the entire file?
Going through the available functions in the io package I managed to find:
func LimitReader(r Reader, n int64) Reader
Which seems to fit my description, it reads from Reader (How do I get a Reader?) but stops after n bytes. Since I'm rather new to Go, I'm not sure how to go about it.
You get the initial reader by opening the file. For 2 bytes, I wouldn't use the LimitReader though. Just reading 2 bytes with io.ReadFull is easier.
r, err := os.Open(file)
if err != nil {
return err
}
defer r.Close()
var header [2]byte
n, err := io.ReadFull(r, header[:])
if err != nil {
return err
}