I want to delete thelast N bytes from file in Go,
Actually, this is already implemented is the os.Truncate() function. But this function takes the new size. So to use this, you have to first get the size of the file. For that, you may use os.Stat().
Wrapping it into a function:
func truncateFile(name string, bytesToRemove int64) error {
fi, err := os.Stat(name)
if err != nil {
return err
}
return os.Truncate(name, fi.Size()-bytesToRemove)
}
Using it to remove the last 5000 bytes:
if err := truncateFile("C:\\Test.zip", 5000); err != nil {
fmt.Println("Error:", err)
}
Another alternative is to use the File.Truncate() method for that. If we have an os.File, we may also use File.Stat() to get its size.
This is how it would look like:
func truncateFile(name string, bytesToRemove int64) error {
f, err := os.OpenFile(name, os.O_RDWR, 0644)
if err != nil {
return err
}
defer f.Close()
fi, err := f.Stat()
if err != nil {
return err
}
return f.Truncate(fi.Size() - bytesToRemove)
}
Using it is the same. This may be preferable if we're working on a file (we have it opened) and we have to truncate it. But in that case you'd want to pass os.File instead of its name to truncateFile().
Note: if you try to remove more bytes than the file currently has, truncateFile() will return an error.
I've got trouble overwriting a files content with zeros. The problem is that the very last byte of the original file remains, even when I exceed its size by 100 bytes. Someone got an idea what I'm missing?
func (h PostKey) ServeHTTP(w http.ResponseWriter, r *http.Request) {
f, err := os.Create("received.dat")
if err != nil {
w.WriteHeader(http.StatusInternalServerError)
return
}
defer f.Close()
_, err = io.Copy(f, r.Body)
if err != nil {
w.WriteHeader(http.StatusInternalServerError)
return
}
// Retrieve filesize
size, _ := f.Seek(0, 1)
zeroFilled := make([]byte, size + 100)
n, err := f.WriteAt(zeroFilled, 0)
if err != nil {
return
}
fmt.Printf("Size: %d\n", size) // prints 13
fmt.Printf("Bytes written: %d\n", n) // prints 113
}
The problem may occurred because the data is written into a same file (shared resource) inside an http handler, and the handler itself may be executed concurrently. You need to lock access to the file during data serialization (overwriting process). Quick solution will be:
import (
"sync"
//... other packages
)
var muFile sync.Mutex
func (h PostKey) ServeHTTP(w http.ResponseWriter, r *http.Request) {
muFile.Lock()
defer muFile.Unlock()
f, err := os.Create("received.dat")
//other statements
//...
}
If your server load is low, the above solution will be fine. But if your server needs to handle a lot of requests concurrently, you need to use different approach (although the rule is the same, lock access to any shared resource).
I was writing to the file and trying to overwrite it in the same context, and so parts of the first write operation were still in memory and not yet written to the disk. By using f.Sync() to flush everything after copying the bodys content I was able to fix the issue.
I try to serialize a structured data to file. I looked through some examples and made such construction:
func (order Order) Serialize(folder string) {
b := bytes.Buffer{}
e := gob.NewEncoder(&b)
err := e.Encode(order)
if err != nil { panic(err) }
os.MkdirAll(folder, 0777)
file, err := os.Create(folder + order.Id)
if err != nil { panic(err) }
defer file.Close()
writer := bufio.NewWriter(file)
n, err := writer.Write(b.Bytes())
fmt.Println(n)
if err != nil {
panic(err)
}
}
Serialize is a method serializing its object to file called by it's id property. I looked through debugger - byte buffer contains data before writing. I mean object is fully initialized. Even n variable representing quantity of written bytes is more than a thousand - the file shouldn't be empty at all. The file is created but it is totally empty. What's wrong?
bufio.Writer (as the package name hints) uses a buffer to cache writes. If you ever use it, you must call Writer.Flush() when you're done writing to it to ensure the buffered data gets written to the underlying io.Writer.
Also note that you can directly write to an os.File, no need to create a buffered writer "around" it. (*os.File implements io.Writer).
Also note that you can create the gob.Encoder directly directed to the os.File, so even the bytes.Buffer is unnecessary.
Also os.MkdirAll() may fail, check its return value.
Also it's better to "concatenate" parts of a file path using filepath.Join() which takes care of extra / missing slashes at the end of folder names.
And last, it would be better to signal the failure of Serialize(), e.g. with an error return value, so the caller party has the chance to examine if the operation succeeded, and act accordingly.
So Order.Serialize() should look like this:
func (order Order) Serialize(folder string) error {
if err := os.MkdirAll(folder, 0777); err != nil {
return err
}
file, err := os.Create(filepath.Join(folder, order.Id))
if err != nil {
return err
}
defer file.Close()
if err := gob.NewEncoder(file).Encode(order); err != nil {
return err
}
return nil
}
I am trying to read a file from client and then send it to server.
It goes like this, you input send <fileName> in the client program, then <fileName> will be sent to server. The server read 2 things from the client via TCP connection, first the command send <fileName> and second the content of the file.
However, sometimes my program will randomly include the file content in the <fileName> string. For example, say I have a text file called xyz.txt, the content of which is "Hellow world". The server sometimes receive send xyz.txtHellow world. Sometimes it doesn't and it works just fine.
I think that it is the problem of synchronization or not flushing reader/writer buffer. But I am not quite sure.
Thanks in advance!
Client code:
func sendFileToServer(fileName string, connection net.Conn) {
fileBuffer := make([]byte, BUFFER_SIZE)
var err error
file, err := os.Open(fileName) // For read access.
lock := make(chan int)
w := bufio.NewWriter(connection)
go func(){
w.Write([]byte("send " + fileName))
w.Flush()
lock <- 1
}()
<-lock
// make a read buffer
r := bufio.NewReader(file)
//read file until there is an error
for err == nil || err != io.EOF {
//read a chunk
n, err := r.Read(fileBuffer)
if err != nil && err != io.EOF {
panic(err)
}
if n == 0 {
break
}
// write a chunk
if _, err := w.Write(fileBuffer[:n]); err != nil {
panic(err)
}
}
file.Close()
connection.Close()
fmt.Println("Finished sending.")
}
Server code: (connectionHandler is a goroutine that is invoked for every TCP connection request from client)
func connectionHandler(connection net.Conn, bufferChan chan []byte, stringChan chan string) {
buffer := make([]byte, 1024)
_, error := connection.Read(buffer)
if error != nil {
fmt.Println("There is an error reading from connection", error.Error())
stringChan<-"failed"
return
}
fmt.Println("command recieved: " + string(buffer))
if("-1"==strings.Trim(string(buffer), "\x00")){
stringChan<-"failed"
return
}
arrayOfCommands := strings.Split(string(buffer)," ")
arrayOfCommands[1] = strings.Replace(arrayOfCommands[1],"\n","",-1)
fileName := strings.Trim(arrayOfCommands[1], "\x00")
if arrayOfCommands[0] == "get" {
fmt.Println("Sending a file " + arrayOfCommands[1])
sendFileToClient(fileName, connection, bufferChan, stringChan)
} else if arrayOfCommands[0] == "send" {
fmt.Println("Getting a file " + arrayOfCommands[1])
getFileFromClient(fileName, connection, bufferChan, stringChan)
} else {
_, error = connection.Write([]byte("bad command"))
}
fmt.Println("connectionHandler finished")
}
func getFileFromClient(fileName string, connection net.Conn,bufferChan chan []byte, stringChan chan string) { //put the file in memory
stringChan<-"send"
fileBuffer := make([]byte, BUFFER_SIZE)
var err error
r := bufio.NewReader(connection)
for err == nil || err != io.EOF {
//read a chunk
n, err := r.Read(fileBuffer)
if err != nil && err != io.EOF {
panic(err)
}
if n == 0 {
break
}
bufferChan<-fileBuffer[:n]
stringChan<-fileName
}
connection.Close()
return
}
TCP is a stream protocol. It doesn't have messages. The network is (within some limits we don't need to concern us about) free to send your data one byte at a time or everything at once. And even if you get lucky and the network sends your data in packets like you want them there's nothing that prevents the receive side from concatenating the packets into one buffer.
In other words: there is nothing that will make each Read call return as many bytes as you wrote with some specific Write calls. You sometimes get lucky, sometimes, as you noticed, you don't get lucky. If there are no errors, all the reads you do from the stream will return all the bytes you wrote, that's the only guarantee you have.
You need to define a proper protocol.
This is not related to Go. Every programming language will behave this way.
I am new to Go and I am trying to write a simple script that reads a file line by line. I also want to save the progress (i.e. the last line number that was read) on the filesystem somewhere so that if the same file was given as the input to the script again, it starts reading the file from the line where it left off. Following is what I have started off with.
package main
// Package Imports
import (
"bufio"
"flag"
"fmt"
"log"
"os"
)
// Variable Declaration
var (
ConfigFile = flag.String("configfile", "../config.json", "Path to json configuration file.")
)
// The main function that reads the file and parses the log entries
func main() {
flag.Parse()
settings := NewConfig(*ConfigFile)
inputFile, err := os.Open(settings.Source)
if err != nil {
log.Fatal(err)
}
defer inputFile.Close()
scanner := bufio.NewScanner(inputFile)
for scanner.Scan() {
fmt.Println(scanner.Text())
}
if err := scanner.Err(); err != nil {
log.Fatal(err)
}
}
// Saves the current progress
func SaveProgress() {
}
// Get the line count from the progress to make sure
func GetCounter() {
}
I could not find any methods that deals with line numbers in the scanner package. I know I can declare an integer say counter := 0 and increment it each time a line is read like counter++. But the next time how do I tell the scanner to start from a specific line? So for example if I read till line 30 the next time I run the script with the same input file, how can I make scanner to start reading from line 31?
Update
One solution I can think of here is to use the counter as I stated above and use an if condition like the following.
scanner := bufio.NewScanner(inputFile)
for scanner.Scan() {
if counter > progress {
fmt.Println(scanner.Text())
}
}
I am pretty sure something like this would work, but it is still going to loop over the lines that we have already read. Please suggest a better way.
If you don't want to read but just skip the lines you read previously, you need to acquire the position where you left off.
The different solutions are presented in a form of a function which takes the input to read from and the start position (byte position) to start reading lines from, e.g.:
func solution(input io.ReadSeeker, start int64) error
A special io.Reader input is used which also implements io.Seeker, the common interface which allows skipping data without having to read them. *os.File implements this, so you are allowed to pass a *File to these functions. Good. The "merged" interface of both io.Reader and io.Seeker is io.ReadSeeker.
If you want a clean start (to start reading from the beginning of the file), simply pass start = 0. If you want to resume a previous processing, pass the byte position where the last processing was stopped/aborted. This position is the value of the pos local variable in the functions (solutions) below.
All the examples below with their testing code can be found on the Go Playground.
1. With bufio.Scanner
bufio.Scanner does not maintain the position, but we can very easily extend it to maintain the position (the read bytes), so when we want to restart next, we can seek to this position.
In order to do this with minimal effort, we can use a new split function which splits the input into tokens (lines). We can use Scanner.Split() to set the splitter function (the logic to decide where are the boundaries of tokens/lines). The default split function is bufio.ScanLines().
Let's take a look at the split function declaration: bufio.SplitFunc
type SplitFunc func(data []byte, atEOF bool) (advance int, token []byte, err error)
It returns the number of bytes to advance: advance. Exactly what we need to maintain the file position. So we can create a new split function using the builtin bufio.ScanLines(), so we don't even have to implement its logic, just use the advance return value to maintain position:
func withScanner(input io.ReadSeeker, start int64) error {
fmt.Println("--SCANNER, start:", start)
if _, err := input.Seek(start, 0); err != nil {
return err
}
scanner := bufio.NewScanner(input)
pos := start
scanLines := func(data []byte, atEOF bool) (advance int, token []byte, err error) {
advance, token, err = bufio.ScanLines(data, atEOF)
pos += int64(advance)
return
}
scanner.Split(scanLines)
for scanner.Scan() {
fmt.Printf("Pos: %d, Scanned: %s\n", pos, scanner.Text())
}
return scanner.Err()
}
2. With bufio.Reader
In this solution we use the bufio.Reader type instead of the Scanner. bufio.Reader already has a ReadBytes() method which is very similar to the "read a line" functionality if we pass the '\n' byte as the delimeter.
This solution is similar to JimB's, with the addition of handling all valid line terminator sequences and also stripping them off from the read line (it is very rare they are needed); in regular expression notation, it is \r?\n.
func withReader(input io.ReadSeeker, start int64) error {
fmt.Println("--READER, start:", start)
if _, err := input.Seek(start, 0); err != nil {
return err
}
r := bufio.NewReader(input)
pos := start
for {
data, err := r.ReadBytes('\n')
pos += int64(len(data))
if err == nil || err == io.EOF {
if len(data) > 0 && data[len(data)-1] == '\n' {
data = data[:len(data)-1]
}
if len(data) > 0 && data[len(data)-1] == '\r' {
data = data[:len(data)-1]
}
fmt.Printf("Pos: %d, Read: %s\n", pos, data)
}
if err != nil {
if err != io.EOF {
return err
}
break
}
}
return nil
}
Note: If the content ends with an empty line (line terminator), this solution will process an empty line. If you don't want this, you can simply check it like this:
if len(data) != 0 {
fmt.Printf("Pos: %d, Read: %s\n", pos, data)
} else {
// Last line is empty, omit it
}
Testing the solutions:
Testing code will simply use the content "first\r\nsecond\nthird\nfourth" which contains multiple lines with varying line terminating. We will use strings.NewReader() to obtain an io.ReadSeeker whose source is a string.
Test code first calls withScanner() and withReader() passing 0 start position: a clean start. In the next round we will pass a start position of start = 14 which is the position of the 3. line, so we won't see the first 2 lines processed (printed): resume simulation.
func main() {
const content = "first\r\nsecond\nthird\nfourth"
if err := withScanner(strings.NewReader(content), 0); err != nil {
fmt.Println("Scanner error:", err)
}
if err := withReader(strings.NewReader(content), 0); err != nil {
fmt.Println("Reader error:", err)
}
if err := withScanner(strings.NewReader(content), 14); err != nil {
fmt.Println("Scanner error:", err)
}
if err := withReader(strings.NewReader(content), 14); err != nil {
fmt.Println("Reader error:", err)
}
}
Output:
--SCANNER, start: 0
Pos: 7, Scanned: first
Pos: 14, Scanned: second
Pos: 20, Scanned: third
Pos: 26, Scanned: fourth
--READER, start: 0
Pos: 7, Read: first
Pos: 14, Read: second
Pos: 20, Read: third
Pos: 26, Read: fourth
--SCANNER, start: 14
Pos: 20, Scanned: third
Pos: 26, Scanned: fourth
--READER, start: 14
Pos: 20, Read: third
Pos: 26, Read: fourth
Try the solutions and testing code on the Go Playground.
Instead of using a Scanner, use a bufio.Reader, specifically the ReadBytes or ReadString methods. This way you can read up to each line termination, and still receive the full line with line endings.
r := bufio.NewReader(inputFile)
var line []byte
fPos := 0 // or saved position
for i := 1; ; i++ {
line, err = r.ReadBytes('\n')
fmt.Printf("[line:%d pos:%d] %q\n", i, fPos, line)
if err != nil {
break
}
fPos += len(line)
}
if err != io.EOF {
log.Fatal(err)
}
You can store the combination of file position and line number however you choose, and the next time you start, you use inputFile.Seek(fPos, os.SEEK_SET) to move to where you left off.
If you want to use Scanner you have go trough the begging of the file till you find GetCounter() end-line symbols.
scanner := bufio.NewScanner(inputFile)
// context line above
// skip first GetCounter() lines
for i := 0; i < GetCounter(); i++ {
scanner.Scan()
}
// context line below
for scanner.Scan() {
fmt.Println(scanner.Text())
}
Alternatively you could store offset instead of line number in the counter but remember that termination token is stripped when using Scanner and for new line the token is \r?\n (regexp notation) so it isn't clear if you should add 1 or 2 to the text length:
// Not clear how to store offset unless custom SplitFunc provided
inputFile.Seek(GetCounter(), 0)
scanner := bufio.NewScanner(inputFile)
So it is better to use previous solution or not using Scanner at all.
There's a lot of words in the other answers, and they're not really reusable code so here's a re-usable function that seeks to the given line number & returns it and the offset where the line starts. play.golang
func SeekToLine(r io.Reader, lineNo int) (line []byte, offset int, err error) {
s := bufio.NewScanner(r)
var pos int
s.Split(func(data []byte, atEof bool) (advance int, token []byte, err error) {
advance, token, err = bufio.ScanLines(data, atEof)
pos += advance
return advance, token, err
})
for i := 0; i < lineNo; i++ {
offset = pos
if !s.Scan() {
return nil, 0, io.EOF
}
}
return s.Bytes(), pos, nil
}