Upload file chunks into MongoDb using mgo/golang - file

In my case I have a logic which should upload large chunked files for example if I have a file which size is 10mb , I need to send PUT request with 1mb file chunks , 10 times, but the mgo (mgo.v2) is not allow to open file for writing
func UploadFileChunk(rw http.ResponseWriter,rq *http.Request) {
fileid:= mux.Vars(rq)["fileid"]
rq.ParseMultipartForm(10000)
formFile:=rq.MultipartForm.File["file"]
content,err:= formFile[0].Open()
defer content.Close()
if err != nil {
http.Error(rw,err.Error(),http.StatusInternalServerError)
return
}
file,err:= db.GridFS("fs").OpenId(bson.ObjectIdHex(fileid))
if err != nil {
http.Error(rw,err.Error(),http.StatusInternalServerError)
return
}
data,err := ioutil.ReadAll(content)
n,_:= file.Write(data)
file.Close()
// Write a log type message
fmt.Printf("%d bytes written to the Mongodb instance\n", n)
}
So I want to every time write a new chunk but 1) The mgo not allows to open file for writing 2) I don't know is this way a good?

Related

How to use NaCl to sign a large file?

Given the sign capability from Go NaCl library (https://github.com/golang/crypto/tree/master/nacl/sign), how to sign a file, especially, a very large file as big as more than 1GB? Most of the internet search results are all about signing a slice or small array of bytes.
I can think of 2 ways:
Loop through the file and stream in a block manner (e.g. 16k each time), then feed it into the sign function. The streamed output are concatenated into a signature certificate. For verification, it is done reversely.
Use SHA(X) to generate the shasum of the file and then sign the shasum output.
For signing very large files (multiple gigabytes and up), the problem of using a standard signing function is often runtime and fragility. For very large files (or just slow disks) it could perhaps take hours or more just to serially read the full file from start to end.
In such cases, you want a way to process the file in parallel. One of the common ways to do this which is suitable for cryptographic signatures is Merkle tree hashes. They allow you to split the large file into smaller chunks, hash them in parallel (producing "leaf hashes"), and then further hash those hashes in a tree structure to produce a root hash which represents the full file.
Once you have calculated this Merkle tree root hash, you can sign this root hash. It then becomes possible to use the signed Merkle tree root hash to verify all of the file chunks in parallel, as well as verifying their order (based on the positions of the leaf hashes in the tree structure).
The problem with NaCl is that you need to put the whole message into RAM, as per godoc:
Messages should be small because: 1. The whole message needs to be held in memory to be processed. 2. Using large messages pressures implementations on small machines to process plaintext without verifying the signature. This is very dangerous, and this API discourages it, but a protocol that uses excessive message sizes might present some implementations with no other choice. 3. Performance may be improved by working with messages that fit into data caches. Thus large amounts of data should be chunked so that each message is small.
However, there are various other methods. Most of them basically do what you described in the first way. You basically copy the file contents into an io.Writer which takes the contents and calculates the hash sum - this is most efficient.
The code below is pretty hacked, but you should get the picture.
I achieved an average throughput of 315MB/s with it.
package main
import (
"crypto/ecdsa"
"crypto/elliptic"
"crypto/rand"
"crypto/sha256"
"flag"
"fmt"
"io"
"log"
"math/big"
"os"
"time"
)
var filename = flag.String("file", "", "file to sign")
func main() {
flag.Parse()
if *filename == "" {
log.Fatal("file can not be empty")
}
f, err := os.Open(*filename)
if err != nil {
log.Fatalf("Error opening '%s': %s", *filename, err)
}
defer f.Close()
start := time.Now()
sum, n, err := hash(f)
duration := time.Now().Sub(start)
log.Printf("Hashed %s (%d bytes)in %s to %x", *filename, n, duration, sum)
log.Printf("Average: %.2f MB/s", (float64(n)/1000000)/duration.Seconds())
r, s, err := sign(sum)
if err != nil {
log.Fatalf("Error creatig signature: %s", err)
}
log.Printf("Signature: (0x%x,0x%x)\n", r, s)
}
func sign(sum []byte) (*big.Int, *big.Int, error) {
priv, err := ecdsa.GenerateKey(elliptic.P256(), rand.Reader)
if err != nil {
log.Printf("Error creating private key: %s", err)
}
return ecdsa.Sign(rand.Reader, priv, sum[:])
}
func hash(f *os.File) ([]byte, int64, error) {
var (
hash []byte
n int64
err error
)
h := sha256.New()
// This is where the magic happens.
// We use the efficient io.Copy to feed the contents
// of the file into the hash function.
if n, err = io.Copy(h, f); err != nil {
return nil, n, fmt.Errorf("Error creating hash: %s", err)
}
hash = h.Sum(nil)
return hash, n, nil
}

Trouble overwriting file content

I've got trouble overwriting a files content with zeros. The problem is that the very last byte of the original file remains, even when I exceed its size by 100 bytes. Someone got an idea what I'm missing?
func (h PostKey) ServeHTTP(w http.ResponseWriter, r *http.Request) {
f, err := os.Create("received.dat")
if err != nil {
w.WriteHeader(http.StatusInternalServerError)
return
}
defer f.Close()
_, err = io.Copy(f, r.Body)
if err != nil {
w.WriteHeader(http.StatusInternalServerError)
return
}
// Retrieve filesize
size, _ := f.Seek(0, 1)
zeroFilled := make([]byte, size + 100)
n, err := f.WriteAt(zeroFilled, 0)
if err != nil {
return
}
fmt.Printf("Size: %d\n", size) // prints 13
fmt.Printf("Bytes written: %d\n", n) // prints 113
}
The problem may occurred because the data is written into a same file (shared resource) inside an http handler, and the handler itself may be executed concurrently. You need to lock access to the file during data serialization (overwriting process). Quick solution will be:
import (
"sync"
//... other packages
)
var muFile sync.Mutex
func (h PostKey) ServeHTTP(w http.ResponseWriter, r *http.Request) {
muFile.Lock()
defer muFile.Unlock()
f, err := os.Create("received.dat")
//other statements
//...
}
If your server load is low, the above solution will be fine. But if your server needs to handle a lot of requests concurrently, you need to use different approach (although the rule is the same, lock access to any shared resource).
I was writing to the file and trying to overwrite it in the same context, and so parts of the first write operation were still in memory and not yet written to the disk. By using f.Sync() to flush everything after copying the bodys content I was able to fix the issue.

Go. Writing []byte to file results in zero byte file

I try to serialize a structured data to file. I looked through some examples and made such construction:
func (order Order) Serialize(folder string) {
b := bytes.Buffer{}
e := gob.NewEncoder(&b)
err := e.Encode(order)
if err != nil { panic(err) }
os.MkdirAll(folder, 0777)
file, err := os.Create(folder + order.Id)
if err != nil { panic(err) }
defer file.Close()
writer := bufio.NewWriter(file)
n, err := writer.Write(b.Bytes())
fmt.Println(n)
if err != nil {
panic(err)
}
}
Serialize is a method serializing its object to file called by it's id property. I looked through debugger - byte buffer contains data before writing. I mean object is fully initialized. Even n variable representing quantity of written bytes is more than a thousand - the file shouldn't be empty at all. The file is created but it is totally empty. What's wrong?
bufio.Writer (as the package name hints) uses a buffer to cache writes. If you ever use it, you must call Writer.Flush() when you're done writing to it to ensure the buffered data gets written to the underlying io.Writer.
Also note that you can directly write to an os.File, no need to create a buffered writer "around" it. (*os.File implements io.Writer).
Also note that you can create the gob.Encoder directly directed to the os.File, so even the bytes.Buffer is unnecessary.
Also os.MkdirAll() may fail, check its return value.
Also it's better to "concatenate" parts of a file path using filepath.Join() which takes care of extra / missing slashes at the end of folder names.
And last, it would be better to signal the failure of Serialize(), e.g. with an error return value, so the caller party has the chance to examine if the operation succeeded, and act accordingly.
So Order.Serialize() should look like this:
func (order Order) Serialize(folder string) error {
if err := os.MkdirAll(folder, 0777); err != nil {
return err
}
file, err := os.Create(filepath.Join(folder, order.Id))
if err != nil {
return err
}
defer file.Close()
if err := gob.NewEncoder(file).Encode(order); err != nil {
return err
}
return nil
}

How to verify if file has contents to marshal from ioutil.ReadFile in Go

I am trying to use a file instead of a DB to get a prototype up and running. I have a program that (1) reads existing content from the file to a map, (2) takes JSON POSTs that add content to the map, (3) on exit, writes to the file.
First, the file is not being created. Then I created an empty file. It is not being written to.
I am trying to read the file, determine if there is existing content. If there is not existing content, create a blank map. If there is existing content, unmarshal it into a new map.
func writeDB() {
eventDBJSON, err := json.Marshal(eventDB)
if err != nil {
panic(err)
}
err2 := ioutil.WriteFile("/Users/sarah/go/dat.txt", eventDBJSON, 0777)
if err2 != nil {
panic(err2)
}
}
func main() {
dat, err := ioutil.ReadFile("/Users/sarah/go/dat.txt")
if err != nil {
panic(err)
}
if dat == nil {
eventDB = DB{
events: map[string]event{},
}
} else {
if err2 := json.Unmarshal(dat, &eventDB); err2 != nil {
panic(err2)
}
}
router := httprouter.New()
router.POST("/join", JoinEvent)
router.POST("/create", CreateEvent)
log.Fatal(http.ListenAndServe(":8080", router))
defer writeDB()
}
There is no way for the server to ever reach defer writeDB().
http.ListenAndServe blocks, and if it did return anything, you log.Fatal that, which exits your app at that point.
You can't intercept all ways an app can exit, getting SIGKILL, machine loss of power, etc.
I'm assuming you really just want to write some code, bounce the server, repeat
If that's the case, then Ctrl-C is good enough.
If you want to write your file on Ctrl-C, look at the signal package.
Also, defer on the last line of a function really has no purpose as defer basically means "do this last".
you can use (*os.File).Stat() to get a file's FileInfo which contain its size
file, err := os.Open( filepath )
if err != nil {
// handle error
}
fi, err := file.Stat()
if err != nil {
// handle error
}
s := fi.Size()

How to transfer multiple files using go

I am trying to write a program in go which has two parts. One part is the client who tries to upload multiple pictures to the other part the server.
The server side should do the following:
Get the number of files which will be send
Loop for every file
Get filename
Get the file and save it
Go to 3
So far the server side is doing the following:
func getFileFromClient(connection net.Conn) {
var numberOfPics int
var err error
var receivedBytes int64
var fileName string
r := bufio.NewReader(connection)
strNumberOfPics, err := r.ReadString('\n')
if err != nil {
fmt.Printf("Error reading: %s\n", err)
return
}
fmt.Printf("Read: %s\n", strNumberOfPics)
strNumberOfPics = strings.Trim(strNumberOfPics, "\n")
numberOfPics, err = strconv.Atoi(strNumberOfPics)
if err != nil {
fmt.Printf("Error Atoi: %s\n", err)
panic("Atoi")
}
fmt.Printf("Receiving %d pics:\n", numberOfPics)
for i := 0; i < numberOfPics; i++ {
// Getting the file name:
fileName, err = r.ReadString('\n')
if err != nil {
fmt.Printf("Error receiving: %s\n", err)
}
fmt.Printf("Filename: %s\n", fileName)
fileName = strings.Trim(fileName, "\n")
f, err := os.Create(fileName)
defer f.Close()
if err != nil {
fmt.Println("Error creating file")
}
receivedBytes, err = io.Copy(f, connection)
if err != nil {
panic("Transmission error")
}
fmt.Printf("Transmission finished. Received: %d \n", receivedBytes)
}
}
io.Copy is working for just one file and nothing additional (because it does not empty the queue I think). I do not want to reconnect every time for every file if I do not have too. But I am not sure what I actually can do about that.
Has anyone any suggestions of an existing package or method which could help? Or example code? Or am I just plain wrong and it is a bad idea to even try this with go?
I think it might be enough if the server is able to flush the connection buffer after every read so no additional info is read and/or copied.
Really looking forward for help, thanks in advance
EDIT: Updated Code still not working. I think it might be the bufio.reader
func getFileFromClient(connection net.Conn) {
var numberOfPics int
var err error
var receivedBytes int64
var fileName string
r := bufio.NewReader(connection)
strNumberOfPics, err := r.ReadString('\n')
if err != nil {
fmt.Printf("Error reading: %s\n", err)
return
}
strNumberOfPics = strings.Trim(strNumberOfPics, "\n")
numberOfPics, err = strconv.Atoi(strNumberOfPics)
if err != nil {
fmt.Printf("Error Atoi: %s\n", err)
panic("Atoi")
}
fmt.Printf("Receiving %d pics:\n", numberOfPics)
for i := 0; i < numberOfPics; i++ {
// Getting the file name:
fileName, err = r.ReadString('\n')
if err != nil {
fmt.Printf("Error receiving: %s\n", err)
}
fileName = strings.Trim(fileName, "\n")
fmt.Printf("Filename: %s\n", fileName)
f, err := os.Create(fileName)
defer f.Close()
if err != nil {
fmt.Println("Error creating file")
}
// Get the file size
strFileSize, err := r.ReadString('\n')
if err != nil {
fmt.Printf("Read size error %s\n", err)
panic("Read size")
}
strFileSize = strings.Trim(strFileSize, "\n")
fileSize, err := strconv.Atoi(strFileSize)
if err != nil {
fmt.Printf("Error size Atoi: %s\n", err)
panic("size Atoi")
}
fmt.Printf("Size of pic: %d\n", fileSize)
receivedBytes, err = io.CopyN(f, connection, int64(fileSize))
if err != nil {
fmt.Printf("Transmission error: %s\n", err)
panic("Transmission error")
}
fmt.Printf("Transmission finished. Received: %d \n", receivedBytes)
}
}
EDIT 2: I did not get this solution to work. I am pretty sure it is because I used bufio. I did however get it to work by transmitting a single zip file with io.copy. Another solution which worked was to transmit a zip file by using http. If you are stuck trying something similar and need help feel free to send me a message. Thanks to all of you for your help
Keeping your implementation so far, the thing you're missing is that io.Copy() reads from source until it finds an EOF, so it will read all the remaining images in one go.
Also, the client must send, for each image, its size in bytes (you could do that after sending the name).
In the server, just read the size and then use io.CopyN() to read that exact number of bytes.
EDIT: as a matter of fact, you could also do things like you were doing and send images in parallel instead of serially, that would mean you open a new connection for each file transfer and then read all of the file withouth needing to send the amount of images or their size.
In case you want an alternative, a good option would be using good 'ol HTTP and multipart requests. There's the built-in module mime/multipart that allows you to do file transfers over HTTP. Of course, that would mean you'd have to rewrite your program.
My suggestion is to zip all the images you want to transfer and then send them as a single multipart POST request. In that way you have a standard way of knowing all your Acceptance criteria.
You can easily zip multiple files using https://golang.org/pkg/archive/zip/

Resources