[]byte{10} or []byte("\n") vs []byte{92, 110} - arrays

I'm working with github.com/tarm/serial to interface with some serial instrumentation. In development, I'm working with the /dev/ttyp0 and /dev/ptyp0 pair, where my go process connects to one and I use screen to connect to the other. I've written a function that, combined with serial.Config.ReadTimeout, reads for up to ReadTimeout or a given byte sequence is received. That function is:
func readToTermination(s serial.Port, termination []byte, rate time.Duration) []byte {
var out []byte
lterm := len(termination)
for {
buf := make([]byte, 128)
n, _ := s.Read(buf)
out = append(out, buf[:n]...)
l := len(out)
if l >= lterm {
if bytes.Compare(out[l-lterm:], termination) == 0 {
break
}
}
time.Sleep(rate)
}
return out
}
This avoids burning up CPU cycles doing nothing with a debounce. When I test with termination = []byte("\n") and screen, the break never triggers because it turns into []byte{97, 11} (two distinct elements, something like screen flushing after each keystroke). On the other hand, if I do something like echo "foo" > /dev/ptyp0, the break triggers correctly. Echo implicitly seems to be sending a \n, or the closing of the connection does so. I see \r\n for echo foo and \r\n\r\n for echo "foo\n".
So my question is:
(1) why is there a difference in behavior here?
(2) how do I get the behavior I really am after with a carriage return for the termination? Perhaps I should use EOT instead? A human will never be typing into this directly.

Related

Expanding a temporary slice if more bytes are needed

I'm generating random files programmatically in a directory, at least temporaryFilesTotalSize worth of random data (a bit more, who cares).
Here's my code:
var files []string
for size := int64(0); size < temporaryFilesTotalSize; {
fileName := random.HexString(12)
filePath := dir + "/" + fileName
file, err := os.Create(filePath)
if err != nil {
return nil, err
}
size += rand.Int63n(1 << 32) // random dimension up to 4GB
raw := make([]byte, size)
_, err := rand.Read(raw)
if err != nil {
panic(err)
}
file.Write(raw)
file.Close()
files = append(files, filePath)
}
Is there any way I can avoid that raw := make([]byte, size) allocation in the for loop?
Ideally I'd like to keep a slice on the heap and only grow if a bigger size is required. Any way to do this efficiently?
First of all you should know that generating random data and writing that to disk is at least an order of magnitude slower than allocating a contiguous memory for buffer. This definitely falls under the "premature optimization" category. Eliminating the creation of the buffer inside the iteration will not make your code noticeably faster.
Reusing the buffer
But to reuse the buffer, move it outside of the loop, create the biggest needed buffer, and slice it in each iteration to the needed size. It's OK to do this, because we'll overwrite the whole part we need with random data.
Note that I somewhat changed the size generation (likely an error in your code as you always increase the generated temporary files, since you use the size accumulated size for new ones).
Also note that writing a file with contents prepared in a []byte is easiest done using a single call to os.WriteFile().
Something like this:
bigRaw := make([]byte, 1 << 32)
for totalSize := int64(0); ; {
size := rand.Int63n(1 << 32) // random dimension up to 4GB
totalSize += size
if totalSize >= temporaryFilesTotalSize {
break
}
raw := bigRaw[:size]
rand.Read(raw) // It's documented that rand.Read() always returns nil error
filePath := filepath.Join(dir, random.HexString(12))
if err := os.WriteFile(filePath, raw, 0666); err != nil {
panic(err)
}
files = append(files, filePath)
}
Solving the task without an intermediate buffer
Since you are writing big files (GBs), allocating that big buffer is not a good idea: running the app will require GBs of RAM! We could improve it with an inner loop to use smaller buffers until we write the expected size, which solves the big memory issue, but increases complexity. Luckily for us, we can solve the task without any buffers, and even with decreased complexity!
We should somehow "channel" the random data from a rand.Rand to the file directly, something similar what io.Copy() does. Note that rand.Rand implements io.Reader, and os.File implements io.ReaderFrom, which suggests we could simply pass a rand.Rand to file.ReadFrom(), and the file itself would get the data directly from rand.Rand that will be written.
This sounds good, but the ReadFrom() reads data from the given reader until EOF or error. Neither will ever happen if we pass rand.Rand. And we do know how many bytes we want to be read and written: size.
To our "rescue" comes io.LimitReader(): we pass an io.Reader and a size to it, and the returned reader will supply no more than the given number of bytes, and after that will report EOF.
Note that creating our own rand.Rand will also be faster as the source we pass to it will be created using rand.NewSource() which returns an "unsynchronized" source (not safe for concurrent use) which in turn will be faster! The source used by the default/global rand.Rand is synchronized (and so safe for concurrent use–but is slower).
Perfect! Let's see this in action:
r := rand.New(rand.NewSource(time.Now().Unix()))
for totalSize := int64(0); ; {
size := r.Int63n(1 << 32)
totalSize += size
if totalSize >= temporaryFilesTotalSize {
break
}
filePath := filepath.Join(dir, random.HexString(12))
file, err := os.Create(filePath)
if err != nil {
return nil, err
}
if _, err := file.ReadFrom(io.LimitReader(r, fsize)); err != nil {
panic(err)
}
if err = file.Close(); err != nil {
panic(err)
}
files = append(files, filePath)
}
Note that if os.File would not implement io.ReaderFrom, we could still use io.Copy(), providing the file as the destination, and a limited reader (used above) as the source.
Final note: closing the file (or any resource) is best done using defer, so it'll get called no matter what. Using defer in a loop is a bit tricky though, as deferred functions run at the end of the enclosing function, and not at the end of the loop's iteration. So you may wrap it in a function. For details, see `defer` in the loop - what will be better?

Golang best practices: empty array response or error?

What are best practices in terms of error handling for a function that accepts slice of objects and returns another slice of objects (ideally of same length as input array) along with error as follows:
func ([]interface{}) ([]interface{}, error)
One way is whenever you get an error for processing one of the objects in a slice, you return an error response, but that way at the receiving function, if you don't discard all slice elements, error response becomes of little use merely telling us that processing of one of the elements or all failed. Another way is you return an error when none of the elements get processed but again this is of little use I feel. One more way is you don't include error as return object and instead with every slice element struct, have it's own error object as a composite so you can send elementwise error as output.
The best way obviously depends on the particular scenario, however, I want to know if there are any best practices people follow or any design patterns around this problem.
PS: This was one of the closest questions, however since its accepting single object as input, not very relevant:
Return empty array or error
... a function that accepts [slice of interface representing an] array of objects and returns another [slice of interface representing an] array of objects along with error ...
You have not told us enough to go on.
Does the returned slice actually have anything to do with the parameter slice?
If so, what relationship do they have? For instance, perhaps the returned slice should be half the size of the input slice, and an error occurs if and only if the number of input objects is odd, in which case the last input object has been ignored.
Must inputs be processed in order, or will they be processed in parallel?
One more way is you don't include error as return object and instead with every array object struct, have it's own error object as a composite so you can send elementwise error as output.
This is probably a wise approach if the outputs are one-to-one with the inputs and you intend to handle them in parallel and/or continue processing the remaining inputs upon reaching one bad one. Equivalently, you can have the output slice include an error.
It's really very problem-dependent.
Edit: consider, e.g., the following (which I don't claim is good, mind you):
const maxWorkers = 10 // tunable
// Process a slice of T's in parallel. The results are either an
// R for each T, or an error. Caller provides the actual function
// f(T), which returns R + error (an empty/zero R for error).
func ProcessInParallel(input []T, f func(T) (R, error)) ([]interface{}, error) {
// Make one channel for sending work to workers,
// and one for receiving results from workers.
type Todo struct {
i int // the index of the item
item T // the item to work on
}
workChan := make(chan Todo)
type Done struct {
i int // the index of the item worked on
r R // result, if we have one
e error // error, if we have one
}
doneChan := make(chan Done)
// Spin off workers: maxWorkers or len(input),
// whichever is smaller.
n := len(input)
if n > maxWorkers {
n = maxWorkers
}
var wg sync.WaitGroup
for i := 0; i < n; i++ {
wg.Add(1)
go func(i int) {
for todo := range workChan {
i := todo.i
r, err := f(input[i])
doneChan <- Done{i, r, err}
}
wg.Done()
}(i)
}
// Close doneChan when all workers finish.
go func() {
wg.Wait()
close(doneChan)
}()
// Hand out work to workers (then close work channel).
go func() {
for i := range input {
workChan <- Todo{i, input[i]}
}
close(workChan)
}()
// Collect results.
var anyErr error
ret := make([]interface{}, len(input))
for done := range doneChan {
i := done.i
r, err := done.r, done.e
if err != nil {
anyErr = err
ret[i] = err
} else {
ret[i] = r
}
}
return ret, anyErr
}
This has an overall error return, and it returns a slice of interface{}. This means you can immediately tell if everything worked. However, it's kind of annoying to use:
ret, err := ProcessInParallel(arg, f)
if err != nil {
fmt.Println("some inputs failed")
for i := range ret {
if e, ok := ret[i].(error); ok {
fmt.Printf("%d: failed: %v\n", i, e)
} else {
fmt.Printf("%d: %s\n", i, ret[i].(R))
}
}
} else {
fmt.Println("all inputs were good")
for i := range ret {
fmt.Printf("%d: %s\n", i, ret[i].(R))
}
}
Why bother with the all-error summary?
Instead, we could have ProcessInParallel return []R, []error, for instance, or—probably better—use a simple error interface return value to store a MultiError as Cerise Limón suggested in a comment:
ret, err := ProcessInParallel(arg, f)
if err != nil {
if merr, ok := err.(datastore.MultiError); ok {
// merr[i] indicates the various failed items
// any ret[i] for which merr[i] is nil is OK
}
} else {
// all ret[i] are ok
}
A working example that doesn't use MultiError is here.
A working example that does use MultiError is here.
While Go supports multiple return values, when one of the return types is an error, it is meant to process either error or the other return values and not both. It means that when error is not nil, the other return values has no specific meaning and should not be processed.
In your case, I'd personally prefer to use an iterator pattern, similar to what is implemented for database/sql.Rows, such that:
func X(values []interface{}) *Result
The Result would hold all processed slice elements associated with their errors. Somewhere in the code I would write something like this:
result := X(values)
for result.Next() {
if err := result.Err(); err != nil {
// Handle the err for this specific element.
// Whether continue or fail the whole process.
}
v := result.Cur()
// Process current element.
}

Golang dynamic sizing slice when reading a file using buffo.read

I have a problem where, I need to use bufio.read to read a tsv file line by line and I need to record how many bytes each line Ive read is.
The problem is, It seems like I can't just initialize an empty slice and pass it into bufio.read and expect the slice to contain the entire line of the file.
file, _ := os.Open("file.tsv")
reader := bufio.NewReader(file)
b := make([]byte, 10)
for {
bytesRead, err:= reader.Read(b)
fmt.Println(bytesRead, b)
if err != nil {
break
}
}
So, for this example, since I specified the slice to be 10 bytes, the reader will read at most 10 bytes even if the line is bigger than 10 bytes.
However:
file, _ := os.Open("file.tsv")
reader := bufio.NewReader(file)
b := byte{} //or var b []byte
for {
bytesRead, err:= reader.Read(b)
fmt.Println(bytesRead, b)
if err != nil {
break
}
}
This will always read 0 bytes and I assume its because the buffer is length 0 or capacity 0.
How do I read a file Line by line, save the entire line in a variable or buffer, and return exactly how many bytes Ive read?
Thanks!
If you want to read line by line, and you're using a buffered reader, use the buffered reader's ReadBytes method.
line,err := reader.ReadBytes('\n')
This will give you a full line, one line at a time, regardless of byte length.

What are the sign extension rules for calling Windows API functions (stdcall)? This is needed to call WInAPI from Go, which is strict about int types

Oops, there was one thing I forgot when I made this answer, and it's something that I'm both not quite sure on myself and that I can't seem to find information for on MSDN and Google and the Stack Overflow search.
There are a number of places in the Windows API where you use a negative number, or a number too large to fit in a signed integer; for instance, CW_USEDEFAULT, INVALID_HANDLE_VALUE, GWLP_USERDATA, and so on. In the world of C, everything is all fine and dandy: the language's integer promotion rules come to the rescue.
But in Go, I have to pass all my arguments to functions as uintptr (which is equivalent to C's uintptr_t). The return value from the function is also returned this way, and then I will need to compare. Go doesn't allow integer promotion, and it doesn't allow you to convert a signed constant expression into an unsigned one at compile-time.
Right now, I have a bit of a jerry-rig set up for handling these constants in my UI library. (Here's an example of what this solution looks like in action.) However, I'm not quite satisfied with this solution; it feels to me like it's assuming things about the ABI, and I want to be absolutely sure of what I'm doing.
So my question is: how are signed values handled when passing them to Windows API functions and how are they handled when returning?
All my constants are autogenerated (example output). The autogenerator uses a C ffi, which I'd rather not use for the main project since I can call the DLLs directly (this also makes cross-compilation easier at least for the rest of the year). If I could somehow leverage that, for instance by making everything into a C-side variable of the form
uintptr_t x_CONST_NAME = (uintptr_t) (CONST_NAME);
that would be helpful. But I can't do that without this answer.
Thanks!
Update
Someone on IRC put it differently (reformatted to avoid horizontal scrolling):
[19:13] <FraGag> basically, you're asking whether an int with a value of -1
will be returned as 0x00000000FFFFFFFF or as 0xFFFFFFFFFFFFFFFF
if an int is 4 bytes and an uintptr is 8 bytes
Basically this, but specifically for Windows API interop, for parameters passed in, and regardless of uintptr size.
#twotwotwo's comments to my question pointed me in the right direction. If Stack Overflow allowed marking comments as answers and having multiple answers marked, I'd do that.
tl;dr version: what I have now is correct after all.
I wrote a program (below) that simply dumped all the constants from package syscall and looked for constants that were negative, but not == -1 (as that would just be ^0). The standard file handles (STD_ERROR_HANDLE, STD_INPUT_HANDLE, and STD_OUTPUT_HANDLE) are (-12, -10, and -11, respectively). The code in package syscall passes these constants as the sole argument of getStdHandle(h int), which produces the required file handle for package os. getStdHandle() passes this int to an autogenerated function GetStdHandle(stdhandle int) that wraps a call to the GetStdHandle() system call. GetStdHandle() takes the int and merely converts it to uintptr for passing into syscall.Syscall(). Though no explanation is given in the autogenerator's source (mksyscall_windows.go), if this didn't work, neither would fmt.Println() =P
All of the above is identical on both windows/386 and windows/amd64; the only thing in a processor-specific file is GetStdHandle(), but the relevant code is identical.
My negConst() function is already doing the same thing, just more directly. As such, I can safely assume that it is correct.
Thanks!
// 4 june 2014
// based on code from 24 may 2014
package main
import (
"fmt"
"os"
"strings"
"go/token"
"go/ast"
"go/parser"
"code.google.com/p/go.tools/go/types"
_ "code.google.com/p/go.tools/go/gcimporter"
)
var arch string
func getPackage(path string) (typespkg *types.Package, pkginfo types.Info) {
var pkg *ast.Package
fileset := token.NewFileSet() // parser.ParseDir() actually writes to this; not sure why it doesn't return one instead
filter := func(i os.FileInfo) bool {
if strings.Contains(i.Name(), "_windows") &&
strings.Contains(i.Name(), "_" + arch) &&
strings.HasSuffix(i.Name(), ".go") {
return true
}
if i.Name() == "race.go" || // skip these
i.Name() == "flock.go" {
return false
}
return strings.HasSuffix(i.Name(), "_windows.go") ||
(!strings.Contains(i.Name(), "_"))
}
pkgs, err := parser.ParseDir(fileset, path, filter, parser.AllErrors)
if err != nil {
panic(err)
}
for k, _ := range pkgs { // get the sole key
if pkgs[k].Name == "syscall" {
pkg = pkgs[k]
break
}
}
if pkg == nil {
panic("package syscall not found")
}
// we can't pass pkg.Files directly to types.Check() because the former is a map and the latter is a slice
ff := make([]*ast.File, 0, len(pkg.Files))
for _, v := range pkg.Files {
ff = append(ff, v)
}
// if we don't make() each map, package types won't fill the structure
pkginfo.Defs = make(map[*ast.Ident]types.Object)
pkginfo.Scopes = make(map[ast.Node]*types.Scope)
typespkg, err = new(types.Config).Check(path, fileset, ff, &pkginfo)
if err != nil {
panic(err)
}
return typespkg, pkginfo
}
func main() {
pkgpath := "/home/pietro/go/src/pkg/syscall"
arch = os.Args[1]
pkg, _ := getPackage(pkgpath)
scope := pkg.Scope()
for _, name := range scope.Names() {
obj := scope.Lookup(name)
if obj == nil {
panic(fmt.Errorf("nil object %q from scope %v", name, scope))
}
if !obj.Exported() { // exported names only
continue
}
if _, ok := obj.(*types.Const); ok {
fmt.Printf("egrep -rh '#define[ ]+%s' ~/winshare/Include/ 2>/dev/null\n", obj.Name())
}
// otherwise skip
}
}

How to use Go's time.Tick?

I want to print something at intervals.
But my code doesn't work, it throws an exception about deadlock.
Could you please help me with it? http://play.golang.org/p/pyEoXU-6Ee
func main() {
c := time.Tick(1 * time.Minute)
for now := range c {
fmt.Printf("%v \n", now)
}
}
Play.golang.org has some strict rules to protect it. If you run this locally, it works.
The reason this does not work in the https://play.golang.org/ is because time.Tick(...) is a dirty function, which is only safe to be used in an infinite loop that ends with the app (or other use cases where you don't mind leaking memory). As per Golang documentation:
Tick is a convenience wrapper for NewTicker providing access to the ticking channel only. While Tick is useful for clients that have no need to shut down the Ticker, be aware that without a way to shut it down the underlying Ticker cannot be recovered by the garbage collector; it "leaks". Unlike NewTicker, Tick will return nil if d <= 0.
So generally it's better to use time.NewTicker(...) instead. See example at: https://pkg.go.dev/time#example-NewTicker
You can try this instead:
package main
import "time"
import "fmt"
func main() {
ticker := time.NewTicker(time.Millisecond * 500)
go func() {
for t := range ticker.C {
fmt.Println("Tick at", t)
}
}()
time.Sleep(time.Millisecond * 1500)
ticker.Stop()
fmt.Println("Ticker stopped")
}
http://play.golang.org/p/FFDKMuR8_e

Resources