Goroutines sharing an array channel : trying to solve data race - arrays

I try to write a complex program with parallel goroutines. It is my first program with channels ;) Each goroutine returns an array and, unfortunately, the result is "random". If I run 10 times the program, I have 10 different results :(
This is an over simplification of my program, the results is good (maybe because it is too simple) but when I run it with -race argument, there is 4 data races.
I tried to had a close() function but it did not worked.
May you help me to find the mistake ? Thank you very much in advance !
package main
import "fmt"
import "sync"
import "strconv"
func cat_strings(a int, b string) []string{
var y []string
j := strconv.Itoa(a)
y = append(y, j)
y = append(y, b)
return y
}
func main() {
var slice []string
var wg sync.WaitGroup
var x []string
queue := make(chan []string, 10)
wg.Add(10)
for i := 0; i < 10; i++ {
go func(i int) {
defer wg.Done()
x = cat_strings(i, "var")
queue <- x
}(i)
}
//close(queue)
go func() {
defer wg.Done()
for t := range queue {
slice = append(slice, t...)
}
}()
wg.Wait()
fmt.Println(slice)
}

There's two pieces to this fix, don't share the slices between goroutines, and then range over queue synchronously in main.
import (
"fmt"
"strconv"
"sync"
)
func cat_strings(a int, b string) []string {
var y []string
j := strconv.Itoa(a)
y = append(y, j)
y = append(y, b)
return y
}
func main() {
var slice []string
var wg sync.WaitGroup
queue := make(chan []string, 10)
wg.Add(10)
for i := 0; i < 10; i++ {
go func(i int) {
defer wg.Done()
queue <- cat_strings(i, "var")
}(i)
}
go func() {
wg.Wait()
close(queue)
}()
for t := range queue {
slice = append(slice, t...)
}
fmt.Println(slice)
}
There's no reason for the extra x slice you're sharing between goroutines. If each goroutine needs another slice, define a new one for each. Sharing a single slice is always going to require extra synchronization.
The other race is between the goruoutine appending from queue to the slice slice, and the final fmt.Println. There's no reason for those be concurrent since you don't want to print until all values have been read, so finish the for-range loop entirely before printing the final value.

Related

Using goroutines to iterate through file indefinitely

I'm new to Go so please excuse my ignorance. I'm attempting to iterate through a bunch of wordlists line by line indefinitely with goroutines. But when trying to do so, it does not iterate or stops half way through. How would I go about this in the proper manner without breaking the flow?
package main
import (
"bufio"
"fmt"
"os"
)
var file, _ = os.Open("wordlist.txt")
func start() {
scanner := bufio.NewScanner(file)
for scanner.Scan() {
fmt.Println(scanner.Text())
}
}
func main(){
for t := 0; t < 150; t++ {
go start()
fmt.Scanln()
}
}
Thank you!
You declare file as a global variable. Sharing read/write file state amongst multiple goroutines is a data race and will give you undefined results.
Most likely, reads start where the last read from any of the goroutines left off. If that's end-of-file, it likely continues to be end-of-file. But, since the results are undefined, that's not guaranteed. Your erratic results are due to undefined behavior.
Here's a revised version of your program that declares a local file variable and uses a sync.Waitgroup to synchronize the completion of all the go start() goroutines and the main goroutine. The program checks for errors.
package main
import (
"bufio"
"fmt"
"os"
"sync"
)
func start(filename string, wg *sync.WaitGroup, t int) {
defer wg.Done()
file, err := os.Open(filename)
if err != nil {
fmt.Println(err)
return
}
defer file.Close()
lines := 0
scanner := bufio.NewScanner(file)
for scanner.Scan() {
lines++
}
if err := scanner.Err(); err != nil {
fmt.Println(err)
return
}
fmt.Println(t, lines)
}
func main() {
wg := &sync.WaitGroup{}
filename := "wordlist.txt"
for t := 0; t < 150; t++ {
wg.Add(1)
go start(filename, wg, t)
}
wg.Wait()
}

go vet - loop variable i captured by func literal

In the below code:
package main
import "fmt"
func main() {
for i := 0; i <= 9; i++ {
go func() {
fmt.Println(i)
}()
}
}
Output:
code$
code$ go install github.com/myhub/cs61a
code$ bin/cs61a
code$
Above program does not provide any output.
1) Is their a data race for single memory location i among 10 go-routines?
2) Why above code does not print value of free variable i?
Is there a data race?
Yes, confirm it by running go run -race example.go. The main goroutine writes i, and the other goroutines read it without any synchronization. See Passing parameters to function closure; Why do these two for loop variations give me different behavior? and Register multiple routes using range for loop slices/map.
Why above code does not print anything?
Because when the main goroutine ends, your program ends as well. It does not wait for other non-main goroutines to finish. See No output from goroutine
Fix
Make a copy of the loop variable, and use that in the closures, and use a sync.WaitGroup to wait for the launched goroutines to end:
var wg sync.WaitGroup
for i := 0; i <= 9; i++ {
i2 := i
wg.Add(1)
go func() {
defer wg.Done()
fmt.Println(i2)
}()
}
wg.Wait()
This will output (try it on the Go Playground):
9
0
2
1
6
5
3
7
8
4
An alternative is to pass i as a parameter to the launched function:
var wg sync.WaitGroup
for i := 0; i <= 9; i++ {
wg.Add(1)
go func(i int) {
defer wg.Done()
fmt.Println(i)
}(i)
}
wg.Wait()
Try this one on the Go Playground.

Should I pass request object to goroutine in blocking for-select loop coming from channel?

I have the following for-select structure in code:
go func(runCh chan Caller, shutdownSignal chan bool) {
for {
select {
case request := <-runCh:
go func() {
w.Run(&request)
}()
case <-shutdownSignal:
w.Shutdown()
return
}
}
}(runCh, shutdownCh)
Will I have some problems with this part:
case request := <-runCh:
go func() {
w.Run(&request)
}()
?
If yes, then why?
In other words - does Using goroutines on loop iterator variables part of Common Mistakes also apply to my case and why it does/does not apply here?
No (does not apply here), you have new variable (memory address) on each loop iteration:
case request := <-runCh:
Because this := creates new variable distinct from previous one, proof:
package main
import (
"fmt"
"time"
)
func main() {
runCh := make(chan int, 2)
runCh <- 1
runCh <- 2
for i := 1; i <= 2; i++ {
select {
case request := <-runCh:
go func() {
fmt.Println(request, &request)
time.Sleep(200 * time.Millisecond)
fmt.Println(request, &request)
}()
}
}
time.Sleep(500 * time.Millisecond)
}
Output (the address of request in each loop iteration is different):
1 0xc0000b8000
2 0xc0000b8008
1 0xc0000b8000
2 0xc0000b8008
See: 0xc0000b8000 != 0xc0000b8008

Most idiomatic way to select elements from an array in Golang?

I have an array of strings, and I'd like to exclude values that start in foo_ OR are longer than 7 characters.
I can loop through each element, run the if statement, and add it to a slice along the way. But I was curious if there was an idiomatic or more golang-like way of accomplishing that.
Just for example, the same thing might be done in Ruby as
my_array.select! { |val| val !~ /^foo_/ && val.length <= 7 }
There is no one-liner as you have it in Ruby, but with a helper function you can make it almost as short.
Here's our helper function that loops over a slice, and selects and returns only the elements that meet a criteria captured by a function value:
func filter(ss []string, test func(string) bool) (ret []string) {
for _, s := range ss {
if test(s) {
ret = append(ret, s)
}
}
return
}
Starting with Go 1.18, we can write it generic so it will work with all types, not just string:
func filter[T any](ss []T, test func(T) bool) (ret []T) {
for _, s := range ss {
if test(s) {
ret = append(ret, s)
}
}
return
}
Using this helper function your task:
ss := []string{"foo_1", "asdf", "loooooooong", "nfoo_1", "foo_2"}
mytest := func(s string) bool { return !strings.HasPrefix(s, "foo_") && len(s) <= 7 }
s2 := filter(ss, mytest)
fmt.Println(s2)
Output (try it on the Go Playground, or the generic version: Go Playground):
[asdf nfoo_1]
Note:
If it is expected that many elements will be selected, it might be profitable to allocate a "big" ret slice beforehand, and use simple assignment instead of the append(). And before returning, slice the ret to have a length equal to the number of selected elements.
Note #2:
In my example I chose a test() function which tells if an element is to be returned. So I had to invert your "exclusion" condition. Obviously you may write the helper function to expect a tester function which tells what to exclude (and not what to include).
Have a look at robpike's filter library. This would allow you to do:
package main
import (
"fmt"
"strings"
"filter"
)
func isNoFoo7(a string) bool {
return ! strings.HasPrefix(a, "foo_") && len(a) <= 7
}
func main() {
a := []string{"test", "some_other_test", "foo_etc"}
result := Choose(a, isNoFoo7)
fmt.Println(result) // [test]
}
Interestingly enough the README.md by Rob:
I wanted to see how hard it was to implement this sort of thing in Go, with as nice an API as I could manage. It wasn't hard.
Having written it a couple of years ago, I haven't had occasion to use it once. Instead, I just use "for" loops.
You shouldn't use it either.
So the most idiomatic way according to Rob would be something like:
func main() {
a := []string{"test", "some_other_test", "foo_etc"}
nofoos := []string{}
for i := range a {
if(!strings.HasPrefix(a[i], "foo_") && len(a[i]) <= 7) {
nofoos = append(nofoos, a[i])
}
}
fmt.Println(nofoos) // [test]
}
This style is very similar, if not identical, to the approach any C-family language takes.
Today, I stumbled on a pretty idiom that surprised me. If you want to filter a slice in place without allocating, use two slices with the same backing array:
s := []T{
// the input
}
s2 := s
s = s[:0]
for _, v := range s2 {
if shouldKeep(v) {
s = append(s, v)
}
}
Here's a specific example of removing duplicate strings:
s := []string{"a", "a", "b", "c", "c"}
s2 := s
s = s[:0]
var last string
for _, v := range s2 {
if len(s) == 0 || v != last {
last = v
s = append(s, v)
}
}
If you need to keep both slices, simply replace s = s[:0] with s = nil or s = make([]T, 0, len(s)), depending on whether you want append() to allocate for you.
There are a couple of nice ways to filter a slice without allocations or new dependencies. Found in the Go wiki on Github:
Filter (in place)
n := 0
for _, x := range a {
if keep(x) {
a[n] = x
n++
}
}
a = a[:n]
And another, more readable, way:
Filtering without allocating
This trick uses the fact that a slice shares the same backing array
and capacity as the original, so the storage is reused for the
filtered slice. Of course, the original contents are modified.
b := a[:0]
for _, x := range a {
if f(x) {
b = append(b, x)
}
}
For elements which must be garbage collected, the following code can
be included afterwards:
for i := len(b); i < len(a); i++ {
a[i] = nil // or the zero value of T
}
One thing I'm not sure about is whether the first method needs clearing (setting to nil) the items in slice a after index n, like they do in the second method.
EDIT: the second way is basically what MicahStetson described in his answer. In my code I use a function similar to the following, which is probably as good as it gets in terms on performance and readability:
func filterSlice(slice []*T, keep func(*T) bool) []*T {
newSlice := slice[:0]
for _, item := range slice {
if keep(item) {
newSlice = append(newSlice, item)
}
}
// make sure discarded items can be garbage collected
for i := len(newSlice); i < len(slice); i++ {
slice[i] = nil
}
return newSlice
}
Note that if items in your slice are not pointers and don't contain pointers you can skip the second for loop.
There isn't an idiomatic way you can achieve the same expected result in Go in one single line as in Ruby, but with a helper function you can obtain the same expressiveness as in Ruby.
You can call this helper function as:
Filter(strs, func(v string) bool {
return strings.HasPrefix(v, "foo_") // return foo_testfor
}))
Here is the whole code:
package main
import "strings"
import "fmt"
// Returns a new slice containing all strings in the
// slice that satisfy the predicate `f`.
func Filter(vs []string, f func(string) bool) []string {
vsf := make([]string, 0)
for _, v := range vs {
if f(v) && len(v) > 7 {
vsf = append(vsf, v)
}
}
return vsf
}
func main() {
var strs = []string{"foo1", "foo2", "foo3", "foo3", "foo_testfor", "_foo"}
fmt.Println(Filter(strs, func(v string) bool {
return strings.HasPrefix(v, "foo_") // return foo_testfor
}))
}
And the running example: Playground
you can use the loop as you did and wrap it to a utils function for reuse.
For multi-datatype support, copy-paste will be a choice. Another choice is writing a generating tool.
And final option if you want to use lib, you can take a look on https://github.com/ledongthuc/goterators#filter that I created to reuse aggregate & transform functions.
It requires the Go 1.18 to use that support generic + dynamic type you want to use with.
filteredItems, err := Filter(list, func(item int) bool {
return item % 2 == 0
})
filteredItems, err := Filter(list, func(item string) bool {
return item.Contains("ValidWord")
})
filteredItems, err := Filter(list, func(item MyStruct) bool {
return item.Valid()
})
It also supports Reduce in case you want to optimize the way you select.
Hope it's useful with you!
"Select Elements from Array" is also commonly called a filter function. There's no such thing in go. There are also no other "Collection Functions" such as map or reduce. For the most idiomatic way to get the desired result, I find https://gobyexample.com/collection-functions a good reference:
[...] in Go it’s common to provide collection functions if and when they are specifically needed for your program and data types.
They provide an implementation example of the filter function for strings:
func Filter(vs []string, f func(string) bool) []string {
vsf := make([]string, 0)
for _, v := range vs {
if f(v) {
vsf = append(vsf, v)
}
}
return vsf
}
However, they also say, that it's often ok to just inline the function:
Note that in some cases it may be clearest to just inline the
collection-manipulating code directly, instead of creating and calling
a helper function.
In general, golang tries to only introduce orthogonal concepts, meaning that when you can solve a problem one way, there shouldn't be too many more ways to solve it. This adds simplicity to the language by only having a few core concepts, such that not every developer uses a different subset of the language.
Take a look at this library: github.com/thoas/go-funk
It provides an implementation of a lot of life-saving idioms in Go (including filtering of elements in array for instance).
r := funk.Filter([]int{1, 2, 3, 4}, func(x int) bool {
return x%2 == 0
}
Here is an elegant example of both Fold and Filter that uses recursion to accomplish filtering. FoldRight is also generally useful. It is not stack safe but could be made so with trampolining. Once Golang has generics it can be entirely generalized for any 2 types:
func FoldRightStrings(as, z []string, f func(string, []string) []string) []string {
if len(as) > 1 {//Slice has a head and a tail.
h, t := as[0], as[1:len(as)]
return f(h, FoldRightStrings(t, z, f))
} else if len(as) == 1 {//Slice has a head and an empty tail.
h := as[0]
return f(h, FoldRightStrings([]string{}, z, f))
}
return z
}
func FilterStrings(as []string, p func(string) bool) []string {
var g = func(h string, accum []string) []string {
if p(h) {
return append(accum, h)
} else {
return accum
}
}
return FoldRightStrings(as, []string{}, g)
}
Here is an example of its usage to filter out all the strings with length < 8
var p = func(s string) bool {
if len(s) < 8 {
return true
} else {
return false
}
}
FilterStrings([]string{"asd","asdfas","asdfasfsa","asdfasdfsadfsadfad"}, p)
I`m developing this library: https://github.com/jose78/go-collection. PLease try this example to filter elements:
package main
import (
"fmt"
col "github.com/jose78/go-collection/collections"
)
type user struct {
name string
age int
id int
}
func main() {
newMap := generateMapTest()
if resultMap, err := newMap.FilterAll(filterEmptyName); err != nil {
fmt.Printf("error")
} else {
fmt.Printf("Result: %v\n", resultMap)
result := resultMap.ListValues()
fmt.Printf("Result: %v\n", result)
fmt.Printf("Result: %v\n", result.Reverse())
fmt.Printf("Result: %v\n", result.JoinAsString(" <---> "))
fmt.Printf("Result: %v\n", result.Reverse().JoinAsString(" <---> "))
result.Foreach(simpleLoop)
err := result.Foreach(simpleLoopWithError)
if err != nil {
fmt.Println(err)
}
}
}
func filterEmptyName(key interface{}, value interface{}) bool {
user := value.(user)
return user.name != "empty"
}
func generateMapTest() (container col.MapType) {
container = col.MapType{}
container[1] = user{"Alvaro", 6, 1}
container[2] = user{"Sofia", 3, 2}
container[3] = user{"empty", 0, -1}
return container
}
var simpleLoop col.FnForeachList = func(mapper interface{}, index int) {
fmt.Printf("%d.- item:%v\n", index, mapper)
}
var simpleLoopWithError col.FnForeachList = func(mapper interface{}, index int) {
if index > 0 {
panic(fmt.Sprintf("Error produced with index == %d\n", index))
}
fmt.Printf("%d.- item:%v\n", index, mapper)
}
Result of execution:
Result: map[1:{Alvaro 6 1} 2:{Sofia 3 2}]
Result: [{Sofia 3 2} {Alvaro 6 1}]
Result: [{Alvaro 6 1} {Sofia 3 2}]
Result: {Sofia 3 2} <---> {Alvaro 6 1}
Result: {Alvaro 6 1} <---> {Sofia 3 2}
0.- item:{Sofia 3 2}
1.- item:{Alvaro 6 1}
0.- item:{Sofia 3 2}
Recovered in f Error produced with index == 1
ERROR: Error produced with index == 1
Error produced with index == 1
The DOC currently are located in wiki section of the project. You can try it in this link. I hope you like it...
REgaRDS...

Count similar array value

I'm trying to learn Go (or Golang) and can't seem to get it right. I have 2 texts files, each containing a list of words. I'm trying to count the amount of words that are present in both files.
Here is my code so far :
package main
import (
"fmt"
"log"
"net/http"
"bufio"
)
func stringInSlice(str string, list []string) bool {
for _, v := range list {
if v == str {
return true
}
}
return false
}
func main() {
// Texts URL
var list = "https://gist.githubusercontent.com/alexcesaro/c9c47c638252e21bd82c/raw/bd031237a56ae6691145b4df5617c385dffe930d/list.txt"
var url1 = "https://gist.githubusercontent.com/alexcesaro/4ebfa5a9548d053dddb2/raw/abb8525774b63f342e5173d1af89e47a7a39cd2d/file1.txt"
//Create storing arrays
var buffer [2000]string
var bufferUrl1 [40000]string
// Set a sibling counter
var sibling = 0
// Read and store text files
wordList, err := http.Get(list)
if err != nil {
log.Fatalf("Error while getting the url : %v", err)
}
defer wordList.Body.Close()
wordUrl1, err := http.Get(url1)
if err != nil {
log.Fatalf("Error while getting the url : %v", err)
}
defer wordUrl1.Body.Close()
streamList := bufio.NewScanner(wordList.Body)
streamUrl1 := bufio.NewScanner(wordUrl1.Body)
streamList.Split(bufio.ScanLines)
streamUrl1.Split(bufio.ScanLines)
var i = 0;
var j = 0;
//Fill arrays with each lines
for streamList.Scan() {
buffer[i] = streamList.Text()
i++
}
for streamUrl1.Scan() {
bufferUrl1[j] = streamUrl1.Text()
j++
}
//ERROR OCCURRING HERE :
// This code if i'm not wrong is supposed to compare through all the range of bufferUrl1 -> bufferUrl1 values with buffer values, then increment sibling and output FIND
for v := range bufferUrl1{
if stringInSlice(bufferUrl1, buffer) {
sibling++
fmt.Println("FIND")
}
}
// As a testing purpose thoses lines properly paste both array
// fmt.Println(buffer)
// fmt.Println(bufferUrl1)
}
But right now, my build doesn't even succeed. I'm only greeted with this message:
.\hello.go:69: cannot use bufferUrl1 (type [40000]string) as type string in argument to stringInSlice
.\hello.go:69: cannot use buffer (type [2000]string) as type []string in argument to stringInSlice
bufferUrl1 is an array: [4000]string. You meant to use v (each
string in bufferUrl1). But in fact, you meant to use the second
variable—the first variable is the index which is ignored in the code
below using _.
type [2000]string is different from []string. In Go, arrays and slices are not the same. Read Go Slices: usage and internals. I've changed both variable declarations to use slices with the same initial length using make.
These are changes you need to make to compile.
Declarations:
// Create storing slices
buffer := make([]string, 2000)
bufferUrl1 := make([]string, 40000)
and the loop on Line 69:
for _, s := range bufferUrl1 {
if stringInSlice(s, buffer) {
sibling++
fmt.Println("FIND")
}
}
As a side-note, consider using a map instead of a slice for buffer for more efficient lookup instead of looping through the list in stringInSlice.
https://play.golang.org/p/UcaSVwYcIw has the fix for the comments below (you won't be able to make HTTP requests from the Playground).

Resources