I have an array of strings, and I'd like to exclude values that start in foo_ OR are longer than 7 characters.
I can loop through each element, run the if statement, and add it to a slice along the way. But I was curious if there was an idiomatic or more golang-like way of accomplishing that.
Just for example, the same thing might be done in Ruby as
my_array.select! { |val| val !~ /^foo_/ && val.length <= 7 }
There is no one-liner as you have it in Ruby, but with a helper function you can make it almost as short.
Here's our helper function that loops over a slice, and selects and returns only the elements that meet a criteria captured by a function value:
func filter(ss []string, test func(string) bool) (ret []string) {
for _, s := range ss {
if test(s) {
ret = append(ret, s)
}
}
return
}
Starting with Go 1.18, we can write it generic so it will work with all types, not just string:
func filter[T any](ss []T, test func(T) bool) (ret []T) {
for _, s := range ss {
if test(s) {
ret = append(ret, s)
}
}
return
}
Using this helper function your task:
ss := []string{"foo_1", "asdf", "loooooooong", "nfoo_1", "foo_2"}
mytest := func(s string) bool { return !strings.HasPrefix(s, "foo_") && len(s) <= 7 }
s2 := filter(ss, mytest)
fmt.Println(s2)
Output (try it on the Go Playground, or the generic version: Go Playground):
[asdf nfoo_1]
Note:
If it is expected that many elements will be selected, it might be profitable to allocate a "big" ret slice beforehand, and use simple assignment instead of the append(). And before returning, slice the ret to have a length equal to the number of selected elements.
Note #2:
In my example I chose a test() function which tells if an element is to be returned. So I had to invert your "exclusion" condition. Obviously you may write the helper function to expect a tester function which tells what to exclude (and not what to include).
Have a look at robpike's filter library. This would allow you to do:
package main
import (
"fmt"
"strings"
"filter"
)
func isNoFoo7(a string) bool {
return ! strings.HasPrefix(a, "foo_") && len(a) <= 7
}
func main() {
a := []string{"test", "some_other_test", "foo_etc"}
result := Choose(a, isNoFoo7)
fmt.Println(result) // [test]
}
Interestingly enough the README.md by Rob:
I wanted to see how hard it was to implement this sort of thing in Go, with as nice an API as I could manage. It wasn't hard.
Having written it a couple of years ago, I haven't had occasion to use it once. Instead, I just use "for" loops.
You shouldn't use it either.
So the most idiomatic way according to Rob would be something like:
func main() {
a := []string{"test", "some_other_test", "foo_etc"}
nofoos := []string{}
for i := range a {
if(!strings.HasPrefix(a[i], "foo_") && len(a[i]) <= 7) {
nofoos = append(nofoos, a[i])
}
}
fmt.Println(nofoos) // [test]
}
This style is very similar, if not identical, to the approach any C-family language takes.
Today, I stumbled on a pretty idiom that surprised me. If you want to filter a slice in place without allocating, use two slices with the same backing array:
s := []T{
// the input
}
s2 := s
s = s[:0]
for _, v := range s2 {
if shouldKeep(v) {
s = append(s, v)
}
}
Here's a specific example of removing duplicate strings:
s := []string{"a", "a", "b", "c", "c"}
s2 := s
s = s[:0]
var last string
for _, v := range s2 {
if len(s) == 0 || v != last {
last = v
s = append(s, v)
}
}
If you need to keep both slices, simply replace s = s[:0] with s = nil or s = make([]T, 0, len(s)), depending on whether you want append() to allocate for you.
There are a couple of nice ways to filter a slice without allocations or new dependencies. Found in the Go wiki on Github:
Filter (in place)
n := 0
for _, x := range a {
if keep(x) {
a[n] = x
n++
}
}
a = a[:n]
And another, more readable, way:
Filtering without allocating
This trick uses the fact that a slice shares the same backing array
and capacity as the original, so the storage is reused for the
filtered slice. Of course, the original contents are modified.
b := a[:0]
for _, x := range a {
if f(x) {
b = append(b, x)
}
}
For elements which must be garbage collected, the following code can
be included afterwards:
for i := len(b); i < len(a); i++ {
a[i] = nil // or the zero value of T
}
One thing I'm not sure about is whether the first method needs clearing (setting to nil) the items in slice a after index n, like they do in the second method.
EDIT: the second way is basically what MicahStetson described in his answer. In my code I use a function similar to the following, which is probably as good as it gets in terms on performance and readability:
func filterSlice(slice []*T, keep func(*T) bool) []*T {
newSlice := slice[:0]
for _, item := range slice {
if keep(item) {
newSlice = append(newSlice, item)
}
}
// make sure discarded items can be garbage collected
for i := len(newSlice); i < len(slice); i++ {
slice[i] = nil
}
return newSlice
}
Note that if items in your slice are not pointers and don't contain pointers you can skip the second for loop.
There isn't an idiomatic way you can achieve the same expected result in Go in one single line as in Ruby, but with a helper function you can obtain the same expressiveness as in Ruby.
You can call this helper function as:
Filter(strs, func(v string) bool {
return strings.HasPrefix(v, "foo_") // return foo_testfor
}))
Here is the whole code:
package main
import "strings"
import "fmt"
// Returns a new slice containing all strings in the
// slice that satisfy the predicate `f`.
func Filter(vs []string, f func(string) bool) []string {
vsf := make([]string, 0)
for _, v := range vs {
if f(v) && len(v) > 7 {
vsf = append(vsf, v)
}
}
return vsf
}
func main() {
var strs = []string{"foo1", "foo2", "foo3", "foo3", "foo_testfor", "_foo"}
fmt.Println(Filter(strs, func(v string) bool {
return strings.HasPrefix(v, "foo_") // return foo_testfor
}))
}
And the running example: Playground
you can use the loop as you did and wrap it to a utils function for reuse.
For multi-datatype support, copy-paste will be a choice. Another choice is writing a generating tool.
And final option if you want to use lib, you can take a look on https://github.com/ledongthuc/goterators#filter that I created to reuse aggregate & transform functions.
It requires the Go 1.18 to use that support generic + dynamic type you want to use with.
filteredItems, err := Filter(list, func(item int) bool {
return item % 2 == 0
})
filteredItems, err := Filter(list, func(item string) bool {
return item.Contains("ValidWord")
})
filteredItems, err := Filter(list, func(item MyStruct) bool {
return item.Valid()
})
It also supports Reduce in case you want to optimize the way you select.
Hope it's useful with you!
"Select Elements from Array" is also commonly called a filter function. There's no such thing in go. There are also no other "Collection Functions" such as map or reduce. For the most idiomatic way to get the desired result, I find https://gobyexample.com/collection-functions a good reference:
[...] in Go it’s common to provide collection functions if and when they are specifically needed for your program and data types.
They provide an implementation example of the filter function for strings:
func Filter(vs []string, f func(string) bool) []string {
vsf := make([]string, 0)
for _, v := range vs {
if f(v) {
vsf = append(vsf, v)
}
}
return vsf
}
However, they also say, that it's often ok to just inline the function:
Note that in some cases it may be clearest to just inline the
collection-manipulating code directly, instead of creating and calling
a helper function.
In general, golang tries to only introduce orthogonal concepts, meaning that when you can solve a problem one way, there shouldn't be too many more ways to solve it. This adds simplicity to the language by only having a few core concepts, such that not every developer uses a different subset of the language.
Take a look at this library: github.com/thoas/go-funk
It provides an implementation of a lot of life-saving idioms in Go (including filtering of elements in array for instance).
r := funk.Filter([]int{1, 2, 3, 4}, func(x int) bool {
return x%2 == 0
}
Here is an elegant example of both Fold and Filter that uses recursion to accomplish filtering. FoldRight is also generally useful. It is not stack safe but could be made so with trampolining. Once Golang has generics it can be entirely generalized for any 2 types:
func FoldRightStrings(as, z []string, f func(string, []string) []string) []string {
if len(as) > 1 {//Slice has a head and a tail.
h, t := as[0], as[1:len(as)]
return f(h, FoldRightStrings(t, z, f))
} else if len(as) == 1 {//Slice has a head and an empty tail.
h := as[0]
return f(h, FoldRightStrings([]string{}, z, f))
}
return z
}
func FilterStrings(as []string, p func(string) bool) []string {
var g = func(h string, accum []string) []string {
if p(h) {
return append(accum, h)
} else {
return accum
}
}
return FoldRightStrings(as, []string{}, g)
}
Here is an example of its usage to filter out all the strings with length < 8
var p = func(s string) bool {
if len(s) < 8 {
return true
} else {
return false
}
}
FilterStrings([]string{"asd","asdfas","asdfasfsa","asdfasdfsadfsadfad"}, p)
I`m developing this library: https://github.com/jose78/go-collection. PLease try this example to filter elements:
package main
import (
"fmt"
col "github.com/jose78/go-collection/collections"
)
type user struct {
name string
age int
id int
}
func main() {
newMap := generateMapTest()
if resultMap, err := newMap.FilterAll(filterEmptyName); err != nil {
fmt.Printf("error")
} else {
fmt.Printf("Result: %v\n", resultMap)
result := resultMap.ListValues()
fmt.Printf("Result: %v\n", result)
fmt.Printf("Result: %v\n", result.Reverse())
fmt.Printf("Result: %v\n", result.JoinAsString(" <---> "))
fmt.Printf("Result: %v\n", result.Reverse().JoinAsString(" <---> "))
result.Foreach(simpleLoop)
err := result.Foreach(simpleLoopWithError)
if err != nil {
fmt.Println(err)
}
}
}
func filterEmptyName(key interface{}, value interface{}) bool {
user := value.(user)
return user.name != "empty"
}
func generateMapTest() (container col.MapType) {
container = col.MapType{}
container[1] = user{"Alvaro", 6, 1}
container[2] = user{"Sofia", 3, 2}
container[3] = user{"empty", 0, -1}
return container
}
var simpleLoop col.FnForeachList = func(mapper interface{}, index int) {
fmt.Printf("%d.- item:%v\n", index, mapper)
}
var simpleLoopWithError col.FnForeachList = func(mapper interface{}, index int) {
if index > 0 {
panic(fmt.Sprintf("Error produced with index == %d\n", index))
}
fmt.Printf("%d.- item:%v\n", index, mapper)
}
Result of execution:
Result: map[1:{Alvaro 6 1} 2:{Sofia 3 2}]
Result: [{Sofia 3 2} {Alvaro 6 1}]
Result: [{Alvaro 6 1} {Sofia 3 2}]
Result: {Sofia 3 2} <---> {Alvaro 6 1}
Result: {Alvaro 6 1} <---> {Sofia 3 2}
0.- item:{Sofia 3 2}
1.- item:{Alvaro 6 1}
0.- item:{Sofia 3 2}
Recovered in f Error produced with index == 1
ERROR: Error produced with index == 1
Error produced with index == 1
The DOC currently are located in wiki section of the project. You can try it in this link. I hope you like it...
REgaRDS...
Related
I have a []byte which I need to sort, in ascending order.
I get an object with the items and then iterate the array in order to create the object returned:
// unfortunately, for some obscure reason I can't change the data types of the caller and the object from the function call are different, although both are []byte underneath (...)
type ID []byte
// in another package:
type ByteInterface []byte
func (c *Store) GetAll() ByteInterface {
returnObj := make([]ByteInterface,0)
obj, err := GetData()
// err handling
for _, b := range obj.IDs {
returnObj = append(returnObj, ByteInterface(b))
}
return returnObj
}
So I'm asking myself if it is possible to do the append so that returnObj is sorted right away, or if I need to sort obj.ByteData upfront (or sort returnOjb afterwards).
On each iteration, do the following:
Grow the target slice (possibly reallocating it):
numElems := len(returnObj)
returnObj = append(returnObj, make([]byte, len(obj))...)
Use the standard approach for insertion to keep the destination sorted by finding a place to put each byte from the source slice, one by one:
for _, b := range obj {
i := sort.Search(numElems, func (i int) bool {
return returnObj[i] >= b
}
if i < numElems {
copy(returnObj[i+1:], returnObj[i:])
}
returnObj[i] = b
numElems++
}
(The call to copy should be optimized by copying less but this is left as an exercise for the reader.)
I have a recursive data structure that can contain a few different type of data:
type Data interface{
// Some methods
}
type Pair struct { // implements Data
fst Data
snd Data
}
type Number float64 // implements Data
Now I want to flatten a chain of Pairs into a []Data. However, the Data in the fst field should not be flattened, only data in snd should be flattened. E.g:
chain := Pair{Number(1.0), Pair{Number(2.0), Pair{Number(3.0), nil}}}
chain2 := Pair{Pair{Number(1.0), Number(4.0)}, Pair{Number(2.0), Pair{Number(3.0), nil}}}
becomes:
data := []Data{Number(1.0), Number(2.0), Number(3.0)}
data2 := []Data{Pair{Number(1.0), Number(4.0)}, Number(2.0), Number(3.0)}
My naive approach would be:
var data []Data
chain := Pair{Number(1.0), Pair{Number(2.0), Pair{Number(3.0), nil}}}
for chain != nil {
data = append(data, chain.fst)
chain = chain.snd
}
Is there a more efficient approach that can flatten a data structure like the one in the variable chain into an []Data array?
You can use a recursive function. On the way down, add up the number of pairs, at the bottom, allocate the array, and on the way back up, fill the array from back to front.
If you need to support arbitrary trees, you can add a size method to Data, and then do another tree traversal to actually fill the array.
Huh, your naive approach doesn't work for Pairs nested inside fst. If you had chain := Pair{Pair{Number(1.0), Number(2.0)}, Number{3.0}}, it would end up as []Data{Pair{Number(1.0), Number(2.0)}, Number{3.0}}. This is an inherently recursive problem, so why not implement it as such?
I suggest adding a flatten() method to your interface. Pairs can just recursively nest themselves, and Numbers just return their value.
Here's a fully working example with some minimal testing:
package main
import "fmt"
type Data interface {
flatten() []Data
}
type Pair struct {
fst Data
snd Data
}
type Number float64
func (p Pair) flatten() []Data {
res := []Data{}
if p.fst != nil {
res = append(res, p.fst.flatten()...)
}
if p.snd != nil {
res = append(res, p.snd.flatten()...)
}
return res
}
func (n Number) flatten() []Data {
return []Data{n}
}
func main() {
tests := []Data{
Pair{Number(1.0), Pair{Number(2.0), Pair{Number(3.0), nil}}},
Pair{Pair{Number(1.0), Number(2.0)}, Number(3.0)},
Pair{Pair{Pair{Number(1.0), Number(2.0)}, Pair{Number(3.0), Number(4.0)}}, Pair{Pair{Number(5.0), Number(6.0)}, Number(7.0)}},
Number(1.0),
}
for _, t := range tests {
fmt.Printf("Original: %v\n", t)
fmt.Printf("Flattened: %v\n", t.flatten())
}
}
(This assumes that the top-level input Data is never nil).
The code prints:
Original: {1 {2 {3 <nil>}}}
Flattened: [1 2 3]
Original: {{1 2} 3}
Flattened: [1 2 3]
Original: {{{1 2} {3 4}} {{5 6} 7}}
Flattened: [1 2 3 4 5 6 7]
Original: 1
Flattened: [1]
As suggested, writing a recursive function fits best for this problem. But it's also possible to write a non-recursive version (IMHO recursive version would be more clear):
func flatten(d Data) []Data {
var res []Data
stack := []Data{d}
for {
if len(stack) == 0 {
break
}
switch x := stack[len(stack)-1].(type) {
case Pair:
stack[len(stack)-1] = x.snd
stack = append(stack, x.fst)
case Number:
res = append(res, x)
stack = stack[:len(stack)-1]
default:
if x == nil {
stack = stack[:len(stack)-1]
} else {
panic("INVALID TYPE")
}
}
}
return res
}
I'm trying to learn Go (or Golang) and can't seem to get it right. I have 2 texts files, each containing a list of words. I'm trying to count the amount of words that are present in both files.
Here is my code so far :
package main
import (
"fmt"
"log"
"net/http"
"bufio"
)
func stringInSlice(str string, list []string) bool {
for _, v := range list {
if v == str {
return true
}
}
return false
}
func main() {
// Texts URL
var list = "https://gist.githubusercontent.com/alexcesaro/c9c47c638252e21bd82c/raw/bd031237a56ae6691145b4df5617c385dffe930d/list.txt"
var url1 = "https://gist.githubusercontent.com/alexcesaro/4ebfa5a9548d053dddb2/raw/abb8525774b63f342e5173d1af89e47a7a39cd2d/file1.txt"
//Create storing arrays
var buffer [2000]string
var bufferUrl1 [40000]string
// Set a sibling counter
var sibling = 0
// Read and store text files
wordList, err := http.Get(list)
if err != nil {
log.Fatalf("Error while getting the url : %v", err)
}
defer wordList.Body.Close()
wordUrl1, err := http.Get(url1)
if err != nil {
log.Fatalf("Error while getting the url : %v", err)
}
defer wordUrl1.Body.Close()
streamList := bufio.NewScanner(wordList.Body)
streamUrl1 := bufio.NewScanner(wordUrl1.Body)
streamList.Split(bufio.ScanLines)
streamUrl1.Split(bufio.ScanLines)
var i = 0;
var j = 0;
//Fill arrays with each lines
for streamList.Scan() {
buffer[i] = streamList.Text()
i++
}
for streamUrl1.Scan() {
bufferUrl1[j] = streamUrl1.Text()
j++
}
//ERROR OCCURRING HERE :
// This code if i'm not wrong is supposed to compare through all the range of bufferUrl1 -> bufferUrl1 values with buffer values, then increment sibling and output FIND
for v := range bufferUrl1{
if stringInSlice(bufferUrl1, buffer) {
sibling++
fmt.Println("FIND")
}
}
// As a testing purpose thoses lines properly paste both array
// fmt.Println(buffer)
// fmt.Println(bufferUrl1)
}
But right now, my build doesn't even succeed. I'm only greeted with this message:
.\hello.go:69: cannot use bufferUrl1 (type [40000]string) as type string in argument to stringInSlice
.\hello.go:69: cannot use buffer (type [2000]string) as type []string in argument to stringInSlice
bufferUrl1 is an array: [4000]string. You meant to use v (each
string in bufferUrl1). But in fact, you meant to use the second
variable—the first variable is the index which is ignored in the code
below using _.
type [2000]string is different from []string. In Go, arrays and slices are not the same. Read Go Slices: usage and internals. I've changed both variable declarations to use slices with the same initial length using make.
These are changes you need to make to compile.
Declarations:
// Create storing slices
buffer := make([]string, 2000)
bufferUrl1 := make([]string, 40000)
and the loop on Line 69:
for _, s := range bufferUrl1 {
if stringInSlice(s, buffer) {
sibling++
fmt.Println("FIND")
}
}
As a side-note, consider using a map instead of a slice for buffer for more efficient lookup instead of looping through the list in stringInSlice.
https://play.golang.org/p/UcaSVwYcIw has the fix for the comments below (you won't be able to make HTTP requests from the Playground).
Is it possible to iterate on a golang array/slice without using 'for' statement?
You could use goto statement (not recommended).
package main
import (
"fmt"
)
func main() {
my_slice := []string {"a", "b", "c", "d"}
index := 0
back:
if index < len(my_slice) {
fmt.Println(my_slice[index])
index += 1
goto back
}
}
As mentioned by #LeoCorrea you could use a recursive function to iterate over a slice. A tail recursion could prevent the stack overflow mentioned by #vutran.
package main
import "fmt"
func num(a []string, i int) {
if i >= len(a) {
return
} else {
fmt.Println(i, a[i]) //0 a 1 b 2 c
i += 1
num(a, i) //tail recursion
}
}
func main() {
a := []string{"a", "b", "c"}
i := 0
num(a, i)
}
A possibly more readable but less pure example could use an anonymous function. See https://play.golang.org/p/Qen6BKviWuE.
You could write a recursive function to iterate over the slice but why would you want to not use a for loop?
Go doesn't have different loop keywords like for or while, it just has for which has a few different forms
I also don't understand why you'd want to do this, but here is a code sample using no for loops.
package main
import "fmt"
type P struct {
Next *P
}
func (p *P) Iterate() *P {
if p.Next != nil {
fmt.Println("Saw another P")
return p.Next.Iterate()
}
return nil
}
func main() {
var z []*P
z = append(z, &P{})
z = append(z, &P{Next: z[len(z)-1]})
z = append(z, &P{Next: z[len(z)-1]})
z = append(z, &P{Next: z[len(z)-1]})
z = append(z, &P{Next: z[len(z)-1]})
z[len(z)-1].Iterate()
}
https://play.golang.org/p/CMSp6M00kR
Please note that, while it contains a slice as requested, the properties of the slice itself go completely unused.
I want to return a structure that looks like this:
{
results: [
["ooid1", 2.0, "Söme text"],
["ooid2", 1.3, "Åther text"],
]
}
That's an array of arrags that is string, floating point number, unicode character.
If it was Python I'd be able to:
import json
json.dumps({'results': [["ooid1", 2.0, u"Söme text"], ...])
But in Go you can't have an array (or slice) of mixed types.
I thought of using a struct like this:
type Row struct {
Ooid string
Score float64
Text rune
}
But I don't want each to become a dictionary, I want it to become an array of 3 elements each.
We can customize how an object is serialized by implementing the json.Marshaler interface. For our particular case, we seem to have a slice of Row elements that we want to encode as an array of heterogenous values. We can do so by defining a MarshalJSON function on our Row type, using an intermediate slice of interface{} to encode the mixed values.
This example demonstrates:
package main
import (
"encoding/json"
"fmt"
)
type Row struct {
Ooid string
Score float64
Text string
}
func (r *Row) MarshalJSON() ([]byte, error) {
arr := []interface{}{r.Ooid, r.Score, r.Text}
return json.Marshal(arr)
}
func main() {
rows := []Row{
{"ooid1", 2.0, "Söme text"},
{"ooid2", 1.3, "Åther text"},
}
marshalled, _ := json.Marshal(rows)
fmt.Println(string(marshalled))
}
Of course, we also might want to go the other way around, from JSON bytes back to structs. So there's a similar json.Unmarshaler interface that we can use.
func (r *Row) UnmarshalJSON(bs []byte) error {
arr := []interface{}{}
json.Unmarshal(bs, &arr)
// TODO: add error handling here.
r.Ooid = arr[0].(string)
r.Score = arr[1].(float64)
r.Text = arr[2].(string)
return nil
}
This uses a similar trick of first using an intermediate slice of interface{}, using the unmarshaler to place values into this generic container, and then plop the values back into our structure.
package main
import (
"encoding/json"
"fmt"
)
type Row struct {
Ooid string
Score float64
Text string
}
func (r *Row) UnmarshalJSON(bs []byte) error {
arr := []interface{}{}
json.Unmarshal(bs, &arr)
// TODO: add error handling here.
r.Ooid = arr[0].(string)
r.Score = arr[1].(float64)
r.Text = arr[2].(string)
return nil
}
func main() {
rows := []Row{}
text := `
[
["ooid4", 3.1415, "pi"],
["ooid5", 2.7182, "euler"]
]
`
json.Unmarshal([]byte(text), &rows)
fmt.Println(rows)
}
You can read a full example here.
Use []interface{}
type Results struct {
Rows []interface{} `json:"results"`
}
You will then have to use type assertion if you want to access the values stored in []interface{}
for _, row := range results.Rows {
switch r := row.(type) {
case string:
fmt.Println("string", r)
case float64:
fmt.Println("float64", r)
case int64:
fmt.Println("int64", r)
default:
fmt.Println("not found")
}
}
Some clumsy, but you can
type result [][]interface{}
type results struct {
Results result
}
Working example https://play.golang.org/p/IXAzZZ3Dg7