Related
Let's say I have a list of student cities and the size of it could be 100 or 1000, and I want to filter out all duplicates cities.
I want a generic solution that I can use to remove all duplicate strings from any slice.
I am new to Go Language, So I tried to do it by looping and checking if the element exists using another loop function.
Students' Cities List (Data):
studentsCities := []string{"Mumbai", "Delhi", "Ahmedabad", "Mumbai", "Bangalore", "Delhi", "Kolkata", "Pune"}
Functions that I created, and it's doing the job:
func contains(s []string, e string) bool {
for _, a := range s {
if a == e {
return true
}
}
return false
}
func removeDuplicates(strList []string) []string {
list := []string{}
for _, item := range strList {
fmt.Println(item)
if contains(list, item) == false {
list = append(list, item)
}
}
return list
}
My solution test
func main() {
studentsCities := []string{"Mumbai", "Delhi", "Ahmedabad", "Mumbai", "Bangalore", "Delhi", "Kolkata", "Pune"}
uniqueStudentsCities := removeDuplicates(studentsCities)
fmt.Println(uniqueStudentsCities) // Expected output [Mumbai Delhi Ahmedabad Bangalore Kolkata Pune]
}
I believe that the above solution that I tried is not an optimum solution. Therefore, I need help from you guys to suggest the fastest way to remove duplicates from the slice?
I checked StackOverflow, this question is not being asked yet, so I didn't get any solution.
I found Burak's and Fazlan's solution helpful. Based on that, I implemented the simple functions that help to remove or filter duplicate data from slices of strings, integers, or any other types with generic approach.
Here are my three functions, first is generic, second one for strings and last one for integers of slices. You have to pass your data and return all the unique values as a result.
Generic solution: => Go v1.18
func removeDuplicate[T string | int](sliceList []T) []T {
allKeys := make(map[T]bool)
list := []T{}
for _, item := range sliceList {
if _, value := allKeys[item]; !value {
allKeys[item] = true
list = append(list, item)
}
}
return list
}
To remove duplicate strings from slice:
func removeDuplicateStr(strSlice []string) []string {
allKeys := make(map[string]bool)
list := []string{}
for _, item := range strSlice {
if _, value := allKeys[item]; !value {
allKeys[item] = true
list = append(list, item)
}
}
return list
}
To remove duplicate integers from slice:
func removeDuplicateInt(intSlice []int) []int {
allKeys := make(map[int]bool)
list := []int{}
for _, item := range intSlice {
if _, value := allKeys[item]; !value {
allKeys[item] = true
list = append(list, item)
}
}
return list
}
You can update the slice type, and it will filter out all duplicates data for all types of slices.
Here is the GoPlayground link: https://go.dev/play/p/iyb97KcftMa
Adding this answer which worked for me, does require/include sorting, however.
func removeDuplicateStrings(s []string) []string {
if len(s) < 1 {
return s
}
sort.Strings(s)
prev := 1
for curr := 1; curr < len(s); curr++ {
if s[curr-1] != s[curr] {
s[prev] = s[curr]
prev++
}
}
return s[:prev]
}
For fun, I tried using generics! (Go 1.18+ only)
type SliceType interface {
~string | ~int | ~float64 // add more *comparable* types as needed
}
func removeDuplicates[T SliceType](s []T) []T {
if len(s) < 1 {
return s
}
// sort
sort.SliceStable(s, func(i, j int) bool {
return s[i] < s[j]
})
prev := 1
for curr := 1; curr < len(s); curr++ {
if s[curr-1] != s[curr] {
s[prev] = s[curr]
prev++
}
}
return s[:prev]
}
Go Playground Link with tests: https://go.dev/play/p/bw1PP1osJJQ
You can do in-place replacement guided with a map:
processed := map[string]struct{}{}
w := 0
for _, s := range cities {
if _, exists := processed[s]; !exists {
// If this city has not been seen yet, add it to the list
processed[s] = struct{}{}
cities[w] = s
w++
}
}
cities = cities[:w]
reduce memory usage:
package main
import (
"fmt"
"reflect"
)
type void struct{}
func main() {
digits := [6]string{"one", "two", "three", "four", "five", "five"}
set := make(map[string]void)
for _, element := range digits {
set[element] = void{}
}
fmt.Println(reflect.ValueOf(set).MapKeys())
}
p.s. playground
Simple to understand.
func RemoveDuplicate(array []string) []string {
m := make(map[string]string)
for _, x := range array {
m[x] = x
}
var ClearedArr []string
for x, _ := range m {
ClearedArr = append(ClearedArr, x)
}
return ClearedArr
}
If you want to don't waste memory allocating another array for copy the values, you can remove in place the value, as following:
package main
import "fmt"
var studentsCities = []string{"Mumbai", "Delhi", "Ahmedabad", "Mumbai", "Bangalore", "Delhi", "Kolkata", "Pune"}
func contains(s []string, e string) bool {
for _, a := range s {
if a == e {
return true
}
}
return false
}
func main() {
fmt.Printf("Cities before remove: %+v\n", studentsCities)
for i := 0; i < len(studentsCities); i++ {
if contains(studentsCities[i+1:], studentsCities[i]) {
studentsCities = remove(studentsCities, i)
i--
}
}
fmt.Printf("Cities after remove: %+v\n", studentsCities)
}
func remove(slice []string, s int) []string {
return append(slice[:s], slice[s+1:]...)
}
Result:
Cities before remove: [Mumbai Delhi Ahmedabad Mumbai Bangalore Delhi Kolkata Pune]
Cities after remove: [Ahmedabad Mumbai Bangalore Delhi Kolkata Pune]
It can also be done with a set-like map:
ddpStrings := []string{}
m := map[string]struct{}{}
for _, s := range strings {
if _, ok := m[scopeStr]; ok {
continue
}
ddpStrings = append(ddpStrings, s)
m[s] = struct{}{}
}
func UniqueNonEmptyElementsOf(s []string) []string {
unique := make(map[string]bool, len(s))
var us []string
for _, elem := range s {
if len(elem) != 0 {
if !unique[elem] {
us = append(us, elem)
unique[elem] = true
}
}
}
return us
}
send the duplicated splice to the above function, this will return the splice with unique elements.
func main() {
studentsCities := []string{"Mumbai", "Delhi", "Ahmedabad", "Mumbai", "Bangalore", "Delhi", "Kolkata", "Pune"}
uniqueStudentsCities := UniqueNonEmptyElementsOf(studentsCities)
fmt.Println(uniqueStudentsCities)
}
Here's a mapless index based slice's duplicate "remover"/trimmer. It use a sort method.
The n value is always 1 value lower than the total of non duplicate elements that's because this methods compare the current (consecutive/single) elements with the next (consecutive/single) elements and there is no matches after the lasts so you have to pad it to include the last.
Note that this snippet doesn't empty the duplicate elements into a nil value. However since the n+1 integer start at the duplicated item's indexes, you can loop from said integer and nil the rest of the elements.
sort.Strings(strs)
for n, i := 0, 0; ; {
if strs[n] != strs[i] {
if i-n > 1 {
strs[n+1] = strs[i]
}
n++
}
i++
if i == len(strs) {
if n != i {
strs = strs[:n+1]
}
break
}
}
fmt.Println(strs)
Based on Riyaz's solution, you can use generics since Go 1.18
func removeDuplicate[T string | int](tSlice []T) []T {
allKeys := make(map[T]bool)
list := []T{}
for _, item := range tSlice {
if _, value := allKeys[item]; !value {
allKeys[item] = true
list = append(list, item)
}
}
return list
}
Generics minimizes code duplication.
Go Playground link : https://go.dev/play/p/Y3fEtHJpP7Q
So far #snassr has given the best answer as it is the most optimized way in terms of memory (no extra memory) and runtime (nlogn). But one thing I want to emphasis here is if we want to delete any index/element of an array we should loop from end to start as it reduces complexity. If we loop from start to end then if we delete nth index then we will accidentally miss the nth element (which was n+1th before deleting nth element) as in the next iteration we will get the n+1th element.
Example Code
func Dedup(strs []string) {
sort.Strings(strs)
for i := len(strs) - 1; i > 0; i-- {
if strs[i] == strs[i-1] {
strs = append(strs[:i], strs[i+1:]...)
}
}
}
try: https://github.com/samber/lo#uniq
names := lo.Uniq[string]([]string{"Samuel", "John", "Samuel"})
// []string{"Samuel", "John"}
According to the documentation:
init(_ s: S) where Element == S.Element, S : Sequence
Creates an array containing the elements of a sequence.
struct Test: IteratorProtocol, Sequence {
let id: Int
init(_ id: Int) {
self.id = id
}
mutating func next() -> Test? {
id < 10 ? Test(id + 1) : nil
}
}
let test = Test(5)
let arr = Array(test)
It compiles. And doesn't even throw any runtime errors.
But instead of getting the array [5, 6, 7, 8, 9] as a result, I get an infinite loop! next() is called infinitely many times.
I thought that nil in next() is a natural indicator of the end of sequence. But apparently it's not.
self.id never changes, so it never reaches 10.
It should be something like this
struct Test: IteratorProtocol, Sequence {
var id: Int
init(_ id: Int) {
self.id = id
}
mutating func next() -> Test? {
defer { id += 1 }
return id < 10 ? self : nil
}
}
print(Array(Test(6)))
Another example
struct Countdown: Sequence, IteratorProtocol {
var count: Int
mutating func next() -> Int? {
if count == 0 {
return nil
} else {
defer { count -= 1 }
return count
}
}
}
let threeToGo = Countdown(count: 3)
for i in threeToGo {
print(i)
}
// Prints "3"
// Prints "2"
// Prints "1"
Appears, there is a built-in function, that completely suits the logic of my initial question in this post.
sequence(first:next:)
Returns a sequence formed from first and repeated lazy applications of next.
struct Test {
var id: Int
init(_ id: Int) {
self.id = id
}
}
let seq = sequence(first: Test(5), next: { test in
let id = test.id + 1
return id < 10 ? Test(id) : nil
})
let arr = Array(seq)
The function outputs via print() all possible combinations of the characters "abc". (Depending on the specified length)
I need to calculate this amount. I only managed to output these combinations one by one through print(). I left a comment in the right place of the code.
func allLexicographicRecur (_ string: [String.Element], _ data: [String], _ last: Int, _ index: Int){
var length = string.count-1
var data = data
for i in 0...length {
data[index] = String(string[i])
if index == last {
print(data.joined()) // Displays a combination. It is necessary to somehow calculate.
}else{
allLexicographicRecur(string, data, last, index+1)
}
}
}
func allLexicographic(_ l: Int) {
var alphabet = "abc"
var data = Array(repeating: "", count: l)
var string = alphabet.sorted()
var counter = 0
allLexicographicRecur(string, data, l-1, 0)
}
allLexicographic(3)
The function must somehow return the number of these combinations.
I would be very grateful for the help!
I managed to count only this way (but most likely it is not the best way to do it):
var count = 0
func allLexicographicRecur (_ string: [String.Element], _ data: [String], _ last: Int, _ index: Int){
var length = string.count-1
var data = data
for i in 0...length {
data[index] = String(string[i])
if index == last {
print(data.joined()) // Displays a combination. It is necessary to somehow calculate.
count += 1
}else{
allLexicographicRecur(string, data, last, index+1)
}
}
}
func allLexicographic(_ l: Int) {
var alphabet = "abc"
var data = Array(repeating: "", count: l)
var string = alphabet.sorted()
var counter = 0
allLexicographicRecur(string, data, l-1, 0)
}
allLexicographic(3)
print(count)
You do not need a global variable. There are at least two other options. You can add an inout parameter to allLexicographicRecur to keep track of the count or you can have allLexicographicRecur return its count.
Here's your code using a return value:
func allLexicographicRecur(_ string: [String.Element], _ data: [String], _ last: Int, _ index: Int) -> Int {
let length = string.count - 1
var data = data
var count = 0
for i in 0...length {
data[index] = String(string[i])
if index == last {
print(data.joined()) // Displays a combination. It is necessary to somehow calculate.
count += 1
} else {
count += allLexicographicRecur(string, data, last, index + 1)
}
}
return count
}
func allLexicographic(_ l: Int) -> Int {
let alphabet = "abc"
let data = Array(repeating: "", count: l)
let string = alphabet.sorted()
return allLexicographicRecur(string, data, l - 1, 0)
}
print(allLexicographic(3))
Here's your code updated to use an inout parameter.
func allLexicographicRecur(_ string: [String.Element], _ data: [String], _ last: Int, _ index: Int, _ count: inout Int){
let length = string.count - 1
var data = data
for i in 0...length {
data[index] = String(string[i])
if index == last {
print(data.joined()) // Displays a combination. It is necessary to somehow calculate.
count += 1
} else {
allLexicographicRecur(string, data, last, index + 1, &count)
}
}
}
func allLexicographic(_ l: Int) -> Int {
let alphabet = "abc"
let data = Array(repeating: "", count: l)
let string = alphabet.sorted()
var counter = 0
allLexicographicRecur(string, data, l - 1, 0, &counter)
return counter
}
print(allLexicographic(3))
You can not mange the count without global variable because of recursive function. so the method you wrote in question is perfect as per the output you want to have.
var bar string
var i int
var a []string
for foo, _ := reader.NextWord(); foo != bar; foo, _ = reader.NextWord() {
bar = foo
fmt.Print(foo)
a[i] = foo
i++
}
Shouldn't this be creating a nil slice and then adding the value to the appropriate place? I keep getting index out of range so I assume it's not adding to a[i]...
Checking length first with
if len(a) > 0 {
a[i] = foo
}
seems to help, but not getting the results I expected. I'll keep playing around.
Update: I did end up using append... I meant to update this thread but thank you both.
package main
import (
"fmt"
"log"
"os"
"strings"
"github.com/steven-ferrer/gonsole"
)
func main() {
file, err := os.Open("test.txt")
if err != nil {
log.Println(err)
}
defer file.Close()
reader := gonsole.NewReader(file)
// cycle through
var bar string
var i int
var quality []string = make([]string, 0)
var tempName []string = make([]string, 0)
var name []string = make([]string, 0)
for foo, _ := reader.NextWord(); foo != bar; foo, _ = reader.NextWord() {
bar = foo
if strings.Contains(foo, "(normal)") {
quality = append(quality, "normal")
for state := 0; state < 1; foo, _ = reader.NextWord() {
if foo == "|" {
state = 1
}
tempName = append(tempName, foo)
}
nameString := strings.Join(tempName, "")
name = append(name, nameString)
} else if strings.Contains(foo, "(unique)") {
quality = append(quality, "unique")
for state := 0; state < 1; foo, _ = reader.NextWord() {
if foo == "|" {
state = 1
}
tempName = append(tempName, foo)
}
nameString := strings.Join(tempName, "")
name = append(name, nameString)
} else if strings.Contains(foo, "(set)") {
quality = append(quality, "set")
for state := 0; state < 1; foo, _ = reader.NextWord() {
if foo == "|" {
state = 1
}
tempName = append(tempName, foo)
}
nameString := strings.Join(tempName, "")
name = append(name, nameString)
}
if tempName != nil {
tempName = nil // clear tempName
}
i++
}
Your slice a needs to be allocated utilizing make.
var a []string = make([]string, n)
where n is the size of the slice.
Removing some of the context-specific parts of your code, you should be using the append method against a dynamic-length slice.
package main
import (
"fmt"
"strings"
)
func main() {
book := "Lorem ipsum dolor sit amet"
var words []string
for _, word := range strings.Split(book, " ") {
words = append(words, word)
}
fmt.Printf("%+v\n", words)
}
https://play.golang.org/p/LMejsrmIGb9
If you know the number of values up front, the same can be achieved for a fixed length slice by using words := make([]string, 5), but I doubt this is what you want in this case.
The reason your code is causing you errors is because your slice isn't initialized at any given length, so your indexes don't yet exist. Generally when working with a slice, append is the method you want.
Opposite to this, when working with existing slices (ie, rangeing an slice), the reason you're able to set the values using indexes is because the index has already been allocated.
Is there a way to convert struct to array of values in Golang?
for example if I have this kind of struct (not just this one):
type Model struct {
Id bson.ObjectId `bson:"_id,omitempty"`
CreatedAt time.Time `bson:",omitempty"`
UpdatedAt time.Time `bson:",omitempty"`
DeletedAt time.Time `bson:",omitempty"`
CreatedBy bson.ObjectId `bson:",omitempty"`
UpdatedBy bson.ObjectId `bson:",omitempty"`
DeletedBy bson.ObjectId `bson:",omitempty"`
Logs []bson.ObjectId `bson:",omitempty"`
}
type User struct {
Name string `bson:"name"`
Model `bson:",inline"`
}
The case was, I usually send the JSON to the browser with this format:
var iota = -1
var data = {
NAME: ++iota, ID: ++iota, CREATED_AT: ++iota, UPDATED_AT: ++iota, DELETED_AT: ++iota, // and so on
rows: [['kiz',1,'2014-01-01','2014-01-01','2014-01-01'],
['yui',2,'2014-01-01','2014-01-01','2014-01-01'],
['ham',3,'2014-01-01','2014-01-01','2014-01-01'] // and so on
]
};
Instead of:
var data = {
rows: [{NAME:'kiz',ID:1,CreatedAt:'2014-01-01',UpdatedAt:'2014-01-01',DeletedAt:'2014-01-01'},
{NAME:'yui',ID:2,CreatedAt:'2014-01-01',UpdatedAt:'2014-01-01',DeletedAt:'2014-01-01'},
{NAME:'ham',ID:3,CreatedAt:'2014-01-01',UpdatedAt:'2014-01-01',DeletedAt:'2014-01-01'} // and so on
]
}
Here's what I've tried:
import (
"github.com/kr/pretty"
//"gopkg.in/mgo.v2"
"gopkg.in/mgo.v2/bson"
"reflect"
"runtime"
"strings"
"time"
)
// copy the model from above
func Explain(variable interface{}) {
_, file, line, _ := runtime.Caller(1)
//res, _ := json.MarshalIndent(variable, " ", " ")
res := pretty.Formatter(variable)
fmt.Printf("%s:%d: %# v\n", file[len(FILE_PATH):], line, res)
//spew.Dump(variable)
}
func s2a(i interface{}) []interface{} { // taken from https://gist.github.com/tonyhb/5819315
iVal := reflect.ValueOf(i).Elem()
//typ := iVal.Type()
values := make([]interface{}, 0, iVal.NumField())
for i := 0; i < iVal.NumField(); i++ {
f := iVal.Field(i)
//tag := typ.Field(i).Tag.Get("tagname")
//fmt.Println(tag)
// name := typ.Field(i).Name
v := f.Interface()
switch v.(type) {
case int, int8, int16, int32, int64, uint, uint8, uint16, uint32, uint64, float32, float64, string, []byte, time.Time:
// do nothing
// case struct{}: // how to catch any embeeded struct?
case Model: // Model (or any embedded/nameless struct) should also converted to array
//arr := s2a() // invalid type assertion: f.(Model) (non-interface type reflect.Value on left)
//arr := s2a(f.Addr().(&Model)) // invalid type assertion: f.Addr().(&Model) (non-interface type reflect.Value on left)
// umm.. how to convert f back to Model?
//for _, e := range arr {
values = append(values, e)
//}
default: // struct? but also interface and map T_T
//v = s2a(&v)
}
values = append(values, v)
}
return values
}
func main() {
//sess, err := mgo.Dial("127.0.0.1")
//Check(err, "unable to connect")
//db := sess.DB("test")
//coll := db.C("coll1")
user := User{}
user.Id = bson.NewObjectId()
user.Name = "kis"
//changeInfo, err := coll.UpsertId(user.Id, user)
//Check(err, "failed to insert")
//Explain(changeInfo)
//Explain(s2a(changeInfo))
user.Name = "test"
Explain(user)
Explain(s2a(&user))
//err = coll.FindId(user.Id).One(&user)
//Check(err, "failed to fetch")
//Explain(user)
//Explain(s2a(&user))
user.CreatedAt = time.Now()
//err = coll.UpdateId(user.Id, user)
//Check(err, "failed to update")
Explain(changeInfo)
Explain(s2a(&user))
user.CreatedAt = user.DeletedAt
//err = coll.FindId(user.Id).One(&user)
//Check(err, "failed to fetch")
Explain(user)
Explain(s2a(&user))
}
Is there easy/fast way to convert struct to array (and if there struct embedded/inside it, converted to array also)?
If you are happy to specify a fixed order for the fields in the array representation, you could do this by implementing the json.Marshaler interface to customise its representation. For example:
func (u User) MarshalJSON() ([]byte, error) {
a := []interface{}{
u.Name,
u.Id,
...,
}
return json.Marshal(a)
}
Now when you marshal variables of this type, they will be represented as an array. If you want to also do the reverse (unmarshal an array into this struct), you will also need to implement the json.Unmarshaler interface. This could be done in a similar fashion, using json.Unmarshal to decode into a []interface{} slice and then pull out the values. Make sure UnmarshalJSON is declared to take a pointer receiver though, or your code won't work (you'll end up updating a copy of the struct rather than the struct itself).
Why not use reflect.Kind()? Here's the playground: http://play.golang.org/p/YjbsnB4eln
Use the reflect package.
Here's some playground code that'll work for one record (of any struct type), you can refactor it to work for a slice of records.
EDIT: (copy-pasted for good measure)
package main
import "fmt"
import "strings"
import "reflect"
type X struct {
Y string
Z int
}
func main() {
data := X{"yval",3}
expectedResult := `{"Y": 0, "Z": 1, "rows": [["yval", 3]]}`
fmt.Println(convert(data))
fmt.Println(expectedResult)
}
func convert(data interface{}) string {
v := reflect.ValueOf(data)
n := v.NumField()
st := reflect.TypeOf(data)
headers := make([]string, n)
for i := 0; i < n; i++ {
headers[i] = fmt.Sprintf(`"%s": %d`, st.Field(i).Name, i)
}
rowContents := make([]string, n)
for i := 0; i < n; i++ {
x := v.Field(i)
s := fmt.Sprintf("%v", x.Interface())
if x.Type().String() == "string" {
s = `"` + s + `"`
}
rowContents[i] = s
}
return "{" + strings.Join(headers, ", ") + `, "rows": [[` + strings.Join(rowContents, ", ") + "]]}"
}