How to remove duplicates strings or int from Slice in Go

How to remove duplicates strings or int from Slice in Go - arrays

Let's say I have a list of student cities and the size of it could be 100 or 1000, and I want to filter out all duplicates cities.
I want a generic solution that I can use to remove all duplicate strings from any slice.
I am new to Go Language, So I tried to do it by looping and checking if the element exists using another loop function.
Students' Cities List (Data):
studentsCities := []string{"Mumbai", "Delhi", "Ahmedabad", "Mumbai", "Bangalore", "Delhi", "Kolkata", "Pune"}
Functions that I created, and it's doing the job:
func contains(s []string, e string) bool {
for _, a := range s {
if a == e {
return true
}
}
return false
}
func removeDuplicates(strList []string) []string {
list := []string{}
for _, item := range strList {
fmt.Println(item)
if contains(list, item) == false {
list = append(list, item)
}
}
return list
}
My solution test
func main() {
studentsCities := []string{"Mumbai", "Delhi", "Ahmedabad", "Mumbai", "Bangalore", "Delhi", "Kolkata", "Pune"}
uniqueStudentsCities := removeDuplicates(studentsCities)
fmt.Println(uniqueStudentsCities) // Expected output [Mumbai Delhi Ahmedabad Bangalore Kolkata Pune]
}
I believe that the above solution that I tried is not an optimum solution. Therefore, I need help from you guys to suggest the fastest way to remove duplicates from the slice?
I checked StackOverflow, this question is not being asked yet, so I didn't get any solution.

I found Burak's and Fazlan's solution helpful. Based on that, I implemented the simple functions that help to remove or filter duplicate data from slices of strings, integers, or any other types with generic approach.
Here are my three functions, first is generic, second one for strings and last one for integers of slices. You have to pass your data and return all the unique values as a result.
Generic solution: => Go v1.18
func removeDuplicate[T string | int](sliceList []T) []T {
allKeys := make(map[T]bool)
list := []T{}
for _, item := range sliceList {
if _, value := allKeys[item]; !value {
allKeys[item] = true
list = append(list, item)
}
}
return list
}
To remove duplicate strings from slice:
func removeDuplicateStr(strSlice []string) []string {
allKeys := make(map[string]bool)
list := []string{}
for _, item := range strSlice {
if _, value := allKeys[item]; !value {
allKeys[item] = true
list = append(list, item)
}
}
return list
}
To remove duplicate integers from slice:
func removeDuplicateInt(intSlice []int) []int {
allKeys := make(map[int]bool)
list := []int{}
for _, item := range intSlice {
if _, value := allKeys[item]; !value {
allKeys[item] = true
list = append(list, item)
}
}
return list
}
You can update the slice type, and it will filter out all duplicates data for all types of slices.
Here is the GoPlayground link: https://go.dev/play/p/iyb97KcftMa

Adding this answer which worked for me, does require/include sorting, however.
func removeDuplicateStrings(s []string) []string {
if len(s) < 1 {
return s
}
sort.Strings(s)
prev := 1
for curr := 1; curr < len(s); curr++ {
if s[curr-1] != s[curr] {
s[prev] = s[curr]
prev++
}
}
return s[:prev]
}
For fun, I tried using generics! (Go 1.18+ only)
type SliceType interface {
~string | ~int | ~float64 // add more *comparable* types as needed
}
func removeDuplicates[T SliceType](s []T) []T {
if len(s) < 1 {
return s
}
// sort
sort.SliceStable(s, func(i, j int) bool {
return s[i] < s[j]
})
prev := 1
for curr := 1; curr < len(s); curr++ {
if s[curr-1] != s[curr] {
s[prev] = s[curr]
prev++
}
}
return s[:prev]
}
Go Playground Link with tests: https://go.dev/play/p/bw1PP1osJJQ

You can do in-place replacement guided with a map:
processed := map[string]struct{}{}
w := 0
for _, s := range cities {
if _, exists := processed[s]; !exists {
// If this city has not been seen yet, add it to the list
processed[s] = struct{}{}
cities[w] = s
w++
}
}
cities = cities[:w]

reduce memory usage:
package main
import (
"fmt"
"reflect"
)
type void struct{}
func main() {
digits := [6]string{"one", "two", "three", "four", "five", "five"}
set := make(map[string]void)
for _, element := range digits {
set[element] = void{}
}
fmt.Println(reflect.ValueOf(set).MapKeys())
}
p.s. playground

Simple to understand.
func RemoveDuplicate(array []string) []string {
m := make(map[string]string)
for _, x := range array {
m[x] = x
}
var ClearedArr []string
for x, _ := range m {
ClearedArr = append(ClearedArr, x)
}
return ClearedArr
}

If you want to don't waste memory allocating another array for copy the values, you can remove in place the value, as following:
package main
import "fmt"
var studentsCities = []string{"Mumbai", "Delhi", "Ahmedabad", "Mumbai", "Bangalore", "Delhi", "Kolkata", "Pune"}
func contains(s []string, e string) bool {
for _, a := range s {
if a == e {
return true
}
}
return false
}
func main() {
fmt.Printf("Cities before remove: %+v\n", studentsCities)
for i := 0; i < len(studentsCities); i++ {
if contains(studentsCities[i+1:], studentsCities[i]) {
studentsCities = remove(studentsCities, i)
i--
}
}
fmt.Printf("Cities after remove: %+v\n", studentsCities)
}
func remove(slice []string, s int) []string {
return append(slice[:s], slice[s+1:]...)
}
Result:
Cities before remove: [Mumbai Delhi Ahmedabad Mumbai Bangalore Delhi Kolkata Pune]
Cities after remove: [Ahmedabad Mumbai Bangalore Delhi Kolkata Pune]

It can also be done with a set-like map:
ddpStrings := []string{}
m := map[string]struct{}{}
for _, s := range strings {
if _, ok := m[scopeStr]; ok {
continue
}
ddpStrings = append(ddpStrings, s)
m[s] = struct{}{}
}

func UniqueNonEmptyElementsOf(s []string) []string {
unique := make(map[string]bool, len(s))
var us []string
for _, elem := range s {
if len(elem) != 0 {
if !unique[elem] {
us = append(us, elem)
unique[elem] = true
}
}
}
return us
}
send the duplicated splice to the above function, this will return the splice with unique elements.
func main() {
studentsCities := []string{"Mumbai", "Delhi", "Ahmedabad", "Mumbai", "Bangalore", "Delhi", "Kolkata", "Pune"}
uniqueStudentsCities := UniqueNonEmptyElementsOf(studentsCities)
fmt.Println(uniqueStudentsCities)
}

Here's a mapless index based slice's duplicate "remover"/trimmer. It use a sort method.
The n value is always 1 value lower than the total of non duplicate elements that's because this methods compare the current (consecutive/single) elements with the next (consecutive/single) elements and there is no matches after the lasts so you have to pad it to include the last.
Note that this snippet doesn't empty the duplicate elements into a nil value. However since the n+1 integer start at the duplicated item's indexes, you can loop from said integer and nil the rest of the elements.
sort.Strings(strs)
for n, i := 0, 0; ; {
if strs[n] != strs[i] {
if i-n > 1 {
strs[n+1] = strs[i]
}
n++
}
i++
if i == len(strs) {
if n != i {
strs = strs[:n+1]
}
break
}
}
fmt.Println(strs)

Based on Riyaz's solution, you can use generics since Go 1.18
func removeDuplicate[T string | int](tSlice []T) []T {
allKeys := make(map[T]bool)
list := []T{}
for _, item := range tSlice {
if _, value := allKeys[item]; !value {
allKeys[item] = true
list = append(list, item)
}
}
return list
}
Generics minimizes code duplication.
Go Playground link : https://go.dev/play/p/Y3fEtHJpP7Q

So far #snassr has given the best answer as it is the most optimized way in terms of memory (no extra memory) and runtime (nlogn). But one thing I want to emphasis here is if we want to delete any index/element of an array we should loop from end to start as it reduces complexity. If we loop from start to end then if we delete nth index then we will accidentally miss the nth element (which was n+1th before deleting nth element) as in the next iteration we will get the n+1th element.
Example Code
func Dedup(strs []string) {
sort.Strings(strs)
for i := len(strs) - 1; i > 0; i-- {
if strs[i] == strs[i-1] {
strs = append(strs[:i], strs[i+1:]...)
}
}
}

try: https://github.com/samber/lo#uniq
names := lo.Uniq[string]([]string{"Samuel", "John", "Samuel"})
// []string{"Samuel", "John"}

Related

Remove slice of string from slice

I want to remove range of slice from slice like remove "A", "B" from "A" to "Z", but I want to make it efficient (I don't know why in Go but in Python we can use hashmap).
The code below is the closest I can get but there are edge cases I miss:
func removeString(listOri []string, targetDelete []string) []string {
newitems := []string{}
for i := range listOri {
for j := range targetDelete {
if listOri [i] != targetDelete[j] {
newitems = append(newitems, listOri [i])
}
}
}
return newitems
}
listOriginal := []string{"A", "B", "C", "D"}
listDelete := []string{"A", "B"}
listNew := removeString(listOriginal, listDelete)
result = "A","B","C","C","D","D"

It's better (faster) to use a map to represent the items that are to be deleted. If there's N things in the original list, and M things that are in the to-be-deleted list, your code (once the bugs are fixed) would run in O(NM) time, whereas a map-based solution will run in O(N) time.
Here's example code:
package main
import "fmt"
func filter(src []string, del map[string]bool) []string {
var dst []string
for _, s := range src {
if !del[s] {
dst = append(dst, s)
}
}
return dst
}
func main() {
src := []string{"A", "B", "C", "D"}
del := map[string]bool{"A": true, "B": true}
fmt.Println(filter(src, del))
}
If you really need the to-be-deleted things to be a slice, you should convert the slice into a map first. Then the code is O(N+M) time.

What you need to do is check if each item in the original exists in the list of items to delete, and if not then you add it to the result:
func removeString(listOri []string, targetDelete []string) []string {
newitems := []string{}
var found bool
for i := range listOri {
found = false
for j := range targetDelete {
if listOri[i] == targetDelete[j] {
found = true
break
}
}
if !found {
newitems = append(newitems, listOri[i])
}
}
return newitems
}
You might also find Does Go have "if x in" construct similar to Python? informative.

Recursively changing arrays to non-arrays in JSON with sjson in Golang

What I'm trying to do:
Transform all arrays of length 1 in a JSON file to non arrays.
e.g.
Input: {"path": [{"secret/foo": [{"capabilities": ["read"]}]}]}
Output: {"path": {"secret/foo": {"capabilities": "read"}}}
I can't use Structs as the JSON format will vary...
Right now I've managed to at least detect the 1 length slices:
package main
import (
"encoding/json"
"fmt"
)
func findSingletons(value interface{}) {
switch value.(type) {
case []interface{}:
if len(value.([]interface{})) == 1 {
fmt.Println("1 length array found!", value)
}
for _, v := range value.([]interface{}) {
findSingletons(v)
}
case map[string]interface{}:
for _, v := range value.(map[string]interface{}) {
findSingletons(v)
}
}
}
func removeSingletonsFromJSON(input string) {
jsonFromInput := json.RawMessage(input)
jsonMap := make(map[string]interface{})
err := json.Unmarshal([]byte(jsonFromInput), &jsonMap)
if err != nil {
panic(err)
}
findSingletons(jsonMap)
fmt.Printf("JSON value of without singletons:%s\n", jsonMap)
}
func main() {
jsonParsed := []byte(`{"path": [{"secret/foo": [{"capabilities": ["read"]}]}]}`)
removeSingletonsFromJSON(string(jsonParsed))
fmt.Println(`Should have output {"path": {"secret/foo": {"capabilities": "read"}}}`)
}
Which outputs
1 length array found! [map[secret/foo:[map[capabilities:[read]]]]]
1 length array found! [map[capabilities:[read]]]
1 length array found! [read]
JSON value of without singletons:map[path:[map[secret/foo:[map[capabilities:[read]]]]]]
Should have output {"path": {"secret/foo": {"capabilities": "read"}}}
But I'm not sure how I can change them into non-arrays...

The type switch is your friend:
switch t := v.(type) {
case []interface{}:
if len(t) == 1 {
data[k] = t[0]
And you may use recursion to remove inside elements, like so:
func removeOneElementSlice(data map[string]interface{}) {
for k, v := range data {
switch t := v.(type) {
case []interface{}:
if len(t) == 1 {
data[k] = t[0]
if v, ok := data[k].(map[string]interface{}); ok {
removeOneElementSlice(v)
}
}
}
}
}
I would do this to convert
{"path":[{"secret/foo":[{"capabilities":["read"]}]}]}
to
{"path":{"secret/foo":{"capabilities":"read"}}}:
package main
import (
"encoding/json"
"fmt"
"log"
)
func main() {
s := `{"path":[{"secret/foo":[{"capabilities":["read"]}]}]}`
fmt.Println(s)
var data map[string]interface{}
if err := json.Unmarshal([]byte(s), &data); err != nil {
panic(err)
}
removeOneElementSlice(data)
buf, err := json.Marshal(data)
if err != nil {
log.Fatal(err)
}
fmt.Println(string(buf)) //{"a":"a","n":7}
}
func removeOneElementSlice(data map[string]interface{}) {
for k, v := range data {
switch t := v.(type) {
case []interface{}:
if len(t) == 1 {
data[k] = t[0]
if v, ok := data[k].(map[string]interface{}); ok {
removeOneElementSlice(v)
}
}
}
}
}

Index out of range trying to add to slice

var bar string
var i int
var a []string
for foo, _ := reader.NextWord(); foo != bar; foo, _ = reader.NextWord() {
bar = foo
fmt.Print(foo)
a[i] = foo
i++
}
Shouldn't this be creating a nil slice and then adding the value to the appropriate place? I keep getting index out of range so I assume it's not adding to a[i]...
Checking length first with
if len(a) > 0 {
a[i] = foo
}
seems to help, but not getting the results I expected. I'll keep playing around.
Update: I did end up using append... I meant to update this thread but thank you both.
package main
import (
"fmt"
"log"
"os"
"strings"
"github.com/steven-ferrer/gonsole"
)
func main() {
file, err := os.Open("test.txt")
if err != nil {
log.Println(err)
}
defer file.Close()
reader := gonsole.NewReader(file)
// cycle through
var bar string
var i int
var quality []string = make([]string, 0)
var tempName []string = make([]string, 0)
var name []string = make([]string, 0)
for foo, _ := reader.NextWord(); foo != bar; foo, _ = reader.NextWord() {
bar = foo
if strings.Contains(foo, "(normal)") {
quality = append(quality, "normal")
for state := 0; state < 1; foo, _ = reader.NextWord() {
if foo == "|" {
state = 1
}
tempName = append(tempName, foo)
}
nameString := strings.Join(tempName, "")
name = append(name, nameString)
} else if strings.Contains(foo, "(unique)") {
quality = append(quality, "unique")
for state := 0; state < 1; foo, _ = reader.NextWord() {
if foo == "|" {
state = 1
}
tempName = append(tempName, foo)
}
nameString := strings.Join(tempName, "")
name = append(name, nameString)
} else if strings.Contains(foo, "(set)") {
quality = append(quality, "set")
for state := 0; state < 1; foo, _ = reader.NextWord() {
if foo == "|" {
state = 1
}
tempName = append(tempName, foo)
}
nameString := strings.Join(tempName, "")
name = append(name, nameString)
}
if tempName != nil {
tempName = nil // clear tempName
}
i++
}

Your slice a needs to be allocated utilizing make.
var a []string = make([]string, n)
where n is the size of the slice.

Removing some of the context-specific parts of your code, you should be using the append method against a dynamic-length slice.
package main
import (
"fmt"
"strings"
)
func main() {
book := "Lorem ipsum dolor sit amet"
var words []string
for _, word := range strings.Split(book, " ") {
words = append(words, word)
}
fmt.Printf("%+v\n", words)
}
https://play.golang.org/p/LMejsrmIGb9
If you know the number of values up front, the same can be achieved for a fixed length slice by using words := make([]string, 5), but I doubt this is what you want in this case.
The reason your code is causing you errors is because your slice isn't initialized at any given length, so your indexes don't yet exist. Generally when working with a slice, append is the method you want.
Opposite to this, when working with existing slices (ie, rangeing an slice), the reason you're able to set the values using indexes is because the index has already been allocated.

How to break out of nested loops in Go?

I have an outer and inner loop, each iterating over a range. I want to exit the outer loop when a condition is satisfied inside the inner loop.
I have a solution which works using two 'break's, one inside the inner loop and one inside the outerloop, just outside the inner loop (a very simplified case for demonstration):
package main
import (
"fmt"
)
func main() {
word := ""
for _, i := range("ABCDE") {
for _,j := range("ABCDE") {
word = string(i) + string(j)
fmt.Println(word)
if word == "DC" {
break
}
}
if word == "DC" {
break
}
}
// More logic here that needs to be executed
}
Go Playground
There is no problem with this solution, but it just looks patched and ugly to me. Is there a better way to do this?
I can try and have another for conditional loop outside the outer loop in the previous solution and have a label and use continue with the label. But as you can see, this approach isn't any more elegant than the solution with break.
package main
import (
"fmt"
)
func main() {
word := ""
Exit:
for word != "DC" {
for _, i := range "ABCDE" {
for _, j := range "ABCDE" {
word = string(i) + string(j)
fmt.Println(word)
if word == "DC" {
continue Exit
}
}
}
}
// More logic here that needs to be executed
}
Go Playground
I have seen similar questions here pertaining to other languages (C, C#, Python etc). But what I am really interested to see is whether there is any trick with Go constructs such as 'for select'.

Use break {label} to break out of any loop as nested as you want. Just put the label before the for loop you want to break out of. This is fairly similar to the code that does a goto {label} but I think a tad more elegant, but matter of opinion I guess.
package main
func main() {
out:
for i := 0; i < 10; i++ {
for j := 0; j < 10; j++ {
if i + j == 20 {
break out
}
}
}
}
More details: https://www.ardanlabs.com/blog/2013/11/label-breaks-in-go.html

use function
package main
import (
"fmt"
)
func getWord() string {
word := ""
for word != "DC" {
for _, i := range "ABCDE" {
for _, j := range "ABCDE" {
word = string(i) + string(j)
fmt.Println(word)
if word == "DC" {
return word
}
}
}
}
return word
}
func main(){
word := getWord()
}
Edit: thanks to #peterSO who points on some mistakes in the details and provides this playground https://play.golang.org/p/udcJptBW9pQ

How about goto?
package main
import (
"fmt"
)
func main() {
word := ""
for _, i := range "ABCDE" {
for _, j := range "ABCDE" {
word = string(i) + string(j)
fmt.Println(word)
if word == "DC" {
goto Exit
}
}
}
Exit: // More logic here that needs to be executed
}

The most straightforward seems to be something like:
func main() {
word := ""
isDone := false
for _, i := range("ABCDE") {
for _,j := range("ABCDE") {
word = string(i) + string(j)
fmt.Println(word)
isDone = word == "DC"
if isDone {
break
}
}
if isDone {
break
}
}
// other stuff
}
An Alternative using a Generator
However you could also do a generator to create the sequence of words as in:
func makegen () chan string {
c:= make(chan string)
go func () {
for _, i := range ("ABCDE") {
for _, j := range ("ABCDE") {
c <- string(i) + string(j)
}
}
close (c)
}()
return c
}
func main() {
word := ""
for word = range makegen() {
fmt.Println (word)
if word == "DC" {
break
}
}
// other code
}
An improved version of the generator function that will clean up the resource leak identified by a comment below.
func makegen () chan string {
c:= make(chan string)
go func () {
word := ""
for _, i := range ("ABCDE") {
for _, j := range ("ABCDE") {
word = string(i) + string(j)
c <- word
if word == "DC" {
close (c)
return
}
}
}
close (c)
}()
return c
}
func main() {
word := ""
for word = range makegen() {
fmt.Println (word)
}
// other code
}

Wrap your for loops in an anonymous self-invoked function, then just return whenever you want to break out
package main
func main() {
func() {
for i:= 0; i < 100; i++ {
for j:= 0; j < 100; j++ {
if (i == 5 && j == 5) {
return
}
}
}
}()
}

Just defer anything you need to do and return as normal.
package main
import (
"fmt"
)
func main() {
defer func() {
// More logic here that needs to be executed
}()
word := ""
for _, i := range "ABCDE" {
for _, j := range "ABCDE" {
word = string(i) + string(j)
fmt.Println(word)
if word == "DC" {
return
}
}
}
}

Iterating over all the keys of a map

Is there a way to get a list of all the keys in a Go language map? The number of elements is given by len(), but if I have a map like:
m := map[string]string{ "key1":"val1", "key2":"val2" };
How do I iterate over all the keys?

https://play.golang.org/p/JGZ7mN0-U-
for k, v := range m {
fmt.Printf("key[%s] value[%s]\n", k, v)
}
or
for k := range m {
fmt.Printf("key[%s] value[%s]\n", k, m[k])
}
Go language specs for for statements specifies that the first value is the key, the second variable is the value, but doesn't have to be present.

Here's some easy way to get slice of the map-keys.
// Return keys of the given map
func Keys(m map[string]interface{}) (keys []string) {
for k := range m {
keys = append(keys, k)
}
return keys
}
// use `Keys` func
func main() {
m := map[string]interface{}{
"foo": 1,
"bar": true,
"baz": "baz",
}
fmt.Println(Keys(m)) // [foo bar baz]
}

Is there a way to get a list of all the keys in a Go language map?
ks := reflect.ValueOf(m).MapKeys()
how do I iterate over all the keys?
Use the accepted answer:
for _, k := range m { ... }

A Type agnostic solution:
for _, key := range reflect.ValueOf(yourMap).MapKeys() {
value := yourMap.MapIndex(key).Interface()
fmt.Println("Key:", key, "Value:", value)
}

Using Generics:
func Keys[K comparable, V any](m map[K]V) []K {
keys := make([]K, 0, len(m))
for k := range m {
keys = append(keys, k)
}
return keys
}

For sorted keys of map[string]string.
package main
import (
"fmt"
"sort"
)
func main() {
m := map[string]string{"key1": "val1", "key2": "val2"}
sortStringMap(m)
}
// sortStringMap prints the [string]string as keys sorted
func sortStringMap(m map[string]string) {
var keys []string
for key := range m {
keys = append(keys, key)
}
sort.Strings(keys) // sort the keys
for _, key := range keys {
fmt.Printf("%s\t:%s\n", key, m[key])
}
}
output:
key1 :val1
key2 :val2

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

How to remove duplicates strings or int from Slice in Go - arrays

You can do in-place replacement guided with a map: processed := map[string]struct{}{} w := 0 for _, s := range cities { if _, exists := processed[s]; !exists { // If this city has not been seen yet, add it to the list processed[s] = struct{}{} cities[w] = s w++ } } cities = cities[:w]

Simple to understand. func RemoveDuplicate(array []string) []string { m := make(map[string]string) for _, x := range array { m[x] = x } var ClearedArr []string for x, _ := range m { ClearedArr = append(ClearedArr, x) } return ClearedArr }

It can also be done with a set-like map: ddpStrings := []string{} m := map[string]struct{}{} for _, s := range strings { if _, ok := m[scopeStr]; ok { continue } ddpStrings = append(ddpStrings, s) m[s] = struct{}{} }

try: https://github.com/samber/lo#uniq names := lo.Uniq[string]([]string{"Samuel", "John", "Samuel"}) // []string{"Samuel", "John"}

Related

Remove slice of string from slice

Recursively changing arrays to non-arrays in JSON with sjson in Golang

Index out of range trying to add to slice

How to break out of nested loops in Go?

Iterating over all the keys of a map

Categories

Resources