How slice works in GO? - arrays

a = make([]int, 7, 15)
creates implicit array of size 15 and slice(a) creates a shallow copy of implicit array and points to first 7 elements in array.
Consider,
var a []int;
creates a zero length slice that does not point to any implicit array.
a = append(a, 9, 86);
creates new implicit array of length 2 and append values 9 and 86. slice(a) points to that new implicit array, where
len(a) is 2 and cap(a) >= 2
My question:
is this the correct understanding?

As I mentioned "Declare slice or make slice?", the zero value of a slice (nil) acts like a zero-length slice.
So you can append to a []int directly.
You would need to make a slice (make([]int, 0) ) only if you wanted to potentially return an empty slice (instead of nil).
If not, no need to allocate memory before starting appending.
See also "Arrays, slices (and strings): The mechanics of 'append': Nil"
a nil slice is functionally equivalent to a zero-length slice, even though it points to nothing. It has length zero and can be appended to, with allocation.

Related

Does Go slicing operator allocate new underlying array?

I was reading this article which explains how slices in Go are implemented under the hood:
https://medium.com/swlh/golang-tips-why-pointers-to-slices-are-useful-and-how-ignoring-them-can-lead-to-tricky-bugs-cac90f72e77b
At the end of the article is this snippet of Go code:
func main() {
slice:= make([]string, 1, 3)
func(slice []string){
slice=slice[1:3]
slice[0]="b"
slice[1]="b"
fmt.Print(len(slice))
fmt.Print(slice)
}(slice)
fmt.Print(len(slice))
fmt.Print(slice)
}
My first guess was this would print:
2 [b b]3 [ b b]
In fact it prints:
2[b b]1[]
Which suggests that when anonymous function creates a new local slice by slicing the one passed to it as argument causes a new underlying array to be allocated for the slice. I've confirmed this with this modified version of code:
func main() {
slice := make([]string, 1, 3)
hdr := (*reflect.SliceHeader)(unsafe.Pointer(&slice))
fmt.Printf("adress of underlying array in main: %p\n", unsafe.Pointer(hdr.Data))
func(slice []string) {
hdr := (*reflect.SliceHeader)(unsafe.Pointer(&slice))
fmt.Printf("adress of underlying array in func before slicing: %p\n", unsafe.Pointer(hdr.Data))
slice = slice[1:3]
slice[0] = "b"
slice[1] = "b"
hdr = (*reflect.SliceHeader)(unsafe.Pointer(&slice))
fmt.Printf("adress of underlying array in func after slicing: %p\n", unsafe.Pointer(hdr.Data))
fmt.Print(len(slice))
fmt.Println(slice)
}(slice)
fmt.Print(len(slice))
fmt.Println(slice)
}
Which prints:
adress of underlying array in main: 0xc0000121b0
adress of underlying array in func before slicing: 0xc0000121b0
adress of underlying array in func after slicing: 0xc0000121c0
2[b b]
1[]
My question is: why is the slicing operation in anonymous function causing a new array to be allocated? My understanding was that if in main we are creating a slice with capacity 3, an underlying array with length 3 is created and when anonymous function is manipulating the slice, it is manipulating the same underlying array.
As mkopriva mentioned, the reslicing does not reallocate anything. Reallocation happens only when appending new values would exceed the slice's capacity (source).
The output you got is because the elements you are "able to see" in a slice (including when you print it) depend on its length:
The number of elements is called the length of the slice and is never negative. [...] The length of a slice s can be discovered by the built-in function len;
The original slice constructed with slice := make([]string, 1, 3), has length=1, so when you print it, the output will be the one element at position 0 of the backing array, which is an empty string.
With this code:
slice = slice[1:3]
slice[0] = "b"
slice[1] = "b"
you are effectively mutating the elements at position 1 and 2 of the backing array, none of which is an element of the original slice.
If you reslice the original slice up to capacity — thus extending its length, it will print what you expect:
slice = slice[:cap(slice)]
fmt.Println(len(slice)) // 3
fmt.Println(slice) // [ b b]
// ^ first is empty string

Updating golang array while iterating it

When iterating an array with range, if the array is updated the updated positions do not make it into the future loop runs. The following prints "1 2" instead of "1 0"
package main
import (
"fmt"
)
func main() {
var A = &[2]int{1, 2}
for i, v := range A {
if i == 0 {
A[1] = 0
}
fmt.Print(v, " ")
}
fmt.Println()
var B = [2]int{1, 2}
for i, v := range B {
if i == 0 {
B[1] = 0
}
fmt.Print(v, " ")
}
}
https://play.golang.org/p/0zZY6vjxwut
It looks like the array is copied before it's iterated.
What part of the spec describes this behavior?
See "For statements with range clause" at https://golang.org/ref/spec#For_range
TLDR; Whatever you range over, a copy is made of it (this is the general "rule", but there is an exception, see below). Arrays are rare in Go, usually slices are used. Slice values (slice headers) contain a pointer to an underlying array, so copying a slice header is fast, efficient, and it does not copy the slice elements, not like arrays. Ranging over a pointer to array is similar to ranging over a slice in this regard.
Spec: For statements:
The range expression x is evaluated once before beginning the loop, with one exception: if at most one iteration variable is present and len(x) is constant, the range expression is not evaluated.
Arrays are values, they do not contain pointers to data located outside of the array's memory (unlike slices). The Go Blog: Go Slices: usage and internals:
Go's arrays are values. An array variable denotes the entire array; it is not a pointer to the first array element (as would be the case in C). This means that when you assign or pass around an array value you will make a copy of its contents. (To avoid the copy you could pass a pointer to the array, but then that's a pointer to an array, not an array.) One way to think about arrays is as a sort of struct but with indexed rather than named fields: a fixed-size composite value.
Evaluating an array is a copy of the entire array, it is a copy of all the elements. Spec: Variables:
A variable's value is retrieved by referring to the variable in an expression; it is the most recent value assigned to the variable.
In your first example the range expression is just a pointer to the array, so only this pointer is copied (but not the pointed array), so when you do A[1] = 0 (which is a shorthand for (*A)[1] = 0), you modify the original array, and the iteration variable gets elements from the pointed array.
In your second example the range expression is the array, so the array (with all its elements) is copied, and inside it B[1] = 0 still modifies the original array (B is a variable, not the result of the evaluation of the range expression), but v is an element of the copy (v is populated from the copied array in each iteration).
Under the hood
So how is this "copy" realized? The compiler generates code for the for range that copies (assigns) the result of the range expression to a temporary variable (if needed, because it might not always be needed: "if at most one iteration variable is present and len(x) is constant, the range expression is not evaluated").
This code can be inspected in the cmd/compile/internal/gc/range.go file.
See related article: Go Range Loop Internals
The spec says
The range expression x is evaluated once before beginning the loop, with one exception: if at most one iteration variable is present and len(x) is constant, the range expression is not evaluated.
Function calls on the left are evaluated once per iteration. For each iteration, iteration values are produced as follows if the respective iteration variables are present
The thing here is that given your loop takes more than one variable, the range expression is evaluated only once at the beginning of the iteration. Thus the value of the B[1] assigned to the v won't change.
In the case with reference, you see the modified value since the expression evaluates the reference to the B[1], which is not modified and prints the value of that referenced variable, which is actually modified.

Send array from index 1 to function

I've this function and and I got values which I need to use from args
Run: func(cmd *cobra.Command, args []string) {
....
myFunc(args)
}
I need to pass to myFunc all the args from index 1 and not 0.
of course I can loop and create another array from index 1 but
this duplicate almost all the values except index 0 , is there a way to avoid it in GO?
Yes, simply slice the args slice, and pass that:
myFunc(args[1:])
args is a slice, not an array. You can (re-)slice slices, which will be a contiguous subpart of the original slice. For example:
args[1:4]
The above would be another slice, holding only the following elements from args:
args[1], args[2], args[3]
The upper limit is exclusive. A missing upper index defaults to the length, a missing lower index defaults to 0. These are all detailed in Spec: Slice expressions.
Note that slicing a slice does not copy the elements: it will point to the same underlying array which actually holds the elements. A slice is just a small, struct-like header containing a pointer to the underlying array.
Note that if args is empty, the above would result in a run-time panic. To avoid that, first check its length:
if len(args) == 0 {
myFunc(nil) // or an empty slice: []string{}
} else {
myFunc(args[1:])
}

How do you convert a slice into an array?

I am trying to write an application that reads RPM files. The start of each block has a Magic char of [4]byte.
Here is my struct
type Lead struct {
Magic [4]byte
Major, Minor byte
Type uint16
Arch uint16
Name string
OS uint16
SigType uint16
}
I am trying to do the following:
lead := Lead{}
lead.Magic = buffer[0:4]
I am searching online and not sure how to go from a slice to an array (without copying). I can always make the Magic []byte (or even uint64), but I was more curious on how would I go from type []byte to [4]byte if needed to?
The built in method copy will only copy a slice to a slice NOT a slice to an array.
You must trick copy into thinking the array is a slice
copy(varLead.Magic[:], someSlice[0:4])
Or use a for loop to do the copy:
for index, b := range someSlice {
varLead.Magic[index] = b
}
Or do as zupa has done using literals. I have added onto their working example.
Go Playground
You have allocated four bytes inside that struct and want to assign a value to that four byte section. There is no conceptual way to do that without copying.
Look at the copy built-in for how to do that.
Try this:
copy(lead.Magic[:], buf[0:4])
Tapir Liui (auteur de Go101) twitte:
Go 1.18 1.19 1.20 will support conversions from slice to array: golang/go issues 46505.
So, since Go 1.18,the slice copy2 implementation could be written as:
*(*[N]T)(d) = [N]T(s)
or, even simpler if the conversion is allowed to present as L-values:
[N]T(d) = [N]T(s)
Without copy, you can convert, with the next Go 1.17 (Q3 2021) a slice to an array pointer.
This is called "un-slicing", giving you back a pointer to the underlying array of a slice, again, without any copy/allocation needed:
See golang/go issue 395: spec: convert slice x into array pointer, now implemented with CL 216424/, and commit 1c26843
Converting a slice to an array pointer yields a pointer to the underlying array of the slice.
If the length of the slice is less than the length of the array,
a run-time panic occurs.
s := make([]byte, 2, 4)
s0 := (*[0]byte)(s) // s0 != nil
s2 := (*[2]byte)(s) // &s2[0] == &s[0]
s4 := (*[4]byte)(s) // panics: len([4]byte) > len(s)
var t []string
t0 := (*[0]string)(t) // t0 == nil
t1 := (*[1]string)(t) // panics: len([1]string) > len(s)
So in your case, provided Magic type is *[4]byte:
lead.Magic = (*[4]byte)(buffer)
Note: type aliasing will work too:
type A [4]int
var s = (*A)([]int{1, 2, 3, 4})
Why convert to an array pointer? As explained in issue 395:
One motivation for doing this is that using an array pointer allows the compiler to range check constant indices at compile time.
A function like this:
func foo(a []int) int
{
return a[0] + a[1] + a[2] + a[3];
}
could be turned into:
func foo(a []int) int
{
b := (*[4]int)(a)
return b[0] + b[1] + b[2] + b[3];
}
allowing the compiler to check all the bounds once only and give compile-time errors about out of range indices.
Also:
One well-used example is making classes as small as possible for tree nodes or linked list nodes so you can cram as many of them into L1 cache lines as possible.
This is done by each node having a single pointer to a left sub-node, and the right sub-node being accessed by the pointer to the left sub-node + 1.
This saves the 8-bytes for the right-node pointer.
To do this you have to pre-allocate all the nodes in a vector or array so they're laid out in memory sequentially, but it's worth it when you need it for performance.
(This also has the added benefit of the prefetchers being able to help things along performance-wise - at least in the linked list case)
You can almost do this in Go with:
type node struct {
value int
children *[2]node
}
except that there's no way of getting a *[2]node from the underlying slice.
Go 1.20 (Q1 2023): this is addressed with CL 430415, 428938 (type), 430475 (reflect) and 429315 (spec).
Go 1.20
You can convert from a slice to an array directly with the usual conversion syntax T(x). The array's length can't be greater than the slice's length:
func main() {
slice := []int64{10, 20, 30, 40}
array := [4]int64(slice)
fmt.Printf("%T\n", array) // [4]int64
}
Go 1.17
Starting from Go 1.17 you can directly convert a slice to an array pointer. With Go's type conversion syntax T(x) you can do this:
slice := make([]byte, 4)
arrptr := (*[4]byte)(slice)
Keep in mind that the length of the array must not be greater than the length of the slice, otherwise the conversion will panic.
bad := (*[5]byte)(slice) // panics: slice len < array len
This conversion has the advantage of not making any copy, because it simply yields a pointer to the underlying array.
Of course you can dereference the array pointer to obtain a non-pointer array variable, so the following also works:
slice := make([]byte, 4)
var arr [4]byte = *(*[4]byte)(slice)
However dereferencing and assigning will subtly make a copy, since the arr variable is now initialized to the value that results from the conversion expression. To be clear (using ints for simplicity):
v := []int{10,20}
a := (*[2]int)(v)
a[0] = 500
fmt.Println(v) // [500 20] (changed, both point to the same backing array)
w := []int{10,20}
b := *(*[2]int)(w)
b[0] = 500
fmt.Println(w) // [10 20] (unchanged, b holds a copy)
One might wonder why the conversion checks the slice length and not the capacity (I did). Consider the following program:
func main() {
a := []int{1,2,3,4,5,6}
fmt.Println(cap(a)) // 6
b := a[:3]
fmt.Println(cap(a)) // still 6
c := (*[3]int)(b)
ptr := uintptr(unsafe.Pointer(&c[0]))
ptr += 3 * unsafe.Sizeof(int(0))
i := (*int)(unsafe.Pointer(ptr))
fmt.Println(*i) // 4
}
The program shows that the conversion might happen after reslicing. The original backing array with six elements is still there, so one might wonder why a runtime panic occurs with (*[6]int)(b) where cap(b) == 6.
This has actually been brought up. It's worth to remember that, unlike slices, an array has fixed size, therefore it needs no notion of capacity, only length:
a := [4]int{1,2,3,4}
fmt.Println(len(a) == cap(a)) // true
You might be able to do the whole thing with one read, instead of reading individually into each field. If the fields are fixed-length, then you can do:
lead := Lead{}
// make a reader to dispense bytes so you don't have to keep track of where you are in buffer
reader := bytes.NewReader(buffer)
// read into each field in Lead, so Magic becomes buffer[0:4],
// Major becomes buffer[5], Minor is buffer[6], and so on...
binary.Read(reader, binary.LittleEndian, &lead)
Don't. Slice itself is suffice for all purpose. Array in go lang should be regarded as the underlying structure of slice. In every single case, use only slice. You don't have to array yourself. You just do everything by slice syntax. Array is only for computer. In most cases, slice is better, clear in code. Even in other cases, slice still is sufficient to reflex your idea.

Are args and args[..] the same?

I'm reading optparse.coffee, and confused with the following line:
args = args[..]
What does that line do?
From the fine manual:
Array Slicing and Splicing with Ranges
Ranges can also be used to extract slices of arrays. With two dots (3..6), the range is inclusive (3, 4, 5, 6); with three dots (3...6), the range excludes the end (3, 4, 5). Slices indices have useful defaults. An omitted first index defaults to zero and an omitted second index defaults to the size of the array.
So saying array[..] is shorthand for:
len = array.length
array[0 .. len]
and that just makes a shallow copy of array. That means that args = args[..] just makes a local shallow copy of args so that args can be manipulated and changed without altering the original array that was passed in and you can store references to the array without the function's caller being able to accidentally alter your array through the original args reference that was passed to the function.
Consider this simplified example:
f = (args) -> args = args[..]
that becomes this JavaScript:
var f;
f = function(args) {
return args = args.slice(0);
};
And Array#slice:
Returns a shallow copy of a portion of an array.
[...]
If end is omitted, slice extracts to the end of the sequence.
So saying array.slice(n) returns a shallow copy of array starting at index n and going to the end of array and since arrays are indexed starting at zero, array.slice(0) makes a shallow copy of the entire array.

Resources