Copying a struct's slice parameter in Go

Copying a struct's slice parameter in Go - arrays

Let's say I have a struct with a slice parameter like the following, and I create one and fill it with some values:
type s struct {
sl []float32
}
func NewS() *s {
return &s{
sl: make([]float32, 3),
}
}
func main() {
a := NewS()
a.sl[0] = 1
a.sl[1] = 2
a.sl[2] = 3
b := NewS()
// Code here
}
I want b to have an sl with the same values as a.sl, and in particular, I'd like to understand how to do it in different ways:
Where the entire b points to the entire a (so that if there were other parameters in the struct, those would all match)
Where b.sl would point to a.sl
As a deep copy, where there are no pointers (and done not by iterating through the slice, but by a way that's repeatable regardless of type, ideally)
What follows is what I've tried and found so far.
b = a
fmt.Printf("a: %p\n", &a)
fmt.Printf("b: %p\n", &b)
fmt.Printf("a.sl: %p\n", &a.sl)
fmt.Printf("b.sl: %p\n", &b.sl)
Which outputs different memory addresses for the structs, but the same memory addresses for the slices. Why is that? Why wouldn't the struct addresses be the same, too, if a is just pointing to b?
a: 0xc00000e030
b: 0xc00000e038
a.sl: 0xc000054040
b.sl: 0xc000054040
Then I tried:
b.sl = a.sl
// ...
Which outputs all distinct memory addresses. It makes sense of course that the address of a and b are different, but why isn't b.sl pointing to the address of a.sl?
a: 0xc00000e028
b: 0xc00000e030
a.sl: 0xc00000c030
b.sl: 0xc00000c048
And finally I tried:
copy(b.sl, a.sl)
// ...
Which outputs similarly to the above. This result is the only one I expected, since I expected this to make a deep copy.
a: 0xc0000b8018
b: 0xc0000b8020
a.sl: 0xc0000ae018
b.sl: 0xc0000ae030
Could you help me understand what's happening in these 3 cases and other ways to achieve my desire results and perform various types of copies?

Here's some background information: A slice contains a pointer to the slice's backing array, the length of the slice and the capacity of the backing array. The (pointer, length, capacity) is referred to as a slice header.
The address of a slice is the address of the slice header, not the pointer to the slice's backing array or the first element of the slice's backing array.
b = a
The variables a and b point to the same value after the statement is executed.
The variables a and b are distinct and therefore have different addresses.
The expression &a.sl is the address of the sl field in the pointed at value. The expression &b.sl equals &a.sl because a and b point at the same value.
This is your #1: a an b point at the same value.
b.sl = a.sl
The statement copies the slice header from a.sl to b.sl. The slices have different addresses because slice headers remain distinct.
This is close to your #2: a.sl an b.sl point at the same backing array. A modification to an element in a.sl is visible through b.sl, but changes to the slice header a.sl are not reflected in `b.sl.
copy(b.sl, a.sl)
The statement copies the elements in a.sls backing array to b.sl.
This is your #3, a deep copy. Changes through a are not reflected in b.

A slice is a triple containing a pointer to the underlying array, length, and capacity. When you assign a slice to another slice, you assign that triple. The underlying array remains the same.
A slice is a view of an array. If you need a deep copy of a slice, you have to create another slice and copy it yourself.
If you assign an array to another, that copies each element of the source to the target.

Related

Does go garbage collect parts of slices?

If I implement a queue like this...
package main
import(
"fmt"
)
func PopFront(q *[]string) string {
r := (*q)[0]
*q = (*q)[1:len(*q)]
return r
}
func PushBack(q *[]string, a string) {
*q = append(*q, a)
}
func main() {
q := make([]string, 0)
PushBack(&q, "A")
fmt.Println(q)
PushBack(&q, "B")
fmt.Println(q)
PushBack(&q, "C")
fmt.Println(q)
PopFront(&q)
fmt.Println(q)
PopFront(&q)
fmt.Println(q)
}
... I end up with an array ["A", "B", "C"] that has no slices pointing to the first two elements. Since the "start" pointer of a slice can never be decremented (AFAIK), those elements can never be accessed.
Is Go's garbage collector smart enough to free them?

Slices are just descriptors (small struct-like data structures) which if not referenced will be garbage collected properly.
The underlying array for a slice (to which the descriptor points to) on the other hand is shared between all slices that are created by reslicing it: quoting from the Go Language Specification: Slice Types:
A slice, once initialized, is always associated with an underlying array that holds its elements. A slice therefore shares storage with its array and with other slices of the same array; by contrast, distinct arrays always represent distinct storage.
Therefore if at least one slice exists, or a variable holding the array (if a slice was created by slicing the array), it will not be garbage collected.
Official Statement about this:
The blog post Go Slices: usage and internals By Andrew Gerrand clearly states this behaviour:
As mentioned earlier, re-slicing a slice doesn't make a copy of the underlying array. The full array will be kept in memory until it is no longer referenced. Occasionally this can cause the program to hold all the data in memory when only a small piece of it is needed.
...
Since the slice references the original array, as long as the slice is kept around the garbage collector can't release the array.
Back to your example
While the underlying array will not be freed, note that if you add new elements to the queue, the built-in append function occasionally might allocate a new array and copy the current elements to the new – but copying will only copy the elements of the slice and not the whole underlying array! When such a reallocation and copying occurs, the "old" array may be garbage collected if no other reference exists to it.
Also another very important thing is that if an element is popped from the front, the slice will be resliced and not contain a reference to the popped element, but since the underlying array still contains that value, the value will also remain in memory (not just the array). It is recommended that whenever an element is popped or removed from your queue (slice/array), always zero it (its respective element in the slice) so the value will not remain in memory needlessly. This becomes even more critical if your slice contains pointers to big data structures.
func PopFront(q *[]string) string {
r := (*q)[0]
(*q)[0] = "" // Always zero the removed element!
*q = (*q)[1:len(*q)]
return r
}
This is mentioned Slice Tricks wiki page:
Delete without preserving order
a[i] = a[len(a)-1]
a = a[:len(a)-1]
NOTE If the type of the element is a pointer or a struct with pointer fields, which need to be garbage collected, the above implementations of Cut and Delete have a potential memory leak problem: some elements with values are still referenced by slice a and thus can not be collected.

No. At the time of this writing, the Go garbage collector (GC) is not smart enough to collect the beginning of an underlying array in a slice, even if it is inaccessible.
As mentioned by others here, a slice (under the hood) is a struct of exactly three things: a pointer to its underlying array, the length of the slice (values accessible without reslicing), and the capacity of the slice (values accessible by reslicing). On the Go blog, slice internals are discussed at length. Here is another article I like about Go memory layouts.
When you reslice and cut off the tail end of a slice, it is obvious (upon understanding the internals) that the underlying array, the pointer to the underlying array, and the slice's capacity are all left unchanged; only the slice length field is updated. When you re-slice and cut off the beginning of a slice, you are really changing the pointer to the underlying array along with the length and capacity. In this case, it is generally unclear (based on my readings) why the GC does not clean up this inaccessible part of the underlying array because you cannot re-slice the array to access it again. My assumption is that the underlying array is treated as one block of memory from the GC's point of view. If you can point to any part of the underlying array, the entire thing is ineligible for deallocation.
I know what you're thinking... like the true computer scientist you are, you may want some proof. I'll indulge you:
https://goplay.space/#tDBQs1DfE2B
As mentioned by others and as shown in the sample code, using append can cause a reallocation and copy of the underlying array, which allows the old underlying array to be garbage collected.

Simple question, simple answer: No. (But if you keep pushing the slice will at some point overflow its underlying array then the unused elements become available to be freed.)

Contrary to what I'm reading, Golang certainly seems to garbage collect at least unused slices starting sections. The following test case provides evidence.
In the first case the slice is set to slice[:1] in each iteration. In the comparison case, it skips that step.
The second case dwarfs the memory consumed in the first case. But why?
func TestArrayShiftMem(t *testing.T) {
slice := [][1024]byte{}
mem := runtime.MemStats{}
mem2 := runtime.MemStats{}
runtime.GC()
runtime.ReadMemStats(&mem)
for i := 0; i < 1024*1024*1024*1024; i++ {
slice = append(slice, [1024]byte{})
slice = slice[1:]
runtime.GC()
if i%(1024) == 0 {
runtime.ReadMemStats(&mem2)
fmt.Println(mem2.HeapInuse - mem.HeapInuse)
fmt.Println(mem2.StackInuse - mem.StackInuse)
fmt.Println(mem2.HeapAlloc - mem.HeapAlloc)
}
}
}
func TestArrayShiftMem3(t *testing.T) {
slice := [][1024]byte{}
mem := runtime.MemStats{}
mem2 := runtime.MemStats{}
runtime.GC()
runtime.ReadMemStats(&mem)
for i := 0; i < 1024*1024*1024*1024; i++ {
slice = append(slice, [1024]byte{})
// slice = slice[1:]
runtime.GC()
if i%(1024) == 0 {
runtime.ReadMemStats(&mem2)
fmt.Println(mem2.HeapInuse - mem.HeapInuse)
fmt.Println(mem2.StackInuse - mem.StackInuse)
fmt.Println(mem2.HeapAlloc - mem.HeapAlloc)
}
}
}
Output Test1:
go test -run=.Mem -v .
...
0
393216
21472
^CFAIL github.com/ds0nt/cs-mind-grind/arrays 1.931s
Output Test3:
go test -run=.Mem3 -v .
...
19193856
393216
19213888
^CFAIL github.com/ds0nt/cs-mind-grind/arrays 2.175s
If you disable garbage collection on the first test, indeed memory skyrockets. The resulting code looks like this:
func TestArrayShiftMem2(t *testing.T) {
debug.SetGCPercent(-1)
slice := [][1024]byte{}
mem := runtime.MemStats{}
mem2 := runtime.MemStats{}
runtime.GC()
runtime.ReadMemStats(&mem)
// 1kb per
for i := 0; i < 1024*1024*1024*1024; i++ {
slice = append(slice, [1024]byte{})
slice = slice[1:]
// runtime.GC()
if i%(1024) == 0 {
fmt.Println("len, cap:", len(slice), cap(slice))
runtime.ReadMemStats(&mem2)
fmt.Println(mem2.HeapInuse - mem.HeapInuse)
fmt.Println(mem2.StackInuse - mem.StackInuse)
fmt.Println(mem2.HeapAlloc - mem.HeapAlloc)
}
}
}

Proper way of passing array parameters to D functions

1st Question:
Are D array function parameters always passed by reference, or by value?
Also, does the language implements Copy on Write for arrays?
E.g.:
void foo(int[] arr)
{
// is arr a local copy or a ref to an external array?
arr[0] = 42; // How about now?
}
2nd Question:
Suppose I have a large array that will be passed to function foo as a read-only parameter and it should be avoided as much as possible copying the array, since it is assumed to be a very large object. Which from the following (or none of them) would be the best declaration for function foo:
void foo(const int[] bigArray)
void foo(in int[] bigArray)
void foo(const ref int[] bigArray)

Technically, a dynamic array like int[] is just a pointer and a length. Only the pointer and length get copied onto the stack, not the array contents. An arr[0] = 42; does modify the original array.
On the other side, a static array like int[30] is a plain old data type consisting of 30 consecutive ints in memory. So, a function like void foo(int[30] arr) would copy 120 bytes onto the stack for a start. In such a case, arr[0] = 42; modifies the local copy of the array.
According to the above, each of the ways you listed avoids copying the array contents. So, whether you need the parameter to be const, in, const ref or otherwise depends on what you are trying to achieve besides avoiding array copy. For example, if you pass a ref int [] arr parameter, not only you can modify its contents, but also you will be able to modify the pointer and length (for example, create a wholly new array and assign it to arr so that it is visible from outside the function).
For further information, please refer to the corresponding articles on the DLang site covering arrays and array slices.

How to allocate a non-constant sized array in Go

How do you allocate an array in Go with a run-time size?
The following code is illegal:
n := 1
var a [n]int
you get the message prog.go:12: invalid array bound n (or similar), whereas this works fine:
const n = 1
var a [n]int
The trouble is, I might not know the size of the array I want until run-time.
(By the way, I first looked in the question How to implement resizable arrays in Go for an answer, but that is a different question.)

The answer is you don't allocate an array directly, you get Go to allocate one for you when creating a slice.
The built-in function make([]T, length, capacity) creates a slice and the array behind it, and there is no (silly) compile-time-constant-restriction on the values of length and capacity. As it says in the Go language specification:
A slice created with make always allocates a new, hidden array to which the returned slice value refers.
So we can write:
n := 12
s := make([]int, n, 2*n)
and have an array allocated size 2*n, with s a slice initialised to be the first half of it.
I'm not sure why Go doesn't allocate the array [n]int directly, given that you can do it indirectly, but the answer is clear: "In Go, use slices rather than arrays (most of the time)."

Access two-dimensional array via pointer to first element

I got strucure
typedef struct
{char *cells;}
Map;
and cells suppose to be pointer to array of rows(in rows are integers on every position).
I don't know how to access for example to number on 3. position in 2. row.
I have stared with some array[3][3], but I don't know how to connect them with this struct.
I tried
Map nextmap;
nextmap.cells[0] = array[0][0];
But I got only first number, which is clear. How can I get to other positions?
Thanks in advance.
EDIT: renaming the structure ..
.

When you did Map nextmap;, you created an uninitialized Map struct. When you did nextmap.cells[0] = array[0][0]; you dereferenced (i.e. followed) the uninitialized pointer, and stored a value at the random memory it points at.
If you want to initialize the cells structure, you can do something as simple as nextmap.cells = array[0]; That will cause nextmap.cells to point at array. Note that it's not copying the contents; just pointing at them. That means that if you change the values through cells, you'll be modifying the values in arrays.
(Also, using 'new' as a variable name is perfectly acceptable in C, but you're likely to confuse any C++ programmers reading your code, since 'new' is an operator in that language.)
new now changed to nextmap in question
Edited to correct the type mismatch in nextmap.cells assignment.

Given an array char array[][NumberOfColumns] (the first dimension is irrelevant and is omitted here; it would be needed when the array is defined), you can set a pointer to the first element of the array with:
nextmap.cells = &array[0][0];
Then you can access an element in the array, array[i][j], by calculating its position within the array, with either of these two expressions:
*(nextmap.cells + i*NumberOfColumns + j)
nextmap.cells[i*NumberOfColumns + j]
Two-dimensional arrays generally ought to be addressed as two-dimensional arrays. Calculating the position manually is poor practice if done without good reason. If this school assignment did not have a good reason for this, then it is a bad assignment.

First of all new is not a good name for a variable.
new now changed to nextmap in question
Second of all in your case cells should be a double pointer, like this
char ** cells;
Or a pointer to a 2D array, like
char (*cells)[N][N];
where N is a constant you want to use.

What does it mean for .slice() to be a "shallow clone"?

ActionScript's Array and Vector classes both have a slice() method. If you don't pass any parameters, the new Array or Vector is a duplicate (shallow clone) of the original Vector.
What does it mean to be a "shallow clone"? Specifically, what is the difference between
Array newArray = oldArray.slice();
Vector.<Foo> newVector = oldVector.slice();
and
Array newArray = oldArray;
Vector.<Foo> newVector = oldVector;
? Also, what if the Vector's base type isn't Foo, but something simple and immutable like int?
Update:
What is the result of the following?
var one:Vector.<String> = new Vector.<String>()
one.push("something");
one.push("something else");
var two:Vector.<String> = one.slice();
one.push("and another thing");
two.push("and the last thing");
trace(one); // something, something else, and another thing
trace(two); // something, something else, and the last thing
Thanks! ♥

In your context, what .slice() does is simply to make a copy of your vector, so that newArray refers to a different object from oldArray, except both seem like identical objects. Likewise goes for newVector and oldVector.
The second snippet:
Array newArray = oldArray;
Vector.<Foo> newVector = oldVector;
actually makes newArray a reference to oldArray. That means both variables refer to the same array. Same for newVector and oldVector — both end up referring to the same vector. Think of it as using a rubber stamp to stamp the same seal twice on different pieces of paper: it's the same seal, just represented on two pieces of paper.
On a side note, the term shallow copy differs from deep copy in that shallow is a copy of only the object while deep is a copy of the object and all its properties.
Also, what if the Vector's base type isn't Foo, but something simple and immutable like int?
It's the same, because your variables refer to the Vector objects and not their ints.
What is the result of the following?
Your output is correct:
something, something else, and another thing
something, something else, and the last thing
two = one.slice(), without any arguments, makes a new copy of one with all its current contents and assigns it to two. When you push each third item to one and two, you're appending to distinct Vector objects.