I am trying to write an application that reads RPM files. The start of each block has a Magic char of [4]byte.
Here is my struct
type Lead struct {
Magic [4]byte
Major, Minor byte
Type uint16
Arch uint16
Name string
OS uint16
SigType uint16
}
I am trying to do the following:
lead := Lead{}
lead.Magic = buffer[0:4]
I am searching online and not sure how to go from a slice to an array (without copying). I can always make the Magic []byte (or even uint64), but I was more curious on how would I go from type []byte to [4]byte if needed to?
The built in method copy will only copy a slice to a slice NOT a slice to an array.
You must trick copy into thinking the array is a slice
copy(varLead.Magic[:], someSlice[0:4])
Or use a for loop to do the copy:
for index, b := range someSlice {
varLead.Magic[index] = b
}
Or do as zupa has done using literals. I have added onto their working example.
Go Playground
You have allocated four bytes inside that struct and want to assign a value to that four byte section. There is no conceptual way to do that without copying.
Look at the copy built-in for how to do that.
Try this:
copy(lead.Magic[:], buf[0:4])
Tapir Liui (auteur de Go101) twitte:
Go 1.18 1.19 1.20 will support conversions from slice to array: golang/go issues 46505.
So, since Go 1.18,the slice copy2 implementation could be written as:
*(*[N]T)(d) = [N]T(s)
or, even simpler if the conversion is allowed to present as L-values:
[N]T(d) = [N]T(s)
Without copy, you can convert, with the next Go 1.17 (Q3 2021) a slice to an array pointer.
This is called "un-slicing", giving you back a pointer to the underlying array of a slice, again, without any copy/allocation needed:
See golang/go issue 395: spec: convert slice x into array pointer, now implemented with CL 216424/, and commit 1c26843
Converting a slice to an array pointer yields a pointer to the underlying array of the slice.
If the length of the slice is less than the length of the array,
a run-time panic occurs.
s := make([]byte, 2, 4)
s0 := (*[0]byte)(s) // s0 != nil
s2 := (*[2]byte)(s) // &s2[0] == &s[0]
s4 := (*[4]byte)(s) // panics: len([4]byte) > len(s)
var t []string
t0 := (*[0]string)(t) // t0 == nil
t1 := (*[1]string)(t) // panics: len([1]string) > len(s)
So in your case, provided Magic type is *[4]byte:
lead.Magic = (*[4]byte)(buffer)
Note: type aliasing will work too:
type A [4]int
var s = (*A)([]int{1, 2, 3, 4})
Why convert to an array pointer? As explained in issue 395:
One motivation for doing this is that using an array pointer allows the compiler to range check constant indices at compile time.
A function like this:
func foo(a []int) int
{
return a[0] + a[1] + a[2] + a[3];
}
could be turned into:
func foo(a []int) int
{
b := (*[4]int)(a)
return b[0] + b[1] + b[2] + b[3];
}
allowing the compiler to check all the bounds once only and give compile-time errors about out of range indices.
Also:
One well-used example is making classes as small as possible for tree nodes or linked list nodes so you can cram as many of them into L1 cache lines as possible.
This is done by each node having a single pointer to a left sub-node, and the right sub-node being accessed by the pointer to the left sub-node + 1.
This saves the 8-bytes for the right-node pointer.
To do this you have to pre-allocate all the nodes in a vector or array so they're laid out in memory sequentially, but it's worth it when you need it for performance.
(This also has the added benefit of the prefetchers being able to help things along performance-wise - at least in the linked list case)
You can almost do this in Go with:
type node struct {
value int
children *[2]node
}
except that there's no way of getting a *[2]node from the underlying slice.
Go 1.20 (Q1 2023): this is addressed with CL 430415, 428938 (type), 430475 (reflect) and 429315 (spec).
Go 1.20
You can convert from a slice to an array directly with the usual conversion syntax T(x). The array's length can't be greater than the slice's length:
func main() {
slice := []int64{10, 20, 30, 40}
array := [4]int64(slice)
fmt.Printf("%T\n", array) // [4]int64
}
Go 1.17
Starting from Go 1.17 you can directly convert a slice to an array pointer. With Go's type conversion syntax T(x) you can do this:
slice := make([]byte, 4)
arrptr := (*[4]byte)(slice)
Keep in mind that the length of the array must not be greater than the length of the slice, otherwise the conversion will panic.
bad := (*[5]byte)(slice) // panics: slice len < array len
This conversion has the advantage of not making any copy, because it simply yields a pointer to the underlying array.
Of course you can dereference the array pointer to obtain a non-pointer array variable, so the following also works:
slice := make([]byte, 4)
var arr [4]byte = *(*[4]byte)(slice)
However dereferencing and assigning will subtly make a copy, since the arr variable is now initialized to the value that results from the conversion expression. To be clear (using ints for simplicity):
v := []int{10,20}
a := (*[2]int)(v)
a[0] = 500
fmt.Println(v) // [500 20] (changed, both point to the same backing array)
w := []int{10,20}
b := *(*[2]int)(w)
b[0] = 500
fmt.Println(w) // [10 20] (unchanged, b holds a copy)
One might wonder why the conversion checks the slice length and not the capacity (I did). Consider the following program:
func main() {
a := []int{1,2,3,4,5,6}
fmt.Println(cap(a)) // 6
b := a[:3]
fmt.Println(cap(a)) // still 6
c := (*[3]int)(b)
ptr := uintptr(unsafe.Pointer(&c[0]))
ptr += 3 * unsafe.Sizeof(int(0))
i := (*int)(unsafe.Pointer(ptr))
fmt.Println(*i) // 4
}
The program shows that the conversion might happen after reslicing. The original backing array with six elements is still there, so one might wonder why a runtime panic occurs with (*[6]int)(b) where cap(b) == 6.
This has actually been brought up. It's worth to remember that, unlike slices, an array has fixed size, therefore it needs no notion of capacity, only length:
a := [4]int{1,2,3,4}
fmt.Println(len(a) == cap(a)) // true
You might be able to do the whole thing with one read, instead of reading individually into each field. If the fields are fixed-length, then you can do:
lead := Lead{}
// make a reader to dispense bytes so you don't have to keep track of where you are in buffer
reader := bytes.NewReader(buffer)
// read into each field in Lead, so Magic becomes buffer[0:4],
// Major becomes buffer[5], Minor is buffer[6], and so on...
binary.Read(reader, binary.LittleEndian, &lead)
Don't. Slice itself is suffice for all purpose. Array in go lang should be regarded as the underlying structure of slice. In every single case, use only slice. You don't have to array yourself. You just do everything by slice syntax. Array is only for computer. In most cases, slice is better, clear in code. Even in other cases, slice still is sufficient to reflex your idea.
Related
A basic question that I'm struggling to find an answer for as there are a lot of answers about how to join two slices using the append function and the spread operator which erroneously use the word 'array'.
I am new to Go and have made the assumption that using sized arrays is good practice where the size is known. However I am struggling to work with arrays as I can't figure out how to do simple operations such as concatenation. Here is some code.
var seven [7]int
five := [5]int{1,2,3,4,5}
two := [2]int{6,7}
//this doesn't work as both the inputs and assignment are the wrong type
seven = append(five,two)
//this doesn't work as the assignment is still the wrong type
seven = append(five[:],two[:])
//this works but I'm not using arrays anymore so may as well use slices everywhere and forget sizing
seven2 := append(five[:],two[:])
As far as I can see I can either just give up on arrays and use slices exclusively or I could write a loop to explicitly construct the new array. Is there a third option?
append() can only be used to append elements to a slice. If you have an array, you can't pass that directly to append().
What you may do is slice the array, so you get a slice (which will use the array as its backing store), and you can use that slice as the target and source of elements.
For example:
s := seven[:0]
s = append(s, five[:]...)
s = append(s, two[:]...)
fmt.Println(seven)
This will print (try it on the Go Playground):
[1 2 3 4 5 6 7]
Also note that since append() returns the resulting slice, it's possible to write all this in one line:
_ = append(append(seven[:0], five[:]...), two[:]...)
(Storing the result is not needed here because we have and want to use only the backing array, but in general that is not the case.)
This outputs the same, try it on the Go Playground. Although this isn't very readable, so it's not worth compacting it into a single line.
Although when you have the target array, "appending" arrays is nothing more than copying them to the target, to the proper position. For that, you may use the builtin copy() function too. Note that the copy() function also accepts only slices, so you have to slice the arrays here too.
copy(seven[:], five[:])
copy(seven[len(five):], two[:])
fmt.Println(seven)
This will output the same. Try this one on the Go Playground.
You can use copy
copy(seven[:], five[:])
copy(seven[5:], two[:])
fmt.Printf("%v\n", seven)
> [1 2 3 4 5 6 7]
You can concatenate two arrays in go using copy function
package main
import "fmt"
func main() {
five := [5]int{1, 2, 3, 4, 5}
two := [2]int{6, 7}
var n [len(five) + len(two)]int
copy(n[:], five[:])
copy(n[len(five):], two[:])
fmt.Println(n)
}
https://blog.golang.org/go-slices-usage-and-internals
Golang runtime used to check whether current index exceeds the maximum possible.
On the side of array, it look ups its type (which contain its len and reference to the element type), because that's type, that can be registered only at compile time.
// each array mention with unique size creates new type
array := [5]byte{1,2,3,4,5}
On the side of slice, it look ups their header which looks like:
type slice {
data *byte
len int
cap int // capacity, the maximum possible index
}
As you can see, any slice is a single structure with data and len, cap fields, meanwhile array is just single pointer to data (*byte).
When you trying to convert array to slice, it just creates slice header and fills fields with:
slice := array[:]
==
slice := Slice{}
slice.data = array
slice.len = type_of(array).len
slice.cap = type_of(array).len
you can do that simply by converting array into slice:
arr1 := [...]int {1,2,3,}
arr2 := [...]int {4,5,6, }
//arr3 = arr1 + arr2 // not allowed
// converting arrays into slice
slc_arr1, slc_arr2 := arr1[:], arr2[:]
slc_arr3 := make([]int, 0)
slc_arr3 = append(slc_arr1, slc_arr2...)
fmt.Println(slc_arr3) // [1 2 3 4 5 6]
There is a more general way of appending an array of any type(once Golang has generics, but for now this solution is specific to strings. Just change the type as appropriate). The notion of Fold comes from Functional Programming. Note I have also included a filter function which also uses Fold. The solution is not stack safe but in many cases that does not matter. It can be made stack safe with trampolining. At the end is an example of its usage.
func FoldRightStrings(as, z []string, f func(string, []string) []string) []string {
if len(as) > 1 { //Slice has a head and a tail.
h, t := as[0], as[1:len(as)]
return f(h, FoldRightStrings(t, z, f))
} else if len(as) == 1 { //Slice has a head and an empty tail.
h := as[0]
return f(h, FoldRightStrings([]string{}, z, f))
}
return z
}
func FilterStrings(as []string, p func(string) bool) []string {
var g = func(h string, accum []string) []string {
if p(h) {
return append(accum, h)
} else {
return accum
}
}
return FoldRightStrings(as, []string{}, g)
}
func AppendStrings(as1, as2 []string) []string {
var g = func(h string, accum []string) []string {
return append(accum, h)
}
return FoldRightStrings(as1, as2, g)
}
func TestAppendStringArrays(t *testing.T) {
strings := []string{"a","b","c"}
bigarray := AppendStrings(AppendStrings(strings, strings),AppendStrings(strings, strings))
if diff := deep.Equal(bigarray, []string{"a","b","c","c","b","a","a","b","c","c","b","a"}); diff != nil {
t.Error(diff)
}
}
How to check whether two slices are backed up by the same array?
For example:
a := []int{1, 2, 3}
b := a[0:1]
c := a[2:3]
alias(b, c) == true
How should alias look like?
In general you can't tell if the backing array is shared between 2 slices, because using a full slice expression, one might control the capacity of the resulting slice, and then there will be no overlap even when checking the capacity.
As an example, if you have a backing array with 10 elements, a slice may be created that only contains the first 2 elements, and its capacity might be 2. And another slice may be create that only holds its last 2 elements, its capacity again being 2.
See this example:
a := [10]int{}
x := a[0:2:2]
y := a[8:10:10]
fmt.Println("len(x) = ", len(x), ", cap(x) = ", cap(x))
fmt.Println("len(y) = ", len(y), ", cap(y) = ", cap(y))
The above will print that both lengths and capcities of x and y are 2. They obviously have the same backing array, but you won't have any means to tell that.
Edit: I've misunderstood the question, and the following describes how to tell if (elements of) 2 slices overlap.
There is no language support for this, but since slices have a contiguous section of some backing array, we can check if the address range of their elements overlap.
Unfortunately pointers are not ordered in the sense that we can't apply the < and > operators on them (there are pointers in Go, but there is no pointer arithmetic). And checking if all the addresses of the elements of the first slice matches any from the second, that's not feasible.
But we can obtain a pointer value (an address) as a type of uintptr using the reflect package, more specifically the Value.Pointer() method (or we could also do that using package unsafe, but reflect is "safer"), and uintptr values are integers, they are ordered, so we can compare them.
So what we can do is obtain the addresses of the first and last elements of the slices, and by comparing them, we can tell if they overlap.
Here's a simple implementation:
func overlap(a, b []int) bool {
if len(a) == 0 || len(b) == 0 {
return false
}
amin := reflect.ValueOf(&a[0]).Pointer()
amax := reflect.ValueOf(&a[len(a)-1]).Pointer()
bmin := reflect.ValueOf(&b[0]).Pointer()
bmax := reflect.ValueOf(&b[len(b)-1]).Pointer()
return !(amax < bmin || amin > bmax)
}
Testing it:
a := []int{0, 1, 2, 3}
b := a[0:2]
c := a[2:4]
d := a[0:3]
fmt.Println(overlap(a, b)) // true
fmt.Println(overlap(b, c)) // false
fmt.Println(overlap(c, d)) // true
Try it on the Go Playground.
Found one way of this here. The idea is that while I don't think there's a way of finding the beginning of the backing array, ptr + cap of a slice should[*] point to the end of it. So then one compares the last pointer for equality, like:
func alias(x, y nat) bool {
return cap(x) > 0 && cap(y) > 0 && &x[0:cap(x)][cap(x)-1] == &y[0:cap(y)][cap(y)-1]
}
[*] The code includes the following note:
Note: alias assumes that the capacity of underlying arrays is never changed for nat values; i.e. that there are no 3-operand slice expressions in this code (or worse, reflect-based operations to the same effect).
Say I have the following code:
a := []int{1,2,3}
i := 0
var mu = &sync.Mutex{}
for i < 10 {
go func(a *[]int) {
for _, i := range a {
mu.Lock()
fmt.Println(a[0])
mu.Unlock()
}
}(&a)
i++
}
The array is a shared resource and is being read from in the loop. How do I protect the array in the loop header and do I need to? Also is it necessary to pass the array to the goroutine as a pointer?
First, some Go terminology:
[]int{1, 2, 3} is a slice, not an array. An array would be written as [...]int{1, 2, 3}.
A slice is a triplet of (start, length, capacity) and points to an underlying array (usually heap-allocated, but this is an implementation detail that the language completely hides from you!)
Go's memory model allows any number of readers or (but not and) at most one writer to any given region in memory. The Go memory model (unfortunately) doesn't specifically call out the case of accessing multiple indices into the same slice concurrently, but it appears to be fine to do so (i.e. they are treated as distinct locations in memory, as would be expected).
So if you're just reading from it, it is not necessary to protect it at all.
If you're reading and writing to it, but the goroutines don't read and write to the same places as each other (for example, if goroutine i only reads and writes to position i) then you also don't need synchronization. Moreover, you could either synchronize the entire slice (which means fewer mutexes, but much higher contention) or you could synchronize the individual positions in the slice (which means much lower contention but many more mutexes and locks acquired and released).
But since Go allows functions to capture variables in scope (that is, they are closures) there's really no reason to pass the array as a pointer at all:
Your code would thus be most idiomatically be written as:
a := []int{1,2,3}
for i := 0; i < 10; i++
for i < 10 {
go func() {
for _, i := range a {
fmt.Println(a[0])
}
}()
}
I'm not really sure what the above code is supposed to be for- since it's going to print out a[0] 10 times in various goroutines, which makes it look like it's not even using the slice in a meaningful way.
First you shuold know a := []int{1,2,3} is not an array, it is a slice.
A slice literal is like an array literal without the length.
This is an array literal:
[3]bool{true, true, false}
And this creates the same array as above, then builds a slice that
references it:
[]bool{true, true, false}
Types with empty [], such as []int are actually slices, not arrays. In Go, the size of an array is part of the type, so to actually have an array you would need to have something like [16]int, and the pointer to that would be *[16]int.
Q: is it necessary to pass the array to the goroutine as a pointer?
A: No. From https://golang.org/doc/effective_go.html#slices
If a function takes a slice argument, changes it makes to the elements
of the slice will be visible to the caller, analogous to passing a
pointer to the underlying array.
a = make([]int, 7, 15)
creates implicit array of size 15 and slice(a) creates a shallow copy of implicit array and points to first 7 elements in array.
Consider,
var a []int;
creates a zero length slice that does not point to any implicit array.
a = append(a, 9, 86);
creates new implicit array of length 2 and append values 9 and 86. slice(a) points to that new implicit array, where
len(a) is 2 and cap(a) >= 2
My question:
is this the correct understanding?
As I mentioned "Declare slice or make slice?", the zero value of a slice (nil) acts like a zero-length slice.
So you can append to a []int directly.
You would need to make a slice (make([]int, 0) ) only if you wanted to potentially return an empty slice (instead of nil).
If not, no need to allocate memory before starting appending.
See also "Arrays, slices (and strings): The mechanics of 'append': Nil"
a nil slice is functionally equivalent to a zero-length slice, even though it points to nothing. It has length zero and can be appended to, with allocation.
I have 2 or more dynamic string array that fill with some huge data , i want to merge this 2 array to one array , i know i can do it with a for loop like this :
var
Arr1, Arr2, MergedArr: Array of string;
I: Integer;
begin
// Arr1:= 5000000 records
// Arr2:= 5000000 records
// Fill MergedArr by Arr1
MergedArr:= Arr1;
// Set length of MergedArr to length of ( Arra1 + Arr2 )+ 2
SetLength(MergedArr, High(Arr1)+ High(Arr2)+2);
// Add Arr2 to MergedArr
for I := Low(Arr2)+1 to High(Arr2)+1 do
MergedArr[High(Arr1)+ i]:= Arr2[i-1];
end;
but it is slow on huge data , is there faster way like copy array memory data ?
First of all string is special, so it should be treated specially: Don't try outsmarting the compiler, keep your code unchanged. String is special because it's reference counted. Every time you copy a string from one place to an other it's reference count is incremented. When the reference count reaches 0, the string is destroyed. Your code plays nice because it lets the compiler know what you're doing, and in turn the compiler gets the chance to properly increment all reference counts.
Sure, you can play all sorts of tricks as suggested in the comments to gabr's answer, like filling the old arrays with zero's so the reference count in the new array remains valid, but you can't do that if you actually need the old arrays as well. And this is a bit of a hack (albeit one that will probably be valid for the foreseeable future). (and to be noted, I actually like this hack).
Anyway, and this is the important part of my answer, your code is most likely not slow in the copying of the strings from one array to the other, it's most likely slowly somewhere else. Here's a short console application that creates two arrays, each with 5M random strings, then merges the two arrays into a third and displays the time it took to create the merge. Merging only takes about 300 milliseconds on my machine. Filling the array takes much longer, but I'm not timing that:
program Project26;
{$APPTYPE CONSOLE}
uses SysUtils, Windows;
var a, b, c: array of string;
i: Integer;
Freq: Int64;
Start, Stop: Int64;
Ticks: Cardinal;
const count = 5000000;
begin
SetLength(a,count);
SetLength(b,count);
for i:=0 to count-1 do
begin
a[i] := IntToStr(Random(1));
b[i] := IntToStr(Random(1));
end;
WriteLn('Moving');
QueryPerformanceFrequency(Freq);
QueryPerformanceCounter(Start);
SetLength(c, Length(a) + Length(b));
for i:=0 to High(a) do
c[i] := a[i];
for i:=0 to High(b) do
c[i+Length(a)] := b[i];
QueryPerformanceCounter(Stop);
WriteLn((Stop - Start) div (Freq div 1000), ' milliseconds');
ReadLn;
end.
You can use built-in Move function which moves a block of memory to another location. Parameters are source and target memory blocks and size of data to be moved.
Because you are copying strings, source arrays must be destroyed after the merging by filling them with zeroes. Otherwise refcounts for strings will be all wrong causing havoc and destruction later in the program.
var
Arr1, Arr2, MergedArr: Array of string;
I: Integer;
begin
SetLength(Arr1, 5000000);
for I := Low(Arr1) to High(Arr1) do
Arr1[I] := IntToStr(I);
SetLength(Arr2, 5000000);
for I := Low(Arr2) to High(Arr2) do
Arr2[I] := IntToStr(I);
// Set length of MergedArr to length of ( Arra1 + Arr2 )+ 2
SetLength(MergedArr, High(Arr1)+ High(Arr2)+2);
// Add Arr1 to MergedArr
Move(Arr1[Low(Arr1)], MergedArr[Low(MergedArr)], Length(Arr1)*SizeOf(Arr1[0]));
// Add Arr2 to MergedArr
Move(Arr2[Low(Arr2)], MergedArr[High(Arr1)+1], Length(Arr2)*SizeOf(Arr2[0]));
// Cleanup Arr1 and Arr2 without touching string refcount.
FillChar(Arr1[Low(Arr1)], Length(Arr1)*SizeOf(Arr1[0]), 0);
FillChar(Arr2[Low(Arr2)], Length(Arr2)*SizeOf(Arr2[0]), 0);
// Test
for I := Low(Arr1) to High(Arr1) do begin
Assert(MergedArr[I] = IntToStr(I));
Assert(MergedArr[I] = MergedArr[Length(Arr1) + I]);
end;
// Clear the array to see if something is wrong with refcounts
for I := Low(MergedArr) to High(MergedArr) do
MergedArr[I] := '';
end;
An excellent maxim is that the fastest code is that which never runs. Since copying is expensive you should look to avoid the cost of copying.
You can do this with a virtual array. Create a class which holds an array of array of string. In your example the outer array would hold two string arrays.
Add a Count property that returns the total number of strings in all of the arrays.
Add a default indexed property that operates by working out which of the outer arrays the index refers to and then returns the appropriate value from the inner array.
For extra points implement an enumerator to make for in work.