unsafe Pointers in Go: function call end kills array - arrays

I'm writing a library and I want to return an array (or write to an array) of an unspecific type to the caller. The type can vary, depending on who calls - I can, however, create as many objects of said type from within my function. One way would be that the caller creates an array and the callee fills that - however, there is no way of telling how long this array is going to be. (Is there a way that the callee makes the caller's array bigger? Remember, the callee only sees x interface{}...)
The other way which I chose because I don't see how above is possible, is that the caller gives me the pointer of his specific type and I redirect it to the array of objects which I created.
Below is my solution. My question: why is the array empty after the function call? They are pointing to the same array after my operation, they should be the same. Am I overlooking something? I thought about GC, but it couldn't be that fast, could it?
http://play.golang.org/p/oVoPx5Nf84
package main
import "unsafe"
import "reflect"
import "log"
func main() {
var x []string
log.Printf("before: %v, %p", x, x)
manipulate(&x)
log.Printf("after: %v, %p", x, x)
}
func manipulate(target interface{}) {
new := make([]string, 0, 10)
new = append(new, "Hello", "World")
log.Printf("new: %v, %p", new, new)
p := unsafe.Pointer(reflect.ValueOf(target).Pointer())
ptr := unsafe.Pointer(reflect.ValueOf(new).Pointer())
*(*unsafe.Pointer)(p) = ptr
}

First of all, unsafe is usually a bad idea. So is reflection, but unsafe is at least an order of magnitude worse.
Here is your example using pure reflection (http://play.golang.org/p/jTJ6Mhg8q9):
package main
import (
"log"
"reflect"
)
func main() {
var x []string
log.Printf("before: %v, %p", x, x)
manipulate(&x)
log.Printf("after: %v, %p", x, x)
}
func manipulate(target interface{}) {
t := reflect.Indirect(reflect.ValueOf(target))
t.Set(reflect.Append(t, reflect.ValueOf("Hello"), reflect.ValueOf("World")))
}
So, why didn't your unsafe way work? Unsafe is extremely tricky and at times requires understanding the internals. First, some misconceptions you have:
You are using arrays: you are not, you are using slices. Slices are a view of an array. They contain within them a pointer to the data, a length, and a capacity. They are stucts internally and not pure pointers.
Pointer returns the pointer only if it is a pointer: it can actually return a value for many types like a slice.
From http://golang.org/pkg/reflect/#Value.Pointer:
If v's Kind is Slice, the returned pointer is to the first element of the slice. If the slice is nil the returned value is 0. If the slice is empty but non-nil the return value is non-zero.
Arrays are pointers: in Go, arrays are actually values. That means they are copied when passed to other functions or assigned. It also means the .Pointer method wouldn't work.
You are assigning a pointer to an array to a slice type. By luck, the implementation of slices used internally has the data pointer first so you are actually setting the internal array pointer used by the slice. I must stress that is is effectively pure accident. Even still, you are not setting the length and capacity of the slice so it still prints zero elements.
Unsafe lets you do things at such a low level that the actual results aren't really defined. It is best to stay away from it unless you really know what you are doing. Even then, be aware that things can can change and what works today may not work in the next version of Go or another implementation.

Related

How to pass C array to Swift [duplicate]

This question already has answers here:
float* array to NSArray, iOS
(1 answer)
Inter-operability of Swift arrays with C?
(4 answers)
Closed 5 months ago.
Newbie to Objective-C and Swift here. I'm creating an NSArray from a C float array with the following code:
float* output_f = output.data_ptr<float>();
NSMutableArray *results = [NSMutableArray arrayWithCapacity: 1360*1060]
for (int i = 0; i < 1360 * 1016; i++) {
[results insertObject:#(output_f[i]) atIndex:i];
}
However since there are over a million samples to be inserted this is slow and is becoming a bottle-neck in my application. Is there a quicker way to create an NSArray from a C array without copying the elements one-by-one?
There's no need to go through Obj-C. Assuming that output_f appears in an include file that's included via your bridging header, Swift will see its type as UnsafeMutablePointer<CFloat> (CFloat is just a typealias for Float, named to clarify that it corresponds to the C type).
Assuming you also make the number of floats in the array available, lets say included somewhere in your bridged header files is:
extern float* output_f;
extern int output_f_count;
Then on the Swift-side, you can use them like this:
let outputFloats = UnsafeMutableBufferPointer<CFloat>(
start: output_f,
count: Int(output_f_count))
The cast of output_f_count to Int is necessary because Swift interprets C's int as CInt (aka Int32).
You can use UnsafeMutablePointer much like array, but there's no copying. It just aliases the C data in Swift.
If you want to make sure you don't mutate the data, you can create an UnsafeBufferPointer instead, but you'll need to cast the pointer.
let outputFloats = UnsafeBufferPointer<CFloat>(
start: UnsafePointer(output_f),
count: Int(output_f_count))
Since there's no copying, both of those options are very fast. However, they are pointers. If Swift modifies the contents, the C code will see the changed data, and vice-versa. That may or may not be a good thing, depending on your use case, but you definitely want to be aware of it.
If you want to make a copy, you can make a Swift Array very easily like this:
let outputFloatsArray = [CFloat](outputFloats)
Now you have you Swift-side copy in an Array.
As a very closely related thing, if in a C header, output_f were declared as an actual array like this,
extern float output_f[1360*1060];
Then Swift doesn't see a pointer. It sees, believe it or not, a tuple... a great big ugly tuple with a crap-load of CFloat members, which has the benefit of being a value type, but is hard to work with directly because you can't index into it. Fortunately you can work around that:
withUnsafeBytes(of: output_f)
{
let outputFloats = $0.bindMemory(to: CFloat.self)
// Now within the scope of this closure you can use outputFloats
// just as before.
}
Note: You can also use the pointer directly without going through the buffer pointer types, and because you avoid bounds-checking that way, it is a tiny bit faster, but just a very tiny bit, it's more awkward, and well... you lose the error catching benefits of bounds-checking. Plus the buffer pointer types provide all the RandomAccessCollection methods like map, filter, forEach, etc...
Update:
In comments OP said that he had tried this approach but got EXEC_BAD_ACCESS while dereferencing them. Missing is the context of what is happening between obtaining the pointer from output and its being available to Swift.
Given the clue from earlier that it's actually C++, I think output is probably std::vector<float>, and its probably going out of scope before Swift does anything with the pointers, so its destructor is being called, which of course, deletes its internal data pointer. In that case Swift is accessing memory that is no longer valid.
There are two ways to address this. The first is to make sure that output is not cleaned up until after Swift is done with it's data. The other option, is to copy the data in C.
const int capacity = 1360*1060;
float* p = output.data_ptr<float>();
// static_cast because the above template syntax indicates
// this is actually C++, not C.
float* output_f = static_cast<float*>(calloc(capacity, sizeof(float)));
memcpy(output_f, p, capacity * sizeof(float));
Now output can be cleaned up before Swift accesses output_f. Also this makes the copy that was originally asked about much faster that using NSArray. Assuming the C code doesn't use output_f after this, Swift can just take ownership of it. In that case, Swift needs to be sure to call free(outout_f) when it's done.
If the Swift code doesn't care about it being in an actual array, the Unsafe...BufferPointer types will do the job.
However, if an actual Array is desired, this will be yet another copy, and copying the same data twice just to get it in a Swift Array doesn't make sense if it can be avoided. How to avoid it depends on whether C (or Obj-C) is calling Swift, or Swift is calling Obj-C. I'm going to assume that it's Swift calling C. So let's assume that Swift is calling some C function get_floats() defined like this:
extern "C" *float get_floats()
{
const int capacity = 1360*1060;
float* p = output.data_ptr<float>();
// static_cast because the above template syntax indicates
// this is actually C++, not C.
float* output_f = static_cast<float*>(
calloc(capacity, sizeof(float))
);
memcpy(output_f, p, capacity * sizeof(float));
// Maybe do other work including disposing of `output`
return output_f;
}
You want to change the interface so that a pre-allocated pointer is provided as a parameter, along with its capacity.
extern "C" void get_floats(float *output_f, int capacity)
{
float* p = output.data_ptr<float>();
memcpy(output_f, p, capacity * sizeof(float));
// Maybe do other work including disposing of `output`
// can use return for something else now -- maybe error code?
}
On the Swift side, you could allocate pointers, but since you want it in an Array anyway:
var outputFloats = [Array](repeating: 0, count: 1360*1060)
outputFloats.withUnsafeMutableBuffer {
get_floats($0.baseAddress, CInt($0.count))
}
// Now the array is populated with the contents of the C array.
One last thing. The above code makes an assumption that output.data_ptr() points to at least capacity number of floats. Are you sure this is true? Assuming output is std::vector, it would be better to change the memcpy call to:
const size_t floatsToCopy = std::min(capacity, output.size())
memcpy(output_f, p, floatsToCopy * sizeof(float));
That ensures that you're not reading garbage from the end of real data if it's actually less than capacity. Then you can return floatsToCopy; from get_floats.
Then on the Swift side, it looks like this:
var outputFloats = [Array](repeating: 0, count: 1360*1060)
let floatsCopied = outputFloats.withUnsafeMutableBuffer {
get_floats($0.baseAddress, CInt($0.count))
}
outputFloats.removeLast(
outputFloats.count - Int(floatsCopied),
keepingCapacity: true)
You don't actually have to use the keepingCapacity parameter, but doing so allows you to re-use the array without having to pay for more memory allocations. Just refill out to full capacity before calling get_floats again with the same array. Plus unless your peak memory usage is an issue, keepingCapacity: true is likely faster, and at least no worse, than the default, because without it, Array might choose to reallocate to the smaller size, which internally is an allocation, a copy, and a free, and the whole point was to avoid a copy... but the dynamic memory allocation is the really slow part. Given CPU caches and the way instruction pipelines work, you can do a lot of sequential copying in the time it takes to do a single memory allocation.
According to the comments section your final goal is to read C-array data in Swift. Provided you know the length of the array, you can return it from an Objective-C function as a pointer:
- (float *)cArray {
float *arr = (float *)malloc(sizeof(float) * 4);
for (int i = 0; i < 4; ++i) {
arr[i] = i;
}
return arr;
}
And just read it from an UnsafePointer in Swift:
let ptr = TDWObject().cArray()
(0 ..< 4).forEach {
print(ptr.advanced(by: $0).pointee)
}
Don't forget to deallocate the pointer when you are done with it:
ptr.deallocate()

Getting size of a structure in GAE/Go

I would like to get the size of a structure in GAE/Go.
I read this post and wrote the code as below.
import (
"reflect"
"unsafe"
)
func GetSize(T interface{}) {
size := reflect.TypeOf(T).Size()
return int((*int)(unsafe.Pointer(size)))
}
But this code does not work because GAE does not allow to import unsafe.
How can I do this in GAE/Go?
Your proposed solution is not valid code, it has multiple errors.
For example GetSize() has no result type, so you couldn't return anything.
Next, the expression you return is also a syntax error, it attempts to convert an *int pointer to int which is not valid.
You need to dereference the pointer first, so the correct syntax would be:
func GetSize(T interface{}) int {
size := reflect.TypeOf(T).Size()
return int(*(*int)(unsafe.Pointer(size)))
}
But. It makes no sense. reflect.Type.Size() already returns the size (the number of bytes needed to store a value of the given type), so there is no need of that unsafe magic. What may be confusing is that its return type is uintptr, but you may simply use that value after converting it to int for example.
Simply use:
func GetSize(v interface{}) int {
return int(reflect.TypeOf(v).Size())
}
Testing it:
fmt.Println("Size of int:", GetSize(int(0)))
fmt.Println("Size of int64:", GetSize(int64(0)))
fmt.Println("Size of [100]byte:", GetSize([100]byte{}))
Output (try it on the Go Playground):
Size of int: 4
Size of int64: 8
Size of [100]byte: 100
One thing you must not forget: this GetSize() will not recurisvely examine the size of the passed value. So for example if it's a struct with a pointer field, it will not "count" the size of the pointed value, only the size of the pointer field.
Constructing a GetSize() that recurisvely counts the total size of a complex data structure is non-trivial due to types like map. For details, see How to get variable memory size of variable in golang?

How to return UnsafePointer <CChar> and UnsafePointer <CUnsignedInt> with Swift

I'm writing a Swift program that needs to interoperate with a C library.
This C Library consists of a set of functions that use callbacks.
I resolved the problem of how to pass a Swift func to these C functions, but I'm having difficulties to convert Swift native types to the appropriate C types.
Specifically, I have these 2 callbacks (the signature MUST be this in order to be accepted by C library):
func peer_name_handler_swift_test(p: peer_wr) -> UnsafePointer<CChar>
{
return nil;
}
func peer_ver_handler_swift_test(p: peer_wr) -> UnsafePointer<CUnsignedInt>
{
return nil;
}
Well, despite my efforts, I could not return the correct types from String Swift type and from a simple CUnsignedInt array.
The 2 data I would like to return in these callbacks are these one:
var BLZ_SWIFT_TEST_PEER_VER: [CUnsignedInt] = [0,0,1,0];
var BLZ_SWIFT_TEST_PEER_NAME: String = "test_swift_peer";
Can you help me?
Coercing data into the correct pointer types can be done in various ways. The tricky part to this question is deciding how to manage the memory you are giving out a pointer to.
If you are declaring values known statically at compile time, you could do it like so:
// I _think_ you need to manually null-terminate the string here
let BLZ_SWIFT_TEST_PEER_NAME: StaticString = "test_swift_peer\0"
func peer_name_handler_swift_test(p: peer_wr) -> UnsafePointer<CChar>
{
return UnsafePointer(BLZ_SWIFT_TEST_PEER_NAME.utf8Start)
}
// tuples are the closest approximation to C fixed-size arrays
var BLZ_SWIFT_TEST_PEER_VER
= (0,0,1,0) as (CUnsignedInt,CUnsignedInt,CUnsignedInt,CUnsignedInt)
func peer_ver_handler_swift_test(p: peer_wr) -> UnsafePointer<CUnsignedInt>
{
return withUnsafePointer(&BLZ_SWIFT_TEST_PEER_VER) {
UnsafePointer($0)
}
}
The extra UnsafePointer occurrences in both function bodies are pointer conversion. In the string case, because utf8 is UInt8 but CChar is an alias for Int8. In the array case, because you want a pointer to the first CUnsignedInt rather than a pointer to a 4-tuple.
If you want to change the values at runtime, you need to decide who’s going to create/free the memory, whether you need to allocate multiple bits of memory because the caller of your callback is going to store the pointer and then if you change it, you are going to need to allocate a new bit of memory rather than overwrite the one you already handed out etc., and if so, how you track that in order to free it later to avoid leaking. If you want a single static string, but one that you want to create at runtime, say from a config file, you could do this:
// prior to the callback getting called
let namePtr = strdup(BLZ_SWIFT_TEST_PEER_NAME)
func peer_name_handler_swift_test(p: peer_wr) -> UnsafePointer<CChar>
{
return UnsafePointer(namePtr)
}
// then some time later, if you want to clean up
free(namePtr)

Does go garbage collect parts of slices?

If I implement a queue like this...
package main
import(
"fmt"
)
func PopFront(q *[]string) string {
r := (*q)[0]
*q = (*q)[1:len(*q)]
return r
}
func PushBack(q *[]string, a string) {
*q = append(*q, a)
}
func main() {
q := make([]string, 0)
PushBack(&q, "A")
fmt.Println(q)
PushBack(&q, "B")
fmt.Println(q)
PushBack(&q, "C")
fmt.Println(q)
PopFront(&q)
fmt.Println(q)
PopFront(&q)
fmt.Println(q)
}
... I end up with an array ["A", "B", "C"] that has no slices pointing to the first two elements. Since the "start" pointer of a slice can never be decremented (AFAIK), those elements can never be accessed.
Is Go's garbage collector smart enough to free them?
Slices are just descriptors (small struct-like data structures) which if not referenced will be garbage collected properly.
The underlying array for a slice (to which the descriptor points to) on the other hand is shared between all slices that are created by reslicing it: quoting from the Go Language Specification: Slice Types:
A slice, once initialized, is always associated with an underlying array that holds its elements. A slice therefore shares storage with its array and with other slices of the same array; by contrast, distinct arrays always represent distinct storage.
Therefore if at least one slice exists, or a variable holding the array (if a slice was created by slicing the array), it will not be garbage collected.
Official Statement about this:
The blog post Go Slices: usage and internals By Andrew Gerrand clearly states this behaviour:
As mentioned earlier, re-slicing a slice doesn't make a copy of the underlying array. The full array will be kept in memory until it is no longer referenced. Occasionally this can cause the program to hold all the data in memory when only a small piece of it is needed.
...
Since the slice references the original array, as long as the slice is kept around the garbage collector can't release the array.
Back to your example
While the underlying array will not be freed, note that if you add new elements to the queue, the built-in append function occasionally might allocate a new array and copy the current elements to the new – but copying will only copy the elements of the slice and not the whole underlying array! When such a reallocation and copying occurs, the "old" array may be garbage collected if no other reference exists to it.
Also another very important thing is that if an element is popped from the front, the slice will be resliced and not contain a reference to the popped element, but since the underlying array still contains that value, the value will also remain in memory (not just the array). It is recommended that whenever an element is popped or removed from your queue (slice/array), always zero it (its respective element in the slice) so the value will not remain in memory needlessly. This becomes even more critical if your slice contains pointers to big data structures.
func PopFront(q *[]string) string {
r := (*q)[0]
(*q)[0] = "" // Always zero the removed element!
*q = (*q)[1:len(*q)]
return r
}
This is mentioned Slice Tricks wiki page:
Delete without preserving order
a[i] = a[len(a)-1]
a = a[:len(a)-1]
NOTE If the type of the element is a pointer or a struct with pointer fields, which need to be garbage collected, the above implementations of Cut and Delete have a potential memory leak problem: some elements with values are still referenced by slice a and thus can not be collected.
No. At the time of this writing, the Go garbage collector (GC) is not smart enough to collect the beginning of an underlying array in a slice, even if it is inaccessible.
As mentioned by others here, a slice (under the hood) is a struct of exactly three things: a pointer to its underlying array, the length of the slice (values accessible without reslicing), and the capacity of the slice (values accessible by reslicing). On the Go blog, slice internals are discussed at length. Here is another article I like about Go memory layouts.
When you reslice and cut off the tail end of a slice, it is obvious (upon understanding the internals) that the underlying array, the pointer to the underlying array, and the slice's capacity are all left unchanged; only the slice length field is updated. When you re-slice and cut off the beginning of a slice, you are really changing the pointer to the underlying array along with the length and capacity. In this case, it is generally unclear (based on my readings) why the GC does not clean up this inaccessible part of the underlying array because you cannot re-slice the array to access it again. My assumption is that the underlying array is treated as one block of memory from the GC's point of view. If you can point to any part of the underlying array, the entire thing is ineligible for deallocation.
I know what you're thinking... like the true computer scientist you are, you may want some proof. I'll indulge you:
https://goplay.space/#tDBQs1DfE2B
As mentioned by others and as shown in the sample code, using append can cause a reallocation and copy of the underlying array, which allows the old underlying array to be garbage collected.
Simple question, simple answer: No. (But if you keep pushing the slice will at some point overflow its underlying array then the unused elements become available to be freed.)
Contrary to what I'm reading, Golang certainly seems to garbage collect at least unused slices starting sections. The following test case provides evidence.
In the first case the slice is set to slice[:1] in each iteration. In the comparison case, it skips that step.
The second case dwarfs the memory consumed in the first case. But why?
func TestArrayShiftMem(t *testing.T) {
slice := [][1024]byte{}
mem := runtime.MemStats{}
mem2 := runtime.MemStats{}
runtime.GC()
runtime.ReadMemStats(&mem)
for i := 0; i < 1024*1024*1024*1024; i++ {
slice = append(slice, [1024]byte{})
slice = slice[1:]
runtime.GC()
if i%(1024) == 0 {
runtime.ReadMemStats(&mem2)
fmt.Println(mem2.HeapInuse - mem.HeapInuse)
fmt.Println(mem2.StackInuse - mem.StackInuse)
fmt.Println(mem2.HeapAlloc - mem.HeapAlloc)
}
}
}
func TestArrayShiftMem3(t *testing.T) {
slice := [][1024]byte{}
mem := runtime.MemStats{}
mem2 := runtime.MemStats{}
runtime.GC()
runtime.ReadMemStats(&mem)
for i := 0; i < 1024*1024*1024*1024; i++ {
slice = append(slice, [1024]byte{})
// slice = slice[1:]
runtime.GC()
if i%(1024) == 0 {
runtime.ReadMemStats(&mem2)
fmt.Println(mem2.HeapInuse - mem.HeapInuse)
fmt.Println(mem2.StackInuse - mem.StackInuse)
fmt.Println(mem2.HeapAlloc - mem.HeapAlloc)
}
}
}
Output Test1:
go test -run=.Mem -v .
...
0
393216
21472
^CFAIL github.com/ds0nt/cs-mind-grind/arrays 1.931s
Output Test3:
go test -run=.Mem3 -v .
...
19193856
393216
19213888
^CFAIL github.com/ds0nt/cs-mind-grind/arrays 2.175s
If you disable garbage collection on the first test, indeed memory skyrockets. The resulting code looks like this:
func TestArrayShiftMem2(t *testing.T) {
debug.SetGCPercent(-1)
slice := [][1024]byte{}
mem := runtime.MemStats{}
mem2 := runtime.MemStats{}
runtime.GC()
runtime.ReadMemStats(&mem)
// 1kb per
for i := 0; i < 1024*1024*1024*1024; i++ {
slice = append(slice, [1024]byte{})
slice = slice[1:]
// runtime.GC()
if i%(1024) == 0 {
fmt.Println("len, cap:", len(slice), cap(slice))
runtime.ReadMemStats(&mem2)
fmt.Println(mem2.HeapInuse - mem.HeapInuse)
fmt.Println(mem2.StackInuse - mem.StackInuse)
fmt.Println(mem2.HeapAlloc - mem.HeapAlloc)
}
}
}

Golang: unsafe dynamic byte array

I am trying to interface with a Windows dll using Go. The dll function I want to use accepts a pointer to a byte array. Therefore I need to give it that byte array.
I am using the syscall libary to call the dll, as demonstrated here. My basic requirements are:
I am given the required size for the byte array
I create the byte array
I must get a pointer to the byte array
I then pass the pointer to the Windows dll
I can't figure out how to create a byte array in go, and get a pointer to it. This is obviously an unsafe operation, and the unsafe library can be helpful, but I need to create a dynamic-length byte array in the first place. Creating a slice with "make" doesn't help me, unless I can get a pointer to the slice's backing array.
Has anyone else encountered this or have any ideas?
I think syscall.ComputerName implementation https://golang.org/src/syscall/syscall_windows.go#395 would be a good example. It uses uint16s, not bytes, but otherwise ...
In your case it would be ptr := &myslice[0].
Alex
Well I found one gross solution. Apparently the structure of a slice contains a pointer to the backing byte array, the length of the backing byte array, and then the capacity of the backing byte array.
I am only interested in a pointer to the byte array, so I only need the first member of the slice's internal data.
Go's unsafe.Pointer will not cast a slice to an unsafe pointer, but it will cast a pointer to a slice as an unsafe pointer. Since I can cast an unsafe pointer to any old type of pointer I want, I can cast it to a pointer-to-a-pointer, which recovers the first member of the slice's internal data.
Here's a working example. I wanted a uintptr but you could cast it to any pointer type.
package main
import (
"fmt"
"unsafe"
)
func main() {
// Arbitrary size
n := 4
// Create a slice of the correct size
m := make([]int, n)
// Use convoluted indirection to cast the first few bytes of the slice
// to an unsafe uintptr
mPtr := *(*uintptr)(unsafe.Pointer(&m))
// Check it worked
m[0] = 987
// (we have to recast the uintptr to a *int to examine it)
fmt.Println(m[0], *(*int)(unsafe.Pointer(mPtr)))
}
If you wanted a *int instead, you could do mPtr := *(**int)(unsafe.Pointer(&m))
This works as long as a slice maintains this internal data structure. I am definitely open to a more robust solution that doesn't depend on the structure of Go's internals.

Resources