C# -- Create Managed Array from Pointer - arrays

I'm trying to create a Managed Array of doubles from an array of bytes. I have the problem working currently, but I wanted to optimize. Here's some code that I would like to work:
private unsafe static double[] _Get_Doubles(byte[] _raw_data)
{
double[] ret;
fixed (byte* _pd = _raw_data)
{
double* _pret = (double*)_pd;
ret = (double[])*_pret; //FAILURE
}
}
Please let me know how to cope with these problems.
-Aaron

One of the key things to notice about the code you have posted is that there is no way to know how many items are pointed to by the return value, and a managed array needs to know how big it is. You can return a double* or create a new double[XXX] and copy the values or even (if the count is constant) create a struct with a public fixed double _data[2]; member and cast the raw data to that type.

Just now, I thought that stackalloc would be the right way, but it fails. Most importantly, I now know that it was doomed to fail. There is no way to do what I want to do.
This can be seen by restating the question:
How can I create a managed array around an 'unsafe' array?
Since a managed array has header information (because it's a class around a chuck of memory), it requires more space in memory than the array itself. So, the answer is:
Allocate space before (and/or after? depending on the way managed arrays are stored in memory) the array itself and put the managed information (length, (et cetera)) around the 'unsafe' array.
This is not easily possible because to guarantee that there is data enough around the array is shaky at best. In my particular example there may be enough space for it because a managed byte[] is passed in meaning that there is data around the array, but to assert that the same data is appropriate for managed double[] is dubious at best, but most likely erroneous, and to change the data to make it appropriate for managed double[] is nefarious.
[EDIT]
It looks like Marshal.Copy is the way to go here. Create a new array and let Marshal copy them (hoping that he will be quicker than me, or that perhaps at some later date, he will be quicker):
var ret = new double[_raw_data.Length / sizeof(double)];
System.Runtime.InteropServices.Marshal.Copy(new System.IntPtr(_pret), ret, 0, ret.Length);

Related

Is it better to create new variables or using pointers in C? [duplicate]

In Go there are various ways to return a struct value or slice thereof. For individual ones I've seen:
type MyStruct struct {
Val int
}
func myfunc() MyStruct {
return MyStruct{Val: 1}
}
func myfunc() *MyStruct {
return &MyStruct{}
}
func myfunc(s *MyStruct) {
s.Val = 1
}
I understand the differences between these. The first returns a copy of the struct, the second a pointer to the struct value created within the function, the third expects an existing struct to be passed in and overrides the value.
I've seen all of these patterns be used in various contexts, I'm wondering what the best practices are regarding these. When would you use which? For instance, the first one could be ok for small structs (because the overhead is minimal), the second for bigger ones. And the third if you want to be extremely memory efficient, because you can easily reuse a single struct instance between calls. Are there any best practices for when to use which?
Similarly, the same question regarding slices:
func myfunc() []MyStruct {
return []MyStruct{ MyStruct{Val: 1} }
}
func myfunc() []*MyStruct {
return []MyStruct{ &MyStruct{Val: 1} }
}
func myfunc(s *[]MyStruct) {
*s = []MyStruct{ MyStruct{Val: 1} }
}
func myfunc(s *[]*MyStruct) {
*s = []MyStruct{ &MyStruct{Val: 1} }
}
Again: what are best practices here. I know slices are always pointers, so returning a pointer to a slice isn't useful. However, should I return a slice of struct values, a slice of pointers to structs, should I pass in a pointer to a slice as argument (a pattern used in the Go App Engine API)?
tl;dr:
Methods using receiver pointers are common; the rule of thumb for receivers is, "If in doubt, use a pointer."
Slices, maps, channels, strings, function values, and interface values are implemented with pointers internally, and a pointer to them is often redundant.
Elsewhere, use pointers for big structs or structs you'll have to change, and otherwise pass values, because getting things changed by surprise via a pointer is confusing.
One case where you should often use a pointer:
Receivers are pointers more often than other arguments. It's not unusual for methods to modify the thing they're called on, or for named types to be large structs, so the guidance is to default to pointers except in rare cases.
Jeff Hodges' copyfighter tool automatically searches for non-tiny receivers passed by value.
Some situations where you don't need pointers:
Code review guidelines suggest passing small structs like type Point struct { latitude, longitude float64 }, and maybe even things a bit bigger, as values, unless the function you're calling needs to be able to modify them in place.
Value semantics avoid aliasing situations where an assignment over here changes a value over there by surprise.
Passing small structs by value can be more efficient by avoiding cache misses or heap allocations. In any case, when pointers and values perform similarly, the Go-y approach is to choose whatever provides the more natural semantics rather than squeeze out every last bit of speed.
So, Go Wiki's code review comments page suggests passing by value when structs are small and likely to stay that way.
If the "large" cutoff seems vague, it is; arguably many structs are in a range where either a pointer or a value is OK. As a lower bound, the code review comments suggest slices (three machine words) are reasonable to use as value receivers. As something nearer an upper bound, bytes.Replace takes 10 words' worth of args (three slices and an int). You can find situations where copying even large structs turns out a performance win, but the rule of thumb is not to.
For slices, you don't need to pass a pointer to change elements of the array. io.Reader.Read(p []byte) changes the bytes of p, for instance. It's arguably a special case of "treat little structs like values," since internally you're passing around a little structure called a slice header (see Russ Cox (rsc)'s explanation). Similarly, you don't need a pointer to modify a map or communicate on a channel.
For slices you'll reslice (change the start/length/capacity of), built-in functions like append accept a slice value and return a new one. I'd imitate that; it avoids aliasing, returning a new slice helps call attention to the fact that a new array might be allocated, and it's familiar to callers.
It's not always practical follow that pattern. Some tools like database interfaces or serializers need to append to a slice whose type isn't known at compile time. They sometimes accept a pointer to a slice in an interface{} parameter.
Maps, channels, strings, and function and interface values, like slices, are internally references or structures that contain references already, so if you're just trying to avoid getting the underlying data copied, you don't need to pass pointers to them. (rsc wrote a separate post on how interface values are stored).
You still may need to pass pointers in the rarer case that you want to modify the caller's struct: flag.StringVar takes a *string for that reason, for example.
Where you use pointers:
Consider whether your function should be a method on whichever struct you need a pointer to. People expect a lot of methods on x to modify x, so making the modified struct the receiver may help to minimize surprise. There are guidelines on when receivers should be pointers.
Functions that have effects on their non-receiver params should make that clear in the godoc, or better yet, the godoc and the name (like reader.WriteTo(writer)).
You mention accepting a pointer to avoid allocations by allowing reuse; changing APIs for the sake of memory reuse is an optimization I'd delay until it's clear the allocations have a nontrivial cost, and then I'd look for a way that doesn't force the trickier API on all users:
For avoiding allocations, Go's escape analysis is your friend. You can sometimes help it avoid heap allocations by making types that can be initialized with a trivial constructor, a plain literal, or a useful zero value like bytes.Buffer.
Consider a Reset() method to put an object back in a blank state, like some stdlib types offer. Users who don't care or can't save an allocation don't have to call it.
Consider writing modify-in-place methods and create-from-scratch functions as matching pairs, for convenience: existingUser.LoadFromJSON(json []byte) error could be wrapped by NewUserFromJSON(json []byte) (*User, error). Again, it pushes the choice between laziness and pinching allocations to the individual caller.
Callers seeking to recycle memory can let sync.Pool handle some details. If a particular allocation creates a lot of memory pressure, you're confident you know when the alloc is no longer used, and you don't have a better optimization available, sync.Pool can help. (CloudFlare published a useful (pre-sync.Pool) blog post about recycling.)
Finally, on whether your slices should be of pointers: slices of values can be useful, and save you allocations and cache misses. There can be blockers:
The API to create your items might force pointers on you, e.g. you have to call NewFoo() *Foo rather than let Go initialize with the zero value.
The desired lifetimes of the items might not all be the same. The whole slice is freed at once; if 99% of the items are no longer useful but you have pointers to the other 1%, all of the array remains allocated.
Copying or moving the values might cause you performance or correctness problems, making pointers more attractive. Notably, append copies items when it grows the underlying array. Pointers to slice items from before the append may not point to where the item was copied after, copying can be slower for huge structs, and for e.g. sync.Mutex copying isn't allowed. Insert/delete in the middle and sorting also move items around so similar considerations can apply.
Broadly, value slices can make sense if either you get all of your items in place up front and don't move them (e.g., no more appends after initial setup), or if you do keep moving them around but you're confident that's OK (no/careful use of pointers to items, and items are small or you've measured the perf impact). Sometimes it comes down to something more specific to your situation, but that's a rough guide.
If you can (e.g. a non-shared resource that does not need to be passed as reference), use a value. By the following reasons:
Your code will be nicer and more readable, avoiding pointer operators and null checks.
Your code will be safer against Null Pointer panics.
Your code will be often faster: yes, faster! Why?
Reason 1: you will allocate less items in the heap. Allocating/deallocating from stack is immediate, but allocating/deallocating on Heap may be very expensive (allocation time + garbage collection). You can see some basic numbers here: http://www.macias.info/entry/201802102230_go_values_vs_references.md
Reason 2: especially if you store returned values in slices, your memory objects will be more compacted in memory: looping a slice where all the items are contiguous is much faster than iterating a slice where all the items are pointers to other parts of the memory. Not for the indirection step but for the increase of cache misses.
Myth breaker: a typical x86 cache line are 64 bytes. Most structs are smaller than that. The time of copying a cache line in memory is similar to copying a pointer.
Only if a critical part of your code is slow I would try some micro-optimization and check if using pointers improves somewhat the speed, at the cost of less readability and mantainability.
Three main reasons when you would want to use method receivers as pointers:
"First, and most important, does the method need to modify the receiver? If it does, the receiver must be a pointer."
"Second is the consideration of efficiency. If the receiver is large, a big struct for instance, it will be much cheaper to use a pointer receiver."
"Next is consistency. If some of the methods of the type must have pointer receivers, the rest should too, so the method set is consistent regardless of how the type is used"
Reference : https://golang.org/doc/faq#methods_on_values_or_pointers
Edit : Another important thing is to know the actual "type" that you are sending to function. The type can either be a 'value type' or 'reference type'.
Even as slices and maps acts as references, we might want to pass them as pointers in scenarios like changing the length of the slice in the function.
A case where you generally need to return a pointer is when constructing an instance of some stateful or shareable resource. This is often done by functions prefixed with New.
Because they represent a specific instance of something and they may need to coordinate some activity, it doesn't make a lot of sense to generate duplicated/copied structures representing the same resource -- so the returned pointer acts as the handle to the resource itself.
Some examples:
func NewTLSServer(handler http.Handler) *Server -- instantiate a web server for testing
func Open(name string) (*File, error) -- return a file access handle
In other cases, pointers are returned just because the structure may be too large to copy by default:
func NewRGBA(r Rectangle) *RGBA -- allocate an image in memory
Alternatively, returning pointers directly could be avoided by instead returning a copy of a structure that contains the pointer internally, but maybe this isn't considered idiomatic:
No such examples found in the standard libraries...
Related question: Embedding in Go with pointer or with value
Regarding to struct vs. pointer return value, I got confused after reading many highly stared open source projects on github, as there are many examples for both cases, util I found this amazing article:
https://www.ardanlabs.com/blog/2014/12/using-pointers-in-go.html
"In general, share struct type values with a pointer unless the struct type has been implemented to behave like a primitive data value.
If you are still not sure, this is another way to think about. Think of every struct as having a nature. If the nature of the struct is something that should not be changed, like a time, a color or a coordinate, then implement the struct as a primitive data value. If the nature of the struct is something that can be changed, even if it never is in your program, it is not a primitive data value and should be implemented to be shared with a pointer. Don’t create structs that have a duality of nature."
Completedly convinced.

When to use slice instead of an array in GO

I am learning GO. According to documentation, slices are richer than arrays.
However, I am failing to grasp hypothetical use cases for slices.
What would be use case where one would use a slice instead of array?
Thanks!
This is really pretty elementary and probably should already have been covered in whatever documentation you're reading (unless it's just the language spec), but: A Go array always has a fixed size. If you always need 10 things of type T, [10]T is fine. But what if you need a variable number of things n, where n is determined at runtime?
A Go slice—which consists of two parts, a slice header and an underlying backing array—is pretty ideal for holding information needed to access a variable-sized array. Note that just declaring a slice-header variable:
var x []T
doesn't actually allocate any array of T yet: the slice header will be initialized to hold nil (converted to the right type) as the (missing) backing array, 0 as the current size, and 0 as the capacity of this array. As a result of this, the test x == nil will say that yes, x is nil. To get an actual array, you will need either:
an actual array, or
a call to make, or
use of the built-in append or similar (e.g., copy, append hidden behind some function, etc).
Since the call to make happens at runtime, it can make an array of whatever size is needed at this point. A series of calls to append can build up an array. Note that each call to append may have to allocate a new backing array, or may be able to extend the existing array in-place, depending on what's in the capacity. That's why you need x = append(x, elem) or x = append(x, elems...) and not just append(x, elem) or append(x, elems...).
The Go blog entry on slices has a lot more to say on this. I like this page more than the sequence of pages in the Go Tour starting here, but opinions vary.

Updating string array using java stream

I know that for object, we can forEach the collection and update the object as we like but for immutable objects like Strings, how can we update the array with new object without converting it into an array again.
For e.g, I have an array of string. I want to iterate through each string and trim them. I would otherwise have to do something like this:
Arrays.stream(str).map(c -> c.trim()).collect(Collectors.toList())
In the end, I will get a List rather then String[] that I initially gave. Its a whole lot of processing. Is there any way I can do something similar to:
for(int i = 0; i < str.length; i++) {
str[i] = str[i].trim();
}
using java streams?
Streams are not intended for manipulating other data structures, especially not for updating their source. But the Java API consists of more than the Stream API.
As Alexis C. has shown in a comment, you could use Arrays.setAll(arr, i -> arr[i].trim());
There’s even a parallelSetAll that you could use when you have a really large array.
However, it might be easier to use just Arrays.asList(arr).replaceAll(String::trim);.
Keep in mind that the wrapper returned by Arrays.asList allows modifications of the wrapped array through the List interface. Only adding and removing is not supported.
Use toArray :
str = Arrays.stream(str).map(c -> c.trim()).toArray(String[]::new);
The disadvantage here (over your original Java 7 loop) is that a new array is created to store the result.
To update the original array, you can re-write your loop with Streams, though I'm not sure what's the point :
IntStream.range (0, str.length).forEach (i -> {str[i] = str[i].trim();});
It's not that much processing as you might think, the array has a known size and the spliterator from it will be SIZED, thus the resulting collection size will be known before processing and the space for it can be allocated ahead of time, without having to re-size the collection.
It's also always interesting that in the absence of actual tests we almost always assume that this is slow or memory hungry.
of course if you want an array as the result there is a method for that :
.toArray(String[]::new);

LabView: fixed size array

is there a way to create a fixed size array in LabView?
I know that I can do some check on the array size, then discard values when an array size become greater than a specific value. But, I think that is a common problem, so there is some built in function in LabView to have a fixed size array?
As far as I know this is impossible, unless they changed something in one of their latest releases but I doubt it: it would probably require a serious rewrite of the core array code.
The closest you can get is writing your own (possibly polymorphic) array class in which you encapsulate an actual array, that you initialize once with a certain size. For the rest your class only exposes methods to get/set by index. No resize etc.
Or, if you are talking about arrays of controls etc on the front panel, you can probably do this at the UI level by hide the indexing control from it and making sure it cannot be resized graphically. Or probably it's also doable to create a custom control and strip lots of array functionality from it.
If the array size is fixed at design time, then you might consider using a cluster instead. There is even a primitive to convert an array to a cluster of fixed size, provided the length is less then 257. (Array To Cluster function.)
There is also a primitive to go the other way if you need to index the array.
One implementation that you could do is a queue with a fixed size. You can use preview queue and flush queue to implement the functionality you want. However a specific custom class is probably a better idea.
In regular desktop LabVIEW, fixed-sized arrays would be something you'd have to code as per the answers you've already gotten here. However, in LabVIEW FPGA with, say, cRIO, all arrays must be fixed-size.
When calling the Call Library Function Node to a WINAPI DLL, there are times where a structure element may be officially be defined as BYTE[130]. So how do you absolutely, positively make sure your cluster has exactly the space for 130 bytes?
You can't do it with arrays no matter what, because LabVIEW arrays are pointers to a structure (the first element being the length), meaning any array you insert will only allocate enough space for a pointer, 4 bytes.
The work-around I came up with is to insert a cluster that includes sixteen U64 and one U16, pass that through an unflatten to string and you'll find it's exactly 130 bytes long.
When the cluster returns from the call, merely type cast the flattened into string results into a U8 array

Get struct's size passed as void to function

I'm changing some codes in a database library. The way it works I send a void pointer, to get the size of it I call a query and using the query I calculate the size of the structure. Now the problem is I receive the struct as params but the function fails before/in the middle of the first fetch. After that I need to clear the structure, but I dont even have the size.
I know the best way is send the size of the structure as a param, but I have thousands and thousands programs already compiled, the library is from 1996, so I need to find a way to calculate the structure size even if the type is void.
One idea I had was to calculate the position of the next element that is not in the structure
0x000010 0x000042
[int|char[30]|int|int][int]
So the size is 32, because the 0x00042-0x000010 is 32.
Is there a way to know when I got out of the structure.
the prototype of the function is
int getData(char* fields, void* myStruct)
I need to find out the structure size.
Sorry if I missed some information, the code is HUGE and unfortunately I cannot post it here.
No, in general there's no way, given a void *, to figure out what you're after. The only thing you can do is compare it against NULL, which of course doesn't help here.
Note that there's nothing in the void * that even says it points at a struct, it could just as well be pointing into the middle of an array.
If you have some global means of recording the pointers before they're passed to getData(), you might be able to implement a look-up function that simply compares the pointer value against those previously recorded, but that's just using the pointer value as a key.

Resources