I am learning GO. According to documentation, slices are richer than arrays.
However, I am failing to grasp hypothetical use cases for slices.
What would be use case where one would use a slice instead of array?
Thanks!
This is really pretty elementary and probably should already have been covered in whatever documentation you're reading (unless it's just the language spec), but: A Go array always has a fixed size. If you always need 10 things of type T, [10]T is fine. But what if you need a variable number of things n, where n is determined at runtime?
A Go slice—which consists of two parts, a slice header and an underlying backing array—is pretty ideal for holding information needed to access a variable-sized array. Note that just declaring a slice-header variable:
var x []T
doesn't actually allocate any array of T yet: the slice header will be initialized to hold nil (converted to the right type) as the (missing) backing array, 0 as the current size, and 0 as the capacity of this array. As a result of this, the test x == nil will say that yes, x is nil. To get an actual array, you will need either:
an actual array, or
a call to make, or
use of the built-in append or similar (e.g., copy, append hidden behind some function, etc).
Since the call to make happens at runtime, it can make an array of whatever size is needed at this point. A series of calls to append can build up an array. Note that each call to append may have to allocate a new backing array, or may be able to extend the existing array in-place, depending on what's in the capacity. That's why you need x = append(x, elem) or x = append(x, elems...) and not just append(x, elem) or append(x, elems...).
The Go blog entry on slices has a lot more to say on this. I like this page more than the sequence of pages in the Go Tour starting here, but opinions vary.
I have some specially-defined arrays in Julia which you can think of being just a composition of many arrays. For example:
type CompositeArray{T}
x::Vector{T}
y::Vector{T}
end
with an indexing scheme
getindex(c::CompositeArray,i::Int) = i <= length(c) ? c.x[i] : c.y[i-length(c.x)]
I do have one caveat: the higher indexing scheme just goes to x itself:
getindex(c::CompositeArray,i::Int...) = c.x[i...]
Now the iterator through these can easily be made as the chain of the iterator on x and then on y. This makes iterating through the values have almost no extra cost. However, can something similar be done for iteration to setindex!?
I was thinking of having a separate dispatch on CartesianIndex{2} just for indexing x vs y and the index, and building an eachindex iterator for that, similar to what CatViews.jl does. However, I'm not certain how that will interact with the i... dispatch, or whether it will be useful in this case.
In addition, will broadcasting automatically use this fast iteration scheme if it's built on eachindex?
Edits:
length(c::CompositeArray) = length(c.x) + length(c.y)
In the real case, x can be any AbstractArray (and thus has a linear index), but since only the linear indexing is used (except for that one user-facing getindex function), the problem really boils down to finding out how to do this with x a Vector.
Making X[CartesianIndex(2,1)] mean something different from X[2,1] is certainly not going to end well. And I would expect similar troubles from the fact that X[100,1] may mean something different from X[100] or if length(X) != prod(size(X)). You're free to break the rules, but you shouldn't be surprised when functions in Base and other packages expect you to follow them.
The safe way to do this would be to make eachindex(::CompositeArray) return a custom iterator over objects that you control entirely. Maybe just throw a wrapper around and forward methods to CartesianRange and CartesianIndex{2} if that data structure is helpful. Then when you get one of these custom index types, you know that SplitIndex(CartesianIndex(1,2)) is indeed intending to refer to the first element in the second array.
I have several arrays like this (please ignore specific names):
static resource_t coap_cmp_res[MAX_CMPS];
e.g. [cmp1,cmp2,cmp3,cmp4,cmp5,0,0,0]
and a code that uses these elements, for example, coap_cmp_res[4] (cmp5) is associated with a REST resource, call it Res5.
At a certain point in time, I delete an element in that array at position x like this:
rest_deactivate_resource(&coap_cmp_res[x]);
e.g. for x = 2
[cmp1,cmp2,0,cmp4,cmp5,0,0,0]
What I then would like to do is have a single continuous array again like this
e.g. [cmp1,cmp2,cmp4,cmp5,0,0,0,0]
What I do currently is:
for(UInt8 i = x; i < MAX_CMPS; i++){
coap_cmp_res[i] = coap_cmp_res[i+1];
}
which gives [cmp1,cmp2,cmp4,cmp5,cmp5,0,0,0]
then I manually set the last non-zero element to 0.
e.g. [cmp1,cmp2,cmp4,cmp5,0,0,0,0]
So, this looks good, but the problem is that the Res5 is still associated with coap_cmp_res[4] and thus now the value 0, instead of cmp5, which is not what I desire.
I could deactivate and reactivate every resource after x in the array to have the associations working again, but was wondering if there was a more efficient way to go about this.
Hopefully this makes sense.
As the proverb says: "add a level of indirection". An array of resource_t* that point into coap_cmp_res and are stable. Then have Rea5 associated with a pointer, and use the indirection to reach into a valid entry.
static resource_t coap_cmp_res_data[MAX_CMPS];
static resource_t* coap_cmp_res_ptrs[MAX_CMPS]; // points into coap_cmp_res_data
When you remove an element, you update the entries in coap_cmp_res_ptrs, without moving them, and shrink coap_cmp_res_data. Any resource will still refer to the same position in coap_cmp_res_ptrs, and the indirection will take it to the current location of the resource.
An alternative approach, which may prove better in your case (you'd have to profile), is to use node based storage. I.e a linked list.
I'm reading the documentation and I am constantly shaking my head at some of the design decisions of the language. But the thing that really got me puzzled is how arrays are handled.
I rushed to the playground and tried these out. You can try them too. So the first example:
var a = [1, 2, 3]
var b = a
a[1] = 42
a
b
Here a and b are both [1, 42, 3], which I can accept. Arrays are referenced - OK!
Now see this example:
var c = [1, 2, 3]
var d = c
c.append(42)
c
d
c is [1, 2, 3, 42] BUT d is [1, 2, 3]. That is, d saw the change in the last example but doesn't see it in this one. The documentation says that's because the length changed.
Now, how about this one:
var e = [1, 2, 3]
var f = e
e[0..2] = [4, 5]
e
f
e is [4, 5, 3], which is cool. It's nice to have a multi-index replacement, but f STILL doesn't see the change even though the length has not changed.
So to sum it up, common references to an array see changes if you change 1 element, but if you change multiple elements or append items, a copy is made.
This seems like a very poor design to me. Am I right in thinking this? Is there a reason I don't see why arrays should act like this?
EDIT: Arrays have changed and now have value semantics. Much more sane!
Note that array semantics and syntax was changed in Xcode beta 3 version (blog post), so the question no longer applies. The following answer applied to beta 2:
It's for performance reasons. Basically, they try to avoid copying arrays as long as they can (and claim "C-like performance"). To quote the language book:
For arrays, copying only takes place when you perform an action that has the potential to modify the length of the array. This includes appending, inserting, or removing items, or using a ranged subscript to replace a range of items in the array.
I agree that this is a bit confusing, but at least there is a clear and simple description of how it works.
That section also includes information on how to make sure an array is uniquely referenced, how to force-copy arrays, and how to check whether two arrays share storage.
From the official documentation of the Swift language:
Note that the array is not copied when you set a new value with subscript syntax, because setting a single value with subscript syntax does not have the potential to change the array’s length. However, if you append a new item to array, you do modify the array’s length. This prompts Swift to create a new copy of the array at the point that you append the new value. Henceforth, a is a separate, independent copy of the array.....
Read the whole section Assignment and Copy Behavior for Arrays in this documentation. You will find that when you do replace a range of items in the array then the array takes a copy of itself for all items.
The behavior has changed with Xcode 6 beta 3. Arrays are no longer reference types and have a copy-on-write mechanism, meaning as soon as you change an array's content from one or the other variable, the array will be copied and only the one copy will be changed.
Old answer:
As others have pointed out, Swift tries to avoid copying arrays if possible, including when changing values for single indexes at a time.
If you want to be sure that an array variable (!) is unique, i.e. not shared with another variable, you can call the unshare method. This copies the array unless it already only has one reference. Of course you can also call the copy method, which will always make a copy, but unshare is preferred to make sure no other variable holds on to the same array.
var a = [1, 2, 3]
var b = a
b.unshare()
a[1] = 42
a // [1, 42, 3]
b // [1, 2, 3]
The behavior is extremely similar to the Array.Resize method in .NET. To understand what's going on, it may be helpful to look at the history of the . token in C, C++, Java, C#, and Swift.
In C, a structure is nothing more than an aggregation of variables. Applying the . to a variable of structure type will access a variable stored within the structure. Pointers to objects do not hold aggregations of variables, but identify them. If one has a pointer which identifies a structure, the -> operator may be used to access a variable stored within the structure identified by the pointer.
In C++, structures and classes not only aggregate variables, but can also attach code to them. Using . to invoke a method will on a variable ask that method to act upon the contents of the variable itself; using -> on a variable which identifies an object will ask that method to act upon the object identified by the variable.
In Java, all custom variable types simply identify objects, and invoking a method upon a variable will tell the method what object is identified by the variable. Variables cannot hold any kind of composite data type directly, nor is there any means by which a method can access a variable upon which it is invoked. These restrictions, although semantically limiting, greatly simplify the runtime, and facilitate bytecode validation; such simplifications reduced the resource overhead of Java at a time when the market was sensitive to such issues, and thus helped it gain traction in the marketplace. They also meant that there was no need for a token equivalent to the . used in C or C++. Although Java could have used -> in the same way as C and C++, the creators opted to use single-character . since it was not needed for any other purpose.
In C# and other .NET languages, variables can either identify objects or hold composite data types directly. When used on a variable of a composite data type, . acts upon the contents of the variable; when used on a variable of reference type, . acts upon the object identified by it. For some kinds of operations, the semantic distinction isn't particularly important, but for others it is. The most problematical situations are those in which a composite data type's method which would modify the variable upon which it is invoked, is invoked on a read-only variable. If an attempt is made to invoke a method on a read-only value or variable, compilers will generally copy the variable, let the method act upon that, and discard the variable. This is generally safe with methods that only read the variable, but not safe with methods that write to it. Unfortunately, .does has not as yet have any means of indicating which methods can safely be used with such substitution and which can't.
In Swift, methods on aggregates can expressly indicate whether they will modify the variable upon which they are invoked, and the compiler will forbid the use of mutating methods upon read-only variables (rather than having them mutate temporary copies of the variable which will then get discarded). Because of this distinction, using the . token to call methods that modify the variables upon which they are invoked is much safer in Swift than in .NET. Unfortunately, the fact that the same . token is used for that purpose as to act upon an external object identified by a variable means the possibility for confusion remains.
If had a time machine and went back to the creation of C# and/or Swift, one could retroactively avoid much of the confusion surrounding such issues by having languages use the . and -> tokens in a fashion much closer to the C++ usage. Methods of both aggregates and reference types could use . to act upon the variable upon which they were invoked, and -> to act upon a value (for composites) or the thing identified thereby (for reference types). Neither language is designed that way, however.
In C#, the normal practice for a method to modify a variable upon which it is invoked is to pass the variable as a ref parameter to a method. Thus calling Array.Resize(ref someArray, 23); when someArray identifies an array of 20 elements will cause someArray to identify a new array of 23 elements, without affecting the original array. The use of ref makes clear that the method should be expected to modify the variable upon which it is invoked. In many cases, it's advantageous to be able to modify variables without having to use static methods; Swift addresses that means by using . syntax. The disadvantage is that it loses clarify as to what methods act upon variables and what methods act upon values.
To me this makes more sense if you first replace your constants with variables:
a[i] = 42 // (1)
e[i..j] = [4, 5] // (2)
The first line never needs to change the size of a. In particular, it never needs to do any memory allocation. Regardless of the value of i, this is a lightweight operation. If you imagine that under the hood a is a pointer, it can be a constant pointer.
The second line may be much more complicated. Depending on the values of i and j, you may need to do memory management. If you imagine that e is a pointer that points to the contents of the array, you can no longer assume that it is a constant pointer; you may need to allocate a new block of memory, copy data from the old memory block to the new memory block, and change the pointer.
It seems that the language designers have tried to keep (1) as lightweight as possible. As (2) may involve copying anyway, they have resorted to the solution that it always acts as if you did a copy.
This is complicated, but I am happy that they did not make it even more complicated with e.g. special cases such as "if in (2) i and j are compile-time constants and the compiler can infer that the size of e is not going to change, then we do not copy".
Finally, based on my understanding of the design principles of the Swift language, I think the general rules are these:
Use constants (let) always everywhere by default, and there won't be any major surprises.
Use variables (var) only if it is absolutely necessary, and be vary careful in those cases, as there will be surprises [here: strange implicit copies of arrays in some but not all situations].
What I've found is: The array will be a mutable copy of the referenced one if and only if the operation has the potential to change the array's length. In your last example, f[0..2] indexing with many, the operation has the potential to change its length (it might be that duplicates are not allowed), so it's getting copied.
var e = [1, 2, 3]
var f = e
e[0..2] = [4, 5]
e // 4,5,3
f // 1,2,3
var e1 = [1, 2, 3]
var f1 = e1
e1[0] = 4
e1[1] = 5
e1 // - 4,5,3
f1 // - 4,5,3
Delphi's strings and arrays had the exact same "feature". When you looked at the implementation, it made sense.
Each variable is a pointer to dynamic memory. That memory contains a reference count followed by the data in the array. So you can easily change a value in the array without copying the whole array or changing any pointers. If you want to resize the array, you have to allocate more memory. In that case the current variable will point to the newly allocated memory. But you can't easily track down all of the other variables that pointed to the original array, so you leave them alone.
Of course, it wouldn't be hard to make a more consistent implementation. If you wanted all variables to see a resize, do this:
Each variable is a pointer to a container stored in dynamic memory. The container holds exactly two things, a reference count and pointer to the actual array data. The array data is stored in a separate block of dynamic memory. Now there is only one pointer to the array data, so you can easily resize that, and all variables will see the change.
A lot of Swift early adopters have complained about this error-prone array semantics and Chris Lattner has written that the array semantics had been revised to provide full value semantics ( Apple Developer link for those who have an account). We will have to wait at least for the next beta to see what this exactly means.
I use .copy() for this.
var a = [1, 2, 3]
var b = a.copy()
a[1] = 42
Did anything change in arrays behavior in later Swift versions ? I just run your example:
var a = [1, 2, 3]
var b = a
a[1] = 42
a
b
And my results are [1, 42, 3] and [1, 2, 3]
is there a way to create a fixed size array in LabView?
I know that I can do some check on the array size, then discard values when an array size become greater than a specific value. But, I think that is a common problem, so there is some built in function in LabView to have a fixed size array?
As far as I know this is impossible, unless they changed something in one of their latest releases but I doubt it: it would probably require a serious rewrite of the core array code.
The closest you can get is writing your own (possibly polymorphic) array class in which you encapsulate an actual array, that you initialize once with a certain size. For the rest your class only exposes methods to get/set by index. No resize etc.
Or, if you are talking about arrays of controls etc on the front panel, you can probably do this at the UI level by hide the indexing control from it and making sure it cannot be resized graphically. Or probably it's also doable to create a custom control and strip lots of array functionality from it.
If the array size is fixed at design time, then you might consider using a cluster instead. There is even a primitive to convert an array to a cluster of fixed size, provided the length is less then 257. (Array To Cluster function.)
There is also a primitive to go the other way if you need to index the array.
One implementation that you could do is a queue with a fixed size. You can use preview queue and flush queue to implement the functionality you want. However a specific custom class is probably a better idea.
In regular desktop LabVIEW, fixed-sized arrays would be something you'd have to code as per the answers you've already gotten here. However, in LabVIEW FPGA with, say, cRIO, all arrays must be fixed-size.
When calling the Call Library Function Node to a WINAPI DLL, there are times where a structure element may be officially be defined as BYTE[130]. So how do you absolutely, positively make sure your cluster has exactly the space for 130 bytes?
You can't do it with arrays no matter what, because LabVIEW arrays are pointers to a structure (the first element being the length), meaning any array you insert will only allocate enough space for a pointer, 4 bytes.
The work-around I came up with is to insert a cluster that includes sixteen U64 and one U16, pass that through an unflatten to string and you'll find it's exactly 130 bytes long.
When the cluster returns from the call, merely type cast the flattened into string results into a U8 array