Array.isDefinedAt for n-dimensional arrays in scala - arrays

Is there an elegant way to express
val a = Array.fill(2,10) {1}
def do_to_elt(i:Int,j:Int) {
if (a.isDefinedAt(i) && a(i).isDefinedAt(j)) f(a(i)(j))
}
in scala?

I recommend that you not use arrays of arrays for 2D arrays, for three main reasons. First, it allows inconsistency: not all columns (or rows, take your pick) need to be the same size. Second, it is inefficient--you have to follow two pointers instead of one. Third, very few library functions exist that work transparently and usefully on arrays of arrays as 2D arrays.
Given these things, you should either use a library that supports 2D arrays, like scalala, or you should write your own. If you do the latter, among other things, this problem magically goes away.
So in terms of elegance: no, there isn't a way. But beyond that, the path you're starting on contains lots of inelegance; you would probably do best to step off of it quickly.

You just need to check the array at index i with isDefinedAt if it exists:
def do_to_elt(i:Int, j:Int): Unit =
if (a.isDefinedAt(i) && a(i).isDefinedAt(j)) f(a(i)(j))
EDIT: Missed that part about the elegant solution as I focused on the error in the code before your edit.
Concerning elegance: no, per se there is no way to express it in a more elegant way. Some might tell you to use the pimp-my-library-Pattern to make it look more elegant but in fact it does not in this case.
If your only use case is to execute a function with an element of a multidimensional array when the indices are valid then this code does that and you should use it. You could generalize the method by changing the signature of to take the function to apply to the element and maybe a value if the indices are invalid like this:
def do_to_elt[A](i: Int, j: Int)(f: Int => A, g: => A = ()) =
if (a.isDefinedAt(i) && a(i).isDefinedAt(j)) f(a(i)(j)) else g
but I would not change anything beyond this. This also does not look more elegant but widens your use case.
(Also: If you are working with arrays you mostly do that for performance reasons and in that case it might even be better to not use isDefinedAt but perform validity checks based on the length of the arrays.)

Related

When to use slice instead of an array in GO

I am learning GO. According to documentation, slices are richer than arrays.
However, I am failing to grasp hypothetical use cases for slices.
What would be use case where one would use a slice instead of array?
Thanks!
This is really pretty elementary and probably should already have been covered in whatever documentation you're reading (unless it's just the language spec), but: A Go array always has a fixed size. If you always need 10 things of type T, [10]T is fine. But what if you need a variable number of things n, where n is determined at runtime?
A Go slice—which consists of two parts, a slice header and an underlying backing array—is pretty ideal for holding information needed to access a variable-sized array. Note that just declaring a slice-header variable:
var x []T
doesn't actually allocate any array of T yet: the slice header will be initialized to hold nil (converted to the right type) as the (missing) backing array, 0 as the current size, and 0 as the capacity of this array. As a result of this, the test x == nil will say that yes, x is nil. To get an actual array, you will need either:
an actual array, or
a call to make, or
use of the built-in append or similar (e.g., copy, append hidden behind some function, etc).
Since the call to make happens at runtime, it can make an array of whatever size is needed at this point. A series of calls to append can build up an array. Note that each call to append may have to allocate a new backing array, or may be able to extend the existing array in-place, depending on what's in the capacity. That's why you need x = append(x, elem) or x = append(x, elems...) and not just append(x, elem) or append(x, elems...).
The Go blog entry on slices has a lot more to say on this. I like this page more than the sequence of pages in the Go Tour starting here, but opinions vary.

Iterating for `setindex!`

I have some specially-defined arrays in Julia which you can think of being just a composition of many arrays. For example:
type CompositeArray{T}
x::Vector{T}
y::Vector{T}
end
with an indexing scheme
getindex(c::CompositeArray,i::Int) = i <= length(c) ? c.x[i] : c.y[i-length(c.x)]
I do have one caveat: the higher indexing scheme just goes to x itself:
getindex(c::CompositeArray,i::Int...) = c.x[i...]
Now the iterator through these can easily be made as the chain of the iterator on x and then on y. This makes iterating through the values have almost no extra cost. However, can something similar be done for iteration to setindex!?
I was thinking of having a separate dispatch on CartesianIndex{2} just for indexing x vs y and the index, and building an eachindex iterator for that, similar to what CatViews.jl does. However, I'm not certain how that will interact with the i... dispatch, or whether it will be useful in this case.
In addition, will broadcasting automatically use this fast iteration scheme if it's built on eachindex?
Edits:
length(c::CompositeArray) = length(c.x) + length(c.y)
In the real case, x can be any AbstractArray (and thus has a linear index), but since only the linear indexing is used (except for that one user-facing getindex function), the problem really boils down to finding out how to do this with x a Vector.
Making X[CartesianIndex(2,1)] mean something different from X[2,1] is certainly not going to end well. And I would expect similar troubles from the fact that X[100,1] may mean something different from X[100] or if length(X) != prod(size(X)). You're free to break the rules, but you shouldn't be surprised when functions in Base and other packages expect you to follow them.
The safe way to do this would be to make eachindex(::CompositeArray) return a custom iterator over objects that you control entirely. Maybe just throw a wrapper around and forward methods to CartesianRange and CartesianIndex{2} if that data structure is helpful. Then when you get one of these custom index types, you know that SplitIndex(CartesianIndex(1,2)) is indeed intending to refer to the first element in the second array.

matlab: structural data and multi-level indexing

I have a simple problem with structures.
Lets create:
x(1).a(:, :) = magic(2);
x(2).a(:, :) = magic(2)*2;
x(3).a(:, :) = magic(2)*3;
how to list a(1, 1) from all x-es?
i wanted to do it like:
x(1, :).a(1,1)
but there is an error "Scalar index required for this type of multi-level indexing."
How to approach it? I know I can do it with a loop, but that's probably the worst solution :)
Thanks!
This is not the best datastructure to use if this is the sort of query you'd like to make on it, precisely because this sort of indexing cannot be done directly.
However, here is one approach that works:
cellfun(#(X) X(1,1), {x.a})
The syntax {x.a} converts x from a 'struct array' into a cell array. Then we use cellfun to apply a function as a map over the cell array. The anonymous function #(X) X(1,1) takes one argument X and returns X(1,1).
You can also get your data in this way:
B = cat(3,x.a);
out = reshape(B(1,1,:),1,[]);
By the way, loops are not evil. Sometimes they are even faster than vectorized indexation. Try it both ways, see what suits you best in terms of:
Speed - use the profiler to check
Code clarity - depends on the context. Sometimes vectorized code looks better, sometimes the opposite.

Haskell map/sortBy/findIndex etc. for Arrays instead of Lists

I can see that it's possible to write functions like map/sortBy/findIndex and some other List-related functions for Arrays instead (at least those indexed by integers.) Is this done anywhere in the standard library, or would I need to roll my own?
I need to use an array in my program for the in-place update, but there are also several locations I'd like to use some of the above list functions on it. Is converting back and forth between the two the best solution?
(The arrays I've been looking at are from Data.Array.IArray. I'm also happy to use any other array library that implements this functionality.)
I recommend you have a look at the vector and vector-algorithms packages. They contain very efficient implementations of many common operations on Int-indexed arrays, in both mutable and immutable variants.
fmap (from Control.Monad) is sort of like a generic version of map that works on anything that supports the Functor type class. Array supports that, so you should be able to use fmap instead of map for array.
As hammar says, the vector and vector-algorithms are probably a better way to approach the problem if you need to consider indexed arrays.

Why no immutable arrays in scala standard library?

Scala has all sorts sorts of immutable sequences like List, Vector,etc. I have been surprised to find no implementation of immutable indexed sequence backed by a simple array (Vector seems way too complicated for my needs).
Is there a design reason for this? I could not find a good explanation on the mailing list.
Do you have a recommendation for an immutable indexed sequence that has close to the same performances as an array? I am considering scalaz's ImmutableArray, but it has some issues with scala trunk for example.
Thank you
You could cast your array into a sequence.
val s: Seq[Int] = Array(1,2,3,4)
The array will be implicitly converted to a WrappedArray. And as the type is Seq, update operations will no longer be available.
So, let's first make a distinction between interface and class. The interface is an API design, while the class is the implementation of such API.
The interfaces in Scala have the same name and different package to distinguish with regards to immutability: Seq, immutable.Seq, mutable.Seq.
The classes, on the other hand, usually don't share a name. A List is an immutable sequence, while a ListBuffer is a mutable sequence. There are exceptions, like HashSet, but that's just a coincidence with regards to implementation.
Now, and Array is not part of Scala's collection, being a Java class, but its wrapper WrappedArray shows clearly where it would show up: as a mutable class.
The interface implemented by WrappedArray is IndexedSeq, which exists are both mutable and immutable traits.
The immutable.IndexedSeq has a few implementing classes, including the WrappedString. The general use class implementing it, however, is the Vector. That class occupies the same position an Array class would occupy in the mutable side.
Now, there's no more complexity in using a Vector than using an Array, so I don't know why you call it complicated.
Perhaps you think it does too much internally, in which case you'd be wrong. All well designed immutable classes are persistent, because using an immutable collection means creating new copies of it, so they have to be optimized for that, which is exactly what Vector does.
Mostly because there are no arrays whatsoever in Scala. What you're seeing is java's arrays pimped with a few methods that help them fit into the collection API.
Anything else wouldn't be an array, with it's unique property of not suffering type erasure, or the broken variance. It would just be another type with indexes and values. Scala does have that, it's called IndexedSeq, and if you need to pass it as an array to some 3rd party API then you can just use .toArray
Scala 2.13 has added ArraySeq, which is an immutable sequence backed by an array.
Scala 3 now has IArray, an Immutable Array.
It is implemented as an Opaque Type Alias, with no runtime overhead.
The point of the scala Array class is to provide a mechanism to access the abilities of Java arrays (but without Java's awful design decision of allowing arrays to be covariant within its type system). Java arrays are mutable, hence so are those in the scala standard library.
Suppose there were also another class immutable.Array in the library but that the compiler were also to use a Java array as the underlying structure (for efficiency/speed). The following code would then compile and run:
val i = immutable.Array("Hello")
i.asInstanceOf[Array[String]](0) = "Goodbye"
println( i(0) ) //I thought i was immutable :-(
That is, the array would really be mutable.
The problem with Arrays is that they have a fixed size. There is no operation to add an element to an array, or remove one from it.
You can keep an array that you guess will be long enough as a backing store, "wasting" the memory you're not using, keep track of the last used index, and copy to a larger array if you need the extra space. That copying is O(N) obviously.
Changing a single element is also O(N) as you will need to copy over the entire array. There is no structural sharing, which is the lynchpin of performant functional datastructures.
You could also allocate an extra array for the "overflowing" elements, and somehow keep track of your arrays. At that point you're on your way of re-inventing Vector.
In short, due to their unsuitablility for structural sharing, immutable facades for arrays have terrible runtime performance characteristics for most common operations like adding an element, removing an element, and changing an element.
That only leaves the use-case of a fixed size fixed content data-carrier, and that use-case is relatively rare. Most uses better served with List, Stream or Vector
You can simply use Array[T].toIndexSeq to convert Array[T] to ArraySeq[T], which is of type immutable.IndexedSeq[T].
(after Scala 2.13.0)
scala> val array = Array(0, 1, 2)
array: Array[Int] = Array(0, 1, 2)
scala> array.toIndexedSeq
res0: IndexedSeq[Int] = ArraySeq(0, 1, 2)

Resources