How do I resize an array in Scala - database

I am trying to create a DB management tool in Scala, and I want to be able to draw from this database into Arrays, whose size can shift based on the data being passed to them. I know how to do this in C, PHP, VB, etc. but can't seem to figure out the syntax for Scala.
I'm sure this should be a simple problem, so any help would be appreciated

Collections by default in Scala tend to be immutable. Operations will create new immutable collections from existing collections (by adding/removing elements etc.). The benefit of this is that collections don't change under iteration and writing multi-threaded applications tends to be easier (lots of caveats/assumptions with how you write standard Java apply here!).
Having said all that, if you need a mutable array, have you looked at an ArrayBuffer (a mutable collection with an underlying array implementation) ?
e.g.
val a = new scala.collection.mutable.ArrayBuffer[String]()
a += "A"
a += "B"
a(1) // gives you 'B'

You could use System.copy for this task, if you really want to use an array, or you could directly use a container that will resize itself automatically, such as ListBuffer or ArrayList.

Related

Updating string array using java stream

I know that for object, we can forEach the collection and update the object as we like but for immutable objects like Strings, how can we update the array with new object without converting it into an array again.
For e.g, I have an array of string. I want to iterate through each string and trim them. I would otherwise have to do something like this:
Arrays.stream(str).map(c -> c.trim()).collect(Collectors.toList())
In the end, I will get a List rather then String[] that I initially gave. Its a whole lot of processing. Is there any way I can do something similar to:
for(int i = 0; i < str.length; i++) {
str[i] = str[i].trim();
}
using java streams?
Streams are not intended for manipulating other data structures, especially not for updating their source. But the Java API consists of more than the Stream API.
As Alexis C. has shown in a comment, you could use Arrays.setAll(arr, i -> arr[i].trim());
There’s even a parallelSetAll that you could use when you have a really large array.
However, it might be easier to use just Arrays.asList(arr).replaceAll(String::trim);.
Keep in mind that the wrapper returned by Arrays.asList allows modifications of the wrapped array through the List interface. Only adding and removing is not supported.
Use toArray :
str = Arrays.stream(str).map(c -> c.trim()).toArray(String[]::new);
The disadvantage here (over your original Java 7 loop) is that a new array is created to store the result.
To update the original array, you can re-write your loop with Streams, though I'm not sure what's the point :
IntStream.range (0, str.length).forEach (i -> {str[i] = str[i].trim();});
It's not that much processing as you might think, the array has a known size and the spliterator from it will be SIZED, thus the resulting collection size will be known before processing and the space for it can be allocated ahead of time, without having to re-size the collection.
It's also always interesting that in the absence of actual tests we almost always assume that this is slow or memory hungry.
of course if you want an array as the result there is a method for that :
.toArray(String[]::new);

(De)serializing an object as an array in XStream

I'm trying to clean up some old code by replacing some arrays that were being passed around with proper objects to improve readability and to encapsulate some behaviour. I ran into a problem when it turned out the arrays were being run through XStream for persistence.
I need to retain the format of the serialization and the arrays in question are inside various other objects being (de)serialized through XStream. Is there and easy way to handle this?
I'm hoping there's an Annotation I can apply or simple XStream Converter I can write for my new classes and be done with it, but from what I can see it would require writing Converters for each of the containing classes instead. I'm not sure as I'm not familiar with XStream. If there isn't a easy solution I'm just going to have to give up and leave the arrays in place as I don't have the time budgeted for anything fancy or to learn the finer points of XStream.
Specifically, I have a TileLayer that has a member int[] metaTileFactors and I want to replace that with class MetaTiling which has members final int x and final int y and still have it serialize and deserialize to/from the same XML as before.

Array vs ArraySeq comparison

This is a bit of a general question but I was wondering if anybody could advise me on what would be advantages of working with Array vs ArraySeq. From what I have seen Array is scala's representation of java Array and there are not too many members in its API whereas ArraySeq seems to contain a much richer API.
There are actually four different classes you could choose from to get mutable array-like functionality.
Array + ArrayOps
WrappedArray
ArraySeq
ArrayBuffer
Array is a plain old Java array. It is by far the best way to go for low-level access to arrays of primitives. There's no overhead. Also it can act like the Scala collections thanks to implicit conversion to ArrayOps, which grabs the underlying array, applies the appropriate method, and, if appropriate, returns a new array. But since ArrayOps is not specialized for primitives, it's slow (as slow as boxing/unboxing always is).
WrappedArray is a plain old Java array, but wrapped in all of Scala's collection goodies. The difference between it and ArrayOps is that WrappedArray returns another WrappedArray--so at least you don't have the overhead of having to re-ArrayOps your Java primitive array over and over again for each operation. It's good to use when you are doing a lot of interop with Java and you need to pass in plain old Java arrays, but on the Scala side you need to manipulate them conveniently.
ArraySeq stores its data in a plain old Java array, but it no longer stores arrays of primitives; everything is an array of objects. This means that primitives get boxed on the way in. That's actually convenient if you want to use the primitives many times; since you've got boxed copies stored, you only have to unbox them, not box and unbox them on every generic operation.
ArrayBuffer acts like an array, but you can add and remove elements from it. If you're going to go all the way to ArraySeq, why not have the added flexibility of changing length while you're at it?
From the scala-lang.org forum:
Array[T] - Benefits: Native, fast -
Limitations: Few methods (only apply,
update, length), need to know T at
compile-time, because Java bytecode
represents (char[] different from
int[] different from Object[])
ArraySeq[T] (the class formerly known
as GenericArray[T]): - Benefits: Still
backed by a native Array, don't need
to know anything about T at
compile-time (new ArraySeq[T] "just
works", even if nothing is known about
T), full suite of SeqLike methods,
subtype of Seq[T] - Limitations: It's
backed by an Array[AnyRef], regardless
of what T is (if T is primitive, then
elements will be boxed/unboxed on
their way in or out of the backing
Array)
ArraySeq[Any] is much faster than
Array[Any] when handling primitives.
In any code you have Array[T], where T
isn't <: AnyRef, you'll get faster
performance out of ArraySeq.
Array is a direct representation of Java's Array, and uses the exact same bytecode on the JVM.
The advantage of Array is that it's the only collection type on the JVM to not undergo type erasure, Arrays are also able to directly hold primitives without boxing, this can make them very fast under some circumstances.
Plus, you get Java's messed up array covariance behaviour. (If you pass e.g. an Array[Int] to some Java class it can be assigned to a variable of type Array[Object] which will then throw an ArrayStoreException on trying to add anything that isn't an int.)
ArraySeq is rarely used nowadays, it's more of a historic artifact from older versions of Scala that treated arrays differently. Seeing as you have to deal with boxing anyway, you're almost certain to find that another collection type is a better fit for your requirements.
Otherwise... Arrays have exactly the same API as ArraySeq, thanks to an implicit conversion from Array to ArrayOps.
Unless you have a specific need for the unique properties of arrays, try to avoid them too.
See This Talk at around 19:30 or This Article for an idea of the sort of problems that Arrays can introduce.
After watching that video, it's interesting to note that Scala uses Seq for varargs :)
As you observed correctly, ArraySeq has a richer API as it is derived from IndexedSeq (and so on) whereas Array is a direct representation of Java arrays.
The relation between the both could be roughly compared to the relation of the ArrayList and arrays in Java.
Due to it's API, I would recommend using the ArraySeq unless there is a specific reason not to do so. Using toArray(), you can convert to an Array any time.

Why no immutable arrays in scala standard library?

Scala has all sorts sorts of immutable sequences like List, Vector,etc. I have been surprised to find no implementation of immutable indexed sequence backed by a simple array (Vector seems way too complicated for my needs).
Is there a design reason for this? I could not find a good explanation on the mailing list.
Do you have a recommendation for an immutable indexed sequence that has close to the same performances as an array? I am considering scalaz's ImmutableArray, but it has some issues with scala trunk for example.
Thank you
You could cast your array into a sequence.
val s: Seq[Int] = Array(1,2,3,4)
The array will be implicitly converted to a WrappedArray. And as the type is Seq, update operations will no longer be available.
So, let's first make a distinction between interface and class. The interface is an API design, while the class is the implementation of such API.
The interfaces in Scala have the same name and different package to distinguish with regards to immutability: Seq, immutable.Seq, mutable.Seq.
The classes, on the other hand, usually don't share a name. A List is an immutable sequence, while a ListBuffer is a mutable sequence. There are exceptions, like HashSet, but that's just a coincidence with regards to implementation.
Now, and Array is not part of Scala's collection, being a Java class, but its wrapper WrappedArray shows clearly where it would show up: as a mutable class.
The interface implemented by WrappedArray is IndexedSeq, which exists are both mutable and immutable traits.
The immutable.IndexedSeq has a few implementing classes, including the WrappedString. The general use class implementing it, however, is the Vector. That class occupies the same position an Array class would occupy in the mutable side.
Now, there's no more complexity in using a Vector than using an Array, so I don't know why you call it complicated.
Perhaps you think it does too much internally, in which case you'd be wrong. All well designed immutable classes are persistent, because using an immutable collection means creating new copies of it, so they have to be optimized for that, which is exactly what Vector does.
Mostly because there are no arrays whatsoever in Scala. What you're seeing is java's arrays pimped with a few methods that help them fit into the collection API.
Anything else wouldn't be an array, with it's unique property of not suffering type erasure, or the broken variance. It would just be another type with indexes and values. Scala does have that, it's called IndexedSeq, and if you need to pass it as an array to some 3rd party API then you can just use .toArray
Scala 2.13 has added ArraySeq, which is an immutable sequence backed by an array.
Scala 3 now has IArray, an Immutable Array.
It is implemented as an Opaque Type Alias, with no runtime overhead.
The point of the scala Array class is to provide a mechanism to access the abilities of Java arrays (but without Java's awful design decision of allowing arrays to be covariant within its type system). Java arrays are mutable, hence so are those in the scala standard library.
Suppose there were also another class immutable.Array in the library but that the compiler were also to use a Java array as the underlying structure (for efficiency/speed). The following code would then compile and run:
val i = immutable.Array("Hello")
i.asInstanceOf[Array[String]](0) = "Goodbye"
println( i(0) ) //I thought i was immutable :-(
That is, the array would really be mutable.
The problem with Arrays is that they have a fixed size. There is no operation to add an element to an array, or remove one from it.
You can keep an array that you guess will be long enough as a backing store, "wasting" the memory you're not using, keep track of the last used index, and copy to a larger array if you need the extra space. That copying is O(N) obviously.
Changing a single element is also O(N) as you will need to copy over the entire array. There is no structural sharing, which is the lynchpin of performant functional datastructures.
You could also allocate an extra array for the "overflowing" elements, and somehow keep track of your arrays. At that point you're on your way of re-inventing Vector.
In short, due to their unsuitablility for structural sharing, immutable facades for arrays have terrible runtime performance characteristics for most common operations like adding an element, removing an element, and changing an element.
That only leaves the use-case of a fixed size fixed content data-carrier, and that use-case is relatively rare. Most uses better served with List, Stream or Vector
You can simply use Array[T].toIndexSeq to convert Array[T] to ArraySeq[T], which is of type immutable.IndexedSeq[T].
(after Scala 2.13.0)
scala> val array = Array(0, 1, 2)
array: Array[Int] = Array(0, 1, 2)
scala> array.toIndexedSeq
res0: IndexedSeq[Int] = ArraySeq(0, 1, 2)

Sort ArrayBuffer[A] in scala?

I have an array in Scala with the class ArrayBuffer[Actor], where Actor is a class that implements the Ordered[Actor] trait. How do I sort this array without coding it manually?
I know there is an object called Sorting, but it doesnt seem to work since ArrayBuffer doesn't implement/extend the right classes.
How do I sort ArrayBuffer[A] type arrays?
If you are using Scala 2.8, you could use the sortWith method of the ArrayBuffer[T] class, which is inherited from the SeqLike trait.
The following code snippet sorts an ArrayBuffer[T] object in ascending order:
def ascendingSort[T <% Ordered[T]](xs: ArrayBuffer[T]) = xs.sortWith(_ < _)
Note that this does not mutate the actual ArrayBuffer, but creates a new one with the elements in the right order.
If you are using Scala 2.7, you could use the stableSort method of the Sorting object. This takes the elements of the ArrayBuffer and produces an array of elements sorted in the right order (given by a closure as an argument, ascending as a default).
For example:
val a = new scala.collection.mutable.ArrayBuffer[Int]()
a += 5
a += 2
a += 3
scala.util.Sorting.stableSort(a)
The important question is what do you want to do with the ArrayBuffer. Usually, a Buffer is used internally in different algorithms in order to increase the performance of intermediate results. If you are using it for that, have a look at the ways of sorting the collection you want to return at the end of your algorithm. The Sorting object already provides a way of transforming an ArrayBuffer into a sorted Array.
From the scaladoc of the Buffer class:
Buffers are used to create sequences of elements incrementally
As you are using it with Actors, it might be used for some kind of actor queue - in which case, you might want to have a look at the Queue collection.
Hope it helps,
-- Flaviu Cipcigan
Btw, the Actor class here is my own class used for "Actors" in a world created using my new game engine for scala ("Awesome Game Engine for Scala ~ AGES"), so it has nothing to do with the concurrency actor class. Also, implementations of lists in scala are a jungle, everything is either deprecated or implemented in a lot of different ways...ArrayBuffer works for my need (my need being a variable size array for containing actors).
Hope this clarifies :)

Resources