Updating string array using java stream - arrays

I know that for object, we can forEach the collection and update the object as we like but for immutable objects like Strings, how can we update the array with new object without converting it into an array again.
For e.g, I have an array of string. I want to iterate through each string and trim them. I would otherwise have to do something like this:
Arrays.stream(str).map(c -> c.trim()).collect(Collectors.toList())
In the end, I will get a List rather then String[] that I initially gave. Its a whole lot of processing. Is there any way I can do something similar to:
for(int i = 0; i < str.length; i++) {
str[i] = str[i].trim();
}
using java streams?

Streams are not intended for manipulating other data structures, especially not for updating their source. But the Java API consists of more than the Stream API.
As Alexis C. has shown in a comment, you could use Arrays.setAll(arr, i -> arr[i].trim());
There’s even a parallelSetAll that you could use when you have a really large array.
However, it might be easier to use just Arrays.asList(arr).replaceAll(String::trim);.
Keep in mind that the wrapper returned by Arrays.asList allows modifications of the wrapped array through the List interface. Only adding and removing is not supported.

Use toArray :
str = Arrays.stream(str).map(c -> c.trim()).toArray(String[]::new);
The disadvantage here (over your original Java 7 loop) is that a new array is created to store the result.
To update the original array, you can re-write your loop with Streams, though I'm not sure what's the point :
IntStream.range (0, str.length).forEach (i -> {str[i] = str[i].trim();});

It's not that much processing as you might think, the array has a known size and the spliterator from it will be SIZED, thus the resulting collection size will be known before processing and the space for it can be allocated ahead of time, without having to re-size the collection.
It's also always interesting that in the absence of actual tests we almost always assume that this is slow or memory hungry.
of course if you want an array as the result there is a method for that :
.toArray(String[]::new);

Related

Difference between Dynamic arrays and list c#

I have a query regarding a concept.
Are dynamic arrays in c# also called list? Or they are totally different?
C# does not have a dynamic array type. However, you can use a List as a dynamic array, given that it supports indexed access:
List<string> list = new List<string>();
list.Add("Hello");
list.Add("Goodbye");
Console.WriteLine(list[1]);
Take a look at Generic Lists.
Since it's the first link I found out when looking for this difference I would like to contribute a little.
The Nitin Bisht response is not entirely wrong, but by most of programming definitions a dynamic array is a type of array that can be set when the program is already running, simply put, it's size is set during run time not in compilation.
In C# this is written as:
string[] myDynamicArray;
myDynamicArray = SomeMethodThatReturnsAnCollection().ToArray();
Other way would be a method which returns an existing array like:
string[] myDynamicArray;
myDynamicArray = SomeMethodThatReturnsAnArray();
Those are Dynamic Arrays, you don't tell your program which size it has but inside the logic of that context this will be decided.
Those Arrays in C# has the Resize method, which enables the programmer to grow the array, which I believe it was the Nitin Bisht and so many others interpretation around that question, but this is a wrong assumption, resizable is different than dynamic.
Instead of using Resize in your Array, you could just use a List type, List in C# implements an Array in it's core, which can be seen here, but it also has the ability to grow in an easier approach than the Array implementation enables us with the Add(T item) List method.
List in C# would be a native Dynamic Resizable Array, some languages call similarly implementations of this structure as Vector, which are not a LinkedList with nodes and pointers for example.
Summarizing the difference in code:
var myList = new List<string> { "initial value" };
myList.Add("New value"); // After add, myList has this new value
var myArray = new string[] { "initial value" };
myArray.Append("New value") // after Append myArray doesnt have a new value
With Append() method what happens is another IEnumerable collection is created with the new value, so if you iterate through both collections you will see that the list prints out the new value but the array doesn't, in that case one way you can fix that is doing this:
string[] myArray;
myArray = myArray.Append("New value").ToArray(); // now it will work
The difference here is that myArray variable is being rewritten with the new array generated by .Append().ToArray() line, in the List example the Add() method will resize and then add the new value.
Important to note that this resize is done in a similar manner than what we did with the Append() "fixed" code, the List creates a new Array with increased size and copies it's original contents to this new Array.
You can use the resources given by the Array class to get to the same result as the Add() method implemented by List, but off course if you need that method you should just use an List instead of writing your own implementation, very rarely you would have to do it and to be honest your implementation would be probably less efficient than the already built ones.
In summary that is the difference between them, as you can see they have similarities but I would still treat them very differently.
Usually you wanna use an array when you do not expect to grow any further, if that is your case then an array is more useful since it's has less implementations and that brings a performance gain which can be good in a lot of scenarios.
If not, just use lists.

Making a Copy of an Array Independent of the Original

var ary:Array = ["string","string","string"];
var copy_ary:Array = ary;
trace(copy_ary);//string,string,string
ary[1]=false;
trace(copy_ary);//string,false,string
All I want to do is make a copy of the array ary without the copy constantly changing in accordance with the original. Would I have to create a copy with a loop (e.g. below)?
var ary:Array = ["string","string","string"];
var copy_ary:Array = [];
for(var i=0;i<ary.length;i++){
copy_ary[i]=ary[i];
}
Obviously this works, but it seems a lot of work considering one wouldn't think the copied array would constantly stay the same as the original in the first place. Could someone please tell me why this is?
In your first example you didn't create a new array - you created a new reference to an already existing array (so you had 2 references but 1 array). When you modify the array through one reference, you'll see changes through the other reference as well (since you really only have 1 array).
To create an independent copy of an array, you need to actually create a new array instance and then copy the items over. This can be done through a shallow or through a deep copy.
In short, a shallow copy can be created using the Array.concat() or the Array.slice() methods (or using a loop, like in your 2nd example). For a deep copy, you'll have to also copy the objects inside the array - this would likely need more code, depending on what kind of objects are in the array.
When your array only contains primitive (or pimitive-like) types, a shallow copy is typically enough - if your array only has strings, a shallow copy should be enough since String behaves like a primitve type even though it's a complex object.
Read this article for more information.

Sorting and managing numerous variables

My project has classes which, unavoidably, contain hundreds upon hundreds of variables that I'm always having to keep straight. For example, I'm always having to keep track of specific kinds of variables for a recurring set of "items" that occur inside of a class, where placing those variables between multiple classes would cause a lot of confusion.
How do I better sort my variables to keep from going crazy, especially when it comes time to save my data?
Am I missing something? Actionscript is an Object Oriented language, so you might have hundreds of variables, but unless you've somehow treated it like a grab bag and dumped it all in one place, everything should be to hand. Without knowing what all you're keeping track of, it's hard to give concrete advice, but here's an example from a current project I'm working on, which is a platform for building pre-employment assessments.
The basic unit is a Question. A Question has a stem, text that can go in the status bar, a collection of answers, and a collection of measures of things we're tracking about what the user does in that particular type of questions.
The measures are, again, their own type of object, and come in two "flavors": one that is used to track a time limit and one that isn't. The measure has a name (so we know where to write back to the database) and a value (which tells us what). Timed ones also have a property for the time limit.
When we need to time the question, we hand that measure to yet another object that counts the time down and a separate object that displays the time (if appropriate for the situation). The answers, known as distractors, have a label and a value that they can impart to the appropriate measure based on the user selection. For example, if a user selects "d", its value, "4" is transferred to the measure that stores the user's selection.
Once the user submits his answer, we loop through all the measures for the question and send those to the database. If those were not treated as a collection (in this case, a Vector), we'd have to know exactly what specific measures are being stored for each question and each question would have a very different structure that we'd have to dig through. So if looping through collections is your issue, I think you should revisit that idea. It saves a lot of code and is FAR more efficient than "var1", "var2", "var3."
If the part you think is unweildy is the type checking you have to do because literally anything could be in there, then Vector could be a good solution for you as long as you're using at least Flash Player 10.
So, in summary:
When you have a lot of related properties, write a Class that keeps all of those related bits and pieces together (like my Question).
When objects have 0-n "things" that are all of the same or very similar, use a collection of some sort, such as an Array or Vector, to allow you to iterate through them as a group and perform the same operation on each (for example, each Question is part of a larger grouping that allows each question to be presented in turn, and each question has a collection of distractors and another of measures.
These two concepts, used together, should help keep your information tidy and organized.
While I'm certain there are numerous ways of keeping arrays straight, I have found a method that works well for me. Best of all, it collapses large amounts of information into a handful of arrays that I can parse to an XML file or other storage method. I call this method my "indexed array system".
There are actually multiple ways to do this: creating a handful of 1-dimensional arrays, or creating 2-dimensional (or higher) array(s). Both work equally well, so choose the one that works best for your code. I'm only going to show the 1-dimensional method here. Those of you who are familiar with arrays can probably figure out how to rewrite this to use higher dimensional arrays.
I use Actionscript 3, but this approach should work with almost any programming or scripting language.
In this example, I'm trying to keep various "properties" of different "activities" straight. In this case, we'll say these properties are Level, High Score, and Play Count. We'll call the activities Pinball, Word Search, Maze, and Memory.
This method involves creating multiple arrays, one for each property, and creating constants that hold the integer "key" used for each activity.
We'll start by creating the constants, as integers. Constants work for this, because we never change them after compile. The value we put into each constant is the index the corresponding data will always be stored at in the arrays.
const pinball:int = 0;
const wordsearch:int = 1;
const maze:int = 2;
const memory:int = 3;
Now, we create the arrays. Remember, arrays start counting from zero. Since we want to be able to modify the values, this should be a regular variable.
Note, I am constructing the array to be the specific length we need, with the default value for the desired data type in each slot. I've used all integers here, but you can use just about any data type you need.
var highscore:Array = [0, 0, 0, 0];
var level:Array = [0, 0, 0, 0];
var playcount:Array = [0, 0, 0, 0];
So, we have a consistent "address" for each property, and we only had to create four constants, and three arrays, instead of 12 variables.
Now we need to create the functions to read and write to the arrays using this system. This is where the real beauty of the system comes in. Be sure this function is written in public scope if you want to read/write the arrays from outside this class.
To create the function that gets data from the arrays, we need two arguments: the name of the activity and the name of the property. We also want to set up this function to return a value of any type.
GOTCHA WARNING: In Actionscript 3, this won't work in static classes or functions, as it relies on the "this" keyword.
public function fetchData(act:String, prop:String):*
{
var r:*;
r = this[prop][this[act]];
return r;
}
That queer bit of code, r = this[prop][this[act]], simply uses the provided strings "act" and "prop" as the names of the constant and array, and sets the resulting value to r. Thus, if you feed the function the parameters ("maze", "highscore"), that code will essentially act like r = highscore[2] (remember, this[act] returns the integer value assigned to it.)
The writing method works essentially the same way, except we need one additional argument, the data to be written. This argument needs to be able to accept any
GOTCHA WARNING: One significant drawback to this system with strict typing languages is that you must remember the data type for the array you're writing to. The compiler cannot catch these type errors, so your program will simply throw a fatal error if it tries to write the wrong value type.
One clever way around this is to create different functions for different data types, so passing the wrong data type in an argument will trigger a compile-time error.
public function writeData(act:String, prop:String, val:*):void
{
this[prop][this[act]] = val;
}
Now, we just have one additional problem. What happens if we pass an activity or property name that doesn't exist? To protect against this, we just need one more function.
This function will validate a provided constant or variable key by attempting to access it, and catching the resulting fatal error, returning false instead. If the key is valid, it will return true.
function validateName(ID:String):Boolean
{
var checkthis:*
var r:Boolean = true;
try
{
checkthis = this[ID];
}
catch (error:ReferenceError)
{
r = false;
}
return r;
}
Now, we just need to adjust our other two functions to take advantage of this. We'll wrap the function's code inside an if statement.
If one of the keys is invalid, the function will do nothing - it will fail silently. To get around this, just put a trace (a.k.a. print) statement or a non-fatal error in the else construct.
public function fetchData(act:String, prop:String):*
{
var r:*;
if(validateName(act) && validateName(prop))
{
r = this[prop][this[act]];
return r;
}
}
public function writeData(act:String, prop:String, val:*):void
{
if(validateName(act) && validateName(prop))
{
this[prop][this[act]] = val;
}
}
Now, to use these functions, you simply need to use one line of code each. For the example, we'll say we have a text object in the GUI that shows the high score, called txtHighScore. I've omitted the necessary typecasting for the sake of the example.
//Get the high score.
txtHighScore.text = fetchData("maze", "highscore");
//Write the new high score.
writeData("maze", "highscore", txtHighScore.text);
I hope ya'll will find this tutorial useful in sorting and managing your variables.
(Afternote: You can probably do something similar with dictionaries or databases, but I prefer the flexibility with this method.)

matlab initialize array of objects

I am playing around with OOP in MATLAB, and I have the following constructor:
function obj = Squadron(num_fighters, num_targets, time_steps)
if nargin == 0
num_targets = 100;
time_steps = 100;
num_fighters = 10;
end
obj.num_shooters = num_fighters;
for iShooter = 1:obj.num_shooters
a(iShooter) = Shooter(num_targets, time_steps);
end
obj.ShooterArray = a;
obj.current_detections = zeros(num_fighters, num_targets);
end
That temporary variable 'a' smells terrible. Is there a better way to initialize an array of objects, I wish there was a push/pop method. I am sure there is a better way to do this.
Looks like you are trying to create an array of handle objects (Shooters) and store it inside the property of another handle object (a Squardron). I have had a very similar problem discussion that might help you.
In short: What you are doing might not be pretty - but might be pretty good already.
When creating an array in Matlab it is usually a good Idea to do some pre-allocation to reserve memory which speeds up performance significantly.
In a normal case something like this:
a=zeros(1,1000);
for n=1:1000
a(n)=n;
end
(here a=1:1000; would be even better)
For objects the pre-allocation works by assigning one of the objects to the very last field in the array. Matlab then fills the other fields before that with objects (handles) that it creates by calling the constructor of that object with no arguments (see Matlab help). Hence a pre-allocation for objects could look like this:
a(1,1000)=ObjectConstructor();
for n=1:1000
a(n)=ObjectConstructor();
end
or simply
for n=1000:-1:1
a(n)=ObjectConstructor();
end
Making sure Shooter can be called with no arguments you should be able to do something like:
for iShooter = obj.num_shooters:-1:1
obj.ShooterArray(iShooter) = Shooter(num_targets, time_steps);
end
However, it turns out that for some reason this direct storing of an array of objects in another object's property creates very bad performance. (Probably the array pre-allocation does not work well in this case). Hence using an auxiliary variable and allocating the full array at once to the property is in this case is a good idea to increase performance.
I would try:
for iShooter = obj.num_shooters:-1:1
a(iShooter) = Shooter(num_targets, time_steps);
end
obj.ShooterArray = a;
Again - for more detail see this discussion
There are a couple of ways to handle this situation...
Building object arrays in the constructor:
You could modify your Shooter class such that when you pass arrays of values it creates an array of objects. Then you could initialize ShooterArray like so:
obj.ShooterArray = Shooter(repmat(num_targets,1,num_fighters),...
repmat(time_steps,1,num_fighters));
Replicating instances of a value class:
If Shooter is a value class, and each object is going to be exactly the same (i.e. you don't initialize any of its default properties to random values), then you can create just one object and replicate it using REPMAT:
obj.ShooterArray = repmat(Shooter(num_targets,time_steps),1,num_fighters);
Unfortunately, if Shooter is a subclass of the handle class, you can't just replicate it as you can with a value class. You would actually be replicating references to just one object, when you really need a number of separate objects each with their own unique reference. In such a case, your current code is likely the best solution.

C# -- Create Managed Array from Pointer

I'm trying to create a Managed Array of doubles from an array of bytes. I have the problem working currently, but I wanted to optimize. Here's some code that I would like to work:
private unsafe static double[] _Get_Doubles(byte[] _raw_data)
{
double[] ret;
fixed (byte* _pd = _raw_data)
{
double* _pret = (double*)_pd;
ret = (double[])*_pret; //FAILURE
}
}
Please let me know how to cope with these problems.
-Aaron
One of the key things to notice about the code you have posted is that there is no way to know how many items are pointed to by the return value, and a managed array needs to know how big it is. You can return a double* or create a new double[XXX] and copy the values or even (if the count is constant) create a struct with a public fixed double _data[2]; member and cast the raw data to that type.
Just now, I thought that stackalloc would be the right way, but it fails. Most importantly, I now know that it was doomed to fail. There is no way to do what I want to do.
This can be seen by restating the question:
How can I create a managed array around an 'unsafe' array?
Since a managed array has header information (because it's a class around a chuck of memory), it requires more space in memory than the array itself. So, the answer is:
Allocate space before (and/or after? depending on the way managed arrays are stored in memory) the array itself and put the managed information (length, (et cetera)) around the 'unsafe' array.
This is not easily possible because to guarantee that there is data enough around the array is shaky at best. In my particular example there may be enough space for it because a managed byte[] is passed in meaning that there is data around the array, but to assert that the same data is appropriate for managed double[] is dubious at best, but most likely erroneous, and to change the data to make it appropriate for managed double[] is nefarious.
[EDIT]
It looks like Marshal.Copy is the way to go here. Create a new array and let Marshal copy them (hoping that he will be quicker than me, or that perhaps at some later date, he will be quicker):
var ret = new double[_raw_data.Length / sizeof(double)];
System.Runtime.InteropServices.Marshal.Copy(new System.IntPtr(_pret), ret, 0, ret.Length);

Resources