Unexpected array size [duplicate] - arrays

This question already has an answer here:
Powershell Join-Path showing 2 dirs in result instead of 1 - accidental script/function output
(1 answer)
Closed 3 years ago.
I'm trying to make a simple blackjack game as a Powershell script and found out that my ArrayList is not behaving as expected. I want to understand why I get a different answer then expected.
The Write-Host $deck call inside the function prints out the deck as I expect, 52 objects. So far so good.
However, when calling Write-Host $myDeck the weird part begins. What it will do is, it will first print 0...51 and then my actual deck. So instead of having 52 objects in my ArrayList I get 104 (52+52). Can anyone explain what really happens here? Because I find this super confusing.
function Get-Deck(){
$ranks = 2,3,4,5,6,7,8,9,10,"Jack","Queen","King","Ace"
$suits = "Spade","Heart","Diamond","Club"
$deck = [System.Collections.ArrayList]::new()
foreach($rank in $ranks){
foreach($suit in $suits){
$card = $rank , $suit
$deck.Add($card)
}
}
Write-Host $deck #prints out the actual array with 52 cards.
return $deck
}
$myDeck = Get-Deck
Write-Host $myDeck #prints out: 0 1 2 3 4 5 6 7 ... 51 2 Spade 2 Heart 2 Diamond ... Ace Club

The unexpected ouptut is caused by ArrayList.Add(). The function prototype is like so,
public virtual int Add (object value);
Note that it's got a non-void return type, int which is the ArrayList index at which the value has been added.
When $deck.Add($card) is called, the return value is left lingering on the pipeline and ends up in the arraylist. To fix the issue, either assigning the return value into an explicit variable, pass to null or cast as void. There are a few gotchas, see another an answer about those. Any of these should work. Like so,
$null = $deck.Add($card) # Preferred, (ab)uses automatic variable
[void]$deck.Add($card) # Works too
$deck.Add($card) | out-null # Works, but is the slowest option
$foo = $deck.Add($card) # Use this if you need the index value

Related

Powershell: Array receiving the return of a function contains incremental numbers for each value in the array [duplicate]

This question already has answers here:
Powershell Join-Path showing 2 dirs in result instead of 1 - accidental script/function output
(1 answer)
Why does Range.BorderAround emit "True" to the console?
(1 answer)
Create a Single-Element Json Array Object Using PowerShell
(2 answers)
Closed 1 year ago.
I am new to PowerShell and there is a weird behavior I cannot explain. I call a function that returns a [System.Collections.ArrayList] but when I print my variable that receives the content of the array, if I have one value(for example: logXXX_20210222_075234355.txt), then I get 0 logXXX_20210222_075234355.txt. The value 0 gets added for some reason as if it has the index of the value.
If I have 4 values, it will look like this:
0 1 2 3 logXXX_20210222_075234315.txt logXXX_20210225_090407364.txt
logXXX_20210204_120318221.txt logXXX_20210129_122737751.txt
Can anyone help?
Here is a simple code that does that:
function returnAnArray{
$arrayToReturn =[System.Collections.ArrayList]::new()
$arrayToReturn.Add('logICM_20210222_075234315.txt')
return $arrayToReturn
}
$fileNames = returnAnArray
Write-Host $fileNames
0 logICM_20210222_075234315.txt
It's characteristic of the ArrayList class to output the index on .Add(...). However, PowerShell returns all output, which will cause it to intermingle the index numbers with the true or other intended output.
My favorite solution is to simply cast the the output from the .Add(...) method to [Void]:
function returnAnArray{
$arrayToReturn = [System.Collections.ArrayList]::new()
[Void]$arrayToReturn.Add('logICM_20210222_075234315.txt')
return $arrayToReturn
}
You can also use Out-Null for this purpose but in many cases it doesn't perform as well.
Another method is to assign it to $null like:
function returnAnArray{
$arrayToReturn = [System.Collections.ArrayList]::new()
$null = $arrayToReturn.Add('logICM_20210222_075234315.txt')
return $arrayToReturn
}
In some cases this can be marginally faster. However, I prefer the [Void] syntax and haven't observed whatever minor performance differential there may be.
Note: $null = ... works in all cases, while there are some cases where [Void] will not; See this answer (thanks again mklement0) for more information.
An aside, you can use casting to establish the list:
$arrayToReturn = [System.Collections.ArrayList]#()
Update Incorporating Important Comments from #mklement0:
return $arrayToReturn may not behave as intended. PowerShell's output behavior is to enumerate (stream) arrays down the pipeline. In such cases a 1 element array will end up returning a scalar. A multi-element array will return a typical object array [Object[]], not [Collection.ArrayList] as seems to be the intention.
The comma operator can be used to guarantee the return type by making the ArrayList the first element of another array. See this answer for more information.
Example without ,:
Function Return-ArrayList { [Collections.ArrayList]#(1,2,3,4,5,6) }
$ArrReturn = Return-ArrayList
$ArrReturn.gettype().FullName
Returns: System.Object[]
Example with ,:
Function Return-ArrayList { , [Collections.ArrayList]#(1,2,3,4,5,6) }
$ArrReturn = Return-ArrayList
$ArrReturn.gettype().FullName
Returns: System.Collections.ArrayList
Of course, this can also be handled by the calling code. Most commonly by wrapping the call in an array subexpression #(...). a call like: $filenames = #(returnAnArray) will force $filenames to be a typical object array ([Object[]]). Casting like $filenames = [Collections.ArrayList]#(returnArray) will make it an ArrayList.
For the latter approach, I always question if it's really needed. The typical use case for an ArrayList is to work around poor performance associated with using += to increment arrays. Often this can be accomplished by allowing PowerShell to return the array for you (see below). But, even if you're forced to use it inside the function, it doesn't mean you need it elsewhere in the code.
For Example:
$array = 1..10 | ForEach-Object{ $_ }
Is preferred over:
$array = [Collections.ArrayList]#()
1..10 | ForEach-Object{ [Void]$array.Add( $_ ) }
Persisting the ArrayList type beyond the function and through to the caller should be based on a persistent need. For example, if there's a need easily add/remove elements further along in the program.
Still More Information:
Notice the Return statement isn't needed either. This very much ties back to why you were getting extra output. Anything a function outputs is returned to the caller. Return isn't explicitly needed for this case. More commonly, Return can be used to exit a function at desired points...
A function like:
Function Demo-Return {
1
return
2
}
This will return 1 but not 2 because Return exited the function beforehand. However, if the function were:
Function Demo-Return
{
1
return 2
}
This returns 1, 2.
However, that's equivalent to Return 1,2 OR just 1,2 without Return
Update based on comments from #zett42:
You could avoid the ArrayList behavior altogether by using a different collection type. Most commonly a generic list, [Collections.Generic.List[object]]. Technically [ArrayList] is deprecated already making generic lists a better option. Furthermore, the .Add() method doesn't output anything, thus you do not need [Void] or any other nullification method. Generic lists are slightly faster than ArrayLists, and saving the nullification operation a further, albeit still small performance advantage.
ArrayList appears to store alternating indexes and values:
PS /home/alistair> $filenames[0]
0
PS /home/alistair> $filenames[1]
logICM_20210222_075234315.txt

Powershell - declaring array of arrays is not so easy? [duplicate]

This question already has answers here:
PowerShell enumerate an array that contains only one inner array
(2 answers)
Closed 2 years ago.
I tried to describe a small 'list' of things, using arrays of arrays. An odd behaviour I observed:
function write-it-out([array] $arrays)
{
foreach($a in $arrays)
{
write-host "items" $a[0] $a[1]
}
}
$arrayOfArrays1 = #(
#("apple","orange"),
#("monkey","bear")
)
$arrayOfArrays2 = #(
#("android","linux")
)
# it works
write-it-out $arrayOfArrays1
# it wont
write-it-out $arrayOfArrays2
The first case outputs the expected two lines with the following content:
items apple orange
items monkey bear
But the second function call outputs not the expecteds
items android linux
but
items a n
items l i
Does somebody know why? And how to describe an array containing only one array inside, not more than one? So how to fix it? Thank guys in advance!
I'm not exactly sure why, but when you declare $arrayOfArrays2, PowerShell is immediately unrolling the outer array.
> $arrayOfArrays2.Count
2
> $arrayOfArrays2[0]
android
In order to make it not do that, you can add an extra comma inside the outer array declaration like this.
> $arrayOfArrays2 = #(,#("android","linux"))
> $arrayOfArrays2.Count
1
> $arrayOfArrays2[0]
android
linux
> write-it-out $arrayOfArrays2
Items android linux

Stata: assign numbers in a loop

I have a problem creating a loop in Stata.
I have a dataset in Stata where I classified my observations into 6 categories via variable k10. So k10 takes on values 1,2,3,4,5,6.
Now I want to assign each observation one value according to its class:
value 15 for k10=1
value 10 for k10=2
value 8 for k10=3
value 5 for k10=4
value 4 for k10=5
value 2 for k10=6
It is easy if I create a new variable w10 and do it like the following:
gen w10 =.
replace w10 = 15 if k10==1
replace w10 = 10 if k10==2
replace w10 = 8 if k10==3
replace w10 = 5 if k10==4
replace w10 = 4 if k10==5
replace w10 = 2 if k10==6
Now I tried to simplify the code by using a loop, unfortunately it does not do what I want to achieve.
My loop:
gen w10=.
local A "1 2 3 4 5 6"
local B "15 10 8 5 4 2"
foreach y of local A {
foreach x of local B {
replace w10 = `x' if k10= `y'
}
}
The loop assigns value 2 to each observation though. The reason is that the if-condition k10=`y' is always true and overwrites the replaced w10s each time until the end, right?
So how can I write the loop correctly?
It's really just one loop, not two nested loops. That's your main error, which is general programming logic. Only the last time you go through the inner loop has an effect that lasts. Try tracing the loops by hand to see this.
Specifically in Stata, looping over the integers 1/6 is much better done with forval; there is no need at all for the indirection of defining a local macro and then obliging foreach to look inside that macro. That can be coupled with assigning the other values to local macros with names 1 ... 6. Here tokenize is the dedicated command to use.
Try this:
gen w10 = .
tokenize "15 10 8 5 4 2"
quietly forval i = 1/6 {
replace w10 = ``i'' if k10 == `i'
}
Note incidentally that you need == not = when testing for equality.
See (e.g.) this discussion.
Many users of Stata would want to do it in one line with recode. Here I concentrate on the loop technique, which is perhaps of wider interest.

Stata Nested foreach loop substring comparison

I have just started learning Stata and I'm having a hard time.
My problem is this: I have two different variables, ATC and A, where A is potentially a substring of ATC.
Now I want to mark all the observations in which A is a substring of ATC with OK = 1.
I tried this using a simple nested loop:
foreach x in ATC {
foreach j in A {
replace OK = 1 if strpos(`x',`j')!=0
}
}
However, whenever I run this loop no changes are being made even though there should be plenty.
I feel like I should probably give an index specifying which OK is being changed (the one belonging to the ATC/x), but I have no idea how to do this. This is probably really simple but I've been struggling with it for some time.
I should have clarified: my A list is separate from the main list (simply appended to it) and only contains unique keys which I use to identify the ATCs which I want. So I have ~120 A-keys and a couple million ATC keys. What I wanted to do was iterate over every ATC key for every single A-key and mark those ATC-keys with A that qualify.
That means I don't have complete tuples of (ATC,A,OK) but instead separate lists of different sizes.
For example: I have
ATC OK A
ABCD 0 .
EFGH 0 .
... ... ...
. . AB
. . ET
and want the result that "ABCD" having OK is marked as 1 while "EFGH" remains at 0.
We can separate your question into two parts. Your title implies a problem with loops, but your loops are just equivalent to
replace OK = 1 if strpos(ATC, A)!=0
so the use of looping appears irrelevant. That leaves the substring comparison.
Let's supply an example:
. set obs 3
obs was 0, now 3
. gen OK = 0
. gen A = cond(_n == 1, "42", "something else")
. gen ATC = "answer is 42"
. replace OK = 1 if strpos(ATC, A) != 0
(1 real change made)
. list
+------------------------------------+
| OK A ATC |
|------------------------------------|
1. | 1 42 answer is 42 |
2. | 0 something else answer is 42 |
3. | 0 something else answer is 42 |
+------------------------------------+
So it works fine; and you really need to give a reproducible example if you think you have something different.
As for specifying where the variable should be changed: your code does precisely that, as again the example above shows.
The update makes the problem clear. Stata will only look in the same observation for a matching substring when you specify the syntax you gave. A variable in Stata is a field in a dataset. To cycle over a set of values, something like this should suffice
gen byte OK = 0
levelsof A, local(Avals)
quietly foreach A of local Avals {
replace OK = 1 if strpos(ATC, `"`A'"') > 0
}
Notes:
Specifying byte cuts down storage.
You may need an if or in restriction on levelsof.
quietly cuts out messages about changed values. When debugging, it is often better left out.
> 0 could be omitted as a positive result from strpos() is automatically treated as true in logical comparisons. See this FAQ.

Perl adding array with another array

I cam across the code below online where it's trying to add two array. Can anyone explain what it is calculating to get 14?
my #a = (1,2,5)+(8,9);
print "#a";
output: 14
Output is 14 as $a[0] is 14 => 5+9
+ operator imposes scalar context on both lists so last elements are taken and added,
# in scalar context $x is assigned with last element
my $x = (1,2,5);
print "\$x is $x\n";
outputs $x is 5
warnings pragma would also complain, giving you a hint that something fishy is going on,
Useless use of a constant (8) in void context
Starting with:
my #a = (1,2,5)+(8,9);
When using a list in a scalar context, the last element is returned. Consult What is the difference between a list and an array? for details.
Therefore the above two lists reduce to:
my #a = 5 + 9;
Which mathematically equals:
my #a = (14);

Resources