Why does Powershell combines array of arrays? - arrays

I use arrays of arrays in a Powershell script, however, sometimes, my array of arrays, actually contains only one array.
For some reason, Powershell keeps replacing that array containing one array, by just one array. I don't get it, no other scripting / coding language I ever used has done that before.
Example, this is what I do not want:
PS C:\Users\> $a = #(#(123,456,789))
PS C:\Users\> $a[0]
123
This is what I want:
PS C:\Users\> $a = #(#(123,456,789), #())
PS C:\Users\> $a[0]
123
456
789
Why do I have to force an extra empty array for Powershell to consider my array of arrays as such when it only contains one array ? This is driving me nuts !

You need to put a comma as the first item:
$a = #(, #(123,456,789) )
The reason for this is that the comma is essentially the array construction parameters. This MSDN article has more information.

#() operator interpret its content as statements not as expression. Let us put explicit ;.
#(#(123,456,789;);)
What do you have here:
123,456,789 — binary comma operator create an array with three elements.
Result: array [123,456,789].
123,456,789; — as expression in this statement return collection, PowerShell enumerate this collection and write collection's elements (not collection itself) to the pipeline.
Result: three elements 123, 456 and 789 written to the pipeline.
#(123,456,789;) — array subexpression operator collect all the items written to pipeline as result of invocation of nested statements and create array from them.
Result: array [123,456,789].
#(123,456,789;); — as expression in this statement return collection, PowerShell enumerate this collection and write collection's elements (not collection itself) to the pipeline.
Result: three elements 123, 456 and 789 written to the pipeline.
#(#(123,456,789;);) — array subexpression operator collect all the items written to pipeline as result of invocation of nested statements and create array from them.
Result: array [123,456,789].
So, when you write #(collection), PowerShell return copy of collection, not collection wrapped into single element array. If you want to create array with single element, then you should use unary comma operator: ,expression. This will create single element array regardless of expression return collection or not.
Also, when you write #(a,b,c,etc), it is binary comma, who create array. Array subexpression operator just copy that array. But why do you need a copy? Is any reason, why you can not use original array? All you need for not making extra copy is just to omit # character: (a,b,c,etc).

Related

Why does an array with a single empty array have the length of 0?

The Length property works as expected on all arrays that I test except one weird case:
PS> #(#()).Length
0
It's not that empty arrays are generally omitted though:
PS> #(#(), #()).Length
2
PS> #(#(), #(), #()).Length
3
What's going on?
#(...), the array-subexpression operator is not an array constructor, it is an array "guarantor" (see next section), and nesting #(...) operations is pointless.
#(#()) is in effect the same as #(), i.e. an empty array of type [object[]].
To unconditionally construct arrays, use ,, the array constructor operator.
To construct an array wrapper for a single object, use the unary form of ,, as Abraham Zinala suggests:
# Create a single-element array whose only element is an empty array.
# Note: The outer enclosure in (...) is only needed in order to
# access the array's .Count property.
(, #()).Count # -> 1
Note that I've used .Count instead of .Length above, which is more PowerShell-idiomatic; .Count works across different collection types. Even though System.Array doesn't directly implement .Count, it does so via the ICollection interface, and PowerShell allows access to interface members without requiring a cast.
Background information:
#(...)'s primary purpose is to ensure that output objects collected from - invariably pipeline-based - commands (e.g, #(Get-ChildItem *.txt)) are always collected as an array (invariably of type [object[]]) - even if ... produces only one output object.
If getting an array is desired, use of #(...) is necessary because collecting output that happens to contain just one object would by default be collected as-is, i.e. not wrapped in an array (this also applies when you use $(...), the subexpression operator); only for multiple output objects is an array used, which is always [object[]]-typed.
Note that PowerShell commands (typically) do not output collections; instead, they stream a (usually open-ended) number of objects one by one to the pipeline; capturing command output therefore requires collecting the streamed objects - see this answer for more information.
#(...)'s secondary purpose is to facilitate defining array literals, e.g. #('foo', 'bar')
Note:
Using #(...) for this purpose was not by original design, but such use became so prevalent that an optimization was implemented in version 5 of PowerShell so that, say, 1, 2 - which is sufficient to declare a 2-element array - may also be expressed as #(1, 2) without unnecessary processing overhead.
On the plus side, #(...) is visually distinctive in general and syntactically convenient specifically for declaring empty (#()) or single-element arrays (e.g. #(42)) - without #(...), these would have to expressed as [object[]]:new() and , 42, respectively.
However, this use of #(...) invites the misconception that it acts as an unconditional array constructor, which isn't the case; in short: wrapping extra #(...) operations around a #(...) operation does not create nested arrays, it is an expensive no-op; e.g.:
#(42) # Single-element array
#(#(42)) # !! SAME - the outer #(...) has no effect.
When #(...) is applied to a (non-array-literal) expression, what this expression evaluates to is sent to the pipeline, which causes PowerShell to enumerate it, if it considers it enumerable;[1] that is, if the expression result is a collection, its elements are sent to the pipeline, one by one, analogous to a command's streaming output, before being collected again in an [object[]] array.
# #(...) causes the [int[]]-typed array to be *enumerated*,
# and its elements are then *collected again*, in an [object[]] array.
$intArray = [int[]] (1, 2)
#($intArray).GetType().FullName # -> !! 'System.Object[]'
To prevent this enumeration and re-collecting:
Use the expression as-is and, if necessary, enclose it just in (...)
To again ensure that an array is returned, an efficient alternative to #(...) is to use an [array] cast; the only caveat is that if the expression evaluates to $null, the result will be $null too ($null -eq [array] $null):
# With an array as input, an [array] cast preserves it as-is.
$intArray = [int[]] (1, 2)
([array] $intArray).GetType().FullName # -> 'System.Int32[]'
# With a scalar as input, a single-element [object[]] array is created.
([array] 42).GetType().FullName # -> 'System.Object[]'
[1] See the bottom section of this answer for an overview of which .NET types PowerShell considers enumerable in the pipeline.

Powershell $this variable does not work inside array indexing

I'm trying to emulate MATLAB's 'end' indexing keyword (such as A[5:end]) in Powershell but I don't want to type the array name (such as $array) to access $array.length for
$array[0..($array.length - 2)]
as discussed in another Stackflow question. I tried $this
(0..7)[4..($this.Length-1)]
given ($this.Length-1) seems to be interpreted as -1 as the output shows
4
3
2
1
0
7
This makes me think $this is empty when used inside [] indexing an array. Is there a way for an array to refer to itself without explicitly repeating the variable name so I can call the methods of the array to derive the indices? This would be very handy for emulating logical indexing while taking advantage of method chaining (like a.b.c.d[4..end]).
PowerShell doesn't have any facility for referring to "the collection targeted by this index access operator", but if you want to skip the first N items of a collection/enumerable you can use Select -Skip:
0..7 |Select -Skip 4
To complement Mathias' helpful answer:
The automatic $this variable is not available inside index expressions ([...]), only in custom classes (to refer to the instance at hand) and in script blocks acting as .NET event delegates (to refer to the event sender).
However, for what you're trying to achieve you don't need a reference to the input array (collection) as a whole: instead, an abstract notation for referring to indices relative to the end of the input array should suffice, and ideally also for "all remaining elements" logic.
You can use negative indices to refer to indices relative to the end of the input array, but that only works with individual indices:
# OK: individual negative indices; get the last and the 3rd last item:
('a', 'b', 'c', 'd')[-3, -1] # -> 'b', 'd'
Unfortunately, because .. inside an index expression refers to the independent, general-purpose range operator, this does not work for range-based array slicing when negative indices are used as range endpoints:
# !! DOES NOT WORK:
# *Flawed* attempt to get all elements up to and including the 2nd last,
# i.e. to get all elements but the last.
# 0..-2 evaluates to array 0, -1, -2, whose elements then serve as the indices.
('a', 'b', 'c', 'd')[0..-2] # -> !! 'a', 'd', 'c'
That is, the general range operation 0..-2 evaluates to array 0, -1, -2, and the resulting indices are used to extract the elements.
It is this behavior that currently requires an - inconvenient - explicit reference to the array inside the index expression for everything-except-the-last-N-elements logic, such as $array[0..($array.length - 2)] in your question in order to extract all elements except the last one.
GitHub issue #7940 proposes introducing new syntax that addresses this problem, by effectively implementing C#-style ranges:
While no syntax has been agreed on and no commitment has been made to implement this enhancement, borrowing C#'s syntax directly is an option:
Now
Potential future syntax
Comment
$arr[1..($arr.Length-2)]
$arr[1..^1]
From the 2nd el. through to the next to last.
$arr[1..($arr.Length-1)]
$arr[1..]
Everything from the 2nd el.
$arr[0..9]
$arr[..9]
Everything up to the 10th el.
$arr[-9..-1]
$arr[^9..]
Everything from the 9th to last el.
Note the logic of the from-the-end, 1-based index syntax (e.g., ^1 refers to the last element) when serving as a range endpoint: It is up-to-but-excluding logic, so that ..^1 means: up to the index before the last one, i.e. the second to last one.
As for the workarounds:
Using Select-Object with -Skip / -SkipLast is convenient in simple cases, but:
performs poorly compared to index expressions ([...]) (see below)
lacks the flexibility of the latter[1]
A notable limitation is that you cannot use both -Skip and -SkipLast in a single Select-Object call; GitHub issue #11752 proposes removing this limitation.
E.g., in the following example (which complements Mathias's -Skip example), which extracts all elements but the last:
# Get all elements but the last.
$arr | Select-Object -SkipLast 1
Array $arr is enumerated, i.e. its elements are sent one by one through the pipeline, a process known as streaming.
When captured, the streamed elements are collected in a regular, [object[]]-typed PowerShell array, even if the input array is strongly typed - however, this loss of strict typing also applies to extracting multiple elements via [...].
Depending on the size of your arrays and the number of slicing operations needed, the performance difference can be significant.
[1] Notably, you can use arbitrary expressions inside [...], which is discussed in more detail in this answer.

How to Compare two arrays and its elements in visual foxpro?

I Am having two arrays named atest and NEWARRAY,I have tried to compare the elements of two arrays with simple if()and this is comparing only the first element of an array , how to compare all the array values at once,here's my code
IF (Alltrim(atest)== Alltrim(NEWARRAY))
Messagebox('Success',64,'Status')
Else
Messagebox('MisMatch',16,'Status')
ENDIF
Fox has a few functions that operate on whole arrays - like acopy, ascan and asort - but there is no built-in function that compares whole arrays. So you'll have to do the comparison element per element, for example with a for loop.
And yes, if you use an array name as an expression - including passing it by value - then you'll get the value of the first array element instead. There is one exception, though: when you pass an array to a built-in function in a place where an array parameter is expected then the compiler will automatically emit a reference token under the hood in order to arrange pass-by-reference instead of pass-by-value.
So, if you have a user-defined function f() to which you want to pass an array a then you need to call it like this: f(#m.a) but you can call built-in functions taking arrays like this: alen(a) (since the m. can be left off in this situation as well). In fact, Fox would complain if you coded something like alen(#m.a) or alen(#a), and older Foxen could even crash in such situations.
Conversely, if an array is the target of an assignment like a = 42 or store 42 to a then the value will be assigned to all array elements. This is convenient for initialising arrays to something like 0, '' or .null..
Hence, if you have two arrays a and b then a = b will assign the first value of b to all elements of a, and if a == b will compare the respective first cells only.
Sidenote: should you ever have to compare records from tables with equal or equivalent structure then you should remember to look up compobj(). It does for objects and scatter records what Fox won't do for arrays: it compares them whole-sale. That is, it compares the values of properties with matching names and tells you if there's a mismatch, and it does so much faster than hand-crafted code could do it.
Theoretically you could gather an array into a table/cursor record and then use scatter name Walther to produce a scatter record, which could then be compared to a scatter record named Herbert that was produced in a similar fashion from the contents of the other array: compobj(m.Walther, m.Herbert) would tell you whether the original arrays were equal or not. However, I'd be hard pressed to imagine circumstances where one might use something like that in production code...
You could create a simple procedure like this for comparison:
Procedure CompareArrays(ta1, ta2)
If Alen(ta1) != Alen(ta2)
Return .F.
EndIf
Local ix
For ix=1 to Alen(ta1)
If (Type('ta1[m.ix]') != Type('ta2[m.ix]') or ta1[m.ix] != ta2[m.ix])
Return .F.
endif
endfor
endproc
And pass your arrays by reference. ie:
isIdentical = CompareArrays(#laArr1, #laArr2)
If array members could hold objects, you should use compobj for comparison of array elements.

How to create a valid empty JSON array with PowerShell?

Converting arrays to JSON string in PowerShell couldn't be more simple:
#(1,2,3) | ConvertTo-Json
Produces:
[
1,
2,
3
]
However if the array is empty the result is an empty string:
#() | ConvertTo-Json
Results an empty string instead of [].
It works without pipelining
PS C:\> ConvertTo-Json #()
[
]
This is a use case for the unary comma operator, https://learn.microsoft.com/en-us/powershell/module/microsoft.powershell.core/about/about_operators?view=powershell-7.2
The pipeline is breaking the array apart. In the first example, the array is broken apart into its integer values as it passes through the pipeline (I'm not sure of what the actual mechanics of it being reassembled on the other side are: if it's the pipeline acting logically seeing three integers grouped together or if it's some interpretation being done by the receiving cmdlet itself). Regardless, using the unary operator ',' creates an array of one element. When used as either , #(1, 2, 3) or , #() it creates a container array which is broken apart by the pipeline still, but the sub-arrays are the objects being passed through and preserved as intended to be interpreted by the ConvertTo-Json cmdlet properly. Assuming your array is stored in a variable like $myArray, the following general code will work for all situations:
, $myArray | ConvertTo-Json

Smalltalk Array Types

When looking at Smalltalk syntax definitions I noticed a few different notations for arrays:
#[] "ByteArray"
#() "Literal Array"
{} "Array"
Why are there different array types? In other programming languages I know there's only one kind of array independent of the stored type.
When to choose which kind?
Why do literal array and array have a different notation but same class?
There's a bit of terminological confusion in Michael's answer, #() is a literal array whereas {} is not. A literal array is the one created by the compiler and can contain any other literal value (including other literal arrays) so the following is a valid literal array:
#(1 #blah nil ('hello' 3.14 true) $c [1 2 3])
On the other hand {} is merely a syntactic sugar for runtime array creation, so { 1+2. #a. anObject} is equivalent to:
(Array new: 3) at: 1 put: 1 + 2; at: 2 put: #a; at: 3 put: anObject; yourself
Here's a little walkthrough:
Firstly, we can find out the types resp. classes of the resulting objects:
#[] class results in ByteArray
#() class results in Array
{} class also results in Array
So apparently the latter two produce Arrays while the first produces a ByteArray. ByteArrays are what you would expect -- fixed sized arrays of bytes.
Now we'll have to figure out the difference between #() and {}. Try evaluating #(a b c), it results in #(#a #b #c); however when you try to evaluate {a b c}, it doesn't work (because a is not defined). The working version would be {#a. #b. #c}, which also results in #(#a #b #c).
The difference between #() and {} is, that the first takes a list of Symbol names separated by spaces. You're also allowed to omit the # signs. Using this notation you can only create Arrays that contain Symbols. The second version is the generic Array literal. It takes any expressions, separated by . (dots). You can even write things like {1+2. anyObject complexOperation}.
This could lead you to always using the {} notation. However, there are some things to keep in mind: The moment of object creation differs: While #() Arrays are created during compilation, {} Arrays are created during execution. Thus when you run code with an #() expression, it will also return the same Array, while {} only returns equal Arrays (as long as you are using equal contents). Also, AFAIK the {} is not necessarily portable because it's not part of the ST-80 standard.

Resources