How to create a valid empty JSON array with PowerShell?

How to create a valid empty JSON array with PowerShell? - arrays

Converting arrays to JSON string in PowerShell couldn't be more simple:
#(1,2,3) | ConvertTo-Json
Produces:
[
1,
2,
3
]
However if the array is empty the result is an empty string:
#() | ConvertTo-Json
Results an empty string instead of [].

It works without pipelining
PS C:\> ConvertTo-Json #()
[
]

This is a use case for the unary comma operator, https://learn.microsoft.com/en-us/powershell/module/microsoft.powershell.core/about/about_operators?view=powershell-7.2
The pipeline is breaking the array apart. In the first example, the array is broken apart into its integer values as it passes through the pipeline (I'm not sure of what the actual mechanics of it being reassembled on the other side are: if it's the pipeline acting logically seeing three integers grouped together or if it's some interpretation being done by the receiving cmdlet itself). Regardless, using the unary operator ',' creates an array of one element. When used as either , #(1, 2, 3) or , #() it creates a container array which is broken apart by the pipeline still, but the sub-arrays are the objects being passed through and preserved as intended to be interpreted by the ConvertTo-Json cmdlet properly. Assuming your array is stored in a variable like $myArray, the following general code will work for all situations:
, $myArray | ConvertTo-Json

Related

Why does an array with a single empty array have the length of 0?

The Length property works as expected on all arrays that I test except one weird case:
PS> #(#()).Length
0
It's not that empty arrays are generally omitted though:
PS> #(#(), #()).Length
2
PS> #(#(), #(), #()).Length
3
What's going on?

#(...), the array-subexpression operator is not an array constructor, it is an array "guarantor" (see next section), and nesting #(...) operations is pointless.
#(#()) is in effect the same as #(), i.e. an empty array of type [object[]].
To unconditionally construct arrays, use ,, the array constructor operator.
To construct an array wrapper for a single object, use the unary form of ,, as Abraham Zinala suggests:
# Create a single-element array whose only element is an empty array.
# Note: The outer enclosure in (...) is only needed in order to
# access the array's .Count property.
(, #()).Count # -> 1
Note that I've used .Count instead of .Length above, which is more PowerShell-idiomatic; .Count works across different collection types. Even though System.Array doesn't directly implement .Count, it does so via the ICollection interface, and PowerShell allows access to interface members without requiring a cast.
Background information:
#(...)'s primary purpose is to ensure that output objects collected from - invariably pipeline-based - commands (e.g, #(Get-ChildItem *.txt)) are always collected as an array (invariably of type [object[]]) - even if ... produces only one output object.
If getting an array is desired, use of #(...) is necessary because collecting output that happens to contain just one object would by default be collected as-is, i.e. not wrapped in an array (this also applies when you use $(...), the subexpression operator); only for multiple output objects is an array used, which is always [object[]]-typed.
Note that PowerShell commands (typically) do not output collections; instead, they stream a (usually open-ended) number of objects one by one to the pipeline; capturing command output therefore requires collecting the streamed objects - see this answer for more information.
#(...)'s secondary purpose is to facilitate defining array literals, e.g. #('foo', 'bar')
Note:
Using #(...) for this purpose was not by original design, but such use became so prevalent that an optimization was implemented in version 5 of PowerShell so that, say, 1, 2 - which is sufficient to declare a 2-element array - may also be expressed as #(1, 2) without unnecessary processing overhead.
On the plus side, #(...) is visually distinctive in general and syntactically convenient specifically for declaring empty (#()) or single-element arrays (e.g. #(42)) - without #(...), these would have to expressed as [object[]]:new() and , 42, respectively.
However, this use of #(...) invites the misconception that it acts as an unconditional array constructor, which isn't the case; in short: wrapping extra #(...) operations around a #(...) operation does not create nested arrays, it is an expensive no-op; e.g.:
#(42) # Single-element array
#(#(42)) # !! SAME - the outer #(...) has no effect.
When #(...) is applied to a (non-array-literal) expression, what this expression evaluates to is sent to the pipeline, which causes PowerShell to enumerate it, if it considers it enumerable;[1] that is, if the expression result is a collection, its elements are sent to the pipeline, one by one, analogous to a command's streaming output, before being collected again in an [object[]] array.
# #(...) causes the [int[]]-typed array to be *enumerated*,
# and its elements are then *collected again*, in an [object[]] array.
$intArray = [int[]] (1, 2)
#($intArray).GetType().FullName # -> !! 'System.Object[]'
To prevent this enumeration and re-collecting:
Use the expression as-is and, if necessary, enclose it just in (...)
To again ensure that an array is returned, an efficient alternative to #(...) is to use an [array] cast; the only caveat is that if the expression evaluates to $null, the result will be $null too ($null -eq [array] $null):
# With an array as input, an [array] cast preserves it as-is.
$intArray = [int[]] (1, 2)
([array] $intArray).GetType().FullName # -> 'System.Int32[]'
# With a scalar as input, a single-element [object[]] array is created.
([array] 42).GetType().FullName # -> 'System.Object[]'
[1] See the bottom section of this answer for an overview of which .NET types PowerShell considers enumerable in the pipeline.

Powershell $this variable does not work inside array indexing

I'm trying to emulate MATLAB's 'end' indexing keyword (such as A[5:end]) in Powershell but I don't want to type the array name (such as $array) to access $array.length for
$array[0..($array.length - 2)]
as discussed in another Stackflow question. I tried $this
(0..7)[4..($this.Length-1)]
given ($this.Length-1) seems to be interpreted as -1 as the output shows
4
3
2
1
0
7
This makes me think $this is empty when used inside [] indexing an array. Is there a way for an array to refer to itself without explicitly repeating the variable name so I can call the methods of the array to derive the indices? This would be very handy for emulating logical indexing while taking advantage of method chaining (like a.b.c.d[4..end]).

PowerShell doesn't have any facility for referring to "the collection targeted by this index access operator", but if you want to skip the first N items of a collection/enumerable you can use Select -Skip:
0..7 |Select -Skip 4

To complement Mathias' helpful answer:
The automatic $this variable is not available inside index expressions ([...]), only in custom classes (to refer to the instance at hand) and in script blocks acting as .NET event delegates (to refer to the event sender).
However, for what you're trying to achieve you don't need a reference to the input array (collection) as a whole: instead, an abstract notation for referring to indices relative to the end of the input array should suffice, and ideally also for "all remaining elements" logic.
You can use negative indices to refer to indices relative to the end of the input array, but that only works with individual indices:
# OK: individual negative indices; get the last and the 3rd last item:
('a', 'b', 'c', 'd')[-3, -1] # -> 'b', 'd'
Unfortunately, because .. inside an index expression refers to the independent, general-purpose range operator, this does not work for range-based array slicing when negative indices are used as range endpoints:
# !! DOES NOT WORK:
# *Flawed* attempt to get all elements up to and including the 2nd last,
# i.e. to get all elements but the last.
# 0..-2 evaluates to array 0, -1, -2, whose elements then serve as the indices.
('a', 'b', 'c', 'd')[0..-2] # -> !! 'a', 'd', 'c'
That is, the general range operation 0..-2 evaluates to array 0, -1, -2, and the resulting indices are used to extract the elements.
It is this behavior that currently requires an - inconvenient - explicit reference to the array inside the index expression for everything-except-the-last-N-elements logic, such as $array[0..($array.length - 2)] in your question in order to extract all elements except the last one.
GitHub issue #7940 proposes introducing new syntax that addresses this problem, by effectively implementing C#-style ranges:
While no syntax has been agreed on and no commitment has been made to implement this enhancement, borrowing C#'s syntax directly is an option:
Now
Potential future syntax
Comment
$arr[1..($arr.Length-2)]
$arr[1..^1]
From the 2nd el. through to the next to last.
$arr[1..($arr.Length-1)]
$arr[1..]
Everything from the 2nd el.
$arr[0..9]
$arr[..9]
Everything up to the 10th el.
$arr[-9..-1]
$arr[^9..]
Everything from the 9th to last el.
Note the logic of the from-the-end, 1-based index syntax (e.g., ^1 refers to the last element) when serving as a range endpoint: It is up-to-but-excluding logic, so that ..^1 means: up to the index before the last one, i.e. the second to last one.
As for the workarounds:
Using Select-Object with -Skip / -SkipLast is convenient in simple cases, but:
performs poorly compared to index expressions ([...]) (see below)
lacks the flexibility of the latter[1]
A notable limitation is that you cannot use both -Skip and -SkipLast in a single Select-Object call; GitHub issue #11752 proposes removing this limitation.
E.g., in the following example (which complements Mathias's -Skip example), which extracts all elements but the last:
# Get all elements but the last.
$arr | Select-Object -SkipLast 1
Array $arr is enumerated, i.e. its elements are sent one by one through the pipeline, a process known as streaming.
When captured, the streamed elements are collected in a regular, [object[]]-typed PowerShell array, even if the input array is strongly typed - however, this loss of strict typing also applies to extracting multiple elements via [...].
Depending on the size of your arrays and the number of slicing operations needed, the performance difference can be significant.
[1] Notably, you can use arbitrary expressions inside [...], which is discussed in more detail in this answer.

Why does Powershell combines array of arrays?

I use arrays of arrays in a Powershell script, however, sometimes, my array of arrays, actually contains only one array.
For some reason, Powershell keeps replacing that array containing one array, by just one array. I don't get it, no other scripting / coding language I ever used has done that before.
Example, this is what I do not want:
PS C:\Users\> $a = #(#(123,456,789))
PS C:\Users\> $a[0]
123
This is what I want:
PS C:\Users\> $a = #(#(123,456,789), #())
PS C:\Users\> $a[0]
123
456
789
Why do I have to force an extra empty array for Powershell to consider my array of arrays as such when it only contains one array ? This is driving me nuts !

You need to put a comma as the first item:
$a = #(, #(123,456,789) )
The reason for this is that the comma is essentially the array construction parameters. This MSDN article has more information.

#() operator interpret its content as statements not as expression. Let us put explicit ;.
#(#(123,456,789;);)
What do you have here:
123,456,789 — binary comma operator create an array with three elements.
Result: array [123,456,789].
123,456,789; — as expression in this statement return collection, PowerShell enumerate this collection and write collection's elements (not collection itself) to the pipeline.
Result: three elements 123, 456 and 789 written to the pipeline.
#(123,456,789;) — array subexpression operator collect all the items written to pipeline as result of invocation of nested statements and create array from them.
Result: array [123,456,789].
#(123,456,789;); — as expression in this statement return collection, PowerShell enumerate this collection and write collection's elements (not collection itself) to the pipeline.
Result: three elements 123, 456 and 789 written to the pipeline.
#(#(123,456,789;);) — array subexpression operator collect all the items written to pipeline as result of invocation of nested statements and create array from them.
Result: array [123,456,789].
So, when you write #(collection), PowerShell return copy of collection, not collection wrapped into single element array. If you want to create array with single element, then you should use unary comma operator: ,expression. This will create single element array regardless of expression return collection or not.
Also, when you write #(a,b,c,etc), it is binary comma, who create array. Array subexpression operator just copy that array. But why do you need a copy? Is any reason, why you can not use original array? All you need for not making extra copy is just to omit # character: (a,b,c,etc).

How to use a 2D array in powershell?

I have a powershell script where I group and store a collection of Conputernames in a hashtable. Everithing works fine so far.
$table = #{}
$pcs = "w7cl002","w7cl002","w7cl001","w7cl002","w7cl008", `
"w7lp001","w7lp001","w7cl008","w7cl004","w7lp001"
foreach ($pc in $pcs){
if ($table.Keys -notcontains $pc){
$Table.Add($pc,1)
}
else{
$occ = $table.get_item($pc) +1
$table.set_item($pc,$occ)
}
}
$table
This is what I want and what I get.
Name Value
---- -----
stflw7lp001 3
stflw7cl002 3
stflw7cl004 1
stflw7cl001 1
stflw7cl008 2
Initially, I wanted to do this by using a 2D-Array. But after 5 hours struggling and running my head against a wall, I gave up and realized it with a hash table.
I just would be interested in whether this is possible using a 2D array?

2D arrays (in the C# sense, not the Java sense) are icky in PowerShell. I tend to avoid them. There is no direct language support for them either, and you have to create them with New-Object.
As for your code, you can achieve the same with this:
$pcs = "w7cl002","w7cl002","w7cl001","w7cl002","w7cl008",
"w7lp001","w7lp001","w7cl008","w7cl004","w7lp001"
$table = #{}
$pcs | Group-Object -NoElement | ForEach-Object { $table[$_.Name] = $_.Count }
No need for awkward loops and code that looks like an unholy combination of C# and VBScript (honestly, none of the actual working lines in your code look like PowerShell, except for the first three and the last).
If you need arrays of arrays instead of a hashtable, you can do so as well:
$result = $pcs | Group-Object -NoElement | ForEach-Object { ,($_.Name, $_.Count) }
However, you really don't want to work this way. For one, you can often get confused whether you have an array or a scalar as an item. Some operations may automatically unroll the array again, etc. IMHO it's much better to work with objects when you can. In this case, just use the result from Group-Object directly. It has handy Name and Count properties that spell out exactly what they are, instead of [0] and [1] which are a bit more opaque.
If you need nicer property names, just project into new objects:
$pcs | group -n | % {
New-Object PSObject -Property #{
PC = $_.Name
Count = $_.Count
}
}
or
$pcs | group -n | select #{l='PC';e={$_.Name}},Count
Generally, when using PowerShell, think of your data as objects and your problem as operations on those objects. Then try to find how to make those operations happen with things the language already gives you. Manipulation of collection classes, or C-like code with lots of loops almost always looks out of place and wrong in PowerShell – usually for good reason, because the pipeline is often a better, easier, and shorter way of accomplishing the same. I can probably count the number of times I used a for loop in PowerShell over the past few years on both hands, while I wouldn't need both hands to count the number of times I used a foreach loop. Those are rare things to use.

In general, what I end up with in a situation like this is not a 2D array, but an array of tuples, or an array of custom objects. This is what import-csv gives you by default. For my work, this is a good representation of a relation, and many of the transforms I want are some combination of set ops and relational ops.
BTW, I'm only a beginner in Powershell.

Smalltalk Array Types

When looking at Smalltalk syntax definitions I noticed a few different notations for arrays:
#[] "ByteArray"
#() "Literal Array"
{} "Array"
Why are there different array types? In other programming languages I know there's only one kind of array independent of the stored type.
When to choose which kind?
Why do literal array and array have a different notation but same class?

There's a bit of terminological confusion in Michael's answer, #() is a literal array whereas {} is not. A literal array is the one created by the compiler and can contain any other literal value (including other literal arrays) so the following is a valid literal array:
#(1 #blah nil ('hello' 3.14 true) $c [1 2 3])
On the other hand {} is merely a syntactic sugar for runtime array creation, so { 1+2. #a. anObject} is equivalent to:
(Array new: 3) at: 1 put: 1 + 2; at: 2 put: #a; at: 3 put: anObject; yourself

Here's a little walkthrough:
Firstly, we can find out the types resp. classes of the resulting objects:
#[] class results in ByteArray
#() class results in Array
{} class also results in Array
So apparently the latter two produce Arrays while the first produces a ByteArray. ByteArrays are what you would expect -- fixed sized arrays of bytes.
Now we'll have to figure out the difference between #() and {}. Try evaluating #(a b c), it results in #(#a #b #c); however when you try to evaluate {a b c}, it doesn't work (because a is not defined). The working version would be {#a. #b. #c}, which also results in #(#a #b #c).
The difference between #() and {} is, that the first takes a list of Symbol names separated by spaces. You're also allowed to omit the # signs. Using this notation you can only create Arrays that contain Symbols. The second version is the generic Array literal. It takes any expressions, separated by . (dots). You can even write things like {1+2. anyObject complexOperation}.
This could lead you to always using the {} notation. However, there are some things to keep in mind: The moment of object creation differs: While #() Arrays are created during compilation, {} Arrays are created during execution. Thus when you run code with an #() expression, it will also return the same Array, while {} only returns equal Arrays (as long as you are using equal contents). Also, AFAIK the {} is not necessarily portable because it's not part of the ST-80 standard.