Tuples/ArrayList of pairs - arrays

I'm essentially trying to create a list of pairs which is proving frustratingly difficult
Note before anyone mentions Hashtables that there will be duplicates which I don't care about.
For example, if I do
$b = #{"dog" = "cat"}
I get
Name Value
---- -----
dog cat
which is good. However, I'm then unable to add the likes of
$b += #{"dog" = "horse"}
Item has already been added. Key in dictionary: 'dog' Key being added: 'dog'
I'm just trying to create a table of data which I can add to using something like .Add() or +=.

I believe a list of hashtables should accomplish what you are after.
$ht = #{foo='bar'}
$list = [System.Collections.Generic.List[hashtable]]::new()
$list.Add($ht)
$list.Add($ht)
$list.Add($ht)
$list

If you want a list of tuples I'd recommend actually building a list of tuples:
$list = #()
$tuple1 = New-Object 'Tuple[String,String]' 'dog', 'cat'
$tuple2 = New-Object 'Tuple[String,String]' 'dog', 'horse'
$list += $tuple1
$list
# Output:
#
# Item1 Item2 Length
# ----- ----- ------
# dog cat 2
$list += $tuple2
$list
# Output:
#
# Item1 Item2 Length
# ----- ----- ------
# dog cat 2
# dog horse 2
Note that in this example $list is a regular array, meaning that, as #mklement0 pointed out in his answer, appending to it with the += assignment operator will re-create the array with size increased by 1, put the new item in the new empty slot, then replace the original array. For a small number of append operations this usually isn't a big issue, but with increasing number of append operations the performance impact becomes significant.
Using an ArrayList instead of a plain array avoids this issue:
$list = New-Object Collections.ArrayList
$tuple1 = New-Object 'Tuple[String,String]' 'dog', 'cat'
$tuple2 = New-Object 'Tuple[String,String]' 'dog', 'horse'
$list.Add($tuple1) | Out-Null
$list
# Output:
#
# Item1 Item2 Length
# ----- ----- ------
# dog cat 2
$list.Add($tuple2) | Out-Null
$list
# Output:
#
# Item1 Item2 Length
# ----- ----- ------
# dog cat 2
# dog horse 2
The Add() method of ArrayList objects outputs the index where the item was appended. Out-Null suppresses that output that is in most cases undesired. If you want to work with these index numbers you can collect them in a variable instead of discarding them ($i = $list.Add($t1)).
If you want to avoid having to specify the type of a new tuple all the time you can wrap it into a reusable function like this:
function New-Tuple {
Param(
[Parameter(Mandatory=$true)]
[ValidateCount(2,2)]
[string[]]$Values
)
New-Object 'Tuple[String,String]' $Values
}
$tuple1 = New-Tuple 'dog', 'cat'
$tuple2 = New-Tuple 'dog', 'horse'
or, in a more generic way, like this:
function New-Tuple {
Param(
[Parameter(
Mandatory=$true,
ValueFromPipeline=$true,
ValueFromPipelineByPropertyName=$true
)]
[ValidateCount(2,20)]
[array]$Values
)
Process {
$types = ($Values | ForEach-Object { $_.GetType().Name }) -join ','
New-Object "Tuple[$types]" $Values
}
}
$tuples = ('dog', 'cat'), ('dog', 'horse') | New-Tuple

Persistent13's helpful answer offers an effective and efficient solution - albeit a slightly obscure one.
Creating an array of hashtables is the simplest solution, though it's important to note that "extending" an array means implicitly recreating it, given that arrays are fixed-size data structures, which can become a performance problem with many iterations:
# Using array-construction operator ",", create a single-element array
# containing a hashtable
$b = , #{ dog = "cat"}
# "Extend" array $b by appending another hashtable.
# Technically, a new array must be allocated that contains the original array's
# elements plus the newly added one.
$b += #{ dog = "horse"}
This GitHub issue discusses a potential future enhancement to make PowerShell natively support an efficiently extensible list-like data type or for it to even default to such a data type. (As of Windows PowerShell v5.1 / PowerShell Core 6.2.0, PowerShell defaults to fixed-size arrays, of type [object[]]).

Thanks to mklement0 as below is probably the simplest solution
$b = , #{ dog = "cat"}
$b += #{ dog = "horse"}
And Persistent13's method also yields an easy route
$ht = #{foo='bar'}
$list = [System.Collections.Generic.List[hashtable]]::new()
$list.Add($ht)
$list.Add($ht)
$list.Add($ht)
$list
I suppose i was just surprised that there wasn't a more ingrained way to get what i feel is a pretty basic/standard object type

Related

How to declare a multidimensional 2D array in Powershell? [duplicate]

I have a way of doing Arrays in other languagues like this:
$x = "David"
$arr = #()
$arr[$x]["TSHIRTS"]["SIZE"] = "M"
This generates an error.
You are trying to create an associative array (hash). Try out the following
sequence of commands
$arr=#{}
$arr["david"] = #{}
$arr["david"]["TSHIRTS"] = #{}
$arr["david"]["TSHIRTS"]["SIZE"] ="M"
$arr.david.tshirts.size
Note the difference between hashes and arrays
$a = #{} # hash
$a = #() # array
Arrays can only have non-negative integers as indexes
from powershell.com:
PowerShell supports two types of multi-dimensional arrays: jagged arrays and true multidimensional arrays.
Jagged arrays are normal PowerShell arrays that store arrays as elements. This is very cost-effective storage because dimensions can be of different size:
$array1 = 1,2,(1,2,3),3
$array1[0]
$array1[1]
$array1[2]
$array1[2][0]
$array1[2][1]
True multi-dimensional arrays always resemble a square matrix. To create such an array, you will need to access .NET. The next line creates a two-dimensional array with 10 and 20 elements resembling a 10x20 matrix:
$array2 = New-Object 'object[,]' 10,20
$array2[4,8] = 'Hello'
$array2[9,16] = 'Test'
$array2
for a 3-dimensioanl array 10*20*10
$array3 = New-Object 'object[,,]' 10,20,10
To extend on what manojlds said above is that you can nest Hashtables. It may not be a true multi-dimensional array but give you some ideas about how to structure the data. An example:
$hash = #{}
$computers | %{
$hash.Add(($_.Name),(#{
"Status" = ($_.Status)
"Date" = ($_.Date)
}))
}
What's cool about this is that you can reference things like:
($hash."Name1").Status
Also, it is far faster than arrays for finding stuff. I use this to compare data rather than use matching in Arrays.
$hash.ContainsKey("Name1")
Hope some of that helps!
-Adam
Knowing that PowerShell pipes objects between cmdlets, it is more common in PowerShell to use an array of PSCustomObjects:
$arr = #(
[PSCustomObject]#{Name = 'David'; Article = 'TShirt'; Size = 'M'}
[PSCustomObject]#{Name = 'Eduard'; Article = 'Trouwsers'; Size = 'S'}
)
Or for older PowerShell Versions (PSv2):
$arr = #(
New-Object PSObject -Property #{Name = 'David'; Article = 'TShirt'; Size = 'M'}
New-Object PSObject -Property #{Name = 'Eduard'; Article = 'Trouwsers'; Size = 'S'}
)
And grep your selection like:
$arr | Where {$_.Name -eq 'David' -and $_.Article -eq 'TShirt'} | Select Size
Or in newer PowerShell (Core) versions:
$arr | Where Name -eq 'David' | Where Article -eq 'TShirt' | Select Size
Or (just get the size):
$arr.Where{$_.Name -eq 'David' -and $_.Article -eq 'TShirt'}.Size
Addendum 2020-07-13
Syntax and readability
As mentioned in the comments, using an array of custom objects is straighter and saves typing, if you like to exhaust this further you might even use the ConvertForm-Csv (or the Import-Csv) cmdlet for building the array:
$arr = ConvertFrom-Csv #'
Name,Article,Size
David,TShirt,M
Eduard,Trouwsers,S
'#
Or more readable:
$arr = ConvertFrom-Csv #'
Name, Article, Size
David, TShirt, M
Eduard, Trouwsers, S
'#
Note: values that contain spaces or special characters need to be double quoted
Or use an external cmdlet like ConvertFrom-SourceTable which reads fixed width table formats:
$arr = ConvertFrom-SourceTable '
Name Article Size
David TShirt M
Eduard Trouwsers S
'
Indexing
The disadvantage of using an array of custom objects is that it is slower than a hash table which uses a binary search algorithm.
Note that the advantage of using an array of custom objects is that can easily search for anything else e.g. everybody that wears a TShirt with size M:
$arr | Where Article -eq 'TShirt' | Where Size -eq 'M' | Select Name
To build an binary search index from the array of objects:
$h = #{}
$arr | ForEach-Object {
If (!$h.ContainsKey($_.Name)) { $h[$_.Name] = #{} }
If (!$h[$_.Name].ContainsKey($_.Article)) { $h[$_.Name][$_.Article] = #{} }
$h[$_.Name][$_.Article] = $_ # Or: $h[$_.Name][$_.Article]['Size'] = $_.Size
}
$h.david.tshirt.size
M
Note: referencing a hash table key that doesn't exist in Set-StrictMode will cause an error:
Set-StrictMode -Version 2
$h.John.tshirt.size
PropertyNotFoundException: The property 'John' cannot be found on this object. Verify that the property exists.
Here is a simple multidimensional array of strings.
$psarray = #(
('Line' ,'One' ),
('Line' ,'Two')
)
foreach($item in $psarray)
{
$item[0]
$item[1]
}
Output:
Line
One
Line
Two
Two-dimensional arrays can be defined this way too as jagged array:
$array = New-Object system.Array[][] 5,5
This has the nice feature that
$array[0]
outputs a one-dimensional array, containing $array[0][0] to $array[0][4].
Depending on your situation you might prefer it over $array = New-Object 'object[,]' 5,5.
(I would have commented to CB above, but stackoverflow does not let me yet)
you could also uses System.Collections.ArrayList to make a and array of arrays or whatever you want.
Here is an example:
$resultsArray= New-Object System.Collections.ArrayList
[void] $resultsArray.Add(#(#('$hello'),2,0,0,0,0,0,0,1,1))
[void] $resultsArray.Add(#(#('$test', '$testagain'),3,0,0,1,0,0,0,1,2))
[void] $resultsArray.Add("ERROR")
[void] $resultsArray.Add(#(#('$var', '$result'),5,1,1,0,1,1,0,2,3))
[void] $resultsArray.Add(#(#('$num', '$number'),3,0,0,0,0,0,1,1,2))
One problem, if you would call it a problem, you cannot set a limit. Also, you need to use [void] or the script will get mad.
Using the .net syntax (like CB pointed above)
you also add coherence to your 'tabular' array...
if you define a array...
and you try to store diferent types
Powershell will 'alert' you:
$a = New-Object 'byte[,]' 4,4
$a[0,0] = 111; // OK
$a[0,1] = 1111; // Error
Of course Powershell will 'help' you
in the obvious conversions:
$a = New-Object 'string[,]' 2,2
$a[0,0] = "1111"; // OK
$a[0,1] = 111; // OK also
Another thread pointed here about how to add to a multidimensional array in Powershell. I don't know if there is some reason not to use this method, but it worked for my purposes.
$array = #()
$array += ,#( "1", "test1","a" )
$array += ,#( "2", "test2", "b" )
$array += ,#( "3", "test3", "c" )
Im found pretty cool solvation for making arrays in array.
$GroupArray = #()
foreach ( $Array in $ArrayList ){
$GroupArray += #($Array , $null)
}
$GroupArray = $GroupArray | Where-Object {$_ -ne $null}
Lent from above:
$arr = ConvertFrom-Csv #'
Name,Article,Size
David,TShirt,M
Eduard,Trouwsers,S
'#
Print the $arr:
$arr
Name Article Size
---- ------- ----
David TShirt M
Eduard Trouwsers S
Now select 'David'
$arr.Where({$_.Name -eq "david"})
Name Article Size
---- ------- ----
David TShirt M
Now if you want to know the Size of 'David'
$arr.Where({$_.Name -eq "david"}).size
M

Powershell Compare 2 Arrays of Hashtables based on a property value

I have one array of hashtables like the one below:
$hashtable1 = #{}
$hashtable1.name = "aaa"
$hashtable1.surname =#()
$hashtable1.surname += "bbb"
$hashtable2 = #{}
$hashtable2.name = "aaa"
$hashtable2.surname =#()
$hashtable2.surname += "ccc"
$hashtable3 = #{}
$hashtable3.name = "bbb"
$hashtable3.surname = #()
$hashtable3.surname += "xxx"
$A = #($hashtable1; $hashtable2; $hashtable3)
I need to iterate though the array and I need to find out duplicates based on hashtable[].name
Then I need to group those hashtable.surname to hashtable[].surname so that the result will be an array of hashtables that will group all for name all the surnames:
$hashtable1.name = "aaa"
$hashtable1.surname = ("bbb","ccc")
$hashtable3.name = "bbb"
$hashtable3.surname = ("xxx")
I was looking into iterating to empty array
+
I have found this link:
powershell compare 2 arrays output if match
but I am not sure on how to reach into the elements of the hashtable.
My options:
I was wondering if -contain can do it.
I have read about compare-object but I am not sure it can be done like that.
(It looks a bit scary in the moment)
I am on PS5.
Thanks for your help,
Aster
You can group your array items by the names using a scriptblock like so.
Once grouped, you can easily build your output to do what you seek.
#In PS 7.0+ you can use Name directly but earlier version requires the use of the scriptblock when dealing with arrays of hashtables.
$Output = $A | Group-Object -Property {$_.Name} | % {
[PSCustomObject]#{
Name = $_.Name
Surname = $_.Group.Surname | Sort-Object -Unique
}
}
Here is the output variable content.
Name Surname
---- -------
aaa {bbb, ccc}
bbb xxx
Note
Improvements have been made in PS 7.0 that allows you to use simply the property name (eg: Name) in Group-Object for arrays of hashtables, just like you would do for any other arrays type. For earlier version though, these particular arrays must be accessed by passing the property in a scriptblock, like so: {$_.Name}
References
MSDN - Group_Object
SS64 - Group Object
Dr Scripto - Use a Script block to create custom groupings in PowerShell

PowerShell: modify elements of array

My cmdlet get-objects returns an array of MyObject with public properties:
public class MyObject{
public string testString = "test";
}
I want users without programming skills to be able to modify public properties (like testString in this example) from all objects of the array.
Then feed the modified array to my second cmdlet which saves the object to the database.
That means the syntax of the "editing code" must be as simple as possible.
It should look somewhat like this:
> get-objects | foreach{$_.testString = "newValue"} | set-objects
I know that this is not possible, because $_ just returns a copy of the element from the array.
So you'd need to acces the elements by index in a loop and then modify the property.This gets really quickly really complicated for people that are not familiar with programming.
Is there any "user-friendly" built-in way of doing this? It shouldn't be more "complex" than a simple foreach {property = value}
I know that this is not possible, because $_ just returns a copy of the element from the array (https://social.technet.microsoft.com/forums/scriptcenter/en-US/a0a92149-d257-4751-8c2c-4c1622e78aa2/powershell-modifying-array-elements)
I think you're mis-intepreting the answer in that thread.
$_ is indeed a local copy of the value returned by whatever enumerator you're currently iterating over - but you can still return your modified copy of that value (as pointed out in the comments):
Get-Objects | ForEach-Object {
# modify the current item
$_.propertyname = "value"
# drop the modified object back into the pipeline
$_
} | Set-Objects
In (allegedly impossible) situations where you need to modify a stored array of objects, you can use the same technique to overwrite the array with the new values:
PS C:\> $myArray = 1,2,3,4,5
PS C:\> $myArray = $myArray |ForEach-Object {
>>> $_ *= 10
>>> $_
>>>}
>>>
PS C:\> $myArray
10
20
30
40
50
That means the syntax of the "editing code" must be as simple as possible.
Thankfully, PowerShell is very powerful in terms of introspection. You could implement a wrapper function that adds the $_; statement to the end of the loop body, in case the user forgets:
function Add-PsItem
{
[CmdletBinding()]
param(
[Parameter(Mandatory,ValueFromPipeline,ValueFromRemainingArguments)]
[psobject[]]$InputObject,
[Parameter(Mandatory)]
[scriptblock]$Process
)
begin {
$InputArray = #()
# fetch the last statement in the scriptblock
$EndBlock = $Process.Ast.EndBlock
$LastStatement = $EndBlock.Statements[-1].Extent.Text.Trim()
# check if the last statement is `$_`
if($LastStatement -ne '$_'){
# if not, add it
$Process = [scriptblock]::Create('{0};$_' -f $Process.ToString())
}
}
process {
# collect all the input
$InputArray += $InputObject
}
end {
# pipe input to foreach-object with the new scriptblock
$InputArray | ForEach-Object -Process $Process
}
}
Now the users can do:
Get-Objects | Add-PsItem {$_.testString = "newValue"} | Set-Objects
The ValueFromRemainingArguments attribute also lets users supply input as unbounded parameter values:
PS C:\> Add-PsItem { $_ *= 10 } 1 2 3
10
20
30
This might be helpful if the user is not used to working with the pipeline
Here's a more general approach, arguably easier to understand, and less fragile:
# $dataSource would be get-object in the OP
# $dataUpdater is the script the user supplies to modify properties
# $dataSink would be set-object in the OP
function Update-Data {
param(
[scriptblock] $dataSource,
[scriptblock] $dataUpdater,
[scriptblock] $dataSink
)
& $dataSource |
% {
$updaterOutput = & $dataUpdater
# This "if" allows $dataUpdater to create an entirely new object, or
# modify the properties of an existing object
if ($updaterOutput -eq $null) {
$_
} else {
$updaterOutput
}
} |
% $dataSink
}
Here are a couple of examples of use. The first example isn't applicable to the OP, but it's being used to create a data set that is applicable (a set of objects with properties).
# Use updata-data to create a set of data with properties
#
$theDataSource = #() # will be filled in by first update-data
update-data {
# data source
0..4
} {
# data updater: creates a new object with properties
New-Object psobject |
# add-member uses hash table created on the fly to add properties
# to a psobject
add-member -passthru -NotePropertyMembers #{
room = #('living','dining','kitchen','bed')[$_];
size = #(320, 200, 250, 424 )[$_]}
} {
# data sink
$global:theDataSource += $_
}
$theDataSource | ft -AutoSize
# Now use updata-data to modify properties in data set
# this $dataUpdater updates the 'size' property
#
$theDataSink = #()
update-data { $theDataSource } { $_.size *= 2} { $global:theDataSink += $_}
$theDataSink | ft -AutoSize
And then the output:
room size
---- ----
living 320
dining 200
kitchen 250
bed 424
room size
---- ----
living 640
dining 400
kitchen 500
bed 848
As described above update-data relies on a "streaming" data source and sink. There is no notion of whether the first or fifteenth element is being modified. Or if the data source uses a key (rather than an index) to access each element, the data sink wouldn't have access to the key. To handle this case a "context" (for example an index or a key) could be passed through the pipeline along with the data item. The $dataUpdater wouldn't (necessarily) need to see the context. Here's a revised version with this concept added:
# $dataSource and $dataSink scripts need to be changed to output/input an
# object that contains both the object to modify, as well as the context.
# To keep it simple, $dataSource will output an array with two elements:
# the value and the context. And $dataSink will accept an array (via $_)
# containing the value and the context.
function Update-Data {
param(
[scriptblock] $dataSource,
[scriptblock] $dataUpdater,
[scriptblock] $dataSink
)
% $dataSource |
% {
$saved_ = $_
# Set $_ to the data object
$_ = $_[0]
$updaterOutput = & $dataUpdater
if ($updaterOutput -eq $null) { $updaterOutput = $_}
$_ = $updaterOutput, $saved_[1]
} |
% $dataSink
}

Powershell array values being overwritten

I'm at a bit of a loss to understand why this code isn't doing what I think it should.
This is part of a larger plan but in essence I'm attempting to fill part of an array with entries from a text file, though I've replaced that bit with a hard coded array. The end result is the same.
$users = 12345,23456,34567,45678,56789,67890
$a = New-Object PSCustomObject
$a = $a|Select ID
$collection = #()
$users|%{
$a.ID = $_
$collection += $a
}
$collection|ft -a
This outputs the following:
ID
--
67890
67890
67890
67890
67890
67890
If you output the array to screen as it gets built, you can watch the values get replaced each time with the most recent entry.
What's the fault? Is there something unusual with the way I'm initialising the $a variable or the array?
It's doing that because there is only one object ($a), and all your loop is doing is changing the value of the one property (ID), and adding another reference to it to the array.
You need to create a new object for each cycle of the loop:
$users = 12345,23456,34567,45678,56789,67890
$a = New-Object PSCustomObject
$a = $a|Select ID
$collection = #()
$users|%{
$a = New-Object PSCustomObject
$a = $a|Select ID
$a.ID = $_
$collection += $a
}
$collection|ft -a
PS is adding a reference to the object itself into the array, rather than the object contents at the time it's added. Changing the value of $a.ID at any point changes the contents displayed by $collection.
To fix this, you can move the initialization for $a within the % statement like so:
$users = 12345,23456,34567,45678,56789,67890
$collection = #()
$users|%{
$a = New-Object PSCustomObject
$a = $a|Select ID
$a.ID = $_
$collection += $a
}
$collection|ft -a
or just simply add $a.ID to $collection if you only want the ID value in that array.
Not quite sure what you're trying to accomplish here, really... but what if you try it like this?
$collection += $a.ID
You create one object ($a). You then add this one object to $collection multiple times. Each time you add it you change its .ID property value. So at the end you have $collection holding many copies of this one object. So you get many copies of that one ID value. Create a new object each time inside the loop if you want multiple objects with different values.

PowerShell array unshift

After reading this helpful article on the Windows PowerShell Blog, I realized I could "shift" off the first part of an array, but didn't know how to "unshift" or push elements onto the front of an array in PowerShell.
I am creating an array of hash objects, with the last read item pushed onto the array first. I'm wondering if there is a better way to accomplish this.
## Create a list of files for this collection, pushing item on top of all other items
if ($csvfiles[$collname]) {
$csvfiles[$collname] = #{ repdate = $date; fileobj = $csv }, $csvfiles[$collname] | %{$_}
}
else {
$csvfiles[$collname] = #{ repdate = $date; fileobj = $csv }
}
A couple of things to note:
I need to unroll the previous array element with a foreach loop %{$_}, because merely referencing the array would create a nested array structure. I have to have all the elements at the same level.
I need to differentiate between an empty array and one that contains elements. If I attempt to unroll an empty array, it places a $null element at the end of the array.
Thoughts?
PS: The reason the empty hash element produces a NULL value is that $null is treated as a scalar in PowerShell. For details, see https://connect.microsoft.com/PowerShell/feedback/details/281908/foreach-should-not-execute-the-loop-body-for-a-scalar-value-of-null.
ANSWER:
Looks like the best solution is to pre-create the empty array when necessary, rather than code around the $null issue. Here's the rewrite using a .NET ArrayList and a native PoSh array.
if (!$csvfiles.ContainsKey($collname)) {
$csvfiles[$collname] = [System.Collections.ArrayList]#()
}
$csvfiles[$collname].insert(0, #{ repdate = $repdate; fileobj = $csv })
## NATIVE POSH SOLUTION...
if (!$csvfiles.ContainsKey($collname)) {
$csvfiles[$collname] = #()
}
$csvfiles[$collname] = #{ repdate = $repdate; fileobj = $csv }, `
$csvfiles[$collname] | %{$_}
You might want to use ArrayList objects instead, as they support insertions at arbitrary locations. For example:
# C:\> $a = [System.Collections.ArrayList]#(1,2,3,4)
# C:\> $a.insert(0,6)
# C:\> $a
6
1
2
3
4
You can simply use a plus operator:
$array = #('bar', 'baz')
$array = #('foo') + $array
Note: this re-creates actually creates a new array instead of changing the existing one (but the $head, $tail = $array way of shifting you refer to works extactly in the same way).

Resources