Powershell Compare and Merge two arrays - arrays

I'm very new to Powershell and what I want to do is to compare two arrays of data and then merge them together to one large array that I can either export to Excel or POST request to a web server using restful API and json.
To do this in Python is simple by using Pandas and adding search data but in Powershell I can't really get it done.
Example: I have 2 arrays
$a1 = #(('Name1', 'Link1', 'URL1'),
('Name2', 'Link2', 'URL2'),
('Name3', 'Link3', 'URL3')
)
$a2 = #(('Name4', 'URL4', 'TEXT4'),
('Name2', 'URL2', 'TEXT1'),
('Name1', 'URL1', 'TEXT2')
)
I want to do a compare $a1 with $a2 and the other way around so I don't miss any values.
Merge them together so I will end up with a $a3 array that look something like this.
$a3 = #(('Name1', 'Link1', 'URL1', 'TEXT1),
('Name2', 'Link2', 'URL2', 'TEXT2),
('Name3', 'Link3', 'URL3', ''),
('Name4', 'URL4', '', 'TEXT4')
)
And as more I dig in to different alternatives as more confused do I get.

I'd recommend to make a collection of (at least) simple objects to work with.
This is almost classic task of combining two different sets of properties by common key. Just convert those arrays-of-arrays to arrays-of-objects-with-properties and combine them next.
Please note that powershell is primary sysadmin's language, there is almost no place for any guesses or chances when processing tasks.
Also you can use almost any external .Net-DLL library in PowerShell (search Deedle as alternative to Pandas)
$a1 = #(('Name1', 'Link1', 'URL1'),
('Name2', 'Link2', 'URL2'),
('Name3', 'Link3', 'URL3')
)
$a2 = #(('Name4', 'URL4', 'TEXT4'),
('Name2', 'URL2', 'TEXT1'),
('Name1', 'URL1', 'TEXT2')
)
# Make Hashtable K=Name;V={Name, Link, Url, Text}
$AObjects = #{}
# Fill Hashtable from A1
#($a1) |
ForEach-Object {
$name = $_[0]
$AObjects[$name] = #{
Name = $name
Link = $_[1]
Url = $_[2]
Text = ''
}
}
# Fill Hashtable from A2
#($a2) |
ForEach-Object {
$name = $_[0]
$url = $_[1]
if (-not $AObjects.ContainsKey($name)) {
Write-Warning "A2 has [$name] that was not listed in A1!"
$AObjects[$name] = #{Name = $name; Url = $url; Link = ''; Text = ''}
} elseif($AObjects[$name].Url -ne $url) {
Write-Warning "For $($name), A1's URL and A2's URL differ! Using A1's"
}
$AObjects[$name].Text = $_[2]
}
# Convert Hashtable to array of objects with props. Select is required for better output.
$AObjects = #($AObjects.Values) | % { [PSCustomObject]$_ } | Select-Object #('Name', 'Link', 'URL', 'Text')
# Convert array of objects with props back to array
$result = $AObjects |
ForEach-Object {
if ([string]::IsNullOrWhiteSpace($_.Link)) { # Didn't get your logic swapping elements in example
return #(,#($_.Name, $_.Url, $_.Link, $_.Text))
} else {
return #(,#($_.Name, $_.Link, $_.Url, $_.Text))
}
}
#Print
$result | % { "[$($_ -join ',')]" }
# [Name3,Link3,URL3,]
# [Name1,Link1,URL1,TEXT2]
# [Name2,Link2,URL2,TEXT1]
# [Name4,URL4,,TEXT4]

Related

How to create an array like this in powershell?

I need to store data like below for TextToColumns Excel automation.
I need to implement Code-2 or Code-3 or Code-4 is that any way to achieve?
I have more than 350+ data so I cant use Code-1, that's not fair for me.
Code-1: working fine
$var = (1,2),(2,2),(3,2),(4,2),(5,2),(6,2)........(300,2)
$ColumnA.texttocolumns($colrange,1,-412,$false,$false,$false,$false,$false,$true,"|",$var)
Code-2: not Working
$var = #((1,2)..(300,2))
$ColumnA.texttocolumns($colrange,1,-412,$false,$false,$false,$false,$false,$true,"|",$var)
Code-3: not Working
$var = #()
#forloop upto 300
{ $var += ($i,2) }
$ColumnA.texttocolumns($colrange,1,-412,$false,$false,$false,$false,$false,$true,"|",$var)
Code-4: not Working
[array]$var = 1..300 | foreach-object { ,#($_, 2) }
$ColumnA.texttocolumns($colrange,1,-412,$false,$false,$false,$false,$false,$true,"|",$var)
I can't fully explain what happens here but I guess that it is related to the fact that the texttocolumns requires an (deferred) expression rather than an (evaluated) object.
Meaning that the following appears to work for the Minimal, Reproducible Example from #mclayton:
$Var = Invoke-Expression ((1..6 |% { "($_, `$xlTextFormat)" }) -Join ',')
And expect the following to work around the issue in the initial question:
$Var = Invoke-Expression ((1..300 |% { "($_, 2)" }) -Join ',')
Not an answer - just documenting some research to save others some time...
I can repro the issue here with the following code:
$xl = new-object -com excel.application;
$xl.Visible = $true;
$workbook = $xl.Workbooks.Add();
$worksheet = $workbook.Worksheets.Item(1);
$worksheet.Range("A1") = "aaa|111";
$worksheet.Range("A2") = "bbb|222";
$worksheet.Range("A3") = "ccc|333";
$worksheet.Range("A4") = "ddd|444";
$worksheet.Range("A5") = "eee|555";
$worksheet.Range("A6") = "fff|666";
which builds a new spreadsheet like this:
If you then run the following it will parse the contents of column A and put the results into columns B and C:
$range = $worksheet.Range("A:A");
$target = $worksheet.Range("B1");
# XlColumnDataType enumeration
# see https://learn.microsoft.com/en-us/office/vba/api/excel.xlcolumndatatype
$xlTextFormat = 2;
# XlTextParsingType enumeration
# see https://learn.microsoft.com/en-us/office/vba/api/excel.xltextparsingtype
$xlDelimited = 1;
# XlTextQualifier enumeration
# https://learn.microsoft.com/en-us/office/vba/api/excel.xltextqualifier
$xlTextQualifierNone = -4142;
$var = (1,$xlTextFormat),(2,$xlTextFormat),(3,$xlTextFormat),(4,$xlTextFormat),(5,$xlTextFormat),(6,$xlTextFormat);
# parse the values in A1:A6 and puts the values in a 2-dimensional array starting at B1
# see https://learn.microsoft.com/en-us/office/vba/api/excel.range.texttocolumns
$result = $range.TextToColumns(
$target, # Destination
$xlDelimited, # DataType
$xlTextQualifierNone, # TextQualifier
$false, # ConsecutiveDelimiter
$false, # Tab
$false, # Semicolon
$false, # Comma
$false, # Space
$true, # Other
"|", # OtherChar
$var # FieldInfo
);
which then looks like this:
However, if you change the declaration for $var to
$var = 1..6 | % { ,#($_, $xlTextFormat) };
you get the following error:
OperationStopped: The remote procedure call failed. (0x800706BE)
and the Excel instance terminates.
So there's something different about these two declarations:
$var = (1,$xlTextFormat),(2,$xlTextFormat),(3,$xlTextFormat),(4,$xlTextFormat),(5,$xlTextFormat),(6,$xlTextFormat);
$var = 1..6 | % { ,#($_, $xlTextFormat) };
but what that is eludes me :-S

Working two Array PowerShell

I have two arrays: array1 [POP1, POP2, POP3 .... POP30] and array2 [61,61,62 ... 61]. I need to create a new object with value 62 and its POP.
In this example:
POP3 62.
I am simplifying the explanation because I've already been able to get the value from the database.
Can someone help me?
Code:
$target = #( )
$ini = 0 | foreach {
$apiurl = "http://xxxxxxxxx:8080/fxxxxp/events_xxxx.xml"
[xml]$ini = (New-Object System.Net.WebClient).downloadstring($apiurl)
$target = $ini.events.event.name
$nodename = $target
$target = $ini.events.event.statuscode
$statuscode = $target
}
$column1 = #($nodename)
$column2 = #($statuscode)
$i = 0
($column1,$column2)[0] | foreach {
New-Object PSObject -Property #{
POP = $Column1[$i]
Status = $column2[$i++]
} | ft -AutoSize
I really couldn't figure out what you were trying to do, but you definitely over complicated it. Here is what I thought of your code:
# Here you have an empty array
$target = #( )
# Here you set call a Foreach, but you don't even need it
$ini = 0 | foreach {
$apiurl = "http://xxxxxxxxx:8080/fxxxxp/events_xxxx.xml"
[xml]$ini = (new-object System.Net.WebClient).downloadstring($apiurl)
# You duplicated variables here. Just set $nodename = $ini.events.event.name
$target = $ini.events.event.name
$nodename = $target
# You duplicate variables here. Just set $statuscode = $ini.events.event.name
$target = $ini.events.event.statuscode
$statuscode = $target
}
# You should already have arrays, so now you're making making more arrays duplicating variables again
$column1 = #($nodename)
$column2 = #($statuscode)
# counter, but you won't need it
$i = 0
# So here, youre making a new array again, but this contains two nested arrays. I don't get it.
($column1,$column2)[0] | foreach {
New-Object PSObject -Property #{
POP = $Column1[$i]
Status = $column2[$i++]
} | ft -AutoSize
} # You were missing a closing bracket for your foreach loop
Here is a solution that should probable work for you:
# Download the file
$apiurl = "http://xxxxxxxxx:8080/fxxxxp/events_xxxx.xml"
[xml]$ini = (New-Object System.Net.WebClient).DownloadString($apiurl)
# Set arrays
$nodename = $ini.events.event.name
$statuscode = $ini.events.event.statuscode
# Create $TableValues by looping through one array
$TableValues = foreach ( $node in $nodename )
{
[pscustomobject] #{
# The current node
POP = $node
# use the array method IndexOf
# This should return the position of the current node
# Then use that index to get the matching value of $statuscode
Status = $statuscode[$nodename.IndexOf($node)]
}
}
# Add a custom value
$TableValues += [pscustomobject] #{
POP = 'POP100'
Status = 100
}
$TableValues | Format-Table -AutoSize
Assuming that your intent is to create an array of custom objects constructed from the pairs of corresponding elements of 2 arrays of the same size:
A concise pipeline-based solution (PSv3+; a for / foreach solution would be faster):
$arr1 = 'one', 'two', 'three'
$arr2 = 1, 2, 3
0..$($arr1.Count-1) | % { [pscustomobject] #{ POP = $arr1[$_]; Status = $arr2[$_] } }
This yields:
POP Status
--- ------
one 1
two 2
three 3

Using intermediate variable to work with array (reference type)

I am trying to use $a variable in this script for working with intermediate steps so that I don't have to use $array[$array.Count-1] repeatedly. Similarly for $prop as well . However, values are being overwritten by last value in loop.
$guests = Import-Csv -Path C:\Users\shant_000\Desktop\UploadGuest_test.csv
$output = gc '.\Sample Json.json' | ConvertFrom-Json
$array = New-Object System.Collections.ArrayList;
foreach ($g in $guests) {
$array.Add($output);
$a = $array[$array.Count-1];
$a.Username = $g.'EmailAddress';
$a.DisplayName = $g.'FirstName' + ' ' + $g.'LastName';
$a.Password = $g.'LastName' + '123';
$a.Email = $g.'EmailAddress';
foreach ($i in $a.ProfileProperties.Count) {
$j = $i - 1;
$prop = $a.ProfileProperties[$j];
if ($prop.PropertyName -eq "FirstName") {
$prop.PropertyValue = $g.'FirstName';
} elseif ($prop.PropertyName -eq "LastName") {
$prop.PropertyValue = $g.'LastName';
}
$a.ProfileProperties[$j] = $prop;
}
$array[$array.Count-1] = $a;
}
$array;
All array elements are referencing one actual variable: $output.
Create an entirely new object each time by repeating JSON-parsing:
$jsontext = gc '.\Sample Json.json'
..........
foreach ($g in $guests) {
$a = $jsontext | ConvertFrom-Json
# process $a
# ............
$array.Add($a) >$null
}
In case the JSON file is very big and you change only a few parts of it you can use a faster cloning technique on the changed parts (and their entire parent chain) via .PSObject.Copy():
foreach ($g in $guests) {
$a = $output.PSObject.Copy()
# ............
$a.ProfileProperties = $a.ProfileProperties.PSObject.Copy()
# ............
foreach ($i in $a.ProfileProperties.Count) {
# ............
$prop = $a.ProfileProperties[$j].PSObject.Copy();
# ............
}
$array.Add($a) >$null
}
As others have pointed out, appending $object appends a references to the same single object, so you keep changing the values for all elements in the list. Unfortunately the approach #wOxxOm suggested (which I thought would work at first too) doesn't work if your JSON datastructure has nested objects, because Copy() only clones the topmost object while the nested objects remain references to their original.
Demonstration:
PS C:\> $o = '{"foo":{"bar":42},"baz":23}' | ConvertFrom-Json
PS C:\> $o | Format-Custom *
class PSCustomObject
{
foo =
class PSCustomObject
{
bar = 42
}
baz = 23
}
PS C:\> $o1 = $o
PS C:\> $o2 = $o.PSObject.Copy()
If you change the nested property bar on both $o1 and $o2 it has on both objects the value that was last set to any of them:
PS C:\> $o1.foo.bar = 23
PS C:\> $o2.foo.bar = 24
PS C:\> $o1.foo.bar
24
PS C:\> $o2.foo.bar
24
Only if you change a property of the topmost object you'll get a difference between $o1 and $o2:
PS C:\> $o1.baz = 5
PS C:\> $o.baz
5
PS C:\> $o1.baz
5
PS C:\> $o2.baz
23
While you could do a deep copy it's not as simple and straightforward as one would like to think. Usually it takes less effort (and simpler code) to just create the object multiple times as #PetSerAl suggested in the comments to your question.
I'd also recommend to avoid appending to an array (or arraylist) in a loop. You can simply echo your objects inside the loop and collect the entire output as a list/array by assigning the loop to a variable:
$json = Get-Content '.\Sample Json.json' -Raw
$array = foreach ($g in $guests) {
$a = $json | ConvertFrom-Json # create new object
$a.Username = $g.'EmailAddress'
...
$a # echo object, so it can be collected in $array
}
Use Get-Content -Raw on PowerShell v3 and newer (or Get-Content | Out-String on earlier versions) to avoid issues with multiline JSON data in the JSON file.

How to search a collection with an array in Powershell

I have an array of MailItems from Outlook. I want to search each mail item and return the Subject and a Category based on a list of search terms contained in an array - in the example called $searchArray.
For example:
$mailbox = "my.mailbox#example.com"
$outlook = New-Object -com Outlook.Application
$ns = $outlook.GetNamespace("MAPI")
$inbox = $ns.Folders.Item($mailbox).Folders.Item("Inbox")
$searchItems = $inbox.Folders.Item("MySubFolder").Folders.Item("MyNestedSubFolder").Items
$searchArray = (
"Category1", ("searchTerm1","searchTerm2","searchTerm3"),
"Category2", ("searchTerm4","searchTerm5"),
"Category3", ("searchTerm6")
)
foreach ($msg in $searchItems) {
$msg | select Subject, # <Category where email address contains one of the search terms>
}
I want to return the Subject, and then a column called Category which will look at the $msg.SenderEmailAddress and if any of the searchTerms in the $searchArray is contained within the address, return the category that had that search term in it.
For example if one of the SenderEmailAddress values was "searchTerm2#somewhere.com" then return Category1 as the Category.
I would flip that array on its head and create a hashtable from it. Then use the first matching search term as a lookup key for the category:
$searchArray = (
"Category1", ("searchTerm1","searchTerm2","searchTerm3"),
"Category2", ("searchTerm4","searchTerm5"),
"Category3", ("searchTerm6")
)
# Create hashtable
$searchTable = #{}
# Populate hash table with values from array
for($i=0;$i-lt$searchArray.Count;$i+=2){
foreach($term in $searchArray[$i+1])
{
$searchTable[$term] = $searchArray[$i]
}
}
# Select Category based on first matching search term
$msg |Select-Object Subject,#{Name="Category";Expression={
$sender = $_.SenderEmailAddress
$searchTable[$($searchTable.Keys |Where-Object{$sender -like "*$_*"} |Select -First 1)]
}
}
Still need to use a calculated expression just as Mathias did (It's really the simple way). However I wanted to show an approach where you had a custom object array for the $searchArray. If you were to tailor it from scratch it would look like this. I also converted the terms into regex pattern matches since you say they are unique. Only caveat there is you need to be sure that there are no regex meta-characters in your search terms.
$searchArray = (
[pscustomobject]#{
Category = "1"
Pattern = "searchTerm1|searchTerm2|searchTerm3"
},
[pscustomobject]#{
Category = "2"
Pattern = "searchTerm4|searchTerm5"
},
[pscustomobject]#{
Category = "3"
Pattern = "searchTerm6"}
)
foreach ($msg in $searchItems) {
$msg | select Subject, #{
Name="Category";
Expression={$searchArray | Where-Object{$msg.SenderEmailAddress -match $_.pattern } | Select-Object -ExpandProperty Category}
}
}
Solution is dependant on PowerShell 3.0 from the type accelerator [pscustomobject]. Could easily bring it back to 2.0 if need be.
To showcase similar structure using 2.0 and automatic conversion of you array to one that works with my code.
$newSearchArray = for($categoryIndex=0;$categoryIndex-lt$searchArray.Count;$categoryIndex+=2){
New-Object -TypeName pscustomobject -Property #{
Category = $searchArray[$categoryIndex]
Pattern = ($searchArray[$categoryIndex+1] | ForEach-Object{[regex]::Escape($_)}) -join "|"
}
}
Now the search terms are automatically escaped and joined into a search pattern.
Using a switch:
$mailbox = "my.mailbox#example.com"
$outlook = New-Object -com Outlook.Application
$ns = $outlook.GetNamespace("MAPI")
$inbox = $ns.Folders.Item($mailbox).Folders.Item("Inbox")
$searchItems = $inbox.Folders.Item("MySubFolder").Folders.Item("MyNestedSubFolder").Items
foreach ($msg in $searchItems) {
$object = $msg | select Subject,Category # <Category where email address contains one of the search terms>
Switch -Regex ($msg.SenderEmailAddress)
{
'searchTerm1|searchTerm2|searchTerm3' { $object.Catetory = 'Category1' ; break }
'searchTerm4|searchTerm5' { $object.Catetory = 'Category2' ; break }
'searchTerm6' { $object.Catetory = 'Category3' ; break }
}
$object
}

PowerShell: modify elements of array

My cmdlet get-objects returns an array of MyObject with public properties:
public class MyObject{
public string testString = "test";
}
I want users without programming skills to be able to modify public properties (like testString in this example) from all objects of the array.
Then feed the modified array to my second cmdlet which saves the object to the database.
That means the syntax of the "editing code" must be as simple as possible.
It should look somewhat like this:
> get-objects | foreach{$_.testString = "newValue"} | set-objects
I know that this is not possible, because $_ just returns a copy of the element from the array.
So you'd need to acces the elements by index in a loop and then modify the property.This gets really quickly really complicated for people that are not familiar with programming.
Is there any "user-friendly" built-in way of doing this? It shouldn't be more "complex" than a simple foreach {property = value}
I know that this is not possible, because $_ just returns a copy of the element from the array (https://social.technet.microsoft.com/forums/scriptcenter/en-US/a0a92149-d257-4751-8c2c-4c1622e78aa2/powershell-modifying-array-elements)
I think you're mis-intepreting the answer in that thread.
$_ is indeed a local copy of the value returned by whatever enumerator you're currently iterating over - but you can still return your modified copy of that value (as pointed out in the comments):
Get-Objects | ForEach-Object {
# modify the current item
$_.propertyname = "value"
# drop the modified object back into the pipeline
$_
} | Set-Objects
In (allegedly impossible) situations where you need to modify a stored array of objects, you can use the same technique to overwrite the array with the new values:
PS C:\> $myArray = 1,2,3,4,5
PS C:\> $myArray = $myArray |ForEach-Object {
>>> $_ *= 10
>>> $_
>>>}
>>>
PS C:\> $myArray
10
20
30
40
50
That means the syntax of the "editing code" must be as simple as possible.
Thankfully, PowerShell is very powerful in terms of introspection. You could implement a wrapper function that adds the $_; statement to the end of the loop body, in case the user forgets:
function Add-PsItem
{
[CmdletBinding()]
param(
[Parameter(Mandatory,ValueFromPipeline,ValueFromRemainingArguments)]
[psobject[]]$InputObject,
[Parameter(Mandatory)]
[scriptblock]$Process
)
begin {
$InputArray = #()
# fetch the last statement in the scriptblock
$EndBlock = $Process.Ast.EndBlock
$LastStatement = $EndBlock.Statements[-1].Extent.Text.Trim()
# check if the last statement is `$_`
if($LastStatement -ne '$_'){
# if not, add it
$Process = [scriptblock]::Create('{0};$_' -f $Process.ToString())
}
}
process {
# collect all the input
$InputArray += $InputObject
}
end {
# pipe input to foreach-object with the new scriptblock
$InputArray | ForEach-Object -Process $Process
}
}
Now the users can do:
Get-Objects | Add-PsItem {$_.testString = "newValue"} | Set-Objects
The ValueFromRemainingArguments attribute also lets users supply input as unbounded parameter values:
PS C:\> Add-PsItem { $_ *= 10 } 1 2 3
10
20
30
This might be helpful if the user is not used to working with the pipeline
Here's a more general approach, arguably easier to understand, and less fragile:
# $dataSource would be get-object in the OP
# $dataUpdater is the script the user supplies to modify properties
# $dataSink would be set-object in the OP
function Update-Data {
param(
[scriptblock] $dataSource,
[scriptblock] $dataUpdater,
[scriptblock] $dataSink
)
& $dataSource |
% {
$updaterOutput = & $dataUpdater
# This "if" allows $dataUpdater to create an entirely new object, or
# modify the properties of an existing object
if ($updaterOutput -eq $null) {
$_
} else {
$updaterOutput
}
} |
% $dataSink
}
Here are a couple of examples of use. The first example isn't applicable to the OP, but it's being used to create a data set that is applicable (a set of objects with properties).
# Use updata-data to create a set of data with properties
#
$theDataSource = #() # will be filled in by first update-data
update-data {
# data source
0..4
} {
# data updater: creates a new object with properties
New-Object psobject |
# add-member uses hash table created on the fly to add properties
# to a psobject
add-member -passthru -NotePropertyMembers #{
room = #('living','dining','kitchen','bed')[$_];
size = #(320, 200, 250, 424 )[$_]}
} {
# data sink
$global:theDataSource += $_
}
$theDataSource | ft -AutoSize
# Now use updata-data to modify properties in data set
# this $dataUpdater updates the 'size' property
#
$theDataSink = #()
update-data { $theDataSource } { $_.size *= 2} { $global:theDataSink += $_}
$theDataSink | ft -AutoSize
And then the output:
room size
---- ----
living 320
dining 200
kitchen 250
bed 424
room size
---- ----
living 640
dining 400
kitchen 500
bed 848
As described above update-data relies on a "streaming" data source and sink. There is no notion of whether the first or fifteenth element is being modified. Or if the data source uses a key (rather than an index) to access each element, the data sink wouldn't have access to the key. To handle this case a "context" (for example an index or a key) could be passed through the pipeline along with the data item. The $dataUpdater wouldn't (necessarily) need to see the context. Here's a revised version with this concept added:
# $dataSource and $dataSink scripts need to be changed to output/input an
# object that contains both the object to modify, as well as the context.
# To keep it simple, $dataSource will output an array with two elements:
# the value and the context. And $dataSink will accept an array (via $_)
# containing the value and the context.
function Update-Data {
param(
[scriptblock] $dataSource,
[scriptblock] $dataUpdater,
[scriptblock] $dataSink
)
% $dataSource |
% {
$saved_ = $_
# Set $_ to the data object
$_ = $_[0]
$updaterOutput = & $dataUpdater
if ($updaterOutput -eq $null) { $updaterOutput = $_}
$_ = $updaterOutput, $saved_[1]
} |
% $dataSink
}

Resources