Powershell - remove duplicates from 2 arrays using wildcards - arrays

I am trying to remove duplicates and leave only unique entries from the output of 2 queries.
I am pulling a list of installed Windows Updates using the following (also stripping 12 chars of whitespace and dropping to lower case:
$A = #(Get-HotFix | select-object #{Expression={$_.HotFixID.ToLower()}} | ft -hidetableheaders | Out-String) -replace '\s{12}',''
I am then querying a list of available files in a folder and stripping 3 trailing whitespace chars using:
$B = #(Get-ChildItem D:\y | select-object 'Name' | ft -hidetableheaders | Out-String) -replace '\s{3}',''
The problem I have is that the first query ($A) returns output like:
kb4040981
kb4041693
kb2345678
kb8765432
While the second query ($B) returns output like:
windows8.1-kb4040981-x64_d1eb05bc8c55c7632779086079c7759f40d7386f.cab
windows8.1-kb4041687-x64_3bdf264bcfc0dda01c2eaf2135e322d2d6ce6c64.cab
windows8.1-kb4041693-x64_359b7ac71a48e5af003d67e3e4b80120a2f5b570.cab
windows8.1-kb4049179-x64_e6ec21d5d16fa6d8ff890c0c6042c2ba38a1f7c4.cab
I need to compare the 2 outputs using wildcards around each entry in the $A array (I think), and where it exists in $B remove the entire line from $B array.
I cannot truncate the output of $B as I need to use the full filenames in a subsequent process.
IE in the example output above, the entire FIRST and THIRD lines would be remove from the $B array and other lines left intact.
I have tried numerous methods from online searches, and used foreach loops, all to no avail.
Thank you in advance for any assistance.

What did you try with foreach loops that didn't work? Unless your output is huge, this method is pretty striaght forward.
$a = "kb4040981","kb4041693","kb2345678","kb8765432","test"
[System.Collections.ArrayList]$b = "windows8.1-kb4040981-x64_d1eb05bc8c55c7632779086079c7759f40d7386f.cab","windows8.1-kb4041687-x64_3bdf264bcfc0dda01c2eaf2135e322d2d6ce6c64.cab","windows8.1-kb4041693-x64_359b7ac71a48e5af003d67e3e4b80120a2f5b570.cab","windows8.1-kb4049179-x64_e6ec21d5d16fa6d8ff890c0c6042c2ba38a1f7c4.cab"
$toRemove = New-Object System.Collections.ArrayList
foreach($kb in $a)
{
foreach($line in $b)
{
if($line -match $kb)
{
write-host "$kb found in: $line" -ForegroundColor Green
$toRemove.add($line) | out-null
}
}
}
foreach($line in $toRemove)
{
$b.Remove($line)
}
$b
Hope it helps.

I would recommend for you to take a little time to learn the very basics of Powershell. When you use format cmdlets and text files instead of objects you cut yourself of the good stuff. ;-)
Here is how I would start the task:
$A = Get-HotFix
$B = Get-ChildItem D:\y | Select-Object -Property Name,#{Name='HotFixID';Expression={($_.BaseName -split '-')[1]}}
Compare-Object -ReferenceObject $A -DifferenceObject $B -Property 'HotFixID' -PassThru

Sincere thanks to sambardo for his patience and input! The final working solution based on his excellent recommendation is:
$a = (Get-Hotfix).hotfixID
$b = (Get-ChildItem D:\y\ -file *.cab).name
$toRemove = New-Object System.Collections.ArrayList
foreach($kb in $a)
{
foreach($line in $b)
{
if($line -match $kb)
{
# write-host "$kb found in: $line" -ForegroundColor Green
$toRemove.add($line) | out-null
}
}
}
foreach($line in $toRemove)
{
$b.Remove($line)
}
$b

Related

Correct usage of ForEach -parallel in Powershell?

I want to process a large number of URLs and grab the *.jpg file locations.
The problem is that the $entry in the second foreach is not threadsafe.
The script is firing hundreds of errors because the $entry is getting overwritten over and over.
When I move the inner foreach outside of the ForEach-Object, then its working fine but very slowly.
How can I process the split output properly within my ForEach-Object without getting these errors?
$array just contains a huge amount of URLs
$clean_img_array is the output array of the operation
$tmpArray is the reference to $clean_img_array in order to use it within a parallel ForEach
Errors:
InvalidOperation:
Line |
14 | [void]$tmpArray:clean_img_array.Add($entry);
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
| You cannot call a method on a null-valued expression.
Snippet:
$clean_img_array = [System.Collections.ArrayList]#();
$array | ForEach-Object -Parallel {
$web = Invoke-RestMethod $_;
$i=1;
foreach($entry in $web.Split("`"")){
echo $entry;
if($entry.IndexOf(".jpg") -ne -1 -And $entry.IndexOf("http") -ne -1){
if($entry.IndexOf("?") -ne -1){
$tmpArray = $using:clean_img_array;
[void]$tmpArray.Add($entry.Substring(0, $entry.IndexOf('?')));
}else{
$tmpArray = $using:Clean_img_array;
[void]$tmpArray:clean_img_array.Add($entry);
}
}
}
} -ThrottleLimit 20
Here's a simple example. Both $a and $b are arrays. $b is the result of the parallel loop. It's like example 12 in the docs.
$a = 1..10
$b = $a | foreach-object -parallel { $_ + 1 }
$b
2
3
4
5
6
7
8
9
10
11
Thanks for the support!
I combined the answer from #js2012 with something of my own.
The return alone did not solve the thread unsave behavior of $entry, but clearly untangled the workflow.
But I used the inline version of .foreach with the pipe line variable $_ wich happens
to be thread safe as it appears to me. Running now like a charm and also very fast for more than 2million entrys.
$array holds the URLs to be processed
$clean_img_array returns the grabbed image URLs
$clean_img_array = $array | ForEach-Object -Parallel {
$web = Invoke-RestMethod $_;
$web.Split("`"").foreach({
if($_ -ne $null){
if($_.IndexOf(".jpg") -ne -1 -And $_.IndexOf("http") -ne -1){
if($_.IndexOf("?") -ne -1){
return $_.Substring(0, $entry.IndexOf('?'));
}else{
return $_;
}
}
}
});
} -ThrottleLimit 25

Powershell compare one array as substring of another array

I have a text file domains.txt
$domains = ‘c:\domains.txt’
$list = Get-Content $domains
google.com
google.js
and an array
$array = #(".php",".zip",".html",".htm",".js",".png",".ico",".0",".jpg")
anything in $domain that ends in something in #arr should NOT be in my final list
So google.com would be in final list but google.js would not.
I found some other stackoverflow code that give me the exact opposite of what I'm looking for but, hah I can't get it reversed!!!!
This gives me the exact opposite of what I want, how do I reverse it?
$domains = ‘c:\domains.txt’
$list = Get-Content $domains
$array = #(".php",".zip",".html",".htm",".js",".png",".ico",".0",".jpg")
$found = #{}
$list | % {
$line = $_
foreach ($item in $array) {
if ($line -match $item) { $found[$line] = $true }
}
}
$found.Keys | write-host
this gives me google.js I need it to give me google.com.
I've tried -notmatch etc and can't get it to reverse.
Thanks in advance and the more explanation the better!
Take the .s off, mash the items together into a regex OR, tag on an end-of-string anchor, and filter the domains against it.
$array = #("php","zip","html","htm","js","png","ico","0","jpg")
# build a regex of
# .(php|zip|html|htm|...)$
# and filter the list with it
$list -notmatch "\.($($array -join '|'))`$"
Anyway, the simple way to invert your result is to walk through $found.keys | where { $_ -notin $list }. Or to change your test to $line -notmatch $item.
But beware that you are doing a regex match and something like top500.org would match .0 and throw your results out. If you need to match at the end specifically, you need to use something like $line.EndsWith($item).
other solution
$array = #(".php",".zip",".html",".htm",".js",".png",".ico",".0",".jpg")
get-content C:\domains.txt | where {[System.IO.Path]::GetExtension($_) -notin $array}

PowerShell - Not creating Jagged Array within forEach loop

So, I'm having an issue enumerating through a forEach loop in PowerShell (v3) and adding the variable being evaluated, as well as a Test-Connection result into an array. I'm trying to make $arrPing a multi-dimensional array as this will make it easier for me to filter and process the objects in there later in the script, but I'm encountering issues with the code.
My code looks like the following:
$arrPing= #();
$strKioskIpAddress= (Get-WmiObject Win32_NetworkAdapterConfiguration | Where-Object { $_.IPAddress -ne $null }).ipaddress
...FURTHER DOWN THE CODE...
$tmpIpAddress= Select-Xml -Path $dirKioskIpAddresses -XPath '//kiosks/kiosk' | Select-Object -ExpandProperty Node
forEach ( $entry in $tmpIpAddress )
{
if ( $entry -ne $strKioskIpAddress )
{
$result= Test-Connection -ComputerName $entry -Count 1 -BufferSize 16 -Quiet -ErrorAction SilentlyContinue
$arrPing+= #($entry,$result);
}
}
But I'm getting the following output when I display the contents of the $arrPing variable:
PS H:\Documents\PowerShell Scripts> $arrPing
10.216.1.134
True
10.216.1.139
True
10.216.23.230
True
10.216.23.196
False
10.216.23.23
False
Can anyone tell me where I'm going wrong? I have a feeling that this is happening because I'm in a forEach loop but I just can't say for sure...
I would simplify it a bit by using a PSCustomObject:
$Ping = foreach ($Entry in $tmpIpAddress) {
if ($Entry -ne $strKioskIpAddress) {
$TestParams = #{
ComputerName = $Entry
Count = '1'
BufferSize = '16'
Quiet = $true
ErrorAction = 'SilentlyContinue'
}
$Result = Test-Connection #TestParams
[PSCustomObject]#{
Entry = $Entry
Result = $Result
}
}
}
$Ping
To avoid a long row of parameters I've used a technique called splatting.
You are seeing how PowerShell unrolls arrays. The variable is as designed: a large array. However PowerShell, when displaying those, puts each element on its own line. If you do not want that and especially if you are going to use This data will be used to filter out computers which are not on the network then you should use PowerShell objects.
if ( $entry -ne $strKioskIpAddress ){
$objPing += New-Object -TypeName psobject -Property #{
Entry = $entry
Result = Test-Connection -ComputerName $entry -Count 1 -BufferSize 16 -Quiet -ErrorAction SilentlyContinue
}
}
Instead of that those I would continue and use a different foreach contruct which is more pipeline friendly. That way you can use other cmdlets like Export-CSV if you need this output in other locations. Also lie PetSerAl says
[Y]ou should not use array addition operator and add elements one by one. It [will] create [a] new array (as arrays are not resizable) and copy elements from [the] old one on each operation.
$tmpIpAddress | Where-Object{$_ -ne $strKioskIpAddress} | ForEach-Object{
New-Object -TypeName psobject -Property #{
Entry = $_
Result = Test-Connection -ComputerName $_ -Count 1 -BufferSize 16 -Quiet -ErrorAction SilentlyContinue
}
} | Export-CSV -NoTypeInformation $path
The if is redundant now that we have moved that logic into Where-Object since you were using it do filter out certain records anyway. That is what Where-Object is good for.
The above code is good for PowerShell 2.0. If you have 3.0 or later then use [pscutomobject] and [ordered]
$tmpIpAddress | Where-Object{$_ -ne $strKioskIpAddress} | ForEach-Object{
[psobject][ordered] #{
Entry = $_
Result = Test-Connection -ComputerName $_ -Count 1 -BufferSize 16 -Quiet -ErrorAction SilentlyContinue
}
} | Export-CSV -NoTypeInformation $path

PowerShell: Set-Content having issues with "file already in use"

I'm working on a PowerShell script that finds all the files with PATTERN within a given DIRECTORY, prints out the relevant lines of the document with the PATTERN highlighted, and then replaces the PATTERN with a provided REPLACE word, then saves the file back. So it actually edits the file.
Except I can't get it to alter the file, because Windows complains about the file already being open. I tried several methods to solve this, but keep running into the issue. Perhaps someone can help:
param(
[string] $pattern = ""
,[string] $replace = ""
,[string] $directory ="."
,[switch] $recurse = $false
,[switch] $caseSensitive = $false)
if($pattern -eq $null -or $pattern -eq "")
{
Write-Error "Please provide a search pattern." ; return
}
if($directory -eq $null -or $directory -eq "")
{
Write-Error "Please provide a directory." ; return
}
if($replace -eq $null -or $replace -eq "")
{
Write-Error "Please provide a string to replace." ; return
}
$regexPattern = $pattern
if($caseSensitive -eq $false) { $regexPattern = "(?i)$regexPattern" }
$regex = New-Object System.Text.RegularExpressions.Regex $regexPattern
function Write-HostAndHighlightPattern([string] $inputText)
{
$index = 0
$length = $inputText.Length
while($index -lt $length)
{
$match = $regex.Match($inputText, $index)
if($match.Success -and $match.Length -gt 0)
{
Write-Host $inputText.SubString($index, $match.Index) -nonewline
Write-Host $match.Value.ToString() -ForegroundColor Red -nonewline
$index = $match.Index + $match.Length
}
else
{
Write-Host $inputText.SubString($index) -nonewline
$index = $inputText.Length
}
}
}
Get-ChildItem $directory -recurse:$recurse |
Select-String -caseSensitive:$caseSensitive -pattern:$pattern |
foreach {
$file = ($directory + $_.FileName)
Write-Host "$($_.FileName)($($_.LineNumber)): " -nonewline
Write-HostAndHighlightPattern $_.Line
%{ Set-Content $file ((Get-Content $file) -replace ([Regex]::Escape("[$pattern]")),"[$replace]")}
Write-Host "`n"
Write-Host "Processed: $($file)"
}
The issue is located within the final block of code, right at the Get-ChildItem call. Of course, some of the code in that block is now a bit mangled due to me trying to fix the problem then stopping, but keep in mind the intent of that part of the script. I want to get the content, replace the words, then save the altered text back to the file I got it from.
Any help at all would be greatly appreciated.
Removed my previous answer, replacing it with this:
Get-ChildItem $directory -recurse:$recurse
foreach {
$file = ($directory + $_.FileName)
(Get-Content $file) | Foreach-object {
$_ -replace ([Regex]::Escape("[$pattern]")),"[$replace]")
} | Set-Content $file
}
Note:
The parentheses around Get-Content to ensure the file is slurped in one go (and therefore closed).
The piping to subsequent commands rather than inlining.
Some of your commands have been removed to ensure it's a simple test.
Just a suggestion but you might try looking at the documentation for the parameters code block. There is a more efficient way to ensure that a parameter is entered if you require it and to throw an error message if the user doesn't.
About_throw: http://technet.microsoft.com/en-us/library/dd819510.aspx
About_functions_advanced_parameters: http://technet.microsoft.com/en-us/library/dd347600.aspx
And then about using Write-Host all the time: http://powershell.com/cs/blogs/donjones/archive/2012/04/06/2012-scripting-games-commentary-stop-using-write-host.aspx
Alright, I finally sat down and just typed everything sequentially in PowerShell, then used that to make my script.
It was actually really simple;
$items = Get-ChildItem $directory -recurse:$recurse
$items |
foreach {
$file = $_.FullName
$content = get-content $file
$newContent = $content -replace $pattern, $replace
Set-Content $file $newcontent
}
Thanks for all your help guys.

How to remove item from an array in PowerShell?

I'm using Powershell 1.0 to remove an item from an Array. Here's my script:
param (
[string]$backupDir = $(throw "Please supply the directory to housekeep"),
[int]$maxAge = 30,
[switch]$NoRecurse,
[switch]$KeepDirectories
)
$days = $maxAge * -1
# do not delete directories with these values in the path
$exclusionList = Get-Content HousekeepBackupsExclusions.txt
if ($NoRecurse)
{
$filesToDelete = Get-ChildItem $backupDir | where-object {$_.PsIsContainer -ne $true -and $_.LastWriteTime -lt $(Get-Date).AddDays($days)}
}
else
{
$filesToDelete = Get-ChildItem $backupDir -Recurse | where-object {$_.PsIsContainer -ne $true -and $_.LastWriteTime -lt $(Get-Date).AddDays($days)}
}
foreach ($file in $filesToDelete)
{
# remove the file from the deleted list if it's an exclusion
foreach ($exclusion in $exclusionList)
{
"Testing to see if $exclusion is in " + $file.FullName
if ($file.FullName.Contains($exclusion)) {$filesToDelete.Remove($file); "FOUND ONE!"}
}
}
I realize that Get-ChildItem in powershell returns a System.Array type. I therefore get this error when trying to use the Remove method:
Method invocation failed because [System.Object[]] doesn't contain a method named 'Remove'.
What I'd like to do is convert $filesToDelete to an ArrayList and then remove items using ArrayList.Remove. Is this a good idea or should I directly manipulate $filesToDelete as a System.Array in some way?
Thanks
The best way to do this is to use Where-Object to perform the filtering and use the returned array.
You can also use #splat to pass multiple parameters to a command (new in V2). If you cannot upgrade (and you should if at all possible, then just collect the output from Get-ChildItems (only repeating that one CmdLet) and do all the filtering in common code).
The working part of your script becomes:
$moreArgs = #{}
if (-not $NoRecurse) {
$moreArgs["Recurse"] = $true
}
$filesToDelete = Get-ChildItem $BackupDir #moreArgs |
where-object {-not $_.PsIsContainer -and
$_.LastWriteTime -lt $(Get-Date).AddDays($days) -and
-not $_.FullName.Contains($exclusion)}
In PSH arrays are immutable, you cannot modify them, but it very easy to create a new one (operators like += on arrays actually create a new array and return that).
I agree with Richard, that Where-Object should be used here. However, it's harder to read.
What I would propose:
# get $filesToDelete and #exclusionList. In V2 use splatting as proposed by Richard.
$res = $filesToDelete | % {
$file = $_
$isExcluded = ($exclusionList | % { $file.FullName.Contains($_) } )
if (!$isExcluded) {
$file
}
}
#the files are in $res
Also note that generally it is not possible to iterate over a collection and change it. You would get an exception.
$a = New-Object System.Collections.ArrayList
$a.AddRange((1,2,3))
foreach($item in $a) { $a.Add($item*$item) }
An error occurred while enumerating through a collection:
At line:1 char:8
+ foreach <<<< ($item in $a) { $a.Add($item*$item) }
+ CategoryInfo : InvalidOperation: (System.Collecti...numeratorSimple:ArrayListEnumeratorSimple) [], RuntimeException
+ FullyQualifiedErrorId : BadEnumeration
This is ancient. But, I wrote these a while ago to add and remove from powershell lists using recursion. It leverages the ability of powershell to do multiple assignment . That is, you can do $a,$b,$c=#('a','b','c') to assign a b and c to their variables. Doing $a,$b=#('a','b','c') assigns 'a' to $a and #('b','c') to $b.
First is by item value. It'll remove the first occurrence.
function Remove-ItemFromList ($Item,[array]$List(throw"the item $item was not in the list"),[array]$chckd_list=#())
{
if ($list.length -lt 1 ) { throw "the item $item was not in the list" }
$check_item,$temp_list=$list
if ($check_item -eq $item )
{
$chckd_list+=$temp_list
return $chckd_list
}
else
{
$chckd_list+=$check_item
return (Remove-ItemFromList -item $item -chckd_list $chckd_list -list $temp_list )
}
}
This one removes by index. You can probably mess it up good by passing a value to count in the initial call.
function Remove-IndexFromList ([int]$Index,[array]$List,[array]$chckd_list=#(),[int]$count=0)
{
if (($list.length+$count-1) -lt $index )
{ throw "the index is out of range" }
$check_item,$temp_list=$list
if ($count -eq $index)
{
$chckd_list+=$temp_list
return $chckd_list
}
else
{
$chckd_list+=$check_item
return (Remove-IndexFromList -count ($count + 1) -index $index -chckd_list $chckd_list -list $temp_list )
}
}
This is a very old question, but the problem is still valid, but none of the answers fit my scenario, so I will suggest another solution.
I my case, I read in an xml configuration file and I want to remove an element from an array.
[xml]$content = get-content $file
$element = $content.PathToArray | Where-Object {$_.name -eq "ElementToRemove" }
$element.ParentNode.RemoveChild($element)
This is very simple and gets the job done.

Resources