Powershell compare one array as substring of another array - arrays

I have a text file domains.txt
$domains = ‘c:\domains.txt’
$list = Get-Content $domains
google.com
google.js
and an array
$array = #(".php",".zip",".html",".htm",".js",".png",".ico",".0",".jpg")
anything in $domain that ends in something in #arr should NOT be in my final list
So google.com would be in final list but google.js would not.
I found some other stackoverflow code that give me the exact opposite of what I'm looking for but, hah I can't get it reversed!!!!
This gives me the exact opposite of what I want, how do I reverse it?
$domains = ‘c:\domains.txt’
$list = Get-Content $domains
$array = #(".php",".zip",".html",".htm",".js",".png",".ico",".0",".jpg")
$found = #{}
$list | % {
$line = $_
foreach ($item in $array) {
if ($line -match $item) { $found[$line] = $true }
}
}
$found.Keys | write-host
this gives me google.js I need it to give me google.com.
I've tried -notmatch etc and can't get it to reverse.
Thanks in advance and the more explanation the better!

Take the .s off, mash the items together into a regex OR, tag on an end-of-string anchor, and filter the domains against it.
$array = #("php","zip","html","htm","js","png","ico","0","jpg")
# build a regex of
# .(php|zip|html|htm|...)$
# and filter the list with it
$list -notmatch "\.($($array -join '|'))`$"
Anyway, the simple way to invert your result is to walk through $found.keys | where { $_ -notin $list }. Or to change your test to $line -notmatch $item.
But beware that you are doing a regex match and something like top500.org would match .0 and throw your results out. If you need to match at the end specifically, you need to use something like $line.EndsWith($item).

other solution
$array = #(".php",".zip",".html",".htm",".js",".png",".ico",".0",".jpg")
get-content C:\domains.txt | where {[System.IO.Path]::GetExtension($_) -notin $array}

Related

Powershell Editing Strings in Array

My array has a lot of properties but I'm only looking to edit one. The goal is to remove domains from the hostname but I haven't been able to get it working. This data is being returned from a REST API and there are thousands of assets that contain the same type of data (JSON content). The end goal is to compare assets pulled from the API and compare it to assets in a CSV file. The issue is that the domain may appear on one list and not the other so I'm trying to strip the domains off for comparison. I didn't want to iterate through both files and the comparison has to go from the CSV file to the API data hence the need to get rid of the domain altogether.
There are other properties in the array that I will need to pull from later. This is just an example of one of the arrays with a few properties:
$array = #{"description"="data"; "name"="host1.domain1.com"; "model"="data"; "ip"="10.0.0.1"; "make"="Microsoft"}
#{"description"="data"; "name"="host2.domain2.com"; "model"="data"; "ip"="10.0.0.2"; "make"="Different"}
#{"description"="data"; "name"="10.0.0.5"; "model"="data"; "ip"="10.0.0.5"; "make"="Different"}
The plan was to match the domain and then strip using the period.
$domainList = #(".domain1.com", ".domain2.com")
$domains = $domainList.ForEach{[Regex]::Escape($_)} -join '|'
$array = $array | ForEach-Object {
if($_.name -match $domains) {
$_.name = $_.name -replace $domains }
$_.name
}
You may do the following to output a new array of values without the domain names:
# starting array
$array = #("hosta.domain1.com", "hostb.domain2.com", "host3", "10.0.0.1")
# domains to remove
$domains = #('.domain1.com','.domain2.com')
# create regex expression with alternations: item1|item2|item3 etc.
# regex must escape literal . since dot has special meaning in regex
$regex = $domains.foreach{[regex]::Escape($_)} -join '|'
# replace domain strings and output everything else
$newArray = $array -replace $regex
Arrays use pointers so I needed to load the array and pipe through ForEach-Object and then set the object when logic was complete. Thanks for all of the help.
$domainList = #(".domain1.com", ".domain2.com")
$domains = $domainList.ForEach{[Regex]::Escape($_)} -join '|'
$array = $array | ForEach-Object {
if($_.name -match $domains) {
$_.name = $_.name -replace $domains }
$_.name
}

Powershell - remove duplicates from 2 arrays using wildcards

I am trying to remove duplicates and leave only unique entries from the output of 2 queries.
I am pulling a list of installed Windows Updates using the following (also stripping 12 chars of whitespace and dropping to lower case:
$A = #(Get-HotFix | select-object #{Expression={$_.HotFixID.ToLower()}} | ft -hidetableheaders | Out-String) -replace '\s{12}',''
I am then querying a list of available files in a folder and stripping 3 trailing whitespace chars using:
$B = #(Get-ChildItem D:\y | select-object 'Name' | ft -hidetableheaders | Out-String) -replace '\s{3}',''
The problem I have is that the first query ($A) returns output like:
kb4040981
kb4041693
kb2345678
kb8765432
While the second query ($B) returns output like:
windows8.1-kb4040981-x64_d1eb05bc8c55c7632779086079c7759f40d7386f.cab
windows8.1-kb4041687-x64_3bdf264bcfc0dda01c2eaf2135e322d2d6ce6c64.cab
windows8.1-kb4041693-x64_359b7ac71a48e5af003d67e3e4b80120a2f5b570.cab
windows8.1-kb4049179-x64_e6ec21d5d16fa6d8ff890c0c6042c2ba38a1f7c4.cab
I need to compare the 2 outputs using wildcards around each entry in the $A array (I think), and where it exists in $B remove the entire line from $B array.
I cannot truncate the output of $B as I need to use the full filenames in a subsequent process.
IE in the example output above, the entire FIRST and THIRD lines would be remove from the $B array and other lines left intact.
I have tried numerous methods from online searches, and used foreach loops, all to no avail.
Thank you in advance for any assistance.
What did you try with foreach loops that didn't work? Unless your output is huge, this method is pretty striaght forward.
$a = "kb4040981","kb4041693","kb2345678","kb8765432","test"
[System.Collections.ArrayList]$b = "windows8.1-kb4040981-x64_d1eb05bc8c55c7632779086079c7759f40d7386f.cab","windows8.1-kb4041687-x64_3bdf264bcfc0dda01c2eaf2135e322d2d6ce6c64.cab","windows8.1-kb4041693-x64_359b7ac71a48e5af003d67e3e4b80120a2f5b570.cab","windows8.1-kb4049179-x64_e6ec21d5d16fa6d8ff890c0c6042c2ba38a1f7c4.cab"
$toRemove = New-Object System.Collections.ArrayList
foreach($kb in $a)
{
foreach($line in $b)
{
if($line -match $kb)
{
write-host "$kb found in: $line" -ForegroundColor Green
$toRemove.add($line) | out-null
}
}
}
foreach($line in $toRemove)
{
$b.Remove($line)
}
$b
Hope it helps.
I would recommend for you to take a little time to learn the very basics of Powershell. When you use format cmdlets and text files instead of objects you cut yourself of the good stuff. ;-)
Here is how I would start the task:
$A = Get-HotFix
$B = Get-ChildItem D:\y | Select-Object -Property Name,#{Name='HotFixID';Expression={($_.BaseName -split '-')[1]}}
Compare-Object -ReferenceObject $A -DifferenceObject $B -Property 'HotFixID' -PassThru
Sincere thanks to sambardo for his patience and input! The final working solution based on his excellent recommendation is:
$a = (Get-Hotfix).hotfixID
$b = (Get-ChildItem D:\y\ -file *.cab).name
$toRemove = New-Object System.Collections.ArrayList
foreach($kb in $a)
{
foreach($line in $b)
{
if($line -match $kb)
{
# write-host "$kb found in: $line" -ForegroundColor Green
$toRemove.add($line) | out-null
}
}
}
foreach($line in $toRemove)
{
$b.Remove($line)
}
$b

Trying to thin out backup files but Get-ChildItem isn't returning usable list

We have a backup that runs every other day, but the files are large and we want to just remove every other one once we get a certain amount of backup files with our file signature.
I've tried this:
$Drive = "E:\temp\"
$deleteTime = -42;
$limit = (Get-Date).AddDays($deleteTime)
#this is finding the correct files but I don't think it's really in an array
$temp1 = Get-ChildItem -Path $Drive -filter "*junk.vhd*" | Where-Object {$_.LastWriteTime -lt $limit} | Select -Expand Name
for($i=$temp1.GetLowerBound(0); $i -le $temp1.GetUpperBound(0); $i+=2) {
Write-Host "removing $temp1[$i]" #this is listing the entire array with a [0] for first one and the third [2] element also, whether I cast to an array or not
}
I tried this instead of the above (Get-ChildItem) line currently but it listed the entire set of junk files for [0] instead of just the first junk.vhd at [0]:
[array]$temp1 =#( Get-ChildItem -Path $Drive -filter "*junk.vhd*" | Where-Object {$_.LastWriteTime -lt $limit} | Foreach-Object {$_.Name} )
I tried this too:
$limit = (Get-Date).AddDays(-42)
$list = (dir -Filter *junk.ps1 | where LastWriteTime -lt $limit).FullName
$count = $list.Length
for ($i = 0; $i -lt $count; $i += 2)
{
Write-Verbose "[$i] $($list[$i])"
#it's not getting in here because I'm not sure how
#to add the $Drive location and list is empty
}
Does anyone have a suggestion how to get an array of the filenames from $Drive location with the signature *junk.vhd so I can loop through them and remove every other one?
An internet search isn't turning much up.
This works for me:
$deleteTime = -12;
$limit = (Get-Date).AddDays($deleteTime)
$t = Get-ChildItem -Path $pwd -filter "p*.txt" | Where-Object {$_.LastWriteTime -lt $limit} | Select -Expand Name
foreach ($a in $t) { Write-Host "Name : $a" }
What have I missed from what you were looking for?
(Obviously, you will need to maintain a counter and do some modulo arithmetic in the body of the foreach() statement... )
This works, too:
for($i=$t.GetLowerBound(0); $i -le $t.GetUpperBound(0); $i+=2) {
$n = $t[$i]
Write-Host "removing $n"
}

One element containing hashes instead of multiple elements - how to fix?

I am trying to parse robocopy log files to get file size, path, and date modified. I am getting the information via regex with no issues. However, for some reason, I am getting an array with a single element, and that element contains 3 hashes. My terminology might be off; I am still learning about hashes. What I want is a regular array with multple elements.
Output that I am getting:
FileSize FilePath DateTime
-------- -------- --------
{23040, 36864, 27136, 24064...} {\\server1\folder\Test File R... {2006/03/15 21:08:01, 2010/12...
As you can see, there is only one row, but that row contains multiple items. I want multiple rows.
Here is my code:
[regex]$Match_Regex = "^.{13}\s\d{4}/\d{2}/\d{2}\s\d{2}:\d{2}:\d{2}\s.*$"
[regex]$Replace_Regex = "^\s*([\d\.]*\s{0,1}\w{0,1})\s(\d{4}\/\d{2}\/\d{2}\s\d{2}:\d{2}:\d{2})\s(.*)$"
$MainContent = New-Object System.Collections.Generic.List[PSCustomObject]
Get-Content $Path\$InFile -ReadCount $Batch | ForEach-Object {
$FileSize = $_ -match $Match_Regex -replace $Replace_Regex,('$1').Trim()
$DateTime = $_ -match $Match_Regex -replace $Replace_Regex,('$2').Trim()
$FilePath = $_ -match $Match_Regex -replace $Replace_Regex,('$3').Trim()
$Props = #{
FileSize = $FileSize;
DateTime = $DateTime;
FilePath = $FilePath
}
$Obj = [PSCustomObject]$Props
$MainContent.Add($Obj)
}
$MainContent | % {
$_
}
What am I doing wrong? I am just not getting it. Thanks.
Note: This needs to be as fast as possible because I have to process millions of lines, which is why I am trying System.Collections.Generic.List.
I think the problem is that for what you're doing you actually need two foreach-object loops. Using Get-Content with -Readcount is going to give you an array of arrays. Use the -Match in the first Foreach-Object to filter out the records that match in each array. That's going to give you an array of the matched records. Then you need to foreach through that array to create one object for each record:
[regex]$Match_Regex = "^.{13}\s\d{4}/\d{2}/\d{2}\s\d{2}:\d{2}:\d{2}\s.*$"
[regex]$Replace_Regex = "^\s*([\d\.]*\s{0,1}\w{0,1})\s(\d{4}\/\d{2}\/\d{2}\s\d{2}:\d{2}:\d{2})\s(.*)$"
$MainContent =
Get-Content $Path\$InFile -ReadCount $Batch |
ForEach-Object {
$_ -match $Match_Regex |
ForEach-Object {
$FileSize = $_ -replace $Replace_Regex,('$1').Trim()
$DateTime = $_ -replace $Replace_Regex,('$2').Trim()
$FilePath = $_ -replace $Replace_Regex,('$3').Trim()
[PSCustomObject]#{
FileSize = $FileSize
DateTime = $DateTime
FilePath = $FilePath
}
}
}
You don't really need to use the collection as an accumulator, just output PSCustomObjects, and let them accumulate in the result variable.

How to remove item from an array in PowerShell?

I'm using Powershell 1.0 to remove an item from an Array. Here's my script:
param (
[string]$backupDir = $(throw "Please supply the directory to housekeep"),
[int]$maxAge = 30,
[switch]$NoRecurse,
[switch]$KeepDirectories
)
$days = $maxAge * -1
# do not delete directories with these values in the path
$exclusionList = Get-Content HousekeepBackupsExclusions.txt
if ($NoRecurse)
{
$filesToDelete = Get-ChildItem $backupDir | where-object {$_.PsIsContainer -ne $true -and $_.LastWriteTime -lt $(Get-Date).AddDays($days)}
}
else
{
$filesToDelete = Get-ChildItem $backupDir -Recurse | where-object {$_.PsIsContainer -ne $true -and $_.LastWriteTime -lt $(Get-Date).AddDays($days)}
}
foreach ($file in $filesToDelete)
{
# remove the file from the deleted list if it's an exclusion
foreach ($exclusion in $exclusionList)
{
"Testing to see if $exclusion is in " + $file.FullName
if ($file.FullName.Contains($exclusion)) {$filesToDelete.Remove($file); "FOUND ONE!"}
}
}
I realize that Get-ChildItem in powershell returns a System.Array type. I therefore get this error when trying to use the Remove method:
Method invocation failed because [System.Object[]] doesn't contain a method named 'Remove'.
What I'd like to do is convert $filesToDelete to an ArrayList and then remove items using ArrayList.Remove. Is this a good idea or should I directly manipulate $filesToDelete as a System.Array in some way?
Thanks
The best way to do this is to use Where-Object to perform the filtering and use the returned array.
You can also use #splat to pass multiple parameters to a command (new in V2). If you cannot upgrade (and you should if at all possible, then just collect the output from Get-ChildItems (only repeating that one CmdLet) and do all the filtering in common code).
The working part of your script becomes:
$moreArgs = #{}
if (-not $NoRecurse) {
$moreArgs["Recurse"] = $true
}
$filesToDelete = Get-ChildItem $BackupDir #moreArgs |
where-object {-not $_.PsIsContainer -and
$_.LastWriteTime -lt $(Get-Date).AddDays($days) -and
-not $_.FullName.Contains($exclusion)}
In PSH arrays are immutable, you cannot modify them, but it very easy to create a new one (operators like += on arrays actually create a new array and return that).
I agree with Richard, that Where-Object should be used here. However, it's harder to read.
What I would propose:
# get $filesToDelete and #exclusionList. In V2 use splatting as proposed by Richard.
$res = $filesToDelete | % {
$file = $_
$isExcluded = ($exclusionList | % { $file.FullName.Contains($_) } )
if (!$isExcluded) {
$file
}
}
#the files are in $res
Also note that generally it is not possible to iterate over a collection and change it. You would get an exception.
$a = New-Object System.Collections.ArrayList
$a.AddRange((1,2,3))
foreach($item in $a) { $a.Add($item*$item) }
An error occurred while enumerating through a collection:
At line:1 char:8
+ foreach <<<< ($item in $a) { $a.Add($item*$item) }
+ CategoryInfo : InvalidOperation: (System.Collecti...numeratorSimple:ArrayListEnumeratorSimple) [], RuntimeException
+ FullyQualifiedErrorId : BadEnumeration
This is ancient. But, I wrote these a while ago to add and remove from powershell lists using recursion. It leverages the ability of powershell to do multiple assignment . That is, you can do $a,$b,$c=#('a','b','c') to assign a b and c to their variables. Doing $a,$b=#('a','b','c') assigns 'a' to $a and #('b','c') to $b.
First is by item value. It'll remove the first occurrence.
function Remove-ItemFromList ($Item,[array]$List(throw"the item $item was not in the list"),[array]$chckd_list=#())
{
if ($list.length -lt 1 ) { throw "the item $item was not in the list" }
$check_item,$temp_list=$list
if ($check_item -eq $item )
{
$chckd_list+=$temp_list
return $chckd_list
}
else
{
$chckd_list+=$check_item
return (Remove-ItemFromList -item $item -chckd_list $chckd_list -list $temp_list )
}
}
This one removes by index. You can probably mess it up good by passing a value to count in the initial call.
function Remove-IndexFromList ([int]$Index,[array]$List,[array]$chckd_list=#(),[int]$count=0)
{
if (($list.length+$count-1) -lt $index )
{ throw "the index is out of range" }
$check_item,$temp_list=$list
if ($count -eq $index)
{
$chckd_list+=$temp_list
return $chckd_list
}
else
{
$chckd_list+=$check_item
return (Remove-IndexFromList -count ($count + 1) -index $index -chckd_list $chckd_list -list $temp_list )
}
}
This is a very old question, but the problem is still valid, but none of the answers fit my scenario, so I will suggest another solution.
I my case, I read in an xml configuration file and I want to remove an element from an array.
[xml]$content = get-content $file
$element = $content.PathToArray | Where-Object {$_.name -eq "ElementToRemove" }
$element.ParentNode.RemoveChild($element)
This is very simple and gets the job done.

Resources