My array has a lot of properties but I'm only looking to edit one. The goal is to remove domains from the hostname but I haven't been able to get it working. This data is being returned from a REST API and there are thousands of assets that contain the same type of data (JSON content). The end goal is to compare assets pulled from the API and compare it to assets in a CSV file. The issue is that the domain may appear on one list and not the other so I'm trying to strip the domains off for comparison. I didn't want to iterate through both files and the comparison has to go from the CSV file to the API data hence the need to get rid of the domain altogether.
There are other properties in the array that I will need to pull from later. This is just an example of one of the arrays with a few properties:
$array = #{"description"="data"; "name"="host1.domain1.com"; "model"="data"; "ip"="10.0.0.1"; "make"="Microsoft"}
#{"description"="data"; "name"="host2.domain2.com"; "model"="data"; "ip"="10.0.0.2"; "make"="Different"}
#{"description"="data"; "name"="10.0.0.5"; "model"="data"; "ip"="10.0.0.5"; "make"="Different"}
The plan was to match the domain and then strip using the period.
$domainList = #(".domain1.com", ".domain2.com")
$domains = $domainList.ForEach{[Regex]::Escape($_)} -join '|'
$array = $array | ForEach-Object {
if($_.name -match $domains) {
$_.name = $_.name -replace $domains }
$_.name
}
You may do the following to output a new array of values without the domain names:
# starting array
$array = #("hosta.domain1.com", "hostb.domain2.com", "host3", "10.0.0.1")
# domains to remove
$domains = #('.domain1.com','.domain2.com')
# create regex expression with alternations: item1|item2|item3 etc.
# regex must escape literal . since dot has special meaning in regex
$regex = $domains.foreach{[regex]::Escape($_)} -join '|'
# replace domain strings and output everything else
$newArray = $array -replace $regex
Arrays use pointers so I needed to load the array and pipe through ForEach-Object and then set the object when logic was complete. Thanks for all of the help.
$domainList = #(".domain1.com", ".domain2.com")
$domains = $domainList.ForEach{[Regex]::Escape($_)} -join '|'
$array = $array | ForEach-Object {
if($_.name -match $domains) {
$_.name = $_.name -replace $domains }
$_.name
}
Related
I have a script where I get all of the folders in d:\folder\*\*\ where the name is -like "*\Log". I then split the folder paths apart to run through wmi to get the corresponding services. After that I'm wanting to split apart the PathName property from $Services so I get everything before the \xxxxx.exe and add \log to the end of the result. Eventually I'll then use those paths to do some compression and archiving of files via a gci.
For whatever reason when I run the script below I the previous loops $LocalLogVar without "log" appended and the current loops LocalLogVar with log appended. I'm sure I'm doing something wrong that's blatantly obvious to somebody out there. If somebody could point me in the right direction on this it'd be much appreciated! I also apologize for the word vomit here, I've been looking at this script all day and my brain's pretty much used up.
A couple of notes:
The number of words in the paths vary which is why I can't manually do $LocalLogVar = "$Var1\$Var2\$Var3\Log"
If I don't have the [array] in front of $LogFolders the object type becomes a string and I get the previous loop's $LocalLogVar without "log" appended combined with the current loop's $LocalLogVar
I tried doing [collections.arraylist]$LogFolders=#() with no success
c:\folder is a shortcut to d:\folder, which is why there's c:\folder\xxx and d:\folder\xxx in the list below
SplitCount is -1 because I don't want the .exe from the path, I just want the folder structure
The naming convention for the string before .exe varies so I can't use an enumerated counter.
Example of first bullet:
word7-word8 #This is the previous loop's $LocalLogVar w/o "log" appended
C:\folder\word5\word6\word9-word8\log #This is the current loop's $LocalLogVar w/ "log" appended.
Example of the second bullet:
word7-word8C:\folder\word5\word6\word9-word8\log
What I should be getting:
D:\folder\word-anotherword\word7-word8\log
D:\folder\word-anotherword\word9-word8\log
C:\folder\word1\word7-word8\log
C:\folder\word1\word9-word8\log
C:\folder\word2\word7-word8\log
C:\folder\word2\word9-word8\log
D:\folder\word2\word10-word11\log
D:\folder\word2\word12-word8\log
C:\folder\word3\word7-word8\log
C:\folder\word3\word9-word8\log
D:\folder\word4\word7-word8\log
C:\folder\word4\word9-word8\log
C:\folder\word5\word6\word7-word8\log
C:\folder\word5\word6\word9-word8\log
C:\folder\word5\word6\word7-word8\log
C:\folder\word5\word6\word9-word8\log
$Folders = Get-ChildItem D:\folder\*\*\ -Directory -Recurse -Verbose `
| Where-Object { $_.fullname -like "*\Log" }
$2 = #()
$LogFolders = #()
foreach ($folder in $folders) {
$ServName = $folder.fullname.split('\')[2]
$ServType = $folder.fullname.split('\')[3]
$ServNameCheck = "*$($ServName.replace('-',' '))*"
$ServTypeCheck = "*$($ServType.replace('-',' '))*"
$PathName = Get-WmiObject -ClassName Win32_Service `
| Where-Object { $_.caption -like "$ServNameCheck" -and $_.caption -like "$ServTypeCheck" } `
| Select-Object Name, Caption, #{n = 'PathName'; e = { ($_.PathName).trim('"') } }
$2 += $PathName
}
$Services = $2 | Sort-Object pathname | Get-Unique -AsString
foreach ($ServPath in $services.pathname) {
$LocalLogVar = #()
if (Get-Variable `
| Where-Object { $_.name -match "^Split([0-9]|10)$" }) {
Get-Variable `
| Where-Object { $_.name -match "^Split([0-9]|10)$" } | Remove-Variable -Force
}
[int]$SplitCount = $ServPath.split('\').count
[int]$SplitCountCheck = $SplitCount - 1
$x = 0
do {
New-Variable -Name "Split$x" -Value "$($ServPath.split('\')[$x])"
$RegEx = "Split$x"
$LogFolderName = Get-Variable | Where-Object { $_.name -match $RegEx } | Select-Object value
[string]$LogFolders = $LogFolderName.value.ToString()
$LocalLogVar += $LogFolders + '\'
$x++
} until ($x -eq $SplitCountCheck)
$LocalLogVar = $LocalLogVar
$LocalLogVar = $LocalLogVar + "log"
[array]$LogFolders += $LocalLogVar
}
Wow, so that's a script. Kind of hard to follow, since some of it seems needlessly complex. I'm not sure if it will accomplish what you're looking for, but that's because you were super vague with your folder descriptions. Do the folders always start like this:
D:\folder<Service Short Name><Service Long Name>...\logs
If not you could be in trouble. The last four items on your example list of what you expect to see don't look like they are like that. I think the way your folders are laid out are like this:
D:\folder...<Service Short Name><Service Long Name>\logs
The difference being where the extra folders are located. If they're before the service like I think they are your script will miss things.
Now, on to getting your list that you want. What I see from looking at your script is that you get a folder list for all folders under D:\folder\*\*\ named 'log'. Then you split out the 3rd and 4th folders to get a service's short name, and long name respectively. Then one by one you pull a list of all services from WMI, and filter for just the service that matches the name and caption (short name, and long name) referred to by the folders. After that you make sure you only have one listing of any given service.
Regarding this first part of the script, you can make it faster by letting the file system provider filter things for you. Instead of pulling a folder list of everything and then filtering for paths that end in '\log', you should use the -filter parameter of the Get-ChildItem cmdlet like this:
$Folders = Get-ChildItem C:\temp\*\*\ -Directory -Recurse -Verbose -Filter 'log'
Then you should query WMI one time, save the result, then pick and choose from there based on your folders. Something like:
[array]$2 = foreach ($folder in $folders) {
$ServName,$ServType = $folder.fullname.split('\')[2,3] -replace '-',' '
$PathName = $AllServices |
Where-Object { $_.caption -like "*$ServName*" -and $_.caption -like "*$ServType*" } |
Select-Object Name, Caption, #{n = 'PathName'; e = { $_.PathName -replace '^(\w\S+) .*','$1' -replace '^([''"])([^\1]+)\1.*','$2' } }
}
$Services = $2 | Sort-Object pathname | Get-Unique -AsString
I did a little regex magic to clean up the pathname instead of just .trim('"') since this gets rid of parameters in the service execution, and cleans paths that are enclosed in single quotes not just double quotes. If what you have works for you feel free to keep it, but this is a little more capable. It may be worth noting that Get-Unique is case sensitive, so 'C:\folder\word3\word9-word8' and 'C:\folder\word3\word9-Word8' are different. You might want to do a .ToUpper() on your paths before you look for unique ones.
Once you have your array of services you loop through them, splitting the file path, reassembling it, and finally adding 'log' to the end of it. That was your way to remove the executable from the path. There's a cmdlet that was designed to do just that: split-path. Use that with Join-Path and that whole last loop gets much simpler:
[array]$LogFolders = foreach ($ServPath in $services.pathname) {
Join-Path (Split-Path $ServPath) 'log'
}
Lastly, try not to use +=, since PowerShell has to rebuild the whole array each time you do that. You'll notice I moved the $Variable = bit outside the loop in places that you do that.
I have a powershell script and a txt database with different number of elements per line.
My txt file is list.txt:
"10345","doomsday","life","hope","run","stone"
"10346","ride","latest","metal"
My powershell script search.ps1:
#Get file path
$path = Split-Path $script:MyInvocation.MyCommand.Path
$search = #()
Get-Content -LiteralPath "$path\list.txt" | ForEach-Object {
$search += $_
}
So, how to convert each line as a element of array? As this:
$search = #(("10345","doomsday","life","hope","run","stone"),("10346","ride","latest","metal"))
To operate as:
echo $search[0][0]
Here's a concise PSv4+ solution:
$search = (Get-Content -LiteralPath $path\list.txt).ForEach({ , ($_ -split ',') })
The .ForEach() method operates on each line read from the input file by Get-Content.
$_ -split ',' splits each line into an array of strings by separator ,
, (...) wraps this array in an aux. single-item array to ensure that the array is effectively output as a whole, resulting in an array of arrays as the overall output.
Note: Strictly speaking, the .ForEach() method outputs a [System.Collections.ObjectModel.Collection[psobject]] collection rather than a regular PowerShell array ([object[]]), but for all practical purposes the two types act the same.
Note: The .ForEach() method was chosen as a faster alternative to a pipeline with the ForEach-Object (%) cmdlet.
Note that the .ForEach() method requires storing the input collection in memory as a whole first.
A faster and more memory-efficient, though perhaps slightly obscure alternative is to use a switch statement with the -file option:
$search = switch -file $path\list.txt { default { , ($_ -split ',') } }
switch -file processes each line of the specified file.
Since each line should be processed, only a default branch is used, in which the desired splitting is performed.
Use -split. A code snippet you can debug in ISE or VSCode below.
$x1 = #'
"10345","doomsday","life","hope","run","stone"
"10346","ride","latest","metal"
'#
$data = $x1 -split "`r`n"
$data.Count
$data[0] -split ","
$arr = #()
foreach ($row in $data)
{
$arr += ,($row -split ",")
}
"arr"
$arr
"0,3"
$arr[0][3]
"1,3"
$arr[1][3]
So you can split each line in your file returned from Get-Content and add it to your new array which lets you reference how you wanted...
There are other ways you can use your data depending on your needs.
Assuming you do not want each item quoted, you might consider to not using the -Split operator but just evaluating each line with the Invoke-Expression cmdlet or using a more secure [ScriptBlock] for this:
$Search = Get-Content ".\list.txt" | ForEach-Object {,#(&([ScriptBlock]::Create($_)))}
Noob here.
I'm trying to pare down a list of domains by eliminating all subdomains if the parent domain is present in the list. I've managed to cobble together a script that somewhat does this with PowerShell after some searching and reading. The output is not exactly what I want, but will work OK. The problem with my solution is that it takes so long to run because of the size of my initial list (tens of thousands of entries).
UPDATE: I've updated my example to clarify my question.
Example "parent.txt" list:
adk2.co
adk2.com
adobe.com
helpx.adobe.com
manage.com
list-manage.com
graph.facebook.com
Example output "repeats.txt" file:
adk2.com (different top level domain than adk2.co but that's ok)
helpx.adobe.com
list-manage.com (not subdomain of manage.com but that's ok)
I would then take and eliminate the repeats from the parent, leaving a list of "unique" subdomains and domains. I have this in a separate script.
Example final list with my current script:
adk2.co
adobe.com
manage.com
graph.facebook.com (it's not facebook.com because facebook.com wasn't in the original list.)
Ideal final list:
adk2.co
adk2.com (since adk2.co and adk2.com are actually distinct domains)
adobe.com
manage.com
graph.facebook.com
Below is my code:
I've taken my hosts list (parent.txt) and checked it against itself, and spit out any matches into a new file.
$parent = Get-Content("parent.txt")
$hosts = Get-Content("parent.txt")
$repeats =#()
$out_file = "$PSScriptRoot\repeats.txt"
$hosts | where {
$found = $FALSE
foreach($domains in $parent){
if($_.Contains($domains) -and $_ -ne $domains){
$found = $TRUE
$repeats += $_
}
if($found -eq $TRUE){
break
}
}
$found
}
$repeats = $repeats -join "`n"
[System.IO.File]::WriteAllText($out_file,$repeats)
This seems like a really inefficient way to do it since I'm going through each element of the array. Any suggestions on how to best optimize this? I have some ideas like putting more conditions on what elements to check and check against, but I feel like there's a drastically different approach that would be far better.
First, a solution based strictly on shared domain names (e.g., helpx.adobe.com and adobe.com are considered to belong to the same domain, but list-manage.com and manage.com are not).
This is not what you asked for, but perhaps more useful to future readers:
Get-Content parent.txt | Sort-Object -Unique { ($_ -split '\.')[-2,-1] -join '.' }
Assuming list.manage.com rather than list-manage.com in your sample input, the above command yields:
adk2.co
adk2.com
adobe.com
graph.facebook.com
manage.com
{ ($_ -split '\.')[-2,-1] -join '.' } sorts the input lines by the last 2 domain components (e.g., adobe.com):
-Unique discards duplicates.
A shared-suffix solution, as requested:
# Helper function for (naively) reversing a string.
# Note: Does not work properly with Unicode combining characters
# and surrogate pairs.
function reverse($str) { $a = $str.ToCharArray(); [Array]::Reverse($a); -join $a }
# * Sort the reversed input lines, which effectively groups them by shared suffix
# with the shortest entry first (e.g., the reverse of 'manage.com' before the
# reverse of 'list-manage.com').
# * It is then sufficient to output only the first entry in each group, using
# wildcard matching with -notlike to determine group boundaries.
# * Finally, sort the re-reversed results.
Get-Content parent.txt | ForEach-Object { reverse $_ } | Sort-Object |
ForEach-Object { $prev = $null } {
if ($null -eq $prev -or $_ -notlike "$prev*" ) {
reverse $_
$prev = $_
}
} | Sort-Object
One approach is to use a hash table to store all your parent values, then for each repeat, remove it from the table. The value 1 when adding to the hash table does not matter since we only test for existence of the key.
$parent = #(
'adk2.co',
'adk2.com',
'adobe.com',
'helpx.adobe.com',
'manage.com',
'list-manage.com'
)
$repeats = (
'adk2.com',
'helpx.adobe.com',
'list-manage.com'
)
$domains = #{}
$parent | % {$domains.Add($_, 1)}
$repeats | % {if ($domains.ContainsKey($_)) {$domains.Remove($_)}}
$domains.Keys | Sort
I have a text file domains.txt
$domains = ‘c:\domains.txt’
$list = Get-Content $domains
google.com
google.js
and an array
$array = #(".php",".zip",".html",".htm",".js",".png",".ico",".0",".jpg")
anything in $domain that ends in something in #arr should NOT be in my final list
So google.com would be in final list but google.js would not.
I found some other stackoverflow code that give me the exact opposite of what I'm looking for but, hah I can't get it reversed!!!!
This gives me the exact opposite of what I want, how do I reverse it?
$domains = ‘c:\domains.txt’
$list = Get-Content $domains
$array = #(".php",".zip",".html",".htm",".js",".png",".ico",".0",".jpg")
$found = #{}
$list | % {
$line = $_
foreach ($item in $array) {
if ($line -match $item) { $found[$line] = $true }
}
}
$found.Keys | write-host
this gives me google.js I need it to give me google.com.
I've tried -notmatch etc and can't get it to reverse.
Thanks in advance and the more explanation the better!
Take the .s off, mash the items together into a regex OR, tag on an end-of-string anchor, and filter the domains against it.
$array = #("php","zip","html","htm","js","png","ico","0","jpg")
# build a regex of
# .(php|zip|html|htm|...)$
# and filter the list with it
$list -notmatch "\.($($array -join '|'))`$"
Anyway, the simple way to invert your result is to walk through $found.keys | where { $_ -notin $list }. Or to change your test to $line -notmatch $item.
But beware that you are doing a regex match and something like top500.org would match .0 and throw your results out. If you need to match at the end specifically, you need to use something like $line.EndsWith($item).
other solution
$array = #(".php",".zip",".html",".htm",".js",".png",".ico",".0",".jpg")
get-content C:\domains.txt | where {[System.IO.Path]::GetExtension($_) -notin $array}
Im interested in some ideas on how one would approach coding a search of a filesystem for files that match any entries contained in a master CSV file. I have a function to search the filesystem, but filtering against the CSV is proving harder than I expect. I have a csv with headers in it for Name & IPaddr:
#create CSV object
$csv = import-csv filename.csv
#create filter object containing only Name column
$filter = $csv | select-object Name
#Now run the search function
SearchSubfolders | where {$_.name -match $filter} #returns no results
I guess my question is this: Can I filter against an array within a pipeline like this???
You need a pair of loops:
#create CSV object
$csv = import-csv filename.csv
#Now run the search function
#loop through the folders
foreach ($folder in (SearchSubfolders)) {
#check that folder against each item in the csv filter list
#this sets up the loop
foreach ($Filter in $csv.Name) {
#and this does the checking and outputs anything that is matched
If ($folder.name -match $Filter) { "$filter" }
}
}
Usually CSVs are 2-dimensional data structures, so you can't use them directly for filtering. You can convert the 2-dimensional array into a 1-dimensional array, though:
$filter = Import-Csv 'C:\path\to\some.csv' | % {
$_.PSObject.Properties | % { $_.Value }
}
If the CSV has just a single column, the "mangling" can be simplified to this (replace Name with the actual column name):
$filter = Import-Csv 'C:\path\to\some.csv' | % { $_.Name }
or this:
$filter = Import-Csv 'C:\path\to\some.csv' | select -Expand Name
Of course, if the CSV has just a single column, it would've been better to make it a flat list right away, so it could've been imported like this:
$filter = Get-Content 'C:\path\to\some.txt'
Either way, with the $filter prepared, you can apply it to your input data like this:
SearchSubFolders | ? { $filter -contains $_.Name } # ARRAY -contains VALUE
The -match operator won't work, because it compares a value (left operand) against a regular expression (right operand).
See Get-Help about_Comparison_Operators for more information.
Another option is to create a regex from the filename collection and use that to filter for all the filenames at once:
$filenames = import-csv filename.csv |
foreach { $_.name }
[regex]$filename_regex = ‘(?i)^(‘ + (($filenames | foreach {[regex]::escape($_)}) –join “|”) + ‘)$’
$SearchSubfolders |
where { $_.name -match $filename_regex }
You can use Compare-Object to do this pretty easily if you are matching the actual Names of the files to names in the list. An example:
$filter = import-csv files.csv
ls | Compare-Object -ReferenceObject $filter -IncludeEqual -ExcludeDifferent -Property Name
This will print the files in the current directory that match the any Name in files.csv. You could also print only the different ones by dropping -IncludeEqual and -ExcludeDifferent flags. If you need full regex matching you will have to loop through each regex in the csv and see if it is a match.
Here's any alternate solution that uses regular expression filters. Note that we will create and cache the regex instances so we don't have to rely on the runtime's internal cache (which defaults to 15 items). First we have a useful helper function, Test-Any that will loop through an array of items and stop if any of them satisfies a criteria:
function Test-Any() {
param(
[Parameter(Mandatory=$True,ValueFromPipeline=$True)]
[object[]]$Items,
[Parameter(Mandatory=$True,Position=2)]
[ScriptBlock]$Predicate)
begin {
$any = $false
}
process {
foreach($item in $items) {
if ($predicate.Invoke($item)) {
$any = $true
break
}
}
}
end { $any }
}
With this, the implementation is relatively simple:
$filters = import-csv files.csv | foreach { [regex]$_.Name }
ls -recurse | where { $name = $_.Name; $filters | Test-Any { $_.IsMatch($name) } }
I ended up using a 'loop within a loop' construct to get this done after much trial and error:
#the SearchSubFolders function was amended to force results in a variable, SearchResults
$SearchResults2 = #()
foreach ($result in $SearchResults){
foreach ($line in $filter){
if ($result -match $line){
$SearchResults2 += $result
}
}
}
This works great after collapsing my CSV file down to a text-based array containing only the necessary column data from that CSV. Much thanks to Ansgar Wiechers for assisting me with that particular thing!!!
All of you presented viable solutions, some more complex than I cared for, nevertheless if I could mark multiple answers as correct, I would!! I chose the correct answer based on not only correctness but also simplicity.....