Powershell Remove objects from an array list - arrays

Is there a non for-loop way to remove some items from a arrayList?
$remotesumerrors = $remoteFiles | Select-String -Pattern '^[a-f0-9]{32}( )' -NotMatch
I want to remove the output of the above from the $remoteFiles var.. is there some pipe way to remove them?

Assuming all of the following:
you do need the results captured in $remotesumerrors separately
that $remoteFiles is a collection of System.IO.FileInfo instances, as output by Get-ChildItem, for instance
it is acceptable to save the result as an invariably new collection back to $remoteFiles,
you can use the .Where() array method as follows (this outperforms a pipeline-based solution based on the Where-Object cmdlet):
# Get the distinct set of the full paths of the files of origin
# from the Select-String results stored in $remotesumerrors
# as a hash set, which allows efficient lookup.
$errorFilePaths =
[System.Collections.Generic.HashSet[string]] $remotesumerrors.Path
# Get those file-info objects from $remoteFiles
# whose paths aren't in the list of the paths obtained above.
$remoteFiles = $remoteFiles.Where({ -not $errorFilePaths.Contains($_.FullName) })
As an aside:
Casting a collection to [System.Collections.Generic.HashSet[T]] is a fast and convenient way to get a set of distinct values (duplicates removed), but note that the resulting hash set's elements are invariably unordered and that, with strings, lookups are by default case-sensitive - see this answer for more information.

Use the Where-Object cmdlet to filter the list:
$remoteFiles = $remoteFiles |Where-Object { $_ |Select-String -Pattern '^[a-f0-9]{32}( )' -NotMatch }

If it truly was a [collections.arraylist], you could remove an element by value. There's also .RemoveAt(), to remove by array index.
[System.Collections.ArrayList]$array = 'a','b','c','d','e'
$array.remove
OverloadDefinitions
-------------------
void Remove(System.Object obj)
void IList.Remove(System.Object value)
$array.remove('c')
$array
a
b
d
e

Let assume that $remoteFiles is a file object of type System.IO.FileInfo. I also assume that you want to filter based on filename.
$remotesumerrors = $remoteFiles.name | Select-String -Pattern '^[a-f0-9]{32}' -NotMatch
What are trying to do with "( )" or what is query that you want to do.
edit: corrected answer based on comment

Related

How to get the "difference" between two objects in another object with the same structure using powershell?

TL;DR: This but in powershell
I'm trying to get the "difference" between a list1 and a list2 where list2 is just a sub-set of list1. (or list1 is a super-set of list2).
Keep in mind that this is not the 'real scenario' but the structures of both objects are the same as this:
#List of all the IDs
$list1='2','5','6','11'
$CustObject1=foreach($i in $list1){
[pscustomobject]#{
id=$i
}
}
#Another list, where invalid IDs are not listed
$list2='2','5'
$CustObject2=foreach($i in $list2){
[pscustomobject]#{
id=$i
}
}
#I am looking for $CustObject3 = {id = '6',id = '11'}
Now, I want to create a new "object" that contains a list of all the 'invalid' IDs, in our example the $CustObject3 would contain just id's '6','11'.
The reason behind this is because I have a list of 3000 ids that are "valid" and just a couple that are "invalid" (at most 10).
I've tried a lot of options but I can't get it to work, used "where", used double foreach loops, honestly I'm just frustrated at this point but I did try to make it work.
Is this feasible?
Thanks.
You can use the -notin or -notcontains operators to perform the check.
#List of all the IDs
$list1='2','5','6','11'
$CustObject1=foreach($i in $list){
[pscustomobject]#{
id=$i
}
}
#Another list, where invalid IDs are not listed
$list2='2','5'
$CustObject2=foreach($i in $list){
[pscustomobject]#{
id=$i
}
}
$CustObject1 | Where-Object {$_.id -notin $CustObject2.id}
Unpack the ids from the PSCustomObjects, put them in HashSets, and then do ExceptWith():
$Ids = [Collections.Generic.HashSet[string]]::new([string[]]$CustObject1.id)
$validIds = [Collections.Generic.HashSet[string]]::new([string[]]$CustObject2.id)
$Ids.ExceptWith($validIds)
Here $Ids starts with all the Ids, then removes the valid ones. It's updated internally and ends with only the invalid ones.
Another option is to use the Compare-Object cmdlet:
$CustObject3 =
Compare-Object -Property id -PassThru $CustObject1 $CustObject2 |
Select-Object * -Exclude SideIndicator
Note:
Compare-Object decorates the passed-through objects with a .SideIndicator property indicating which of the input collections a given object is unique to (<= for objects unique to the first collection, => for those unique to the second).
Since this information is not of interest here (all result objects are by definition unique to the first collection), Select-Object * -Exclude SideIndicator then removes this extra property.

Is it possible to make IndexOf case-insensitive in PowerShell?

I've got a problem searching an INDEX in an array made up by query sessions command in a terminal server.
This is the problematic script:
# Array of logged users in terminal servers
$a=Get-RDUsersession -CollectionName "BLABLA" -ConnectionBroker BLABLA.BLA.BL
# Array of all users with two columns from active directory
$b=Get-ADUser -filter * -properties TelephoneNumber,SamAccountName
Now imagine logging in the terminal server using the account name TEST instead of test.
If I do:
$c = $b[$b.SamAccountName.indexof("test")].TelephoneNumber
then I don't get the telephone number.
I think that's because of the case sensitivity, isn't it? If I type TEST in the search command, I get the correct number.
Is there any simple way to solve this problem and make the search of the index case-insensitive?
I've read about using the method [StringComparison]"CurrentCultureIgnoreCase", but it seems not working with array.
Thanks.
Since $b is an Object[] type, then you would probably want to do a Where-Object.
$b | Where-Object -FilterScript {$_.Samaccountname -like '*Smith*'} | Select-Object -ExpandProperty 'telephoneNumber'
That being said, an array in Powershell can be indexed case-insensitively if it is converted to a [Collections.Generic.List[Object]] type.
$b = [Collections.Generic.List[Object]]$b
$b.FindIndex( {$args[0].sAMAccountName -eq 'test'} )
Note that pulling every single user object in AD and filtering using where-object or index matching can be very slow. You can instead Get-ADUser as needed or pull all ADusers using a filter that pulls only the users returned in $a.
If you insist on having all ADUsers in one spot with one pull, consider looping over the list once to make a hash lookup so you can easily index the hash value.
#Create account lookup hash
$accountlookup = #{}
foreach ($element in $accounts) {
$accountlookup[$element.SamAccountName] = $element
}
Hope that helps!

Powershell importing one csv column in array. Contains command doesn`t work with this array

With a powershell script
I`m importing a CSV file in an array, everything works fine.
$csv = Import-Csv "C:\test.csv" -delimiter ";"
But I'm not able to find easily a value in a field name PeopleID directly.
The only best method is to loop through all array line and look if the item I`m looking for exist like :
foreach($item in $csv)
{
if ($csv | where {$item.PeopleID -eq 100263} | select *) {
#Good I found it!!!
}
$List.add($item.PeopleID) > $null
}
Instead I decide to import only my column PeopleID directly in an array to make it faster:
$csvPeople = Import-Csv "C:\test.csv" -delimiter ";" | select PeopleID
If you see higher, I also create an array $List that add every PeopleID in my loop.
So I have 2 arrays that are identically
The problem if I use the CONTAINS command:
if($csvPeople -contains 100263)
the result is false
if($List -contains 100263)
the result is true
What can I do to have my $csvPeople array working with "contains" ?
Importing a csv column is faster than looping through result and adding it to a new array, but this array is working.
Do I'm missing somthing when I import my CSV column to have a "working" array ?
thanks
I think you are looking for -like not -contains. -contains is a bit finicky about how it is used. If you replace you if syntax with this:
if($csvPeople -like "*100263*")
You should be good to go. Note the wildcards on either side, I'm putting these here for you since I don't know exactly what your data looks like. You might be either able to remove them or able to change them.
Obligatory -like vs -contains article if you are interested: http://windowsitpro.com/blog/powershell-contains
Also, #AnsgarWiechers comment above will work. I believe you will still need to wrap your number in quotes though. I don't like to do this as it requires an exact match. If you are working with the CSV in excel or elsewhere and you have whitespace or oddball line ending characters then you might not get a hit with -contains.
I just noticed this in your script above where you are doubling your efforts:
foreach($item in $csv) # <-- Here $item is a record in the CSV
{
if ($csv | where {$item.PeopleID -eq 100263} | select *) { # <-- Here you are re-parsing the CSV even though you already have the data in $item
}
$List.add($item.Matricule) > $null
}
Since I don't have the full picture of what you are trying to do I'm providing a solution that allows you to match and take action per record.
foreach($item in $csv)
{
if ($item.PeopleID -eq 100263) {
# Do something with $item here now that you matched it
}
}
To Address your 191 users. I would take a different approach. You should be able to load a raw list of PeopleID's into a variable and then use the -in feature of Where to do a list-to-list comparison.
First load your target PeopleID's into a list. Assuming it is just a flat list, not a csv you could do your comparison like this:
$PeopleID_List = Get-Content YourList.txt
$csv | where { $_.PeopleID -in $PeopleID_List } | % { # Do something with the matched record here. Reference it with $_ }
This basically says for each record in the CSV I want you to check if that record's peopleID is in the $PeopleID_List. If it is, take action on that record.

Would a hash table speed this up? If so how would I do it?

I'm looking for the negative intersection of two arrays. Each array has about 20k elements. I'm using a foreach loop over one array and looking each value up in the other array. I'm only keeping elements in the first array not found in the second array:
$deadpaths=#()
$ix=0
ForEach ($f in $FSBuildIDs)
{
if (-not($blArray -like $f)) {$deadpaths+=$paths[$ix]}
$ix++
}
$blArray contains valid IDs. $FSBuildIDs contains the IDs corresponding to the file system paths in $paths. The intent is to only keep the elements in $paths where the corresponding ID in $FSBuildIDS is NOT in $blArray.
Is there a better way to do this? The processing here takes an extremely long time. Both $blArray and $FSBuildIDs have about 20k elements and I suspect I'm looking at On^2 comparisons.
I thought about using a Dictionary with the elements of $FSBuildIDs as the keys and $paths as the values, but I can't figure out from the docs how to initialize and load the Dictionary (assuming this approach would speed things up). Obviously negative set intersection would be best but this isn't TSQL and I'm painfully aware that even V4 of PS doesn't support set operations.
Would using a dictionary in this problem speed up the comparisons? If so how do I create it from $FSBuildIDs and $paths? Any other techniques that might give me a performance boost vs. just iterating over these large(ish) lists?
Sample data for $blArray:
51012
51044
51049
51055
51058
51060
51073
51074
51077
51085
Sample data for $FSBuildIDs:
51001
51003
51005
51009
51013
51017
51018
51020
51021
51024
51026
Sample data for $paths:
\\server1\d$\software\anthill\var\artifacts\0000\3774\0000\3792\0005\2335
\\server1\d$\software\anthill\var\artifacts\0000\3774\0000\3792\0005\2336
\\server1\d$\software\anthill\var\artifacts\0000\3774\0000\3792\0005\2337
\\server1\d$\software\anthill\var\artifacts\0000\3774\0000\3792\0005\2338
\\server1\d$\software\anthill\var\artifacts\0000\3774\0000\3792\0005\2339
\\server1\d$\software\anthill\var\artifacts\0000\3774\0000\3792\0005\2340
\\server1\d$\software\anthill\var\artifacts\0000\3774\0000\3792\0005\2341
This is similar to the question posed previously, but different in some aspects. I'm essentially looking for guidance on constructing a dictionary from two existing arrays. I realized after posting that I really need a dictionary from $blarray as the keys and maybe $True as the value. The value is irrelevant. The important test is whether or not the current value in $FSBuildIDs is found in $blarray. That could be a dictionary lookup based on the ID as the key. That should speed up the processing, right?
I'm not clear on the comment that I'm destroying and recreating the array each time. Is that the $deadPaths array? Simply adding to it causes that? If so would I be better using a .Net ArrayList?
You could achieve a significant improvement by using the -contains operator instead of -like.
When the left-hand side of a -like operation is an array, PowerShell will iterate the array and perform a -like comparison against each and every entry.
-contains, on the other hand, returns as soon as a match is found.
Consider the following example:
$array1 = 1..2000
$array2 = 2..2001
$like = Measure-Command {
foreach($i in $array2){
$array1 -like $i
}
} |Select -Expand TotalMilliseconds
$contains = Measure-Command {
foreach($i in $array2){
$array1 -contains $i
}
} |Select -Expand TotalMilliseconds
Write-Host "Operation with -like took: $($like)ms"
Write-Host "Operation with -contains took: $($contains)ms"
Just like in your real-world example, we have 2 integer arrays with a large overlap. Let's see how it performs on my Windows 7 laptop (PowerShell 4.0):
I think the result speaks for itself :-)
That being said, you could, as you seem to anticipate, achieve an even greater improvement by populating a hashtable, using the values from the first array as keys:
$hashtable = $array1 |ForEach-Object -Begin {$t = #{}} -Process {
$t[$_] = $null
# the value doesn't matter, we're only interested in the key lookup
} -End { $t }
and then use the ContainsKey() method on the hashtable instead of -like:
foreach($i in $array2){
if($hashtable.ContainsKey($i)) { # do stuff }
}
You'll need to bump up the size of the array to see the actual difference (here using 20K items in the first array):
Final test script can be found here
I think this would be the start of what you are looking for. As discussed in comments we are going to do two comparisons. First to get the BuildID's we need to compare from from $FSBuildIDs and $blArray then we take the result of that to compare against the list of $paths. I am going to assume that it is just a string array of paths for now. Note there is room for error prevention and correction here. Still just testing for now.
$parsedIDs = Compare-Object $blArray $FSBuildIDs | Where{$_.SideIndicator -eq "=>"} | Select-Object -ExpandProperty InputObject
$paths = $paths | ForEach-Object{
$_ | Add-Member -MemberType NoteProperty -Name BuildID -Value (($_.Parent.Name + $_.Name) -as [int32]) -PassThru
}
$paths | Where-Object{$_.BuildID -in $parsedIDs}
First we compare the two ID arrays and keep the unique elements of $FSBuildIDs.
Next we go through the $paths. For each one we add a property that contains buildid. Where the buildid is the last two path elements concatenated and converted to an integer.
Once we have that a simple Where-Object give us the paths that have an id present from the first comparison.
To answer the question about building a hashtable:
$keyEnumerator = $FSBuildIDs.GetEnumerator()
$valEnumerator = $paths.GetEnumerator()
$idPathHash = #{}
foreach ($key in $keyEnumerator ) {
$null = $valEnumerator.movenext()
$idPathHash[$key] = $valEnumerator.current
}
Running this code on my system with a 20000 element array of fake data took 138ms.
To build the list of build ids not in the $idPathHash:
$buildIDsNotIn =
foreach ($buildId in $blArray) {
if (!$idPathHash.ContainsKey($buildId )) {
$buildId
}
}
This took 50ms on my system, with 20000 items in $blArray, again with fake data.

Ensure pipeline always results in array without using #()?

Is there any nice way to ensure a pipeline result is always an array without the array literal #()?
Currently, I always found myself writing pipeline and assume that the result is an array, e.g.
$Results = $ResultFiles | Where Name -like $Pattern | Sort -Unique Name
# Processing that assume $Results is array, e.g.
Out-Host -InputObject "found $($Results.Length) matching files..."
# Further processing that assume $Results is array
Then, I realize that I need to ensure $Results is always an array. So, I come back to add the magic #(), #( in front and ) at the back:
$Results = #( <pipeline_statement> )
or even more magic by adding to an empty array
$Results = #() + <pipeline_statement>
My question: is there any way to ensure that a pipeline always results in an array? that doesn't requires the "magic" #()? I think of creating a function to collect pipeline results, e.g. ConvertTo-Array, like:
$Results = <pipeline_statement> | ConvertTo-Array
But I'd rather use default Cmdlet or idioms if any.
Note:
I also tempted to create ConvertTo-CustomObject as I often found myself creating PSCustomObject from hash table.
Yes, there is. There is an ability to directly typecast a single value to an array. If that value is already a [Object[]] nothing is effectively changed.
[Array]$result=<pipeline_statement>
As mentioned in comments by #wannabeprogrammer, this does not convert null value into an array. If something can return null, the following addition will remedy the situation:
if ($result -eq $null) { $result = #() }

Resources