Powershell Grouping of csv import file - file

Hopefully I'm just missing something that is simple...
I have a csv file similar to this :
Employee ID, Entry Date, Product Code,Amount Due
0001,20/11/2017,A001,10
0001,20/11/2017,Q003,13
0001,20/11/2017,H001,8
0002,20/11/2017,P003,12
0002,20/11/2017,A001,7
and what I want as an output is similar to this :
0001;<some header text>;200171120
A001;10
Q003;13
H001;8
0002;<some header text>;200171120
P003;12
A001;7
So that each detail section is grouped by the Employee ID it relates to
I have tried piping the group-object ("Employee ID") after using an import-csv ... but I can't get it to work correctly
Any help much appreciated!

I think this does what you need:
$CSV | Group-Object 'Employee ID' | ForEach-Object {
$EntryDate = (Get-Date ($_.Group.'Entry Date' | Select -First 1) -F 'yyyyMMdd')
"{0};<some header text>;{1}" -f $_.Name, $EntryDate
ForEach ($Item in $_.Group) {
"{0},{1}" -f $Item.'Product Code',$Item.'Amount Due'
}
}
Explanation:
Uses Group-Object to group the results by Employee ID and then ForEach-Object to iterate through the collection.
Gets the Entry date from the first entry in the group, converts it to a date object with Get-Date and then formats it as year/month/day (this part assumes you don't care if there are other/differing dates in the collection).
Outputs the header string you wanted, using {0},{1} as placeholders for the variables passed via -f (there are multiple ways to do this but this seemed tidiest for your scenario).
Uses ForEach to iterate through the Group property, which contains the grouped items. Again using string replacement to output the product code and amount due fields in the format you desired.

Related

Transpose to object but without values, yet

I'm importing esxtop data from an ESXi host. The import gives me two sets of data. The first is the list of all the fields that I will be getting data from. The second is data in those fields.
The fields list is like this:
|SchedGroup|GroupID|GroupName|IsValid|IsVM|VMName|TimerPeriod|CPUAllocMin|CPUAllocMax|
I converted this into a Selected.System.String named $Headercols:
header
------
SchedGroup
GroupID
GroupName
IsValid
IsVM
VMName
TimerPeriod
CPUAllocMin
CPUAllocMax
But I actually need it to be in columns, instead of rows. Because later on I want to add "records" / "rows" containing the data. I tried using converting to columns using add-member, but that needs a name and a value, but I don't know the value yet.
The example here only has a few rows, but this grows to 33 rows, so I'd rather fill them in a loop instead of addressing each one specifically.
Later on I will be receiving the data like this:
|SchedGroup|13567824|vm.4430541|1|1|BJLDC002|10416|0|-1|
|SchedGroup|12883482|vm.4316334|1|1|VCDJAGTEST|15260|0|-1|
|SchedGroup|13558449|vm.4429332|1|1|BJLADF02|1000|0|-1|
(Example has 3 rows, each one VM, but it can be many more rows with VMs.)
I know how to split the rows into separate values, but just as with the headers, I wouldn't know how to expand the object.
All examples I google are immediately adding name and value, I can't seem to get my head around how I should work with that.
But I actually need it to be in columns, instead of rows.
You're getting ahead of yourself - "columns" and "rows" relate to how we might format (or display) the data - but right now we need to import and parse the data somehow.
I would suggest trimming the leading and trailing |'s, at which point we can treat the data set as a CSV (with | instead of , as the separator - a PSV?):
# Read header list and split into individual strings
$headerString = Get-Content .\path\to\headers.txt |Select -First 1
$headers = $headerString.Split('|', [StringSplitOptions]::RemoveEmptyEntries)
# Read data set, remove the leading and trailing pipes
$dataSetRaw = (Get-Content .\path\to\data.txt) -replace '^\||\|$'
# Convert to structured objects
$data = $dataSetRaw |ConvertFrom-Csv -Delimiter '|' -Header $headers
ConvertFrom-Csv will create one object per record, at which point you'll find most formatting cmdlets do exactly what you expect:
PS ~> $data |Format-Table
SchedGroup GroupID GroupName IsValid IsVM VMName TimerPeriod CPUAllocMin CPUAllocMax
---------- ------- --------- ------- ---- ------ ----------- ----------- -----------
SchedGroup 13567824 vm.4430541 1 1 BJLDC002 10416 0 -1
SchedGroup 12883482 vm.4316334 1 1 VCDJAGTEST 15260 0 -1
SchedGroup 13558449 vm.4429332 1 1 BJLADF02 1000 0 -1
Continuing from my comment...
'|SchedGroup|GroupID|GroupName|IsValid|IsVM|VMName|TimerPeriod|CPUAllocMin|CPUAllocMax|' -replace '\|', ',' -replace '^,|,\s*$'
# Results
<#
SchedGroup,GroupID,GroupName,IsValid,IsVM,VMName,TimerPeriod,CPUAllocMin,CPUAllocMax
#>
Yet, you say your data is coming in this way...
SchedGroup,13567824,vm.4430541,1,1,BJLDC002,10416
SchedGroup,12883482,vm.4316334,1,1,VCDJAGTEST,15260
SchedGroup,13558449,vm.4429332,1,1,BJLADF02,1000
...then just import it and assign headers.
$Headers = #('|SchedGroup|GroupID|GroupName|IsValid|IsVM|VMName|TimerPeriod|CPUAllocMin|CPUAllocMax|' -replace '\|', ',' -replace '^,|,\s*$')
Import-Csv -Path 'SomeFilePath' -Header $Headers

Powershell - Parse duplicates in a list

I'm working on an issue with SCCM delivered App-V connection groups. Occasionally it delivers multiple (duplicate) connection groups to the client and the apps don't function correctly.
I'm running get-appvclientconnectiongroups in user context and where duplicates exist, exporting out Name, Groupid and Version id to a CSV.
I then import this using an elevated Powershell session (as I need admin rights to remove connection groups).
So CSV headers are
Name, GroupID, VersionID
The duplication lies in the Name header only
E.g.
Name, Group ID, Version ID
Adobe_Reader, 123, 456
Adobe_Reader, 456, 789
Adobe_Reader, 111, 555
Notepad, 333,222
Notepad, 111,444
Receiver, 444,777
Receiver, 123,999
What I would like to do is, for each duplicate name grab, the Group ID and Version ID to use in a remove-appvclientconnectiongroup. HOWEVER - I don't wish to do this for each entry - I want to stop when there is one left of each name (i.e when than name becomes unique in the list).
So in the end the list would be:
Adobe_Reader, 111, 555
Notepad, 111,444
Receiver, 123,999
And these are the ones we don't want to run throught the cmdlet
Any ideas? APologies if that makes no sense!
I've been playing around with arrays but not getting anywhere fast.
Assuming you have a CSV file already, you can do the following to return the last item in a group of identical names:
Import-Csv file.csv | Group-Object Name |
Foreach-Object { $_.Group[-1] }
Explanation:
Using Group-Object, you can group objects based on a property value. Here, grouping by property Name creates collection of items with properties Count,Name,Group. Name contains the values of the property you are grouping by. Count contains the number of matching values of that grouped property. Group contains the objects that had matching property values.
Since the Group property contains your objects, you can access the objects using the member access operator .. When piping to Foreach-Object, $_.Group will return the object groupings. Then you simply need to grab the last element [-1] of the collection.
If you have that information stored in a CSV file, you can do this to remove all but the last connection group:
Import-Csv -Path 'TheGroups.csv' | Group-Object Name | Where-Object { $_.Count -gt 1 } | Foreach-Object {
$last = $_.Count - 2
# remove all but the last connection group
$_.Group[0..$last] | Remove-AppvClientConnectionGroup
}
Thanks! I managed to get it working with the code below after much messing about. As there can be multiple instances of duplicates I pushed everything into an editable array which a removes line from as the groups are removed. It then checks how many duplicates are left for any given package and stops when there's one of each left
$data = import-csv $env:ALLUSERSPROFILE\AppvDuplciateGroups.csv
#Push the data into an ArrayList so it can be edited on the fly
$dataarray = [System.Collections.ArrayList]($data)
#Get the array size, number of duplicates and target array size
$arraysize = $dataarray.Count
$dupescount = $dataarray| group name
$arraytargetsize = $dupescount.count
$i = $arraysize -1
Function RemoveDuplicates(){
#select the relevant duplicate from the array based in index number (running bottom to top)
$lineX = $dataarray | select -Index $i
 #remove the duplicate
Remove-AppvClientConnectionGroup -GroupId $lineX.groupid -VersionId $lineX.VersionId
 #remove enrty from the array
$dataarray.RemoveAt($i)
 #check to see if that was the last entry for that particular connection group
$getcount = $dataarray | group-object name| Select-Object name, count | where name -eq $lineX.name
 #if there's still more that one entry for that package, run the function again on the next line up
If ($getcount.count -gt 1){
$i = $i -1
RemoveDuplicates}
 #if the array size is still larger than the calculated target array size, move up 2 lines in the array and run the function again
Else{
    If ($dataarray.count -gt $arraytargetsize){
        $i = $i -2
        RemoveDuplicates}
 #once the array meets the calculated target size, repair all connection groups and remove .csv file         
          
          Else{
                Repair-AppvClientConnectionGroup *
                Remove-Item -Path $env:ALLUSERSPROFILE\AppvDuplicateGroups.csv}
}
}
RemoveDuplicates ```

Selecting Distinct Items within Array using PowerShell and Linq

I have been banging my head on this problem for a few hours.
I have a multi-dimensional array and I need to select the unique items based on two "columns".
Is there an efficient .Net or otherwise way to do this and achieve the desired output?
The data looks something like:
ComputerName, IPAddress, MacAddress, FirstObserved
I would like to determine unique values based on MacAddress and ComputerName and keep the unique value based on the oldest FirstObserved date.
I have tried the PowerShell ways of doing this but it's horribly slow to say the least.
$data | Group-Object -Property ComputerName,MacAddress | ForEach-Object{$_.Group | Sort-Object -Property FirstObserved | Select-Object -First 1}
In a perfect world I would have a list of items no duplicates with the oldest entry based on FirstObserved date.
You can implement the grouping manually with a hashtable:
$FirstObserved = #{}
Import-Csv .\data.csv |ForEach-Object {
$key = $_.Computername,$_.MacAddress -join ','
$date = $_.FirstObserved -as [datetime]
# Check if we already have an entry for this Name + MAC combo
if($FirstObserved.Contains($key))
{
# Make sure the current date is older than what we have already
if($FirstObserved[$key] -gt $date)
{
$FirstObserved[$key] = $date
}
}
else
{
# First instance of the Name + MAC combo, save the date
$FirstObserved[$key] = $date
}
}
Now you can eaily look up the first date for a combination with:
$FirstObserved["Computer-01,11:22:33:44:55:66"]
If you want to export the list to another csv, all you need to do is turn the key-value pairs of the hashtable into objects:
$FirstObserved |Select-Object #{Name='Identity';Expression={$_.Key}},#{Name='FirstObserved';Expression={$_.Value}} |Export-Csv -Path output.csv

why is FORMAT not working with expandproperty

I am trying to check if the latest row in the table was added at the current time using the following script check:
$updateTime = "SELECT TOP 1 FORMAT([UPDATE_TIME], 'M/d/yyyy h:mm tt') FROM [dbo].[$Table] ORDER BY UPDATE_TIME DESC" |
Select -ExpandProperty UPDATE_TIME;
Before adding format, the expand property was working very well as intended getting me the latest UPDATE_TIME value
$updateTime = "SELECT TOP 1 [UPDATE_TIME] FROM [dbo].[$Table] ORDER BY UPDATE_TIME DESC" |
Select -ExpandProperty UPDATE_TIME;
However, when I added FORMAT to the query, it results in this error:
Select : Property "UPDATE_TIME" cannot be found.
... ble] ORDER BY UPDATE_TIME DESC" | Select -ExpandProperty UPDATE_TIME;
I need to use FORMAT because I want to ignore the seconds, or if there is away i can ignore a 1 second difference between the UPDATE_TIME value and Get-Date value, that would be even better.
$today = (Get-Date).ToString('M/d/yyyy h:mm tt')
if ($updateTime -eq $today) {
Write-Host "`r`n generated successfully at [ $updateTime ]"
} else {
Write-Host "`r`n NOT generated at [ $today ]"
}
With FORMAT but not expand property, i get this output:
Does your PowerShell console auto-execute strings as SQL queries? PowerShell doesn't normally do that.
Anyway, assuming that your PowerShell instance actually works that way, using the FORMAT() function (or other functions) changes the title of the result column, which in turn becomes the name of the respective property of the output object(s). Because of that the property UPDATE_TIME doesn't exist on those object and thus cannot be expanded.
You need to define a column name to fix that:
SELECT TOP 1 FORMAT(...) AS UPDATE_TIME ...

Use changing variable in loop in Powershell

I'm fairly new to PowerShell and tried to find a solution to simplify my script.
Problem: I need to update several fields with a number in the variable name in a loop. Now I did it using "if statement" (see below) but I need to update many fields and looking for an easier way.
Below the example script:
#START OF SCRIPT
Import-Module ActiveDirectory
$field_value_1 = ""
$field_value_2 = ""
$field_value_3 = ""
$user_1 = "Dave"
$user_2 = "Chris"
$user_3 = "Bob"
for ($i=1;$i-le 3;$i++){
if($i -eq 1){$field_value_1 = Get-ADUser $user_1 | Select -Expand Name}
elseif($i -eq 2){$field_value_2 = Get-ADUser $user_2 | Select -Expand Name}
elseif($i -eq 3){$field_value_3 = Get-ADUser $user_3 | Select -Expand Name}
}
write-host "These are the values: $field_value_1, $field_value_2, $field_value_3"
#END OF SCRIPT
What I'm looking for is to add something like this in the loop instead of the if statements:
$field_value_$i = Get-ADUser $user_$i | Select -Expand Name
So basically that $field_value_$i translates into $field_value_1 and on the next loop to $field_value_2 etc. Same for the $user_x fields.
The result would be that in loop one, it would load the value from $user_1 (Dave), lookup the Active Directory name (Dave Cross), and write it to the $field_value_1 variable. On the next loop it would load the value from $user_2 (Chris), lookup the Active Directory name (Chris Bowes), and write it to the $field_value_2 variable and so on.
Does anyone know how to achieve this? I had a look at Dynamic variables but could not figure out how to do it.
UPDATE
The way to get the info is to use arrays the next problem was to write the results to a text field on the form and that only works when using iex
Invoke-Expression "`$textfield$i.Text = `$var[$i]"
This will use the $i in the command to pull the info from the array and also write it to the correct text field. You have to use the ` in order to read the $ as text. The Invoke_Expression will execute the full text command :-)

Resources