I am working with two CSV files. One holds the name of users and the other one holds their corresponding email address. What I want to do is to combine them both so that users is column 1 and email is column 2 and output it to one file. So far, I've managed to add a second column from the email csv file to the user csv file, but with blank row data. Below is the code that I am using:
$emailCol= import-csv "C:\files\temp\emailOnly.csv" | Select-Object -skip 1
$emailArr=#{}
$i=0
$nameCol = import-csv "C:\files\temp\nameOnly.csv"
foreach ($item in $emailCol){
$nameCol | Select *, #{
Name="email";Expression=
{$emailArr[$i]}
} | Export-Csv -path
C:\files\temp\revised.csv -NoTypeInformation
}
Updated: Below is what worked for me. Thanks BenH!
function combineData {
#This function will combine the user CSV file and
#email CSV file into a single file
$emailCol = Get-Content "C:\files\temp\emailOnly.csv"
| Select-Object -skip 1
$nameCol = Get-Content "C:\files\temp\nameOnly.csv" |
Select-Object -skip 1
# Max function to find the larger count of the two
#csvs to use as the boundary for the counter.
$count = [math]::Max($emailCol.count,$nameCol.count)
$CombinedArray = for ($i = 0; $i -lt $count; $i++) {
[PSCustomObject]#{
fullName = $nameCol[$i]
email = $emailCol[$i]
}
}
$CombinedArray | Export-Csv C:\files\temp\revised.csv
-NoTypeInformation
}
To prevent some additional questions about this theme let me show you alternative approach. If your both CSV files have same number of lines and each line of the first file corresponds to the first line of the second file and etc. then you can do next. For example, users.csv:
User
Name1
Name2
Name3
Name4
Name5
and email.csv:
Email
mail1#gmail.com
mail2#gmail.com
mail3#gmail.com
mail5#gmail.com
Our purpose:
"User","Email"
"Name1","mail1#gmail.com"
"Name2","mail2#gmail.com"
"Name3","mail3#gmail.com"
"Name4",
"Name5","mail5#gmail.com"
What we do?
$c1 = 'C:\path\to\user.csv'
$c2 = 'C:\path\to\email.csv'
[Linq.Enumerable]::Zip(
(Get-Content $c1), (Get-Content $c2),[Func[Object, Object, Object[]]]{$args -join ','}
) | ConvertFrom-Csv | Export-Csv C:\path\to\output.csv
If our purpose is:
"User","Email"
"Name1","mail1#gmail.com"
"Name2","mail2#gmail.com"
"Name3","mail3#gmail.com"
"Name5","mail5#gmail.com"
then:
$c1 = 'C:\path\to\user.csv'
$c2 = 'C:\path\to\email.csv'
([Linq.Enumerable]::Zip(
(Get-Content $c1), (Get-Content $c2),[Func[Object, Object, Object[]]]{$args -join ','}
) | ConvertFrom-Csv).Where{$_.Email} | Export-Csv C:\path\to\output.csv
Hope this helps you in the future.
A for loop would be better suited for your loop. Then use the counter as the index for each of the arrays to build your new object.
$emailCol = Get-Content "C:\files\temp\emailOnly.csv" | Select-Object -Skip 2
$nameCol = Get-Content "C:\files\temp\nameOnly.csv" | Select-Object -Skip 1
# Max function to find the larger count of the two csvs to use as the boundary for the counter.
$count = [math]::Max($emailCol.count,$nameCol.count)
$CombinedArray = for ($i = 0; $i -lt $count; $i++) {
[PSCustomObject]#{
Name = $nameCol[$i]
Email = $emailCol[$i]
}
}
$CombinedArray | Export-Csv C:\files\temp\revised.csv -NoTypeInformation
Answer edited to use Get-Content with an extra skip added to skip the header line in order to handle blank lines.
Related
Maybe the header is wrong but i dont know how to explain.
I have 4 csv files with aprox 15000 rows in each looking like this
number,"surname","forename","emailAddress","taxIdentifier"
100238963,"Smith","John","john.smith#gmail.com","xxxxxxxxxxxx"
Im reading in 9999 of the rows and creating a json file we use on a site to check every person, we then get a respond back for most of the users, and that respons is "number"
Then i need to find all them persons in the first array.
I have done it like this today, but it take to much time to check every person like this, is there any better way of doing this?
This is the code for getting the persons from the file and create json file:
$Files = Get-ChildItem -Path "$Folders\\*" -Include *.csv -Force
foreach ($File in $Files){
$fname = $file
$fname = (Split-Path $File.name -leaf).ToString().Replace(".csv", "")
$Savefile = $fname+ "_Cleaned.csv"
$users = Import-Csv $File
$body = "{`"requestId`": `"144x25`",`"items`": ["
$batchSize = 9999
$batchNum = 0
$row = 0
while ($row -lt $users.Count) {
$test = $users[$row..($row + $batchSize - 1)]
foreach ($user in $test) {
$nr = $user.number
$tax = $user.taxIdentifier
$body += "{`"itemId`": `"$nr`",`"subjectId`": `"$tax`"},"
}
And then this is the code to deal with the respons:
$Result = #()
foreach ($1 in $response.allowedItemIds)
{
foreach ($2 in $Users){
If ($2.number -like $1)
{
$Result += [pscustomobject]#{
number = $2.number
Surname = $2.surname
Forename = $2.forename
Email = $2.emailaddress
Taxidendifier = $2.taxIdentifier
}
}
}
}
$Result | Export-Csv -path "$folders\$savefile" -NoTypeInformation -Append
$row += $batchSize
$batchNum++
Hope someone has any ideas
Cheers
I think you can just do this:
# read the original data file
$originalCsv = #"
number,"surname","forename","emailAddress","taxIdentifier"
1000,"Smith","Mel","mel.smith#example.org","xxxxxxxxxxxx"
3000,"Wilde","Kim","kim.wilde#example.org","xxxxxxxxxxxx"
2000,"Jones","Gryff Rhys","gryff.jones#example.org","xxxxxxxxxxxx"
"#
$originalData = $originalCsv | ConvertFrom-Csv
# get a response from the api
$responseJson = #"
{
"requestId": "144x25",
"responseId": "2efb8b47-d693-46ac-96b1-a31288567cf3",
"allowedItemIds": [ 1000, 2000 ]
}
"#
$responseData = $responseJson | ConvertFrom-Json
# filter original data for matches to the response
$matches = $originalData | where-object { $_.number -in $responseData.allowedItemIds }
# number surname forename emailAddress taxIdentifier
# ------ ------- -------- ------------ -------------
# 1000 Smith Mel mel.smith#example.org xxxxxxxxxxxx
# 2000 Jones Gryff Rhys gryff.jones#example.org xxxxxxxxxxxx
# write the data out
$matches | Export-Csv -Path ".\myfile.csv" -NoTypeInformation -Append
I don't know if that will perform better than your example, but it should do as it's not got a nested loop that runs original row count * response row count times.
Bond.out file example (looking to replace what is highlighted):
Out.csv file (data to be used):
Code:
#set paths up
$filepath= 'C:\folder\path\bond.out'
$filepath2= 'C:\folder\path\temp.txt'
$Ticklist='C:\folder\path\tick.txt'
$ratelist='C:\folder\path\rate.txt'
#Import needed data from an excel file which creates and array
$csv = Import-CSV C:\folder\path\RateIDTable.csv | Where { $_.'Rate' -ne "" } | Export-Csv C:\folder\path\out.csv -NoTypeInformation
$bond = Import-CSV C:\folder\path\out.csv | select -Property TickerID, Rate
#Put array from Excel file into two text files
$Tick = $bond | foreach-object {$_.TickerID} | set-content $Ticklist
$replace = $bond | foreach-object {$_.rate} | set-content $Ratelist
#Create two separate arrays from the new text files
$Tickdata = (Get-content $Ticklist ) -join ','
foreach ($t in $Tickdata)
{
$t = $t -split(",")
$First = $t[0]}
$Ratedata = (Get-content $Ratelist ) -join ','
foreach ($r in $Ratedata)
{
$r = $r -split(",")
$First = $r[0]}
#Get main file to search (bond.out) and search for the word that is in the first line from "t" array file
$data = Select-String $filepath -pattern $t[0] | Select-Object -ExpandProperty Line
$data
#Once found, split the line, replace the rate on the 3rd line with the rate in the first line from the "r" array file, the put the line back to together
$split=$data.split("{|}")
$split[3]=$r[0]
$join = $split -join "|"
$join
#Put the updated line back into the "bond.out" file from whence it came
(get-content $filepath) -replace($data,$join) | set-content $filepath
#computer says no :(
Output:
As you can see, it actually replaces the rate and puts it all back like I need it to. But that last line doesn't seem to work. Instead I get the file back like so:
It appears as though it is repeating the same line from the $join parameter and adding letters to the beginning of each iteration.
I believe it has something to do with the '|' at the end of the line, and remember reading something about marking the beginning and end of lines some time ago, but can't find it anywhere.
Here's an idea. Instead of using regular expressions ...
The Import-Csv command has a -Delimiter parameter. Can you just import bond.out as a "CSV" (but with a pipe delimiter), and update it just like you would a CSV file?
Pseudo-code
### Convert bond.out to objects
$BondOut = Import-Csv -Delimiter '|' -Path $FilePath
### Get the line you want to update
$LineToUpdate = $BondOut.Where({ $PSItem.TickerID -eq 'BBG0019K2QZ5' })
### Update the Rate property from your source (out.csv)
$LineToUpdate.Rate = $SomeSource.Rate
### Export the modified objects to a new bond.out.modified file
$BondOut | Export-Csv -Delimiter '|' -Path 'bond.out.modified' -NoTypeInformation
As per PetSerAI's clue:
#set paths up
$filepath= 'C:\folder\path\bond.out'
$filepath2= 'C:\folder\path\temp.txt'
$Ticklist='C:\folder\path\tick.txt'
$ratelist='C:\folder\path\rate.txt'
#Import needed data from an excel file which creates and array
$csv = Import-CSV C:\folder\path\RateIDTable.csv | Where { $_.'Rate' -ne "" } | Export-Csv C:\folder\path\out.csv -NoTypeInformation
$bond = Import-CSV C:\folder\path\out.csv | select -Property TickerID, Rate
#Put array from Excel file into two text files
$Tick = $bond | foreach-object {$_.TickerID} | set-content $Ticklist
$replace = $bond | foreach-object {$_.rate} | set-content $Ratelist
#Create two separate arrays from the new text files
$Tickdata = (Get-content $Ticklist ) -join ','
foreach ($t in $Tickdata)
{
$t = $t -split(",")
}
$Ratedata = (Get-content $Ratelist ) -join ','
foreach ($r in $Ratedata)
{
$r = $r -split(",")
}
#Get main file to search (bond.out) and search for the word that is in the first line from "t" array file
###Replace all pipes with a comma
(get-content $filepath) -replace('\|', ',') | set-content $filepath
$data = Select-String $filepath -pattern $t[0] | Select-Object -ExpandProperty Line
$data
#Once found, split the line, replace the rate on the 3rd line with the rate in the first line from the "r" array file, the put the line back to together
$split=$data.split("{,}")
$split[3]=$r[0]
$join = $split -join ","
#Put the updated line back into the "bond.out" file from whence it came
###change all commas back to pipes
(get-content $filepath) -replace($data,$j) | set-content $filepath
(get-content $filepath) -replace(',', '|') | set-content $filepath
#computer says yay :D
Here is my dilemma. I have a csv file with two columns
ID,FullFileName
1,Value1
1,Value2
1,Value3
2,Value1
2,Value2
3,Value1
4,Value1
5,Value1
5,Value2
The output I'm looking for is to get an exported csv with two columns in it ID, and FullFilename. The value in FullFileName will contain the matching joined values separated by a pipe delimiter.
But my output i'm trying to get the following:
ID,FullFilename
1,Value1|Value2|Value3
2,Value1|Value2
3,Value1
4,Value1
5,Value1|Value2
I'm not sure how to make powershell search the value in ID and take all of the results and yield them into a single concatenated value with a pipe separation. Any assistance on trying to search the array or join / concatenate array values would be greatly appreciated.
Group-Object is the useful cmdlet that can help you. Grouping the data by ID turns it into:
PS D:\> ipcsv .\t.csv | group id
Count Name Group
----- ---- -----
3 1 {#{ID=1; FullFileName=Value1}, #{ID=1; FullFileName=Value2}, #{ID=1; FullFileName=Value3}}
2 2 {#{ID=2; FullFileName=Value1}, #{ID=2; FullFileName=Value2}}
1 3 {#{ID=3; FullFileName=Value1}}
1 4 {#{ID=4; FullFileName=Value1}}
2 5 {#{ID=5; FullFileName=Value1}, #{ID=5; FullFileName=Value2}}
So you want the Name (= ID) and the Group property, just the FullFileName, joined up:
Import-Csv -Path c:\path\data.csv |
Group-Object -Property ID |
Select-Object #{Name='ID'; Expression={$_.Name}},
#{Name='FullFilename'; Expression={$_.Group.FullFileName -join '|'}} |
Export-Csv -Path C:\Path\out.csv -NoTypeInformation
$InFile = '.\Sample.csv'
$OutFile= '.\New.csv'
$Csv = Import-Csv $InFile | Group-Object ID | ForEach-Object{
[pscustomobject]#{
ID=$_.Name
FullFileName=$_.Group.FullFileName -join '|'
}
}
$Csv
"----------"
$Csv | Export-Csv $OutFile -NoTypeInformation
Get-Content $OutFile
Sample output:
ID FullFileName
-- ------------
1 Value1|Value2|Value3
2 Value1|Value2
3 Value1
4 Value1
5 Value1|Value2
----------
"ID","FullFileName"
"1","Value1|Value2|Value3"
"2","Value1|Value2"
"3","Value1"
"4","Value1"
"5","Value1|Value2"
Edit Just saw you wanted the pipe as delimiter.
Did my best but was to late, anyway here is the code
Instead of Group-Object i am using a HashTable, i find it easy to work with when my data comes from multiple sources.
$CSV = Import-Csv -Delimiter ',' -Path "$env:TEMP\testfolder\csv.txt" #can be .csv or whatever.
$HastTable = #{}
Foreach ($Line in $CSV) {
if (!$HastTable["$($Line.ID)"]) {
$HastTable["$($Line.ID)"] = $Line
}
else {
$HastTable["$($Line.ID)"].FullFileName += "|$($Line.FullFileName)"
}
}
$HastTable.Values | Export-Csv -Delimiter ',' -NoTypeInformation -Path "$env:TEMP\testfolder\newcsv.txt" #can be .csv or whatever.
I have a SQL table that contains several hundred rows of data. One of the columns in this table contains text reports that were stored as plain text within the column.
Essentially, I need to iterate through each row of data in SQL and output the contents of each row's report column to its own individual text file with a unique name pulled from another column.
I am trying to accomplish this via PowerShell and I seem to be hung up. Below is what I have thus far.
foreach ($i=0; $i -le $Reports.Count; $i++)
{
$SDIR = "C:\harassmentreports"
$FILENAME = $Reports | Select-Object FILENAME
$FILETEXT = $Reports | Select-Object TEXT
$NAME = "$SDIR\$FILENAME.txt"
if (!([System.IO.File]::Exists($NAME))) {
Out-File $NAME | Set-Content -Path $FULLFILE -Value $FILETEXT
}
}
Assuming that $Reports is a list of the records from your SQL query, you'll want to fix the following issues:
In an indexed loop use indexed access to the elements of your array:
$FILENAME = $Reports[$i] | Select-Object FILENAME
$FILETEXT = $Reports[$i] | Select-Object TEXT
Define variables outside the loop if their value doesn't change inside the loop:
$SDIR = "C:\harassmentreports"
foreach ($i=0; $i -le $Reports.Count; $i++) {
...
}
Expand properties if you want to use their value:
$FILENAME = $Reports[$i] | Select-Object -Expand FILENAME
$FILETEXT = $Reports[$i] | Select-Object -Expand TEXT
Use Join-Path for constructing paths:
$NAME = Join-Path $SDIR "$FILENAME.txt"
Use Test-Path for checking the existence of a file or folder:
if (-not (Test-Path -LiteralPath $NAME)) {
...
}
Use either Out-File
Out-File -FilePath $NAME -InputObject $TEXT
or Set-Content
Out-File -Path $NAME -Value $TEXT
not both of them. The basic difference between the two cmdlets is their default encoding. The former uses Unicode, the latter ASCII encoding. Both allow you to change the encoding via the parameter -Encoding.
You may also want to reconsider using a for loop in the first place. A pipeline with a ForEach-Object loop might be a better approach:
$SDIR = "C:\harassmentreports"
$Reports | ForEach-Object {
$file = Join-Path $SDIR ($_.FILENAME + '.txt')
if (-not (Test-Path $file)) { Set-Content -Path $file -Value $_.TEXT }
}
I guess the question is in the title.
I have a CSV that looks something like
user,path,original_path
I'm trying to find duplicates on the original path, then output both the user and original_path line.
This is what I have so far.
$2 = Import-Csv 'Total 20_01_16.csv' | Group-Object -Property Original_path |
Where-Object { $_.count -ge 2 } | fl Group | out-string -width 500
This gives me the duplicates in Original_Path. I can see all the required information but I'll be danged if I know how to get to it or format it into something useful.
I did a bit of Googleing and found this script:
$ROWS = Import-CSV -Path 'Total 20_01_16.csv'
$NAMES = #{}
$OUTPUT = foreach ( $ROW in $ROWS ) {
IF ( $NAMES.ContainsKey( $ROW.Original_path ) -and $NAMES[$ROW.original_path] -lt 2 )
{ $ROW }
$NAMES[$ROW.original_path] += 1 }
Write-Output $OUTPUT
I'm reluctant to use this because, well first I have no idea what it's doing. So little of the makes any sense to me, I don't like using scripts I can't get my head around.
Also, and this is the more important part, it's only giving me a single duplicate, it's not giving me both sets. I'm after both offending lines, so I can find both users with the same file.
If anyone could be so kind as to lend a hand I'd appreciate it.
Thanks
It depends on the output format you need, but to build on what you already have we can use this to show the records in the console:
Import-Csv 'Total 20_01_16.csv' |
Group-Object -Property Original_path |
Where-Object { $_.count -ge 2 } |
Foreach-Object { $_.Group } |
Format-Table User, Path, Original_path -AutoSize
Alternatively, use this to save them in a new csv-file:
Import-Csv 'Total 20_01_16.csv' |
Group-Object -Property Original_path |
Where-Object { $_.count -ge 2 } |
Foreach-Object { $_.Group } |
Select User, Path, Original_path |
Export-csv -Path output.csv -NoTypeInformation