I try to write a short script what will count average value for CPU for last XY minutes.
I wrote something like that (just short overview). First part of the script just stored values in tmp file. From this values is count avg value:
$CPU= ........ Add-Content "myfile.txt" "$CPU"
$array=(Get-Content -Path myfile.txt);
$AVG=($array | Measure-Object -Average).average;
Then I set first-in first-out function:
if ($array.length -gt XY) {$array=($array[1..($array.Length-0)])>myfile.txt}.
When this condition is completed next execution write to script "strange" character and not number. Type command report "?" as last character in file instead number so average function don`t know work with it.
It doesn't work for PowerShell version 2. I don't have such issue in version 3.
"FIFO" snippet :
(get-content c:\temp\test.txt) |select -skip 1 | set-content c:\temp\test.txt
Related
OK, Powershell may not be the best tool for the job but it's the only one available to me.
I have a bunch of 600K+ row .csv data files. Some of them have delimiter errors e.g. " in the middle of a text field or "" at the start of one. They are too big to edit (even in UltraEdit) and fix manually even if I wanted to which I don't!
Because the double-""-delimeter at the start of some text fields and rogue-"-delimiter in the middle of some text fields, I haven't used a header row to define the columns because these rows appear as if there is an extra column in them due to the extra delimiter.
I need to parse the file looking for "" instead of " at the start of a text-field and also to look for " in the middle of a text field and remove them.
I have managed to write the code to do this (after a fashion) by basically reading the whole file into an array, looping through it and adding output characters to an output array.
What I haven't managed to do is successfully write this output array to a file.
I have read every part of https://learn.microsoft.com/en-us/powershell/module/Microsoft.PowerShell.Utility/out-file?view=powershell-5.1 that seemed relevant. I've also trawled through about 10 similar questions on this site and attempted various code gleaned from them.
The output array prints perfectly to screen using a Write-Host but I can't get the data back into a file for love or money. I have a total of 1.5days Powershell experience so far! All suggestions gratefully received.
Here is my code to read/identify rogue delimiters (not pretty (at all), refer previous explanation of data and available technology constraints):
$ContentToCheck=get-content 'myfile.csv' | foreach { $_.ToCharArray()}
$ContentOutputArray=#()
for ($i = 0; $i -lt $ContentToCheck.count; $i++)
{
if (!($ContentToCheck[$i] -match '"')) {#not a quote
if (!($ContentToCheck[$i] -match ',')) {#not a comma i.e. other char that could be enclosed in ""
if ($ContentToCheck[$i-1] -match '"' ) {#check not rogue " delimiter in previous char allow for start of file exception i>1?
if (!($ContentToCheck[$i-2] -match ',') -and !($ContentToCheck[$i-3] -match '"')){
Write-Host 'Delimiter error' $i
$ContentOutputArray+= ''
}#endif not preceded by ",
}#endif"
else{#previous char not a " so move on
$ContentOutputArray+= $ContentToCheck[$i]
}
}#endifnotacomma
else
{#a comma, include it
$ContentOutputArray+= $ContentToCheck[$i]
}#endacomma
}#endifnotaquote
else
{#a quote so just append it to the output array
$ContentOutputArray+= $ContentToCheck[$i]
}#endaquote
}#endfor
So far so good, if inelegant. if I do a simple
Write-Host $ContentOutputArray
data displays nicely " 6 5 " , " 652 | | 999 " , " 99 " , " " , " 678 | | 1 " ..... furthermore when I check the size of the array (based on a cut-down version of one of the problem files)
$ContentOutputArray.count
I get 2507 character length of array. Happy out. However, then variously using:
$ContentOutputArray | Set-Content 'myfile_FIXED.csv'
creates blank file
$ContentOutputArray | out-file 'myfile_FIXED.csv' -encoding ASCII
creates blank file
$ContentOutputArray | export-csv 'myfile_FIXED.csv'
gives only '#TYPE System.Char' in file
$ContentOutputArray | Export-Csv 'myfile_FIXED.csv' -NoType
gives empty file
$ContentOutputArray >> 'myfile_FIXED.csv'
gives blanks separated by ,
What else can I try to write an array of characters to a flat file? It seems such a basic question but it has me stumped. Thanks for reading.
Convert (or cast) the char array to a string before exporting it.
(New-Object string (,$ContentOutputArray)) |Set-Content myfile_FIXED.csv
I just imported a bunch of pictures, and realized that there's 3 copies of each pictures, but they're named sequentially.
Basically these three files are the same:
P5240901.dng
P5240902.dng
P5240903.dng
And that, for about 1600 pictures.
I was looking into writing a simple PowerShell script (I use Windows) that would look into the directory of these files, and keep 1 file out of three, just looping through a range of files.
I didn't find something that would deal with the 'P' character before my file, and I'm not familiar with PowerShell language.
Any ideas?
Thank you!
Assuming everything in the dir follows the naming convention & is in a set of 3 something like this should work:
$mydir = 'C:\path\to\files'
[int]$idx = 1
get-childitem $mydir|sort-object {$_.Name} |foreach-object{
if ($idx % 3 -ne 1){ #get the modulus
$_ |remove-item
}
$idx++
}
Try the following, which will keep only the 1st file in each group of files whose names are the same except for the last character before the filename extension, assuming that character is a digit (syntax assumes PSv3+):
'P5240901.dng', 'P5240902.dng', 'P5240903.dng', 'A1.dng', 'A2.dng', 'singleton.dng' |
Group-Object { $_ -replace '^(.+)\d\.', '$1' } |
? Count -gt 1 |
% { $_.Group[1..$($_.Group.Count)] }
yields:
P5240902.dng
P5240903.dng
A2.dng
Replace the sample input array with a call to Get-ChildItem -File, and prepend Remove-Item to $_.Group[1..$($_.Group.Count)] to perform actual deletion.
The above command uses a string array with input filenames, but the [System.IO.FileInfo] instances output by Get-ChildItem will effectively act the same in a string context: they will expand to their respective filenames.
The advantage of this solution is that it doesn't rely on input files appearing strictly in groups of 3:
Any group of input files sharing the same name except for a digit before the filename extension that has at least 2 members (and any number beyond that) will have every member but the 1st deleted.
Any other files are left untouched.
Explanation:
Group-Object { $_ -replace '^(.+)\d\.', '$1' }
effectively groups the input files by the portion of the filename they share (but only if they share everything but the last char. before the filename extension, and if that char. is a digit).
? Count -gt 1
only passes on those resulting groups that have at least 2 members.
% { $_.Group[1..$($_.Group.Count)] }
processes each group's files, except the 1st.
Update: Here's a variation prompted by the OP's later comments:
The following, given input filenames such as P5240901.dng, P5240902.dng, ..., P5240910.dng, P5240911.dng, ..., P5240990.dng, P5240991.dng, ..., P5240999.dng, will consider each group of 10 files a group (based on the tens place), and within each group only retain the 1st file:
1..99 | % { "P52409$('{0:00}' -f $_).dng" } |
Group-Object { $_ -replace '^(.+\d)\d\.', '$1' } |
? Count -gt 1 |
% { $_.Group[1..$($_.Group.Count)]}
yields:
# tens place of 0; skips ...01.dng
P5240902.dng
P5240903.dng
... # up to ...09.dng
# tens place of 1; skips ...10.dng
P5240911.dng
P5240912.dng
... # skips ...20.dng, ...30.dng, ...
# tens place of 9; skips ...90.dng
P5240991.dng
P5240992.dng
...
P5240999.dng
In order to only pass the files of interest to the command, replace the sample input array with
Get-ChildItem P52515[0-9][0-9].dng.
I have a pipe-delimited file containing 5 columns. I need to append a sixth (pipe-delimited) column to the end of each row.
Old data:
a|b|c|d|e
p|q|r|s|t
New Data:
a|b|c|d|e|x
p|q|r|s|t|x
The sixth column (x) is a value which read from a text-file.
I am wondering if there is a quick way to append this data into existing data-file using powershell? The file contains variable number of rows (between 10 to 100,000)
Any help is appreciated
Simple text operations should work:
$replace = 'x'
(Get-Content file.txt) -replace '$',"|$replace"
a|b|c|d|e|x
p|q|r|s|t|x
For large files, you can do this:
$replace = 'x'
filter add-data {$_ -replace '$',"|$replace"}
Get-Content file.txt -ReadCount 1000 | add-data | add-content newfile.txt
That should produce very good performance with large files.
Assuming that your data does not have any headers in the CSV already, then you'll have to define the headers with the -Headers parameter of the Import-Csv cmdlet. To run the example below, put your data into a file called c:\test\test.csv. Then, run the script in PowerShell or PowerShell ISE.
# 1. Import the data
$Data = Import-Csv -Delimiter '|' -Path c:\test\test.csv -Header prop1,prop2,prop3,prop4,prop5;
# 2. Add a new member to each row
foreach ($Item in $Data) {
Add-Member -InputObject $Item -MemberType NoteProperty -Name prop6 -Value x;
}
# 3. Export the data to a new CSV file
$Data | Export-Csv -Delimiter '|' -Path c:\test\test.new.csv -NoTypeInformation;
# 4. Remove the double quotes around values
(Get-Content -Path c:\test\test.new.csv -Raw) -replace '"','' | Set-Content -Path c:\test\test.new.csv;
Original Data
The source data in c:\test\test.csv should look like this (according to your original post):
a|b|c|d|e
p|q|r|s|t
Resulting Data
After executing the script, your resulting data in c:\test\test.new.csv will look like this:
prop1|prop2|prop3|prop4|prop5|prop6
a|b|c|d|e|x
p|q|r|s|t|x
Random Sample Data Generation
Here is a short script that will generate a 10,000-line, randomized sample file to c:\test\test.csv:
$Random = { [System.Text.ASCIIEncoding]::ASCII.GetString((1..5 | % { [byte](Get-Random -Minimum 97 -Maximum 122); })).ToCharArray(); };
1..10000 | % { #('{0}|{1}|{2}|{3}|{4}' -f (& $Random)) } | Set-Content -Path c:\test\test.csv;
After running my first script against this sample data (10,000 lines), the result took: 1,729 milliseconds to execute. I would say that's pretty fast. Not that this is a race or anything.
I ran the sample file generator again, to generate 100,000 lines of data. After running the same script against that data, it took 19,784 milliseconds to run. It's roughly proportional to the 10,000 line test, but all in all, still doesn't take all that long. Is this a one-time thing, or does it need to be run on a schedule?
You could loop through the file line for line and just append the value in the loop:
Edit full sample code:
function append{
process{
foreach-object {$_ + "|x"}}}
$a = get-content yourcsv.csv
$a | append | set-content yourcsv.csv
I've got the following script that works and I'm trying to rewrite in a more efficient manner. The following lines of code below work and accomplish what I want:
Get-Content "C:\Documents and Settings\a411882\My Documents\Scripts\printserveroutput.txt" | Select-String -SimpleMatch -AllMatches -Pattern "/GPR" | % {
$_.Line -replace '(?:.*)(/GPR)(?:.*)(?<=on\s)(\w+)(?:.*)', '$1,$2'
}
Get-Content "C:\Documents and Settings\a411882\My Documents\Scripts\printserveroutput.txt" | Select-String -SimpleMatch -AllMatches -Pattern "DV6" | % {
$_.Line -replace '(?:.*)(DV6)(?:.*)(?<=on\s)(\w+)(?:.*)', '$1,$2'
}
I repeat the same exact lines of code seven times with slightly altering what I'm looking for. The output that I get is the following, which I'd want (note: this is just a small output):
/GPR,R3556
/GPR,R3556
While this works, I really don't like how cluttered the code is and I decided to try re-writing it in a more effective method. I've re-written the code like this:
$My_Arr = "/GPR", "DV6", "DV7", "RT3", "DEV", "TST", "PRE"
$low = $My_Arr.getlowerbound(0)
$upper = $My_Arr.getupperbound(0)
for ($temp=$low; $temp -le $upper; $temp++){
$Test = $My_Arr[$Temp]
Get-Content "C:\Documents and Settings\a411882\My Documents\Scripts\printserveroutput.txt" | Select-String -SimpleMatch -AllMatches -Pattern $My_Arr[$temp] | % {
$_.Line -replace '(?:.*)($Test)(?:.*)(?<=on\s)(\w+)(?:.*)', '$1,$2'
}
}
The output that this gives me is the following:
10 Document 81, A361058/GPR0000151814_1: owned by A361058 was printed on R3556 via port IP_***.***.***.***. Size in bytes: 53704; pages printed: 2 20130219123105.000000-300
10 Document 80, A361058/GPR0000151802_1: owned by A361058 was printed on R3556 via port IP_***.***.***.***. Size in bytes: 53700; pages printed: 2 20130219123037.000000-300
This is almost correct however the -replace line is where my error is occurring since the code is expecting a string instead of variable when it reaches ($Test). It is posting the entire line of the text file that I'm parsing each time that I find /GPR in this example, rather than the desired output shown above. Would anyone know of a method to fix this line and get the same output as the original code I was using?
EDIT: the output that I'm getting right now with the newer code is also the exact text that is in the .txt file I'm trying to parse through. There's more lines than that in the .txt but for the most part it is identical to that. I'm only concerned with getting the /GPR or any of the other possible strings in the array and then the server name which comes after the word "on" each time.
I'd say this is caused by the simple quotes, which prevent variable expansion.
PS is trying to replace the exact string '(?:.*)($Test)(?:.*)(?<=on\s)(\w+)(?:.*)' without replacing the $test variable by its value.
Try replacing them with quotes, but keep simple quotes on the second string, as follows:
Get-Content "C:\Documents and Settings\a411882\My Documents\Scripts\printserveroutput.txt" | Select-String -SimpleMatch -AllMatches -Pattern $My_Arr[$temp] | % {
$_.Line -replace "(?:.*)($Test)(?:.*)(?<=on\s)(\w+)(?:.*)", '$1,$2'
Does this work on your data?
$file = 'C:\Documents and Settings\a411882\My Documents\Scripts\printserveroutput.txt'
$regex = '(?:.*)(/GPR|DV6|DV7|RT3|DEV|TST|PRE)(?:.*)(?<=on\s)(\w+)(?:.*)'
(Get-Content $file) -match $regex -replace $regex,'$1,$2'
I am currently importing a CSV file which has a column that is all numbers. I am attempting to cast it as an int and only pull ones that are greater than 100. I have manually gone through this CSV file, and I can confirm that there are three rows with a greater-than-100% value. However, this always returns 0. What am I doing wrong?
$percentTooLarge = Import-Csv path\file.csv | Foreach-Object { $_.SumHoldingPercent = $_.SumHoldingPercent -as [int]; $_ } | Where-Object { $_.SumHoldingPercent -gt 100 } | Measure-Object
$numPercentTooLarge = $percentTooLarge.Count
Because of the way compare operators work in PowerShell, this should do the trick:
$percentTooLarge = Import-Csv path\file.csv |
Where-Object { 100 -lt $_.SumHoldingPercent} |
Measure-Object
Basically, PowerShell, when you compare things, tries to convert right to the type of left. If you put a value from ipcsv first - left will be a string. If you put a numeric first - it will convert the value from the CSV file to a number (it will be smart enough to keep the type big-enough ;))
I tested with this code:
#"
foo,bar,percent
alfa,beta,120.5
beta,gamma,99.9
foo,bar,30.4
works,cool,120.7
"# | ConvertFrom-Csv | where { 100 -lt $_.percent }
... and the results seems OK.
Looking at your conversation with #Shay Levy, I would use:
[System.Globalization.CultureInfo] $culture = "en-us"
So you can try something like :
([decimal]::parse($_.SumHoldingPercent ,$culture)) -gt 100