Modify CSV file to add a digit character to a column - batch-file

I have a problem making a batch file.
This text is inside a file with name export.dat and I have no means to change the CSV field item from 5 to 6 digits.
This is a preferred older program.
As for newer program I can add the required.
D,1,2126,4372,T,125P,,255473730,person,n,person,19800320,07833,300.00,07833,300.00,078330
Has to be after modification:
D,1,2126,4372,T,125P,,255473730,person,n,person,19800320,078330,300.00,078330 - A03.1 - Shigellosis d,,etc
So we have a leading capital D followed by 13 commas and batch should ad a zero before the comma 13.

#echo off
setlocal
set "target=.\export.dat"
set "destination=.\updated.dat"
powershell -noprofile -command^
"$content = get-content -literalpath '%target%';"^
"$modifiedContent = #();"^
"$modifiedContent += $content[0];"^
"ForEach ($line in $content[1 .. ($content.count - 1)]) {"^
" if ($line.startswith('D,')) {"^
" $items = $line.split(',');"^
" if ($items.length -gt 14) {$items[14] += '0'; $line = $items -join ','};"^
" };"^
" $modifiedContent += $line"^
"};"^
"$modifiedContent | set-content -literalpath '%destination%'"
Powershell has Import-CSV and Export-CSV which can handle CSV files, though Export-CSV outputs data with double quoted field items. Powershell 7 introduces arguments for quotes for Export-CSV, though is still new at this time. With Powershell less than 7, the suggestions are to read the file with Get-Content, remove double quotes and use Set-Content. Since the example content lacks double quotes, I decided to use Get-Content, split on commas and write with Set-Content.
The following code is a hybrid batch-file with powershell which I used as a basis to construct the previous code.
It has some comments that explains the code which is not suitable for inserting into a single powershell command.
<# :: Begin bat code
#echo off
setlocal
set "target=.\export.dat"
set "destination=.\updated.dat"
powershell -noprofile "invoke-expression(get-content '%~f0' | out-string)"
exit /b 0
#> ## Begin ps1 code
# Filepaths from environment variables.
$target = $env:target
$destination = $env:destination
# Read file content into a variable.
$content = get-content -literalpath $target
# Create an empty array for modified content.
$modifiedContent = #()
# Add 1st line as unmodified header.
$modifiedContent += $content[0]
# Add the rest of the lines.
foreach ($line in $content[1 .. ($content.length - 1)]) {
# Require a leading capital D to modify a line.
if ($line.startswith('D,')) {
# Split by commas.
$items = $line.split(',')
# Modify item 14.
if ($items.length -gt 14) {
$items[14] += '0'
$line = $items -join ','
}
}
# Add line to array.
$modifiedContent += $line
}
# Write modified content to destination file.
$modifiedContent | set-content -literalpath $destination
If suitable, the powershell code portion could be copied to a .ps1 file and just replace $target = $env:target and $destination = $env:destination with $target = '.\export.dat' and $destination = '.\updated.dat'.
Input export.dat:
0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16
D,1,2126,4372,T,125P,,255473730,person,n,person,19800320,07833,300.00,07833,300.00,078330
D,1,2126,4372,T,125P,,255473730,person,n,person,19800320,07833,300.00
d,1,2126,4372,T,125P,,255473730,person,n,person,19800320,07833,300.00,07833,300.00,078330
Output updated.dat:
0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16
D,1,2126,4372,T,125P,,255473730,person,n,person,19800320,07833,300.00,078330,300.00,078330
D,1,2126,4372,T,125P,,255473730,person,n,person,19800320,07833,300.00
d,1,2126,4372,T,125P,,255473730,person,n,person,19800320,07833,300.00,07833,300.00,078330
1st line is the header unmodified.
2nd line is modified.
3rd line is too short in length to modify.
4th line does not start with D, as d, does not match same case so is not modified.

Related

Math from a file.txt

$space =("`r`n")
$data = #(Get-Content C:\Users\user1\Desktop\ma.txt)
$Array = ($Data.Split($space)).Split($space)
$pos1 = $Array[0][0]+$Array[0][1]
$pos2 = $Array[1][0]+$Array[1][1]
$pos3 = $Array[2][0]+$Array[2][1]
#$pos1
#$pos2
#$pos3
$zahl1 = $Array[0][5]+$Array[0][7]+$Array[0][9]
$zahl1
PowerShell 7.2
txt1.txt has the text:
x1 = 2 + 3
x2 = 8 / 4
x3 = 1 - 4
i want the results (from x1,x2,x3) to be saved at txt2.txt with a command in Terminal.
I tried whith Arrays,
but i only get :2+3 instead of 5
Any thoughts?
You could use Invoke-Expression for this, but read the warning first
Get-Content -Path text1.txt | Where-Object {$_ -match '\S'} | ForEach-Object {
$var,$calculation = ($_ -split '=').Trim()
'{0} --> {1}' -f $var, (Invoke-Expression -Command $calculation)
} | Set-Content -Path text2.txt
This is an attempt of a more secure version, that matches only mathematical expressions, so users cannot run arbitrary code through Invoke-Expression:
Get-Content text1.txt |
Select-String '^\s*(\S+)\s*=([\d\.+\-*/%\(\)\s]+)$' |
ForEach-Object {
$var = $_.Matches.Groups[ 1 ].Value
$expression = $_.Matches.Groups[ 2 ].Value
$result = Invoke-Expression $expression
"{0} = {1}" -f $var, $result
} |
Set-Content text2.txt
The Select-String cmdlet uses a regular expression to match only lines that are considered "safe". Within the RegEx there are two groups defined to split the line into variable (1) and calculation (2) sub strings. These are then extracted via $_.Matches.Groups.
RegEx breakdown:
Pattern
Description
^
line start
\s*
zero or more whitespace characters
(
start 1st capturing group
 \S+
one or more non-whitespace characters
)
end 1st capturing group
\s*
zero or more whitespace characters
=
literal "="
(
start 2nd capturing group
 [
start list of allowed characters
  \d\.+\-*/%\(\)\s
digits, dot, math ops, parentheses, whitespace
 ]
end the list of allowed characters
 +
one or more chars (from the list of allowed characters)
)
end 2nd capturing group
$
line end

Out-File output is missing line feeds between lines of data

I am passing in an array of $users.
PS C:\> $users | ft
ID DisplayName AdminID first last Password
---- ----------- ------- ----- ---- --------
Axyz Axyz, Bill NBX_Admin Bill Axyz Secret
The code:
$y = #()
$y = "Create Users process. Run started at $('[{0:MM/dd/yyyy} {0:HH:mm:ss}]' -f (Get-Date))"
foreach ($x in $users) {
$y += "User $($x.DisplayName) with NNN of $($x.ID)"
}
$y += "Completed at $('[{0:MM/dd/yyyy} {0:HH:mm:ss}]' -f (Get-Date))"
$y | Out-File "Log.txt"
$y is now an unformatted string array. When I type $y to the screen, it looks great.
If I direct it to Format-Table, it looks great (no headings).
When I output it to a file, and type that file at a Command Prompt (cmd.exe), it looks great.
However, when I pull it up in Notepad, all the output appears on a single line. To be precise, all the data is there, there are no lines of data missing, but there are no CR/LF so all of the data appears on a single line within the file when viewed with Notepad.exe.
As AdminOfThings correctly points out:
While $y = #() assigns an empty array to $y, it doesn't type-constrain that variable, so your very next assignment - $y = "Create Users process ..." - changes the variable type to a string.
Simply using += instead of = in that subsequent assignment would have prevented the problem: $y += "Create Users process ...".
Alternatively, type-constraining the variable creation - [array] $y = #() - i.e., placing a type literal to the left of the variable being assigned (akin to a cast) - would have prevented the problem too.
Subsequent use of += therefore performs simple string concatenation rather than the desired gradual building of an array, with no separators between the "lines" added.[1]
By contrast, had you used an array as intended, both Out-File and Set-Content would automatically insert platform-appropriate newlines[2] between the elements, plus one at the end, on saving (in PSv5+ you can use the -NoNewline switch to opt out).
That said, using += to "extend" an array is inefficient, because what PowerShell must do behind the scenes is create a new array containing the old elements plus the new one(s), given that arrays are fixed-size data structures.
While the performance penalty for use of += to "extend" arrays in a loop only really matters with high iteration counts, it is more concise, convenient and efficient to let PowerShell create arrays for you implicitly, by using your foreach loop as an expression:
# Initialize the array and assign the first element.
# Due to the type constraint ([array]), the RHS string implicitly becomes
# the array's 1st element.
[array] $y = "Create Users process. Run started at $('[{0:MM/dd/yyyy} {0:HH:mm:ss}]' -f (Get-Date))"
# Add the strings output by the foreach loop to the array.
# PowerShell implicitly collects foreach output in an array when
# you use it in as an expression.
$y += foreach ($x in $users)
{
"User $($x.displayname) with NNN of $($x.ID)"
}
# Add the final string to the array.
$y += "Completed at $('[{0:MM/dd/yyyy} {0:HH:mm:ss}]' -f (Get-Date))"
# Send the array to a file with Out-File, which separates
# the elements with newlines and adds a trailing one.
# Windows PowerShell:
# Out-File creates UTF-16LE-encoded files.
# Set-Content, which can alternatively be used, creates "ANSI"-encoded files.
# PowerShell Core:
# Both cmdlets create UTF-8-encoded files without BOM.
$y | Out-File "Log.txt"
Note that you can similarly use for, if, do / while / switch statements as expressions.
In all cases, however, as of PowerShell 7.0, these statements can only serve as expressions by themselves; regrettably, using them as the first segment of a pipeline or embedding them in larger expressions does not work - see this GitHub issue.
[1] A simple demonstration of your problem:
# The initialization of $y as #() is overridden by $y = 'first'.
PS> $y = #(); $y = 'first'; $y += 'second'; $y
firstsecond # !! $y contains a single string built with string concatenation
The description of your symptoms is therefore not consistent with your code, as you should have seen a single-line output string in all scenarios (printing directly to the screen / via Format-Table, sending to a file and type-ing that from cmd.exe).
[2] The platform-appropriate newline is reflected in [Environment]::NewLine, and it is "`r`n" (CRLF) on Windows, and just "`n" (LF) on Unix-like platforms (in PowerShell Core).
As using += recreates the array on every iteration I'd suggest to assign the output of a ForEach-Object with it's -Begin, -Process and -End sections to a variable also using a more common approach of the format operator.:
$Log = $users | ForEach-Object -Begin {
"Create Users process. Run started at [{0:MM/dd/yyyy} {0:HH:mm:ss}]" -f (Get-Date)
} -Process {
"User {0} with NNN of {1}" -f $_.DisplayName,$_.ID
} -End {
"Completed at [{0:MM/dd/yyyy} {0:HH:mm:ss}]" -f (Get-Date)
}
$Log | Set-Content "Log.txt"

Convert txt to array in powershell

I have a powershell script and a txt database with different number of elements per line.
My txt file is list.txt:
"10345","doomsday","life","hope","run","stone"
"10346","ride","latest","metal"
My powershell script search.ps1:
#Get file path
$path = Split-Path $script:MyInvocation.MyCommand.Path
$search = #()
Get-Content -LiteralPath "$path\list.txt" | ForEach-Object {
$search += $_
}
So, how to convert each line as a element of array? As this:
$search = #(("10345","doomsday","life","hope","run","stone"),("10346","ride","latest","metal"))
To operate as:
echo $search[0][0]
Here's a concise PSv4+ solution:
$search = (Get-Content -LiteralPath $path\list.txt).ForEach({ , ($_ -split ',') })
The .ForEach() method operates on each line read from the input file by Get-Content.
$_ -split ',' splits each line into an array of strings by separator ,
, (...) wraps this array in an aux. single-item array to ensure that the array is effectively output as a whole, resulting in an array of arrays as the overall output.
Note: Strictly speaking, the .ForEach() method outputs a [System.Collections.ObjectModel.Collection[psobject]] collection rather than a regular PowerShell array ([object[]]), but for all practical purposes the two types act the same.
Note: The .ForEach() method was chosen as a faster alternative to a pipeline with the ForEach-Object (%) cmdlet.
Note that the .ForEach() method requires storing the input collection in memory as a whole first.
A faster and more memory-efficient, though perhaps slightly obscure alternative is to use a switch statement with the -file option:
$search = switch -file $path\list.txt { default { , ($_ -split ',') } }
switch -file processes each line of the specified file.
Since each line should be processed, only a default branch is used, in which the desired splitting is performed.
Use -split. A code snippet you can debug in ISE or VSCode below.
$x1 = #'
"10345","doomsday","life","hope","run","stone"
"10346","ride","latest","metal"
'#
$data = $x1 -split "`r`n"
$data.Count
$data[0] -split ","
$arr = #()
foreach ($row in $data)
{
$arr += ,($row -split ",")
}
"arr"
$arr
"0,3"
$arr[0][3]
"1,3"
$arr[1][3]
So you can split each line in your file returned from Get-Content and add it to your new array which lets you reference how you wanted...
There are other ways you can use your data depending on your needs.
Assuming you do not want each item quoted, you might consider to not using the -Split operator but just evaluating each line with the Invoke-Expression cmdlet or using a more secure [ScriptBlock] for this:
$Search = Get-Content ".\list.txt" | ForEach-Object {,#(&([ScriptBlock]::Create($_)))}

Use txt file as list in PowerShell array/variable

I've got a script that searches for a string ("End program" in this case). It then goes through each file within the folder and outputs any files not containing the string.
It works perfectly when the phrase is hard coded, but I want to make it more dynamic by creating a text file to hold the string. In the future, I want to be able to add to the list of string in the text file. I can't find this online anywhere, so any help is appreciated.
Current code:
$Folder = "\\test path"
$Files = Get-ChildItem $Folder -Filter "*.log" |
? {$_.LastWriteTime -gt (Get-Date).AddDays(-31)}
# String to search for within the file
$SearchTerm = "*End program*"
foreach ($File in $Files) {
$Text = Get-Content "$Folder\$File" | select -Last 1
if ($Text | WHERE {$Text -inotlike $SearchTerm}) {
$Arr += $File
}
}
if ($Arr.Count -eq 0) {
break
}
This is a simplified version of the code displaying only the problematic area. I'd like to put "End program" and another string "End" in a text file.
The following is what the contents of the file look like:
*End program*,*Start*
If you want to check whether a file contains (or doesn't contain) a number of given terms you're better off using a regular expression. Read the terms from a file, escape them, and join them to an alternation:
$terms = Get-Content 'C:\path\to\terms.txt' |
ForEach-Object { [regex]::Escape($_) }
$pattern = $terms -join '|'
Each term in the file should be in a separate line with no leading or trailing wildcard characters. Like this:
End program
Start
With that you can check if the files in a folder don't contain any of the terms like this:
Get-ChildItem $folder | Where-Object {
-not $_.PSIsContainer -and
(Get-Content $_.FullName | Select-Object -Last 1) -notmatch $pattern
}
If you want to check the entire files instead of just their last line change
Get-Content $_.FullName | Select-Object -Last 1
to
Get-Content $_.FullName | Out-String

How to compare substrings within a folder and an array with PowerShell?

I have a folder with 100,000 files (pictures) which are named by their UPC code (8 to 14 numerical digits) followed by an underscore and other digits:
000012345678_00_1
And I have a list of 20,000 unique UPC codes in a word document (separated by commas) which should match a fifth of these pictures (I also have this list in an Excel table).
000000000000, 000000000001, 000000000011
What I'm trying to do, is to find matches between my array (the 20,000 elements list) and files in my folder so as to extract only those 20,000 pictures from the folder.
I've started by cutting the file name up to the "__" so as to get only the relevant part of the file name:
$FName = ($File -split '_')[0]
To make things harder, I also need to add a wild card " * " to the elements in the array since some extra "0" at the beginning of the files name might have been added and are not present in our array. For example, this UPC in the array "05713901" refers to this file name "00005713901_00.png "; so to find matches I will have to use the "like" operator.
Then when I've found those matches, I'll just have to use Move-Item to a new folder or subfolder.
This is what I've started to code without any result:
$Directory = "C:path_to_my_folder";
$AllFiles = Get-ChildItem $Directory
$FileNames = New-Object System.Collections.ArrayList;
foreach($File in $AllFiles)
{
$FName = ($File -split '_')[0]
$FileNames.Add($FName)
}
$Upc = Get-Content C:\path_to_my_word.docx
Compare-Object $FileNames $Upc
You can't read a docx-file using Get-Content, and even if it did, Compare-Object wouldn't work because your word file was a list over UPC-codes separated by a commas (a single string in powershell), while $FileNames is an array (multiple-objects).
Copy the UPC-codes from excel to notepad so you get a simple textfile with one code per line similar to this sample.
UPC.txt - Content:
000000000000
000000000001
000000000011
....
It would take a long time to run 100.000 files through a 20.000 -like test-loop each. I would create a regex-pattern that looks for either of the codes with an underscore at the end. Ex:
$Directory = "C:\path_to_my_folder";
$AllFiles = Get-ChildItem $Directory
#Generate regex that matches 00001_ or 00002_ etc. Trimming leading and trailing whitespace just to be safe.
$regex = ((Get-Content -Path "c:\UPC.txt") | ForEach-Object { "$($_.Trim())_" }) -join '|'
#Get files that match
$AllFiles | Where-Object { $_.Name -match $regex } | ForEach-Object {
#Do something, ex. Move file.
Move-Item -Path $_.FullName -Dest C:\Destination
}
Or simply
$AllFiles | Where-Object { $_.Name -match $regex } | Move-Item -Destination "C:\Destination"
Save your UPC codes as a plain text file. As Frode F. suggested, copying them from Excel to Notepad is probably the easiest way to do it. Save that list. Then we will load that list into PowerShell, and for each file we will split at the underscore like you did, and trim any leading zeros, then check if it is in the list of known codes. Move any files that are in the list of known UPCs with Move-Item
#Import Known UPC List
$UPCList = Get-Content C:\Path\To\UPCList.txt
#Remove Leading Zeros From List
$UPCList = $UPCList | ForEach{$_.TrimStart('0')}
$Directory = "C:path_to_my_folder"
Get-ChildItem $Directory | Where{$_.Name.Split('_')[0].TrimStart('0') -in $UPCList} | Move-Item -Dest C:\Destination

Resources