batch script to read multiple files and count commas on each line - batch-file

I am a newbie in batch script and I am trying to achieve the following:
loop through multiple files, count the # of commas on each line then remove extra commas if it is greater than 10. I can only get to the point where I get the count but I am stuck there.
All fields are required. No carriage return. The extra comma will only happen in the field after the 9th comma
Example of data in csv file:
Row 1, (good data)
123,235252,6376,test1,08/11/2022,2,0,1,EA,Required text, pencil ,pen
Row 2, (bad data)
456,235252,6376,test2,08/11/2022,2,0,1,EA,Required,text, pencil ,pen
In row 2, Required text has an extra comma and should be removed. It should look like the row above
So the logic I would like to have is
If the number of commas is 10 for the row, I will go to the next line
If the number of commas greater than 10, then I will remove the one after the 9th comma since extra commas will only happen in that field
Please note, I cannot put double quote around the field
#echo on
setlocal enabledexpansion enableddelayedexpansion
pause
set "inputFile=test.csv"
set "searchChar=,"
set count16=16
pause
for /f "delims=" %%a in ('
findstr /n "^" "%inputFile%"
') do for /f "delims=:" %%b in ("%%~a") do (
set "line=%%a"
pause
for /f %%c in ('
cmd /u /v /e /q /c"(echo(!line:*:=!)"^|find /c "%searchChar%"
') do set count=%%c echo %%c echo here echo %count% echo %count16% echo %%c line %%b has %%c characters
if %count16% equ %count% (echo ***hit)
)
pause
)
pause

Your question is very confusing. You had not clearly explained the details. More important: you have not posted in the question an example of the input data and the desired output; this would remedy the lack of details. So we can only guess what you want...
I think your problem could be better explained if you pay attention to the columns that both input and output data have. Are you interested in the commas, or in the columns?
This is my (attempt of a) solution. I used the example input file posted by Compo.
#echo off
setlocal EnableDelayedExpansion
rem Process all files with .csv extension in current folder
for %%F in (*.csv) do (
ECHO/
ECHO Input: "%%F"
TYPE "%%F"
rem Each file have comma-separated columns: may be 12 columns or more
rem Keep columns 1-9 the same. After that, generate 3 columns more:
rem the last and one-before-last columns are the same
rem the two-before-last column contain the rest of columns separated by space
(for /F "usebackq tokens=1-9* delims=," %%a in ("%%F") do (
set "restAfter9=%%j"
set "last="
set "lastBut1="
set "lastBut2="
for %%A in ("!restAfter9:,=" "!") do (
set "lastBut2=!lastBut2! !lastBut1!"
set "lastBut1=!last!"
set "last=%%~A"
)
echo %%a,%%b,%%c,%%d,%%e,%%f,%%g,%%h,%%i,!lastBut2:~3!,!lastBut1!,!last!
)) > "%%~NF.out"
ECHO Output: "%%~NF.out"
TYPE "%%~NF.out"
)
Output example:
Input: "test1.csv"
123,235252,6376,test1,08/11/2022,2,0,1,EA,Required text, pencil ,pen
456,235252,6376,test2,08/11/2022,2,0,1,EA,Required,text, pencil ,pen
789,235252,6376,test3,08/11/2022,2,0,1,EA,Re,qu,ir,ed,te,xt, pencil ,pen
012,235252,6376,test4,08/11/2022,2,0,1,,Required,text, pencil ,pen
789,235252,6376,test5,08/11/2022,2,0,1,,Re,qu,,,te,xt, pencil ,pen
Output: "test1.out"
123,235252,6376,test1,08/11/2022,2,0,1,EA,Required text, pencil ,pen
456,235252,6376,test2,08/11/2022,2,0,1,EA,Required text, pencil ,pen
789,235252,6376,test3,08/11/2022,2,0,1,EA,Re qu ir ed te xt, pencil ,pen
012,235252,6376,test4,08/11/2022,2,0,1,Required,text, pencil ,pen
789,235252,6376,test5,08/11/2022,2,0,1,Re,qu te xt, pencil ,pen
Input: "test2.csv"
396,32124191,6376,CD1,08/11/2022,1,0,1,EA,Required Books,08/22/2022,12/10/2022,$60 basic supplies,37246613bA0,11800118,Required Books
396,32124191,6376,CD2,08/11/2022,2,0,1,EA,Required Supplies,08/22/2022,12/10/2022,up to $60.00 basic supplies with comma,37246613bA1,11800118,Required Supplies
396,32124191,6376,CD3,08/11/2022,2,0,1,EA,Required Supplies,08/22/2022,12/10/2022,up to $60.00 basic supplies with comma,37246613bA2,11800118,Required Supplies
Output: "test2.out"
396,32124191,6376,CD1,08/11/2022,1,0,1,EA,Required Books 08/22/2022 12/10/2022 $60 basic supplies 37246613bA0,11800118,Required Books
396,32124191,6376,CD2,08/11/2022,2,0,1,EA,Required Supplies 08/22/2022 12/10/2022 up to $60.00 basic supplies with comma 37246613bA1,11800118,Required Supplies
396,32124191,6376,CD3,08/11/2022,2,0,1,EA,Required Supplies 08/22/2022 12/10/2022 up to $60.00 basic supplies with comma 37246613bA2,11800118,Required Supplies
EDIT: New simpler solution added
#echo off
setlocal EnableDelayedExpansion
rem General method to keep the first N columns the same
rem and group additional fields in column N+1
rem Define the number of "same" and "total" columns:
set /A "same=12, last=17"
rem Process all files with .csv extension in current folder
for %%F in (*.csv) do (
ECHO/
ECHO Input: "%%F"
TYPE "%%F"
rem Process all lines of current file
(for /F "usebackq delims=" %%a in ("%%F") do (
set "line=%%a"
set "head="
set "tail="
set "i=0"
rem Split current line in comma-separated fields
for %%b in ("!line:,=" "!") do (
set /A i+=1
if !i! leq %same% ( rem Accumulate field in "head" columns
set "head=!head!%%~b,"
) else if !i! leq %last% ( rem Accumulate field in "tail" columns
set "tail=!tail!%%~b,"
) else ( rem Combine one field from beginning of "tail" and accumulate last field
for /F "tokens=1* delims=," %%x in ("!tail!") do set "tail=%%x %%y%%~b,"
)
)
echo !head!!tail:~0,-1!
)) > "%%~NF.out"
ECHO Output: "%%~NF.out"
TYPE "%%~NF.out"
)
Output example:
Input: "test1.csv"
field 1, field 2, field 3, field 4, field 5, field 6, field 7, field 8, field 9, field 10, field 11, field 12, field 13, field 14, field 15, field 16, field 17
field 1, field 2, field 3, field 4, field 5, field 6, field 7, field 8, field 9, field 10, field 11, field 12, field 13, field 13a, field 13b, field 14, field 15, field 16, field 17
field 1, field 2, field 3, field 4, field 5, field 6, field 7, field 8, field 9, field 10, field 11, field 12, field 13, field 13a, field 13b, field 13c, field 14, field 15, field 16, field 17
Output: "test1.out"
field 1, field 2, field 3, field 4, field 5, field 6, field 7, field 8, field 9, field 10, field 11, field 12, field 13, field 14, field 15, field 16, field 17
field 1, field 2, field 3, field 4, field 5, field 6, field 7, field 8, field 9, field 10, field 11, field 12, field 13 field 13a field 13b, field 14, field 15, field 16, field 17
field 1, field 2, field 3, field 4, field 5, field 6, field 7, field 8, field 9, field 10, field 11, field 12, field 13 field 13a field 13b field 13c, field 14, field 15, field 16, field 17

#ECHO OFF
SETLOCAL
rem The following settings for the directories and filenames are names
rem that I use for testing and deliberately include names which include spaces to make sure
rem that the process works using such names. These will need to be changed to suit your situation.
SET "sourcedir=u:\your files"
SET "filename1=%sourcedir%\q73491865.txt"
SET "destdir=u:\your results"
SET "outfile=%destdir%\outfile.txt"
(
FOR /f "usebackqtokens=1-12*delims=," %%g IN ("%filename1%") DO (
IF "%%s"=="" (
ECHO %%g,%%h,%%i,%%j,%%k,%%l,%%m,%%n,%%o,%%p,%%q,%%r
) ELSE (
ECHO %%g,%%h,%%i,%%j,%%k,%%l,%%m,%%n,%%o,%%p %%q,%%r,%%s
)
)
)>"%outfile%"
GOTO :EOF
Always verify against a test directory before applying to real data.
I've assumed that no empty fields exist.
There are 12 fields normally, 13 if the extra comma exists. Your column-count and comma-count are incorrect.
Assign each field to %%g..%%s. If %%s is empty, then there are 12 fields, so regurgitate %%g..%%r.
If %%s is not empty, remove the comma between %%p and %%q and append %%s.
=== Revision ==== following extra specification
#ECHO Off
SETLOCAL
rem The following settings for the directories and filenames are names
rem that I use for testing and deliberately include names which include spaces to make sure
rem that the process works using such names. These will need to be changed to suit your situation.
SET "sourcedir=u:\your files"
SET "filename1=%sourcedir%\q73491865.txt"
SET "destdir=u:\your results"
SET "outfile=%destdir%\outfile.txt"
:again
SET "modified="
(
FOR /f "usebackqtokens=1-12*delims=," %%g IN ("%filename1%") DO (
IF "%%s"=="" (
ECHO %%g,%%h,%%i,%%j,%%k,%%l,%%m,%%n,%%o,%%p,%%q,%%r
) ELSE (
SET "modified=y"
ECHO %%g,%%h,%%i,%%j,%%k,%%l,%%m,%%n,%%o,%%p %%q,%%r,%%s
)
)
)>"%outfile%"
IF DEFINED modified (
SET "filename1=%outfile%"
IF "%outfile:~-1%"=="1" (SET "outfile=%outfile:~0,-1%") ELSE SET "outfile=%outfile%1"
GOTO again
)
IF "%outfile:~-1%"=="1" SET "outfile=%outfile:~0,-1%"&GOTO again
DEL "%outfile%1" 2>NUL
TYPE "%outfile%"
GOTO :EOF
The revision is somewhat inelegant, repeatedly reprocessing the result file until no more substitutions are made.

Here's something which may 'remove' the commas you require, (it will not however 'look like the row above' as you stated, because that would require replacing commas with space characters):
I will not support any changes to your requirements or any script modification beyond changing S:\omewhere\AAA708270467-11-08-2022_12-15-36-819-A-2376058.csv on line 4
#Echo Off
SetLocal EnableExtensions DisableDelayedExpansion
Set "SourceCSV=S:\omewhere\AAA708270467-11-08-2022_12-15-36-819-A-2376058.csv"
If Not Exist "%SourceCSV%" Exit /B
For /F UseBackQ^ Delims^=^ EOL^= %%G In ("%SourceCSV%") Do Call :Sub "%%G"
Pause
Exit /B
:Sub
Set "r=%~1"
For /F "Delims==" %%G In ('"(Set f[) 2>NUL"') Do Set "%%G="
Set "i=1"
SetLocal EnableDelayedExpansion
Set "f[!i!]=%r:,=" & Set /A i += 1 & Set "f[!i!]=%"
If !i! Lss 13 (Echo(%~1& GoTo :EOF) Else Set /A c=i-2
For /L %%G In (1,1,!i!) Do If %%G Equ 1 (Set /P "=!f[%%G]!" 0<NUL
) Else If %%G GTR !c! (Set /P "=,!f[%%G]!" 0<NUL) Else If %%G Lss 11 (
Set /P "=,!f[%%G]!" 0<NUL) Else Set /P "=!f[%%G]!" 0<NUL
Echo(
GoTo :EOF
Note: This code should allow for empty fields, and having multiple commas next to each other in the problem field, (field 10).
As a courtesy, the following example does exactly the same as the above code, but does not remove the commas, as your question specifically asks, but instead replaces them with spaces.
#Echo Off
SetLocal EnableExtensions DisableDelayedExpansion
Set "SourceCSV=S:\omewhere\AAA708270467-11-08-2022_12-15-36-819-A-2376058.csv"
If Not Exist "%SourceCSV%" Exit /B
For /F UseBackQ^ Delims^=^ EOL^= %%G In ("%SourceCSV%") Do Call :Sub "%%G"
Pause
Exit /B
:Sub
Set "r=%~1"
For /F "Delims==" %%G In ('"(Set f[) 2>NUL"') Do Set "%%G="
Set "i=1"
SetLocal EnableDelayedExpansion
Set "f[!i!]=%r:,=" & Set /A i += 1 & Set "f[!i!]=%"
If !i! Lss 13 (Echo(%~1& GoTo :EOF) Else Set /A c=i-2
For /L %%G In (1,1,!i!) Do If %%G Equ 1 (Set /P "=!f[%%G]!" 0<NUL
) Else If %%G GTR !c! (Set /P "=,!f[%%G]!" 0<NUL) Else If %%G Lss 11 (
Set /P "=,!f[%%G]!" 0<NUL) Else (Set "Prompt=$S!f[%%G]:$=$$!"
%SystemRoot%\System32\cmd.exe /D /K 0<NUL)
Echo(
GoTo :EOF
S:\omewhere\AAA708270467-11-08-2022_12-15-36-819-A-2376058.csv
123,235252,6376,test1,08/11/2022,2,0,1,EA,Required text, pencil ,pen
456,235252,6376,test2,08/11/2022,2,0,1,EA,Required,text, pencil ,pen
789,235252,6376,test3,08/11/2022,2,0,1,EA,Re,qu,ir,ed,te,xt, pencil ,pen
012,235252,6376,test4,08/11/2022,2,0,1,,Required,text, pencil ,pen
789,235252,6376,test5,08/11/2022,2,0,1,,Re,qu,,,te,xt, pencil ,pen
396,32124191,6376,CD1,08/11/2022,1,0,1,EA,Required Books,08/22/2022,12/10/2022,$60 basic supplies,37246613bA0,11800118,Required Books
396,32124191,6376,CD2,08/11/2022,2,0,1,EA,Required Supplies,08/22/2022,12/10/2022,up to $60.00 basic supplies with comma,37246613bA1,11800118,Required Supplies
396,32124191,6376,CD3,08/11/2022,2,0,1,EA,Required Supplies,08/22/2022,12/10/2022,up to $60.00 basic supplies with comma,37246613bA2,11800118,Required Supplies
intended output second example:
123,235252,6376,test1,08/11/2022,2,0,1,EA,Required text, pencil ,pen
456,235252,6376,test2,08/11/2022,2,0,1,EA,Required text, pencil ,pen
789,235252,6376,test3,08/11/2022,2,0,1,EA,Re qu ir ed te xt, pencil ,pen
012,235252,6376,test4,08/11/2022,2,0,1,,Required text, pencil ,pen
789,235252,6376,test5,08/11/2022,2,0,1,,Re qu te xt, pencil ,pen
396,32124191,6376,CD1,08/11/2022,1,0,1,EA,Required Books 08/22/2022 12/10/2022 $60 basic supplies 37246613bA0,11800118,Required Books
396,32124191,6376,CD2,08/11/2022,2,0,1,EA,Required Supplies 08/22/2022 12/10/2022 up to $60.00 basic supplies with comma 37246613bA1,11800118,Required Supplies
396,32124191,6376,CD3,08/11/2022,2,0,1,EA,Required Supplies 08/22/2022 12/10/2022 up to $60.00 basic supplies with comma 37246613bA2,11800118,Required Supplies

Related

How to substitute a string in batch script?

My arqtext.txt has the following dataset:
A,B,C,
(123 or 456) and (789 or 012),1,5,
(456 or 654) and (423 or 947),3,6,
(283 or 335) and (288 or 552),2,56,
I want to change the 1st column of the last 3 rows to a new string set in the script, with the result like:
A,B,C,
roi1,1,5,
roi2,3,6,
roi3,2,56,
But my code only output the header "A,B,C,":
#echo off
setlocal EnableDelayedExpansion EnableExtensions
set roi1="(123 or 456) and (789 or 012)"
set roi2="(456 or 654) and (423 or 947)"
set roi3="(283 or 335) and (288 or 552)"
set /p "header="<"arqtext.txt"
echo %header%>arqtextnovo.txt
for /f "skip=1 tokens=1,* delims=," %%a in ("arqtext.txt") do (
if %%a=="roi1" (
echo roi1,%%b>>arqtextnovo.txt
)
if %%a=="roi2" (
echo roi2,%%b>>arqtextnovo.txt
)
if %%a=="roi3" (
echo roi3,%%b>>arqtextnovo.txt
)
)
rem EXIT /B
pause>nul
This is the way I would do it:
#echo off
setlocal EnableDelayedExpansion
set "roi[(123 or 456) and (789 or 012)]=1"
set "roi[(456 or 654) and (423 or 947)]=2"
set "roi[(283 or 335) and (288 or 552)]=3"
set /P "header=" < "arqtext.txt"
> arqtextnovo.txt (
echo %header%
for /f "usebackq skip=1 tokens=1,* delims=," %%a in ("arqtext.txt") do (
echo roi!roi[%%a]!,%%b
)
)
rem EXIT /B
pause>nul
The for /F command requires "usebackq" option if the filename is enclosed in quotes. Otherwise, it process the literal string enclosed in quotes.
It is more efficient to (redirect the whole output) > to a file, instead of append >> every new line. This also avoid the problems of redirect lines that ends in numbers.
If you have several values and want to select a result value based on the first one, it is much simpler and more efficient to use an array instead of test each individual value.
Let's suppose this method:
set /P "selector=Enter selector: "
if "%selector%" equ "nine" set result=9
if "%selector%" equ "seven" set result=7
if "%selector%" equ "five" set result=5
Instead, you may define an array called "value":
set "value[nine]=9"
set "value[seven]=7"
set "value[five]=5"
... and then directly get the result value this way:
set "result=!value[%selector%]!"
The same method is used in this code. However, you have not specified what happen if the input value is not one of the array elements.
For a further description on array management in Batch files, see this answer

Batch command to find highest number in set of strings?

For instance:
file.txt contains:
4.3 - random1
5.6 - random2
2.2 - random3
3 - random4
1.8 - random5
I need a command that will output the highest number only, not the preceding text.
Ie.
Output = 5.6
You can give SORTN.bat a try.
Here is the code for it as well.
#ECHO OFF
if "%~1"=="/?" (
echo.Sorts text by handling first number in line as number not text
echo.
echo.%~n0 [n]
echo.
echo. n Specifies the character number, n, to
echo. begin each comparison. 3 indicates that
echo. each comparison should begin at the 3rd
echo. character in each line. Lines with fewer
echo. than n characters collate before other lines.
echo. By default comparisons start at the first
echo. character in each line.
echo.
echo.Description:
echo. 'abc10def3' is bigger than 'abc9def4' because
echo. first number in first string is 10
echo. first number in second string is 9
echo. whereas normal text compare returns
echo. 'abc10def3' smaller than 'abc9def4'
echo.
echo.Example:
echo. To sort a directory pipe the output of the dir
echo. command into %~n0 like this:
echo. dir /b^|%~n0
echo.
echo.Source: http://www.dostips.com
goto:EOF
)
if "%~1" NEQ "~" (
for /f "tokens=1,* delims=," %%a in ('"%~f0 ~ %*|sort"') do echo.%%b
goto:EOF
)
SETLOCAL ENABLEDELAYEDEXPANSION
set /a n=%~2+0
for /f "tokens=1,* delims=]" %%A in ('"find /n /v """') do (
set f=,%%B
(
set f0=!f:~0,%n%!
set f0=!f0:~1!
rem call call set f=,%%%%f:*%%f0%%=%%%%
set f=,!f:~%n%!
)
for /f "delims=1234567890" %%b in ("!f!") do (
set f1=%%b
set f1=!f1:~1!
call set f=0%%f:*%%b=%%
)
for /f "delims=abcdefghijklmnopqrstuwwxyzABCDEFGHIJKLMNOPQRSTUWWXYZ~`##$*_-+=:;',.?/\ " %%b in ("!f!") do (
set f2=00000000000000000000%%b
set f2=!f2:~-20!
call set f=%%f:*%%b=%%
)
echo.!f1!!f2!!f!,%%B
rem echo.-!f0!*!f1!*!f2!*!f!*%%a>&2
)
I gave it a try using this input as an example
4.3 - random1
11.3 - random6
5.6 - random2
2.2 - random3
100.1 - random8
3 - random4
1.8 - random5
11.12 - random7
11.11 - random7
This is how I ran it but you should be able to capture the output as well using a FOR /F command just like Stephan showed you in his answer.
type sortme.txt |sortn.bat
Output
1.8 - random5
2.2 - random3
3 - random4
4.3 - random1
5.6 - random2
11.11 - random7
11.12 - random7
11.3 - random6
100.1 - random8
sort will sort in correct order (Attention: this is sorting strings, not numbers, but will work fine for your example. Note that with string comparison 5 or 6.3 are "bigger" than 15).
Put a for around to process the ouput (Standard tokens is 1 and Space is a standard delimiter, so the for /f gets only the first element - your desired number)
for /f %%a in ('sort t.txt') do set high=%%a
echo %high%
EDIT to also process numbers higher than 10. Note: there is no math involved - it's just clever string manipulation.
#echo off
setlocal enabledelayedexpansion
(
for /f "tokens=1,2 delims=. " %%a in (t.txt) do (
set a=0000%%a
if "%%b"=="-" (echo !a:~-4!) else (echo !a:~-4!.%%b)
)
)>temp.txt
type temp.txt
pause
for /f "tokens=1,2 delims=0" %%a in ('sort temp.txt') do set high=%%a
echo %high%
The following script does proper sorting of fractional numbers with eight digits before and after the decimal separator . at most:
#echo off
setlocal EnableExtensions EnableDelayedExpansion
rem // Define constants here:
set "_FILE=%~1" & rem // (input file; `%~1` means first command line argument)
set /A "_ZPAD=8" & rem // (number of zeroes to be used for temporary padding)
set "_ZERO=" & for /L %%K in (0,1,%_ZPAD%) do set "_ZERO=!_ZERO!0"
for /F "usebackq delims=- " %%L in ("%_FILE%") do (
for /F "tokens=1,2 delims=." %%I in ("%%L.") do (
set "INT=%_ZERO%%%I" & set "FRA=%%J%_ZERO%"
set "$NUM_!INT:~-%_ZPAD%!_!FRA:~,%_ZPAD%!=%%L"
)
)
set "MIN=" & set "MAX="
for /F "tokens=2 delims==" %%L in ('2^> nul set $NUM') do (
if not defined MIN set "MIN=%%L"
set "MAX=%%L"
)
echo Minimum: %MIN%
echo Maximum: %MAX%
endlocal
exit /B

Batch command to find a replace string with line break

I have data that is mostly organized in a way that I can convert and import into a spreadsheet. But certain lines have carriage returns and text that my current batch file won't use.
Good Data:
Pipers Cove × 2 $25.00
Pipers Cove Petite × 2 $25.00
Pipers Cove Plus × 2 $25.00
Nordic Club × 2 $25.00
Whiteout × 1 $12.50
Bad Data:
Pipers Cove Kids × 2
Size:
Large - ages 10 to 12
$20.00
Pipers Cove Kids × 2
Size:
Medium - ages 6 to 8
$20.00
Pipers Cove Kids × 2
Size:
Small - ages 2 to 4
$20.00
I need to remove the 2 lines starting with Size, Small, Medium, or Large and have the dollar amount follow the quantity number so my batch file can convert it to a CSV file and so on.
#ECHO OFF
SETLOCAL
SET "sourcedir=U:\sourcedir"
SET "destdir=U:\destdir"
SET "filename1=%sourcedir%\q40953616.txt"
SET "outfile=%destdir%\outfile.txt"
SET "part1="
(
FOR /f "usebackqdelims=" %%i IN ("%filename1%") DO (
ECHO %%i|FIND "$" >NUL
IF ERRORLEVEL 1 (
REM $ not found - set part1 on first such line
IF NOT DEFINED part1 SET "part1=%%i"
) ELSE (
REM $ found - see whether at start or not
FOR /f "tokens=1*delims=$" %%a IN ("%%i") DO (
IF "%%b"=="" (
REM at start - combine and output and reset part1
CALL ECHO %%part1%% %%i
SET "part1="
) ELSE (
ECHO %%i
)
)
)
)
)>"%outfile%"
GOTO :EOF
You would need to change the settings of sourcedir and destdir to suit your circumstances.
I used a file named q40953616.txt containing your data for my testing.
Produces the file defined as %outfile%
Scan each line of the file. If the line does not contain $ then save the first such line in part1.
Otherwise, tokenise the line. If there is only 1 token, then the $ is at the start of the line, so it needs to be output combined with part1
Otherwise, just regurgitate the line.
Although you did not show any own efforts, I decided to provide a solution as the task at hand appears not that trivial to me.
The following script -- let us call it clean-up-text-file.bat -- ignores only lines that begin with the words you specified. Any other lines are appended to the previous one until a $ sign is encountered, in which case a new ine is started. With this method, no lines can get lost unintentionally.
#echo off
setlocal EnableExtensions DisableDelayedExpansion
rem // Define constants here:
set _WORDS="Size","Small","Medium","Large"
for %%F in (%*) do (
set "COLL=" & set "FILE=%%~F"
for /F delims^=^ eol^= %%L in ('type "%%~F" ^& ^> "%%~F" rem/') do (
set "LINE=%%L"
(echo("%%L" | > nul find "$") && (
setlocal EnableDelayedExpansion
>> "!FILE!" echo(!COLL!!LINE!
endlocal
set "COLL="
) || (
set "FLAG="
for %%K in (%_WORDS%) do (
(echo("%%L" | > nul findstr /I /R /B /C:^^^"\"%%~K\>") && (
set "FLAG=#"
)
)
if not defined FLAG (
setlocal EnableDelayedExpansion
rem // The following line contains a TAB character!
for /F "delims=" %%E in (^""!COLL!!LINE! "^") do (
endlocal
set "COLL=%%~E"
)
)
)
)
)
endlocal
exit /B
To use the script, provide your text file(s) as (a) command line argument(s):
clean-up-text-file.bat "good.txt" "bad.txt"
Every specified file is modified directly, so take care when testing!

Batch: How to remove all empty columns from a csv file

I have a CSV file like this:
P,PC,,PL,B,15feb16,P,Bay,RP,15-FEB-16,22-FEB-16,7,,,,,,11,14,138,14,16,993.42,-12,-84,-12,,,,,,,,,17,2,-10,0,0,1,1,16:05:53,15FEB16
P,PC,,PL,I,1FEB-16,P,In,RP,15-FEB-16,22-FEB-16,7,,,,,,25,5,32,5,5,-29.7,-24,-168,-24,,,,,,,,,520,14,-10,0,0,1,1,10-MAY-201606:05:53,15-FEB-16
P,PC,,PC,S,15FEB16,P,Su,RP,15-FEB-16,22-FEB-16,7,,,,,,6,5,32,56,5,4.65,0,0,0,,,,,,,,,546,0,0,0,0,1,1,10-MAY-201606:05:53,15-FEB-16
The code I have written is:
#echo off
setlocal EnableDelayedExpansion
for /F "delims=" %%a in (C:\Pca.csv) do (
set line=%%a
set line=!line:,,=, ,!
set line=!line:,,=, ,!
for /F "tokens=1,2,3* delims=," %%i in (^"!line!^") do (
echo %%i,%%l>>C:\P.csv
)
)
But it only deletes 2nd and 3rd column, no matter whether it is empty or contains data.
The sample output file should be like:
P,PC,PL,B,15feb16,P,Bay,RP,15-FEB-16,22-FEB-16,7,11,14,138,14,16,993.42,-12,-84,-12,17,2,-10,0,0,1,1,16:05:53,15FEB16
P,PC,PL,I,1FEB-16,P,In,RP,15-FEB-16,22-FEB-16,7,25,5,32,5,5,-29.7,-24,-168,-24,520,14,-10,0,0,1,1,10-MAY-201606:05:53,15-FEB-16
P,PC,PC,S,15FEB16,P,Su,RP,15-FEB-16,22-FEB-16,7,6,5,32,56,5,4.65,0,0,0,546,0,0,0,0,1,1,10-MAY-201606:05:53,15-FEB-16
Here is a quite comprehensive and adaptive script that removes empty columns from CSV-formatted data.
Before the code is shown, let us take a look at the help message that appears when called with /?:
"del-empty-cols-from-csv.bat"
This script removes any empty columns from CSV-formatted data. A column is con-
sidered as empty if the related fields in all rows are empty, unless the switch
/H is given, in which case the first line (so the header) is evaluated only.
Notice that fields containing white-spaces only are not considered as empty.
USAGE:
del-empty-cols-from-csv.bat [/?] [/H] csv_in [csv_out]
/? displays this help message;
/H specifies to regard the header only, that is the very first row,
to determine which columns are considered as empty; if NOT given,
the whole data, hence all rows, are taken into account instead;
csv_in CSV data file to process, that is, to remove empty columns of;
these data must be correctly formatted CSV data, using the comma as
separator and the quotation mark as text delimiter; regard that
literal quotation marks must be doubled; there are some additional
restrictions: the data must not contain any line-breaks; neither
must they contain any asterisks nor question marks;
csv_out CSV data file to write the return data to; this must not be equal
to csv_in; note that an already existing file will be overwritten
without prompt; if not given, the data is displayed on the console;
As you can read, there are two operation modes: standard (no switch) and header mode (switch /H).
Given that the following CSV data is fed into the script...:
A, ,C, ,E,F
1, , ,4,5,
1, , , ,5,
1, ,3,4, ,
...the returned CSV data in standard mode will look like...:
A,C, ,E,F
1, ,4,5,
1, , ,5,
1,3,4, ,
...and the returned CSV data in header mode (/H) will look like:
A,C,E,F
1, ,5,
1, ,5,
1,3, ,
Remind that the spaces in the above sample data must actually not be present in the files; they have just been inserted here for better illustration of the said operation modes.
Now, this is the complete code:
#echo off
setlocal EnableExtensions DisableDelayedExpansion
set "OPT_HEAD=%~1"
if "%OPT_HEAD%"=="/?" (
goto :MSG_HELP
) else if /I "%OPT_HEAD%"=="/H" (
shift
) else if "%OPT_HEAD:~,1%"=="/" (
set "OPT_HEAD="
shift
) else set "OPT_HEAD="
set "CSV_IN=%~1"
if not defined CSV_IN (
>&2 echo ERROR: no input file specified!
exit /B 1
)
set "CSV_OUT=%~2"
if not defined CSV_OUT set "CSV_OUT=con"
for /F "delims==" %%V in ('2^> nul set CELL[') do set "%%V="
setlocal EnableDelayedExpansion
if not defined OPT_HEAD (
for /F %%C in ('^< "!CSV_IN!" find /C /V ""') do set "NUM=%%C"
) else set /A NUM=1
set /A LIMIT=0
< "!CSV_IN!" (
for /L %%L in (1,1,%NUM%) do (
set /P "LINE="
call :PROCESS LINE LINE || exit /B !ErrorLevel!
set /A COUNT=0
for %%C in (!LINE!) do (
set /A COUNT+=1
if not defined CELL[!COUNT!] set "CELL[!COUNT!]=%%~C"
if !LIMIT! LSS !COUNT! set /A LIMIT=COUNT
)
)
)
set "PAD=" & for /L %%I in (2,1,!LIMIT!) do set "PAD=!PAD!,"
> "!CSV_OUT!" (
for /F usebackq^ delims^=^ eol^= %%L in ("!CSV_IN!") do (
setlocal DisableDelayedExpansion
set "LINE=%%L%PAD%"
set "ROW="
set /A COUNT=0
setlocal EnableDelayedExpansion
call :PROCESS LINE LINE || exit /B !ErrorLevel!
for %%C in (!LINE!) do (
endlocal
set "CELL=%%C"
set /A COUNT+=1
setlocal EnableDelayedExpansion
if !COUNT! LEQ !LIMIT! (
if defined CELL[!COUNT!] (
for /F delims^=^ eol^= %%R in ("!ROW!,!CELL!") do (
endlocal
set "ROW=%%R"
)
) else (
endlocal
)
) else (
endlocal
)
setlocal EnableDelayedExpansion
)
if defined ROW set "ROW=!ROW:~1!"
call :RESTORE ROW ROW || exit /B !ErrorLevel!
echo(!ROW!
endlocal
endlocal
)
)
endlocal
endlocal
exit /B
:PROCESS var_return var_string
set "STRING=!%~2!"
if defined STRING (
set "STRING="!STRING:,=","!""
if not "!STRING!"=="!STRING:**=!" goto :ERR_CHAR
if not "!STRING!"=="!STRING:*?=!" goto :ERR_CHAR
)
set "%~1=!STRING!"
exit /B
:RESTORE var_return var_string
set "STRING=!%~2!"
if "!STRING:~,1!"==^""" set "STRING=!STRING:~1!"
if "!STRING:~-1!"==""^" set "STRING=!STRING:~,-1!"
if defined STRING (
set "STRING=!STRING:","=,!"
)
set "%~1=!STRING!"
exit /B
:ERR_CHAR
endlocal
>&2 echo ERROR: `*` and `?` are not allowed!
exit /B 1
:MSG_HELP
echo(
echo("%~nx0"
echo(
echo(This script removes any empty columns from CSV-formatted data. A column is con-
echo(sidered as empty if the related fields in all rows are empty, unless the switch
echo(/H is given, in which case the first line ^(so the header^) is evaluated only.
echo(Notice that fields containing white-spaces only are not considered as empty.
echo(
echo(
echo(USAGE:
echo(
echo( %~nx0 [/?] [/H] csv_in [csv_out]
echo(
echo( /? displays this help message;
echo( /H specifies to regard the header only, that is the very first row,
echo( to determine which columns are considered as empty; if NOT given,
echo( the whole data, hence all rows, are taken into account instead;
echo( csv_in CSV data file to process, that is, to remove empty columns of;
echo( these data must be correctly formatted CSV data, using the comma as
echo( separator and the quotation mark as text delimiter; regard that
echo( literal quotation marks must be doubled; there are some additional
echo( restrictions: the data must not contain any line-breaks; neither
echo( must they contain any asterisks nor question marks;
echo( csv_out CSV data file to write the return data to; this must not be equal
echo( to csv_in; note that an already existing file will be overwritten
echo( without prompt; if not given, the data is displayed on the console;
echo(
exit /B
assuming, your original csv looks like this:
id_users,,,quantity,,date
1,,,1,,2013
1,,,1,,2013
2,,,1,,2013
then this single line should solve your request:
(for /f "tokens=1-3 delims=," %%a in (c:\pca.csv) do echo %%a,%%b,%%c)>c:\p.csv
resulting in:
id_users,quantity,date
1,1,2013
1,1,2013
2,1,2013
The trick is: consecutive delimiters are treated as one.
Edit: another approach, as it turned out, there are much more colums, than the original question showed.
#echo off
break>out.txt
for /F "delims=" %%a in (c:\pca.csv) do call :shorten "%%a"
goto :eof
:shorten
set "line=%~1"
:remove
set "line=%line:,,=,%"
echo %line%|find ",,">nul && goto :remove
echo %line%>>c:\p.csv
break>c:\p.csv: create outputfile (overwrite if exist)
replace two consecutive commas with one;
repeat, if there are any more consecutive commas.
Write the resulting line to the outfile.

Numerical Batch Input with Hyphen representing the sequence between two numbers

I need to receive input in a batch file that could potentially contain a numerical hyphen, ie. 1-5 means 1, 2, 3, 4, and 5 as the input from the user.
I know how to take a single character input from the user, but to split the input into 5 (or more) separate entries kind of baffles me.
#echo off
set /P "input=Enter a number or range: "
for /F "tokens=1,2 delims=-" %%a in ("%input%") do (
set lower=%%a
set upper=%%b
)
if not defined upper set upper=%lower%
for /L %%i in (%lower%,1,%upper%) do (
echo Process number %%i
)
You can use for /f with delims to split a string. You can use for /L to loop over a range. Type help for on the command line to read about the type of loops.
#echo off
set /p "input=Enter a number or range: "
REM The user must enter either a plain number or a range. Either way, we split
REM the user input on the minus sign. If there's no minus sign, then only
REM %upper% won't get a value.
for /f "usebackq delims=- tokens=1,2" %%a in ('%input%') do (
set "lower=%%a"
set "upper=%%b"
)
REM If %upper% has a value, then input was a range. Otherwise, input contained
REM a single number.
if not "%upper%"=="" goto :handle_range
echo Single number: %lower%
goto :eof
:handle_range
echo Range %lower% to %upper%
REM We can use the numeric for loop to loop over the full range.
for /l %%i in (%lower%, 1, %upper%) do (
echo %%i
)
goto :eof

Resources