I am a newbie in batch script and I am trying to achieve the following:
loop through multiple files, count the # of commas on each line then remove extra commas if it is greater than 10. I can only get to the point where I get the count but I am stuck there.
All fields are required. No carriage return. The extra comma will only happen in the field after the 9th comma
Example of data in csv file:
Row 1, (good data)
123,235252,6376,test1,08/11/2022,2,0,1,EA,Required text, pencil ,pen
Row 2, (bad data)
456,235252,6376,test2,08/11/2022,2,0,1,EA,Required,text, pencil ,pen
In row 2, Required text has an extra comma and should be removed. It should look like the row above
So the logic I would like to have is
If the number of commas is 10 for the row, I will go to the next line
If the number of commas greater than 10, then I will remove the one after the 9th comma since extra commas will only happen in that field
Please note, I cannot put double quote around the field
#echo on
setlocal enabledexpansion enableddelayedexpansion
pause
set "inputFile=test.csv"
set "searchChar=,"
set count16=16
pause
for /f "delims=" %%a in ('
findstr /n "^" "%inputFile%"
') do for /f "delims=:" %%b in ("%%~a") do (
set "line=%%a"
pause
for /f %%c in ('
cmd /u /v /e /q /c"(echo(!line:*:=!)"^|find /c "%searchChar%"
') do set count=%%c echo %%c echo here echo %count% echo %count16% echo %%c line %%b has %%c characters
if %count16% equ %count% (echo ***hit)
)
pause
)
pause
Your question is very confusing. You had not clearly explained the details. More important: you have not posted in the question an example of the input data and the desired output; this would remedy the lack of details. So we can only guess what you want...
I think your problem could be better explained if you pay attention to the columns that both input and output data have. Are you interested in the commas, or in the columns?
This is my (attempt of a) solution. I used the example input file posted by Compo.
#echo off
setlocal EnableDelayedExpansion
rem Process all files with .csv extension in current folder
for %%F in (*.csv) do (
ECHO/
ECHO Input: "%%F"
TYPE "%%F"
rem Each file have comma-separated columns: may be 12 columns or more
rem Keep columns 1-9 the same. After that, generate 3 columns more:
rem the last and one-before-last columns are the same
rem the two-before-last column contain the rest of columns separated by space
(for /F "usebackq tokens=1-9* delims=," %%a in ("%%F") do (
set "restAfter9=%%j"
set "last="
set "lastBut1="
set "lastBut2="
for %%A in ("!restAfter9:,=" "!") do (
set "lastBut2=!lastBut2! !lastBut1!"
set "lastBut1=!last!"
set "last=%%~A"
)
echo %%a,%%b,%%c,%%d,%%e,%%f,%%g,%%h,%%i,!lastBut2:~3!,!lastBut1!,!last!
)) > "%%~NF.out"
ECHO Output: "%%~NF.out"
TYPE "%%~NF.out"
)
Output example:
Input: "test1.csv"
123,235252,6376,test1,08/11/2022,2,0,1,EA,Required text, pencil ,pen
456,235252,6376,test2,08/11/2022,2,0,1,EA,Required,text, pencil ,pen
789,235252,6376,test3,08/11/2022,2,0,1,EA,Re,qu,ir,ed,te,xt, pencil ,pen
012,235252,6376,test4,08/11/2022,2,0,1,,Required,text, pencil ,pen
789,235252,6376,test5,08/11/2022,2,0,1,,Re,qu,,,te,xt, pencil ,pen
Output: "test1.out"
123,235252,6376,test1,08/11/2022,2,0,1,EA,Required text, pencil ,pen
456,235252,6376,test2,08/11/2022,2,0,1,EA,Required text, pencil ,pen
789,235252,6376,test3,08/11/2022,2,0,1,EA,Re qu ir ed te xt, pencil ,pen
012,235252,6376,test4,08/11/2022,2,0,1,Required,text, pencil ,pen
789,235252,6376,test5,08/11/2022,2,0,1,Re,qu te xt, pencil ,pen
Input: "test2.csv"
396,32124191,6376,CD1,08/11/2022,1,0,1,EA,Required Books,08/22/2022,12/10/2022,$60 basic supplies,37246613bA0,11800118,Required Books
396,32124191,6376,CD2,08/11/2022,2,0,1,EA,Required Supplies,08/22/2022,12/10/2022,up to $60.00 basic supplies with comma,37246613bA1,11800118,Required Supplies
396,32124191,6376,CD3,08/11/2022,2,0,1,EA,Required Supplies,08/22/2022,12/10/2022,up to $60.00 basic supplies with comma,37246613bA2,11800118,Required Supplies
Output: "test2.out"
396,32124191,6376,CD1,08/11/2022,1,0,1,EA,Required Books 08/22/2022 12/10/2022 $60 basic supplies 37246613bA0,11800118,Required Books
396,32124191,6376,CD2,08/11/2022,2,0,1,EA,Required Supplies 08/22/2022 12/10/2022 up to $60.00 basic supplies with comma 37246613bA1,11800118,Required Supplies
396,32124191,6376,CD3,08/11/2022,2,0,1,EA,Required Supplies 08/22/2022 12/10/2022 up to $60.00 basic supplies with comma 37246613bA2,11800118,Required Supplies
EDIT: New simpler solution added
#echo off
setlocal EnableDelayedExpansion
rem General method to keep the first N columns the same
rem and group additional fields in column N+1
rem Define the number of "same" and "total" columns:
set /A "same=12, last=17"
rem Process all files with .csv extension in current folder
for %%F in (*.csv) do (
ECHO/
ECHO Input: "%%F"
TYPE "%%F"
rem Process all lines of current file
(for /F "usebackq delims=" %%a in ("%%F") do (
set "line=%%a"
set "head="
set "tail="
set "i=0"
rem Split current line in comma-separated fields
for %%b in ("!line:,=" "!") do (
set /A i+=1
if !i! leq %same% ( rem Accumulate field in "head" columns
set "head=!head!%%~b,"
) else if !i! leq %last% ( rem Accumulate field in "tail" columns
set "tail=!tail!%%~b,"
) else ( rem Combine one field from beginning of "tail" and accumulate last field
for /F "tokens=1* delims=," %%x in ("!tail!") do set "tail=%%x %%y%%~b,"
)
)
echo !head!!tail:~0,-1!
)) > "%%~NF.out"
ECHO Output: "%%~NF.out"
TYPE "%%~NF.out"
)
Output example:
Input: "test1.csv"
field 1, field 2, field 3, field 4, field 5, field 6, field 7, field 8, field 9, field 10, field 11, field 12, field 13, field 14, field 15, field 16, field 17
field 1, field 2, field 3, field 4, field 5, field 6, field 7, field 8, field 9, field 10, field 11, field 12, field 13, field 13a, field 13b, field 14, field 15, field 16, field 17
field 1, field 2, field 3, field 4, field 5, field 6, field 7, field 8, field 9, field 10, field 11, field 12, field 13, field 13a, field 13b, field 13c, field 14, field 15, field 16, field 17
Output: "test1.out"
field 1, field 2, field 3, field 4, field 5, field 6, field 7, field 8, field 9, field 10, field 11, field 12, field 13, field 14, field 15, field 16, field 17
field 1, field 2, field 3, field 4, field 5, field 6, field 7, field 8, field 9, field 10, field 11, field 12, field 13 field 13a field 13b, field 14, field 15, field 16, field 17
field 1, field 2, field 3, field 4, field 5, field 6, field 7, field 8, field 9, field 10, field 11, field 12, field 13 field 13a field 13b field 13c, field 14, field 15, field 16, field 17
#ECHO OFF
SETLOCAL
rem The following settings for the directories and filenames are names
rem that I use for testing and deliberately include names which include spaces to make sure
rem that the process works using such names. These will need to be changed to suit your situation.
SET "sourcedir=u:\your files"
SET "filename1=%sourcedir%\q73491865.txt"
SET "destdir=u:\your results"
SET "outfile=%destdir%\outfile.txt"
(
FOR /f "usebackqtokens=1-12*delims=," %%g IN ("%filename1%") DO (
IF "%%s"=="" (
ECHO %%g,%%h,%%i,%%j,%%k,%%l,%%m,%%n,%%o,%%p,%%q,%%r
) ELSE (
ECHO %%g,%%h,%%i,%%j,%%k,%%l,%%m,%%n,%%o,%%p %%q,%%r,%%s
)
)
)>"%outfile%"
GOTO :EOF
Always verify against a test directory before applying to real data.
I've assumed that no empty fields exist.
There are 12 fields normally, 13 if the extra comma exists. Your column-count and comma-count are incorrect.
Assign each field to %%g..%%s. If %%s is empty, then there are 12 fields, so regurgitate %%g..%%r.
If %%s is not empty, remove the comma between %%p and %%q and append %%s.
=== Revision ==== following extra specification
#ECHO Off
SETLOCAL
rem The following settings for the directories and filenames are names
rem that I use for testing and deliberately include names which include spaces to make sure
rem that the process works using such names. These will need to be changed to suit your situation.
SET "sourcedir=u:\your files"
SET "filename1=%sourcedir%\q73491865.txt"
SET "destdir=u:\your results"
SET "outfile=%destdir%\outfile.txt"
:again
SET "modified="
(
FOR /f "usebackqtokens=1-12*delims=," %%g IN ("%filename1%") DO (
IF "%%s"=="" (
ECHO %%g,%%h,%%i,%%j,%%k,%%l,%%m,%%n,%%o,%%p,%%q,%%r
) ELSE (
SET "modified=y"
ECHO %%g,%%h,%%i,%%j,%%k,%%l,%%m,%%n,%%o,%%p %%q,%%r,%%s
)
)
)>"%outfile%"
IF DEFINED modified (
SET "filename1=%outfile%"
IF "%outfile:~-1%"=="1" (SET "outfile=%outfile:~0,-1%") ELSE SET "outfile=%outfile%1"
GOTO again
)
IF "%outfile:~-1%"=="1" SET "outfile=%outfile:~0,-1%"&GOTO again
DEL "%outfile%1" 2>NUL
TYPE "%outfile%"
GOTO :EOF
The revision is somewhat inelegant, repeatedly reprocessing the result file until no more substitutions are made.
Here's something which may 'remove' the commas you require, (it will not however 'look like the row above' as you stated, because that would require replacing commas with space characters):
I will not support any changes to your requirements or any script modification beyond changing S:\omewhere\AAA708270467-11-08-2022_12-15-36-819-A-2376058.csv on line 4
#Echo Off
SetLocal EnableExtensions DisableDelayedExpansion
Set "SourceCSV=S:\omewhere\AAA708270467-11-08-2022_12-15-36-819-A-2376058.csv"
If Not Exist "%SourceCSV%" Exit /B
For /F UseBackQ^ Delims^=^ EOL^= %%G In ("%SourceCSV%") Do Call :Sub "%%G"
Pause
Exit /B
:Sub
Set "r=%~1"
For /F "Delims==" %%G In ('"(Set f[) 2>NUL"') Do Set "%%G="
Set "i=1"
SetLocal EnableDelayedExpansion
Set "f[!i!]=%r:,=" & Set /A i += 1 & Set "f[!i!]=%"
If !i! Lss 13 (Echo(%~1& GoTo :EOF) Else Set /A c=i-2
For /L %%G In (1,1,!i!) Do If %%G Equ 1 (Set /P "=!f[%%G]!" 0<NUL
) Else If %%G GTR !c! (Set /P "=,!f[%%G]!" 0<NUL) Else If %%G Lss 11 (
Set /P "=,!f[%%G]!" 0<NUL) Else Set /P "=!f[%%G]!" 0<NUL
Echo(
GoTo :EOF
Note: This code should allow for empty fields, and having multiple commas next to each other in the problem field, (field 10).
As a courtesy, the following example does exactly the same as the above code, but does not remove the commas, as your question specifically asks, but instead replaces them with spaces.
#Echo Off
SetLocal EnableExtensions DisableDelayedExpansion
Set "SourceCSV=S:\omewhere\AAA708270467-11-08-2022_12-15-36-819-A-2376058.csv"
If Not Exist "%SourceCSV%" Exit /B
For /F UseBackQ^ Delims^=^ EOL^= %%G In ("%SourceCSV%") Do Call :Sub "%%G"
Pause
Exit /B
:Sub
Set "r=%~1"
For /F "Delims==" %%G In ('"(Set f[) 2>NUL"') Do Set "%%G="
Set "i=1"
SetLocal EnableDelayedExpansion
Set "f[!i!]=%r:,=" & Set /A i += 1 & Set "f[!i!]=%"
If !i! Lss 13 (Echo(%~1& GoTo :EOF) Else Set /A c=i-2
For /L %%G In (1,1,!i!) Do If %%G Equ 1 (Set /P "=!f[%%G]!" 0<NUL
) Else If %%G GTR !c! (Set /P "=,!f[%%G]!" 0<NUL) Else If %%G Lss 11 (
Set /P "=,!f[%%G]!" 0<NUL) Else (Set "Prompt=$S!f[%%G]:$=$$!"
%SystemRoot%\System32\cmd.exe /D /K 0<NUL)
Echo(
GoTo :EOF
S:\omewhere\AAA708270467-11-08-2022_12-15-36-819-A-2376058.csv
123,235252,6376,test1,08/11/2022,2,0,1,EA,Required text, pencil ,pen
456,235252,6376,test2,08/11/2022,2,0,1,EA,Required,text, pencil ,pen
789,235252,6376,test3,08/11/2022,2,0,1,EA,Re,qu,ir,ed,te,xt, pencil ,pen
012,235252,6376,test4,08/11/2022,2,0,1,,Required,text, pencil ,pen
789,235252,6376,test5,08/11/2022,2,0,1,,Re,qu,,,te,xt, pencil ,pen
396,32124191,6376,CD1,08/11/2022,1,0,1,EA,Required Books,08/22/2022,12/10/2022,$60 basic supplies,37246613bA0,11800118,Required Books
396,32124191,6376,CD2,08/11/2022,2,0,1,EA,Required Supplies,08/22/2022,12/10/2022,up to $60.00 basic supplies with comma,37246613bA1,11800118,Required Supplies
396,32124191,6376,CD3,08/11/2022,2,0,1,EA,Required Supplies,08/22/2022,12/10/2022,up to $60.00 basic supplies with comma,37246613bA2,11800118,Required Supplies
intended output second example:
123,235252,6376,test1,08/11/2022,2,0,1,EA,Required text, pencil ,pen
456,235252,6376,test2,08/11/2022,2,0,1,EA,Required text, pencil ,pen
789,235252,6376,test3,08/11/2022,2,0,1,EA,Re qu ir ed te xt, pencil ,pen
012,235252,6376,test4,08/11/2022,2,0,1,,Required text, pencil ,pen
789,235252,6376,test5,08/11/2022,2,0,1,,Re qu te xt, pencil ,pen
396,32124191,6376,CD1,08/11/2022,1,0,1,EA,Required Books 08/22/2022 12/10/2022 $60 basic supplies 37246613bA0,11800118,Required Books
396,32124191,6376,CD2,08/11/2022,2,0,1,EA,Required Supplies 08/22/2022 12/10/2022 up to $60.00 basic supplies with comma 37246613bA1,11800118,Required Supplies
396,32124191,6376,CD3,08/11/2022,2,0,1,EA,Required Supplies 08/22/2022 12/10/2022 up to $60.00 basic supplies with comma 37246613bA2,11800118,Required Supplies
I'm sorry but I'm not good at Windows batch.
I want to merge two or more files with matching by key value. Those files has different numbers of the lines.
For example:
File_A.txt
Time, D1, D2
1.1, 11, 12
1.2, 21, 22
1.3, 31, 32
1.4, 41, 42
1.5, 51, 52
File_B.txt
Time, D3, D4
1.1, 13, 14
1.3, 33, 34
1.4, 43, 44
File_C.txt
Time, D5, D6
1.2, 25, 26
1.4, 45, 46
1.5, 55, 56
I want to get:
Merged.txt
Time, D1, D2, D3, D4, D5, D6
1.1, 11, 12, 13, 14
1.2, 21, 22, , , 25, 26
1.3, 31, 32, 33, 34
1.4, 41, 42, 43, 44, 45, 46
1.5, 51, 52, , , 55, 56
If I make it at C / C++, then it will be easy, but because of my situation, I have to make it at Windows batch, and I cannot imagenate how I have to do.
Please give your favor.
This solution process all files named File_*.txt in current directory and assume that the "master file" (the one with all keys) is the first file.
#echo off
setlocal EnableDelayedExpansion
set "keys="
for %%f in (File_*.txt) do (
if not defined keys (
for /F "usebackq tokens=1* delims=," %%a in ("%%f") do (
set "line[%%a]=%%a,%%b"
set "keys=!keys! %%a"
)
) else (
set "rest=!keys!"
for /F "usebackq tokens=1* delims=," %%a in ("%%f") do (
set "line[%%a]=!line[%%a]!,%%b"
set "rest=!rest: %%a=!"
)
for %%k in (!rest!) do set "line[%%k]=!line[%%k]!, , "
)
)
(for %%k in (%keys%) do echo !line[%%k]!) > Merged.txt
Using your three example files as input, this is the output:
Time, D1, D2, D3, D4, D5, D6
1.1, 11, 12, 13, 14, ,
1.2, 21, 22, , , 25, 26
1.3, 31, 32, 33, 34, ,
1.4, 41, 42, 43, 44, 45, 46
1.5, 51, 52, , , 55, 56
Here is a batch script that should do what you want, although it is not very efficient on large files. It relies on the fact that all the files have got the same number of columns (three in our situation), and that the first file provided has got all possible values in its first column. Supposing the script is saved as merge.bat, you need to provide the files to merge as command line arguments, like:
merge.bat "File_A.txt" "File_B.txt" "File_C.txt"
The result file will always be Merged.txt in the current working directory. So here is the code:
#echo off
setlocal EnableExtensions DisableDelayedExpansion
rem // Define constants here:
set "_SEPAR=, "
set "_FILL=%_SEPAR% %_SEPAR% "
set "_RESULT=Merged.txt"
set "FIRST=#"
for %%A in (%*) do (
if defined FIRST (
> nul copy /Y "%%~nxA" "%_RESULT%"
set "FIRST="
) else (
> "%_RESULT%.tmp" (
for /F usebackq^ delims^=^ eol^= %%F in ("%_RESULT%") do (
set "FLAG=" & set "LINE=%%F"
for /F usebackq^ delims^=^ eol^= %%I in ("%%~A") do (
for /F "eol=%_SEPAR:~,1% delims=%_SEPAR%" %%E in ("%%F") do (
for /F "tokens=1* eol=%_SEPAR:~,1% delims=%_SEPAR%" %%J in ("%%I") do (
if "%%E"=="%%J" set "STR=%%K" & set "FLAG=#"
)
)
)
setlocal EnableDelayedExpansion
if defined FLAG (
echo(!LINE!%_SEPAR%!STR!
) else (
echo(!LINE!%_FILL%
)
endlocal
)
)
> nul move /Y "%_RESULT%.tmp" "%_RESULT%"
)
)
endlocal
exit /B
With your sample files as input, the output in Merged.txt is going to be this:
Time, D1, D2, D3, D4, D5, D6
1.1, 11, 12, 13, 14, ,
1.2, 21, 22, , , 25, 26
1.3, 31, 32, 33, 34, ,
1.4, 41, 42, 43, 44, 45, 46
1.5, 51, 52, , , 55, 56
Here is an alternative batch script. In contrast to the other one, the first file is no longer expected to contain all possible values in its first column. This is the code:
#echo off
setlocal EnableExtensions DisableDelayedExpansion
rem // Define constants here:
set "_SEPAR=, "
set "_FILL=%_SEPAR% %_SEPAR% "
set "_RESULT=Merged.txt"
for /F "delims==" %%E in ('set "$" 2^> nul') do set "%%E="
set /A "INDEX=0"
for %%A in (%*) do (
for /F "usebackq eol=%_SEPAR:~,1% delims=%_SEPAR%" %%E in ("%%~A") do (
call set "NUMBER=000000000000%%INDEX%%"
if not defined $[%%E] call set "$[%%E]=%%NUMBER:~-12%%%_SEPAR%%%E"
set /A "INDEX+=1"
)
)
> "%_RESULT%.tmp" (
for /F "tokens=1* delims==" %%E in ('set "$"') do #(
echo(%%F
)
)
> "%_RESULT%" (
for /F "tokens=1* delims=%_SEPAR%" %%E in ('sort "%_RESULT%.tmp"') do #(
echo(%%F
)
)
del "%_RESULT%.tmp"
for %%A in (%*) do (
> "%_RESULT%.tmp" (
for /F usebackq^ delims^=^ eol^= %%F in ("%_RESULT%") do (
set "FLAG=" & set "LINE=%%F"
for /F usebackq^ delims^=^ eol^= %%I in ("%%~A") do (
for /F "eol=%_SEPAR:~,1% delims=%_SEPAR%" %%E in ("%%F") do (
for /F "tokens=1* eol=%_SEPAR:~,1% delims=%_SEPAR%" %%J in ("%%I") do (
if "%%E"=="%%J" set "STR=%%K" & set "FLAG=#"
)
)
)
setlocal EnableDelayedExpansion
if defined FLAG (
echo(!LINE!%_SEPAR%!STR!
) else (
echo(!LINE!%_FILL%
)
endlocal
)
)
> nul move /Y "%_RESULT%.tmp" "%_RESULT%"
)
endlocal
exit /B