Script to convert fixed width flat file to csv - batch-file

I want to make a generic script which will convert a fixed width flat file into csv. Below is my approach:
`echo off
setlocal EnableDelayedExpansion
echo a,b,c>final.txt
for /f "tokens=1 delims=;" %%i in (source.txt) do (
set x=%%i
for /f "tokens=1,2 delims=," %%a in (config.txt) do (
call SET VAR=!x:~%%a,%%b!
for %%p in (!VAR!) do (echo/|set /p ="%%p,"
) >>final.txt
)
)
`
Config file which I am using contains the parameters for substring which states from where to do the substring and how many characters.
Config file contains:
0,9
9,3
12,11
23,7
30,1
31,1
32,5
37,9
46,9
55,3
58,9
67,9
76,9
85,9
94,1
Source file contains the actual fixed with source.
Now with my code I am getting the result but for the fields which has no value/empty in the source is not reflecting in the final output.
Example:
Source:
1234<space>678 [col1=678,col2=space,col3=678]
Output (Current):
1234,678
Output(Expected):
1234,,678
Please help

Your for %%p in (!VAR!) do (…) loop executes defined (…) command for all strings found in !VAR! delimited with white space(s). So if !VAR! results to all white spaces, then consequential for %%p in ( ) do (…) loop executes nothing. Next code snippet could help:
echo off
setlocal EnableDelayedExpansion
echo a,b,c>final.txt
rem replace the €€€ string with any unused one
set "fooString=€€€"
for /f "tokens=1 delims=;" %%i in (source.txt) do (
set "x=%%i"
for /f "tokens=1,2 delims=," %%a in (config.txt) do (
call SET "VARraw=!x:~%%a,%%b!%fooString%"
rem replaced with respect to the OP's comment: for %%p in (!VARraw!) do (
for /F "tokens=*" %%p in ("!VARraw!") do (
set "rav=%%p"
set "var=!rav:%fooString%=!"
echo/|set /p "=!var!,"
) >>final.txt
)
)
Edit: note that for /F "tokens=*" %%p in ("!VARraw!") do ( would remove leading spaces from the !VARraw! string.

Related

Calling a function for every attribute I want to read from multiple .txt files and write to .csv file

I tried to call a function for every attribute (column) that I want to read from 4 .txt files and then write into a .csv file. One column has flawed output and the code should have a few logic flaws as I haven't learned batch cleanly from scratch. Do you know a fix?
Link to previous solved question: Read information from multiple .txt files and sort it into .csv file
#Magoo
echo Name;Prename;Sign;Roomnumber;Phonenumber > sorted.csv
for /f "tokens=1,2 delims= " %%a in (TestEmployees.txt) do (
call :findSign %%a %%b
)
:findSign
set prename=%1
set name=%2
for /f "tokens=1,2 delims= " %%a in (TestSign.txt) do (
if "%name%"=="%%a" (
call :findRoomNumber
)
)
:End
:findRoomNumber
set sign=%1
for /f "tokens=1,2 delims=|" %%q in (TestRoomNumber.txt) do (
if "%sign%"=="%%q" (
call :findPhoneNumber
)
)
:End
:findPhoneNumber
for /f "tokens=1,2 delims=;" %%u in (TestPhoneNumber.txt) do (
if "%%b"=="%%u" (
echo %name%;%prename%;%%b;%%r;%%v >> sorted.csv
)
)
:End
This is the way I would do it:
#echo off
setlocal EnableDelayedExpansion
rem Load PhoneNumber array
for /F "tokens=1,2 delims=;" %%a in (PhoneNumber.txt) do set "phone[%%a]=%%b"
rem Load RoomNumber array
for /F "tokens=1,2 delims=|" %%a in (RoomNumber.txt) do set "room[%%a]=%%b"
rem Load Sign array
for /F "tokens=1,2" %%a in (Sign.txt) do set "sign[%%a]=%%b"
rem Process Employees file and generate output
> sorted.csv (
echo Name;Prename;Sign;RoomNumber;PhoneNumber
for /F "tokens=1,2" %%a in (Employees.txt) do for %%s in (!sign[%%b]!) do (
echo %%b;%%a;%%s;!room[%%s]!;!phone[%%s]!
)
)
#ECHO OFF
SETLOCAL
rem The following settings for the directories and filenames are names
rem that I use for testing and deliberately includes spaces to make sure
rem that the process works using such names. These will need to be changed to suit your situation.
SET "sourcedir=u:\your files"
SET "destdir=u:\your results"
SET "filename1=%sourcedir%\q74258020_TestEmployees.txt"
SET "filename2=%sourcedir%\q74258020_TestSign.txt"
SET "filename3=%sourcedir%\q74258020_TestRoomNumber.txt"
SET "filename4=%sourcedir%\q74258020_TestPhoneNumber.txt"
SET "outfile=%destdir%\outfile.txt"
>"%outfile%" (
echo Surname;Name;Sign;Roomnumber;Phonenumber
for /f "usebackqtokens=1,2 delims= " %%g in ("%filename1%") do (
call :findSign %%g %%h
)
)
GOTO :eof
:findSign
for /f "usebackqtokens=1,2 delims= " %%b in ("%filename2%") do (
if "%2"=="%%b" (
for /f "usebackqtokens=1,2 delims=|" %%q in ("%filename3%") do (
if "%%c"=="%%q" (
for /f "usebackqtokens=1,2 delims=;" %%u in ("%filename4%") do (
if "%%c"=="%%u" (
echo %2;%1;%%c;%%r;%%v
)
)
)
)
)
)
GOTO :EOF
Always verify against a test directory before applying to real data.
Note that if the filename does not contain separators like spaces, then both usebackq and the quotes around %filename1% can be omitted.
Why should I not upload images of code/data/errors? Copy and paste the code as text
For similar reasons, please post all relevant data into a question to obviate every potential respondent having to switch back and forth to the data to which you have linked.

Locating keyword in .csv files

I am trying to create batch file that reads specific CSV documents from specific file, and extracts some lines that have specific number and print it out on the screen " the whole line !". The problem is I created the code but it wont work at all, whenever I tried it only prints the line numbers!?
The code:
#echo off
setlocal EnableDelayedExpansion
set "yourDir=C:\Users\Adminm\Desktop\test11\"
set "yourExt=csv"
set "keyword=44"
set /a count=0
set linenum=!count!
set c=0
pushd %yourDir%
for %%a in (*.%yourExt%) do (
for /f "usebackq tokens=3 delims=," %%b in (%yourDir%%%a) do (
set /a count = !count! + 1
if NOT %%b == %keyword% (
for /f "delims=" %%1 in ('type %yourDir%%%a') do (
set /a c+=1 && if "!c!" equ "%linenum%" echo %%1%
)
)
)
)
echo !count!
popd
endlocal
thanks in advance <3
for %%a in (*.%yourExt%) do (
for /f "usebackq delims=" %%L in ("%%a") do (
for /f "tokens=3 delims=," %%b in ("%%L") do (
if %%b == %keyword% echo %%L
)
)
)
Assuming what you want to do is scan each file for a target string in column3, then:
Since you have already changed to yourdir, there's no requirement to specify it in the scan-for-filenames for.
Your attempt to locate the required line is clumsy. All you need to do is assign each line in turn to a metavariable (%%L) and then use for/ to parse the metavariable. When the required data matches, simply echo the metavariable containing the entire line.
You've attempted to use %%1 as a metavariable. %n for n=0..9 refers to the parameter number supplied to the routine. The only officially defined metavariables for use here are %%a..%%z and %%A..%%Z (one of the very few places where batch is case-sensitive) - although some other symbols also work. Numerics will not work here.

Generic Text Converter

I want to make a generic batch script which will read a schema file which will contain the various width's/column length's of the fixed width flat file source and finally create a target csv file based on the column length.
Example:
Schema.txt
COL1,5
COL2,2
COL3,4
COL4,3
COL5,6
So the above schema.txt file contains the column list.It also contains the width of each field. Our source will always be a fixed width flat file. Our objective will be to convert it into csv.
Source1.txt
11111223333444555555
11111223333444555555
Target1.txt
11111,22,3333,444,555555
11111,22,3333,444,555555
Source2.txt
11111 333344466666
11111223333 66666
Target2.txt
11111,,3333,444,66666
11111,22,333,,66666
so it should be able to handle space and blanks as well, as we saw in 2nd Source file.
The schema should be a dynamic file where if we provide the structure the bat file will create a csv exactly like the structure from the source.The final target file should have the header taken from the schema file.
Please help.
My present code is given below:
echo off
setlocal EnableDelayedExpansion
echo a,b,c final.txt
rem replace the €€€ string with any unused one
set "fooString=€€€"
for /f "tokens=1 delims=;" %%i in (source.txt) do (
set "x=%%i"
for /f "tokens=1,2 delims=," %%a in (config.txt) do (
call SET "VARraw=!x:~%%a,%%b!%fooString%"
rem replaced with respect to the OP's comment: for %%p in (!VARraw!) do (
for /F "tokens=*" %%p in ("!VARraw!") do (
set "rav=%%p"
set "var=!rav:%fooString%=!"
echo/|set /p "=!var!,"
) final.txt
)
)
Present config.txt contains
0,9
9,3
12,11
23,7
30,1
But i want to modify it.Want to keep only the Field name and the width. Not the starting position and the width.
Problem with existing code is that it prints the result in one single line but i want the \n after the end of each line.
#echo off
setlocal EnableDelayedExpansion
rem Load the schema
set /A numCol=0, maxSpc=0
set "header="
set "spaces="
for /F "tokens=1,2 delims=," %%a in (Schema.txt) do (
set /A numCol+=1
set "header=!header!,%%a"
set "col[!numCol!]=%%b"
if %%b gtr !maxSpc! (
set /A spc=%%b-maxSpc, maxSpc=%%b
for /L %%i in (1,1,!spc!) do set "spaces=!spaces! "
)
)
rem Process the input file
echo %header:~1%
for /F "delims=" %%a in (%1) do (
set "in=%%a"
set "start=0"
set "out="
for /L %%i in (1,1,%numCol%) do for /F "tokens=1,2" %%j in ("!start! !col[%%i]!") do (
set "col=!in:~%%j,%%k!"
if "!col!" equ "!spaces:~0,%%k!" set "col="
set "out=!out!,!col!"
set /A start+=%%k
)
echo !out:~1!
)
Output of example session:
C:\> type Schema.txt
COL1,5
COL2,2
COL3,4
COL4,3
COL5,6
C:\> type Source1.txt
11111223333444555555
11111223333444555555
C:\> test Source1.txt
COL1,COL2,COL3,COL4,COL5
11111,22,3333,444,555555
11111,22,3333,444,555555
C:\> type Source2.txt
11111 333344466666
11111223333 66666
C:\> test Source2.txt
COL1,COL2,COL3,COL4,COL5
11111,,3333,444,66666
11111,22,3333,,66666
The following script (let us call it convert.bat) converts a text file given via command line argument into a CSV file according to your requirements. You may provide the result file as a second argument; if omitted, the output is displayed at the console. The default schema file Schema.txt can be changed if a third argument is specified: (so use like: convert.bat source.txt [target.txt [schema.txt]])
#echo off
setlocal EnableExtensions DisableDelayedExpansion
rem Remove leading blanks of every field if this value is non-empty:
set "DELBLANKS=REMOVE"
rem Specify source file as first command line argument:
set "SOURCE=%~1"
rem Specify target file as second argument (optionally):
set "TARGET=%~2"
rem Provide scheme file as third argument (default is "Schema.txt"):
set "SCHEME=%~3"
rem Check the given command line arguments:
if not defined SOURCE >&2 echo ERROR: no source file given! & exit /B 1
if not defined TARGET set "TARGET=con"
if not defined SCHEME set "SCHEME=%~dp0Schema.txt"
rem Read scheme file and build header:
setlocal EnableDelayedExpansion
set "HEADER="
set /A POSITION=0
set /A COLUMN=0
for /F "usebackq tokens=1,2 delims=," %%I in ("!SCHEME!") do (
set /A COLUMN+=1
set "HEADER=!HEADER!,%%I"
if not "%%J"=="" (
set "WIDTH=%%J"
set /A WIDTH[!COLUMN!]+=WIDTH
set /A POSITION[!COLUMN!]=POSITION
set /A POSITION+=WIDTH
)
)
rem Convert source file into CSV format and store to target file:
> "!TARGET!" (
echo(!HEADER:~1!
for /F usebackq^ delims^=^ eol^= %%L in ("!SOURCE!") do (
setlocal DisableDelayedExpansion
set "LINE=%%L"
setlocal EnableDelayedExpansion
set "LINE=!LINE:,=;!"
set "CSV="
for /L %%C in (1,1,%COLUMN%) do (
for /F "tokens=1,2 delims=," %%P in ("!POSITION[%%C]!,!WIDTH[%%C]!") do (
if defined DELBLANKS (
for /F tokens^=*^ eol^= %%S in ("!LINE:~%%P,%%Q!,") do (
for /F "delims=" %%T in (""!CSV!"") do (
endlocal
set "CSV=%%~T%%S"
setlocal EnableDelayedExpansion
set "LINE=!LINE:,=;!"
)
)
) else (
set "CSV=!CSV!!LINE:~%%P,%%Q!,"
)
)
)
if defined CSV echo(!CSV:~,-1!
endlocal
endlocal
)
)
endlocal
endlocal
exit /B
The headers in the schema file should not contain any exclamation marks !.
Any commas , in the source file will be replaced by semicolons ;.

Find And Replace in a TXT From CSV File

I'm trying to find words on a first column of a CSV or XLS file, and replace them with words in the second column of the CSV of XLS. I have made something like that but it doesn't work.
Can you help me? For each line, the first column in a variable called ita and the second column in a variable called eng, and then find Ita and replace Eng. As you can imagine I need to translate a web page, starting from the csv with a language for each column. My csv file structure is:
ita1;eng1
ita2;eng2
etc...
This is my wrong script:
#echo off
setlocal enableextensions enabledelayedexpansion
set host=%COMPUTERNAME%
echo Host: %host%
pause
for /f "tokens=1 delims=;" %%Ita in (index.csv) do (
SET ita=%%Ita
echo %ita%
pause
for /f "tokens=2 delims=;" %%eng in (index.csv) do (
set eng=%%eng
echo %eng
pause
(for /f "delims=" %%i in ('findstr /n "^" "index.txt"') do (
set "transl=%%i"
set "transl=!line:%ita%=%eng%!"
echo(!line!
endlocal
))>"index2.txt"
type "index2.txt"
)
)
)
(for /f "tokens=1,2 delims=;" %%a in (index.csv) do echo(%%b;%%a)>index2.txt
type index2.txt
#echo off
setlocal enableextensions enabledelayedexpansion
(for /f "tokens=* usebackq" %%l in ("index.txt") do (
set "line=%%l"
for /f "tokens=1,2 delims=; usebackq" %%a in ("index.csv") do (
set "line=!line:%%a=%%b!"
)
echo !line!
)) > index2.txt
But this approach doesn't care of substring match.
Try using awk
awk -F';' 'NR==FNR {a[$1]=$2;next} { for(x in a) gsub(x,a[x]) } 1' index.csv index.txt

Loop through CSV file with batch - Space issue

I have a csv file like this
name,sex,age
venu,m,16
test,,22
[EDIT]
name could have comma also
"venu,gopal",m,16
I want to handle if sex is nothing and save it to another file.
I have a script like this
#Echo Off
For /F "usebackq tokens=1-3 delims=," %%a in (test.csv) Do (
echo %%a, %%b, %%c >> test-new.csv
)
But for the third record, I am getting %%b as 22 which should be space. How to fix this?
[EDIT2]
I have tried as per that link. I am not sure what I am doing wrong. I am getting same issue. Please check it once.
#echo off
setlocal DisableDelayedExpansion
For /F "usebackq tokens=1-3 delims=" %%x in (C:\somefile.csv) Do (
setlocal EnableDelayedExpansion
set "var=%%x"
set "var=!var:"=""!"
set "var=!var:^=^^!"
set "var=!var:&=^&!"
set "var=!var:|=^|!"
set "var=!var:<=^<!"
set "var=!var:>=^>!"
set "var=!var:,=^,^,!"
set var=!var:""="!
set "var=!var:"=""Q!"
set "var=!var:,,="S"S!"
set "var=!var:^,^,=,!"
set "var=!var:""="!"
set "var=!var:"Q=!"
For /F "tokens=1-3 delims=," %%a in ("!var:"S"S=","!") Do (
endlocal
echo %%~a, %%~b, %%~c
setlocal EnableDelayedExpansion
pause
)
endlocal
)
This is a bit tricky, as multiple delims will be condensed to a single delim.
So you need to replace them before to a unique delim sequence.
#Echo Off
setlocal DisableDelayedExpansion
For /F "usebackq tokens=1-3 delims=" %%a in (test.csv) Do (
set "line=%%a"
setlocal EnableDelayedExpansion
set "line="!line:,=","!""
For /F "tokens=1-3 delims=," %%a in ("!line!") Do (
echo %%~a, %%~b, %%~c
)
)
This enclose each column into quotes, and with the %%~a the quotes will be removed later
EDIT: The solution for embedded commas
In this case it's only a bit different than the solution for how to split on ';' in CMD shell
#echo off
setlocal DisableDelayedExpansion
For /F "usebackq tokens=1-3 delims=" %%x in (test.csv) Do (
set "var=%%x"
setlocal EnableDelayedExpansion
set "var=!var:^=^^!"
set "var=!var:&=^&!"
set "var=!var:|=^|!"
set "var=!var:<=^<!"
set "var=!var:>=^>!"
set "var=!var:,=^,^,!"
rem ** This is the key line, the missing quote is intention
call set var=%%var:""="%%
set "var=!var:"="Q!"
set "var=!var:^,^,="C!"
set "var=!var:,,=,!"
set "var=!var:""="!"
set "var="!var:,=","!""
for /F "tokens=1-3 delims=," %%a in ("!var!") do (
endlocal
set "col1=%%~a"
set "col2=%%~b"
set "col3=%%~c"
setlocal EnableDelayedExpansion
if defined col1 (
set "col1=!col1:"C=,!"
set "col1=!col1:"Q="!"
)
if defined col2 (
set "col2=!col2:"C=,!"
set "col2=!col2:"Q="!"
)
if defined col3 (
set "col3=!col3:"C=,!"
set "col3=!col3:"Q="!"
)
echo a=!col1!, b=!col2!, c=!col3!
endlocal
)
)
using a string editor like SSED, I have overcome this issue by creating a temp file where I have replaced ",," with ",-," twice, then regarding a "-" as having been an empty field in the original file...
ssed "s/,,/,-,/ig;s/,,/,-,/ig" file1.csv > file1.tmp
Then the tokens are allocated correctly. If you need to edit the temp file and then return to a CSV, then use...
ssed "s/,-,/,,/ig;s/,-,/,,/ig" file1.tmp > file1.csv
This seems much simpler than doing in flight string replacements by token and having subroutines/etc.

Resources