Find And Replace in a TXT From CSV File - batch-file

I'm trying to find words on a first column of a CSV or XLS file, and replace them with words in the second column of the CSV of XLS. I have made something like that but it doesn't work.
Can you help me? For each line, the first column in a variable called ita and the second column in a variable called eng, and then find Ita and replace Eng. As you can imagine I need to translate a web page, starting from the csv with a language for each column. My csv file structure is:
ita1;eng1
ita2;eng2
etc...
This is my wrong script:
#echo off
setlocal enableextensions enabledelayedexpansion
set host=%COMPUTERNAME%
echo Host: %host%
pause
for /f "tokens=1 delims=;" %%Ita in (index.csv) do (
SET ita=%%Ita
echo %ita%
pause
for /f "tokens=2 delims=;" %%eng in (index.csv) do (
set eng=%%eng
echo %eng
pause
(for /f "delims=" %%i in ('findstr /n "^" "index.txt"') do (
set "transl=%%i"
set "transl=!line:%ita%=%eng%!"
echo(!line!
endlocal
))>"index2.txt"
type "index2.txt"
)
)
)

(for /f "tokens=1,2 delims=;" %%a in (index.csv) do echo(%%b;%%a)>index2.txt
type index2.txt

#echo off
setlocal enableextensions enabledelayedexpansion
(for /f "tokens=* usebackq" %%l in ("index.txt") do (
set "line=%%l"
for /f "tokens=1,2 delims=; usebackq" %%a in ("index.csv") do (
set "line=!line:%%a=%%b!"
)
echo !line!
)) > index2.txt
But this approach doesn't care of substring match.

Try using awk
awk -F';' 'NR==FNR {a[$1]=$2;next} { for(x in a) gsub(x,a[x]) } 1' index.csv index.txt

Related

Batch file: count duplicate ids and write them in column of csv

I am currently trying to automate the preprocessing process on a csv file via a batch file. I have the following table:
id;street;name;nrOfIds
4014001;T1;example1;0
4014002;B2;example2;0
4014003;B3;example3;0
4014004;L1;example4;0
4015001;M3;example5;0
4015002;B9;example6;0
4016001;T4;example7;0
4016002;L2;example8;0
4016003;L1;example9;0
The first row "id" holds the id of the entry which is made unique by the last 3 digts (for example 001, 002, 003, ...). The digits before the last three digits are not unique. As you can see in the result table, I want to count how often the first part of the ID (so the part before the last three digits) exists in the table and I want to write the sum into the third column named "nrOfIds". The result table then should look like this:
id;street;name;nrOfIds
4014001;T1;example1;4
4014002;B2;example2;4
4014003;B3;example3;4
4014004;L1;example4;4
4015001;M3;example5;2
4015002;B9;example6;2
4016001;T4;example7;3
4016002;L2;example8;3
4016003;L1;example9;3
For example, the part before the last three digits of the first line (4014) exists exactly 4 times in the whole table, so I write 4 in the "nrOfIds" column and so on.
The code used for this looks like this:
#echo off
setlocal enabledelayedexpansion
for /F "tokens=1-3* delims=;" %%a in (%PREPROCESSING_INPUT_PATH%%INPUT_FILENAME%) do (
(echo %%a;%%b;%%c)> "%PREPROCESSING_INPUT_PATH%%OUTPUT_FILENAME%" & goto :file
)
:file
(for /F "skip=1 tokens=1-3* delims=;" %%a in (%PREPROCESSING_INPUT_PATH%%INPUT_FILENAME%) do (
REM count ids (like 4014, 4015, ... and write sum into "nrOfIds" column
)
) >> %PREPROCESSING_OUTPUT_PATH%%OUTPUT_FILENAME%
pause
Any suggestions on how to do this? Thank you very much in advance! Your help is greatly appreciated.
Pretty similar to the previous answer I posted, here we just use find /C to identify the number of occurrences of the last 3 digits of the ID:
#echo off
setlocal enabledelayedexpansion
set "infile=z:\folder31\testcsv.csv"
set "outfile=%PREPROCESSING_OUTPUT_PATH%testOutput.csv"
for /f "usebackq delims=" %%a in ("%infile%") do (
(echo %%a)>"%outfile%" & goto :file
)
:file
(for /f "skip=1 usebackq tokens=1-4*delims=;" %%a in ("%infile%") do (
set "match=%%a"
for /f %%i in ('findstr /B "!match:~0,-3!" "%infile%" ^| find /C "!match:~0,-3!"') do (
set /a _cnt=%%i
echo %%a;%%b;%%c;!_cnt!
)
)
)>>"%outfile%"
Debug version:
#echo off
setlocal enabledelayedexpansion
set "infile=%PREPROCESSING_INPUT_PATH%%INPUT_FILENAME%"
set "outfile=%PREPROCESSING_OUTPUT_PATH%%OUTPUT_FILENAME%"
for /f "usebackq delims=" %%a in ("%infile%") do (
(echo %%a) & goto :file
)
:file
(for /f "skip=1 usebackq tokens=1-4*delims=;" %%a in ("%infile%") do (
set "match=%%a"
for /f %%i in ('findstr /B "!match:~0,-3!" "%infile%" ^|find /C "!match:~0,-3!"') do (
set /a _cnt=%%i
echo %%a;%%b;%%c;!_cnt!
)
)
)
pause
This method is simple and run fast:
#echo off
setlocal enabledelayedexpansion
rem Count ids
for /F "skip=1 delims=;" %%a in (input.txt) do (
set "id=%%a"
set /A "count[!id:~0,-3!]+=1"
)
rem Update the file
set "header="
(for /F "tokens=1-4 delims=;" %%a in (input.txt) do (
if not defined header (
echo %%a;%%b;%%c;%%d
set "header=1"
) else (
set "id=%%a"
for /F %%i in ("!id:~0,-3!") do echo %%a;%%b;%%c;!count[%%~i]!
)
)) > output.txt
A method based on external commands, like findstr or find, is slower...

Batch: split values into separate line with specific pipe delimiter

I'm trying to create a report that will split into next row for each line. My data:
I,have,a,report,to,split,|,into,next,row
my,second,line,should,also,|,split,like,before
then it should be like this:
I,have,a,report,to,split
into,next,row
my,second,line,should,also
split,like,before
I have tried the script:
#echo off
setLocal
for /f "tokens=1 delims=.|" %%a in (input.csv) do echo %%a
for /f "tokens=2 delims=.|" %%a in (input.csv) do echo %%a
but the result is:
I,have,a,report,to,split
my,second,line,should,also
into,next,row
split,like,before
Anyone can help me with this? thanks in advance…
Your result example is wrong. This is the output from your code:
I,have,a,report,to,split,
my,second,line,should,also,
,into,next,row
,split,like,before
Note the comma at end of line 1 and 2, and at beginning of line 3 and 4
This method works:
#echo off
setlocal EnableDelayedExpansion
rem Define a variable with CR+LF ASCII chars:
for /F %%a in ('copy /Z "%~F0" NUL') do set CRLF=%%a^
%empty line 1/2%
%empty line 2/2%
rem Change each ",|," string by CR+LF characters
for /F "delims=" %%a in (input.csv) do (
set "line=%%a"
for %%N in ("!CRLF!") do echo !line:,^|,=%%~N!
)
Of course, you may also do it in the "traditional" :/ way:
#echo off
setlocal EnableDelayedExpansion
for /F "tokens=1,2 delims=|" %%a in (input.csv) do (
set "left=%%a"
set "right=%%b"
echo !left:~0,-1!
echo !right:~1!
)

Unable to edit text files with shared delimiter

I am trying to write a script that takes a one lined text file containing the output from another script and removes the commas\spaces and replaces them with a carriage return.
Sample
ENTRY_1331_TFS273350_03, ENTRY_1331_TFS282928, ENTRY_1331_TFS292719,
Desired Output
ENTRY_1331_TFS273350_03
ENTRY_1331_TFS282928
ENTRY_1331_TFS292719
I have tried using variations of the below script (doesn't do anything)
for /f "tokens=* delims=, " %%a in (input.txt) do (
echo %%a>> output.txt
echo( >>output.txt
)
or (only echos the first object)
setLocal EnableDelayedExpansion
for /f "delims==, " %%A in (input.txt) do (
set string=%%A
echo !string!>>output.txt
echo( >>output.txt
)
Any help or advice would be appreciated,
Thanks!
Try this.
#ECHO OFF
(FOR /F "USEBACKQ DELIMS=" %%A IN ("input.txt"
) DO FOR %%B IN (%%A) DO ECHO=%%B)>"output.txt"

Script to convert fixed width flat file to csv

I want to make a generic script which will convert a fixed width flat file into csv. Below is my approach:
`echo off
setlocal EnableDelayedExpansion
echo a,b,c>final.txt
for /f "tokens=1 delims=;" %%i in (source.txt) do (
set x=%%i
for /f "tokens=1,2 delims=," %%a in (config.txt) do (
call SET VAR=!x:~%%a,%%b!
for %%p in (!VAR!) do (echo/|set /p ="%%p,"
) >>final.txt
)
)
`
Config file which I am using contains the parameters for substring which states from where to do the substring and how many characters.
Config file contains:
0,9
9,3
12,11
23,7
30,1
31,1
32,5
37,9
46,9
55,3
58,9
67,9
76,9
85,9
94,1
Source file contains the actual fixed with source.
Now with my code I am getting the result but for the fields which has no value/empty in the source is not reflecting in the final output.
Example:
Source:
1234<space>678 [col1=678,col2=space,col3=678]
Output (Current):
1234,678
Output(Expected):
1234,,678
Please help
Your for %%p in (!VAR!) do (…) loop executes defined (…) command for all strings found in !VAR! delimited with white space(s). So if !VAR! results to all white spaces, then consequential for %%p in ( ) do (…) loop executes nothing. Next code snippet could help:
echo off
setlocal EnableDelayedExpansion
echo a,b,c>final.txt
rem replace the €€€ string with any unused one
set "fooString=€€€"
for /f "tokens=1 delims=;" %%i in (source.txt) do (
set "x=%%i"
for /f "tokens=1,2 delims=," %%a in (config.txt) do (
call SET "VARraw=!x:~%%a,%%b!%fooString%"
rem replaced with respect to the OP's comment: for %%p in (!VARraw!) do (
for /F "tokens=*" %%p in ("!VARraw!") do (
set "rav=%%p"
set "var=!rav:%fooString%=!"
echo/|set /p "=!var!,"
) >>final.txt
)
)
Edit: note that for /F "tokens=*" %%p in ("!VARraw!") do ( would remove leading spaces from the !VARraw! string.

Loop through CSV file with batch - Space issue

I have a csv file like this
name,sex,age
venu,m,16
test,,22
[EDIT]
name could have comma also
"venu,gopal",m,16
I want to handle if sex is nothing and save it to another file.
I have a script like this
#Echo Off
For /F "usebackq tokens=1-3 delims=," %%a in (test.csv) Do (
echo %%a, %%b, %%c >> test-new.csv
)
But for the third record, I am getting %%b as 22 which should be space. How to fix this?
[EDIT2]
I have tried as per that link. I am not sure what I am doing wrong. I am getting same issue. Please check it once.
#echo off
setlocal DisableDelayedExpansion
For /F "usebackq tokens=1-3 delims=" %%x in (C:\somefile.csv) Do (
setlocal EnableDelayedExpansion
set "var=%%x"
set "var=!var:"=""!"
set "var=!var:^=^^!"
set "var=!var:&=^&!"
set "var=!var:|=^|!"
set "var=!var:<=^<!"
set "var=!var:>=^>!"
set "var=!var:,=^,^,!"
set var=!var:""="!
set "var=!var:"=""Q!"
set "var=!var:,,="S"S!"
set "var=!var:^,^,=,!"
set "var=!var:""="!"
set "var=!var:"Q=!"
For /F "tokens=1-3 delims=," %%a in ("!var:"S"S=","!") Do (
endlocal
echo %%~a, %%~b, %%~c
setlocal EnableDelayedExpansion
pause
)
endlocal
)
This is a bit tricky, as multiple delims will be condensed to a single delim.
So you need to replace them before to a unique delim sequence.
#Echo Off
setlocal DisableDelayedExpansion
For /F "usebackq tokens=1-3 delims=" %%a in (test.csv) Do (
set "line=%%a"
setlocal EnableDelayedExpansion
set "line="!line:,=","!""
For /F "tokens=1-3 delims=," %%a in ("!line!") Do (
echo %%~a, %%~b, %%~c
)
)
This enclose each column into quotes, and with the %%~a the quotes will be removed later
EDIT: The solution for embedded commas
In this case it's only a bit different than the solution for how to split on ';' in CMD shell
#echo off
setlocal DisableDelayedExpansion
For /F "usebackq tokens=1-3 delims=" %%x in (test.csv) Do (
set "var=%%x"
setlocal EnableDelayedExpansion
set "var=!var:^=^^!"
set "var=!var:&=^&!"
set "var=!var:|=^|!"
set "var=!var:<=^<!"
set "var=!var:>=^>!"
set "var=!var:,=^,^,!"
rem ** This is the key line, the missing quote is intention
call set var=%%var:""="%%
set "var=!var:"="Q!"
set "var=!var:^,^,="C!"
set "var=!var:,,=,!"
set "var=!var:""="!"
set "var="!var:,=","!""
for /F "tokens=1-3 delims=," %%a in ("!var!") do (
endlocal
set "col1=%%~a"
set "col2=%%~b"
set "col3=%%~c"
setlocal EnableDelayedExpansion
if defined col1 (
set "col1=!col1:"C=,!"
set "col1=!col1:"Q="!"
)
if defined col2 (
set "col2=!col2:"C=,!"
set "col2=!col2:"Q="!"
)
if defined col3 (
set "col3=!col3:"C=,!"
set "col3=!col3:"Q="!"
)
echo a=!col1!, b=!col2!, c=!col3!
endlocal
)
)
using a string editor like SSED, I have overcome this issue by creating a temp file where I have replaced ",," with ",-," twice, then regarding a "-" as having been an empty field in the original file...
ssed "s/,,/,-,/ig;s/,,/,-,/ig" file1.csv > file1.tmp
Then the tokens are allocated correctly. If you need to edit the temp file and then return to a CSV, then use...
ssed "s/,-,/,,/ig;s/,-,/,,/ig" file1.tmp > file1.csv
This seems much simpler than doing in flight string replacements by token and having subroutines/etc.

Resources