How to merge two text files using batch script? - batch-file

I have two text files as A.txt and B.txt with the below contents:
A.txt
value_a1,value_a2
value_b
value_c
value_d
value_e1,value_e2
B.txt
12,14
13
15
16
23,34
I want output file C.txt as
"value_a1","12","value_a2","14"
"value_b","13"
"value_c","15"
"value_d,"16"
"value_e1,"23","value_e2","34"
Please guide me through as I am new to Batch Script.

Following code will work:
#Echo off
echo. >>d.txt
type b.txt >>d.txt
set dec=1
For /F "usebackq tokens=1,* delims=, " %%a in ("a.txt") do call :File1 %%a %%b
set dec1=0
del d.txt
exit /b
:File1
SET str1=%~1
SET str2=%~2
SET Count=1
For /F "usebackq tokens=1,* skip=%dec% delims=," %%A in ("d.txt") do call :File2 %%A %%B
set /a dec=%dec%+1
exit /b
:File2
SET str3=%~1
SET str4="%~2"
IF %Count% EQU 1 (
IF %str4%=="" (
echo "%str1%","%str3%" >>c.txt
set /a Count=%Count%+1
) ELSE (
echo "%str1%","%str3%","%str2%",%str4% >>c.txt
set /a Count=%Count%+1)
)
exit /b

There are many restrictions to this solution, but it is about as simple a solution as is possible with pure native batch. It is also fairly efficient for a batch solution.
#echo off
setlocal enableDelayedExpansion
<b.txt >c.txt (
for /f delims^=^ eol^= %%L in (a.txt) do (
set "ln="
set /p "ln="
set "out="
for %%A in (%%L) do (
for /f "tokens=1* delims=," %%a in ("!ln!") do (
set "out=!out!,"%%A","%%a""
set "ln=%%b"
)
echo !out:~1!
)
)
)
Limitations:
A.TXT cannot contain * or ? or !
A.TXT values must be quoted if they contain any of <space> <tab> , ; =
A.TXT max line length is approximately 8191 bytes
B.TXT cannot contain !
B.TXT values cannot contain , (, is strictly a delimiter)
B.TXT max line length is 1021 bytes
B.TXT lines must use Windows style terminators (carriage return/linefeed), not Unix style (linefeed)
Some of the limitations can be overcome fairly easily. Others require a lot more effort, to the point of becoming totally impractical.

Related

How to read all content from text flie in single variable in Batch Script?

This is my first experience in batch script, i am trying to read text file content and trying to set its content in single variable i am using .bat file to run script but script not working.
i want to set all content of file in single variable.
tried most of example but failure.
below is my script which i am trying
cd "C:\documents and settings\%USERNAME%\desktop"
for /f "delims=" %%x in (Test.txt) do set Build=%%x
pause >nul
exit
This is my Text File
And
below result is showing
i want it in single variable
cd "C:\documents and settings\%USERNAME%\desktop"
setlocal enableDelayedExpansion
for /f "useback delims=" %%x in ("Test.txt") do set "Build=!build! %%x"
echo %build%
pause >nul
exit
Mind that the max length of a string you can assign to a variable is 8191 symbols.
Also some special symbols could break the script above (%,! ..)
i want to set all content of file in single variable.
In batch, the only way is to construct a multi-line variable yourself, named !LF!
#echo off
SETLOCAL EnableDelayedExpansion
::Change this
set "file=%~f0"
if NOT EXIST "%file%" exit /b 1
::Initialize & empty variables
( set LF=^
%= 0x0D Form Feed =%
)
set "lines="
set "data="
::Count lines
FOR /F tokens^=*^ delims^=^ eol^= %%L in (
'2^>nul findstr /N "^" "%file%"'
) do set /a "lines+=1"
::Read file using SET /P
<"%file%" (
for /L %%a in (1 1 %lines%) do (
set "x=" %= Workaround for empty lines =%
set /p "x="
set "data=!data!!LF!!x!"
)
)
::Print data
echo(!data!
ENDLOCAL
pause & exit /b
The FOR /F tokens^=*^ delims^=^ eol^= disables eol by escaping every delimiter. Warning: "eol=" DOES NOT disable eol, but sets the quote " as the end-of-line character!
Disadvantages:
SLOW: On my machine, it takes 2 seconds for a 2000 lines file, printing the variable excluded
The line ending style (\n or \r\n is ignored, and always set to \n)
Additional \n at the beginning of variable
Advantages:
SAFE: Always works for any file with printable ASCII characters (0x20 ~ 0x7E)
You can also use #dbenham's CERTUTIL method. It is foolproof against <CR> <LF> ! ^...
See more tricks with certutil for the "poorly documented" switches explained in depth.
#echo off
====SETLOCAL EnableDelayedExpansion EnableExtensions
set "B=^!"
set "C=^"
set ^"L=^
%===Line Feed===%
"
for /F %%C in ('wmic qfe list') do set "R=%%C" Carriage Return
>nul 2>&1 certutil -f -encodehex Test.txt "%temp%\hex.txt" 4
pushd "%temp%"
>expand.txt (
for /f "delims=" %%A in (hex.txt) do for %%B in (%%A) do (
set "char=%%B"
REM ! --> !B!
set "char=!char:21=21 42 21!"
REM ^ --> !C!
set "char=!char:5e=21 43 21!"
REM <CR> --> !R!
set "char=!char:0a=21 4c 21!"
REM <LF> --> !L!
set "char=!char:0d=21 52 21!"
echo(!char!
)
)
>nul 2>&1 certutil -f -decodehex expand.txt rawContent.txt
for /f delims^=^ eol^= %%A in (rawContent.txt) do set "fileContent=%%A"
del hex.txt expand.txt rawContent.txt
popd
echo(!fileContent!

Windows Batch FOR Loop improvement

I have a batch to check the duplicate line in TXT file (over one million line) with 13MB, that will be running over 2hr...how can I speed up that? Thank you!!
TXT file
11
22
33
44
.
.
.
44 (over one million line)
Existing Batch
setlocal
set var1=*
sort original.txt>sort.txt
for /f %%a in ('type sort.txt') do (call :run %%a)
goto :end
:run
if %1==%var1% echo %1>>duplicate.txt
set var1=%1
goto :eof
:end
This should be the fastest method using a Batch file:
#echo off
setlocal EnableDelayedExpansion
set var1=*
sort original.txt>sort.txt
(for /f %%a in (sort.txt) do (
if "%%a" == "!var1!" (
echo %%a
) else (
set "var1=%%a"
)
)) >duplicate.txt
This method use findstr command as in aschipfl's answer, but in this case each line and its duplicates are removed from the file after being revised by findstr. This method could be faster if the number of duplicates in the file is high; otherwise it will be slower because the high volume data manipulated in each turn. Just a test may confirm this point...
#echo off
setlocal EnableDelayedExpansion
del duplicate.txt 2>NUL
copy /Y original.txt input.txt > NUL
:nextTurn
for %%a in (input.txt) do if %%~Za equ 0 goto end
< input.txt (
set /P "line="
findstr /X /C:"!line!"
find /V "!line!" > output.txt
) >> duplicate.txt
move /Y output.txt input.txt > NUL
goto nextTurn
:end
#echo off
setlocal enabledelayedexpansion
set var1=*
(
for /f %%a in ('sort q42574625.txt') do (
if "%%a"=="!var1!" echo %%a
set "var1=%%a"
)
)>"u:\q42574625_2.txt"
GOTO :EOF
This may be faster - I don't have your file to test against
I used a file named q42574625.txt containing some dummy data for my testing.
It's not clear whether you want only one instance of a duplicate line or not. Your code would produce 5 "duplicate" lines if there were 6 identical lines in the source file.
Here's a version which will report each duplicated line only once:
#echo off
setlocal enabledelayedexpansion
set var1=*
set var2=*
(
for /f %%a in ('sort q42574625.txt') do (
if "%%a"=="!var1!" IF "!var2!" neq "%%a" echo %%a&SET "var2=%%a"
set "var1=%%a"
)
)>"u:\q42574625.txt"
GOTO :EOF
Supposing you provide the text file as the first command line argument, you could try the following:
#echo off
for /F "usebackq delims=" %%L in ("%~1") do (
for /F "delims=" %%K in ('
findstr /X /C:"%%L" "%~1" ^| find /C /V ""
') do (
if %%K GTR 1 echo %%L
)
)
This returns all duplicate lines, but multiple times each, namely as often as each occurs in the file.

Extract substring from filename and count

Example below - 5 files will be located in the same folder.
Sales-fid1000-f100.dat
Revenue-fid1000-f100.dat
Sales-fid2000-f200.dat
Revenue-fid2000-f200.dat
Income-fid2000-f200.dat
I need to read the filename and get the number after "fid", in this case 1000 and 2000 and count the number of files associated with each "fid".
So for fid1000, there are 2 files and for fid2000, there are 3 files.
I need to write the output into a .txt file as below with first field being the fid number and second field being the count.
1000|2
2000|3
How can I generate output text file with fid and count using a Windows batch file?
#echo off
setlocal EnableDelayedExpansion
rem Process all file names
for /F "tokens=2 delims=-" %%a in ('dir /B /A-D *.dat') do (
rem Get FID from second dash-delimited token; format: "xxx-fid####-xxx.dat"
set "fid=%%a"
rem Accumulate it to the corresponding element of "count" array
set /A "count[!fid:~3!]+=1"
)
rem Create the output
(for /F "tokens=2,3 delims=[]=" %%a in ('set count[') do echo %%a^|%%b) > output.txt
For further details on array management in Batch files, see: Arrays, linked lists and other data structures in cmd.exe (batch) script
#ECHO OFF
SETLOCAL
:: remove variables starting $
FOR /F "delims==" %%a In ('set $ 2^>Nul') DO SET "%%a="
SET "sourcedir=U:\sourcedir"
SET "destdir=U:\destdir"
SET "outfile=%destdir%\outfile.txt"
FOR /f "delims=" %%a IN (
'dir /b /a-d "%sourcedir%\*-fid*" '
) DO (
SET "filename=%%a"
CALL :process
)
(
FOR /F "tokens=1,2delims=$=" %%a In ('set $ 2^>Nul') DO ECHO(%%a^|%%b
)>"%outfile%"
GOTO :EOF
:process
SET "filename=%filename:*-fid=%"
FOR /f "delims=-" %%q IN ("%filename%") DO SET /a $%%q+=1
GOTO :eof
You would need to change the settings of sourcedir and destdir to suit your circumstances.
Produces the file defined as %outfile%
After clearing all the $ variables (for safety's sake), perform a directory listing without directorynames and in basic form of files in the source directory matching *-fid*.
For each name found, assign the name to filename and execute the :process routine, which first removes the characters up to and including -fid from filename then uses the delims=- option to assign the part originally between -fid and the following - to %%q.
setthe variable $%%q up by 1 (if $?? is undefined, assign 1)
Finally, when all the names have been processed, list the variables named $... using set which produces a report of the style
$1000=2
$2000=3
Using $ and = as delimiters puts token 1 (eg 2000) into %%a and token 2 (eg 3) into %%b. Write these to the output using echo, remembering to escape the pipe (|) with a caret (^) to suppress the interpretation as a redirector.
The parentheses around the for...$... ensures the output is directed to the destination file specified.
Extract the numbers into a temporary file, then count the occurrences of each number in that file.
#echo off
setlocal EnableDelayedExpansion
>temp.txt type nul
set "unique_num="
for /f "tokens=2 delims=-" %%a in ('dir /b *.dat') do (
set "fid=%%a"
set "num=!fid:~3!"
>>temp.txt echo !num!
echo " !unique_num! " | find " !num! " >nul
if !errorlevel! neq 0 set "unique_num=!unique_num! !num!"
)
for %%n in (%unique_num%) do (
for /f "delims=: tokens=2" %%c in ('find /c "%%n" temp.txt') do (
set "count=%%c"
echo %%n^|!count: =!
)
)
del /f /q temp.txt
Pipe the result into sort if you need the output sorted.

Deleting last n lines from file using batch file

How to delete last n lines from file using batch script
I don't have any idea about batch files, I am writing batch file for the first time.
How should I write this batch file?
For Windows7
Try it for
<Project_Name>
<Noter>
<Common>
<File>D:\Project_Name\Util.jar</File>
<File>D:\Project_Name\Noter.bat</File>
<File>D:Project_Name\Noter.xml</File>
<File>D:Project_Name\Noter.jar</File>
</Common>
<Project_Name>
<File>D:\Util.bat</File>
<File>D:\Util.xml</File>
<File>D:\log.bat</File>
</Project_Name>
</Noter>
<CCNET>
This the complete script for remove last N line
count the total line
set Line = Line - N , remain just processing lines number
#echo OFF
setlocal EnableDelayedExpansion
set LINES=0
for /f "delims==" %%I in (infile.txt) do (
set /a LINES=LINES+1
)
echo Total Lines : %LINES%
echo.
:: n = 5 , last 5 line will ignore
set /a LINES=LINES-5
call:PrintFirstNLine > output.txt
goto EOF
:PrintFirstNLine
set cur=0
for /f "delims==" %%I in (infile.txt) do (
echo %%I
::echo !cur! : %%I
set /a cur=cur+1
if "!cur!"=="%LINES%" goto EOF
)
:EOF
exit /b
Here call:PrintFirstNLine > output.txt will give the output in an external file name as output.txt
Output for sample Input
<Project_Name>
<CBA_Notifier>
<Common>
<File>D:\CBA\CBA_Notifier\Project_Name\IPS-Util.jar</File>
<File>D:\CBA\CBA_Notifier\Project_Name\Notifier.bat</File>
<File>D:\CBA\CBA_Notifier\Project_Name\Notifier.xml</File>
<File>D:\CBA\CBA_Notifier\Project_Name\Notifier.jar</File>
</Common>
<Project_Name>
<File>D:\CBA\CBA_Notifier\IPS-Util.bat</File>
remove last 5 line
Update
:PrintFirstNLine
set cur=0
for /F "tokens=1* delims=]" %%I in ('type "infile.txt" ^| find /V /N ""') do (
if "%%J"=="" (echo.) else (
echo.%%J
set /a cur=cur+1
)
if "!cur!"=="%LINES%" goto EOF
)
This script will takes 1 arguement, the file to be trunkated, creates a temporary file and then replaces the original file with the shorter one.
#echo off
setlocal enabledelayedexpansion
set count=
for /f %%x in ('type %1 ^| find /c /v ""') do set /a lines=%%x-5
copy /y nul %tmp%\tmp.zzz > nul
for /f "tokens=*" %%x in ('type %1 ^| find /v ""') do (
set /a count=count+1
if !count! leq %lines% echo %%x>>%tmp%\tmp.zzz
)
move /y %tmp%\tmp.zzz %1 > nul
If the original file is 5 or less lines, the main output routine will noT create a file. To combat this, I use the copy /y null to create a zero byte file.
If you would rather not have an empty file, just remove the copy /y nul line, and replace it with the following line:
if %lines% leq 0 del %1
You should use one method or the other, otherwise source files with 5 or less lines will remain untouched. (Neither replaced or deleted.)
to delete last lines from your file,
1 copy starting lines that are needed from file like from- e:\original.txt
2 paste them in new file like- e:\new\newfile1.txt
code is thanks to the person giving me this code:
remember all may be done if you have motive and even blood hb =6. but help of nature is required always as you are a part of it
#echo off & setLocal enableDELAYedeXpansion
set N=
for /f "tokens=* delims= " %%a in (e:\4.txt) do (
set /a N+=1
if !N! gtr 264 goto :did
e:\new4.txt echo.%%a
)
:did
if you have 800 files then use excel to make code for 800 and then copy it to notepad and using Ctrl+h replace space with no space. then rename file as haha.bat . run in folder with files numbered 1.txt 2.txt 3.txt etc. any enquirers welcome Erkamaldev#gmail.com " Long Live Bharata"
A slow method with less coding:
set input=file.txt
set remove=7
for /f "delims=" %i in ('find /c /v "" ^< "%cd%\%input%"') do set lines=%i
set /a lines-=remove
for /l %i in (1,1,!lines!) do findstr /n . log.txt | findstr /b %i:
May be redirected to a file.
Each line is prefixed with the line number; may be removed with extra coding.
A faster version with /g flag in my answer at:
How to split large text file in windows?
Tested in Win 10 CMD, on 577KB file, 7669 lines.

Add leading zeroes to a filename

I am trying to modify a batch file created by somebody else, to add leading zeros depending on the number found on line 4 of the file. The actual filename is a concatenation of the name found on line 3, and the numbers on line 4. So if the first few lines are as follows:
3.1.19
-1
TEST
560
The file name would be v_TEST00560.TXT. As you can see, the total number of digits in the file name should be 5. If the number which appears on line 4 is 8 (see below), then:
3.1.19
-1
TEST
8
The file name will be v_TEST00008.txt.
The file I have is as follows:
#Echo Off
Setlocal EnableDelayedExpansion
REM File: rename5.bat
REM The script will look for and parse one (or more) input files
REM Input files can containrecords for one or more vessels.
REM This script assumes that each record starts with the "3.1.19" string.
REM %%%%%%%%%%%%%%%%%%%%%%% Configuration Section %%%%%%%%%%%%%%%%%%%%%%%
SET INPUT_DIR=C:\Files\RenameFileName\Input
SET OUTPUT_DIR=C:\Files\RenameFileName\Output
SET ARCHIVE_DIR=C:\Files\RenameFileName\Archive
SET TEMP_DIR=C:\Files\RenameFileName\tmp
SET INPUT_FILENAME=INTERFACE.TXT
SET REC=3.1.19
REM %%%%%%%%%%%%%%%%%%%%%%%%%%%% Checking Section %%%%%%%%%%%%%%%%%%%%%%%%%%%%
FOR /F "usebackq tokens=* eol= delims= " %%d IN (`date /t`) do SET RUNDATE=%%d
echo [%RUNDATE% %TIME%] Script starting...
IF NOT EXIST %INPUT_DIR% (
SET MESSAGE=Input directory not found.
goto END
)
IF NOT EXIST %OUTPUT_DIR% (
SET MESSAGE=Output directory not found.
goto END
)
IF NOT EXIST %ARCHIVE_DIR% (
SET MESSAGE=Archive directory not found.
goto END
)
IF NOT EXIST %TEMP_DIR% (
echo Temporary directory does not exit.
echo Creating %TEMP_DIR%
mkdir %TEMP_DIR%
)
REM %%%%%%%%%%%%%%%%%%%%%%%%% Main Processing %%%%%%%%%%%%%%%%%%%%%%%%%
dir %INPUT_DIR%\%INPUT_FILENAME% 1>NUL 2>NUL
IF %ERRORLEVEL% EQU 1 (
SET MESSAGE=Input files not present.
goto END
)
FOR /F "usebackq tokens=* eol= delims= " %%d IN (`date /t`) do SET RUNDATE=%%d
echo [%RUNDATE% %TIME%] Input files found. Start Processing...
FOR /F "usebackq" %%I IN (`dir /b %INPUT_DIR%\%INPUT_FILENAME%`) DO (
SET INPUT_FILE=!INPUT_DIR!\%%I
echo READING Input file: !INPUT_FILE!
SET N=
FOR /F "tokens=* eol= delims= " %%A IN (!INPUT_FILE!) Do (
set LINE=%%A
set LINE2=!LINE:~0,6!
if !LINE2! EQU !REC! (
SET /A N+=1
echo Creating temp file !TEMP_DIR!\!N!.tmp
)
echo !LINE! >> !TEMP_DIR!\!N!.tmp
)
FOR /F "usebackq" %%Y in (`dir /b !TEMP_DIR!\*.tmp`) DO (
SET TEMPFILE=!TEMP_DIR!\%%Y
SET N=
FOR /F %%A IN (!TEMPFILE!) DO (
SET /A N+=1
IF !N! EQU 3 SET S=%%A
IF !N! EQU 4 SET T=%%A
)
SET S=!S:~0,10!
SET T=!T:~0,10!
echo CREATING Output File: %OUTPUT_DIR%\V_!S!00!T!.TXT
MOVE !TEMPFILE! %OUTPUT_DIR%\V_!S!00!T!.TXT
)
)
REM %%%%%%%%%%%%%%%%%%%%%%%%% Archiving Section %%%%%%%%%%%%%%%%%%%%%%%%%
FOR /F "usebackq" %%t IN (`cscript "%~dp0timestamp.vbs" //Nologo`) do SET TIMESTAMP=%%t
FOR /F "usebackq" %%I IN (`dir /b %INPUT_DIR%\%INPUT_FILENAME%`) DO (
echo ARCHIVING Input file %%I to %ARCHIVE_DIR%
rem COPY !INPUT_DIR!\%%I !ARCHIVE_DIR!\%%I.!TIMESTAMP!
MOVE !INPUT_DIR!\%%I !ARCHIVE_DIR!\%%I.!TIMESTAMP!
)
REM %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
FOR /F "usebackq tokens=* eol= delims= " %%d IN (`date /t`) do SET RUNDATE=%%d
SET MESSAGE=[%RUNDATE% %TIME%] Processing Done.
:END
echo %MESSAGE%
FOR /F "usebackq tokens=* eol= delims= " %%d IN (`date /t`) do SET RUNDATE=%%d
echo [%RUNDATE% %TIME%] Script finished.
As you can see, its quite sophisticated, and I have no idea how to make these changes myself. The BAT runs perfectly, but the number of zeroes if fixed, and not generated depending on the number of digits already present. Any help appreciate
I'm not about to read all of your code, but I use this for padding zeros.
The first line is whetever number you read from your file.
The second line pads more than enough zeros to the start of the variable.
The third line cuts off all but the last five characters from the variable.
Set Number=123
Set Number=00000%Number%
Set Number=%Number:~-5%

Resources