Batch command to find a replace string with line break - batch-file

I have data that is mostly organized in a way that I can convert and import into a spreadsheet. But certain lines have carriage returns and text that my current batch file won't use.
Good Data:
Pipers Cove × 2 $25.00
Pipers Cove Petite × 2 $25.00
Pipers Cove Plus × 2 $25.00
Nordic Club × 2 $25.00
Whiteout × 1 $12.50
Bad Data:
Pipers Cove Kids × 2
Size:
Large - ages 10 to 12
$20.00
Pipers Cove Kids × 2
Size:
Medium - ages 6 to 8
$20.00
Pipers Cove Kids × 2
Size:
Small - ages 2 to 4
$20.00
I need to remove the 2 lines starting with Size, Small, Medium, or Large and have the dollar amount follow the quantity number so my batch file can convert it to a CSV file and so on.

#ECHO OFF
SETLOCAL
SET "sourcedir=U:\sourcedir"
SET "destdir=U:\destdir"
SET "filename1=%sourcedir%\q40953616.txt"
SET "outfile=%destdir%\outfile.txt"
SET "part1="
(
FOR /f "usebackqdelims=" %%i IN ("%filename1%") DO (
ECHO %%i|FIND "$" >NUL
IF ERRORLEVEL 1 (
REM $ not found - set part1 on first such line
IF NOT DEFINED part1 SET "part1=%%i"
) ELSE (
REM $ found - see whether at start or not
FOR /f "tokens=1*delims=$" %%a IN ("%%i") DO (
IF "%%b"=="" (
REM at start - combine and output and reset part1
CALL ECHO %%part1%% %%i
SET "part1="
) ELSE (
ECHO %%i
)
)
)
)
)>"%outfile%"
GOTO :EOF
You would need to change the settings of sourcedir and destdir to suit your circumstances.
I used a file named q40953616.txt containing your data for my testing.
Produces the file defined as %outfile%
Scan each line of the file. If the line does not contain $ then save the first such line in part1.
Otherwise, tokenise the line. If there is only 1 token, then the $ is at the start of the line, so it needs to be output combined with part1
Otherwise, just regurgitate the line.

Although you did not show any own efforts, I decided to provide a solution as the task at hand appears not that trivial to me.
The following script -- let us call it clean-up-text-file.bat -- ignores only lines that begin with the words you specified. Any other lines are appended to the previous one until a $ sign is encountered, in which case a new ine is started. With this method, no lines can get lost unintentionally.
#echo off
setlocal EnableExtensions DisableDelayedExpansion
rem // Define constants here:
set _WORDS="Size","Small","Medium","Large"
for %%F in (%*) do (
set "COLL=" & set "FILE=%%~F"
for /F delims^=^ eol^= %%L in ('type "%%~F" ^& ^> "%%~F" rem/') do (
set "LINE=%%L"
(echo("%%L" | > nul find "$") && (
setlocal EnableDelayedExpansion
>> "!FILE!" echo(!COLL!!LINE!
endlocal
set "COLL="
) || (
set "FLAG="
for %%K in (%_WORDS%) do (
(echo("%%L" | > nul findstr /I /R /B /C:^^^"\"%%~K\>") && (
set "FLAG=#"
)
)
if not defined FLAG (
setlocal EnableDelayedExpansion
rem // The following line contains a TAB character!
for /F "delims=" %%E in (^""!COLL!!LINE! "^") do (
endlocal
set "COLL=%%~E"
)
)
)
)
)
endlocal
exit /B
To use the script, provide your text file(s) as (a) command line argument(s):
clean-up-text-file.bat "good.txt" "bad.txt"
Every specified file is modified directly, so take care when testing!

Related

How to find and replace a number in a file and increment it?

I have 5000 same files and I need to update numeric value in its content and increment it. Below is the batch script I use to find and replace a number in a certain file called BULK_1.txt.
I am not sure on how to increment the value after running search and replace.
#echo off
setlocal EnableExtensions DisableDelayedExpansion
set search=01118596270001
set replace=01118596270002
set "textFile=BULK_1.txt"
set "rootDir=C:\Batch"
for %%j in ("%rootDir%\%textFile%") do (
for /f "delims=" %%i in ('type "%%~j" ^& break ^> "%%~j"') do (
set "line=%%i"
setlocal EnableDelayedExpansion
set "line=!line:%search%=%replace%!"
>>"%%~j" echo(!line!
endlocal
)
)
endlocal
The result should be like below. The last 4 digits should be updated from 0001 to 5000 for each file
Content of the BULK_1.txt:
DMAIN Test_data 01118596270001
DDOC_DATA Test_docdata 01118596270001
Content of the BULK_2:
DMAIN Test_data 01118596270002
DDOC_DATA Test_docdata 01118596270002
Content of the BULK_3:
DMAIN Test_data 01118596270003
DDOC_DATA Test_docdata 01118596270003
According to your requirements you may need:
#echo off
setlocal EnableDelayedExpansion
for /L %%A IN (1 1 5000) do (
set num_processed=%%A
call:find_len num_processed
if !len! EQU 1 (set num_%%A=0111859627000%%A)
if !len! EQU 2 (set num_%%A=011185962700%%A)
if !len! EQU 3 (set num_%%A=01118596270%%A)
if !len! EQU 4 (set num_%%A=0111859627%%A)
)
for /L %%A IN (1 1 5000) do (
for /L %%B IN (1 1 2) do (
if %%B EQU 1 (
echo DMAIN Test_data !num_%%A!>BULK_%%A.txt
) else (
echo DDOC_DATA Test_docdata !num_%%A!>>BULK_%%A.txt
)
)
)
:find_len
set "s=!%~1!#"
set "len=0"
for %%P in (4 2 1) do (
if "!s:~%%P,1!" NEQ "" (
set /a "len+=%%P"
set "s=!s:~%%P!"
)
)
My code uses the way that #jeb has suggested here slightly edited.
First, we make a loop to count from 1 to 5000.
We want to count the length of each of these numbers. calling the find_len subroutine does this.
If the string length of the variable is 1, then it must be 0001. The number in front is the same in all cases.
If the string length of the variable is 2, then it must be 0010. The number in front is the same in all cases.
If the string length of the variable is 3, then it must be 0100. The number in front is the same in all cases.
If the string length of the variable is 4, then it must be 1000. The number in front is the same in all cases.
Note: If we tried something similar to set /a 0000+1 the result would be 1. That is why all of these complicates!.
In all cases, the variable names will be num_numberCurrentlyProcessed.
Now another loop, actually the same as before. It will loop 5000 times and will create 5000 files in the format BULK_num.txt.
Inside this loop, another loop is required from 1 to 2, as each file must have 2 lines.
If we are in line 1, we echo the specific text specified by OP.
If we are in line 1, we echo again the specific text specified by OP.
"I have 5000 same files" - it's a lot faster to write them from scratch than to edit each of them:
#echo off
setlocal enabledelayedexpansion
set "Hdr=DMAIN Test_data 0111859627"
set "Dta=DDOC_DATA Test_docdata 0111859627"
for /l %%a in (1,1,5000) do (
set "num=0000%%a" REM prepend some zeros
set "num=!num:~-4!" REM take last four chars
>Bulk_%%a.txt (
echo %Hdr%!num!
echo %Dta%!num!
)
)

Batch file to Copy the last n lines of a text file into a new text file

is there a way using a batch file to copy the last n lines of a log file into a new text file.
Log file:
line 1
line 2
line 3
line 4
line 5
n = 2
Newfile:
line 4
line 5
You could try the following code:
#echo off
set /A "_LAST=2" & rem // (define the number of last lines to keep)
for /F %%C in ('^< "test.log" find /C /V ""') do set "COUNT=%%C"
set /A "LINES=COUNT-_LAST"
if %LINES% gtr 0 (set "SKIP=+%LINES%") else (set "SKIP=")
> "test.log.new" more %SKIP% "test.log"
This script can handle log files containing empty lines and such with a length of up to 65534 characters. However, the output file must not contain more than 65535 lines. Note, that TABs become expanded to SPACEs.
Or try this:
#echo off
set /A "_LAST=2" & rem // (define the number of last lines to keep)
for /F %%C in ('^< "test.log" find /C /V ""') do set "COUNT=%%C"
set /A "LINES=COUNT-_LAST"
if %LINES% gtr 0 (set "SKIP=skip^=%LINES%") else (set "SKIP=")
> "test.log.new" (
for /F usebackq^ %SKIP%^ delims^=^ eol^= %%L in ("test.log") do (
echo(%%L
)
)
This one has not got a limitation for the number of lines, though the file size must be less than 2 GiB. However, it cannot handle files containing empty lines (as they get lost) and such with a length of more than 8190 characters.

Batch command to find highest number in set of strings?

For instance:
file.txt contains:
4.3 - random1
5.6 - random2
2.2 - random3
3 - random4
1.8 - random5
I need a command that will output the highest number only, not the preceding text.
Ie.
Output = 5.6
You can give SORTN.bat a try.
Here is the code for it as well.
#ECHO OFF
if "%~1"=="/?" (
echo.Sorts text by handling first number in line as number not text
echo.
echo.%~n0 [n]
echo.
echo. n Specifies the character number, n, to
echo. begin each comparison. 3 indicates that
echo. each comparison should begin at the 3rd
echo. character in each line. Lines with fewer
echo. than n characters collate before other lines.
echo. By default comparisons start at the first
echo. character in each line.
echo.
echo.Description:
echo. 'abc10def3' is bigger than 'abc9def4' because
echo. first number in first string is 10
echo. first number in second string is 9
echo. whereas normal text compare returns
echo. 'abc10def3' smaller than 'abc9def4'
echo.
echo.Example:
echo. To sort a directory pipe the output of the dir
echo. command into %~n0 like this:
echo. dir /b^|%~n0
echo.
echo.Source: http://www.dostips.com
goto:EOF
)
if "%~1" NEQ "~" (
for /f "tokens=1,* delims=," %%a in ('"%~f0 ~ %*|sort"') do echo.%%b
goto:EOF
)
SETLOCAL ENABLEDELAYEDEXPANSION
set /a n=%~2+0
for /f "tokens=1,* delims=]" %%A in ('"find /n /v """') do (
set f=,%%B
(
set f0=!f:~0,%n%!
set f0=!f0:~1!
rem call call set f=,%%%%f:*%%f0%%=%%%%
set f=,!f:~%n%!
)
for /f "delims=1234567890" %%b in ("!f!") do (
set f1=%%b
set f1=!f1:~1!
call set f=0%%f:*%%b=%%
)
for /f "delims=abcdefghijklmnopqrstuwwxyzABCDEFGHIJKLMNOPQRSTUWWXYZ~`##$*_-+=:;',.?/\ " %%b in ("!f!") do (
set f2=00000000000000000000%%b
set f2=!f2:~-20!
call set f=%%f:*%%b=%%
)
echo.!f1!!f2!!f!,%%B
rem echo.-!f0!*!f1!*!f2!*!f!*%%a>&2
)
I gave it a try using this input as an example
4.3 - random1
11.3 - random6
5.6 - random2
2.2 - random3
100.1 - random8
3 - random4
1.8 - random5
11.12 - random7
11.11 - random7
This is how I ran it but you should be able to capture the output as well using a FOR /F command just like Stephan showed you in his answer.
type sortme.txt |sortn.bat
Output
1.8 - random5
2.2 - random3
3 - random4
4.3 - random1
5.6 - random2
11.11 - random7
11.12 - random7
11.3 - random6
100.1 - random8
sort will sort in correct order (Attention: this is sorting strings, not numbers, but will work fine for your example. Note that with string comparison 5 or 6.3 are "bigger" than 15).
Put a for around to process the ouput (Standard tokens is 1 and Space is a standard delimiter, so the for /f gets only the first element - your desired number)
for /f %%a in ('sort t.txt') do set high=%%a
echo %high%
EDIT to also process numbers higher than 10. Note: there is no math involved - it's just clever string manipulation.
#echo off
setlocal enabledelayedexpansion
(
for /f "tokens=1,2 delims=. " %%a in (t.txt) do (
set a=0000%%a
if "%%b"=="-" (echo !a:~-4!) else (echo !a:~-4!.%%b)
)
)>temp.txt
type temp.txt
pause
for /f "tokens=1,2 delims=0" %%a in ('sort temp.txt') do set high=%%a
echo %high%
The following script does proper sorting of fractional numbers with eight digits before and after the decimal separator . at most:
#echo off
setlocal EnableExtensions EnableDelayedExpansion
rem // Define constants here:
set "_FILE=%~1" & rem // (input file; `%~1` means first command line argument)
set /A "_ZPAD=8" & rem // (number of zeroes to be used for temporary padding)
set "_ZERO=" & for /L %%K in (0,1,%_ZPAD%) do set "_ZERO=!_ZERO!0"
for /F "usebackq delims=- " %%L in ("%_FILE%") do (
for /F "tokens=1,2 delims=." %%I in ("%%L.") do (
set "INT=%_ZERO%%%I" & set "FRA=%%J%_ZERO%"
set "$NUM_!INT:~-%_ZPAD%!_!FRA:~,%_ZPAD%!=%%L"
)
)
set "MIN=" & set "MAX="
for /F "tokens=2 delims==" %%L in ('2^> nul set $NUM') do (
if not defined MIN set "MIN=%%L"
set "MAX=%%L"
)
echo Minimum: %MIN%
echo Maximum: %MAX%
endlocal
exit /B

Listing matrix from batch file

I need to make batch file that takes two arguments.
First is txt file containing matrix A containing integer values delimited by ','.
Second argument is also txt file. It containts array B of integers delimited by ' '.
Batch file should create file named result.txt which contains matrix C, where
C[i][j]=A[i][j]+k[i][j] where k[i][j] is number of occurrence number A[i][j] in array B. I would be very thankful if anyone could help me. I tried to solve this but for command in dos is killing me...
For example:
matrix.txt
1,2,3
4,5,6
array.txt
1 3 2 5 1
results.txt
3 3 4
4 6 6
This problem is interesting! This is how I would solve it:
rem Count the number of times each number appear in array B
for each line in %2 do (
for each number %%n in line do (
add 1 to times[%%n]
)
)
rem Process the matrix A
for each line in %1 do (
rem Initialize output line
set "out="
for each number %%n in line do (
set termC = %%n + times[%%n]
join termC to out
)
echo out
)
EDIT: As user aschipfl indicated, this answer is just the pseudo-code of how to solve this problem that I posted as a hint for you (because post complete solutions when the OP had not showed his/her own efforts to solve the problem is not a good practice here).
However, now that another working solution had been posted, I completed previous pseudo-code into a fully working program:
#echo off
setlocal EnableDelayedExpansion
rem Check the arguments
if "%~2" equ "" echo Usage: %~NX0 MatrixA.txt ArrayB.txt & goto :EOF
if not exist "%~1" echo MatrixA file not found & goto :EOF
if not exist "%~2" echo ArrayB file not found & goto :EOF
rem Count the number of times each number appear in array B
for /F "usebackq delims=" %%b in ("%~2") do (
for %%n in (%%b) do (
set /A "times[%%n]+=1"
)
)
rem Process the matrix A and create matrix C
(for /F "usebackq delims=" %%a in ("%~1") do (
set "out="
for %%n in (%%a) do (
set /A termC=%%n + times[%%n]
set "out=!out! !termC!"
)
echo !out:~1!
)) > result.txt
#echo off
setlocal enabledelayedexpansion
for /f "usebackq delims=" %%a in ("path+array.txt") do (
set "array_all=%%a"
)
set array_n=0
call :sub_2 %array_all%
set array_n_all=%array_n%
for /f "usebackq delims=" %%a in ("path+matrix.txt") do (
set "current_line=%%a"
set "current_line=!current_line:,= !"
set "matrix_new="
call :sub_1 !current_line!
echo !matrix_new:~1!>>"path+results.txt"
)
exit /b
:sub_1
set "current_value=%1"
if not defined current_value exit /b
set "array_rest=!array_all:%current_value%=!"
set array_n=0
call :sub_2 %array_rest%
set /a current_value_new=%current_value%+%array_n_all%-%array_n%
set "matrix_new=%matrix_new% %current_value_new%"
shift
goto sub_1
:sub_2
set "array_tmp=%1"
if not defined array_tmp exit /b
set /a array_n+=1
shift
goto sub_2

How to read filenames in segments to retrieve a set of files in batch script

I have a set of files in the below format in two different directories.
LS-2bit_a0_c0_apple_p1.log
LS-2bit_a0_c0_apple_p1.txt
LS-2bit_a0_c0_apple_p2.log
LS-2bit_a0_c0_apple_p2.txt
LS-2bit_a0_c0_mango_p1.log
LS-2bit_a0_c0_mango_p1.txt
LS-2bit_a0_c0_mango_p2.log
LS-2bit_a0_c0_mango_p2.txt
LS-2bit_a0_c0_grape_p1.log
LS-2bit_a0_c0_grape_p1.txt
LS-2bit_a0_c0_grape_p2.log
LS-2bit_a0_c0_grape_p2.txt
LS-2bit_a0_c1_apple_p1.log
LS-2bit_a0_c1_apple_p1.txt
LS-2bit_a0_c1_apple_p2.log
LS-2bit_a0_c1_apple_p2.txt
LS-2bit_a0_c1_mango_p1.log
LS-2bit_a0_c1_mango_p1.txt
LS-2bit_a0_c1_mango_p2.log
LS-2bit_a0_c1_mango_p2.txt
LS-2bit_a0_c1_grape_p1.log
LS-2bit_a0_c1_grape_p1.txt
LS-2bit_a0_c1_grape_p2.log
LS-2bit_a0_c1_grape_p2.txt
and the sequence follows for c0,c1,c2 etc. I want to read the files by segment and perform an operation each set of files individually. (eg.retrieve all files with C0_apple alone.But it outputs all apple files (eg. c0_apple,c1_apple, C2_apple...)
I am very new to batch script and tried the below.
#echo off
setlocal enabledelayedexpansion
SET PATH=%PATH%;c:\windows\system32
for /f %%a in ('dir /on /ad /b') do (
cd %%a
for %%b in (LS-2bit*) do (
call :fun1
)
:fun1
for %%b in (*a0*) do (
call :fun2
)
:fun2
for %%b in (*C0*) do (
call :fun3
)
:fun3
for %%b in (*apple*) do (
call :fun4
)
for %%b in (*p1.log, *p1.txt, *p2.log *p2.txt) do (
echo %%b
)
cd..
)
Sorry for not being very clear. Here is exactly what I need to do. it is a huge set of data. I need to group each set (eg. c0_apple_*, c1_apple, c2_apple, c0_mango, c1_mango,c2_mango and so on like nearly 15 set groups of fruits). each set contains 4 log files(p1,p2,p3,p4) and 4 text files(p1,p2,p3,p4). I need to retrieve average value from log file and rate value from the text file and store them in a seperate file.(log file value and text file value can be in the same output file) so each set (c0_apple has one output file, c1_apple has one output file...). I worked on retreiving value with the below function in my old code.
`for %%b in (*p1* *p2*) do(
type %%b | find " bit " >> new1.txt
)
for /f "tokens=3" %%d in (new1.txt) do (
echo %%d > temp.txt
for /f "delims=." %%r in (temp.txt) do (
echo | set /p = %%r %tab% >> hmlog.txt
del temp.txt
)
)
and
for %%b in (*p1* *p2*) do(
type %%b | find "Average" >> new.txt
)
for /f "tokens=2" %%b in (new.txt) do (
echo %%b >> hmtxt.txt
)`
However, the question now is, since the data is huge is there a way to loop through the folder to get each set of data and generate the output file during each iteration?
EDIT: The original request in this question was changed, so I need to change my solution accordingly. However, clear specifications about the new problem were not posted, so I need to guess the specifications that I will use to develop my new program. Here they are:
In a directory there is a set of files with this name format:
LS-2bit_a0_c#_fruit_p#.txt
LS-2bit_a0_c#_fruit_p#.log
where LS-2bit_a0 part is fixed; c# part vary as c0, c1 and c2; fruit part vary over 15 different groups of fruits (like "apple", "mango", "grape", etc.); and p# part vary from p1 to p4.
The *.txt files have several lines each, including one with a rate value in this format:
Rate bit ###.##
The *.log files have several lines each, including one with an average value in this format:
Average ###
The program must group data from the 4 *.txt files and the 4 *.log files that correspond to the same c#_fruit combination and store them in the same output file with c#_fruit.out name. Each output file must have these two lines:
RateP1 RateP2 RateP3 RateP4
AvgeP1 AvgeP2 AvgeP3 AvgeOP4
... where RateP# is the integer part of the number after "bit" in *.txt files, and AvgeP# is the number after "Average" in *.log files; the values must be separated by a TAB character.
The program below fulfills these specifications:
#echo off
setlocal EnableDelayedExpansion
rem Group rate data from 4 *.txt files in its corresponding "c#_fruit.out" output file
set "i=0"
set "line="
for /F "tokens=3,4,8 delims=_. " %%a in ('findstr /C:" bit " *.txt') do (
rem Insert here v a TAB character
set "line=!line! %%c"
set /A i+=1
if !i! equ 4 (
> %%a_%%b.out echo !line:~1!
set "i=0"
set "line="
)
)
rem Group average data from 4 *.log files in its corresponding "c#_fruit.out" output file
for /F "tokens=3,4,6 delims=_ " %%a in ('findstr "Average" *.log') do (
rem Insert here v a TAB character
set "line=!line! %%c"
set /A i+=1
if !i! equ 4 (
>> %%a_%%b.out echo !line:~1!
set "i=0"
set "line="
)
)
Output example:
C:\> type *.out
c0_apple.out
123 456 789 987
111 222 333 444
c0_mango.out
123 456 789 987
111 222 333 444
c1_apple.out
123 456 789 987
111 222 333 444
c1_mango.out
123 456 789 987
111 222 333 444
#ECHO OFF
SETLOCAL
SET "sourcedir=U:\sourcedir"
for /f "delims=" %%a in ('dir /on /ad /b "%sourcedir%"') do (
PUSHD "%sourcedir%\%%a"
DIR /b /a-d *LS-2bit*a0*c0*apple*p*.log *LS-2bit*a0*c0*apple*p*.txt
echo \\\\
for %%b in (*LS-2bit*a0*c0*apple*p1*.log *LS-2bit*a0*c0*apple*p1*.txt
*LS-2bit*a0*c0*apple*p2*.log *LS-2bit*a0*c0*apple*p2*.txt) do (
ECHO %%b
)
echo ++++
for %%b in (LS-2bit) do (
for %%c in (a0) do (
for %%d in (C0) do (
for %%e in (apple) do (
for %%n in (p1 p2) DO FOR %%x IN (log txt) do (
IF EXIST "%%b*%%c*%%d*%%e*%%n.%%x" ECHO(%%b*%%c*%%d*%%e*%%n.%%x
)
echo ----
for %%n in (p1 p2) DO FOR %%x IN (log txt) do (
DIR /b /a-d "%%b*%%c*%%d*%%e*%%n.%%x"
)
echo ====
PAUSE
)
)
)
)
POPD
)
ECHO done
pause
GOTO :EOF
It's not really clear what you want to do.
Block statements (a parenthesised series of statements) don't play nicely with labels as a label terminates a block.
Using a string containing * or ? in a for loop makes the string into a filename - to use the string literally, you need to first enclose it in quotes like "*c0*" and then when it's assigned to a metavariable %%X, extract it using %%~X.
There seems to be little point in using the fixed text LS-2bit, a0, c0, apple etc. You don't explain what your batch is intended to do, so we're off to guesswork territory. Perhaps you could run this as a subroutine where these elements would be supplied as parameters (in which case, replace each of the fixed strings with %~1, %~2 etc.
Here's four different ways to produce a filelist. Don't know which you'd prefer...

Resources