How to get records right arranged in windows batch?

How to get records right arranged in windows batch? - batch-file

I have file s_result.txt as:
AAA,BBB,CCC
DDD,EEE
FFF,GGG
HHH,III,JJJ
...
And I try to get the sf_result.txt like this:
AAA,BBB
AAA,CCC
DDD,EEE
FFF,GGG
HHH,III
HHH,JJJ
...
I used script as below:
REM Transfer s_result.txt to sf_result.txt
DEL sf_result.txt
Echo. 2>sf_result.txt
for /F "tokens=1,2,3 delims=," %%a in (s_result.txt) do (
If %%c EQU [] (
ECHO %%a,%%b>>sf_result.txt
) else (
ECHO %%a,%%b>>sf_result.txt
ECHO %%a,%%c>>sf_result.txt
)
)
I got this result.txt instead:
AAA,BBB
AAA,CCC
DDD,EEE
DDD,
FFF,GGG
FFF,
HHH,III
HHH,JJJ
...
How can I get the right result?
Thanks,

If you want to parse line-by-line, use for /F. If you want to tokenize word-by-word on a single line, use for without the /F. Also, in a basic for loop, Windows already tokenizes on unquoted commas with no need to specify a delimiter. (It also tokenizes on spaces, tabs, and semicolons.) With this in mind, the solution is actually pretty simple.
#echo off & setlocal
for /f "usebackq tokens=1* delims=," %%I in ("test.txt") do (
for %%x in (%%J) do (
echo(%%I,%%x
)
)
Output:
AAA,BBB
AAA,CCC
DDD,EEE
FFF,GGG
HHH,III
HHH,JJJ

There have already been great answers provided with some smart approaches.
However, I want to stick to the code you posted here.
The main problem is the line if %%c EQU [], because it compares the third token with the literal string []; the third token can be CCC, JJJ, or an empty string, according to your example, so the condition is never going to be fulfilled.
To correct that, you should write if "%%c"=="" instead.
You could further improve your script by doing a single redirection > to the output file rather than creating it in advance and appending multiple times; just put the entire for /F loop in between parentheses and redirect the whole block.
So here is the corrected and improved code:
rem Transfer s_result.txt to sf_result.txt
> "sf_result.txt" (
for /F "usebackq tokens=1-3 delims=," %%a in ("s_result.txt") do #(
if "%%c"=="" (
echo %%a,%%b
) else (
echo %%a,%%b
echo %%a,%%c
)
)
)
The # symbol prevents command echoes of the loop body to be written to the output file as well. If there is #echo off placed at the beginning of your script you do no longer need that symbol.
Of course this code cannot handle lines with more than three tokens correctly.

I do like rojos clever approach (+1).
To overcome the implications he mentions I think of a recursive approach.
:: Q:\Test\2018\06\28\SO_51073893.cmd
#echo off & setlocal
for /f "usebackq tokens=1-2* delims=," %%A in ("test.txt") do Call :Sub "%%A" "%%B" "%%C"
Goto :Eof
:Sub
Echo %~1,%~2
if "%~3" neq "" for /f "tokens=1* delims=," %%D in (%3) do Call :Sub %1 "%%D" "%%E"
With a slightly changed file test.txt I get this output:
> SO_51073893.cmd
AAA,B=B
AAA,C;C
DDD,EEE
FFF,GGG
HHH,I I
HHH,JJJ

Related

Batch output one occurrence instead of several [duplicate]

Is it possible to remove duplicate rows from a text file? If yes, how?

Sure can, but like most text file processing with batch, it is not pretty, and it is not particularly fast.
This solution ignores case when looking for duplicates, and it sorts the lines. The name of the file is passed in as the 1st and only argument to the batch script.
#echo off
setlocal disableDelayedExpansion
set "file=%~1"
set "sorted=%file%.sorted"
set "deduped=%file%.deduped"
::Define a variable containing a linefeed character
set LF=^
::The 2 blank lines above are critical, do not remove
sort "%file%" >"%sorted%"
>"%deduped%" (
set "prev="
for /f usebackq^ eol^=^%LF%%LF%^ delims^= %%A in ("%sorted%") do (
set "ln=%%A"
setlocal enableDelayedExpansion
if /i "!ln!" neq "!prev!" (
endlocal
(echo %%A)
set "prev=%%A"
) else endlocal
)
)
>nul move /y "%deduped%" "%file%"
del "%sorted%"
This solution is case sensitive and it leaves the lines in the original order (except for duplicates of course). Again the name of the file is passed in as the 1st and only argument.
#echo off
setlocal disableDelayedExpansion
set "file=%~1"
set "line=%file%.line"
set "deduped=%file%.deduped"
::Define a variable containing a linefeed character
set LF=^
::The 2 blank lines above are critical, do not remove
>"%deduped%" (
for /f usebackq^ eol^=^%LF%%LF%^ delims^= %%A in ("%file%") do (
set "ln=%%A"
setlocal enableDelayedExpansion
>"%line%" (echo !ln:\=\\!)
>nul findstr /xlg:"%line%" "%deduped%" || (echo !ln!)
endlocal
)
)
>nul move /y "%deduped%" "%file%"
2>nul del "%line%"
EDIT
Both solutions above strip blank lines. I didn't think blank lines were worth preserving when talking about distinct values.
I've modified both solutions to disable the FOR /F "EOL" option so that all non-blank lines are preserved, regardless what the 1st character is. The modified code sets the EOL option to a linefeed character.
New solution 2016-04-13: JSORT.BAT
You can use my JSORT.BAT hybrid JScript/batch utility to efficiently sort and remove duplicate lines with a simple one liner (plus a MOVE to overwrite the original file with the final result). JSORT is pure script that runs natively on any Windows machine from XP onward.
#jsort file.txt /u >file.txt.new
#move /y file.txt.new file.txt >nul

you may use uniq http://en.wikipedia.org/wiki/Uniq from UnxUtils http://sourceforge.net/projects/unxutils/

Some time ago I found an unexpectly simple solution, but this unfortunately only works on Windows 10: the sort command features some undocumented options that can be adopted:
/UNIQ[UE] to output only unique lines;
/C[ASE_SENSITIVE] to sort case-sensitively;
So use the following line of code to remove duplicate lines (remove /C to do that in a case-insensitive manner):
sort /C /UNIQUE "incoming.txt" /O "outgoing.txt"
This removes duplicate lines from the text in incoming.txt and provides the result in outgoing.txt. Regard that the original order is of course not going to be preserved (because, well, this is the main purpose of sort).
However, you sould use these options with care as there might be some (un)known issues with them, because there is possibly a good reason for them not to be documented (so far).

The Batch file below do what you want:
#echo off
setlocal EnableDelayedExpansion
set "prevLine="
for /F "delims=" %%a in (theFile.txt) do (
if "%%a" neq "!prevLine!" (
echo %%a
set "prevLine=%%a"
)
)
If you need a more efficient method, try this Batch-JScript hybrid script that is developed as a filter, that is, similar to Unix uniq program. Save it with .bat extension, like uniq.bat:
#if (#CodeSection == #Batch) #then
#CScript //nologo //E:JScript "%~F0" & goto :EOF
#end
var line, prevLine = "";
while ( ! WScript.Stdin.AtEndOfStream ) {
line = WScript.Stdin.ReadLine();
if ( line != prevLine ) {
WScript.Stdout.WriteLine(line);
prevLine = line;
}
}
Both programs were copied from this post.

set "file=%CD%\%1"
sort "%file%">"%file%.sorted"
del /q "%file%"
FOR /F "tokens=*" %%A IN (%file%.sorted) DO (
SETLOCAL EnableDelayedExpansion
if not [%%A]==[!LN!] (
set "ln=%%A"
echo %%A>>"%file%"
)
)
ENDLOCAL
del /q "%file%.sorted"
This should work exactly the same. That dbenham example seemed way too hardcore for me, so, tested my own solution. usage ex.: filedup.cmd filename.ext

Pure batch - 3 effective lines.
#ECHO OFF
SETLOCAL
:: remove variables starting $
FOR /F "delims==" %%a In ('set $ 2^>Nul') DO SET "%%a="
FOR /f "delims=" %%a IN (q34223624.txt) DO SET $%%a=Y
(FOR /F "delims=$=" %%a In ('set $ 2^>Nul') DO ECHO %%a)>u:\resultfile.txt
GOTO :EOF
Works happily if the data does not contain characters to which batch has a sensitivity.
"q34223624.txt" because question 34223624 contained this data
1.1.1.1
1.1.1.1
1.1.1.1
1.2.1.2
1.2.1.2
1.2.1.2
1.3.1.3
1.3.1.3
1.3.1.3
on which it works perfectly.

Did come across this issue and had to resolve it myself because the use was particulate to my need.
I needed to find duplicate URL's and order of lines was relevant so it needed to be preserved. The lines of text should not contain any double quotes, should not be very long and sorting cannot be used.
Thus I did this:
setlocal enabledelayedexpansion
type nul>unique.txt
for /F "tokens=*" %%i in (list.txt) do (
find "%%i" unique.txt 1>nul
if !errorlevel! NEQ 0 (
echo %%i>>unique.txt
)
)
Auxiliary: if the text does contain double quotes then the FIND needs to use a filtered set variable as described in this post: Escape double quotes in parameter
So instead of:
find "%%i" unique.txt 1>nul
it would be more like:
set test=%%i
set test=!test:"=""!
find "!test!" unique.txt 1>nul
Thus find will look like find """what""" file and %%i will be unchanged.

I have used a fake "array" to accomplish this
#echo off
:: filter out all duplicate ip addresses
REM you file would take place of %1
set file=%1%
if [%1]==[] goto :EOF
setlocal EnableDelayedExpansion
set size=0
set cond=false
set max=0
for /F %%a IN ('type %file%') do (
if [!size!]==[0] (
set cond=true
set /a size="size+1"
set arr[!size!]=%%a
) ELSE (
call :inner
if [!cond!]==[true] (
set /a size="size+1"
set arr[!size!]=%%a&& ECHO > NUL
)
)
)
break> %file%
:: destroys old output
for /L %%b in (1,1,!size!) do echo !arr[%%b]!>> %file%
endlocal
goto :eof
:inner
for /L %%b in (1,1,!size!) do (
if "%%a" neq "!arr[%%b]!" (set cond=true) ELSE (set cond=false&&goto :break)
)
:break
the use of the label for the inner loop is something specific to cmd.exe and is the only way I have been successful nesting for loops within each other. Basically this compares each new value that is being passed as a delimiter and if there is no match then the program will add the value into memory. When it is done it will destroy the target files contents and replace them with the unique strings

Extract lines from text batch file

I need to take a known number of lines from one text doc and put them in another. During or after this process I need to look at certain columns only. My idea:
setlocal enabledelayedexpansion
FOR /F "SKIP=21 TOKENS 1,3 DELIMS= " %%B IN (INPUT.TXT) DO ECHO %%B %%C > OUTPUT.TXT
When I try this I just get the last line of the file printed. I eventually want just lines 22-34, 1st and 3rd columns. Please keep simple.

Change your > OUTPUT.TXT to >> OUTPUT.TXT

setlocal enabledelayedexpansion
set /a counter=32-21
FOR /F "SKIP=21 TOKENS 1,3 DELIMS= " %%B IN (INPUT.TXT) DO (
ECHO %%B %%C
set /a counter=counter-1
if !counter! == 0 (
goto :break
)
)>> OUTPUT.TXT
:break
Like is posted you script will print all lines after 21th.To stop at 34 you'll need a break condition.
And as Kevin pointed you need appending redirection.

It is more efficient to enclose entire construct in parens and redirect only once.
You can intentionally divide by zero and detect when to quit without any need for delayed expansion.
#echo off
setlocal disableDelayedExpansion
set "cnt=13"
>output.txt (
for /f "skip=21 tokens=1,3 delims= " %%B in (input.txt) do (
echo(%%B %%C
set /a "1/(cnt-=1)" 2>nul || goto :break
)
)
:break
Or, if you get my JREPL.BAT regular expression text processing utility, then all you need is the following from the command line:
jrepl "^([^ ]*) [^ ]* ([^ ]*)" "$1 $2" /jmatch /jbegln "skip=(ln<22 || ln>34)" /f input.txt /o output.txt
The above assumes there is exactly one space between tokens. The regular expression can be modified if there may be multiple spaces between tokens.
You must use CALL JREPL if you use the command within a batch script.

Batch FOR /F loop won't read relative directory

I have a simply FOR /F loop which strips out all but one line of a text file:
for /f "skip=12 tokens=* delims= " %%f in (.\NonProcessed\*.txt) do (
> newfile.txt echo.%%f
goto :eof
)
But when I run, I get the result:
The system cannot find the file .\NonProcessed\*.txt
The for loop works fine if I enter a fully qualified path to the text file within the brackets, but it can't handle the relative link I have in there. I've been able to use the exact same relative link in another standard for loop in a different batch file running in the same directory without any issues. I can't understand why it won't work! Please help.
EDIT: For comments, code I'm using now is
for %%f in (.\NonProcessed\*.txt) do (
echo f is %%f
for /f "usebackq skip=12 tokens=* delims= " %%a in (%%f) do (
echo a is %%a
> %%f echo.%%a
goto :continue
)
:continue
sqlcmd stuff here
)

Sorry but for /f does not allow you to do that. And no, the problem is not the relative path to files but the wildcard.
According to documentation, you have the syntax case
for /F ["ParsingKeywords"] {%% | %}variable in (filenameset) do command [CommandLineOptions]
For this case, documentation states The Set argument specifies one or more file names. You can do
for /f %%a in (file1.txt file2.txt file3.txt) do ...
but wildcards are not allowed.
If you don't know the name of the file you want to process, your best option is to add an additional for command to first select the file
for %%a in (".\NonProcessed\*.txt"
) do for /f "usebackq skip=12 tokens=* delims= " %%f in ("%%~fa"
) do (
> newfile.txt echo(%%f
goto :eof
)
When executed, the goto command will cancel both for loops so you end with the same behaviour you expected from your original code.
edited to adapt code to comments
#echo off
set "folder=.\NonProcessed"
pushd "%folder%"
for /f "tokens=1,2,* delims=:" %%a in (
' findstr /n "^" *.txt ^| findstr /r /b /c:"[^:]*:13:" '
) do (
echo Overwrite file "%%a" with content "%%c"
>"%%a" echo(%%c
)
popd
Read all the files in the folder, numbering the lines. The output for the first findstr command will be
filename.txt:99:lineContents
This output is parsed to find the line 13, the resulting data is splitted using the colon as a separator, so we will end with the file name in %%a, the line number in %%b and the line content in %%c.

SET FILES_LIST=files_list.config
DIR /b .\NonProcessed\*.txt>!FILES_LIST!
for /f "skip=12 tokens=* delims= " %%f in (!FILES_LIST!) do (
> newfile.txt echo.%%f
goto :eof
)
IF EXIST "!FILES_LIST!" DEL "!FILES_LIST!"
I did not check how your's FOR works, just added my additions/corrections to it.... Hope it will work for you.
Best regards!

Batch to parse lines containing a specific string from an input file

Edit: yes, this has to be done in batch.
I need to be able to read an input file, and parse out certain sections only of lines that contain a specific string, then write that to an output file. For example:
input =
i_NumberOfPersonInTheDataBase=1
i_NumberOfPersonInTheDataBase < 50 AdjTotal=801
MATCH IdentificationResult id=Olivier Score=11419 (NOT_DEFINED-cfv )
02-11-11-07-00 TAG_CAPTURE Badge:CAPTURE - Candidate Found :Olivier
i_NumberOfPersonInTheDataBase=1
i_NumberOfPersonInTheDataBase < 50 AdjTotal=801
MATCH IdentificationResult id=Martin Score=1008 (NOT_DEFINED-cfv )
02-11-11-08-15 TAG_CAPTURE Badge:CAPTURE - Candidate Found :Martin
in lines that contain the string "IdentificationResult", I need to return the strings that contain the id and Score.
expected output =
id=Olivier Score=11419
id=Martin Score=1008
This is what I have so far:
#setlocal enableextensions enabledelayedexpansion
:: Path of input and output files
set INPUTFILE=DemoFingerOtf-2.log
set OUTPUTFILE=logOutput.txt
:: Clear out the output file
#echo on > %OUTPUTFILE%
:: Read %INPUTFILE% and loop through each line
for /F "tokens=* delims=" %%A in (%INPUTFILE%) do (
SET my_line=%%A
SET my_line=!my_line:IdentificationResult=!
if not !my_line!==%%A (
call :parse_it
)
)
:parse_it
for /F "usebackq tokens=1,2,3,4 delims=~" %%1 in ('%my_line: =~%') do (
echo %%3 %%4>> %OUTPUTFILE%
)
The problem I have right now is that when I run this script, I get a ') was unexpected at this time error. When I remove the parentheses from the input, I get my expected results. I've tried including a line like the following to remove the parentheses:
:: Read %INPUTFILE% and loop through each line
for /F "tokens=* delims=" %%A in (%INPUTFILE%) do (
SET my_line=%%A
SET my_line=!my_line:IdentificationResult=!
if not !my_line!==%%A (
SET new_line=%my_line:~0,-18%
call :parse_it
)
)
:parse_it
for /F "usebackq tokens=1,2,3,4 delims=~" %%1 in ('%new_line: =~%') do (
echo %%3 %%4>> %OUTPUTFILE%
)
I know that in the lines I want, the section with parentheses will always be exactly 18 characters, so I trim them from the end. However, when I do that, for some reason I get the following as my output:
wrong output:
id=Olivier Score=11419
id=Olivier Score=11419
id=Olivier Score=11419
So, I'm getting only the data from the first line that I want to parse, and I'm getting it three times (even though there are only two lines in my input that meet my criteria). Why am I getting this data multiple times instead of the correct data? Additionally, is there a better way around the ') was unexpected at this time error that I was getting?

Edit modificated without trailing space. Without "ScoreAdjustment" and work with "John Smith" :)
echo off
:: Path of input and output files
set INPUTFILE=DemoFingerOtf-2.log
set OUTPUTFILE=logOutput.txt
setlocal enabledelayedexpansion
for /f "tokens=2,3 delims=^=^(" %%a in ('type "%INPUTFILE%" ^| find /i "IdentificationResult"') do (
set $line=Id=%%a=%%b
set $line=!$line:ScoreAdjustment=!
set $line=!$line:~0,-1!
echo !$line!>>%OUTPUTFILE%)
Endlocal

#ECHO OFF &SETLOCAL
for /f "delims=" %%a in ('^<file find "IdentificationResult"') do call:DOit "%%~a"
goto:Eof
:doit
setlocal
set "string=%~1"
set "STring=%string:*IdentificationResult=%"
for /f "Tokens=1,2" %%b in ("%string%") do echo(%%b %%c
exit /b

The easy way:
if not !my_line!==%%A (
ECHO !my_line:~7,-19!>> %OUTPUTFILE%
)
A few issues with your code, but I applaud your effort to solve the problem.
Batch doesn't stop at a label - they're just markers, so it charges straight through, hence you'd need a
goto :eof
before any suboutine label.
You can't use numerics as metavariables, so in parse_it you'd need a letter, not 1. You can also use space as a delims character - but it must be specified as the last delimiter (ie. just before the closing "
So parse_it could be reduced (if it was required) to
for /F "tokens=1* delims= " %%q in ("%new_line%") do (
echo %%r>> %OUTPUTFILE%
)
But - overall, bravo for the attempt!

You dont say what language you want. so can you do something like:
grep IdentificationResult file | awk '{ print $3, $4}' > output.file

batch script - read line by line

I have a log file which I need to read in, line by line and pipe the line to a next loop.
Firstly I grep the logfile for the "main" word (like "error") in a separate file - to keep it small. Now I need to take the seperate file and read it in line by line - each line needs to go to another loop (in these loop I grep the logs and divide it in blocks) but I stuck here.
The log looks like
xx.xx.xx.xx - - "http://www.blub.com/something/id=?searchword-yes-no" 200 - "something_else"
with a for /f loop I just get the IP instead of the complete line.
How can I pipe/write/buffer the whole line? (doesn't matter what is written per line)

Try this:
#echo off
for /f "tokens=*" %%a in (input.txt) do (
echo line=%%a
)
pause
because of the tokens=* everything is captured into %a
edit:
to reply to your comment, you would have to do that this way:
#echo off
for /f "tokens=*" %%a in (input.txt) do call :processline %%a
pause
goto :eof
:processline
echo line=%*
goto :eof
:eof
Because of the spaces, you can't use %1, because that would only contain the part until the first space. And because the line contains quotes, you can also not use :processline "%%a" in combination with %~1. So you need to use %* which gets %1 %2 %3 ..., so the whole line.

The "call" solution has some problems.
It fails with many different contents, as the parameters of a CALL are parsed twice by the parser.
These lines will produce more or less strange problems
one
two%222
three & 333
four=444
five"555"555"
six"&666
seven!777^!
the next line is empty
the end
Therefore you shouldn't use the value of %%a with a call, better move it to a variable and then call a function with only the name of the variable.
#echo off
SETLOCAL DisableDelayedExpansion
FOR /F "usebackq delims=" %%a in (`"findstr /n ^^ t.txt"`) do (
set "myVar=%%a"
call :processLine myVar
)
goto :eof
:processLine
SETLOCAL EnableDelayedExpansion
set "line=!%1!"
set "line=!line:*:=!"
echo(!line!
ENDLOCAL
goto :eof

This has worked for me in the past and it will even expand environment variables in the file if it can.
for /F "delims=" %%a in (LogName.txt) do (
echo %%a>>MyDestination.txt
)

For those with spaces in the path, you are going to want something like this:
n.b. It expands out to an absolute path, rather than relative, so if your running directory path has spaces in, these count too.
set SOURCE=path\with spaces\to\my.log
FOR /F "usebackq delims=" %%A IN ("%SOURCE%") DO (
ECHO %%A
)
To explain:
(path\with spaces\to\my.log)
Will not parse, because spaces.
If it becomes:
("path\with spaces\to\my.log")
It will be handled as a string rather than a file path.
"usebackq delims="
See docs will allow the path to be used as a path (thanks to Stephan).