Split a file using windows batch script - file

I have a csv file and i need to split it in to n files such that each split file should not exceed 100 mb. I need to achieve it in windows batch script. I tried the below way but its taking lot of time as my unsplit file is in GBs
#echo off
setlocal enableextensions enabledelayedexpansion
set count=1
set maxbytesize=100000000
set size=1
type NUL > output_1.csv
FOR /F "tokens=*" %%i in (myfile.csv) do (
FOR /F "usebackq" %%A in ('!filename!_!count!.csv') do (
set size=%%~zA)
if !size! LSS !maxbytesize! (
echo %%i>>!filename!_!count!.csv) else (
set /a count+=1
echo %%i>>!filename!_!count!.csv
))
please let me know if there is a better way to achieve this. I cant go to any other scripting languages as my server is windows

This would do the trick assuming your lines are roughly the same size.
Its advantage is that it is only a 2 pass solution, One for counting the lines and the other for printing them.
#rem echo off
#rem usage: batchsplit.bat <file-to-split> <size-limit>
#rem it will generate files named <file-to-split>.part_NNN
setlocal EnableDelayedExpansion
set FILE_TO_SPLIT=%1
set SIZE_LIMIT=%2
for /f %%s in ('dir /b %FILE_TO_SPLIT%') do set SIZE=%%~Zs
for /f %%c in ('type "%FILE_TO_SPLIT%"^|find "" /v /c') do set LINE_COUNT=%%c
set /a AVG_LINE_SIZE=%SIZE%/%LINE_COUNT%
set /a LINES_PER_PART=%SIZE_LIMIT%/%AVG_LINE_SIZE%
set "cmd=findstr /R /N "^^" %FILE_TO_SPLIT%"
for /f "tokens=1,2* delims=:" %%a in ('!cmd!') do #(
set /a ccc = %%a / %LINES_PER_PART%
echo %%b >> %FILE_TO_SPLIT%.part_!ccc!
)
save it as batchsplit.bat and run it using:
batchsplit.bat myfile.csv 100000000

Related

Windows Batch Scripting: Checking file for multiple strings

I have a batch file that processes scanned PDFs using ghostscript. One of the user prompts is for the resolution of the desired output. I wrote a crude autodetect routine like this:
for /f "delims=" %%a in ('findstr /C:"/Height 1650" %1') do set resdect=150
for /f "delims=" %%a in ('findstr /C:"/Height 3300" %1') do set resdect=300
for /f "delims=" %%a in ('findstr /C:"/Height 6600" %1') do set resdect=600
echo %resdect% DPI detected.
%1 is the filename passed to the batch script.
This should return the the highest resolution detected of some common sizes we see. My question to the community is: Is there a faster or more efficient way to do this other than search the file multiple times?
Assuming that the value of RESDECT is the /Height value divided by 11, and that no line contains more than one /Height token, the following code might work for you:
#echo off
for /F delims^=^ eol^= %%A in ('findstr /R /I /C:"/Height *[0-9][0-9]*" "%~1"') do (
set "LINE=%%A"
setlocal EnableDelayedExpansion
set "RESDECT=!LINE:*/Height =!"
set /A "RESDECT/=11"
echo/!RESDECT!
endlocal
)
If you only want to match the dedicated /Height values 1650, 3300, 6600, you could use this:
#echo off
for /F delims^=^ eol^= %%A in ('findstr /I /C:"/Height 1650" /C:"/Height 3300" /C:"/Height 6600" "%~1"') do (
set "LINE=%%A"
setlocal EnableDelayedExpansion
set "RESDECT=!LINE:*/Height =!"
set /A "RESDECT/=11"
echo/!RESDECT!
endlocal
)
To gather the greatest /Height value appearing in the file, you can use this script, respecting the aforementioned assumptions:
#echo off
set "RESDECT=0"
for /F delims^=^ eol^= %%A in ('findstr /R /I /C:"/Height *[0-9][0-9]*" "%~1"') do (
set "LINE=%%A"
setlocal EnableDelayedExpansion
set "HEIGHT=!LINE:*/Height =!"
for /F %%B in ('set /A HEIGHT/11') do (
if %%B gtr !RESDECT! (endlocal & set "RESDECT=%%B") else endlocal
)
)
echo %RESDECT%
Of course you can again exchange the findstr command line like above.
Here is another approach to get the greatest /Height value, using (pseudo-)arrays, which might be faster than the above method, because there are no extra cmd instances created in the loop:
#echo off
setlocal
set "RESDECT=0"
for /F delims^=^ eol^= %%A in ('findstr /R /I /C:"/Height *[0-9][0-9]*" "%~1"') do (
set "LINE=%%A"
setlocal EnableDelayedExpansion
set "HEIGHT=!LINE:*/Height =!"
set /A "HEIGHT+=0, RES=HEIGHT/11" & set "HEIGHT=0000000000!HEIGHT!"
for /F %%B in ("$RESOLUTIONS[!HEIGHT:~-10!]=!RES!") do endlocal & set "%%B"
)
for /F "tokens=2 delims==" %%B in ('set $RESOLUTIONS[') do set "RESDECT=%%B"
echo %RESDECT%
endlocal
At first all heights and related resolutions are collected in an array called $RESOLUTIONS[], where the /Height values are used as indexes and the resolutions are the values. The heights become left-zero-padded to a fixed number of digits, so set $RESOLUTIONS[ return them in ascending order. The second for /F loop returns the last arrays element whose value is the greatest resolution.
I do have to admit that this was inspired by Aacini's nice answer.
get the corresponding line to a variable and work with that instead of the whole file. Instead of your three for loops, you can use just one, when you change the logic a bit:
#echo off
setlocal enabledelayedexpansion
for /f "delims=" %%a in ('findstr /C:"/Height " %1') do (
set "line=%%a"
set "line=!line:*/Height =!"
for /f "delims=/ " %%b in ("!line!") do set "hval=!hval! %%b"
)
for %%a in (1650,3300,6600) do #(
echo " %hval% " | find " %%a " >nul && set /a resdect=%%a/11
)
echo %resdect% DPI detected.
A solution with jrepl.bat could look something like:
for /f %a in ('type t.txt^|find "/Height "^|jrepl ".*/Height ([0-9]{4}).*" "$1"^|sort') do set /a dpi==%a / 11
(given, all valid Heights have 4 digits)
Note: for use in batchfiles, use %%a instead of %a
I barely scratched the surface of jrepl - I'm quite sure, there is a much more elegant (and probably faster) solution.
You may directly convert the Height value into the highest resolution in a single operation using an array. However, to do that we need to know the format of the line that contain the Height value. In the code below I assumed that the format of such a line is /Height xxxx, that is, that the height is the second token in the line. If this is not true, just adjust the "tokens=2" value in the for /F command.
EDIT: Code modified as requested in comments
In this modified code the Height value may appear anywhere in the line.
#echo off
setlocal EnableDelayedExpansion
rem Initialize "resDect" array
for %%a in ("1650=150" "3300=300" "6600=600") do (
for /F "tokens=1,2 delims==" %%b in (%%a) do (
set "resDect[%%b]=%%c"
)
)
set "highResDect=0"
for /F "delims=" %%a in ('findstr "/Height" %1') do (
set "line=%%a"
set "line=!line:*/Height =!"
for /F %%b in ("!line!") do set /A "thisRectDect=resDect[%%b]"
if !thisRectDect! gtr !highResDect! set "highResDect=!thisRectDect!"
)
echo %highResDect% DPI detected.
For the record, the final code was:
setlocal enabledelayedexpansion
set resdetc=0
for /f "delims=" %%a in ('findstr /C:"/Height " %1') do (
set "line=%%a"
set "line=!line:*/Height =!"
for /f "delims=/ " %%b in ("!line!") do set "hval=!hval! %%b"
)
for %%a in (1650,3300,6600) do #(
echo " %hval% " | find " %%a " >nul && set /a resdetc=%%a/11
)
if %resdetc%==0 SET resDefault=3
if %resdetc%==150 SET resDefault=1
if %resdetc%==300 SET resDefault=3
if %resdetc%==600 SET resDefault=6
ECHO.
ECHO Choose your resolution
ECHO ----------------------
ECHO 1. 150 4. 400
ECHO 2. 200 5. 500
ECHO 3. 300 6. 600
ECHO.
IF NOT %RESDETC%==0 ECHO 7. Custom (%resdetc% DPI input detected)
IF %RESDETC%==0 ECHO 7. Custom
ECHO ----------------------
choice /c 1234567 /T 3 /D %resDefault% /N /M "Enter 1-7 (defaults to %resDefault% after 3 sec.): "
IF errorlevel==7 goto choice7
IF errorlevel==6 set reschoice=600 & goto convert
IF errorlevel==5 set reschoice=500 & goto convert
[...]
Thanks everyone for the help!

Preserve a variable across a DisableDelayedExpansion ENDLOCAL

#echo off
setlocal EnableDelayedExpansion
set /a N=0
for /f "tokens=* delims=" %%g in ('dir !FOLDERPATH! /b') do (
setlocal DisableDelayedExpansion
set "item=%%g"
endlocal
set /a N+=1
REM next line loses exclamation marks. replacing %%g with %%item%% gives error: not defined $$ variable
call set "$$%%N%%=%%g"
)
set "ind=%N%"
REM View the results
echo !ind!
for /f "tokens=* delims=" %%i in ('set $$') do echo %%i
pause
EXIT /b
I read many solutions to my problem (stackoverflow.com/questions/3262287; stackoverflow.com/questions/28682268; stackoverflow.com/questions/29869394; stackoverflow.com/questions/3262287) and am still baffled. Any help appreciated
I have a folder with mp3 filenames containing exclamation marks (and percentage signs and ampersands). The above subroutine is supposed to fill an array=$$ with these filenames. It gets called 2000 times, each time for a different folder.
I want to use EnableDelayedExpansion as the master setlocal (twice as fast). In my entire batch program this is only line (set "item=%%g") where I need DisableDelayedExpansion. I need to know how to efficiently pass this variable (item) across the DisableDelayedExpansion endlocal boundary (and into the $$ set). Alternatively I guess I could fill the $$set within the Disabled environment, and then I need to pass the set across the boundary.
I'm looking for speed. I can use Disabled for the entire subroutine, but this doubles my processing time (for very few exclamation marks and percentage signs).
Based on the responses I see it might be good to include the entire batch script (simplified). This script takes 28 minutes to run with my database on my puny machine. It handles exclamation marks, percentage signs, ampersands and anything else you can throw at it.
#echo off
chcp 1254>nul
setlocal DisableDelayedExpansion
set /p COUNT=Select Desired Number of Random Episodes per Album:
for /d %%f in (H:\itunes\Podcasts\*) do (
set buffer="%%f"
set /a ind = 0
call:Set$$Variables
setlocal EnableDelayedExpansion
if !COUNT! LEQ !ind! ( set DCOUNT=!COUNT! ) ELSE set DCOUNT=!ind!
for /l %%g in (1, 1, !DCOUNT!) do (
call:GenerateUniqueRandomNumber
for %%N in (!num!) do echo !$$%%N!>>"%USERPROFILE%\Desktop\RandomEpisodes.m3u8"
)
endlocal
)
pause
Exit /b
:Set$$Variables
set /a N = 0
for /f "tokens=* delims=" %%g in ('dir %buffer% /b') do (
set "item=%%g"
set /a N+=1
call set "$$%%N%%=%%item%%"
)
set "ind=%N%"
EXIT /b
:GenerateUniqueRandomNumber
:nextone
set /a "num = (((!random! & 1) * 1073741824) + (!random! * 32768) + !random!) %% !ind! + 1"
for %%N in (!num!) do (
if !RN%%N!==1 (
goto:nextone
)
set "RN%%N=1"
)
EXIT /b
A simple change to the following and it runs in 13 minutes. But it doesn't handle exclamation marks (bangs=!).
#echo off
chcp 1254>nul
setlocal EnableDelayedExpansion
set /p COUNT=Select Desired Number of Random Episodes per Album:
for /d %%f in (H:\itunes\Podcasts\*) do (\
setlocal
set buffer="%%f"
set /a ind = 0
call:Set$$Variables
if !COUNT! LEQ !ind! ( set DCOUNT=!COUNT! ) ELSE set DCOUNT=!ind!
for /l %%g in (1, 1, !DCOUNT!) do (
call:GenerateUniqueRandomNumber
for %%N in (!num!) do echo !$$%%N!>>"%USERPROFILE%\Desktop\RandomEpisodes.m3u8"
)
endlocal
)
pause
Exit /b
:Set$$Variables
set /a N = 0
for /f "tokens=* delims=" %%g in ('dir %buffer% /b') do (
set "item=%%g"
set /a N+=1
call set "$$%%N%%=%%item%%"
)
set "ind=%N%"
EXIT /b
:GenerateUniqueRandomNumber
:nextone
set /a "num = (((!random! & 1) * 1073741824) + (!random! * 32768) + !random!) %% !ind! + 1"
for %%N in (!num!) do (
if !RN%%N!==1 (
goto:nextone
)
set "RN%%N=1"
)
EXIT /b
More than double the processing time to handle a very few filenames containing exclamation marks. The resource - time hog is the subroutine Set$$Variables.
So my hope is someone out there can find a happy middle ground between these two times: 13 minutes vs. 28 minutes. It seems to me there must be a way to handle the dozen or so affected files in the 15 minute difference.
Best if you truly understand what is slowing down your script.
The following all contribute to unacceptable performance:
Excessive CALLs in a loop
Randomly selecting a number until you get one that has not been selected yet
Redirection in append mode for each file - best to redirect only once
Enabling and disabling delayed expansion normally does not contribute much to bad performance (unless you have a massively large environment space)
My code below uses the FINDSTR technique to quickly build the array, without needing delayed expansion.
I then can enable delayed expansion just once for each folder. I never have to pass any values across the endlocal "barrier"
I guarantee that each random operation selects an available file by building a list of possible numbers, fixed width of 4 with leading spaces. The list must fit in a single variable with max length of ~8190 bytes, so this solution supports up to ~2040 files per folder. Each random number specifies which position to take from the list, and then the value is extracted and the count decremented.
I enclose the entire outer loop in an extra set of parentheses so that I only need to redirect once.
I'm pretty sure this code will be significantly faster than even your 2nd code that does not support ! etc.
benham1.bat
#echo off
chcp 1254>nul
setlocal DisableDelayedExpansion
set /p "maxCnt=Select Desired Number of Random Episodes per Album:"
pushd "H:\itunes\Podcasts"
>"%USERPROFILE%\Desktop\RandomEpisodes.m3u8" (
for /d %%F in (*) do (
pushd "%%F"
set "fileCnt=0"
for /f "delims=: tokens=1,2" %%A in ('dir /b /a-d 2^>nul^|findstr /n "^"') do (
set "$$%%A=%%B"
set "fileCnt=%%A"
)
setlocal enableDelayedExpansion
set "nums="
for /l %%N in (1 1 !fileCnt!) do (
set "n= %%N"
set "nums=!nums!!n:~-4!"
)
if !fileCnt! lss !maxCnt! (set "cnt=!fileCnt!") else set "cnt=!maxCnt!"
for /l %%N in (1 1 !cnt!) do (
set /a "pos=(!random!%%fileCnt)*4, next=pos+4, fileCnt-=1"
for /f "tokens=1,2" %%A in ("!pos! !next!") do (
for /f %%N in ("!nums:~%%A,4!") do echo !$$%%N!
set "nums=!nums:~0,%%A!!nums:~%%B!"
)
)
endlocal
popd
)
)
popd
Update and solution
It really bothered me that this code behaved so poorly in RKO's hands (see his comment). So I did some tests of my own to figure out what is happening.
I ran the code against 500 folders with 100 files in each folder, and I asked for an output of 50 files for each folder. I modified the loop to print out the name of each folder to stderr so I could monitor progress. As expected, each folder was processed very quickly at the beginning. But as the program progressed, each folder became slower than the previous one, in a non-linear fashion. By the time it reached the end, each folder was painfully slow.
It took ~3 minutes to process 500 folders. I then doubled the number of folders, and the time exploded to ~18 minutes. Very nasty.
Way back when, a group of us at DosTips tried to investigate how cmd.exe manages the environment space, and how large environments impact performance. See Why does SET performance degrade as environment size grows?
We determined that ENDLOCAL does not actually release allocated memory, which causes SET performance to degrade as the number of SET operations accumulates. This problem has lots of SET operations, so it makes sense that it becomes slow.
But RKO has code with lots of inefficiencies that is performing better than mine. In the absence of environment size issues, his code should be much slower. So somehow his code must not be accumulating memory like mine. So I went on a quest to isolate the memory allocation for each folder from all the rest.
My first attempt was to put all the memory allocation for a single folder within a new cmd.exe process. And it worked! Processing 500 folders now took ~1 minute, and doubling to 1000 folders basically doubled the time to ~2 minutes!
benham2.bat
#echo off
if "%~1" equ ":processFolder" (
pushd "%folder%"
set "fileCnt=0"
for /f "delims=: tokens=1,2" %%A in ('dir /b /a-d 2^>nul^|findstr /n "^"') do (
set "$$%%A=%%B"
set "fileCnt=%%A"
)
setlocal enableDelayedExpansion
set "nums="
for /l %%N in (1 1 !fileCnt!) do (
set "n= %%N"
set "nums=!nums!!n:~-4!"
)
if !fileCnt! lss !maxCnt! (set "cnt=!fileCnt!") else set "cnt=!maxCnt!"
for /l %%N in (1 1 !cnt!) do (
set /a "pos=(!random!%%fileCnt)*4, next=pos+4, fileCnt-=1"
for /f "tokens=1,2" %%A in ("!pos! !next!") do (
for /f %%N in ("!nums:~%%A,4!") do echo !$$%%N!
set "nums=!nums:~0,%%A!!nums:~%%B!"
)
)
popd
exit
)
setlocal DisableDelayedExpansion
set /p "maxCnt=Select Desired Number of Random Episodes per Album:"
pushd "H:\itunes\Podcasts"
>"%USERPROFILE%\Desktop\RandomEpisodes.m3u8" (
for /d %%F in (*) do (
set "folder=%%F"
cmd /v:off /c ^""%~f0" :processFolder^"
)
)
popd
exit /b
But then I realized that I didn't have to restart my console for each test of my original benham1 code - when the batch script terminated, the memory seemed to have reset because the next run would start out just as fast as the prior one.
So I thought, why not simply CALL a :subroutine instead of initiating a new cmd.exe. This worked about the same, just a little bit better!
benham3.bat
#echo off
setlocal DisableDelayedExpansion
set /p "maxCnt=Select Desired Number of Random Episodes per Album:"
pushd "H:\itunes\Podcasts"
>"%USERPROFILE%\Desktop\RandomEpisodes.m3u8" (
for /d %%F in (*) do (
set "folder=%%F"
call :go
)
)
popd
exit /b
:go
setlocal
cd %folder%
set "fileCnt=0"
for /f "delims=: tokens=1,2" %%A in ('dir /b /a-d 2^>nul^|findstr /n "^"') do (
set "$$%%A=%%B"
set "fileCnt=%%A"
)
setlocal enableDelayedExpansion
set "nums="
for /l %%N in (1 1 !fileCnt!) do (
set "n= %%N"
set "nums=!nums!!n:~-4!"
)
if !fileCnt! lss !maxCnt! (set "cnt=!fileCnt!") else set "cnt=!maxCnt!"
for /l %%N in (1 1 !cnt!) do (
set /a "pos=(!random!%%fileCnt)*4, next=pos+4, fileCnt-=1"
for /f "tokens=1,2" %%A in ("!pos! !next!") do (
for /f %%N in ("!nums:~%%A,4!") do echo !$$%%N!
set "nums=!nums:~0,%%A!!nums:~%%B!"
)
)
exit /b
Another Update
I substituted Aacini's superior Array based method for guaranteeing each random operation selects a unique file name in place of my string based method. It yields slightly better performance:
benham-aacini.bat
#echo off
setlocal DisableDelayedExpansion
set /p "maxCnt=Select Desired Number of Random Episodes per Album:"
pushd "H:\itunes\Podcasts"
>"%USERPROFILE%\Desktop\RandomEpisodes.m3u8" (
for /d %%F in (*) do (
set "folder=%%F"
call :go
)
)
popd
exit /b
:go
setlocal
cd "%folder%"
set "fileCnt=0"
for /f "delims=: tokens=1,2" %%A in ('dir /b /a-d 2^>nul^|findstr /n "^"') do (
set "$$%%A=%%B"
set /a "fileCnt=%%A, RN%%A=%%A"
)
setlocal enableDelayedExpansion
if !fileCnt! lss !maxCnt! (set end=1) else set /a "end=fileCnt-maxCnt+1"
for /l %%N in (!fileCnt! -1 !end!) do (
set /a "ran=!random!%%%%N+1"
set /a "num=RN!ran!, RN!ran!=RN%%N
for %%N in (!num!) do echo !cd!\!$$%%N!
)
exit /b
Here is a summary of the timings of each version:
folders | benham1 benham2 benham3 benham-aacini
---------+-------------------------------------------
500 | 2:49 0:53 0:43 0:41
1000 | 17:48 1:56 1:44 1:34
So our original thinking at DosTips that the environment never shrinks is wrong. But I haven't had time to fully test and determine exactly when it shrinks.
Regardless, I think I finally have a version that is truly faster than your current 13 minute code :-)
Yet Another Update (Assume few collisions)
Multiple random selections from the complete set of files might result in duplicates. My code assumes that the number of requested files might be a large portion of a large list, in which case you can expect to get many duplicates.
So all of my previous solutions did some extra bookkeeping to guarantee that each random operation results in a unique file.
But RKO seems to typically select just a few files from each folder. So the chance of collision is small. His code just randomly selects a file, and then if the file has been selected before, it loops back and tries again until it finds a new file. But since the chance of collision is small, the retry rarely happens. This method has significantly less bookkeeping. The result is that his random selection algorithm is faster as long as the number requested remains small.
So I have adopted my code to use a slightly modified version of RKO's selection method. I expect this to be the fastest yet for a small request counts.
benham4.bat
#echo off
setlocal DisableDelayedExpansion
set /p "maxCnt=Select Desired Number of Random Episodes per Album:"
pushd "H:\itunes\Podcasts"
>"%USERPROFILE%\Desktop\RandomEpisodes.m3u8" (
for /d %%F in (*) do (
set "folder=%%F"
call :selectRandom
)
)
popd
exit /b
:selectRandom
setlocal
cd "%folder%"
set "fileCnt=0"
for /f "delims=: tokens=1,2" %%A in ('dir /b /a-d 2^>nul^|findstr /n "^"') do (
set "$$%%A=%%B"
set "fileCnt=%%A"
)
setlocal enableDelayedExpansion
if !fileCnt! lss !maxCnt! (set /a end=fileCnt) else set /a end=maxCnt
for /l %%N in (1 1 !end!) do (
set /a "N=!random!%%fileCnt+1"
if not defined $$!N! call :tryAgain
for %%N in (!N!) do echo !folder!\!$$%%N!
set "$$!N!="
)
exit /b
:tryAgain
set /a "N=!random!%%fileCnt+1"
if not defined $$!N! goto :tryAgain
exit /b
There isn't any good way to transer many variables out of the scope (over the endlocal barrier).
But when you transfer one by one than it works.
#echo off
setlocal
set "folderPath=C:\temp"
call :func
set $$
exit /b
:func
setlocal EnableDelayedExpansion
FOR /F "delims=" %%F in ("!FOLDERPATH!") DO (
endlocal
setlocal DisableDelayedExpansion
set /a Counter=0
for /f "tokens=* delims=" %%g in ('dir %%F /b') do (
set "item=%%g"
setlocal EnableDelayedExpansion
FOR /F "tokens=1,*" %%C in ("!Counter! !item!") DO (
endlocal
endlocal
set "$$%%C=%%D"
setlocal DisableDelayedExpansion
set /a Counter=%%C + 1
)
)
)
EXIT /b
This solution assumes, that none of your filenames contain a bang ! or that you call your function from a disabled delayed expansion context.
Another solution
When you need a bullet proof variant, then you could replace the endlocal/set "$$%%C=%%D" with the macroReturn technic.
When you can live with a temporary file you could also use the set /p technic.
:collectFiles
dir /b !FOLDERPATH! > temp.$$$
FOR /F %%C in ('type temp.$$$ ^| find /v /c ""') DO (
echo count %%C
< temp.$$$ (
for /L %%n in (0 1 %%C) DO (
set /p $$%%n=
)
)
)
exit /b
10 Minutes! It's a modified version of the 13 Minute EnabledDelayedExpansion version shown in the original post. I'm waiting for a slight coding fix from dbenham to see if his is an even faster solution.
Here's the critical piece of coding:
for /f "tokens=* delims=" %%g in ('dir "!buffer!" /b') do (
setlocal DisableDelayedExpansion
for /f "tokens=1* delims=!" %%m in ("%%g") do if not "%%m"=="%%g" (
set "item=%%g"
call set BangFile=%%item:^&=¬%%
call set BangFile=%%Bangfile:!=^^^^!%%
call echo %%BangFile%%>"G:\BangFilename.txt"
)
endlocal
Yes it's ugly. I don't like writing to temporary files, but could find no other way. And again there's probably only a few dozen filenames containing bangs (!) in the entire 120K collection, so very few read-writes. I had to use the above code twice: once for the directory names and once for the filenames.
The full code is here:
#echo off
chcp 1254>nul
setlocal EnableDelayedExpansion
IF EXIST "%USERPROFILE%\Desktop\RandomEpisodes.m3u8" del "%USERPROFILE%\Desktop\RandomEpisodes.m3u8"
set /p COUNT=Select Desired Number of Random Episodes per Album:
call:timestart
for /d %%f in (G:\itunes\Podcasts\* H:\itunes\Podcasts\*) do (
setlocal DisableDelayedExpansion
if exist g:\BangDirectory.txt del g:\BangDirectory.txt
for /f "tokens=1* delims=!" %%d in ("%%f") do if not "%%d"=="%%f" (
set "BangDir=%%f"
call set BangDir=%%BangDir:^&=¬%%
call set BangDir=%%BangDir:!=^^^^!%%
call echo %%BangDir%%>g:\BangDirectory.txt
)
endlocal
setlocal
set "buffer=%%f"
set directory=%%~nf
call:timecalc
call:Set$$Variables
if !COUNT! LEQ !ind! ( set DCOUNT=!COUNT! ) ELSE set DCOUNT=!ind!
for /l %%g in (1, 1, !DCOUNT!) do (
call:GenerateUniqueRandomNumber
for %%N in (!num!) do echo !$$%%N!>>"%USERPROFILE%\Desktop\RandomEpisodes.m3u8"
)
endlocal
)
pause
Exit /b
:Set$$Variables
set /a cnt = 0
if exist g:\BangDirectory.txt for /f "usebackq delims=" %%t in ("G:\BangDirectory.txt") do (
set "buffer=%%t"
set "buffer=!buffer:¬=&!"
)
for /f "tokens=* delims=" %%g in ('dir "!buffer!" /b') do (
setlocal DisableDelayedExpansion
if exist g:\BangFilename.txt del g:\BangFilename.txt
for /f "tokens=1* delims=!" %%m in ("%%g") do if not "%%m"=="%%g" (
set "item=%%g"
call set BangFile=%%item:^&=¬%%
call set BangFile=%%Bangfile:!=^^^^!%%
call echo %%BangFile%%>"G:\BangFilename.txt"
)
endlocal
if exist g:\BangFilename.txt for /f "usebackq tokens=* delims=" %%p in ("G:\BangFilename.txt") do (
set Filename=%%p
set "Filename=!Filename:¬=&!"
set $$!cnt!=!buffer!\!Filename!
)
if not exist g:\BangFilename.txt set "$$!cnt!=!buffer!\%%g"
set /a cnt+=1
)
set "ind=!cnt!"
EXIT /b
:GenerateUniqueRandomNumber
:nextone
set /a "num = (((!random! & 1) * 1073741824) + (!random! * 32768) + !random!) %% !ind!"
for %%N in (!num!) do (
if !RN%%N!==1 (
goto:nextone
)
set "RN%%N=1"
)
EXIT /b
:timestart
for /F "tokens=1-4 delims=:.," %%a in ("%time%") do (
set /A "start=(((%%a*60)+1%%b %% 100)*60+1%%c %% 100)*100+1%%d %% 100")
exit /b
:timecalc
REM Get end time:
for /F "tokens=1-4 delims=:.," %%a in ("%time%") do (
set /A "end=(((%%a*60)+1%%b %% 100)*60+1%%c %% 100)*100+1%%d %% 100"
)
REM Get elapsed time:
set /A elapsed=end-start
REM Show elapsed time:
set /A hh=elapsed/(60*60*100), rest=elapsed%%(60*60*100), mm=rest/(60*100), rest%%=60*100, ss=rest/100, cc=rest%%100
if %mm% lss 10 set mm=0%mm%
if %ss% lss 10 set ss=0%ss%
set "TimeElapsed= Time elapsed (mm:ss) %mm%:%ss%"
title %timeelapsed% WIP: "%directory%"
exit /b
Note on testing. I tested all the versions with virus protection off and with no other running programs. I selected 5 as the Desired Number of Random Episodes. I use an Evo N410c running Win XP. I tried to be as even as possible for each version of the script, but I noticed the 13 minute run can sometimes be as fast as 10 minutes (my guess is XP is creating & keeping some indices on-the-fly which affect runtime).
The 120K item library is contained on an external harddrive connected by USB. The library has about 300 directories with hundreds to thousands of items. It has another 1400 directories with between a few and a few dozen items. This is a live library, but I included an additional testing directory with filenames having every combination and permutation of !, & and %.
Philosophic note. Several times I've had to do 'stuff' with this 120K item library and in the end it has always been the case that avoidance of DisableDelayedExpansion is the best rule. I typically do everything I can with EnabledDelayedExpansion and then take care of any exceptions (bangs) as exceptions.
Any recommendations to reduce the ugliness of this solution (but not its speed) are very welcome.
Postscript--------------------
I incorporated Aacini's and dbenham's random number routine into my 10 minute version. Their coding looked to be much more elegant than my original coding. I deleted my GenerateUniqueRandomNumber subroutine and incorporated the following:
if !ind! lss !Count! (set end=1) else set /a "end=ind-Count+1"
for /l %%N in (!ind! -1 !end!) do (
set /a "ran=!random!%%%%N+1"
set /a "num=RN!ran!, RN!ran!=RN%%N
for %%N in (!num!) do echo !$$%%N!>>"%USERPROFILE%\Desktop\RandomEpisodes.m3u8"
)
This change added 2 minutes to processing time (increased run-time from 10 minutes to 12 minutes). Sometimes elegant just ain't as fast as plain ugly. I'm stickin' with my original.
This method create the array in a disabled delayed expansion environment in a very fast way:
#echo off
setlocal DisableDelayedExpansion
for /f "tokens=1* delims=:" %%g in ('dir %FOLDERPATH% /b ^| findstr /N "^"') do (
set "$$%%g=%%h"
set "ind=%%g"
)
REM View the results
echo %ind%
set $$
REM View the results using DelayedExpansion
setlocal EnableDelayedExpansion
for /L %%i in (1,1,%ind%) do echo %%i- !$$%%i!
pause
EXIT /b
You may also transfer the entire array to the environment of the caller program in a very simple way:
for /F "delims=" %%a in ('set $$') do (
endlocal
set "%%a"
)
In this method the endlocal command is executed several times, but it just works the first time when it is matched with the initial setlocal of the function. If this method could release other previous environments, then just add a simple test:
set _FLAG_=1
for /F "delims=" %%a in ('set $$') do (
if defined _FLAG_ endlocal
set "%%a"
)
EDIT: Complete example code added
I wrote the code below after the complete example code was posted by the OP; I used the same commands and variable names from the original code. This solution use the method I originally posted here to create the array in a disabled delayed expansion environment (that generate the indices of the elements via findstr /N "^" command), and then extract the elements of the array in random order using a very efficient method that I already used at this answer. I also inserted a couple modifications that increase the efficiency, like avoid call commands and change the append redirection >> (that is executed one time for each output line) by a standard redirection > (that is executed just once). The resulting program should run much faster than the original OP's code.
#echo off
chcp 1254>nul
setlocal DisableDelayedExpansion
set /p COUNT=Select Desired Number of Random Episodes per Album:
(for /d %%f in (H:\itunes\Podcasts\*) do (
for /f "tokens=1* delims=:" %%g in ('dir "%%f" /b /A-D 2^>NUL ^| findstr /N "^"') do (
set "$$%%g=%%h"
set "ind=%%g"
set "RN%%g=%%g"
)
setlocal EnableDelayedExpansion
if %COUNT% LEQ !ind! (set /A DCOUNT=ind-COUNT+1) ELSE set DCOUNT=1
for /l %%g in (!ind!, -1, !DCOUNT!) do (
set /A "ran=(!random!*%%g)/32768+1"
set /A "num=RN!ran!, RN!ran!=RN%%g"
for %%N in (!num!) do echo !$$%%N!
)
endlocal
)) > "%USERPROFILE%\Desktop\RandomEpisodes.m3u8"
pause
Exit /b
2ND EDIT
After read the explanation of user dbenham about a problem with the environment release in our original methods, I introduced the same modification suggested by him in my code in order to fix the problem. It is expected that both codes now run faster than the original OP's code...
#echo off
chcp 1254>nul
setlocal DisableDelayedExpansion
(
for /F "delims==" %%a in ('set') do set "%%a="
set "ComSpec=%ComSpec%"
set "USERPROFILE=%USERPROFILE%"
)
set /p COUNT=Select Desired Number of Random Episodes per Album:
(for /d %%f in (H:\itunes\Podcasts\*) do call :Sub "%%f"
) > "%USERPROFILE%\Desktop\RandomEpisodes.m3u8"
pause
Exit /b
:Sub
setlocal DisableDelayedExpansion
cd %1
for /f "tokens=1* delims=:" %%g in ('dir /b /A-D 2^>NUL ^| findstr /N "^"') do (
set "$%%g=%%h"
set /A "ind=%%g, N%%g=%%g"
)
setlocal EnableDelayedExpansion
if %COUNT% LEQ %ind% (set /A DCOUNT=ind-COUNT+1) ELSE set DCOUNT=1
for /l %%g in (%ind%, -1, %DCOUNT%) do (
set /A "ran=!random!%%%%g+1"
set /A "num=N!ran!, N!ran!=N%%g"
for %%N in (!num!) do echo %CD%\!$%%N!
)
exit /B
3RD EDIT
The method to generate unique random numbers used by dbenham and me is a general-purpose method that efficiently manage most situations; however, it seems that such method is not best suited for this particular problem. The new code below is an attempt to write a solution for this problem that run in the fastest possible way.
Mod: A small bug have been fixed.
#echo off
chcp 1254>nul
setlocal DisableDelayedExpansion
set "findstr=C:\Windows\System32\findstr.exe"
for /F "delims=" %%a in ('where findstr 2^>NUL') do set "findstr=%%a"
(
for /F "delims==" %%a in ('set') do set "%%a="
set "ComSpec=%ComSpec%"
set "USERPROFILE=%USERPROFILE%"
set "findstr=%findstr%"
)
set /p COUNT=Select Desired Number of Random Episodes per Album:
(for /d %%f in (H:\itunes\Podcasts\*) do call :Sub "%%f"
) > "%USERPROFILE%\Desktop\RandomEpisodes.m3u8"
pause
Exit /b
:Sub
setlocal DisableDelayedExpansion
cd %1
for /f "tokens=1* delims=:" %%g in ('dir /b /A-D 2^>NUL ^| "%findstr%" /N "^"') do (
set "$%%g=%%h"
set "ind=%%g"
)
setlocal EnableDelayedExpansion
if %COUNT% LEQ %ind% (set "DCOUNT=%COUNT%") ELSE set "DCOUNT=%ind%"
for /l %%g in (1, 1, %DCOUNT%) do (
set /A "num=!random!%%%%g+1"
if not defined $!num! call :nextNum
for %%N in (!num!) do echo %CD%\!$%%N!& set "$%%N="
)
exit /B
:nextNum
set /A num=num%%ind+1
if not defined $%num% goto nextNum
exit /B
...
setlocal DisableDelayedExpansion
endlocal&set "item=%%g"
...
should work.
#ECHO OFF
SETLOCAL DISABLEDELAYEDEXPANSION
FOR /f "tokens=1*delims=:" %%a IN (
'dir /b /ad c:\106x^|findstr /n /r "."') DO (
SET "thing%%a=%%b"
)
SET thing
GOTO :EOF
This should establish your array. c:\106x is just a directory where I have some strange directory-names.
My directory c:\106x:
!dir!
%a
%silly%
-
1 & 2
a silly dirname with & and % and ! an things
exe
flags
nasm32working
nasmxpts
no test file(s)
some test file(s)
stl
wadug
with spaces
with'apostrophes'test
Result of running above code
thing1=!dir!
thing10=nasmxpts
thing11=no test file(s)
thing12=some test file(s)
thing13=stl
thing14=wadug
thing15=with spaces
thing16=with'apostrophes'test
thing2=%a
thing3=%silly%
thing4=-
thing5=1 & 2
thing6=a silly dirname with & and % and ! an things
thing7=exe
thing8=flags
thing9=nasm32working
works for me! - and only executes the setlocal once.

Copy files from a file list to a folder list in batch

I have two text files...files.txt containing a list of filenames and dirs.txt containing the list of directories the files need to be copied to.
This is how the files need to be copied:
File 1 ------------------------> Folder 1
File 2 ------------------------> Folder 2
File 3 ------------------------> Folder 3
How do I implement this using batch? Thanks in advance...
Try this:
#echo off
setlocal enabledelayedexpansion
for /f "delims=" %%a in (files.txt) do (
set /p dir=
echo copy "%%~a" "!dir!"
)<dirs.txt
pause
The above works - Mona can revise or remove the following:
setlocal enabledelayedexpansion
3<dirs.txt(
for /f "delims=" %%a in (files.txt) do (
set /p dir=<&3
copy "%%~a" "!dir!"
)
)
And that should do what you want. Note if dirs.txt has less lines then files.txt, this will fail.
Mona.
Well I managed to figure it out...thanks to this answer from #foxidrive. Here is the code:
#echo off
setlocal enabledelayedexpansion
set /A i=0
for /F "usebackq delims==" %%a in (files.txt) do (
set /A i+=1
call set array1[%%i%%]=%%a
call set n=%%i%%
)
set /A i=0
for /F "usebackq delims==" %%a in (dirs.txt) do (
set /A i+=1
call set array2[%%i%%]=%%a
)
for /L %%i in (1,1,%n%) do call copy "%%array1[%%i]%%" "%%array2[%%i]%%"
It's definitely not the best solution...but it works!
Thanks everyone for your help.

Loop through list, set variable

I'm trying to make a loop that goes through a file with filenames on each line, set the first filename as a variable and execute the rest if the script. Then take the second line and do the same.
etc. etc.
The problem is that it only does the first line of filenames.txt
#echo off
for /F "tokens=*" %%G in (filenames.txt) do (
set filename=%%G
script
script
script
)
pause
It has be a batch file.
The whole script:
#ECHO OFF
for /F "tokens=*" %%G in (filenames.txt) do (
SET FileName=%%G
SET Word1="ts_confirmImplicitSAMM.gram"
SET Word2="SWIrcnd"
for /f "tokens=3" %%f in ('find /c /i %Word1% %FileName%') do set PairsToShow=%%f
SET /a Lines1=0, Lines2=0
FOR /f "delims=" %%a IN ('findstr "%Word1%" "%FileName%"') DO (
SET "str=%%a"
SET /a Lines1+=1
SETLOCAL enabledelayedexpansion
SET "$1!Lines1!=!str!"
FOR /f "tokens=1*delims==" %%b IN ('set "$1"') DO (IF "!"=="" endlocal)&SET "%%b=%%c"
)
FOR /f "delims=" %%a IN ('findstr "%Word2%" "%FileName%"') DO (
SET "str=%%a"
SET /a Lines2+=1
SETLOCAL enabledelayedexpansion
SET "$2!Lines2!=!str!"
FOR /f "tokens=1*delims==" %%b IN ('set "$2"') DO (IF "!"=="" endlocal)&SET "%%b=%%c"
)
SET /a Lines=Lines1+Lines2
ECHO(%Lines% lines read from %FileName%.
IF %Lines1% leq %Lines2% (SET /a MaxPairs=Lines1) ELSE SET /a MaxPairs=Lines2
IF %PairsToShow% gtr %MaxPairs% (
ECHO only text for %MaxPairs% pairs NOT %PairsToShow% :/
GOTO :END
)
(FOR /l %%a IN (1,1,%PairsToShow%) DO (
SETLOCAL ENABLEDELAYEDEXPANSION
CALL SET "Line1=%%$1%%a%%"
CALL SET "Line2=%%$2%%a%%"
<NUL SET /p "=!Line1!"
ECHO !Line2!
ENDLOCAL
))>> result1.txt
ENDLOCAL
TYPE result1.txt| FINDSTR /V EVNT=SWIgrld >> result.txt
DEL result1.txt
PAUSE
)
Without seeing the rest of your script... you probably need to do 1 of 2 things:
Use SETLOCAL ENABLEDELAYEDEXPANSION (as 2nd line of your script) and then reference the variable filename as !filename! instead of %filename% to use the run-time value instead of the load-time value. But that could cause other problems, depending on what goes on in "script".
Just use %%G instead of filename

bat to replace text from text file1 to file2 (relative search)

textfile1 is source and textfile2 is target
Example textfile1.txt:
server1.net 2100 /l /n /k port:2000,server2.net 2100 /l /n /k port:20000
textfile2.txt:
server3.net 2000 /l /k port:xxxx,server4.net 2000 /l /k port:yyyyy
I need to find "port:" characters in textfile1.txt and subst the port number (n character after "port:" characteres), in textfile2.txt
Note: portnumber is variable number in textfile1, but text: "port:" is fixed.
Thanks
Basically I need to put the same port numbers in textfile1 and textfile2
The Batch file below works with an unlimited list of servers in both text files and allows the port:nnnn option be included at any place in the line. It does not check for any errors to made it simpler.
#echo off
setlocal EnableDelayedExpansion
rem Read data from both files
set /P "data1=" < textfile1.txt
set /P "data2=" < textfile2.txt
rem Replace "port" data in all servers
:nextServer
rem Get port number after "port:" in first data and eliminate it
for /F "tokens=1* delims=," %%a in ("%data1:*port:=%") do set port1=%%a& set data1=%%b
rem Replace port number in second data and change colon by semicolon
for /F "tokens=1 delims=," %%a in ("%data2:*port:=%") do set data2=!data2::%%a=;%port1%!
rem Pass to next server, if any
if defined data1 goto nextServer
rem Output result restoring colons
echo %data2:;=:%
For example, with this data:
server1.net 2100 /l /n /k port:2000,server2.net 2100 /l /n /k port:20000,server5.net 2100 /l /n /k port:1234
server3.net 2000 /l /k port:xxxx,server4.net 2000 /l /k port:yyyyy,server6.net 2000 /l /k port:zzzz
The output is:
server3.net 2000 /l /k port:2000,server4.net 2000 /l /k port:20000,server6.net 2000 /l /k port:1234
I don't see why you want to do this task with batch - it would be much simpler using another scripting language like VBScript, JScript, or perhaps PowerShell.
But here is a native batch solution. My solution makes no assumption as to the order of the options in each command. It allows for any number of options both before and after /port:nnnn.
#echo off
setlocal enableDelayedExpansion
::read the line from textfile1
set "str="
<"textfile1.txt" set /p "str="
if not defined str exit /b
::enclose each command in quotes
set "str="!str:,=","!""
::extract the port from each command into an array
set /a n=0
for %%A in (!str!) do (
set "cmd=%%~A"
set /a n+=1
for /f %%B in ("!cmd:*port:=!") do set "port!n!=%%B"
)
::read the line from textfile2
set "str="
<"textfile2.txt" set /p "str="
if not defined str exit /b
::enclose each command in quotes
set "str="!str:,=","!""
::process each command
set /a n=0
set "ln="
for %%A in (!str!) do (
set "cmd=%%~A"
REM break the command into 2 parts, discarding port:
REM %%B = before port
REM %%E = after port
for /f "tokens=1* delims=," %%B in ("!cmd:port:=,!") do (
for /f "tokens=1*" %%D in ("%%C") do (
set /a n+=1
REM transfer !n! into %%N
for %%N in (!n!) do (
REM build the new line
set "ln=!ln!,%%B port:!port%%N! %%E"
)
)
)
)
::remove space before comma
set "ln=!ln: ,=,!"
::write the result back to textfile2
>"textfile2.txt" echo !ln:~1!
A little more clarification on what you're looking to do in the long-term may help, but in the meantime, here is a script I just wrote and tested and it does exactly what you're asking.
for /f "tokens=6,12 delims=, " %%a in (textfile1.txt)do (
set portOne=%%a
set portTwo=%%b
call :writeTextTwo
)
type texttwo_temp>textfile2.txt
goto :EOF
:writeTextTwo
for /f "tokens=1-12 delims=, " %%c in (textfile2.txt)do (echo %%c %%d %%e %%f %portOne%,%%h %%i %%j %%k %portTwo% >>texttwo_temp )
goto :EOF
I hope this helps.

Resources