This question already has answers here:
How to search and replace a string that have an equal-to sign "=" inside
(2 answers)
Closed 3 years ago.
I adapted this script from another thread on Stack Overflow. Script works, but has incorrect output because of special characters (<, >, ", =) in the search query.
Basically, I just need to find <script src="https://d1tdp7z6w94jbb.cloudfront.net/js/jquery-3.3.1.min.js" type="text/javascript" integrity="sha256-FgpCb/KJQlLNfOu91ta32o/NMZxltwRo8QtmkMRdAu8=" crossorigin="anonymous"></script> and remove it.
setlocal EnableExtensions DisableDelayedExpansion
set "search=<script src="https://d1tdp7z6w94jbb.cloudfront.net/js/jquery-3.3.1.min.js" type="text/javascript" integrity="sha256-FgpCb/KJQlLNfOu91ta32o/NMZxltwRo8QtmkMRdAu8=" crossorigin="anonymous"></script>"
set "replace="
set "textFile=index.html"
set "rootDir=."
for %%j in ("%rootDir%\%textFile%") do (
for /f "delims=" %%i in ('type "%%~j" ^& break ^> "%%~j"') do (
set "line=%%i"
setlocal EnableDelayedExpansion
set "line=!line:%search%=%replace%!"
>>"%%~j" echo(!line!
endlocal
)
)
endlocal
I have found other threads on Stack Overflow asking the same question, but I can't understand their implementations and how to apply them to this script.
Windows command processor cmd.exe is designed for executing commands and applications. It is not designed for file content modification purposes independent on type of file.
There are lots of script interpreters which have built-in support for modification of file contents like VBScript, JScript, PowerShell, Perl, Python, ... So best would be to use a different script interpreter than Windows command processor for this task, especially on search or replace string contain "<=>| which makes a file content modification with pure Windows command processor commands a nightmare.
However, this is an easy to achieve task with using JREPL.BAT written by Dave Benham which is a batch file / JScript hybrid to run a regular expression replace on a file using JScript.
#echo off
if not exist ".\index.html" goto :EOF
if not exist "%~dp0jrepl.bat" goto :EOF
call "%~dp0jrepl.bat" "[\t ]*<script src=\x22https://d1tdp7z6w94jbb.cloudfront.net/js/jquery-3.3.1.min.js\x22 type=\x22text/javascript\x22 integrity=\x22sha256-FgpCb/KJQlLNfOu91ta32o/NMZxltwRo8QtmkMRdAu8=\x22 crossorigin=\x22anonymous\x22></script>[\t ]*\r?\n?" "" /M /F ".\index.html" /O -
The batch file first checks if there is an index.html file in current directory and immediately exits if this condition is not true, see Where does GOTO :EOF return to?
The batch file JREPL.BAT must be stored in same directory as the batch file with the code above. For that reason the batch file checks next if JREPL.BAT really exists in directory of the batch file and exits if this condition is not true.
Next the batch file calls JREPL.BAT to do a case-sensitive regular expression replace with replace string being an empty string.
The search string is mainly the string which should be removed from the file.
Each " in search string is replaced by \x22 which is an expression to search for a character with hexadecimal code value 22 which is the code value of character " to be able to specify this string on Windows command line as one argument string enclosed in double quotes.
The main search string does not contain any character with a special regular expression meaning and therefore no other character must be escaped with a backslash to be interpreted as literal character by regular expression function of JScript.
The main search string also does not contain any character with a special Windows command processor meaning even inside a double quoted argument string like percent sign %. Each % inside the searched string would be needed to be escaped with one more % to be interpreted as literal character by cmd.exe parsing this command line before calling the other batch file with the already parsed arguments.
The search expression starts with [\t ]* to remove additionally 0 or more horizontal tabs or normal spaces left to the string to remove. The string to remove is usually in an HTML file on a separate line indented with tabs or spaces and the goal is to remove also those indenting whitespaces.
The search expression ends with [\t ]*\r?\n? to remove additionally 0 or more horizontal tabs or normal spaces right to the string to remove, i.e. trailing whitespaces on the line, and one carriage return if existing at all, and one line-feed if existing at all.
So an entire line is removed from the file if the string to remove is on a separate line in the HTML file without or with leading tabs/spaces and without or with trailing tabs/spaces. But if the string to remove is on a line with other HTML tags, just the searched string and the tabs/spaces left and right to this string are removed from the HTML file. The JREPL.BAT option /M is used to be able to remove an entire line and not only the searched string within the line and leaving back an empty line on script tags being on a separate line.
For understanding the used commands and how they work, open a command prompt window, execute there the following commands, and read entirely all help pages displayed for each command very carefully.
call /? ... explains also %~dp0 ... drive and path of argument 0 being the batch file itself.
echo /?
goto /?
if /?
jrepl.bat /?
Related
I have the following batch file to make git diff invoke spreadsheet compare UI in windows. So I'm trying to pass the git diff's 2nd (old file) and 5th (new file) arguments to spreadsheet compare in order to make it compare the file using git diff.
So now, this batch file only successfully handles files with NO spaces in the file names, it CANNOT handle files with spaces in the file names.
What code should I add to this script to make this batch code handles file with spaces:
#ECHO OFF
set path2=%5
set path2=%path2:/=\%
ECHO %2 > tmp.txt
dir %path2% /B /S >> tmp.txt
C:/"Program Files"/"Microsoft Office"/root/vfs/ProgramFilesX86/"Microsoft Office"/Office16/DCF/SPREADSHEETCOMPARE.EXE tmp.txt
It currently throw errors like this:
Unhandled Exception: System.ArgumentException: Illegal characters in path.
at System.IO.Path.CheckInvalidPathChars(String path, Boolean checkAdditional)
at System.IO.Path.GetFileName(String path)
at ProdianceExcelCompare.Form1.StatusReady()
at ProdianceExcelCompare.Form1.Init()
at ProdianceExcelCompare.Form1..ctor(String instructionFile)
at ProdianceExcelCompare.Program.Main(String[] args)
fatal: external diff died, stopping at London comparison.xlsx
See the following answers on Stack Overflow:
How to set environment variables with spaces?
Why is no string output with 'echo %var%' after using 'set var = text' on command line?
They explain the recommended syntax set "VariableName=variable value" to define an environment variable and the reasons recommending this syntax.
Why does ECHO command print some extra trailing space into the file?
It explains why the space character left to redirection operator > on an ECHO command line is also written into the file as trailing space and how to avoid this safely on variable text written into the file.
See also Microsoft documentation about Using command redirection operators.
On other command lines than ECHO a space left to > is usually no problem.
It is in general wrong to use multiple times " within an argument string like a file or folder path. There should be just one " at beginning and one " at end. This is explained by help of Windows command processor output on last help page on running in a command prompt window cmd /?.
The Microsoft documentation about Naming Files, Paths, and Namespaces explains that the directory separator on Windows is \ and not / and therefore / should not be used in batch files on Windows in file/folder paths.
The help output on running in a command prompt window call /? explains how the arguments of a batch file can be referenced with which modifiers.
The code rewritten according to information posted above and on the referenced pages:
#echo off
setlocal EnableExtensions DisableDelayedExpansion
set "path2=%~5"
set "path2=%path2:/=\%"
>"tmp.txt" echo %2
dir "%path2%" /B /S >>"tmp.txt" 2>nul
"%ProgramFiles%\Microsoft Office\root\vfs\ProgramFilesX86\Microsoft Office\Office16\DCF\SPREADSHEETCOMPARE.EXE" "tmp.txt"
endlocal
The first line in tmp.txt contains the second argument as passed to the batch file, i.e. without or with surrounding double quotes.
The following code is necessary to write the second argument safely always without " into file tmp.txt even on second argument passed to the batch file is "Hello & welcome!":
#echo off
setlocal EnableExtensions DisableDelayedExpansion
set "path2=%~5"
set "path2=%path2:/=\%"
set "Argument2=%~2"
setlocal EnableDelayedExpansion
echo !Argument2!>"tmp.txt"
endlocal
dir "%path2%" /B /S >>"tmp.txt" 2>nul
"%ProgramFiles%\Microsoft Office\root\vfs\ProgramFilesX86\Microsoft Office\Office16\DCF\SPREADSHEETCOMPARE.EXE" "tmp.txt"
endlocal
>tmp.txt echo %~2 cannot be used as not working for something like "Hello & welcome!". Windows command processor would interpret the first string separated by normal space, horizontal tab, comma, equal sign, or no-break space (in OEM code pages) delimited string after & as command or application to execute as described by single line with multiple commands using Windows batch file.
"tmp.txt" could be written everywhere in both batch files also with just tmp.txt. But it is never wrong to enclose the complete file/folder argument string in double quotes even on not being really necessary because of the string does not contain a space or one of these characters &()[]{}^=;!'+,`~. So it is good practice to always enclose a complete file/folder argument string in double quotes. For example running a replace on both batch files searching for tmp.txt and using as replace string %TEMP%\%~n0.tmp would result in using instead of tmp.txt in current directory a temporary file with name of batch file as file name and file extension .tmp in directory for temporary files independent on what is the name of the batch file and what is the path of the directory for temporary files.
The last suggestion is reading this answer for details about the commands SETLOCAL and ENDLOCAL.
The temporary file should be also deleted finally before reaching an exit point for batch file execution.
You can use quotes as below:
It treats the string in quotes as a title of the new command window. So, you may do the following:
start "" "yourpath"
Found it in the below link :
https://ccm.net/forum/affich-16973-open-a-file-with-spaces-from-batch-file
I wanted to make my own command prompt. And I wanted a command in it with conditions (e.g. /h or something like that)
This is my command:
if %adm%=="admin /y" (goto add)
I tested it on the original command prompt and it says this:
/y="admin /y" was unexpected at this time.
I tried using carets (the ^ symbol) but it didn't work.
The command IF expects at least three arguments on comparing values:
The first argument is the first value: a string (or an integer).
The second argument is the comparison operator: ==, EQU, etc.
The third argument is the second value: a string (or an integer).
How IF executes the comparison is explained in detail in answer on Symbol equivalent to NEQ, LSS, GTR, etc. in Windows batch files. Run also if /? in a command prompt window to get displayed the help for this command on several window pages.
The argument separator on command line is the space character.
%adm% is replaced by current value of environment variable adm during preprocessing phase of the command line by Windows command interpreter cmd.exe before the command line is executed at all. What is really executed after preprocessing can be seen on temporarily removing #echo off from top of batch file or change it to #echo ON or comment this line, and run the batch file from within a command prompt window instead of double clicking on the batch file.
if admin /y=="admin /y" (goto add)
The command IF interprets admin as first argument. The second argument is /y=="admin /y" which is definitely not a supported comparison operator which is the reason for the error message because that string is really not expected by IF.
An argument string containing a space character or one of the characters &()[]{}^=;!'+,`~<|> requires enclosing the entire argument string in double quotes to get the space and the other characters interpreted as literal characters of an argument string.
So a possible solution is:
if "%adm%"=="admin /y" goto add
This command line in the batch file is expanded during preprocessing to:
if "admin /y" == "admin /y" goto add
It can be seen that Windows command interpreter inserts a space before and after == being the second argument and a valid comparison operator.
The round brackets are removed as command IF is designed by default to execute a single command line. Parentheses are only needed if an ELSE branch is needed too, or multiple command lines should be executed depending on the condition. A command block starting with ( and ending with matching ) cause an extra step during preprocessing phase and should be avoided for that reason if not really needed.
But be aware that any double quote " in string value assigned to adm breaks again the IF condition on using if "%adm%" == "admin /y" goto add in the batch file. This can be seen on using for example
set /P "adm="
if "%adm%" == "admin /y" goto add
And the user enters on execution of this batch file:
" == "" echo rd /Q /S "C:\" & rem "
The command line in the batch file is expanded before execution to:
if "" == "" echo rd /Q /S "C:\" & rem "" == "admin /y" goto add
So rd /Q /S "C:\" is output and without echo the batch file would start deleting all directories recursive on which the current user has the permissions to delete files and folders.
So if the string assigned to environment variable adm is input by a user of the batch file, it is highly recommended to use delayed environment variable expansion to avoid a modification of the command line during preprocessing phase to an invalid command line or a command line which does something completely different than it is designed for.
setlocal EnableExtensions EnableDelayedExpansion
rem Other command lines in the batch file.
set /P "adm="
if /I "!adm!" == "admin /y" goto add
The environment variable adm is referenced with usage of delayed environment variable expansion and therefore this command line can't be modified for execution by input of the user.
Additionally the optional parameter /I is used to make the string comparison case-insensitive.
MS-DOS Batch is expanding your variable %adm% to its value, then trying to compare the last token in that value to the string. To prevent your program from only comparing the last token to the string, you can surround your variable in quotation marks so it will be evaluated to a single string for comparison.
For the correct MS-DOS Batch comparison, your code should look something more like this, which avoids comparing the expanded value of that variable, and instead performs literal comparison of the entire variable's value to the string.
IF "%adm%"=="admin /y"
#echo off & setlocal
set "search=jre1.8.0_?"
set "replace=jre1.8.0_156"
set "textfile=C:\Program Files\ABC\_ABC_installation\Uninstall_ABC.lax"
set "newfile=C:\Program Files\ABC\_ABC_installation\Uninstall_ABC1.lax"
pause;
call repl.bat "%search%" "%replace%" L < "%textfile%" >"%newfile%"
pause;
del "%textfile%"
rename "%newfile%" "%textfile%"
pause;
The batch file should match anything like jre1.8.0_121 or jre1.8.0_152.
So I want to replace jre1.8.0_? with jre1.8.0_156, but neither ? nor * works on replace.
The replace works fine on removing this wildcard character.
Deprecated repl.bat as well as JREPL.BAT use JScript which offers regular expression support.
In ECMAScript/JavaScript/JScript/Perl regular expression syntax \d+ or [0-9]+ matches 1 or more digits, i.e. a positive integer number.
So the command line to use in the batch file is one of those 2 lines on using jrepl.bat:
call jrepl.bat "jre1\.8\.0_\d+" "jre1.8.0_156" /F "%ProgramFiles%\ABC\_ABC_installation\Uninstall_ABC.lax" /O -
call jrepl.bat "jre1\.8\.0_[0-9]+" "jre1.8.0_156" /F "%ProgramFiles%\ABC\_ABC_installation\Uninstall_ABC.lax" /O -
Or use one of those two lines on still using repl.bat.
set "search=jre1\.8\.0_\d+"
set "search=jre1\.8\.0_[0-9]+"
The . has a special meaning in a regular expression search string. It means by default any character except newline characters (according to Unicode standard). To get . interpreted as literal character, it must be escaped with a backslash.
The option L must be removed on usage of repl.bat as the search string is not anymore a string to find literally, but a regular expression string.
The batch code using deprecated repl.bat can't work as long as input and output file have same name. And even on using for output file a different name, the posted batch code does not work because command rename requires that the new file name is specified without path.
The command move can be used to replace input file by modified output file.
The entire batch code really working using deprecated repl.bat.
#echo off & setlocal
set "search=jre1\.8\.0_\d+"
set "replace=jre1.8.0_156"
set "textfile=%ProgramFiles%\ABC\_ABC_installation\Uninstall_ABC.lax"
set "newfile=%ProgramFiles%\ABC\_ABC_installation\Uninstall_ABC.tmp"
call repl.bat "%search%" "%replace%" <"%textfile%" >"%newfile%"
move /Y "%newfile%" "%textfile%"
Hint: Do not use a semicolon after command pause. There is absolute no need for a semicolon. The semicolon after pause just tests the error correction handling of Windows command interpreter.
I'm using a batch file to format a text file so that I can append it to an already populated .csv file automatically each hour. My problem is, the text file's intended formatting isn't showing up in Notepad. It opens as a single line with 43 tokens. I only need tokens 30 - 43. Since I can not skip lines and I'm over the token limit, what are my options?
You don't state what your token delimiter is, or whether any of your values contain quoted delimiters, or whether any tokens are empty (consecutive delimiters with nothing between them).
Pure batch solution
I will assume that your input is comma delimited. It is trivial to change the delimiter used by FOR /F.
I will also assume no values contain comma literals, and there are no consecutive commas. A simple FOR /F cannot handle either situation. Both could be solved with batch, (assuming the line is < 8kb), but it is a bit painful. If you have these issues, then I think you are better off with some other language than batch.
A single FOR /F command cannot parse more than 31 tokens. For more info, see
https://stackoverflow.com/a/8520993/1012053
Number of tokens limit in a FOR command in a Windows batch script
But you don't need to parse any tokens past 29 :-)
You can simply use FOR /F with "delims=29*", and the * "token" will contain tokens 30-43.
for /f "usebackq tokens=29* delims=," %%A in ("yourInputFile.ext") do (echo(%%B) >>yourOutput.csv
If your input delimiter is something other than a comma, then you can store tokens 30-43 in a variable, and then use find/replace to substitute commas for the pipes.
For example, if your input parameter is a pipe, then
#echo off
setlocal
for /f "usebackq tokens=29* delims=|" %%A in ("yourInputFile.ext") do set "line=%%B"
setlocal enableDelayedExpansion
if defined line set "line=!line:|=,!"
(echo(!line!) >>yourOutput.csv
I did not put the manipulation and write operations within the loop because you state your input has only one line.
I do not enable delayed expansion until after the loop completes just in case your input contains ! literals. Expansion of FOR /F variables will corrupt ! values if delayed expansion is enabled.
Robust JREPL.BAT solution (hybrid batch/JScript)
If your input violates any of the restrictions that I laid out in the pure batch solution, then you could use JREPL.BAT - A regular expression command line text processing utility. JREPL.BAT is pure script (hybrid batch/JScript) that runs natively on any Windows machine from XP onward - no 3rd party exe file required.
Since you did not specify your input format, I will assume it is CSV. The following solution will simply remove the first 29 tokens. It supports empty tokens, as well as quoted tokens with comma literals.
call jrepl "^(\q([^\q]|\q\q)*\q,|[^,]*,){29}" "" /x /f yourFile.ext >>yourOutput.csv
I'm sure there is an efficient JREPL solution if your input is not csv format. But I am not going to waste my time trying to guess your format.
I'm building a script for Windows command line in which I try to check some filenames in a FOR loop, and then stripping off part of the filename into a variable for further use. Basically, what I want to happen is this:
List all files in a certain directory, splitting of the extension like .osm.pbf in this case.
Assign the filename to a variable.
Out the last 7 characters of the filename in another variable.
Compare this new variable to "-latest".
If the compare is true, cut a part of the variable containing the filename.
If the compare is false, take over the complete variable into another variable.
Through some trial and error and some searching online, I've arrived at this point (which still isn't doing what I want):
FOR /F "tokens=1-2 delims=." %%M IN ('DIR /b %VECTOR_WORKDIR%\*.osm.pbf') DO (
SET VECTOR_CURRENT_MAP2=%%M
ECHO !VECTOR_CURRENT_MAP2! >> %VECTOR_LOGFILE%
SET LAST_BIT_TEMP=!VECTOR_CURRENT_MAP2:~-7!
ECHO !LAST_BIT_TEMP! >> %VECTOR_LOGFILE%
SET LAST_BIT=!LAST_BIT_TEMP: =!
ECHO !LAST_BIT! >> %VECTOR_LOGFILE%
IF !LAST_BIT!=="-latest" (
SET VECTOR_CURRENT_MAP3=!VECTOR_CURRENT_MAP2:~0,-8!
ELSE
SET VECTOR_CURRENT_MAP3=!VECTOR_CURRENT_MAP2!
)
ECHO !VECTOR_CURRENT_MAP3! >> %VECTOR_LOGFILE%
)
This results in these lines in the log file, for the file basse-normandie-latest.osm.pbf:
basse-normandie-latest
-latest
-latest
ECHO is on.
The first echo is correct, although the filename has a trailing space. (So actually it's "basse-normandie-latest ".)
The second echo doesn't seem to take this training space into account, as it correctly gives "-latest" as the last 7 characters. This echo also has a trailing space (So actually it's "-latest ".)
The third echo is an attempt to clear the spaces from the variable (by using ": ="), but this results in another trailing space. (So actually it's "latest ".)
The final echo after the IF statement (where I try to cut the "-latest" part from the filename), results in "ECHO is on".
I have SETLOCAL enabledelayedexpansion enableextensions declared at the top of my script.
Any thoughts on how to make this work, i.e. get rid of the trailing spaces to make the comparison work?
Thanks in advance for any pointers in the right direction!
A line like
ECHO !VECTOR_CURRENT_MAP2! >> %VECTOR_LOGFILE%
results in appending the value of the environment variable VECTOR_CURRENT_MAP2 to file with file name stored in environment variable VECTOR_LOGFILE with a trailing space because there is a space before redirection operator >> which is interpreted by Windows command processor as part of the string to output by command ECHO. This space must be removed to get the file name redirected into the log file without a trailing space.
In general it is critical on redirecting a variable string into a file without a space between the variable string and the redirection operator in case of the variable string ends with a space and a number being a valid handle number like 1 or 2 or 3. There are several solutions to workaround this problem like specifying the redirection left to command ECHO, i.e.
>>%VECTOR_LOGFILE% ECHO !VECTOR_CURRENT_MAP2!
But on using delayed expansion as simply necessary here, it is safe to append the redirection at end without a space between exclamation mark and >>, i.e.
ECHO !VECTOR_CURRENT_MAP2!>> %VECTOR_LOGFILE%
The space after redirection operator is ignored by Windows command processor and therefore can be kept although many batch file programmers (like me) with good syntax highlighting don't insert a space after a redirection operator.
On comparing strings with command IF and enclosing one string in double quotes which is always a good idea, it must be made sure that the other string is also enclosed in double quotes. The command IF does not remove the double quotes before comparing the strings. The double quotes are parts of the compared strings.
The condition
IF !LAST_BIT!=="-latest"
is only true if the string assigned to environment variable LAST_BIT would be with surrounding quotes which is never the case with your batch code and therefore the condition is never true.
Correct would be:
IF "!LAST_BIT!"=="-latest"
There is no need to use command DIR to search for files with a pattern in a directory as command FOR is designed for doing exactly this task. Processing of output of command DIR is an extension of FOR available only if command extensions are enabled as by default.
The file extension is defined by Microsoft as everything after last dot in name of a file. Therefore the file extension for your files is pbf respectively .pbf and .osm belongs to the file name.
Command FOR offers several modifiers to get specific parts of a file or directory name. Those modifiers are explained in help output into console window on running in a command prompt window for /?. Help of command CALL output with call /? explains the same for processing parameters of a batch file or subroutine (batch file embedded within a batch file).
Your code with all mistakes removed:
FOR %%M IN (*.osm.pbf) DO (
SET "VECTOR_CURRENT_MAP2=%%~nM"
SET "VECTOR_CURRENT_MAP2=!VECTOR_CURRENT_MAP2:~0,-4!"
ECHO !VECTOR_CURRENT_MAP2!>>%VECTOR_LOGFILE%
SET "LAST7CHARS=!VECTOR_CURRENT_MAP2:~-7!"
ECHO !LAST7CHARS!>>%VECTOR_LOGFILE%
IF "!LAST7CHARS!" == "-latest" (
SET "VECTOR_CURRENT_MAP3=!VECTOR_CURRENT_MAP2:~0,-7!"
) ELSE (
SET "VECTOR_CURRENT_MAP3=!VECTOR_CURRENT_MAP2!"
)
ECHO !VECTOR_CURRENT_MAP3!>>%VECTOR_LOGFILE%
)
Easier would be using this code with using string substitution feature of command SET, i.e. search within a string case-insensitive for all occurrences of a string and replace them with another string which can be also an empty string.
FOR %%M IN (*.osm.pbf) DO (
SET "VECTOR_CURRENT_MAP2=%%~nM"
SET "VECTOR_CURRENT_MAP2=!VECTOR_CURRENT_MAP2:~0,-4!"
ECHO !VECTOR_CURRENT_MAP2!>>%VECTOR_LOGFILE%
SET "VECTOR_CURRENT_MAP3=!VECTOR_CURRENT_MAP2:-latest=!"
ECHO !VECTOR_CURRENT_MAP3!>>%VECTOR_LOGFILE%
)
%%~nM is replaced on execution by Windows command processor by the name of the file without drive, path and file extension resulting for your example in basse-normandie-latest.osm.
The unwanted file name part .osm is removed with the next line in both batch code blocks which chops the last 4 characters from the file name string.
For understanding the used commands and how they work, open a command prompt window, execute there the following commands, and read entirely all help pages displayed for each command very carefully.
echo /?
for /?
if /?
set /?
Read the answer on question Why is no string output with 'echo %var%' after using 'set var = text' on command line? for an explanation why I used set "variable=value" on every line which assigns a value string to an environment variable because trailing whitespaces are critical for your task.