Character escaping in variable name - batch-file

Is someone able to explain how cmd handles carats, in the following examples?
C:\>set ^^=test
C:\>echo %^%
test
C:\>echo ^%^%
test
C:\>echo %^^%
%^%
I figured that %^% would be handled as simply %%. I assume that the variable expansion is handled before the carat is considered, however that is a half arsed answer to a question that I'm sure could be more eloquently explained.
In batch -
#echo off
set ^^=test
echo %^%
echo ^%^%
echo %^^%
--
C:\>test.bat
test
test
ECHO is off.

It is because the order of how Batch processes each command line. Simply put, the variable expansion is performed before analyzing special characters. That is why the carat is consumed by the variable expansion before being removed as an escape character. This is also why percent characters have to be escaped by themselves %% instead of the standard carat ^ escape character.
Phase/order
1) Phase(Percent):
A double %% is replaced by a single %
Expansion of argument variables (%1, %2, etc.)
Expansion of %var%, if var does not exists replace it with nothing
For a complete explanation read this from dbenham Same thread: percent expansion
1.5) Remove all <CR> (CarriageReturn 0x0d) from the line
2) Phase(Special chars, " <LF> ^ & | < > ( ): Look at each character
If it is a quote (") toggle the quote flag, if the quote flag is active, the following special characters are no longer special: ^ & | < > ( ).
If it is a caret (^) the next character has no special meaning, the caret itself is removed, if the caret is the last character of the line, the next line is appended, the first charater of the next line is always handled as escaped character.
<LF> stops the parsing immediately, but not with a caret in front
For a full and great explanation (seriously bookmark this link!) see the answers here:
How does the Windows Command Interpreter (CMD.EXE) parse scripts?
Direct Answer: https://stackoverflow.com/a/4095133/891976

Additionally to the answer of David Ruhrmann...
You create a variable named ^, as the parser will escape the second caret and remove the first one in the statement set ^^=test.
As David explained, the percent expansion phase is the first phase so it can expand even strange expressions like a <CR> character, but it's also the cause why you can't build a multiline percent expansion.
First the percents are expanded (and fails as there is only one) and then the mutliline caret is used to append the next line.
echo %va^
r%
But really confusing is the next caret example
set "^=one caret"
set "^^=two carets"
echo "%^%"
call echo "%%^%%"
The output is
"one caret"
"two carets"
It's because carets will be doubled by a CALL

Related

Why does FOR fail with delayed sub-string expansion after IN?

Sub-string expansion works within the set of a for loop (that is the parenthesised part after in) when immediate expansion is used (write %%I instead of %I in a batch file):
set "X=123"
for %I in (%X:~1,1%) do #echo %I
However, it fails when delayed expansion is applied:
for %I in (!X:~1,1!) do #echo %I
I would expect the output to be:
2
But instead it is:
!X:~1
1!
Why is this and how can I prevent that?
I know I could work around that by quoting the set and using ~, but this is not what I want here:
for %I in ("!X:~1,1!") do #echo %~I
The following command line fails too:
for %I in (!X:*2=!) do #echo %I
The unexpected output is:
!
Also for command lines using the /R, /L and /R switches fail with such sub-string syntax.
It surely has got something to do with the fact that , and = are token separators for cmd, just like SPACE, TAB, ;, etc.
According to the thread How does the Windows Command Interpreter (CMD.EXE) parse scripts? and also numerous comments here, the answer lies in the special way for loops are parsed.
The key is the following excerpt of this answer (see the italic text in particular):
Phase 2) Process special characters, tokenize, and build a cached command block:
[...]
Three commands get special handling - IF, FOR, and REM
[...]
FOR is split in two after the DO. A syntax error in the FOR construction will result in a fatal syntax error.
The portion through DO is the actual FOR iteration command that flows all the way through phase 7
All FOR options are fully parsed in phase 2.
The IN parenthesized clause treats <LF> as <space>. After the IN clause is parsed, all tokens are concatenated together to form a single token.
Consecutive token delimiters collapse into a single space throughout the FOR command through DO.
Due to the fact that delayed expansion happens after parsing of for and the described behaviour, token separators like SPACE, TAB, ,, ;, =, etc. become converted to a single SPACE, hence a sub-string expansion expression like !X:~1,1! is changed to !X:~1 1!, and a sub-string substitution expression like !X:*2=! is changed to !X:*2 !, which both are invalid syntax.
Therefore to solve the issue you need to escape the token separators by ^ like:
for %I in (!X:~1^,1!) do #echo %I
and:
for %I in (!X:*2^=!) do #echo %I
(By the way, there is a very similar problem with if statements.)

Sub-string expansion with empty string causes error in If clause

I have the following code snippet:
if "%ARGV:~,1%"==":" echo %ARGV% begins with a colon.
As long as variable ARGV contains a non-empty value, or correctly said, it is defined, everything works as expected, hence if the string in ARGV begins with a colon, the echo command is executed.
However, as soon as I clear variable ARGV, a syntax error arises:
echo was unexpected at this time.
What is going on here? The syntax is perfectly fine, but why does that command line fail?
Even one of the most helpful threads here, How does the Windows Command Interpreter (CMD.EXE) parse scripts?, for such things does not deliver an explanation for this behaviour.
When I do the same directly in command prompt, everything is in order. Moreover, when I try it using delayed expansion no error occurs either.
My companion answer to jeb's answer to "How does the Windows Command Interpreter (CMD.EXE) parse scripts?" does explain the behavior.
My companion answer gives the necessary details on how % expansion works to fully predict the behavior.
If you keep ECHO ON, then you can see the result of the expansion, and the error message makes sense:
test.bat
#echo on
#set "ARGV="
if "%ARGV:~,1%"==":" echo %ARGV% begins with a colon.
-- output --
C:\test>test
echo was unexpected at this time.
C:\test>if "~,1" echo begins with a colon.
The important rules from my answer that explain the expansion result are:
1)(Percent) Starting from left, scan each character for %. If found then
1.1 (escape %) ... not relevant
1.2 (expand argument) ... not relevant
1.3 (expand variable)
Else if command extensions are disabled then ... not relevant
Else if command extensions are enabled then Look at next string of characters, breaking before % : or <LF>, and call them VAR
(may be an empty list). If VAR breaks before : and the subsequent
character is % then include : as the last character in VAR and
break before %.
If next character is % then Replace %VAR% with value of VAR (replace with nothing if VAR not defined) and continue scan
Else if next character is : then
If VAR is undefined then Remove %VAR: and continue scan.
... Remainder is not relevant
Starting with
if "%ARGV:~,1%"==":" echo %ARGV% begins with a colon.
The variable expansion expands all of the following strings to nothing because the variable is undefined:
%ARGV:
%"==":
%ARGV%
And you are left with:
if "~,1" echo begins with a colon.
It works with delayed expansion because the IF statement is parsed before delayed expansion (explained in jeb's answer within phase 2)
Everything works from the command line because the command line variable expansion does not remove the string when the variable is not defined. (loosely explained in jeb's answer near the bottom within CmdLineParser:, Phase1(Percent))

Escaping exclamation marks required in replace string but not in search string (substring replacement with delayed expansion on)?

Supposing one wants to replace certain substrings by exclamation marks using the substring replacement syntax while delayed expansion is enabled, they have to use immediate (normal) expansion, because the parser cannot distinguish between !s for expansion and literal ones.
However, why does one have to escape exclamation marks in the replacement string? And why is it not necessary and even disruptive when exclamation marks in the search string are escaped?
The following script replaces !s in a string by ` and in reverse order afterwards, so I expect the result to be equal to the initial string (which must not contain any back-ticks on its own of course):
#echo off
setlocal EnableExtensions DisableDelayedExpansion
rem This is the test string:
set "STRING=string!with!exclamation!marks!"
set "DELOFF=%STRING%"
set "DELOFF=%DELOFF:!=`%"
set "DELOFF=%DELOFF:`=!%"
setlocal EnableDelayedExpansion
set "DELEXP=!STRING!"
set "DELEXP=%DELEXP:!=`%"
set "DELEXP=%DELEXP:`=!%"
echo(original string: !STRING!
echo(normal expansion: !DELOFF!
echo(delayed expansion: !DELEXP!
endlocal
endlocal
exit /B
This result is definitely not what I want, the last string is different:
original string: string!with!exclamation!marks!
normal expansion: string!with!exclamation!marks!
delayed expansion: stringexclamation
As soon as take the line...:
set "DELEXP=%DELEXP:`=!%"
....and replace the ! by ^! there, hence escaping the exclamation mark in the replace string, the result is exactly what I expect:
original string: string!with!exclamation!marks!
normal expansion: string!with!exclamation!marks!
delayed expansion: string!with!exclamation!marks!
When I try other escaping combinations though (escape the exclamation mark in both the replace and the search string, or in the latter only), the result is again the aforementioned unwanted one.
I walked through the post How does the Windows Command Interpreter (CMD.EXE) parse scripts? but I could not find an explanation to that behaviour, because I learned the normal (or immediate, percent) expansion is accomplished long before delayed expansion occurs and any exclamation marks are even recognised. Also caret recognition and escaping seems to happen afterwards. In addition, there are even quotation marks around the strings that usualy hide carets from the parser.
Actually, for the substring replacement itself there is no escaping required. It becomes necessary for the later parsing phases only. This is why:
However, why does one have to escape exclamation marks in the replacement string?
The thing is, that immediate (normal, %) expansion is done in a quite early stage, whereas delayed expansion (!), as the name implies, is accomplished as one of the last steps. Hence a immediately expanded string also passes through the delayed expansion phase. As proof, set variables VAR to Value!X! and X to 0, then execute echo %VAR%, so you will get Value0 as the result.
But back to the initial question, when using immediate substring replacement, the replacement string is part of the expanded value, so it is also passed through the delayed expansion phase. Therefore, a literal exclamation mark must be escaped in order not to be consumed by the delayed expansion. This implies that the escaping is not needed for the replacement itself, it is actually done afterwards, so the given replace string including the escaping is applied literally.
And why is it not necessary and even disruptive when exclamation marks in the search string are escaped?
Since caret recognition and so escaping happens after immediate expansion, the search string is treated literally. Furthermore, the search string is replaced and therefore not included in output of immediate substring replacement, so it is not passed through the delayed expansion phase.
Let us look at the original example (excerpt only):
set "STRING=string!with!exclamation!marks!"
setlocal EnableDelayedExpansion
set "DELEXP=!STRING!"
set "DELEXP=%DELEXP:!=`%"
set "DELEXP=%DELEXP:`=!%"
echo(delayed expansion: !DELEXP!
endlocal
The replacement set "DELEXP=%DELEXP:!=`%" searches for !. The resulting value is string`with`exclamation`marks`.
Using set "DELEXP=%DELEXP:^!=`%" would search for ^! literally, so no occurrences would be found of course (so all the literal ! in the original string were kept, they were processed by delayed expansion finally).
The replacement set "DELEXP=%DELEXP:`=!%" replaces ` by ! perfectly, the result string is string!with!exclamation!marks!, but such are consumed by delayed expansion afterwards.
The escaped replacement %DELEXP:`=^!% replaces ` by ^! literally, so the result is string^!with^!exclamation^!marks^!; the escaping is processed afterwards during the delayed expansion phase, resulting in literal ! and the return string string!with!exclamation!marks! finally.
According to the post How does the Windows Command Interpreter (CMD.EXE) parse scripts?, there is a second phase where escaping occurs, which is the delayed expansion phase. This is the one that applies for the example in the original question, because the first escaping (during the special character recognition phase) is disabled due to the surrounding quotation marks (omitting such would lead to the need of double-escaping like ^^!).

Splitting the parameter in a batch file multiple times

Sample batch execution:
test.bat /s v1.1 1,3,4,5
I want to split the parameter into three tokens using space as a delimiter. The result should be:
1st token = /s
2nd token = /v1.1
3rd token = 1,3,4,5
Then the 3rd token will be split again using comma as a delimiter
The code below splits the arguments using common delimiters such as space, comma, etc.
#ECHO OFF
SET PARAMS=
:_PARAMS_LOOP
SET PARAMS=%PARAMS%%1
ECHO %1
SHIFT
IF NOT "%1"=="" GOTO _PARAMS_LOOP
Execution:
test.bat /s v4.1 1,2,3,4
Result:
/s
v4.1
1
3
4
5
I just want to use space as a delimiter, then in the 3rd token(1,3,4,5) I will split it again using comma as a delimiter and echo each of it.
The issue is that cmd recognizes a space, tab, comma, semicolon, or equals sign as command line delimiters unless they are wrapped in doublequotes.
Delimiters
Some characters in the command line are ignored by batch files,
depending on the DOS version, wether they are "escaped" or not, and
often depending on their location in the command line:
commas (",") are replaced by spaces, unless they are part of a string
in doublequotes
semicolons (";") are replaced by spaces, unless they
are part of a string in doublequotes
"=" characters are sometimes
replaced by spaces, not if they are part of a string in doublequotes
the first forward slash ("/") is replaced by a space only if it
immediately follows the command, without a leading space
multiple spaces are replaced by a single space, unless they are part of a
string in doublequotes
tabs are replaced by a single space
leading spaces before the first command line argument are ignored
I know of several occasions where these seemingly useless "features" proved very
handy. Keep in mind, though, that these "features" may vary with the
operating systems used.
More on command line parsing can be found on the PATH and FOR
(especially FOR's interactive examples) pages.
http://www.robvanderwoude.com/parameters.php

Ignore percent sign in batch file

I have a batch file which moves files from one folder to another. The batch file is generated by another process.
Some of the files I need to move have the string "%20" in them:
move /y "\\myserver\myfolder\file%20name.txt" "\\myserver\otherfolder"
This fails as it tries to find a file with the name:
\\myserver\myfolder\file0name.txt
Is there any way to ignore %? I'm not able to alter the file generated to escape this, such as by doubling percent signs (%%), escaping with / or ^ (caret), etc.
You need to use %% in this case. Normally using a ^ (caret) would work, but for % signs you need to double up.
In the case of %%1 or %%i or echo.%%~dp1, because % indicates input either from a command or from a variable (when surrounded with %; %variable%)
To achieve what you need:
move /y "\\myserver\myfolder\file%%20name.txt" "\\myserver\otherfolder"
I hope this helps!
The question's title is very generic, which inevitably draws many readers looking for a generic solution.
By contrast, the OP's problem is exotic: needing to deal with an auto-generated batch file that is ill-formed and cannot be modified: % signs are not properly escaped in it.
The accepted answer provides a clever solution to the specific - and exotic - problem, but is bound to create confusion with respect to the generic question.
If we focus on the generic question:
How do you use % as a literal character in a batch file / on the command line?
Inside a batch file, always escape % as %%, whether in unquoted strings or not; the following yields My %USERNAME% is jdoe, for instance:
echo My %%USERNAME%% is %USERNAME%
echo "My %%USERNAME%% is %USERNAME%"
On the command line (interactively) - as well as when using the shell-invoking functions of scripting languages - the behavior fundamentally differs from that inside batch files: technically, % cannot be escaped there and there is no single workaround that works in all situations:
In unquoted strings, you can use the "^ name-disrupter" trick: for simplicity, place a ^ before every % char, but note that you're not technically escaping % that way (see below for more); e.g., the following again yields something like My %USERNAME% is jdoe:
echo My ^%USERNAME^% is %USERNAME%
In double-quoted strings, you cannot escape % at all, but there are workarounds:
You can use unquoted strings as above, which then requires you to additionally ^-escape all other shell metacharacters, which is cumbersome; these metacharacters are: <space> & | < > "
Alternatively, unless you're invoking a batch file, , you can individually double-quote % chars as part of a compound argument (most external programs and scripting engines parse a compound argument such as "%"USERNAME"%" as verbatim string %USERNAME%):
some_exe My "%"USERNAME"%" is %USERNAME%
From scripting languages, if you know you're calling a binary executable, you may be able to avoid the whole problem by forgoing the shell-invoking functions in favor of the "shell-free" variants, such as using execFileSync instead of execSync in Node.js.
Optional background information re command-line (interactive) use:
Tip of the hat to jeb for his help with this section.
On the command line (interactively), % can technically not be escaped at all; while ^ is generally cmd.exe's escape character, it does not apply to %.
As stated, there is no solution for double-quoted strings, but there are workarounds for unquoted strings:
The reason that "^ name-disrupter" trick (something like ^%USERNAME^%) works is:
It "disrupts" the variable name; that is, in the example above cmd.exe looks for a variable named USERNAME^, which (hopefully) doesn't exist.
On the command line - unlike in batch files - references to undefined variables are retained as-is.
Technically, a single ^ inside the variable name - anywhere inside it, as long as it's not next to another ^ - is sufficient, so that %USERNAME^%, for instance, would be sufficient, but I suggest adopting the convention of methodically placing ^ before each and every % for simplicity, because it also works for cases such as up 20^%, where the disruption isn't even necessary, but is benign, so you can apply it methodically, without having to think about the specifics of the input string.
A ^ before an opening %, while not necessary, is benign, because ^ escapes the very next character, whether that character needs escaping - or, in this case, can be escaped - or not. The net effect is that such ^ instances are ultimately removed from unquoted strings.
Largely hypothetical caveat: ^ is actually a legal character in variable names (see jeb's example in the comments); if your variable name ends with ^, simply place the "disruptive" ^ somewhere else in the variable name, as long as it's not directly next to another ^ (as that would cause a ^ to appear in the resulting string).
That said, in the (very unlikely) event that your variable has a name such as ^b^, you're out of luck.
In batch files, the percent sign may be "escaped" by using a double percent sign ( %% ).
That way, a single percent sign will be used within the command line. from http://www.robvanderwoude.com/escapechars.php
I think I've got a partial solution working. If you're only looking to transfer files that have the "%20" string in their name and not looking for a broader solution, you can make a second batch file call the first with %%2 as the second parameter. This way, when your program tries to fetch the second parameter when it hits the %2 in the text name, it will replace the %2 with an escaped %2, leaving the file name unchanged.
Hope this works!
How to "escape" inside a batch file withoput modify the file**
The original question is about a generated file, that can't be modified, but contains lines like:
move /y "\\myserver\myfolder\file%20name.txt" "\\myserver\otherfolder"
That can be partly solved by calling the script with proper arguments (%1, %2, ...)
#echo off
set "per=%%"
call generated_file.bat %%per%%1 %%per%%2 %%per%%3 %%per%%4
This simply sets the arguments to:
arg1="%1"
arg2="%2"
...
How to add a literal percent sign on the command line
mklement0 describes the problem, that escaping the percent sign on the command line is tricky, and inside quotes it seems to be impossible.
But as always it can be solved with a little trick.
for %Q in ("%") do echo "file%~Q20name.txt"
%Q contains "%" and %~Q expands to only %, independent of quotes.
Or to avoid the %~ use
for /F %Q in ("%") do echo "file%Q20name.txt"
You should be able to use a caret (^) to escape a percent sign.
Editor's note: The link is dead now; either way: It is % itself that escapes %, but only in batch files, not at the command prompt; ^ never escapes %, but at the command prompt it can be used indirectly to prevent variable expansion, in unquoted strings only.
The reason %2 is disappearing is that the batch file is substituting the second argument passed in, and your seem to not have a second argument. One way to work around that would be to actually try foo.bat ^%1 ^%2... so that when a %2 is encountered in a command, it is actually substituted with a literal %2.

Resources