Open URL that contains umlaut with batch - batch-file

I want to open an URL in chrome with a batch file. This works for normal URLs, but it doesn't for URLs with umlauts.
start chrome.exe https://trends.google.de/trends/explore?q=mähroboter
I cannot use "ae" as a replacement for "ä", as it will give me different results on Google Trends.
When I keep it like this, the URL in my browser changes to
https://trends.google.de/trends/explore?q=mA4hroboter
which again gives me the wrong results. It needs to be "ä".
I tried playing around with the file encoding. Currently UTF8 without BOM. I tried UTF8 with BOM, ANSI, converting to and fro. Nothing seemed to work. What can I do to make it work?

URLs must be URL encoded with percent-encoded bytes.
That means the German umlaut ä in a URL must be first UTF-8 encoded with the two bytes with the hexadecimal values C3 A4 and next percent-encoded resulting in %C3%A4 in the URL string:
https://trends.google.de/trends/explore?q=m%C3%A4hroboter
In a batch file a percent sign must be escaped with an additional percent sign to get it interpreted by Windows command processor as literal character and not
as beginning of a batch file argument reference as explained by help of command CALL output on running call /? in a command prompt window, or
beginning of a loop variable reference as explained by help of command FOR output on running for /? in a command prompt window, or
beginning / end of an environment variable reference as explained by help of command SET output on running set /? in a command prompt window.
So in the batch file must be used:
start chrome.exe https://trends.google.de/trends/explore?q=m%%C3%%A4hroboter

Related

Controlling codepages in a cmd window when running batch scripts

I have problems controlling character code pages in a Windows cmd window, or rather in DOS scripts (.bat files) I use for certain tasks on my Windows 7 office computer.
Here is the problem:
One of my scripts is used to open certain files in their respective programmes, e.g.
C:\Stuff\Büroeinrichtung\MyFile.xlsx
The crucial thing here is the u-umlaut (ü) in the directory name.
In my script I use
Start "" "C:\Stuff\Büroeinrichtung\MyFile.xlsx"
to start Excel and open the file.
This works as long as I tell my text editor (Notepad++) to encode the script using codepage 850 (Western European), as this is what the cmd windows on my machine use by default.
However, I want to be able to use scripts that are encoded in something else, primarily UTF-8 or UTF-8-BOM. From answers to another question posted here I learned that principally I can set a command in the script for the cmd window to change the codepage, e.g. chcp 65001 for UTF-8. So my script would then look like
chcp 65001
pause :: this is here just to have some visual control while testing
Start "" "C:\Stuff\Büroeinrichtung\MyFile.xlsx"
pause :: dito
But: whatever I do, I do not get this running. The cmd window nicely accepts the change to the codepage, then stops due to pause (in Line 2), but on hitting "enter" to continue I
either get an alert that something is wrong with the ü (other, fancy, characters displayed), or
I get an alert that a directory of that name wasn't found (again obviously something wrong with the ü the actual bits of which seem to respresent something else) or
the cmd window just disappears (apparently crashed, and apparently never reaching Line 4 where a new pause would halt it).
I tried all possible combinations of codepages called in the script and various encodings for the script file (.bat) itself but did not get anywhere.
So, to put the long story short: What do I have to do, in a script encoded in UTF-8 (or so) and going to be run on a machine using codepage 850 by default that a character ü (u-umlaut) in a directory name is to be understood in the script as exactly ü, nothing else?

Search a string in a line in a text file and copy it to another text file or a new variable

I have a problem with batch and I can't get behind it. I searched Google and Stack Overflow for hours, and now I'm asking the question myself because nobody seemed to have this exact problem yet or I simply can't find it. I have even searched last results page on Google (!).
So I have coded a batch file that automatically pulls file names from a server and puts them into a text file along with the files path. Now I have a file that looks roughly like this:
q:\0003730310008520150610120508\1_PY98200_00084_00085_09_20150610_140447.antfzg
q:\000649A7B0008520150630085701\1_KP40610_00084_00085_09_20150630_105647.antfzg
q:\000649A7B0008520150630085701\1_KP40610_00084_00085_09_20150630_110508.antfzg
q:\00161083B0008520150429065335\1_J281516_00084_00085_09_20150429_085326.antfzg
q:\00161083B0008520150429122000\1_PV92717_00084_00085_09_20150429_141952.antfzg
q:\00161083B0008520150515065834\1_VY65621_00084_00085_09_20150515_085802.antfzg
q:\00161083B0008520150527075722\1_D894693_00084_00085_09_20150527_095704.antfzg
q:\00161083B0008520150602075809\1_L893216_00084_00085_09_20150602_095757.antfzg
q:\00161083B0008520150608082553\1_VT04798_00084_00085_09_20150608_102033.antfzg
q:\00161083B0008520150610080050\1_LF22563_00084_00085_09_20150610_100016.antfzg
q:\00161083B0008520150623132003\1_VN57593_00084_00085_09_20150623_151927.antfzg
Now I want to search for a specific article number that looks like this for the first example: PY98200 (the part directly behind 1_). If that is found in the file, copy the entire line containing the string into either a new variable or a new text file. If the number exists multiple times, then all lines should be copied, too.
I tried different for loops, but I failed because I am not that experienced with batch coding.
The command line for this task as posted by npocmaka is:
type "original_text_file.txt" | %SystemRoot%\System32\find.exe "PY98200" > "new_text_file.txt"
How it works?
type is an internal command of command processor cmd.exe which outputs the contents of a text file to stdout (standard output) which is displayed usually in console window.
The output of type is piped with | from stdout of type to stdin (standard input) of standard Windows console application find.
find is a very small console application to find a simple, non regular expression string in lines of a text file and which by default outputs to stdout all lines containing the searched string. A more powerful console application to find strings in files is findstr. But find has the feature needed for this task, too.
The output of find is redirected with > to file new_text_file.txt in current directory.
Why not calling find directly with the name of the text file?
%SystemRoot%\System32\find.exe "PY98200" "original_text_file.txt" > "new_text_file.txt"
With the command line above find outputs also an empty line and one more line containing the name of the file before the lines containing the searched string. This is useful if multiple files are searched for lines containing a string and in output it is important to know which lines are from which file.
But new_text_file.txt should contain only the lines containing PY98200 without any additional information about source. Therefore the command type is used to let find read contents of original_text_file.txt via stdin which avoids printing the two header lines find outputs for each file to file new_text_file.txt.
An alternate command line for this task would be:
%SystemRoot%\System32\findstr.exe /C:PY98200 "original_text_file.txt" > "new_text_file.txt"
For more details about the used commands respectively console applications, open a command prompt window, execute the following commands and read all help pages output by each command into the console window.
type /?
find /?
findstr /?
All standard commands / console applications of Windows output help on running them with parameter /? which for some commands is more accurate then online documentation.

Running BAT/CMD file with accented characters in it

I have a Windows batch file which has an instruction to execute an EXE file in a location whose path contains accented characters. Following are the contents of the batch file.
#echo off
C:\español\jre\bin\java.exe -version
C:\español\jre\bin\java.exe - This path exists and is proper. I can run this command directly on cmd.exe. But when I run the command from a bat/cmd file it fails saying "The system cannot find the path specified"
One way to fix this is by setting code page to 1252 (that works for me). But I'm afraid we'd have to set code pages for any non-English locale and figuring out which code page to use is pretty difficult.
Is there an alternative approach to fix this problem? Maybe a command-line option or something else?
Another way of doing this, in Windows, is by using wordpad.exe:
Run wordpad.exe
Write your script as you usually do, with accents
Choose Save as > Other formats
Choose to save it as Text document MS-DOS (*.txt)
Change the file extension from .txt to .bat
I had the same problem, and this answer solved it. Basically you have to wrap your script with a bunch of commands to change your terminal codepage, and then to restore it.
#echo off
for /f "tokens=2 delims=:." %%x in ('chcp') do set cp=%%x
chcp 1252>nul
:: your stuff here ::
chcp %cp%>nul
Worked like a charm!
I'm using Notepad++ and it has an option to change "character sets", OEM-US did the trick. ;)
Since you have #echo off you can't see what your batch is sending to the command prompt. Reproducing your problem with that off it seems like the ñ character gets misinterpreted since the output I see is:
C:\espa±ol\jre\bin\java -version
The system cannot find the path specified.
I was able to get it to work by echoing the command into the batch file from the command prompt, i.e.
echo C:\español\jre\bin\java.exe -version>>test.bat
This seems to translate the character into whatever the command prompt is looking for, though I've only tested it with English locale set so I don't know if it'll work in all situations for you. Also, if you open the batch in a text editor like notepad it looks wrong (C:\espa¤ol\jre\bin\java.exe)
Use Alt + 0164 for ¤ instead of Alt + 164 ñ in a batch file... It will look odd, but your script should run.
You can use Visual Studio Code and it will let you select the encoding you want to use. Lower right corner, you select the encoding and will display option "save with encoding". Select DOS and will save the accented chars.
I also had the same problem. I was trying to create a simple XCOPY batch file to copy a spreadsheet from one folder to another. Its name had the "é" character in it, and it refused to copy.
Even trying to use Katalin's and Metalcoder's suggestions didn't work on my neolithic Windows XP machine. Then I suddenly thought: Why not keep things as simple as possible (as I am myself extremely simple-minded when it comes to computers) and just substitute, in the batch file code, "é" with the wildcard character "?".
And guess what? It worked!

Batch script Latin characters

I am writing a batch script to go through some directories doing an specific task, something like the following:
set DBCreationScript=//Here I set the full path for the script
echo %DBCreationScript%
Problem is the path has got some latin characters (ç, ã, á) and when I run the script, the output shows strange characters, not the ones I typed in. The batch script is in ANSI encoding.
I already tried to set the script encoding to UTF-8, but apparently the batch interpreter can't handle the control characters that appear on the beggining of the file.
Any thoughts?
Save the batch file in OEM encoding (a decent editor should allow this) or change the code page prior to running it with
chcp 1252
You can also save it as UTF-8 without signature (BOM) and use
chcp 65001
but down that path lies peril and dragons await to eat you (in short: It's usually painful and has a few weird side-effects).

Stupid Batch File Behavior. Tries to execute comments

I have tried prefixing lines with semicolons, 'REM', etc.. but no matter what when I run my batch file I keep getting "unknown command REM whatever"
"REM test" It is not recognized, and it is windows vista. I simply get "rem" output back to my console.
That's entirely normal behavior. Batch files are simply sequences of commands that are run one after another. So every line will get output to the console as if it were typed there.
H:\>echo rem test > test.cmd
H:\>test
yields the output
H:\>rem test
as if I typed rem test directly to the console.
You can suppress this by either prefixing the line with #:
#rem test
or by including echo off in the batch file:
#echo off
rem test
If I put ":: test" and execute it I get back "Test".
Can't reproduce here.
If I put "; test" it recursively executes itself
A semicolon at the start of the line seemingly gets ignored.
If you're talking about cmd.exe batch files under Windows, you can use:
rem this method or
:: this method.
For bash and a lot of other UNIX-type shells, you use:
# this method.
I'm pretty certain you're not using cmd.exe since that would give you an error like:
'rem' is not recognized as an internal or external command,
operable program or batch file.
rather then:
Unknown command ...
If you are using a UNIX-type shell, the # character is almost certainly what you're after. If you let us know exactly the shell you're using, we can probably help out further.
you probably created an UNICODE file. These files contain 2 bytes header named BOM
which is not shown by any editor but cmd attempts to execute them and fails.
To make sure this is indeed an issue: type any other command at the very beginning
of your file and see it throws the same error - for example #echo test
To fix it, just create a new plain text file and copy content of the original file there.
then remove the original file and replace it by the newly created one.
In my case the problems are line endings. Somehow Maven or the Jenkins pipeline running on a Linux machine changed the line endings from Windows style (CR LF) to Unix style (LF). Changing them back solves the issue for me.

Resources