I created a batch file that contains characters in Hebrew.
ECHO אאאאא
The result is אאא on running the batch file.
How can I fix it?
It looks like you have encoded your batch file with UTF-8 saved without byte order mark (BOM) for Hebrew Letter Aleph with Unicode code value 05D0.
The batch code below copied into a UTF-8 encoded file without BOM changes the code page to UTF-8 (65001) before the characters are written into the console window.
#echo off
chcp 65001 >nul
ECHO אאאא
Instead of using multi-byte encoding UTF-8, it would be also possible to use single-byte encoding with code page 862 which contains this letter mapped to code value 80 (hexadecimal, 128 decimal).
#echo off
chcp 862 >nul
ECHO אאאא
Code page 862 is the OEM code page for Hebrew.
In console windows usually OEM code pages are used. If you open a command prompt window and execute in this window chcp you can see which code page is by default set on your machine.
But setting the right code page in batch file according to encoding used for the batch file does not automatically mean to get the Hebrew letters now displayed correct in console window on execution of the batch file.
The font used for the console window must support code page 862 respectively the Hebrew letters from Unicode table, too.
As I saw the Hebrew characters displayed wrong in command prompt window with default font setting Raster Fonts on my English Windows 7 x64 machine using by default code page 850 in console windows, I clicked on icon on left side of title bar of command prompt window, clicked in opened menu on Properties and selected Consolas on tab Font. The Hebrew letters were displayed now different than with Raster Fonts, but still not right. So Consolas also does not support Hebrew letters on my machine. Next I tried font Lucida Console, but again the Hebrew letters were not displayed right. In other words non of the 3 fonts available on my machine for console windows can be used to display the Hebrew letters in a console window with the right glyphs.
Read this brief overview of Unicode on a power tip page for text editor UltraEdit if you don't know anything about text encoding.
Command prompt environment is not really designed for Unicode. Select in Windows Control Panel - Region and Language the tab Administrative. There you can set system locale for non-Unicode programs. And there is also a link to a help page explaining what this settings is for - setting default font and code page for single byte encoded text in Windows GUI (Windows-1255) and console windows (OEM 862) with Hebrew (Israel) selected.
Related
Hoping someone can help.
I have a batch job (Windows env) which simply copies a file to another folder.
copy "\\ACP-MS-NAS21\Global\MEC Daily Productivity\Business Analysts\Master_List\HCP_Master_List.xlsx" ^
"\\ACP-MS-NAS21\Global\CSD [?] DWP Medical Services\CSL_CSD_DB\Master_List"
But I get the following error:
The filename, directory name, or volume label syntax is incorrect.
I can see there's an en dash in the file path which I believe is causing the problem.
Is there any way to have a wildcard in the file path or any other way the job can recognise this?
Thanks in advance.
PS. newbie at batch programming, coding etc, so please can explanations be in plain English. Many thanks
The command-line windows (and by extension - BAT files) operate in OEM codepage by default. Which exact codepage is defined by your OS settings (Language for non-Unicode programs or similar). Therefore you cannot normally use anything outside ASCII or whatever codepage you have.
Save script as UTF-8
To use those characters you will need to work in a codepage that actually has them. UTF-8 is the best one for this tasks (and probably the only one that will work in your case).
First, save your script as UTF-8. In notepad you can select codepage from save as menu.
If your editor does not allow you to select UTF-8 (no BOM) leave first line of your BAT file blank as some editors may preface your file a special header called BOM that helps detecting codepage. If it does and you leave 1st line blank you will get a Bad command or file name error as soon as your script starts but this won't prevent it from running properly.
Select UTF-8 codepage
Now, your script is in UTF-8 but windows command processor will still execute it as if it was in ASCII, thus corrupting all special characters. To specify our encoding we need to add following command, preferentially as a first non-blank line of your script, including comments (those may have non-ASCII characters in them - with unpredictable results)
CHCP 65001
CHCP changes current codepage to 65001 which is internal codepage number for UTF-8.
This works because latin letters and numbers in UTF-8 have the same encoding as in ASCII and OEM codepages. Thus your scripts starts executing in OEM codepage, but since CHCP 65001 command itself does not have non-ASCII characters it understood correctly. All following commands will be executed in UTF-8 and may have non-ASCII characters.
You may now insert em-dash into the filename and it won't be replaced with ? upon saving.
Set UTF-8 font
Unfortunately default console window font does not display UTF-8 correctly so you won't be able to see non-ASCII characters correctly.
To solve this you should right-click command-line window title bar, select properties, and change font to UNICODE one. Consolas, Lucida Console and Courier New should work.
Thanks for all the responses.
I've had to make a design change to my DB so have managed to get around the need to do this now (phew :))
Thanks again
I have problems controlling character code pages in a Windows cmd window, or rather in DOS scripts (.bat files) I use for certain tasks on my Windows 7 office computer.
Here is the problem:
One of my scripts is used to open certain files in their respective programmes, e.g.
C:\Stuff\Büroeinrichtung\MyFile.xlsx
The crucial thing here is the u-umlaut (ü) in the directory name.
In my script I use
Start "" "C:\Stuff\Büroeinrichtung\MyFile.xlsx"
to start Excel and open the file.
This works as long as I tell my text editor (Notepad++) to encode the script using codepage 850 (Western European), as this is what the cmd windows on my machine use by default.
However, I want to be able to use scripts that are encoded in something else, primarily UTF-8 or UTF-8-BOM. From answers to another question posted here I learned that principally I can set a command in the script for the cmd window to change the codepage, e.g. chcp 65001 for UTF-8. So my script would then look like
chcp 65001
pause :: this is here just to have some visual control while testing
Start "" "C:\Stuff\Büroeinrichtung\MyFile.xlsx"
pause :: dito
But: whatever I do, I do not get this running. The cmd window nicely accepts the change to the codepage, then stops due to pause (in Line 2), but on hitting "enter" to continue I
either get an alert that something is wrong with the ü (other, fancy, characters displayed), or
I get an alert that a directory of that name wasn't found (again obviously something wrong with the ü the actual bits of which seem to respresent something else) or
the cmd window just disappears (apparently crashed, and apparently never reaching Line 4 where a new pause would halt it).
I tried all possible combinations of codepages called in the script and various encodings for the script file (.bat) itself but did not get anywhere.
So, to put the long story short: What do I have to do, in a script encoded in UTF-8 (or so) and going to be run on a machine using codepage 850 by default that a character ü (u-umlaut) in a directory name is to be understood in the script as exactly ü, nothing else?
I'm trying to make a login batch file that starts a few services but in a way that the user knows they are being started. So I thought I'd use a batch script for that.
The script is working fine, but I wanted to embellish it a bit more using the logo in ASCII and use colors. Everything is working fine on my development PC (Windows 10 64-bit), but on the user machines (Windows 7 64-bit) the colors are not being shown.
I'm using:
echo <ESC>[93m Logging in
But when I run it, I displays:
←[93m Logging in
So it's not treating the ESC properly.
The issue has to be PC based because it's working on another machine, but I don't know how to solve this.
Only console of Windows 10 supports ESC sequences as documented on MSDN page Console Virtual Terminal Sequences. Console host of previous Windows versions don't support ANSI ESC sequences.
There is the command COLOR to define text color and background color.
Open a command prompt window and run color /? for help on this command.
Hundreds of batch file examples on how to use COLOR can be found on Stack Overflow for example with the search term [batch-file] color.
As mented befoer me, windows prior 10 does not support escape sequences. You could try ANSICON
the old ANSI.SYS, which was loaded at boot time would interpret color commands
such as [esc][1;33;40m (where [esc] was a small arrow) as the foreground and
background colors for text in the DOS prompt window, or outside of windows in
a DOS session. (Worked in Windows 3.1x, Win 95, Win 98 1st and 2nd, Win ME and
perhaps even 32 bit Win XP.)
However,after the introduction of 64-bit systems, ANSI.SYS no longer works as before.
The command "color" in a Windows 7 cmd.exe window colors the ENTIRE window text, not
just the part you want to color. I understand some of this has been alleviated in
Win 10 cmd.exe, but except for that...
There may be a possible solution:
called "CoColor" by Horst Schaeffer
Freeware © Horst Schaeffer -- Contact: horst.schaeffer#gmail.com
http://www.horstmuc.de/wcon.htm
Here is what he says about it:
CoColor 2.1 Change console output color Download 32 bit (6Kb)
Download 64 bit (7Kb)
CoColor changes the console color for the succeeding console output, not for the entire window, like the built-in COLOR command. CoColor uses the same color codes as COLOR.
CoColor also accepts a sequence of color codes and text strings (each in double quote marks), making it a colorful ECHO replacement. Non-ASCII characters will be handled the same way as by ECHO.
Demo.CMD is included.
(NOTE: After running Demo.cmd you will need to run the command color to return
to the default colors of the screen. He did not include that in his script.)
After scanning the files with Avast Antivirus, SuperAntiSpyware and Malwarebytes,
I ran the CoColor 64-bit version on Win 7 Pro 64-bit and it seems to work well.
I wrote a lot of batches back in the old days with color bars for the lines of
text. They did NOT change the color of the entire screen as does the "color"
command in cmd.exe! COMMAND.COM understood color commands with ANSI.SYS loaded
at boot time in the CONFIG.SYS. This is the closest thing I've seen yet to that
original functionality. Hope this helps.
Ive tried pasting this character: ☢ into a txt file,but when I use TYPE,it doesn't show the symbol? Is there a certain code page?
Tried different encoding options.
I know about the CHCP command and tried different pages.
As Josep indicates, the text file containing the UTF-8 character has to be encoded in UTF-8 with BOM.
Also, your console font has to include that character. The default fonts don't. I just tested DejaVu Sans Mono, and it works. See this page for instructions on how to add a TrueType font to the cmd console.
Finally, you can get around messing with the code page garbage by invoking powershell. That way you don't have to mess with scraping the current code page, changing it to 65001, doing your thing, then restoring back to original. Just do this:
powershell "gc nuke.txt"
... where nuke.txt contains ☢
Edit: Since you're unable to install new fonts on your computer, the only solution is ascii art.
___
\_/
.--,O.--,
\/ \/
You have to make sure your file has the UTF8 BOM (Use Notepad++ or the option suggested in the answers below), then you may need to change the character set of your console with chcp 65001, see this answers for more information:
How to make Unicode charset in cmd.exe by default?
How to convert a batch file stored in utf-8 to something that works via another batch file and run it
I have a bunch of dynamically created *.BAT files. These BAT files are used to create folders in a server. Just one line in each BAT file, such as: MKDIR \NetworkShare\abc\123
This "abc\123" string is from a database.
It runs OK for a while to create thousands subfolders on demand until today it stopped creating a special subfolder which has a "close single quote" (Alt + 0146 if typing from dos prompt) in the string.
I did some research and found that this "close single quote" is an extended ASCII character. It can't be saved properly in ANSI BAT file (end up as something else). I tried UNICODE and UTF-8 BAT file, but it doesn't work.
The only near-close solution is that I tried a binary editor to make sure it's code 146, but code 146 gives me Æ (ALT-146) not "close single quote" (Alt + 0146).
I know I can manually type special characters in DOS prompt (by using keyboard Alt + ).
But is there a way to properly save this "close single quote" (Alt + 0146) in BAT file so I can execute them dynamically?
The host system is Windows Server 2003 US-English.
Thank you for this CHCP 65001 trick. It leads to proper solution:
I took follow steps to resolve the issue:
+++++++++++++++++++
Prepare the BAT Text File (either manually or dynamically)
+++++++++++++++++++
(1) Make the first line blank (this is necessary, because there are hidden chars in the first line for UTF-8 text file)
(2) Put CHCP 65001 as second line
(3) main line here: MKDIR \networkshare\abc(right single quote-->this is special extended ASCII char)\123
(4) make sure the BAT file saved as UTF-8
+++++++++++++++++++
Now it's the CMD.EXE trick
+++++++++++++++++++
(1) Start cmd.exe
(2) open cmd.exe black screen property
(3) make sure the black screen font is "true type" i.e. "TT" like. By default, it is raster font, can not handle special ascii code properly. (This is the key step)
(4) now I can run my BAT to handle those extended ASCII chars properly.
Try changing the code page of your batch file to UTF-8: Insert this line at the top of your batch file and save the file as UTF-8:
chcp 65001
Be careful though: Creating folders with non-ASCII letters can break some programs that may rely on older API of libraries, or just assume that all folder and file names are ASCII.