I'm trying to make a (somewhat) stylish DOS menu as a present for my father.
I was able to get the whole menu system to work, but I wanted to gussy it up with some box drawing characters and, possibly, colored text.
In this YouTube video, the user shows an example of what I'm trying to do (example at the 5:00 mark), but doesn't explain how those characters are being rendered. In the Notepad document, it is displayed as goofy characters.
Do I need to save the file with a special type of encoding? Can it only be done in Notepad (I'm using TextEdit on Mac)? Can someone provide an example menu that can be added to DOSBox's [autoexec] config?
Also, I'm not sure if it is possible, but how can the text color/background color be changed? When running DOSBox initially, it shows their welcome screen with a blue background and box drawing characters, so I would think all of that is possible.
I tried using escaped unicode characters and I tried using a capital-E acute (as shown in the linked video), but they just render funky stuff when run in DOSBox.
The discrepancy in characters is a result of different code pages being used in character rendering. English-speaking Windows uses ANSI code page 1252 (otherwise known as Latin-1), while DOS uses OEM code page 437, or IBM-PC.
The codepage that Windows uses will vary based on your system language, so you many need to experiment to find the correct characters, but basically, find the character you want to print in 437 (say ╔, which is 200) and then in your code use the 1252 version (where 200 is È). Then save the file in ANSI encoding.
Related
Once i write a c program and try to output special characters (like ä ö ü ß) with printf() on the cmd window on windows 10 it only shows sth like ▒▒▒▒▒▒▒▒▒▒▒▒
But if i just type them in the cmd window without a c programm being executed it displays these characters properly.
When i change the console type to standard output in netbeans the output is correct as well.
I tried to change the codepage of cmd but it didnt fix the problem.
I use the gcc c compiler.
The reason is the usage of different code pages for character encoding.
In GUI text editor on writing program code stored in a file on which each character is encoded with just a single byte the code page Windows-1252 is used in Western European and North American countries.
In console window opened on running a console application an OEM code page is used which is in Western European countries OEM 850 and in North American countries OEM 437.
So you need for ÄÖÜäöüß different byte values written in code to get those characters displayed as expected in the console window at least on execution in Western European and North American countries.
Character Windows-1252 OEM 850
Ä \xC4 \x8E
Ö \xD6 \x99
Ü \xDC \x9A
ä \xE4 \x84
ö \xF6 \x94
ü \xF1 \x8C
ß \xDF \xE1
The code page used by default in a console window can be seen by opening a command prompt window and run either chcp (change code page) or mode which both display the active code page.
The default code page for GUI applications and console applications on a computer for a user account depends on the Windows region and language settings for this user account.
Some web pages you should read to better understand character encoding:
Character encoding (English Wikipedia article)
On the Goodness of Unicode by Tim Bray
The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!) by Joel Spolksy
What's the best default new file format? (UltraEdit forum topic)
Programmers should not write non ASCII characters into strings output by a compiled executable because it depends on which code page is used by the compiler on creating the binary representation (bytes) of the characters in executable. It is better to use the hexadecimal notation when active code page on execution of the application is known or defined by the application before the string is output.
It is also possible to store strings in the executable in Unicode, determine the encoding of the output handle before output any string and convert each Unicode string to the encoding of the output handle before the string is written to the output handle.
And of course it depends on used output font how the bytes in the strings in the executable are finally really displayed on screen.
SITUATION:
My instructor for my micro-controller class refuses to save sample code to a text file and instead saves it to a word document file instead. When I open up the doc file and copy/paste the code into my IDE "CodeWarrior" it causes errors upon compile time.
I am having to rewrite all the code into a text editor and then copy/paste it into my IDE.
MY UNDERSTANDING:
I was told to always save code as a text file because when you save code as a word document file it will bring in unwanted characters when your copy/pasting the code into your IDE for compiling.
MY QUESTIONS TO YOU:
1.)
Can someone explain this dilemma to me so I can understand it better? I would like to present a better case next time when I receive errors and to also know more about what is happening.
2.)
Is it possible to write a script that will show me all the characters that are being copied and pasted into a file when the code is coming from a word document vs. a text file? In otherwords is there a program that will allow me to see what is going on between copying/pasting code from a word doc file versus a txt file?
Saving source code as a Word document is just silly. If your instructor is insisting on this, chances are no matter how well-reasoned and thorough your argument, they're not going to listen. They're beyond help.
However, to answer your questions: 1) It depends on what you're pasting the thing into. Programs that copy onto the clipboard usually make the data available in several different formats, ranging from their own internal format to plain ASCII text, to maximize compatibility so that the data can be pasted into pretty much any target program. Most text editors will only accept the plan-text version, in which case no extra characters should be transferred. However if your text editor supports RTF or HTML, this may not be true. I'm not sure what CodeWarrior supports but it is certainly possible.
A workaround if this is the case: First paste into a PURE text editor like Notepad. Then copy from Notepad into CodeWarrior. This should eliminate any hidden formatting. As shoover said above, make sure double-quotes " are really double-quotes and not the fancy left- and right-specific quotes that Word sometimes uses.
Use a hex editor like XVI32 to see the raw contents of the file, including nonprinting characters. Or use a text editor with support for showing nonprinting characters (vi/vim, etc.).
I'm studying C and I've just had the same problem. When coping a piece of code from a PDF file and trying to compiling it, gcc would return a serie of errors. Reading the answer above I had an idea: "What if I converted the utf8 into ascii?". Well, I found a website that does just that (https://onlineutf8tools.com/convert-utf8-to-ascii). But instead of also converting the utf8 characters into ascii, it showed them as hexadecimals (Copying from the website to the text editor you can see it better). From there i realised that the problem were mostly the quote marks "".
I then copied the ascii "translation" into my code editor (I must add that it worked fine with Sublime, while VScode read the same utf8 code as it was in the original file, even after cp from the website) and replaced all the hex with the actual ascii characters that were needed to compile the code properly. I used the function find and replace from my editor to do it. I must say that it wasn't very fast doing it. But I believe that in some cases, if the code you're trying to copying is too long, doing it the way I've just described could be faster than rewriting the entire code.
I've wrote a simple console program in C which uses ANSI escape codes to color its text.
Is there a way to temporarily set the background of the whole terminal to black and the default font color to light gray? Can this be reverted after the program ends?
I'd prefer to avoid using ncurses.
Probably the simplest way to go is to set the background colour of your text with ANSI:
For instance using:
echo -e "\e[37m\e[41m"
will give you blue text on a red background (you can use this to test the effect in dramatic, easy to see colours).
Whereas
echo -e "\e[97m\e[40m"
will set the foreground to white and the background to black for the duration of your program.
If you find that you're getting a kind of ugly transition zone between your background colour and the terminal's just print a sufficient number of newlines to wipe the whole screen.
To use this in C, you'll obviously want printf instead of echo.
The wiki page on ANSI escape codes has additional information.
How to do this depends on the terminal the user is using. It may be ANSI, it may be VT100, it might be a line printer. ncurses abstracts this horror for you. It uses a database of information about how to talk to different kinds of terminal (see the contents of $TERM to see which one you are currently using) normally stored in /lib/terminfo or /usr/share/terminfo.
Once you look in those files, you'll probably want to reconsider not using ncurses, unless you have specific requirements to avoid it (embedded system with not enough storage, etc.)
In my C program I've had to swap my unicode box-drawing characters into escaped characters for DOS code page 437 to get it to work in the Windows command prompt. Is it possible to change the code page of gnome-terminal to display these characters correctly when natively compiling the program for linux?
Thanks.
From https://nethackwiki.com/wiki/IBMgraphics
The current gnome-terminal does not
have a setting for code page 437, but
it does support other code pages that
are equivalent for NetHack's purposes,
such as 862 (Hebrew).
To set code page 862 on
gnome-terminal:
Select Terminal->Set Character Encoding->Add or Remove.
In the pane on the left, select the line with description Hebrew and
encoding IBM862.
Click the right-pointing arrow between the two panes.
Click Close.
The above steps only need to be done
once for the lifetime of the Gnome
installation. Once done, it is
sufficient to:
Select Terminal, Set Character Encoding, and then Hebrew (IBM862).
It should be noted that the current
default gnome-terminal font in Ubuntu
Jaunty fully supports DECgraphics as
long as eight_bit_tty is set to false.
If you need these characters, you should use their correct Unicode codepoint values and output them as UTF-8. Or, if you prefer, you can output them as wide characters and let the standard library's locale system take care of converting them to UTF-8 or another "native" encoding the user has selected (which might even be CP437, although I've never seen a system setup that poorly...).
I've redirected stdout of a child process spawned with CreateProcess to a pipe. It works fine except that, as far as I can tell, no information about color changes are coming through. The child process is using SetConsoleTextAttribute to change the text color--is it possible to detect this through the pipe and, if so, how?
I'm ultimately displaying the output in a RichEdit control and I would like to capture the color information if at all possible.
This is in C with the Win32 API on XP and Vista.
You probably need to use ReadConsoleOutput (and/or related ones) found here: http://msdn.microsoft.com/en-us/library/ms682073(VS.85).aspx.
Hope that helps.
There maybe a work around...its old and not-used much!
Use Ansi.Sys and load that.
Whenever you output a text to the console, by using the Escape sequence, you can set a color around the text.
Then parse the escape sequences into the equivalent for RichText Colors.
The escape sequences are standard here. Here is how to add support for ANSI.SYS into the console. And here is the official KB from Microsoft on how to do this.
For an example:
printf("\x1b[33;43Yellow on Blue\x1b[0\n");
Now, parse the bit after the \x1b[, 33 is yellow foreground, and 43 is blue background, then look up the relevant color for that and set it in the RichTextBox..
Note: \x1b[0 turns off the attribute.
Edit: This may not be the best solution as that's for legacy NTVDM's 16bit DOS command.com under XP or later. But however, I found another link to 'ansicon' here which is for pure cmd.exe 32bit console, with ANSI support.
Hope this helps,
Best regards,
Tom.