Print long unicode in ncurses - c

I need to print some unicode characters for my game on terminal, some like this, \U0001F0A1, and my code
#include <curses.h>
#include <locale.h>
int main(){
setlocale(LC_ALL, "");
initscr();
printw("\U0001F0A1");
getch();
endwin();
return 0;
}
and all it print out is blank screen, but when i tried with printf, it can print a card out normally.

The likely problem is this: ncurses uses wcwidth to determine the width of the character, and printf does not check. The locale information on your computer is too old to give proper results.
Checking with my Debian testing, this works (using the sample program given in the question, compiling/linking with ncursesw) - see screenshot:
According to fileformat-info, this comes from Unicode 6.0 (2010). Depending on what system you are using, e.g., Debian or Ubuntu, that may be "recent".
ncurses requires correct wcwidth locale information while printf does not use the wcwidth-information. If your locale information is too old, wcwidth returns a negative value, telling ncurses that the character is nonprinting. In that case, ncurses will display a blank.

Related

ncurses.h extended characters not displaying properly in c

I am trying to pretty up my program by using ncurses extended characters. However, some of them show up as the question mark in a box: ⍰. This happens when I try functions such as:
addch(ACS_S1);
addch(ACS_LANTERN);
addch(ACS_S3);
And so on. Any help would be appreciated.
#include <ncurses.h>
int main()
{
initscr();
addch(ACS_S1);
addch(ACS_S3);
addch(ACS_S7);
addch(ACS_S9);
addch(ACS_LANTERN);
refresh();
getch();
endwin();
return 0;
}
edit: I forgot to add the code example. So I added it this time
edit: I am using Ubuntu to compile my code
You forgot to tell ncurses what the locale is (and if you did not compile/link with ncursesw, there are still some limitations):
The library uses the locale which the calling program has initialized.
That is normally done with setlocale:
setlocale(LC_ALL, "");
If the locale is not initialized, the library assumes that characters
are printable as in ISO-8859-1, to work with certain legacy programs.
You should initialize the locale and not rely on specific details of
the library when the locale has not been setup.

Print rectangles to terminal

I'm trying to write a text editor for Linux that looks like MS-DOS EDIT.
However, I'm stuck because I can't figure out how to draw the thin rectangles around the editor screen and dialog box. I know the Linux dialog command can do something similar:
How can I draw rectangles like that around the screen (preferably without curses)?
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃These are box-drawing characters. ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│They live in the U+2500-U+257F range of│
│Unicode characters. │
└───────────────────────────────────────┘
░▒▓▛▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▜▓▒░
░▒▓▌ The shadows are block elements, ▐▓▒░
░▒▓▌ Unicode U+2580-U+259F. ▐▓▒░
░▒▓▙▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▟▓▒░
Once upon a time, box-drawing characters and block elements and were common in CP-437. Modern terminals likely expect UTF-8. (They don't work very well in web browsers... see here if the above text looks odd.)
There are also ANSI escapes to set the background color, foreground color, and other attributes of text displayed on a terminal. I can't demonstrate it well on Stack Overflow, though.
The ncurses library is a good way to do what you want, although you say you want alternatives. You can use the Unicode box-drawing characters as wide characters. They include all the characters from MS-DOS code page 437.
Modern distributions should be set up to support UTF-8 by default, so this should work. (I recommend saving the source file as UTF-8 with a byte-order mark.)
#define _XOPEN_SOURCE 700
#include <locale.h>
#include <stdio.h>
#include <stdlib.h>
#include <wchar.h>
int main(void)
{
setlocale( LC_ALL, "" );
fputws( L"╒╩╤╣\n", stdout );
return EXIT_SUCCESS;
}
Without curses, you can check the environment variables LINES and COLS to get the dimensions of the terminal. The control characters to print colors and such on the Linux console are in the console_codes(4) man page (and are a variant of the VT102 control codes, which are a superset of VT100, a superset of ANSI standard terminals). If you want to invoke it from a program such as gnome_terminal, check its documentation too, but it will probably implement an extension of xterm, which is an extension of VT102, etc. One that is very useful is that the form feed character '\L' will clear the screen and let you redraw it. You could also use terminfo or termcap for a more abstract and general interface, but in practical terms, nobody uses anything other than an extension of VT100 plus ANSI color any more.
Make sure your terminal font includes the line-drawing characters you want to use! DejaVu Sans Mono is an excellent monospace font, especially for its coverage of Unicode. Also, you can check that your locale is set correctly with the locale command; the locale names you see should end in something like .utf8 or UTF-8.
What you're describing is using the box drawing characters present in various extended character sets. The characters available depend at least on the platform and terminal emulation.
Given your question is tagged with Linux, the easiest method would be to use the ncurses library. Why would you prefer not to use it and have to reinvent that wheel?
If you can expect at least VT100 emulation (reasonable) then you can use the basic line drawing, but higher levels have more characters.
It's a bit old, but have a look at the window sample code here:
NCURSES Programming HOWTO: Windows
You may also want to look into the Xterm escape characters (expands the VT100 set):
Xterm Control Sequences
You're looking for box-drawing characters. Here's a complete table.
Assuming your system has a Unicode font installed, which most modern distros do, you could print those to your terminal like this:
#include <wchar.h>
#include <locale.h>
...
setlocale(LC_ALL,"en_US.UTF-8");
wprintf(L"\u250C\u2500\u2510\n"); // ┏━┓
wprintf(L"\u2502 \u2502\n"); // │ │
wprintf(L"\u2514\u2500\u2518\n"); // └━─┘

How could I guarantee a terminal has Unicode/wide character support with NCURSES?

I am developing an NCURSES application for a little TUI (text user interface) exercise. Unfortunately, I do not have the option of using the ever-so-wonderful-and-faithful ASCII. My program uses a LOT of Unicode box drawing characters.
My program can already detect if the terminal is color-capable. I need to do something like:
if(!supportsUnicode()) //I prefer camel-case, it's just the way I am.
{
fprintf(stderr, "This program requires a Unicode-capable terminal.\n\r");
exit(1);
}
else
{
//Yay, we have Unicode! some random UI-related code goes here.
}
This isn't just a matter of simply including ncursesw and just setting the locale. I need to get specific terminal info and actually throw an error if it's not gonna happen. I need to, for example, throw an error when the user tries to run the program in the lovely XTerm rather than the Unicode-capable UXTerm.
As noted, you cannot detect the terminal's capabilities reliably. For that matter, you cannot detect the terminal's support for color either. In either case, your application can only detect what you have configured, which is not the same thing.
Some people have had partial success detecting Unicode support by writing a UTF-encoded character and using the cursor-position report to see where the cursor is (see for example Detect how much of Unicode my terminal supports, even through screen).
Compiling/linking with ncursesw relies upon having your locale configured properly, with some workarounds for terminals (such as PuTTY) which do not support VT100 line-graphics when in UTF-8 mode.
Further reading:
Line Graphics curs_add_wch(3x)
NCURSES_NO_UTF8_ACS ncurses(3x)
You can't. ncurses(w) uses termcap to determine what capabilities a terminal has, and that looks at the $TERM environment variable to determine what terminal is being used. There is no special value of that variable that indicates that a terminal supports Unicode; both XTerm and UXTerm set TERM=xterm. Many other terminal applications use that value of $TERM as well, including both ones that support Unicode and ones that don't. (Indeed, in many terminal emulators, it's possible to enable and disable Unicode support at runtime.)
If you want to start outputting Unicode text to the terminal, you will just have to take it on faith that the user's terminal will support that.
If all you want to do is output box drawing characters, though, you may not need Unicode at all — those characters are available as part of the VT100 graphical character set. You can output these characters in a ncurses application using the ACS_* constants (e.g, ACS_ULCORNER for ┌), or use a function like box() to draw a larger figure for you.
The nl_langinfo() function shall return a pointer to a string containing information relevant to the particular language or cultural area defined in the current locale.
#include <langinfo.h>
#include <locale.h>
#include <stdbool.h>
#include <string.h>
bool supportsUnicode()
{
/* Set a locale for the ctype and multibyte functions.
* This controls recognition of upper and lower case,
* alphabetic or non-alphabetic characters, and so on.
*/
setlocale(LC_CTYPE, "en_US.UTF-8");
return (strcmp(nl_langinfo(CODESET), "UTF-8") == 0) ? true : false;
}
Refer to htop source code which can draw lines with/without Unicode.

putwchar / getwchar encoding?

I'm writing code which runs on both Windows and Linux. The application works with unicode strings, and I'm looking to output them to the console using common code.
Will putwchar and getwchar do the trick? For example, can I provide unicode character values to these functions, and they will both display the same character on Linux and Windows?
You are about to enter a world of pain. Invariably *nix consoles prefer you to send them UTF-8 encoded char* data.
Windows on the other hand uses UTF-16 for its Unicode APIs and for console APIs I believe it is limited to UCS2.
You need probably need to find some library code that abstracts away the differences for you. I don't have a good recommendation for you but I am sure that putwchar and getwchar are not the solution.
One of the many ways to reconcile them is to use explicit conversion modes in Windows:
#ifdef _WIN32
#include <fcntl.h>
#include <io.h>
#endif
#include <wchar.h>
#include <stdio.h>
#include <locale.h>
int main()
{
#ifdef _WIN32
_setmode(_fileno(stdout), _O_WTEXT);
#else
setlocale(LC_ALL, "en_US.UTF-8");
#endif
fputws(L"Кошка\n", stdout);
}
tested with gcc 4.6.1 on Linux and Visual Studio 2010 on windows
There's also a _O_U8TEXT and _O_U16TEXT in Windows. Your mileage may vary.
See the putwchar man page on Linux. It says that the behavior depends on LC_CTYPE and says "It is reasonable to expect that putwchar() will actually write the multibyte sequence corresponding to the wide character wc." Similarly, getwchar() should read a multibyte sequence from standard input, and return it as a wide character.
Don't assume that they will read/write a constant number of bytes like they would in UCS2.
All that said, character-by-character I/O isn't usually the fastest solution, and when you start optimizing, do keep in mind that on Linux and Unix you'll be working in UTF-8.

Equivalent to Windows getch() for Mac/Linux crashes

I am using getch() and my app crashes instantly. Including when doing:
int main()
{
getch();
}
I can't find the link but supposedly the problem is that it needs to turn off buffering or something strange along those lines, and I still want cout to work along with cross platform code.
I was told to use std::cin.get(), but I'd like the app to quit when a key is pressed, not when the user typed in a letter or number then press enter to quit.
Is there any function for this? The code must work under Mac (my os) and Windows.
Linking/compiling is not an issue; I include <curses.h> and link with -lcurses in XCode, while Windows uses <conio.h>.
Have you looked in <curses.h> to see what the getch() function does?
Hint: OSX and Linux are not the same as Windows.
Specifically, as a macro in <curses.h>, we find:
#define getch() wgetch(stdscr)
Now, there appears, on your system, to be an actual function getch() in the curses library, but it expects stdscr to be set up, and that is done by the curses initialization functions (initscr() and relatives), and that is signally not done by your code. So, your code is invoking undefined behaviour by calling curses routines before the correct initialization is done, leading to the crash.
(Good hint from dmckee - it helped get the link line out of acidzombie24, which was important.)
To get to a point where a single key-stroke can be read and the program terminated cleanly, you have to do a good deal of work on Unix (OSX, Linux). You would have to trap the initial state of the terminal, arrange for an atexit() function - or some similar mechanism - to restore the state of the terminal, change the terminal from cooked mode into raw mode, then invoke a function to read a character (possibly just read(0, &c, 1)), and do your exit. There might be other ways to do it - but it certainly will involve some setup and teardown operations.
One book that might help is Advanced Unix Programming, 2nd Edn by Mark Rochkind; it covers terminal handling at the level needed. Alternatively, you can use <curses.h> properly - that will be simpler than a roll-your-own solution, and probably more reliable.
You have not exhibited a
#include <stdio.h>
or
#include <curses.h>
or similar line. Are you sure that you are linking against a library that includes getch()?
Use the cin.get() function for example:
#include <iostream>
using namespace std;
int main()
{
char input = cin.get();
cout << "You Pressed: " << input;
}
The program would then wait for you to press a key.
Once you have, the key you pressed would be printed to the screen.
The getch function is not available on Unix-like systems, but you can replace it with console commands through your compiler with the system function.
Usage:
In Windows you can use system("pause");
In Unix-like systems (such as OSX) you can use system("read -n1 -p ' ' key");
Note: system is declared in <stdlib.h>.

Resources