putwchar / getwchar encoding? - c

I'm writing code which runs on both Windows and Linux. The application works with unicode strings, and I'm looking to output them to the console using common code.
Will putwchar and getwchar do the trick? For example, can I provide unicode character values to these functions, and they will both display the same character on Linux and Windows?

You are about to enter a world of pain. Invariably *nix consoles prefer you to send them UTF-8 encoded char* data.
Windows on the other hand uses UTF-16 for its Unicode APIs and for console APIs I believe it is limited to UCS2.
You need probably need to find some library code that abstracts away the differences for you. I don't have a good recommendation for you but I am sure that putwchar and getwchar are not the solution.

One of the many ways to reconcile them is to use explicit conversion modes in Windows:
#ifdef _WIN32
#include <fcntl.h>
#include <io.h>
#endif
#include <wchar.h>
#include <stdio.h>
#include <locale.h>
int main()
{
#ifdef _WIN32
_setmode(_fileno(stdout), _O_WTEXT);
#else
setlocale(LC_ALL, "en_US.UTF-8");
#endif
fputws(L"Кошка\n", stdout);
}
tested with gcc 4.6.1 on Linux and Visual Studio 2010 on windows
There's also a _O_U8TEXT and _O_U16TEXT in Windows. Your mileage may vary.

See the putwchar man page on Linux. It says that the behavior depends on LC_CTYPE and says "It is reasonable to expect that putwchar() will actually write the multibyte sequence corresponding to the wide character wc." Similarly, getwchar() should read a multibyte sequence from standard input, and return it as a wide character.
Don't assume that they will read/write a constant number of bytes like they would in UCS2.
All that said, character-by-character I/O isn't usually the fastest solution, and when you start optimizing, do keep in mind that on Linux and Unix you'll be working in UTF-8.

Related

Print rectangles to terminal

I'm trying to write a text editor for Linux that looks like MS-DOS EDIT.
However, I'm stuck because I can't figure out how to draw the thin rectangles around the editor screen and dialog box. I know the Linux dialog command can do something similar:
How can I draw rectangles like that around the screen (preferably without curses)?
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃These are box-drawing characters. ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│They live in the U+2500-U+257F range of│
│Unicode characters. │
└───────────────────────────────────────┘
░▒▓▛▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▜▓▒░
░▒▓▌ The shadows are block elements, ▐▓▒░
░▒▓▌ Unicode U+2580-U+259F. ▐▓▒░
░▒▓▙▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▟▓▒░
Once upon a time, box-drawing characters and block elements and were common in CP-437. Modern terminals likely expect UTF-8. (They don't work very well in web browsers... see here if the above text looks odd.)
There are also ANSI escapes to set the background color, foreground color, and other attributes of text displayed on a terminal. I can't demonstrate it well on Stack Overflow, though.
The ncurses library is a good way to do what you want, although you say you want alternatives. You can use the Unicode box-drawing characters as wide characters. They include all the characters from MS-DOS code page 437.
Modern distributions should be set up to support UTF-8 by default, so this should work. (I recommend saving the source file as UTF-8 with a byte-order mark.)
#define _XOPEN_SOURCE 700
#include <locale.h>
#include <stdio.h>
#include <stdlib.h>
#include <wchar.h>
int main(void)
{
setlocale( LC_ALL, "" );
fputws( L"╒╩╤╣\n", stdout );
return EXIT_SUCCESS;
}
Without curses, you can check the environment variables LINES and COLS to get the dimensions of the terminal. The control characters to print colors and such on the Linux console are in the console_codes(4) man page (and are a variant of the VT102 control codes, which are a superset of VT100, a superset of ANSI standard terminals). If you want to invoke it from a program such as gnome_terminal, check its documentation too, but it will probably implement an extension of xterm, which is an extension of VT102, etc. One that is very useful is that the form feed character '\L' will clear the screen and let you redraw it. You could also use terminfo or termcap for a more abstract and general interface, but in practical terms, nobody uses anything other than an extension of VT100 plus ANSI color any more.
Make sure your terminal font includes the line-drawing characters you want to use! DejaVu Sans Mono is an excellent monospace font, especially for its coverage of Unicode. Also, you can check that your locale is set correctly with the locale command; the locale names you see should end in something like .utf8 or UTF-8.
What you're describing is using the box drawing characters present in various extended character sets. The characters available depend at least on the platform and terminal emulation.
Given your question is tagged with Linux, the easiest method would be to use the ncurses library. Why would you prefer not to use it and have to reinvent that wheel?
If you can expect at least VT100 emulation (reasonable) then you can use the basic line drawing, but higher levels have more characters.
It's a bit old, but have a look at the window sample code here:
NCURSES Programming HOWTO: Windows
You may also want to look into the Xterm escape characters (expands the VT100 set):
Xterm Control Sequences
You're looking for box-drawing characters. Here's a complete table.
Assuming your system has a Unicode font installed, which most modern distros do, you could print those to your terminal like this:
#include <wchar.h>
#include <locale.h>
...
setlocale(LC_ALL,"en_US.UTF-8");
wprintf(L"\u250C\u2500\u2510\n"); // ┏━┓
wprintf(L"\u2502 \u2502\n"); // │ │
wprintf(L"\u2514\u2500\u2518\n"); // └━─┘

the linux and windows about the color

Recently, I've been having a problem.
#include <stdio.h>
#include <stdlib.h>
void main()
{
system("color 1F");
}
This can be printed Windows, but not in the Linux. Why?
Nothing to do with c, you're performing a system call on a command that doesn't necessarily exist.
color exists in the Windows shell, but doesn't on Linux. Your code is just not portable on Linux as-is.
Linux has its own way of doing it. You should check which OS you're running on and call the setterm instead for instance if you detect Linux (or at compilation time), so you already have Windows & Linux covered.
As a portable alternative, standard ANSI escape sequences are also widely available on a lot of OSes (For Windows, you need Windows 10, though)

Print long unicode in ncurses

I need to print some unicode characters for my game on terminal, some like this, \U0001F0A1, and my code
#include <curses.h>
#include <locale.h>
int main(){
setlocale(LC_ALL, "");
initscr();
printw("\U0001F0A1");
getch();
endwin();
return 0;
}
and all it print out is blank screen, but when i tried with printf, it can print a card out normally.
The likely problem is this: ncurses uses wcwidth to determine the width of the character, and printf does not check. The locale information on your computer is too old to give proper results.
Checking with my Debian testing, this works (using the sample program given in the question, compiling/linking with ncursesw) - see screenshot:
According to fileformat-info, this comes from Unicode 6.0 (2010). Depending on what system you are using, e.g., Debian or Ubuntu, that may be "recent".
ncurses requires correct wcwidth locale information while printf does not use the wcwidth-information. If your locale information is too old, wcwidth returns a negative value, telling ncurses that the character is nonprinting. In that case, ncurses will display a blank.

portable alternative to kbhit() and getch() and system("cls")

I need a way to use kbhit and getch functionality in a portable way. I'm currently developing a simple ascii game and I need to detect if a key is pressed. If it is I need to read it and if it isn't I need to continue without waiting for input. I would prefer not to echo it, but I won't be to picky about that. I think kbhit and getch would be great for this, BUT I'm only allowed to use fully portable code(well at least code for linux, mac and PC, not a lot of other OSes come to mind though). As I understand it the termios, curses, and conio libraries aren't fully implemented on all three OSes I need. I'm at a loss. Every solution I have found uses non-portable code. Is there someway I'm able to write portable functions for this myself? I'm currently including stdio.h, stdlib.h, and time.h. I also need a portable way to clear the screen as I'm currently using system("cls") and system("clear") which must also be changed every time I change the OS, or is the a way I could do an if-else and detect the OS the code is running on to switch between these two statements. Here is a segment of code that has these functions:
char key = ' ';
while(1)
{
system("cls");
if (_kbhit())
{
key =_getch();
printf("output: %c", key);
}
else
printf("output:");
}
This is essentially what functionality I need in my code, but I can't figure out a portable way to do it, and my teacher requires the code to work on Linux, mac, and pc using standard c and standard libraries. Please help! And no c++ please, we are using c.
EDIT: I don't think ncurses wasn't quite what I was looking for. Someone recommended I use #ifdef to implement these at compile time. I like this solution, but I need some help understanding how to do this on linux and mac as I can only test on windows with my current setup. hopefully I will soon have linux running on my other machine for testing, but OSX has a big price tag with it, so I would appreciate the help. Here's the current code:
//libraries
#include <stdio.h> //used for i/o
#include <stdlib.h> //used for clearing the screen
#include <time.h> //used to get time for random number generator
//check OS and include necessary libraries
#ifdef _WIN32
//code for Windows (32-bit and 64-bit, this part is common)
#include <conio.h>
#define CLEARSCREEN system("cls")
#define CHECKKEY _kbhit()
#define NBGETCHAR getch()
#elif __APPLE__
//code for mac
#define CLEARSCREEN system("clear")
#define CHECKKEY
#define NBGETCHAR
#elif __linux__
//code for linux
#define CLEARSCREEN system("clear")
#define CHECKKEY
#define NBGETCHAR
#else
# error "Unknown compiler"
#endif
int main()
{
char key = ' ';
while(1)
{
CLEARSCREEN;
if (CHECKKEY)
{
key=NBGETCHAR;
printf("output: %c", key);
}
else
printf("output:");
}
}
You should look into the portable ncurses library. In addition to many other tools for drawing in the terminal, it provides a keyboard interface which includes a getch() function.

Unbuffered I/O in ANSI C

For the sake of education, and programming practice, I'd like to write a simple library that can handle raw keyboard input, and output to the terminal in 'real time'.
I'd like to stick with ansi C as much as possible, I just have no idea where to start something like this. I've done several google searches, and 99% of the results use libraries, or are for C++.
I'd really like to get it working in windows, then port it to OSX when I have the time.
Sticking with Standard C as much as possible is a good idea, but you are not going to get very far with your adopted task using just Standard C. The mechanisms to obtain characters from the terminal one at a time are inherently platform specific. For POSIX systems (MacOS X), look at the <termios.h> header. Older systems use a vast variety of headers and system calls to achieve similar effects. You'll have to decide whether you are going to do any special character handling, remembering that things like 'line kill' can appear at the end of the line and zap all the characters entered so far.
For Windows, you'll need to delve into the WIN32 API - there is going to be essentially no commonality in the code between Unix and Windows, at least where you put the 'terminal' into character-by-character mode. Once you've got a mechanism to read single characters, you can manage common code - probably.
Also, you'll need to worry about the differences between characters and the keys pressed. For example, to enter 'ï' on MacOS X, you type option-u and i. That's three key presses.
To set an open stream to be non-buffered using ANSI C, you can do this:
#include <stdio.h>
if (setvbuf(fd, NULL, _IONBF, 0) == 0)
printf("Set stream to unbuffered mode\n");
(Reference: C89 4.9.5.6)
However, after that you're on your own. :-)
This is not possible using only standard ISO C. However, you can try using the following:
#include <stdio.h>
void setbuf(FILE * restrict stream, char * restrict buf);
and related functions.
Your best bet though is to use the ncurses library.

Resources