Non-spacing characters in curses - c

I was trying to write a basic program to print ā (a with overline) in C using curses and non-spacing characters. I have set the locale to en_US.UTF-8 and I am able to print international language characters using that. This code only prints a without overline. I am getting similar results with ncurses too. What else do I need to do to get ā on screen?
#include <curses.h>
#include <locale.h>
#include <wchar.h>
#include <assert.h>
int main() {
setlocale(LC_ALL, "");
initscr();
int s = 0x41; // represents 'a'
int ns = 0x0305; // represents COMBINING OVERLINE (a non-spacing character)
assert(wcwidth(ns) == 0);
wchar_t wstr[] = { s, ns, L'\0'};
cchar_t *cc;
int x = setcchar(cc, wstr, 0x00, 0, NULL);
assert(x == 0);
add_wch(cc);
refresh();
getch();
endwin();
return 0;
}

The curses calls need a pointer to data, not just a pointer.
It's okay to pass a null-terminated array for the wide-characters, but the pointer for the cchar_t data needs some repair.
Here's a fix for the program:
> diff -u foo.c.orig foo.c
--- foo.c.orig 2020-05-21 19:50:48.000000000 -0400
+++ foo.c 2020-05-21 19:51:46.799849136 -0400
## -3,7 +3,7 ##
#include <wchar.h>
#include <assert.h>
-int main() {
+int main(void) {
setlocale(LC_ALL, "");
initscr();
int s = 0x41; // represents 'a'
## -12,11 +12,11 ##
assert(wcwidth(ns) == 0);
wchar_t wstr[] = { s, ns, L'\0'};
- cchar_t *cc;
- int x = setcchar(cc, wstr, 0x00, 0, NULL);
+ cchar_t cc;
+ int x = setcchar(&cc, wstr, 0x00, 0, NULL);
assert(x == 0);
- add_wch(cc);
+ add_wch(&cc);
refresh();
getch();
That produces (on xterm) a "A" with an overbar:
(For what it's worth, 0x61 is "a", while 0x41 is "A").

Your code is basically correct aside from the declaration of cc. You'd be well-advised to hide the cursor, though; I think it is preventing you from seeing the overbar incorrectly rendered in the following character position.
I modified your code as follows:
#include <curses.h>
#include <locale.h>
#include <wchar.h>
#include <assert.h>
int main() {
setlocale(LC_ALL, "");
initscr();
int s = 0x41; // represents 'A'
int ns = 0x0305; // represents COMBINING OVERLINE (a non-spacing character)
assert(wcwidth(ns) == 0);
wchar_t wstr[] = { s, ns, L'\0'};
cchar_t cc; /* Changed *cc to cc */
int x = setcchar(&cc, wstr, 0x00, 0, NULL); /* Changed cc to &cc */
assert(x == 0);
set_curs(0); /* Added to hide the cursor */
add_wch(&cc); /* Changed cc to &cc */
refresh();
getch();
endwin();
return 0;
}
I tested on a kubuntu system, since that's what I have handy. The resulting program worked perfectly on xterm (which has ugly fonts) but not on konsole. On konsole, it rendered the overbar in the following character position, which is clearly a rendering bug since the overbar appears on top of the following character if there is one. After changing konsole's font to Liberation Mono, the test program worked perfectly.
The rendering bug is not going to be easy to track down because it is hard to reproduce, although from my experiments it seems to show up reliably when the font is DejaVu Sans Mono. Curiously, my system is set up to use non-spacing characters from DejaVu Sans Mono as substitutes in other fonts, such as Ubuntu Mono, and when these characters are used as substitutes, the spacing appears to be correct. However, Unicode rendering is sufficiently intricate that I cannot actually prove that the substitute characters really come from the configured font, and the rendering bug seems to come and go. It may depend on the font cache, although I can't prove that either.
If I had more to go on I'd file a bug report, and if I get motivated to look at this some more tomorrow, I might find something. Meanwhile, any information that other people can provide will undoubtedly be useful; at a minimum, that should include operating system and console emulator, with precise version numbers, and a list of fonts tried along with an indication whether they succeeded or not.
It's not necessary to use ncurses to see this bug, by the way. It's sufficient to test in your shell:
printf '\u0041\u0305\u000a'
will suffice. I found it interesting to test
printf '\u0041\u0305\u0321\u000a'
as well.
The system I tested it on:
$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 18.04.4 LTS
Release: 18.04
Codename: bionic
$ konsole --version
konsole 17.12.3
$ # Fonts showing bug
$ otfinfo -v /usr/share/fonts/truetype/dejavu/DejaVuSansMono.ttf
Version 2.37
$ # Fonts not showing bug
$ otfinfo -v /usr/share/fonts/truetype/liberation/LiberationMono-Regular.ttf
Version 1.07.4

There are multiple issues here. First, you're storing the result of setcchar to random memory at an uninitialized pointer, cc. Whenever a function takes a pointer for output, you need to pass the address of an object where the result will be stored, not an uninitialized pointer variable. The output must be an array of sufficient length to store the number of characters in the input. I'm not sure what the null termination convention is so to be safe I'd use:
cchar_t cc[3];
int x = setcchar(cc, wstr, 0x00, 0, NULL);
Then, the add_wch function takes only a single character to add, and replaces or appends based on whether it's a spacing or non-spacing character. So you need to call it once for each character.

Related

Using the gotoxy() function to center X coordinate

I want to write something using printf while also centering the x coordinate and y=0.
How can I center the x coordinate? For example someone might have their compiler window open in fullscreen and others might not? I want the text in the middle. Right now x is assigned a random value (50)
#include <stdio.h>
#include <conio.h>
int main()
{
gotoxy(50,0);
printf("Test");
return 0;
}
I'm just using an online compiler right now. onlinegdb.com Was thinking if there was a way to center the x so that it's the same in every compiler.
What is possible or not isn't determined by the compiler you are using, but by the platform and the ammount of code you are prepared to write.
Standard C has no idea of consoles, windows and other platform dependent stuff. If you want to get to know something about your consoles properties you have to ask the console/operating system. There are also libraries like ncurses for POSIX that allowes different terminals POSIX systems can run on to be treated uniformly.
An implementation of the ncurses-library that is available for DOS, OS/2, Win32, X11 and SDL is PDCurses. It can be used to write platform agnostic code.
But since you mentioned that your platform is windows, here is a solution that uses only the WinAPI:
#include <stddef.h>
#include <stdio.h>
#include <string.h>
#include <windows.h>
COORD get_console_dimensions(void)
{
CONSOLE_SCREEN_BUFFER_INFO csbi;
GetConsoleScreenBufferInfo(GetStdHandle(STD_OUTPUT_HANDLE), &csbi);
COORD dimensions = { csbi.srWindow.Right - csbi.srWindow.Left,
csbi.srWindow.Bottom - csbi.srWindow.Top };
return dimensions;
}
COORD get_console_cursor_pos(void)
{
CONSOLE_SCREEN_BUFFER_INFO csbi;
GetConsoleScreenBufferInfo(GetStdHandle(STD_OUTPUT_HANDLE), &csbi);
return csbi.dwCursorPosition;
}
void gotoxy(short x, short y)
{
COORD pos = { x, y };
SetConsoleCursorPosition(GetStdHandle(STD_OUTPUT_HANDLE), pos);
}
void puts_centered(char const *str)
{
size_t length = strlen(str);
short x = (short)(get_console_dimensions().X - length) / 2;
gotoxy(x, get_console_cursor_pos().Y);
puts(str);
}
int main(void)
{
puts_centered("Hello, World!");
}
Using ncurses the same can be achieved (also works with PDCurses, include <curses.h> instead of <ncurses.h>):
#include <string.h>
#include <ncurses.h>
int main(void)
{
initscr();
int max_x = getmaxx(stdscr);
int y, x;
getyx(stdscr, y, x);
char const *str = "Hello, World!\n";
mvaddstr(y, (max_x - strlen(str)) / 2, str);
refresh();
// endwin(); // *)
}
Live: https://onlinegdb.com/HkIpXBUim
Please note that OnlineGDBs support for ncurses with its "terminal" is broken. getyx() won't tell the real width of its console.
*) Documentation says you should call endwin() before exiting your program. If you do so with OnlineGDB you won't get any visible output at all from OnlineGDB. Only if you click the "Copy output to clipboard"-button and view the copied text you'll see the ANSI escape sequences produced by ncurses.

mbrtowc return -1 for non ASCII characters on embedded device but not on linux computer

Task
At the moment I am porting old DOS code for a device to Linux in pure C. The text is drawn on the surface with the help of bitfonts. I wrote a function which needs the Unicode codepoint to be passed and then draws the corresponding glyph (tested and works with different ASCII and non-ASCII characters). The old source code used DOS encoding but I am trying to use UTF-8 since multilanguage support is desired. I cannot use SDL_ttf or similar functions since the produced glyphs are not "precise" enough. Therefore I have to stick with bitfonts.
Issue
I wrote a small C test program to test the conversion of multibyte characters to their corresponding Unicode codepoint (inspired by http://en.cppreference.com/w/c/string/multibyte/mbrtowc).
#include <stdio.h>
#include <locale.h>
#include <string.h>
#include <wchar.h>
#include <stdint.h>
int main(void)
{
size_t n = 0, x = 0;
setlocale(LC_CTYPE, "en_US.utf8");
mbstate_t state = {0};
char in[] = "!°水"; // or u8"zß水"
size_t in_sz = sizeof(in) / sizeof (*in);
printf("Processing %zu UTF-8 code units: [ ", in_sz);
for(n = 0; n < in_sz; ++n)
{
printf("%#x ", (unsigned char)in[n]);
}
puts("]");
wchar_t out[in_sz];
char* p_in = in, *end = in + in_sz;
wchar_t *p_out = out;
int rc = 0;
while((rc = mbrtowc(p_out, p_in, end - p_in, &state)) > 0)
{
p_in += rc;
p_out += 1;
}
size_t out_sz = p_out - out + 1;
printf("into %zu wchar_t units: [ ", out_sz);
for(x = 0; x < out_sz; ++x)
{
printf("%u ", (unsigned short)out[x]);
}
puts("]");
}
The output is as expected:
Processing 7 UTF-8 code units: [ 0x21 0xc2 0xb0 0xe6 0xb0 0xb4 0 ]
into 4 wchar_t units: [ 33 176 27700 0 ]
When I run this code on my embedded Linux device I get the following as output:
Processing 7 UTF-8 code units: [ 0x21 0xc2 0xb0 0xe6 0xb0 0xb4 0 ]
into 2 wchar_t units: [ 33 55264 ]
After the ! character the mbrtowc output is -1, which, according to the documentation, occurs when an encoding error happened. I tested it with different signs and this error occurs only with non-ASCII characters. Error never occurred on Linux computer
Additional Information
I am using a PFM-540I Rev. B as pc on the embedded device. The Linux distribution is built using Buildroot.
You need to make sure that the en_US.utf8 locale is available on the embedded Linux build. By default, Buildroot limits the locales installed on the system in two ways:
Only specific locales are generated, as specified by the BR2_GENERATE_LOCALE configure option. By default, this list is empty, so you only get the C locale. Set this config option to en_US.UTF-8.
All locale data is removed at the end of the build, except the ones specified in BR2_ENABLE_LOCALE_WHITELIST. en_US is already in the default value, so probably you don't need to change this.
Note that if you change these configuration options, you need to make a completely clean build (with make clean; make) for the change to take effect.

How to Build Curses Program That Supports More Than 223 Columns of Mouse Input

I'm trying to get a curses program working with my terminal spanning my monitor. However, the x coordinate can't move past the 223rd column, instead it loops around. In the source, this seems to be due to them being defined as 8-bits, and having the position values start only after the first 32 values (i.e. x = raw_x - ' ').
Here's an example program from https://gist.github.com/sylt/93d3f7b77e7f3a881603 that demonstrates the issue when compiled with libncurses5. In it, if your cursor moves more than 233 columns to the right of the window, the x value will loop back over to 0 - ' ', i.e. -32
#include <curses.h>
#include <stdio.h>
int main()
{
initscr();
cbreak();
noecho();
// Enables keypad mode. This makes (at least for me) mouse events getting
// reported as KEY_MOUSE, instead as of random letters.
keypad(stdscr, TRUE);
// Don't mask any mouse events
mousemask(ALL_MOUSE_EVENTS | REPORT_MOUSE_POSITION, NULL);
printf("\033[?1003h\n"); // Makes the terminal report mouse movement events
for (;;) {
int c = wgetch(stdscr);
// Exit the program on new line fed
if (c == '\n')
break;
char buffer[512];
size_t max_size = sizeof(buffer);
if (c == ERR) {
snprintf(buffer, max_size, "Nothing happened.");
}
else if (c == KEY_MOUSE) {
MEVENT event;
if (getmouse(&event) == OK) {
snprintf(buffer, max_size, "Mouse at row=%d, column=%d bstate=0x%08lx",
event.y, event.x, event.bstate);
}
else {
snprintf(buffer, max_size, "Got bad mouse event.");
}
}
else {
snprintf(buffer, max_size, "Pressed key %d (%s)", c, keyname(c));
}
move(0, 0);
insertln();
addstr(buffer);
clrtoeol();
move(0, 0);
}
printf("\033[?1003l\n"); // Disable mouse movement events, as l = low
endwin();
return 0;
}
for the curious, you can build this with gcc file.c -lcurses
How do I workaround this? I can use vim in full-screen mode mode, and tmux mouse interactions also work. These both depend on ncurses, so it must be fixed somehow. I tried reading their source for hours and attempting samples of what I thought would work. I've also tried several printf() terminal modes, but none seem to enable this mode. How can I get my mouse event to hold more than 8 bits, and thus let the columns field hold values larger than 232?
That's a terminal-dependent feature (not an ncurses limitation as such). The original xterm protocol dating from the late 1980s encodes each ordinate in a byte, reserving the first 32 for control characters. That gives 256 - 32 = 223.
xterm introduced an experimental feature in 2010 to extend the range. There is an ncurses terminal description "xterm-1005" which uses that. Some criticized that, and xterm introduced an different feature in 2012. Again, there is a "xterm-1006" using that feature.
The descriptions in ncurses were added in 2014. ncurses 6 was released in 2015, and still supports (by compile-time options) the ABI 5 for ncurses 5. If your "ncurses5" is at least as new as the changes in 2014, the library supports SGR 1006 without change.
The reason for not making one of those part of the default "xterm" is that portability across the various xterm imitators is poor (as is their documentation), and that would only increase bug reports. But if you happen to be using one of the terminals (such as xterm...) which support the SGR 1006 feature, that's supported in the ncurses library.

Strange output when using system("clear") command in C program

I have the following code
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <time.h>
#include <stdbool.h>
#define dimensions 5
int RandomNumInRange(int M, int N)
{
return M + rand() / (RAND_MAX / (N - M + 1) + 1);
}
char ** CreateWorld(int dim)
{
int i,j;
char **world = malloc(dim *sizeof(char*));
for(i=0;i<dim;i++)
world[i]=malloc(dim*sizeof(char));
for(i=0;i<dim;i++)
for(j=0;j<dim;j++)
world[i][j]=42;
return world;
}
void CreateCastle(char **world)
{
//assuming world is big enough
//to hold a match of 2
int randRow,randCol;
//1 to dimension -2 so we can spawn a 3x3 castle
randRow = RandomNumInRange(1,dimensions-2);
randCol = RandomNumInRange(1,dimensions-2);
printf("position: %d %d\n", randRow, randCol);
world[randRow][randCol]='c';
//fill the rest so castle is 3x3
//assuming there is enough space for that
world[randRow-1][randCol-1]=35;
world[randRow-1][randCol]=35;
world[randRow-1][randCol+1]=35;
world[randRow][randCol-1]=35;
world[randRow][randCol+1]=35;
world[randRow+1][randCol-1]=35;
world[randRow+1][randCol]=35;
world[randRow+1][randCol+1]=35;
}
void DisplayWorld(char** world)
{
int i,j;
for(i=0;i<dimensions;i++)
{
for(j=0;j<dimensions;j++)
{
printf("%c",world[i][j]);
}
printf("\n");
}
}
int main(void){
system("clear");
int i,j;
srand (time(NULL));
char **world = CreateWorld(dimensions);
DisplayWorld(world);
CreateCastle(world);
printf("Castle Positions:\n");
DisplayWorld(world);
//free allocated memory
free(world);
//3 star strats
char ***world1 = malloc(3 *sizeof(char**));
for(i=0;i<3;i++)
world1[i]=malloc(3*sizeof(char*));
for(i=0;i<3;i++)
for(j=0;j<3;j++)
world1[i][j]="\u254B";
for(i=0;i<3;i++){
for(j=0;j<3;j++)
printf("%s",world1[i][j]);
puts("");
}
free(world1);
//end
return 0 ;
}
If I use the system("clear") command, I get a line consisting of "[3;J"
followed by an expected output. If I run the program again, I get the same gibberish, then many blank newlines, then the expected output. If I put the system("clear") command in comments then both the "[3;J" and the blank newlines don't show and the output is expected.
Edit: it seems the error is not in the code, but rather in the way the terminal on my system is (not) set. Thank you all for your input, I definitely have a lot of interesting stuff to read and learn now.
The codes being sent by your clear command from don't seem to be compatible with the Gnome terminal emulator, which I believe is what you would be using.
The normal control codes to clear a console are CSI H CSI J. (CSI is the Control Sequence Initializer: an escape character \033 followed by a [). CSI H sends the cursor to the home position, and CSI J clears from the cursor position to the end of the screen. You could also use CSI 2 J which clears the entire screen.
On Linux consoles and some terminal emulators, you can use CSI 3 J to clear both the entire screen and the scrollback. I would consider it unfriendly to do this (and the clear command installed on my system doesn't.)
CSI sequences can typically contain semicolons to separate numeric arguments. However, the J command doesn't accept more than one numeric argument and the semicolon seems to cause Gnome terminal to fail to recognize the control sequence. In any event, I don't believe Gnome terminal supports CSI 3 J.
The clear command normally uses the terminfo database to find the correct control sequences for the terminal. It identifies the terminal by using the value of the TERM environment variable, which suggests that you have to wrong value for that variable. Try setting export TERM=xterm and see if you get different results. If that works, you'll have to figure out where Linux Mint configures environment variables and fix it.
On the whole, you shouldn't need to use system("clear") to clear your screen; it's entirely too much overhead for such a simple task. You would be better off using tputs from the ncurses package. However, that also uses the terminfo database, so you will have to fix your TERM setting in any case.

How can I format currency with commas in C?

I'm looking to format a Long Float as currency in C. I would like to place a dollar sign at the beginning, commas iterating every third digit before decimal, and a dot immediately before decimal. So far, I have been printing numbers like so:
printf("You are owed $%.2Lf!\n", money);
which returns something like
You are owed $123456789.00!
Numbers should look like this
$123,456,789.00
$1,234.56
$123.45
Any answers need not be in actual code. You don't have to spoon feed. If there are C-related specifics which would be of help, please mention. Else pseudo-code is fine.
Your printf might already be able to do that by itself with the ' flag. You probably need to set your locale, though. Here's an example from my machine:
#include <stdio.h>
#include <locale.h>
int main(void)
{
setlocale(LC_NUMERIC, "");
printf("$%'.2Lf\n", 123456789.00L);
printf("$%'.2Lf\n", 1234.56L);
printf("$%'.2Lf\n", 123.45L);
return 0;
}
And running it:
> make example
clang -Wall -Wextra -Werror example.c -o example
> ./example
$123,456,789.00
$1,234.56
$123.45
This program works the way you want it to both on my Mac (10.6.8) and on a Linux machine (Ubuntu 10.10) I just tried.
I know this is a way-old post, but I disappeared down the man rabbit hole today, so I thought I'd document my travels:
There's a function called strfmon() that you can include with monetary.h that will do this, and do it according to local or international standards.
Note that it works like printf(), and will take as many double arguments as there are % formats specified in the string.
There's a lot more to it than what I have here, and I found this page to be the most helpful: https://www.gnu.org/software/libc/manual/html_node/Formatting-Numbers.html
#include <monetary.h>
#include <locale.h>
#include <stdlib.h>
#include <stdio.h>
int main(){
// need to setlocal(), "" sets locale to the system locale
setlocale(LC_ALL, "");
double money_amt = 1234.5678;
int buf_len = 16;
char simple_local[buf_len];
char international[buf_len];
char parenthesis_for_neg[buf_len];
char specified_width[buf_len];
char fill_6_stars[buf_len];
char fill_9_stars[buf_len];
char suppress_thousands[buf_len];
strfmon(simple_local, buf_len-1, "%n", money_amt);
strfmon(international, buf_len-1, "%i", money_amt);
strfmon(parenthesis_for_neg, buf_len-1, "%(n", money_amt);
strfmon(specified_width, buf_len-1, "%#6n", money_amt);
strfmon(fill_6_stars, buf_len-1, "%=*#6n", money_amt);
strfmon(fill_9_stars, buf_len-1, "%=*#8n", money_amt);
strfmon(suppress_thousands, buf_len-1, "%^=*#8n", money_amt);
printf( "===================== Output ===================\n"\
"Simple, local: %s\n"\
"International: %s\n"\
"parenthesis for negatives: %s\n"\
"fixed width (6 digits): %s\n"\
"fill character '*': %s\n"\
"-- note fill characters don't\n"\
"-- count where the thousdands\n"\
"-- separator would go:\n"\
"filling with 9 characters: %s\n"\
"Suppress thousands separators: %s\n"\
"================================================\n",
simple_local, international, parenthesis_for_neg,
specified_width, fill_6_stars, fill_9_stars,
suppress_thousands);
/** free(money_string); */
return 0;
}
===================== Output ===================
Simple, local: $1,234.57
International: USD1,234.57
parenthesis for negatives: $1,234.57
fixed width (6 digits): $ 1,234.57
fill character '*': $**1,234.57
-- note fill characters don't
-- count where the thousdands
-- separator would go:
filling with 9 characters: $*****1,234.57
Suppress thousands separators: $****1234.57
================================================
I don't think there's a C function to do that, but you could just write your own? Say float price = 23234.45. First print (int)price with commas, print a decimal point; then for the decimal part, do printf("%d", (int)(price*100)%100);
int anio, base = 1e4;
double cantidad, rata = 0.5;
int din_buf = 16;
char dinero[din_buf];
printf("%3s%23s\n", "Año", "Cantidad a depositar");
setlocale(LC_ALL, "en_US");
for ( anio = 1; anio < 11; anio++) {
cantidad = base * pow(rata + 1, anio);
strfmon(dinero, din_buf, "%#6n", cantidad);
printf("%3d\t%s\n", anio, dinero);
}
Windows users (with MSVC)
You cannot use:
the POSIX printf() formatting extras in Cal Norum’s answer.
the GNU strfmon() function in rreagan3’s and Edgar Fernando Dagar’s answers.
You are kind of stuck using the Win32 API:
GetCurrencyFormatEx() for locale-dependent currency formatting
(GetNumberFormatEx() for locale-dependent general-purpose number formatting)
Example:
#include <stdio.h>
#include <wchar.h>
#include <windows.h>
int main(void)
{
double money = 1234567.89;
wchar_t s[20], money_s[20];
swprintf( s, sizeof s, L"%.2f", money );
GetCurrencyFormatEx( L"en_US", 0, s, NULL, money_s, (int)sizeof money_s );
printf( "You are owed %S!\n", money_s );
}
You are owed $1,234,567.89!
As always, watch your rounding errors with swprintf() if you are counting currency with more precision than 100ths. (You may not want to round up if you owe money.)

Resources