This question already has answers here:
Showing characters in extended ASCII code (Ubuntu)
(3 answers)
Closed 9 years ago.
I tried the following
printf ("%c", 236); //236 is the ASCII value for infinity
But I am just getting garbage output on the screen.
printf was working correctly for ASCII values less than 128. So I tried the following
printf ("%c", 236u); //unsigned int 236
Still I am just getting garbage only. So, what should I do to make printf display ASCII values from 128 to 255.
Like everyone else in the comments already mentioned, you would not be able to reliably print characters after 127 (and assuming it as ASCII) since ASCII is only defined upto 127. Also the output you see very much depends on the terminal settings (i.e. which locale it is configured to).
If you're fine using UTF-8 to print, you could give wprintf a try as shown below:
#include <stdio.h>
#include <wchar.h>
#include <locale.h>
int main()
{
setlocale( LC_ALL, "en_US.UTF-8" );
wprintf (L"%lc\n", 8734);
return 0;
}
It would produce the following output:
∞
8734 (or 0x221E) is the equivalent of the UTF-8 UNICODE character for the symbol ∞.
Standard C does not have a symbol for infinite. That's for your implementation (eg. your compiler, your operating system, your terminal and your hardware) to define. Consider that C was designed with portability for systems that use non-ASCII character sets in mind (eg. EBCDIC).
Related
I can't understand why this code output is weird. I wrote this out of curiosity and now if I enter 55 it shows a leaf. And also many other things depending on number. I searched it in google but didn't find any possible explanation.
#include <stdio.h>
int main(){
char input='*';
int x;
scanf("%d",&x);
printf("%c",input*x);
return 0;
}
Characters are encoded as integers, usually 8-bit by default in C. Some are defined by ASCII and others depend on your OS, screen font, etc.
The * character has code 42. If you enter 55, your code computes 42*55=2310 and uses the low 8 bits of this value, which is 6, as the character to print. Character 6 is ACK which is not defined as a printable character by ASCII, but it sounds like your system is using something like the legacy IBM code page 437, in which character 6 displays as the spade symbol ♠.
Multiplying a character code by an integer is not a very useful thing to do. I'm not sure what you were expecting to accomplish with this program. If you thought it would print 55 copies of the * character, like Perl's x operator, well it doesn't. C has no built-in way to do that; you would just write a loop that prints * once per iteration, and iterates 55 times.
I have to save in a char[] the letter ñ and I'm not being able to do it. I tried doing this:
char example[1];
example[0] = 'ñ';
When compiling I get this:
$ gcc example.c
error: character too large for enclosing
character literal type
example[0] = 'ñ';
Does anyone know how to do this?
If you're using High Sierra, you are presumably using a Mac running macOS 10.13.3 (High Sierra), the same as me.
This comes down to code sets and locales — and can get tricky. Mac terminals use UTF-8 by default and ñ is Unicode character U+00F1, which requires two bytes, 0xC3 and 0xB1, to represent it in UTF-8. And the compiler is letting you know that one byte isn't big enough to hold two bytes of data. (In the single-byte code sets such as ISO 8859-1 or 8859-15, ñ has character code 0xF1 — 0xF1 and U+00F1 are similar, and this is not a coincidence; Unicode code points U+0000 to U+00FF are the same as in ISO 8859-1. ISO 8859-15 is a more modern variant of 8859-1, with the Euro symbol € and 7 other variations from 8859-1.)
Another option is to change the character set that your terminal works with; you need to adapt your code to suit the code set that the terminal uses.
You can work around this by using wchar_t:
#include <wchar.h>
void function(void);
void function(void)
{
wchar_t example[1];
example[0] = L'ñ';
putwchar(example[0]);
putwchar(L'\n');
}
#include <locale.h>
int main(void)
{
setlocale(LC_ALL, "");
function();
return 0;
}
This compiles; if you omit the call to setlocale(LC_ALL, "");, it doesn't work as I want (it generates just octal byte \361 (aka 0xF1) and a newline, which generates a ? on the terminal), whereas with setlocale(), it generates two bytes (\303\261 in octal, aka 0xC3 and 0xB1) and you see ñ on the console output.
You can use "extended ascii". This chart shows that 'ñ' can be represented in extended ascii as 164.
example[0] = (char)164;
You can print this character just like any other character
putchar(example[0]);
As noted in the comments above, this will depend on your environment. It might work on your machine but not another one.
The better answer is to use unicode, for example:
wchar_t example = '\u00F1';
This really depends on which character set / locale you will be using. If you want to hardcode this as a latin1 character, this example program does that:
#include <cstdio>
int main() {
char example[2] = {'\xF1'};
printf("%s", example);
return 0;
}
This, however, results in this output on my system that uses UTF-8:
$ ./a.out
�
So if you want to use non-ascii strings, I'd recommend not representing them as char arrays directly. If you really need to use char directly, the UTF-8 sequence for ñ is two chars wide, and can be written as such (again with a terminating '\0' for good measure):
char s[3] = {"\xC3\xB1"};
I wanted to know how to display special characters with printf().
I'm doing a string conversion program from Text to Code128 (barcode encoding).
For this type of encoding I need to display characters such as Î, Ç, È, Ì.
Example:
string to convert: EPE196000100000002260500004N
expected result: ÌEPEÇ3\ *R 6\ R $ÈNZÎ
printf result typed: ╠EPEÇ3\ *R 6\ R $ÇNZ╬
printf result image: []
EDIT: I only can use C in this program no C++ at all. All the awnsers I've find so far are in C++ not C so I'm asking how to do it with C ^^
I've find it,
#include <locale.h>
int main()
{
setlocale(LC_ALL,"");
printf("%c%c%c%c\n", 'Î', 'Ç', ' È','Ì');
}
Thank you all for your awnsers it helps me a lot!!! :)
If your console is in UTF-8 it is possible just to print UTF-8 hex representation for your symbols. See similar answer for C++ Special Characters on Console
The following line prints heart:
printf("%c%c%c\n", '\xE2', '\x99', '\xA5');
However, since you print '\xCC', '\xC8', '\xCE','\xC7' and you have 4 different symbols it means that the console encoding is some kind of ASCII extension. Probably you have such encoding http://asciiset.com/. In that case you need characters '\x8c', 'x8d'. Unfortunately there are no capital version of those symbols in that encoding. So, you need some other encoding for your console, for example Latin-1, ISO/IEC 8859-1.
For Windows console:
UINT oldcp = GetConsoleOutputCP(); // save current console encoding
SetConsoleOutputCP(1252);
// print in cp1252 (Latin 1) encoding: each byte => one symbol
printf("%c%c%c%c\n", '\xCC', '\xC8', '\xCE','\xC7');
SetConsoleOutputCP(CP_UTF8);
// 3 hex bytes in UTF-8 => one 'heart' symbol
printf("%c%c%c\n", '\xE2', '\x99', '\xA5');
SetConsoleOutputCP(oldcp);
The console font should support Unicode (for example 'Lucida Console'). It can be changed manually in the console properties, since the default font may be 'Raster Fonts'.
I've been tried to print Extended ASCII characters:
http://www.theasciicode.com.ar/
But all those symbols were printed as question-character on the white background ?.
I use the following cycle to print that symbols:
for (i = 0; i <= 30; i++)
printf("%c", 201);
Question: Is there any way to print those Extended ASCII characters or not? Or maybe there is special library for these characters?
OS Linux Ubuntu 13.04, Code::Blocks 12.11 IDE.
It's better to use unicode than extended ASCII, which is non-standard. A thread about printing unicode characters in C :
printing-utf-8-strings-with-printf-wide-vs-multibyte-string-literals
But indeed you need to copy paste unicode characters..
A better way to start:
#include <stdio.h>
int main() {
printf("\u2500\u2501\n");
}
See https://en.wikipedia.org/wiki/Box-drawing_character#Unicode for unicode characters for this extended ASCII style box art..
Edit:
I can only use stdio.h and stdlib.h
I would like to iterate through a char array filled with chars.
However chars like ä,ö take up twice the space and use two elements.
This is where my problem lies, I don't know how to access those special chars.
In my example the char "ä" would use hmm[0] and hmm[1].
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main()
{
char* hmm = "äö";
printf("%c\n", hmm[0]); //i want to print "ä"
printf("%i\n", strlen(hmm));
return 0;
}
Thanks, i tried to run my attached code in Eclipse, there it works. I assume because it uses 64 bits and the "ä" has enough space to fit. strlen confirms that each "ä" is only counted as one element.
So i guess i could somehow tell it to allocate more space for each char (so "ä" can fit)?
#include <stdio.h>
#include <stdlib.h>
int main()
{
char* hmm = "äüö";
printf("%c\n", hmm[0]);
printf("%c\n", hmm[1]);
printf("%c\n", hmm[2]);
return 0;
}
A char always used one byte.
In your case you think that "ä" is one char: Wrong.
Open your .c source code with an hexadecimal viewer and you will see that ä is using 2 char because the file is encoded in UTF8
Now the question is do you want to use wide character ?
#include <stdio.h>
#include <stdlib.h>
#include <wchar.h>
#include <locale.h>
int main()
{
const wchar_t hmm[] = L"äö";
setlocale(LC_ALL, "");
wprintf(L"%ls\n", hmm);
wprintf(L"%lc\n", hmm[0]);
wprintf(L"%i\n", wcslen(hmm));
return 0;
}
Your data is in a multi-byte encoding. Therefore, you need to use multibyte character handling techniques to divvy up the string. For example:
#include <stdio.h>
#include <string.h>
#include <locale.h>
int main(void)
{
char* hmm = "äö";
int off = 0;
int len;
int max = strlen(hmm);
setlocale(LC_ALL, "");
printf("<<%s>>\n", hmm);
printf("%zi\n", strlen(hmm));
while (hmm[off] != '\0' && (len = mblen(&hmm[off], max - off)) > 0)
{
printf("<<%.*s>>\n", len, &hmm[off]);
off += len;
}
return 0;
}
On my Mac, it produced:
<<äö>>
4
<<ä>>
<<ö>>
The call to setlocale() was crucial; without that, the program runs in the "C" locale instead of my en_US.UTF-8 locale, and mblen() mishandled things:
<<äö>>
4
<<?>>
<<?>>
<<?>>
<<?>>
The questions marks appear because the bytes being printed are invalid single bytes as far as the UTF-8 terminal is concerned.
You can also use wide characters and wide-character printing, as shown in benjarobin's answer..
Sorry to drag this on. Though I think its important to highlight some issues. As I understand it OS-X has the ability to have the default OS code page to be UTF-8 so the answer is mostly in regards to Windows that under the hood uses UTF-16, and its default ACP code page is dependent on the specified OS region.
Firstly you can open Character Map, and find that
äö
Both reside in the code page 1252 (western), so this is not a MBCS issue. The only way it could be a MBCS issue is if you saved the file using MBCS (Shift-JIS,Big5,Korean,GBK) encoding.
The answer, of using
setlocale( LC_ALL, "" )
Does not give insight into the reason why, äö was rendered in the command prompt window incorrectly.
Command Prompt does use its own code pages, namely OEM code pages. Here is a reference to the following (OEM) code pages available with their character map's.
Going into command prompt and typing the following command (Chcp) Will reveal the current OEM code page that the command prompt is using.
Following Microsoft documentation by using setlocal(LC_ALL,"") it details the following behavior.
setlocale( LC_ALL, "" );
Sets the locale to the default, which is the user-default ANSI code page obtained from the operating system.
You can do this manually, by using chcp and passing your required code page, then run your application and it should output the text perfectly fine.
If it was a multie byte character set problem then there would be a whole list of other issues:
Under MBCS, characters are encoded in either one or two bytes. In two-byte characters, the first, or "lead-byte," signals that both it and the following byte are to be interpreted as one character. The first byte comes from a range of codes reserved for use as lead bytes. Which ranges of bytes can be lead bytes depends on the code page in use. For example, Japanese code page 932 uses the range 0x81 through 0x9F as lead bytes, but Korean code page 949 uses a different range.
Looking at the situation, and that the length was 4 instead of 2. I would say that the file format has been saved in UTF-8 (It could in fact been saved in UTF-16, though you would of run into problems sooner than later with the compiler). You're using characters that are not within the ASCII range of 0 to 127, UTF-8 is encoding the Unicode code point to two bytes. Your compiler is opening the file and assuming its your default OS code page or ANSI C. When parsing your string, it's interpreting the string as a ANSI C Strings 1 byte = 1 character.
To sove the issue, under windows convert the UTF-8 string to UTF-16 and print it with wprintf. Currently there is no native UTF-8 support for the Ascii/MBCS stdio functions.
For Mac OS-X, that has the default OS code page of UTF-8 then I would recommend following Jonathan Leffler solution to the problem because it is more elegant. Though if you port it to Windows later, you will find you will need to covert the string from UTF-8 to UTF-16 using the example bellow.
In either solution you will still need to change the command prompt code page to your operating system code page to print the characters above ASCII correctly.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <Windows.h>
#include <locale>
// File saved as UTF-8, with characters outside the ASCII range
int main()
{
// Set the OEM code page to be the default OS code page
setlocale(LC_ALL, "");
// äö reside outside of the ASCII range and in the Unicode code point Western Latin 1
// Thus, requires a lead byte per unicode code point when saving as UTF-8
char* hmm = "äö";
printf("UTF-8 file string using Windows 1252 code page read as:%s\n",hmm);
printf("Length:%d\n", strlen(hmm));
// Convert the UTF-8 String to a wide character
int nLen = MultiByteToWideChar(CP_UTF8, 0,hmm, -1, NULL, NULL);
LPWSTR lpszW = new WCHAR[nLen];
MultiByteToWideChar(CP_UTF8, 0, hmm, -1, lpszW, nLen);
// Print it
wprintf(L"wprintf wide character of UTF-8 string: %s\n", lpszW);
// Free the memory
delete[] lpszW;
int c = getchar();
return 0;
}
UTF-8 file string using Windows 1252 code page read as:äö
Length:4
wprintf wide character of UTF-8 string: äö
i would check your command prompt font/code page to make sure that it can display your os single byte encoding. note command prompt has its own code page that differs to your text editor.