What's the difference between putch() and putchar()? - c

Okay so, I'm pretty new to C.
I've been trying to figure out what exactly is the difference between putch() and putchar()?
I tried googling my answers but all I got was the same copy-pasted-like message that said:
putchar(): This function is used to print one character on the screen, and this may be any character from C character set (i.e it may be printable or non printable characters).
putch(): The putch() function is used to display all alphanumeric characters through the standard output device like monitor. this function display single character at a time.
As English isn't my first language I'm kinda lost. Are there non printable characters in C? If so, what are they? And why can't putch produce the same results?
I've tried googling the C character set and all of the alphanumeric characters there are, but as much as my testing went, there wasn't really anything that one function could print and the other couldn't.
Anyways, I'm kind of lost.
Could anyone help me out? thanks!
TLDR;
what can putchar() do that putch() can't? (or the opposite or something idk)
dunno, hoped there would be a visible difference between the two but can't seem to find it.

putchar() is a standard function, possibly implemented as a macro, defined in <stdio.h> to output a single byte to the standard output stream (stdout).
putch() is a non standard function available on some legacy systems, originally implemented on MS/DOS as a way to output characters to the screen directly, bypassing the standard streams buffering scheme. This function is mostly obsolete, don't use it.
Output via stdout to a terminal is line buffered by default on most systems, so as long as your output ends with a newline, it will appear on the screen without further delay. If you need output to be flushed in the absence of a newline, use fflush(stdout) to force the stream buffered contents to be written to the terminal.

putch replaces the newline character (\ n)
putchar is a function in the C programming language that writes a single character to the standard output

Related

Why can't we print ASCII values from 0 to 31?

#include<stdio.h>
int main()
{
for(int i=0;i<=31;i++)
printf("%c",i);
}
when we try to run this code then nothing prints
what is the reason for it ?
C is printing them, but perhaps your terminal is not displaying them. This distinction is important because the terminal is responsible for interpreting the output of your program, printing letters, moving the cursor around, changing colors and such.
By historical convention the first 32 characters of the ASCII table are considered "control characters", some of which are printable, some like backspace which move the cursor, others like BEL which can make your terminal beep.
Different terminals may display these differently, or not at all.
It's worth noting that ASCII pre-dates modern "glass" terminals and that these codes were used to move the print-head around on the page. Early machines used teletypes to communicate with them and a line-feed would crank down the paper one line, a carriage return move the cursor back to the start of the line, much like the physical carriage return on a typewriter which would move the "carriage" back to the first column.
These were pretty elaborate elecromechanical contraptions that didn't have any modern circuitry in them, yet they could still process ASCII data, at least for those using ASCII, as there are other character sets like EBCDIC that co-existed with ASCII.
As these characters were never intended to be printed, so they don't have a standard visual representation in ASCII.
With "extended ASCII", as used in DOS, there are symbols defined for them because it seemed like a waste otherwise. These don't have control-code meanings, typically you write them directly to the console character buffer in order to see them.
You can, it's just that most of them are non-printable control characters that most shells ignore. If you pipe stdout to a file, the file will contain those characters, it's just the shell that doesn't know what to do with them. Some of them are handled by shells (e.g. the line feed and backspace characters) but others are just nonsensical (e.g. end of transmission, data link escape) and get ignored, or replaced with a different character for display (often a space or a question mark or the like).

Why does this scanf() conversion actually work?

Ah, the age old tale of a programmer incrementally writing some code that they aren't expecting to do anything more than expected, but the code unexpectedly does everything, and correctly, too.
I'm working on some C programming practice problems, and one was to redirect stdin to a text file that had some lines of code in it, then print it to the console with scanf() and printf(). I was having trouble getting the newline characters to print as well (since scanf typically eats up whitespace characters) and had typed up a jumbled mess of code involving multiple conditionals and flags when I decided to start over and ended up typing this:
(where c is a character buffer large enough to hold the entirety of the text file's contents)
scanf("%[a-zA-Z -[\n]]", c);
printf("%s", c);
And, voila, this worked perfectly. I tried to figure out why by creating variations on the character class (between the outside brackets), such as:
[\w\W -[\n]]
[\w\d -[\n]]
[. -[\n]]
[.* -[\n]]
[^\n]
but none of those worked. They all ended up reading either just one character or producing a jumbled mess of random characters. '[^\n]' doesn't work because the text file contains newline characters, so it only prints out a single line.
Since I still haven't figured it out, I'm hoping someone out there would know the answer to these two questions:
Why does "[a-zA-Z -[\nn]]" work as expected?
The text file contains letters, numbers, and symbols (':', '-', '>', maybe some others); if 'a-z' is supposed to mean "all characters from unicode 'a' to unicode 'z'", how does 'a-zA-Z' also include numbers?
It seems like the syntax for what you can enter inside the brackets is a lot like regex (which I'm familiar with from Python), but not exactly. I've read up on what can be used from trying to figure out this problem, but I haven't been able to find any info comparing whatever this syntax is to regex. So: how are they similar and different?
I know this probably isn't a good usage for scanf, but since it comes from a practice problem, real world convention has to be temporarily ignored for this usage.
Thanks!
You are picking up numbers because you have " -[" in your character set. This means all characters from space (32) to open-bracket (91), which includes numbers in ASCII (48-57).
Your other examples include this as well, but they are missing the "a-zA-Z", which lets you pick up the lower-case letters (97-122). Sequences like '\w' are treated as unknown escape sequences in the string itself, so \w just becomes a single w. . and * are taken literally. They don't have a special meaning like in a regular expression.
If you include - inside the [ (other than at the beginning or end) then the behaviour is implementation-defined.
This means that your compiler documentation must describe the behaviour, so you should consult that documentation to see what the defined behaviour is, which would explain why some of your code worked and some didn't.
If you want to write portable code then you can't use - as anything other than matching a hyphen.

Confused with interesting printf() statement

By reading this code, I stumbled upon the following printf() statement:
// reset, hide cursor and clear screen
printf("\e[0m\e[?25l\e[2J");
I must admit that I am not a fully qualified C hacker and do not fully understand this. I tweaked around, removing the arguments, and I understand what it does (well, the comment actually says it all), but I have no idea how it's done. Also, this is something kind of hard to google for.
How does this printf() call work?
This doesn't really have anything to do with printf. The C11 standard lists escape sequences in §5.2.2, and the list consists of \a, \b, \f, \n, \r, \t and \v. As an extension, GCC considers \e to be an escape sequence which stands for the ASCII character Esc (\E may work as well, or your compiler may support neither of them. Consult the documentation for your compiler). What follows are non-portable control sequences. They are not guaranteed to work the same in all terminals, or even work at all. The best way to know is to consult the documentation for your system.
§6.4.4.4 also describes octal escape sequences. For example, \033, where 033 is 27 in decimal, and therefore the escape character in ASCII. Similarly, you can use \x1b, which is a hexadecimal escape sequence specifying the same character.
If we inspect the output of the program with od -c, it shows 033.
(✿´‿`) ~/test> ./a.out | od -c
0000000 033 [ 0 m 033 [ ? 2 5 l 033 [ 2 J
0000016
The ANSI escape sequences are interpreted by terminal emulators. C will convert the octal/hexadecimal escape sequences to the ASCII Esc character. Your compiler, as an extension, might also convert \e or \E. As requested, a brief explanation of what the control sequences are doing:
[0m: resets all the SGR attributes
[?25l: hides the cursor
[2J: from Wikipedia:
Clears part of the screen. If n is 0 (or missing), clear from cursor
to end of screen. If n is 1, clear from cursor to beginning of the
screen. If n is 2, clear entire screen ...
The printf() call is simply outputting a specific series of byte values. The "magic" is that those values are special in the terminal.
A special series of bytes starting with the ASCII "escape" character is called an "escape sequence". These were invented for serial data terminals, where the only means of communication with the terminal was by sending byte values through the serial connection. Ordinary characters are simply displayed on the terminal, but it was desirable to have a way to move the cursor, clear the screen, etc. and most terminals used escape sequences for this.
http://en.wikipedia.org/wiki/Escape_sequence
There was one particularly popular terminal called the "VT100", and most terminal emulators today operate using VT100 escape sequences.
Even today, escape sequences are useful. You can write a simple C program that will work on the terminal emulators in Linux, Mac, Windows, mobile devices, basically everywhere. When you need to do something simple like clear the screen, just outputting the proper escape sequence is the easiest way.

Impossible to put stdout in wide char mode

On my system, a pretty normal Ubuntu 13.10, the french accented characters "éèàçù..." are always handled correctly by whatever tools I use, despite LC_ environment variables being set to en_US.UTF-8.
In particular command line utilities like grep, cat, ... always read and print these characters without a hitch.
Despite these remarks, such a small program as
int main() {
printf("%c", getchar());
return 0;
}
fails when the user enters "é".
From the man pages, and a lot of googling, there is no standard way to close stdout, then reopening it. From man fwide(), if stdout is in byte mode, I can't pass it to wide character mode, short of closing it and reopening it... therefore I can't use getwchar() and wprintf().
I can't believe that every single utility like cat, grep, etc... reimplements a way to manage wide characters, yet from my research, I see no other way.
Is it my system that has a problem? I can't see how since every utility works flawlessly.
What am I missing, please?
When a C program starts, stdout, stdin and stderr are neither byte nor wide-character oriented. fwide(stdin, 0) should return 0 at this point.
If you expand your minimal program to:
#include <stdio.h>
#include <locale.h>
#include <wchar.h>
int main()
{
setlocale(LC_ALL, "");
printf("%lc\n", getwchar());
return 0;
}
Then it should work as you expect. (There is no need to explicitly set the orientation of stdin here - since the first operation on it is a wide-character operation, it will have wide-character orientation).
You do need to use getwchar() instead of getchar() if you want to read a wide character with it, though.
UTF-8 character are taken as byte code not character and non ascii character are more then one byte.
Check this Question
for more info
The utilities you mention are generally line-oriented. If you were to try to read a whole line with e.g. fgets() rather than a single character, I think it'll work for you, too.
When you start reading single characters (which may be just bytes, and often are), you are of course very much susceptible to encoding issues.
Reading full lines will work just fine, as long as the line-termiation encoding is not mis-understood (and for UTF-8 it won't be).

Carriage return required when printing to the console in Windows?

It seems like just putting a linefeed is good enough, but I know it is supposed to be carriage return + line feed. Does anything horrible happen if you don't put the carriage return and only use line feeds?
This is in ANSI C and not going to be redirected to a file or anything else. Just a normal console app.
The Windows console follows the same line ending convention that is assumed for files, or for that matter for actual, physical terminals. It needs to see both CR and LF to properly move to the next line.
That said, there is a lot of software infrastructure between an ANSI C program and that console. In particular, any standard C library I/O function is going to try to do the right thing, assuming you've allowed it the chance. This is why fopen()'s t and b modifiers for the mode parameter were defined.
With t (the default for most streams, and in particular for stdin and stdout) then any \n printed is converted to a CRLF sequence, and the reverse happens for reads. To turn off that behavior, use the b modifier.
Incidentally, the terminals traditionally hooked to *nix boxes including the DEC VT100 emulated by XTerm also needs both CR and LF. However, in the *nix world, the conversion from a newline character to a CRLF sequence is handled in the tty device driver so most programs don't need to know about it, and the t and b modifiers are both ignored. On those platforms, if you need to send and receive characters on a tty without that modification, you need to look up stty(1) or the system calls it depends on.
If your otherwise ANSI C program is avoiding C library I/O to the console (perhaps because you need access to the console's character color and other attributes) then whether you need to send CR or not will depend on which Win32 API calls you are using to send the characters.
If you're in a *nix environment \n (Linefeed) is probably ok. If you're in Windows and aren't redirecting (now) a linefeed is also ok, but if someone at somepoint redirects, :-(
If you're doing Windows though, there could be issues if the output is redirected to a text file and then another process tries to consume the data.
The console knows what to show, but consumers might not be happy...
If you are using C# You might try the Environment.NewLine "constant".
http://msdn.microsoft.com/en-us/library/system.environment.newline.aspx
If you're really in vanilla c, you're stuck with \r\n. :-)
It depends on what you're using them for. Some programs will not display newlines properly if you don't put both \r and \n.
If you try to only write \n some programs that consume your text file (or output) may display your text as a single line instead of multiple lines.
There are also some file formats and protocols that will completely be invalid without using both \r and \n.
I haven't tried it in so long that I'm not sure I remember what happens... but doesn't a linefeed by itself move down a line without returning to the left column?
Depending on your compiler, the standard output might be opened in text mode, in which case a single linefeed will be translated to \r\n before being written out.
Edit: I just tried a quick test, and in XP a file without returns displays normally. I still don't know if any compilers insert the returns for you.
In C, files (called "streams") come in two flavors - binary or text.
The meaning of this distinction is left implementation/platform dependent, but on Windows (with common implementations that I've seen) when writing to text streams '\n' is automatically translated to "\r\n", and when reading from text streams "\r\n" is automatically translated to '\n'.
The "console" is actually "standard output", which is a stream opened by default as a text stream. So, in practice on Windows, writing "Hello, world!\n" should be quite sufficient - and portable.

Resources