EOF block while loop in c - c

I'm trying to make a code in c, that simply write disk c information in txt file with cmd comand "Wmic logicaldisk get" but i need only numbers instead (size 4294931).
So i pick this output and put it into a txt file to get only number in input.(I know it's quite strange).
This is the full code:
#include <stdlib.h>
#include <stdio.h>
#include <windows.h>
#include <ctype.h>
int main()
{
system("wmic logicaldisk get size> test.txt");
unsigned char symb;
FILE *FileIn;
FileIn = fopen("test.txt","rt");
int getc(FILE *stream);
while (( symb = getc(FileIn)) !=EOF)
{
if( isdigit(symb))
{
printf("%C", symb);
}
}
printf("test"); //for debug
}
the code work but can't exit the loop while, the number it's printed correctly but the next comands aren't executed(so the pritnf test isn't executed).

There are three things going on in your code that's wrong.
You redeclare a prototype for getc. You should not do that, since your declaration might not be the same as the official standard declaration.
The getc function returns an int. That is because EOF is an int constant, with the value -1. And ((unsigned char) -1) != -1. This is because the unsigned char value -1 is really 255 and that is not anywhere equal to -1. The variables you use together with getc (or any similar function) must be an int.
The printf format specifier "%C" (with an upper-case C) is not a standard format specifier. It is an Microsoft Visual C++ extension and is for wide characters of type wchar_t. Since your variable symb is not the correct type matching the format specifier you will have undefined behavior. For a narrow character like yours you should use lower case c.

Related

Is there a missing code on the isupper function?

#include <stdio.h>
#include <cs50.h>
#include <ctype.h>
#include <math.h>
// Prototype
string Get_text(void);
char isupper(ch);
int main(void)
{
string text = Get_text();
printf("%s\n", text );
}
// Prompt the user for text
string Get_text(void)
{
string n;
do {
n = get_string("Text: ");
}
while (n >= 0);
// Letters, Words, & Sentences
char ch = 'A';
}
I encountered an error when I ran my code. It point out to line 9 where I implemented the isupper function to check if the letters are capital. I even included an extra parenthesis on the isupper before the parenthesis on char but there's still errors. P.S I'm not yet done with the code. I'm reviewing how the isupper function works.
isupper was once a macro. Never declare it. #include <ctype.h> does the right thing.
If the offending declaration were for something other than isupper I would answer rather as follows:
char isupper(ch);
is bad syntax because it should be (type argumentname) in parenthesis. It would rather be as it appears in the man page (taking the luxury of correcting the type)
int isupper(int ch);
but as I said don't actually do this because of macro fun for the builtins in ctype.h.
Anyway, you're coding in c (from the tag) so there's no stock string type. This is not a compilable fragment; thankfully the line you're asking about occurs early enough that we can tell anyway what the problem is.
The isupper identifier coincides with a <ctype.h> function/macro that is included in the standard library.
As there's an already introduced prototype for the isupper function (int isupper(int ch);) that doesn't match the one you have used, it is giving you an error.
Simply call it otherwise (more if you plan to use the <ctype.h> routines) and not isupper.

How to read and print a unicode file

I have a test input file input.txt with one line with the following contents:
кёльнский
I am using this code to attempt to read it in and print it out.
#include <locale.h>
#include <stdio.h>
#include <wchar.h>
int main()
{
FILE *input;
wchar_t buf[1000];
setlocale(LC_CTYPE,"");
if ((input = fopen("input.txt","r")) == NULL)
return 1;
printf("Read and print\n");
while (fgetws(buf,1000,input)!=NULL)
wprintf(L"%s",buf);
fclose(input);
}
However when I run it I see "Read and print" and then nothing else.
I am compiling with gcc on Ubuntu.
What am I doing wrong?
It turns out that substituting the wprintf line with
printf("%ls",buf);
fixes the problem.
Why is this?
You're doing two things wrong:
Mixing normal (byte-oriented) and wide output functions to standard output. You need to stick to one or the other. From the C11 draft, section 7.21.2:
Each stream has an orientation. After a stream is associated with an external file, but before any operations are performed on it, the stream is without orientation. Once a wide character input/output function has been applied to a stream without orientation, the stream becomes a wide-oriented stream. Similarly, once a byte input/output function has been applied to a stream without orientation, the stream becomes a byte-oriented stream. ...
Byte input/output functions shall not be applied to a wide-oriented stream and wide character input/output functions shall not be applied to a byte-oriented stream.
Using the wrong printf format to print a wide string. %s is for a normal char string. %ls is for a wchar_t string. But for just printing a wide string to a wide stream, prefer fputws(). No point in using a printf function if you're not actually using its formatting capabilities or mixing literal text with variables or printing wide characters to a byte-oriented stream or something else fancy.
One way (Of many alternatives) to fix the above problems, that treats standard output as a wide-oriented stream:
#include <locale.h>
#include <stdio.h>
#include <wchar.h>
int main(void)
{
FILE *input;
wchar_t buf[1000];
setlocale(LC_CTYPE,"");
if ((input = fopen("input.txt","r")) == NULL)
return 1;
fputws(L"Read and print\n", stdout);
while (fgetws(buf,1000,input)!=NULL)
fputws(buf, stdout);
fclose(input);
}
Another, using a byte-oriented standard output:
#include <locale.h>
#include <stdio.h>
#include <wchar.h>
int main()
{
FILE *input;
wchar_t buf[1000];
setlocale(LC_CTYPE,"");
if ((input = fopen("input.txt","r")) == NULL)
return 1;
puts("Read and print");
while (fgetws(buf,1000,input)!=NULL)
printf("%ls", buf);
fclose(input);
}
fgetws reads UTF-16. We normally handle UTF-8 with normal fgets into char [] and expect it to work and print it with normal printf. That's the point of UTF-8. If it doesn't work on display, probably your terminal isn't UTF-8; this is easily checked by running cat input.txt.

Expected encoding of wcwidth() argument

I'm trying to find out what the expected encoding of wcwidth() argument is.
The man page says absolutely nothing about this, and I wasted hours trying to
find out what it is. Here's an example, in C:
#include <stdio.h>
#include <wchar.h>
void main()
{
wchar_t c = L'h';
printf("%d\n", wcwidth(c));
}
I want to know how should I encode this character literal so that this program
prints 2 instead of -1.
Here's a Rust example:
extern "C" {
fn wcwidth(c: libc::wchar_t) -> libc::c_int;
}
fn main() {
let c = 'h';
println!("{}", unsafe { wcwidth(c as libc::wchar_t) });
}
Similarly I want to convert this character constant to wchar_t (i32) so that
this program prints 2.
Thanks.
UPDATE: Sorry for my wording, I made this sound specific to C's long char literals. I want to encode character literals in any language as a 32-bit int so that when I pass it to wcwidth I get a right answer. So my question is not specific to C or C's long char literals.
UPDATE 2: I'd also be happy with another function like wcwidth that is better specified (and maybe even platform independent). E.g. one that takes UTF-8 encoded character and returns number of cols needed to render it in a monospace terminal.
You need to add support for _XOPEN_SOURCE and also you need to set your locales.
Try this:
#define _XOPEN_SOURCE 700
#include <stdio.h>
#include <locale.h>
#include <wchar.h>
int main(void)
{
setlocale(LC_CTYPE, "");
wchar_t c = L'h';
printf("%d\n", wcwidth(c));
return 0;
}

How to fix locale?

Add ru_RU.CP1251 locale (on debian uncomment ru_RU.CP1251 in /etc/locale.gen and run sudo locale-gen) and
compile the following program with gcc -fexec-charset=cp1251 test.c (input file is in UTF-8). The result is empty. Just letter 'я' is wrong.
Other letters are determined either lowercase or uppercase just fine.
#include <locale.h>
#include <ctype.h>
#include <stdio.h>
int main (void)
{
setlocale(LC_ALL, "ru_RU.CP1251");
char c = 'я';
int i;
char z;
for (i = 7; i >= 0; i--) {
z = 1 << i;
if ((z & c) == z) printf("1"); else printf("0");
}
printf("\n");
if (islower(c))
printf("lowercase\n");
if (isupper(c))
printf("uppercase\n");
return 0;
}
Why neither islower() nor isupper() work on letter я?
The answer is that the encoding for the lower case version of that character in CP 1251 is decimal 255, and islower() and isupper() for your implementation do not accept or return that value (which is often interpreted as EOF).
You need to track down the source code for the runtime library to see what it does and why.
The solution is to write your own implementations, or wrap the ones you have. Personally, I never use these functions directly because of the many gotchas.
Igor, if your file is UTF-8 it's of no sense to try to use code page 1251, as it has nothing in common with utf-8 encoding. Just use locale ru_RU.UTF-8 and you'll be able to display your file without any problem. Or, if you insist on using ru_RU.CP1251, you'll need to first convert your file from utf-8 encoding to cp1251 (you can use the iconv(1) utility for that)
iconv --from-code=utf-8 --to-code=cp1251 your_file.txt > your_converted_file.txt
On other side, the --fexec-charset=cp1251 only affects the characters used on the executable, but you have not specified the input charset to use in string literals in your source code. Probably, the compiler is determining that from the environment (which you have set in your LANG or LC_CHARSET environment variables)
Only once you control exactly what locales are used at each stage, you'll get coherent results.
The main reason an effort is being made to switch all countries to a common charset (UTF) is exactly to not have to deal with all these locale settings at each stage.
If you deal always with documents encoded in CP1251, you'll need to use that encoding for everything on your computer, but when you receive some document encoded in utf-8, then you'll have to convert it to be able to see it right.
I mostly recommend you to switch to utf-8, as it's an encoding that has support for all countries character sets, but at this moment, that decision is only yours.
NOTE
On debian linux:
$ sed 's/^/ /' pru-$$.c
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>
#include <locale.h>
#define P(f,v) printf(#f"(%d /* '%c' */) => %d\n", (v), (v), f(v))
#define Q(v) do{P(isupper,(v));P(islower,(v));}while(0)
int main()
{
setlocale(LC_ALL, "");
Q(0xff);
}
Compiled with
$ make pru-$$
cc pru-1342.c -o pru-1342
execution with ru_RU.CP1251 locale
$ locale | sed 's/^/ /'
LANG=ru_RU.CP1251
LANGUAGE=
LC_CTYPE="ru_RU.CP1251"
LC_NUMERIC="ru_RU.CP1251"
LC_TIME="ru_RU.CP1251"
LC_COLLATE="ru_RU.CP1251"
LC_MONETARY="ru_RU.CP1251"
LC_MESSAGES="ru_RU.CP1251"
LC_PAPER="ru_RU.CP1251"
LC_NAME="ru_RU.CP1251"
LC_ADDRESS="ru_RU.CP1251"
LC_TELEPHONE="ru_RU.CP1251"
LC_MEASUREMENT="ru_RU.CP1251"
LC_IDENTIFICATION="ru_RU.CP1251"
LC_ALL=
$ pru-$$
isupper(255 /* 'я' */) => 0
islower(255 /* 'я' */) => 512
So, glibc is not faulty, the fault is in your code.
The first comment of Jonathan Leffler to OP is true. isxxx() (and iswxxx()) functions are required to handle EOF (WEOF) argument
(probably to be fool-proof).
This is why int was chosen as the argument type. When we pass argument of type char or character literal, it is
promoted to int (preserving the sign). And because by default char type and character literals are signed in gcc,
0xFF becomes -1, which is by unhappy coincidence the value of EOF.
Therefore always do explicit typecasting when passing parameters of type char (and character literals with code 0xFF) to functions, using int argument type (don't count on the unsignedness of char, because it is implementation-defined). Typecasting may be either done via (unsigned char), or via (uint8_t), which is less to type (you must include stdint.h).
See also https://sourceware.org/bugzilla/show_bug.cgi?id=20792 and Why passing char as parameter to islower() does not work correctly?

ASCII characters in C

I'm trying to save a character from the cyrillic alphabet in a char.
When I take a string from the console it saves it in the char array successfully but just initializing it doesn't seem to work. I get "programName.exe has stopped working" when trying to run it.
#include <stdio.h>
#include <conio.h>
#include <string.h>
#include <Windows.h>
#include <stdlib.h>
void test(){
char test = 'Я';
printf("%s",test);
}
void main(){
SetConsoleOutputCP(1251);
SetConsoleCP(1251);
test();
}
fgets ( books[booksCount].bookTitle, 80, stdin ); // this seems to be working ok with ascii.
I tried using wchar_t but I get the same results.
If you're using Russian Windows which uses Windows-1251 codepage by default, you can print the character encoded as a single byte using the old printf but you need to make sure that the source code uses the same cp1251 charset. Don't save as Unicode.
But the preferred way should be using wprintf with wide char string
void test() {
wchar_t test_char = L'Я';
wchar_t *test_string = L"АБВГ"; // or LPCWSTR test_string
wprintf(L"%c\n%s", test_char, test_string);
}
This time you need to save the file as Unicode (UTF-8 or UTF-16)
UTF-8 may be better, but it's trickier on Windows. Moreover if you use UTF-8 you cannot use a char to store Я because it needs more than 1 byte. You must use a char* instead
Note that main must return int, not void, and the above fgets must be called from inside some function
This could be solved, doing
void test()
{
char test = 'Я';
putchar(test);
}
But there is a catch: Since 'Я' is not an ASCII character, you might need to set appropriate locale before.
Moreover, only ASCII characters 32 - 126 are guaranteed to be printable, and the same symbol, on all systems.

Resources