How does getchar_unlocked() work? - c

My question is based on a CodeChef problem called Lucky Four.
This is my code:
int count_four() {
int count = 0;
char c = getchar_unlocked();
while (c < '0' || c > '9')
c = getchar_unlocked();
while (c >= '0' && c <= '9') {
if (c == '4')
++count;
c = getchar_unlocked();
}
return count;
}
int main() {
int i, tc;
scanf("%d", &tc);
for (i = 0; i < tc; ++i) {
printf("%d\n", count_four());
}
return 0;
}
Let's say I make a slight change to count_four():
int count_four() {
int count = 0;
char c = getchar_unlocked();
while (c >= '0' && c <= '9') {
if (c == '4')
++count;
c = getchar_unlocked();
}
while (c < '0' || c > '9') // I moved this `while` loop
c = getchar_unlocked();
return count;
}
This is my output after moving the while loop below the other one:
0
3
0
1
0
instead of:
4
0
1
1
0
The input used to test the program:
5
447474
228
6664
40
81
Why is this happening? How do getchar() and getchar_unlocked() work?

getchar_unlocked is just a lower level function to read a byte from the stream without locking it. In a single thread program, it behaves exactly like getchar().
Your change in the count_four function changes its behavior completely.
The original function reads the standard input. It skips non digits, causing an infinite loop at end of file. It then counts digits until it gets a '4'. The count is returned.
Your version reads the input, it skips digits, counting occurrences of '4', it then skips non digits, with the same bug on EOF, and finally returns the count.

Related

Parse char from int input in C

I am trying to display a matrix by taking input from a user. Here, the input is a lower triangular matrix and the user may enter the 'x' character which has to be replaced with INT_MAX.
The below program is not working correctly as the output is not matching the expected one.
#include <limits.h>
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int read_int() {
char input[30] = {0};
int number;
for (int i = 0; i < sizeof(input) - 1; i++){
char c = (char)getc(stdin);
if (c == 'x' || c == 'X')
return INT_MAX;
if (c < '0' || '9' < c){
if (i == 0) continue;
input[i] = 0;
return atoi(input);
}
input[i] = c;
}
input[29] = 0;
return atoi(input);
}
int main() {
int N = read_int();
int matrix[N][N];
memset(matrix, 0, N * N * sizeof(int));
for(int i = 0; i < N; ++i){
for(int j = 0; j <= i; ++j){
int distance = read_int();
matrix[i][j] = distance;
matrix[j][i] = distance;
}
}
printf("\n");
for(int i = 0; i < N; ++i){
for(int j = 0; j < N; ++j){
printf("%d\t", matrix[i][j]);
}
printf("\n");
}
printf("\n");
return 0;
}
For input:
3
x 2
x x 2
The Above program prints:
3 2147483647 2147483647
2147483647 32 2147483647
2147483647 2147483647 32
which is not expected
It should be
3 2147483647 2147483647
2147483647 2 2147483647
2147483647 2147483647 2
Update: The answers below, doesn't work for all case [except accepted one]
One such case is -
5
10
50 20
30 5 30
100 20 50 40
10 x x 10 50
it just keeps on taking input
Your logic for skipping whitespace is broken because when you eventually assign a character after skipping position 0, you will always be writing a "wanted" character at position i. That means anything already in position 0 remains.
In your case, it's undefined behavior because input[0] was originally filled with 3 on the first input where no whitespace was skipped, but in subsequent calls to your function it is uninitialized. You then go on to write a 2 into input[1] and thus by pure chance (your array from previous calls has not been overwritten on the stack and the stack is the same), you end up with the string "32" sitting in input.
What you need to do is have some way to count the actual required characters so that you write them into the array at the correct position. One naive approach would be:
int pos = 0;
for(...) {
// other logic...
// Actually write a character we want
input[pos++] = c;
}
Another way that is more like how integer input works is:
int c;
int pos = 0;
while(pos < sizeof(input) - 1 && (c = getc(stdin)) != EOF)
{
if (c == 'x' || c == 'X')
return INT_MAX;
else if (pos == 0 && isspace(c))
continue;
else if (!isdigit(c) && !(pos == 0 && (c == '-' || c == '+')))
break;
input[pos++] = c;
}
input[pos] = '\0';
return atoi(input);
I think the problem is this part of the loop:
if (c < '0' || '9' < c){
if (i == 0) continue;
input[i] = 0;
return atoi(input);
}
If you have entered 3enterx 2 as your input, then the 3 gets read successfully, and the the x gets returned as INT_MAX as intended, but in the next call to read_int, the next character in the input sequence is a space (i.e. c == ' '), and therefore it branches here. Since i == 0 at this point, the loop continues, which means i is incremented to 1, but this also means that input[0] is never changed. Most likely, input[0] contains the same value from the previous call to read_int (3), but in any case, it's undefined behaviour.
As a quick alternative, you can simply change this condition to:
if (c != ' ' && (c < '0' || '9' < c)){
This will mean input[0] will be set to a space character, which atoi will ignore.
An alternative solution could be to read in an entire line at once and tokenise the line.

Count number of words in input line. Where word is just consistency, where first character is only letter

Exact formulation:
Write a program, which counts numbers of words in the input line. In word means consistency, where the first character must be a letter.
Examples of inputs and outputs:
Input: one 2two three
Output: 2
Input: one two three four five 6six
Output: 5
Input: 789878moer and more
Output: 2
Input: something like 8this Output: 2
Program:
#include <stdio.h>
#define YES 1
#define NO 0
int main() {
int c, nw, inword, first_char;
inword = first_char = NO;
nw = 0;
while((c = getchar()) != EOF) {
if (c == ' ' || c == '\n' || c == '\t') {
inword = first_char = NO;
} else if (inword == NO && first_char == NO) {
if ((65 < c && c < 90) || (97 < c && c < 122)) {
++nw;
inword = YES;
} else {
first_char = YES;
}
}
}
printf("%d\n", nw);
}
Answers:
Is it correct solution?
Is it possible to decide this task in more elegant way? If yes, How?
Is it correct solution?
I tested a few cases and it seems okay to me.
Is it possible to decide this task in more elegant way? If yes, How?
The following line
if((65 < c && c < 90) || (97 < c && c < 122))
uses magic numbers and ASCII values to check if c is an alphabet.
You can instead use the library function isalpha() which is defined in <ctype.h> header file so that the above line becomes:
if (isalpha(c))

Why have I to write c - '0' instead of just c? [duplicate]

This question already has answers here:
Use of s[i] - '0' [duplicate]
(3 answers)
Closed 5 years ago.
Hey I can't understand why my code doesn't write when I put just ++ndigit[c] (instead of ++ndigit[c -'0'], then with ++nchar[c] it's ok.
If you have any tuto I'll be really interested !
#include <stdio.h>
int main()
{
int c, i, y, ns;
int ndigit[10];
int nchar[26];
ns = 0;
for(i = 0; i >= 0 && i<= 9; ++i) {
ndigit[i] = 0;
}
for(y = 'a'; y <= 'z'; ++y) {
nchar[y] = 0;
}
while((c = getchar()) != EOF) {
if(c == ' ' || c == '\t') {
++ns;
}
if(c >= 'a' && c <= 'z') {
++nchar[c];
}
if(c >= '0' && c <= '9') {
++ndigit[c];
//++ndigit[c-'0'];
}
if(c == '\n') {
printf("chiffres: ");
for(i=0;i<10;++i) {
printf("%d:%d ", i, ndigit[i]);
}
printf("lettres: ");
for(y='a';y<='z';++y) {
printf("%d:%d ", y, nchar[y]);
}
printf("space: %d\n", ns);
}
}
}
Actually when you set the variable to c='0', it means that the value of c is now the ascii value of '0' and that is = 48.
Since you are setting the value of c to 48 but the array size is 10, your code will get a runtime exception because you are trying to access an index that doesn't even exist.
Remember when you use '0' it means character. So setting this value to an int variable makes the value equals to the ascii value of that character. Instead you can use c=0 directly.
Because the character '4' (for example) is usually not equal to the integer 4. I.e. '4' != 4.
Using the most common character encoding scheme ASCII, the character '4' has the value 52, and the character '0' has the value 48. That means if you do e.g. '4' - '0' you in practice to 52 - 48 and get the result 4 as an integer.

Converting from hexadecimal to decimal number in C

I am doing the exercise in the C Programming language book, and exercise 2-3 asked us to write a function htoi to convert a hexadecimal number to decimal number.
This is the code I wrote, however when it runs, it always show that my hexadecimal number is illegal.
Please help!
#include<stdio.h>
#define TRUE 1
#define FALSE 0
int htoi (char s[]);
int main() {
printf("The decimal number is %d\n", htoi("0x134"));
return 0;
}
int htoi (char s[]) {
int j; /* counter for the string */
int temp; /* temp number in between conversion */
int number; /* the converted number */
int ishex; /* if the number is a valid hexadecimal number */
char c;
number = 0;
temp = 0;
ishex = FALSE;
if (s[0] == '0' && (s[1] == 'x' || s[1] == 'X')) {
ishex = TRUE;
}
else {
ishex = FALSE;
printf("This is not valid hexadecimal number.\n");
return number = 0;
}
if (ishex == TRUE) {
for (j = 2; (c = s[j]) != EOF; ++j) {
if (c >= '0' && c <= '9')
temp = c - '0';
else if (c >= 'a' && c <= 'f')
temp = 10 + c - 'a';
else if (c >= 'A' && c <= 'F')
temp = 10 + c - 'A';
else {
printf("This is a illegal hexadecimal number.\n");
ishex = FALSE;
return 0;
}
number = number * 16 + temp;
}
}
return number;
}
A string is a sequence of characters that terminates at the first '\0' character. That means "0x134" terminates with a '\0' character value, not an EOF value.
You are operating on a sequence of characters that you expect to be terminated by an EOF value, but that is simply not possible. I'll explain why later... Suffice to say for now, the string "0x134" contains no EOF value.
Your loop reaches the string-terminating '\0', which isn't in the range 0..9, a..f or A..F and so this branch executes:
else {
printf("This is a illegal hexadecimal number.\n");
ishex = FALSE;
return 0;
}
Perhaps you meant to write your loop like so:
for (j = 2; (c = s[j]) != '\0'; ++j) {
/* SNIP */
}
I promised to explain what is wrong with expecting EOF to exist as a character value. Assuming an unsigned char is 8 bits, getchar can return one of 256 character values, and it will return them as a positive unsigned char value... OR it can return the negative int value EOF, corresponding to an error or end-of-file.
Confused? In an empty file, there are no characters... Yet if you try to read a character from the file, you will get EOF every time, in spite of there being no characters. Hence, EOF is not a character value. It's an int value, and should be treated as such before you attempt to convert the value to a character, like so:
int c = getchar();
if (c == EOF) {
/* Here, c is NOT A CHARACTER VALUE! *
* It's more like an error code ... *
* XXX: Break or return or something */
}
else {
/* Here, c IS a character value, ... *
* so the following conversion is ok */
char ch = c;
}
On another note, c >= '0' && c <= '9' will evaluate truthfully when c is one of the digits in the range 0..9... This is a requirement from the C standard
Neither c >= 'a' && c <= 'f' nor c >= 'A' && c <= 'F' are required to evaluate truthfully under any circumstance, however. It happens to work on your system, because you are using ASCII which contains all of the lowercase letters in one contiguous block, and all of the uppercase letters in another contiguous block. C does not require that ASCII be the character set.
If you want this code to work portably, you might consider something like:
char alpha_digit[] = "aAbBcCdDeEfF";
if (c >= '0' && c <= '9') {
c -= '0';
}
else if (strchr(alpha_digit, c)) {
c = 10 + (strchr(alpha_digit, c) - alpha_digit) / 2;
}
else {
/* SNIP... XXX invalid digit */
}

Infinite loop when trying to count letter frequencies

#include <stdio.h>
#include <string.h>
int main(void)
{
char string[100];
int c = 0, count[26] = {0};
int accum = 0;
int a;
while(1)
{
a = scanf("%s", string);
while ( string[c] != '\0' )
{
if ( string[c] >= 'a' && string[c] <= 'z' ){
count[string[c]-'a']++;
accum++;
}
else if (string[c] >= 'A' && string[c] <= 'Z'){
count[string[c]-'A']++;
accum++;
}
c++;
}
if (a == EOF)
{
for ( c = 0 ; c < 26 ; c++ )
{
if( count[c] != 0 )
printf( "%c %f\n", c+'a', ((double)count[c])/accum);
}
}
}
return 0;
}
So I have a program that counts the frequencies of letters that appear in standard input until EOF. But once I reach EOF, my program just goes into an infinite loop and the frequencies doesn't seem to be right. When I just put a print statement to enter a single string, it works fine. I don't really know what the problem is. Would anybody be able to help me with this?
if (a == EOF) should be right after a = scanf("%s", string);
Then that if() condition should exist the loop.
Should reset c = 0 each time in loop
while(1) {
a = scanf("%s", string);
if (a == EOF) {
...
break;
}
c = 0;
while ( string[c] != '\0' ) {
With the above changes, confident your code will run fine. There are other things to consider on lesser degree. 1) The scanf("%s",... is unbounded. 2) Should limit input. if (a == EOF) might as well code after the loop. 3) Suggest the loop condition is the positive affirmation that scanf()==1. Loop on what is good, not exit on a case of what is bad. 4) Consider unsigned vs. int for counting. 5) A for() loop rather than while() is nice for incremental loops. 6) Avoid magic numbers like 26.
BTW: Your code had nice use of casting for floating point, A literals and array {0} initialization.
#include <stdio.h>
#include <string.h>
int main(void) {
char string[100];
unsigned count['z' - 'a' + 1] = { 0 };
unsigned accum = 0;
while (scanf("%99s", string) == 1) {
for (int c = 0; string[c]; c++) {
if (string[c] >= 'a' && string[c] <= 'z') {
count[string[c] - 'a']++;
accum++;
} else if (string[c] >= 'A' && string[c] <= 'Z') {
count[string[c] - 'A']++;
accum++;
}
}
}
for (int c = 'a'; c <= 'z'; c++) {
if (count[c - 'a'] != 0)
printf("%c %f\n", c, ((double) count[c - 'a']) / accum);
}
return 0;
}
The infinite loop is caused by this line:
while(1)
Remove it if you don't need it, or add a break statement somewhere.
Some more words to help describe your problem and the solution (as proposed by chux).
The first problem that you have is that you have no logic to exit the while(1) loop.
IE you have an infinite loop because that is what you have coded.
Even though you detect the EOF, you don't do anything about it: there is nothing in your code that says "Now that we have EOF, we need to exit this while(1) loop".
This is what chux is suggesting in his answer: that is what the break statement is for: it says "break out of the loop now".
You also have an additional problem that you are parsing the string before you check if you have EOF. If a is EOF, then you must not parse the string, because you didn't get one.
So you need to rearrange your code so that your EOF check is before your string parsing, and when you finish printing the string after detecting EOF, you need to break.

Resources