checking chars when reading from file with getc - c

In the following code, I'm attempting to store all characters from a file (including newlines).
If a newline is read, variable 'i' should be incremented and 'j' reset to 0, but this doesn't happen. I've confirmed that the newlines are in fact being read and stored, by printing from my array to console.
void scan_solved_nonogram(board *b) {
FILE *file = fopen("test.txt", "r");
int i = 0, j = 0;
while( ( b->symbol[i][j] = getc(file) ) != EOF ) {
j++;
if( b->symbol[i][j] == '\n' ) {
i++;
j = 0;
}
}
fclose(file);
b->size_i = i;
b->size_j = j;
}

The problem is that you increment j before you check for the newline character.
while( ( b->symbol[i][j] = getc(file) ) != EOF ) {
j++;// you increment j, so you need to check for newline at j-1
if( b->symbol[i][j-1] == '\n' ) {
i++;
j = 0;
}
}

Related

Reverse line function in C works unpredictably

I'm doing the exercises from C Programming Language, the one where you need to make a function to reverse a line. So I did and it works sometimes. But only some times. With the same test it gives different results. I really don't get it, and would appreciate some help. 3 tries out of 4 it would print around 150 spaces and 1 out of 4 it would print the reversed line just like I wanted, though with some junk in the end for some reason.
I was thinking of doing it with pointers, but couldn't figure them out as of now.
Here's my code:
#include <stdio.h>
void reverse(char theline[150]){
int i, j;
char tmp[150];
for (i = 0; theline[i] != 0; i++){
tmp[i] = theline[i];
}
for (j = 0; i >= 0; j++){
theline[j] = tmp[i];
i--;
}
}
int main() {
char line[150];
char c;
int counter = 0;
do {
counter = 0;
while (((c = getchar()) != '\n') && (c != EOF)) { //one line loop
line[counter] = c;
counter++;
}
if (counter > 80){
reverse(line);
printf("%s\n", line);
}
}
while (c != EOF);
return 0;
}
I compile it with "gcc -g -Wall program -o test" and the compiler doesn't give me any errors or warnings. My OS is Ubuntu and I test it with "./test < longtext.txt". This text file has a few lines of different length.
After this loop
while (((c = getchar()) != '\n') && (c != EOF)) { //one line loop
line[counter] = c;
counter++;
}
the character array line does not contain a string because the stored characters are not appended with the terminating zero character '\0'.
So this loop within the function
for (i = 0; theline[i] != 0; i++){
tmp[i] = theline[i];
}
invokes undefined behavior.
You need to append the array with the terminating zero character '\0'.
But even if the passed character array will containe a string the second for loop
for (i = 0; theline[i] != 0; i++){
tmp[i] = theline[i];
}
for (j = 0; i >= 0; j++){
theline[j] = tmp[i];
i--;
}
writes the terminating zero character '\0' in the first position if the array theline. As a result you will get an empty string.
Also the function shall not use the magic number 150 and an auxiliary array.
Pay attention to that the variable c should be declared as having the type int. In general the type char can behave either as the type signed char or unsigned char depending on compiler options. If it will behave as the type unsigned char then this condition
c != EOF
will always evaluate to true.
Without using standard C string functions the function can be declared and defined the following way
char * reverse( char theline[] )
{
size_t i = 0;
while ( theline[i] != '\0' ) i++;
size_t j = 0;
while ( j < i )
{
char c = theline[j];
theline[j++] = theline[--i];
theline[i] = c;
}
return theline;
}
Here is a demonstration program
#include <stdio.h>
char * reverse( char theline[] )
{
size_t i = 0;
while ( theline[i] != '\0' ) i++;
size_t j = 0;
while ( j < i )
{
char c = theline[j];
theline[j++] = theline[--i];
theline[i] = c;
}
return theline;
}
int main( void )
{
char s[] = "Hello World!";
puts( s );
puts( reverse( s ) );
}
The program output is
Hello World!
!dlroW olleH
Why muck-around with reversing a buffer when you can simply store-up the entered characters as they arrive.
#include <stdio.h>
int main() {
for( ;; ) {
char line[ 150 ];
int c, counter = sizeof line;
line[ --counter ] = '\0';
// NB: EOF is an int, not a char
while( ( c = getchar() ) != '\n' && c != EOF && counter > 0 )
line[ --counter ] = (char)c;
printf( "%s\n\n", line + counter );
counter = sizeof line;
}
return 0;
}
Output:
the quick
kciuq eht
Once upon a time in a land far awy
ywa raf dnal a ni emit a nopu ecnO
I wish I was what I was when I wished I was what I am now.
.won ma I tahw saw I dehsiw I nehw saw I tahw saw I hsiw I

Why do I have garbage characters at the end of my character array?

I have the following code (from K&R Exercise 2-4):
#include <stdio.h>
#define MAXLINE 1000 /* maximum input line size */
void squeeze(char s1[], char s2[]);
/* Exercise 2-4. Write an alternate version of squeeze(s1,s2) that deletes each
character in s1 that matches any character in the string s2. */
main()
{
int c, i;
char line[MAXLINE]; /* current input line */
int len = 0; /* current line length */
char s2[MAXLINE]; /* array of characters to delete */
printf("Characters to delete: ");
for (i = 0; (c = getchar()) != '\n'; ++i)
s2[i] = c;
while ((c = getchar()) != EOF) {
if (c == '\n') {
squeeze(line, s2);
printf("%s\n", line);
for (i = 0; i < len; ++i)
line[i] = 0;
len = 0;
} else {
line[len] = c;
++len;
}
}
return 0;
}
/* squeeze: delete all chars in s2 from s1 */
void squeeze(char s1[], char s2[])
{
int i, j, k;
for (k = 0; s2[k] != '\0'; k++) {
for (i = j = 0; s1[i] != '\0'; i++)
if (s1[i] != s2[k])
s1[j++] = s1[i];
s1[j] = '\0';
}
}
But when I run it and it reads in input, I find that there are garbage characters at the end of s2. Adding the following code after declaring the character arrays:
for (i = 0; i < MAXLINE; ++i)
line[i] = s2[i] = 0;
seems to fix the issue. But don't character arrays come initialized with 0 to begin with? Does anyone know why this is happening?
The problem here is your strings are not null terminated. Local variables (like line and s2) are not automatically initialized with 0. Their content is indeterminate.
Declare line and s2 like this:
char line[MAXLINE] = { 0 }; // initializes all elements of line with 0
char s2[MAXLINE] = {0}; // initializes all elements of s2 with 0
or just null terminate line:
...
if (c == '\n') {
line[len] = 0; // <<< add this: null termninate line
squeeze(line, s2);
...
and null terminate s2:
...
printf("Characters to delete: ");
for (i = 0; (c = getchar()) != '\n'; ++i)
s2[i] = c;
s2[i] = 0; // <<< add this: null termninate s2
while ((c = getchar()) != EOF) {
...
For starters according to the C Standard the function main without parameters shall be declared like
int main( void )
As the function squeze accepts two arrays without their length than it means that the arrays contain strings: sequences of characters terminated by zero character '\0'.
As the second array is not changed within the function then the second parameter should be declared with the qualifier const.
The function declaration will look like
char * squeeze( char s1[], const char s2[] );
Within the function you should check at first whether s1 or s2 contains an empty string.
The function definition can look for example the following way
char * squeeze( char s1[], const char s2[] )
{
if ( *s1 != '\0' && *s2 != '\0' )
{
size_t i = 0;
for ( size_t j = 0; s1[j] != '\0'; ++j )
{
size_t k = 0;
while ( s2[k] != '\0' && s2[k] != s1[j] ) ++k;
if ( s2[k] == '\0' )
{
if ( i != j ) s1[i] = s1[j];
++i;
}
}
s1[i] = '\0';
}
return s1;
}
The nested loops within the function
if ( *s1 != '\0' && *s2 != '\0' )
{
size_t i = 0;
for ( size_t j = 0; s1[j] != '\0'; ++j )
{
size_t k = 0;
while ( s2[k] != '\0' && s2[k] != s1[j] ) ++k;
if ( s2[k] == '\0' )
{
if ( i != j ) s1[i] = s1[j];
++i;
}
}
s1[i] = '\0';
}
can be also rewritten the following way
if ( *s1 != '\0' && *s2 != '\0' )
{
size_t i = 0;
for ( size_t j = 0; s1[j] != '\0'; ++j )
{
if ( strchr( s2, s1[j] ) == NULL )
{
if ( i != j ) s1[i] = s1[j];
++i;
}
}
s1[i] = '\0';
}
In main you need to append entered sequence with the terminating zero character as for example
i = 0;
while ( i + 1 < MAXLINE && ( c = getchar() ) != EOF && c != '\n' )
{
s2[i++] = c;
}
s2[i] = '\0';
and
i = 0;
while ( ( c = getchar() ) != EOF )
{
if ( i + 1 == MAXLINE || c == '\n' )
{
line[i] = '\0';
printf( "%s\n", squeeze( line, s2 ) );
i = 0;
}
else
{
line[i++] = c;
}
}

Is there a way to print all string in capital letters without using strupr function as its not a standard library function?

I want to print the data stored in a file which is randomly cased in all caps and strupr() seems to be something that's been listed by someone previously but its not a standard function and may not be cross platform. Is there something which is cross platform?
EDIT 1:
fgets(input1,254,title);
fgets(input2,254,author);
input1[strcspn(input1, "\n")] = '\0';
input2[strcspn(input2, "\n")] = '\0';
printf("<%s> <%s>\n",input1,input2 );
I want to print the string stored in input1 and input2 in uppercase. How to do that?
You can process character by character and use toupper(). Standard function C89 onwards.
Or you can check if character is in between a & z then do a - 32. It will be changed to capital letter.
Here a - 32 = A, because ASCII value of a is 97 and 97 - 32 = 65 and we all know that ASCII value of A is 65.
Code:
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
FILE *fp;
char buffer[255] = {'\0'}, c;
int i = 0;
fp = fopen("txt.txt", "r");
if(!fp)
{
perror("txt");
exit(1);
}
while( (c = getc(fp)) != EOF)
buffer[i++] = c;
for( i = 0; buffer[i] != '\0'; i++)
{
if(buffer[i] >= 'a' && buffer[i] <= 'z')
buffer[i] = buffer[i] - 32;
printf("%c", buffer[i]);
}
fclose(fp);
return 0;
}
Output:
HELLO!
THIS IS 2ND LINE.
You can use a custom made function, f.e. upcase(). It reads every character in the file, checks whether it is lowercase or not (if it is, the character is adjusted to uppercase using the toupper() function), stores the whole file content into a buffer and then overwrites the file with the content in the buffer:
FILE* upcase (const char* path)
{
int c, cnt = 0, i = 0, j = 1;
int n = 500;
FILE* fp = fopen(path, "r+");
char* buffer = calloc(n, sizeof(char));
if (!fp || !buffer)
{
return NULL;
}
while ((c = fgetc(fp)) != EOF)
{
if ( i == n )
{
j++;
realloc(buffer, sizeof(char) * (n * j));
if (!buffer)
{
return NULL;
}
i = -1;
}
c = toupper(c);
buffer[i] = c;
i++;
cnt++;
}
for ( int i = 0; i < cnt; i++ )
{
if (fputc(c, fp) == EOF)
{
fclose(buffer);
return NULL;
}
}
return fp;
}

Why is string modified in C even when I'm not trying to modify it?

I'm trying to solve exercise 1-19 in K&R C second edition. "Write a function reverse that reverses the character string s. Use it to write program that reverses its input a line at a time."
My solution takes two input strings s and t. s is source and t is target. And it copies the data in source s to t. I'm able to solve the problem, but I'm struggling hard to understand why would source string s be modified, even though it is not on the left hand side of the equal operator.
#include <stdio.h>
/* Solution to Exercise 1-19. Chapter 1 */
#define MAXLENGTH 10
int getln(char s[], int lim);
void reverse(char s[], char t[]);
int main()
{
int i, len;
char s[MAXLENGTH]; /* original string */
char t[MAXLENGTH]; /* reversed string */
while ((len = getln(s, MAXLENGTH)) > 0) {
printf("before reverse: %s", s);
reverse(s,t);
printf("reversed string: %s\n", t);
printf("after reverse: %s", s);
}
return 0;
}
/* getln: read a line into s, return length */
int getln(char s[], int lim)
{
int c, i, l;
l = 0;
for (i = 0; ((c = getchar()) != EOF) && (c != '\n'); ++i) {
if (i < (lim - 1)) {
s[l] = c;
++l;
}
}
if (c == '\n') {
s[l] = c;
++l;
}
s[l] = '\0';
return l;
}
/* reverse: reverses s to target t */
void reverse(char s[], char t[])
{
int i, j;
for (i = 0; s[i] != '\0'; ++i)
;
--i;
if (s[i] == '\n') {
--i;
}
for (j = 0; i >= 0; ++j) {
t[j] = s[i];
--i;
}
t[j] = '\0';
}
Test case:
$ ./a.out < testdata
before reverse: abcdefghi
reversed string: ihgfedcba
after reverse: abcdefghi
ihgfedcba$
Contents of file testdata:
$ cat testdata
abcdefghijklmnopqrstuvwxyz
$
There is a bug in the function getln To simplify the analyze of the function let's assume that lim is equal to 2.
Then in this loop
l = 0;
for (i = 0; ((c = getchar()) != EOF) && (c != '\n'); ++i) {
if (i < (lim - 1)) {
s[l] = c;
++l;
}
}
you can write lim-1 characters that is only one character. The loop stops its iterations when the user will press the key Enter that sends to the input buffer the new line character '\n'.
So the last read character is the new line character '\n'. This character is stored in the string after the loop
if (c == '\n') {
s[l] = c;
++l;
}
Now the limit is exhausted. Two characters of the passed character array are set.
However in the next statement
s[l] = '\0';
there is access to the memory beyond the limit when l is equal to 2.
That is all. The function invokes undefined behavior provided that the value of the parameter lim is equal to the size of the passed character array. The terminating zero character '\0' is written in the memory outside the character array and later can be overwritten.
I would define the function the following way as it is shown in the demonstrative program below.
#include <stdio.h>
size_t getln( char s[], size_t n )
{
size_t i = 0;
if ( n )
{
int c;
while ( i + 1 < n && ( c = getchar() ) != EOF && c != '\n' )
{
s[i++] = c;
}
if ( c == '\n' && i + 1 < n ) s[i++] = c;
s[i] = '\0';
}
return i;
}
int main(void)
{
enum { N = 10 };
char s[N];
while ( getln( s, N ) ) printf( "\"%s\"\n", s );
return 0;
}
If to enter
abcdefghijklmnopqrstuvwxyz
then the program output will be
"abcdefghi"
"jklmnopqr"
"stuvwxyz
"
That is only the last entered string contains the new line characters.
Pay attention to that in the exercise there is written
Write a function reverse that reverses the character string s.
This means that you need to reverse the original string itself instead of coping it in the reverse order to another character array.
Such a function can look the following way
#include <stdio.h>
char * reverse( char *s )
{
size_t n = 0;
while ( s[n] != '\0' ) n++;
if ( n && s[n-1] == '\n' ) --n;
for ( size_t i = 0; i < n / 2; i++ )
{
char c = s[i];
s[i] = s[n-i-1];
s[n-i-1] = c;
}
return s;
}
size_t getln( char s[], size_t n )
{
size_t i = 0;
if ( n )
{
int c;
while ( i + 1 < n && ( c = getchar() ) != EOF && c != '\n' )
{
s[i++] = c;
}
if ( c == '\n' && i + 1 < n ) s[i++] = c;
s[i] = '\0';
}
return i;
}
int main(void)
{
enum { N = 10 };
char s[N];
while ( getln( s, N ) ) printf( "\"%s\"\n", reverse( s ) );
return 0;
}
Again if the input is
abcdefghijklmnopqrstuvwxyz
then the program output is
"ihgfedcba"
"rqponmlkj"
"zyxwvuts
"
If you want to remove the new line character '\n' from the string inside the function reverse then substitute this statement
if ( n && s[n-1] == '\n' ) --n;
for this one
if ( n && s[n-1] == '\n' ) s[--n] = '\0';
You are not allocating memory for c or t, therefore you are overwriting things.

Getc() reading \n improperly

I am creating a program that prints each line of a file one by one in reverse word order. i.e. "The big black moose" prints as "moose black big The".
However, in the code below, it does not print the last word of lines that do not have a delimiter before the line break. A delimiter in this case is defined as any whitespace character such as space or tab.
int main(int argc, char const *argv[]) {
if (argc != 2) return 0;
int i = 0, c;
int isD = 0, wasD = 1;
int count = 0;
FILE *file = fopen(argv[1], "r");
while ((c = getc(file)) != EOF) {
isD = c == ' ' || c == '\t' || c == '\n';
if (!isD) {
chars[i++] = c;
count++;
}
if (isD && !wasD) {
shiftInsert(i++, count);
count = 0;
}
wasD = isD;
}
fclose(file);
return 0;
}
int shiftInsert(int length, int shift) {
int word[shift+1], i;
printf("\n----------\nL:%d,S:%d\n", length, shift);
for (i = 0; i < shift; i++)
word[i] = chars[length-shift+i];
word[shift] = ' ';
for (i = 0; i < shift; i++)
printf("%c", word[i]);
for (i = length; i >= 0; i--)
chars[i+shift+1] = chars[i];
for (i = 0; i <= shift; i++)
chars[i] = word[i];
printf("|");
}
This happens, because you don't enter the loop when getc finds the end of the file. If wasD is false, you'll have one unprocessed word in the buffer.
You could treat EOF as whitespace and place the terminating condition at the end of the loop:
do {
c = getc(file);
isD = (c == ' ' || c == '\t' || c == '\n' || c == EOF);
// ...
} while (c != EOF);
This works, because you use the value of c only if it is not a delimiter. (The special value EOF is outside the valid range of (unsigned) chars and should not be inserted into strings or printed.)
stdout is not getting flushed because your last output didn't contain a newline...
Change this line
printf("|");
to
printf("|\n");

Resources