Using fgetc to read words? - c

I want to read a text file, character by character, and then do something with the characters and something with the words. This is my implementation:
char c;
char* word="";
fp = fopen("text.txt","rt");
do
{
c = (char)fgetc(fp);
if(c == ' ' || c == '\n' || c == '\0' || c == '\t')
{
//wordfunction(word)
word = ""; //Reset word
}
else
{
strcat(word, &c); //Keeps track of current word
}
//characterfunction(c);
}while(c != EOF);
fclose(fp);
However, when I try to run this my program instantly crashes. Is there a problem with setting word to ""? If so, what should I do instead?

In your word variable initial assignment, you're pointing to a static string of length 0. When you try to write data into there, you'll overwrite something else and your program will brake. You need, instead, to reserve space to your words.
where you have
char* word="";
use instead
char word[100];
This will create a space of 100 chars for your word.
char c;
char word[100];
fp = fopen("text.txt","rt");
int index = 0;
do {
c = (char)fgetc(fp);
if(c == ' ' || c == '\n' || c == '\0' || c == '\t') {
//wordfunction(word)
word[0] = 0; //Reset word
index = 0;
} else {
word[index++] = c;
word[index] = 0;
//strcat(word, &c); //Keeps track of current word
}
//characterfunction(c);
} while(c != EOF);
fclose(fp);

word points to a constant area which is not allowed to be write by strcat, so your code dumped;
word should have enough space to reserve chars, try hard-coded or realloc(word, size_t);
this can be compiled with gcc -o:
int main(){
char c;
char word[1000] = {0}; //enough space
FILE* fp = fopen("text.txt","rt");
int index = 0;
assert(0 != fp);
do
{
c = (char)fgetc(fp);
if(c == ' ' || c == '\n' || c == '\0' || c == '\t')
{
//wordfunction(word)
index = 0;
}
else
{
word[index++] = c;
}
//characterfunction(c);
}while(c != EOF);
word[index] = 0;
fclose(fp);
return 0;
}

Related

How to store each word of a txt file into a 2d array

int calc_stats(void)
{
char in_name[80];
FILE* in_file;
int ch, character = 0, space = 0, words = 0;
char str[30];
int i;
printf("Enter file name:\n");
scanf("%s", in_name);
in_file = fopen(in_name, "r");
if (in_file == NULL)
printf("Can't open %s for reading.\n", in_name);
else
{
while ((ch = fgetc(in_file)) != EOF)
{
character++;
if (ch == ' ')
{
space++;
}
if (ch == ' ' || ch == '\t' || ch == '\n' || ch == '\0')
{
words++;
strcat(str, " ");
}
else
{
strcat(str, ch);
}
}
fclose(in_file);
printf("\nNumber of characters = %d", character);
printf("\nNumber of characters without space = %d", character - space);
printf("\nNumber of words = %d", words);
}
return 0;
}
My goal here is whever I find a word to store it in a 2d array but here I am comaring characters through the ch = fgetc(in_file) command. I need to somehow form a word and store it in an array.
Any help would be useful.
Hello Manolis lordanidis,
Here is the core code:
// This is the place to hold your word (and its length).
static char word[256];
static int word_len = 0;
while ((ch = fgetc(in_file)) != EOF) {
character++;
if (ch == ' ') {
space++;
}
if (ch == ' ' || ch == '\t' || ch == '\n' || ch == '\0') {
words++;
// One word end. Now we need to put the word into str array.
word[word_len++] = '\0';
printf("Get: %s.\n", word);
str[str_len++] = malloc((word_len) * sizeof(char));
memcpy(str[str_len - 1], word, (word_len) * sizeof(char));
word_len = 0;
} else {
// Hold your character into the word.
printf("Push: %d.\n", ch);
word[word_len++] = (char)ch;
}
}
// Do not forget the last word.
word[word_len++] = '\0';
printf("Get: %s.\n", word);
str[str_len++] = malloc((word_len) * sizeof(char));
memcpy(str[str_len - 1], word, (word_len) * sizeof(char));
In my opinion, the 2D array may be a array of word, which type is char*, right? You can use a really 2D array. Do not worry.
In here I am using malloc function to ask for a array at runtime.
Every time, if the character is not space or something same, it will push the character to the array word. Then when it meet the space or something same or end-of-line (EOF), it will push the word to the str array.
I really hope it is helpful for you.
If you like using malloc, do not forget call function free :)

Why the functions doesn't return new line with replaceable keywords?

Hey just doing some exercises in c, one is saying to replace tabs in the input string with any other characters , i restrict myself to only using getchar(), no gets() fgets() etc..., as my learning book didn't catch it yet, so i tried to not break the flow, the code below just printf() the same line it receives, can you please examine why ?
#include <stdio.h>
int main(){
char line[20];
char c;
int i = 0;
printf("Enter name: ");
while ( c != '\n'){
c = getchar();
line[i] = c;
++i;}
while (line[i] != '\0')
if (line[i] == '\t')
line[i] = '*';
printf("Line is %s \n", line);
return 0;}
c, which is used in c != '\n', is not initialized at first. Its initial value is indeterminate and using is value without initializng invokes undefined behavior.
You are checking line[i] != '\0', but you never assigned '\0' to line unless '\0' is read from the stream.
You should initialize i before the second loop and update i during the second loop.
Return values of getchar() should be assigned to int to distinguish between EOF and an valid character.
You should perform index check not to cause buffer overrun.
Fixed code:
#include <stdio.h>
#define BUFFER_SIZE 20
int main(){
char line[BUFFER_SIZE];
int c = 0;
int i = 0;
printf("Enter name: ");
while ( i + 1 < BUFFER_SIZE && c != '\n'){
c = getchar();
if (c == EOF) break;
line[i] = c;
++i;
}
line[i] = '\0';
i = 0;
while (line[i] != '\0'){
if (line[i] == '\t')
line[i] = '*';
++i;
}
printf("Line is %s \n", line);
return 0;
}

How can I break up this into more than two functions?

How can I divide those two functions to more than two?
The functions read the file line by line.
An instruction will appear in a line in the file (at the end of each instruction there will be a newline
character). At the start of the running, the program will read the instruction line by line. Then it will
decode the required action and parameters and will call to perform the action with the appropriate
parameters.
I tried to put the foor loop, array, getc() to another function but it doesn't work.
void read_line(FILE *fp, char *orders, char *book_name, char *book_name_2, int *num)
{
int i = 0;
char c ;
*num = 0;
c = getc(fp);
while ((c != '\n') && (!feof(fp))) {
for (i = 0; (c != '$') && (c != '\n') && (!feof(fp)); i++) {
orders[i] = c;
c = getc(fp);
}
orders[i] = '\0';
if (c != '\n' && (!feof(fp))) {
fseek(fp, 3, 1);
c = getc(fp);
}
else break;
for (i = 0; (c != '$') && (c != '\n'); i++) {
book_name[i] = c;
c = getc(fp);
}
book_name[i] = '\0';
if (c != '\n' && (!feof(fp))) {
fseek(fp, 3, 1);
c = getc(fp);
}
else break;
if (strcmp(orders, "Rename ") != 0) {
for (i = 0; c != '\n'; i++) {
*num = (*num) * 10 + (c - '0');
c = getc(fp);
}
}
else {
for (i = 0; c != '\n'; i++) {
book_name_2[i] = c;
c = getc(fp);
}
book_name_2[i] = ' ';
book_name_2[i + 1] = '\0';
}
return;
}
}
Book* read_file_books(FILE *fp, Book *head, char *book_name, int *copies)
{
int i = 0;
char c ;
*copies = 0;
c = getc(fp);
while ((c != '\n') && (!feof(fp))) {
for (i = 0; (c != '$') && (c != '\n'); i++) {
book_name[i] = c;
c = getc(fp);
}
book_name[i] = '\0';
if (c != '\n') {
fseek(fp, 3, 1);
c = getc(fp);
}
else break;
for (i = 0; (c != '\n') && (!feof(fp)); i++) {
*copies = (*copies) * 10 + (c - '0');
c = getc(fp);
}
return add(head, book_name, *copies);
}
return head;
}
The most streamlined way of extracting a code block is to basically just copy the block to a function and then use pointers for variables that are declared outside that block. Let's take the for loop:
for (i = 0; (c != '$') && (c != '\n'); i++) {
book_name[i] = c;
c = getc(fp);
}
book_name[i] = '\0';
So, we will need access to i, c, book_name and fp. The simplest (but not the best) is this:
void foo(int *i, char *c, char *book_name, FILE *fp)
{
for (*i = 0; (*c != '$') && (*c != '\n'); (*i)++) {
book_name[*i] = *c;
*c = getc(fp);
}
book_name[*i] = '\0';
}
And then replace the for loop with:
foo(&i, &c, book_name, fp);
That's an easy procedure. But it gives you quite a lot of pointers. That's actually nothing wrong with the method itself. But you could get rid of some of them by considering in which scope you're declaring the variables. For instance, you can declare variables inside the for header, and you should unless you want to keep the last value. If you had done that, you could remove one parameter and get
void foo(char *c, char *book_name, FILE *fp)
{
for (int i = 0; (*c != '$') && (*c != '\n'); i++) {
book_name[i] = *c;
*c = getc(fp);
}
book_name[i] = '\0';
}
Note that I had to peek forward until the next usage if i to determine that this was safe. Your code is simply not written in a way that makes it suitable for block extraction.
You should also try to make the c variable local. Speaking of which, it should be an int and not a char. Read the documentation of getc to understand why.
Using for loop here is not wrong per se, but it's non idiomatic. For loops are typically used when you know how many times the loop should execute before the loop starts. When you want to loop until something happens, a while is more suitable. But what would be much better here is do-while. Because, when reading files, you want to try to read, and then check if you succeeded. You are technically doing that, but what makes this code hard to extract is that each block ends with reading a character that the next loop should process.
To correct this, we need to start from the beginning. First remove the very first instance to getc. Also, remove the conditions from the loop headers.
while (1) {
i=0;
// Each block takes care of 100% of their input
while((c = getc(fp)) != '$' && c != '\n' && c != EOF) {
orders[i] = c;
i++;
}
orders[i] = '\0';
// Negated your condition to make a cleaner if and get rid of else
if (!(c != '\n' && (!feof(fp)))) break;
fseek(fp, 3, 1);
...
The main thing I have corrected here is that every code block starts from scratch. It does not care about what happened before. In your code, you always started by checking the values from the previous blocks. This change makes it MUCH easier to extract code. In most cases when people ask how to extract code, the question they really should have asked is to how to change the code so that extraction becomes trivial.
Also, read Why is while(!feof(fp)) always wrong?

C language: change user input

I need to write program that get Input from user and in case i have quate (") i need to change all the chars inside the quotes to uppercase.
int main()
{
int quoteflag = 0;
int ch = 0;
int i = 0;
char str[127] = { '\0' };
while ((ch = getchar()) != EOF && !isdigit(ch))
{
++i;
if (ch == '"')
quoteflag = !quoteflag;
if (quoteflag == 0)
str[i] = tolower(ch);
else
{
strncat(str, &ch, 1);
while ((ch = getchar()) != '\"')
{
char c = toupper(ch);
strncat(str, &c, 1);
}
strncat(str, &ch, 1);
quoteflag = !quoteflag;
}
if (ch == '.')
{
strncat(str, &ch, 1);
addnewline(str);
addnewline(str);
}
else
{
if ((isupper(ch) && !quoteflag))
{
char c = tolower(ch);
strncat(str, &c, 1);
}
}
}
printf("\n-----------------------------");
printf("\nYour output:\n%s", str);
getchar();
return 1;
}
void addnewline(char *c)
{
char tmp[1] = { '\n' };
strncat(c, tmp, 1);
}
So my problem here is in case my input is "a" this print at the end "A instead of "A" and i dont know why
The problem is that you are using strncat in a weird way. First, strncat will always do nothing on big-endian systems. What strncat does is read the inputs ... as strings. So passing and int (four or eight bytes) into the function, it'll read the first byte. If the first byte is 0, then it'll believe it is the end of the string and will not add anything to str. On little endian systems, the first byte should be the char you want, but on big-endian systems it will be the upper byte (which for an int that holds a value less than 255, will always be zero). You can read more about endianness here.
I don't know why you're using strncat for appending a single character, though. You have the right idea with str[i] = tolower(ch). I changed int ch to char ch and then went through and replaced strncat(...) with str[i++] = ... in your code, and it compiled fine and returned the "A" output you wanted. The source code for that is below.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main()
{
int quoteflag = 0;
char ch = 0;
int i = 0;
char str[127] = { '\0' };
while ((ch = getchar()) != EOF && !isdigit(ch))
{
if (ch == '"')
quoteflag = !quoteflag;
if (quoteflag == 0)
str[i++] = tolower(ch);
else
{
str[i++] = ch;
while ((ch = getchar()) != '\"')
{
char c = toupper(ch);
str[i++] = c;
}
str[i++] = ch;
quoteflag = !quoteflag;
}
if (ch == '.')
{
str[i++] = '.';
str[i++] = '\n';
str[i++] = '\n';
}
else
{
if ((isupper(ch) && !quoteflag))
{
char c = tolower(ch);
str[i++] = c;
}
}
}
printf("\n-----------------------------");
printf("\nYour output:\n%s", str);
getchar();
return 1;
}
You should delete the ++i; line, then change:
str[i] = tolower(ch);
To:
str[i++] = tolower(ch);
Otherwise, since you pre-increment, if your first character is not a " but say a, your string will be \0a\0\0.... This leads us on to the next problem:
strncat(str, &ch, 1);
If the input is a", then strncat(str, &'"', 1); will give a result of \"\0\0... as strncat will see str as an empty string. Replace all occurrences with the above:
str[i++] = toupper(ch);
(The strncat() may also be technically undefined behaviour as you are passing in an malformed string, but that's one for the language lawyers)
This will keep track of the index, otherwise once out of the quote loop, your first str[i] = tolower(ch); will start overwriting everything in quotes.

Not counting spaces as words in c

#include <stdlib.h>
#include <stdio.h>
int main()
{
unsigned long c;
unsigned long line;
unsigned long word;
char ch;
c = 0;
line = 0;
word = 0;
while((ch = getchar()) != EOF)
{
c ++;
if (ch == '\n')
{
line ++;
}
if (ch == ' ' || ch == '\n' || ch =='\'')
{
word ++;
}
}
printf( "%lu %lu %lu\n", c, word, line );
return 0;
}
My program works fine for the most part, but when I add extra spaces, it counts the spaces as extra words. So for example, How are you? is counted as 10 words, but I want it to count as 3 words instead. How could I modify my code to get it to work?
I found a way to count words and between them several spaces the program will count only the words and not the several spaces also as words
here is the code:
nbword is the number of words, c is the character typed and prvc is the previously typed character.
#include <stdio.h>
int main()
{
int nbword = 1;
char c, prvc = 0;
while((c = getchar()) != EOF)
{
if(c == ' ')
{
nbword++;
}
if(c == prvc && prvc == ' ')
nbword-;
if(c == '\n')
{
printf("%d\n", nbword);
nbword = 1:
}
prvc = c;
}
return 0:
}
This is one possible solution:
#include <stdlib.h>
#include <stdio.h>
int main()
{
unsigned long c;
unsigned long line;
unsigned long word;
char ch;
char lastch = -1;
c = 0;
line = 0;
word = 0;
while((ch = getchar()) != EOF)
{
c ++;
if (ch == '\n')
{
line ++;
}
if (ch == ' ' || ch == '\n' || ch =='\'')
{
if (!(lastch == ' ' && ch == ' '))
{
word ++;
}
}
lastch = ch;
}
printf( "%lu %lu %lu\n", c, word, line );
return 0;
}
Hope this helped, good luck!

Resources