Im trying to read in some data from a file (ultimately into a structure, but not important for now) and to ensure that this file has equal amount of data on each line. Each line can have words or numbers though so I've converted one line of a file into one big string. I then try and split this string into tokens using strtok and commas (which separate the data) as delimiters. But how can i count the amount of tokens that exist between commas.
I've tried to count the amount of commas on each line but for some reason it is not behaving as it i expect it to. Each line in the file has 5 pieces of data, all split by commas, so there should be 4 commas on each line.
while (fgets(string, sizeof(string), f)) {
input = fgetc(f);
if(input == ','){
i++;
}
else if (input == ' '){
printf("Error");
exit(0);
}
}
if(i % 4 != 0){
printf("Error");
exit(0);
}
Here i am trying to count the amount of commas on every line (and if theres a space on the file line, it should show an error, as I ONLY want commas dividing the data). Finally after fgets stops reading, I want to see if the "i" variable is a multiple of 4. Im sure there is a more efficient and user friendly way to do this though but I cant think of one.
Quick question as well: does the fgetc run through every character on the line before the rest of the commands continue, or as soon as a comma is encountered, my program will move on the next loop?
Thank you!
To count commas at each line of file you need to know exact line delimiter in your file. Then you may iterate over file until end-of-file reached and count commas within lines.
In following example i assume '\n' is a line delimiter.
#define DESIRED_COMMAS_COUNT 4
int commas_per_line = 0;
bool prev_is_comma = false;
int c;
while ((c = fgetc(f)) != EOF) //This expression reads one characters from file and checks for end-of-file condition
{
switch(c)
{
case ',':
commas_per_line++;
if (prev_is_comma)
{
printf("Two successive commas! Empty element in line\n");
exit(1);
}
prev_is_comma = true;
if (commas_per_line > DESIRED_COMMAS_COUNT)
{
printf("Error: too many commas at line. At least %d.\n", commas_per_line);
exit(1);
}
break;
case ' ':
printf("Error: space encountered!\n");
exit(1);
case '\n':
if (commas_per_line != DESIRED_COMMAS_COUNT)
{
printf("Error: too low commas (%d)", commas_per_line);
exit(1);
}
if (prev_is_comma)
{
printf("Line ends with comma: no last element in line\n");
exit(1);
}
commas_per_line = 0;
break;
default:
prev_is_comma = false;
}
}
if ((commas_per_line != DESIRED_COMMAS_COUNT) && //check commas count for last line in file
(commas_per_line != 0))
{
printf("Error: too low commas (%d)", commas_per_line);
exit(1);
}
Related
I wrote a fiarly small program to help with txt files formatting, but when I tried to read from the input files and skip unwanted '\n' I actually skipped the next character after '\n' instead.
The characters I work on in the sample file is like this:
abcde
abc
ab
abcd
And my code looks like this:
while (!feof(fp1)) {
ch = fgetc(fp1);
if (ch != '\n') {
printf("%c",ch);
}
else {
ch = fgetc(fp1); // move to the next character
if (ch == '\n') {
printf("%c",ch);
}
}
}
The expected result is
abcdeabc
ababcd
But I actually got
abcdebc
abbcd
I guess the problem is in ch = fgetc(fp1); // move to the next character
, but I just can't find a correct way to implement this idea.
Think of the flow of your code (lines numbered below):
1: while (!feof(fp1)) {
2: ch = fgetc(fp1);
3: if (ch != '\n') {
4: printf("%c",ch);
5: }
6: else {
7: ch = fgetc(fp1); // move to the next character
8: if (ch == '\n') {
9: printf("%c",ch);
10: }
11: }
12: }
When you get a newline followed by non-newline, the flow is (starting at the else line): 6, 7, 8, 10, 11, 12, 1, 2.
It's that execution of the final 2 in that sequence that effectively throws away the non-newline character that you had read at 7.
If your intent is to basically throw away single newlines and convert sequences of newlines (two or more) to a single one(a), you can use something like the following pseudo-code:
set numNewlines to zero
while not end-file:
get thisChar
if numNewlines is one or thisChar is not newline:
output thisChar
if thisChar is newline:
increment numNewlines
else:
set numNewlines to zero
This reads the character in one place, making it less likely that you'll inadvertently skip one due to confused flow.
It also uses the newline history to decide what gets printed. It only outputs a newline on the second occurrence in a sequence of newlines, ignoring the first and any after the second.
That means taht a single newline will never be echoed and any group of two or more will be transformed into one.
Some actual C code that demonstrates this(b) follows:
#include <stdio.h>
#include <stdbool.h>
int main(void) {
// Open file.
FILE *fp = fopen("testprog.in", "r");
if (fp == NULL) {
fprintf(stderr, "Cannot open input file\n");
return 1;
}
// Process character by character.
int numNewlines = 0;
while (true) {
// Get next character, stop if none left.
int ch = fgetc(fp);
if (ch == EOF) break;
// Output only second newline in a sequence of newlines,
// or any non-nwline.
if (numNewlines == 1 || ch != '\n') {
putchar(ch);
}
// Manage sequence information.
if (ch == '\n') {
++numNewlines;
} else {
numNewlines = 0;
}
}
// Finish up cleanly.
fclose(fp);
return 0;
}
(a) It's unclear from your question how you want to handle sequences of three or more newlines so I've had to make an assumption.
(b) Of course, you shouldn't use this if your intent is to learn, because:
You'll learn more if you try yourself and have to fix any issues.
Educational institutions will almost certainly check submitted code against a web search, and you'll probably be pinged for plagiarism.
I'm just providing it for completeness.
I have text file which include thousands of string
but each string split by a space " "
How can i count how many strings there are?
You don't need the strtok() as you only need to count the number of space characters.
while (fgets(line, sizeof line, myfile) != NULL) {
for (size_t i = 0; line[i]; i++) {
if (line[i] == ' ') totalStrings++;
}
}
If you want to consider any whitespace character then you can use isspace() function.
You can read character by character as well without using an array:
int ch;
while ((ch=fgetc(myfile)) != EOF) {
if (ch == ' ') totalStrings++;
}
But I don't see why you want to avoid using an array as it would probably be more efficient (reading more chars at a time rather than reading one byte at a time).
fgets() function will read entire line from file (you need to know maximum possible size of that line. Then, you can use strtok() from ` to parse the string and count the words.
Using fgetc(), you can count the spaces.
Take note that in cases wherein there are spaces at the beginning of the string, those will be counted as well and it is okay if spaces are present on the start of the line. Else, it won't give accurate results as the first string won't be counted because it has no space before it.
To solve that, we need to check first the first character and increment the string counter if it is an alphabet character.
int str_count = 0;
int c;
// first char
if( isalpha(c = fgetc(myfile)) )
str_count++;
else
ungetc(c, myfile);
Then, we loop through the rest of the contents.
Checking if an alphabet character follows a space will verify if there is a next string after the space, else a space at the end of the line will be counted as well, giving an inaccurate result.
do
{
c = fgetc(myfile);
if( c == EOF )
break;
if(isspace(c)) {
if( isalpha(c = fgetc(myfile)) ) {
str_count++;
ungetc(c, myfile);
} else if(c == '\n') { // for multiple newlines
str_count++;
}
}
} while(1);
Tested on a Lorem Ipsum generator of 1500 words:
http://pastebin.com/w6EiSHbx
I wanted to find the 5th letter from every word after reading from the text file. I do not know where I am wrong.
After entering the file path, I am getting a pop up window reading
read.c has stopped working.
The code:
#include<stdio.h>
#include<conio.h>
int main(void){
char fname[30];
char ch[]={'\0'};
int i=0;
FILE *fp;
printf("enter file name with path\n");
gets(fname);
fp=fopen(fname,"r");
if(fp==0)
printf("file doesnot exist\n");
else{
printf("File read successfully\n");
do{
ch[i]=getc(fp);
if(feof(fp)){
printf("end of file");
break;
}
else if(ch[i]=='\n'){
putc(ch[4],stdout);
}
i++;
}while(1);
fclose(fp);
}
return 0;
}
I wanted to find the 5th letter from every word
That's not something your code is doing now. It is wrong for various reasons, like
char ch[]={'\0'}; is an array with length 1. with unbound ch[i], you're overrunning the allocated memory creating undefined behaviour.
gets() is very dangerous, it can cause buffer overflow.
getc() reads character-by-character, not word-by-word, so you need to take care of space character (' ') as a delimiter also.
etc.
My suggestion, rewrite your code using the following algorithm.
Allocate a buffer long enough to hold a complete line from the file.
Open the file, check for success.
Read a whole line to the buffer using fgets()
3.1. If fgets() return NULL, you've most probably reached the end of file. End.
3.2. Otherwise, continue to next step.
Tokenize the line using strtok(), using space ' ' as the delimiter. Check the returned token against NULL.
4.1. if token is NULL, go to step 3.
4.2. if token is not NULL, proceeded to next step .
Check strlen() of the returned token (which is the word). if it is more than 4, print the index 4 of the token. (Remember, array index in c is 0 based).
Continue to step 4.
You can use following snippet. With this code you need to know Maximum length of line. This code prints 5th character on each line on each word if its length is 5 or more.. Hope this work for you.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MAX_LINE 1000
int main()
{
char fname[30];
char myCurLine[MAX_LINE + 1];
char myCurWord[MAX_LINE + 1];
int index,loop;
FILE *fp;
printf("enter file name with path\n");
gets(fname);
fp=fopen(fname,"r");
if(fp==0)
printf("file doesnot exist\n");
else
{
printf("File read successfully\n");
do
{
if(fgets(myCurLine,MAX_LINE,fp) != NULL)
{
index = 0;
for(loop = 0;loop < MAX_LINE; loop++)
{
myCurWord[index] = myCurLine[loop];
index++;
if((myCurLine[loop] == ' ') || (myCurLine[loop] == '\n'))
{
myCurWord[index] = '\0';
index = 0;
if(strlen(myCurWord) > 4)
{
putchar(myCurWord[4]);
}
index = 0;
if(myCurLine[loop] == '\n')
break;
}
}
}
if(feof(fp))
{
break;
}
}while(1);
fclose(fp);
}
return 0;
}
I edited your code because of these reasons:
You don't need to use char array at all, since you're only checking for letters and you can count the letters in every word of the file (which can be checked using spaces) and print when your count reaches 4 (since we start at 0).
Since gets() has no overflow protection, fgets() is more preferred.
fgets(fname, sizeof(fname), stdin);
Another point is you can simplify your do-while loop into one while loop with the condition of breaking if reaching EOF, since your do-while is simply an infinite loop (with condition defined as true or 1) that breaks at EOF (which is checked in a separate if inside the infinite do-while).
while (!feof)
An alternative to char array is to loop until a space ' ' or newline '\n' is found.
I also removed the else from if (fp==0) to avoid too many indents.
I also added ctype.h to check if the 5th letter is really a letter using isalpha().
This is how the word's 5th letter search works:
Loop (outer loop) until end-of-file (EOF).
In each iteration of outer loop, loop (inner loop) until a space ' ' or newline '\n' is found.
If the counter in inner loop reaches 4 (which means 5th letter is reached),
print the current letter,
reset counter to zero,
then break the inner loop.
Applying those edits to your code,
#include<stdio.h>
#include<conio.h>
#include<ctype.h>
int main(void){
char fname[30];
char ch; // changed from array to single char
int i=0;
FILE *fp;
printf("enter file name with path\n");
fgets(fname, sizeof(fname), stdin); // changed from gets()
// removes the \n from the input, or else the file won't be located
// I never encountered a file with newlines in its name.
strtok(fname, "\n");
fp=fopen(fname,"r");
// if file open failed,
// it tells the user that file doesn't exits,
// then ends the program
if (!fp) {
printf("file does not exist\n");
return -1;
}
// loops until end-of-file
while (!feof(fp))
// loops until space or newline or when 5th letter is found
for (i = 0; (ch=getc(fp)) != ' ' && ch != '\n'; i++)
// if 5th character is reached and it is a letter
if (i == 4 && isalpha(ch)) {
putc(ch ,stdout);
// resets letter counter, ends the loop
i = 0;
break;
}
fclose(fp);
return 0;
}
*Note: Words with less than 5 letters will not be included in the output, but you can specify a character or number to indicate that a word has less than 5 letters. (such as 0, -1)
sample read.txt:
reading write love
coder heart
stack overflow
output:
enter file name with path
read.txt
iertkf
We have a program that will take a file as input, and then count the lines in that file, but without counting the empty lines.
There is already a post in Stack Overflow with this question, but the answer to that doesn't cover me.
Let's take a simple example.
File:
I am John\n
I am 22 years old\n
I live in England\n
If the last '\n' didn't exist, then the counting would be easy. We actually already had a function that did this here:
/* Reads a file and returns the number of lines in this file. */
uint32_t countLines(FILE *file) {
uint32_t lines = 0;
int32_t c;
while (EOF != (c = fgetc(file))) {
if (c == '\n') {
++lines;
}
}
/* Reset the file pointer to the start of the file */
rewind(file);
return lines;
}
This function, when taking as input the file above, counted 4 lines. But I only want 3 lines.
I tried to fix this in many ways.
First I tried by doing fgets in every line and comparing that line with the string "\0". If a line was just "\0" with nothing else, then I thought that would solve the problem.
I also tried some other solutions but I can't really find any.
What I basically want is to check the last character in the file (excluding '\0') and checking if it is '\n'. If it is, then subtract 1 from the number of lines it previously counted (with the original function). I don't really know how to do this though. Are there any other easier ways to do this?
I would appreciate any type of help.
Thanks.
You can actually very efficiently amend this issue by keeping track of just the last character as well.
This works because empty lines have the property that the previous character must have been an \n.
/* Reads a file and returns the number of lines in this file. */
uint32_t countLines(FILE *file) {
uint32_t lines = 0;
int32_t c;
int32_t last = '\n';
while (EOF != (c = fgetc(file))) {
if (c == '\n' && last != '\n') {
++lines;
}
last = c;
}
/* Reset the file pointer to the start of the file */
rewind(file);
return lines;
}
Here is a slightly better algorithm.
#include <stdio.h>
// Reads a file and returns the number of lines in it, ignoring empty lines
unsigned int countLines(FILE *file)
{
unsigned int lines = 0;
int c = '\0';
int pc = '\n';
while (c = fgetc(file), c != EOF)
{
if (c == '\n' && pc != '\n')
lines++;
pc = c;
}
if (pc != '\n')
lines++;
return lines;
}
Only the first newline in any sequence of newlines is counted, since all but the first newline indicate blank lines.
Note that if the file does not end with a '\n' newline character, any characters encountered (beyond the last newline) are considered a partial last line. This means that reading a file with no newlines at all returns 1.
Reading an empty file will return 0.
Reading a file ending with a single newline will return 1.
(I removed the rewind() since it is not necessary.)
Firstly, detect lines that only consist of whitespace. So let's create a function to do that.
bool stringIsOnlyWhitespace(const char * line) {
int i;
for (i=0; line[i] != '\0'; ++i)
if (!isspace(line[i]))
return false;
return true;
}
Now that we have a test function, let's build a loop around it.
while (fgets(line, sizeof line, fp)) {
if (! (stringIsOnlyWhitespace(line)))
notemptyline++;
}
printf("\n The number of nonempty lines is: %d\n", notemptyline);
Source is Bill Lynch, I've little bit changed.
I think your approach using fgets() is totally fine. Try something like this:
char line[200];
while(fgets(line, 200, file) != NULL) {
if(strlen(line) <= 1) {
lines++;
}
}
If you don't know about the length of the lines in your files, you may want to check if line actually contains a whole line.
Edit:
Of course this depends on how you define what an empty line is. If you define a line with only whitespaces as empty, the above code will not work, because strlen() includes whitespaces.
im trying read a file and using fscanf to get some values down and store it in an array, however in the file, there will be some line starts with '#" (e.g. #this is just a command), and i want to skip them how should i do that? those lines that contains # will appear at random lines. got some of my code here:
//do line counts of how many lines contain parameters
while(!EOF) {
fgets(lines, 90, hi->agentFile);
count++;
if (lines[0] == '#') {
count--;
}
}
//mallocing an array of struct.
agentInfo* array = malloc(count*sizeof(agentInfo));
for (i = 0; i < count; i++) {
fscanf(hi->agentFile,"%d %d %c %s %c",&array[i].r,&array[i].c,
&array[i].agent_name,&array[i].function[80],
&array[i].func_par);
so i need to add something so i can skip lines start with '#', how?
Your EOF test is wrong. You also need to rewind the file between the fgets() loop and the fscanf() loop. And you need to replace the fscanf() loop with a second fgets() loop using sscanf() to read the data. Or you need to allocate the memory as you go while reading the file once. Let's leave that for later, though:
while(fgets(lines, sizeof(lines), hi->agentFile) != EOF)
{
if (lines[0] != '#')
count++;
}
agentInfo *array = malloc(count*sizeof(agentInfo));
if (array != 0)
{
int i;
rewind(hi->agentFile);
for (i = 0; fgets(lines, sizeof(lines), hi->agentFile) != EOF && i < count; i++)
{
if (lines[0] != '#')
{
if (sscanf(lines, "%d %d %c %s %c",&array[i].r,&array[i].c,
&array[i].agent_name,&array[i].function[80],
&array[i].func_par) != 5)
...format error in non-comment line...
}
}
assert(i == count); // else someone changed the file, or ...
}
Note that this checks for a memory allocation error and for format errors for non-comment lines.