reading data from file in c - c

I have a txt file named prob which contains:
6 2 8 3
4 98652
914
143 789
1
527 146
85
1 74 8
7 6 3
Each line has 9 chars and there are 9 lines. Since I cant make a string array in c, im be using a two dimensional array. Careful running the code, infinite loops are common and it prints weird output. Im also curious as to where does it stop taking in the string? until newline?
expected result for each "save": 6 2 8 3
or watever the line contained.
#include <stdio.h>
FILE *prob;
main()
{
prob = fopen("prob.txt", "r");
char grid_values[9][9];
char save[9];
int i;
for (i = 0; (fscanf(prob, "%s", save) != EOF); i++)
{
int n;
for (n = 0; n <= 9; n++)
{
grid_values[i][n] = save[n];
printf("%c", grid_values[i][n]);
}
}
fclose(prob);
}

if you use fscanf, it will stop after a space delimiter..
try fgets to do it.. It will read line by line..
for (i = 0; (fgets(save, sizeof(save), prob) != EOF); i++)
the detail of fgets usage can be found here:
http://www.cplusplus.com/reference/clibrary/cstdio/fgets/
--edited--
here's the second
while(!feof(file))
{
fgets(s, sizeof(s), file); ......
}
I think it'll work well..

This looks like a homework problem, so I will try to give you some good advice.
First, read the description of the fscanf function and the description of the "%s" conversion.
Here is a snip from the description I have for "%s":
Matches a sequence of non-white-space characters; the next pointer must be a pointer to a character array that is long enough to hold the input sequence and the terminating null
character (’\0’), which is added automatically. The input string stops at white space or
at the maximum field width, whichever occurs first.
Here are the two important points:
Each of your input lines contains numbers and whitespace characters. So the function will read a number, reach whitespace, and stop. It will not read 9 characters.
If it did read 9 characters, you do not have enough room in your array to store the 10 bytes required. Note that a "terminating null character" will be added. 9 characters read, plus 1 null, equals 10. This is a common mistake in C programming and it is best to learn now to always account for the terminating null in any C string.
Now, to fix this to read characters into a two dimensional array: You need to use a different function. Look through your list of C stdio functions.
See anything useful sounding?
If you haven't, I will give you a hint: fread. It will read a fixed number of bytes from the input stream. In your case you could tell it to always read 9 bytes.
That would only work if each line is guaranteed to be padded out to 9 characters.
Another function is fgets. Again, carefully read the function documentation. fgets is another function that appends a terminating null. However! In this case, if you tell fgets a size of 9, fgets will only read 8 characters and it will write the terminating null as the 9th character.
But there is even another way! Back to fscanf!
If you look at the other conversion specifiers, you could use "%9c" to read 9 characters. If you use this operation, it will not add a terminating null to the string.
With both fread and fscanf "%9c" if you wanted to use those 9 bytes as a string in other functions such as printf, you would need to make your buffers 10 bytes and after every fread or fscanf function you would need to write save[9] = '\0'.
Always read the documentation carefully. C string functions sometimes do it one way. But not always.

Related

Why is the last character of a string not captured?

What happens to the last (nth) character of a n-character string when I try to output the string?
I've included my code, sample input and output below that highlights that the last character I input is lost.
Code:
char buffer[10];
fgets(buffer, sizeof(buffer), stdin);
printf("%s", buffer);
return 0;
Input:
aaaaaaaaab (that's 9 a's followed by 1 b)
Output:
aaaaaaaaa (9 a's)
For an array of characters to be treated as a propper string, its last character must be a null terminator (or null byte) '\0'.
The fgets function, in particular always makes sure that this character is added to the char array, so for a size argument of 10 it stores the first 9 caracters in the array and a null byte in the last available space, if the input is larger than or equal to 9.
Be aware that the unread characters, like b in your sample case, will remain in the input buffer stdin, and can disrupt future input reads.
This null byte acts as a sentinel, and is used by functions like printf to know where the string ends, needless to say that this character is not printable.
If you pass a non null terminated array of characters to printf this will amount to undefined behavior.
Many other functions in the standard library (and others) rely on this to work properly so it's imperative that you make sure that all your strings are properly null terminated.

How to slice/index data in c

I am trying to learn C and have recieved a homework assignment to write code which can read data from a .txt file and print out particular lines.
I wrote the following:
#include <stdio.h>
void main() {
char str[5];
FILE *fp;
fp=fopen("data.txt","r");
int i;
for (i=1;i<=5;i++){
fgets(str,5,fp);
printf("%d \n",i);
if (i==1||i==3||i==5) {
printf("%s \n \n",str);
}
}
}
The file data.txt is just the following:
3.21
5.22
4.67
2.31
2.51
1.11
I had read that each time fgets is run, the pointer is updated to point to the next line. I thought I could keep running fgets and then only print the string str when at the correct value for i (the line I want output on the console).
It partially worked, here is the output:
1
3.21
2
3
5.22
4
5
4.67
Process returned 8 (0x8) execution time : 0.024 s
Press any key to continue.
It did only print when i had the correct values, but for some reason it only printed the first 3 lines, even though fgets was supposed to have been run 5 times by the last iteration, and so the pointer should have been reading the last line.
Can someone explain why the pointer did not update as expected and if there is an easier way to slice or index through a file in c.
You need to account for (at least) two additional characters, in addition to the numbers you have in the file. There is the end-of-line delimiter (\n on UNIX/Mac, or possibly \r\n on Windows... so maybe 3 additional characters), plus (from the fgets documentation):
A terminating null character is automatically appended after the characters copied to str.
A lot of the C functions that manipulate character arrays (ie. strings) will give you this extra null "for free" and it can be tricky if you forget about it.
Also, a better way to loop over the lines might be:
#define MAX_CHARS 7
char buf[MAX_CHARS];
while((fgets(buf, MAX_CHARS, fp)) != NULL) {
printf("%s\n", buf);
}
It's still not the best way to do it (no error checking) but a little more compact/readable and idiomatic C, IMO.

How do you scan redirected files in C (STDIN)?

Say I'm calling a program:
$ ./dataset < filename
where filename is any file with x amount of line pairs where the first line contains a string and second line contains 10 numbers separated by spaces. The last line ends with "END"
How can I then start putting the first lines of pairs (string) into:
char *experiments[20] // max of 20 pairs
and the second lines of the pairs (numbers) into:
int data[10][20] // max of 20, 10 integers each
Any guidance? I don't even understand how I'm supposed to scan the file into my arrays.
Update:
So say this is my file:
Test One
0 1 2 3 4 5 6 7 8 9
END
Then redirecting this file would mean if I want to put the first line into my *experiments, that I would need to scan it as such?
scanf("%s", *experiments[0]);
Doing so gives me an error: Segmentation fault (core dumped)
What is incorrect about this?
Say my file is simply numbers, for ex:
0 1 2 3 4 5 6 7 8 9
Then,
scanf("%d", data[0][0]); works, and will hold value of '1'. Is there an easier way to do this for the whole line of data? i.e. data[0-9][0].
find the pseudo-code, code explains how to read the input
int main()
{
char str[100]; // make sure that this size is enough to hold the single line
int no_line=1;
while(gets(str) != NULL && strcmp(str,"END"))
{
if(no_line % 2 == 0)
{
/*read integer values from the string "str" using sscanf, sscanf can be called in a loop with %d untill it fails */
}
else
{
/*strore string in your variable "experiments" , before copying allocate a memory for the each entry */
}
no_line++;
}
}
The redirected file is associated with the FILE * stdin. It's already opened for you...
otherwise, you can treat it the same as any other text file, and/or use the functions that are dedicated to standard input - with the only exception that you cannot seek in the file and not retrieve the size of the input.
For the data sizes you're talking about, by far the easiest thing to do is just slurp all of the content into a buffer and work on that: you don't have to be super-stingy, just make sure that you don't overrun.
If you want to be super-stingy with memory, preallocate a 4kB buffer with malloc(), progressively read() into it from stdin, and realloc() another 4kB every time the input exceeds what you've already read. If you don't care so much about being stingy with memory (e.g. on a modern machine with gigabytes of memory), just malloc() something much bigger than the expected input (e.g. a megabyte) and bug out if the input is more than that: this is far simpler to implement but less general/elegant.
You then have all of the input in a buffer and you can do what you like with it, which depends too strongly on the format of the input for me to say how you should approach that part.

splitting string in c

I have a file where each line looks like this:
cc ssssssss,n
where the two first 'c's are individual characters, possibly spaces, then a space after that, then the 's's are a string that is 8 or 9 characters long, then there's a comma and then an integer.
I'm really new to c and I'm trying to figure out how to put this into 4 seperate variables per line (each of the first two characters, the string, and the number)
Any suggestions? I've looked at fscanf and strtok but i'm not sure how to make them work for this.
Thank you.
I'm assuming this is a C question, as the question suggests, not C++ as the tags perhaps suggest.
Read the whole line in.
Use strchr to find the comma.
Do whatever you want with the first two characters.
Switch the comma for a zero, marking the end of a string.
Call strcpy from the fourth character on to extract the sssssss part.
Call atoi on one character past where the comma was to extract the integer.
A string is a sequence of characters that ends at the first '\0'. Keep this in mind. What you have in the file you described isn't a string.
I presume n is an integer that could span multiple decimal places and could be negative. If that's the case, I believe the format string you require is "%2[^ ] %9[^,\n],%d". You'll want to pass fscanf the following expressions:
Your FILE *,
The format string,
An array of 3 chars silently converted to a pointer,
An array of 9 chars silently converted to a pointer,
... and a pointer to int.
Store the return value of fscanf into an int. If fscanf returns negative, you have a problem such as EOF or some other read error. Otherwise, fscanf tells you how many objects it assigned values into. The "success" value you're looking for in this case is 3. Anything else means incorrectly formed input.
I suggest reading the fscanf manual for more information, and/or for clarification.
fscanf function is very powerful and can be used to solve your task:
We need to read two chars - the format is "%c%c".
Then skip a space (just add it to the format string) - "%c%c ".
Then read a string until we hit a comma. Don't forget to specify max string size. So, the format is "%c%c %10[^,]". 10 - max chars to read. [^,] - list of allowed chars. ^, - means all except a comma.
Then skip a comma - "%c%c %10[^,],".
And finally read an integer - "%c%c %10[^,],%d".
The last step is to be sure that all 4 tokens are read - check fscanf return value.
Here is the complete solution:
FILE *f = fopen("input_file", "r");
do
{
char c1 = 0;
char c2 = 0;
char str[11] = {};
int d = 0;
if (4 == fscanf(f, "%c%c %10[^,],%d", &c1, &c2, str, &d))
{
// successfully got 4 values from the file
}
}
while(!feof(f));
fclose(f);

Using fgets to read strings from file in C

I am trying to read strings from a file that has each string on a new line but I think it reads a newline character once instead of a string and I don't know why. If I'm going about reading strings the wrong way please correct me.
i=0;
F1 = fopen("alg.txt", "r");
F2 = fopen("tul.txt", "w");
if(!feof(F1)) {
do{ //start scanning file
fgets(inimene[i].Enimi, 20, F1);
fgets(inimene[i].Pnimi, 20, F1);
fgets(inimene[i].Kood, 12, F1);
printf("i=%d\nEnimi=%s\nPnimi=%s\nKaad=%s",i,inimene[i].Enimi,inimene[i].Pnimi,inimene[i].Kood);
i++;}
while(!feof(F1));};
/*finish getting structs*/
The printf is there to let me see what was read into what and here is the result
i=0
Enimi=peter
Pnimi=pupkin
Kood=223456iatb i=1
Enimi=
Pnimi=masha
Kaad=gubkina
i=2
Enimi=234567iasb
Pnimi=sasha
Kood=dudkina
As you can see after the first struct is read there is a blank(a newline?) onct and then everything is shifted. I suppose I could read a dummy string to absorb that extra blank and then nothing would be shifted, but that doesn't help me understand the problem and avoid in the future.
Edit 1: I know that it stops at a newline character but still reads it. I'm wondering why it doesn't read it during the third string and transfers to the fourth string instead of giving the fourth string the fourth line of the source but it happens just once.
The file is formatted like this by the way
peter
pupkin
223456iatb
masha
gubkina
234567iasb
sasha
dudkina
123456iasb
fgets stops reading when it reads a newline, but the newline is considered a valid character and is included in the returned string.
If you want to remove it, you'll need to trim it yourself:
length = strlen(str);
if (str[length - 1] == '\n')
str[length - 1] = '\0';
Where str is the string into which you read the data from the file, and length is of type size_t.
To answer the edit to the question: the reason the newline is not read during the third read is because you are not reading enough characters. You give fgets a limit of 12 characters, which means it can actually read a maximum of 11 characters since it has to add the null terminator to the end.
The line you read is 11 characters in length before the newline. Note that there is a space at the end of that line when you output it:
Kood=223456iatb i=1
^
As already stated, if there's enough room in the buffer, then fgets() reads the data including the newline into the buffer and null terminates the line. If there isn't enough room in the buffer before coming across the newline, fgets() copies what it can (the length of the buffer minus one byte) and null terminates the string. The library resumes reading from where fgets() left off on the next iteration.
Don't mess with buffers smaller than 2 bytes long.
Note that gets() removes the newline (but does not protect you from buffer overflows, so do not use it). If things go as currently planned, gets() will be removed from the next version of the C standard; it will be a long time before it is removed from C libraries (it will just become a non-standard - or ex-standard - additional function available for abuse).
Your code should check each of the fgets() function calls:
while (fgets(inimene[i].Enimi, 20, F1) != 0 &&
fgets(inimene[i].Pnimi, 20, F1) != 0 &&
fgets(inimene[i].Kood, 12, F1) != 0)
{
printf("i=%d\nEnimi=%s\nPnimi=%s\nKaad=%s", i, inimene[i].Enimi, inimene[i].Pnimi, inimene[i].Kood);
i++;
}
There are places for do/while loops; they are not used very often, though.
the fgets function reads newline char as a part of the string read.
From the description of fgets:
The fgets() function shall read bytes from stream into the array pointed to by s, until n-1 bytes are read, or a newline is read and transferred to s, or an end-of-file condition is encountered. The string is then terminated with a null byte.
if Enimi/Pnimi/Kood are arrays not pointers:
while( fgets(inimene[i].Enimi,sizeof inimene[i].Enimi,F1) &&
fgets(inimene[i].Pnimi,sizeof inimene[i].Pnimi,F1) &&
fgets(inimene[i].Kood,sizeof inimene[i].Kood,F1) )
{
if( strchr(inimene[i].Enimi,'\n') ) *strchr(inimene[i].Enimi,'\n')=0;
if( strchr(inimene[i].Pnimi,'\n') ) *strchr(inimene[i].Pnimi,'\n')=0;
if( strchr(inimene[i].Kood,'\n') ) *strchr(inimene[i].Kood,'\n')=0;
printf("i=%d\nEnimi=%s\nPnimi=%s\nKaad=%s", i, inimene[i].Enimi, inimene[i].Pnimi,inimene[i].Kood);
i++;
}

Resources