Reading data from a text file in C? - c

So I'm pretty new at reading data from a text file in C. I'm used to getting input using scanf or hard coding.
I am trying to learn how to not only read data from a text file but manipulate that data. For example, say a text file called bst.txt had the following information used to perform operations on a binary search tree:
insert 10
insert 13
insert 5
insert 7
insert 20
delete 5
delete 10
....
With that example, I would have the following code:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main()
{
FILE *fptr;
char *charptr;
char temp[50];
fptr = fopen("bst.txt", "r");
while(fgets(temp, 50, fptr) != NULL)
{
charptr = strtok(temp, " ");
while(charptr != NULL)
{
charptr = strtok(NULL, " ");
}
}
return 0;
}
I know that within the first while loop strtok() splits each line in the text file and within the second while loop strtok() splits off when the program recognizes a space, which in this case would separate the operations from the integers.
So my main question is, after, for example, the word "insert" is separated from the integer "10", how do I get the program to continue like this:
if(_____ == "insert")
{
//read integer from input file and call insert function, i.e. insert(10);
}
I need to fill in the blank.
Any help would be greatly appreciated!

If I were doing what you're doing, I would be doing it that way :)
I see a lot of people getting upvoted (not here, I mean on SO generally) for recommending that people use functions like scanf() and strtok() despite the fact that these functions are uniformly considered evil, not just because they're not thread-safe, but because they modify their arguments in ways that are hard to predict, and are a giant pain in the ass to debug.
If you're malloc()ing an input buffer for reading from a file, always make it at least 4kB — that's the smallest page the kernel can give you anyway, so unless you're doing a bazillion stupid little 100-byte malloc()s, you might as well — and don't be afraid to allocate 10x or 100x that if that makes life easy.
So, for these kinds of problems where you're dealing with little text files of input data, here's what you do:
malloc() yourself a fine big buffer that's big enough to slurp in the whole file with buckets and buckets of headroom
open the file, slurp the whole damn thing in with read(), and close it
record how many bytes you read in n_chars (or whatever)
do one pass through the buffer and 1) replace all the newlines with NULs and 2) record the start of each line (occurs after a newline!) into successive positions in a lines array (e.g. char **lines; lines=malloc(n_chars*sizeof(char *)): there can't be more lines than bytes!)
(optional) as you go, advance your start-of-line pointers to skip leading whitespace
(optional) as you go, overwrite trailing whitespace with NULs
keep a count of the lines as you go and save it in n_lines
remember to free() that buffer when you're done with it
Now, what do you have? You have an array of strings that are the lines of your file (optionally with each line stripped of leading and trailing whitespace) and you can do what the hell you like with it.
So what do you do?
Go through the array of lines one-by-one, like this:
for(i=0; i<n_lines; i++) {
if( '\0'==*lines[i] || '#' == *lines[i] )
continue;
// More code
}
Already you have ignored empty lines and lines that start with a "#". Your config file now has comments!
long n;
int len;
for(i=0; i<n_lines; i++) {
if( '\0'==*lines[i] || '#' == *lines[i] )
continue;
// More code
len = strlen("insert");
if( 0== strncmp(lines[i], "insert", len) ) {
n = strtol(lines[i]+len+1, &endp, 10);
// error checking
tree_insert( (int)n );
continue;
}
len = strlen("delete");
if( 0== strncmp(lines[i], "delete", len) ) {
n = strtol(lines[i]+len+1, &endp, 10);
// error checking
tree_delete( (int)n );
}
}
Now, you can probably see 10 ways of making this code better. Me too. How about a struct that contains a keywords and a function pointer to the appropriate tree function?
Other ideas? Knock yourself out!

you can call as follows.For example i have put printf but you can replace your insert/delete function instead that.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main()
{
FILE *fptr;
char *charptr;
char temp[50];
fptr = fopen("bst.txt", "r");
while(fgets(temp, 50, fptr) != NULL)
{
charptr = strtok(temp, " ");
if(strcmp(charptr,"insert")==0)
{
charptr = strtok(NULL, " ");
printf("insert num %d\n",atoi(charptr));
}
else if(strcmp(charptr,"delete")==0)
{
charptr = strtok(NULL, " ");
printf("delete num %d\n",atoi(charptr));
}
}
return 0;
}

I think the best way to read formatted strings in file is using fscanf, the following example shows how to parse the file. You could store the charptr and value for further operations:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main()
{
FILE *fptr;
char charptr[50];
int value;
fptr = fopen("bst.txt", "r");
while (fscanf(fptr, "%s%d", charptr, &value) > 0)
{
printf("%s: %d\n", charptr, value);
}
return 0;
}

try this code
int main(){
FILE *fp;
char character[50];
int value;
fptr = fopen("input.txt", "r");
while (fscanf(fp, "%s%d", character, &value) > 0)
{
if(strcmp(character,"insert")==0){
insert(value);//call you function which you want value is 10 or change according to file
}
}
return 0;
}

Related

C Using char* in fscanf causing error Segmentation fault: 11

I am new to C and I came across an issue when using fscanf to read all strings from a .txt file.
The code is as follow:
#include <stdlib.h>
#include <stdio.h>
int main() {
FILE *spIn;
char *numIn;
spIn = fopen("data.txt", "r");
if (spIn == NULL) {
printf("Can't Open This File \n");
}
while ((fscanf(spIn, "%s", numIn)) == 1) {
printf("%s\n", numIn);
};
fclose(spIn);
return 1;
}
This throws an error: Segmentation fault: 11.
The original data on txt file is:
1 2 345 rrtts46
dfddcd gh 21
789 kl
a mix of ints, strings, white space and newline characters.
At least 4 candidate undefined behaviors (UB) that could lead to a fault of some kind.
Code fails to pass to fscanf(spIn,"%s",numIn) an initialized pointer.
Code calls fscanf() even if fopen() fails.
Code calls fclose() even if fopen() fails.
No width limit in fscanf(spIn,"%s",numIn)), worse than gets().
Text files really do not have strings ('\0' terminated data) nor int, they have lines (various characters with a '\n' termination).
To read a line in and save as a string, use fgets(). Do not use fscanf() to read lines of data.
#include <stdlib.h>
#include <stdio.h>
int main() {
FILE *spIn = fopen("data.txt", "r");
if (spIn == NULL) {
printf("Can't Open This File \n");
} else {
char buf[100];
while (fgets(buf, sizeof buf, spIn)) {
printf("%s", buf);
}
fclose(spIn);
}
}
char* numIn is a pointer, and it is uninitalized, you can't really store anything in it, you need to either allocate memory for it or make it point to some valid memory location:
#include<stdlib.h> // for malloc
char* numIn = malloc(100); // space for 99 char + null terminator byte
//...
while ((fscanf(spIn, "%99s", numIn)) == 1)
{
printf("%s\n",numIn);
};
Or:
char str[100];
char *numIn = str;
Which in this small code makes little sense, you should probably make numIn a fixed size array to begin with:
char numIn[100];
Note that that you should use a width specifier in *scanf to avoid buffer overflow. This still has a problem though, it will read word by word, instead of line by line.
Looking at your input file, using fgets seems like a better option, it can read complete lines, including spaces:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int main()
{
FILE *spIn;
char numIn[100];
spIn = fopen("data.txt", "r");
if (spIn != NULL)
{
while ((fgets(numIn, sizeof numIn, spIn)))
{
numIn[strcspn(numIn, "\n")] = '\0'; // removing \n
printf("%s\n", numIn);
}
fclose(spIn);
}
else
{
perror("Can't Open This File");
return EXIT_FAILURE;
}
return EXIT_SUCCESS;
}
Since fgets also parses the \n character, I'm removing it with strcspn.
Though you do verify the return value of fopen the execution continues even if it fails to open, I also addressed that issue.

code in C being killed when reading a 250MB file

I am trying to process a 250MB file using a script in C.
The file is basically a dataset and I want to read just some of the columns and (more importantly) break one of them (which is originally a string) into a sequence of characters.
However, even though I have plenty of RAM available, the code is killed by konsole (using KDE Neon) everytime I run it.
The source is available below:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main() {
FILE *arquivo;
char *line = NULL;
size_t len = 0;
int i = 0;
int j;
int k;
char *vetor[500];
int acertos[45];
FILE *licmat = fopen("licmat.csv", "w");
//creating the header
fprintf(licmat,"CO_CATEGAD,CO_UF_CURSO,ACERTO09,ACERTO10,ACERTO11,ACERTO12,ACERTO13,ACERTO14,ACERTO15,ACERTO16,ACERTO17,ACERTO18,ACERTO19,ACERTO20,ACERTO21,ACERTO22,ACERTO23,ACERTO24,ACERTO25,ACERTO26,ACERTO27,ACERTO28,ACERTO29,ACERTO30,ACERTO31,ACERTO32,ACERTO33,ACERTO34,ACERTO35\n");
if ((arquivo = fopen("MICRODADOS_ENADE_2017.csv", "r")) == NULL) {
printf ("\nError");
exit(0);
}
//reading one line at a time
while (getline(&line, &len, arquivo)) {
char *ptr = strsep(&line,";");
j=0;
//breaking the line into a vector based on ;
while(ptr != NULL)
{
vetor[j]=ptr;
j=j+1;
ptr = strsep(&line,";");
}
//filtering based on content
if (strcmp(vetor[4],"702")==0 && strcmp(vetor[33],"555")==0) {
//copying some info
fprintf(licmat,"%s,%s,",vetor[2],vetor[8]);
//breaking the string (32) into isolated characters
for (k=0;k<27;k=k+1) {
fprintf(licmat,"%c", vetor[32][k]);
if (k<26) {
fprintf(licmat,",");
}
}
fprintf(licmat,"\n");
}
i=i+1;
}
free(line);
fclose(arquivo);
fclose(licmat);
}
The output is perfect up to the point when the script is killed. The output file is just 640KB long and has about 10000 lines only.
What could be the issue?
It looks to me like you're mishandling the memory buffer managed by getline() - which allocates/reallocates as needed - by the use of strsep(), which seems to manipulate that same pointer value.
Once line has been updated to reflect some other element on the line, it's no longer pointing to the start of allocated memory, and then boom the next time getline() needs to do anything with it.
Use a different variable to pass to strsep():
while (getline(&line, &len, arquivo) > 0) { // use ">=" if you want blank lines
char *parseline = line;
char *ptr = strsep(&parseline,";");
// do the same thing later
The key thing here: you're not allowed to muck with the value of line other than to free() it at the end (which you do), and you can't let any other routine do it either.
Edit: updated to reflect getline() returning <0 on error (h/t to #user3121023)

Read lines from FILE

I would like to know what is the best way to read a lines from files, given I have
a file, that I'm promised that it would be as followed:
type
string table
color
string brown
height
int 120
cost
double 129.90
each time, one word then I would have 2 words.
I know that fscanf returns the value of the numbers of var it scans, and that's
why I have problem here, because one time the line has 1 argument and the next line it would have 2.
always the first line is only a char*, not longer then 10, and then the next is has 3 options..
if it is written an int then the number followed would be an int, as well as if it a double or a string.
thank you.
From the structure of file i think it can be grouped into a struct. And fscanf can be used like:
#include <stdio.h>
#include <stdlib.h>
#define SIZE 100
typedef struct Node {
char name[SIZE];
char type[SIZE], value[SIZE];
} Node;
int main() {
FILE *pFile = fopen("sample-test.txt", "r");
if(pFile == NULL) {
fprintf(stderr, "Error in reading file\n");
return EXIT_FAILURE;
}
Node nodes[SIZE];
int nRet, nIndex = 0;
// Just to make sure it reads 3 tokens each time
while((nRet = fscanf(pFile, "%s%s%s", nodes[nIndex].name,
nodes[nIndex].type, nodes[nIndex].value) == 3))
nIndex++;
for(int i = 0; i < nIndex; i++)
printf("%s %s %s\n", nodes[i].name, nodes[i].type, nodes[i].value);
return EXIT_SUCCESS;
}
After reading the file, you can check in your structure array to find your desired int, double depending upon the value of name using sscanf as pointed by Some Programmer Dude.

Reading information from a file in C language

So I have the txt file from which I need to read the number of students written in that file, and because every student is in separate line, it means that I need to read the number of lines in that document. So I need to:
Print all lines from that document
Write the number of lines from that document.
So, I write this:
#include "stdafx.h"
#include <stdio.h>
int _tmain(int argc, _TCHAR* Argo[]){
FILE *student;
char brst[255];
student = fopen("student.txt", "r");
while(what kind of condition to put here?)
{
fgetc(brst, 255, (FILE*)student);
printf("%s\n", brst);
}
return 0;
}
Ok, I understand that I can use the same loop for printing and calculating the number of lines, but I can't find any working rule to end the loop. Every rule I tried caused an endless loop. I tried brst != EOF, brst != \0. So, it works fine and print all elements of the document fine, and then it start printing the last line of document without end. So any suggestions? I need to do this homework in C language, and I am using VS 2012 C++ compiler.
OP's code is close but needs to use fgets() rather than fgetc() and use the return value of fgets() to detect when to quit, it will be NULL #Weather Vane. Also add a line counter.
#include <stdio.h>
int main(void) {
FILE *student = fopen("student.txt", "r");
unsigned line_count = 0;
if (student) {
char brst[255];
// fgetc(brst, 255, (FILE*)student);
while (fgets(brst, sizeof brst, student)) {
line_count++;
printf("%u %s", line_count, brst);
}
fclose(student);
}
printf("Line Count %u\n", line_count);
return 0;
}
Try this:
#include "stdafx.h"
#include <stdio.h>
int _tmain(int argc, _TCHAR* Argo[]){
FILE *student;
char brst[255];
char* result = NULL;
//Ensure file open works, if it doesn't quit
if ((student = fopen("student.txt", "r")) == NULL)
{
printf("Failed to load file\n");
return 1;
}
//Read in the file
for ( (result = fgets( brst, sizeof(brst), student));
!feof(student);
(result = fgets( brst, sizeof(brst), student)) )
{
if ( result == NULL ) break; //I've worked on embedded systems where this actually ment waiting on data, not EOF, so a 'continue' would go here instead of break in that case
printf("%s\n", brst);
}
fclose( student );
return 0;
}
feof() is only true after you've read past the end of the file. Using a for with two identical reads, and feof() on the conditional is a simple way to ensure you read the file as expected.
Use feof() to check for an eof condition.
You are correctly reading the file line-by-line, but use fgets(), not fgetc() - and the cast is not needed.
Then use sscanf() to assign the line data to variables (or some "safe" form of it).

Data entry into array of character pointers in C

this is my first question asked on here so if I'm not following the formatting rules here please forgive me. I am writing a program in C which requires me to read a few lines from a file. I am attempting to put each line into a cstring. I have declared a 2D character array called buf which is to hold each of the 5 lines from the file. The relevant code is shown below
#include <stdlib.h>
#include <sys/types.h>
#include <sys/file.h>
#include <sys/socket.h>
#include <sys/un.h> /* UNIX domain header */
void FillBuffersForSender();
char buf[5][2000]; //Buffer for 5 frames of output
int main()
{
FillBuffersForSender();
return 0;
}
void FillBuffersForSender(){
FILE *fp;
int line = 0;
char* temp = NULL;
size_t len = 0;
ssize_t read;
fp = fopen("frames.txt", "r");
printf("At the beginning of Fill Buffers loop.\n");
//while ((read = getline(&temp, &len, fp)) != -1){
while(line < 5){
//fprintf(stderr, "Read in: %s\n", temp);
fgets(temp, 2000, fp);
strcpy(buf[line], temp);
line++;
fprintf(stderr, "Line contains: %s.\n", temp);
temp = NULL;
}
while(line != 0){
fprintf(stderr, "Line contains: %s.\n", buf[line]);
line--;
}
}
The line
strcpy(buf[line], temp);
is causing a segmentation fault. I have tried this numerous ways, and cannot seem to get it to work. I am not used to C, but have been tasked with writing a bidirectional sliding window protocol in it. I keep having problems with super basic issues like this! If this were in C++, I'd be done already. Any help anyone could provide would be incredible. Thank you.
temp needs to point to an allocated buffer that fgets can write into.
In C programming, error checking is an important part of every program (in fact sometimes it seems like there's more error handling code than functional code). The code should check the return value from every function to make sure that it worked, e.g. if fopen returns NULL then it wasn't able to open the file, likewise if fgets returns NULL it wasn't able to read a line.
Also, the code needs to clean up after itself. For example, there is no destructor that closes a file when the file pointer goes out of scope, so the code needs to call fclose explicitly to close the file when it's finished with the file.
Finally, note that many of the C library functions have quirks that need to be understood, and properly handled. You can learn about these quirks by reading the man pages for the functions. For example, the fgets function will leave the newline character \n at the end of each line that it reads. But the last line of a file may not have a newline character. So when using fgets, it's good practice to strip the newline.
With all that in mind, the code should look like this:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MAXLINE 5
#define MAXLENGTH 2000
static char buffer[MAXLINE][MAXLENGTH];
void FillBufferForSender(void)
{
char *filename = "frames.txt";
FILE *fp;
if ((fp = fopen(filename, "r")) == NULL)
{
printf("file '%s' does not exist\n", filename);
exit(1);
}
for (int i = 0; i < MAXLINE; i++)
{
// read a line
if (fgets( buffer[i], MAXLENGTH, fp ) == NULL)
{
printf("file does not have %d lines\n", MAXLINE);
exit(1);
}
// strip the newline, if any
size_t newline = strcspn(buffer[i], "\n");
buffer[i][newline] = '\0';
}
fclose(fp);
}
int main(void)
{
FillBufferForSender();
for (int i = 0; i < MAXLINE; i++)
printf("%s\n", buffer[i]);
}
Note: for an explanation of how strcspn is used to strip the newline, see this answer.
When it comes to C you have to think of the memory. Where is the memory for a point with NULL assigned to it? How can we copy something to somewhere that we have no space for?

Resources