How to read entire file in c into a numeric array - c

I'm writing a program in C, that is supposed to read arrays of doubles from files of arbitrary length. How to stop it at EOF? I've tried using feof, but then read the discussions:
Why is “while ( !feof (file) )” always wrong?
How does C handle EOF?
Then again they concern strings, not numbers. So when reading a file of the form "4 23.3 2 1.2", how to state the loop condition, so that my array A has all the numbers from the file and nothing more (also the iterator N = length of A)?
Here's the part of my code with feof, which produces one entry too many in A. The reallocation part is based on http://www.cplusplus.com/reference/cstdlib/realloc/ example.
int N = 0;
double *A = NULL;
double *more_space;
double buff = 0;
FILE *data;
data = fopen(data.dat,"r");
while(feof(data) == 0) {
fscanf(data, "%lg", &buff);
more_space = realloc(A, (N+1)*sizeof(double));
A = more_space;
A[N] = buff;
N++;
}

The return value of fscanf is -1 when you hit eof (or other problems) so try while(fscanf(…) >= 0) for your outer loop. (Whether you accept a zero return value or not depends on how you want to handle bad data -- as other posters have pointed out, you will get 0 if no numbers are converted.
I like using the 'greater than or equals' test (instead of testing for ==1) because it doesn't break if you change the scanf format to, say, read two items from the input line. On the other hand, you could argue that it's better to verify that you always read exactly the correct number of arguments, and it's better to have your program 'fail fast' as soon as you change the number of scant arguments, than to have some mysterious failure later on, if you get bad input. It's a matter of style, and whether this is throwaway code for an assignment, or production code..

During the scan, save the results of fscanf()
int cnt;
double d;
while ((cnt = fscanf("%f", &d)) == 1) {
more_space = realloc(A, (N+1)*sizeof(double));
if (more_space == NULL) Handle_MemoryError();
A = more_space;
A[N] = buff;
N++;
}
if (cnt == 0) Handle_BadDataError();

Just stop when fscanf tells you there is no more data to be read. Its return value indicates the number of items it read or you get EOF if you reach end of file or there is a read error. Therefore, since you are reading only one item in each call, you will get 1 so long as reading is successful.
while (fscanf(data, "%lg", &buff) == 1) {
...
}
You may want to store the return value in a variable to verify whether reading stopped due to end-of-file or because the double parsing failed.

You should do:
while (fscanf(data, "%lg", &buff) == 1)
That way if you hit a "non number" then it'll stop instead of loop infinitely!! ... And fscanf will check for EOF and it'll break the loop for you.

Alternatively to other answers, read the entire file:
int32_t readFile ( char* filename, uint8_t** contentPtr, int32_t* sizePtr)
{
int32_t ret = 0;
struct stat st;
FILE *file = NULL;
uint8_t* content = NULL;
memset ( &st, 0, sizeof ( struct stat ) );
ret = stat( filename, &st );
if( ret < 0 ) {
return -1;
}
file = fopen64(filename, "rb");
if ( file == NULL ) {
return -1;
}
content = calloc ( st.st_size+10, sizeof ( uint8_t ) );
ret = fread ( content, 1, st.st_size, file );
fclose( file );
if( ret != st.st_size ) {
free(content);
return -1;
}
*contentPtr = content;
if (sizePtr != NULL) {
*sizePtr = st.st_size;
}
return 0;
}
And then process the contents of the buffer with sscanf().

Related

Reading variables via fread and storing via sscanf

I have a file .txt with values of some variable. I need to read them to declarate my variables in main. What is wrong?
#include <stdio.h>
#define INPUT "input.txt"
int main (void){
FILE *open_file = fopen(INPUT, "r");
if (open_file == NULL){
puts("ERROR.");
} else {
puts("SUCCESS.");
}
char *buffer;
int size = ftell(open_file);
int read_file = fread(buffer,size,*open_file);
int Integer1, Integer2;
while (size != EOF){
sscanf("%d%d",Integer1, Integer2);
}
int close_file = fclose(open_file);
if (close_file == -1){
puts("Error in closing file.");
} else {
puts("Closing file: SUCCESS.");
}
return 0;
}
Whats is wrong? I have to read every line of my file. For example if my file contains:
1
2
My scanf should set:
Integer1 = 1;
Integer2 = 2;
One problem is that after the line
char *buffer;
the variable buffer does not point to any valid memory location. It is a wild pointer.
You can either create a fixed size array like this:
char buffer[100];
or you can create a dynamically sized memory buffer, like this:
if ( fseek( open_file, 0, SEEK_END ) != 0 )
DoSomethingToHandleError();
long size = ftell(open_file);
if ( fseek( open_file, 0, SEEK_SET ) != 0 )
DoSomethingToHandleError();
char *buffer = malloc( size );
if ( buffer == NULL )
DoSomethingToHandleError();
Depending on whether you used a fixed-size array or a dynamically allocated buffer, you should change the call to fread to one of the following:
fread( buffer, sizeof(buffer), open_file ); //for fixed size array
fread( buffer, size, open_file ); //for dynamically allocated buffer
Instead of using the function fread, you would probably be better off using the function fgets, especially because sscanf requires a null-terminated string as input, not binary data. The function fread will give you binary data that is not null-terminated, whereas fgets will give you a null-terminated string.
Also, change the line
sscanf("%d%d",Integer1, Integer2);
to
sscanf( buffer, "%d%d", &Integer1, &Integer2);
Before using Integer1 and Integer2 afterwards, you should also check the return value of sscanf to make sure that both integers were found in the string.
However, if you don't want to handle the memory management and reading in of the file yourself, you can simply use fscanf, like this:
if ( fscanf( open_file, "%d%d", &Integer1, &Integer2 ) != 2 )
DoSomethingToHandleError();
But this has the disadvantage that fscanf will just give you the first two numbers that it finds in the file, and won't perform much input validation. For example, fscanf won't enforce that the two numbers are on separate lines. See the following link for more information on the disadvantages of using fscanf:
A beginners' guide away from scanf()
If you use the function fgets as I suggested, then you will need two calls to fgets to read both lines of the file. This means that sscanf will be unable to find both integers in the same string. Therefore, if you use fgets, then you will have to change your program logic a bit, for example like this:
#define MAX_INTEGERS 2
char buffer[100];
int integers[MAX_INTEGERS];
int num_found = 0;
while ( fgets( buffer, sizeof(buffer), open_file ) != NULL )
{
int i;
if ( strchr( buffer, '\n' ) == NULL && !feof( openfile ) )
{
printf( "Error: Line size too long! Aborting.\n" );
break;
}
if ( sscanf( buffer, "%d", &i ) == 1 )
{
if ( num_found == MAX_INTEGERS )
{
printf(
"Error: Found too many integers in file! This "
"program only has room for storing %d integers. "
"Aborting.\n",
MAX_INTEGERS
);
break;
}
//add found integer to array and increment num_found
integers[num_found++] = i;
}
else
{
printf(
"Warning: Line without number encountered. This could "
"simply be a harmless empty line at the end of the "
"file, but could also indicate an error.\n"
);
}
}
printf( "Found the following integers:\n" );
for ( int i = 0; i < num_found; i++ )
{
printf( "%d\n", integers[i] );
}
Instead of using sscanf, you may want to use strtol, as that function allows you to perform stricter input validation.
If you don't want to use the function fgets, which reads input line by line, but really want to read the whole file at once using fread, you can do that, but you would have to add the terminating null character manually.
EDIT: Since you stated in the comments that you didn't want a loop and wanted to store the individual lines each into their own named variable, then you could use the following code instead:
//This function will assume that one number is stored per line, and
//write it to the memory location that "output" points to. Note that
//the return value does not specify the number found, but rather
//whether an error occurred or not. A return value of 0 means no error,
//nonzero means an error occurred.
int get_number( FILE *fp, int *output )
{
char buffer[100];
int i;
//read the next line into buffer and check for error
if ( fgets( buffer, sizeof(buffer), fp ) == NULL )
return -1;
//make sure line was not too long to fit into buffer
if ( strchr( buffer, '\n' ) == NULL && !feof( fp ) )
return -1;
//parse line and make sure that a number was found
if ( sscanf( buffer, "%d", &i ) != 1 )
return -1;
*output = i
return 0;
}
Now, in your function main, you can simply use the following code:
int error_occurred = 0;
if ( get_number( &Integer1 ) != 0 )
{
printf( "Error occured when reading the first integer.\n" );
error_occurred = 1;
}
if ( get_number( &Integer2 ) != 0 )
{
printf( "Error occured when reading the second integer.\n" );
error_occurred = 1;
}
if ( !error_occurred )
{
printf( "The value of Integer1 is: %d\n", Integer1 );
printf( "The value of Integer2 is: %d\n", Integer2 );
}
There are couple of problems in your code
fread(buffer,size,*open_file);, buffer is not allocated
A) you have to allocate memory using malloc or calloc if you use pointers and also free after you are done with it.
If you want to avoid the headache of allocating and freeing, you better use an array sufficient enough to store the contents.
fread takes 4 arguments
A) 4th argument is not FILE , its FILE* , use only open_file not *open_file and you have not used nmemb(number of members) parameter
On success, fread() return the number of items read, so check the return value to avoid errors.
ftell() returns the current offset , Otherwise, -1 is returned
A) you really don't need it, check the return value of fread to find out you have reached the EOF.
check the syntax of sscanf and example
OP's code failed to seek the end of the array, allocate space for buffer and used fread() incorrectly.
Correct version shown with some error checking.
//char *buffer;
//int size = ftell(open_file);
//int read_file = fread(buffer,size,*open_file);
// Seek and report, hopefully, the end position
if (fseek(open_file, 0, SEEK_END)) { fprintf(stderr, "fseek() failed.\n"); return EXIT_FAILURE;
long size = ftell(open_file);
if (size == -1) { fprintf(stderr, "ftell() failed.\n"); return EXIT_FAILURE; }
// Allocate memory and read
if ((unsigned long) size > SIZE_MAX) { fprintf(stderr, "size too big.\n"); return EXIT_FAILURE; }
char *buffer = malloc((size_t) size);
if (buffer == NULL) { fprintf(stderr, "malloc() failed.\n"); return EXIT_FAILURE; }
size_t length = fread(buffer, sizeof *buffer, size, open_file);
// Use `length` as the number of char read
// ...
// when done
free(buffer);
Other problems too.
// while (size != EOF){
// sscanf("%d%d",Integer1, Integer2);
// }
Maybe later, GTG.

Reading text file into an array in C

I want to parse a .txt file into a 1D array in C. I'm using the fgets function to read the contents of the file into the array("waveform" as the array into which the file contents are to be stored - defined as a "char"). The saved values need to be saved into a new array as integer values. I am not sure where I am going wrong.
P.S: I am new to programming in C, please bear with me :)
Please ignore the indexing issues, done due to pasting
int main(){
int a, win[10];
FILE *filename = fopen("testFile.txt","r");
char waveform[10];
if (filename == NULL)
{
printf("Error opening file.\n");
exit(8);
}
for(int i =0;1;i++){
if(fgets(waveform[i], 10, filename) == NULL);
break;
if(i < 10)
{
a = atoi(waveform[i]);
win[i] = a;
}
}
fclose(filename);
return 0;
}
Compiler errors - image embedded
Data in testFile.txt:
1 to 10 in a row vector.
You are on the right track. Here is my contribution on the topic:
Open the file (fopen)
Count number of lines (getc and rewind)
Read all lines into array (getline)
Free memory and close file (free and fclose)
Code example:
#include <stdlib.h>
#include <stdio.h>
int main(int argc, char *argv[])
{
// Open File
const char fname[] = "testFile.txt";
FILE *fp = fopen(fname, "r");
if( !fp )
goto error_open_file;
printf("Opened file: %s\n", fname);
// Count Lines
char cr;
size_t lines = 0;
while( cr != EOF ) {
if ( cr == '\n' ) {
lines++;
}
cr = getc(fp);
}
printf("Number of lines: %ld\n", lines);
rewind(fp);
// Read data
{// 'goto' + data[lines] causes error, introduce block as a workaround
char *data[lines];
size_t n;
for (size_t i = 0; i < lines; i++) {
data[i] = NULL;
size_t n = 0;
getline(&data[i], &n, fp);
if ( ferror( fp ) )
goto error_read_file;
}
for (size_t i = 0; i < lines; i++) {
printf("%s", data[i]);
free(data[i]);
}
}
// Close File
fclose(fp);
return 0;
error_read_file:
perror("fopen ");
return 1;
error_open_file:
perror("getline ");
return 2;
}
There are several errors in this loop
for(int i =0;1;i++){
if(fgets(waveform[i], 10, filename) == NULL);
break;
if(i < 10)
{
a = atoi(waveform[i]);
win[i] = a;
}
}
For starters there is a semicolon after the if statement
if(fgets(waveform[i], 10, filename) == NULL);
^^^
Secondly the fgets call
fgets(waveform[i], 10, filename)
^^^
is invalid because the type of the expression waveform[i] is char.
And correspondingly this statement
a = atoi(waveform[i]);
is also invalid.
There must be at least
fgets( waveform, 10, filename)
and
a = atoi( waveform );
I suppose that each line of the file contains exactly one number. (Otherwise you should use for example sscanf to extract numbers from a line using an internal additional loop.)
The loop can look like
int i = 0;
for ( ; i < 10 && fgets( waveform, 10, filename) != NULL; i++ )
{
a = atoi( waveform );
win[i] = a;
}
After the loop the variable i will contain the actual number of elements of the array win.
Pay attention to that the name filename is not good for a pointer of the type FILE *. File name is the string "testFile.txt" in your code.
If you want to use the fgets() function you don't have to put it into a loop. Indeed, the second argument of fgets() is the number of elements you want to read.
I would have put the fgets() into a singl-line instruction, and then loop from 0 to 10 to make the conversion from char to int with the atoi() function.
Moreover, you have a ; at the end of your if() statement, so you'll execute it not in the way you want.

How to use the read function, with an unknown size

I made a function that opens a file and reads the single characters of them.
int getBlocks(char *file){
char c;
int line = 0;
int pos = 0;
int blocks;
FILE *inptr = fopen(file, "rt");
// Count the amount of blocks/tetriminos
while((c=fgetc(inptr))!=EOF){
if(c == '.' || c == '#'){
pos++;
if(pos == 4){
pos = 0;
line++;
}
}
}
blocks = (line/4);
fclose(inptr);
return blocks;
}
Now I have to rewrite it, because the only functions I am allowed to use are exit, open, close, write, read, malloc and free.
I think I could basically use int inptr = open(file, O_RDONLY); instead of my fopen line, and simply close(inptr); instead of my fclose function.
My only problem now is fgetc. I am pretty sure that I can use read here, but according to it's definition ssize_t read(int fildes, void *buf, size_t nbytes); I would need to tell a fixed amount of bytes in advance, but the filesize can always differ in my case.
Do you know how I could rewrite fgetc here?
It's pretty similar with slight changes
char chr;
int fd = open(file, O_RDONLY);
if (fd == -1)
reutnr -1; // Or some error integer
while (read(fd, &chr, 1) == 1) {
/* the rest of your code */
}
close(fd);
Note that one important change is that the type of chr is char and not int.
Instead of checking for EOF you simply check that read() returned a suitable value, ideally you should store it somewhere and check that it's 0 at the end of the loop, meaning the end of the file was reached, otherwise an error occurred.

Copy a file with buffers of different sizes for read and write

I have been doing some practice problems for job interviews and I came across a function that I can't wrap my mind on how to tackle it. The idea is to create a function that takes the name of two files, and the allowed buffer size to read from file1 and allowed buffer size for write to file2. if the buffer size is the same, I know how to go trough the question, but I am having problems figuring how to move data between the buffers when the sizes are of different. Part of the constraints is that we have to always fill the write buffer before writing it to file. if file1 is not a multiple of file2, we pad the last buffer transfer with zeros.
// input: name of two files made for copy, and their limited buffer sizes
// output: number of bytes copied
int fileCopy(char* file1,char* file2, int bufferSize1, int bufferSize2){
int bytesTransfered=0;
int bytesMoved=o;
char* buffer1, *buffer2;
FILE *fp1, *fp2;
fp1 = fopen(file1, "r");
if (fp1 == NULL) {
printf ("Not able to open this file");
return -1;
}
fp2 = fopen(file2, "w");
if (fp2 == NULL) {
printf ("Not able to open this file");
fclose(fp1);
return -1;
}
buffer1 = (char*) malloc (sizeof(char)*bufferSize1);
if (buffer1 == NULL) {
printf ("Memory error");
return -1;
}
buffer2 = (char*) malloc (sizeof(char)*bufferSize2);
if (buffer2 == NULL) {
printf ("Memory error");
return -1;
}
bytesMoved=fread(buffer1, sizeof(buffer1),1,fp1);
//TODO: Fill buffer2 with maximum amount, either when buffer1 <= buffer2 or buffer1 > buffer2
//How do I iterate trough file1 and ensuring to always fill buffer 2 before writing?
bytesTransfered+=fwrite(buffer2, sizeof(buffer2),1,fp2);
fclose(fp1);
fclose(fp2);
return bytesTransfered;
}
How should I write the while loop for the buffer transfers before the fwrites?
I am having problems figuring how to move data between the buffers when the sizes are of different
Layout a plan. For "some practice problems for job interviews", a good plan and ability to justify it is important. Coding, although important, is secondary.
given valid: 2 FILE *, 2 buffers and their sizes
while write active && read active
while write buffer not full && reading active
if read buffer empty
read
update read active
append min(read buffer length, write buffer available space) of read to write buffer
if write buffer not empty
pad write buffer
write
update write active
return file status
Now code it. A more robust solution would use a struct to group the FILE*, buffer, size, offset, length, active variables.
// Return true on problem
static bool rw(FILE *in_s, void *in_buf, size_t in_sz, FILE *out_s,
void *out_buf, size_t out_sz) {
size_t in_offset = 0;
size_t in_length = 0;
bool in_active = true;
size_t out_length = 0;
bool out_active = true;
while (in_active && out_active) {
// While room for more data
while (out_length < out_sz && in_active) {
if (in_length == 0) {
in_offset = 0;
in_length = fread(in_buf, in_sz, 1, in_s);
in_active = in_length > 0;
}
// Append a portion of `in` to `out`
size_t chunk = min(in_length, out_sz - out_length);
memcpy((char*) out_buf + out_length, (char*) in_buf + in_offset, chunk);
out_length += chunk;
in_length -= chunk;
in_offset += chunk;
}
if (out_length > 0) {
// Padding only occurs, maybe, on last write
memset((char*) out_buf + out_length, 0, out_sz - out_length);
out_active = fwrite(out_buf, out_sz, 1, out_s) == out_sz;
out_length = 0;
}
}
return ferror(in_s) || ferror(out_s);
}
Other notes;
Casting malloc() results not needed. #Gerhardh
// buffer1 = (char*) malloc (sizeof(char)*bufferSize1);
buffer1 = malloc (sizeof *buffer1 * bufferSize1);
Use stderr for error messages. #Jonathan Leffler
Open the file in binary.
size_t is more robust for array/buffer sizes than int.
Consider sizeof buffer1 vs. sizeof (buffer1) as parens not needed with sizeof object
while(bytesMoved > 0) {
for(i=0; i<bytesMoved && i<bufferSize2; i++)
buffer2[i]=buffer1[i];
bytesTransfered+=fwrite(buffer2, i,1,fp2);
bytesMoved-=i;
}
If bufferSize1 is smaller than the filesize you need an outer loop.
As the comments to your question have indicated, this solution is not the best way to transfer data from 1 file to another file. However, your case has certain restrictions, which this solution accounts for.
(1) Since you are using a buffer, you do not need to read and write 1 char at a time, but instead you can make as few calls to those functions possible.
size_t fread(void *ptr, size_t size, size_t nmemb, FILE *stream);
:from the man page for fread, nmemb can = bufferSize1
(2) You will need to check the return from fread() (i.e. bytesMoved) and compare it with both of the bufferSize 1 and 2. If (a) bytesMoved (i.e. return from fread()) is equal to bufferSize1 or if (b) bufferSize2 is less than bufferSize1 or the return from fread(), then you know that there is still data that needs to be read (or written). So, therefore you should begin the next transfer of data, and when completed return to the previous step you left off on.
Note: The pointer to the File Stream in fread() and fwrite() will begin where it left off in the event that the data is larger than the bufferSizes.
PseudoCode:
/* in while() loop continue reading from file 1 until nothing is left to read */
while (bytesMoved = fread(buffer1, sizeof(buffer1), bufferSize1, fp1))
{
/* transfer from buffer1 to buffer2 */
for(i = 0; i < bytesMoved && i < bufferSize2; i++)
buffer2[i] = buffer1[i];
buffer2[i] = '\0';
iterations = 1; /* this is just in case your buffer2 is super tiny and cannot store all from buffer1 */
/* in while() loop continue writing to file 2 until nothing is left to write
to upgrade use strlen(buffer2) instead of bufferSize2 */
while (bytesTransfered = fwrite(buffer2, sizeof(buffer2), bufferSize2, fp2))
{
/* reset buffer2 & write again from buffer1 to buffer2 */
for(i = bufferSize2 * iterations, j = 0; i < bytesMoved && j < bufferSize2; i++, j++)
buffer2[j] = buffer1[i];
buffer2[j] = '\0';
iterations++;
}
/* mem reset buffer1 to prepare for next data transfer*/
}

File I/O function for C

char *loadTextFile(const char *filename)
{
FILE *fileh;
char *text = 0;
long filelength;
if((fileh=fopen(filename,"rb"))== 0)
printf("loadTextFile() - could not open file");
else
{
fseek(fileh, 0, SEEK_END);
filelength = ftell(fileh);
rewind(fileh);
text=(char *) smartmalloc((int)filelength + 1);
fread(text,(int)filelength, 1, fileh);
fclose(fileh);
text[filelength]=0;
}
printf(text);
return(text);
}
This function only returns partial data of a txt file. It is also inconsistent...soemtimes gives me 100 characters of the file some times 20. I don't see anything wrong with it. Thought I might get another pair of eyes on it. Thanks.
Obvious things to check:
What did ftell(fileh) give you?
Can there be embedded NUL characters in the file? That would cause printf(text) to stop prematurely.
Here is a slightly better version of your code. You need more error checking with the IO function calls. Also, there is the annoying long to size_t implicit conversions which I would recommend dealing with properly in production code.
char* loadTextFile(const char *filename) {
char *text;
long length;
FILE *fileh = fopen(filename, "rb");
if ( !fileh ) {
return NULL;
}
fseek(fileh, 0, SEEK_END);
length = ftell(fileh);
rewind(fileh);
text = malloc(length + 1);
if ( !text ) {
return NULL;
}
fread(text, 1, length, fileh);
text[length] = 0;
fclose(fileh);
return text;
}
Note that, John R. Strohm is right: If your assessment of what has been read is based on what printf prints, then you are likely being misled by embedded nuls.
fread is not guaranteed to return as many characters as you ask for. You need to check its return value and use it in a loop.
Example loop (not tested):
char *p = text;
do {
size_t n = fread(p,1,(size_t)filelength, fileh);
if (n == 0) {
*p = '\0';
break;
}
filelength -= n;
p += n;
} while (filelength > 0);
The test for n==0 catches the case where some other process truncates the file as you are trying to read it.

Resources