pointer in binary files after using fread command - c

im learning at the moment about file pointers,and came across this code which was given as exemple
i tried replicating it in visual studio but i keep getting run time errors that are not relevent at this moment
void main()
{
FILE *cfPtr; /* credit.dat file pointer */
/* create clientData with default information */
struct clientData client = { 0, "", "", 0.0 };
if ( ( cfPtr = fopen( "credit.dat", "rb" ) ) == NULL )
printf( "File could not be opened.\n" );
else {
printf( "%-6s%-16s%-11s%10s\n", "Acct", "Last Name",
"First Name", "Balance" );
/* read all records from file (until eof) */
fread( &client, sizeof( struct clientData ), 1, cfPtr );
while ( !feof( cfPtr ) ) {
/* display record */
if ( client.acctNum != 0 )
printf( "%-6d%-16s%-11s%10.2f\n",
client.acctNum, client.lastName,
client.firstName, client.balance );
fread( &client, sizeof( struct clientData ), 1, cfPtr );
} /* end while */
fclose( cfPtr ); /* fclose closes the file */
} /* end else */
} /* end main */
my question is this,if the file is empty what does struct client contains?
also if the file only have 1 struct when in the code they used fread wouldnt the pointer move to after the struct meaning it will sit on eof and when they used if it will be false meaning the struct wasnt printed on screen?

my question is this,if the file is empty what does struct client contains? also if the file only have 1 struct when in the code they used fread wouldnt the pointer move to after the struct meaning it will sit on eof and when they used if it will be false meaning the struct wasnt printed on screen?
7.21.8.1 The fread function
...
Synopsis
1 #include <stdio.h>
size_t fread(void * restrict ptr,
size_t size, size_t nmemb,
FILE * restrict stream);
Description
2 The fread function reads, into the array pointed to by ptr, up to nmemb elements
whose size is specified by size, from the stream pointed to by stream. For each
object, size calls are made to the fgetc function and the results stored, in the order
read, in an array of unsigned char exactly overlaying the object. The file position
indicator for the stream (if defined) is advanced by the number of characters successfully
read. If an error occurs, the resulting value of the file position indicator for the stream is
indeterminate. If a partial element is read, its value is indeterminate.
Returns
3 The fread function returns the number of elements successfully read, which may be
less than nmemb if a read error or end-of-file is encountered. If size or nmemb is zero,
fread returns zero and the contents of the array and the state of the stream remain
unchanged.
C 2011 Online Draft
Emphasis added; if fread returns something less than 1 for this code, then you should assume that client doesn't contain anything meaningful.
Never use feof as your loop condition - it won't return true until you try to read past the end of the file, so the loop will execute once too often. Instead, check the result of your input operation like so:
while ( fread( &client, sizeof client, 1, cfPtr ) == 1 )
{
// process client normally
}
if ( feof( cfPtr ) )
fputs( "End of file detected on input\n", stderr );
else
fputs( "Error on input\n", stderr );

Related

Why is my program perceiving an EOF condition way before my file actually ends?

My code reads line by line from a text file and stores the lines in a massive array of char pointers. When I use an ordinary text file, this works with no issues. However, when I try to read from the 'dictionary.txt' file I'm supposed to be using, my program detects EOF after reading the first of MANY lines in the file.
int i = 0;
while( 1 ) {
size_t size = 50;
fseek( dicFile, 0L, getline( &dictionary[i++], &size, dicFile) );
printf( "%d:\t%s", i, dictionary[i - 1] );
if( feof( dicFile ) ) {
fclose( dicFile );
break;
}
}
puts("finished loading dictionary");
Here is the start of the dictionary file I'm attempting to load:
A
A's
AA's
AB's
ABM's
AC's
ACTH's
AI's
AIDS's
AM's
AOL
AOL's
ASCII's
ASL's
ATM's
ATP's
AWOL's
AZ's
The output is get from this portion of the program is:
1: A
2: finished loading dictionary
Thanks for any help.
Your third argument to fseek() is nuts. I've seen at least one implementation that treated every out of range third argument as SEEK_END. Oops.
You should just call getline() in the loop instead. In fact, just check the return value of getline() for -1 and get rid of that feof().

Overwrite the Contents of a File in C

I am writing a code that forks multiple processes. They all share a file called "character." what I want to do is to have every process read the 'only character' in the file and then erase it by putting its own character so the other process can do the same. The file is the only way the processes can communicate each other. How can I erase the 'only character' in the file and put a new one in its place. I was advise to use freopen() (which closes the file and reopens it erasing its previous contents) but I am not sure if it is the best way to achieve this.
You should not have to reopen the file. That gains you nothing. If you're worried about each process buffering input or output, disable buffering if you want to use FILE *-based stdio functions.
But if I'm reading your question correctly (you want each process to replace the one character in the file when it's a specific value is held in the file, and that value changes for each process), this will do what you want, using POSIX open() pread(), and pwrite() (you're already using POSIX fork(), so using low-level POSIX IO makes things a lot simpler - note that pread() and pwrite() eliminate the need for seeking.)
I'll say this is what I think you're trying to do:
// header files and complete error checking is omitted for clarity
int fd = open( filename, O_RDWR );
// fork() here?
// loop until we read the char we want from the file
for ( ;; )
{
char data;
ssize_t result = pread( fd, &data, sizeof( data ), 0 );
// pread failed
if ( result != sizeof( data ) )
{
break;
}
// if data read matches this process's value, replace the value
// (replace 'a' with 'b', 'c', 'z' or '*' - whatever value you
// want the current process to wait for)
if ( data == 'a' )
{
data = 'b';
result = pwrite( fd, &data, sizeof( data ), 0 );
break;
}
}
close( fd );
For any decent number of processes, that's going to put a lot of stress on your filesystem.
If you really want to start with fopen() and use that family of calls, this might work depending on your implementation:
FILE *fp = fopen( filename, "rb+" );
// disable buffering
setbuf( fd, NULL );
// fork() here???
// loop until the desired char value is read from the file
for ( ;; )
{
char data;
// with fread(), we need to fseek()
fseek( fp, 0, SEEK_SET );
int result = fread( &data, 1, 1, fp );
if ( result != 1 )
{
break;
}
if ( data == 'a' )
{
data = 'b';
fseek( fp, 0, SEEK_SET );
fwrite( &data, 1, 1, fp );
break;
}
}
fclose( fp );
Again, that assumes I'm reading your question properly. Note that the POSIX rules John Bollinger mentioned in his comments regarding multiple handles don't apply - because the streams are explicitly not buffered.

Improving IO performance for merging two files in C

I wrote a function which merges two large files (file1,file2) into a new file (outputFile).
Each file is a line based format while entries are separated by \0 byte. Both files have the same amount of null bytes.
One example file with two entries could look like this A\nB\n\0C\nZ\nB\n\0
Input:
file1: A\nB\0C\nZ\nB\n\0
file2: BBA\nAB\0T\nASDF\nQ\n\0
Output
outputFile: A\nB\nBBA\nAB\0C\nZ\nB\nT\nASDF\nQ\n\0
FILE * outputFile = fopen(...);
setvbuf ( outputFile , NULL , _IOFBF , 1024*1024*1024 )
FILE * file1 = fopen(...);
FILE * file2 = fopen(...);
int c1, c2;
while((c1=fgetc(file1)) != EOF) {
if(c1 == '\0'){
while((c2=fgetc(file2)) != EOF && c2 != '\0') {
fwrite(&c2, sizeof(char), 1, outputFile);
}
char nullByte = '\0';
fwrite(&nullByte, sizeof(char), 1, outputFile);
}else{
fwrite(&c1, sizeof(char), 1, outputFile);
}
}
Is there a way to improve this IO performance of this function? I increased the buffer size of outputFile to 1 GB by using setvbuf. Would it help to use posix_fadvise on file1 and file2?
You're doing IO character-by-character. That is going to be needlessly and painfully S-L-O-W, even with buffered streams.
Take advantage of the fact that your data is stored in your files as NUL-terminated strings.
Assuming you're alternating nul-terminated strings from each file, and running on a POSIX platform so you can simply mmap() the input files:
typedef struct mapdata
{
const char *ptr;
size_t bytes;
} mapdata_t;
mapdata_t mapFile( const char *filename )
{
mapdata_t data;
struct stat sb;
int fd = open( filename, O_RDONLY );
fstat( fd, &sb );
data.bytes = sb.st_size;
/* assumes we have a NUL byte after the file data
If the size of the file is an exact multiple of the
page size, we won't have the terminating NUL byte! */
data.ptr = mmap( NULL, sb.st_size, PROT_READ, MAP_PRIVATE, fd, 0 );
close( fd );
return( data );
}
void unmapFile( mapdata_t data )
{
munmap( data.ptr, data.bytes );
}
void mergeFiles( const char *file1, const char *file2, const char *output )
{
char zeroByte = '\0';
mapdata_t data1 = mapFile( file1 );
mapdata_t data2 = mapFile( file2 );
size_t strOffset1 = 0UL;
size_t strOffset2 = 0UL;
/* get a page-aligned buffer - a 64kB alignment should work */
char *iobuffer = memalign( 64UL * 1024UL, 1024UL * 1024UL );
/* memset the buffer to ensure the virtual mappings exist */
memset( iobuffer, 0, 1024UL * 1024UL );
/* use of direct IO should reduce memory pressure - the 1 MB
buffer is already pretty large, and since we're not seeking
the page cache is really only slowing things down */
int fd = open( output, O_RDWR | O_TRUNC | O_CREAT | O_DIRECT, 0644 );
FILE *outputfile = fdopen( fd, "wb" );
setvbuf( outputfile, iobuffer, _IOFBF, 1024UL * 1024UL );
/* loop until we reach the end of either mapped file */
for ( ;; )
{
fputs( data1.ptr + strOffset1, outputfile );
fwrite( &zeroByte, 1, 1, outputfile );
fputs( data2.ptr + strOffset2, outputfile );
fwrite( &zeroByte, 1, 1, outputfile );
/* skip over the string, assuming there's one NUL
byte in between strings */
strOffset1 += 1 + strlen( data1.ptr + strOffset1 );
strOffset2 += 1 + strlen( data2.ptr + strOffset2 );
/* if either offset is too big, end the loop */
if ( ( strOffset1 >= data1.bytes ) ||
( strOffset2 >= data2.bytes ) )
{
break;
}
}
fclose( outputfile );
unmapFile( data1 );
unmapFile( data2 );
}
I've put in no error checking at all. You'll also need to add the proper header files.
Note also that the file data is assumed to NOT be an exact multiple of the system page size, thus ensuring that there's a NUL byte mapped after the file contents. If the size of the file is an exact multiple of the page size, you'll have to mmap() an additional page after the file contents to ensure that there's a NUL byte to terminate the last string.
Or you can rely on there being a NUL byte as the last byte of the file's contents. If that ever turns out to not be true, you'll likely get either a SEGV or corrupted data.
you are using two function calls per character, (one for input, one for output) Function calls are slow (they pollute the instruction pipeline)
fgetc() and fputc have their getc() / putc() counterparts, which are (can be) implemented as macros, enabling the compiler to inline the entire loop, except for the reading/writing of buffers , twice per 512 or 1024 or 4096 characters processed. (these will invoke system calls, but these are inevitable anyway)
using read/write instead of buffered I/O will probably not be worth the effort, the extra bookkeeping wil make your loop fatter (btw: using fwrite() to write one character is certainly wastefull, same for write())
maybe a larger output buffer could help, but I wouldnt count on that.
If you can use threads, make one for file1 and another for file2.
Make the outputFile as big as you need, then make thread1 write the file1 into outputFile.
While thread2 seek it's output of outputFile the the length of file1+1, and write file2
Edit:
It's not a correct answer for this case, but to prevent confusions I'll let it here.
More discusion I found about it: improve performance in file IO in C

how to print first 10 lines of a text file using Unix system calls?

I want to write my own version of the head Unix command, but my program is not working.
I am trying to to print the first 10 lines of a text file, but instead the program prints all the lines. I specify the file name and number of lines to print via command-line arguments. I am only required to use Unix system calls such as read(), open() and close().
Here is the code:
#include "stdlib.h"
#include "stdio.h"
#include <fcntl.h>
#include <stdlib.h>
#include <unistd.h>
#define BUFFERSZ 256
#define LINES 10
void fileError( char*, char* );
int main( int ac, char* args[] )
{
char buffer[BUFFERSZ];
int linesToRead = LINES;
int in_fd, rd_chars;
// check for invalid argument count
if ( ac < 2 || ac > 3 )
{
printf( "usage: head FILE [n]\n" );
exit(1);
}
// check for n
if ( ac == 3 )
linesToRead = atoi( args[2] );
// attempt to open the file
if ( ( in_fd = open( args[1], O_RDONLY ) ) == -1 )
fileError( "Cannot open ", args[1] );
int lineCount = 0;
//count no. of lines inside file
while (read( in_fd, buffer, 1 ) == 1)
{
if ( *buffer == '\n' )
{
lineCount++;
}
}
lineCount = lineCount+1;
printf("Linecount: %i\n", lineCount);
int Starting = 0, xline = 0;
// xline = totallines - requiredlines
xline = lineCount - linesToRead;
printf("xline: %i \n\n",xline);
if ( xline < 0 )
xline = 0;
// count for no. of line to print
int printStop = lineCount - xline;
printf("printstop: %i \n\n",printStop);
if ( ( in_fd = open( args[1], O_RDONLY ) ) == -1 )
fileError( "Cannot open ", args[1] );
//read and print till required number
while (Starting != printStop) {
read( in_fd, buffer, BUFFERSZ );
Starting++; //increment starting
}
//read( in_fd, buffer, BUFFERSZ );
printf("%s \n", buffer);
if ( close( in_fd ) == -1 )
fileError( "Error closing files", "" );
return 0;
}
void fileError( char* s1, char* s2 )
{
fprintf( stderr, "Error: %s ", s1 );
perror( s2 );
exit( 1 );
}
What am I doing wrong?
It's very odd that you open the file and scan it to count the total number lines before going on to echoing the first lines. There is absolutely no need to know in advance how many lines there are altogether before you start echoing lines, and it does nothing useful for you. If you're going to do it, anyway, however, then you ought to close() the file before you re-open it. For your simple program, this is a matter of good form, not of correct function -- the misbehavior you observe is unrelated to that.
There are several problems in the key portion of your program:
//read and print till required number
while (Starting != printStop) {
read( in_fd, buffer, BUFFERSZ );
Starting++; //increment starting
}
//read( in_fd, buffer, BUFFERSZ );
printf("%s \n", buffer);
You do not check the return value of your read() call in this section. You must check it, because it tells you not only whether there was an error / end-of-file, but also how many bytes were actually read. You are not guaranteed to fill the buffer on any call, and only in this way can you know which elements of the buffer afterward contain valid data. (Pre-counting lines does nothing for you in this regard.)
You are performing raw read()s, and apparently assuming that each one will read exactly one line. That assumption is invalid. read() does not give any special treatment to line terminators, so you are likely to have reads that span multiple lines, and reads that read only partial lines (and maybe both in the same read). You therefore cannot count lines by counting read() calls. Instead, you must scan the valid characters in the read buffer and count the newlines among them.
You do not actually print anything inside your read loop. Instead, you wait until you've done all your reading, then print everything the buffer after the last read. That's not going to serve your purpose when you don't get all the lines you need in the first read, because each subsequent successful read will clobber the data from the preceding one.
You pass the buffer to printf() as if it were a null-terminated string, but you do nothing to ensure that it is, in fact, terminated. read() does not do that for you.
I have trouble believing your claim that your program always prints all the line of the designated file, but I can believe that it prints all the lines of the specific file you're testing it on. It might do that if the file is short enough that the whole thing fits into your buffer. Your program then might read the whole thing into the buffer on the first read() call (though it is not guaranteed to do so), and then read nothing on each subsequent call, returning -1 and leaving the buffer unchanged. When you finally print the buffer, it still contains the whole contents of the file.

Error checking and the added length thereof - is there an analog to interrupts from embedded system programming?

Of course it is necessary to check whether certain operations occurred as expected: calls to malloc, fopen, fgetc
However, sometimes adding these checks makes the code way too long - especially for very simply functions. For example, I have a function where I must open a file, read in a few parameters, and allocate memory corresponding to what was just read in.
Therefore, the code ends up looking something like:
Open file
Check if file opened
Read parameter
Check if file EOF was not read (if it was, file format is incorrect)
Allocate memory
Check if memory allocation occurred as expected
Etc.
There appears to be quite a bit of redundancy here. At least for my simple program, if any of the above checks file, I simply report the error and return control to the operating system. The code ends up looking something like this:
if(filePointer == NULL){
perror("Error X occured");
exit(EXIT_FAILURE);
}
So, a simple few-line function turns into perhaps 20 or more lines because of this error checking. Is there some where to aggregate the determination of these errors?
Just wondering if there was something that I missed.
EDIT: For example, is there a way to interrupt the flow of program when certain events occur? I.e. if EOF is read prematurely, then jump to some function that informs the user (something like an interrupt in embedded systems).
This is a question that every C programmer asks at some point in his/her career. You are correct that some portions of your code will have more lines of error handling code than actual useful productive code. One technique I've used in the past to streamline error handling is to implement an error function, like this
static FILE *fpin = NULL;
static FILE *fpout = NULL;
static BYTE *buffer = NULL;
static void error( char *msg, char *name )
{
if ( msg != NULL )
{
if ( name != NULL )
fprintf( stderr, "%s: %s\n", msg, name );
else
fprintf( stderr, "%s\n", msg );
}
if ( fpin != NULL )
fclose( fpin );
if ( fpout != NULL )
fclose( fpout );
if ( buffer != NULL )
free( buffer );
exit( 1 );
}
which then gets used like this
void main( int argc, char *argv[] )
{
if ( argc != 3 )
error( "Usage: ChangeBmp infile outfile" );
if ( (fpin = fopen( argv[1], "rb" )) == NULL )
error( "Unable to open input file", argv[1] );
if ( (fpout = fopen( argv[2], "wb" )) == NULL )
error( "Unable to open output file", argv[2] );
size = sizeof( bmphead );
if ( fread( &bmphead, 1, size, fpin ) != size )
error( "Unable to read header", NULL );
size = sizeof( bmpinfo );
if ( fread( &bmpinfo, 1, size, fpin ) != size )
error( "Unable to read info", NULL );
Of course, this only works if the error function has access to all of the necessary variables. For simple, single file programs, I just make the necessary variables global. In a larger project, you might have to manage the variables more carefully.
One common way to address this, at least to reduce apparent code size, is wrapping the various checks with macros: e.g.,
#define CHECK_NULL(expr) { \
if ((expr) == NULL) { \
perror("Error X"); \
exit(-1); \
} \
}
CHECK_NULL(p = malloc(size))
CHECK_NULL(filePointer = fopen("foo.txt", "r"))
As for interrupting control flow, other languages often use exceptions, which are also possible in C. However, this tends to be platform-specific and isn't usually the way it's done in practice with C.

Resources