Overwrite the Contents of a File in C - c

I am writing a code that forks multiple processes. They all share a file called "character." what I want to do is to have every process read the 'only character' in the file and then erase it by putting its own character so the other process can do the same. The file is the only way the processes can communicate each other. How can I erase the 'only character' in the file and put a new one in its place. I was advise to use freopen() (which closes the file and reopens it erasing its previous contents) but I am not sure if it is the best way to achieve this.

You should not have to reopen the file. That gains you nothing. If you're worried about each process buffering input or output, disable buffering if you want to use FILE *-based stdio functions.
But if I'm reading your question correctly (you want each process to replace the one character in the file when it's a specific value is held in the file, and that value changes for each process), this will do what you want, using POSIX open() pread(), and pwrite() (you're already using POSIX fork(), so using low-level POSIX IO makes things a lot simpler - note that pread() and pwrite() eliminate the need for seeking.)
I'll say this is what I think you're trying to do:
// header files and complete error checking is omitted for clarity
int fd = open( filename, O_RDWR );
// fork() here?
// loop until we read the char we want from the file
for ( ;; )
{
char data;
ssize_t result = pread( fd, &data, sizeof( data ), 0 );
// pread failed
if ( result != sizeof( data ) )
{
break;
}
// if data read matches this process's value, replace the value
// (replace 'a' with 'b', 'c', 'z' or '*' - whatever value you
// want the current process to wait for)
if ( data == 'a' )
{
data = 'b';
result = pwrite( fd, &data, sizeof( data ), 0 );
break;
}
}
close( fd );
For any decent number of processes, that's going to put a lot of stress on your filesystem.
If you really want to start with fopen() and use that family of calls, this might work depending on your implementation:
FILE *fp = fopen( filename, "rb+" );
// disable buffering
setbuf( fd, NULL );
// fork() here???
// loop until the desired char value is read from the file
for ( ;; )
{
char data;
// with fread(), we need to fseek()
fseek( fp, 0, SEEK_SET );
int result = fread( &data, 1, 1, fp );
if ( result != 1 )
{
break;
}
if ( data == 'a' )
{
data = 'b';
fseek( fp, 0, SEEK_SET );
fwrite( &data, 1, 1, fp );
break;
}
}
fclose( fp );
Again, that assumes I'm reading your question properly. Note that the POSIX rules John Bollinger mentioned in his comments regarding multiple handles don't apply - because the streams are explicitly not buffered.

Related

pointer in binary files after using fread command

im learning at the moment about file pointers,and came across this code which was given as exemple
i tried replicating it in visual studio but i keep getting run time errors that are not relevent at this moment
void main()
{
FILE *cfPtr; /* credit.dat file pointer */
/* create clientData with default information */
struct clientData client = { 0, "", "", 0.0 };
if ( ( cfPtr = fopen( "credit.dat", "rb" ) ) == NULL )
printf( "File could not be opened.\n" );
else {
printf( "%-6s%-16s%-11s%10s\n", "Acct", "Last Name",
"First Name", "Balance" );
/* read all records from file (until eof) */
fread( &client, sizeof( struct clientData ), 1, cfPtr );
while ( !feof( cfPtr ) ) {
/* display record */
if ( client.acctNum != 0 )
printf( "%-6d%-16s%-11s%10.2f\n",
client.acctNum, client.lastName,
client.firstName, client.balance );
fread( &client, sizeof( struct clientData ), 1, cfPtr );
} /* end while */
fclose( cfPtr ); /* fclose closes the file */
} /* end else */
} /* end main */
my question is this,if the file is empty what does struct client contains?
also if the file only have 1 struct when in the code they used fread wouldnt the pointer move to after the struct meaning it will sit on eof and when they used if it will be false meaning the struct wasnt printed on screen?
my question is this,if the file is empty what does struct client contains? also if the file only have 1 struct when in the code they used fread wouldnt the pointer move to after the struct meaning it will sit on eof and when they used if it will be false meaning the struct wasnt printed on screen?
7.21.8.1 The fread function
...
Synopsis
1 #include <stdio.h>
size_t fread(void * restrict ptr,
size_t size, size_t nmemb,
FILE * restrict stream);
Description
2 The fread function reads, into the array pointed to by ptr, up to nmemb elements
whose size is specified by size, from the stream pointed to by stream. For each
object, size calls are made to the fgetc function and the results stored, in the order
read, in an array of unsigned char exactly overlaying the object. The file position
indicator for the stream (if defined) is advanced by the number of characters successfully
read. If an error occurs, the resulting value of the file position indicator for the stream is
indeterminate. If a partial element is read, its value is indeterminate.
Returns
3 The fread function returns the number of elements successfully read, which may be
less than nmemb if a read error or end-of-file is encountered. If size or nmemb is zero,
fread returns zero and the contents of the array and the state of the stream remain
unchanged.
C 2011 Online Draft
Emphasis added; if fread returns something less than 1 for this code, then you should assume that client doesn't contain anything meaningful.
Never use feof as your loop condition - it won't return true until you try to read past the end of the file, so the loop will execute once too often. Instead, check the result of your input operation like so:
while ( fread( &client, sizeof client, 1, cfPtr ) == 1 )
{
// process client normally
}
if ( feof( cfPtr ) )
fputs( "End of file detected on input\n", stderr );
else
fputs( "Error on input\n", stderr );

Why is my program perceiving an EOF condition way before my file actually ends?

My code reads line by line from a text file and stores the lines in a massive array of char pointers. When I use an ordinary text file, this works with no issues. However, when I try to read from the 'dictionary.txt' file I'm supposed to be using, my program detects EOF after reading the first of MANY lines in the file.
int i = 0;
while( 1 ) {
size_t size = 50;
fseek( dicFile, 0L, getline( &dictionary[i++], &size, dicFile) );
printf( "%d:\t%s", i, dictionary[i - 1] );
if( feof( dicFile ) ) {
fclose( dicFile );
break;
}
}
puts("finished loading dictionary");
Here is the start of the dictionary file I'm attempting to load:
A
A's
AA's
AB's
ABM's
AC's
ACTH's
AI's
AIDS's
AM's
AOL
AOL's
ASCII's
ASL's
ATM's
ATP's
AWOL's
AZ's
The output is get from this portion of the program is:
1: A
2: finished loading dictionary
Thanks for any help.
Your third argument to fseek() is nuts. I've seen at least one implementation that treated every out of range third argument as SEEK_END. Oops.
You should just call getline() in the loop instead. In fact, just check the return value of getline() for -1 and get rid of that feof().

Improving IO performance for merging two files in C

I wrote a function which merges two large files (file1,file2) into a new file (outputFile).
Each file is a line based format while entries are separated by \0 byte. Both files have the same amount of null bytes.
One example file with two entries could look like this A\nB\n\0C\nZ\nB\n\0
Input:
file1: A\nB\0C\nZ\nB\n\0
file2: BBA\nAB\0T\nASDF\nQ\n\0
Output
outputFile: A\nB\nBBA\nAB\0C\nZ\nB\nT\nASDF\nQ\n\0
FILE * outputFile = fopen(...);
setvbuf ( outputFile , NULL , _IOFBF , 1024*1024*1024 )
FILE * file1 = fopen(...);
FILE * file2 = fopen(...);
int c1, c2;
while((c1=fgetc(file1)) != EOF) {
if(c1 == '\0'){
while((c2=fgetc(file2)) != EOF && c2 != '\0') {
fwrite(&c2, sizeof(char), 1, outputFile);
}
char nullByte = '\0';
fwrite(&nullByte, sizeof(char), 1, outputFile);
}else{
fwrite(&c1, sizeof(char), 1, outputFile);
}
}
Is there a way to improve this IO performance of this function? I increased the buffer size of outputFile to 1 GB by using setvbuf. Would it help to use posix_fadvise on file1 and file2?
You're doing IO character-by-character. That is going to be needlessly and painfully S-L-O-W, even with buffered streams.
Take advantage of the fact that your data is stored in your files as NUL-terminated strings.
Assuming you're alternating nul-terminated strings from each file, and running on a POSIX platform so you can simply mmap() the input files:
typedef struct mapdata
{
const char *ptr;
size_t bytes;
} mapdata_t;
mapdata_t mapFile( const char *filename )
{
mapdata_t data;
struct stat sb;
int fd = open( filename, O_RDONLY );
fstat( fd, &sb );
data.bytes = sb.st_size;
/* assumes we have a NUL byte after the file data
If the size of the file is an exact multiple of the
page size, we won't have the terminating NUL byte! */
data.ptr = mmap( NULL, sb.st_size, PROT_READ, MAP_PRIVATE, fd, 0 );
close( fd );
return( data );
}
void unmapFile( mapdata_t data )
{
munmap( data.ptr, data.bytes );
}
void mergeFiles( const char *file1, const char *file2, const char *output )
{
char zeroByte = '\0';
mapdata_t data1 = mapFile( file1 );
mapdata_t data2 = mapFile( file2 );
size_t strOffset1 = 0UL;
size_t strOffset2 = 0UL;
/* get a page-aligned buffer - a 64kB alignment should work */
char *iobuffer = memalign( 64UL * 1024UL, 1024UL * 1024UL );
/* memset the buffer to ensure the virtual mappings exist */
memset( iobuffer, 0, 1024UL * 1024UL );
/* use of direct IO should reduce memory pressure - the 1 MB
buffer is already pretty large, and since we're not seeking
the page cache is really only slowing things down */
int fd = open( output, O_RDWR | O_TRUNC | O_CREAT | O_DIRECT, 0644 );
FILE *outputfile = fdopen( fd, "wb" );
setvbuf( outputfile, iobuffer, _IOFBF, 1024UL * 1024UL );
/* loop until we reach the end of either mapped file */
for ( ;; )
{
fputs( data1.ptr + strOffset1, outputfile );
fwrite( &zeroByte, 1, 1, outputfile );
fputs( data2.ptr + strOffset2, outputfile );
fwrite( &zeroByte, 1, 1, outputfile );
/* skip over the string, assuming there's one NUL
byte in between strings */
strOffset1 += 1 + strlen( data1.ptr + strOffset1 );
strOffset2 += 1 + strlen( data2.ptr + strOffset2 );
/* if either offset is too big, end the loop */
if ( ( strOffset1 >= data1.bytes ) ||
( strOffset2 >= data2.bytes ) )
{
break;
}
}
fclose( outputfile );
unmapFile( data1 );
unmapFile( data2 );
}
I've put in no error checking at all. You'll also need to add the proper header files.
Note also that the file data is assumed to NOT be an exact multiple of the system page size, thus ensuring that there's a NUL byte mapped after the file contents. If the size of the file is an exact multiple of the page size, you'll have to mmap() an additional page after the file contents to ensure that there's a NUL byte to terminate the last string.
Or you can rely on there being a NUL byte as the last byte of the file's contents. If that ever turns out to not be true, you'll likely get either a SEGV or corrupted data.
you are using two function calls per character, (one for input, one for output) Function calls are slow (they pollute the instruction pipeline)
fgetc() and fputc have their getc() / putc() counterparts, which are (can be) implemented as macros, enabling the compiler to inline the entire loop, except for the reading/writing of buffers , twice per 512 or 1024 or 4096 characters processed. (these will invoke system calls, but these are inevitable anyway)
using read/write instead of buffered I/O will probably not be worth the effort, the extra bookkeeping wil make your loop fatter (btw: using fwrite() to write one character is certainly wastefull, same for write())
maybe a larger output buffer could help, but I wouldnt count on that.
If you can use threads, make one for file1 and another for file2.
Make the outputFile as big as you need, then make thread1 write the file1 into outputFile.
While thread2 seek it's output of outputFile the the length of file1+1, and write file2
Edit:
It's not a correct answer for this case, but to prevent confusions I'll let it here.
More discusion I found about it: improve performance in file IO in C

Write to the same file with different processes in order of occurence

I am working on a UNIX based operating system (Lubuntu 14.10. I have several processes that need to print a message to the same file and to the std output.
When I print my message to the screen, it works the way I want, in the order of occurence. E.g:
Process1_message1
Process2_message1
Process3_message1
Process1_message2
Process2_message2
Process3_message2
...
However, when I check the output file it is like below:
Process1_message1
Process1_message2
Process2_message1
Process2_message2
Process3_message1
Process3_message2
...
I use fprintf(FILE *ptr, char *str) to write the message to the file.
Note: I opened the file with following format in the main process:
fptr=fopen("output.txt", "a");
where fptr is a global FILE *.
Any help will be appreciated. Thank you!
fprintf() isn't going to work. It's prone being translated into multiple calls to write() to actually write out the data, exactly like you posted. You call fprintf() once, and under the covers it makes multiple calls to write() to actually write the data into the file.
You need to use open( filename, O_WRONLY | O_CREAT | O_APPEND, 0600 ), and write data something like this in order to ensure you only call write() once, which is guaranteed to be atomic:
ssize_t myprintf( int fd, const char *fmt, ... )
{
char buffer[ 1024 ];
ssize_t bytesWritten;
va_list argp;
va_start( argp, fmt );
int bytes = vsnprintf( buffer, sizeof( buffer ), fmt, argp );
if ( bytes < sizeof( buffer ) )
{
bytesWritten = write( fd, buffer, bytes );
}
// buffer was too small, get a bigger one
else
{
char *bufptr = malloc( bytes + 1 );
bytes = vsnprintf( bufptr, bytes + 1, fmp, argp );
bytesWritten = write( fd, bufptr, bytes );
free( bufptr );
}
return( bytesWritten );
}
Most likely, your problem is that the file output is fully buffered, so the output from each process doesn't appear until the standard I/O buffer for the stream (in that process) is full.
You can probably work around it sufficiently by setting line buffering:
FILE *fptr = fopen("output.txt", "a");
if (fptr != 0)
{
setvbuf(fptr, 0, _IOLBF, BUFSIZ);
…code using fptr — including your fork() calls…
fclose(fptr);
}
Every time a process writes a line to the buffer, it will be flushed. You might run into problems if your output lines are longer than BUFSIZ; then you might want to increase the size passed to setvbuf() to the largest line length you need written atomically.
If that still isn't good enough, or if you need to be able to write groups of lines at one time, you'll have to go to a solution using file descriptors as in Andrew Henle's answer. You might want to look at the O_SYNC and O_DSYNC options to open().
Flushing buffers is different in stdio when you are writing to a terminal (isatty(fptr) ---see isatty(3)--- returns true) than when you output to a file. For a file, stdio output only does a write(2) system call when the buffer is filled up and this makes all the messages to appear together (as each buffer flushes out on exit, they fill up in one single output buffer) On ttys, output is flushed when buffer fills up or when a \n char is output to the buffer (as a compromise on buffering/non buffering)
You can force buffer flushing with fflush(fptr); after fprintf(fptr, ...); or even do fflush(NULL); (which flushes all output buffers in one call).
But, be carefull as the writes are the ones that control the atomicity of calls (not the fprintf calls) so, if you have to write several pages of output in one fprintf call, be ready to accept messed output.

Reading data from stdin in C

I'm trying to read from stdin and output the data, things work, EXCEPT that it's not outputting the new incoming data. I'm not quite sure where is the issue. I'm guessing it has something to do when determining the stdin size. Any help would be greatly appreciated! Thanks
tail -f file | my_prog
Updated
#include <stdio.h>
#include <sys/stat.h>
long size(FILE *st_in) {
struct stat st;
if (fstat(fileno(st_in), &st) == 0)
return st.st_size;
return -1;
}
int main (){
FILE *file = stdin;
char line [ 128 ];
while ( fgets ( line, sizeof line, file ) != NULL )
fputs ( line, stdout ); /* write the line */
long s1, s2;
s1 = size(file);
for (;;) {
s2 = size (file);
if (s2 != s1) {
if (!fseek (file, s1, SEEK_SET)) {
while ( fgets ( line, sizeof line, file ) != NULL ) {
fputs ( line, stdout ); /* write the line */
}
}
s1 = s2;
usleep(300000);
}
}
return 0;
}
Edit: Fixed!
After a FILE * has reached EOF, it stays in a state where it will read no more data until you clear the 'EOF' bit either with clearerr() or with fseek(). However, if standard input is connected to a terminal, then that is not a seekable device, so instead of clearing the error, it might not do anything useful:
POSIX says:
The behavior of fseek() on devices which are incapable of seeking is implementation-defined.
Your loop entry condition is suspect; you need to sleep before starting it, and you need to sleep on each iteration. Indeed, normally you write tail -f without worrying about the file size; you sleep, try to read until the next 'EOF', reset the file EOF indicator, and repeat. Note, too, that the size of a pipe or terminal is not defined.
Separately, it is aconventional to call a FILE * argument to a function filename; it has completely the wrong connotations. A filename is a string.
This is not really standard C:
size(file);
Call stat() to get file information - organization type of a file, file size and permissions.
What your code does is to eventually set the file pointer to the end of the file, as it tries to read through it. Consider stat() (or fstat() on a an open file) instead.
rewind() resets the file pointer to the start of the file, fseek() will place it anywhere you need.
tail -f repeatedly tries the file at the EOF point with a short sleep in between tries.... It does not "consider" EOF to be an error. It remembers the current file offset for the EOF, then fseeks() using SEEK_END, then calls ftell(), and compares the offsets. If there is a difference it then fseek()-s back to the last known endpoint and reads the data.
This description is from old unix source. I'm sure it has been tweaked since then.

Resources