Exclude terminator when writing to file - c

I'm receiving a message through a socket in C which is written to a buffer. The message has a terminator (##) added to the end so I know when to stop writing to the buffer, however, the buffer is much larger than the message. Is there a way to write only up to and not including the terminator, throwing away the rest of the buffer?
Maybe something with a position pointer?
char *pos;
FILE* fp = fopen("tempfile", "w+");
pos = strstr(buffer, "##");
pos = '\0'; // Maybe I could stop the writing with NULL?
fwrite(buffer, 1, BUF_SIZE /*too big*/, fp);
fclose(fp);
I need to get the size of the portion of the buffer that contains the message, or maybe only write to my file up to a certain character. Either way would work.

Statement fwrite(buffer, 1, BUF_SIZE, fp) will write BUF_SIZE bytes, regardless of the actual content of the buffer and regardless if it contains a '\0' considered as "string termination".
So the only way is to tell fwrite the correct number of elements to write; and this can be easily calculated through pointer arithmetics:
fwrite(buffer, 1, pos-buffer, fp)
Of course, you'll have to check if pos != NULL and so on; but I think this is straight forward.

Related

character by character reading from a file in C

How to read text from a file into a dynamic array of characters?
I found a way to count the number of characters in a file and create a dynamic array, but I can't figure out how to assign characters to the elements of the array?
FILE *text;
char* Str;
int count = 0;
char c;
text = fopen("text.txt", "r");
while(c = (fgetc(text))!= EOF)
{
count ++;
}
Str = (char*)malloc(count * sizeof(char));
fclose(text);
There is no portable, standard-conforming way in C to know in advance how may bytes may be read from a FILE stream.
First, the stream might not even be seekable - it can be a pipe or a terminal or even a socket connection. On such streams, once you read the input it's gone, never to be read again. You can push back one char value, but that's not enough to be able to know how much data remains to be read, or to reread the entire stream.
And even if the stream is to a file that you can seek on, you can't use fseek()/ftell() in portable, strictly-conforming C code to know how big the file is.
If it's a binary stream, you can not use fseek() to seek to the end of the file - that's explicitly undefined behavior per the C standard:
... A binary stream need not meaningfully support fseek calls with a whence value of SEEK_END.
Footnote 268 even says:
Setting the file position indicator to end-of-file, as with fseek(file, 0, SEEK_END), has undefined behavior for a binary stream ...
So you can't portably use fseek() in a binary stream.
And you can't use ftell() to get a byte count for a text stream. Per the C standard again:
For a text stream, its file position indicator contains unspecified information, usable by the fseek function for returning the file position indicator for the stream to its position at the time of the ftell call; the difference between two such return values is not necessarily a meaningful measure of the number of characters written or read.
Systems do exist where the value returned from ftell() is nothing like a byte count.
The only portable, conforming way to know how many bytes you can read from a stream is to actually read them, and you can't rely on being able to read them again.
If you want to read the entire stream into memory, you have to continually reallocate memory, or use some other dynamic scheme.
This is a very inefficient but portable and strictly-conforming way to read the entire contents of a stream into memory (all error checking and header files are omitted for algorithm clarity and to keep the vertical scrollbar from appearing - it really needs error checking and will need the proper header files):
// get input stream with `fopen()` or some other manner
FILE *input = ...
size_t count = 0;
char *data = NULL;
for ( ;; )
{
int c = fgetc( input );
if ( c == EOF )
{
break;
}
data = realloc( data, count + 1 );
data[ count ] = c;
count++;
}
// optional - terminate the data with a '\0'
// to treat the data as a C-style string
data = realloc( data, count + 1 );
data[ count ] = '\0';
count++;
That will work no matter what the stream is.
On a POSIX-style system such as Linux, you can use fileno() and fstat() to get the size of a file (again, all error checking and header files are omitted):
char *data = NULL;
FILE *input = ...
int fd = fileno( input );
struct stat sb;
fstat( fd, &sb );
if ( S_ISREG( sb.st_mode ) )
{
// sb.st_size + 1 for C-style string
char *data = malloc( sb.st_size + 1 );
data[ sb.st_size ] = '\0';
}
// now if data is not NULL you can read into the buffer data points to
// if data is NULL, see above code to read char-by-char
// this tries to read the entire stream in one call to fread()
// there are a lot of other ways to do this
size_t totalRead = 0;
while ( totalRead < sb.st_size )
{
size_t bytesRead = fread( data + totalRead, 1, sb.st_size - totalRead, input );
totalRead += bytesRead;
}
The above could should work on Windows, too. You may get some compiler warnings or have to use _fileno(), _fstat() and struct _stat instead, too.*
You may also need to define the S_ISREG() macro on Windows:
#define S_ISREG(m) (((m) & S_IFMT) == S_IFREG)
* that's _fileno(), _fstat(), and struct _stat without the hyperlink underline-munge.
For a binary file, you can use fseek and ftell to know the size without reading the file, allocate the memory and then read everything:
...
text = fopen("text.txt", "r");
fseek(txt, 0, SEEK_END);
char *ix = Str = malloc(ftell(txt);
while(c = (fgetc(text))!= EOF)
{
ix++ = c;
}
count = ix - Str; // get the exact count...
...
For a text file, on a system that has a multi-byte end of line (like Windows which uses \r\n), this will allocate more bytes than required. You could of course scan the file twice, first time for the size and second for actually reading the characters, but you can also just ignore the additional bytes, or you could realloc:
...
count = ix - Str;
Str = realloc(Str, count);
...
Of course for a real world program, you should control the return values of all io and allocation functions: fopen, fseek, fteel, malloc and realloc...
To just do what you asked for, you would have to read the whole file again:
...
// go back to the beginning
fseek(text, 0L, SEEK_SET);
// read
ssize_t readsize = fread(Str, sizeof(char), count, text);
if(readsize != count) {
printf("woops - something bad happened\n");
}
// do stuff with it
// ...
fclose(text);
But your string is not null terminated this way. That will get you in some trouble if you try to use some common string functions like strlen.
To properly null terminate your string you would have to allocate space for one additional character and set that last one to '\0':
...
// allocate count + 1 (for the null terminator)
Str = (char*)malloc((count + 1) * sizeof(char));
// go back to the beginning
fseek(text, 0L, SEEK_SET);
// read
ssize_t readsize = fread(Str, sizeof(char), count, text);
if(readsize != count) {
printf("woops - something bad happened\n");
}
// add null terminator
Str[count] = '\0';
// do stuff with it
// ...
fclose(text);
Now if you want know the number of characters in the file without counting them one by one, you could get that number in a more efficient way:
...
text = fopen("text.txt", "r");
// seek to the end of the file
fseek(text, 0L, SEEK_END);
// get your current position in that file
count = ftell(text)
// allocate count + 1 (for the null terminator)
Str = (char*)malloc((count + 1) * sizeof(char));
...
Now bring this in a more structured form:
// open file
FILE *text = fopen("text.txt", "r");
// seek to the end of the file
fseek(text, 0L, SEEK_END);
// get your current position in that file
ssize_t count = ftell(text)
// allocate count + 1 (for the null terminator)
char* Str = (char*)malloc((count + 1) * sizeof(char));
// go back to the beginning
fseek(text, 0L, SEEK_SET);
// read
ssize_t readsize = fread(Str, sizeof(char), count, text);
if(readsize != count) {
printf("woops - something bad happened\n");
}
fclose(text);
// add null terminator
Str[count] = '\0';
// do stuff with it
// ...
Edit:
As Andrew Henle pointed out not every FILE stream is seekable and you can't even rely on being able to read the file again (or that the file has the same length/content when reading it again). Even though this is the accepted answer, if you don't know in advance what kind of file stream you're dealing with, his solution is definitely the way to go.

Difference between specifications of fread and fgets?

What is the difference between fread and fgets when reading in from a file?
I use the same fwrite statement, however when I use fgets to read in a .txt file it works as intended, but when I use fread() it does not.
I've switched from fgets/fputs to fread/fwrite when reading from and to a file. I've used fopen(rb/wb) to read in binary rather than standard characters. I understand that fread will get /0 Null bytes as well rather than just single lines.
//while (fgets(buff,1023,fpinput) != NULL) //read in from file
while (fread(buff, 1, 1023, fpinput) != 0) // read from file
I expect to read in from a file to a buffer, put the buffer in shared memory, and then have another process read from shared memory and write to a new file.
When I use fgets() it works as intended with .txt files, but when using fread it adds a single line from 300~ characters into the buffer with a new line. Can't for the life of me figure out why.
fgets will stop when encountering a newline. fread does not. So fgets is typically only useful for text files, while fread can be used for both text and binary files.
From the C11 standard:
7.21.7.2 The fgets function
The fgets function reads at most one less than the number of characters specified by n from the stream pointed to by stream into the array pointed to by s. No additional characters are read after a new-line character (which is retained) or after end-of-file. A null character is written immediately after the last character read into the array.
7.21.8.1 The fread function
The fread function reads, into the array pointed to by ptr, up to nmemb elements whose size is specified by size, from the stream pointed to by stream. For each object, size calls are made to the fgetc function and the results stored, in the order read, in an array of unsigned char exactly overlaying the object. The file position indicator for the stream (if defined) is advanced by the number of characters successfully read. If an error occurs, the resulting value of the file position indicator for the stream is indeterminate. If a partial element is read, its value is indeterminate.
This snippet maybe will make things clearer for you. It just copies a file in chunks.
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char ** argv)
{
if(argc != 3) {
printf("Usage: ./a.out src dst\n");
printf("Copies file src to dst\n");
exit(EXIT_SUCCESS);
}
const size_t chunk_size = 1024;
FILE *in, *out;
if(! (in = fopen(argv[1], "rb"))) exit(EXIT_FAILURE);
if(! (out = fopen(argv[2], "wb"))) exit(EXIT_FAILURE);
char * buffer;
if(! (buffer = malloc(chunk_size))) exit(EXIT_FAILURE);
size_t bytes_read;
do {
// fread returns the number of successfully read elements
bytes_read = fread(buffer, 1, chunk_size, in);
/* Insert any modifications you may */
/* want to do here */
// write bytes_read bytes from buffer to output file
if(fwrite(buffer, 1, bytes_read, out) != bytes_read) exit(EXIT_FAILURE);
// When we read less than chunk_size we are either done or an error has
// occured. This error is not handled in this program.
} while(bytes_read == chunk_size);
free(buffer);
fclose(out);
fclose(in);
}
You mentioned in a comment below that you wanted to use this for byteswapping. Well, you can just use the following snippet. Just insert it where indicated in code above.
for(int i=0; i < bytes_read - bytes_read%2; i+=2) {
char tmp = buffer[i];
buffer[i] = buffer[i+1];
buffer[i+1] = tmp;
}

Length of character array after fread is smaller than expected

I am attempting to read a file into a character array, but when I try to pass in a value for MAXBYTES of 100 (the arguments are FUNCTION FILENAME MAXBYTES), the length of the string array is 7.
FILE * fin = fopen(argv[1], "r");
if (fin == NULL) {
printf("Error opening file \"%s\"\n", argv[1]);
return EXIT_SUCCESS;
}
int readSize;
//get file size
fseek(fin, 0L, SEEK_END);
int fileSize = ftell(fin);
fseek(fin, 0L, SEEK_SET);
if (argc < 3) {
readSize = fileSize;
} else {
readSize = atof(argv[2]);
}
char *p = malloc(fileSize);
fread(p, 1, readSize, fin);
int length = strlen(p);
filedump(p, length);
As you can see, the memory allocation for p is always equal to filesize. When I use fread, I am trying to read in the 100 bytes (readSize is set to 100 as it should be) and store them in p. However, strlen(p) results in 7 during if I pass in that argument. Am I using fread wrong, or is there something else going on?
Thanks
That is the limitation with attempting to read text with fread. There is nothing wrong with doing so, but you must know whether the file contains something other than ASCII characters (such as the nul-character) and you certainly cannot treat any part of the buffer as a string until you manually nul-terminate it at some point.
fread does not guarantee the buffer will contain a nul-terminating character at all -- and it doesn't guarantee that the first character read will not be the nul-character.
Again, there is nothing wrong with reading an entire file into an allocated buffer. That's quite common, you just cannot treat what you have read as a string. That is a further reason why there are character oriented, formatted, and line oriented input functions. (getchar, fgetc, fscanf, fgets and POSIX getline, to list a few). The formatted and line oriented functions guarantee a nul-terminated buffer, otherwise, you are on your own to account for what you have read, and insure you nul-terminate your buffer -- before treating it as a string.

Read contents of a file into string while skipping first two lines

I'm trying to read the contents of a file into one string without reading in the first two lines.
Right now I have:
char* LoadDocument(char* name) {
char* buffer = 0;
long length;
FILE* f = fopen(name, "r");
if(f) {
fseek(f, 0, SEEK_END);
length = ftell(f);
fseek(f, 0, SEEK_SET);
buffer = malloc(length);
if (buffer) {
fgets (buffer, 100, f);
}
fclose (f);
}
return buffer;
}
But I'm not sure how to skip the first two lines. Also, it appears my malloc is insufficient to hold the whole file here, because it's not getting the whole file.
One solution is to read the complete file into the buffer, manually find the end of the second line, and move the remaining data to the beginning of the buffer.
Also, don't forget to add the string terminator, if you want to use the buffer as a string.
The easiest way is just to read two lines before your "real" reading begins:
char line[1024];
fgets(line, sizeof line, f);
fgets(line, sizeof line, f);
You should probably error-check this too, since the file might be shorter in which case you won't get the expected results. Also the length might be too short. If you really want to support any length, read single characters until you find the end of line, twice.
Your final fgets() should probably be a fread() call, to read the entire rest of the file. You might want to compensate for the lost length due to the initial skipping, too.

getting string from file

Why is the output of this code is some random words in memory?
void conc()
{
FILE *source = fopen("c.txt", "r+");
if(!source)
{
printf("Ficheiro não encontrado");
return;
}
short i = 0;
while(fgetc(source) != EOF)
i++;
char tmp_str[i];
fgets(tmp_str, i, source);
fclose(source);
printf("%s", tmp_str);
}
This should give me the content of the file, I think.
Because after you have walked through the file using fgetc(), then the position indicator is at end-of-file. fgets() has nothing to read. You need to reset it so that it points to the beginning using rewind(source);.
By the way, don't loop through the file using fgetc(), that's an extremely ugly solution. use fseek() and ftell() or lseek() instead to get the size of the file:
fseek(source, SEEK_END, 0);
long size = ftell(source);
fseek(source, SEEK_SET, 0); // or rewind(source);
alternative:
off_t size = lseek(source, SEEK_END, 0);
rewind(source);
Use rewind(source); before fgets(tmp_str, i, source);
After your fgetc() - loop you have reached EOF and if you don't fseek( source, 0l, SEEK_SET ) back to the beginning you won't get any more data.
Anyway you should avoid reading the file twice. Use fstat( fileno(source), ... ) instead to determine the file size.
fgetc reads a character from the stream.
fgets reads a string from the stream.
Now in your code you are iterating through the end of the file. So the call to fgets on the stream will simply return NULL and leave the buffer content unchanged. In your case, your buffer is not initialised. This explains the random values you are seeing.
Instead of reading the complete file content with fgetc to get the character count, I recommend using fseek / ftell (see answer from this thread)
Your code is wrong. As was said before:
You should not read file twice
To allocate array dynamically you must use operator new (in c++)
or function malloc (in c)
If you need code to read content of the file, try next (sorry, but I didn't compile it. anyway it should work well):
FILE* source = fopen("c.txt", "r+b");
if(!source){return;}
fseek(source, 0, SEEK_END);
size_t filesize = ftell(source);
fseek(source, 0, SEEK_SET);
char* buf = new char[filesize+1]; // +1 is for '/0'
fread(buf, sizeof(char), filesize, source);
fclose(source);
buf[filesize]=0;
printf("%s", buf);
delete buf;
Each time you call fgetc() you advance the internal file pointer one character further. At the end of your while() loop the file pointer will then be at the end of the file. Subsequent calls intented to read on the file handle will fail with an EOF condition.
The fgets manual says that :
If the end-of-file is encountered while attempting to read a character, the eof indicator is set (feof). If this happens before any characters could be read, the pointer returned is a null pointer (and the contents of str remain unchanged).
The consequence is that tmp_str is left unchanged. The garbage you get back when you call printf is actually a part of the conc() function stack.
A solution to your problem would be rewinding the file pointer with fseek() just before calling fgets().
fseek(source, 0, SEEK_SET);
Then a better way to get the size of your file would be to fseek to the end of the file, and use ftell to get the current position :
long size;
fseek(source, 0, SEEK_END);
size = ftell(source);
This being said, your code still has a problem. When you allocate on the stack (variables local to a function) you have to tell the size of the variable at compile time. Here your compiler allocates a char array of length 0. I suggest you investigate dynamic allocation with malloc of the keyword new if you're coding in C++.
A proper allocation would look like this :
char *tmp_str = malloc(size);
// Here you read the file
free(tmp_str);
An simpler solution could be to preallocate a string large enough to hold your file.
char tmp_str[1024 * 100]; // 100Kb
Then use the size variable we got earlier to check the file will fit in tmp_str before reading.

Resources