Read contents of a file into string while skipping first two lines - c

I'm trying to read the contents of a file into one string without reading in the first two lines.
Right now I have:
char* LoadDocument(char* name) {
char* buffer = 0;
long length;
FILE* f = fopen(name, "r");
if(f) {
fseek(f, 0, SEEK_END);
length = ftell(f);
fseek(f, 0, SEEK_SET);
buffer = malloc(length);
if (buffer) {
fgets (buffer, 100, f);
}
fclose (f);
}
return buffer;
}
But I'm not sure how to skip the first two lines. Also, it appears my malloc is insufficient to hold the whole file here, because it's not getting the whole file.

One solution is to read the complete file into the buffer, manually find the end of the second line, and move the remaining data to the beginning of the buffer.
Also, don't forget to add the string terminator, if you want to use the buffer as a string.

The easiest way is just to read two lines before your "real" reading begins:
char line[1024];
fgets(line, sizeof line, f);
fgets(line, sizeof line, f);
You should probably error-check this too, since the file might be shorter in which case you won't get the expected results. Also the length might be too short. If you really want to support any length, read single characters until you find the end of line, twice.
Your final fgets() should probably be a fread() call, to read the entire rest of the file. You might want to compensate for the lost length due to the initial skipping, too.

Related

how to download an unrestricted line from a file using the fread function?

I have a question how to download a line of text from the file without specifying the size of this line? I wouldn't want to use fgets because you have to give the fgets to the characters in advance. I can load the whole file, but not one line.
FILE *f
long lSize;
char *buffer;
size_t result;
f = fopen("file.txt", "r");
fseek(f, 0, SEEK_END);
lSize = ftell(f);
rewind (f);
buffer = (char*)malloc(sizeof(char)*lSize);
result = fread(buffer,1,lSize, f);
fclose(f);
free(buffer);
Use malloc() to set an initial buffer for your text line. Say, 16 chars.
Loop over the file and retrieve one character at a time with fgetc(). Store it into your buffer, at the appropiate place. If it is a newline, put a NUL character instead in the buffer and exit the loop.
When the buffer is about to get full, realloc() it and expand it for 16 more chars. If the realloc is succesfull, go to step 2.

writing multiple lines back to a file, wrong file length

I want to get the a multiline text file's content, and put it back to the file.
However, I have an issue with the file length.
The null terminator (0) that I add is after some strange characters.
Something wrong with my f_length ?
Edit : If I set the line-endings of my file to Unix (LF), I don't have the issue. So it seems that my code is incompatible with Windows line endings. How can I account for Windows text files ?
int main()
{
FILE *fp = NULL;
int f_length;
char *buffer = NULL;
size_t size = 0;
fp = fopen(FILENAME, "r+");
fseek(fp, 0, SEEK_END);
f_length = ftell(fp);
rewind(fp);
buffer = malloc((f_length + 1) * sizeof(*buffer));
fread(buffer, f_length, 1, fp);
buffer[f_length] = 0;
printf("%s\n", buffer);
fp = fopen(FILENAME, "w+");
fputs(buffer, fp);
fclose(fp);
return 0;
}
Use fp = fopen(FILENAME, "rb+"); instead. For text files, you'll have newline characters replaced while reading (you've already noticed that in comments). In some cases, new format is shorter ("\r" or "\n" while the file itself contains "\r\n"), so f_length will be bigger than the actual data read.
Or you can use line-by-line reading functions, they are made for text-mode files.
You are assuming the size of the file on disk is going to be equal to the number of bytes you will read. That is a valid assumption for a clean, binary file. It is not a valid assumption for a text file.
I'd suggest using the return value from fread instead of f_length as it reports the number of objects you actually read after any required read processing. You'll need to adjust your fread parameters to read 1-byte sized objects.
regarding:
buffer[f_length] = 0;
This is placing the '0' way too far into the buffer. this is why you see garbage characters. Much better to capture the returned value from the call to fread() and set the '0' using:
buffer[ <returnedValue> ] = '\0';

Length of character array after fread is smaller than expected

I am attempting to read a file into a character array, but when I try to pass in a value for MAXBYTES of 100 (the arguments are FUNCTION FILENAME MAXBYTES), the length of the string array is 7.
FILE * fin = fopen(argv[1], "r");
if (fin == NULL) {
printf("Error opening file \"%s\"\n", argv[1]);
return EXIT_SUCCESS;
}
int readSize;
//get file size
fseek(fin, 0L, SEEK_END);
int fileSize = ftell(fin);
fseek(fin, 0L, SEEK_SET);
if (argc < 3) {
readSize = fileSize;
} else {
readSize = atof(argv[2]);
}
char *p = malloc(fileSize);
fread(p, 1, readSize, fin);
int length = strlen(p);
filedump(p, length);
As you can see, the memory allocation for p is always equal to filesize. When I use fread, I am trying to read in the 100 bytes (readSize is set to 100 as it should be) and store them in p. However, strlen(p) results in 7 during if I pass in that argument. Am I using fread wrong, or is there something else going on?
Thanks
That is the limitation with attempting to read text with fread. There is nothing wrong with doing so, but you must know whether the file contains something other than ASCII characters (such as the nul-character) and you certainly cannot treat any part of the buffer as a string until you manually nul-terminate it at some point.
fread does not guarantee the buffer will contain a nul-terminating character at all -- and it doesn't guarantee that the first character read will not be the nul-character.
Again, there is nothing wrong with reading an entire file into an allocated buffer. That's quite common, you just cannot treat what you have read as a string. That is a further reason why there are character oriented, formatted, and line oriented input functions. (getchar, fgetc, fscanf, fgets and POSIX getline, to list a few). The formatted and line oriented functions guarantee a nul-terminated buffer, otherwise, you are on your own to account for what you have read, and insure you nul-terminate your buffer -- before treating it as a string.

Exclude terminator when writing to file

I'm receiving a message through a socket in C which is written to a buffer. The message has a terminator (##) added to the end so I know when to stop writing to the buffer, however, the buffer is much larger than the message. Is there a way to write only up to and not including the terminator, throwing away the rest of the buffer?
Maybe something with a position pointer?
char *pos;
FILE* fp = fopen("tempfile", "w+");
pos = strstr(buffer, "##");
pos = '\0'; // Maybe I could stop the writing with NULL?
fwrite(buffer, 1, BUF_SIZE /*too big*/, fp);
fclose(fp);
I need to get the size of the portion of the buffer that contains the message, or maybe only write to my file up to a certain character. Either way would work.
Statement fwrite(buffer, 1, BUF_SIZE, fp) will write BUF_SIZE bytes, regardless of the actual content of the buffer and regardless if it contains a '\0' considered as "string termination".
So the only way is to tell fwrite the correct number of elements to write; and this can be easily calculated through pointer arithmetics:
fwrite(buffer, 1, pos-buffer, fp)
Of course, you'll have to check if pos != NULL and so on; but I think this is straight forward.

getting string from file

Why is the output of this code is some random words in memory?
void conc()
{
FILE *source = fopen("c.txt", "r+");
if(!source)
{
printf("Ficheiro não encontrado");
return;
}
short i = 0;
while(fgetc(source) != EOF)
i++;
char tmp_str[i];
fgets(tmp_str, i, source);
fclose(source);
printf("%s", tmp_str);
}
This should give me the content of the file, I think.
Because after you have walked through the file using fgetc(), then the position indicator is at end-of-file. fgets() has nothing to read. You need to reset it so that it points to the beginning using rewind(source);.
By the way, don't loop through the file using fgetc(), that's an extremely ugly solution. use fseek() and ftell() or lseek() instead to get the size of the file:
fseek(source, SEEK_END, 0);
long size = ftell(source);
fseek(source, SEEK_SET, 0); // or rewind(source);
alternative:
off_t size = lseek(source, SEEK_END, 0);
rewind(source);
Use rewind(source); before fgets(tmp_str, i, source);
After your fgetc() - loop you have reached EOF and if you don't fseek( source, 0l, SEEK_SET ) back to the beginning you won't get any more data.
Anyway you should avoid reading the file twice. Use fstat( fileno(source), ... ) instead to determine the file size.
fgetc reads a character from the stream.
fgets reads a string from the stream.
Now in your code you are iterating through the end of the file. So the call to fgets on the stream will simply return NULL and leave the buffer content unchanged. In your case, your buffer is not initialised. This explains the random values you are seeing.
Instead of reading the complete file content with fgetc to get the character count, I recommend using fseek / ftell (see answer from this thread)
Your code is wrong. As was said before:
You should not read file twice
To allocate array dynamically you must use operator new (in c++)
or function malloc (in c)
If you need code to read content of the file, try next (sorry, but I didn't compile it. anyway it should work well):
FILE* source = fopen("c.txt", "r+b");
if(!source){return;}
fseek(source, 0, SEEK_END);
size_t filesize = ftell(source);
fseek(source, 0, SEEK_SET);
char* buf = new char[filesize+1]; // +1 is for '/0'
fread(buf, sizeof(char), filesize, source);
fclose(source);
buf[filesize]=0;
printf("%s", buf);
delete buf;
Each time you call fgetc() you advance the internal file pointer one character further. At the end of your while() loop the file pointer will then be at the end of the file. Subsequent calls intented to read on the file handle will fail with an EOF condition.
The fgets manual says that :
If the end-of-file is encountered while attempting to read a character, the eof indicator is set (feof). If this happens before any characters could be read, the pointer returned is a null pointer (and the contents of str remain unchanged).
The consequence is that tmp_str is left unchanged. The garbage you get back when you call printf is actually a part of the conc() function stack.
A solution to your problem would be rewinding the file pointer with fseek() just before calling fgets().
fseek(source, 0, SEEK_SET);
Then a better way to get the size of your file would be to fseek to the end of the file, and use ftell to get the current position :
long size;
fseek(source, 0, SEEK_END);
size = ftell(source);
This being said, your code still has a problem. When you allocate on the stack (variables local to a function) you have to tell the size of the variable at compile time. Here your compiler allocates a char array of length 0. I suggest you investigate dynamic allocation with malloc of the keyword new if you're coding in C++.
A proper allocation would look like this :
char *tmp_str = malloc(size);
// Here you read the file
free(tmp_str);
An simpler solution could be to preallocate a string large enough to hold your file.
char tmp_str[1024 * 100]; // 100Kb
Then use the size variable we got earlier to check the file will fit in tmp_str before reading.

Resources