Gibberish after file contents - c

I'm working on the following function to pull scripts into my C application:
char *anne_script_load(char fileName[]){
char path[6400] = "data/scripts/";
strncat(path, fileName, 6000);
strncat(path, ".lua", 5);
struct stat fileInfo;
char *contents = NULL;
FILE * pFile;
stat(path, &fileInfo);
contents = (char*)malloc(fileInfo.st_size);
pFile = fopen (path,"r");
fread (contents,1,fileInfo.st_size,pFile);
fclose(pFile);
printf("Script path: %s\n", path);
printf("Script loaded: %s\n", contents);
return contents;
}
At runtime, the second printf generates the following output:
test script - if you see this load is working :) helicopters����
The garbled text looks different on my console, but I'm not sure it matters: my theory is that the file stream doesn't end in a null byte (it's not stored on disk as a c string, after all - so I terminated it myself as follows:
contents[fileInfo.st_size] = 0;
This appears to work, but I'm concerned about the robustness of this solution. Is there a better, generally accepted way of doing this?

You need to add +1 to your malloc for the termination:
if(stat(path, &fileInfo) != 0) {
perror("stat");
...
}
contents = (char*)malloc(fileInfo.st_size + 1); // don't forget this
pFile = fopen (path,"r");
if(pFile == 0) {
perror("fopen");
...
}
int n = fread (contents,1,fileInfo.st_size,pFile);
if(n != fileInfo.st_size) {
perror("read");
...
}
contents[n] = 0; // terminate after what you read. Not what you think you read.
** Check the return values from fopen, stat, and read.**

There are number of things you could probably improve here... but to answer your question, fread() is reading bytes into an array that you didn't initialize. In C, you can't expect a \0 character after the last byte you read via fread -- you've got to put it there first, either with a memset() or calloc().
Also, if the file content is treated as a text string, be sure to allocate one additional byte over the size to hold that terminating \0 character!

You must nul-terminated the string otherwise printf() will not know for sure when to stop reading from the pointer you just passed. The same applies to other functions like strcmp(), you could have two strings with the same content while having strcmp() yelling non-zero because of a non nul-terminated string.
So contents[fileInfo.st_size] = 0; is just fine.
You could read a little bit more about nul-terminated strings on Wikipedia

Related

How to avoid calling fopen() with a buffer that is not null-terminated in C?

Let's look at this example:
static FILE *open_file(const char *file_path)
{
char buf[80];
size_t n = snprintf(buf, sizeof (buf), "%s", file_path);
assert(n < sizeof (buf));
return fopen(buf, "r");
}
Here, the assert() is off-by-one. From the manpage for snprintf:
"Upon successful return, these functions return the number of characters printed (excluding the null byte used to end output to strings)."
So, if it returns 80, then the string will fill the buffer, and won't be terminated by \0. This will cause a problem because fopen() assumes it is null terminated.
What is the best way to prevent this?
So, if it returns 80, then the string will fill the buffer, and won't be terminated by \0
That is incorrect: the string would be null-terminated no matter what you pass for file_path. Obviously, the string would be cut off at the sizeof(buf)-1.
Note that snprintf could return a number above 80 as well. This would mean that the string you wanted to print was longer than the buffer you have provided.
What is the best way to prevent this?
You are already doing it: the assert is not necessary for preventing unterminated strings. You can use the return value to decide if any truncation has happened, and pass a larger buffer to compensate:
// Figure out the size
size_t n = snprintf(NULL, 0, "%s", file_path);
// Allocate the buffer and print into it
char *tmpBuf = malloc(n+1);
snprintf(tmpBuf, n+1, "%s", file_path);
// Prepare the file to return
FILE *res = fopen(tmpBuf, "r");
// Free the temporary buffer
free(tmpBuf);
return res;
What is the best way to prevent this?
Simple, don't give it a non-null terminated string. Academic questions aside, you are in control of the code you write. You don't have to protect against yourself sabotaging the project in every conceivable way, you just have to not sabotage yourself.
If everyone checked and double checked everything in code, the performance loss would be incredible. There's a reason why fopen doesn't do it.
There are a couple of issues here.
First assert() is used to catch issues as a part of designer testing. It is not meant to be used in production code.
Secondly if the file path is not complete then do you really want to call fopen()?
Normally what is done is to add one to the expected number of characters.
static FILE *open_file(const char *file_path)
{
char buf[80 + 1] = {0};
size_t n = snprintf(buf, 80, "%s", file_path);
assert(n < sizeof (buf));
return fopen(buf, "r");
}

Length of character array after fread is smaller than expected

I am attempting to read a file into a character array, but when I try to pass in a value for MAXBYTES of 100 (the arguments are FUNCTION FILENAME MAXBYTES), the length of the string array is 7.
FILE * fin = fopen(argv[1], "r");
if (fin == NULL) {
printf("Error opening file \"%s\"\n", argv[1]);
return EXIT_SUCCESS;
}
int readSize;
//get file size
fseek(fin, 0L, SEEK_END);
int fileSize = ftell(fin);
fseek(fin, 0L, SEEK_SET);
if (argc < 3) {
readSize = fileSize;
} else {
readSize = atof(argv[2]);
}
char *p = malloc(fileSize);
fread(p, 1, readSize, fin);
int length = strlen(p);
filedump(p, length);
As you can see, the memory allocation for p is always equal to filesize. When I use fread, I am trying to read in the 100 bytes (readSize is set to 100 as it should be) and store them in p. However, strlen(p) results in 7 during if I pass in that argument. Am I using fread wrong, or is there something else going on?
Thanks
That is the limitation with attempting to read text with fread. There is nothing wrong with doing so, but you must know whether the file contains something other than ASCII characters (such as the nul-character) and you certainly cannot treat any part of the buffer as a string until you manually nul-terminate it at some point.
fread does not guarantee the buffer will contain a nul-terminating character at all -- and it doesn't guarantee that the first character read will not be the nul-character.
Again, there is nothing wrong with reading an entire file into an allocated buffer. That's quite common, you just cannot treat what you have read as a string. That is a further reason why there are character oriented, formatted, and line oriented input functions. (getchar, fgetc, fscanf, fgets and POSIX getline, to list a few). The formatted and line oriented functions guarantee a nul-terminated buffer, otherwise, you are on your own to account for what you have read, and insure you nul-terminate your buffer -- before treating it as a string.

Handle memory while reading long lines from a file in C

First of all, I know this question is very close to this topic, but the question was so poorly worded that I am not even sure it is a duplicate plus no code were shown so I thought it deserved to be asked properly.
I am trying to read a file line by line and I need to store a line in particular in a variable. I have managed to do so quite easily using fgets, nevertheless the size of the lines to be read and the number of lines in the file remain unknown.
I need a way to properly allocate memory to the variable whatever the size of the line might be, using C and not C++.
So far my code looks like that :
allowedMemory = malloc(sizeof(char[1501])); // Checks if enough memory
if (NULL == allowedMemory)
{
fprintf(stderr, "Not enough memory. \n");
exit(1);
}
else
char* res;
res = allowedMemory;
while(fgets(res, 1500, file)) // Iterate until end of file
{
if (res == theLineIWant) // Using strcmp instead of ==
return res;
}
The problem of this code is that it is not adaptable at all. I am looking for a way to allocate just enough memory to res so that I don't miss any data in line.
I was thinking about something like that :
while ( lineContainingKChar != LineContainingK+1Char) // meaning that the line has not been fully read
// And using strcmp instead of ==
realloc(lineContainingKChar, K + 100) // Adding memory
But I would need to iterate through two FILE object in order to fill these variables which would not be very efficient.
Any hints about how to implement this solution or advise about how to do it in a easier way would be appreciated.
EDIT : Seems like using getline() is the best way to do so because this function allocates the memory needed by itself and free it when needed. Nevertheless I don't think that it is 100% portable since I still can't use it though I have included <stdio.h>. To be verified though, since my issues are often situated between keyboard and computer. Until then I am still open to a solution which would not use POSIX-compliant C.
getline() appears to do exactly what you want:
DESCRIPTION
The getdelim() function shall read from stream until it encounters a
character matching the delimiter character.
...
The getline() function shall be equivalent to the getdelim()
function with the delimiter character equal to the <newline>
character.
...
EXAMPLES
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
FILE *fp;
char *line = NULL;
size_t len = 0;
ssize_t read;
fp = fopen("/etc/motd", "r");
if (fp == NULL)
exit(1);
while ((read = getline(&line, &len, fp)) != -1) {
printf("Retrieved line of length %zu :\n", read);
printf("%s", line);
}
if (ferror(fp)) {
/* handle error */
}
free(line);
fclose(fp);
return 0;
}
And per the Linux man page:
DESCRIPTION
getline() reads an entire line from stream, storing the address of
the buffer containing the text into *lineptr. The buffer is null-
terminated and includes the newline character, if one was found.
If *lineptr is set to NULL and *n is set 0 before the call, then
getline() will allocate a buffer for storing the line. This buffer
should be freed by the user program even if getline() failed.
Alternatively, before calling getline(), *lineptr can contain a
pointer to a malloc(3)-allocated buffer *n bytes in size. If the
buffer is not large enough to hold the line, getline() resizes it
with realloc(3), updating *lineptr and *n as necessary.
In either case, on a successful call, *lineptr and *n will be
updated to reflect the buffer address and allocated size respectively.

Cannot get Call to Function Working

I found this piece of code at Reading a file character by character in C and it compiles and is what I wish to use. My problem that I cannot get the call to it working properly. The code is as follows:
char *readFile(char *fileName)
{
FILE *file = fopen(fileName, "r");
char *code;
size_t n = 0;
int c;
if (file == NULL)
return NULL; //could not open file
code = malloc(1500);
while ((c = fgetc(file)) != EOF)
{
code[n++] = (char) c;
}
code[n] = '\0';
return code;
}
I am not sure of how to call it. Currently I am using the following code to call it:
.....
char * rly1f[1500];
char * RLY1F; // This is the Input File Name
rly1f[0] = readFile(RLY1F);
if (rly1f[0] == NULL) {
printf ("NULL array); exit;
}
int n = 0;
while (n++ < 1000) {
printf ("%c", rly1f[n]);
}
.....
How do I call the readFile function such that I have an array (rly1f) which is not NULL? The file RLY1F exists and has data in it. I have successfully opened it previously using 'in line code' not a function.
Thanks
The error you're experiencing is that you forgot to pass a valid filename. So either the program crashes, or fopen tries to open a trashed name and returns NULL
char * RLY1F; // This is not initialized!
RLY1F = "my_file.txt"; // initialize it!
The next problem you'll have will be in your loop to print the characters.
You have defined an array of pointers char * rly1f[1500];
You read 1 file and store it in the first pointer of the array rly1f[0]
But when you display it you display the pointer values as characters which is not what you want. You should just do:
while (n < 1000) {
printf ("%c", rly1f[0][n]);
n++;
}
note: that would not crash but would print trash if the file read is shorter than 1000.
(BLUEPIXY suggested the post-incrementation fix for n BTW or first character is skipped)
So do it more simply since your string is nul-terminated, pass the array to puts:
puts(rly1f[0]);
EDIT: you have a problem when reading your file too. You malloc 1500 bytes, but you read the file fully. If the file is bigger than 1500 bytes, you get buffer overflow.
You have to compute the length of the file before allocating the memory. For instance like this (using stat would be a better alternative maybe):
char *readFile(char *fileName, unsigned int *size) {
...
fseek(file,0,SEEK_END); // set pos to end of file
*size = ftell(file); // get pos, i.e. size
rewind(file); // set pos to 0
code = malloc(*size+1); // allocate the proper size plus one
notice the extra parameter which allows you to return the size as well as the file data.
Note: on windows systems, text files use \r\n (CRLF) to delimit lines, so the allocated size will be higher than the number of characters read if you use text mode (\r\n are converted to \n so there are less chars in your buffer: you could consider a realloc once you know the exact size to shave off the unused allocated space).

create a binary file in C

I'm currently working on a binary file creation. Here is what I have tried.
Example 1:
#include<stdio.h>
int main() {
/* Create the file */
int a = 5;
FILE *fp = fopen ("file.bin", "wb");
if (fp == NULL)
return -1;
fwrite (&a, sizeof (a), 1, fp);
fclose (fp);
}
return 0;
}
Example 2:
#include <stdio.h>
#include <string.h>
int main()
{
FILE *fp;
char str[256] = {'\0'};
strcpy(str, "3aae71a74243fb7a2bb9b594c9ea3ab4");
fp = fopen("file.bin", "wb");
if(fp == NULL)
return -1;
fwrite(str, sizeof str, 1, fp);
return 0;
}
Example 1 gives the right output in binary form. But Example 2 where I'm passing string doesn't give me right output. It writes the input string which I have given into the file and appends some data(binary form).
I don't understand and I'm unable to figure it out what mistake I'm doing.
The problem is that sizeof str is 256, that is, the entire size of the locally declared character array. However, the data you are storing in it does not require all 256 characters. The result is that the write operation writes all the characters of the string plus whatever garbage happened to be in the character array already. Try the following line as a fix:
fwrite(str, strlen(str), 1, fp);
C strings are null terminated, meaning that anything after the '\0' character must be ignored. If you read the file written by Example 2 into a str[256] and print it out using printf("%s", str), you would get the original string back with no extra characters, because null terminator would be read into the buffer as well, providing proper termination for the string.
The reason you get the extra "garbage" in the output is that fwrite does not interpret str[] array as a C string. It interprets it as a buffer of size 256. Text editors do not interpret null character as a terminator, so random characters from str get written to the file.
If you want the string written to the file to end at the last valid character, use strlen(str) for the size in the call of fwrite.

Resources