#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char *readLine(FILE *inFile) //Simply reads line in a text file till "\n"
{
char *line = realloc(NULL, 1);
char c;
int i=0;
while (!feof(inFile))
{
c = fgetc(inFile);
if (ferror(inFile)) printf("Error reading");
if (c == 10)
{
realloc(line,i+1);
line[i]= 10;
break;
}
realloc(line, i+1);
line[i++] = c;
}
return line;
}
int main(int argc,char **argv)
{
FILE *inFile;
inFile = fopen("testFile","r");
printf("%s",readLine(inFile));
printf("%s",readLine(inFile));
printf("%s",readLine(inFile));
return 0;
}
If the contents of testFile is:-
abc
def
ghi
The three printf statements should show "abc" three times.. But the output is:-
abc
def
ghi
I know I am wrong in the concept somewhere. Pls help.
Usage of realloc() is incorrect.
realloc(line,i+1); // wrong
// OK
void *new_line = realloc(line,i+1);
if (!new_line)
{
free(line);
return NULL;
}
line = new_line;
Because line is passed by value, it's not changed. The actual re-allocated memory is in the return value. Therefore line remains the same over and over again, and you are seeing the same line over and over again. Edit: just realized that's even though it's a bug, it's not what would cause repeating lines. Other points are still valid.
What's worse:
You have a memory leak by losing the newly re-allocated pointer every time.
You are potentially accessing freed memory, because old line value may become invalid after reallocation, if it was reallocated in a different part of the heap.
You are re-allocating memory every character, which is potentially an expensive operation.
But I am passing file pointer by value. So i should get output "abc" again and again
Ah, I understand your confusion.
A file pointer only points to the actual file structure. State such as the current offset are not part of the pointer but are part of the internal structure.
Another way to think about this is that the actual object representing the file is FILE. To get pass-by-reference semantics, you pass a pointer to the object. Since you are passing by reference, each line picks up where the last one left off.
fgetc() advances the file pointer (which is "where the next character to be read is located"). That's how you're able to call it in a loop and read a whole line of characters.
After it advances past the newline character, it naturally moves on to the next character, which is the beginning of the next line.
You could modify the file pointer with the fseek() function. For example, calling fseek(inFile, 0, SEEK_SET) would reset it to the beginning of the file, causing the next fgetc() call to start over from the first character of the file.
Related
The file I'm reading from just has names separated by a line. What happens is the program tries to print the contents of line_array, and it will print out about 20 of the last line in the txt file.
#include <stdio.h>
FILE* fp;
int main(){
char* line;
const char* line_array[255];
int i= 0;
int b =0;
fp = fopen("noob.txt","r");
while(fgets(line,255,fp)){
line_array[i]=line;
printf("%s",line);
printf("%s",line_array[i]);
i++;
}
for(;b<i;b++){
printf("%s",line_array[b]);
}
fclose(fp);
return 0;
}
The first issue, in your code,
while(fgets(line,255,fp))
line is used uninitialized. There is no memory allocated to line. It invokes undefined behavior.
Then, you did not check for the success of fopen() before using the returned file pointer. Again, possible UB.
And finally, by saying
line_array[i]=line;
what you did is to store the line itself to all the occurrences of line_array[n], so for the later printf() loop, the latest content of line is being printed over and over again.
Solution(s):
Allocate memory to line or use a fixed length-array.
Check for the success of fopen()before using the returned pointer.
Allocate memory to each line_array[n] and use strcpy() to copy the content. Ottherwise, you can directly use strdup(), too.
I am pretty new to C and memory allocation in general. Basically what I am trying to do is copy the contents of an input file of unknown size and reverse it's contents using recursion. I feel that I am very close, but I keep getting a segmentation fault when I try to put in the contents of what I presume to be the reversed contents of the file (I presume because I think I am doing it right....)
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int recursive_back(char **lines, int lineNumber, FILE *input) {
char *input_line = malloc(sizeof(char) * 1000);
lines = realloc(lines, (lineNumber) * 1000 * sizeof(char));
if(fgets(input_line, 201, input) == NULL) {
*(lines + lineNumber) = input_line;
return 1;
}
else {
printf("%d\n", lineNumber);
return (1+recursive_back(lines, ++lineNumber, input));
}
}
void backward (FILE *input, FILE *output, int debugflag ) {
int i;
char **lines; //store lines in here
lines = malloc(1000 * sizeof(char *) ); //1000 lines
if(lines == NULL) { //if malloc failed
fprintf(stderr, "malloc of lines failed\n");
exit(1);
}
int finalLineCount, lineCount;
finalLineCount = recursive_back(lines, 0, input);
printf("test %d\n", finalLineCount);
for(i = finalLineCount; i > 0; i--) {
fputs(*(lines+i), output); //segfault here
}
}
I am using a simple input file to test the code. My input file is 6 lines long that says "This is a test input file". The actual input files are being opened in another function and passed over to the backward function. I have verified that the other functions in my program work since I have been playing around with different options. These two functions are the only functions that I am having trouble with. What am I doing wrong?
Your problem is here:
lines = realloc(lines, (lineNumber) * 1000 * sizeof(char));
exactly as #ooga said. There are at least three separate things wrong with it:
You are reallocating the memory block pointed to by recursive_back()'s local variable lines, and storing the new address (supposing that the reallocation succeeds) back into that local variable. The new location is not necessarily the same as the old, but the only pointer to it is a local variable that goes out of scope at the end of recursive_back(). The caller's corresponding variable is not changed (including when the caller is recursive_back() itself), and therefore can no longer be relied upon to be a valid pointer after recursive_back() returns.
You allocate space using the wrong type. lines has type char **, so the object it points to has type char *, but you are reserving space based on the size of char instead.
You are not reserving enough space, at least on the first call, when lineNumber is zero. On that call, when the space requested is exactly zero bytes, the effect of the realloc() is to free the memory pointed to by lines. On subsequent calls, the space allocated is always one line's worth less than you think you are allocating.
It looks like the realloc() is altogether unnecessary if you can rely on the input to have at most 1000 lines, so you should consider just removing it. If you genuinely do need to be able to reallocate in a way that the caller will see, then the caller needs to pass a pointer to its variable, so that recursive_back() can modify it via that pointer.
I have a function that receives the name of a file as an argument.
The idea is to read each word in the given file and save each one in a linked list (as a struct with a value and a pointer to the next struct).
I could get it working for small files, but when I give a big .txt file I get a segmentation fault.
Using gdb I could figure out that this happens at the while(fscanf(fi, "%s", value) != EOF){ line.
For some reason when the file is bigger the fscanf() segfaults.
As I could figure out the linked list part, here I pasted just enough code to compile and for you to see my problem.
So my question are:
Why fscanf() segfauts with big .txt files (thousands of words), but not with small file (ten words)?
By the way, is there a better way to check for the end of the file?
Thanks in advance.
bool read(const char* file){
// open file
FILE* fi = fopen(file, "r"); //file is a variable that contains the name of the file to be opened
if (fi == NULL)
{
return false;
}
// malloc for value
char* value = malloc(sizeof(int));
// fscanf() until the end of the file
while(fscanf(fi, "%s", value) != EOF){ // HERE IS MY PROBLEM
// some code for the linked list
// where the value will be saved at the linked list
}
// free space
free(value);
// close the file
fclose(fi);
return true;
}
No, here is your problem:
char* value = malloc(sizeof(int)); // <<<<<<< You allocate only place for an int
while(fscanf(fi, "%s", value) != EOF){ // <<<<<<< but you read a huge string
So you end up with a buffer overflow !
You have to make sure that you never overflow the size of your buffer by setting some limits. For example by using the width field of fscanf() to indicate max size of chars to be read for the string:
char* value = malloc(512); // Allocate your buffer
while(fscanf(fi, "%511s", value) != EOF){ // read max 511 chars + 1 char for terminating 0
...
(disclaimer: simplified explanation)
A char* is a pointer to an address of memory. It specifies that it points to an array of characters. A malloc call reserves a block of memory of a certain size.
Your line
char* value = malloc(sizeof(int));
creates a character array that can hold 4 characters (as an int is 4 bytes long generally). And for it to be a complete string the last character has to be a NULL terminator '\0', So really it can only hold 3 readable characters.
You should make that malloc create a block of memory that is larger than the biggest string in the file. Or you could use another safer method such as fgets : http://www.cplusplus.com/reference/cstdio/fgets/
In the code below, I hope you can see that I have a char* variable and that I want to read in a string from a file. I then want to pass this string back from the function. I'm rather confused by pointers so I'm not too sure what I'm supposed to do really.
The purpose of this is to then pass the array to another function to be searched for a name.
Unfortunately the program crashes as a result and I've no idea why.
char* ObtainName(FILE *fp)
{
char* temp;
int i = 0;
temp = fgetc(fp);
while(temp != '\n')
{
temp = fgetc(fp);
i++;
}
printf("%s", temp);
return temp;
}
Any help would be vastly appreciated.
fgetc returns an int, not a char*. This int is a character from the stream, or EOF if you reach the end of the file.
You're implicitly casting the int to a char*, i.e., interpreting it as an address (turn your warnings on.) When you call printf it reads that address and continues to read a character at a time looking for the null terminator which ends the string, but that address is almost certainly invalid. This is undefined behavior.
I've taken some liberties with what you wanted to accomplish. Rather that deal with pointers, you can just use a fixed sized array as long as you can set a maximum length. I've also included several checks so that you don't run off the end of the buffer or the end of the file. Also important is to make sure that you have a null termination '\0' at the end of the string.
#define MAX_LEN 100
char* ObtainName(FILE *fp)
{
static char temp[MAX_LEN];
int i = 0;
while(i < MAX_LEN-1)
{
if (feof(fp))
{
break;
}
temp[i] = fgetc(fp);
if (temp[i] == '\n')
{
break;
}
i++;
}
temp[i] = '\0';
printf("%s", temp);
return temp;
}
So, there are several problems here:
You're not setting aside any storage for the string contents;
You're not storing the string contents correctly;
You're attempting to read memory that doesn't belong to you;
The way you're attempting to return the string is going to give you heartburn.
1. You're not setting aside storage for the string contents
The line
char *temp;
declares temp as a pointer to char; its value will be the address of a single character value. Since it's declared at local scope without the static keyword, its initial value will be indeterminate, and that value may not correspond to a valid memory address.
It does not set aside any storage for the string contents read from fp; that would have to be done as a separate step, which I'll get to below.
2. You're not storing the string contents correctly
The line
temp = fgetc(fp);
reads the next character from fp and assigns it to temp. First of all, this means you're only storing the last character read from the stream, not the whole string. Secondly, and more importantly, you're assigning the result of fgetc() (which returns a value of type int) to an object of type char * (which is treated as an address). You're basically saying "I want to treat the letter 'a' as an address into memory." This brings us to...
3. You're attempting to read memory that doesn't belong to you
In the line
printf("%s", temp);
you're attempting to print out the string beginning at the address stored in temp. Since the last thing you wrote to temp was most likely a character whose value is < 127, you're telling printf to start at a very low and most likely not accessible address, hence the crash.
4. The way you're attempting to return the string is guaranteed to give you heartburn
Since you've defined the function to return a char *, you're going to need to do one of the following:
Allocate memory dynamically to store the string contents, and then pass the responsibility of freeing that memory on to the function calling this one;
Declare an array with the static keyword so that the array doesn't "go away" after the function exits; however, this approach has serious drawbacks;
Change the function definition;
Allocate memory dynamically
You could use dynamic memory allocation routines to set aside a region of storage for the string contents, like so:
char *temp = malloc( MAX_STRING_LENGTH * sizeof *temp );
or
char *temp = calloc( MAX_STRING_LENGTH, sizeof *temp );
and then return temp as you've written.
Both malloc and calloc set aside the number of bytes you specify; calloc will initialize all those bytes to 0, which takes a little more time, but can save your bacon, especially when dealing with text.
The problem is that somebody has to deallocate this memory when its no longer needed; since you return the pointer, whoever calls this function now has the responsibility to call free() when it's done with that string, something like:
void Caller( FILE *fp )
{
...
char *name = ObtainName( fo );
...
free( name );
...
}
This spreads the responsibility for memory management around the program, increasing the chances that somebody will forget to release that memory, leading to memory leaks. Ideally, you'd like to have the same function that allocates the memory free it.
Use a static array
You could declare temp as an array of char and use the static keyword:
static char temp[MAX_STRING_SIZE];
This will set aside MAX_STRING_SIZE characters in the array when the program starts up, and it will be preserved between calls to ObtainName. No need to call free when you're done.
The problem with this approach is that by creating a static buffer, the code is not re-entrant; if ObtainName called another function which in turn called ObtainName again, that new call will clobber whatever was in the buffer before.
Why not just declare temp as
char temp[MAX_STRING_SIZE];
without the static keyword? The problem is that when ObtainName exits, the temp array ceases to exist (or rather, the memory it was using is available for someone else to use). That pointer you return is no longer valid, and the contents of the array may be overwritten before you can access it again.
Change the function definition
Ideally, you'd like for ObtainName to not have to worry about the memory it has to write to. The best way to achieve that is for the caller to pass target buffer as a parameter, along with the buffer's size:
int ObtainName( FILE *fp, char *buffer, size_t bufferSize )
{
...
}
This way, ObtainName writes data into the location that the caller specifies (useful if you want to obtain multiple names for different purposes). The function will return an integer value, which can be a simple success or failure, or an error code indicating why the function failed, etc.
Note that if you're reading text, you don't have to read character by character; you can use functions like fgets() or fscanf() to read an entire string at a time.
Use fscanf if you want to read whitespace-delimited strings (i.e., if the input file contains "This is a test", fscanf( fp, "%s", temp); will only read "This"). If you want to read an entire line (delimited by a newline character), use fgets().
Assuming you want to read an individual string at a time, you'd use something like the following (assumes C99):
#define FMT_SIZE 20
...
int ObtainName( FILE *fp, char *buffer, size_t bufsize )
{
int result = 1; // assume success
int scanfResult = 0;
char fmt[FMT_SIZE];
sprintf( fmt, "%%%zus", bufsize - 1 );
scanfResult = fscanf( fp, fmt, buffer );
if ( scanfResult == EOF )
{
// hit end-of-file before reading any text
result = 0;
}
else if ( scanfResult == 0 )
{
// did not read anything from input stream
result = 0;
}
else
{
result = 1;
}
return result;
}
So what's this noise
char fmt[FMT_SIZE];
sprintf( fmt, "%%%zus", bufsize - 1 );
about? There is a very nasty security hole in fscanf() when you use the %s or %[ conversion specifiers without a maximum length specifier. The %s conversion specifier tells fscanf to read characters until it sees a whitespace character; if there are more non-whitespace characters in the stream than the buffer is sized to hold, fscanf will store those extra characters past the end of the buffer, clobbering whatever memory is following it. This is a common malware exploit. So we want to specify a maximum length for the input; for example, %20s says to read no more than 20 characters from the stream and store them to the buffer.
Unfortunately, since the buffer length is passed in as an argument, we can't write something like %20s, and fscanf doesn't give us a way to specify the length as an argument the way fprintf does. So we have to create a separate format string, which we store in fmt. If the input buffer length is 10, then the format string will be %10s. If the input buffer length is 1000, then the format string will be %1000s.
The following code expands on that in your question, and returns the string in allocated storage:
char* ObtainName(FILE *fp)
{
int temp;
int i = 1;
char *string = malloc(i);
if(NULL == string)
{
fprintf(stderr, "malloc() failed\n");
goto CLEANUP;
}
*string = '\0';
temp = fgetc(fp);
while(temp != '\n')
{
char *newMem;
++i;
newMem=realloc(string, i);
if(NULL==newMem)
{
fprintf(stderr, "realloc() failed.\n");
goto CLEANUP;
}
string=newMem;
string[i-1] = temp;
string[i] = '\0';
temp = fgetc(fp);
}
CLEANUP:
printf("%s", string);
return(string);
}
Take care to 'free()' the string returned by this function, or a memory leak will occur.
I have a function that reads an input file and is supposed to modify the contents of a char** and a int*. The function is as follows:
void
input_parser(arguments* args, char** input, int* files) {
char buffer[MAX];
FILE *fr;
fr = fopen(args->file,"r");
if (fr == NULL) {
printf("No correct input file was entered\n");
exit(0);
}
while(fgets(buffer,MAX,fr) != NULL) {
input[*files] = strtok(buffer,"\n");
(*files)++;
}
fclose(fr);
return;
}
I have defined input and files as follows in the main program:
char* input[25];
files = 0;
I call the function as follows:
input_parser(args, input, &files);
The input file contains 3 lines as follows:
output1.xml
output2.xml
output3.xml
I notice that during the while loop the 'current' value is read correctly but stored in all input[*] resulting in:
input[0] = output3.xml
input[1] = output3.xml
input[2] = output3.xml
I would greatly appreciate if someone has any idea what is going wrong here.
The function is storing the address of the local variable buffer to each element in the input array: you need to copy the value returned by strtok(). The code as it stands is undefined behaviour as the buffer is out of scope once input_parser() returns, even it was not the logic is incorrect anyway.
If you have strdup(), you just use it:
input[*files] = strdup(strtok(buffer,"\n")); /* NULL check omitted. */
otherwise malloc() and strcpy(). Remember to free() the elements of input when no longer required.
Initialise input to be able determine which elements point to valid strings:
char* input[25] = { NULL };
You are going to end up having danging pointers, which are pointing inside your buffer after the buffer has been deallocated.