C - Realloc invalid next size w strings - c

I'm having a problem in C when listing the files from a folder.
The strange here is that it works fine several times, but then the program is calling other functions and after that run again then the function to list the files.
I added a print malloc_usable_size - it says that has enough space but when it breaks out it says 0.
Also when is broken out the ent->d_name has some strange characters.
At the end finishes with an error: realloc(): invalid next size
Do you have any idea?
Here is the code:
struct dirent *ent;
int size = 6;
char *file_names = NULL, *temp = NULL;
while((ent=readdir(dirp))!=NULL) {
if( (strcmp(ent->d_name, ".")!=0) && (strcmp(ent->d_name, "..")!=0) ) {
size += strlen(ent->d_name)*sizeof(char) + 6;
temp = file_names;
file_names = (char *) realloc(file_names, size);
if(file_names != NULL) {
strcat(file_names, ent->d_name);
strcat(file_names, "\n\0");
}
else {
file_names = temp;
}
}
}
closedir(dirp);
if(file_names != NULL) {
strcat(file_names, "\0");
}

strcat appends to the end of a string. But you never start off with a string; the first call to realloc gets you uninitialized memory. Perhaps it chances that you get a zero byte the first time, but after other functions have used and freed memory, next time you allocate memory it started with a non-zero byte.
You'll need to set file_names[0] = 0; after the first allocation. (e.g. if ( temp == NULL ) file_names[0] = 0;
BTW it's more usual to use this pattern for realloc: (and don't cast it)
temp = realloc(file_names, size);
if ( temp != NULL )
{
if ( file_names == NULL )
temp[0] = 0;
file_names = temp;
strcat(file_names, ent->d_name);
strcat(file_names, "\n"); // extra \0 is redundant
}
NB. The algorithm is rather inefficient (every call to strcat has to scan the whole string again). You could instead store the current offset; this would also fix your problem with the initial strcat. E.g. (pseudocode)
// before loop
size_t offset = 0;
// in the loop; after allocating the right amount of memory as before
offset += sprintf(file_names + offset, "%s\n", ent->d_name);

Related

C - Copying text from a file results in unknown characters being copied over as well

When running the following C file, copying the character to fgetc to my tmp pointer results in unknown characters being copied over for some reason. The characters received from fgetc() are the expected characters. However, for some reason when assigning this character to my tmp pointer unknown characters get copied over.
I've tried looking for the reason why online, but haven't found any luck. From what I have read it could be something to do with UTF-8 and ASCII issues. However, I'm not sure about the fix. I'm a relatively new C programmer and still new to memory management.
Output:
TMP: Hello, DATA!�
TEXT: Hello, DATA!�
game.c:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <allegro5/allegro5.h>
#include <allegro5/allegro_font.h>
const int WIN_WIDTH = 1366;
const int WIN_HEIGHT = 768;
char *readFile(const char *fileName) {
FILE *file;
file = fopen(fileName, "r");
if (file == NULL) {
printf("File could not be opened for reading.\n");
}
size_t tmpSize = 1;
char *tmp = (char *)malloc(tmpSize);
if (tmp == NULL) {
printf("malloc() could not be called on tmp.\n");
}
for (int c = fgetc(file); c != EOF; c = fgetc(file)) {
if (c != NULL) {
if (tmpSize > 1)
tmp = (char *)realloc(tmp, tmpSize);
tmp[tmpSize - 1] = (char *)c;
tmpSize++;
}
}
tmp[tmpSize] = 0;
fclose(file);
printf("TMP: %s\n", tmp);
return tmp;
}
int main(int argc, char **argv) {
al_init();
al_install_keyboard();
ALLEGRO_TIMER* timer = al_create_timer (1.0 / 30.0);
ALLEGRO_EVENT_QUEUE *queue = al_create_event_queue();
ALLEGRO_DISPLAY* display = al_create_display(WIN_WIDTH, WIN_HEIGHT);
ALLEGRO_FONT* font = al_create_builtin_font();
al_register_event_source(queue, al_get_keyboard_event_source());
al_register_event_source(queue, al_get_display_event_source(display));
al_register_event_source(queue, al_get_timer_event_source(timer));
int redraw = 1;
ALLEGRO_EVENT event;
al_start_timer(timer);
char *text = readFile("game.DATA");
printf("TEXT: %s\n", text);
while (1) {
al_wait_for_event(queue, &event);
if (event.type == ALLEGRO_EVENT_TIMER)
redraw = 1;
else if ((event.type == ALLEGRO_EVENT_KEY_DOWN) || (event.type == ALLEGRO_EVENT_DISPLAY_CLOSE))
break;
if (redraw && al_is_event_queue_empty(queue)) {
al_clear_to_color(al_map_rgb(0, 0, 0));
al_draw_text(font, al_map_rgb(255, 255, 255), 0, 0, 0, text);
al_flip_display();
redraw = false;
}
}
free(text);
al_destroy_font(font);
al_destroy_display(display);
al_destroy_timer(timer);
al_destroy_event_queue(queue);
return 0;
}
game.DATA file:
Hello, DATA!
What I use to run the program:
gcc game.c -o game $(pkg-config allegro-5 allegro_font-5 --libs --cflags)
--EDIT--
I tried taking the file reading code and running it in a new c file, for some reason it works there, but not when in the game.c file with allegro code.
test.c:
#include <stdlib.h>
#include <stdio.h>
char *readFile(const char *fileName) {
FILE *file;
file = fopen(fileName, "r");
if (file == NULL) {
printf("File could not be opened for reading.\n");
}
size_t tmpSize = 1;
char *tmp = (char *)malloc(tmpSize);
if (tmp == NULL) {
printf("malloc() could not be called on tmp.\n");
}
for (int c = fgetc(file); c != EOF; c = fgetc(file)) {
if (c != NULL) {
if (tmpSize > 1)
tmp = (char *)realloc(tmp, tmpSize);
tmp[tmpSize - 1] = (char *)c;
tmpSize++;
}
}
tmp[tmpSize] = 0;
fclose(file);
printf("TMP: %s\n", tmp);
return tmp;
}
void main() {
char *text = readFile("game.DATA");
printf("TEXT: %s\n", text);
free(text);
return 0;
}
Produces the correct output always:
TMP: Hello, DATA!
TEXT: Hello, DATA!
When you write a loop that updates various things each time through, like you do with tmpSize in your loop here, it's important to have a handle on what the theoretical computer science types call your "loop invariants". That is, what is it that's true each time through the loop? It's important not only to maintain your loop invariants properly, but also to pick your loop invariants so that they're easy to maintain, and easy for a later reader to understand and to verify.
Since tmpSize starts out as 1, I'm guessing your loop invariant is trying to be, "tmpSize is always one more than the size of the string I've read so far". A reason for picking that slightly-strange loop invariant is, of course, that you'll need that extra byte for the terminating \0. The other clue is that you're setting tmp[tmpSize-1] = c;.
But here's the first problem. When we exit the loop, and if tmpSize is still one more than the size of the string you've read so far, let's see what happens. Suppose we read three characters. So tmpSize should be 4. So we'll set tmp[4] = 0;. But wait! Remember, arrays in C are 0-based. So the three characters we read are in tmp[0], tmp[1], and tmp[2], and we want the terminating \0 character to go into tmp[3], not tmp[4]. Something is wrong.
But actually, it's worse than that. I wasn't at all sure I understood the loop invariant, so I cheated, and inserted a few debugging printouts. Right before the realloc call, I added
printf("realloc %zu\n", tmpSize);
and at the end, right before the tmp[tmpSize] = 0; line, I added
printf("final %zu\n", tmpSize);
The last few lines it printed (while reading a game.DATA file containing "Hello, DATA!" just like yours) were:
...
realloc 10
realloc 11
realloc 12
final 13
But this is off by two! If the last reallocation gave the array a size of 12, the valid indices are from 0 to 11. But somehow we end up writing the \0 into cell 13.
It took me a while to figure it out, but the second problem is that you do the reallocation at the top of the loop, before you've incremented tmpLen.
To me, the loop invariant of "one more than the size of the string read so far" is just too hard to think about. I very much prefer to use a loop invariant where the "size" variable keeps track of the number of characters I have read, not +1 or -1 off of that. Let's see how that loop might look. (I've also cleaned up a few other things.)
size_t tmpSize = 0;
char *tmp = malloc(tmpSize+1);
if (tmp == NULL) {
printf("malloc() failed.\n");
exit(1);
}
for (int c = getc(file); c != EOF; c = getc(file)) {
printf("realloc %zu\n", tmpSize+1+1);
tmp = realloc(tmp, tmpSize+1+1); /* +1 for c, +1 for \0 */
if (tmp == NULL) {
printf("realloc() failed.\n");
exit(1);
}
tmp[tmpSize] = c;
tmpSize++;
}
printf("final %zu\n", tmpSize);
tmp[tmpSize] = '\0';
There's still something fishy here -- I said I didn't like "fudge factors" like +1, and here I've got two -- but at least now the debugging printouts go
...
realloc 11
realloc 12
realloc 13
final 12
so it looks like I'm not overrunning the allocated memory any more.
To make this even better, I want to take a slightly different approach. You're not supposed to worry abut efficiency at first, but I can tell you that a loop that calls realloc to make the buffer bigger by 1, each time it reads a character, can end up being really inefficient. So let's make a few more changes:
size_t nchAllocated = 0;
size_t nchRead = 0;
char *tmp = NULL;
for (int c = getc(file); c != EOF; c = getc(file)) {
if(nchAllocated <= nchRead) {
nchAllocated += 10;
printf("realloc %zu\n", nchAllocated);
tmp = realloc(tmp, nchAllocated);
if (tmp == NULL) {
printf("realloc() failed.\n");
exit(1);
}
}
tmp[nchRead++] = c;
}
printf("final %zu\n", nchRead);
tmp[nchRead] = '\0';
Now there are two separate variables: nchAllocated keeps track of exactly how many characters I've allocated, and nchRead keeps track of exactly how many characters I've read. And although I've doubled the number of "counter" variables, in doing so I've simplified a lot of other things, so I think it's a net improvement.
First of all, notice that there are no +1 fudge factors any more, at all.
Second, this loop doesn't call realloc every time -- instead it allocates characters 10 at a time. And because there are separate variables for the number of characters allocated versus read, it can keep track of the fact that it may have allocated more characters than it has read so far. For this code, the debugging printouts are:
realloc 10
realloc 20
final 12
Another little improvement is that we don't have to "preallocate" the array -- there's no initial malloc call. One of our loop invariants is that nchAllocated is the number of characters allocated, and we start this out as 0, and if there are no characters allocated, then it's okay that tmp starts out as NULL. This relies on the fact that when you call realloc for the first time, with tmp equal to NULL, realloc is fine with that, and essentially acts like malloc.
But there's one question you might be asking: If I got rid of all my fudge factors, where do we arrange to allocate one extra byte to hold the terminating \0 character? It's there, but it's subtle: it's lurking in the test
if(nchAllocated <= nchRead)
The very first time through the loop, nchAllocated will be 0, and nchRead will be 0, but this test will be true, so we'll allocate our first chunk of 10 characters, and we're off and running. (If we didn't care about the \0 character, the test nchAllocated < nchRead would have sufficed.)
...But, actually, I've made a mistake! There's a subtle bug here!
What if the file being read is empty? tmp will start out NULL, and we'll never make any trips through the loop, so tmp will remain NULL, so when we assign tmp[nchRead] = 0 it'll blow up.
And actually, it's worse than that. If you trace through the logic very carefully, you'll find that any time the file size is an exact multiple of 10, not enough space gets allocated for the \0, after all.
And this indicates a significant drawback of the "allocate characters 10 at a time" scheme. The code is now harder to test, because the control flow is different for files whose size is a multiple of 10. If you never happen to test that case, you won't realize that this program has a bug in it.
The way I usually fix this is to notice that the \0 byte I have to add to terminate the string is sort of balanced by the EOF character I read that indicated the end of the file. Maybe, when I read the EOF, I can use it to remind me to allocate space for the \0. That's actually easy enough to do, and it looks like this:
int c;
while(1) {
c = getc(file);
if(nchAllocated <= nchRead) {
nchAllocated += 10;
printf("realloc %zu\n", nchAllocated);
tmp = realloc(tmp, nchAllocated);
if (tmp == NULL) {
printf("realloc() failed.\n");
exit(1);
}
}
if(c == EOF)
break;
tmp[nchRead++] = c;
}
printf("final %zu\n", nchRead);
tmp[nchRead] = '\0';
The trick here is that we don't test for EOF until after we've checked that there's enough space in the buffer, and called realloc if necessary. It's as if we allocate space in the buffer for the EOF -- except then we use that space for the \0 instead. This is what I meant by "use it to remind me to allocate space for the \0".
Now, I have to admit that there's still a drawback here, in that the loop is now somewhat unconventional. A loop that has while(1) at the top looks like an infinite loop. This one has
if(c == EOF) break;
down in the middle of it, so it is literally a "break in the middle" loop. (This is by contrast to conventional for and while loops, which are "break at the top", or a do/while loop, which is "break at the bottom".) Personally, I find this to be a useful idiom, and I use it all the time. But some programmers, and perhaps your instructor, would frown on it, because it's "weird", it's "different", it's "unconventional". And to some extent they're right: unconventional programming is somewhat dangerous programming, and is bad if later maintenance programmers can't understand it because they don't recognize or don't understand the idioms in it. (It's sort of the programming equivalent of the English word "ain't", or a split infinitive.)
Finally, if you're still with me, I have one more point to make. (And if you are still with me, thank you. I realize this answer has gotten very long, but I hope you're learning something.)
Earlier I said that "a loop that calls realloc to make the buffer bigger by 1, each time it reads a character, can end up being really inefficient." It turns out that a loop that makes the buffer bigger by 10 isn't much better, and can still be significantly inefficient. You can do a little better by incrementing it by 50 or 100, but if you're dealing with input that might be really big (thousands of characters or more), you're usually better off increasing the buffer size by leaps and bounds, perhaps by multiplying it by some factor, rather than adding. So here's the final version of that part of the loop:
if(nchAllocated <= nchRead) {
if(nchAllocated == 0) nchAllocated = 10;
else nchAllocated *= 2;
printf("realloc %zu\n", nchAllocated);
tmp = realloc(tmp, nchAllocated);
And even this improvement -- multiplying by 2, rather than adding something -- comes with a cost: we need an extra test, to special-case the first trip through the loop, because nchAllocated started out as 0, and 0 × 2 = 0.
Your reallocation scheme is incorrect: the array is always too short by one byte and the null terminator is written one position past the end of the string, instead of at the end of the string. This causes an extra byte to be printed, with whatever value happens to be in memory in the block returned by realloc(), which is uninitialized.
It is less confusing to use tmpLen as the length of the string read si far and allocate 2 extra bytes for the newly read character and the null terminator.
Furthermore the test c != NULL makes no sense: c is byte and NULL is a pointer. Similarly, tmp[tmpSize - 1] = (char *)c; is incorrect: you should just write
tmp[tmpSize - 1] = c;
Here is a corrected version:
char *readFile(const char *fileName) {
FILE *file = fopen(fileName, "r");
if (file == NULL) {
printf("File could not be opened for reading.\n");
return NULL;
}
size_t tmpLen = 0;
char *tmp = (char *)malloc(tmpLen + 1);
if (tmp == NULL) {
printf("malloc() could not be called on tmp.\n");
fclose(file);
return NULL;
}
int c;
while ((c = fgetc(file)) != EOF) {
char *new_tmp = (char *)realloc(tmp, tmpLen + 2);
if (new_tmp == NULL) {
printf("realloc() failure for %zu bytes.\n", tmpLen + 2);
free(tmp);
fclose(file);
return NULL;
}
tmp = new_tmp;
tmp[tmpLen++] = c;
}
tmp[tmpLen] = '\0';
fclose(file);
printf("TMP: %s\n", tmp);
return tmp;
}
It is usually better to reallocate in chunks or with a geometric size increment. Here is a simple implementation:
char *readFile(const char *fileName) {
FILE *file = fopen(fileName, "r");
if (file == NULL) {
printf("File could not be opened for reading.\n");
return NULL;
}
size_t tmpLen = 0;
size_t tmpSize = 16;
char *tmp = (char *)malloc(tmpSize);
char *newTmp;
if (tmp == NULL) {
printf("malloc() could not be called on tmp.\n");
fclose(file);
return NULL;
}
int c;
while ((c = fgetc(file)) != EOF) {
if (tmpSize - tmpLen < 2) {
size_t newSize = tmpSize + tmpSize / 2;
newTmp = (char *)realloc(tmp, newSize);
if (newTmp == NULL) {
printf("realloc() failure for %zu bytes.\n", newSize);
free(tmp);
fclose(file);
return NULL;
}
tmpSize = newSize;
tmp = newTmp;
}
tmp[tmpLen++] = c;
}
tmp[tmpLen] = '\0';
fclose(file);
printf("TMP: %s\n", tmp);
// try to shrink allocated block to the minimum size
// if realloc() fails, return the current block
// it seems impossible for this reallocation to fail
// but the C Standard allows it.
newTmp = (char *)realloc(tmp, tmpLen + 1);
return newTmp ? newTmp : tmp;
}

Using gdb--still can't find malloc error

I've looked at previous posts about this and they didn't help me locate my problem... To keep it short I'm making a function should read a text file line by line (and yes, I do realize there are many posts like this). But when I run my program through CMD, it's giving me this error:
Program received signal SIGSEGV, Segmentation fault.
__GI___libc_realloc (oldmem=0x10011, bytes=1) at malloc.c:2999
2999 malloc.c: No such file or directory.
I'm pretty sure I wrote out my malloc/realloc lines correctly. I've tried finding alot of posts similar to this, but none of the solutions offered are helping. If you have any post suggestions that maybe I missed, please let me know. Regardless, here are my functions:
char* read_single_line(FILE* fp){
char* line = NULL;
int num_chars = 0;
char c;
fscanf(fp, "%c", &c);
while(!feof(fp)) {
num_chars++;
line = (char*) realloc(line, num_chars * sizeof(char));
line[num_chars -1] = c;
if (c == '\n') {
break;
}
fscanf(fp, "%c", &c);
}
if(line != NULL) {
line = realloc(line, (num_chars+1) * sizeof(char));
line[num_chars] = '\0';
}
return line;
}
void read_lines(FILE* fp, char*** lines, int* num_lines) {
int i = 0;
int num_lines_in_file = 0;
char line[1000];
if (fp == NULL) {
*lines = NULL;
*num_lines = 0;
} else {
(*lines) = (char**)malloc(1 * sizeof(char*));
while (read_single_line(fp) != NULL) {
(*lines)[i] = (char*)realloc((*lines)[i], sizeof(char));
num_lines_in_file++;
i++;
}
*lines[i] = line;
*num_lines = num_lines_in_file;
}
}
I would really appreciate any help--I'm a beginner in C so hear me out!!
char line[1000];
:
while (read_single_line(fp) != NULL) {
:
}
*lines[i] = line;
This doesn't look at all right to me. Your read_single_line function returns an actual line but, other than checking that against NULL, you never actually store it anywhere. Instead, you point the line pointer to line, a auto-scoped variable which could contain literally anything (and, more worrying, possibly no terminator character).
I think you should probably store the return value from read_single_line and use that to set your line pointers.
By the way, it may also be quite inefficient to expand your buffer one character at a time. I'd suggest initially allocating more bytes and then keeping both that capacity and the bytes currently in use. Then, only when you're about to use beyond your capacity do you expand, and by more than one. In pseudo-code, something like:
def getLine:
# Initial allocation with error check.
capacity = 64
inUse = 0
buffer = allocate(capacity)
if buffer == null:
return null
# Process each character made available somehow.
while ch = getNextChar:
# Expand buffer if needed, always have room for terminator.
if inUse + 1 == capacity:
capacity += 64
newBuff = realloc buffer with capacity
# Failure means we have to release old buffer.
if newBuff == null:
free buffer
return null
# Store character in buffer, we have enough room.
buffer[inUse++] = ch
# Store terminator, we'll always have room.
buffer[inUse] = '\0';
return buffer
You'll notice, as well as the more efficient re-allocations, better error checking on said allocations.
while (read_single_line(fp) != NULL) {
(*lines)[i] = (char*)realloc((*lines)[i], sizeof(char));
num_lines_in_file++;
i++;
}
*lines[i] = line;
There are more errors then lines in this short fragment. Let's go over them one by one.
while (read_single_line(fp) != NULL)
You read a line, check whether it's a null pointer, and throw it away instead of keeping it around to accumulate it in the lines array.
(*lines)[i] = (char*)realloc((*lines)[i], sizeof(char));
You are trying to realloc (*lines[i]). First, it does not exist beyond i==0, because (*lines) was only ever allocated to contain one element. Second, it makes no sense to realloc individual lines, because you are (should be) getting perfect ready-made lines from the line reading function. You want to realloc *lines instead:
*lines = realloc (*lines, i * sizeof(char*));
Now these two lines
num_lines_in_file++;
i++;
are not an error per se, but why have two variables which always have the exact same value? In addition, you want them (it) be before the realloc line, per usual increment-realloc-assign pattern (you are using it in the other function).
Speaking of the assign part, there isn't any. You should insert one now:
(*lines)[i-1] = // what?
The line pointer you should have saved when calling read_single_line, that's what. From the beginning:
char* cur_line;
int i = 0;
*lines = NULL;
while ((cur_line=read_single_line(fp)) != NULL)
{
++i;
*lines = realloc (*lines, i * sizeof(char*));
(*lines)[i-1] = cur_line;
}
*num_lines = i;
The last one
*lines[i] = line;
is downright ugly.
First, lines is not an array, it's a pointer pointing to a single variable, so lines[i] accesses intergalactic dust. Second, you are trying to assign it an address of a local variable, which will cease to exist as soon as your function returns. Third, what is it doing outside of the loop? If you want to terminate your line array with a null pointer, do so:
}
*num_lines = i;
++i;
*lines = realloc (*lines, i * sizeof(char*));
(*lines)[i-1] = NULL;
But given that you return the number of lines, this may not be necessary.
Disclaimer: none of the above is tested. If there are any bugs, fix them!

How to fix console return of malloc(): memory corruption (fast)

When I ran a program, my console output showed the following:
malloc(): memory corruption (fast) and was followed by what seemed like an address in memory. I have narrowed it down to a few functions, but I feel that I free all memory that I allocate properly.
The following function takes a string representing a file name.
void readAndProcessFile(char* filename){
FILE *fileptr;
char* word = malloc(32*sizeof(char));
fileptr = fopen(filename, "r");
while(fscanf(fileptr,"%s",word) != EOF){
processWord(word);
}
fclose(fileptr);
free(word);
}
This function takes a word, removes any non-alphabetic characters and changes all letters to uppercase.
void processWord(char* text){
char* processedWord;
processedWord = trimAndCaps(text);
if(processedWord != NULL && processedWord[0] != '\0'){
addWord(processedWord);
}
free(processedWord);
}
Here's the trim and cap function
char* trimAndCaps(char* text)
{
int i = 0;
int j = 0;
char currentChar;
char* rv = malloc(sizeof(text));
while ((currentChar = text[i++]) != '\0')
{
if (isalpha(currentChar))
{
rv[j++] = toupper(currentChar);
}
}
rv[j] = '\0';
return rv;
}
And just for good measure here's the addWord function
void addWord(char* word)
{
// check if word is already in list
struct worddata* currentWord = findWord(word);
// word is in list
if(currentWord != NULL)
{
incrementCount(currentWord);
}
// word is not in list
else
{
currentWord = malloc(sizeof(struct worddata));
currentWord->count = 1;
strcpy(currentWord->word, word);
ll_add(wordList, currentWord, sizeof(struct worddata));
free(currentWord);
}
}
As you can see, all instances where I manually allocate memory, I free afterwards. This program works when there is a smaller amount of words, but not for larger. My thought process leads me to believe that there is some sort of leak, but when I have few enough words to the point that I can run it, I run the following code:
// The following code will print out the final dynamic memory used
struct mallinfo veryend = mallinfo();
fprintf(stderr, "Final Dynamic Memory used : %d\n", veryend.uordblks);
And this shows a 0 every time for memory used. What else can cause this? Any direction or fixes are much appreciated.
The following line doesn't do what you are hoping to:
char* rv = malloc(sizeof(text));
It only allocates 4 or8 bytes or memory depending on the size of pointers on your platform.
You need:
char* rv = malloc(strlen(text) + 1);

How to convert my malloc + strcpy to strdup in C?

I am trying to save csv data in an array for use in other functions. I understand that strdup is good for this, but am unsure how to make it work for my situation. Any help is appreciated!
The data is stored in a struct:
typedef struct current{
char **data;
}CurrentData;
Function call:
int main(void){
int totalProducts = 0;
CurrentData *AllCurrentData = { '\0' };
FILE *current = fopen("C:\\User\\myfile.csv", "r");
if (current == NULL){
puts("current file data not found");
}
else{
totalProducts = getCurrentData(current, &AllCurrentData);
}
fclose(current);
return 0;
}
How I allocated memory;
int getCurrentData(FILE *current, CurrentData **AllCurrentData){
*AllCurrentData = malloc(totalProducts * sizeof(CurrentData));
/*allocate struct data memory*/
while ((next = fgetc(current)) != EOF){
if (next == '\n'){
(*AllCurrentData)[newLineCount].data = malloc(colCount * sizeof(char*));
newLineCount++;
}
}
newLineCount = 0;
rewind(current);
while ((next = fgetc(current)) != EOF && newLineCount <= totalProducts){
if (ch != '\0'){
buffer[i] = ch;
i++;
characterCount++;
}
if (ch == ',' && next != ' ' || ch == '\n' && ch != EOF){
if (i > 0){
buffer[i - 1] = '\0';
}
length = strlen(buffer);
/*(*AllCurrentData)[newLineCount].data[tabCount] = malloc(length + 1); /* originally was using strcpy */
strcpy((*AllCurrentData)[newLineCount].data[tabCount], buffer);
*/
(*AllCurrentData)[newLineCount].data[tabCount] = strdup(buffer); /* something like this? */
i = 0;
tabCount++;
for (j = 0; j < BUFFER_SIZE; j++){
buffer[j] = '\0';
}
}
You define a ptr AllCurrentData but you should set it to NULL.
CurrentData* AllCurrentData = NULL;
In getCurrentData you use totalProducts which seems a bit
odd since it is a local variable in main(), either you have another
global variable with the same name or there is an error.
The **data inside the structure seems odd, instead maybe you want
to parse the csv line and create proper members for them. You already
have an array of CurrentData so it seems odd to have another array
inside the struct -- i am just guessing cause you haven't explained
that part.
Since a csv file is line based use fgets() to read one line
from the file, then parse the string by using e.g. strtok or just by
checking the buffer after delimiters. Here strdup can come into play,
when you have taken out a token, do a strdup on it and store it in your
structure.
char line[255];
if ( fgets(line,sizeof(line),current) != NULL )
{
char* token = strdup(strtok( line, "," ));
...
}
Instead of allocating a big buffer that may be enough (or not) use
realloc to increase your buffer as you read from the file.
That said there are faster ways to extract data from a csv-file e.g.
you can read in the whole file with fread, then look for delimiters
and set these to \0 and create an array of char pointers into the buffer.
Okay, I wouldn't comment on other parts of your code, but you can use strdup to get rid of this line (*AllCurrentData)[newLineCount].data = malloc(colCount * sizeof(char*));, and this line (*AllCurrentData)[newLineCount].data[tabCount] = strdup(buffer); /* something like this? */
and replace them with this: (*AllCurrentData)[newLineCount].data = strdup(buffer);
For the function to read in the array of strings I would start with the following approach. This has not been tested or even compiled however it is a starting place.
There are a number of issues not addressed by this sample. The temporary buffer size of 4K characters may or may not be sufficiently large for all lines in the file. There may be more lines of text in the file than elements in the array of pointers and there is no indication from the function that this has happened.
Improvements to this would be better error handling. Also it might be modified so that the array of pointers is allocated in the function with some large amount and then if there are more lines in the file than array elements, using the realloc() function to enlarge the array of pointers by some size. Perhaps also a check on the file size and using an average text line length would be appropriate to provide an initial size for the array of pointers.
// Read lines of text from a text file returning the number of lines.
// The caller will provide an array of char pointers which will be used
// to return the list of lines of text from the file.
int GetTextLines (FILE *hFile, char **pStringArrays, int nArrayLength)
{
int iBuffSize = 4096;
int iLineCount = 0;
char tempBuffer [4096];
while (fgets (tempBuffer, iBuffSize, hFile) && iLineCount < nArrayLength) {
pStringArrays[iLineCount] = malloc ((strlen(tempBuffer) + 1) * sizeof (char));
if (! pStringArrays[iLineCount])
break;
strcpy (pStringArrays[iLineCount], tempBuffer);
iLineCount++;
}
return iLineCount;
}

how to put char * into array so that I can use it in qsort, and then move on to the next line

I have lineget function that returns char *(it detects '\n') and NULL on EOF.
In main() I'm trying to recognize particular words from that line.
I used strtok:
int main(int argc, char **argv)
{
char *line, *ptr;
FILE *infile;
FILE *outfile;
char **helper = NULL;
int strtoks = 0;
void *temp;
infile=fopen(argv[1],"r");
outfile=fopen(argv[2],"w");
while(((line=readline(infile))!=NULL))
{
ptr = strtok(line, " ");
temp = realloc(helper, (strtoks)*sizeof(char *));
if(temp == NULL) {
printf("Bad alloc error\n");
free(helper);
return 0;
} else {
helper=temp;
}
while (ptr != NULL) {
strtoks++;
fputs(ptr, outfile);
fputc(' ', outfile);
ptr = strtok(NULL, " ");
helper[strtoks-1] = ptr;
}
/*fputs(line, outfile);*/
free(line);
}
fclose(infile);
fclose(outfile);
return 0;
}
Now I have no idea how to put every of tokenized words into an array (I created char ** helper for that purpose), so that it can be used in qsort like qsort(helper, strtoks, sizeof(char*), compare_string);.
Ad. 2 Even if it would work - I don't know how to clear that line, and proceed to sorting next one. How to do that?
I even crashed valgrind (with the code presented above) -> "valgrind: the 'impossible' happened:
Killed by fatal signal"
Where is the mistake ?
The most obvious problem (there may be others) is that you're reallocating helper to the value of strtoks at the beginning of the line, but then incrementing strtoks and adding to the array at higher values of strtoks. For instance, on the first line, strtoks is 0, so temp = realloc(helper, (strtoks)*sizeof(char *)); leaves helper as NULL, but then you try to add every word on that line to the helper array.
I'd suggest an entirely different approach which is conceptually simpler:
char buf[1000]; // or big enough to be bigger than any word you'll encounter
char ** helper;
int i, numwords;
while(!feof(infile)) { // most general way of testing if EOF is reached, since EOF
// is just a macro and may not be machine-independent.
for(i = 0; (ch = fgetc(infile)) != ' ' && ch != '\n'; i++) {
// get chars one at a time until we hit a space or a newline
buf[i] = ch; // add char to buffer
}
buf[i + 1] = '\0' // terminate with null byte
helper = realloc(++numwords * sizeof(char *)); // expand helper to fit one more word
helper[numwords - 1] = strdup(buffer) // copy current contents of buffer to the just-created element of helper
}
I haven't tested this so let me know if it's not correct or there's anything you don't understand. I've left out the opening and closing of files and the freeing at the end (remember you have to free every element of helper before you free helper itself).
As you can see in strtok's prototype:
char * strtok ( char * str, const char * delimiters );
...str is not const. What strtok actually does is replace found delimiters by null bytes (\0) into your str and return a pointer to the beginning of the token.
Per example:
char in[] = "foo bar baz";
char *toks[3];
toks[0] = strtok(in, " ");
toks[1] = strtok(NULL, " ");
toks[2] = strtok(NULL, " ");
printf("%p %s\n%p %s\n%p %s\n", toks[0], toks[0], toks[1], toks[1],
toks[2], toks[2]);
printf("%p %s\n%p %s\n%p %s\n", &in[0], &in[0], &in[4], &in[4],
&in[8], &in[8]);
Now look at the results:
0x7fffd537e870 foo
0x7fffd537e874 bar
0x7fffd537e878 baz
0x7fffd537e870 foo
0x7fffd537e874 bar
0x7fffd537e878 baz
As you can see, toks[1] and &in[4] point to the same location: the original str has been modified, and in reality all tokens in toks point to somewhere in str.
In your case your problem is that you free line:
free(line);
...invalidating all your pointers in helper. If you (or qsort) try to access helper[0] after freeing line, you end up accessing freed memory.
You should copy the tokens instead, e.g.:
ptr = strtok(NULL, " ");
helper[strtoks-1] = malloc(strlen(ptr) + 1);
strcpy(helper[strtoks-1], ptr);
Obviously, you will need to free each element of helper afterwards (in addition to helper itself).
You should be getting a 'Bad alloc' error because:
char **helper = NULL;
int strtoks = 0;
...
while ((line = readline(infile)) != NULL) /* Fewer, but sufficient, parentheses */
{
ptr = strtok(line, " ");
temp = realloc(helper, (strtoks)*sizeof(char *));
if (temp == NULL) {
printf("Bad alloc error\n");
free(helper);
return 0;
}
This is because the value of strtoks is zero, so you are asking realloc() to free the memory pointed at by helper (which was itself a null pointer). One outside chance is that your library crashes on realloc(0, 0), which it shouldn't but it is a curious edge case that might have been overlooked. The other possibility is that realloc(0, 0) returns a non-null pointer to 0 bytes of data which you are not allowed to dereference. When your code dereferences it, it crashes. Both returning NULL and returning non-NULL are allowed by the C standard; don't write code that crashes regardless of which behaviour realloc() shows. (If your implementation of realloc() does not return a non-NULL pointer for realloc(0, 0), then I'm suspicious that you aren't showing us exactly the code that managed to crash valgrind (which is a fair achievement — congratulations) because you aren't seeing the program terminate under control as it should if realloc(0, 0) returns NULL.)
You should be able to avoid that problem if you use:
temp = realloc(helper, (strtoks+1) * sizeof(char *));
Don't forget to increment strtoks itself at some point.

Resources