I am hoping that someone can help me understand where I have gone wrong here. I am implementing a program to check for spelling correctness. In the process I use a trie data structure to load into memory a dictionary text file to check words against.
Overall it seems to operate as expected but I get a lot of problems when loading in the longest possible word, namely pneumonoultramicroscopicsilicovolcanoconiosis. I do not understand why but first let me present some code -
/**
* Loads dictionary into memory. Returns true if successful else false.
*/
bool load(const char *dictionary)
{
FILE *dict = fopen(dictionary, "r");
if (dict == NULL)
{
fprintf(stderr, "Could not open %s dictionary file.\n", dictionary);
return false;
}
// Initialise the root t_node
root = (t_node *) malloc(sizeof(t_node));
if (root == NULL)
{
fprintf(stderr, "Could not allocate memory to trie structure.\n");
return false;
}
// Set all current values in root to NULL and is_word to false
for (int i = 0; i < ALPHA_SIZE; i++)
{
root->branch[i] = NULL;
}
root->is_word = false;
while (1)
{
// Create char aray to hold words from .txt dictionary file once read
char *word = (char *) malloc((LENGTH + 1) * sizeof(char));
if (fscanf(dict, "%s", word) == EOF)
{
free(word);
break;
}
t_node *cursor = root;
int len = strlen(word) + 1;
for (int i = 0; i < len; i++)
{
if (word[i] == '\0')
{
cursor->is_word = true;
cursor = root;
word_count++;
}
else
{
int index = (word[i] == '\'') ? ALPHA_SIZE - 1 : tolower(word[i]) - 'a';
if (cursor->branch[index] == NULL)
{
cursor->branch[index] = (t_node *) malloc(sizeof(t_node));
for (int j = 0; j < ALPHA_SIZE; j++)
{
cursor->branch[index]->branch[i] = NULL;
}
cursor->branch[index]->is_word = false;
}
cursor = cursor->branch[index];
}
}
free(word);
}
fclose(dict);
return true;
}
This is my entire function to load in a dictionary into memory. For reference I defined the trie structure and created root prior to this function. LENGTH is defined as 45 to account for the longest possible word. And ALPHA_SIZE is 27 to include lower case letters plus apostrophes.
As I said already with all other shorter words this function works well. But with the longest word the function works through about half of the word, getting up to index 29 of the word variable before suffering a sysmalloc assertion issue where it then aborts.
I've tried to find what is happening here but the most I can see is that it suffers the fault at -
cursor->branch[index] = (t_node *) malloc(sizeof(t_node));
once it gets to the 29th index of word, but no other indexes prior. And all other posts I can find relate to functions giving this error that do not work at all rather than most of the time with an exception.
Can anyone see what I cannot and what may be the error I made in this code? I'd appreciate any help and thank you all for your time taken to consider my problem.
* UPDATE *
First of all I want to thank everyone for all of their help. I was so pleasantly surprised to see how many people responded to my issue and how quickly they did! I cannot express my gratitude to all of you for your help. Especially Basile Starynkevitch who gave me a great amount of information and offered a lot of help.
I am extremely embarrassed to say that I have found my issue and it's something that I should have caught a LONG time before turning to SO. So I must apologise for using up everyone's time on something so silly. My problem lied here -
else
{
int index = (word[i] == '\'') ? ALPHA_SIZE - 1 : tolower(word[i]) - 'a';
if (cursor->branch[index] == NULL)
{
cursor->branch[index] = (t_node *) malloc(sizeof(t_node));
for (int j = 0; j < ALPHA_SIZE; j++)
{
cursor->branch[index]->branch[j] = NULL; // <<< PROBLEM WAS HERE
}
cursor->branch[index]->is_word = false;
}
cursor = cursor->branch[index];
}
In my code originally I had 'cursor->branch[index]->branch[i] = NULL' where I was iterating through 'int j' in that loop, not i ....
Sooooo once again thank you all for your help! I am sorry for my poorly formatted question and I will do better to abide by the SO guidelines in the future.
Your
char *word = (char *) malloc((LENGTH + 1) * sizeof(char));
is not followed by a test on failure of malloc; you need to add:
if (!word) { perror("malloc word"); exit(EXIT_FAILURE); }
before
if (fscanf(dict, "%s", word) == EOF)
since using fscanf with %s on a NULL pointer is wrong (undefined behavior, probably).
BTW, recent versions of fscanf (or with dynamic memory TR) accepts the %ms specifier to allocate a string when reading it. On those systems you could:
char*word = NULL;
if (fscanf(dict, "%ms", &word) == EOF))
break;
and some systems have getline, see this.
At last, compile with all warnings and debug info (gcc -Wall -Wextra -g with GCC), improve your code to get no warnings, and use the debugger gdb and valgrind.
BTW pneumonoultramicroscopicsilicovolcanoconiosis has 45 letters. You need one additional byte for the terminating NUL (otherwise you have a buffer overflow). So your LENGTH should be at least 46 (and I recommend choosing something slightly bigger, perhaps 64; in fact I recommend using systematically C dynamic memory allocation and avoiding hard-coding such limits and code in a more robust style, following the GNU coding standards).
Related
When running the following C file, copying the character to fgetc to my tmp pointer results in unknown characters being copied over for some reason. The characters received from fgetc() are the expected characters. However, for some reason when assigning this character to my tmp pointer unknown characters get copied over.
I've tried looking for the reason why online, but haven't found any luck. From what I have read it could be something to do with UTF-8 and ASCII issues. However, I'm not sure about the fix. I'm a relatively new C programmer and still new to memory management.
Output:
TMP: Hello, DATA!�
TEXT: Hello, DATA!�
game.c:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <allegro5/allegro5.h>
#include <allegro5/allegro_font.h>
const int WIN_WIDTH = 1366;
const int WIN_HEIGHT = 768;
char *readFile(const char *fileName) {
FILE *file;
file = fopen(fileName, "r");
if (file == NULL) {
printf("File could not be opened for reading.\n");
}
size_t tmpSize = 1;
char *tmp = (char *)malloc(tmpSize);
if (tmp == NULL) {
printf("malloc() could not be called on tmp.\n");
}
for (int c = fgetc(file); c != EOF; c = fgetc(file)) {
if (c != NULL) {
if (tmpSize > 1)
tmp = (char *)realloc(tmp, tmpSize);
tmp[tmpSize - 1] = (char *)c;
tmpSize++;
}
}
tmp[tmpSize] = 0;
fclose(file);
printf("TMP: %s\n", tmp);
return tmp;
}
int main(int argc, char **argv) {
al_init();
al_install_keyboard();
ALLEGRO_TIMER* timer = al_create_timer (1.0 / 30.0);
ALLEGRO_EVENT_QUEUE *queue = al_create_event_queue();
ALLEGRO_DISPLAY* display = al_create_display(WIN_WIDTH, WIN_HEIGHT);
ALLEGRO_FONT* font = al_create_builtin_font();
al_register_event_source(queue, al_get_keyboard_event_source());
al_register_event_source(queue, al_get_display_event_source(display));
al_register_event_source(queue, al_get_timer_event_source(timer));
int redraw = 1;
ALLEGRO_EVENT event;
al_start_timer(timer);
char *text = readFile("game.DATA");
printf("TEXT: %s\n", text);
while (1) {
al_wait_for_event(queue, &event);
if (event.type == ALLEGRO_EVENT_TIMER)
redraw = 1;
else if ((event.type == ALLEGRO_EVENT_KEY_DOWN) || (event.type == ALLEGRO_EVENT_DISPLAY_CLOSE))
break;
if (redraw && al_is_event_queue_empty(queue)) {
al_clear_to_color(al_map_rgb(0, 0, 0));
al_draw_text(font, al_map_rgb(255, 255, 255), 0, 0, 0, text);
al_flip_display();
redraw = false;
}
}
free(text);
al_destroy_font(font);
al_destroy_display(display);
al_destroy_timer(timer);
al_destroy_event_queue(queue);
return 0;
}
game.DATA file:
Hello, DATA!
What I use to run the program:
gcc game.c -o game $(pkg-config allegro-5 allegro_font-5 --libs --cflags)
--EDIT--
I tried taking the file reading code and running it in a new c file, for some reason it works there, but not when in the game.c file with allegro code.
test.c:
#include <stdlib.h>
#include <stdio.h>
char *readFile(const char *fileName) {
FILE *file;
file = fopen(fileName, "r");
if (file == NULL) {
printf("File could not be opened for reading.\n");
}
size_t tmpSize = 1;
char *tmp = (char *)malloc(tmpSize);
if (tmp == NULL) {
printf("malloc() could not be called on tmp.\n");
}
for (int c = fgetc(file); c != EOF; c = fgetc(file)) {
if (c != NULL) {
if (tmpSize > 1)
tmp = (char *)realloc(tmp, tmpSize);
tmp[tmpSize - 1] = (char *)c;
tmpSize++;
}
}
tmp[tmpSize] = 0;
fclose(file);
printf("TMP: %s\n", tmp);
return tmp;
}
void main() {
char *text = readFile("game.DATA");
printf("TEXT: %s\n", text);
free(text);
return 0;
}
Produces the correct output always:
TMP: Hello, DATA!
TEXT: Hello, DATA!
When you write a loop that updates various things each time through, like you do with tmpSize in your loop here, it's important to have a handle on what the theoretical computer science types call your "loop invariants". That is, what is it that's true each time through the loop? It's important not only to maintain your loop invariants properly, but also to pick your loop invariants so that they're easy to maintain, and easy for a later reader to understand and to verify.
Since tmpSize starts out as 1, I'm guessing your loop invariant is trying to be, "tmpSize is always one more than the size of the string I've read so far". A reason for picking that slightly-strange loop invariant is, of course, that you'll need that extra byte for the terminating \0. The other clue is that you're setting tmp[tmpSize-1] = c;.
But here's the first problem. When we exit the loop, and if tmpSize is still one more than the size of the string you've read so far, let's see what happens. Suppose we read three characters. So tmpSize should be 4. So we'll set tmp[4] = 0;. But wait! Remember, arrays in C are 0-based. So the three characters we read are in tmp[0], tmp[1], and tmp[2], and we want the terminating \0 character to go into tmp[3], not tmp[4]. Something is wrong.
But actually, it's worse than that. I wasn't at all sure I understood the loop invariant, so I cheated, and inserted a few debugging printouts. Right before the realloc call, I added
printf("realloc %zu\n", tmpSize);
and at the end, right before the tmp[tmpSize] = 0; line, I added
printf("final %zu\n", tmpSize);
The last few lines it printed (while reading a game.DATA file containing "Hello, DATA!" just like yours) were:
...
realloc 10
realloc 11
realloc 12
final 13
But this is off by two! If the last reallocation gave the array a size of 12, the valid indices are from 0 to 11. But somehow we end up writing the \0 into cell 13.
It took me a while to figure it out, but the second problem is that you do the reallocation at the top of the loop, before you've incremented tmpLen.
To me, the loop invariant of "one more than the size of the string read so far" is just too hard to think about. I very much prefer to use a loop invariant where the "size" variable keeps track of the number of characters I have read, not +1 or -1 off of that. Let's see how that loop might look. (I've also cleaned up a few other things.)
size_t tmpSize = 0;
char *tmp = malloc(tmpSize+1);
if (tmp == NULL) {
printf("malloc() failed.\n");
exit(1);
}
for (int c = getc(file); c != EOF; c = getc(file)) {
printf("realloc %zu\n", tmpSize+1+1);
tmp = realloc(tmp, tmpSize+1+1); /* +1 for c, +1 for \0 */
if (tmp == NULL) {
printf("realloc() failed.\n");
exit(1);
}
tmp[tmpSize] = c;
tmpSize++;
}
printf("final %zu\n", tmpSize);
tmp[tmpSize] = '\0';
There's still something fishy here -- I said I didn't like "fudge factors" like +1, and here I've got two -- but at least now the debugging printouts go
...
realloc 11
realloc 12
realloc 13
final 12
so it looks like I'm not overrunning the allocated memory any more.
To make this even better, I want to take a slightly different approach. You're not supposed to worry abut efficiency at first, but I can tell you that a loop that calls realloc to make the buffer bigger by 1, each time it reads a character, can end up being really inefficient. So let's make a few more changes:
size_t nchAllocated = 0;
size_t nchRead = 0;
char *tmp = NULL;
for (int c = getc(file); c != EOF; c = getc(file)) {
if(nchAllocated <= nchRead) {
nchAllocated += 10;
printf("realloc %zu\n", nchAllocated);
tmp = realloc(tmp, nchAllocated);
if (tmp == NULL) {
printf("realloc() failed.\n");
exit(1);
}
}
tmp[nchRead++] = c;
}
printf("final %zu\n", nchRead);
tmp[nchRead] = '\0';
Now there are two separate variables: nchAllocated keeps track of exactly how many characters I've allocated, and nchRead keeps track of exactly how many characters I've read. And although I've doubled the number of "counter" variables, in doing so I've simplified a lot of other things, so I think it's a net improvement.
First of all, notice that there are no +1 fudge factors any more, at all.
Second, this loop doesn't call realloc every time -- instead it allocates characters 10 at a time. And because there are separate variables for the number of characters allocated versus read, it can keep track of the fact that it may have allocated more characters than it has read so far. For this code, the debugging printouts are:
realloc 10
realloc 20
final 12
Another little improvement is that we don't have to "preallocate" the array -- there's no initial malloc call. One of our loop invariants is that nchAllocated is the number of characters allocated, and we start this out as 0, and if there are no characters allocated, then it's okay that tmp starts out as NULL. This relies on the fact that when you call realloc for the first time, with tmp equal to NULL, realloc is fine with that, and essentially acts like malloc.
But there's one question you might be asking: If I got rid of all my fudge factors, where do we arrange to allocate one extra byte to hold the terminating \0 character? It's there, but it's subtle: it's lurking in the test
if(nchAllocated <= nchRead)
The very first time through the loop, nchAllocated will be 0, and nchRead will be 0, but this test will be true, so we'll allocate our first chunk of 10 characters, and we're off and running. (If we didn't care about the \0 character, the test nchAllocated < nchRead would have sufficed.)
...But, actually, I've made a mistake! There's a subtle bug here!
What if the file being read is empty? tmp will start out NULL, and we'll never make any trips through the loop, so tmp will remain NULL, so when we assign tmp[nchRead] = 0 it'll blow up.
And actually, it's worse than that. If you trace through the logic very carefully, you'll find that any time the file size is an exact multiple of 10, not enough space gets allocated for the \0, after all.
And this indicates a significant drawback of the "allocate characters 10 at a time" scheme. The code is now harder to test, because the control flow is different for files whose size is a multiple of 10. If you never happen to test that case, you won't realize that this program has a bug in it.
The way I usually fix this is to notice that the \0 byte I have to add to terminate the string is sort of balanced by the EOF character I read that indicated the end of the file. Maybe, when I read the EOF, I can use it to remind me to allocate space for the \0. That's actually easy enough to do, and it looks like this:
int c;
while(1) {
c = getc(file);
if(nchAllocated <= nchRead) {
nchAllocated += 10;
printf("realloc %zu\n", nchAllocated);
tmp = realloc(tmp, nchAllocated);
if (tmp == NULL) {
printf("realloc() failed.\n");
exit(1);
}
}
if(c == EOF)
break;
tmp[nchRead++] = c;
}
printf("final %zu\n", nchRead);
tmp[nchRead] = '\0';
The trick here is that we don't test for EOF until after we've checked that there's enough space in the buffer, and called realloc if necessary. It's as if we allocate space in the buffer for the EOF -- except then we use that space for the \0 instead. This is what I meant by "use it to remind me to allocate space for the \0".
Now, I have to admit that there's still a drawback here, in that the loop is now somewhat unconventional. A loop that has while(1) at the top looks like an infinite loop. This one has
if(c == EOF) break;
down in the middle of it, so it is literally a "break in the middle" loop. (This is by contrast to conventional for and while loops, which are "break at the top", or a do/while loop, which is "break at the bottom".) Personally, I find this to be a useful idiom, and I use it all the time. But some programmers, and perhaps your instructor, would frown on it, because it's "weird", it's "different", it's "unconventional". And to some extent they're right: unconventional programming is somewhat dangerous programming, and is bad if later maintenance programmers can't understand it because they don't recognize or don't understand the idioms in it. (It's sort of the programming equivalent of the English word "ain't", or a split infinitive.)
Finally, if you're still with me, I have one more point to make. (And if you are still with me, thank you. I realize this answer has gotten very long, but I hope you're learning something.)
Earlier I said that "a loop that calls realloc to make the buffer bigger by 1, each time it reads a character, can end up being really inefficient." It turns out that a loop that makes the buffer bigger by 10 isn't much better, and can still be significantly inefficient. You can do a little better by incrementing it by 50 or 100, but if you're dealing with input that might be really big (thousands of characters or more), you're usually better off increasing the buffer size by leaps and bounds, perhaps by multiplying it by some factor, rather than adding. So here's the final version of that part of the loop:
if(nchAllocated <= nchRead) {
if(nchAllocated == 0) nchAllocated = 10;
else nchAllocated *= 2;
printf("realloc %zu\n", nchAllocated);
tmp = realloc(tmp, nchAllocated);
And even this improvement -- multiplying by 2, rather than adding something -- comes with a cost: we need an extra test, to special-case the first trip through the loop, because nchAllocated started out as 0, and 0 × 2 = 0.
Your reallocation scheme is incorrect: the array is always too short by one byte and the null terminator is written one position past the end of the string, instead of at the end of the string. This causes an extra byte to be printed, with whatever value happens to be in memory in the block returned by realloc(), which is uninitialized.
It is less confusing to use tmpLen as the length of the string read si far and allocate 2 extra bytes for the newly read character and the null terminator.
Furthermore the test c != NULL makes no sense: c is byte and NULL is a pointer. Similarly, tmp[tmpSize - 1] = (char *)c; is incorrect: you should just write
tmp[tmpSize - 1] = c;
Here is a corrected version:
char *readFile(const char *fileName) {
FILE *file = fopen(fileName, "r");
if (file == NULL) {
printf("File could not be opened for reading.\n");
return NULL;
}
size_t tmpLen = 0;
char *tmp = (char *)malloc(tmpLen + 1);
if (tmp == NULL) {
printf("malloc() could not be called on tmp.\n");
fclose(file);
return NULL;
}
int c;
while ((c = fgetc(file)) != EOF) {
char *new_tmp = (char *)realloc(tmp, tmpLen + 2);
if (new_tmp == NULL) {
printf("realloc() failure for %zu bytes.\n", tmpLen + 2);
free(tmp);
fclose(file);
return NULL;
}
tmp = new_tmp;
tmp[tmpLen++] = c;
}
tmp[tmpLen] = '\0';
fclose(file);
printf("TMP: %s\n", tmp);
return tmp;
}
It is usually better to reallocate in chunks or with a geometric size increment. Here is a simple implementation:
char *readFile(const char *fileName) {
FILE *file = fopen(fileName, "r");
if (file == NULL) {
printf("File could not be opened for reading.\n");
return NULL;
}
size_t tmpLen = 0;
size_t tmpSize = 16;
char *tmp = (char *)malloc(tmpSize);
char *newTmp;
if (tmp == NULL) {
printf("malloc() could not be called on tmp.\n");
fclose(file);
return NULL;
}
int c;
while ((c = fgetc(file)) != EOF) {
if (tmpSize - tmpLen < 2) {
size_t newSize = tmpSize + tmpSize / 2;
newTmp = (char *)realloc(tmp, newSize);
if (newTmp == NULL) {
printf("realloc() failure for %zu bytes.\n", newSize);
free(tmp);
fclose(file);
return NULL;
}
tmpSize = newSize;
tmp = newTmp;
}
tmp[tmpLen++] = c;
}
tmp[tmpLen] = '\0';
fclose(file);
printf("TMP: %s\n", tmp);
// try to shrink allocated block to the minimum size
// if realloc() fails, return the current block
// it seems impossible for this reallocation to fail
// but the C Standard allows it.
newTmp = (char *)realloc(tmp, tmpLen + 1);
return newTmp ? newTmp : tmp;
}
I am writing a simple Shell for school assignment and stuck with a segmentation problem. Initially, my shell parses the user input to remove whitespaces and endofline character, and seperate the words inside the input line to store them in a char **args array. I can seperate the words and can print them without any problem, but when storing the words into a char **args array, and if argument number is greater than 1 and is odd, I get a segmentation error.
I know the problem is absurd, but I stuck with it. Please help me.
This is my parser code and the problem occurs in it:
char **parseInput(char *input){
int idx = 0;
char **parsed = NULL;
int parsed_idx = 0;
while(input[idx]){
if(input[idx] == '\n'){
break;
}
else if(input[idx] == ' '){
idx++;
}
else{
char *word = (char*) malloc(sizeof(char*));
int widx = 0; // Word index
word[widx] = input[idx];
idx++;
widx++;
while(input[idx] && input[idx] != '\n' && input[idx] != ' '){
word = (char*)realloc(word, (widx+1)*sizeof(char*));
word[widx] = input[idx];
idx++;
widx++;
}
word = (char*)realloc(word, (widx+1)*sizeof(char*));
word[widx] = '\0';
printf("Word[%d] --> %s\n", parsed_idx, word);
if(parsed == NULL){
parsed = (char**) malloc(sizeof(char**));
parsed[parsed_idx] = word;
parsed_idx++;
}else{
parsed = (char**) realloc(parsed, (parsed_idx+1)*sizeof(char**));
parsed[parsed_idx] = word;
parsed_idx++;
}
}
}
int i = 0;
while(parsed[i] != NULL){
printf("Parsed[%d] --> %s\n", i, parsed[i]);
i++;
}
return parsed;
}
In your code you have the loop
while(parsed[i] != NULL) { ... }
The problem is that the code never sets any elements of parsed to be a NULL pointer.
That means the loop will go out of bounds, and you will have undefined behavior.
You need to explicitly set the last element of parsed to be a NULL pointer after you parsed the input:
while(input[idx]){
// ...
}
parsed[parsed_idx] = NULL;
On another couple of notes:
Don't assign back to the same pointer you pass to realloc. If realloc fails it will return a NULL pointer, but not free the old memory. If you assign back to the pointer you will loose it and have a memory leak. You also need to be able to handle this case where realloc fails.
A loop like
int i = 0;
while (parsed[i] != NULL)
{
// ...
i++;
}
is almost exactly the same as
for (int i = 0; parsed[i] != NULL; i++)
{
// ...
}
Please use a for loop instead, it's usually easier to read and follow. Also for a for loop the "index" variable (i in your code) will be in a separate scope, and not available outside of the loop. Tighter scope for variables leads to less possible problems.
In C you shouldn't really cast the result of malloc (or realloc) (or really any function returning void *). If you forget to #include <stdlib.h> it could lead to hard to diagnose problems.
Also, a beginner might find the -pedantic switch helpful on your call to the compiler. That switch would have pointed up most of the other suggestions made here. I personally am also a fan of -Wall, though many find it annoying instead of helpful.
I've been getting segmentation faults (with gdb printing "??" on backtraces) on a program I'm trying to compile for a while now and after trying many things (such re-programming a data structure I used which should work now) I still kept getting segfaults although now it gave me a line (which I added a comment onto here).
getMains() is ran multiple times to tokenize different lines from the same file.
I wanted mains to be an array of size 4 but when passing it as "char * mains[4]" I I got a compile error for trying to pass it an array (*)[4] which I've never dealt with beforehand (Just started using C). I'm assuming maybe that could be a problem if I try to access any part that wasn't used, but the problem happens while initializing the indices of the array.
The code I'm trying to get to work, where the "char *** mains" argument is taking in a &(char **) from a separate function "runner" which I want to be edited so I can look at its contents in "runner":
bool getMains(FILE * file, char *** mains)
{
char line[256];
int start = 0;
char * token;
const char * mainDelim = "\t \n\0", * commDelim = "\n\t\0";
if(fgets(line, sizeof(line), file) == NULL)
return false;
while(line[0] == '.')
if(fgets(line, sizeof(line), file) == NULL);
return false;
if(line[0] == '\t' || line[0] == ' ')
{
(*mains)[0] = " ";
start = 1;
}
token = strtok(line, mainDelim);
int i;
for(i = start; token != NULL; ++i)
{
(*mains)[i] = strdup(token); // <- gdb: Segmentation Fault occurs here
if(i % 3 == 2)
token = strtok(NULL, commDelim);
else
token = strtok(NULL, mainDelim);
}
free(token); // Unsure if this was necessary but added in case.
return true;
}
/* Snippet of code running it... */
void runner(FILE * file) {
char ** mains;
if(!getMains(*file, &mains))
return;
while(strcmp(mains[1], "END") != 0){
/* do stuff lookinig through indices 0, 1, 2, & 3 */
if(!getMains(*file, &mains))
break;
}
}
Any tips on this or just generally safely modifying arrays through other functions?
Should I change getMains() into "getMains(FILE * file, char ** mains[4]);" and pass it a &"char * mains[4]") for it to be a set size as wanted? Or would that also produce errors?
You need to allocate memory for mains, it should look like this:
char ** mains;
mains = malloc(some number N * sizeof(char*));
You need something like this if you don't use strdup, which allocates the memory for you:
for (int i = 0; i < N; ++i) {
mains[i] = malloc(some number K);
}
In all cases, do not forget to call free on every pointer you received from malloc or strdup. You can skip this part if the program ends right after you would call free.
I've looked at previous posts about this and they didn't help me locate my problem... To keep it short I'm making a function should read a text file line by line (and yes, I do realize there are many posts like this). But when I run my program through CMD, it's giving me this error:
Program received signal SIGSEGV, Segmentation fault.
__GI___libc_realloc (oldmem=0x10011, bytes=1) at malloc.c:2999
2999 malloc.c: No such file or directory.
I'm pretty sure I wrote out my malloc/realloc lines correctly. I've tried finding alot of posts similar to this, but none of the solutions offered are helping. If you have any post suggestions that maybe I missed, please let me know. Regardless, here are my functions:
char* read_single_line(FILE* fp){
char* line = NULL;
int num_chars = 0;
char c;
fscanf(fp, "%c", &c);
while(!feof(fp)) {
num_chars++;
line = (char*) realloc(line, num_chars * sizeof(char));
line[num_chars -1] = c;
if (c == '\n') {
break;
}
fscanf(fp, "%c", &c);
}
if(line != NULL) {
line = realloc(line, (num_chars+1) * sizeof(char));
line[num_chars] = '\0';
}
return line;
}
void read_lines(FILE* fp, char*** lines, int* num_lines) {
int i = 0;
int num_lines_in_file = 0;
char line[1000];
if (fp == NULL) {
*lines = NULL;
*num_lines = 0;
} else {
(*lines) = (char**)malloc(1 * sizeof(char*));
while (read_single_line(fp) != NULL) {
(*lines)[i] = (char*)realloc((*lines)[i], sizeof(char));
num_lines_in_file++;
i++;
}
*lines[i] = line;
*num_lines = num_lines_in_file;
}
}
I would really appreciate any help--I'm a beginner in C so hear me out!!
char line[1000];
:
while (read_single_line(fp) != NULL) {
:
}
*lines[i] = line;
This doesn't look at all right to me. Your read_single_line function returns an actual line but, other than checking that against NULL, you never actually store it anywhere. Instead, you point the line pointer to line, a auto-scoped variable which could contain literally anything (and, more worrying, possibly no terminator character).
I think you should probably store the return value from read_single_line and use that to set your line pointers.
By the way, it may also be quite inefficient to expand your buffer one character at a time. I'd suggest initially allocating more bytes and then keeping both that capacity and the bytes currently in use. Then, only when you're about to use beyond your capacity do you expand, and by more than one. In pseudo-code, something like:
def getLine:
# Initial allocation with error check.
capacity = 64
inUse = 0
buffer = allocate(capacity)
if buffer == null:
return null
# Process each character made available somehow.
while ch = getNextChar:
# Expand buffer if needed, always have room for terminator.
if inUse + 1 == capacity:
capacity += 64
newBuff = realloc buffer with capacity
# Failure means we have to release old buffer.
if newBuff == null:
free buffer
return null
# Store character in buffer, we have enough room.
buffer[inUse++] = ch
# Store terminator, we'll always have room.
buffer[inUse] = '\0';
return buffer
You'll notice, as well as the more efficient re-allocations, better error checking on said allocations.
while (read_single_line(fp) != NULL) {
(*lines)[i] = (char*)realloc((*lines)[i], sizeof(char));
num_lines_in_file++;
i++;
}
*lines[i] = line;
There are more errors then lines in this short fragment. Let's go over them one by one.
while (read_single_line(fp) != NULL)
You read a line, check whether it's a null pointer, and throw it away instead of keeping it around to accumulate it in the lines array.
(*lines)[i] = (char*)realloc((*lines)[i], sizeof(char));
You are trying to realloc (*lines[i]). First, it does not exist beyond i==0, because (*lines) was only ever allocated to contain one element. Second, it makes no sense to realloc individual lines, because you are (should be) getting perfect ready-made lines from the line reading function. You want to realloc *lines instead:
*lines = realloc (*lines, i * sizeof(char*));
Now these two lines
num_lines_in_file++;
i++;
are not an error per se, but why have two variables which always have the exact same value? In addition, you want them (it) be before the realloc line, per usual increment-realloc-assign pattern (you are using it in the other function).
Speaking of the assign part, there isn't any. You should insert one now:
(*lines)[i-1] = // what?
The line pointer you should have saved when calling read_single_line, that's what. From the beginning:
char* cur_line;
int i = 0;
*lines = NULL;
while ((cur_line=read_single_line(fp)) != NULL)
{
++i;
*lines = realloc (*lines, i * sizeof(char*));
(*lines)[i-1] = cur_line;
}
*num_lines = i;
The last one
*lines[i] = line;
is downright ugly.
First, lines is not an array, it's a pointer pointing to a single variable, so lines[i] accesses intergalactic dust. Second, you are trying to assign it an address of a local variable, which will cease to exist as soon as your function returns. Third, what is it doing outside of the loop? If you want to terminate your line array with a null pointer, do so:
}
*num_lines = i;
++i;
*lines = realloc (*lines, i * sizeof(char*));
(*lines)[i-1] = NULL;
But given that you return the number of lines, this may not be necessary.
Disclaimer: none of the above is tested. If there are any bugs, fix them!
Im trying to split a string according to the following rules:
words without "" around them should be treated as seperate strings
anything wiht "" around it should be treated as one string
However when i run it in valgrind i get invalid frees and invalid read size errors, but if i remove the two frees i get a memory leak. If anyone could point me in the right direction i would appreciate it
The code that calls split_string
char *param[5];
for(i = 0;i < 5;i++) {
param[i] = NULL;
}
char* user = getenv("LOGNAME");
char tid[9];
char* instring = (char*) malloc(201);
/
while((printf("%s %s >",user,gettime(tid)))&&(instring
=fgets(instring,201,stdin)) != NULL) {
int paramsize = split_string(param, instring);
The code that tries to free param
for(i = 0;i < 5;i++) {
if(param[i] != NULL) {
free(param[i]);
fprintf(stderr,"%d",i);
}
}
int split_string(char** param, char* string) {
int paramplace = 0; //hvor vi er i param
int tempplace = 0; //hvor i temp vi er
char* temp = malloc(201);
int command = 0;
int message = 0;
for(; (*string != '\0') && (*string != 10) && paramplace < 4; string++) {
if((*string == ' ') && (message == 0)) {
if(command == 1) {
temp[tempplace] = '\0';
param[paramplace++] = temp;
tempplace = 0;
command = 0;
}
}
else {
if(*string =='"') {
if(message == 0) message = 1;
else message = 0;
}
if(command == 0) {
free(temp);
temp = malloc(201);
}
command = 1;
if(*string != '"') {
temp[tempplace++] = *string;
}
}
}
if(command == 1) {
temp[tempplace] = '\0';
param[paramplace++] = temp;
}
param[paramplace] = NULL;
free(temp);
return paramplace;
}
As far as I can see, you want to put the split strings into param as an array of pointers (presumably making the caller responsible for freeing them). In the first branch of the if statement in your loop, you do so by assigning the current temp buffer to that place. However, once you start a new string (when comnmand == 0, you free that space, rendering the previous param entry pointer invalid.
Only free each pointer once. I wouldn't rule out other leaks in this code: I think you can simplify your state machine (and probably find other bugs as a result).
When you free the temp buffer you also free the param[] buffer, where your tokens are stored. On the other hand, if you don't call free(temp), which you shouldn't, it will be the responsibility of the caller of your function to call free(param[n]), when the tokens aren't needed.
Maybe you're not removing the right free()'s? The actual problem might be in the code that calls split_string. Can you show it?
It's hard to understand your code. I suggest you use sscanf instead.
You can use a format string like this:
"\"%[^\"]\"%n"
Read up on what it does.
I wrote an example:
if( sscanf( string, "\"%[^\"]\"%n", matchedstring, &bytesread ) )
{
handlestring( matchedstring );
string += bytesread;
}
else if( sscanf( string, "%s%n", matchedstring, &bytesread ) )
{
handlestring( matchedstring );
string += bytesread;
}
else
{
handleexception();
}
Untested. :)
Thanks to all the comments i found the answer.
The problem was that the first malloc before the for loop is superfluous as there will be another one before it starts to put temp into param and therefore there were no pointers to the first malloc so it was simply lost.