I am having trouble with a struct array. I need to read in a text file line by line, and compare the values side by side. For example "Mama" would return 2 ma , 1 am because you have ma- am- ma. I have a struct:
typedef struct{
char first, second;
int count;
} pair;
I need to create an array of structs for the entire string, and then compare those structs. We also were introduced to memory allocation so we have to do it for any size file. That is where my trouble is really coming in. How do I reallocate the memory properly for an array of structs? This is my main as of now (doesn't compile, has errors obviously having trouble with this).
int main(int argc, char *argv[]){
//allocate memory for struct
pair *p = (pair*) malloc(sizeof(pair));
//if memory allocated
if(p != NULL){
//Attempt to open io files
for(int i = 1; i<= argc; i++){
FILE * fileIn = fopen(argv[i],"r");
if(fileIn != NULL){
//Read in file to string
char lineString[137];
while(fgets(lineString,137,fileIn) != NULL){
//Need to reallocate here, sizeof returning error on following line
//having trouble seeing how much memory I need
pair *realloc(pair *p, sizeof(pair)+strlen(linestring));
int structPos = 0;
for(i = 0; i<strlen(lineString)-1; i++){
for(int j = 1; j<strlen(lineSTring);j++){
p[structPos]->first = lineString[i];
p[structPos]->last = lineString[j];
structPos++;
}
}
}
}
}
}
else{
printf("pair pointer length is null\n");
}
}
I am happy to change things around obviously if there is a better method for this. I HAVE to use the above struct, have to have an array of structs, and have to work with memory allocation. Those are the only restrictions.
Allocating memory for an array of struct is as simple as allocating for one struct:
pair *array = malloc(sizeof(pair) * count);
Then you can access each item by subscribing "array":
array[0] => first item
array[1] => second item
etc
Regarding the realloc part, instead of:
pair *realloc(pair *p, sizeof(pair)+strlen(linestring));
(which is not syntactically valid, looks like a mix of realloc function prototype and its invocation at the same time), you should use:
p=realloc(p,[new size]);
In fact, you should use a different variable to store the result of realloc, since in case of memory allocation failure, it would return NULL while still leaving the already allocated memory (and then you would have lost its position in memory). But on most Unix systems, when doing casual processing (not some heavy duty task), reaching the point where malloc/realloc returns NULL is somehow a rare case (you must have exhausted all virtual free memory). Still it's better to write:
pair*newp=realloc(p,[new size]);
if(newp != NULL) p=newp;
else { ... last resort error handling, screaming for help ... }
So if I get this right you're counting how many times pairs of characters occur? Why all the mucking about with nested loops and using that pair struct when you can just keep a frequency table in a 64KB array, which is much simpler and orders of magnitude faster.
Here's roughly what I would do (SPOILER ALERT: especially if this is homework, please don't just copy/paste):
#include <stdlib.h>
#include <stdio.h>
#include <ctype.h>
void count_frequencies(size_t* freq_tbl, FILE* pFile)
{
int first, second;
first = fgetc(pFile);
while( (second = fgetc(pFile)) != EOF)
{
/* Only consider printable characters */
if(isprint(first) && isprint(second))
++freq_tbl[(first << 8) | second];
/* Proceed to next character */
first = second;
}
}
int main(int argc, char*argv[])
{
size_t* freq_tbl = calloc(1 << 16, sizeof(size_t));;
FILE* pFile;
size_t i;
/* Handle some I/O errors */
if(argc < 2)
{
perror ("No file given");
return EXIT_FAILURE;
}
if(! (pFile = fopen(argv[1],"r")))
{
perror ("Error opening file");
return EXIT_FAILURE;
}
if(feof(pFile))
{
perror ("Empty file");
return EXIT_FAILURE;
}
count_frequencies(freq_tbl, pFile);
/* Print frequencies */
for(i = 0; i <= 0xffff; ++i)
if(freq_tbl[i] > 0)
printf("%c%c : %d\n", (char) (i >> 8), (char) (i & 0xff), freq_tbl[i]);
free(freq_tbl);
return EXIT_SUCCESS;
}
Sorry for the bit operations and hex notation. I just happen to like them in such a context of char tables, but they can be replaced with multiplications and additions, etc for clarity.
Related
I am trying to write a simple program that will read words from a file and print the number of occurrences of a particular word passed to it as argument.
For that, I use fscanf to read the words and copy them into an array of strings that is dynamically allocated.
For some reason, I get an error message.
Here is the code for the readFile function:
void readFile(char** buffer, char** argv){
unsigned int i=0;
FILE* file;
file = fopen(argv[1], "r");
do{
buffer = realloc(buffer, sizeof(char*));
buffer[i] = malloc(46);
}while(fscanf(file, "%s", buffer[i++]));
fclose(file);
}
And here is the main function :
int main(int argc, char** argv){
char** buffer = NULL;
readFile(buffer, argv);
printf("%s\n", buffer[0]);
return 0;
}
I get the following error message :
realloc(): invalid next size
Aborted (core dumped)
I have looked at other threads on this topic but none of them seem to be of help. I could not apply whatever I learned there to my problem.
I used a debugger (VS Code with gdb). Data is written successfully into indices 0,1,2,3 of the buffer array but says error : Cannot access memory at address 0xfbad2488 for index 4 and pauses on exception.
Another thread on this topic suggests there might be a wild pointer somewhere. But I don't see one anywhere.
I have spent days trying to figure this out. Any help will be greatly appreciated.
Thanks.
Your algorithm is wrong on many fronts, including:
buffer is passed by-value. Any modifications where buffer = ... is the assignment will mean nothing to the caller. In C, arguments are always pass-by-value (arrays included, but their "value" is a conversion to temporary pointer to first element, so you get a by-ref synonym there whether you want it or not).
Your realloc usage is wrong. It should be expanding based on the iteration of the loop as a count multiplied by the size of a char *. You only have the latter, with no count multiplier. Therefore, you never allocate more than a single char * with that realloc call.
Your loop termination condition is wrong. Your fscanf call should check for the expected number of arguments to be processed, which in your case is 1. Instead, you're looking for any non-zero value, which EOF is going to be when you hit it. Therefore, the loop never terminates.
Your fscanf call is not protected from buffer overflow : You're allocating a static-sized string for each string read, but not limiting the %s format to the static size specified. This is a recipe for buffer-overflow.
No IO functions are ever checked for success/failure : The following APIs could fail, yet you never check that possibility: fopen, fscanf, realloc, malloc. In failing to do so, you're violating Henry Spencer's 6th Commandment for C Programmers : "If a function be advertised to return an error code in the event of difficulties, thou shalt check for that code, yea, even though the checks triple the size of thy code and produce aches in thy typing fingers, for if thou thinkest ``it cannot happen to me'', the gods shall surely punish thee for thy arrogance."
No mechanism for communicating the allocated string count to the caller : The caller of this function is expecting a resulting char**. Assuming you fix the first item in this list, you still have not provided the caller any means of knowing how long that pointer sequence is when readFile returns. An out-parameter and/or a formal structure is a possible solution to this. Or perhaps a terminating NULL pointer to indicate the list is finished.
(Moderate) You never check argc : Instead, you just send argv directly to readFile, and assume the file name will be at argv[1] and always be valid. Don't do that. readFile should take either a FILE* or a single const char * file name, and act accordingly. It would be considerably more robust.
(Minor) : Extra allocation : Even fixing the above items, you'll still leave one extra buffer allocation in your sequence; the one that failed to read. Not that it matter much in this case, as the caller has no idea how many strings were allocated in the first place (see previous item).
Shoring up all of the above would require a basic rewrite of nearly everything you have posted. In the end, the code would look so different, it's almost not worth trying to salvage what is here. Instead, look at what you have done, look at this list, and see where things went wrong. There's plenty to choose from.
Sample
#include <stdio.h>
#include <stdlib.h>
#define STR_MAX_LEN 46
char ** readFile(const char *fname)
{
char **strs = NULL;
int len = 0;
FILE *fp = fopen(fname, "r");
if (fp != NULL)
{
do
{
// array expansion
void *tmp = realloc(strs, (len+1) * sizeof *strs);
if (tmp == NULL)
{
// failed. cleanup prior success
perror("Failed to expand pointer array");
for (int i=0; i<len; ++i)
free(strs[i]);
free(strs);
strs = NULL;
break;
}
// allocation was good; save off new pointer
strs = tmp;
strs[len] = malloc( STR_MAX_LEN );
if (strs[len] == NULL)
{
// failed. cleanup prior sucess
perror("Failed to allocate string buffer");
for (int i=0; i<len; ++i)
free(strs[i]);
free(strs);
strs = NULL;
break;
}
if (fscanf(fp, "%45s", strs[len]) == 1)
{
++len;
}
else
{
// read failed. we're leaving regardless. the last
// allocation is thrown out, but we terminate the list
// with a NULL to indicate end-of-list to the caller
free(strs[len]);
strs[len] = NULL;
break;
}
} while (1);
fclose(fp);
}
return strs;
}
int main(int argc, char *argv[])
{
if (argc < 2)
exit(EXIT_FAILURE);
char **strs = readFile(argv[1]);
if (strs)
{
// enumerate and free in the same loop
for (char **pp = strs; *pp; ++pp)
{
puts(*pp);
free(*pp);
}
// free the now-defunct pointer array
free(strs);
}
return EXIT_SUCCESS;
}
Output (run against /usr/share/dict/words)
A
a
aa
aal
aalii
aam
Aani
aardvark
aardwolf
Aaron
Aaronic
Aaronical
Aaronite
Aaronitic
Aaru
Ab
aba
Ababdeh
Ababua
abac
abaca
......
zymotechny
zymotic
zymotically
zymotize
zymotoxic
zymurgy
Zyrenian
Zyrian
Zyryan
zythem
Zythia
zythum
Zyzomys
Zyzzogeton
Improvements
The secondary malloc in this code is completely pointless. You're using a fixed length word maximum size, so you could easily retool you array to be a pointer to use this:
char (*strs)[STR_MAX_LEN]
and simply eliminate the per-string malloc code entirely. That does leave the problem of how to tell the caller how many strings were allocated. In the prior version we used a NULL pointer to indicate end-of-list. In this version we can simply use a zero-length string. Doing this makes the declaration of readFile rather odd looking, but for returning a pointer-to-array-of-size-N, its' correct. See below:
#include <stdio.h>
#include <stdlib.h>
#define STR_MAX_LEN 46
char (*readFile(const char *fname))[STR_MAX_LEN]
{
char (*strs)[STR_MAX_LEN] = NULL;
int len = 0;
FILE *fp = fopen(fname, "r");
if (fp != NULL)
{
do
{
// array expansion
void *tmp = realloc(strs, (len+1) * sizeof *strs);
if (tmp == NULL)
{
// failed. cleanup prior success
perror("Failed to expand pointer array");
free(strs);
strs = NULL;
break;
}
// allocation was good; save off new pointer
strs = tmp;
if (fscanf(fp, "%45s", strs[len]) == 1)
{
++len;
}
else
{
// read failed. make the final string zero-length
strs[len][0] = 0;
break;
}
} while (1);
fclose(fp);
}
return strs;
}
int main(int argc, char *argv[])
{
if (argc < 2)
exit(EXIT_FAILURE);
char (*strs)[STR_MAX_LEN] = readFile(argv[1]);
if (strs)
{
// enumerate and free in the same loop
for (char (*s)[STR_MAX_LEN] = strs; (*s)[0]; ++s)
puts(*s);
free(strs);
}
return EXIT_SUCCESS;
}
The output is the same as before.
Another Improvement: Geometric Growth
With a few simple changes we can significantly cut down on the realloc invokes (we're currently doing one per string added) by only doing them in a double-size growth pattern. If each time we reallocate, we double the size of the prior allocation, we will make more and more space available for reading larger numbers of strings before the next allocation:
#include <stdio.h>
#include <stdlib.h>
#define STR_MAX_LEN 46
char (*readFile(const char *fname))[STR_MAX_LEN]
{
char (*strs)[STR_MAX_LEN] = NULL;
int len = 0;
int capacity = 0;
FILE *fp = fopen(fname, "r");
if (fp != NULL)
{
do
{
if (len == capacity)
{
printf("Expanding capacity to %d\n", (2 * capacity + 1));
void *tmp = realloc(strs, (2 * capacity + 1) * sizeof *strs);
if (tmp == NULL)
{
// failed. cleanup prior success
perror("Failed to expand string array");
free(strs);
strs = NULL;
break;
}
// save the new string pointer and capacity
strs = tmp;
capacity = 2 * capacity + 1;
}
if (fscanf(fp, "%45s", strs[len]) == 1)
{
++len;
}
else
{
// read failed. make the final string zero-length
strs[len][0] = 0;
break;
}
} while (1);
// shrink if needed. remember to retain the final empty string
if (strs && (len+1) < capacity)
{
printf("Shrinking capacity to %d\n", len);
void *tmp = realloc(strs, (len+1) * sizeof *strs);
if (tmp)
strs = tmp;
}
fclose(fp);
}
return strs;
}
int main(int argc, char *argv[])
{
if (argc < 2)
exit(EXIT_FAILURE);
char (*strs)[STR_MAX_LEN] = readFile(argv[1]);
if (strs)
{
// enumerate and free in the same loop
for (char (*s)[STR_MAX_LEN] = strs; (*s)[0]; ++s)
puts(*s);
// free the now-defunct pointer array
free(strs);
}
return EXIT_SUCCESS;
}
Output
The output is the same as before, but I added instrumentation to show when expansion happens to illustrate the expansions and final shrinking. I'll leave out the rest of the output (which is over 200k lines of words)
Expanding capacity to 1
Expanding capacity to 3
Expanding capacity to 7
Expanding capacity to 15
Expanding capacity to 31
Expanding capacity to 63
Expanding capacity to 127
Expanding capacity to 255
Expanding capacity to 511
Expanding capacity to 1023
Expanding capacity to 2047
Expanding capacity to 4095
Expanding capacity to 8191
Expanding capacity to 16383
Expanding capacity to 32767
Expanding capacity to 65535
Expanding capacity to 131071
Expanding capacity to 262143
Shrinking capacity to 235886
so I'm having a little problem with my struct array not doing what its supposed to. I get no compiler warnings or errors when building the program.
int Array_Size=0;;
int Array_Index=0;
FILE *Writer;
struct WordElement
{
int Count;
char Word[50];
};
struct WordElement *StructPointer; //just a pointer to a structure
int Create_Array(int Size){
StructPointer = (struct WordElement *) malloc(Size * sizeof(StructPointer));
Array_Size = Size;
return 0;
}
int Add_To_Array(char Word[50]){
int Word_Found=0;
for(int i=0; i <= Array_Size && Word_Found!=1; i++)
{
if(strcmp(StructPointer[i].Word, Word)) // This should only run if the word exists in struct array
{
StructPointer[i].Count++;
Word_Found=1;
}
}
if(Word_Found==0) // if the above if statement doesnt evualate, this should run
{
strcpy(StructPointer[Array_Index].Word, Word); //copying the word passed by the main function to the struct array at a specific index
printf("WORD: %s\n", StructPointer[Array_Index].Word); // printing it just to make sure it got copied correctly
Array_Index++;
}
return 0;
}
int Print_All(char File_Name[50])
{
Writer = fopen(File_Name, "w");
printf("Printing starts now: \n");
for(int i=0; i < Array_Size; i++)
{
fprintf(Writer, "%s\t\t%d\n",StructPointer[i].Word, StructPointer[i].Count);
}
free(StructPointer);
return 0;
}
These functions get called from a different file, The Add_To_Array is called when the program reads a new word form the text file. That function is supposed to check if the word already exists in the struct array and if it does, it should just increment the counter. If it doesn't, then it adds it.
The Print_All function is called after all the words have been stored in the struct array. Its supposed to loop through them and print each word and their occurrence. In the text file, there are 2 of every words but my program outputs:
this 13762753
document -1772785369
contains 1129268256
two 6619253
of 5701679
every 5570645
word 3342389
doccontains 5374021
I don't know what to make of this as im really new to C programming... It's probably worth mentioning the if(Word_Foun==0) doesn't execute
StructPointer = malloc(Size * sizeof(*StructPointer));
This will be the correct allocation. Otherwise you will have erroneous behavior in your code. Also check the return value of malloc.
StructPointer = malloc(Size * sizeof(*StructPointer));
if(NULL == StructPointer){
perror("malloc failure");
exit(EXIT_FAILURE);
}
You are allocating for struct WordElement not a for a pointer to it. You already have a pointer to struct WordElement all that you needed was memory for a struct WordElement.
Also in the loop you are accessing array index out of bound
for(int i=0; i <= Array_Size && Word_Found!=1; i++)
^^^
It will be i < Array_Size.
In case match occurs you want to set the variable Word_found to 1.
if(strcmp(StructPointer[i].Word, Word) == 0){
/* it macthed */
}
Also Writer = fopen(File_Name, "w"); you should check the return value of fopen.
if(Writer == NULL){
fprintf(stderr,"Error in file opening");
exit(EXIT_FAILURE);
}
Also when you are increasing the Array_index place a check whether it might access the array index out of bound.
The more global variable you use for achieving a small task would make it more difficult to track down a bug. It is always problematic because the places from which data might change is scattered - making it difficult to manage.
it's my first post so I apologize in advance if I've posted the code format wrong.
I've been trying to work out where I'm going wrong here for awhile now and haven't been able to find an answer. I keep getting a segmentation fault after two Lines of a text file have been scanned into my arrays. Text file follows the pattern of : City1 City2 distance.
I feel like it has something to do with the memory but I can't understand why.
#include <stdio.h>
#include <stdlib.h>
#include <string.h> //Chosen to use this library to break text file down.
#include "list.h" //Using file created in lab 3 earlier this year.
#define DYNAMIC_RESIZE 0 ///Might not be needed...
#define Max_Lines 40
#define LINE_SIZE 150
int main()
{
FILE *Distances_File = fopen("Distances.txt", "r");
char *City1[Max_Lines];
char *City2[Max_Lines];
int *Distances[Max_Lines];
City1[Max_Lines] = malloc(sizeof(Max_Lines));
City2[Max_Lines] = malloc(sizeof(Max_Lines));
Distances[Max_Lines] = malloc(sizeof(Max_Lines));
char File_Line[LINE_SIZE];
int Line_Count = 0;
if (!Distances_File) {
printf("File could not open");
return 1;
}
if ( Distances_File != NULL )
{
///Intro to the program.
printf("This is a program that will calculate the shortest distance between selected \ncities.");
printf("\nSo wish me luck :(\n \n");
while(fgets(File_Line, sizeof(File_Line), Distances_File))
{
//printf("%s", File_Line);
sscanf(File_Line, "%s%s%s", City1[Line_Count], City2[Line_Count], Distances[Line_Count]);
///Still issue of Distances being a char.
printf("%s\n%s\n%s\n\n", City1[Line_Count], City2[Line_Count], Distances[Line_Count]);
Line_Count++;
printf("%d", Line_Count);
}
}
}
This statement
char *City1[Max_Lines];
declare an array of char pointers and the size of the array is Max_Lines.
And here you are allocating memory to an invalid index of the array:
City1[Max_Lines] = malloc(sizeof(Max_Lines));
Max_Lines value is 40, so the valid index of the array of size Max_Lines will be 0-39.
You need to allocate memory to all the pointers of array City1 and City2 before using them.
May you write a function to perform this allocation, like this:
void allocate_city_mem(char *arr[], size_t sz) {
for(size_t i = 0; i < sz; i++) {
arr[i] = malloc(LINE_SIZE);
if (NULL == arr[i])
exit(EXIT_FAILURE);
}
}
In doing so, you need to make sure to free the dynamically allocated memory once you have done with it. You can do:
void free_city_mem(char *arr[], size_t sz) {
for(size_t i = 0; i < sz; i++) {
free(arr[i]);
arr[i] = NULL;
}
}
In your program, I can see that the Max_Lines and LINE_SIZE are small values, so as an alternative you can do this:
char City1[Max_Lines][LINE_SIZE];
With this, you don't need to take care of any allocation/deallocation of memory.
Also, no need to take the array of integer pointer for storing distance. Distances could be an array of integers.
In your code, you are calling fopen() and checking the return value of it (if (!Distances_File) {....) somewhere below in the code after calling malloc. As a good programming practice, you should immediately check the return value of library function in such cases because if they fail, there is no point in proceeding further.
Also, when using scanf family functions, make sure that format specifier corresponding to the parameter passed should be correct.
Collectively all above points, the main() will be something like this:
int main() {
FILE *Distances_File = fopen("Distances.txt", "r");
if (!Distances_File) {
printf("File could not open");
return 1;
}
char City1[Max_Lines][LINE_SIZE];
char City2[Max_Lines][LINE_SIZE];
int Distances[Max_Lines];
char File_Line[LINE_SIZE];
int Line_Count = 0;
//Intro to the program.
printf("This is a program that will calculate the shortest distance between selected \ncities.");
printf("\nSo wish me luck :(\n \n");
while(fgets(File_Line, sizeof(File_Line), Distances_File)) {
printf("Line number : %d\n", Line_Count+1);
//printf("%s", File_Line);
sscanf(File_Line, "%s%s%d", City1[Line_Count], City2[Line_Count], &Distances[Line_Count]);
///Still issue of Distances being a char.
printf("%s\n%s\n%d\n\n", City1[Line_Count], City2[Line_Count], Distances[Line_Count]);
Line_Count++;
}
return 0;
}
You have undefined behavior accessing City1[Max_Lines] where the size of the array is Max_Lines. You should change it instead allocate memory and make those pointers point to it. So you will do something like this:-
#define MAXLEN 100
...
for(size_y i = 0; i < Max_Lines; i++){
City1[i] = malloc(MAXLEN);
/* check malloc return value */
}
And also here you can simply do this too,
char City1[Max_Lines][MAXLEN];
Because here if you want to get MAXLEN byte buffer then instead of dynamic allocation you can do this. The only problem is that if Max_lines and MAXLEN is larger then there is a possibilty of constrained by stack size. Then dynamic allocation will be a rescue.
Also you should check the return value fopen. There are cases when fopen fails. You need to handle those cases separately.
try below variant:
#include <stdio.h>
#include <stdlib.h>
#include <string.h> //Chosen to use this library to break text file down.
//#include "list.h" //Using file created in lab 3 earlier this year.
#define DYNAMIC_RESIZE 0 ///Might not be needed...
#define Max_Lines 40
#define LINE_SIZE 150
typedef struct {
char data[LINE_SIZE];
} LineBuffer;
int main()
{
FILE *Distances_File = fopen("Distances.txt", "r");
LineBuffer *City1;
LineBuffer *City2;
int *Distances;
char File_Line[LINE_SIZE];
int Line_Count = 0;
City1 = (LineBuffer *)malloc(sizeof(LineBuffer)*Max_Lines);
City2 = (LineBuffer *)malloc(sizeof(LineBuffer)*Max_Lines);
Distances = (int *)malloc(Max_Lines);
if (!Distances_File) {
printf("File could not open");
return 1;
}
if ( Distances_File != NULL ) {
///Intro to the program.
printf("This is a program that will calculate the shortest distance between selected \ncities.");
printf("\nSo wish me luck :(\n \n");
while(fgets(File_Line, sizeof(File_Line), Distances_File) && Line_Count < Max_Lines)
{
//printf("%s", File_Line);
sscanf(File_Line, "%s%s%d", City1[Line_Count], City2[Line_Count], Distances[Line_Count]);
///Still issue of Distances being a char.
printf("%s\n%s\n%d\n\n", City1[Line_Count], City2[Line_Count], Distances[Line_Count]);
Line_Count++;
printf("%d", Line_Count);
}
}
free(City1);
free(City2);
free(Distances);
}
edit
sizeof(int) should be 4 bytes so malloc(sizeof(Max_Lines)) will allocate 4 bytes
I'm making a raytracing engine in C using the minilibX library.
I want to be able to read in a .conf file the configuration for the scene to display:
For example:
(Az#Az 117)cat universe.conf
#randomcomment
obj:eye:x:y:z
light:sun:100
light:moon:test
The number of objects can vary between 1 and the infinite.
From now on, I'm reading the file, copying each line 1 by 1 in a char **tab, and mallocing by the number of objects found, like this:
void open_file(int fd, struct s_img *m)
{
int i;
char *s;
int curs_obj;
int curs_light;
i = 0;
curs_light = 0;
curs_obj = 0;
while (s = get_next_line(fd))
{
i = i + 1;
if (s[0] == 'l')
{
m->lights[curs_light] = s;
curs_light = curs_light + 1;
}
else if (s[0] == 'o')
{
m->objs[curs_obj] = s;
curs_obj = curs_obj + 1;
}
else if (s[0] != '#')
{
show_error(i, s);
stop_parsing(m);
}
}
Now, I want to be able to store each information of each tab[i] in a new char **tab, 1 for each object, using the ':' as a separation.
So I need to initialize and malloc an undetermined number of char **tab. How can I do that?
(Ps: I hope my code and my english are good enough for you to understand. And I'm using only the very basic function, like read, write, open, malloc... and I'm re-building everything else, like printf, get_line, and so on)
You can't allocate an indeterminate amount of memory; malloc doesn't support it. What you can do is to allocate enough memory for now and revise that later:
size_t buffer = 10;
char **tab = malloc(buffer);
//...
if (indexOfObjectToCreate > buffer) {
buffer *= 2;
tab = realloc(tab, buffer);
}
I'd use an alternative approach (as this is c, not c++) and allocate simply large buffers as we go by:
char *my_malloc(size_t n) {
static size_t space_left = 0;
static char *base = NULL;
if (base==NULL || space_left < n) base=malloc(space_left=BIG_N);
base +=n; return base-n;
}
Disclaimer: I've omitted the garbage collection stuff and testing return values and all safety measures to keep the routine short.
Another way to think this is to read the file in to a large enough mallocated array (you can check it with ftell), scan the buffer, replace delimiters, line feeds etc. with ascii zero characters and remember the starting locations of keywords.
While working on a program which requires frequent memory allocation I came across behaviour I cannot explain. I've implemented a work around but I am curious to why my previous implementation didn't work. Here's the situation:
Memory reallocation of a pointer works
This may not be best practice (and if so please let me knwow) but I recall that realloc can allocate new memory if the pointer passed in is NULL. Below is an example where I read file data into a temporary buffer, then allocate appropriate size for *data and memcopy content
I have a file structure like so
typedef struct _my_file {
int size;
char *data;
}
And the mem reallocation and copy code like so:
// cycle through decompressed file until end is reached
while ((read_size = gzread(fh, buf, sizeof(buf))) != 0 && read_size != -1) {
// allocate/reallocate memory to fit newly read buffer
if ((tmp_data = realloc(file->data, sizeof(char *)*(file->size+read_size))) == (char *)NULL) {
printf("Memory reallocation error for requested size %d.\n", file->size+read_size);
// if memory was previous allocated but realloc failed this time, free memory!
if (file->size > 0)
free(file->data);
return FH_REALLOC_ERROR;
}
// update pointer to potentially new address (man realloc)
file->data = tmp_data;
// copy data from temporary buffer
memcpy(file->data + file->size, buf, read_size);
// update total read file size
file->size += read_size;
}
Memory reallocation of pointer to pointer fails
However, here is where I'm confused. Using the same thought that reallocation of a NULL pointer will allocate new memory, I parse a string of arguments and for each argument I allocate a pointer to a pointer, then allocate a pointer that is pointed by that pointer to a pointer. Maybe code is easier to explain:
This is the structure:
typedef struct _arguments {
unsigned short int options; // options bitmap
char **regexes; // array of regexes
unsigned int nregexes; // number of regexes
char *logmatch; // log file match pattern
unsigned int limit; // log match limit
char *argv0; // executable name
} arguments;
And the memory allocation code:
int i = 0;
int len;
char **tmp;
while (strcmp(argv[i+regindex], "-logs") != 0) {
len = strlen(argv[i+regindex]);
if((tmp = realloc(args->regexes, sizeof(char **)*(i+1))) == (char **)NULL) {
printf("Cannot allocate memory for regex patterns array.\n");
return -1;
}
args->regexes = tmp;
tmp = NULL;
if((args->regexes[i] = (char *)malloc(sizeof(char *)*(len+1))) == (char *)NULL) {
printf("Cannot allocate memory for regex pattern.\n");
return -1;
}
strcpy(args->regexes[i], argv[i+regindex]);
i++;
}
When I compile and run this I get a run time error "realloc: invalid pointer "
I must be missing something obvious but after not accomplishing much trying to debug and searching for solutions online for 5 hours now, I just ran two loops, one counts the numbers of arguments and mallocs enough space for it, and the second loop allocates space for the arguments and strcpys it.
Any explanation to this behaviour is much appreciated! I really am curious to know why.
First fragment:
// cycle through decompressed file until end is reached
while (1) {
char **tmp_data;
read_size = gzread(fh, buf, sizeof buf);
if (read_size <= 0) break;
// allocate/reallocate memory to fit newly read buffer
tmp_data = realloc(file->data, (file->size+read_size) * sizeof *tmp_data );
if ( !tmp_data ) {
printf("Memory reallocation error for requested size %d.\n"
, file->size+read_size);
if (file->data) {
free(file->data)
file->data = NULL;
file->size = 0;
}
return FH_REALLOC_ERROR;
}
file->data = tmp_data;
// copy data from temporary buffer
memcpy(file->data + file->size, buf, read_size);
// update total read file size
file->size += read_size;
}
Second fragment:
unsigned i; // BTW this variable is already present as args->nregexes;
for(i =0; strcmp(argv[i+regindex], "-logs"); i++) {
char **tmp;
tmp = realloc(args->regexes, (i+1) * sizeof *tmp );
if (!tmp) {
printf("Cannot allocate memory for regex patterns array.\n");
return -1;
}
args->regexes = tmp;
args->regexes[i] = strdup( argv[i+regindex] );
if ( !args->regexes[i] ) {
printf("Cannot allocate memory for regex pattern.\n");
return -1;
}
...
return 0;
}
A few notes:
the syntax ptr = malloc ( CNT * sizeof *ptr); is more robust than the sizeof(type) variant.
strdup() does exactly the same as your malloc+strcpy()
the for(;;) loop is less error prone than a while() loop with a loose i++; at the end of the loop body. (it also makes clear that the loopcondition is never checked)
to me if ( !ptr ) {} is easyer to read than if (ptr != NULL) {}
the casts are not needed and sometimes unwanted.