I am trying to write a simple program that will read words from a file and print the number of occurrences of a particular word passed to it as argument.
For that, I use fscanf to read the words and copy them into an array of strings that is dynamically allocated.
For some reason, I get an error message.
Here is the code for the readFile function:
void readFile(char** buffer, char** argv){
unsigned int i=0;
FILE* file;
file = fopen(argv[1], "r");
do{
buffer = realloc(buffer, sizeof(char*));
buffer[i] = malloc(46);
}while(fscanf(file, "%s", buffer[i++]));
fclose(file);
}
And here is the main function :
int main(int argc, char** argv){
char** buffer = NULL;
readFile(buffer, argv);
printf("%s\n", buffer[0]);
return 0;
}
I get the following error message :
realloc(): invalid next size
Aborted (core dumped)
I have looked at other threads on this topic but none of them seem to be of help. I could not apply whatever I learned there to my problem.
I used a debugger (VS Code with gdb). Data is written successfully into indices 0,1,2,3 of the buffer array but says error : Cannot access memory at address 0xfbad2488 for index 4 and pauses on exception.
Another thread on this topic suggests there might be a wild pointer somewhere. But I don't see one anywhere.
I have spent days trying to figure this out. Any help will be greatly appreciated.
Thanks.
Your algorithm is wrong on many fronts, including:
buffer is passed by-value. Any modifications where buffer = ... is the assignment will mean nothing to the caller. In C, arguments are always pass-by-value (arrays included, but their "value" is a conversion to temporary pointer to first element, so you get a by-ref synonym there whether you want it or not).
Your realloc usage is wrong. It should be expanding based on the iteration of the loop as a count multiplied by the size of a char *. You only have the latter, with no count multiplier. Therefore, you never allocate more than a single char * with that realloc call.
Your loop termination condition is wrong. Your fscanf call should check for the expected number of arguments to be processed, which in your case is 1. Instead, you're looking for any non-zero value, which EOF is going to be when you hit it. Therefore, the loop never terminates.
Your fscanf call is not protected from buffer overflow : You're allocating a static-sized string for each string read, but not limiting the %s format to the static size specified. This is a recipe for buffer-overflow.
No IO functions are ever checked for success/failure : The following APIs could fail, yet you never check that possibility: fopen, fscanf, realloc, malloc. In failing to do so, you're violating Henry Spencer's 6th Commandment for C Programmers : "If a function be advertised to return an error code in the event of difficulties, thou shalt check for that code, yea, even though the checks triple the size of thy code and produce aches in thy typing fingers, for if thou thinkest ``it cannot happen to me'', the gods shall surely punish thee for thy arrogance."
No mechanism for communicating the allocated string count to the caller : The caller of this function is expecting a resulting char**. Assuming you fix the first item in this list, you still have not provided the caller any means of knowing how long that pointer sequence is when readFile returns. An out-parameter and/or a formal structure is a possible solution to this. Or perhaps a terminating NULL pointer to indicate the list is finished.
(Moderate) You never check argc : Instead, you just send argv directly to readFile, and assume the file name will be at argv[1] and always be valid. Don't do that. readFile should take either a FILE* or a single const char * file name, and act accordingly. It would be considerably more robust.
(Minor) : Extra allocation : Even fixing the above items, you'll still leave one extra buffer allocation in your sequence; the one that failed to read. Not that it matter much in this case, as the caller has no idea how many strings were allocated in the first place (see previous item).
Shoring up all of the above would require a basic rewrite of nearly everything you have posted. In the end, the code would look so different, it's almost not worth trying to salvage what is here. Instead, look at what you have done, look at this list, and see where things went wrong. There's plenty to choose from.
Sample
#include <stdio.h>
#include <stdlib.h>
#define STR_MAX_LEN 46
char ** readFile(const char *fname)
{
char **strs = NULL;
int len = 0;
FILE *fp = fopen(fname, "r");
if (fp != NULL)
{
do
{
// array expansion
void *tmp = realloc(strs, (len+1) * sizeof *strs);
if (tmp == NULL)
{
// failed. cleanup prior success
perror("Failed to expand pointer array");
for (int i=0; i<len; ++i)
free(strs[i]);
free(strs);
strs = NULL;
break;
}
// allocation was good; save off new pointer
strs = tmp;
strs[len] = malloc( STR_MAX_LEN );
if (strs[len] == NULL)
{
// failed. cleanup prior sucess
perror("Failed to allocate string buffer");
for (int i=0; i<len; ++i)
free(strs[i]);
free(strs);
strs = NULL;
break;
}
if (fscanf(fp, "%45s", strs[len]) == 1)
{
++len;
}
else
{
// read failed. we're leaving regardless. the last
// allocation is thrown out, but we terminate the list
// with a NULL to indicate end-of-list to the caller
free(strs[len]);
strs[len] = NULL;
break;
}
} while (1);
fclose(fp);
}
return strs;
}
int main(int argc, char *argv[])
{
if (argc < 2)
exit(EXIT_FAILURE);
char **strs = readFile(argv[1]);
if (strs)
{
// enumerate and free in the same loop
for (char **pp = strs; *pp; ++pp)
{
puts(*pp);
free(*pp);
}
// free the now-defunct pointer array
free(strs);
}
return EXIT_SUCCESS;
}
Output (run against /usr/share/dict/words)
A
a
aa
aal
aalii
aam
Aani
aardvark
aardwolf
Aaron
Aaronic
Aaronical
Aaronite
Aaronitic
Aaru
Ab
aba
Ababdeh
Ababua
abac
abaca
......
zymotechny
zymotic
zymotically
zymotize
zymotoxic
zymurgy
Zyrenian
Zyrian
Zyryan
zythem
Zythia
zythum
Zyzomys
Zyzzogeton
Improvements
The secondary malloc in this code is completely pointless. You're using a fixed length word maximum size, so you could easily retool you array to be a pointer to use this:
char (*strs)[STR_MAX_LEN]
and simply eliminate the per-string malloc code entirely. That does leave the problem of how to tell the caller how many strings were allocated. In the prior version we used a NULL pointer to indicate end-of-list. In this version we can simply use a zero-length string. Doing this makes the declaration of readFile rather odd looking, but for returning a pointer-to-array-of-size-N, its' correct. See below:
#include <stdio.h>
#include <stdlib.h>
#define STR_MAX_LEN 46
char (*readFile(const char *fname))[STR_MAX_LEN]
{
char (*strs)[STR_MAX_LEN] = NULL;
int len = 0;
FILE *fp = fopen(fname, "r");
if (fp != NULL)
{
do
{
// array expansion
void *tmp = realloc(strs, (len+1) * sizeof *strs);
if (tmp == NULL)
{
// failed. cleanup prior success
perror("Failed to expand pointer array");
free(strs);
strs = NULL;
break;
}
// allocation was good; save off new pointer
strs = tmp;
if (fscanf(fp, "%45s", strs[len]) == 1)
{
++len;
}
else
{
// read failed. make the final string zero-length
strs[len][0] = 0;
break;
}
} while (1);
fclose(fp);
}
return strs;
}
int main(int argc, char *argv[])
{
if (argc < 2)
exit(EXIT_FAILURE);
char (*strs)[STR_MAX_LEN] = readFile(argv[1]);
if (strs)
{
// enumerate and free in the same loop
for (char (*s)[STR_MAX_LEN] = strs; (*s)[0]; ++s)
puts(*s);
free(strs);
}
return EXIT_SUCCESS;
}
The output is the same as before.
Another Improvement: Geometric Growth
With a few simple changes we can significantly cut down on the realloc invokes (we're currently doing one per string added) by only doing them in a double-size growth pattern. If each time we reallocate, we double the size of the prior allocation, we will make more and more space available for reading larger numbers of strings before the next allocation:
#include <stdio.h>
#include <stdlib.h>
#define STR_MAX_LEN 46
char (*readFile(const char *fname))[STR_MAX_LEN]
{
char (*strs)[STR_MAX_LEN] = NULL;
int len = 0;
int capacity = 0;
FILE *fp = fopen(fname, "r");
if (fp != NULL)
{
do
{
if (len == capacity)
{
printf("Expanding capacity to %d\n", (2 * capacity + 1));
void *tmp = realloc(strs, (2 * capacity + 1) * sizeof *strs);
if (tmp == NULL)
{
// failed. cleanup prior success
perror("Failed to expand string array");
free(strs);
strs = NULL;
break;
}
// save the new string pointer and capacity
strs = tmp;
capacity = 2 * capacity + 1;
}
if (fscanf(fp, "%45s", strs[len]) == 1)
{
++len;
}
else
{
// read failed. make the final string zero-length
strs[len][0] = 0;
break;
}
} while (1);
// shrink if needed. remember to retain the final empty string
if (strs && (len+1) < capacity)
{
printf("Shrinking capacity to %d\n", len);
void *tmp = realloc(strs, (len+1) * sizeof *strs);
if (tmp)
strs = tmp;
}
fclose(fp);
}
return strs;
}
int main(int argc, char *argv[])
{
if (argc < 2)
exit(EXIT_FAILURE);
char (*strs)[STR_MAX_LEN] = readFile(argv[1]);
if (strs)
{
// enumerate and free in the same loop
for (char (*s)[STR_MAX_LEN] = strs; (*s)[0]; ++s)
puts(*s);
// free the now-defunct pointer array
free(strs);
}
return EXIT_SUCCESS;
}
Output
The output is the same as before, but I added instrumentation to show when expansion happens to illustrate the expansions and final shrinking. I'll leave out the rest of the output (which is over 200k lines of words)
Expanding capacity to 1
Expanding capacity to 3
Expanding capacity to 7
Expanding capacity to 15
Expanding capacity to 31
Expanding capacity to 63
Expanding capacity to 127
Expanding capacity to 255
Expanding capacity to 511
Expanding capacity to 1023
Expanding capacity to 2047
Expanding capacity to 4095
Expanding capacity to 8191
Expanding capacity to 16383
Expanding capacity to 32767
Expanding capacity to 65535
Expanding capacity to 131071
Expanding capacity to 262143
Shrinking capacity to 235886
Related
I'm using an array of strings in C to hold arguments given to a custom shell. I initialize the array of buffers using:
char *args[MAX_CHAR];
Once I parse the arguments, I send them to the following function to determine the type of IO redirection if there are any (this is just the first of 3 functions to check for redirection and it only checks for STDIN redirection).
int parseInputFile(char **args, char *inputFilePath) {
char *inputSymbol = "<";
int isFound = 0;
for (int i = 0; i < MAX_ARG; i++) {
if (strlen(args[i]) == 0) {
isFound = 0;
break;
}
if ((strcmp(args[i], inputSymbol)) == 0) {
strcpy(inputFilePath, args[i+1]);
isFound = 1;
break;
}
}
return isFound;
}
Once I compile and run the shell, it crashes with a SIGSEGV. Using GDB I determined that the shell is crashing on the following line:
if (strlen(args[i]) == 0) {
This is because the address of arg[i] (the first empty string after the parsed commands) is inaccessible. Here is the error from GDB and all relevant variables:
(gdb) next
359 if (strlen(args[i]) == 0) {
(gdb) p args[0]
$1 = 0x7fffffffe570 "echo"
(gdb) p args[1]
$2 = 0x7fffffffe575 "test"
(gdb) p args[2]
$3 = 0x0
(gdb) p i
$4 = 2
(gdb) next
Program received signal SIGSEGV, Segmentation fault.
parseInputFile (args=0x7fffffffd570, inputFilePath=0x7fffffffd240 "") at shell.c:359
359 if (strlen(args[i]) == 0) {
I believe that the p args[2] returning $3 = 0x0 means that because the index has yet to be written to, it is mapped to address 0x0 which is out of the bounds of execution. Although I can't figure out why this is because it was declared as a buffer. Any suggestions on how to solve this problem?
EDIT: Per Kaylum's comment, here is a minimal reproducible example
#include<stdio.h>
#include<string.h>
#include<stdlib.h>
#include<unistd.h>
#include<sys/types.h>
#include<sys/wait.h>
#include <sys/stat.h>
#include<readline/readline.h>
#include<readline/history.h>
#include <fcntl.h>
// Defined values
#define MAX_CHAR 256
#define MAX_ARG 64
#define clear() printf("\033[H\033[J") // Clear window
#define DEFAULT_PROMPT_SUFFIX "> "
char PROMPT[MAX_CHAR], SPATH[1024];
int parseInputFile(char **args, char *inputFilePath) {
char *inputSymbol = "<";
int isFound = 0;
for (int i = 0; i < MAX_ARG; i++) {
if (strlen(args[i]) == 0) {
isFound = 0;
break;
}
if ((strcmp(args[i], inputSymbol)) == 0) {
strcpy(inputFilePath, args[i+1]);
isFound = 1;
break;
}
}
return isFound;
}
int ioRedirectHandler(char **args) {
char inputFilePath[MAX_CHAR] = "";
// Check if any redirects exist
if (parseInputFile(args, inputFilePath)) {
return 1;
} else {
return 0;
}
}
void parseArgs(char *cmd, char **cmdArgs) {
int na;
// Separate each argument of a command to a separate string
for (na = 0; na < MAX_ARG; na++) {
cmdArgs[na] = strsep(&cmd, " ");
if (cmdArgs[na] == NULL) {
break;
}
if (strlen(cmdArgs[na]) == 0) {
na--;
}
}
}
int processInput(char* input, char **args, char **pipedArgs) {
// Parse the single command and args
parseArgs(input, args);
return 0;
}
int getInput(char *input) {
char *buf, loc_prompt[MAX_CHAR] = "\n";
strcat(loc_prompt, PROMPT);
buf = readline(loc_prompt);
if (strlen(buf) != 0) {
add_history(buf);
strcpy(input, buf);
return 0;
} else {
return 1;
}
}
void init() {
char *uname;
clear();
uname = getenv("USER");
printf("\n\n \t\tWelcome to Student Shell, %s! \n\n", uname);
// Initialize the prompt
snprintf(PROMPT, MAX_CHAR, "%s%s", uname, DEFAULT_PROMPT_SUFFIX);
}
int main() {
char input[MAX_CHAR];
char *args[MAX_CHAR], *pipedArgs[MAX_CHAR];
int isPiped = 0, isIORedir = 0;
init();
while(1) {
// Get the user input
if (getInput(input)) {
continue;
}
isPiped = processInput(input, args, pipedArgs);
isIORedir = ioRedirectHandler(args);
}
return 0;
}
Note: If I forgot to include any important information, please let me know and I can get it updated.
When you write
char *args[MAX_CHAR];
you allocate room for MAX_CHAR pointers to char. You do not initialise the array. If it is a global variable, you will have initialised all the pointers to NULL, but you do it in a function, so the elements in the array can point anywhere. You should not dereference them before you have set the pointers to point at something you are allowed to access.
You also do this, though, in parseArgs(), where you do this:
cmdArgs[na] = strsep(&cmd, " ");
There are two potential issues here, but let's deal with the one you hit first. When strsep() is through the tokens you are splitting, it returns NULL. You test for that to get out of parseArgs() so you already know this. However, where your program crashes you seem to have forgotten this again. You call strlen() on a NULL pointer, and that is a no-no.
There is a difference between NULL and the empty string. An empty string is a pointer to a buffer that has the zero char first; the string "" is a pointer to a location that holds the character '\0'. The NULL pointer is a special value for pointers, often address zero, that means that the pointer doesn't point anywhere. Obviously, the NULL pointer cannot point to an empty string. You need to check if an argument is NULL, not if it is the empty string.
If you want to check both for NULL and the empty string, you could do something like
if (!args[i] || strlen(args[i]) == 0) {
If args[i] is NULL then !args[i] is true, so you will enter the if body if you have NULL or if you have a pointer to an empty string.
(You could also check the empty string with !(*args[i]); *args[i] is the first character that args[i] points at. So *args[i] is zero if you have the empty string; zero is interpreted as false, so !(*args[i]) is true if and only if args[i] is the empty string. Not that this is more readable, but it shows again the difference between empty strings and NULL).
I mentioned another issue with the parsed arguments. Whether it is a problem or not depends on the application. But when you parse a string with strsep(), you get pointers into the parsed string. You have to be careful not to free that string (it is input in your main() function) or to modify it after you have parsed the string. If you change the string, you have changed what all the parsed strings look at. You do not do this in your program, so it isn't a problem here, but it is worth keeping in mind. If you want your parsed arguments to survive longer than they do now, after the next command is passed, you need to copy them. The next command that is passed will change them as it is now.
In main
char input[MAX_CHAR];
char *args[MAX_CHAR], *pipedArgs[MAX_CHAR];
are all uninitialized. They contain indeterminate values. This could be a potential source of bugs, but is not the reason here, as
getInput modifies the contents of input to be a valid string before any reads occur.
pipedArgs is unused, so raises no issues (yet).
args is modified by parseArgs to (possibly!) contain a NULL sentinel value, without any indeterminate pointers being read first.
Firstly, in parseArgs it is possible to completely fill args without setting the NULL sentinel value that other parts of the program should rely on.
Looking deeper, in parseInputFile the following
if (strlen(args[i]) == 0)
contradicts the limits imposed by parseArgs that disallows empty strings in the array. More importantly, args[i] may be the sentinel NULL value, and strlen expects a non-NULL pointer to a valid string.
This termination condition should simply check if args[i] is NULL.
With
strcpy(inputFilePath, args[i+1]);
args[i+1] might also be the NULL sentinel value, and strcpy also expects non-NULL pointers to valid strings. You can see this in action when inputSymbol is a match for the final token in the array.
args[i+1] may also evaluate as args[MAX_ARGS], which would be out of bounds.
Additionally, inputFilePath has a string length limit of MAX_CHAR - 1, and args[i+1] is (possibly!) a dynamically allocated string whose length might exceed this.
Some edge cases found in getInput:
Both arguments to
strcat(loc_prompt, PROMPT);
are of the size MAX_CHAR. Since loc_prompt has a length of 1. If PROMPT has the length MAX_CHAR - 1, the resulting string will have the length MAX_CHAR. This would leave no room for the NUL terminating byte.
readline can return NULL in some situations, so
buf = readline(loc_prompt);
if (strlen(buf) != 0) {
can again pass the NULL pointer to strlen.
A similar issue as before, on success readline returns a string of dynamic length, and
strcpy(input, buf);
can cause a buffer overflow by attempting to copy a string greater in length than MAX_CHAR - 1.
buf is a pointer to data allocated by malloc. It's unclear what add_history does, but this pointer must eventually be passed to free.
Some considerations.
Firstly, it is a good habit to initialize your data, even if it might not matter.
Secondly, using constants (#define MAX_CHAR 256) might help to reduce magic numbers, but they can lead you to design your program too rigidly if used in the same way.
Consider building your functions to accept a limit as an argument, and return a length. This allows you to more strictly track the sizes of your data, and prevents you from always designing around the maximum potential case.
A slightly contrived example of designing like this. We can see that find does not have to concern itself with possibly checking MAX_ARGS elements, as it is told precisely how long the list of valid elements is.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MAX_ARGS 100
char *get_input(char *dest, size_t sz, const char *display) {
char *res;
if (display)
printf("%s", display);
if ((res = fgets(dest, sz, stdin)))
dest[strcspn(dest, "\n")] = '\0';
return res;
}
size_t find(char **list, size_t length, const char *str) {
for (size_t i = 0; i < length; i++)
if (strcmp(list[i], str) == 0)
return i;
return length;
}
size_t split(char **list, size_t limit, char *source, const char *delim) {
size_t length = 0;
char *token;
while (length < limit && (token = strsep(&source, delim)))
if (*token)
list[length++] = token;
return length;
}
int main(void) {
char input[512] = { 0 };
char *args[MAX_ARGS] = { 0 };
puts("Welcome to the shell.");
while (1) {
if (get_input(input, sizeof input, "$ ")) {
size_t argl = split(args, MAX_ARGS, input, " ");
size_t redirection = find(args, argl, "<");
puts("Command parts:");
for (size_t i = 0; i < redirection; i++)
printf("%zu: %s\n", i, args[i]);
puts("Input files:");
if (redirection == argl)
puts("[[NONE]]");
else for (size_t i = redirection + 1; i < argl; i++)
printf("%zu: %s\n", i, args[i]);
}
}
}
it's my first post so I apologize in advance if I've posted the code format wrong.
I've been trying to work out where I'm going wrong here for awhile now and haven't been able to find an answer. I keep getting a segmentation fault after two Lines of a text file have been scanned into my arrays. Text file follows the pattern of : City1 City2 distance.
I feel like it has something to do with the memory but I can't understand why.
#include <stdio.h>
#include <stdlib.h>
#include <string.h> //Chosen to use this library to break text file down.
#include "list.h" //Using file created in lab 3 earlier this year.
#define DYNAMIC_RESIZE 0 ///Might not be needed...
#define Max_Lines 40
#define LINE_SIZE 150
int main()
{
FILE *Distances_File = fopen("Distances.txt", "r");
char *City1[Max_Lines];
char *City2[Max_Lines];
int *Distances[Max_Lines];
City1[Max_Lines] = malloc(sizeof(Max_Lines));
City2[Max_Lines] = malloc(sizeof(Max_Lines));
Distances[Max_Lines] = malloc(sizeof(Max_Lines));
char File_Line[LINE_SIZE];
int Line_Count = 0;
if (!Distances_File) {
printf("File could not open");
return 1;
}
if ( Distances_File != NULL )
{
///Intro to the program.
printf("This is a program that will calculate the shortest distance between selected \ncities.");
printf("\nSo wish me luck :(\n \n");
while(fgets(File_Line, sizeof(File_Line), Distances_File))
{
//printf("%s", File_Line);
sscanf(File_Line, "%s%s%s", City1[Line_Count], City2[Line_Count], Distances[Line_Count]);
///Still issue of Distances being a char.
printf("%s\n%s\n%s\n\n", City1[Line_Count], City2[Line_Count], Distances[Line_Count]);
Line_Count++;
printf("%d", Line_Count);
}
}
}
This statement
char *City1[Max_Lines];
declare an array of char pointers and the size of the array is Max_Lines.
And here you are allocating memory to an invalid index of the array:
City1[Max_Lines] = malloc(sizeof(Max_Lines));
Max_Lines value is 40, so the valid index of the array of size Max_Lines will be 0-39.
You need to allocate memory to all the pointers of array City1 and City2 before using them.
May you write a function to perform this allocation, like this:
void allocate_city_mem(char *arr[], size_t sz) {
for(size_t i = 0; i < sz; i++) {
arr[i] = malloc(LINE_SIZE);
if (NULL == arr[i])
exit(EXIT_FAILURE);
}
}
In doing so, you need to make sure to free the dynamically allocated memory once you have done with it. You can do:
void free_city_mem(char *arr[], size_t sz) {
for(size_t i = 0; i < sz; i++) {
free(arr[i]);
arr[i] = NULL;
}
}
In your program, I can see that the Max_Lines and LINE_SIZE are small values, so as an alternative you can do this:
char City1[Max_Lines][LINE_SIZE];
With this, you don't need to take care of any allocation/deallocation of memory.
Also, no need to take the array of integer pointer for storing distance. Distances could be an array of integers.
In your code, you are calling fopen() and checking the return value of it (if (!Distances_File) {....) somewhere below in the code after calling malloc. As a good programming practice, you should immediately check the return value of library function in such cases because if they fail, there is no point in proceeding further.
Also, when using scanf family functions, make sure that format specifier corresponding to the parameter passed should be correct.
Collectively all above points, the main() will be something like this:
int main() {
FILE *Distances_File = fopen("Distances.txt", "r");
if (!Distances_File) {
printf("File could not open");
return 1;
}
char City1[Max_Lines][LINE_SIZE];
char City2[Max_Lines][LINE_SIZE];
int Distances[Max_Lines];
char File_Line[LINE_SIZE];
int Line_Count = 0;
//Intro to the program.
printf("This is a program that will calculate the shortest distance between selected \ncities.");
printf("\nSo wish me luck :(\n \n");
while(fgets(File_Line, sizeof(File_Line), Distances_File)) {
printf("Line number : %d\n", Line_Count+1);
//printf("%s", File_Line);
sscanf(File_Line, "%s%s%d", City1[Line_Count], City2[Line_Count], &Distances[Line_Count]);
///Still issue of Distances being a char.
printf("%s\n%s\n%d\n\n", City1[Line_Count], City2[Line_Count], Distances[Line_Count]);
Line_Count++;
}
return 0;
}
You have undefined behavior accessing City1[Max_Lines] where the size of the array is Max_Lines. You should change it instead allocate memory and make those pointers point to it. So you will do something like this:-
#define MAXLEN 100
...
for(size_y i = 0; i < Max_Lines; i++){
City1[i] = malloc(MAXLEN);
/* check malloc return value */
}
And also here you can simply do this too,
char City1[Max_Lines][MAXLEN];
Because here if you want to get MAXLEN byte buffer then instead of dynamic allocation you can do this. The only problem is that if Max_lines and MAXLEN is larger then there is a possibilty of constrained by stack size. Then dynamic allocation will be a rescue.
Also you should check the return value fopen. There are cases when fopen fails. You need to handle those cases separately.
try below variant:
#include <stdio.h>
#include <stdlib.h>
#include <string.h> //Chosen to use this library to break text file down.
//#include "list.h" //Using file created in lab 3 earlier this year.
#define DYNAMIC_RESIZE 0 ///Might not be needed...
#define Max_Lines 40
#define LINE_SIZE 150
typedef struct {
char data[LINE_SIZE];
} LineBuffer;
int main()
{
FILE *Distances_File = fopen("Distances.txt", "r");
LineBuffer *City1;
LineBuffer *City2;
int *Distances;
char File_Line[LINE_SIZE];
int Line_Count = 0;
City1 = (LineBuffer *)malloc(sizeof(LineBuffer)*Max_Lines);
City2 = (LineBuffer *)malloc(sizeof(LineBuffer)*Max_Lines);
Distances = (int *)malloc(Max_Lines);
if (!Distances_File) {
printf("File could not open");
return 1;
}
if ( Distances_File != NULL ) {
///Intro to the program.
printf("This is a program that will calculate the shortest distance between selected \ncities.");
printf("\nSo wish me luck :(\n \n");
while(fgets(File_Line, sizeof(File_Line), Distances_File) && Line_Count < Max_Lines)
{
//printf("%s", File_Line);
sscanf(File_Line, "%s%s%d", City1[Line_Count], City2[Line_Count], Distances[Line_Count]);
///Still issue of Distances being a char.
printf("%s\n%s\n%d\n\n", City1[Line_Count], City2[Line_Count], Distances[Line_Count]);
Line_Count++;
printf("%d", Line_Count);
}
}
free(City1);
free(City2);
free(Distances);
}
edit
sizeof(int) should be 4 bytes so malloc(sizeof(Max_Lines)) will allocate 4 bytes
I have a file which stored a sequence of integers. The number of total integers is unknown, so I keep using malloc() to apply new memory if i read an integer from the file.
I don't know if i could keep asking for memory and add them at the end of the array. The Xcode keeps warning me that 'EXC_BAD_EXCESS' in the line of malloc().
How could i do this if i keep reading integers from a file?
int main()
{
//1.read from file
int *a = NULL;
int size=0;
//char ch;
FILE *in;
//open file
if ( (in=fopen("/Users/NUO/Desktop/in.text","r")) == NULL){
printf("cannot open input file\n");
exit(0); //if file open fail, stop the program
}
while( ! feof(in) ){
a = (int *)malloc(sizeof(int));
fscanf(in,"%d", &a[size] );;
printf("a[i]=%d\n",a[size]);
size++;
}
fclose(in);
return 0;
}
Calling malloc() repeatedly like that doesn't do what you think it does. Each time malloc(sizeof(int)) is called, it allocates a separate, new block of memory that's only large enough for one integer. Writing to a[size] ends up writing off the end of that array for every value past the first one.
What you want here is the realloc() function, e.g.
a = realloc(a, sizeof(int) * (size + 1));
if (a == NULL) { ... handle error ... }
Reworking your code such that size is actually the size of the array, rather than its last index, would help simplify this code, but that's neither here nor there.
Instead of using malloc, use realloc.
Don't use feof(in) in a while loop. See why.
int number;
while( fscanf(in, "%d", &number) == 1 ){
a = realloc(a, sizeof(int)*(size+1));
if ( a == NULL )
{
// Problem.
exit(0);
}
a[size] = number;
printf("a[i]=%d\n", a[size]);
size++;
}
Your malloc() is overwriting your previous storage with just enough space for a single integer!
a = (int *)malloc(sizeof(int));
^^^ assignment overwrites what you have stored!
Instead, realloc() the array:
a = realloc(a, sizeof(int)*(size+1));
You haven't allocated an array of integers, you've allocated one integer here. So you'll need to allocate a default array size, then resize if you're about to over run. This will resize it by 2 each time it is full. Might not be in your best interest to resize it this way, but you could also reallocate each for each additional field.
size_t size = 0;
size_t current_size = 2;
a = (int *)malloc(sizeof(int) * current_size);
if(!a)
handle_error();
while( ! feof(in) ){
if(size >= current_size) {
current_size *= 2;
a = (int *)realloc(a, sizeof(int) * current_size);
if(!a)
handle_error();
}
fscanf(in,"%d", &a[size] );;
printf("a[i]=%d\n",a[size]);
size++;
}
The usual approach is to allocate some amount of space at first (large enough to cover most of your cases), then double it as necessary, using the realloc function.
An example:
#define INITIAL_ALLOCATED 32 // or enough to cover most cases
...
size_t allocated = INITIAL_ALLOCATED;
size_t size = 0;
...
int *a = malloc( sizeof *a * allocated );
if ( !a )
// panic
int val;
while ( fscanf( in, "%d", &val ) == 1 )
{
if ( size == allocated )
{
int *tmp = realloc( a, sizeof *a * allocated * 2 ); // double the size of the buffer
if ( tmp )
{
a = tmp;
allocated *= 2;
}
else
{
// realloc failed - you can treat this as a fatal error, or you
// can give the user a choice to continue with the data that's
// been read so far.
}
a[size++] = val;
}
}
We start by allocating 32 elements to a. Then we read a value from the file. If we're not at the end of the array (size is not equal to allocated), we add that value to the end of the array. If we are at the end of the array, we then double the size of it using realloc. If the realloc call succeeds, we update the allocated variable to keep track of the new size and add the value to the array. We keep going until we reach the end of the input file.
Doubling the size of the array each time we reach the limit reduces the total number of realloc calls, which can save performance if you're loading a lot of values.
Note that I assigned the result of realloc to a different variable tmp. realloc will return NULL if it cannot extend the array for any reason. If we assign that NULL value to a, we lose our reference to the memory that was allocated before, causing a memory leak.
Note also that we check the result of fscanf instead of calling feof, since feof won't return true until after we've already tried to read past the end of the file.
While working on a program which requires frequent memory allocation I came across behaviour I cannot explain. I've implemented a work around but I am curious to why my previous implementation didn't work. Here's the situation:
Memory reallocation of a pointer works
This may not be best practice (and if so please let me knwow) but I recall that realloc can allocate new memory if the pointer passed in is NULL. Below is an example where I read file data into a temporary buffer, then allocate appropriate size for *data and memcopy content
I have a file structure like so
typedef struct _my_file {
int size;
char *data;
}
And the mem reallocation and copy code like so:
// cycle through decompressed file until end is reached
while ((read_size = gzread(fh, buf, sizeof(buf))) != 0 && read_size != -1) {
// allocate/reallocate memory to fit newly read buffer
if ((tmp_data = realloc(file->data, sizeof(char *)*(file->size+read_size))) == (char *)NULL) {
printf("Memory reallocation error for requested size %d.\n", file->size+read_size);
// if memory was previous allocated but realloc failed this time, free memory!
if (file->size > 0)
free(file->data);
return FH_REALLOC_ERROR;
}
// update pointer to potentially new address (man realloc)
file->data = tmp_data;
// copy data from temporary buffer
memcpy(file->data + file->size, buf, read_size);
// update total read file size
file->size += read_size;
}
Memory reallocation of pointer to pointer fails
However, here is where I'm confused. Using the same thought that reallocation of a NULL pointer will allocate new memory, I parse a string of arguments and for each argument I allocate a pointer to a pointer, then allocate a pointer that is pointed by that pointer to a pointer. Maybe code is easier to explain:
This is the structure:
typedef struct _arguments {
unsigned short int options; // options bitmap
char **regexes; // array of regexes
unsigned int nregexes; // number of regexes
char *logmatch; // log file match pattern
unsigned int limit; // log match limit
char *argv0; // executable name
} arguments;
And the memory allocation code:
int i = 0;
int len;
char **tmp;
while (strcmp(argv[i+regindex], "-logs") != 0) {
len = strlen(argv[i+regindex]);
if((tmp = realloc(args->regexes, sizeof(char **)*(i+1))) == (char **)NULL) {
printf("Cannot allocate memory for regex patterns array.\n");
return -1;
}
args->regexes = tmp;
tmp = NULL;
if((args->regexes[i] = (char *)malloc(sizeof(char *)*(len+1))) == (char *)NULL) {
printf("Cannot allocate memory for regex pattern.\n");
return -1;
}
strcpy(args->regexes[i], argv[i+regindex]);
i++;
}
When I compile and run this I get a run time error "realloc: invalid pointer "
I must be missing something obvious but after not accomplishing much trying to debug and searching for solutions online for 5 hours now, I just ran two loops, one counts the numbers of arguments and mallocs enough space for it, and the second loop allocates space for the arguments and strcpys it.
Any explanation to this behaviour is much appreciated! I really am curious to know why.
First fragment:
// cycle through decompressed file until end is reached
while (1) {
char **tmp_data;
read_size = gzread(fh, buf, sizeof buf);
if (read_size <= 0) break;
// allocate/reallocate memory to fit newly read buffer
tmp_data = realloc(file->data, (file->size+read_size) * sizeof *tmp_data );
if ( !tmp_data ) {
printf("Memory reallocation error for requested size %d.\n"
, file->size+read_size);
if (file->data) {
free(file->data)
file->data = NULL;
file->size = 0;
}
return FH_REALLOC_ERROR;
}
file->data = tmp_data;
// copy data from temporary buffer
memcpy(file->data + file->size, buf, read_size);
// update total read file size
file->size += read_size;
}
Second fragment:
unsigned i; // BTW this variable is already present as args->nregexes;
for(i =0; strcmp(argv[i+regindex], "-logs"); i++) {
char **tmp;
tmp = realloc(args->regexes, (i+1) * sizeof *tmp );
if (!tmp) {
printf("Cannot allocate memory for regex patterns array.\n");
return -1;
}
args->regexes = tmp;
args->regexes[i] = strdup( argv[i+regindex] );
if ( !args->regexes[i] ) {
printf("Cannot allocate memory for regex pattern.\n");
return -1;
}
...
return 0;
}
A few notes:
the syntax ptr = malloc ( CNT * sizeof *ptr); is more robust than the sizeof(type) variant.
strdup() does exactly the same as your malloc+strcpy()
the for(;;) loop is less error prone than a while() loop with a loose i++; at the end of the loop body. (it also makes clear that the loopcondition is never checked)
to me if ( !ptr ) {} is easyer to read than if (ptr != NULL) {}
the casts are not needed and sometimes unwanted.
I am having trouble with a struct array. I need to read in a text file line by line, and compare the values side by side. For example "Mama" would return 2 ma , 1 am because you have ma- am- ma. I have a struct:
typedef struct{
char first, second;
int count;
} pair;
I need to create an array of structs for the entire string, and then compare those structs. We also were introduced to memory allocation so we have to do it for any size file. That is where my trouble is really coming in. How do I reallocate the memory properly for an array of structs? This is my main as of now (doesn't compile, has errors obviously having trouble with this).
int main(int argc, char *argv[]){
//allocate memory for struct
pair *p = (pair*) malloc(sizeof(pair));
//if memory allocated
if(p != NULL){
//Attempt to open io files
for(int i = 1; i<= argc; i++){
FILE * fileIn = fopen(argv[i],"r");
if(fileIn != NULL){
//Read in file to string
char lineString[137];
while(fgets(lineString,137,fileIn) != NULL){
//Need to reallocate here, sizeof returning error on following line
//having trouble seeing how much memory I need
pair *realloc(pair *p, sizeof(pair)+strlen(linestring));
int structPos = 0;
for(i = 0; i<strlen(lineString)-1; i++){
for(int j = 1; j<strlen(lineSTring);j++){
p[structPos]->first = lineString[i];
p[structPos]->last = lineString[j];
structPos++;
}
}
}
}
}
}
else{
printf("pair pointer length is null\n");
}
}
I am happy to change things around obviously if there is a better method for this. I HAVE to use the above struct, have to have an array of structs, and have to work with memory allocation. Those are the only restrictions.
Allocating memory for an array of struct is as simple as allocating for one struct:
pair *array = malloc(sizeof(pair) * count);
Then you can access each item by subscribing "array":
array[0] => first item
array[1] => second item
etc
Regarding the realloc part, instead of:
pair *realloc(pair *p, sizeof(pair)+strlen(linestring));
(which is not syntactically valid, looks like a mix of realloc function prototype and its invocation at the same time), you should use:
p=realloc(p,[new size]);
In fact, you should use a different variable to store the result of realloc, since in case of memory allocation failure, it would return NULL while still leaving the already allocated memory (and then you would have lost its position in memory). But on most Unix systems, when doing casual processing (not some heavy duty task), reaching the point where malloc/realloc returns NULL is somehow a rare case (you must have exhausted all virtual free memory). Still it's better to write:
pair*newp=realloc(p,[new size]);
if(newp != NULL) p=newp;
else { ... last resort error handling, screaming for help ... }
So if I get this right you're counting how many times pairs of characters occur? Why all the mucking about with nested loops and using that pair struct when you can just keep a frequency table in a 64KB array, which is much simpler and orders of magnitude faster.
Here's roughly what I would do (SPOILER ALERT: especially if this is homework, please don't just copy/paste):
#include <stdlib.h>
#include <stdio.h>
#include <ctype.h>
void count_frequencies(size_t* freq_tbl, FILE* pFile)
{
int first, second;
first = fgetc(pFile);
while( (second = fgetc(pFile)) != EOF)
{
/* Only consider printable characters */
if(isprint(first) && isprint(second))
++freq_tbl[(first << 8) | second];
/* Proceed to next character */
first = second;
}
}
int main(int argc, char*argv[])
{
size_t* freq_tbl = calloc(1 << 16, sizeof(size_t));;
FILE* pFile;
size_t i;
/* Handle some I/O errors */
if(argc < 2)
{
perror ("No file given");
return EXIT_FAILURE;
}
if(! (pFile = fopen(argv[1],"r")))
{
perror ("Error opening file");
return EXIT_FAILURE;
}
if(feof(pFile))
{
perror ("Empty file");
return EXIT_FAILURE;
}
count_frequencies(freq_tbl, pFile);
/* Print frequencies */
for(i = 0; i <= 0xffff; ++i)
if(freq_tbl[i] > 0)
printf("%c%c : %d\n", (char) (i >> 8), (char) (i & 0xff), freq_tbl[i]);
free(freq_tbl);
return EXIT_SUCCESS;
}
Sorry for the bit operations and hex notation. I just happen to like them in such a context of char tables, but they can be replaced with multiplications and additions, etc for clarity.