Memory Leaks when Tokenizing a String - c

I wrote a program in C that tokenizes an input string, and when the user enters "exit", it exits the program.
It tokenizes the string correctly, however, when I test my program with valgrind, I get some memory leaks. The only scenario when I don't get memory leaks is after compiling and then executing, I exit right away.
Here is the output with valgrind:
Valgrind Memory Leaks
And here is my code for the program:
int main() {
/* Main Function Variables */
char *buf;
char *savecpy = NULL;
char *token;
size_t num_chars;
size_t bufsize = 2048;
int run = 1;
int tok_count = 0;
int cmp;
/* Allocate memory for the input buffer. */
buf = (char *) malloc(sizeof(char) * bufsize);
/*main run loop*/
while(run) {
/* Print >>> then get the input string */
printf(">>> ");
num_chars = getline(&buf, &bufsize, stdin);
cmp = strcmp(buf, "exit\n");
if (num_chars > 1) {
/* Tokenize the input string */
if (cmp != 0) {
/* Display each token */
savecpy = strdup(buf);
while((token = strtok_r(savecpy, " ", &savecpy))) {
printf("T%d: %s\n", tok_count, token);
tok_count++;
}
}
/* If the user entered <exit> then exit the loop */
else {
run = 0;
break;
}
}
tok_count = 0;
}
/*Free the allocated memory*/
free(buf);
return 1;
}
What may be the problem here that is causing the memory leaks in valgrind? I am freeing my memory for my input string, but I still get memory leaks.

savecpy should be freed. As seen in the manual:
Memory for the new string is obtained
with malloc(3), and can be freed with free(3).
savecpy can not be freed after passing through strtok_r third argument, as this function modifies the pointer. Rather pass something like this
char* ptr;
strtok_r(..,.., &ptr);
Then you can free savecpy

Related

free() errors and munmap_chunk errors when exiting a program

I wrote a program in C that tokenizes an input string, and when the user enters "exit", it exits the program.
It seems to tokenize the string correctly, however, when I exit the program, I get an error either about an "aborted (core dumped) error" about freeing memory, or a munmap_chunk error.
Here is the link to the picture of my output:
ErrorMessage
And here is my code for the program:
int main() {
/* Main Function Variables */
char *buf;
char *token;
size_t num_chars;
size_t bufsize = 2048;
int run = 1;
int tok_count = 0;
int cmp;
/* Allocate memory for the input buffer. */
buf = (char *) malloc(sizeof(char) * bufsize);
/*main run loop*/
while(run) {
/* Print >>> then get the input string */
printf(">>> ");
num_chars = getline(&buf, &bufsize, stdin);
cmp = strcmp(buf, "exit\n");
if (num_chars > 1) {
/* Tokenize the input string */
if (cmp != 0) {
/* Display each token */
while((token = strtok_r(buf, " ", &buf))) {
printf("T%d: %s\n", tok_count, token);
tok_count++;
}
}
/* If the user entered <exit> then exit the loop */
else {
run = 0;
break;
}
}
tok_count = 0;
}
/*Free the allocated memory*/
free(buf);
return 1;
}
What may be the problem here that is causing the free() errors and munmap_chunk errors?

How to Delete Duplicate Elements from Dynamically Allocated String Array in C

I have created a program in C that reads in a word file and counts how many words are in that file, along with how many times each word occurs.
When I run it through Valgrind I either get too many bytes lost or a Segmentation Fault.
How can I remove a duplicate element from a dynamically allocated array and free the memory as well?
Gist: wordcount.c
int tokenize(Dictionary **dictionary, char *words, int total_words)
{
char *delim = " .,?!:;/\"\'\n\t";
char **temp = malloc(sizeof(char) * strlen(words) + 1);
char *token = strtok(words, delim);
*dictionary = (Dictionary*)malloc(sizeof(Dictionary) * total_words);
int count = 1, index = 0;
while (token != NULL)
{
temp[index] = (char*)malloc(sizeof(char) * strlen(token) + 1);
strcpy(temp[index], token);
token = strtok(NULL, delim);
index++;
}
for (int i = 0; i < total_words; ++i)
{
for (int j = i + 1; j < total_words; ++j)
{
if (strcmp(temp[i], temp[j]) == 0) // <------ segmentation fault occurs here
{
count++;
for (int k = j; k < total_words; ++k) // <----- loop to remove duplicates
temp[k] = temp[k+1];
total_words--;
j--;
}
}
int length = strlen(temp[i]) + 1;
(*dictionary)[i].word = (char*)malloc(sizeof(char) * length);
strcpy((*dictionary)[i].word, temp[i]);
(*dictionary)[i].count = count;
count = 1;
}
free(temp);
return 0;
}
Thanks in advance.
Without A Minimal, Complete, and Verifiable example, there is no guarantee that additional problems do not originate elsewhere in your code, but the following need careful attention:
char **temp = malloc(sizeof(char) * strlen(words) + 1);
Above you are allocating pointers not words, your allocation is too small by a factor of sizeof (char*) - sizeof (char). To prevent such problems, if you use the sizeof *thepointer, you will always have the correct size, e.g.
char **temp = malloc (sizeof *temp * strlen(words) + 1);
(unless you plan on providing a sentinel NULL as the final pointer, then + 1 is unnecessary. You must also validate the return (see below))
Next:
*dictionary = (Dictionary*)malloc(sizeof(Dictionary) * total_words);
There is no need to cast the return of malloc, it is unnecessary. See: Do I cast the result of malloc?. Further, if *dictionary was previously allocated elsewhere, the allocation above creates a memory leak because you lose the reference to the original pointer. If it has been previously allocated, you need realloc, not malloc. And if wasn't allocate, a better way of writing it would be:
*dictionary = malloc (sizeof **dictionary * total_words);
You must also validation the allocation succeeds before attempting to use the block of memory, e.g.
if (! *dictionary) {
perror ("malloc - *dictionary");
exit (EXIT_FAILURE);
}
In:
temp[index] = (char*)malloc(sizeof(char) * strlen(token) + 1);
sizeof(char) is always 1 and can be omitted. Better written as:
temp[index] = malloc (strlen(token) + 1);
or better, allocate and validate in a single block:
if (!(temp[index] = malloc (strlen(token) + 1))) {
perror ("malloc - temp[index]");
exit (EXIT_FAILURE);
}
then
strcpy(temp[index++], token);
Next, while total_words may be equal to the words in temp, you have only validated that you have index number of words. That combined with your original allocation times sizeof (char) instead of sizeof (char *), makes it no wonder there can be segfaults where you attempt to iterate over your list of pointers in temp. Better:
for (int i = 0; i < index; ++i)
{
for (int j = i + 1; j < index; ++j)
(the same applies to your k loop as well. Additionally, since you have allocated each temp[index], when you shuffle pointers with temp[k] = temp[k+1]; you overwrite the pointer address in temp[k] causing a memory leak with every pointer you overwrite. Each temp[k] that is overwritten should be freed before the assignment is made.
While you are updating total_words--, there still to this point has never been a validation that index == total_words, and in the event they are not, you can have no confidence in total_words or that you won't segfault attempting to iterate over uninitialized pointers as the result.
The rest appears workable, but after changes are made above, you should insure that the are no additional changes needed. Look things over and let me know if you need additional help. (and with a MCVE, I'm happy to help further)
Additional Problems
I apologize for the delay, real-world called -- and this took a lot longer than anticipated, because what you have is an awkward slow-motion logical train-wreck. First and foremost, while there is nothing wrong with reading an entire text-file file into a buffer with fread -- the buffer is NOT nul-terminated and therefore cannot be used with any functions expecting a string. Yes, strtok, strcpy or any string function will read past the end of word_data looking for the nul-terminating character (well out into memory you don't own) resulting in a SegFault.
Your various scattered +1 tacked onto your malloc allocations now make a little more sense, as it appears you were looking for where you needed to add an additional character to make sure you could nul-terminate word_data, but couldn't quite figure out where it went. (don't worry, I straightened that out for you, but it is a big hint that you are probably going about this in the wrong way -- reading with POSIX getline or fgets is probably a better approach than the file-at-once for this type of text processing)
That is literally, just the tip of the iceberg in the problems encountered in your code. As hinted at earlier, in tokenize, you failed to validate that index equals total_words. This ends up being important given your choice of delim which includes the ASCII apostrophe (or single-quote). This causes your index to exceed the word_count any time a plural-possessive or contraction is encountered in the buffer (e.g. "can't" is split is "can" and "t", "Peter's" is split into "Peter" and "s", etc.... You will have to decide how you want to resolve this, I have simply removed the single quote for now.
Your logic in both tokenize and count_words was difficult to follows, and just wrong in some aspects, and your return type (void) for read_file provided absolutely no way to indicate a success (or failure) within. Always choose a return type that provides meaningful information from which you can determine is a critical function has succeeded or failed (reading your data qualifies as critical).
If it provides a return -- use it. This applies to all functions that can fail (including functions like fseek)
Returning 0 from tokenize misses the return of the number of words (allocated struts) in dictionary leaving you unable to properly free the information and leaving you to guess at some number to display (e.g. for (int i = 0; i < 333; ++i) in main()). You need to track the number of dictionary structs and member word that are allocated in tokenize (keep an index, say dindex). Then returning dindex to main() (assigned to hello in your code) provides the information you need to iterate over the structs in main() to output your information, as well as to free each allocated word before freeing the pointers.
If you don't have an accurate count of the number of allocated dictionary structs back in main(), you have failed in the two responsibilities you have regarding any block of memory allocated: (1) always preserve a pointer to the starting address for the block of memory so, (2) it can be freed when it is no longer needed. If you don't know how many blocks there are, then you haven't done (1) and can't do (2).
This is a nit about style, and while not an error, the standard coding style for C avoids the use of Initialcaps, camelCase or MixedCase variable names in favor of all lower-case while reserving upper-case names for use with macros and constants. It is a matter of style -- so it is completely up to you, but failing to follow it can lead to the wrong first impression in some circles.
Rather than carry on for another handful of paragraphs, I've reworked your example for you and added a few comments inline. Go though it, I haven't punishingly tested it for all corner-cases, but it should be a sound base to build from. You will note in going though it, your count_words and tokenize have been simplified. Try and understand why what was done, was done, and ask if you have any questions:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>
#include <errno.h>
typedef struct{
char *word;
int count;
} dictionary_t;
char *read_file (FILE *file, char **words, size_t *length)
{
size_t size = *length = 0;
if (fseek (file, 0, SEEK_END) == -1) {
perror ("fseek SEEK_END");
return NULL;
}
size = (size_t)ftell (file);
if (fseek (file, 0, SEEK_SET) == -1) {
perror ("fseek SEEK_SET");
return NULL;
}
/* +1 needed to nul-terminate buffer to pass to strtok */
if (!(*words = malloc (size + 1))) {
perror ("malloc - size");
return NULL;
}
if (fread (*words, 1, size, file) != size) {
perror ("fread words");
free (*words);
return NULL;
}
*length = size;
(*words)[*length] = 0; /* nul-terminate buffer - critical */
return *words;
}
int tokenize (dictionary_t **dictionary, char *words, int total_words)
{
// char *delim = " .,?!:;/\"\'\n\t"; /* don't split on apostrophies */
char *delim = " .,?!:;/\"\n\t";
char **temp = malloc (sizeof *temp * total_words);
char *token = strtok(words, delim);
int index = 0, dindex = 0;
if (!temp) {
perror ("malloc temp");
return -1;
}
if (!(*dictionary = malloc (sizeof **dictionary * total_words))) {
perror ("malloc - dictionary");
return -1;
}
while (token != NULL)
{
if (!(temp[index] = malloc (strlen (token) + 1))) {
perror ("malloc - temp[index]");
exit (EXIT_FAILURE);
}
strcpy(temp[index++], token);
token = strtok (NULL, delim);
}
if (total_words != index) { /* validate total_words = index */
fprintf (stderr, "error: total_words != index (%d != %d)\n",
total_words, index);
/* handle error */
}
for (int i = 0; i < total_words; i++) {
int found = 0, j = 0;
for (; j < dindex; j++)
if (strcmp((*dictionary)[j].word, temp[i]) == 0) {
found = 1;
break;
}
if (!found) {
if (!((*dictionary)[dindex].word = malloc (strlen (temp[i]) + 1))) {
perror ("malloc (*dictionay)[dindex].word");
exit (EXIT_FAILURE);
}
strcpy ((*dictionary)[dindex].word, temp[i]);
(*dictionary)[dindex++].count = 1;
}
else
(*dictionary)[j].count++;
}
for (int i = 0; i < total_words; i++)
free (temp[i]); /* you must free storage for words */
free (temp); /* before freeing pointers */
return dindex;
}
int count_words (char *words, size_t length)
{
int count = 0;
char previous_char = ' ';
while (length--) {
if (isspace (previous_char) && !isspace (*words))
count++;
previous_char = *words++;
}
return count;
}
int main (int argc, char **argv)
{
char *word_data = NULL;
int word_count, hello;
size_t length = 0;
dictionary_t *dictionary = NULL;
FILE *input = argc > 1 ? fopen (argv[1], "r") : stdin;
if (!input) { /* validate file open for reading */
fprintf (stderr, "error: file open failed '%s'.\n", argv[1]);
return 1;
}
if (!read_file (input, &word_data, &length)) {
fprintf (stderr, "error: file_read failed.\n");
return 1;
}
if (input != stdin) fclose (input); /* close file if not stdin */
word_count = count_words (word_data, length);
printf ("wordct: %d\n", word_count);
/* number of dictionary words returned in hello */
if ((hello = tokenize (&dictionary, word_data, word_count)) <= 0) {
fprintf (stderr, "error: no words or tokenize failed.\n");
return 1;
}
for (int i = 0; i < hello; ++i) {
printf("%-16s : %d\n", dictionary[i].word, dictionary[i].count);
free (dictionary[i].word); /* you must free word storage */
}
free (dictionary); /* free pointers */
free (word_data); /* free buffer */
return 0;
}
Let me know if you have further questions.
There are a few things that you need to do to make your code work:
Fix the memory allocation of temp by replacing sizeof(char) with sizeof(char *) like so:
char **temp = malloc(sizeof(char *) * strlen(words) + 1);
Fix the memory allocation of dictionary by replacing sizeof(Dictionary) with sizeof(Dictionary *):
*dictionary = (Dictionary*)malloc(sizeof(Dictionary *) * (*total_words));
Pass the address of address of word_count when calling tokenize:
int hello = tokenize(&dictionary, word_data, &word_count);
Replace all occurrences of total_words in tokenize function with (*total_words). In the tokenize function signature, you can replace int total_words with int *total_words.
You should also replace the hard-coded value of 333 in your for loop in the main function with word_count.
After you make these changes, your code should work as expected. I was able to run it successfully with these changes.

Splitting a string to remove everything after a delimiter

I am trying to parse the phone number out of a sip uri, or if the string is just a number, returning that. Basically, I want to chop off the # and anything after it if it exists.
I wrote a small function using strtok() but the function always returns NULL.
Can anyone tell me what I'm doing wrong here?
char* GetPhoneNumber(const char* sCallee) {
char* buf = (char*)malloc(strlen(sCallee) + 1);
strcpy(buf, sCallee);
char *p = strtok (buf, "#");
char *q = strtok (p, ":");
if (buf) {
free(buf);
}
return q;
}
int main() {
const char* raw_uri = "2109999999#10.0.0.1";
char* number = GetPhoneNumber(raw_uri);
if (number == NULL) {
printf("I am screwed! %s comes out null!", raw_uri);
}
char* second = GetPhoneNumber("2109999999");
if (second == NULL) {
printf("This does not work either.");
}
}
edit
This
If it's returning NULL because the while loops ends when q is NULL. So I
assume that the q = strtok (NULL, ":"); also returns NULL and that's why it
leaves the loop.
makes no sense anymore, since you've edited your question and removed the code
in question.
end edit
Regardless, you are using it wrong, though.
strtok returns a pointer to the original string plus an offset that marks the
beginning of the next token. The pointer is at an offset of buf. So when you
do free(buf), you are making the pointers returned by strtok also invalid.
You should also check first if malloc returns NULL and then try to parse
it. Checking of malloc returning NULL after the parsing is wrong. Also you
would need to make a copy of the value you are returning.
char* GetPhoneNumber(const char* sCallee) {
char* buf = malloc(strlen(sCallee) + 1);
if(buf == NULL)
return NULL;
strcpy(buf, sCallee);
char *p = strtok (buf, "#");
if(p == NULL)
{
// no token found
free(buf);
return NULL;
}
char *q = strtok (p, ":"); // makes no sense after your edit
// I don't see any colons in your input
// but I leave this to show you how
// it would be done if colons were present
if(q == NULL)
{
// no token found
free(buf);
return NULL;
}
char *copy = malloc(strlen(q) + 1);
if(copy == NULL)
{
free(buf);
return NULL;
}
strcpy(copy, q);
free(buf);
return copy;
}
Also when you call this function, you have to remember to free the pointer
returned by GetPhoneNumber.
edit2
To be honest, I don't see why you even use strtok if the number comes before
#. You can use strchr instead:
char* GetPhoneNumber(const char* sCallee) {
if(sCallee == NULL)
return NULL;
char *p = strchr(sCallee, '#');
if(p == NULL)
return NULL; // wrong format
char *q = calloc(1, p - sCallee + 1);
if(q == NULL)
return NULL;
strncpy(q, sCallee, p - sCallee);
// q already \0-terminated because of calloc
return q;
}
Your code works fine, you just cannot free(buf) before the return or you release the memory holding number and second leading to Undefined Behavior. You further need to validate each step. Suggest something like:
(Note: updateded to protect against a leading '#' in sCallee or a leading ':' in p resulting in NULL being returned)
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char *GetPhoneNumber(const char* sCallee)
{
if (*sCallee == '#') { /* protect against leading '#' */
fprintf (stderr, "invalid sCallee - leading '#'\n");
return NULL;
}
char* buf = malloc (strlen(sCallee) + 1);
if (!buf) { /* if you allocate/validate */
perror ("malloc - buf");
return NULL;
}
strcpy(buf, sCallee);
char *p = strtok (buf, "#"); /* get first token with '#' */
if (!p) { /* validate */
fprintf (stderr, "error: strtok with '#' failed.\n");
return NULL;
}
if (*p == ':') { /* protect against leading ':' */
fprintf (stderr, "invalid p - leading ':'\n");
free (buf);
return NULL;
}
char *q = strtok (p, ":"); /* get first token with ':' */
// free(buf);
return q;
}
int main () {
const char* raw_uri = "2109999999:abc#10.0.0.1";
char* number = GetPhoneNumber(raw_uri);
if (number == NULL) {
printf("I am screwed! %s comes out null!\n", raw_uri);
}
else {
printf ("number: %s\n", number);
free (number);
}
char* second = GetPhoneNumber("2109999999");
if (second == NULL) {
printf("This does not work either.\n");
}
else {
printf ("second: %s\n", second);
free (second);
}
}
(note: There is no need to cast the return of malloc, it is unnecessary. See: Do I cast the result of malloc?)
Example Use/Output
$ ./bin/strtokpnum
number: 2109999999
second: 2109999999
Memory Use/Error Check
In any code you write that dynamically allocates memory, you have 2 responsibilities regarding any block of memory allocated: (1) always preserve a pointer to the starting address for the block of memory so, (2) it can be freed when it is no longer needed.
For Linux valgrind is the normal choice. There are similar memory checkers for every platform. They are all simple to use, just run your program through it.
$ valgrind ./bin/strtokpnum
==24739== Memcheck, a memory error detector
==24739== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==24739== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==24739== Command: ./bin/strtokpnum
==24739==
number: 2109999999
second: 2109999999
==24739==
==24739== HEAP SUMMARY:
==24739== in use at exit: 0 bytes in 0 blocks
==24739== total heap usage: 2 allocs, 2 frees, 35 bytes allocated
==24739==
==24739== All heap blocks were freed -- no leaks are possible
==24739==
==24739== For counts of detected and suppressed errors, rerun with: -v
==24739== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
Always confirm that you have freed all memory you have allocated and that there are no memory errors.
Protect Against All Corner Cases
Though not part of the question, there are possible corner cases such as multiple leading '#' and multiple leading ':' in the secondary string. If you are going to protect against the possibility of having multiple leading '#' delimiters or multiple leading ':' in the secondary string, then simply use the multiple allocation method. Adding any additional checks and you might as well just walk a pointer down sCallee.
char *gettok (const char *s, const char *d1, const char *d2)
{
char *buf = malloc (strlen (s) + 1),
*p = NULL,
*q = NULL,
*s2 = NULL;
if (!buf) { /* validate allocation */
perror ("malloc - buf");
return NULL;
}
strcpy (buf, s); /* copy s to buf */
if (!(p = strtok (buf, d1))) { /* if token on d1 fails */
free (buf); /* free buf */
return NULL;
}
if (!(q = strtok (p, d2))) { /* if token on d2 fails */
free (buf); /* free buf */
return NULL;
}
/* allocate/validate return */
if (!(s2 = malloc (strlen (q) + 1))) {
perror ("malloc - s2");
return NULL;
}
strcpy (s2, q); /* copy token */
free (buf); /* free buf */
return s2; /* return token */
}
In that case you would simply call
gettok ("#####:::::2109999999:abc#10.0.0.1", "#", ":');
and you are protected.
(and if you are getting strings like that from your sip uri, fix that process)

String is truncated after allocating 2D array (Edited)

I am dynamically allocating the 2D array like this:
char ** inputs;
inputs = (char **) malloc(4 * sizeof(char));
After doing this I started having the problem with the string. I printed the string before and after allocating the 2D-array:
printf("%s\n", str);
char ** inputs;
inputs = (char **) malloc(4 * sizeof(char));
printf("%s\n", str);
But I get strange output:
before: input aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa with len 34
after: input aaaaaaaaaaaaaaaaaaaaaaaaaaaa with len 29
Why the length is changed? I've searched through stackoverflow and other websites but couldn't find reasonable answer for that.
Here is my all function call:
int main(int argc, char const *argv[])
{
/* code */
mainProcess();
printf("\nEnd of the program\n");
return 0;
}
// Reading the input from the user
char * getInput(){
printf("Inside of the getInput\n");
char * result;
char * st;
char c;
result = malloc(4 * sizeof(char));
st = malloc(4 * sizeof(char));
// code goes here
printf("$ ");
while(3){
c = fgetc(stdin);
if(c == 10){
break;
}
printf("%c", c);
result[length] = c;
length++;
}
result[length] = '\0';
return result;
}
void mainProcess(){
char * input;
printf("Inside of Main process\n");
input = getInput();
printf("\nthis is input %s with len %d\n", input, strlen(input));
splitInput(input);
printf("\nthis is input %s with len %d\n", input, strlen(input));
}
char ** splitInput(const char * str){
char ** inputs;
inputs = NULL;
printf("inside split\n");
printf("%s\n", str);
inputs = (char **) malloc( sizeof(char));
// free(inputs);
printf("------\n"); // for testing
printf("%s\n", str);
if(!inputs){
printf("Error in initializing the 2D array!\n");
exit(EXIT_FAILURE);
}
return NULL;
}
It is not entirely clear what you are trying to accomplish, but it appears you are attempting to read a line of text with getInput and then you intend to separate the input into individual words in splitInput, but are not clear on how to go about doing it. The process of separating a line of text into words is called tokenizing a string. The standard library provide strtok (aptly named) and strsep (primarily useful if you have the possibility of an empty delimited field).
I have explained the difference between a 2D array and your use of a pointer-to-pointer-to-char in the comments above.
To begin, look at getInput. One issue that will give you no end of grief is c must be type int or you cannot detect EOF. In addition, you can simply pass a pointer (type size_t) as a parameter and keep count of the characters in result and avoid the need for strlen to get the length of the returned string. You MUST use a counter anyway to insure you do not write beyond the end of result to begin with, so you may as well make the count available back in the calling function e.g.
char *getInput (size_t *n)
{
printf ("Inside of the getInput\n");
char *result = NULL;
int c = 0; /* c must be type 'int' or you cannot detect EOF */
/* validate ALL allocations */
if ((result = malloc (MAXC * sizeof *result)) == NULL) {
fprintf (stderr, "error: virtual memory exhausted.\n");
return result;
}
printf ("$ ");
fflush (stdout); /* output is buffered, flush buffer to show prompt */
while (*n + 1 < MAXC && (c = fgetc (stdin)) != '\n' && c != EOF) {
printf ("%c", c);
result[(*n)++] = c;
}
putchar ('\n'); /* tidy up with newline */
result[*n] = 0;
return result;
}
Next, as indicated above, it appears you want to take the line of text in result and use splitInput to fill a pointer-to-pointer-to-char with the individual words (which you are confusing to be a 2D array). To do that, you must keep in mind that strtok will modify the string it operates on so you must make a copy of str which you pass as const char * to avoid attempting to modify a constant string (and the segfault).
You are confused in how to allocate the pointer-to-pointer-to-char object. First you must allocate space for a sufficient number of pointers, e.g. (with #define MAXW 32) you would need something like:
/* allocate MAXW pointers */
if ((inputs = malloc (MAXW * sizeof *inputs)) == NULL) {
fprintf (stderr, "error: memory exhausted - inputs.\n");
return inputs;
}
Then as you tokenize the input string, you must allocate for each individual word (each themselves an individual string), e.g.
if ((inputs[*n] = malloc ((len + 1) * sizeof *inputs[*n])) == NULL) {
fprintf (stderr, "error: memory exhausted - word %zu.\n", *n);
break;
}
strcpy (inputs[*n], p);
(*n)++;
note: 'n' is a pointer to size_t to make the word count available back in the caller.
To tokenize the input string you can wrap the allocation above in:
for (char *p = strtok (cpy, delim); p; p = strtok (NULL, delim))
{
size_t len = strlen (p);
...
if (*n == MAXW) /* check if limit reached */
break;
}
Throughout your code you should also validate all memory allocations and provide effective returns for each function that allocates to allow the caller to validate whether the called function succeeded or failed.
Putting all the pieces together, you could do something like the following:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MAXC 256 /* constant for maximum characters of user input */
#define MAXW 32 /* constant for maximum words in line */
void mainProcess();
int main (void)
{
mainProcess();
printf ("End of the program\n");
return 0;
}
char *getInput (size_t *n)
{
printf ("Inside of the getInput\n");
char *result = NULL;
int c = 0; /* c must be type 'int' or you cannot detect EOF */
/* validate ALL allocations */
if ((result = malloc (MAXC * sizeof *result)) == NULL) {
fprintf (stderr, "error: virtual memory exhausted.\n");
return result;
}
printf ("$ ");
fflush (stdout); /* output is buffered, flush buffer to show prompt */
while (*n + 1 < MAXC && (c = fgetc (stdin)) != '\n' && c != EOF) {
printf ("%c", c);
result[(*n)++] = c;
}
putchar ('\n'); /* tidy up with newline */
result[*n] = 0;
return result;
}
/* split str into tokens, return pointer to array of char *
* update pointer 'n' to contain number of words
*/
char **splitInput (const char *str, size_t *n)
{
char **inputs = NULL,
*delim = " \t\n", /* split on 'space', 'tab' or 'newline' */
*cpy = strdup (str);
printf ("inside split\n");
printf ("%s\n", str);
/* allocate MAXW pointers */
if ((inputs = malloc (MAXW * sizeof *inputs)) == NULL) {
fprintf (stderr, "error: memory exhausted - inputs.\n");
return inputs;
}
/* split cpy into tokens (words) max of MAXW words allowed */
for (char *p = strtok (cpy, delim); p; p = strtok (NULL, delim))
{
size_t len = strlen (p);
if ((inputs[*n] = malloc ((len + 1) * sizeof *inputs[*n])) == NULL) {
fprintf (stderr, "error: memory exhausted - word %zu.\n", *n);
break;
}
strcpy (inputs[*n], p);
(*n)++;
if (*n == MAXW) /* check if limit reached */
break;
}
free (cpy); /* free copy */
return inputs;
}
void mainProcess()
{
char *input = NULL,
**words = NULL;
size_t len = 0, nwords = 0;
printf ("Inside of Main process\n\n");
input = getInput (&len);
if (!input || !*input) {
fprintf (stderr, "error: input is empty or NULL.\n");
return;
}
printf ("this is input '%s' with len: %zu (before split)\n", input, len);
words = splitInput (input, &nwords);
printf ("this is input '%s' with len: %zu (after split)\n", input, len);
free (input); /* done with input, free it! */
printf ("the words in input are:\n");
for (size_t i = 0; i < nwords; i++) {
printf (" word[%2zu]: '%s'\n", i, words[i]);
free (words[i]); /* free each word */
}
free (words); /* free pointers */
putchar ('\n'); /* tidy up with newline */
}
Example Use/Output
$ ./bin/mainprocess
Inside of Main process
Inside of the getInput
$ my dog has fleas
my dog has fleas
this is input 'my dog has fleas' with len: 16 (before split)
inside split
my dog has fleas
this is input 'my dog has fleas' with len: 16 (after split)
the words in input are:
word[ 0]: 'my'
word[ 1]: 'dog'
word[ 2]: 'has'
word[ 3]: 'fleas'
End of the program
Memory Error Check
In any code you write that dynamically allocates memory, you need to run your code though a memory/error checking program. On Linux, valgrind is the normal choice. Simply run your code through it, e.g.
$ valgrind ./bin/mainprocess
==15900== Memcheck, a memory error detector
==15900== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==15900== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==15900== Command: ./bin/mainprocess
==15900==
Inside of Main process
Inside of the getInput
$ my dog has fleas
my dog has fleas
this is input 'my dog has fleas' with len: 16 (before split)
inside split
my dog has fleas
this is input 'my dog has fleas' with len: 16 (after split)
the words in input are:
word[ 0]: 'my'
word[ 1]: 'dog'
word[ 2]: 'has'
word[ 3]: 'fleas'
End of the program
==15900==
==15900== HEAP SUMMARY:
==15900== in use at exit: 0 bytes in 0 blocks
==15900== total heap usage: 7 allocs, 7 frees, 546 bytes allocated
==15900==
==15900== All heap blocks were freed -- no leaks are possible
==15900==
==15900== For counts of detected and suppressed errors, rerun with: -v
==15900== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
Always verify you have freed any memory you allocate, and that there are no memory errors.
Look things over and let me know if you have any questions. If I guess wrong about what you intended, well that's where an MCVE helps :)
This code compiles (gcc -Wall) without warnings and does not change the size.
It also tries to stress the need for allocating enough space and/or not to write beyond allocated memory.
Note for example the
malloc((MaxInputLength+1) * sizeof(char))
while(length<MaxInputLength)
inputs[i]=malloc((MaxLengthOfSplitString+1) * sizeof(char));
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
// the length which was used in your MCVE, probably accidentally
#define MaxInputLength 3 // you will probably want to increase this
#define MaxLengthOfSplitString 1 // and this
#define MaxNumberOfSplitStrings 3 // and this
// Reading the input from the user
char * getInput(){
printf("Inside of the getInput\n");
char * result;
char c;
int length=0;
result = malloc((MaxInputLength+1) * sizeof(char));
// code goes here
printf("$ ");
while(length<MaxInputLength){
c = fgetc(stdin);
if(c == 10){
break;
}
printf("%c", c);
result[length] = c;
length++;
}
result[length] = '\0';
return result;
}
char ** splitInput(const char * str){
char ** inputs;
inputs = NULL;
printf("inside split\n");
printf("%s\n", str);
inputs = (char **) malloc(MaxNumberOfSplitStrings * sizeof(char*));
{
int i;
for (i=0; i< MaxNumberOfSplitStrings; i++)
{
inputs[i]=malloc((MaxLengthOfSplitString+1) * sizeof(char));
}
// Now you have an array of MaxNumberOfSplitStrings char*.
// Each of them points to a buffer which can hold a ero- terminated string
// with at most MaxLengthOfSplitString chars, ot counting the '\0'.
}
// free(inputs);
printf("------\n"); // for testing
printf("%s\n", str);
if(!inputs){
printf("Error in initializing the 2D array!\n");
exit(EXIT_FAILURE);
}
return NULL;
}
void mainProcess(){
char * input;
printf("Inside of Main process\n");
input = getInput();
printf("\nthis is input %s with len %d\n", input, strlen(input));
splitInput(input);
printf("\nthis is input %s with len %d\n", input, strlen(input));
}
int main(int argc, char const *argv[])
{
/* code */
mainProcess();
printf("\nEnd of the program\n");
return 0;
}

dynamically allocating my 2d array in c

Any hints on how I would dynamically allocate myArray so I can enter any amount of strings and it would store correctly.
int main()
{
char myArray[1][1]; //how to dynamically allocate the memory?
counter = 0;
char *readLine;
char *word;
char *rest;
printf("\n enter: ");
ssize_t buffSize = 0;
getline(&readLine, &buffSize, stdin);//get user input
//tokenize the strings
while(word = strtok_r(readLine, " \n", &rest )) {
strcpy(myArray[counter], word);
counter++;
readLine= rest;
}
//print the elements user has entered
int i =0;
for(i = 0;i<counter;i++){
printf("%s ",myArray[i]);
}
printf("\n");
}
Use realloc like this:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(void){
char **myArray = NULL;
char *readLine = NULL;
size_t buffSize = 0;
size_t counter = 0;
char *word, *rest, *p;
printf("\n enter: ");
getline(&readLine, &buffSize, stdin);
p = readLine;
while(word = strtok_r(p, " \n", &rest )) {
myArray = realloc(myArray, (counter + 1) * sizeof(*myArray));//check omitted
myArray[counter++] = strdup(word);
p = NULL;
}
free(readLine);
for(int i = 0; i < counter; i++){
printf("<%s> ", myArray[i]);
free(myArray[i]);
}
printf("\n");
free(myArray);
}
Here is one way you might approach this problem. If you are going to dynamically allocate storage for an unknown number of words of unknown length, you can start with a buffSize that seems reasonable, allocate that much space for the readLine buffer, and grow this memory as needed. Similarly, you can choose a reasonable size for the number of words expected, and grow word storage as needed.
In the program below, myArray is a pointer to pointer to char. arrSize is initialized so that pointers to 100 words may be stored in myArray. First, readLine is filled with an input line. If more space than provided by the initial allocation is required, the memory is realloced to be twice as large. After reading in the line, the memory is again realloced to trim it to the size of the line (including space for the '\0').
strtok_r() breaks the line into tokens. The pointer store is used to hold the address of the memory allocated to hold the word, and then word is copied into this memory using strcpy(). If more space is needed to store words, the memory pointed to by myArray is realloced and doubled in size. After all words have been stored, myArray is realloced a final time to trim it to its minimum size.
When doing this much allocation, it is nice to write functions which allocate memory and check for errors, so that you don't have to do this manually every allocation. xmalloc() takes a size_t argument and an error message string. If an allocation error occurs, the message is printed to stderr and the program exits. Otherwise, a pointer to the allocated memory is returned. Similarly, xrealloc() takes a pointer to the memory to be reallocated, a size_t argument, and an error message string. Note here that realloc() can return a NULL pointer if there is an allocation error, so you need to assign the return value to a temporary pointer to avoid a memory leak. Moving realloc() into a separate function helps protect you from this issue. If you assigned the return value of realloc() directly to readLine, for example, and if there were an allocation error, readLine would no longer point to the previously allocated memory, which would be lost. This function prints the error message and exits if there is an error.
Also, you need to free all of these memory allocations, so this is done before the program exits.
This method is more efficient than reallocing memory for every added character in the line, and for every added pointer to a word in myArray. With generous starting values for buffSize and arrSize, you may only need the initial allocations, which are then trimmed to final size. Of course, there are still the individual allocations for each of the individual words. You could also use strdup() for this part, but you would still need to remember to free those allocations as well.Still, not nearly as many allocations will be needed as when readLine and myArray are grown one char or one pointer at a time.
#define _POSIX_C_SOURCE 1
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
void * xmalloc(size_t size, char *msg);
void * xrealloc(void *ptr, size_t size, char *msg);
int main(void)
{
char **myArray;
size_t buffSize = 1000;
size_t arrSize = 100;
size_t charIndex = 0;
size_t wordIndex = 0;
char *readLine;
char *inLine;
char *word;
char *rest;
char *store;
/* Initial allocations */
readLine = xmalloc(buffSize, "Allocation error: readLine");
myArray = xmalloc(sizeof(*myArray) * arrSize,
"Allocation error: myArray\n");
/* Get user input */
printf("\n enter a line of input:\n");
int c;
while ((c = getchar()) != '\n' && c != EOF) {
if (charIndex + 1 >= buffSize) { // keep room for '\0'
buffSize *= 2;
readLine = xrealloc(readLine, buffSize,
"Error in readLine realloc()\n");
}
readLine[charIndex++] = c;
}
readLine[charIndex] = '\0'; // add '\0' terminator
/* If you must, trim the allocation now */
readLine = xrealloc(readLine, strlen(readLine) + 1,
"Error in readLine trim\n");
/* Tokenize readLine */
inLine = readLine;
while((word = strtok_r(inLine, " \n", &rest)) != NULL) {
store = xmalloc(strlen(word) + 1, "Error in word allocation\n");
strcpy(store, word);
if (wordIndex >= arrSize) {
arrSize *= 2;
myArray = xrealloc(myArray, sizeof(*myArray) * arrSize,
"Error in myArray realloc()\n");
}
myArray[wordIndex] = store;
wordIndex++;
inLine = NULL;
}
/* You can trim this allocation, too */
myArray = xrealloc(myArray, sizeof(*myArray) * wordIndex,
"Error in myArray trim\n");
/* Print words */
for(size_t i = 0; i < wordIndex; i++){
printf("%s ",myArray[i]);
}
printf("\n");
/* Free allocated memory */
for (size_t i = 0; i < wordIndex; i++) {
free(myArray[i]);
}
free(myArray);
free(readLine);
return 0;
}
void * xmalloc(size_t size, char *msg)
{
void *temp = malloc(size);
if (temp == NULL) {
fprintf(stderr, "%s\n", msg);
exit(EXIT_FAILURE);
}
return temp;
}
void * xrealloc(void *ptr, size_t size, char *msg)
{
void *temp = realloc(ptr, size);
if (temp == NULL) {
fprintf(stderr, "%s\n", msg);
exit(EXIT_FAILURE);
}
return temp;
}
I suggest you first scan the data and then call malloc() with the appropriate size.
Otherwise, you can use realloc() to reallocate memory as you go through the data.

Resources