Loading strings divided by delimiter to array - c

I have an array that needs to be filled with values from a string looking like this:
value0;value1;value2;value3;\n
I tried using strtok() but couldn't really figure out how to properly load more than 2 elements into table.
Desirable output is something like
arrayValues[0] = value0;
arrayValues[1] = value1;
etc.

You need to use strtok() and realloc(). Both are a bit difficult to use
char input[] = "value0;value1;value2;value3\n";
char **arrayValues = NULL;
int N = 0;
char *token = strtok(input, ";");
while(token != 0)
{
N++;
arrayValues = realloc(arrayValues, N * sizeof(char *));
if(!arrayValues)
/* out of memory - very unlikely to happen */
arrayValues[N-1] = strdup(token);
token = strtok(NULL, ";");
}
/* print out to check */
for(i=0;i<N;i++)
printf("***%s***\n", arrayValues[i]);
Note that the delimiter ';' is overwritten, if you retain it as you specified you'll have to add it to the end of the strings, which is fiddly and probably not what you really want.

At its simplest form, if the string you need to separate will remain in scope during the time you are making use of the individual tokens, then there is no need to allocate. Simply declare an array of pointers with a sufficient number of pointers for the tokens you have, and as you tokenize your string, just assign the address for the beginning of each token to the pointers in your array of pointers. That way, the pointers in your array simply point to the place within the original string where each of your tokens are found. Example:
#include <stdio.h>
#include <string.h>
#define MAXS 16
int main (void) {
char str[] = "value0;value1;value2;value3;\n",
*array[MAXS] = {NULL}, /* array of pointers */
*p = str, /* pointer to str */
*delim = ";\n"; /* delimiters */
int i, n = 0; /* loop var & index - n */
p = strtok (p, delim); /* get 1st token */
while (n < MAXS && p) { /* check bounds/validate token */
array[n++] = p; /* add pointer to array */
p = strtok (NULL, delim); /* get next token */
}
for (i = 0; i < n; i++) /* output tokens */
printf ("array[%2d] : %s\n", i, array[i]);
return 0;
}
(note: strtok modifies the original string by placing nul-terminating characters (e.g. '\0') in place of the delimiters. If you need to preserve the original string, make a copy before calling strtok)
Note above, you are limited to a fixed number of pointers, so while you are separating the tokens and assigning them to pointers in your array, you need to check the number against your array bounds to prevent writing beyond the end of your array.
Example Use/Output
$ ./bin/parsestrstrtok
array[ 0] : value0
array[ 1] : value1
array[ 2] : value2
array[ 3] : value3
Taking the parsing to the next step, where your original string may not remain in scope during the time your array values are needed, you simply need to allocate storage for each token and copy each token to your newly allocated memory and assign the starting address for each new block to the pointers in your array. That way, even if you pass your array of pointers and the string to a function for parsing, the array values remain available after the function completes until you free the memory you have allocated.
You are still limited to a fixed number of pointers, but your array is now usable wherever required in your program. The additions required for this are minimal. Note, malloc and strcpy are used below and can be replaced by a single call to strdup. However, since strdup is not part of all versions of C, malloc and strcpy are used instead. (but note, strdup does make for a very convenient replacement if your compiler supports it)
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MAXS 16
int main (void) {
char str[] = "value0;value1;value2;value3;\n",
*array[MAXS] = {NULL}, /* array of pointers */
*p = str, /* pointer to str */
*delim = ";\n"; /* delimiters */
int i, n = 0; /* loop var & index - n */
p = strtok (p, delim); /* get 1st token */
while (p) { /* validate token */
/* allocate/validate storage for token */
if (!(array[n] = malloc (strlen (p) + 1))) {
perror ("malloc failed");
exit (EXIT_FAILURE);
}
strcpy (array[n++], p); /* copy token to array */
p = strtok (NULL, delim); /* get next token */
}
for (i = 0; i < n; i++) { /* output tokens */
printf ("array[%2d] : %s\n", i, array[i]);
free (array[i]); /* free memory for tokens */
}
return 0;
}
(output is the same)
Finally, you can eliminate your dependency on a fixed number of pointers by dynamically allocating the pointers and reallocating the pointers on an as needed basis. You can start with the same number, and then allocate twice the current number of pointers when your current supply is exhausted. It is simply one additional level of allocation before you start parsing, and a requirement to realloc when you have used all the pointers at hand. Example:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MAXS 16
int main (void) {
char str[] = "value0;value1;value2;value3;\n",
**array = NULL, /* pointer to pointer to char */
*p = str, /* pointer to str */
*delim = ";\n"; /* delimiters */
int i, n = 0, /* loop var & index - n */
nptrs = MAXS; /* number allocated pointers */
/* allocate/validate an initial number of pointers for array */
if (!(array = malloc (nptrs * sizeof *array))) {
perror ("malloc pointers failed");
exit (EXIT_FAILURE);
}
p = strtok (p, delim); /* get 1st token */
while (p) { /* validate token */
/* allocate/validate storage for token */
if (!(array[n] = malloc (strlen (p) + 1))) {
perror ("malloc failed");
exit (EXIT_FAILURE);
}
strcpy (array[n++], p); /* copy token to array */
if (n == nptrs) { /* pointer limit reached */
/* realloc 2X number of pointers/validate */
void *tmp = realloc (array, nptrs * 2 * sizeof *array);
if (!tmp) {
perror ("realloc - pointers");
goto memfull; /* don't exit, array has original values */
}
array = tmp; /* assign new block to array */
nptrs *= 2; /* update no. allocated pointers */
}
p = strtok (NULL, delim); /* get next token */
}
memfull:;
for (i = 0; i < n; i++) { /* output tokens */
printf ("array[%2d] : %s\n", i, array[i]);
free (array[i]); /* free memory for tokens */
}
free (array); /* free memory for pointers */
return 0;
}
note: you should validate your memory use with a memory use and error checking program like valgrind on Linux. There are similar tools for every platform. Just run your code though the checker and validate there are no memory error and that all memory you have allocated has been properly freed.
Example:
$ valgrind ./bin/parsestrstrtokdbl
==15256== Memcheck, a memory error detector
==15256== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==15256== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==15256== Command: ./bin/parsestrstrtokdbl
==15256==
array[ 0] : value0
array[ 1] : value1
array[ 2] : value2
array[ 3] : value3
==15256==
==15256== HEAP SUMMARY:
==15256== in use at exit: 0 bytes in 0 blocks
==15256== total heap usage: 5 allocs, 5 frees, 156 bytes allocated
==15256==
==15256== All heap blocks were freed -- no leaks are possible
==15256==
==15256== For counts of detected and suppressed errors, rerun with: -v
==15256== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
**note above you see 5 allocations (1 for the pointers and 1 for each token). All memory has been freed and there are no memory errors.
There are probably a dozen more approaches you can take to inch-worm down your string picking out tokens, but this is the general progression of how to expand on the approach using strtok. Let me know if you have any further questions.

You can just use good old strchr function to hunt for the substring ';' terminator and malloc and realloc for memory allocations.
Make note that input str is modified (reused). In that string ';' are replaced by '\0'.
(If you need str untouched than allocate another buffer, copy the str to it and point p1,p2 pointers to it.)
The arrayValues holds pointers to the substrings:
#include<stdio.h>
#include<string.h>
#include<stdlib.h>
int main ()
{
char **arrayValues = malloc(sizeof(char *)); // allocate memory for string pointers
char str[] = "value1;value2;value3\n"; // input
char *p1 = str; // init pointer helpers
char *p2 = str;
int n = 0; // substring counter
// OPTIONAL if you want to get rid of ending '\n'
//--s
size_t len = strlen(str);
if(len>0)
if(str[len-1] == '\n')
str[len-1] = 0;
//--ee
while(p1 != NULL)
{
p1 = strchr(p1,';'); // find ';'
if(p1 != NULL)
{
arrayValues[n] = p2; // begining of the substring
*p1 = 0; // terminate the substring string; get rid of ';'
n++; // count the substrings
arrayValues = realloc( arrayValues, (n+1) * sizeof(char *)); // allocate more memory for next pointer
p2 = p1+1; // move the ponter after the ';'
p1 = p1+1; // we start the search for next ';'
}
else
{
arrayValues[n] = p2; // this is the last (or first) substring
n++;
}
} // while
// Output:
for (int j=0; j<n; j++)
{
printf("%s \n", arrayValues[j]);
}
printf("------");
free(arrayValues);
return 0;
}
Output:
value1
value2
value3
------

Related

Creating an dynamic array, but getting segmentation fault as error

i wanted to create a dynamic array, which will contain user input. But I keep getting segmentation fault as an error after my first input. I know that segmentation fault is caused due false memory access. Is there a way to locate the error in the code ?
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(int argc, char *argv[])
{
int length,i;
int size_arr = 1;
int size_word = 102;
char **arr;
char *word;
arr = malloc(size_arr * sizeof(char*));
word = malloc(size_word *sizeof(char));
if(arr == NULL)
{
fprintf(stderr, "No reallocation");
exit(EXIT_FAILURE);
}
else
{
while(!feof(stdin) && fgets(word, size_word , stdin) != NULL)
{
arr[i] = strndup(word,size_word);
length = strlen(word);
if(length > 102)
{
fprintf(stderr, "Word is too long");
}
if(size_arr-1 == i)
{
size_arr *= 2;
arr = realloc(arr, size_arr * sizeof(char*));
}
i++;
}
}
free(arr);
return 0;
}
Greeting,
Soreseth
You are close, you are just a bit confused on how to handle putting all the pieces together. To begin with, understand there are no arrays involved in your code. When you are building a collection using a pointer-to-pointer (e.g. char **arr), it is a two step process. arr itself is a single-pointer to an allocated block of pointers -- which you expand by one each time to add the next word by calling realloc() to reallocate an additional pointer.
The second step is to allocated storage for each word. You then assign the beginning address for that word's storage to the pointer you have allocated.
So you have one block of pointers which you expand (realloc()) to add a pointer to hold the address for the next word, you then allocate storage for that word, assigning the beginning address to your new pointer and then copy the word to that address.
(note: calling realloc() every iteration to add just one-pointer is inefficient. You solve that by adding another counter (e.g. allocated) that holds the number of pointers allocated, and a counter used, to keep track of the number of pointers you have used. You only reallocate when used == allocated. That way you can, e.g. double the number of pointers available each time -- or whatever growth scheme you choose)
Also note that strdup() and strndup() are handy, but are not part of the standard C library (they are POSIX). While most compilers will provide them, you may need the right compiler option to ensure they are available.
Let's look at your example, in the simple case, using only functions provided by the standard library. We will keep you reallocate by one scheme, and leave you to implement the used == allocated scheme to clean that up later.
When reading lines of data, you won't know how many characters you need to store until the line is read -- so just reused the same fixed-size buffer to read each line. Then you can trim the '\n' included by fgets() and get the length of the characters you need to allocate (+1 for the *nul-terminating character). Then simply allocate, assign to your new pointer and copy from the fixed buffer to the new block of storage for the word (or line). A 1K, 2K, etc.. fixed buffer is fine.
So let's collect the variables you need for your code, defining a constant for the fixed buffer size, e.g.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MAXC 1024 /* if you need a constant, #define one (or more) */
int main (void)
{
char **arr = NULL, /* pointer to block of allocated pointers */
buf[MAXC]; /* fixed buffer for read-input */
int size_arr = 0; /* only 1 counter needed here */
Now let's read a line into buf and start by allocating your pointer:
while (fgets (buf, MAXC, stdin))
{
size_t len;
/* allocate pointer (one each time rather inefficient) */
void *tmp = realloc (arr, (size_arr + 1) * sizeof *arr);
if (!tmp) { /* VALIDATE */
perror ("realloc-arr");
break;
}
arr = tmp; /* assign on success */
(as noted in my comment, you never realloc() using the pointer itself because when (not if) realloc() fails, it will overwrite the pointer address (e.g. the address to your collection of pointers) with NULL.)
So above, you realloc() to the temporary pointer tmp, validate the reallocation succeeded, then assign the newly allocated block of pointers to arr.
Now trim the '\n' from buf and get the number of characters. (where strcspn() allows you to do this all in a single call):
buf[(len = strcspn (buf, "\n"))] = 0; /* trim \n, save len */
Now just allocated storage for len + 1 characters and copy from buf to arr[size_arr].
arr[size_arr] = malloc (len + 1); /* allocate for word */
if (!arr[size_arr]) { /* VALIDATE */
perror ("malloc-arr[i]");
break;
}
memcpy (arr[size_arr], buf, len + 1); /* copy buf to arr[i] */
size_arr += 1; /* increment counter */
}
(note: when reallocating 1 pointer per-iteration, only a single counter variable is needed, and note how it is not incremented until both the pointer reallocation, the allocation for your word storage is validated, and the word is copied from buf to arr[size_arr]. On failure of either allocation, the loop is broken and your size_arr will still hold the correct number of stored words)
That completes you read-loop.
Now you can use your stored collection of size_arr pointers, each pointing to an allocated and stored word as you wish. But remember, when it comes time to free the memory, that too is a 2-step process. You must free the allocated block for each word, before freeing the block of allocated pointers, e.g.
for (int i = 0; i < size_arr; i++) { /* output result */
puts (arr[i]);
free (arr[i]); /* free word storage */
}
free(arr); /* free pointers */
Done.
The complete program is:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MAXC 1024 /* if you need a constant, #define one (or more) */
int main (void)
{
char **arr = NULL, /* pointer to block of allocated pointers */
buf[MAXC]; /* fixed buffer for read-input */
int size_arr = 0; /* only 1 counter needed here */
while (fgets (buf, MAXC, stdin))
{
size_t len;
/* allocate pointer (one each time rather inefficient) */
void *tmp = realloc (arr, (size_arr + 1) * sizeof *arr);
if (!tmp) { /* VALIDATE */
perror ("realloc-arr");
break;
}
arr = tmp; /* assign on success */
buf[(len = strcspn (buf, "\n"))] = 0; /* trim \n, save len */
arr[size_arr] = malloc (len + 1); /* allocate for word */
if (!arr[size_arr]) { /* VALIDATE */
perror ("malloc-arr[i]");
break;
}
memcpy (arr[size_arr], buf, len + 1); /* copy buf to arr[i] */
size_arr += 1; /* increment counter */
}
for (int i = 0; i < size_arr; i++) { /* output result */
puts (arr[i]);
free (arr[i]); /* free word storage */
}
free(arr); /* free pointers */
}
Example Use/Output
Test it out. Do something creative like read, store and output your source file to make sure it works, e.g.
$ ./bin/dynarr < dynarr.c
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MAXC 1024 /* if you need a constant, #define one (or more) */
int main (void)
{
char **arr = NULL, /* pointer to block of allocated pointers */
buf[MAXC]; /* fixed buffer for read-input */
int size_arr = 0; /* only 1 counter needed here */
while (fgets (buf, MAXC, stdin))
{
size_t len;
/* allocate pointer (one each time rather inefficient) */
void *tmp = realloc (arr, (size_arr + 1) * sizeof *arr);
if (!tmp) { /* VALIDATE */
perror ("realloc-arr");
break;
}
arr = tmp; /* assign on success */
buf[(len = strcspn (buf, "\n"))] = 0; /* trim \n, save len */
arr[size_arr] = malloc (len + 1); /* allocate for word */
if (!arr[size_arr]) { /* VALIDATE */
perror ("malloc-arr[i]");
break;
}
memcpy (arr[size_arr], buf, len + 1); /* copy buf to arr[i] */
size_arr += 1; /* increment counter */
}
for (int i = 0; i < size_arr; i++) { /* output result */
puts (arr[i]);
free (arr[i]); /* free word storage */
}
free(arr); /* free pointers */
}
Memory Use/Error Check
In any code you write that dynamically allocates memory, you have 2 responsibilities regarding any block of memory allocated: (1) always preserve a pointer to the starting address for the block of memory so, (2) it can be freed when it is no longer needed.
It is imperative that you use a memory error checking program to ensure you do not attempt to access memory or write beyond/outside the bounds of your allocated block, attempt to read or base a conditional jump on an uninitialized value, and finally, to confirm that you free all the memory you have allocated.
For Linux valgrind is the normal choice. There are similar memory checkers for every platform. They are all simple to use, just run your program through it.
$ valgrind ./bin/dynarr < dynarr.c
==30995== Memcheck, a memory error detector
==30995== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==30995== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==30995== Command: ./bin/dynarr
==30995==
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
<snipped code>
}
==30995==
==30995== HEAP SUMMARY:
==30995== in use at exit: 0 bytes in 0 blocks
==30995== total heap usage: 84 allocs, 84 frees, 13,462 bytes allocated
==30995==
==30995== All heap blocks were freed -- no leaks are possible
==30995==
==30995== For counts of detected and suppressed errors, rerun with: -v
==30995== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
Always confirm that you have freed all memory you have allocated and that there are no memory errors.
Look things over and let me know if you have further questions.

Function to split string sometimes gives segmentation fault

I have the following function to split a string. Most of the time it works fine, but sometimes it randomly causes a segmentation fault.
char** splitString(char* string, char* delim){
int count = 0;
char** split = NULL;
char* temp = strtok(string, delim);
while(temp){
split = realloc(split, sizeof(char*) * ++count);
split[count - 1] = temp;
temp = strtok(NULL, " ");
}
int i = 0;
while(split[i]){
printf("%s\n", split[i]);
i++;
}
split[count - 1][strlen(split[count - 1]) - 1] = '\0';
return split;
}
You have a number of subtle issues, not the least of which your function will segfault if you pass a string literal. You need to make a copy of the string you will be splitting as strtok modifies the string. If you pass a string literal (stored in read-only memory), your compiler has no way of warning unless you have declared string as const char *string;
To avoid these problems, simply make a copy of the string you will tokeninze. That way, regardless how the string you pass to the function was declared, you avoid the problem altogether.
You should also pass a pointer to size_t as a parameter to your function in order to make the number of token available back in the calling function. That way you do not have to leave a sentinel NULL as the final pointer in the pointer to pointer to char you return. Just pass a pointer and update it to reflect the number of tokens parsed in your function.
Putting those pieces together, and cleaning things up a bit, you could use the following to do what you are attempting to do:
char **splitstr (const char *str, char *delim, size_t *n)
{
char *cpy = strdup (str), *p = cpy; /* copy of str & pointer */
char **split = NULL; /* pointer to pointer to char */
*n = 0; /* zero 'n' */
for (p = strtok (p, delim); p; p = strtok (NULL, delim)) {
void *tmp = realloc (split, sizeof *split * (*n + 1));
if (!tmp) { /* validate realloc succeeded */
fprintf (stderr, "splitstr() error: memory exhausted.\n");
break;
}
split = tmp; /* assign tmp to split */
split[(*n)++] = strdup (p); /* allocate/copy to split[n] */
}
free (cpy); /* free cpy */
return split; /* return split */
}
Adding a short example program, you could do the following:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char **splitstr (const char *str, char *delim, size_t *n)
{
char *cpy = strdup (str), *p = cpy; /* copy of str & pointer */
char **split = NULL; /* pointer to pointer to char */
*n = 0; /* zero 'n' */
for (p = strtok (p, delim); p; p = strtok (NULL, delim)) {
void *tmp = realloc (split, sizeof *split * (*n + 1));
if (!tmp) { /* validate realloc succeeded */
fprintf (stderr, "splitstr() error: memory exhausted.\n");
break;
}
split = tmp; /* assign tmp to split */
split[(*n)++] = strdup (p); /* allocate/copy to split[n] */
}
free (cpy); /* free cpy */
return split; /* return split */
}
int main (void) {
size_t n = 0; /* number of strings */
char *s = "My dog has fleas.", /* string to split */
*delim = " .\n", /* delims */
**strings = splitstr (s, delim, &n); /* split s */
for (size_t i = 0; i < n; i++) { /* output results */
printf ("strings[%zu] : %s\n", i, strings[i]);
free (strings[i]); /* free string */
}
free (strings); /* free pointers */
return 0;
}
Example Use/Output
$ ./bin/splitstrtok
strings[0] : My
strings[1] : dog
strings[2] : has
strings[3] : fleas
Memory Use/Error Check
In any code you write that dynamically allocates memory, you have 2 responsibilities regarding any block of memory allocated: (1) always preserve a pointer to the starting address for the block of memory so, (2) it can be freed when it is no longer needed.
It is imperative that you use a memory error checking program to insure you do not attempt to write beyond/outside the bounds of your allocated block of memory, attempt to read or base a conditional jump on an uninitialized value, and finally, to confirm that you free all the memory you have allocated.
For Linux valgrind is the normal choice. There are similar memory checkers for every platform. They are all simple to use, just run your program through it.
$ valgrind ./bin/splitstrtok
==14471== Memcheck, a memory error detector
==14471== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==14471== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==14471== Command: ./bin/splitstrtok
==14471==
strings[0] : My
strings[1] : dog
strings[2] : has
strings[3] : fleas
==14471==
==14471== HEAP SUMMARY:
==14471== in use at exit: 0 bytes in 0 blocks
==14471== total heap usage: 9 allocs, 9 frees, 115 bytes allocated
==14471==
==14471== All heap blocks were freed -- no leaks are possible
==14471==
==14471== For counts of detected and suppressed errors, rerun with: -v
==14471== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
Always confirm that you have freed all memory you have allocated and that there are no memory errors.
Look things over and let me know if you have further questions.
split[count - 1][strlen(split[count - 1]) - 1] = '\0';
should look like
split[count - 1] = NULL;
You don't have anything allocated there so that you can access it and put '\0'.
After that put that line before while(split[i]) so that the while can stop when it reaches NULL.
The function strtok is not reentrant, use strtok_r() function this is a reentrant version strtok().

how to connect/link words in an string array using one loop?

i have an array with n words .. i want to attach the strings togther ..
for example if the array have the following strings: "hello" "world" "stack77"
i want the function to return :"helloworldstach7 " any help how i can do something like this without Recursion and with one loop and i can only use from the string library the two functions strcpy and strlen !!
any ideas ! thanks
I NEED TO USE ONE LOOP ONLY !
char *connect(char**words,int n){
int i=0;
while(words){
strcpy(words+i,
i saw many many solutions but they all use other string functions , where i only want to use strcpy and strlen .
If to use only the two mentioned standard string functions then the function can look as it is shown in the demonstrative program.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char * connect( char **words, size_t n )
{
size_t length = 0;
for ( size_t i = 0; i < n; i++ ) length += strlen( words[i] );
char *s = malloc( length + 1 );
size_t pos = 0;
for ( size_t i = 0; i < n; i++ )
{
strcpy( s + pos, words[i] );
pos += strlen( words[i] );
}
s[pos] = '\0';
return s;
}
int main( void )
{
char * s[] = { "Hello", " ", "World" };
char *p = connect( s, sizeof( s ) / sizeof( *s ) );
puts( p );
free( p );
}
The program output is
Hello World
If to use only one loop then the function can look the following way
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char * connect( char **words, size_t n )
{
char *s = calloc( 1, sizeof( char ) );
if ( s != NULL )
{
size_t pos = 0;
for ( size_t i = 0; s != NULL && i < n; i++ )
{
size_t length = strlen( words[i] );
char *tmp = realloc( s, pos + length + 1 );
if ( tmp != NULL )
{
s = tmp;
strcpy( s + pos, words[i] );
pos += length;
}
else
{
free( s );
s = NULL;
}
}
}
return s;
}
int main( void )
{
char * s[] = { "Hello", " ", "World" };
char *p = connect( s, sizeof( s ) / sizeof( *s ) );
if ( p != NULL ) puts( p );
free( p );
}
In addition to using strcpy, you can also use sprintf. Each of the functions in the printf family returns the number of characters actually output allowing you to compute an offset in your final string without an additional function call. Now, there is nothing wrong with using a strcpy/strlen approach, and in fact, that is probably the preferred approach, but be aware that there are always multiple ways of doing things within the parameters you have given. Also note that the printf family offers a wealth of formatting benefits in the event you would need to include additional information along with the concatenation of strings.
For example, using sprintf to concatenate each string while saving the number of characters in each nc as the offset for writing the next string to the resulting buffer buf, while using a ternary operator to control the addition of a space between the words based on your loop counter, you could do something similar to the following:
char *compress (char **p, int n)
{
char *buf = NULL; /* buffer to hold concatendated string */
size_t total = 0; /* total number of characters required */
int nc = 0; /* number of chars added (counter) */
for (int i = 0; i < n; i++) /* get total required length */
total += strlen (p[i]) + 1; /* including spaces between */
if (!(buf = malloc (total + 1))) /* allocate/validate mem */
return buf; /* return NULL on error */
for (int i = 0; i < n; i++) /* add each word to buf, save nc */
nc += sprintf (buf + nc, i ? " %s" : "%s", p[i]);
*(buf + nc) = 0; /* affirmatively nul-terminate buf */
return buf;
}
note: each memory allocation with malloc, calloc or realloc should be validated to insure it succeeds, and the error handled in the event of failure. (here NULL is returned if allocation fails).
Putting that together in a short example, you could do something similar to the following:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char *compress (char **p, int n)
{
char *buf = NULL; /* buffer to hold concatendated string */
size_t total = 0; /* total number of characters required */
int nc = 0; /* number of chars added (counter) */
for (int i = 0; i < n; i++) /* get total required length */
total += strlen (p[i]) + 1; /* including spaces between */
if (!(buf = malloc (total + 1))) /* allocate/validate mem */
return buf; /* return NULL on error */
for (int i = 0; i < n; i++) /* add each word to buf, save nc */
nc += sprintf (buf + nc, i ? " %s" : "%s", p[i]);
*(buf + nc) = 0; /* affirmatively nul-terminate buf */
return buf;
}
int main (void) {
char *sa[] = { "My", "dog", "has", "too many", "fleas." },
*result = compress (sa, sizeof sa/sizeof *sa);
if (result) { /* check return */
printf ("result: '%s'\n", result); /* print string */
free (result); /* free memory */
}
return 0;
}
Example Use/Output
$ ./bin/strcat_sprintf
result: 'My dog has too many fleas.'
Memory/Error Check
In any code you write that dynamically allocates memory, you have 2 responsibilities regarding any block of memory allocated: (1) always preserve a pointer to the starting address for the block of memory so, (2) it can be freed when it is no longer needed.
It is imperative that you use a memory error checking program to insure you do not attempt to write beyond/outside the bounds of your allocated block of memory, attempt to read or base a conditional jump on an uninitialized value, and finally, to confirm that you free all the memory you have allocated.
For Linux valgrind is the normal choice. There are similar memory checkers for every platform. They are all simple to use, just run your program through it.
$ valgrind ./bin/strcat_sprintf
==27595== Memcheck, a memory error detector
==27595== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==27595== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==27595== Command: ./bin/strcat_sprintf
==27595==
result: 'My dog has too many fleas.'
==27595==
==27595== HEAP SUMMARY:
==27595== in use at exit: 0 bytes in 0 blocks
==27595== total heap usage: 1 allocs, 1 frees, 28 bytes allocated
==27595==
==27595== All heap blocks were freed -- no leaks are possible
==27595==
==27595== For counts of detected and suppressed errors, rerun with: -v
==27595== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
Always confirm that you have freed all memory you have allocated and that there are no memory errors.
Look things over and let me know if you have any additional questions.
compress with a Single Loop
As mentioned in the comment to MarianD's answer, you can use a single loop, reallocating your buffer with the addition of each word to the final string, but that is less efficient than getting the total number of characters required and then allocating once. However, there are many occasions where that is exactly what you will be required to do. Basically, you will simply get the length of each word and then allocate memory for that word (and the space between it and the next and for the nul-byte) using realloc instead of malloc (or calloc). realloc acts just like malloc for the first allocation, thereafter it resizes the buffer maintaining its current contents.
note: never realloc the buffer directly (e.g. buf = realloc (buf, newsize);), instead, always use a temporary pointer. Why? IF realloc fails, NULL is returned by realloc which causes you to lose the reference to your original buf (e.g. it will result in buf = NULL;), meaning that the address for your original buf is lost (and you have created a memory leak).
Putting that together, you could do something like the following:
char *compress (char **p, int n)
{
char *buf = NULL; /* buffer to hold concatendated string */
size_t bufsz = 0; /* current allocation size for buffer */
int nc = 0; /* number of chars added (counter) */
for (int i = 0; i < n; i++) { /* add each word to buf */
size_t len = strlen (p[i]) + 1; /* get length of word */
void *tmp = realloc (buf, bufsz + len); /* realloc buf */
if (!tmp) /* validate reallocation */
return buf; /* return current buffer */
buf = tmp; /* assign reallocated block to buffer */
bufsz += len; /* increment bufsz to current size */
nc += sprintf (buf + nc, i ? " %s" : "%s", p[i]);
}
*(buf + nc) = 0; /* affirmatively nul-terminate buf */
return buf;
}
Memory/Error Check
$ valgrind ./bin/strcat_sprintf_realloc
==28175== Memcheck, a memory error detector
==28175== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==28175== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==28175== Command: ./bin/strcat_sprintf_realloc
==28175==
result: 'My dog has too many fleas.'
==28175==
==28175== HEAP SUMMARY:
==28175== in use at exit: 0 bytes in 0 blocks
==28175== total heap usage: 5 allocs, 5 frees, 68 bytes allocated
==28175==
==28175== All heap blocks were freed -- no leaks are possible
==28175==
==28175== For counts of detected and suppressed errors, rerun with: -v
==28175== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
note: now there are 5 allocations instead of 1.
Let me know if you have any questions.

Create array from another array in C

I'm trying to create an array from another array, if I have char *arr[100] = {"Hi", "&&", "hello", 0}; I want to make it be new[0] = "hi"; new[1] = "hello"; my code below doesn't seem to work. How can I fix this?
#include <stdio.h>
#include <string.h>
void split_by_word(char *av[], char **arr, char *word)
{
int i = 0;
int j = 0;
while (strcmp(*arr, word) == 0)
arr++;
if (!arr)
return ;
while (arr[i])
{
strcat(av[j], arr[i]);
if (strcmp(*arr, word) == 0)
j++;
i++;
}
}
int main()
{
char *av[100];
char *arr[100] = {"hi", "&&", "hello", 0};
memset(av, 0, sizeof(char *) * 100);
split_by_word(av, arr, "&&");
return 0;
}
Given the array
char *arr[] =
{
"Hello", "good",
"morning", "out",
"hello", "good",
"afternoon", 0
};
Output when I split by out (split_by_word(av, arr, "out"));
av[0] = "hello good morning";
av[1] = "hello good afternoon";
You need to allocate space for the new 2D array for a start. For simplicity, I allocated one with a size of 100 x 10.*
Moreover, the logic is more simple, I would say, loop over your array and if it is not the word, then copy it, otherwise do nothing (skip it, if it's the word in other words).
So, a basic, good example to start, is:
#include <stdio.h>
#include <string.h>
void split_by_word(char av[100][10], char **arr, char *word)
{
int i = 0, j = 0;
while(arr[i])
{
// if not 'word', copy
if(strcmp(arr[i], word))
strcpy(av[j++], arr[i]);
++i;
}
}
int main()
{
int i;
char av[100][10] = {{0}};
char *arr[100] = {"hi", "&&", "hello", 0};
split_by_word(av, arr, "&&");
for(i = 0; i < 2; ++i)
printf("%s\n", av[i]);
return 0;
}
Output:
Georgioss-MacBook-Pro:~ gsamaras$ gcc -Wall main.c
Georgioss-MacBook-Pro:~ gsamaras$ ./a.out
hi
hello
*For a 2D dynamically allocated array, I would do it like this 2d-dynamic-array-c.
Here's some code that seems to work according to the requirements of your revised question. I have little doubt that it could be improved with some diligence — particularly in split_by_word(). Your revised requirement seems to concatenate strings where it was certainly not clear that your original requirement did.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
static void split_by_word(char **av, char **arr, char *word)
{
while (*arr != 0)
{
if (strcmp(*arr, word) == 0)
av++;
else if (*av == 0)
*av = strdup(*arr);
else
{
size_t len = strlen(*av) + strlen(*arr) + 2; // 1 for null byte, 1 for blank
void *space = realloc(*av, len);
if (space == 0)
{
fprintf(stderr, "Memory allocation failed (%zu bytes)\n", len);
exit(EXIT_FAILURE);
}
*av = space;
strcat(*av, " ");
strcat(*av, *arr);
}
arr++;
}
*++av = 0; // Null terminate pointer list
}
static void free_words(char **words)
{
while (*words != 0)
{
free(*words);
*words++ = 0;
}
}
static void print_words(char **words)
{
for (int i = 0; words[i] != 0; i++)
printf("%d: [%s]\n", i, words[i]);
}
int main(void)
{
char *av[100] = { 0 };
char *arr1[100] = { "hi", "&&", "hello", 0 };
split_by_word(av, arr1, "&&");
print_words(av);
free_words(av);
char *arr2[] =
{
"Hello", "good",
"morning", "out",
"hello", "good",
"afternoon", 0
};
split_by_word(av, arr2, "out");
print_words(av);
free_words(av);
return 0;
}
Sample output:
0: [hi]
1: [hello]
0: [Hello good morning]
1: [hello good afternoon]
You need to insure you understand that your arr is an array of pointers to string literals within which you have tokens that indicate where to separate the contents of the array into separate strings made up of the literals up to that point, and that arr is ultimately terminated by a sentinel nul.
One issue that has been skirted, is how to handle changes in the length of the strings created by the words in arr. Depending on the length of the words in arr, how do you insure you have adequate space for the combined strings that make up the results array?
You can either guess and set a static storage size for each element in the result array (hopefully large enough for any arr you need to separate), or you can dynamically allocate (allocate/reallocate as needed). That way you insure you can handle the contents of arr in your result array.
There are many ways to do this and many routines you can use. Regardless, the approach is basically the same. Read each word in arr, insure the result string has adequate storage, and then concatenate the word from arr to the result string. One approach would be as follows:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MAXS 16
int split_by_word (char **res, char **arr, char *tok);
void *xrealloc (void *ptr, size_t psz, size_t *nelem, size_t inc);
int main (void) {
char *arr[] = { "Hello", "good",
"morning", "out",
"hello", "good",
"afternoon", 0 },
*res[sizeof arr/sizeof *arr] = { NULL },
*tok = "out";
if (split_by_word (res, arr, tok) > 0)
for (int i = 0; res[i]; i++) {
printf ("%s\n", res[i]);
free (res[i]);
}
return 0;
}
int split_by_word (char **res, char **arr, char *tok)
{
int aidx = 0, cidx = 0, ridx = 0; /* array, current and result index */
size_t szres = MAXS; /* current size of res[ridx] */
if (!res || !arr || !tok) return -1; /* validate parameters */
if (!(res[ridx] = calloc (szres, sizeof *(res[ridx])))) /* allocate result */
return -1;
while (arr[aidx]) {
if (strcmp (arr[aidx], tok) == 0) { /* separator found */
*(res[ridx] + cidx) = 0; /* nul-terminate */
ridx++; /* advance result index */
szres = MAXS; /* reset alloc size, alloc */
if (!(res[ridx] = calloc (szres, sizeof *(res[ridx]))))
return -1;
cidx = 0; /* reset current index */
}
else { /* append word from arr to res */
size_t len = strlen (arr[aidx]), /* get length */
reqd = cidx ? len + 2 : len + 1; /* add space and nulbyte */
if (cidx + reqd > szres) /* check space, realloc */
res[ridx] = xrealloc (res[ridx], sizeof *(res[ridx]), &szres,
cidx + reqd);
/* write word to result */
snprintf (res[ridx] + cidx, reqd, cidx ? " %s" : "%s", arr[aidx]);
cidx += reqd - 1; /* advance current index */
}
aidx++; /* advance array index */
}
*(res[ridx] + cidx) = 0; /* nul-terminate */
return ridx ? ridx : cidx ? 1 : ridx; /* return strings in results */
}
/** realloc 'ptr' to 'nelem' of 'psz' to 'nelem + inc' of 'psz'.
* returns pointer to reallocated block of memory with all new
* memory initialized to 0/NULL. return must be assigned to
* original pointer in caller.
*/
void *xrealloc (void *ptr, size_t psz, size_t *nelem, size_t inc)
{ void *memptr = realloc ((char *)ptr, (*nelem + inc) * psz);
if (!memptr) {
fprintf (stderr, "realloc() error: virtual memory exhausted.\n");
exit (EXIT_FAILURE);
} /* zero new memory (optional) */
memset ((char *)memptr + *nelem * psz, 0, inc * psz);
*nelem += inc;
return memptr;
}
Above split_by_word returns an integer value indicating the number of strings within the results array or -1 on error.
Example Use/Output
$ ./bin/splitap
Hello good morning
hello good afternoon
Verify Your Memory Use
If you allocate memory, it is your responsibility to preserve a pointer to the begninning of each block, so it can be freed when no longer needed. On Linux, valgrind is the tool of choice. Simply run your program through it. (there are similary memory error checking routines for each OS)
$ valgrind ./bin/splitap
==13491== Memcheck, a memory error detector
==13491== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==13491== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==13491== Command: ./bin/splitap
==13491==
Hello good morning
hello good afternoon
==13491==
==13491== HEAP SUMMARY:
==13491== in use at exit: 0 bytes in 0 blocks
==13491== total heap usage: 4 allocs, 4 frees, 104 bytes allocated
==13491==
==13491== All heap blocks were freed -- no leaks are possible
==13491==
==13491== For counts of detected and suppressed errors, rerun with: -v
==13491== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
You want to validate that each allocation has been freed, no memory leaks are possible, and that there are no errors in the way you have used the memory you have allocated (e.g. invalid reads/writes, etc..)
Since it looks (from your declarations) like you only want to store pointers in the new array, there is no need for strcat() or strcpy(). The first loop in your function appears to be skipping initial delimiters, but you can do this in the main loop. Here is a modified version of your code:
#include <stdio.h>
#include <string.h>
void split_by_word(char *av[], char **arr, char *word)
{
size_t i = 0;
size_t j = 0;
while (arr[i]) {
if (strcmp(arr[i], word)) {
av[j] = arr[i];
++j;
}
++i;
}
}
int main(void)
{
char *av[100];
char *arr[100] = {"hi", "&&", "hello", 0};
memset(av, 0, sizeof(char *) * 100);
split_by_word(av, arr, "&&");
for (size_t i = 0; av[i]; i++)
puts(av[i]);
return 0;
}
After arr is passed to split_by_word(), av contains pointers to the string literals "hi" and "hello":
λ> ./a.out
hi
hello
If, on the other hand, you actually want the new array to contain copies of the strings, you must declare av so that there is space for these copies, and you need to use strcpy(), or some similar function, to copy the characters to the array. Here is another version that accomplishes this. Note that the size of the largest string must be known in advance; here I have #defined a constant for this purpose. Also note that the display loop is slightly different from the previous loop. The first display loop continued until a NULL pointer was encountered, but in the second version the loop continues until an empty string is encountered. The output is the same as before.
#include <stdio.h>
#include <string.h>
#define MAXWORD 100
void split_by_word(char av[][MAXWORD], char **arr, char *word)
{
size_t i = 0;
size_t j = 0;
while (arr[i]) {
if (strcmp(arr[i], word)) {
strcpy(av[j], arr[i]);
++j;
}
++i;
}
}
int main(void)
{
char av[100][MAXWORD] = { { 0 } };
char *arr[100] = {"hi", "&&", "hello", 0};
split_by_word(av, arr, "&&");
for (size_t i = 0; av[i][0]; i++)
puts(av[i]);
return 0;
}
UPDATE
I have modified the previous solution to meet the refined requirements suggested in your revised example. The constant MAXWORD is now MAXLEN, and is large enough to hold quite a few words. strcat() is used instead of strcpy(), and an additional space character is added to the end of the string every time a word is added. The string-index j is incremented only when the delimiter string is encountered.
Note that there are no checks to ensure that there is room for a new string of words in av (which can currently hold up to 99 strings and one empty string as a terminator), or room for a new word in a string (999 characters plus room for a '\0' terminator seems reasonably generous). There is no dynamic allocation here, and if you need this perhaps Jonathan Leffler's solution is more to your taste.
#include <stdio.h>
#include <string.h>
#define MAXLEN 1000
void split_by_word(char av[][MAXLEN], char **arr, char *word)
{
size_t i = 0;
size_t j = 0;
while (arr[i]) {
if (strcmp(arr[i], word)) {
strcat(av[j], arr[i]);
strcat(av[j], " ");
} else {
++j;
}
++i;
}
}
int main(void)
{
char av[100][MAXLEN] = { { 0 } };
char *arr[] =
{
"Hello", "good",
"morning", "out",
"hello", "good",
"afternoon", 0
};
split_by_word(av, arr, "out");
for (size_t i = 0; av[i][0]; i++)
puts(av[i]);
return 0;
}
Here is the output of this program:
λ> ./a.out
Hello good morning
hello good afternoon
Bounds Checking
I couldn't bear to leave this without adding some checks on array bounds to avoid undefined behavior in case of unexpected input sizes. Here is a version of the split_by_word() function that only adds a new string to av if there is room, and only adds a new word to a string if there is room. If there is not enough space for the new word, the function skips to the next delimiter, or the end of arr, whichever comes first. I added a MAXNUM constant for the maximum number of strings to be stored, to replace the hard-coded 100 from previous versions. I have no doubt that you could improve this function.
#define MAXNUM 100
#define MAXLEN 1000
void split_by_word(char av[][MAXLEN], char **arr, char *word)
{
size_t i = 0;
size_t j = 0;
while ((j + 1) < MAXNUM && arr[i]) {
if (strcmp(arr[i], word)) {
/* Verify space for word + extra space */
if ((strlen(av[j]) + strlen(arr[i]) + 1) < MAXLEN) {
strcat(av[j], arr[i]);
strcat(av[j], " ");
} else { // No space: skip to next delimiter
++i;
while (arr[i] && strcmp(arr[i], word)) {
++i;
}
++j; // increment to next string
}
} else {
++j; // increment to next string
}
if (arr[i]) ++i; // increment i if not already at end
}
}

Copying a string into a dynamical two-dimensional array

like the title already states I would like to copy various strings into a two-dimensional array. Each string has a different size therefore I need to use memory reallocation. The code below should do this job, but somehow I cannot get it working.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(void) {
char **str = NULL;
int str_num = 1;
// get string
char *str_new = "hey mate\0";
printf("%s\n",str_new);
// reallocation for str_new
str = realloc(str,str_num*sizeof(char));
str[str_num-1] = realloc(str,strlen(str_new));
// copy string to new space
strcpy(*str, str_new);
// displaying string
printf("%s\n",str[str_num-1]);
return EXIT_SUCCESS;
}
One problem is in the reallocation:
str = realloc(str,str_num*sizeof(char));
You only allocate space for a single byte and not for a pointer to char. This leds to undefined behavior.
Change to e.g.
str = realloc(str,str_num*sizeof(char*));
Another problem, and also a cause for undefined behavior is the actual string allocation:
str[str_num-1] = realloc(str,strlen(str_new));
Here you are reallocating str and not str[0]. And you don't alocate space for the string terminator.
For this don't use realloc at all, since you're only allocating this once you only need either malloc, or just use the strdup function to do both allocation and copying in one call:
str[str_num - 1] = strdup(str_new);
By the way, when using realloc never assign to the same pointer you're passing in as first argument, because if the realloc function fails it will return NULL and you will loose the original pointer. Instead assign to a temporary pointer, and if it's non-null then assign to the actual pointer.
char **str = malloc(sizeof(char *) * n); /* n= number of pointers */
Allocate memory for the pointer's first then allocate memory to individual pointer like
for(i=0;i<n;i++)
{
str[i] = malloc(sizeof(char) * 20);
/* copy your string to the allocated memory location here*/
}
Else you can have
char **str = NULL;
str = realloc(str,str_num * sizeof(char *));
I think you don't understand what realloc does, the code you posted is not right, that code should be written this way
char **str = NULL;
int str_num = 1;
// get string
char *str_new = "hey mate\0";
printf("%s\n",str_new);
/*
// reallocation for str_new
str = realloc(str, str_num * sizeof(char));
*/
/* allocate space for `str_num` pointers of char */
str = malloc(str_num * sizeof(char *));
if (str == NULL) /* check it worked */
return -1;
/* allocate space for the number of characters in str_new and
* the termination null byte
*/
str[str_num - 1] = malloc(1 + strlen(str_new));
if (str[str_num - 1] == NULL)
{
free(str);
return -1;
}
/*
* // copy string to new space
* strcpy(*str, str_new);
*/
/* use the same index since if str_num > 1 the above is wrong */
strcpy(str[str_num - 1], str_new);
// displaying string
printf("%s\n",str[str_num - 1]);
realloc should be used like this
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(void)
{
char **str = NULL;
int str_num;
// get string
char *str_new[2] = {"hey mate 1", "hey mate 2"};
for (str_num = 0 ; str_num < 2 ; ++str_num)
{
char **pointer;
printf("%s\n", str_new[str_num]);
/*
// reallocation for str_new
str = realloc(str, str_num * sizeof(char));
*/
/* re-allocate space for `str_num` pointers of char */
pointer = realloc(str, (1 + str_num) * sizeof(char *));
/*
* If you use
*
* str = realloc(str, str_num * sizeof(char *));
*
* you wont be able to free(str) on failure
*/
if (pointer == NULL)
{
int j;
/* on failure cleanup and return */
for (j = str_num - 1 ; j >= 0 ; --j)
free(str[j]);
free(str);
return -1;
}
str = pointer;
/* allocate space for the number of characters in str_new and
* the termination null byte
*/
str[str_num] = malloc(1 + strlen(str_new[str_num]));
if (str[str_num] == NULL)
{
free(str);
return -1;
}
/*
* // copy string to new space
* strcpy(*str, str_new);
*/
/* use the same index since if str_num > 1 the above is wrong */
strcpy(str[str_num], str_new[str_num]);
// displaying string
printf("%s\n",str[str_num]);
}
return EXIT_SUCCESS;
}
you should remember to free all the allocated memory.
Also you don't need to embed the '\0' in a string literal.
Beside correcting (as pointed in Joachim Pileborg's answer)
str = realloc(str,str_num*sizeof(char));
to
str = realloc(str,str_num*sizeof(*str));
you should note that, the pointer passed to realloc must be the pointer returned by any malloc family function or it points to NULL. You can do it as
str[str_num-1] = NULL;
str[str_num-1] = realloc(str[str_num-1], strlen(str_new)+1);

Resources