Overlaying array of strings over char array - c

I am working a function that needs to be re-entrant - the function is given a memory buffer as an argument and should use such buffer for all its memory needs. In other words, it can't use malloc, but rather should draw the memory the supplied buffer.
The challenge that I ran into is how to overlay an array of strings over a char array of given size (the buffer is supplied as char *), but my result is array of strings (char **).
Below is a repro:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define BUFFER_SIZE 100
#define INPUT_ARRAY_SIZE 3
char *members[] = {
"alex",
"danny",
"max"
};
int main() {
// this simulates a buffer that is presented to my func
char *buffer = malloc(BUFFER_SIZE);
char *orig = buffer;
memset(buffer, NULL, BUFFER_SIZE);
// pointers will be stored at the beginning of the buffer
char **pointers = &buffer;
// strings will be stored after the pointers
char *strings = buffer + (sizeof(char *) * INPUT_ARRAY_SIZE);
for(int i = 0; i < INPUT_ARRAY_SIZE; i++) {
strncpy(strings, members[i], (strlen(members[i]) + 1));
// Need to store pointer to string in the pointers section
// pointers[i] = strings; // This does not do what I expect
strings += ((strlen(members[i]) + 1));
}
for (int i=0; i < BUFFER_SIZE; i++) {
printf("%c", orig[i]);
}
// Need to return pointers
}
With the problematic line commented out, the code above prints:
alex danny max
However, I need some assistance in figuring out how to write addresses of the strings at the beginning.
Of course, if there an easier way of accomplishing this task, please, let me know.

Here take a look at this.
/* conditions :
*
* 'buffer' should be large enough, 'arr_length','arr' should be valid.
*
*/
char ** pack_strings(char *buffer, char * arr[], int arr_length)
{
char **ptr = (char**) buffer;
char *string;
int index = 0;
string = buffer + (sizeof(char *) * (arr_length+1)); /* +1 for NULL */
while(index < arr_length)
{
size_t offset;
ptr[index] = string;
offset = strlen(arr[index])+1;
strcpy(string,arr[index]);
string += offset;
++index;
}
ptr[index] = NULL;
return ptr;
}
usage
char **ptr = pack_strings(buffer,members,INPUT_ARRAY_SIZE);
for (int i=0; ptr[i] != NULL; i++)
puts(ptr[i]);

Related

How do I allocate memory for a new string in a C Multiarray?

I am trying to find a way to create a dynamically allocated array of C strings. So far I have come with the following code that allows me to initialize an array of strings and change the value of an already existing index.
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
void replace_index(char *array[], int index, char *value) {
array[index] = malloc(strlen(value) + 1);
memmove(array[index], value, strlen(value) + 1);
}
int main(int argc, const char * argv[]) {
char *strs[] = {"help", "me", "learn", "dynamic", "strings"};
replace_index(strs, 2, "new_value");
// The above code works fine, but I can not use it to add a value
// beyond index 4.
// The following line will not add the string to index 5.
replace_index(strs, 5, "second_value");
}
The function replace_index will work to change the value of a string already include in the initializer, but will not work to add strings beyond the maximum index in the initializer. Is there a way to allocate more memory and add a new index?
First off, if you want to do serious string manipulation it would be so much easier to use almost any other language or to get a library to do it for you.
Anyway, onto the answer.
The reason replace_index(strs, 5, "second_value"); doesn't work in your code is because 5 is out of bounds-- the function would write to memory unassociated with strs. That wasn't your question, but that's something important to know if you didn't. Instead, it looks like you want to append a string. The following code should do the trick.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
typedef struct {
char **content;
int len;
} string_array;
void free_string_array(string_array *s) {
for (int i = 0; i < s->len; i++) {
free(s->content[i]);
}
free(s->content);
free(s);
}
int append_string(string_array *s, char *value) {
value = strdup(value);
if (!value) {
return -1;
}
s->len++;
char **resized = realloc(s->content, sizeof(char *)*s->len);
if (!resized) {
s->len--;
free(value);
return -1;
}
resized[s->len-1] = value;
s->content = resized;
return 0;
}
string_array* new_string_array(char *init[]) {
string_array *s = calloc(1, sizeof(string_array));
if (!s || !init) {
return s;
}
while (*init) {
if (append_string(s, *init)) {
free_string_array(s);
return NULL;
}
init++;
}
return s;
}
// Note: It's up to the caller to free what was in s->content[index]
int replace_index(string_array *s, int index, char *value) {
value = strdup(value);
if (!value) {
return -1;
}
s->content[index] = value;
return 0;
}
int main() {
string_array *s = new_string_array((char *[]) {"help", "me", "learn", "dynamic", "strings", NULL});
if (!s) {
printf("out of memory\n");
exit(1);
}
free(s->content[2]);
// Note: No error checking for the following two calls
replace_index(s, 2, "new_value");
append_string(s, "second value");
for (int i = 0; i < s->len; i++) {
printf("%s\n", s->content[i]);
}
free_string_array(s);
return 0;
}
Also, you don't have to keep the char ** and int in a struct together but it's much nicer if you do.
If you don't want to use this code, the key takeaway is that the array of strings (char ** if you prefer) must be dynamically allocated. Meaning, you would need to use malloc() or similar to get the memory you need, and you would use realloc() to get more (or less). Don't forget to free() what you get when you're done using it.
My example uses strdup() to make copies of char *s so that you can always change them if you wish. If you have no intention of doing so it might be easier to remove the strdup()ing parts and also the free()ing of them.
Static array
char *strs[] = {"help", "me", "learn", "dynamic", "strings"};
This declares strs as an array of pointer to char and initializes it with 5 elements, thus the implied [] is [5]. A more restrictive const char *strs[] would be more appropriate if one were not intending to modify the strings.
Maximum length
char strs[][32] = {"help", "me", "learn", "dynamic", "strings"};
This declares strs as an array of array 32 of char which is initialized with 5 elements. The 5 elements are zero-filled beyond the strings. One can modify this up to 32 characters, but not add more.
Maximum capacity singleton for constant strings
static struct str_array { size_t size; const char *data[1024]; } strs;
This will pre-allocate the maximum capacity at startup and use that to satisfy requests. In this, the capacity is 1024, but the size can be any number up to the capacity. The reason I've made this static is this is typically a lot to put the stack. There is no reason why it couldn't be dynamic memory, as required.
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <errno.h>
static struct { size_t size; const char *data[1024]; } strs;
static const size_t strs_capacity = sizeof strs.data / sizeof *strs.data;
/** Will reserve `n` pointers to strings. A null return indicates that the size
is overflowed, and sets `errno`, otherwise it returns the first string. */
static const char **str_array_append(const size_t n) {
const char **r;
if(n > strs_capacity - strs.size) { errno = ERANGE; return 0; }
r = strs.data + strs.size;
strs.size += n;
return r;
}
/** Will reserve one pointer to a string, null indicates the string buffer is
overflowed. */
static const char **str_array_new(void) { return str_array_append(1); }
int main(void) {
const char **s;
size_t i;
int success = EXIT_FAILURE;
if(!(s = str_array_append(5))) goto catch;
s[0] = "help";
s[1] = "me";
s[2] = "learn";
s[3] = "dynamic";
s[4] = "strings";
strs.data[2] = "new_value";
if(!(s = str_array_new())) goto catch;
s[0] = "second_value";
for(i = 0; i < strs.size; i++) printf("->%s\n", strs.data[i]);
{ success = EXIT_SUCCESS; goto finally; }
catch:
perror("strings");
finally:
return success;
}
Dynamic array
struct str_array { const char **data; size_t size, capacity; };
I think you are asking for a dynamic array of const char *. Language-level support of dynamic arrays is not in the standard C run-time; one must write one's own. Which is entirely possible, but more involved. Because the size is variable, it will probably be slower, but in the limit as the problem grows, by a constant average.
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <errno.h>
/** A dynamic array of constant strings. */
struct str_array { const char **data; size_t size, capacity; };
/** Returns success allocating `min` elements of `a`. This is a dynamic array,
with the capacity going up exponentially, suitable for amortized analysis. On
resizing, any pointers in `a` may become stale. */
static int str_array_reserve(struct str_array *const a, const size_t min) {
size_t c0;
const char **data;
const size_t max_size = ~(size_t)0 / sizeof *a->data;
if(a->data) {
if(min <= a->capacity) return 1;
c0 = a->capacity < 5 ? 5 : a->capacity;
} else {
if(!min) return 1;
c0 = 5;
}
if(min > max_size) return errno = ERANGE, 0;
/* `c_n = a1.625^n`, approximation golden ratio `\phi ~ 1.618`. */
while(c0 < min) {
size_t c1 = c0 + (c0 >> 1) + (c0 >> 3);
if(c0 >= c1) { c0 = max_size; break; } /* Unlikely. */
c0 = c1;
}
if(!(data = realloc(a->data, sizeof *a->data * c0)))
{ if(!errno) errno = ERANGE; return 0; }
a->data = data, a->capacity = c0;
return 1;
}
/** Returns a pointer to the `n` buffered strings in `a`, that is,
`a + [a.size, a.size + n)`, or null on error, (`errno` will be set.) */
static const char **str_array_buffer(struct str_array *const a,
const size_t n) {
if(a->size > ~(size_t)0 - n) { errno = ERANGE; return 0; }
return str_array_reserve(a, a->size + n)
&& a->data ? a->data + a->size : 0;
}
/** Makes any buffered strings in `a` and beyond if `n` is greater then the
buffer, (containing uninitialized values) part of the size. A null on error
will only be possible if the buffer is exhausted. */
static const char **str_array_append(struct str_array *const a,
const size_t n) {
const char **b;
if(!(b = str_array_buffer(a, n))) return 0;
return a->size += n, b;
}
/** Returns a pointer to a string that has been buffered and created from `a`,
or null on error. */
static const char **str_array_new(struct str_array *const a) {
return str_array_append(a, 1);
}
/** Returns a string array that has been zeroed, with zero strings and idle,
not taking up any dynamic memory. */
static struct str_array str_array(void) {
struct str_array a;
a.data = 0, a.capacity = a.size = 0;
return a;
}
/** Erases `a`, if not null, and returns it to idle, not taking up dynamic
memory. */
static void str_array_(struct str_array *const a) {
if(a) free(a->data), *a = str_array();
}
int main(void) {
struct str_array strs = str_array();
const char **s;
size_t i;
int success = EXIT_FAILURE;
if(!(s = str_array_append(&strs, 5))) goto catch;
s[0] = "help";
s[1] = "me";
s[2] = "learn";
s[3] = "dynamic";
s[4] = "strings";
strs.data[2] = "new_value";
if(!(s = str_array_new(&strs))) goto catch;
s[0] = "second_value";
for(i = 0; i < strs.size; i++) printf("->%s\n", strs.data[i]);
{ success = EXIT_SUCCESS; goto finally; }
catch:
perror("strings");
finally:
str_array_(&strs);
return success;
}
but will not work to add strings beyond the maximum index in the initializer
To do that, you need the pointer array to be dynamic as well. To create a dynamic array of strings is one of the very few places where using a pointer-to-pointer to emulate 2D arrays is justified:
size_t n = 5;
char** str_array = malloc(5 * sizeof *str_array);
...
size_t size = strlen(some_string)+1;
str_array[i] = malloc(size);
memcpy(str_array[i], some_string, size);
You have to keep track of the used size n manually and realloc more room in str_array when you run out of it. realloc guarantees that previous values are preserved.
This is very flexible but that comes at the cost of fragmented allocation, which is relatively slow. Had you used fixed-size 2D arrays, the code would perform much faster but then you can't resize them.
Note that I used memcpy, not memmove - the former is what you should normally use, since it's the fastest. memmove is for specialized scenarios where you suspect that the two arrays being copied may overlap.
As a side-note, the strlen + malloc + memcpy can be replaced with strdup, which is currently a non-standard function (but widely supported). It seems likely that strdup will become standard in the upcoming C23 version of C, so using it will become recommended practice.

copy a const char* into array of char (facing a bug)

I have following method
static void setName(const char* str, char buf[16])
{
int sz = MIN(strlen(str), 16);
for (int i = 0; i < sz; i++) buf[i] = str[i];
buf[sz] = 0;
}
int main()
{
const char* string1 = "I am getting bug for this long string greater than 16 lenght);
char mbuf[16];
setName(string,mybuf)
// if I use buf in my code it is leading to spurious characters since length is greater than 16 .
Please let me know what is the correct way to code above if the restriction for buf length is 16 in method static void setName(const char* str, char buf[16])
When passing an array as argument, array decays into the pointer of FIRST element of array. One must define a rule, to let the method know the number of elements.
You declare char mbuf[16], you pass it to setName(), setName() will not get char[], but will get char* instead.
So, the declaration should be
static void setName(const char* str, char* buf)
Next, char mbuf[16] can only store 15 chars, because the last char has to be 'null terminator', which is '\0'. Otherwise, the following situation will occur:
// if I use buf in my code it is leading to spurious characters since length is greater than 16 .
Perhaps this will help you understand:
char str[] = "foobar"; // = {'f','o','o','b','a','r','\0'};
So the code should be
static void setName(const char* str, char* buf)
{
int sz = MIN(strlen(str), 15); // not 16
for (int i = 0; i < sz; i++) buf[i] = str[i];
buf[sz] = '\0'; // assert that you're assigning 'null terminator'
}
Also, I would recommend you not to reinvent the wheel, why don't use strncpy instead?
char mbuf[16];
strncpy(mbuf, "12345678901234567890", 15);
The following code passes the size of the memory allocated to the buffer, to the setName function.
That way the setName function can ensure that it does not write outside the allocated memory.
Inside the function either a for loop or strncpy can be used. Both will be controlled by the size parameter sz and both will require that a null terminator character is placed after the copied characters. Again, sz will ensure that the null terminator is written within the memory allocated to the buffer.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
static void setName(const char *str, char *buf, int sz);
int main()
{
const int a_sz = 16;
char* string = "This bit is OK!! but any more than 15 characters are dropped";
/* allocate memory for a buffer & test successful allocation*/
char *mbuf = malloc(a_sz);
if (mbuf == NULL) {
printf("Out of memory!\n");
return(1);
}
/* call function and pass size of buffer */
setName(string, mbuf, a_sz);
/* print resulting buffer contents */
printf("%s\n", mbuf); // printed: This bit is OK!
/* free the memory allocated to the buffer */
free(mbuf);
return(0);
}
static void setName(const char *str, char *buf, int sz)
{
int i;
/* size of string or max 15 */
if (strlen(str) > sz - 1) {
sz--;
} else {
sz = strlen(str);
}
/* copy a maximum of 15 characters into buffer (0 to 14) */
for (i = 0; i < sz; i++) buf[i] = str[i];
/* null terminate the string - won't be more than buf[15]) */
buf[i] = '\0';
}
Changing one value const int a_sz allows different numbers of characters to be copied. There is no 'hard coding' of the size in the function, so reducing the risk of errors if the code is modified later on.
I replaced MIN with a simple if ... else structure so that I could test the code.

memory leakage on asprintf using inside a loop in C

I have a piece of code that looks like this
#include <stdio.h>
int main()
{
int i;
int number_of_chunks = 12;
char *final_string = NULL;
for(i = 0; i < number_of_chunks; i++)
{
char *chunk = some_hash_table.pop(i);
asprintf(&final_string, "%s%s", (final_string==NULL?"":final_string), chunk);
}
free(final_string);
return 0;
}
Here I am concatinating string chunks dynamically, meaning I don't know the size of each chunk in advance. For this I am using asprintf. The code works fine, however rise some serious memory issue. My doubt is asprintf allocates memory in each iteration and the code loses pointer in each iteration. If there is any other way I can concate string inside loop please guide me
To put your question in the simplest possible way, what you are essentially trying to do with the above code is
1. Allocate memory to a pointer continuously(in your case 12 times in the for loop) and
2. free it at the end only once, which is causing memory leak.
Like in the below code
#include <stdio.h>
int main()
{
int i;
int number_of_chunks = 12;
char *final_string = NULL;
for(i = 0; i < number_of_chunks; i++)
{
/*For example: similar to what asprintf does, allocate memory to the pointer*/
final_string = malloc(1);
}
free(final_string);
return 0;
}
From the above example it is easily visible that you have allocated the memory 12 times but freed only once.
code snippet:
#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char *argv[])
{
int i;
int number_of_chunks = 12;
char *final_string = NULL;
char *tmp = NULL;
for(i = 0; i < number_of_chunks; i++)
{
char *chunk = some_hash_table.pop(i);
asprintf(&final_string, "%s%s", (tmp==NULL?"":tmp), chunk);
if (tmp)
free(tmp);
tmp = final_string;
}
printf("%s\n", final_string);
free(final_string);
return 0;
}
Others have already pointed out that you lose the reference to all but the last allocation and that having the same string that is written to as printf argument is probably undefined behaviour, even more so as re-allocations might occur and invalidate the format argument.
You don't use asprintf's formatting capabilities, you use it only to concatenate strings, so you might want to take another approach. You could either collect the strings in an array, determine the needed length, allocate as appropriate and fill the allocated buffer with memcpy.
Or you could write a self-allocating string buffer similar to C++'s std::stringstream, for example:
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
struct append_t {
char *str; /* string */
size_t len; /* length of string */
size_t size; /* allocated size */
};
void append(struct append_t *app, const char *str)
{
size_t len = strlen(str);
while (app->len + len + 1 >= app->size) {
app->size = app->size ? app->size * 2 : 0x100;
app->str = realloc(app->str, app->size);
/* error handling on NULL re-allocation */
}
strcpy(app->str + app->len, str);
app->len += len;
}
int main(int argc, char **argv)
{
struct append_t app = {NULL};
for (int i = 1; i < argc; i++) {
append(&app, argv[i]);
}
if (app.str) puts(app.str);
free(app.str);
return 0;
}

Copying a file line by line into a char array with strncpy

So i am trying to read a text file line by line and save each line into a char array.
From my printout in the loop I can tell it is counting the lines and the number of characters per line properly but I am having problems with strncpy. When I try to print the data array it only displays 2 strange characters. I have never worked with strncpy so I feel my issue may have something to do with null-termination.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(int argc, char* argv[])
{
FILE *f = fopen("/home/tgarvin/yes", "rb");
fseek(f, 0, SEEK_END);
long pos = ftell(f);
fseek(f, 0, SEEK_SET);
char *bytes = malloc(pos); fread(bytes, pos, 1, f);
int i = 0;
int counter = 0;
char* data[counter];
int length;
int len=strlen(data);
int start = 0;
int end = 0;
for(; i<pos; i++)
{
if(*(bytes+i)=='\n'){
end = i;
length=end-start;
data[counter]=(char*)malloc(sizeof(char)*(length)+1);
strncpy(data[counter], bytes+start, length);
printf("%d\n", counter);
printf("%d\n", length);
start=end+1;
counter=counter+1;
}
}
printf("%s\n", data);
return 0;
}
Your "data[]" array is declared as an array of pointers to characters of size 0. When you assign pointers to it there is no space for them. This could cause no end of trouble.
The simplest fix would be to make a pass over the array to determine the number of lines and then do something like "char **data = malloc(number_of_lines * sizeof(char *))". Then doing assignments of "data[counter]" will work.
You're right that strncpy() is a problem -- it won't '\0' terminate the string if it copies the maximum number of bytes. After the strncpy() add "data[counter][length ] = '\0';"
The printf() at the end is wrong. To print all the lines use "for (i = 0; i < counter; i++) printf("%s\n", data[counter]);"
Several instances of bad juju, the most pertinent one being:
int counter = 0;
char* data[counter];
You've just declared data as a variable-length array with zero elements. Despite their name, VLAs are not truly variable; you cannot change the length of the array after allocating it. So when you execute the lines
data[counter]=(char*)malloc(sizeof(char)*(length)+1);
strncpy(data[counter], bytes+start, length);
data[counter] is referring to memory you don't own, so you're invoking undefined behavior.
Since you don't know how many lines you're reading from the file beforehand, you need to create a structure that can be extended dynamically. Here's an example:
/**
* Initial allocation of data array (array of pointer to char)
*/
char **dataAlloc(size_t initialSize)
{
char **data= malloc(sizeof *data * initialSize);
return data;
}
/**
* Extend data array; each extension doubles the length
* of the array. If the extension succeeds, the function
* will return 1; if not, the function returns 0, and the
* values of data and length are unchanged.
*/
int dataExtend(char ***data, size_t *length)
{
int r = 0;
char **tmp = realloc(*data, sizeof *tmp * 2 * *length);
if (tmp)
{
*length= 2 * *length;
*data = tmp;
r = 1;
}
return r;
}
Then in your main program, you would declare data as
char **data;
with a separate variable to track the size:
size_t dataLength = SOME_INITIAL_SIZE_GREATER_THAN_0;
You would allocate the array as
data = dataAlloc(dataLength);
initially. Then in your loop, you would compare your counter against the current array size and extend the array when they compare equal, like so:
if (counter == dataLength)
{
if (!dataExtend(&data, &dataLength))
{
/* Could not extend data array; treat as a fatal error */
fprintf(stderr, "Could not extend data array; exiting\n");
exit(EXIT_FAILURE);
}
}
data[counter] = malloc(sizeof *data[counter] * length + 1);
if (data[counter])
{
strncpy(data[counter], bytes+start, length);
data[counter][length] = 0; // add the 0 terminator
}
else
{
/* malloc failed; treat as a fatal error */
fprintf(stderr, "Could not allocate memory for string; exiting\n");
exit(EXIT_FAILURE);
}
counter++;
You are trying to print data with a format specifier %s, while your data is a array of pointer s to char.
Now talking about copying a string with giving size:
As far as I like it, I would suggest you to use
strlcpy() instead of strncpy()
size_t strlcpy( char *dst, const char *src, size_t siz);
as strncpy wont terminate the string with NULL,
strlcpy() solves this issue.
strings copied by strlcpy are always NULL terminated.
Allocate proper memory to the variable data[counter]. In your case counter is set to 0. Hence it will give segmentation fault if you try to access data[1] etc.
Declaring a variable like data[counter] is a bad practice. Even if counter changes in the subsequent flow of the program it wont be useful to allocate memory to the array data.
Hence use a double char pointer as stated above.
You can use your existing loop to find the number of lines first.
The last printf is wrong. You will be printing just the first line with it.
Iterate over the loop once you fix the above issue.
Change
int counter = 0;
char* data[counter];
...
int len=strlen(data);
...
for(; i<pos; i++)
...
strncpy(data[counter], bytes+start, length);
...
to
int counter = 0;
#define MAX_DATA_LINES 1024
char* data[MAX_DATA_LINES]; //1
...
for(; i<pos && counter < MAX_DATA_LINES ; i++) //2
...
strncpy(data[counter], bytes+start, length);
...
//1: to prepare valid memory storage for pointers to lines (e.g. data[0] to data[MAX_DATA_LINES]). Without doing this, you may hit into 'segmentation fault' error, if you do not, you are lucky.
//2: Just to ensure that if the total number of lines in the file are < MAX_DATA_LINES. You do not run into 'segmentation fault' error, because the memory storage for pointer to line data[>MAX_DATA_LINES] is no more valid.
I think that this might be a quicker implementation as you won't have to copy the contents of all the strings from the bytes array to a secondary array. You will of course lose your '\n' characters though.
It also takes into account files that don't end with a new line character and as pos is defined as long the array index used for bytes[] and also the length should be long.
#include <stdio.h>
#include <stdlib.h>
#define DEFAULT_LINE_ARRAY_DIM 100
int main(int argc, char* argv[])
{
FILE *f = fopen("test.c", "rb");
fseek(f, 0, SEEK_END);
long pos = ftell(f);
fseek(f, 0, SEEK_SET);
char *bytes = malloc(pos+1); /* include an extra byte incase file isn't '\n' terminated */
fread(bytes, pos, 1, f);
if (bytes[pos-1]!='\n')
{
bytes[pos++] = '\n';
}
long i;
long length = 0;
int counter = 0;
size_t size=DEFAULT_LINE_ARRAY_DIM;
char** data=malloc(size*sizeof(char*));
data[0]=bytes;
for(i=0; i<pos; i++)
{
if (bytes[i]=='\n') {
bytes[i]='\0';
counter++;
if (counter>=size) {
size+=DEFAULT_LINE_ARRAY_DIM;
data=realloc(data,size*sizeof(char*));
if (data==NULL) {
fprintf(stderr,"Couldn't allocate enough memory!\n");
exit(1);
}
}
data[counter]=&bytes[i+1];
length = data[counter] - data[counter - 1] - 1;
printf("%d\n", counter);
printf("%ld\n", length);
}
}
for (i=0;i<counter;i++)
printf("%s\n", data[i]);
return 0;
}

C: creating array of strings from delimited source string

What would be an efficient way of converting a delimited string into an array of strings in C (not C++)? For example, I might have:
char *input = "valgrind --leak-check=yes --track-origins=yes ./a.out"
The source string will always have only a single space as the delimiter. And I would like a malloc'ed array of malloc'ed strings char *myarray[] such that:
myarray[0]=="valgrind"
myarray[1]=="--leak-check=yes"
...
Edit I have to assume that there are an arbitrary number of tokens in the inputString so I can't just limit it to 10 or something.
I've attempted a messy solution with strtok and a linked list I've implemented, but valgrind complained so much that I gave up.
(If you're wondering, this is for a basic Unix shell I'm trying to write.)
What's about something like:
char* string = "valgrind --leak-check=yes --track-origins=yes ./a.out";
char** args = (char**)malloc(MAX_ARGS*sizeof(char*));
memset(args, 0, sizeof(char*)*MAX_ARGS);
char* curToken = strtok(string, " \t");
for (int i = 0; curToken != NULL; ++i)
{
args[i] = strdup(curToken);
curToken = strtok(NULL, " \t");
}
if you have all of the input in input to begin with then you can never have more tokens than strlen(input). If you don't allow "" as a token, then you can never have more than strlen(input)/2 tokens. So unless input is huge you can safely write.
char ** myarray = malloc( (strlen(input)/2) * sizeof(char*) );
int NumActualTokens = 0;
while (char * pToken = get_token_copy(input))
{
myarray[++NumActualTokens] = pToken;
input = skip_token(input);
}
char ** myarray = (char**) realloc(myarray, NumActualTokens * sizeof(char*));
As a further optimization, you can keep input around and just replace spaces with \0 and put pointers into the input buffer into myarray[]. No need for a separate malloc for each token unless for some reason you need to free them individually.
Were you remembering to malloc an extra byte for the terminating null that marks the end of string?
From the strsep(3) manpage on OSX:
char **ap, *argv[10], *inputstring;
for (ap = argv; (*ap = strsep(&inputstring, " \t")) != NULL;)
if (**ap != '\0')
if (++ap >= &argv[10])
break;
Edited for arbitrary # of tokens:
char **ap, **argv, *inputstring;
int arglen = 10;
argv = calloc(arglen, sizeof(char*));
for (ap = argv; (*ap = strsep(&inputstring, " \t")) != NULL;)
if (**ap != '\0')
if (++ap >= &argv[arglen])
{
arglen += 10;
argv = realloc(argv, arglen);
ap = &argv[arglen-10];
}
Or something close to that. The above may not work, but if not it's not far off. Building a linked list would be more efficient than continually calling realloc, but that's really besides the point - the point is how best to make use of strsep.
Looking at the other answers, for a beginner in C, it would look complex due to the tight size of code, I thought I would put this in for a beginner, it might be easier to actually parse the string instead of using strtok...something like this:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>
char **parseInput(const char *str, int *nLen);
void resizeptr(char ***, int nLen);
int main(int argc, char **argv){
int maxLen = 0;
int i = 0;
char **ptr = NULL;
char *str = "valgrind --leak-check=yes --track-origins=yes ./a.out";
ptr = parseInput(str, &maxLen);
if (!ptr) printf("Error!\n");
else{
for (i = 0; i < maxLen; i++) printf("%s\n", ptr[i]);
}
for (i = 0; i < maxLen; i++) free(ptr[i]);
free(ptr);
return 0;
}
char **parseInput(const char *str, int *Index){
char **pStr = NULL;
char *ptr = (char *)str;
int charPos = 0, indx = 0;
while (ptr++ && *ptr){
if (!isspace(*ptr) && *ptr) charPos++;
else{
resizeptr(&ptr, ++indx);
pStr[indx-1] = (char *)malloc(((charPos+1) * sizeof(char))+1);
if (!pStr[indx-1]) return NULL;
strncpy(pStr[indx-1], ptr - (charPos+1), charPos+1);
pStr[indx-1][charPos+1]='\0';
charPos = 0;
}
}
if (charPos > 0){
resizeptr(&pStr, ++indx);
pStr[indx-1] = (char *)malloc(((charPos+1) * sizeof(char))+1);
if (!pStr[indx-1]) return NULL;
strncpy(pStr[indx-1], ptr - (charPos+1), charPos+1);
pStr[indx-1][charPos+1]='\0';
}
*Index = indx;
return (char **)pStr;
}
void resizeptr(char ***ptr, int nLen){
if (*(ptr) == (char **)NULL){
*(ptr) = (char **)malloc(nLen * sizeof(char*));
if (!*(ptr)) perror("error!");
}else{
char **tmp = (char **)realloc(*(ptr),nLen);
if (!tmp) perror("error!");
*(ptr) = tmp;
}
}
I slightly modified the code to make it easier. The only string function that I used was strncpy..sure it is a bit long-winded but it does reallocate the array of strings dynamically instead of using a hard-coded MAX_ARGS, which means that the double pointer is already hogging up memory when only 3 or 4 would do, also which would make the memory usage efficient and tiny, by using realloc, the simple parsing is covered by employing isspace, as it iterates using the pointer. When a space is encountered, it reallocates the double pointer, and malloc the offset to hold the string.
Notice how the triple pointers are used in the resizeptr function.. in fact, I thought this would serve an excellent example of a simple C program, pointers, realloc, malloc, passing-by-reference, basic element of parsing a string...
Hope this helps,
Best regards,
Tom.

Resources