How to get a substring from a string in C?

How to get a substring from a string in C? - c

I want to create a function in C that gets the a substring from a string. This is what I have so far:
char* substr(char* src, int start, int len){
char* sub = malloc(sizeof(char)*(len+1));
memcpy(sub, &src[start], len);
sub[len] = '\0';
return sub;
}
int main(){
char* test = malloc(sizeof(char)*5); // the reason I don't use char* = "test"; is because I wouldn't be able to use free() on it then
strcpy(test, "test");
char* sub = substr(test, 1, 2); // save the substr in a new char*
free(test); // just wanted the substr from test
printf("%s\n", sub); // prints "es"
// ... free when done with sub
free(sub);
}
Is there any way I can save the substring into test without having to create a new char*? If I do test = substr(test, 1, 2), the old value of test no longer has a pointer pointing to it, so it's leaked memory (I think. I'm a noob when it comes to C languages.)

void substr(char* str, char* sub , int start, int len){
memcpy(sub, &str[start], len);
sub[len] = '\0';
}
int main(void)
{
char *test = (char*)malloc(sizeof(char)*5);
char *sub = (char*)malloc(sizeof(char)*3);
strcpy(test, "test");
substr(test, sub, 1, 2);
printf("%s\n", sub); // prints "es"
free(test);
free(sub);
return 0;
}

Well you could always keep the address of the malloc'd memory is a separate pointer:
char* test = malloc(~~~)
char* toFree = test;
test = substr(test,1,2);
free(toFree);
But most of the features and capabilities of shuffling this sort of data around has already been done in string.h. One of those functions probably does the job you want get done. movemem() as others have pointed out, could move the substring to the start of your char pointer, viola!
If you specifically want to make a new dynamic string to play with while keeping the original separate and safe, and also want to be able to overlap these pointers.... that's tricky. You could probably do it if you passed in the source and destination and then range-checked the affected memory, and free'd the source if there was overlap... but that seems a little over-complicated.
I'm also loathe to malloc memory that I trust higher levels to free, but that's probably just me.
As an aside,
char* test = "test";
Is one of those niche cases in C. When you initialize a pointer to a string literal (stuff in quotes), it puts the data in a special section of memory just for text data. You can (rarely) edit it, but you shouldn't, and it can't grow.

There are a number of ways to do this, and the way you approached it is a good one, but there are several areas where you seemed a bit confused. First, there is no need to allocated test. Simply using a pointer is fine. You could simply do char *test = "test"; in your example. No need to free it then either.
Next, when you are beginning to allocate memory dynamically, you need to always check the return to make sure your allocation succeeded. Otherwise, you can easily segfault if you attempt to write to a memory location when there has been no memory allocated.
In your substr, you should also validate the range of start and len you send to the function to insure you are not attempting to read past the end of the string.
When dealing with only positive numbers, it is better to use type size_t or unsigned. There will never be a negative start or len in your code, so size_t fits the purpose nicely.
Lastly, it is good practice to always check that a pointer to a memory block to be freed actually holds a valid address to prevent freeing a block of memory twice, etc... (e.g. if (sub) free (sub);)
Take a look at the following and let me know if you have questions. I changed the code to accept command line arguments from string, start and len, so the use is:
./progname the_string_to_get_sub_from start len
I hope the following helps.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char* substr (char* src, size_t start, size_t len)
{
/* validate indexes */
if (start + len > strlen (src)) {
fprintf (stderr, "%s() error: invalid substring index (start+len > length).\n", __func__);
return NULL;
}
char* sub = calloc (1, len + 1);
/* validate allocation */
if (!sub) {
fprintf (stderr, "%s() error: memory allocation failed.\n", __func__);
return NULL;
}
memcpy (sub, src + start, len);
// sub[len] = '\0'; /* by using calloc, sub is filled with 0 (null) */
return sub;
}
int main (int argc, char **argv) {
if (argc < 4 ) {
fprintf (stderr, "error: insufficient input, usage: %s string ss_start ss_length\n", argv[0]);
return 1;
}
char* test = argv[1]; /* no need to allocate test, a pointer is fine */
size_t ss_start = (size_t)atoi (argv[2]); /* convert start & length from */
size_t ss_lenght = (size_t)atoi (argv[3]); /* the command line arguments */
char* sub = substr (test, ss_start, ss_lenght);
if (sub) /* validate sub before use */
printf("\n sub: %s\n\n", sub);
if (sub) /* validate sub before free */
free(sub);
return 0;
}
Output
$ ./bin/str_substr test 1 2
sub: es
If you choose an invalid start / len combination:
$ ./bin/str_substr test 1 4
substr() error: invalid substring index (start+len > length).
Verify All Memory Freed
$ valgrind ./bin/str_substr test 1 2
==13515== Memcheck, a memory error detector
==13515== Copyright (C) 2002-2012, and GNU GPL'd, by Julian Seward et al.
==13515== Using Valgrind-3.8.1 and LibVEX; rerun with -h for copyright info
==13515== Command: ./bin/str_substr test 1 2
==13515==
sub: es
==13515==
==13515== HEAP SUMMARY:
==13515== in use at exit: 0 bytes in 0 blocks
==13515== total heap usage: 1 allocs, 1 frees, 4 bytes allocated
==13515==
==13515== All heap blocks were freed -- no leaks are possible
==13515==
==13515== For counts of detected and suppressed errors, rerun with: -v
==13515== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 2 from 2)

Let's break down what is being talked about:
You allocate some memory, and you create the variable test to point to it.
You allocate some more memory, and you'd like to store that pointer in the variable named test as well.
You have 2 pieces of information that you claim you'd like to store in the same pointer. You can't do this!
Solution 1
Use two variables. I don't know why this isn't acceptable...
char *input = "hello";
char *output = substr(input, 2, 3);
Solution 2
Have your input parameter not be heap memory. There's a number of ways we could do this:
// Use a string literal
char *test = substr("test", 2, 2);
// Use a stack allocated string
char s[] = "test";
char *test = substr(s, 2, 2);
Personally...
If you're already passing the length of the substring to the function, I'd personally rather see that function just get passed the piece of memory that it will push the data into. Something like:
char *substr(char *dst, char *src, size_t offset, size_t length) {
memcpy(dst, src + offset, length);
dst[length] = '\0';
return dst;
}
int main() {
char s[5] = "test";
char d[3] = "";
substr(d, s, 2, 2);
}

In C, string functions quickly run into memory management. So somehow the space for the sub-string needs to exist and passed to the function or the function can allocate it.
const char source[] = "Test";
size_t start, length;
char sub1[sizeof source];
substring1(source, sub1, start, length);
// or
char *sub2 = substring2(source, start, length);
...
free(sub2);
Code needs to specify what happens when 1) the start index is greater than other original string's length and 2) the length similarly exceeds the original string. These are 2 important steps not done OP's code.
void substring1(const char *source, char *dest, size_t start, size_t length) {
size_t source_len = strlen(source);
if (start > source_len) start = source_len;
if (start + length > source_len) length = source_len - start;
memmove(dest, &source[start], length);
dest[length] = 0;
}
char *substring2(const char *source, size_t start, size_t length) {
size_t source_len = strlen(source);
if (start > source_len) start = source_len;
if (start + length > source_len) length = source_len - start;
char *dest = malloc(length + 1);
if (dest == NULL) {
return NULL;
}
memcpy(dest, &source[start], length);
dest[length] = 0;
return dest;
}
By using memmove() vs. memcpy() in substring1(), code could use the same destination buffer as the source buffer. memmove() is well defined, even if buffers overlap.
substring1(source, source, start, length);

Related

C programming problem in dynamic memory allocation

The problem should be simple, but I have spent hours on this and cannot see what is wrong in my logic. The output works as it should, but Valgrind prints memory issues that should be fixed. I have added the origdest = (char*)realloc(origdest, strlen(origdest) + i * sizeof(char)); code to the while loop, my question is why doesn't this dynamically adjust the memory? The exact error given by Valgrind is
==9== Invalid write of size 1
==9== at 0x1087E2: mystrcat (mystrcat.c:18)
==9== by 0x10883C: main (mystrcat.c:34)
==9== Address 0x522d046 is 6 bytes inside a block of size 7 free'd
==9== at 0x4C31D2F: realloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==9== by 0x1087C2: mystrcat (mystrcat.c:17)
==9== by 0x10883C: main (mystrcat.c:34)
==9== Block was alloc'd at
==9== at 0x4C2FB0F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==9== by 0x108811: main (mystrcat.c:31)
char *mystrcat(char *dest, const char *src)
{
char *origdest = dest;
while(*dest) {
dest++;
}
int i = 1;
while (*src) {
origdest = (char*)realloc(origdest, strlen(origdest) + i * sizeof(char));
*dest++ = *src++; // Copies character and increases/moves pointer
i++;
}
*dest = 0;
return origdest;
}
int main(void)
{
char *str = malloc(7);
strcpy(str, "Mydogs");
str = mystrcat(str, "arecool");
printf("%s\n", str);
free(str);
}

This statement:
Address 0x522d046 is 6 bytes inside a block of size 7 free'd is saying that the realloc() called following these statements results in the old pointer pointing to freed memory.
after this segment:
char *origdest = dest;
while(*dest) {
dest++;
}
EDIT to address comment "what is specifically wrong with the code and what could be changed to make it work?"
The explanation of my first observation above is that once the pointer to allocated memory is moved, as you have done, the memory allocation tables no longer have an accurate location of that memory, making that memory un-freeable.
Your stated goal here is to create a version of strcat(), so using realloc() is a reasonable approach, but to use it safely allocate the new memory into a temporary buffer first, then if allocation fails, the original memory location still exists, and can be freed.
One other small change that makes a big difference is how i is initialized. If 1 is used, it places the beginning of the second string one extra position further in memory, leaving a \0 character just after the first string, effectively making it the end of the resultant string. i.e. you would never see the appended part of the string:
In memory it would look like this:
|M|y|d|o|g|s|\0|a|r|e|c|o|o|l|
Then over-flow your buffer when attempting to place another NULL terminator at the and of the concatenated buffer, resulting in undefined behavior.
The following adaptation of your code illustrates these, along with some other simplifications:
char *mystrcat(char *dest, const char *src)
{
char *temp = NULL;
int i = 0;//changed from i = 1 as first location to
//copy to is len1, not len1 + 1
//Note, starting at len1 + 1 would leave a NULL character
//after "Mydogs", effectively ending the string
//the following are simplifications for use in realloc()
int len1 = strlen(dest);
int len2 = strlen(src);
//call once rather than in a loop. It is more efficient.
temp = realloc(dest, len1+len2+1);//do not cast return of realloc
if(!temp)
{
//handle error
return NULL;
}
dest = temp;
while(*src)
{
dest[len1 + i] = *src;
i++;
src++;
}
dest[len1 + i] = 0;//add null terminator
return dest;
}
int main(void)
{
char *temp = NULL;
char *str = malloc(7);
if(str)//always a good idea to test pointer before using
{
strcpy(str, "Mydogs");
temp = mystrcat(str, "arecool");
if(!temp)
{
free(str);
printf("memory allocation error, leaving early");
return 0;
}
str = temp;
printf("%s\n", str);
free(str);
}
return 0;
}
Why it is not correct to cast the return of c-m-realloc() in C.

Here you move to the end of the original string:
while(*dest)
dest++;
Here you allocate some new memory, but dest still points to the end of the original string. So you are overwriting memory after the end of the original string. Since you are reallocating, the original string may not even exist anymore at the previous location you are writing to, because realloc can move the data to a completely new location.
while (*src)
{
origdest = (char*)realloc(origdest, strlen(origdest) + i * sizeof(char));
*dest++ = *src++; // Copies character and increases/moves pointer
i++;
}

Loading strings divided by delimiter to array

I have an array that needs to be filled with values from a string looking like this:
value0;value1;value2;value3;\n
I tried using strtok() but couldn't really figure out how to properly load more than 2 elements into table.
Desirable output is something like
arrayValues[0] = value0;
arrayValues[1] = value1;
etc.

You need to use strtok() and realloc(). Both are a bit difficult to use
char input[] = "value0;value1;value2;value3\n";
char **arrayValues = NULL;
int N = 0;
char *token = strtok(input, ";");
while(token != 0)
{
N++;
arrayValues = realloc(arrayValues, N * sizeof(char *));
if(!arrayValues)
/* out of memory - very unlikely to happen */
arrayValues[N-1] = strdup(token);
token = strtok(NULL, ";");
}
/* print out to check */
for(i=0;i<N;i++)
printf("***%s***\n", arrayValues[i]);
Note that the delimiter ';' is overwritten, if you retain it as you specified you'll have to add it to the end of the strings, which is fiddly and probably not what you really want.

At its simplest form, if the string you need to separate will remain in scope during the time you are making use of the individual tokens, then there is no need to allocate. Simply declare an array of pointers with a sufficient number of pointers for the tokens you have, and as you tokenize your string, just assign the address for the beginning of each token to the pointers in your array of pointers. That way, the pointers in your array simply point to the place within the original string where each of your tokens are found. Example:
#include <stdio.h>
#include <string.h>
#define MAXS 16
int main (void) {
char str[] = "value0;value1;value2;value3;\n",
*array[MAXS] = {NULL}, /* array of pointers */
*p = str, /* pointer to str */
*delim = ";\n"; /* delimiters */
int i, n = 0; /* loop var & index - n */
p = strtok (p, delim); /* get 1st token */
while (n < MAXS && p) { /* check bounds/validate token */
array[n++] = p; /* add pointer to array */
p = strtok (NULL, delim); /* get next token */
}
for (i = 0; i < n; i++) /* output tokens */
printf ("array[%2d] : %s\n", i, array[i]);
return 0;
}
(note: strtok modifies the original string by placing nul-terminating characters (e.g. '\0') in place of the delimiters. If you need to preserve the original string, make a copy before calling strtok)
Note above, you are limited to a fixed number of pointers, so while you are separating the tokens and assigning them to pointers in your array, you need to check the number against your array bounds to prevent writing beyond the end of your array.
Example Use/Output
$ ./bin/parsestrstrtok
array[ 0] : value0
array[ 1] : value1
array[ 2] : value2
array[ 3] : value3
Taking the parsing to the next step, where your original string may not remain in scope during the time your array values are needed, you simply need to allocate storage for each token and copy each token to your newly allocated memory and assign the starting address for each new block to the pointers in your array. That way, even if you pass your array of pointers and the string to a function for parsing, the array values remain available after the function completes until you free the memory you have allocated.
You are still limited to a fixed number of pointers, but your array is now usable wherever required in your program. The additions required for this are minimal. Note, malloc and strcpy are used below and can be replaced by a single call to strdup. However, since strdup is not part of all versions of C, malloc and strcpy are used instead. (but note, strdup does make for a very convenient replacement if your compiler supports it)
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MAXS 16
int main (void) {
char str[] = "value0;value1;value2;value3;\n",
*array[MAXS] = {NULL}, /* array of pointers */
*p = str, /* pointer to str */
*delim = ";\n"; /* delimiters */
int i, n = 0; /* loop var & index - n */
p = strtok (p, delim); /* get 1st token */
while (p) { /* validate token */
/* allocate/validate storage for token */
if (!(array[n] = malloc (strlen (p) + 1))) {
perror ("malloc failed");
exit (EXIT_FAILURE);
}
strcpy (array[n++], p); /* copy token to array */
p = strtok (NULL, delim); /* get next token */
}
for (i = 0; i < n; i++) { /* output tokens */
printf ("array[%2d] : %s\n", i, array[i]);
free (array[i]); /* free memory for tokens */
}
return 0;
}
(output is the same)
Finally, you can eliminate your dependency on a fixed number of pointers by dynamically allocating the pointers and reallocating the pointers on an as needed basis. You can start with the same number, and then allocate twice the current number of pointers when your current supply is exhausted. It is simply one additional level of allocation before you start parsing, and a requirement to realloc when you have used all the pointers at hand. Example:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MAXS 16
int main (void) {
char str[] = "value0;value1;value2;value3;\n",
**array = NULL, /* pointer to pointer to char */
*p = str, /* pointer to str */
*delim = ";\n"; /* delimiters */
int i, n = 0, /* loop var & index - n */
nptrs = MAXS; /* number allocated pointers */
/* allocate/validate an initial number of pointers for array */
if (!(array = malloc (nptrs * sizeof *array))) {
perror ("malloc pointers failed");
exit (EXIT_FAILURE);
}
p = strtok (p, delim); /* get 1st token */
while (p) { /* validate token */
/* allocate/validate storage for token */
if (!(array[n] = malloc (strlen (p) + 1))) {
perror ("malloc failed");
exit (EXIT_FAILURE);
}
strcpy (array[n++], p); /* copy token to array */
if (n == nptrs) { /* pointer limit reached */
/* realloc 2X number of pointers/validate */
void *tmp = realloc (array, nptrs * 2 * sizeof *array);
if (!tmp) {
perror ("realloc - pointers");
goto memfull; /* don't exit, array has original values */
}
array = tmp; /* assign new block to array */
nptrs *= 2; /* update no. allocated pointers */
}
p = strtok (NULL, delim); /* get next token */
}
memfull:;
for (i = 0; i < n; i++) { /* output tokens */
printf ("array[%2d] : %s\n", i, array[i]);
free (array[i]); /* free memory for tokens */
}
free (array); /* free memory for pointers */
return 0;
}
note: you should validate your memory use with a memory use and error checking program like valgrind on Linux. There are similar tools for every platform. Just run your code though the checker and validate there are no memory error and that all memory you have allocated has been properly freed.
Example:
$ valgrind ./bin/parsestrstrtokdbl
==15256== Memcheck, a memory error detector
==15256== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==15256== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==15256== Command: ./bin/parsestrstrtokdbl
==15256==
array[ 0] : value0
array[ 1] : value1
array[ 2] : value2
array[ 3] : value3
==15256==
==15256== HEAP SUMMARY:
==15256== in use at exit: 0 bytes in 0 blocks
==15256== total heap usage: 5 allocs, 5 frees, 156 bytes allocated
==15256==
==15256== All heap blocks were freed -- no leaks are possible
==15256==
==15256== For counts of detected and suppressed errors, rerun with: -v
==15256== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
**note above you see 5 allocations (1 for the pointers and 1 for each token). All memory has been freed and there are no memory errors.
There are probably a dozen more approaches you can take to inch-worm down your string picking out tokens, but this is the general progression of how to expand on the approach using strtok. Let me know if you have any further questions.

You can just use good old strchr function to hunt for the substring ';' terminator and malloc and realloc for memory allocations.
Make note that input str is modified (reused). In that string ';' are replaced by '\0'.
(If you need str untouched than allocate another buffer, copy the str to it and point p1,p2 pointers to it.)
The arrayValues holds pointers to the substrings:
#include<stdio.h>
#include<string.h>
#include<stdlib.h>
int main ()
{
char **arrayValues = malloc(sizeof(char *)); // allocate memory for string pointers
char str[] = "value1;value2;value3\n"; // input
char *p1 = str; // init pointer helpers
char *p2 = str;
int n = 0; // substring counter
// OPTIONAL if you want to get rid of ending '\n'
//--s
size_t len = strlen(str);
if(len>0)
if(str[len-1] == '\n')
str[len-1] = 0;
//--ee
while(p1 != NULL)
{
p1 = strchr(p1,';'); // find ';'
if(p1 != NULL)
{
arrayValues[n] = p2; // begining of the substring
*p1 = 0; // terminate the substring string; get rid of ';'
n++; // count the substrings
arrayValues = realloc( arrayValues, (n+1) * sizeof(char *)); // allocate more memory for next pointer
p2 = p1+1; // move the ponter after the ';'
p1 = p1+1; // we start the search for next ';'
}
else
{
arrayValues[n] = p2; // this is the last (or first) substring
n++;
}
} // while
// Output:
for (int j=0; j<n; j++)
{
printf("%s \n", arrayValues[j]);
}
printf("------");
free(arrayValues);
return 0;
}
Output:
value1
value2
value3
------

Function to split string sometimes gives segmentation fault

I have the following function to split a string. Most of the time it works fine, but sometimes it randomly causes a segmentation fault.
char** splitString(char* string, char* delim){
int count = 0;
char** split = NULL;
char* temp = strtok(string, delim);
while(temp){
split = realloc(split, sizeof(char*) * ++count);
split[count - 1] = temp;
temp = strtok(NULL, " ");
}
int i = 0;
while(split[i]){
printf("%s\n", split[i]);
i++;
}
split[count - 1][strlen(split[count - 1]) - 1] = '\0';
return split;
}

You have a number of subtle issues, not the least of which your function will segfault if you pass a string literal. You need to make a copy of the string you will be splitting as strtok modifies the string. If you pass a string literal (stored in read-only memory), your compiler has no way of warning unless you have declared string as const char *string;
To avoid these problems, simply make a copy of the string you will tokeninze. That way, regardless how the string you pass to the function was declared, you avoid the problem altogether.
You should also pass a pointer to size_t as a parameter to your function in order to make the number of token available back in the calling function. That way you do not have to leave a sentinel NULL as the final pointer in the pointer to pointer to char you return. Just pass a pointer and update it to reflect the number of tokens parsed in your function.
Putting those pieces together, and cleaning things up a bit, you could use the following to do what you are attempting to do:
char **splitstr (const char *str, char *delim, size_t *n)
{
char *cpy = strdup (str), *p = cpy; /* copy of str & pointer */
char **split = NULL; /* pointer to pointer to char */
*n = 0; /* zero 'n' */
for (p = strtok (p, delim); p; p = strtok (NULL, delim)) {
void *tmp = realloc (split, sizeof *split * (*n + 1));
if (!tmp) { /* validate realloc succeeded */
fprintf (stderr, "splitstr() error: memory exhausted.\n");
break;
}
split = tmp; /* assign tmp to split */
split[(*n)++] = strdup (p); /* allocate/copy to split[n] */
}
free (cpy); /* free cpy */
return split; /* return split */
}
Adding a short example program, you could do the following:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char **splitstr (const char *str, char *delim, size_t *n)
{
char *cpy = strdup (str), *p = cpy; /* copy of str & pointer */
char **split = NULL; /* pointer to pointer to char */
*n = 0; /* zero 'n' */
for (p = strtok (p, delim); p; p = strtok (NULL, delim)) {
void *tmp = realloc (split, sizeof *split * (*n + 1));
if (!tmp) { /* validate realloc succeeded */
fprintf (stderr, "splitstr() error: memory exhausted.\n");
break;
}
split = tmp; /* assign tmp to split */
split[(*n)++] = strdup (p); /* allocate/copy to split[n] */
}
free (cpy); /* free cpy */
return split; /* return split */
}
int main (void) {
size_t n = 0; /* number of strings */
char *s = "My dog has fleas.", /* string to split */
*delim = " .\n", /* delims */
**strings = splitstr (s, delim, &n); /* split s */
for (size_t i = 0; i < n; i++) { /* output results */
printf ("strings[%zu] : %s\n", i, strings[i]);
free (strings[i]); /* free string */
}
free (strings); /* free pointers */
return 0;
}
Example Use/Output
$ ./bin/splitstrtok
strings[0] : My
strings[1] : dog
strings[2] : has
strings[3] : fleas
Memory Use/Error Check
In any code you write that dynamically allocates memory, you have 2 responsibilities regarding any block of memory allocated: (1) always preserve a pointer to the starting address for the block of memory so, (2) it can be freed when it is no longer needed.
It is imperative that you use a memory error checking program to insure you do not attempt to write beyond/outside the bounds of your allocated block of memory, attempt to read or base a conditional jump on an uninitialized value, and finally, to confirm that you free all the memory you have allocated.
For Linux valgrind is the normal choice. There are similar memory checkers for every platform. They are all simple to use, just run your program through it.
$ valgrind ./bin/splitstrtok
==14471== Memcheck, a memory error detector
==14471== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==14471== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==14471== Command: ./bin/splitstrtok
==14471==
strings[0] : My
strings[1] : dog
strings[2] : has
strings[3] : fleas
==14471==
==14471== HEAP SUMMARY:
==14471== in use at exit: 0 bytes in 0 blocks
==14471== total heap usage: 9 allocs, 9 frees, 115 bytes allocated
==14471==
==14471== All heap blocks were freed -- no leaks are possible
==14471==
==14471== For counts of detected and suppressed errors, rerun with: -v
==14471== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
Always confirm that you have freed all memory you have allocated and that there are no memory errors.
Look things over and let me know if you have further questions.

split[count - 1][strlen(split[count - 1]) - 1] = '\0';
should look like
split[count - 1] = NULL;
You don't have anything allocated there so that you can access it and put '\0'.
After that put that line before while(split[i]) so that the while can stop when it reaches NULL.

The function strtok is not reentrant, use strtok_r() function this is a reentrant version strtok().

how to connect/link words in an string array using one loop?

i have an array with n words .. i want to attach the strings togther ..
for example if the array have the following strings: "hello" "world" "stack77"
i want the function to return :"helloworldstach7 " any help how i can do something like this without Recursion and with one loop and i can only use from the string library the two functions strcpy and strlen !!
any ideas ! thanks
I NEED TO USE ONE LOOP ONLY !
char *connect(char**words,int n){
int i=0;
while(words){
strcpy(words+i,
i saw many many solutions but they all use other string functions , where i only want to use strcpy and strlen .

If to use only the two mentioned standard string functions then the function can look as it is shown in the demonstrative program.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char * connect( char **words, size_t n )
{
size_t length = 0;
for ( size_t i = 0; i < n; i++ ) length += strlen( words[i] );
char *s = malloc( length + 1 );
size_t pos = 0;
for ( size_t i = 0; i < n; i++ )
{
strcpy( s + pos, words[i] );
pos += strlen( words[i] );
}
s[pos] = '\0';
return s;
}
int main( void )
{
char * s[] = { "Hello", " ", "World" };
char *p = connect( s, sizeof( s ) / sizeof( *s ) );
puts( p );
free( p );
}
The program output is
Hello World
If to use only one loop then the function can look the following way
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char * connect( char **words, size_t n )
{
char *s = calloc( 1, sizeof( char ) );
if ( s != NULL )
{
size_t pos = 0;
for ( size_t i = 0; s != NULL && i < n; i++ )
{
size_t length = strlen( words[i] );
char *tmp = realloc( s, pos + length + 1 );
if ( tmp != NULL )
{
s = tmp;
strcpy( s + pos, words[i] );
pos += length;
}
else
{
free( s );
s = NULL;
}
}
}
return s;
}
int main( void )
{
char * s[] = { "Hello", " ", "World" };
char *p = connect( s, sizeof( s ) / sizeof( *s ) );
if ( p != NULL ) puts( p );
free( p );
}

In addition to using strcpy, you can also use sprintf. Each of the functions in the printf family returns the number of characters actually output allowing you to compute an offset in your final string without an additional function call. Now, there is nothing wrong with using a strcpy/strlen approach, and in fact, that is probably the preferred approach, but be aware that there are always multiple ways of doing things within the parameters you have given. Also note that the printf family offers a wealth of formatting benefits in the event you would need to include additional information along with the concatenation of strings.
For example, using sprintf to concatenate each string while saving the number of characters in each nc as the offset for writing the next string to the resulting buffer buf, while using a ternary operator to control the addition of a space between the words based on your loop counter, you could do something similar to the following:
char *compress (char **p, int n)
{
char *buf = NULL; /* buffer to hold concatendated string */
size_t total = 0; /* total number of characters required */
int nc = 0; /* number of chars added (counter) */
for (int i = 0; i < n; i++) /* get total required length */
total += strlen (p[i]) + 1; /* including spaces between */
if (!(buf = malloc (total + 1))) /* allocate/validate mem */
return buf; /* return NULL on error */
for (int i = 0; i < n; i++) /* add each word to buf, save nc */
nc += sprintf (buf + nc, i ? " %s" : "%s", p[i]);
*(buf + nc) = 0; /* affirmatively nul-terminate buf */
return buf;
}
note: each memory allocation with malloc, calloc or realloc should be validated to insure it succeeds, and the error handled in the event of failure. (here NULL is returned if allocation fails).
Putting that together in a short example, you could do something similar to the following:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char *compress (char **p, int n)
{
char *buf = NULL; /* buffer to hold concatendated string */
size_t total = 0; /* total number of characters required */
int nc = 0; /* number of chars added (counter) */
for (int i = 0; i < n; i++) /* get total required length */
total += strlen (p[i]) + 1; /* including spaces between */
if (!(buf = malloc (total + 1))) /* allocate/validate mem */
return buf; /* return NULL on error */
for (int i = 0; i < n; i++) /* add each word to buf, save nc */
nc += sprintf (buf + nc, i ? " %s" : "%s", p[i]);
*(buf + nc) = 0; /* affirmatively nul-terminate buf */
return buf;
}
int main (void) {
char *sa[] = { "My", "dog", "has", "too many", "fleas." },
*result = compress (sa, sizeof sa/sizeof *sa);
if (result) { /* check return */
printf ("result: '%s'\n", result); /* print string */
free (result); /* free memory */
}
return 0;
}
Example Use/Output
$ ./bin/strcat_sprintf
result: 'My dog has too many fleas.'
Memory/Error Check
In any code you write that dynamically allocates memory, you have 2 responsibilities regarding any block of memory allocated: (1) always preserve a pointer to the starting address for the block of memory so, (2) it can be freed when it is no longer needed.
It is imperative that you use a memory error checking program to insure you do not attempt to write beyond/outside the bounds of your allocated block of memory, attempt to read or base a conditional jump on an uninitialized value, and finally, to confirm that you free all the memory you have allocated.
For Linux valgrind is the normal choice. There are similar memory checkers for every platform. They are all simple to use, just run your program through it.
$ valgrind ./bin/strcat_sprintf
==27595== Memcheck, a memory error detector
==27595== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==27595== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==27595== Command: ./bin/strcat_sprintf
==27595==
result: 'My dog has too many fleas.'
==27595==
==27595== HEAP SUMMARY:
==27595== in use at exit: 0 bytes in 0 blocks
==27595== total heap usage: 1 allocs, 1 frees, 28 bytes allocated
==27595==
==27595== All heap blocks were freed -- no leaks are possible
==27595==
==27595== For counts of detected and suppressed errors, rerun with: -v
==27595== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
Always confirm that you have freed all memory you have allocated and that there are no memory errors.
Look things over and let me know if you have any additional questions.
compress with a Single Loop
As mentioned in the comment to MarianD's answer, you can use a single loop, reallocating your buffer with the addition of each word to the final string, but that is less efficient than getting the total number of characters required and then allocating once. However, there are many occasions where that is exactly what you will be required to do. Basically, you will simply get the length of each word and then allocate memory for that word (and the space between it and the next and for the nul-byte) using realloc instead of malloc (or calloc). realloc acts just like malloc for the first allocation, thereafter it resizes the buffer maintaining its current contents.
note: never realloc the buffer directly (e.g. buf = realloc (buf, newsize);), instead, always use a temporary pointer. Why? IF realloc fails, NULL is returned by realloc which causes you to lose the reference to your original buf (e.g. it will result in buf = NULL;), meaning that the address for your original buf is lost (and you have created a memory leak).
Putting that together, you could do something like the following:
char *compress (char **p, int n)
{
char *buf = NULL; /* buffer to hold concatendated string */
size_t bufsz = 0; /* current allocation size for buffer */
int nc = 0; /* number of chars added (counter) */
for (int i = 0; i < n; i++) { /* add each word to buf */
size_t len = strlen (p[i]) + 1; /* get length of word */
void *tmp = realloc (buf, bufsz + len); /* realloc buf */
if (!tmp) /* validate reallocation */
return buf; /* return current buffer */
buf = tmp; /* assign reallocated block to buffer */
bufsz += len; /* increment bufsz to current size */
nc += sprintf (buf + nc, i ? " %s" : "%s", p[i]);
}
*(buf + nc) = 0; /* affirmatively nul-terminate buf */
return buf;
}
Memory/Error Check
$ valgrind ./bin/strcat_sprintf_realloc
==28175== Memcheck, a memory error detector
==28175== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==28175== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==28175== Command: ./bin/strcat_sprintf_realloc
==28175==
result: 'My dog has too many fleas.'
==28175==
==28175== HEAP SUMMARY:
==28175== in use at exit: 0 bytes in 0 blocks
==28175== total heap usage: 5 allocs, 5 frees, 68 bytes allocated
==28175==
==28175== All heap blocks were freed -- no leaks are possible
==28175==
==28175== For counts of detected and suppressed errors, rerun with: -v
==28175== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
note: now there are 5 allocations instead of 1.
Let me know if you have any questions.

Returning a character array from a function in c

Can I return an array that is created dynamically (using malloc) inside a function back to its caller?
I know that returning a statically allocated array is wrong because the stack unwinds as the function returns and variable is no longer valid but what about a dynamically allocated variable?

Returning anything allocated with malloc is fine, as long as whoever uses your function takes care of free()ing it when they're done. malloc allocates on the heap which is essentially global within your program.

As others have noted, you can in fact return a char pointer.
However, another common method is for the caller to pass in the pointer for the method to fill along with a length parameter. This makes it so the function responsible for allocating the memory will also be the same function responsible for freeing the memory, which can make memory leaks easier to see. This is what functions such as snprintf and strncpy do.
/* Performs a reverse strcpy. Returns number of bytes written if dst is
* large enough, or the negative number of bytes that would have been
* written if dst is too small too hold the copy. */
int rev_strcpy(char *dst, const char *src, unsigned int dst_len) {
unsigned int src_len = strlen(src); /* assumes src is in fact NULL-terminated */
int i,j;
if (src_len+1 > dst_len) {
return -(src_len+1); /* +1 for terminating NULL */
}
i = 0;
j = src_len-1;
while (i < src_len) {
dst[i] = src[j];
++i;
++j;
}
dst[src_len] = '\0';
return src_len;
}
void random_function() {
unsigned int buf_len;
char *buf;
int len;
const char *str = "abcdefg";
buf_len = 4;
buf = malloc(buf_len * sizeof(char));
if (!buf) {
/* fail hard, log, whatever you want */
return;
}
/* ...whatever randomness this function needs to do */
len = rev_strcpy(buf, str, buf_len);
if (len < 0) {
/* realloc buf to be large enough and try again */
free(buf);
buf_len = -len;
buf = malloc(buf_len * sizeof(buf));
if (!buf) {
/* fail hard, log, whatever you want */
return;
}
len = rev_strcpy(buf, str, sizeof(buf));
}
/* ... the rest of the randomness this function needs to do */
/* random_function has allocated the memory, random_function frees the memory */
free(buf);
}
This can lead to some overhead though if you don't know how big a buffer you'll need and need to call the function twice, but often the caller has a good idea to how large the buffer needs to be. Also it requires a little more logic to ensure the function doesn't overrun the given buffer. But it keeps the responsibility of freeing the memory with whatever is allocating the memory, while also allowing the option to pass local stack memory.
Example just returning the char*:
/* Performs a reverse strcpy. Returns char buffer holding reverse copy of
* src, or NULL if memory could not be allocated. Caller is responsible
* to free memory. */
char* rev_strcpy(const char *src) {
unsigned int src_len = strlen(src); /* assumes src is in fact NULL-terminated */
char *dst;
int i,j;
dst = malloc((src_len+1) * sizeof(char));
if (!dst) {
return NULL;
}
i = 0;
j = src_len-1;
while (i < src_len) {
dst[i] = src[j];
++i;
++j;
}
dst[src_len] = '\0';
return dst;
}
void random_function() {
char *buf;
const char *str = "abcdefg";
/* ...whatever randomness this function needs to do */
buf = rev_strcpy(str);
if (!buf) {
/* fail hard, log, whatever you want */
return;
}
/* ... the rest of the randomness this function needs to do */
/* random_function frees the memory that was allocated by rev_strcpy */
free(buf);
}

Yes you can. Just malloc() the array inside your function and return the pointer.
BUT, the caller needs to understand it needs to be freed at some point, or you'll have a memory leak.

You can certainly return an array allocated with malloc, but you have to make sure the caller of the function eventually releases the array with free; if you don't free a malloc'd array, the memory remains "in use" until program exit.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

How to get a substring from a string in C? - c

Related

C programming problem in dynamic memory allocation

Loading strings divided by delimiter to array

Function to split string sometimes gives segmentation fault

how to connect/link words in an string array using one loop?

Returning a character array from a function in c

Categories

Resources