Concatenating two strings : C - c

Consider Below code for concatenating two char arrays with a delimiter:
void addStrings(char* str1,char* str2,char del)
{
//str1=str1+str2
int len1=strlen(str1);
int len2=strlen(str2);
int i=0;
//char* temp=(char*) malloc((len1+1)*sizeof(char));
//strcpy(temp,str1);
str1=(char*) realloc(str1,(len1+len2+1)*sizeof(char));
printf("Here--%d\n",strlen(str1));
*(str1+len1)=del; //adding delimiter
for(i=0;i<=len2;i++)
*(str1+len1+i+1)=*(str2+i);
printf("Concatenated String: %s\n",str1);
i=0;
while( *(str1+i) != '\0')
{
printf("~~%d:%c\n",i,*(str1+i));
i++;
}
}
When running this function with addStrings("A","test",'#');; The code crashes as realloc below is gdb output
Breakpoint 3, addStrings (str1=0x40212f <_data_start__+303> "A", str2=0x40212a <_data_start__+298> "test",
del=64 '#') at string.c:34
34 int len1=strlen(str1);
(gdb) s
35 int len2=strlen(str2);
(gdb) s
36 int i=0;
(gdb) s
39 str1=(char*) realloc(str1,(len1+len2+1)*sizeof(char));
(gdb)
Program received signal SIGABRT, Aborted.
0x004012f2 in addStrings (str1=0xc0 <Address 0xc0 out of bounds>,
str2=0xea60 <Address 0xea60 out of bounds>, del=0 '\000') at string.c:39
39 str1=(char*) realloc(str1,(len1+len2+1)*sizeof(char));
Not able to figure out why it is crashing? Is it because I am passing str1 as auto variable rather than creating it on heap?
If this is the case ? How do I modify my code to accept auto as well as heap variables?

You need to pass your target string pointer by address, and it must hold either the address of a previously allocate string, or NULL (if coded correctly). The size allocation must be both lengths + 2 (one for the deli separator, one for the terminator). The result can look something like this:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
void addStrings(char** str1, const char* str2,char del)
{
size_t len1 = *str1 ? strlen(*str1) : 0;
size_t len2 = str2 ? strlen(str2) : 0;
char *res = realloc(*str1, len1 + len2 + 2);
if (res)
{
res[len1] = del;
memcpy(res + len1 + 1, str2, len2);
res[len1 + 1 + len2] = 0;
*str1 = res;
}
}
int main()
{
char *p = NULL;
const char test[] = "test";
int i=0;
// prove it works with no input whatsoever
addStrings(&p, NULL, 'X');
printf("p = %p, %s\n", p, p);
// loop on some input for awhile
for (;i<10;++i)
{
addStrings(&p, test, '#');
printf("p = %p, %s\n", p, p);
}
free(p);
return 0;
}
Output
p = 0x128610, X
p = 0x128610, X#test
p = 0x128610, X#test#test
p = 0x128620, X#test#test#test
p = 0x128620, X#test#test#test#test
p = 0x128620, X#test#test#test#test#test
p = 0x128620, X#test#test#test#test#test#test
p = 0x128640, X#test#test#test#test#test#test#test
p = 0x128640, X#test#test#test#test#test#test#test#test
p = 0x128640, X#test#test#test#test#test#test#test#test#test
p = 0x128670, X#test#test#test#test#test#test#test#test#test#test
Compiled with:
Apple LLVM version 5.1 (clang-503.0.40) (based on LLVM 3.4svn)
Target: i386-apple-darwin13.2.0
Thread model: posix
Note the change in resulting address on some of the passes. I leave the checking for valid parameter input as an exercise for you.

Related

Extracting a string between two similar (or different) strings in C as fast as possible

I made a program in C that can find two similar or different strings and extract the string between them. This type of program has so many uses, and generally when you use such a program, you have a lot of info, so it needs to be fast. I would like tips on how to make this program as fast and efficient as possible.
I am looking for suggestions that won't make me resort to heavy libraries (such as regex).
The code must:
be able to extract a string between two similar or different strings
find the 1st occurrence of string1
find the 1st occurrence of string2 which occurs AFTER string1
extract the string between string1 and string2
be able to use string arguments of any size
be foolproof to human error and return NULL if such occurs (example, string1 exceeds entire text string length. don't crash in an element error, but gracefully return NULL)
focus on speed and efficiency
Below is my code. I am quite new to C, coming from C++, so I could probably use a few suggestions, especially regarding efficient/proper use of the 'malloc' command:
fast_strbetween.c:
/*
Compile with:
gcc -Wall -O3 fast_strbetween.c -o fast_strbetween
*/
#include <stdio.h> // printf
#include <stdlib.h> // malloc
// inline function if it pleases the compiler gods
inline size_t fast_strlen(char *str)
{
int i; // Cannot return 'i' if inside for loop
for(i = 0; str[i] != '\0'; ++i);
return i;
}
char *fast_strbetween(char *str, char *str1, char *str2)
{
// size_t segfaults when incorrect length strings are entered (due to going below 0), so use int instead for increased robustness
int str0len = fast_strlen(str);
int str1len = fast_strlen(str1);
int str1pos = 0;
int charsfound = 0;
// Find str1
do {
charsfound = 0;
while (str1[charsfound] == str[str1pos + charsfound])
++charsfound;
} while (++str1pos < str0len - str1len && charsfound < str1len);
// '++str1pos' increments past by 1: needs to be set back by one
--str1pos;
// Whole string not found or logical impossibilty
if (charsfound < str1len)
return NULL;
/* Start searching 2 characters after last character found in str1. This will ensure that there will be space, and logical possibility, for the extracted text to exist or not, and allow immediate bail if the latter case; str1 cannot possibly have anything between it if str2 is right next to it!
Example:
str = 'aa'
str1 = 'a'
str2 = 'a'
returned = '' (should be NULL)
Without such preventative, str1 and str2 would would be found and '' would be returned, not NULL. This also saves 1 do/while loop, one check pertaining to returning null, and two additional calculations:
Example, if you didn't add +1 str2pos, you would need to change the code to:
if (charsfound < str2len || str2pos - str1pos - str1len < 1)
return NULL;
It also allows for text to be found between three similar strings—what??? I can feel my brain going fuzzy!
Let this example explain:
str = 'aaa'
str1 = 'a'
str2 = 'a'
result = '' (should be 'a')
Without the aforementioned preventative, the returned string is '', not 'a'; the program takes the first 'a' for str1 and the second 'a' for str2, and tries to return what is between them (nothing).
*/
int str2pos = str1pos + str1len + 1; // the '1' added to str2pos
int str2len = fast_strlen(str2);
// Find str2
do {
charsfound = 0;
while (str2[charsfound] == str[str2pos + charsfound])
++charsfound;
} while (++str2pos < str0len - str2len + 1 && charsfound < str2len);
// Deincrement due to '++str2pos' over-increment
--str2pos;
if (charsfound < str2len)
return NULL;
// Only allocate what is needed
char *strbetween = (char *)malloc(sizeof(char) * str2pos - str1pos - str1len);
unsigned int tmp = 0;
for (unsigned int i = str1pos + str1len; i < str2pos; i++)
strbetween[tmp++] = str[i];
return strbetween;
}
int main() {
char str[30] = { "abaabbbaaaabbabbbaaabbb" };
char str1[10] = { "aaa" };
char str2[10] = { "bbb" };
//Result should be: 'abba'
printf("The string between is: \'%s\'\n", fast_strbetween(str, str1, str2));
// free malloc as we go
for (int i = 10000000; --i;)
free(fast_strbetween(str, str1, str2));
return 0;
}
In order to have some way of measuring progress, I have already timed the code above (extracting a small string 10000000 times):
$ time fast_strbetween
The string between is: 'abba'
0m11.09s real 0m11.09s user 0m00.00s system
Process used 99.3 - 100% CPU according to 'top' command (Linux).
Memory used while running: 3.7Mb
Executable size: 8336 bytes
Ran on a Raspberry Pi 3B+ (4 x 1.4Ghz, Arm 6)
If anyone would like to offer code, tips, pointers... I would appreciate it. I will also implement the changes and give a timed result for your troubles.
Oh, and one thing that I learned is to always de-allocate malloc; I ran the code above (with extra loops), just before posting this. My computer's ram filled up, and the computer froze. Luckily, Stack made a backup draft! Lesson learned!
* EDIT *
Here is the revised code using chqrlie's advice as best I could. Added extra checks for end of string, which ended up costing about a second of time with the tested phrase but can now bail very fast if the first string is not found. Using null or illogical strings should not result in error, hopefully. Lots of notes int the code, where they can be better understood. If I've left anything thing out or done something incorrectly, please let me know guys; it is not intentional.
fast_strbetween2.c:
/*
Compile with:
gcc -Wall -O3 fast_strbetween2.c -o fast_strbetween2
Corrections and additions courtesy of:
https://stackoverflow.com/questions/55308295/extracting-a-string-between-two-similar-or-different-strings-in-c-as-fast-as-p
*/
#include<stdio.h> // printf
#include<stdlib.h> // malloc, free
// Strings now set to 'const'
char * fast_strbetween(const char *str, const char *str1, const char *str2)
{
// string size will now be calculated by the characters picked up
size_t str1pos = 0;
size_t str1chars;
// Find str1
do{
str1chars = 0;
// Will the do/while str1 check for '\0' suffice?
// I haven't seen any issues yet, but not sure.
while(str1[str1chars] == str[str1pos + str1chars] && str1[str1chars] != '\0')
{
//printf("Found str1 char: %i num: %i pos: %i\n", str1[str1chars], str1chars + 1, str1pos);
++str1chars;
}
// Incrementing whilst not in conditional expression tested faster
++str1pos;
/* There are two checks for "str1[str1chars] != '\0'". Trying to find
another efficient way to do it in one. */
}while(str[str1pos] != '\0' && str1[str1chars] != '\0');
--str1pos;
//For testing:
//printf("str1pos: %i str1chars: %i\n", str1pos, str1chars);
// exit if no chars were found or if didn't reach end of str1
if(!str1chars || str1[str1chars] != '\0')
{
//printf("Bailing from str1 result\n");
return '\0';
}
/* Got rid of the '+1' code which didn't allow for '' returns.
I agree with your logic of <tag></tag> returning ''. */
size_t str2pos = str1pos + str1chars;
size_t str2chars;
//printf("Starting pos for str2: %i\n", str1pos + str1chars);
// Find str2
do{
str2chars = 0;
while(str2[str2chars] == str[str2pos + str2chars] && str2[str2chars] != '\0')
{
//printf("Found str2 char: %i num: %i pos: %i \n", str2[str2chars], str2chars + 1, str2pos);
++str2chars;
}
++str2pos;
}while(str[str2pos] != '\0' && str2[str2chars] != '\0');
--str2pos;
//For testing:
//printf("str2pos: %i str2chars: %i\n", str2pos, str2chars);
if(!str2chars || str2[str2chars] != '\0')
{
//printf("Bailing from str2 result!\n");
return '\0';
}
/* Trying to allocate strbetween with malloc. Is this correct? */
char * strbetween = malloc(2);
// Check if malloc succeeded:
if (strbetween == '\0') return '\0';
size_t tmp = 0;
// Grab and store the string between!
for(size_t i = str1pos + str1chars; i < str2pos; ++i)
{
strbetween[tmp] = str[i];
++tmp;
}
return strbetween;
}
int main() {
char str[30] = { "abaabbbaaaabbabbbaaabbb" };
char str1[10] = { "aaa" };
char str2[10] = { "bbb" };
printf("Searching \'%s\' for \'%s\' and \'%s\'\n", str, str1, str2);
printf(" 0123456789\n\n"); // Easily see the elements
printf("The word between is: \'%s\'\n", fast_strbetween(str, str1, str2));
for(int i = 10000000; --i;)
free(fast_strbetween(str, str1, str2));
return 0;
}
** Results **
$ time fast_strbetween2
Searching 'abaabbbaaaabbabbbaaabbb' for 'aaa' and 'bbb'
0123456789
The word between is: 'abba'
0m10.93s real 0m10.93s user 0m00.00s system
Process used 99.0 - 100% CPU according to 'top' command (Linux).
Memory used while running: 1.8Mb
Executable size: 8336 bytes
Ran on a Raspberry Pi 3B+ (4 x 1.4Ghz, Arm 6)
chqrlie's answer
I understand that this is just some example code that shows proper programming practices. Nonetheless, it can make for a decent control in testing.
Please note that I do not know how to deallocate malloc in your code, so it is NOT a fair test. As a result, ram usage builds up, taking 130Mb+ for the process alone. I was still able to run the test for the full 10000000 loops. I will say that I tried deallocating this code the way I did my code (via bringing the function 'simple_strbetween' down into main and deallocating with 'free(strndup(p, q - p));'), and the results weren't much different from not deallocating.
** simple_strbetween.c **
/*
Compile with:
gcc -Wall -O3 simple_strbetween.c -o simple_strbetween
Courtesy of:
https://stackoverflow.com/questions/55308295/extracting-a-string-between-two-similar-or-different-strings-in-c-as-fast-as-p
*/
#include<string.h>
#include<stdio.h>
char *simple_strbetween(const char *str, const char *str1, const char *str2) {
const char *q;
const char *p = strstr(str, str1);
if (p) {
p += strlen(str1);
q = *str2 ? strstr(p, str2) : p + strlen(p);
if (q)
return strndup(p, q - p);
}
return NULL;
}
int main() {
char str[30] = { "abaabbbaaaabbabbbaaabbb" };
char str1[10] = { "aaa" };
char str2[10] = { "bbb" };
printf("Searching \'%s\' for \'%s\' and \'%s\'\n", str, str1, str2);
printf(" 0123456789\n\n"); // Easily see the elements
printf("The word between is: \'%s\'\n", simple_strbetween(str, str1, str2));
for(int i = 10000000; --i;)
simple_strbetween(str, str1, str2);
return 0;
}
$ time simple_strbetween
Searching 'abaabbbaaaabbabbbaaabbb' for 'aaa' and 'bbb'
0123456789
The word between is: 'abba'
0m19.68s real 0m19.34s user 0m00.32s system
Process used 100% CPU according to 'top' command (Linux).
Memory used while running: 130Mb (leak due do my lack of knowledge)
Executable size: 8380 bytes
Ran on a Raspberry Pi 3B+ (4 x 1.4Ghz, Arm 6)
Results for above code ran with this alternate strndup:
char *alt_strndup(const char *s, size_t n)
{
size_t i;
char *p;
for (i = 0; i < n && s[i] != '\0'; i++)
continue;
p = malloc(i + 1);
if (p != NULL) {
memcpy(p, s, i);
p[i] = '\0';
}
return p;
}
$ time simple_strbetween
Searching 'abaabbbaaaabbabbbaaabbb' for 'aaa' and 'bbb'
0123456789
The word between is: 'abba'
0m20.99s real 0m20.54s user 0m00.44s system
I kindly ask that nobody make judgements on the results until the code is properly ran. I will revise the results as soon as it is figured out.
* Edit *
Was able to decrease the time by over 25% (11.93s vs 8.7s). This was done by using pointers to increment the positions, as opposed to size_t. Collecting the return string whilst checking the last string was likely what caused the biggest change. I feel there is still lots of room for improvement. A big loss comes from having to free malloc. If there is a better way, I'd like to know.
fast_strbetween3.c:
/*
gcc -Wall -O3 fast_strbetween.c -o fast_strbetween
*/
#include<stdio.h> // printf
#include<stdlib.h> // malloc, free
char * fast_strbetween(const char *str, const char *str1, const char *str2)
{
const char *sbegin = &str1[0]; // String beginning
const char *spos;
// Find str1
do{
spos = str;
str1 = sbegin;
while(*spos == *str1 && *str1)
{
++spos;
++str1;
}
++str;
}while(*str1 && *spos);
// Nothing found if spos hasn't advanced
if (spos == str)
return NULL;
char *strbetween = malloc(1);
if (!strbetween)
return '\0';
str = spos;
int i = 0;
//char *p = &strbetween[0]; // Alt. for advancing strbetween (slower)
sbegin = &str2[0]; // Recycle sbegin
// Find str2
do{
str2 = sbegin;
spos = str;
while(*spos == *str2 && *str2)
{
++str2;
++spos;
}
//*p = *str;
//++p;
strbetween[i] = *str;
++str;
++i;
}while(*str2 && *spos);
if (spos == str)
return NULL;
//*--p = '\0';
strbetween[i - 1] = '\0';
return strbetween;
}
int main() {
char s[100] = "abaabbbaaaabbabbbaaabbb";
char s1[100] = "aaa";
char s2[100] = "bbb";
printf("\nString: \'%s\'\n", fast_strbetween(s, s1, s2));
for(int i = 10000000; --i; )
free(fast_strbetween(s, s1, s2));
return 0;
}
String: 'abba'
0m08.70s real 0m08.67s user 0m00.01s system
Process used 99.0 - 100% CPU according to 'top' command (Linux).
Memory used while running: 1.8Mb
Executable size: 8336 bytes
Ran on a Raspberry Pi 3B+ (4 x 1.4Ghz, Arm 6)
* Edit *
This doesn't really count as it does not 'return' a value, and therefore is against my own rules, but it does pass a variable through, which is changed and brought back to main. It runs with 1 library and takes 3.6s. Getting rid of malloc was the key.
/*
gcc -Wall -O3 fast_strbetween.c -o fast_strbetween
*/
#include<stdio.h> // printf
unsigned int fast_strbetween(const char *str, const char *str1, const char *str2, char *strbetween)
{
const char *sbegin = &str1[0]; // String beginning
const char *spos;
// Find str1
do{
spos = str;
str1 = sbegin;
while(*spos == *str1 && *str1)
{
++spos;
++str1;
}
++str;
}while(*str1 && *spos);
// Nothing found if spos hasn't advanced
if (spos == str)
{
strbetween[0] = '\0';
return 0;
}
str = spos;
sbegin = &str2[0]; // Recycle sbegin
// Find str2
do{
str2 = sbegin;
spos = str;
while(*spos == *str2 && *str2)
{
++str2;
++spos;
}
*strbetween = *str;
++strbetween;
++str;
}while(*str2 && *spos);
if (spos == str)
{
strbetween[0] = '\0';
return 0;
}
*--strbetween = '\0';
return 1; // Successful (found text)
}
int main() {
char s[100] = "abaabbbaaaabbabbbaaabbb";
char s1[100] = "aaa";
char s2[100] = "bbb";
char sret[100];
fast_strbetween(s, s1, s2, sret);
printf("String: %s\n", sret);
for(int i = 10000000; --i; )
fast_strbetween(s, s1, s2, sret);
return 0;
}
Your code has multiple problems and is probably not as efficient as it should be:
you use types int and unsigned int for indexes into the strings. These types may be smaller than the range of size_t. You should revise your code to use size_t and avoid mixing signed and unsigned types in comparisons.
your functions' string arguments should be declared as const char * as you do not modify the strings and should be able to pass const strings without a warning.
redefining strlen is a bad idea: your version will be slower than the system's optimized, assembly coded and very likely inlined version.
computing the length of str is unnecessary and potentially costly: both str1 and str2 may appear close to the beginning of str, scanning for the end of str will be wasteful.
the while loop inside the first do / while loop is incorrect: while(str1[charsfound] == str[str1pos + charsfound]) charsfound++; may access characters beyond the end of str and str1 as the loop does not stop at the null terminator. If str1 only appears at the end of str, you have undefined behavior.
if str1 is an empty string, you will find it at the end of str instead of at the beginning.
why do you initialize str2pos as int str2pos = str1pos + str1len + 1;? If str2 immediately follows str1 inside str, an empty string should be allocated and returned. Your comment regarding this case is unreadable, you should break such long lines to fit within a typical screen width such as 80 columns. It is debatable whether strbetween("aa", "a", "a") should return "" or NULL. IMHO it should return an allocated empty string, which would be consistent with the expected behavior on strbetween("<name></name>", "<name>", "</name>") or strbetween("''", "'", "'"). Your specification preventing strbetween from returning an empty string produces a counter-intuitive border case.
the second scanning loop has the same problems as the first.
the line char *strbetween = (char *) malloc(sizeof(char) * str2pos - str1pos - str1len); has multiple problems: no cast is necessary in C, if you insist on specifying the element size sizeof(char), which is 1 by definition, you should parenthesize the number of elements, and last but not least, you must allocate one extra element for the null terminator.
You do not test if malloc() succeeded. If it returns NULL, you will have undefined behavior, whereas you should just return NULL.
the copying loop uses a mix of signed and unsigned types, causing potentially counterintuitive behavior on overflow.
you forget to set the null terminator, which is consistent with the allocation size error, but incorrect.
Before you try and optimize code, you must ensure correctness! Your code is too complicated and has multiple flaws. Optimisation is a moot point.
You should first try a very simple implementation using standard C string functions: searching a string inside another one is performed efficiently by strstr.
Here is a simple implementation using strstr and strndup(), which should be available on your system:
#include <string.h>
char *simple_strbetween(const char *str, const char *str1, const char *str2) {
const char *q;
const char *p = strstr(str, str1);
if (p) {
p += strlen(str1);
q = *str2 ? strstr(p, str2) : p + strlen(p);
if (q)
return strndup(p, q - p);
}
return NULL;
}
strndup() is defined in POSIX and is part of the Extensions to the C Library Part II: Dynamic Allocation Functions, ISO/IEC TR 24731-2:2010. If it is not available on your system, it can be redefined as:
#include <stdlib.h>
#include <string.h>
char *strndup(const char *s, size_t n) {
size_t i;
char *p;
for (i = 0; i < n && s[i] != '\0'; i++)
continue;
p = malloc(i + 1);
if (p != NULL) {
memcpy(p, s, i);
p[i] = '\0';
}
return p;
}
To ensure correctness, write a number of test cases, with border cases such as all combinations of empty strings and identical strings.
Once your have thoroughly your strbetween function, you can write a benchmarking framework to test performance. This is not so easy to get reliable performance figures, as you will experience if you try. Remember to configure your compiler to select the appropriate optimisations, -O3 for example.
Only then can you move to the next step: if you are really restricted from using standard C library functions, you may first recode your versions of strstr and strlen and still use the same method. Test this new version both for correctness and for performance.
The redundant parts are the computation of strlen(str1) which must have been determined by strstr when it finds a match. And the scan in strndup() which is unnecessary since no null byte is present between p and q. If you have time to waste, you can try and remove these redundancies at the expense of readability, risking non conformity. I would be surprised if you get any improvement at all on average over a wide variety of test cases. 20% would be remarkable.

C language how to cut out part of a string

Let's say I have a string which contains the following text:
#Line1 Hello, today I ate 3 crackers for dinner
#Line2 and 4 crackers with some soup for lunch.
#Line3 For breakfast tomorrow, I plan on eating
#Line4 bacon, eggs, and ham.
and I wanted to cut the part of the string from one substring to another substring, for example from "#Line3" to \n to get the following output:
#Line1 Hello, today I ate 3 crackers for dinner
#Line2 and 4 crackers with some soup for lunch.
#Line4 bacon, eggs, and ham.
(Just basically cutting out everything from #Line3 to \n and in essence removing the entire 3rd line)
I have read that this could be done with the function memmove but have not been able to figure out how to correctly do so. However, if anyone has a solution that does not involve memmove, of course that would be equally appreciated.
Here's what I have so far:
int str_cut(char *str, char *begin, int len)
{
int l = strlen(str);
if (strlen(begin) + len > l) len = l - begin;
memmove(str + strlen(begin), str + begin + len, l - len + 1);
return len;
}
This is so far pretty far off in accomplishing what I want because it depends on knowing the length of what needs to be cut out and what I want it to do is cut out between 2 chars , to go along with my previous example to cut everything between "line3" and \n
Pretty easy using memmove:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char *
strfilter(const char *str, const char *substr)
{
char *res = strdup(str), *ptr = NULL;
size_t len = 0;
if (!res)
return NULL;
ptr = strstr(res, substr);
if (!ptr)
{
free(res);
return NULL;
}
len = strlen(ptr) - strlen(substr);
memmove(ptr, ptr + strlen(substr), len);
memset(ptr + len, 0, strlen(ptr + len));
return res;
}
char *
strfilter2(const char *str, const char *start, const char *end)
{
char *res = strdup(str), *ptr1 = NULL, *ptr2 = NULL, *tmp = NULL;
size_t len1 = 0, len2 = 0;
if (!res)
return NULL;
ptr1 = strstr(res, start);
ptr2 = strstr(res, end);
if (!ptr1 || !ptr2)
return NULL;
if (ptr1 > ptr2)
{
tmp = ptr2;
ptr2 = ptr1;
ptr1 = tmp;
tmp = end;
end = start;
start = tmp;
}
len1 = strlen(start);
len2 = strlen(ptr2);
memmove(ptr1 + len1, ptr2, len2);
memset(ptr1 + len1 + len2, 0, strlen(ptr1 + len1 + len2));
return res;
}
int
main(void)
{
const char *str = "Hello, today I ate 3 crackers for dinner\n"
"and 4 crackers with some soup for lunch.\n"
"For breakfast tomorrow, I plan on eating\n"
"bacon, eggs, and ham.\n";
char *res = strfilter2(str, "and 4 crackers with some soup for lunch.\n", "bacon, eggs, and ham.\n");
if (!res)
{
perror("strfilter2()");
return 1;
}
puts(res);
free(res);
return 0;
}
The function just finds the substring that you want to remove and overwrites it with everything that comes after it, and then it zeroes out the remainder of the string.
EDIT:
Added strfilter2 to eliminate the content between two substrings.
Using memcpy you can copy two parts into two buffers and then concat them using strcat.
Get a substring of a char*
To cut out a part of the string you need to find - of course - first the start and end marks of the part you want to cut. For that you can use strstr. Then just copy the remaining part (everything after the end mark) to the place where you found the start mark:
char * cut_between(
char * const str,
char const * const from,
char const * const to) {
char * const startMark = strstr(str, from);
if (! startMark) {
return NULL;
}
char * const endMark =
strstr(startMark+strlen(from), to);
if (endMark) {
strcpy(startMark, endMark+strlen(to));
return startMark + strlen(startMark) + 1;
} else {
*startMark = '\0';
return startMark + 1;
}
}
On success the above function returns a pointer beyond the end of the resulting string. This is useful for buffer compaction, like:
int main() {
char * const input = malloc(400);
fgets(input, 400, stdin);
char const * const end =
cut_between(input, "from", "to");
if (end) {
char const * const result =
realloc(input, end - input);
puts(result);
// OMG missing free(s), well ... OK for this simple test.
}
return 0;
}
(Live on ideone)
Please note the missing error checks on above test. In production code these must be added.

String manipulating (cut the head of a string)

Why I cannot get "xxx"? The returned value is something very strange symbols... I want the returned value to be xxx, but I don't know what is wrong with this program. The function works well and can print "xxx" for me, but once it returns value to main function, the string outcome just cannot display "xxx" well. Can somebody tell me the reason?
char* cut_command_head(char *my_command, char *character) {
char command_arg[256];
//command_arg = (char *)calloc(256, sizeof(char));
char *special_position;
int len;
special_position = strstr(my_command, character);
//printf("special_position: %s\n", special_position);
for (int i=1; special_position[i] != '\0'; i++) {
//printf("spcial_position[%d]: %c\n", i, special_position[i]);
command_arg[i-1] = special_position[i];
//printf("command_arg[%d]: %c\n", i-1, command_arg[i-1]);
}
len = (int)strlen(command_arg);
//printf("command_arg len: %d\n", len);
command_arg[len] = '\0';
my_command = command_arg;
printf("my_command: %s\n", my_command);
return my_command;
}
int main(int argc, const char * argv[]) {
char *test = "cd xxx";
char *outcome;
outcome = cut_command_head(test, " ");
printf("outcome: %s\n",outcome);
return 0;
}
Here
my_command = command_arg;
you assign the address of a local variable to the variable you are returning. This local variable lives on the stack of cut_command_head().
This address is invalid after the function returned. Accessing the memory returned by cut_command_head() provokes undefined behaviour.
You need to allocate memory somewhen, somewhere.
The easiest way is to use strdup() (if available):
my_command = strdup(command_arg);
A portable approach is to use malloc() followed by copying the data in question:
my_command = malloc(strlen(command_arg));
if (NULL != my_command)
{
strcpy(my_command, command_arg);
}
Also this looks strange:
len = (int)strlen(command_arg);
//printf("command_arg len: %d\n", len);
command_arg[len] = '\0';
Just remove it and initialise command_arg to all zeros right at the beginning to makes sure it is always 0-terminated:
char command_arg[256] = {0};

memcpy, segmentation fault

i have try to write this code, but i have found a problem with segmentation fault with memcpy ( i have try to see the code with debug)
FILE *tp;
int l = 0;
while ((fgets(buffer, sizeof buffer, tp))) {
// search equal sign
char *equalsign = strchr(buffer, '=');
l++;
// search quote near value
char *q1 = equalsign + 1;
char *q2 = strchr(q1 + 1, '"');
// extract name and value
char* names = strndup(buffer, equalsign - buffer);
char* values = strndup(q1 + 1, q2 - q1 - 1);
memcpy(g_names,names,strlen(names));
memcpy(g_values,values,strlen(values));
free(names);
free(values);
}
with
const char* g_names[SIZE] = { 0, };
char* g_values[SIZE] = { 0, };
char buffer[MAXLINE] = {0,};
define as global. with the debug i have see that the problem is with memcpy (segmentation fault). anyone have a suggest?
Thanks.
Regards.
There are at least two problems with your code: First, it is using g_names as the destination of memcpy, which copies the characters over the array of pointers. You should be copying to g_names[l] (assuming l was to be the index in the g_names array).
Second, your code is missing the actual allocation of g_names[l], something like:
g_names[l] = malloc(strlen(names) + 1);
But since you're calling strndup anyway, you can simply store the result of that call into the array:
// search for equal sign
char *equalsign = strchr(buffer, '=');
// search quote near value
char *q1 = equalsign + 1;
char *q2 = strchr(q1 + 1, '"');
// extract name and value
g_names[l] = strndup(buffer, equalsign - buffer);
g_values[l] = strndup(q1 + 1, q2 - q1 - 1);
l++;

How to convert from string array to int array and then sort it using c

I have string array initialized like that:
char ** strArray;
if ( (strArray = malloc(sizeof(*strArray) + 3)) == NULL ) {
fprintf(stderr, "ls1: couldn't allocate memory");
//exit(EXIT_FAILURE);
}
strArray[0] = NULL;
strArray[0] = "111";
strArray[1] = "222";
strArray[2] = "1";
strArray[3] = "2";
I want to convert this string array to int array, like that:
int * toIntArray(char ** strArray) {
int size = getCharArraySize(strArray);
int intArray[size];
int i;
for ( i = 0; i < size ; ++i)
{
intArray[i] = atoi(strArray[i]);
printf( "r[%d] = %d\n", i, intArray[i]);
}
intArray[size] = '\0';
return intArray;
}
int getCharArraySize(char ** strArray) {
int s = 0;
while ( strArray[s]) {
printf("Char array: %s.\n", strArray[s]);
s++;
}
return s;
}
And then I want to sort this int array.
I must have string array initilized like above (char ** strArray) and then convert this to int array and then sort it. Can anybody help my with that? I would ask about printed sorted integer in main function.
A few minor things to take note of in the question code:
char ** strArray;
if ( (strArray = malloc(sizeof(*strArray) + 3)) == NULL ) {
fprintf(stderr, "ls1: couldn't allocate memory");
//exit(EXIT_FAILURE);
}
If successful, the intention of the above code allocates memory to strArray sufficient for three char *'s. Specifically, strArray[0], strArray1 and strArray[2].
NOTE: As pointed out in Matt McNabb's comment below, it actually incorrectly allocates memory sufficient for one char *, and three extra bytes.
strArray[0] = NULL;
The above line sets sets the first pointer in the **strArray to point at NULL.
strArray[0] = "111";
The above code is odd. After just setting strArray[0] to point at NULL, the above line changes it to point to "111". Kind of makes setting it to NULL (in the first place) seem unnecessary.
strArray[1] = "222";
strArray[2] = "1";
The above two lines initialize the other two pointers in the strArray correctly.
strArray[3] = "2";
The above line attempts to initialize strArray[3], when that element of the array really doesn't exist. So, it is changing something to point to "2", but probably not with the expected result.
Perhaps the intent would be better served by changing the above code to:
char **strArray;
size_t strArrayElements=4;
if(NULL == (strArray = malloc((strArrayElements+1) * sizeof(*strArray))))
{
fprintf(stderr, "ls1: couldn't allocate memory");
exit(EXIT_FAILURE);
}
strArray[strArrayElements] = NULL;
strArray[0] = "111";
strArray[1] = "222";
strArray[2] = "1";
strArray[3] = "2";
As can be observed, the above code allocates 5 elements (strArrayElements+1) to the **strArray. The last element strArray[4] is initialized to NULL; a marker to indicate End-Of-List. Then the other 4 elements [0..3] are initialized.
Now shifting focus to:
int * toIntArray(char ** strArray) {
int size = getCharArraySize(strArray);
int intArray[size];
int i;
for ( i = 0; i < size ; ++i)
{
intArray[i] = atoi(strArray[i]);
printf( "r[%d] = %d\n", i, intArray[i]);
}
intArray[size] = '\0';
return intArray;
}
The above code is successful at converting the strings to their integer forms, and storing them in intArray. However, the code is flawed when it attempts to return intArray to the caller. The intArray variable was declared as a local stack object. The return statement causes all such stack variables to become invalid; and allows the stack memory such variables were using to be used for other things.
Perhaps the the following code better represents what was intended. It allocates memory from the heap for intArray. This allocated memory can outlive the return statement:
int *toIntArray(char **strArray)
{
int size = getCharArraySize(strArray);
int *intArray = malloc(size * sizeof(*intArray));
int i;
for ( i = 0; i < size ; ++i)
{
intArray[i] = atoi(strArray[i]);
printf( "r[%d] = %d\n", i, intArray[i]);
}
intArray[size] = '\0';
return(intArray);
}
Spoiler code may be found here.

Resources