Related
I made a program in C that can find two similar or different strings and extract the string between them. This type of program has so many uses, and generally when you use such a program, you have a lot of info, so it needs to be fast. I would like tips on how to make this program as fast and efficient as possible.
I am looking for suggestions that won't make me resort to heavy libraries (such as regex).
The code must:
be able to extract a string between two similar or different strings
find the 1st occurrence of string1
find the 1st occurrence of string2 which occurs AFTER string1
extract the string between string1 and string2
be able to use string arguments of any size
be foolproof to human error and return NULL if such occurs (example, string1 exceeds entire text string length. don't crash in an element error, but gracefully return NULL)
focus on speed and efficiency
Below is my code. I am quite new to C, coming from C++, so I could probably use a few suggestions, especially regarding efficient/proper use of the 'malloc' command:
fast_strbetween.c:
/*
Compile with:
gcc -Wall -O3 fast_strbetween.c -o fast_strbetween
*/
#include <stdio.h> // printf
#include <stdlib.h> // malloc
// inline function if it pleases the compiler gods
inline size_t fast_strlen(char *str)
{
int i; // Cannot return 'i' if inside for loop
for(i = 0; str[i] != '\0'; ++i);
return i;
}
char *fast_strbetween(char *str, char *str1, char *str2)
{
// size_t segfaults when incorrect length strings are entered (due to going below 0), so use int instead for increased robustness
int str0len = fast_strlen(str);
int str1len = fast_strlen(str1);
int str1pos = 0;
int charsfound = 0;
// Find str1
do {
charsfound = 0;
while (str1[charsfound] == str[str1pos + charsfound])
++charsfound;
} while (++str1pos < str0len - str1len && charsfound < str1len);
// '++str1pos' increments past by 1: needs to be set back by one
--str1pos;
// Whole string not found or logical impossibilty
if (charsfound < str1len)
return NULL;
/* Start searching 2 characters after last character found in str1. This will ensure that there will be space, and logical possibility, for the extracted text to exist or not, and allow immediate bail if the latter case; str1 cannot possibly have anything between it if str2 is right next to it!
Example:
str = 'aa'
str1 = 'a'
str2 = 'a'
returned = '' (should be NULL)
Without such preventative, str1 and str2 would would be found and '' would be returned, not NULL. This also saves 1 do/while loop, one check pertaining to returning null, and two additional calculations:
Example, if you didn't add +1 str2pos, you would need to change the code to:
if (charsfound < str2len || str2pos - str1pos - str1len < 1)
return NULL;
It also allows for text to be found between three similar strings—what??? I can feel my brain going fuzzy!
Let this example explain:
str = 'aaa'
str1 = 'a'
str2 = 'a'
result = '' (should be 'a')
Without the aforementioned preventative, the returned string is '', not 'a'; the program takes the first 'a' for str1 and the second 'a' for str2, and tries to return what is between them (nothing).
*/
int str2pos = str1pos + str1len + 1; // the '1' added to str2pos
int str2len = fast_strlen(str2);
// Find str2
do {
charsfound = 0;
while (str2[charsfound] == str[str2pos + charsfound])
++charsfound;
} while (++str2pos < str0len - str2len + 1 && charsfound < str2len);
// Deincrement due to '++str2pos' over-increment
--str2pos;
if (charsfound < str2len)
return NULL;
// Only allocate what is needed
char *strbetween = (char *)malloc(sizeof(char) * str2pos - str1pos - str1len);
unsigned int tmp = 0;
for (unsigned int i = str1pos + str1len; i < str2pos; i++)
strbetween[tmp++] = str[i];
return strbetween;
}
int main() {
char str[30] = { "abaabbbaaaabbabbbaaabbb" };
char str1[10] = { "aaa" };
char str2[10] = { "bbb" };
//Result should be: 'abba'
printf("The string between is: \'%s\'\n", fast_strbetween(str, str1, str2));
// free malloc as we go
for (int i = 10000000; --i;)
free(fast_strbetween(str, str1, str2));
return 0;
}
In order to have some way of measuring progress, I have already timed the code above (extracting a small string 10000000 times):
$ time fast_strbetween
The string between is: 'abba'
0m11.09s real 0m11.09s user 0m00.00s system
Process used 99.3 - 100% CPU according to 'top' command (Linux).
Memory used while running: 3.7Mb
Executable size: 8336 bytes
Ran on a Raspberry Pi 3B+ (4 x 1.4Ghz, Arm 6)
If anyone would like to offer code, tips, pointers... I would appreciate it. I will also implement the changes and give a timed result for your troubles.
Oh, and one thing that I learned is to always de-allocate malloc; I ran the code above (with extra loops), just before posting this. My computer's ram filled up, and the computer froze. Luckily, Stack made a backup draft! Lesson learned!
* EDIT *
Here is the revised code using chqrlie's advice as best I could. Added extra checks for end of string, which ended up costing about a second of time with the tested phrase but can now bail very fast if the first string is not found. Using null or illogical strings should not result in error, hopefully. Lots of notes int the code, where they can be better understood. If I've left anything thing out or done something incorrectly, please let me know guys; it is not intentional.
fast_strbetween2.c:
/*
Compile with:
gcc -Wall -O3 fast_strbetween2.c -o fast_strbetween2
Corrections and additions courtesy of:
https://stackoverflow.com/questions/55308295/extracting-a-string-between-two-similar-or-different-strings-in-c-as-fast-as-p
*/
#include<stdio.h> // printf
#include<stdlib.h> // malloc, free
// Strings now set to 'const'
char * fast_strbetween(const char *str, const char *str1, const char *str2)
{
// string size will now be calculated by the characters picked up
size_t str1pos = 0;
size_t str1chars;
// Find str1
do{
str1chars = 0;
// Will the do/while str1 check for '\0' suffice?
// I haven't seen any issues yet, but not sure.
while(str1[str1chars] == str[str1pos + str1chars] && str1[str1chars] != '\0')
{
//printf("Found str1 char: %i num: %i pos: %i\n", str1[str1chars], str1chars + 1, str1pos);
++str1chars;
}
// Incrementing whilst not in conditional expression tested faster
++str1pos;
/* There are two checks for "str1[str1chars] != '\0'". Trying to find
another efficient way to do it in one. */
}while(str[str1pos] != '\0' && str1[str1chars] != '\0');
--str1pos;
//For testing:
//printf("str1pos: %i str1chars: %i\n", str1pos, str1chars);
// exit if no chars were found or if didn't reach end of str1
if(!str1chars || str1[str1chars] != '\0')
{
//printf("Bailing from str1 result\n");
return '\0';
}
/* Got rid of the '+1' code which didn't allow for '' returns.
I agree with your logic of <tag></tag> returning ''. */
size_t str2pos = str1pos + str1chars;
size_t str2chars;
//printf("Starting pos for str2: %i\n", str1pos + str1chars);
// Find str2
do{
str2chars = 0;
while(str2[str2chars] == str[str2pos + str2chars] && str2[str2chars] != '\0')
{
//printf("Found str2 char: %i num: %i pos: %i \n", str2[str2chars], str2chars + 1, str2pos);
++str2chars;
}
++str2pos;
}while(str[str2pos] != '\0' && str2[str2chars] != '\0');
--str2pos;
//For testing:
//printf("str2pos: %i str2chars: %i\n", str2pos, str2chars);
if(!str2chars || str2[str2chars] != '\0')
{
//printf("Bailing from str2 result!\n");
return '\0';
}
/* Trying to allocate strbetween with malloc. Is this correct? */
char * strbetween = malloc(2);
// Check if malloc succeeded:
if (strbetween == '\0') return '\0';
size_t tmp = 0;
// Grab and store the string between!
for(size_t i = str1pos + str1chars; i < str2pos; ++i)
{
strbetween[tmp] = str[i];
++tmp;
}
return strbetween;
}
int main() {
char str[30] = { "abaabbbaaaabbabbbaaabbb" };
char str1[10] = { "aaa" };
char str2[10] = { "bbb" };
printf("Searching \'%s\' for \'%s\' and \'%s\'\n", str, str1, str2);
printf(" 0123456789\n\n"); // Easily see the elements
printf("The word between is: \'%s\'\n", fast_strbetween(str, str1, str2));
for(int i = 10000000; --i;)
free(fast_strbetween(str, str1, str2));
return 0;
}
** Results **
$ time fast_strbetween2
Searching 'abaabbbaaaabbabbbaaabbb' for 'aaa' and 'bbb'
0123456789
The word between is: 'abba'
0m10.93s real 0m10.93s user 0m00.00s system
Process used 99.0 - 100% CPU according to 'top' command (Linux).
Memory used while running: 1.8Mb
Executable size: 8336 bytes
Ran on a Raspberry Pi 3B+ (4 x 1.4Ghz, Arm 6)
chqrlie's answer
I understand that this is just some example code that shows proper programming practices. Nonetheless, it can make for a decent control in testing.
Please note that I do not know how to deallocate malloc in your code, so it is NOT a fair test. As a result, ram usage builds up, taking 130Mb+ for the process alone. I was still able to run the test for the full 10000000 loops. I will say that I tried deallocating this code the way I did my code (via bringing the function 'simple_strbetween' down into main and deallocating with 'free(strndup(p, q - p));'), and the results weren't much different from not deallocating.
** simple_strbetween.c **
/*
Compile with:
gcc -Wall -O3 simple_strbetween.c -o simple_strbetween
Courtesy of:
https://stackoverflow.com/questions/55308295/extracting-a-string-between-two-similar-or-different-strings-in-c-as-fast-as-p
*/
#include<string.h>
#include<stdio.h>
char *simple_strbetween(const char *str, const char *str1, const char *str2) {
const char *q;
const char *p = strstr(str, str1);
if (p) {
p += strlen(str1);
q = *str2 ? strstr(p, str2) : p + strlen(p);
if (q)
return strndup(p, q - p);
}
return NULL;
}
int main() {
char str[30] = { "abaabbbaaaabbabbbaaabbb" };
char str1[10] = { "aaa" };
char str2[10] = { "bbb" };
printf("Searching \'%s\' for \'%s\' and \'%s\'\n", str, str1, str2);
printf(" 0123456789\n\n"); // Easily see the elements
printf("The word between is: \'%s\'\n", simple_strbetween(str, str1, str2));
for(int i = 10000000; --i;)
simple_strbetween(str, str1, str2);
return 0;
}
$ time simple_strbetween
Searching 'abaabbbaaaabbabbbaaabbb' for 'aaa' and 'bbb'
0123456789
The word between is: 'abba'
0m19.68s real 0m19.34s user 0m00.32s system
Process used 100% CPU according to 'top' command (Linux).
Memory used while running: 130Mb (leak due do my lack of knowledge)
Executable size: 8380 bytes
Ran on a Raspberry Pi 3B+ (4 x 1.4Ghz, Arm 6)
Results for above code ran with this alternate strndup:
char *alt_strndup(const char *s, size_t n)
{
size_t i;
char *p;
for (i = 0; i < n && s[i] != '\0'; i++)
continue;
p = malloc(i + 1);
if (p != NULL) {
memcpy(p, s, i);
p[i] = '\0';
}
return p;
}
$ time simple_strbetween
Searching 'abaabbbaaaabbabbbaaabbb' for 'aaa' and 'bbb'
0123456789
The word between is: 'abba'
0m20.99s real 0m20.54s user 0m00.44s system
I kindly ask that nobody make judgements on the results until the code is properly ran. I will revise the results as soon as it is figured out.
* Edit *
Was able to decrease the time by over 25% (11.93s vs 8.7s). This was done by using pointers to increment the positions, as opposed to size_t. Collecting the return string whilst checking the last string was likely what caused the biggest change. I feel there is still lots of room for improvement. A big loss comes from having to free malloc. If there is a better way, I'd like to know.
fast_strbetween3.c:
/*
gcc -Wall -O3 fast_strbetween.c -o fast_strbetween
*/
#include<stdio.h> // printf
#include<stdlib.h> // malloc, free
char * fast_strbetween(const char *str, const char *str1, const char *str2)
{
const char *sbegin = &str1[0]; // String beginning
const char *spos;
// Find str1
do{
spos = str;
str1 = sbegin;
while(*spos == *str1 && *str1)
{
++spos;
++str1;
}
++str;
}while(*str1 && *spos);
// Nothing found if spos hasn't advanced
if (spos == str)
return NULL;
char *strbetween = malloc(1);
if (!strbetween)
return '\0';
str = spos;
int i = 0;
//char *p = &strbetween[0]; // Alt. for advancing strbetween (slower)
sbegin = &str2[0]; // Recycle sbegin
// Find str2
do{
str2 = sbegin;
spos = str;
while(*spos == *str2 && *str2)
{
++str2;
++spos;
}
//*p = *str;
//++p;
strbetween[i] = *str;
++str;
++i;
}while(*str2 && *spos);
if (spos == str)
return NULL;
//*--p = '\0';
strbetween[i - 1] = '\0';
return strbetween;
}
int main() {
char s[100] = "abaabbbaaaabbabbbaaabbb";
char s1[100] = "aaa";
char s2[100] = "bbb";
printf("\nString: \'%s\'\n", fast_strbetween(s, s1, s2));
for(int i = 10000000; --i; )
free(fast_strbetween(s, s1, s2));
return 0;
}
String: 'abba'
0m08.70s real 0m08.67s user 0m00.01s system
Process used 99.0 - 100% CPU according to 'top' command (Linux).
Memory used while running: 1.8Mb
Executable size: 8336 bytes
Ran on a Raspberry Pi 3B+ (4 x 1.4Ghz, Arm 6)
* Edit *
This doesn't really count as it does not 'return' a value, and therefore is against my own rules, but it does pass a variable through, which is changed and brought back to main. It runs with 1 library and takes 3.6s. Getting rid of malloc was the key.
/*
gcc -Wall -O3 fast_strbetween.c -o fast_strbetween
*/
#include<stdio.h> // printf
unsigned int fast_strbetween(const char *str, const char *str1, const char *str2, char *strbetween)
{
const char *sbegin = &str1[0]; // String beginning
const char *spos;
// Find str1
do{
spos = str;
str1 = sbegin;
while(*spos == *str1 && *str1)
{
++spos;
++str1;
}
++str;
}while(*str1 && *spos);
// Nothing found if spos hasn't advanced
if (spos == str)
{
strbetween[0] = '\0';
return 0;
}
str = spos;
sbegin = &str2[0]; // Recycle sbegin
// Find str2
do{
str2 = sbegin;
spos = str;
while(*spos == *str2 && *str2)
{
++str2;
++spos;
}
*strbetween = *str;
++strbetween;
++str;
}while(*str2 && *spos);
if (spos == str)
{
strbetween[0] = '\0';
return 0;
}
*--strbetween = '\0';
return 1; // Successful (found text)
}
int main() {
char s[100] = "abaabbbaaaabbabbbaaabbb";
char s1[100] = "aaa";
char s2[100] = "bbb";
char sret[100];
fast_strbetween(s, s1, s2, sret);
printf("String: %s\n", sret);
for(int i = 10000000; --i; )
fast_strbetween(s, s1, s2, sret);
return 0;
}
Your code has multiple problems and is probably not as efficient as it should be:
you use types int and unsigned int for indexes into the strings. These types may be smaller than the range of size_t. You should revise your code to use size_t and avoid mixing signed and unsigned types in comparisons.
your functions' string arguments should be declared as const char * as you do not modify the strings and should be able to pass const strings without a warning.
redefining strlen is a bad idea: your version will be slower than the system's optimized, assembly coded and very likely inlined version.
computing the length of str is unnecessary and potentially costly: both str1 and str2 may appear close to the beginning of str, scanning for the end of str will be wasteful.
the while loop inside the first do / while loop is incorrect: while(str1[charsfound] == str[str1pos + charsfound]) charsfound++; may access characters beyond the end of str and str1 as the loop does not stop at the null terminator. If str1 only appears at the end of str, you have undefined behavior.
if str1 is an empty string, you will find it at the end of str instead of at the beginning.
why do you initialize str2pos as int str2pos = str1pos + str1len + 1;? If str2 immediately follows str1 inside str, an empty string should be allocated and returned. Your comment regarding this case is unreadable, you should break such long lines to fit within a typical screen width such as 80 columns. It is debatable whether strbetween("aa", "a", "a") should return "" or NULL. IMHO it should return an allocated empty string, which would be consistent with the expected behavior on strbetween("<name></name>", "<name>", "</name>") or strbetween("''", "'", "'"). Your specification preventing strbetween from returning an empty string produces a counter-intuitive border case.
the second scanning loop has the same problems as the first.
the line char *strbetween = (char *) malloc(sizeof(char) * str2pos - str1pos - str1len); has multiple problems: no cast is necessary in C, if you insist on specifying the element size sizeof(char), which is 1 by definition, you should parenthesize the number of elements, and last but not least, you must allocate one extra element for the null terminator.
You do not test if malloc() succeeded. If it returns NULL, you will have undefined behavior, whereas you should just return NULL.
the copying loop uses a mix of signed and unsigned types, causing potentially counterintuitive behavior on overflow.
you forget to set the null terminator, which is consistent with the allocation size error, but incorrect.
Before you try and optimize code, you must ensure correctness! Your code is too complicated and has multiple flaws. Optimisation is a moot point.
You should first try a very simple implementation using standard C string functions: searching a string inside another one is performed efficiently by strstr.
Here is a simple implementation using strstr and strndup(), which should be available on your system:
#include <string.h>
char *simple_strbetween(const char *str, const char *str1, const char *str2) {
const char *q;
const char *p = strstr(str, str1);
if (p) {
p += strlen(str1);
q = *str2 ? strstr(p, str2) : p + strlen(p);
if (q)
return strndup(p, q - p);
}
return NULL;
}
strndup() is defined in POSIX and is part of the Extensions to the C Library Part II: Dynamic Allocation Functions, ISO/IEC TR 24731-2:2010. If it is not available on your system, it can be redefined as:
#include <stdlib.h>
#include <string.h>
char *strndup(const char *s, size_t n) {
size_t i;
char *p;
for (i = 0; i < n && s[i] != '\0'; i++)
continue;
p = malloc(i + 1);
if (p != NULL) {
memcpy(p, s, i);
p[i] = '\0';
}
return p;
}
To ensure correctness, write a number of test cases, with border cases such as all combinations of empty strings and identical strings.
Once your have thoroughly your strbetween function, you can write a benchmarking framework to test performance. This is not so easy to get reliable performance figures, as you will experience if you try. Remember to configure your compiler to select the appropriate optimisations, -O3 for example.
Only then can you move to the next step: if you are really restricted from using standard C library functions, you may first recode your versions of strstr and strlen and still use the same method. Test this new version both for correctness and for performance.
The redundant parts are the computation of strlen(str1) which must have been determined by strstr when it finds a match. And the scan in strndup() which is unnecessary since no null byte is present between p and q. If you have time to waste, you can try and remove these redundancies at the expense of readability, risking non conformity. I would be surprised if you get any improvement at all on average over a wide variety of test cases. 20% would be remarkable.
Let's say I have a string which contains the following text:
#Line1 Hello, today I ate 3 crackers for dinner
#Line2 and 4 crackers with some soup for lunch.
#Line3 For breakfast tomorrow, I plan on eating
#Line4 bacon, eggs, and ham.
and I wanted to cut the part of the string from one substring to another substring, for example from "#Line3" to \n to get the following output:
#Line1 Hello, today I ate 3 crackers for dinner
#Line2 and 4 crackers with some soup for lunch.
#Line4 bacon, eggs, and ham.
(Just basically cutting out everything from #Line3 to \n and in essence removing the entire 3rd line)
I have read that this could be done with the function memmove but have not been able to figure out how to correctly do so. However, if anyone has a solution that does not involve memmove, of course that would be equally appreciated.
Here's what I have so far:
int str_cut(char *str, char *begin, int len)
{
int l = strlen(str);
if (strlen(begin) + len > l) len = l - begin;
memmove(str + strlen(begin), str + begin + len, l - len + 1);
return len;
}
This is so far pretty far off in accomplishing what I want because it depends on knowing the length of what needs to be cut out and what I want it to do is cut out between 2 chars , to go along with my previous example to cut everything between "line3" and \n
Pretty easy using memmove:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char *
strfilter(const char *str, const char *substr)
{
char *res = strdup(str), *ptr = NULL;
size_t len = 0;
if (!res)
return NULL;
ptr = strstr(res, substr);
if (!ptr)
{
free(res);
return NULL;
}
len = strlen(ptr) - strlen(substr);
memmove(ptr, ptr + strlen(substr), len);
memset(ptr + len, 0, strlen(ptr + len));
return res;
}
char *
strfilter2(const char *str, const char *start, const char *end)
{
char *res = strdup(str), *ptr1 = NULL, *ptr2 = NULL, *tmp = NULL;
size_t len1 = 0, len2 = 0;
if (!res)
return NULL;
ptr1 = strstr(res, start);
ptr2 = strstr(res, end);
if (!ptr1 || !ptr2)
return NULL;
if (ptr1 > ptr2)
{
tmp = ptr2;
ptr2 = ptr1;
ptr1 = tmp;
tmp = end;
end = start;
start = tmp;
}
len1 = strlen(start);
len2 = strlen(ptr2);
memmove(ptr1 + len1, ptr2, len2);
memset(ptr1 + len1 + len2, 0, strlen(ptr1 + len1 + len2));
return res;
}
int
main(void)
{
const char *str = "Hello, today I ate 3 crackers for dinner\n"
"and 4 crackers with some soup for lunch.\n"
"For breakfast tomorrow, I plan on eating\n"
"bacon, eggs, and ham.\n";
char *res = strfilter2(str, "and 4 crackers with some soup for lunch.\n", "bacon, eggs, and ham.\n");
if (!res)
{
perror("strfilter2()");
return 1;
}
puts(res);
free(res);
return 0;
}
The function just finds the substring that you want to remove and overwrites it with everything that comes after it, and then it zeroes out the remainder of the string.
EDIT:
Added strfilter2 to eliminate the content between two substrings.
Using memcpy you can copy two parts into two buffers and then concat them using strcat.
Get a substring of a char*
To cut out a part of the string you need to find - of course - first the start and end marks of the part you want to cut. For that you can use strstr. Then just copy the remaining part (everything after the end mark) to the place where you found the start mark:
char * cut_between(
char * const str,
char const * const from,
char const * const to) {
char * const startMark = strstr(str, from);
if (! startMark) {
return NULL;
}
char * const endMark =
strstr(startMark+strlen(from), to);
if (endMark) {
strcpy(startMark, endMark+strlen(to));
return startMark + strlen(startMark) + 1;
} else {
*startMark = '\0';
return startMark + 1;
}
}
On success the above function returns a pointer beyond the end of the resulting string. This is useful for buffer compaction, like:
int main() {
char * const input = malloc(400);
fgets(input, 400, stdin);
char const * const end =
cut_between(input, "from", "to");
if (end) {
char const * const result =
realloc(input, end - input);
puts(result);
// OMG missing free(s), well ... OK for this simple test.
}
return 0;
}
(Live on ideone)
Please note the missing error checks on above test. In production code these must be added.
I got the following string from the user:
char *abc = "a234bc567d";
but all the numbers can have different lengths than in this example (letters are constants).
How can I get each part of numbers? (again, it can be 234 or 23743 or something else..)
I tried to use strchr and strncpy but I need to allocate memory for this (for strncpy), and I hope there is a better solution.
Thanks.
You can do something like this:
char *abc = "a234bc567d";
char *ptr = abc; // point to start of abc
// While not at the end of the string
while (*ptr != '\0')
{
// If position is the start of a number
if (isdigit(*ptr))
{
// Get value (assuming base 10), store end position of number in ptr
int value = strtol(ptr, &ptr, 10);
printf("Found value %d\n", value);
}
else
{
ptr++; // Increase pointer
}
}
If I understand your question, you are trying to extract the parts of the user input that contain numbers ... and the sequence of numbers can be variable ... but the letters are fixed i.e. a or b or c or d. Correct ... ? The following program may help you. I tried it for strings "a234bc567d", "a23743bc567d" and "a23743bc5672344d". Works ...
int main()
{
char *sUser = "a234bc567d";
//char *sUser = "a23743bc567d";
//char *sUser = "a23743bc5672344d";
int iLen = strlen(sUser);
char *sInput = (char *)malloc((iLen+1) * sizeof(char));
strcpy(sInput, sUser);
char *sSeparator = "abcd";
char *pToken = strtok(sInput, sSeparator);
while(1)
{
if(pToken == NULL)
break;
printf("Token = %s\n", pToken);
pToken = strtok(NULL, sSeparator);
}
return 0;
}
Given a (char *) string, I want to find all occurrences of a substring and replace them with an alternate string. I do not see any simple function that achieves this in <string.h>.
The optimizer should eliminate most of the local variables. The tmp pointer is there to make sure strcpy doesn't have to walk the string to find the null. tmp points to the end of result after each call. (See Shlemiel the painter's algorithm for why strcpy can be annoying.)
// You must free the result if result is non-NULL.
char *str_replace(char *orig, char *rep, char *with) {
char *result; // the return string
char *ins; // the next insert point
char *tmp; // varies
int len_rep; // length of rep (the string to remove)
int len_with; // length of with (the string to replace rep with)
int len_front; // distance between rep and end of last rep
int count; // number of replacements
// sanity checks and initialization
if (!orig || !rep)
return NULL;
len_rep = strlen(rep);
if (len_rep == 0)
return NULL; // empty rep causes infinite loop during count
if (!with)
with = "";
len_with = strlen(with);
// count the number of replacements needed
ins = orig;
for (count = 0; tmp = strstr(ins, rep); ++count) {
ins = tmp + len_rep;
}
tmp = result = malloc(strlen(orig) + (len_with - len_rep) * count + 1);
if (!result)
return NULL;
// first time through the loop, all the variable are set correctly
// from here on,
// tmp points to the end of the result string
// ins points to the next occurrence of rep in orig
// orig points to the remainder of orig after "end of rep"
while (count--) {
ins = strstr(orig, rep);
len_front = ins - orig;
tmp = strncpy(tmp, orig, len_front) + len_front;
tmp = strcpy(tmp, with) + len_with;
orig += len_front + len_rep; // move to next "end of rep"
}
strcpy(tmp, orig);
return result;
}
This is not provided in the standard C library because, given only a char* you can't increase the memory allocated to the string if the replacement string is longer than the string being replaced.
You can do this using std::string more easily, but even there, no single function will do it for you.
There isn't one.
You'd need to roll your own using something like strstr and strcat or strcpy.
You could build your own replace function using strstr to find the substrings and strncpy to copy in parts to a new buffer.
Unless what you want to replace_with is the same length as what you you want to replace, then it's probably best to use a new buffer to copy the new string to.
Here's some sample code that does it.
#include <string.h>
#include <stdlib.h>
char * replace(
char const * const original,
char const * const pattern,
char const * const replacement
) {
size_t const replen = strlen(replacement);
size_t const patlen = strlen(pattern);
size_t const orilen = strlen(original);
size_t patcnt = 0;
const char * oriptr;
const char * patloc;
// find how many times the pattern occurs in the original string
for (oriptr = original; patloc = strstr(oriptr, pattern); oriptr = patloc + patlen)
{
patcnt++;
}
{
// allocate memory for the new string
size_t const retlen = orilen + patcnt * (replen - patlen);
char * const returned = (char *) malloc( sizeof(char) * (retlen + 1) );
if (returned != NULL)
{
// copy the original string,
// replacing all the instances of the pattern
char * retptr = returned;
for (oriptr = original; patloc = strstr(oriptr, pattern); oriptr = patloc + patlen)
{
size_t const skplen = patloc - oriptr;
// copy the section until the occurence of the pattern
strncpy(retptr, oriptr, skplen);
retptr += skplen;
// copy the replacement
strncpy(retptr, replacement, replen);
retptr += replen;
}
// copy the rest of the string.
strcpy(retptr, oriptr);
}
return returned;
}
}
#include <stdio.h>
int main(int argc, char * argv[])
{
if (argc != 4)
{
fprintf(stderr,"usage: %s <original text> <pattern> <replacement>\n", argv[0]);
exit(-1);
}
else
{
char * const newstr = replace(argv[1], argv[2], argv[3]);
if (newstr)
{
printf("%s\n", newstr);
free(newstr);
}
else
{
fprintf(stderr,"allocation error\n");
exit(-2);
}
}
return 0;
}
As strings in C can not dynamically grow inplace substitution will generally not work. Therefore you need to allocate space for a new string that has enough room for your substitution and then copy the parts from the original plus the substitution into the new string. To copy the parts you would use strncpy.
// Here is the code for unicode strings!
int mystrstr(wchar_t *txt1,wchar_t *txt2)
{
wchar_t *posstr=wcsstr(txt1,txt2);
if(posstr!=NULL)
{
return (posstr-txt1);
}else
{
return -1;
}
}
// assume: supplied buff is enough to hold generated text
void StringReplace(wchar_t *buff,wchar_t *txt1,wchar_t *txt2)
{
wchar_t *tmp;
wchar_t *nextStr;
int pos;
tmp=wcsdup(buff);
pos=mystrstr(tmp,txt1);
if(pos!=-1)
{
buff[0]=0;
wcsncpy(buff,tmp,pos);
buff[pos]=0;
wcscat(buff,txt2);
nextStr=tmp+pos+wcslen(txt1);
while(wcslen(nextStr)!=0)
{
pos=mystrstr(nextStr,txt1);
if(pos==-1)
{
wcscat(buff,nextStr);
break;
}
wcsncat(buff,nextStr,pos);
wcscat(buff,txt2);
nextStr=nextStr+pos+wcslen(txt1);
}
}
free(tmp);
}
The repl_str() function on creativeandcritical.net is fast and reliable. Also included on that page is a wide string variant, repl_wcs(), which can be used with Unicode strings including those encoded in UTF-8, through helper functions - demo code is linked from the page. Belated full disclosure: I am the author of that page and the functions on it.
Here is the one that I created based on these requirements:
Replace the pattern regardless of whether is was long or shorter.
Not use any malloc (explicit or implicit) to intrinsically avoid memory leaks.
Replace any number of occurrences of pattern.
Tolerate the replace string having a substring equal to the search string.
Does not have to check that the Line array is sufficient in size to hold the replacement. e.g. This does not work unless the caller knows that line is of sufficient size to hold the new string.
avoid use of strcat() to avoid overhead of scanning the entire string to append another string.
/* returns number of strings replaced.
*/
int replacestr(char *line, const char *search, const char *replace)
{
int count;
char *sp; // start of pattern
//printf("replacestr(%s, %s, %s)\n", line, search, replace);
if ((sp = strstr(line, search)) == NULL) {
return(0);
}
count = 1;
int sLen = strlen(search);
int rLen = strlen(replace);
if (sLen > rLen) {
// move from right to left
char *src = sp + sLen;
char *dst = sp + rLen;
while((*dst = *src) != '\0') { dst++; src++; }
} else if (sLen < rLen) {
// move from left to right
int tLen = strlen(sp) - sLen;
char *stop = sp + rLen;
char *src = sp + sLen + tLen;
char *dst = sp + rLen + tLen;
while(dst >= stop) { *dst = *src; dst--; src--; }
}
memcpy(sp, replace, rLen);
count += replacestr(sp + rLen, search, replace);
return(count);
}
Any suggestions for improving this code are cheerfully accepted. Just post the comment and I will test it.
i find most of the proposed functions hard to understand - so i came up with this:
static char *dull_replace(const char *in, const char *pattern, const char *by)
{
size_t outsize = strlen(in) + 1;
// TODO maybe avoid reallocing by counting the non-overlapping occurences of pattern
char *res = malloc(outsize);
// use this to iterate over the output
size_t resoffset = 0;
char *needle;
while (needle = strstr(in, pattern)) {
// copy everything up to the pattern
memcpy(res + resoffset, in, needle - in);
resoffset += needle - in;
// skip the pattern in the input-string
in = needle + strlen(pattern);
// adjust space for replacement
outsize = outsize - strlen(pattern) + strlen(by);
res = realloc(res, outsize);
// copy the pattern
memcpy(res + resoffset, by, strlen(by));
resoffset += strlen(by);
}
// copy the remaining input
strcpy(res + resoffset, in);
return res;
}
output must be free'd
a fix to fann95's response, using in-place modification of the string, and assuming the buffer pointed to by line is large enough to hold the resulting string.
static void replacestr(char *line, const char *search, const char *replace)
{
char *sp;
if ((sp = strstr(line, search)) == NULL) {
return;
}
int search_len = strlen(search);
int replace_len = strlen(replace);
int tail_len = strlen(sp+search_len);
memmove(sp+replace_len,sp+search_len,tail_len+1);
memcpy(sp, replace, replace_len);
}
/*замена символа в строке*/
char* replace_char(char* str, char in, char out) {
char * p = str;
while(p != '\0') {
if(*p == in)
*p == out;
++p;
}
return str;
}
This function only works if ur string has extra space for new length
void replace_str(char *str,char *org,char *rep)
{
char *ToRep = strstr(str,org);
char *Rest = (char*)malloc(strlen(ToRep));
strcpy(Rest,((ToRep)+strlen(org)));
strcpy(ToRep,rep);
strcat(ToRep,Rest);
free(Rest);
}
This only replaces First occurrence
Here goes mine, make them all char*, which makes calling easier...
char *strrpc(char *str,char *oldstr,char *newstr){
char bstr[strlen(str)];
memset(bstr,0,sizeof(bstr));
int i;
for(i = 0;i < strlen(str);i++){
if(!strncmp(str+i,oldstr,strlen(oldstr))){
strcat(bstr,newstr);
i += strlen(oldstr) - 1;
}else{
strncat(bstr,str + i,1);
}
}
strcpy(str,bstr);
return str;
}
There is a function in string.h but it works with char [] not char* but again it outputs a char* and not a char []
It is simple and beautiful
Supposing we want to replace 'and' in 'TheandQuickandBrownandFox'.
We first split with strtok and then join with snprintf defined in the stdio.h
char sometext[] = "TheandQuickandBrownandFox";
char* replaced = malloc(1024);
// split on the substring, here I am using (and)
char* token = strtok(sometext, "and");
snprintf(replaced, 1, "%s", ""); // initialise so we can compare
while(token) {
if (strcmp(replaced, "") < 1) {
// if it is the first one
snprintf(replaced, 1024, "%s", token);
token = NULL;
} else {
// put the space between the existing and new
snprintf(replaced, 1024, "%s %s", replaced, token);
token = NULL;
}
}
free(replaced);
This should give us:
The Quick Brown Fox
You can use this function (the comments explain how it works):
void strreplace(char *string, const char *find, const char *replaceWith){
if(strstr(string, find) != NULL){
char *temporaryString = malloc(strlen(strstr(string, find) + strlen(find)) + 1);
strcpy(temporaryString, strstr(string, find) + strlen(find)); //Create a string with what's after the replaced part
*strstr(string, find) = '\0'; //Take away the part to replace and the part after it in the initial string
strcat(string, replaceWith); //Concat the first part of the string with the part to replace with
strcat(string, temporaryString); //Concat the first part of the string with the part after the replaced part
free(temporaryString); //Free the memory to avoid memory leaks
}
}
DWORD ReplaceString(__inout PCHAR source, __in DWORD dwSourceLen, __in const char* pszTextToReplace, __in const char* pszReplaceWith)
{
DWORD dwRC = NO_ERROR;
PCHAR foundSeq = NULL;
PCHAR restOfString = NULL;
PCHAR searchStart = source;
size_t szReplStrcLen = strlen(pszReplaceWith), szRestOfStringLen = 0, sztextToReplaceLen = strlen(pszTextToReplace), remainingSpace = 0, dwSpaceRequired = 0;
if (strcmp(pszTextToReplace, "") == 0)
dwRC = ERROR_INVALID_PARAMETER;
else if (strcmp(pszTextToReplace, pszReplaceWith) != 0)
{
do
{
foundSeq = strstr(searchStart, pszTextToReplace);
if (foundSeq)
{
szRestOfStringLen = (strlen(foundSeq) - sztextToReplaceLen) + 1;
remainingSpace = dwSourceLen - (foundSeq - source);
dwSpaceRequired = szReplStrcLen + (szRestOfStringLen);
if (dwSpaceRequired > remainingSpace)
{
dwRC = ERROR_MORE_DATA;
}
else
{
restOfString = CMNUTIL_calloc(szRestOfStringLen, sizeof(CHAR));
strcpy_s(restOfString, szRestOfStringLen, foundSeq + sztextToReplaceLen);
strcpy_s(foundSeq, remainingSpace, pszReplaceWith);
strcat_s(foundSeq, remainingSpace, restOfString);
}
CMNUTIL_free(restOfString);
searchStart = foundSeq + szReplStrcLen; //search in the remaining str. (avoid loops when replWith contains textToRepl
}
} while (foundSeq && dwRC == NO_ERROR);
}
return dwRC;
}
char *replace(const char*instring, const char *old_part, const char *new_part)
{
#ifndef EXPECTED_REPLACEMENTS
#define EXPECTED_REPLACEMENTS 100
#endif
if(!instring || !old_part || !new_part)
{
return (char*)NULL;
}
size_t instring_len=strlen(instring);
size_t new_len=strlen(new_part);
size_t old_len=strlen(old_part);
if(instring_len<old_len || old_len==0)
{
return (char*)NULL;
}
const char *in=instring;
const char *found=NULL;
size_t count=0;
size_t out=0;
size_t ax=0;
char *outstring=NULL;
if(new_len> old_len )
{
size_t Diff=EXPECTED_REPLACEMENTS*(new_len-old_len);
size_t outstring_len=instring_len + Diff;
outstring =(char*) malloc(outstring_len);
if(!outstring){
return (char*)NULL;
}
while((found = strstr(in, old_part))!=NULL)
{
if(count==EXPECTED_REPLACEMENTS)
{
outstring_len+=Diff;
if((outstring=realloc(outstring,outstring_len))==NULL)
{
return (char*)NULL;
}
count=0;
}
ax=found-in;
strncpy(outstring+out,in,ax);
out+=ax;
strncpy(outstring+out,new_part,new_len);
out+=new_len;
in=found+old_len;
count++;
}
}
else
{
outstring =(char*) malloc(instring_len);
if(!outstring){
return (char*)NULL;
}
while((found = strstr(in, old_part))!=NULL)
{
ax=found-in;
strncpy(outstring+out,in,ax);
out+=ax;
strncpy(outstring+out,new_part,new_len);
out+=new_len;
in=found+old_len;
}
}
ax=(instring+instring_len)-in;
strncpy(outstring+out,in,ax);
out+=ax;
outstring[out]='\0';
return outstring;
}
Using only strlen from string.h
sorry for my English
char * str_replace(char * text,char * rep, char * repw){//text -> to replace in it | rep -> replace | repw -> replace with
int replen = strlen(rep),repwlen = strlen(repw),count;//some constant variables
for(int i=0;i<strlen(text);i++){//search for the first character from rep in text
if(text[i] == rep[0]){//if it found it
count = 1;//start searching from the next character to avoid repetition
for(int j=1;j<replen;j++){
if(text[i+j] == rep[j]){//see if the next character in text is the same as the next in the rep if not break
count++;
}else{
break;
}
}
if(count == replen){//if count equals to the lenght of the rep then we found the word that we want to replace in the text
if(replen < repwlen){
for(int l = strlen(text);l>i;l--){//cuz repwlen greater than replen we need to shift characters to the right to make space for the replacement to fit
text[l+repwlen-replen] = text[l];//shift by repwlen-replen
}
}
if(replen > repwlen){
for(int l=i+replen-repwlen;l<strlen(text);l++){//cuz replen greater than repwlen we need to shift the characters to the left
text[l-(replen-repwlen)] = text[l];//shift by replen-repwlen
}
text[strlen(text)-(replen-repwlen)] = '\0';//get rid of the last unwanted characters
}
for(int l=0;l<repwlen;l++){//replace rep with repwlen
text[i+l] = repw[l];
}
if(replen != repwlen){
i+=repwlen-1;//pass to the next character | try text "y" ,rep "y",repw "yy" without this line to understand
}
}
}
}
return text;
}
if you want strlen code to avoid calling string.h
int strlen(char * string){//use this code to avoid calling string.h
int lenght = 0;
while(string[lenght] != '\0'){
lenght++;
}
return lenght;
}
There you go....this is the function to replace every occurance of char x with char y within character string str
char *zStrrep(char *str, char x, char y){
char *tmp=str;
while(*tmp)
if(*tmp == x)
*tmp++ = y; /* assign first, then incement */
else
*tmp++;
// *tmp='\0'; -> we do not need this
return str;
}
An example usage could be
Exmaple Usage
char s[]="this is a trial string to test the function.";
char x=' ', y='_';
printf("%s\n",zStrrep(s,x,y));
Example Output
this_is_a_trial_string_to_test_the_function.
The function is from a string library I maintain on Github, you are more than welcome to have a look at other available functions or even contribute to the code :)
https://github.com/fnoyanisi/zString
EDIT:
#siride is right, the function above replaces chars only. Just wrote this one, which replaces character strings.
#include <stdio.h>
#include <stdlib.h>
/* replace every occurance of string x with string y */
char *zstring_replace_str(char *str, const char *x, const char *y){
char *tmp_str = str, *tmp_x = x, *dummy_ptr = tmp_x, *tmp_y = y;
int len_str=0, len_y=0, len_x=0;
/* string length */
for(; *tmp_y; ++len_y, ++tmp_y)
;
for(; *tmp_str; ++len_str, ++tmp_str)
;
for(; *tmp_x; ++len_x, ++tmp_x)
;
/* Bounds check */
if (len_y >= len_str)
return str;
/* reset tmp pointers */
tmp_y = y;
tmp_x = x;
for (tmp_str = str ; *tmp_str; ++tmp_str)
if(*tmp_str == *tmp_x) {
/* save tmp_str */
for (dummy_ptr=tmp_str; *dummy_ptr == *tmp_x; ++tmp_x, ++dummy_ptr)
if (*(tmp_x+1) == '\0' && ((dummy_ptr-str+len_y) < len_str)){
/* Reached end of x, we got something to replace then!
* Copy y only if there is enough room for it
*/
for(tmp_y=y; *tmp_y; ++tmp_y, ++tmp_str)
*tmp_str = *tmp_y;
}
/* reset tmp_x */
tmp_x = x;
}
return str;
}
int main()
{
char s[]="Free software is a matter of liberty, not price.\n"
"To understand the concept, you should think of 'free' \n"
"as in 'free speech', not as in 'free beer'";
printf("%s\n\n",s);
printf("%s\n",zstring_replace_str(s,"ree","XYZ"));
return 0;
}
And below is the output
Free software is a matter of liberty, not price.
To understand the concept, you should think of 'free'
as in 'free speech', not as in 'free beer'
FXYZ software is a matter of liberty, not price.
To understand the concept, you should think of 'fXYZ'
as in 'fXYZ speech', not as in 'fXYZ beer'
You can use strrep()
char* strrep ( const char * cadena,
const char * strf,
const char * strr
)
strrep (String Replace). Replaces strf with strr in cadena and returns the new string. You need to free the returned string in your code after using strrep.
Parameters:
cadena: The string with the text.
strf: The text to find.
strr: The replacement text.
Returns
The text updated wit the replacement.
Project can be found at https://github.com/ipserc/strrep
Given a (char *) string, I want to find all occurrences of a substring and replace them with an alternate string. I do not see any simple function that achieves this in <string.h>.
The optimizer should eliminate most of the local variables. The tmp pointer is there to make sure strcpy doesn't have to walk the string to find the null. tmp points to the end of result after each call. (See Shlemiel the painter's algorithm for why strcpy can be annoying.)
// You must free the result if result is non-NULL.
char *str_replace(char *orig, char *rep, char *with) {
char *result; // the return string
char *ins; // the next insert point
char *tmp; // varies
int len_rep; // length of rep (the string to remove)
int len_with; // length of with (the string to replace rep with)
int len_front; // distance between rep and end of last rep
int count; // number of replacements
// sanity checks and initialization
if (!orig || !rep)
return NULL;
len_rep = strlen(rep);
if (len_rep == 0)
return NULL; // empty rep causes infinite loop during count
if (!with)
with = "";
len_with = strlen(with);
// count the number of replacements needed
ins = orig;
for (count = 0; tmp = strstr(ins, rep); ++count) {
ins = tmp + len_rep;
}
tmp = result = malloc(strlen(orig) + (len_with - len_rep) * count + 1);
if (!result)
return NULL;
// first time through the loop, all the variable are set correctly
// from here on,
// tmp points to the end of the result string
// ins points to the next occurrence of rep in orig
// orig points to the remainder of orig after "end of rep"
while (count--) {
ins = strstr(orig, rep);
len_front = ins - orig;
tmp = strncpy(tmp, orig, len_front) + len_front;
tmp = strcpy(tmp, with) + len_with;
orig += len_front + len_rep; // move to next "end of rep"
}
strcpy(tmp, orig);
return result;
}
This is not provided in the standard C library because, given only a char* you can't increase the memory allocated to the string if the replacement string is longer than the string being replaced.
You can do this using std::string more easily, but even there, no single function will do it for you.
There isn't one.
You'd need to roll your own using something like strstr and strcat or strcpy.
You could build your own replace function using strstr to find the substrings and strncpy to copy in parts to a new buffer.
Unless what you want to replace_with is the same length as what you you want to replace, then it's probably best to use a new buffer to copy the new string to.
Here's some sample code that does it.
#include <string.h>
#include <stdlib.h>
char * replace(
char const * const original,
char const * const pattern,
char const * const replacement
) {
size_t const replen = strlen(replacement);
size_t const patlen = strlen(pattern);
size_t const orilen = strlen(original);
size_t patcnt = 0;
const char * oriptr;
const char * patloc;
// find how many times the pattern occurs in the original string
for (oriptr = original; patloc = strstr(oriptr, pattern); oriptr = patloc + patlen)
{
patcnt++;
}
{
// allocate memory for the new string
size_t const retlen = orilen + patcnt * (replen - patlen);
char * const returned = (char *) malloc( sizeof(char) * (retlen + 1) );
if (returned != NULL)
{
// copy the original string,
// replacing all the instances of the pattern
char * retptr = returned;
for (oriptr = original; patloc = strstr(oriptr, pattern); oriptr = patloc + patlen)
{
size_t const skplen = patloc - oriptr;
// copy the section until the occurence of the pattern
strncpy(retptr, oriptr, skplen);
retptr += skplen;
// copy the replacement
strncpy(retptr, replacement, replen);
retptr += replen;
}
// copy the rest of the string.
strcpy(retptr, oriptr);
}
return returned;
}
}
#include <stdio.h>
int main(int argc, char * argv[])
{
if (argc != 4)
{
fprintf(stderr,"usage: %s <original text> <pattern> <replacement>\n", argv[0]);
exit(-1);
}
else
{
char * const newstr = replace(argv[1], argv[2], argv[3]);
if (newstr)
{
printf("%s\n", newstr);
free(newstr);
}
else
{
fprintf(stderr,"allocation error\n");
exit(-2);
}
}
return 0;
}
As strings in C can not dynamically grow inplace substitution will generally not work. Therefore you need to allocate space for a new string that has enough room for your substitution and then copy the parts from the original plus the substitution into the new string. To copy the parts you would use strncpy.
// Here is the code for unicode strings!
int mystrstr(wchar_t *txt1,wchar_t *txt2)
{
wchar_t *posstr=wcsstr(txt1,txt2);
if(posstr!=NULL)
{
return (posstr-txt1);
}else
{
return -1;
}
}
// assume: supplied buff is enough to hold generated text
void StringReplace(wchar_t *buff,wchar_t *txt1,wchar_t *txt2)
{
wchar_t *tmp;
wchar_t *nextStr;
int pos;
tmp=wcsdup(buff);
pos=mystrstr(tmp,txt1);
if(pos!=-1)
{
buff[0]=0;
wcsncpy(buff,tmp,pos);
buff[pos]=0;
wcscat(buff,txt2);
nextStr=tmp+pos+wcslen(txt1);
while(wcslen(nextStr)!=0)
{
pos=mystrstr(nextStr,txt1);
if(pos==-1)
{
wcscat(buff,nextStr);
break;
}
wcsncat(buff,nextStr,pos);
wcscat(buff,txt2);
nextStr=nextStr+pos+wcslen(txt1);
}
}
free(tmp);
}
The repl_str() function on creativeandcritical.net is fast and reliable. Also included on that page is a wide string variant, repl_wcs(), which can be used with Unicode strings including those encoded in UTF-8, through helper functions - demo code is linked from the page. Belated full disclosure: I am the author of that page and the functions on it.
Here is the one that I created based on these requirements:
Replace the pattern regardless of whether is was long or shorter.
Not use any malloc (explicit or implicit) to intrinsically avoid memory leaks.
Replace any number of occurrences of pattern.
Tolerate the replace string having a substring equal to the search string.
Does not have to check that the Line array is sufficient in size to hold the replacement. e.g. This does not work unless the caller knows that line is of sufficient size to hold the new string.
avoid use of strcat() to avoid overhead of scanning the entire string to append another string.
/* returns number of strings replaced.
*/
int replacestr(char *line, const char *search, const char *replace)
{
int count;
char *sp; // start of pattern
//printf("replacestr(%s, %s, %s)\n", line, search, replace);
if ((sp = strstr(line, search)) == NULL) {
return(0);
}
count = 1;
int sLen = strlen(search);
int rLen = strlen(replace);
if (sLen > rLen) {
// move from right to left
char *src = sp + sLen;
char *dst = sp + rLen;
while((*dst = *src) != '\0') { dst++; src++; }
} else if (sLen < rLen) {
// move from left to right
int tLen = strlen(sp) - sLen;
char *stop = sp + rLen;
char *src = sp + sLen + tLen;
char *dst = sp + rLen + tLen;
while(dst >= stop) { *dst = *src; dst--; src--; }
}
memcpy(sp, replace, rLen);
count += replacestr(sp + rLen, search, replace);
return(count);
}
Any suggestions for improving this code are cheerfully accepted. Just post the comment and I will test it.
i find most of the proposed functions hard to understand - so i came up with this:
static char *dull_replace(const char *in, const char *pattern, const char *by)
{
size_t outsize = strlen(in) + 1;
// TODO maybe avoid reallocing by counting the non-overlapping occurences of pattern
char *res = malloc(outsize);
// use this to iterate over the output
size_t resoffset = 0;
char *needle;
while (needle = strstr(in, pattern)) {
// copy everything up to the pattern
memcpy(res + resoffset, in, needle - in);
resoffset += needle - in;
// skip the pattern in the input-string
in = needle + strlen(pattern);
// adjust space for replacement
outsize = outsize - strlen(pattern) + strlen(by);
res = realloc(res, outsize);
// copy the pattern
memcpy(res + resoffset, by, strlen(by));
resoffset += strlen(by);
}
// copy the remaining input
strcpy(res + resoffset, in);
return res;
}
output must be free'd
a fix to fann95's response, using in-place modification of the string, and assuming the buffer pointed to by line is large enough to hold the resulting string.
static void replacestr(char *line, const char *search, const char *replace)
{
char *sp;
if ((sp = strstr(line, search)) == NULL) {
return;
}
int search_len = strlen(search);
int replace_len = strlen(replace);
int tail_len = strlen(sp+search_len);
memmove(sp+replace_len,sp+search_len,tail_len+1);
memcpy(sp, replace, replace_len);
}
/*замена символа в строке*/
char* replace_char(char* str, char in, char out) {
char * p = str;
while(p != '\0') {
if(*p == in)
*p == out;
++p;
}
return str;
}
This function only works if ur string has extra space for new length
void replace_str(char *str,char *org,char *rep)
{
char *ToRep = strstr(str,org);
char *Rest = (char*)malloc(strlen(ToRep));
strcpy(Rest,((ToRep)+strlen(org)));
strcpy(ToRep,rep);
strcat(ToRep,Rest);
free(Rest);
}
This only replaces First occurrence
Here goes mine, make them all char*, which makes calling easier...
char *strrpc(char *str,char *oldstr,char *newstr){
char bstr[strlen(str)];
memset(bstr,0,sizeof(bstr));
int i;
for(i = 0;i < strlen(str);i++){
if(!strncmp(str+i,oldstr,strlen(oldstr))){
strcat(bstr,newstr);
i += strlen(oldstr) - 1;
}else{
strncat(bstr,str + i,1);
}
}
strcpy(str,bstr);
return str;
}
There is a function in string.h but it works with char [] not char* but again it outputs a char* and not a char []
It is simple and beautiful
Supposing we want to replace 'and' in 'TheandQuickandBrownandFox'.
We first split with strtok and then join with snprintf defined in the stdio.h
char sometext[] = "TheandQuickandBrownandFox";
char* replaced = malloc(1024);
// split on the substring, here I am using (and)
char* token = strtok(sometext, "and");
snprintf(replaced, 1, "%s", ""); // initialise so we can compare
while(token) {
if (strcmp(replaced, "") < 1) {
// if it is the first one
snprintf(replaced, 1024, "%s", token);
token = NULL;
} else {
// put the space between the existing and new
snprintf(replaced, 1024, "%s %s", replaced, token);
token = NULL;
}
}
free(replaced);
This should give us:
The Quick Brown Fox
You can use this function (the comments explain how it works):
void strreplace(char *string, const char *find, const char *replaceWith){
if(strstr(string, find) != NULL){
char *temporaryString = malloc(strlen(strstr(string, find) + strlen(find)) + 1);
strcpy(temporaryString, strstr(string, find) + strlen(find)); //Create a string with what's after the replaced part
*strstr(string, find) = '\0'; //Take away the part to replace and the part after it in the initial string
strcat(string, replaceWith); //Concat the first part of the string with the part to replace with
strcat(string, temporaryString); //Concat the first part of the string with the part after the replaced part
free(temporaryString); //Free the memory to avoid memory leaks
}
}
DWORD ReplaceString(__inout PCHAR source, __in DWORD dwSourceLen, __in const char* pszTextToReplace, __in const char* pszReplaceWith)
{
DWORD dwRC = NO_ERROR;
PCHAR foundSeq = NULL;
PCHAR restOfString = NULL;
PCHAR searchStart = source;
size_t szReplStrcLen = strlen(pszReplaceWith), szRestOfStringLen = 0, sztextToReplaceLen = strlen(pszTextToReplace), remainingSpace = 0, dwSpaceRequired = 0;
if (strcmp(pszTextToReplace, "") == 0)
dwRC = ERROR_INVALID_PARAMETER;
else if (strcmp(pszTextToReplace, pszReplaceWith) != 0)
{
do
{
foundSeq = strstr(searchStart, pszTextToReplace);
if (foundSeq)
{
szRestOfStringLen = (strlen(foundSeq) - sztextToReplaceLen) + 1;
remainingSpace = dwSourceLen - (foundSeq - source);
dwSpaceRequired = szReplStrcLen + (szRestOfStringLen);
if (dwSpaceRequired > remainingSpace)
{
dwRC = ERROR_MORE_DATA;
}
else
{
restOfString = CMNUTIL_calloc(szRestOfStringLen, sizeof(CHAR));
strcpy_s(restOfString, szRestOfStringLen, foundSeq + sztextToReplaceLen);
strcpy_s(foundSeq, remainingSpace, pszReplaceWith);
strcat_s(foundSeq, remainingSpace, restOfString);
}
CMNUTIL_free(restOfString);
searchStart = foundSeq + szReplStrcLen; //search in the remaining str. (avoid loops when replWith contains textToRepl
}
} while (foundSeq && dwRC == NO_ERROR);
}
return dwRC;
}
char *replace(const char*instring, const char *old_part, const char *new_part)
{
#ifndef EXPECTED_REPLACEMENTS
#define EXPECTED_REPLACEMENTS 100
#endif
if(!instring || !old_part || !new_part)
{
return (char*)NULL;
}
size_t instring_len=strlen(instring);
size_t new_len=strlen(new_part);
size_t old_len=strlen(old_part);
if(instring_len<old_len || old_len==0)
{
return (char*)NULL;
}
const char *in=instring;
const char *found=NULL;
size_t count=0;
size_t out=0;
size_t ax=0;
char *outstring=NULL;
if(new_len> old_len )
{
size_t Diff=EXPECTED_REPLACEMENTS*(new_len-old_len);
size_t outstring_len=instring_len + Diff;
outstring =(char*) malloc(outstring_len);
if(!outstring){
return (char*)NULL;
}
while((found = strstr(in, old_part))!=NULL)
{
if(count==EXPECTED_REPLACEMENTS)
{
outstring_len+=Diff;
if((outstring=realloc(outstring,outstring_len))==NULL)
{
return (char*)NULL;
}
count=0;
}
ax=found-in;
strncpy(outstring+out,in,ax);
out+=ax;
strncpy(outstring+out,new_part,new_len);
out+=new_len;
in=found+old_len;
count++;
}
}
else
{
outstring =(char*) malloc(instring_len);
if(!outstring){
return (char*)NULL;
}
while((found = strstr(in, old_part))!=NULL)
{
ax=found-in;
strncpy(outstring+out,in,ax);
out+=ax;
strncpy(outstring+out,new_part,new_len);
out+=new_len;
in=found+old_len;
}
}
ax=(instring+instring_len)-in;
strncpy(outstring+out,in,ax);
out+=ax;
outstring[out]='\0';
return outstring;
}
Using only strlen from string.h
sorry for my English
char * str_replace(char * text,char * rep, char * repw){//text -> to replace in it | rep -> replace | repw -> replace with
int replen = strlen(rep),repwlen = strlen(repw),count;//some constant variables
for(int i=0;i<strlen(text);i++){//search for the first character from rep in text
if(text[i] == rep[0]){//if it found it
count = 1;//start searching from the next character to avoid repetition
for(int j=1;j<replen;j++){
if(text[i+j] == rep[j]){//see if the next character in text is the same as the next in the rep if not break
count++;
}else{
break;
}
}
if(count == replen){//if count equals to the lenght of the rep then we found the word that we want to replace in the text
if(replen < repwlen){
for(int l = strlen(text);l>i;l--){//cuz repwlen greater than replen we need to shift characters to the right to make space for the replacement to fit
text[l+repwlen-replen] = text[l];//shift by repwlen-replen
}
}
if(replen > repwlen){
for(int l=i+replen-repwlen;l<strlen(text);l++){//cuz replen greater than repwlen we need to shift the characters to the left
text[l-(replen-repwlen)] = text[l];//shift by replen-repwlen
}
text[strlen(text)-(replen-repwlen)] = '\0';//get rid of the last unwanted characters
}
for(int l=0;l<repwlen;l++){//replace rep with repwlen
text[i+l] = repw[l];
}
if(replen != repwlen){
i+=repwlen-1;//pass to the next character | try text "y" ,rep "y",repw "yy" without this line to understand
}
}
}
}
return text;
}
if you want strlen code to avoid calling string.h
int strlen(char * string){//use this code to avoid calling string.h
int lenght = 0;
while(string[lenght] != '\0'){
lenght++;
}
return lenght;
}
There you go....this is the function to replace every occurance of char x with char y within character string str
char *zStrrep(char *str, char x, char y){
char *tmp=str;
while(*tmp)
if(*tmp == x)
*tmp++ = y; /* assign first, then incement */
else
*tmp++;
// *tmp='\0'; -> we do not need this
return str;
}
An example usage could be
Exmaple Usage
char s[]="this is a trial string to test the function.";
char x=' ', y='_';
printf("%s\n",zStrrep(s,x,y));
Example Output
this_is_a_trial_string_to_test_the_function.
The function is from a string library I maintain on Github, you are more than welcome to have a look at other available functions or even contribute to the code :)
https://github.com/fnoyanisi/zString
EDIT:
#siride is right, the function above replaces chars only. Just wrote this one, which replaces character strings.
#include <stdio.h>
#include <stdlib.h>
/* replace every occurance of string x with string y */
char *zstring_replace_str(char *str, const char *x, const char *y){
char *tmp_str = str, *tmp_x = x, *dummy_ptr = tmp_x, *tmp_y = y;
int len_str=0, len_y=0, len_x=0;
/* string length */
for(; *tmp_y; ++len_y, ++tmp_y)
;
for(; *tmp_str; ++len_str, ++tmp_str)
;
for(; *tmp_x; ++len_x, ++tmp_x)
;
/* Bounds check */
if (len_y >= len_str)
return str;
/* reset tmp pointers */
tmp_y = y;
tmp_x = x;
for (tmp_str = str ; *tmp_str; ++tmp_str)
if(*tmp_str == *tmp_x) {
/* save tmp_str */
for (dummy_ptr=tmp_str; *dummy_ptr == *tmp_x; ++tmp_x, ++dummy_ptr)
if (*(tmp_x+1) == '\0' && ((dummy_ptr-str+len_y) < len_str)){
/* Reached end of x, we got something to replace then!
* Copy y only if there is enough room for it
*/
for(tmp_y=y; *tmp_y; ++tmp_y, ++tmp_str)
*tmp_str = *tmp_y;
}
/* reset tmp_x */
tmp_x = x;
}
return str;
}
int main()
{
char s[]="Free software is a matter of liberty, not price.\n"
"To understand the concept, you should think of 'free' \n"
"as in 'free speech', not as in 'free beer'";
printf("%s\n\n",s);
printf("%s\n",zstring_replace_str(s,"ree","XYZ"));
return 0;
}
And below is the output
Free software is a matter of liberty, not price.
To understand the concept, you should think of 'free'
as in 'free speech', not as in 'free beer'
FXYZ software is a matter of liberty, not price.
To understand the concept, you should think of 'fXYZ'
as in 'fXYZ speech', not as in 'fXYZ beer'
You can use strrep()
char* strrep ( const char * cadena,
const char * strf,
const char * strr
)
strrep (String Replace). Replaces strf with strr in cadena and returns the new string. You need to free the returned string in your code after using strrep.
Parameters:
cadena: The string with the text.
strf: The text to find.
strr: The replacement text.
Returns
The text updated wit the replacement.
Project can be found at https://github.com/ipserc/strrep