Replace String pattern with new string - c

I could really use some help with this question...
char *replace(char *s, char *pat, char *rep)
Returns a copy of the string s, but with each instance of pat replaced with rep. Note that len(pat) can be less than, greater than, or equal to len(rep). The function allocates memory for the resulting string, and it is up to the caller to free it. For example, if we call replace("Fiore X", "X", "sucks"), what is returned is the new string Fiore sucks (but remember, pat could be longer than an individual character and could occur multiple times).
I've managed to determine whether the pattern occurs in the original string, but I run into a problem if the pattern occurs more than once. I also haven't got to the part of creating a new string with the replaced text. I'm not allowed to use any functions from <string.h>. (I'm still very new to C)
char *replace(char *s, char *pat, char *rep){
char *a = malloc(300);
char *pa = s;
int patLen = 0;
int i;
for(i = 0; pat[i] != '\0'; i++)
{
patLen++;
}
int ogLen = patLen;
while(*s != '\0')
{
if(*s == *pat)
{
s++;
pat++;
patLen--;
while(*s == *pat)
{
s++;
pat++;
patLen--;
}
if(patLen == 0)
{
printf("This is a pattern");
patLen = ogLen;
}
}
s++;
}
return s;
}

Since you can't use string.h functions, I'd write a few string utility functions to count string length, compare strings and copy strings. This will make your code easier to understand. I'd make 2 passes through string s: first time would count the occurrences of pat. Then I can calculate the size of the new string: length(s) + occurrences * (length(rep) - length(pat). Allocate the new string. Now pass through string s again, copying into new string but whenever occurrence of pat is found, copy rep instead. Hope this helps.

Ok, not using string.h, it can be done.
First thing, you're doing pat++, but never going back. after finding the first occurrence of the first letter of the pat string, you're never coming back to the start of the string to make other comparisons in the future.
Using s++ is fine, as you don't need to come back to the start of this string, but for pat I would advise you to use index, and assess pat[i]. Nevertheless, if you keep track on how many times you advanced with pat++, you should be able to pat-- the exact amount (by the way, recursion would be an elegant way to do so without creating an int to keep track of how many times you advanced)
On the second while, just for safety, I would include &&*s!='\0'. And for proccess reasons, add &&patLen!=0. If you don't include this last one, you'll do one extra s++, and lose one possible starting point.
And finally, just a printf won't solve your problems, you should be able to track where the pattern was found (easily done with an array of ints) so you can go back and replace it.
The replacing gets trick when pat and rep have different sizes. I would create some additional functions, to make room for chars (in case rep>pat), and to eliminate some chars (if rep

Related

Is using input parameters instead of local variables more efficient in C?

Consider the two functions below. Both functions compute the number of times a character appears in a string with a specified length.
int str_get_num_occurrences1(char * str, char c, unsigned int len){
if (!len)
len = strlen(str);
int res = 0;
int n = len;
for ( ; n--; )
if (str[n] == c)
res++;
return res;
}
int str_get_num_occurrences2(char * str, char c, unsigned int len){
int res = 0;
if (!len)
len = strlen(str);
for ( ; len--; )
if (str[len] == c)
res++;
return res;
}
Obviously, the two functions do the same thing. Besides the fact that the first function is a little bit more readable than the second, is the second function more efficient since it avoids a local variable? I'm sure that these particular functions are really too simple to measure a true difference. I'm asking in more of a general or theoretical way.
Are there reasons why a user should avoid using input parameters as temporary storage (besides readability)? I'm not asking about pointers, where the input could be changed by the function. Does the compiler interpret the two functions differently which could cause function one to be preferred?
I searched through the questions, and I did find some related questions but none that I could find discussed the efficiency.
TL;DR
Write the code you find easiest to read/write/maintain. The difference between your functions will probably disappear when you compile with optimizations.
You might want to think about a couple of things that you can do to write a more flexible function, or at least: code that is easier to read. This answer will focus more on coding style, than the question Which is best, X or Y, because the answer will almost always be That depends on Z
Given that you're allowing the call to pass a 0 value for the string length, you could just write something like this:
int get_char_count(const char *str, char c)
{
int count = 0;
while(*str++) {
if (*str == c) {
++count;
}
}
return count;
}
That, to me, looks like the least amount of code, it's easy to read, and easy to maintain.
The drawbacks are:
Strings with '\0' characters in the middle (ie char[][]) can't be processed in full in a single call using this approach
Not possible to get the char count in a part of the string.
Strings containing '\0' chars can't be processed in full
If you want to support those use cases, you'll have to add a length argument. But even then, I'd just add it to the function, and not call strlen:
int get_char_count(const char *str, char c, unsigned int len)
{
int count = 0;
if (!len) {
while(*str++) {
if (*str == c) {
++count;
}
}
return count; // return early
}
//len is given
while (len--) {
if (str[len] == c) {
++count;
}
}
return count;
}
Now that I'm able to specify how many characters to iterate over, rather than to return on '\0', I can use this function, for example, to count how many occurrences of a given character are in an array of strings:
Example: count in char[][]
Example: cont in part of a string
Example: string with nul-chars
The first case (char [][]) works because of how the arrays are stored in memory: An array is a contiguous block of memory, and all values are stored in succession. If you know the total size of said block, you can use a char[][] as though it is one big string. The result being: only 1 function call is needed to count a character in all elements of the array.
The last case is pretty much the same thing, because the string in the example is actually how an array of strings is stored.
The second example (counting in partial string) is self-evident: rather than specifying the length of the full string, you can specify the number of characters you want to check...
The same approach can be used for strings lacking a terminating nul character
Because this is a fairly trivial function to implement, it's common to see most of the brackets being omitted:
while (*str++)
if (*str == c)
++count;
//or even
while(len--) count += str[len] == c;
The last version is technically valid, but it's not as easy to read. Omitting brackets for one-line if's and simple loops is fairly common, but has been the cause of bugs, like the goto fail bug from a few years back.
One last style-related thing:
When using the pointer to iterate over the string like I did in the first snippet, some will tell you that the best thing to do is to create a local pointer to increment:
int get_char_count(const char *str, char c)
{
int count = 0;
const char *local = str;
while(*local++) {
if (*local == c) {
++count;
}
}
return count;
}
The obvious advantage here being that you're not losing the original position/pointer that was passed in. If you later add something to the function, you can always reassign, or assign a new pointer based off str.

C version of strpos and substr?

I'm really surprised I can't figure out a way to do this effectively. I've tried strstr, a combination of things with sscanf, and nothing seems to work the way I would expect it to based on my experience in other languages.
I have a char of "ABCDEFG HIJ K BEGINTheMiddleEND LMNO PQRS". I do not know where "BEGINTheMiddleEND" is in the string, and I would like to end with a char that equals "TheMiddle" by finding the occurrences of "BEGIN" and "END" and grabbing what is in between.
What is the most efficient way to accomplish this (find and sub-string)?
Thanks!
-- EDIT BASED ON ANSWERS --
I have tried this:
char *searchString = "ABCDEFG HIJ K BEGINTheMiddleEND LMNO PQRS"
char *t1, *t2;
t1 = strstr(searchString, "BEGIN");
t2 = strstr(t1, "END");
But something must be wrong from a pointer standpoint as it doesn't work for me. Strstr only takes two arguments, so I'm not sure what you mean by starting at the previous pointer. I'm also not sure how to then use those pointers to substring it, as they are not integer values like strpos returns, but character pointers.
Thanks again.
-- EDIT WITH FINAL CODE --
For anyone else who hits this, the final, working code:
char *searchString = "ABCDEFG HIJ K BEGINTheMiddleEND LMNO PQRS"
char *b = strstr(searchString , "BEGIN");
char *e = strstr(b, "END");
int offset = e - b;
b[offset] = 0;
Where "b" is now equal to "BEGINTheMiddle". (which as it turns out is what I needed in this case).
Thanks again everyone.
You need to realize what a string is. A 0 delimited sequence of chars.
strstr does what it says: it finds the beginning of the given substring.
So calling strstr with the needle "BEGIN" takes you to the position of this substring. The pointer pointing to "BEGIN", advanced by 5 characters is pointing to "TheMiddle" onward to the next 0 char. By searching for "END" you can find the end pointer, and then you need to copy the substring into a new string array (or cut it, by replacing the "E" with a 0; or implement your own string functions that do not use 0 terminated strings, so they can arbitrarily overlap).
That is probably the step that you are still missing: actually copy the string. E.g. using
t3 = strndup(t1, t2 - t1);
Take the string ABCDEF0, where 0 is an actual 0 character. A pointer to the beginning points to the full string, a pointer pointing to the E points to "EF" only. If you want to get a string "AB", you need to either copy that to "AB0", or replace C by 0.
strstr does not do the copying for you. It just finds the position. If you want an index, you can do int offset = newPosition - oldPosition;, but if you need to continue searching, it's easier to work with the newPosition pointer.
All this is less intuitive than e.g. String operations in Java. Except for truncating strings, it actually is more efficient as far as I know, and if you realize the 0-terminated memory layout, it makes a lot of sense. It's only when you think of strings as arrays that it may seem odd to have a pointer somewhere in the middle, and continue using it like a regular array. That makes "sub = string + offset" the C way of writing "sub = string.substring(offset)".
use strstr() twice, but the sencond time start from the position returned by the first call to strstr() + strlen(BEGIN).
This will be efficient because the first pointer returned from strstr() is going to be the beginning of BEGIN, therefore you won't be looking through the whole string again but start at the BEGIN-ing and look for the END from there; which means that at the most you run through the whole string once.
I hope this will help
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int strpos(char *haystack, char *needle, int offset);
int main()
{
char *p = "Hello there all y'al, hope that you are all well";
int pos = strpos(p, "all", 0);
printf("First all at : %d\n", pos);
pos = strpos(p, "all", 10);
printf("Second all at : %d\n", pos);
}
int strpos(char *hay, char *needle, int offset)
{
char haystack[strlen(hay)];
strncpy(haystack, hay+offset, strlen(hay)-offset);
char *p = strstr(haystack, needle);
if (p)
return p - haystack+offset;
return -1;
}

Reverse a string using recursion

I got this code from the internet but I couldnt get the whole code.
for example if(*str) . What does this code mean? and also can a string be returned? I thought that an array in main can be changed
directly in a function but here its been returned..
#include<stdio.h>
#define MAX 100
char* getReverse(char[]);
int main(){
char str[MAX],*rev;
printf("Enter any string: ");
scanf("%s",str);
rev = getReverse(str);
printf("Reversed string is: %s\n\n",rev);
return 0;
}
char* getReverse(char str[]){
static int i=0;
static char rev[MAX];
if(*str){
getReverse(str+1);
rev[i++] = *str;
}
return rev;
}
This is not the clearest example of recursion due to the use of the static variables. Hopefully the code generally seems clear to you, I suspect the part that is confusing to you is the same that was confusing to me at first.
if(*str){
getReverse(str+1);
rev[i++] = *str;
}
So line by line.
if(*str){
If we have not reached the null terminator.
getReverse(str+1);
Call the getReverse function on the next character of the string. It seems pretty straight forward up to here. But it also seems like it may not actually reverse anything because this is the next line
rev[i++] = *str;
We assign index i the character at the beginning of str and increment i but here is the tricky part. i may not be what you think. getReverse gets called before i is incremented. And i is static, so changes will persist between function calls. So, lets say we have a 5 letter word, let say "horse" we will end up with 6 calls on the stack to getReverse. The 6th will not do anything because that is where it finds the null terminator. The trick is that we will then go about resolving the calls in reverse order. First the call where str is pointing to 'e' will resolve and increment i because all the other ones are are still waiting for their calls to getReverse to return. So the last letters are actually the first ones to get added and increment i which is what can be confusing here.

Inserting characters in the middle of char array

I have a char array filled with some characters. Let's say I have "HelloWorld" in my char array. (not string. taking up index of 0 to 9)
What I'm trying to do is insert a character in the middle of the array, and push the rest to the side to make room for the new character that is being inserted.
So, I can make the char array to have "Hello.World" in it.
char ch[15]; // assume it has "HelloWorld" in it
for(int i=0; i<=strlen(ch)-1; i++) {
if(ch[i]=='o' && ch[i+1]=='W') {
for(int j=strlen(ch)-1; j>=i+2; j--) {
ch[j] = ch[j-1]; // pushing process?
}
ch[i+1] = '.';
break;
}
}
Would this work? Would there be an easier way? I might just be thinking way too complicated on this.
You need to start the inner loop from strlen(ch) + 1, not strlen(ch) - 1, because you need to move the NULL-terminator to the right one place as well. Remember that strlen returns the length of the string such that string[strlen(string)] == '\0'; you can think of strlen as a function for obtaining the index of the NULL-terminator of a C-string.
If you want to move all the characters up by one, then you could do it using memmove.
#include <string.h>
char ch[15];
int make_room_at = 5;
int room_to_make = 1;
memmove(
ch + make_room_at + room_to_make,
ch + make_room_at,
15 - (make_room_at + room_to_make)
);
Simply do:
#define SHIFT 1
char bla[32] = "HelloWorld"; // We reserve enough room for this example
char *ptr = bla + 5; // pointer to "World"
memmove(ptr + SHIFT, ptr, strlen(ptr) + 1); // +1 for the trailing null
The initial starting value for the inner loop is one short. It should be something like the following. Note too that since the characters are moved to the right, a new null terminator needs to be added:
ch[strlen(ch) + 1] = '\0';
for(j=strlen(ch); j>=i+2; j--) { // note no "-1" after the strlen
Edit As far as the "Is this a good way?" part, I think it is reasonable; it just depends on the intended purpose. A couple thoughts come to mind:
Reducing the calls to strlen might be good. It could depend on how good the optimizer is (perhaps some might be optimized out). But each call to strlen require a scan of the string looking for the null terminator. In high traffic code, that can add up. So storing the initial length in a variable and then using the variable elsewhere could help.
This type of operation has the chance for buffer overflow. Always make sure the buffer is long enough (it is in the OP).
If you're going to manipulate a char array you shouldn't make it static. By doing this:
char ch[15];
you're hardcoding the array to always have 15 characters in it. Making it a pointer would be step 1:
char* ch;
This way you can modify it as need be.

C homework - string loops replacements

I know it's a little unorthodox and will probably cost me some downvotes, but since it's due in 1 hour and I have no idea where to begin I thought I'd ask you guys.
Basically I'm presented with a string that contains placeholders in + form, for example:
1+2+5
I have to create a function to print out all the possibilities of placing different combinations of any given series of digits. I.e. for the series:
[9,8,6] // string array
The output will be
16265
16285
16295
18265
18285
18295
19265
19285
19295
So for each input I get (number of digits)^(number of placeholders) lines of output.
Digits are 0-9 and the maximum form of the digits string is [0,1,2,3,4,5,6,7,8,9].
The original string can have many placeholders (as you'd expect the output can get VERY lengthly).
I have to do it in C, preferably with no recursion. Again I really appreciate any help, couldn't be more thankful right now.
If you can offer an idea, a simplified way to look at solving this, even in a different language or recursively, it'd still be ok, I could use a general concept and move on from there.
It prints them in different order, but it does not matter. and it's not recursive.
#include <stdlib.h>
#include <stdio.h>
int // 0 if no more.
get_string(char* s, const char* spare_chr, int spare_cnt, int comb_num){
for (; *s; s++){
if (*s != '+') continue;
*s = spare_chr[comb_num % spare_cnt];
comb_num /= spare_cnt;
};
return !comb_num;
};
int main(){
const char* spare_str = "986";
int num = 0;
while (1){
char str[] = "1+2+5";
if (!get_string(str, spare_str, strlen(spare_str), num++))
break; // done
printf("str num %2d: %s\n", num, str);
};
return 0;
};
In order to do the actual replacement, you can use strchr to find the first occurrence of a character and return a char * pointer to it. You can then simply change that pointer's value and bam, you've done a character replacement.
Because strchr searches for the first occurrence (before a null terminator), you can use it repeatedly for every value you want to replace.
The loop's a little trickier, but let's see what you make of this.

Resources