Remove substring between parentheses in a string - c

I need to remove each substring between parentheses. I have found some solutions but none is good. Here is an example:
My string is: text(lorem(ipsum)abcd)pieceoftext and the actual output: lorem(ipsum
However, the expected output: text(())pieceoftext or textpieceoftext
Here is the code. I've run out of ideas. I thought of using strtok() but I have two different delimiters.
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int main()
{
const char *s = "text(lorem(ipsum)abcd)pieceoftext";
const char *patternA = "(";
const char *patternB = ")";
char *target = NULL;
char *start, *end;
if (start = strstr( s, patternA ))
{
start += strlen( patternA);
if (end = strstr( start, patternB ) )
{
target = (char *)malloc(end - start + 1);
memcpy(target, start, end - start);
target[end - start] = '\0';
}
}
if (target)
printf("Answer: %s\n", target);
return 0;
}
Looking forward to hearing some of your ideas to solve this problem. Thank you

To begin with, just allocate enough memory to target as you need to hold the entire source string s, because you really have no idea how much space you will need. Remember to add one for the end-of-string character.
Then change patternA and patternB from char * to just char, so you can compare them against individual chars in s.
Then you need to loop through the source string, keeping track of whether you are inside parentheses or not. Since you need to support nested parentheses, I would use a counter of how deep inside the parentheses you are:
int main()
{
const char *s = "text(lorem(ipsum)abcd)pieceoftext";
const char patternA = '(';
const char patternB = ')';
char *target;
int targetIndex = 0;
int parenDepth = 0;
target = malloc(strlen(s) + 1);
// check for malloc() error
for (int sourceIndex = 0; sourceIndex < strlen(s); sourceIndex++) {
if (s[sourceIndex] == patternA) {
// we are going deeper into parens, add to level, then ignore the char
parenDepth++;
continue;
}
if (s[sourceIndex] == patternB) {
// we are coming out of the parens, lower our level, ignore the parens char
parenDepth--;
continue;
}
if (parenDepth == 0) {
// if depth is 0, then we are not inside parens, so copy the char to target
target[targetIndex++] = s[sourceIndex];
}
}
// add end-of-string
target[targetIndex] = '\0';
printf("Answer: %s\n", target);
return 0;
}

I don't understand why you don't use strtok(strtok_r) only. I think it is more functional for this purpose. Just play with it somewhat.
#include <stdio.h>
#include <string.h>
int main(void) {
char str[] = "text(lorem(ipsum)abcd)pieceoftext";
char const * delim = ")(";
char *token;
char *rest = str;
while ((token = strtok_r(rest, delim, &rest))) {
printf("token: %s\n", token);
printf("rest: %s\n", rest);
}
}

You should investigate basic parsing techniques and use those to build a small sized program that does what you want.
hello(world)world
A simple solution:
If lookahead is an opening paren, stop saving. Until there is a closing paren. When there might be imbricated parens you just maintain a global variable of how deep we are (increment when there is an opening paren and decrement when there is a closing paren). When this variable is zero you can save.
You can use the same pattern beforehand to check if there are enough closing parens.

Related

Deleting a char and moving it in a string

I need ideas for a recursive code that deletes a specific char in a string, and move all the other sting chars together
for Example :
"the weather is cloudy"
the entered char is 'e':
result :
"th wathr is cloudy"
I really don't have any idea how to start, thanks for the help.
#include <stdio.h>
void remove_impl(char* s, char c, char* d) {
if (*s != c) {
*d++ = *s;
}
if (*s != '\0') {
remove_impl(++s, c, d);
}
}
void remove(char* s, char c) {
remove_impl(s, c, s);
}
int main() {
char s[] = "the weather is cloudy";
remove(s, 'e');
puts(s);
}
How it works? Consider remove_impl. s is the original string, c is the character to be deleted from s, d is the resulting string, into which the characters of s, not equal to c, are written. Recursively iterates through the characters of s. If the next character is not equal to c, then it is written in d. The recursion stop point is the condition of checking that the end of s is reached. Since it is necessary to modify the source string, the wrapper is implemented (remove) in which as d, the original string (s) is passed.
An easy way to do it is to loop over the string and add any letter that doesn't match the unwanted letter.
Here's a demonstration:
char *source = "the weather is cloudy";
int source_len = strlen(source);
char *target = (char *)calloc(source_len, sizeof(char));
int target_len = 0;
char to_remove = 'e';
for(int i = 0; i < source_len; i++)
{
if(source[i] != to_remove)
{
target[target_len++] = source[i];
}
}
puts(target); // Output "th wathr is cloudy" in the console
My turn to make a proposal ! I add a assert test and use existing functions (strchr and strcpy).
#include <string.h>
#include <stdio.h>
#include <assert.h>
int removeChar(char *str, char chr)
{
assert(str != 0); // Always control entry !
char *str_pnt = strchr(str, chr);
if (str_pnt) {
strcpy(str_pnt, str_pnt+1);
removeChar(str_pnt, chr);
}
}
void main (void)
{
char str[] = "the weather is cloudy";
char char_to_delete = 'e';
removeChar(str, char_to_delete);
puts(str);
}
This can be done in many ways. What i am thinking right now is store not Allowed char array which going to filter which char should show or not. Something like following..
#include <stdio.h>
#include <string.h>
// Global Scope variable declaration
int notAllowedChar[128] = {0}; // 0 for allowed , 1 for not allowed
char inputString[100];
void recursion(int pos, int len) {
if( pos >= len ) {
printf("\n"); // new line
return;
}
if( notAllowedChar[inputString[pos]]) {// not printing
recursion( pos + 1 , len );
}
else {
printf("%c", inputString[pos]);
recursion( pos + 1 , len );
}
}
int main() {
gets(inputString); // taking input String
printf("Enter not allowed chars:: "); // here we can even run a loop for all of them
char notAllowed;
scanf("%c", &notAllowed);
notAllowedChar[notAllowed] = 1;
int len = strlen(inputString);
recursion( 0 , len );
}
How this work
Lets say we have a simple string "Hello world"
and we want l should be removed from final string, so final output will be "Heo word"
Here "Hello world" length is 11 chars
before calling recursion function we make sure 'l' index which is 108 ascii values link 1 in notAllowedChar array.
now we are calling recursion method with ( 0 , 11 ) value , In recursion method we are having mainly 2 logical if operation, first one is for base case where we will terminate our recursion call when pos is equal or more than 11. and if its not true , we will do the second logical operation if current char is printable or not. This is simply just checking where this char is in notAllowedChar list or not. Every time we increase pos value + 1 and doing a recursion call, and finally when pos is equal or more than 11 , which means we have taken all our decision about printing char or not our recursion will terminate. I tried assign variable with meaningful name. If you still not understand how this work you should go with simple recursion simulation basic ( search in youtube ) and also you should try to manually debug how value is changing in recursion local scope. This may take time but it will be worthy to understand. All the very best.
#include <stdio.h>
/**
* Returns the number of removed chars.
* Base case: if the current char is the null char (end of the string)
* If the char should be deleted return 1 + no of chars removed in the remaining string.
* If it's a some other char simply return the number of chars removed in the remaining string
*/
int removeCAfterwardsAndCount(char* s,char c){
if((*s) == '\0'){
return 0;
}
if((*s) == c){
int noOfChars = removeCAfterwardsAndCount(s+1,c);// s+1 means the remaining string
s[noOfChars] = *s; // move the current char (*s) noOfChars locations ahead
return noOfChars +1; // means this char is removed... some other char should be copied here...
}
else{
int noOfChars = removeCAfterwardsAndCount(s+1,c);
s[noOfChars ] = *s;
return noOfChars ; // means this char is intact ...
}
}
int main()
{
char s[] = "Arifullah Jan";
printf("\n%s",s);
int totalRemoved = removeCAfterwardsAndCount(s,'a');
char *newS = &s[totalRemoved]; // the start of the string should now be originalPointer + total Number of chars removed
printf("\n%s",newS);
return 0;
}
Test Code Here
To avoid moving the chars using loops. I am just moving the chars forward which creates empty space in the start of the string. newS pointer is just a new pointer of the same string to eliminate the empty/garbage string.
#include <stdio.h>
void RemoveChar(char* str, char chr) {
char *str_old = str;
char *str_new = str;
while (*str_old)
{
*str_new = *str_old++;
str_new += (*str_new != chr);
}
*str_new = '\0'; }
int main() {
char string[] = "the weather is cloudy";
RemoveChar(string, 'e');
printf("'%s'\n", string);
return 0; }
#include <stdio.h>
#include <string.h>
char *remove_char(char *str, int c)
{
char *pos;
char *wrk = str;
while((pos = strchr(wrk, c)))
{
strcpy(pos, pos + 1);
wrk = pos;
}
return str;
}
int main()
{
char str[] = "Hello World";
printf(remove_char(str, 'l'));
return 0;
}
Or faster but mode difficult to understand version:
char *remove_char(char *str, int c)
{
char *pos = str;
char *wrk = str;
while(*wrk)
{
if(*wrk == c)
{
*wrk++;
continue;
}
*pos++ = *wrk++;
}
*pos = 0;
return str;
}
Both require the string to be writable (so you cant pass the pointer to the string literal for example)

Substrings in the middle of a String in C

I need to extract substrings that are between Strings I know.
I have something like char string = "abcdefg";
I know what I need is between "c" and "f", then my return should be "de".
I know the strncpy() function but do not know how to apply it in the middle of a string.
Thank you.
Here's a full, working example:
#include <stdio.h>
#include <string.h>
int main(void) {
char string[] = "abcdefg";
char from[] = "c";
char to[] = "f";
char *first = strstr(string, from);
if (first == NULL) {
first = &string[0];
} else {
first += strlen(from);
}
char *last = strstr(first, to);
if (last == NULL) {
last = &string[strlen(string)];
}
char *sub = calloc(strlen(string) + 1, sizeof(char));
strncpy(sub, first, last - first);
printf("%s\n", sub);
free(sub);
return 0;
}
You can check it at this ideone.
Now, the explanation:
1.
char string[] = "abcdefg";
char from[] = "c";
char to[] = "f";
Declarations of strings: main string to be checked, beginning delimiter, ending delimiter. Note these are arrays as well, so from and to could be, for example, cd and fg, respectively.
2.
char *first = strstr(string, from);
Find occurence of the beginning delimiter in the main string. Note that it finds the first occurence - if you need to find the last one (for example, if you had the string abcabc, and you wanted a substring from the second a), it might need to be different.
3.
if (first == NULL) {
first = &string[0];
} else {
first += strlen(from);
}
Handle situation, in which the first delimiter doesn't appear in the string. In such a case, we will make a substring from the beginning of the entire string. If it does appear, however, we move the pointer by length of from string, as we need to extract the substring beginning after the first delimiter (correction thanks to #dau_sama).
Depending on your specifications, this may or may not be needed, or another result might be expected.
4.
char *last = strstr(first, to);
Find occurence of the ending delimiter in the main string. Note that it finds the first occurence.
As noted by #dau_sama, it's better to search for ending delimiter from the first, not from beginning of the entire string. This prevents situations, in which to would appear earlier than from.
5.
if (last == NULL) {
last = &string[strlen(string)];
}
Handle situation, in which the second delimiter doesn't appear in the string. In such a case, we will make a substring until end of the string, so we get a pointer to the last character.
Again, depending on your specifications, this may or may not be needed, or another result might be expected.
6.
char *sub = calloc(last - first + 1, sizeof(char));
strncpy(sub, first, last - first);
Allocate sufficient memory and extract substring based on pointers found earlier. We copy last - first (length of the substring) characters beginning from first character.
7.
printf("%s\n", sub);
Here's the result.
I hope it does present the problem with enough details. Depending on your exact specifications, you may need to alter this somehow. For example, if you needed to find all substrings, and not just the first one, you may want to make a loop for finding first and last.
TY guys, worked using the form below:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char *between_substring(char *str, char from, char to){
while(*str && *str != from)
++str;//skip
if(*str == '\0')
return NULL;
else
++str;
char *ret = malloc(strlen(str)+1);
char *p = ret;
while(*str && *str != to){
*p++ = *str++;//To the end if `to` do not exist
}
*p = 0;
return ret;
}
int main (void){
char source[] = "abcdefg";
char *target;
target = between(source, 'c', 'f');
printf("%s", source);
printf("%s", target);
return 0;
}
Since people seemed to not understand my approach in the comments, here's a quick hacked together stub.
const char* string = "abcdefg";
const char* b = "c";
const char* e = "f";
//look for the first pattern
const char* begin = strstr(string, b);
if(!begin)
return NULL;
//look for the end pattern
const char* end = strstr(begin, e);
if(!end)
return NULL;
end -= strlen(e);
char result[MAXLENGTH];
strncpy(result, begin, end-begin);
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char *between(const char *str, char from, char to){
while(*str && *str != from)
++str;//skip
if(*str == '\0')
return NULL;
else
++str;
char *ret = malloc(strlen(str)+1);
char *p = ret;
while(*str && *str != to){
*p++ = *str++;//To the end if `to` do not exist
}
*p = 0;
return ret;
}
int main(void){
const char* string = "abcdefg";
char *substr = between(string, 'c', 'f');
if(substr!=NULL){
puts(substr);
free(substr);
}
return 0;
}

How to get the last part of a string in C

I have a string in C that contains a file path like "home/usr/wow/muchprogram".
I was wondering in C how I can get the string after the last "/". So Then I could store it as a variable. That variable would equal "muchprogram" to be clear.
I am also wondering how I could get everything before that final "/" as well. Thanks in advance.
Start scanning the string from the end. Once you get a / stop. Note the index and copy from index+1 to last_index, to a new array.
You get everything before the final / as well. You have the index. Start copying from start_index to index-1, to a new array.
Someone else already suggested this, but they forgot to include the C. This assumes it is ok to mutate the source string. Tested with GCC 4.7.3.
#include <stdio.h>
#include <string.h>
int main() {
char* s = "home/usr/wow/muchprogram";
int n = strlen(s);
char* suffix = s + n;
printf("%s\n%s\n", s, suffix);
while (0 < n && s[--n] != '/');
if (s[n] == '/') {
suffix = s + n + 1;
s[n] = '\0';
}
printf("%s\n%s\n", s, suffix);
return 0;
}
Search backwards from the end of the string until you find a '/'. The pointer to the next index from there is the string of everything after that. If you copy everything up to, but not including, the '/' into another string (or replace the '/' with '\0'), you obtain the string of everything before the last '/'.
http://pubs.opengroup.org/onlinepubs/9699919799/functions/strrchr.html
strrchr(3) has been there since C89 for that purpose.
#include <stdio.h>
#include <string.h>
static void find_destructive(char *s) {
char *p_sl = strrchr(s, '/');
if (p_sl) {
*p_sl = '\0';
printf("[%s] [%s]\n", s, p_sl + 1);
} else {
printf("Cannot find any slashes.\n");
}
}
static void find_transparent(const char *s) {
const char *p_sl = strrchr(s, '/');
if (p_sl) {
char *first = (char *)malloc(p_sl - s + 1);
if ( ! first) {
perror("malloc for a temp buffer: ");
return;
}
memcpy(first, s, p_sl - s);
first[p_sl - s] = '\0';
printf("[%s] [%s]\n", first, p_sl + 1);
free(first);
} else {
printf("Cannot find any slashes.\n");
}
}
int main() {
char s[] = "home/usr/wow/muchprogram";
find_transparent(s);
find_destructive(s);
return 0;
}
http://ideone.com/vApvqp
You can solve this in c# as follows..
Var tokens = Str.Split('/');
Var lastItem = tokens[tokens.Length-1];
Var everythingBeforeLastItem = string.Empty;
Enumerate.Range(0,tokens.Length-3).ToList().
ForEach(i => everythingBeforeLastItem = everythingBeforeLastItem+tokens[i]+"\");
EverythingBeforeLastItem += tokens[tokens.Length-2];
You can use StringBuilder for efficiency if you expect a deeper path resulting in large number of tokens..

How to copy front part of string up to a delimiter

I need to grab the first part of a string up to and including the last backslash in a path. I am fairly new to C. So I was wondering if the following code is a good approach? Or is there a better way?
#include <stdio.h>
#include <string.h>
int main(int argc, char* argv[]) {
char szPath[260] = {0};
strcpy(szPath, argv[0]);
char* p = szPath;
size_t len = strlen(argv[0]);
p+=len; //go to end of string
int backpos = 0;
while(*--p != '\\')
++backpos;
szPath[len-backpos] = 0;
printf("%s\n", szPath);
return 0;
}
After receiving comments changed to this:
char szPath[260];
strcpy(szPath, argv[0]);
/*Scan a string for the last occurrence of a character.*/
char *p = strrchr(szPath, '\\');
if (p) {
*(p + 1) = 0; /* retain backslash and null terminate after that */
} else {
/* handle error */
}
printf("%s\n", szPath);
I would go with strrchr. This assumes str points to writable memory:
char *p;
if ((p = strrchr(str, '\\'))
*(p + 1) = 0; /* Since we passed it to strrchr, it's 0-terminated. */
Obviously, basename and dirname might be there if you are working with paths and might be more appropriate.

How to safety parse tab-delimited string ?

How to safety parse tab-delimiter string ? for example:
test\tbla-bla-bla\t2332 ?
strtok() is a standard function for parsing strings with arbitrary delimiters. It is, however, not thread-safe. Your C library of choice might have a thread-safe variant.
Another standard-compliant way (just wrote this up, it is not tested):
#include <string.h>
#include <stdio.h>
int main()
{
char string[] = "foo\tbar\tbaz";
char * start = string;
char * end;
while ( ( end = strchr( start, '\t' ) ) != NULL )
{
// %s prints a number of characters, * takes number from stack
// (your token is not zero-terminated!)
printf( "%.*s\n", end - start, start );
start = end + 1;
}
// start points to last token, zero-terminated
printf( "%s", start );
return 0;
}
Use strtok_r instead of strtok (if it is available). It has similar usage, except it is reentrant, and it does not modify the string like strtok does. [Edit: Actually, I misspoke. As Christoph points out, strtok_r does replace the delimiters by '\0'. So, you should operate on a copy of the string if you want to preserve the original string. But it is preferable to strtok because it is reentrant and thread safe]
strtok will leave your original string modified. It replaces the delimiter with '\0'. And if your string happens to be a constant, stored in a read only memory (some compilers will do that), you may actually get a access violation.
Using strtok() from string.h.
#include <stdio.h>
#include <string.h>
int main ()
{
char str[] = "test\tbla-bla-bla\t2332";
char * pch;
pch = strtok (str," \t");
while (pch != NULL)
{
printf ("%s\n",pch);
pch = strtok (NULL, " \t");
}
return 0;
}
You can use any regex library or even the GLib GScanner, see here and here for more information.
Yet another version; this one separates the logic into a new function
#include <stdio.h>
static _Bool next_token(const char **start, const char **end)
{
if(!*end) *end = *start; // first call
else if(!**end) // check for terminating zero
return 0;
else *start = ++*end; // skip tab
// advance to terminating zero or next tab
while(**end && **end != '\t')
++*end;
return 1;
}
int main(void)
{
const char *string = "foo\tbar\tbaz";
const char *start = string;
const char *end = NULL; // NULL value indicates first call
while(next_token(&start, &end))
{
// print substring [start,end[
printf("%.*s\n", end - start, start);
}
return 0;
}
If you need a binary safe way to tokenize a given string:
#include <string.h>
#include <stdio.h>
void tokenize(const char *str, const char delim, const size_t size)
{
const char *start = str, *next;
const char *end = str + size;
while (start < end) {
if ((next = memchr(start, delim, end - start)) == NULL) {
next = end;
}
printf("%.*s\n", next - start, start);
start = next + 1;
}
}
int main(void)
{
char str[] = "test\tbla-bla-bla\t2332";
int len = strlen(str);
tokenize(str, '\t', len);
return 0;
}

Resources