Simpler way to extract occurrences without using strtok in C - c

I'm trying to extract the strings before and after the first comma from the given string. However, I feel there's got to be a better way than what I have below, perhaps I don't even need the strdup calls. Thanks
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
int extract_names(const char *str)
{
char *name, *last, *p1, *p2, *p3;
name = strdup(str);
last = strdup(str);
p1 = strchr(name, ',');
if (p1)
{
*p1 = '\0';
printf("%s\n", name);
}
p2 = strchr(last, ',');
p2++;
if (p2)
{
p3 = strpbrk(p2 + 1, " \0");
if (p3)
*p3 = '\0';
printf("%s\n", p2);
}
free(name);
free(last);
return 0;
}
int main()
{
// strings should at least contain last,name.
// but can contain several words
const char *str1 = "jones,bob age,12";
extract_names(str1);
const char *str2 = "smith,peter";
extract_names(str2);
return 0;
}
Output
jones
bob
smith
peter

Use strchr to find the limits of the last and first names. Then you can use the precision specifier in printf to print just the part of the string you are interested in.
For example:
int extract_names(const char *str)
{
const char *comma = strchr(str, ',');
const char *name_end = strchr(str, ' ');
/* name ends at space or end of string */
if (!name_end) {
name_end = str + strlen(str);
}
/* print last name */
printf("%.*s\n", (comma - str), str);
/* print first name */
printf("%.*s\n", name_end - comma, comma + 1);
return 0;
}

Since you're obviously trying to learn, I'll only give you a few pointers and not a "better working solution".
strdup is a useful idea here. As you do, it allows you to overwrite the string (const char *str is "readonly").
This combination:
p2 = strchr(last, ',');
p2++;
if (p2) {
is wrong. after the call p2 equals the return value. if you advance it before the if, you're not testing anything (if NULL was returned, it's 1 by the time you test it).
You don't need two strdups, and you don't need to search twice. p1 already points to the right place in name, you can use that for the rest of your logic.

Related

How to obtain a single word from a string in C?

In order to complete a program I am working on, I have to be able to put pieces of a string into a stack for later use. For example, say I had this string:
"22 15 - 2 +"
Ideally, I first want to extract 22 from the string, place it in a separate, temporary string, and then manipulate it as I would like. Here is the code that I'm using which I think would work, but it is very over-complicated.
void evaluatePostfix(char *exp){
stack *s = initStack();
char *temp_str;
char temp;
int temp_len, val, a, b, i=0, j;
int len = strlen(exp);
while(len > 0){
temp_str = malloc(sizeof(char)); //holds the string i am extracting
j=0; //first index in temp_str
temp = exp[i]; //current value in exp, incremented later on the function
temp_len = 1; //for reallocation purposes
while(!isspace(temp)){ //if a white space is hit, the full value is already scanned
if(ispunct(temp)) //punctuation will always be by itself
break; //break if it is encountered
temp_str = (char*)realloc(temp_str, temp_len+1); //or else reallocate the string to hold the new character
temp_str[j] = temp; //copy the character to the string
temp_len++; //increment for the length of temp_str
i++; //advance one value in exp
j++; //advance one value in temp_str
len--; //the number of characters left to scan is one less
temp = exp[i]; //prepare for the next loop
} //and so on, and so on...
} //more actions follow this, but are excluded
}
Like I said, overcomplicated. Is there a simpler way for me to extract this code? I can reliably depend upon there being white space between the values and characters I need to extract.
If you are good to use library function, then strtok is for this
#include <string.h>
#include <stdio.h>
int main()
{
char str[80] = "22 15 - 2 +";
const char s[2] = " ";
char *token;
/* get the first token */
token = strtok(str, s);
/* walk through other tokens */
while( token != NULL )
{
printf( " %s\n", token );
token = strtok(NULL, s);
}
return(0);
}
Reference
The limitation of strtok(char *str, const char *delim) is that it can't work on multiple strings simultaneously as it maintains a static pointer to store the index till it has parsed (hence sufficient if playing with only one string at a time). The better and safer method is to use strtok_r(char *str, const char *delim, char **saveptr) which explicitly takes a third pointer to save the parsed index.
#include <string.h>
#include <stdio.h>
int main()
{
char str[80] = "22 15 - 2 +";
const char s[2] = " ";
char *token, *saveptr;
/* get the first token */
token = strtok_r(str, s, &saveptr);
/* walk through other tokens */
while( token != NULL )
{
printf( " %s\n", token );
token = strtok_r(NULL, s, &saveptr);
}
return(0);
}
Take a look at the strotk function, i think it's what you'r looking for.

Split string by a substring

I have following string:
char str[] = "A/USING=B)";
I want to split to get separate A and B values with /USING= as a delimiter
How can I do it? I known strtok() but it just split by one character as delimiter.
As others have pointed out, you can use strstr from <string.h> to find the delimiter in your string. Then either copy the substrings or modify the input string to split it.
Here's an implementation that returns the second part of a split string. If the string can't be split, it returns NULL and the original string is unchanged. If you need to split the string into more substrings, you can call the function on the tail repeatedly. The first part will be the input string, possibly shortened.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char *split(char *str, const char *delim)
{
char *p = strstr(str, delim);
if (p == NULL) return NULL; // delimiter not found
*p = '\0'; // terminate string after head
return p + strlen(delim); // return tail substring
}
int main(void)
{
char str[] = "A/USING=B";
char *tail;
tail = split(str, "/USING=");
if (tail) {
printf("head: '%s'\n", str);
printf("tail: '%s'\n", tail);
}
return 0;
}
I known strtok() but it just split by one character as delimiter
Nopes, it's not.
As per the man page for strtok(), (emphasis mine)
char *strtok(char *str, const char *delim);
[...] The delim argument specifies a set of bytes that delimit the tokens in the parsed string. [...] A sequence of two or more contiguous delimiter bytes in the parsed string is considered to be a single delimiter. [...]
So, it need not be "one character" as you've mentioned. You can using a string, like in your case "/USING=" as the delimiter to get the job done.
Here is a little function to do this. It works exactly like strtok_r except that the delimiter is taken as a delimiting string, not a list of delimiting characters.
char *strtokstr_r(char *s, char *delim, char **save_ptr)
{
char *end;
if (s == NULL)
s = *save_ptr;
if (s == NULL || *s == '\0')
{
*save_ptr = s;
return NULL;
}
// Skip leading delimiters.
while (strstr(s,delim)==s) s+=strlen(delim);
if (*s == '\0')
{
*save_ptr = s;
return NULL;
}
// Find the end of the token.
end = strstr (s, delim);
if (end == NULL)
{
*save_ptr = s + strlen(s);
return s;
}
// Terminate the token and make *SAVE_PTR point past it.
memset(end, 0, strlen(delim));
*save_ptr = end + strlen(delim);
return s;
}
Dude, this answer is only valid if the input is this one, if were "abUcd/USING=efgh" your algorithm doesn't work.
This answer is the only valid for me:
char *split(char *str, const char *delim)
{
char *p = strstr(str, delim);
if (p == NULL) return NULL; // delimiter not found
*p = '\0'; // terminate string after head
return p + strlen(delim); // return tail substring
}
int main(void)
{
char str[] = "A/USING=B";
char *tail;
tail = split(str, "/USING=");
if (tail) {
printf("head: '%s'\n", str);
printf("tail: '%s'\n", tail);
}
return 0;
}
See this. I got this when I searched for your question on google.
In your case it will be:
#include <stdio.h>
#include <string.h>
int main (int argc, char* argv [])
{
char theString [16] = "abcd/USING=efgh";
char theCopy [16];
char *token;
strcpy (theCopy, theString);
token = strtok (theCopy, "/USING=");
while (token)
{
printf ("%s\n", token);
token = strtok (NULL, "/USING=");
}
return 0;
}
This uses /USING= as the delimiter.
The output of this was:
abcd
efgh
If you want to check, you can compile and run it online over here.

Substrings in the middle of a String in C

I need to extract substrings that are between Strings I know.
I have something like char string = "abcdefg";
I know what I need is between "c" and "f", then my return should be "de".
I know the strncpy() function but do not know how to apply it in the middle of a string.
Thank you.
Here's a full, working example:
#include <stdio.h>
#include <string.h>
int main(void) {
char string[] = "abcdefg";
char from[] = "c";
char to[] = "f";
char *first = strstr(string, from);
if (first == NULL) {
first = &string[0];
} else {
first += strlen(from);
}
char *last = strstr(first, to);
if (last == NULL) {
last = &string[strlen(string)];
}
char *sub = calloc(strlen(string) + 1, sizeof(char));
strncpy(sub, first, last - first);
printf("%s\n", sub);
free(sub);
return 0;
}
You can check it at this ideone.
Now, the explanation:
1.
char string[] = "abcdefg";
char from[] = "c";
char to[] = "f";
Declarations of strings: main string to be checked, beginning delimiter, ending delimiter. Note these are arrays as well, so from and to could be, for example, cd and fg, respectively.
2.
char *first = strstr(string, from);
Find occurence of the beginning delimiter in the main string. Note that it finds the first occurence - if you need to find the last one (for example, if you had the string abcabc, and you wanted a substring from the second a), it might need to be different.
3.
if (first == NULL) {
first = &string[0];
} else {
first += strlen(from);
}
Handle situation, in which the first delimiter doesn't appear in the string. In such a case, we will make a substring from the beginning of the entire string. If it does appear, however, we move the pointer by length of from string, as we need to extract the substring beginning after the first delimiter (correction thanks to #dau_sama).
Depending on your specifications, this may or may not be needed, or another result might be expected.
4.
char *last = strstr(first, to);
Find occurence of the ending delimiter in the main string. Note that it finds the first occurence.
As noted by #dau_sama, it's better to search for ending delimiter from the first, not from beginning of the entire string. This prevents situations, in which to would appear earlier than from.
5.
if (last == NULL) {
last = &string[strlen(string)];
}
Handle situation, in which the second delimiter doesn't appear in the string. In such a case, we will make a substring until end of the string, so we get a pointer to the last character.
Again, depending on your specifications, this may or may not be needed, or another result might be expected.
6.
char *sub = calloc(last - first + 1, sizeof(char));
strncpy(sub, first, last - first);
Allocate sufficient memory and extract substring based on pointers found earlier. We copy last - first (length of the substring) characters beginning from first character.
7.
printf("%s\n", sub);
Here's the result.
I hope it does present the problem with enough details. Depending on your exact specifications, you may need to alter this somehow. For example, if you needed to find all substrings, and not just the first one, you may want to make a loop for finding first and last.
TY guys, worked using the form below:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char *between_substring(char *str, char from, char to){
while(*str && *str != from)
++str;//skip
if(*str == '\0')
return NULL;
else
++str;
char *ret = malloc(strlen(str)+1);
char *p = ret;
while(*str && *str != to){
*p++ = *str++;//To the end if `to` do not exist
}
*p = 0;
return ret;
}
int main (void){
char source[] = "abcdefg";
char *target;
target = between(source, 'c', 'f');
printf("%s", source);
printf("%s", target);
return 0;
}
Since people seemed to not understand my approach in the comments, here's a quick hacked together stub.
const char* string = "abcdefg";
const char* b = "c";
const char* e = "f";
//look for the first pattern
const char* begin = strstr(string, b);
if(!begin)
return NULL;
//look for the end pattern
const char* end = strstr(begin, e);
if(!end)
return NULL;
end -= strlen(e);
char result[MAXLENGTH];
strncpy(result, begin, end-begin);
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char *between(const char *str, char from, char to){
while(*str && *str != from)
++str;//skip
if(*str == '\0')
return NULL;
else
++str;
char *ret = malloc(strlen(str)+1);
char *p = ret;
while(*str && *str != to){
*p++ = *str++;//To the end if `to` do not exist
}
*p = 0;
return ret;
}
int main(void){
const char* string = "abcdefg";
char *substr = between(string, 'c', 'f');
if(substr!=NULL){
puts(substr);
free(substr);
}
return 0;
}

Print delim used by strtok_r

I have this text for example:
I know,, more.- today, than yesterday!
And I'm extracting words with this code:
while(getline(&line, &len, fpSourceFile) > 0) {
last_word = NULL;
word = strtok_r(line, delim, &last_word);
while(word){
printf("%s ", word);
word = strtok_r(NULL, delim, &last_word);
// delim_used = ;
}
}
The output is:
I know more today than yesterday
But there is any way to get the delimiter used by strtok_r()? I want to replace same words by one integer, and do the same with delimiters. I can get one word with strtok_r(), but how get the delimiter used by that function?
Fortunately, strtok_r() is a pretty simple function - it's easy to create your own variant that does what you need:
#include <string.h>
/*
* public domain strtok_ex() based on a public domain
* strtok_r() by Charlie Gordon
*
* strtok_r from comp.lang.c 9/14/2007
*
* http://groups.google.com/group/comp.lang.c/msg/2ab1ecbb86646684
*
* (Declaration that it's public domain):
* http://groups.google.com/group/comp.lang.c/msg/7c7b39328fefab9c
*/
/*
strtok_ex() is an extended version of strtok_r() that optinally
returns the delimited that was used to terminate the token
the first 3 parameters are the same as for strtok_r(), the last
parameter:
char* delim_found
is an optional pointer to a character that will get the value of
the delimiter that was found to terminate the token.
*/
char* strtok_ex(
char *str,
const char *delim,
char **nextp,
char* delim_found)
{
char *ret;
char tmp;
if (!delim_found) delim_found = &tmp;
if (str == NULL)
{
str = *nextp;
}
str += strspn(str, delim);
if (*str == '\0')
{
*delim_found = '\0';
return NULL;
}
ret = str;
str += strcspn(str, delim);
*delim_found = *str;
if (*str)
{
*str++ = '\0';
}
*nextp = str;
return ret;
}
#include <stdio.h>
int main(void)
{
char delim[] = " ,.-!";
char line[] = "I know,, more.- today, than yesterday!";
char delim_used;
char* last_word = NULL;
char* word = strtok_ex(line, delim, &last_word, &delim_used);
while (word) {
printf("word: \"%s\" \tdelim: \'%c\'\n", word, delim_used);
word = strtok_ex(NULL, delim, &last_word, &delim_used);
}
return 0;
}
Getting any skipped delimiters would be a bit more work. I don't think it would be a lot of work, but I do think the interface would be unwieldy (strtok_ex()'s interface is already clunky), so you'd have to put some thought into that.
No, you cannot identify the delimiter (by means of the call to strtok_r() itself).
From man strtok_r:
BUGS
[...]
The identity of the delimiting byte is lost.

How to remove \n or \t from a given string in C?

How can I strip a string with all \n and \t in C?
This works in my quick and dirty tests. Does it in place:
#include <stdio.h>
void strip(char *s) {
char *p2 = s;
while(*s != '\0') {
if(*s != '\t' && *s != '\n') {
*p2++ = *s++;
} else {
++s;
}
}
*p2 = '\0';
}
int main() {
char buf[] = "this\t is\n a\t test\n test";
strip(buf);
printf("%s\n", buf);
}
And to appease Chris, here is a version which will make a place the result in a newly malloced buffer and return it (thus it'll work on literals). You will need to free the result.
char *strip_copy(const char *s) {
char *p = malloc(strlen(s) + 1);
if(p) {
char *p2 = p;
while(*s != '\0') {
if(*s != '\t' && *s != '\n') {
*p2++ = *s++;
} else {
++s;
}
}
*p2 = '\0';
}
return p;
}
If you want to replace \n or \t with something else, you can use the function strstr(). It returns a pointer to the first place in a function that has a certain string. For example:
// Find the first "\n".
char new_char = 't';
char* pFirstN = strstr(szMyString, "\n");
*pFirstN = new_char;
You can run that in a loop to find all \n's and \t's.
If you want to "strip" them, i.e. remove them from the string, you'll need to actually use the same method as above, but copy the contents of the string "back" every time you find a \n or \t, so that "this i\ns a test" becomes: "this is a test".
You can do that with memmove (not memcpy, since the src and dst are pointing to overlapping memory), like so:
char* temp = strstr(str, "\t");
// Remove \n.
while ((temp = strstr(str, "\n")) != NULL) {
// Len is the length of the string, from the ampersand \n, including the \n.
int len = strlen(str);
memmove(temp, temp + 1, len);
}
You'll need to repeat this loop again to remove the \t's.
Note: Both of these methods work in-place. This might not be safe! (read Evan Teran's comments for details.. Also, these methods are not very efficient, although they do utilize a library function for some of the code instead of rolling your own.
Basically, you have two ways to do this: you can create a copy of the original string, minus all '\t' and '\n' characters, or you can strip the string "in-place." However, I bet money that the first option will be faster, and I promise you it will be safer.
So we'll make a function:
char *strip(const char *str, const char *d);
We want to use strlen() and malloc() to allocate a new char * buffer the same size as our str buffer. Then we go through str character by character. If the character is not contained in d, we copy it into our new buffer. We can use something like strchr() to see if each character is in the string d. Once we're done, we have a new buffer, with the contents of our old buffer minus characters in the string d, so we just return that. I won't give you sample code, because this might be homework, but here's the sample usage to show you how it solves your problem:
char *string = "some\n text\t to strip";
char *stripped = strip(string, "\t\n");
This is a c string function that will find any character in accept and return a pointer to that position or NULL if it is not found.
#include <string.h>
char *strpbrk(const char *s, const char *accept);
Example:
char search[] = "a string with \t and \n";
char *first_occ = strpbrk( search, "\t\n" );
first_occ will point to the \t, or the 15 character in search. You can replace then call again to loop through until all have been replaced.
I like to make the standard library do as much of the work as possible, so I would use something similar to Evan's solution but with strspn() and strcspn().
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define SPACE " \t\r\n"
static void strip(char *s);
static char *strip_copy(char const *s);
int main(int ac, char **av)
{
char s[] = "this\t is\n a\t test\n test";
char *s1 = strip_copy(s);
strip(s);
printf("%s\n%s\n", s, s1);
return 0;
}
static void strip(char *s)
{
char *p = s;
int n;
while (*s)
{
n = strcspn(s, SPACE);
strncpy(p, s, n);
p += n;
s += n + strspn(s+n, SPACE);
}
*p = 0;
}
static char *strip_copy(char const *s)
{
char *buf = malloc(1 + strlen(s));
if (buf)
{
char *p = buf;
char const *q;
int n;
for (q = s; *q; q += n + strspn(q+n, SPACE))
{
n = strcspn(q, SPACE);
strncpy(p, q, n);
p += n;
}
*p++ = '\0';
buf = realloc(buf, p - buf);
}
return buf;
}

Resources