I'm looking for a very simple way to return an array of strings that are contained between trailing and leading strings. Here's an example:
char *text = ;;;Text I want]]] Text I don't care about ;;;More Text I want]]] More text I don't care about
Calling stringBetweenString(";;;","]]]",text) should return an array (const char *myArray[2]) with the following values: "Text I want","More Text I want".
Unfortunately, I do not have access to RegEx for this application, nor external libraries. Any help would be greatly appreciated, thanks!
There is no need for a regex, as others have noted strstr will search within a string for the occurrence of a substring, returning a pointer to the beginning of the substring on success, NULL otherwise. You can use that with simple pointer arithmetic to parse the wanted text from between the substrings, e.g.:
#include <stdio.h>
#include <string.h>
#define MAXC 128
int main (void) {
char *text = ";;;Text I want]]] Text I don't care about ;;;More "
"Text I want]]] More text I don't care about";
char buf[MAXC] = "", *p = text, *ep;
while ((p = strstr (p, ";;;"))) {
if ((ep = strstr (p, "]]]"))) {
strncpy (buf, p + 3, ep - p - 3);
buf[ep - p - 3] = 0;
printf ("buf: '%s'\n", buf);
}
else
break;
p = ep;
}
return 0;
}
Example Use/Output
$ ./bin/splitbetween
buf: 'Text I want'
buf: 'More Text I want'
Related
How to get string until second symbol through sscanf?
for example:
char *str = "struct1.struct2.struct3.int";
char buf[256] = {0};
sscanf(str, "", buf); //have any format string could get string until second dot?
sscanf get string until second symbol (include one)
How to get string until second symbol through sscanf?
Not generally possible with a single use of sscanf().
Certainly, without a lot of work, a more involved use of sscanf() will work for many input strings, yet fail for select ones1. sscanf() is not the best fit here for this task.
strchr(), strcspn() better suited.
#include <string.h>
#include<stdlib.h>
// Return offset to 2nd needle occurrence
// or end of string, if not found.
size_t foo(const char *haystack, const char *needle) {
size_t offset = strcspn(haystack, needle);
if (haystack[offset]) {
offset++;
offset += strcspn(haystack + offset, needle);
}
return offset;
}
#include <stdio.h>
int main() {
const char *haystack = "struct1.struct2.struct3.int";
printf("<%.*s>\n", (int) foo(haystack, "."), haystack);
}
Output
<struct1.struct2>
1 Consider: "struct1.struct2", "struct1..", "..struct2", ".struct2.", "..", ".", "".
You can use a * to tell scanf to ignore an element:
const char *str = "struct1.struct2.struct3.int";
int main() {
char buf[256];
int i = sscanf(str, "%*[^.].%[^.]", buf);
printf("%d >%s<\n", i, buf);
return 0;
}
This outputs as expected:
1 >struct2<
because exactly 1 element was assigned even if another one was parsed.
Can I use the strstr function to match exact word? For example, let's say I have the word hello, and an input string line:
if
char* line = "hellodarkness my old friend";
and I use
result = strstr(line, "hello");
result will match (be not NULL), however I want to match only the exact word "hello" (so that "hellodarkness" would not match) and result will be NULL.
Is it possible to do this using strstr or do I have to use fscan and scan the line word by word and check for matches?
Here is a generic function for your purpose. It returns a pointer to the first match or NULL if none can be found:
#include <ctype.h>
#include <string.h>
char *word_find(const char *str, const char *word) {
const char *p = NULL;
size_t len = strlen(word);
if (len > 0) {
for (p = str; (p = strstr(p, word)) != NULL; p++) {
if (p == str || !isalnum((unsigned char)p[-1])) {
if (!isalnum((unsigned char)p[len]))
break; /* we have a match! */
p += len; /* next match is at least len+1 bytes away */
}
}
}
return p;
}
I would:
check if string is in sentence
if found at start (same pointer as line), add the length of the word and check if alphanumerical char found. If not (or null-terminated), then match
if found anywhere else, add the extra "no alphanum before" test
code:
#include <stdio.h>
#include <strings.h>
#include <ctype.h>
int main()
{
const char* line = "hellodarkness my old friend";
const char *word_to_find = "hello";
char* p = strstr(line,word_to_find);
if ((p==line) || (p!=NULL && !isalnum((unsigned char)p[-1])))
{
p += strlen(word_to_find);
if (!isalnum((unsigned char)*p))
{
printf("Match\n");
}
}
return 0;
}
here it doesn't print anything, but insert a punctuation/space before/after or terminate the string after "hello" and you'll get a match. Also, you won't get a match by inserting alphanum chars before hello.
EDIT: the above code is nice when there's only 1 "hello" but fails to find the second "hello" in "hellohello hello". So we have to insert a loop to look for the word or NULL, advancing p each time, like this:
#include <stdio.h>
#include <strings.h>
#include <ctype.h>
int main()
{
const char* line = " hellohello hello darkness my old friend";
const char *word_to_find = "hello";
const char* p = line;
for(;;)
{
p = strstr(p,word_to_find);
if (p == NULL) break;
if ((p==line) || !isalnum((unsigned char)p[-1]))
{
p += strlen(word_to_find);
if (!isalnum((unsigned char)*p))
{
printf("Match\n");
break; // found, quit
}
}
// substring was found, but no word match, move by 1 char and retry
p+=1;
}
return 0;
}
Since strstr() returns the pointer to the starting location of the substring that you want to identify, then you can use strlen(result) the check if it is a substring of longer string or the isolated string that you are looking for. if strlen(result) == strlen("hello"), then it ends correctly. If it ends with a space or punctuation (or some other delimiter), then it is also isolated at the end. You would also need to check if the start of the substring is at the beginning of the "long string" or preceded by a blank, punctuation, or other delimiter.
So, I have an issue. I'm trying to only get the inside of a string as given by this example:
User input: insert("someWord")
And I want to first make sure that the user spelt insert(" correctly, then I want to copy the string contained inside the " ". As of now, I have a function with a parameter that is the full user input, and inside that function, I have the following:
method header(char *string){
char insert[]="insert(";
if((strncmp(string,insert,6)==0)
{
//the first part was right up to the "
//how do I now get the string contained between " "?
}
else
{ //invalid input
}
}
I'm not even 100% positive the strncmp method is comparing the first 6 letters of the two strings correctly.
sscanf(3) to the rescue:
char insert[31];
int matched = sscanf(string, "insert(\"%[^\"]30s\")", insert);
if (matched) printf("Got %s\n", insert);
This matches a string no larger than 30 characters that doesn't contain a " and is surrounded by insert(" and ").
strncmp will compare the exact number of characters or terminate when it see the first null.
So,
strncmp("inser","insert(",6) // false -- the first string too short
strncmp("insert","insert(",6) // true, the first 6 char match
strncmp("insert(","insert(",6) // true, first 6 char match
strncmp("insertxyz","insert(",6) // true, the first 6 chars match
strncmp("insert","inse",6) // false -- the second string too short
strncmp("append","insert",6) // false -- they just don't match
Note that "insert(" is actually 7 characters long (8 if you include the null byte), so comparing with strncmp(,,6) will not take the training '(' into account -- not sure if that is your problem.
In your case there is no real reason to use strncmp -- just use strcmp instead (or the sscanf solution suggested by a3f)
strchr, strncpy and simple pointer arithmetic can also do the trick, e.g.
method header(char *string){
char insert[]="insert(";
if (strncmp (string, insert, 6) == 0) {
char text[MAXC] = "", *p = string + 7, *ep = NULL;
if ((ep = strchr (p, ')')) && ep - p < MAXC)
strncpy (text, p, ep - p);
printf ("header text: '%s'\n", text);
}
else { //invalid input
}
}
A short example that prints the wanted text between the parenthesis (text cannot contain an embedded close parenthesis).
#include <stdio.h>
#include <string.h>
#define MAXC 128
void header (char *string);
int main (void) {
char string[] = "insert(headerinfo)";
header (string);
return 0;
}
void header (char *string)
{
char insert[] = "insert(";
if (strncmp (string, insert, 6) == 0) {
char text[MAXC] = "", *p = string + 7, *ep = NULL;
if ((ep = strchr (p, ')')) && ep - p < MAXC)
strncpy (text, p, ep - p);
printf ("header text: '%s'\n", text);
}
else { //invalid input
}
}
note: the nul-terminating byte is provided by virtue of initialization. If text is reused with strncpy within the same scope, you should affirmatively nul-terminate text for each subsequent use.
Example Use/Output
$ ./bin/headertxt
header text: 'headerinfo'
I have a gps string like below:
char gps_string[] = "$GPRMC,080117.000,A,4055.1708,N,02918.9336,E,0.00,316.26,00,,,A*78";
I want to parse the substrings between the commas like below sequence:
$GPRMC
080117.000
A
4055.1708
.
.
.
I have tried sscanf function like below:
sscanf(gps_string,"%s,%s,%s,%s,%s,",char1,char2,char3,char4,char5);
But this is not working. char1 array gets the whole string if use above function.
Actually i have used strchr function in my previous algorithm and got it work but it's easier and simplier if i can get it work with sscanf and get those parameters in substring.
By the way, substrings between the commas can vary. But the comma sequence is fixed. For example below is another gps string example but it does not contain some of its parts because of sattellite problem:
char gps_string[] = "$GPRMC,001041.799,V,,,,,0.00,0.00,060180,,,N*"
There have been a number of comments in other answers stating that there are a number of problems with strtok() and suggesting using strpbrk() instead. An example of how this is used can be found at Arrays and strpbrk in C
I do not have a compiler available so I could not test this. I could have typos or other misteaks in the code, but I am sure that you can figure out what is meant.
In this case you would use
char *String_Buffer = gps_string;
char *start = String_Buffer;
char *end;
char *fields[MAXFIELDS];
int i = 0;
int n = 0;
char *match = NULL;
while (end = strpbrk(start, ",")) // Get pointer to next delimiter
{
/* found it, allocate enough space for it and NUL */
/* If there ar two consecutive delimiters, only the NUL gets entered */
n = end - start;
match = malloc(n + 1);
/* copy and NUL terminate */
/* Note that if n is 0, nothing will be copied so do not need to test */
memcpy(match, start, n);
match[n] = '\0';
printf("Found field entry: %s\n", match);
/* Now save the actual match string pointer into the fields array*/
/* Since the match pointer is in fields, it does not need to be freed */
fields[i++] = match;
start = end + 1;
}
/* Check that the last element in the gps_string is not ,
Then get the final field, which has the NUL termination of the string */
n = strlen(start);
match = malloc(n + 1);
/* Note that if n is 0, only the terminator will be put in */
strcpy(match, start);
printf("Found field entry: %s\n", match);
fields[i++] = match;
printf("Total number of fields: %d\n", i);
You can use strtok:
#include <stdio.h>
int main(void) {
char gps_string[] = "$GPRMC,080117.000,A,4055.1708,N,02918.9336,E,0.00,316.26,00,,,A*78";
char* c = strtok(gps_string, ",");
while (c != NULL) {
printf("%s\n", c);
c = strtok(NULL, ",");
}
return 0;
}
EDIT: As Carey Gregory mentioned, strtok modifies the given string. This is explained in the man page I linked to, and you can find some details here too.
I'm trying to parse the string below in a good way so I can get the sub-string stringI-wantToGet:
const char *str = "Hello \"FOO stringI-wantToGet BAR some other extra text";
str will vary in length but always same pattern - FOO and BAR
What I had in mind was something like:
const char *str = "Hello \"FOO stringI-wantToGet BAR some other extra text";
char *probe, *pointer;
probe = str;
while(probe != '\n'){
if(probe = strstr(probe, "\"FOO")!=NULL) probe++;
else probe = "";
// Nulterm part
if(pointer = strchr(probe, ' ')!=NULL) pointer = '\0';
// not sure here, I was planning to separate it with \0's
}
Any help will be appreciate it.
I had some time on my hands, so there you are.
#include <string.h>
#include <stddef.h>
#include <stdio.h>
#include <stdlib.h>
int getStringBetweenDelimiters(const char* string, const char* leftDelimiter, const char* rightDelimiter, char** out)
{
// find the left delimiter and use it as the beginning of the substring
const char* beginning = strstr(string, leftDelimiter);
if(beginning == NULL)
return 1; // left delimiter not found
// find the right delimiter
const char* end = strstr(string, rightDelimiter);
if(end == NULL)
return 2; // right delimiter not found
// offset the beginning by the length of the left delimiter, so beginning points _after_ the left delimiter
beginning += strlen(leftDelimiter);
// get the length of the substring
ptrdiff_t segmentLength = end - beginning;
// allocate memory and copy the substring there
*out = malloc(segmentLength + 1);
strncpy(*out, beginning, segmentLength);
(*out)[segmentLength] = 0;
return 0; // success!
}
int main()
{
char* output;
if(getStringBetweenDelimiters("foo FOO bar baz quaz I want this string BAR baz", "FOO", "BAR", &output) == 0)
{
printf("'%s' was between 'FOO' and 'BAR'\n", output);
// Don't forget to free() 'out'!
free(output);
}
}
In first loop, scan until to find your first delimiter string. Set an anchor pointer there.
if found, from the anchor ptr, in a second loop, scan until you find your 2nd delimiter string or you encounter end of the string
If not at end of string, copy characters between the anchor ptr and the 2nd ptr (plus adjustments for spaces, etc that you need)