Hi guys I'm currently using the code below and I'm pretty sure there's a better way to do it. What the code does is look if there's the delimiter (~~~~), puts everything before ~~~~ in cmd and everything after ~~~~ in param. If anyone could let me know how I should be doing this then it would be very appreciated! I'm not used to low-level languages so strings and pointers are still confusing to me!
Thanks!
char buffer[1024], *tempCharPointer, cmd[100], param[1024];
if(strstr(buffer, "~~~~"))
{
strcpy(cmd, buffer);
tempCharPointer = strstr(buffer, "~~~~");
index = (tempCharPointer-buffer) + 4;
strcpy(param, &tempCharPointer[4]);
memmove(&cmd[index-4], "", (index-4));
}
You can simplify your code as follows:
char cmd[1024], *tempCharPointer, *param = "";
// Fill in cmd from somewhere...
...
char *delim = strstr(cmd, "~~~~");
if(delim)
{
param = delim+4;
*delim = '\0';
}
You can simplify your code and insert \0 before the delimiter (modify the first character of the delimiter and make it \0) and have command be a pointer to the beginning of the string and param a pointer to the first character after the delimiter. Saves you memory and all these moves and such.
char buffer[1024], *tempCharPointer, cmd[100], param[1024];
tempCharPointer = strstr(buffer, "~~~~");
if (tempCharPointer){
*tempCharPointer = '\0';
tempCharPointer +=4;
//now buffer points to the first half, and tempCharPointer points to second half
//do with them what you will
}
The strtok function in the C library (extract tokens from strings) can be useful here.
A small example follows. man strtok for more info. Note that strtok_r (used below) is used for reentrant support.
#include <string.h>
#include <stdio.h>
int main(const int argc, const char const** argv)
{
char buffer[1024];
sprintf(buffer, "~~~~foo~~~~bar~~~~baz");
char* saveptr = NULL;
char* token = strtok_r(buffer, "~~~~", &saveptr);
while(token != NULL)
{
printf("TOKEN: %s\n", token);
token = strtok_r(NULL, "~~~~", &saveptr);
}
}
Related
I'm a beginner at C and I'm stuck on a simple problem. Here it goes:
I have a string formatted like this: "first1:second1\nsecond2\nfirst3:second3" ... and so on.
As you can see from the the example the first field is optional ([firstx:]secondx).
I need to get a resulting string which contains only the second field. Like this: "second1\nsecond2\nsecond3".
I did some research here on stack (string splitting in C) and I found that there are two main functions in C for string splitting: strtok (obsolete) and strsep.
I tried to write the code using both functions (plus strdup) without success. Most of the time I get some unpredictable result.
Better ideas?
Thanks in advance
EDIT:
This was my first try
int main(int argc, char** argv){
char * stri = "ciao:come\nva\nquialla:grande\n";
char * strcopy = strdup(stri); // since strsep and strtok both modify the input string
char * token;
while((token = strsep(&strcopy, "\n"))){
if(token[0] != '\0'){ // I don't want the last match of '\n'
char * sub_copy = strdup(token);
char * sub_token = strtok(sub_copy, ":");
sub_token = strtok(NULL, ":");
if(sub_token[0] != '\0'){
printf("%s\n", sub_token);
}
}
free(sub_copy);
}
free(strcopy);
}
Expected output: "come", "si", "grande"
Here's a solution with strcspn:
#include <stdio.h>
#include <string.h>
int main(void) {
const char *str = "ciao:come\nva\nquialla:grande\n";
const char *p = str;
while (*p) {
size_t n = strcspn(p, ":\n");
if (p[n] == ':') {
p += n + 1;
n = strcspn(p , "\n");
}
if (p[n] == '\n') {
n++;
}
fwrite(p, 1, n, stdout);
p += n;
}
return 0;
}
We compute the size of the initial segment not containing : or \n. If it's followed by a :, we skip over it and get the next segment that doesn't contain \n.
If it's followed by \n, we include the newline character in the segment. Then we just need to output the current segment and update p to continue processing the rest of the string in the same way.
We stop when *p is '\0', i.e. when the end of the string is reached.
I'm new to C.
I'm trying to eliminate ".,;?!" from a string using strtok and then to create a simple string without any punctuation marks, but it gives me a 'Segmentation fault' after compilation. Why and how to fix it ?
char simple_s[100];
char delim[20];
memset(simple_s,0,100);
memset(delim,0,100);
strcpy(delim,strtok(s,",.;:!? "));
while(delim != NULL) {
strcat(simple_s,delim);
strcpy(delim,strtok(NULL,",.;:!? "));
}
printf("%s",simple_s);
There are several errors in the code. First you zero too many bytes with
char delim[20];
memset(delim,0,100);
To avoid this error, you should use
char simple_s[100];
char delim[20];
memset(simple_s,0,sizeof(simple_s));
memset(delim,0,sizeof(delim));
Next, you have used the return value of strtok() before checking if it is NULL
strcpy(delim,strtok(s,",.;:!? "));
and from there you go on to test delim for being NULL instead of checking for a NULL pointer from strtok()
while(delim != NULL) {
strcat(simple_s,delim);
strcpy(delim,strtok(NULL,",.;:!? ")); // <--- copying from NULL pointer
}
but delim is not even necessary, you need to work with the pointer returned by strtok(). Putting this back together, I would have
char simple_s[100] = ""; // initialise string
char seps[] = ",.;:!? "; // added separators, so not to duplicate
char *tok; // added to receive value from strtok()
tok = strtok(s, seps);
while(tok) { // until `NULL` returned
strcat(simple_s, tok);
tok = strtok(NULL, seps);
}
printf("%s",simple_s);
Additionally, I have skipped over the string length checking. When you have it working, check that the new length of simple_s[] won't break, before you strcat() the next substring.
delim must be defined as a char *. Here is your code with a couple of bugs fixed:
#include <stdio.h>
#include <string.h>
#include <strings.h>
int main(int argc, char **argv)
{
char s[100];
char simple_s[100];
char *delim;
strcpy(s, "abc,def");
memset(simple_s,0,sizeof(simple_s));
delim = strtok(s,",.;:!? ");
while(delim != NULL) {
strcat(simple_s,delim);
delim = strtok(NULL,",.;:!? ");
}
printf("%s",simple_s);
return 0;
}
This outputs abcdef
Ok, so what I need to do is split a string. Here is an example string that I need to split:
test_french_fries
and let's say I want to split the string into two parts, the part before and after:
french
thus I would get a return of something like:
match[1]=[test_]
match[2]=[_fries]
I have tried messing around with strtok() but I have had no success with that. It appears that strtok() splits the string at all delims that are given - so if I were to give it a delim of french, it would separate the string for each character matched - which ISN'T what I want.
I would like to have a delim where I can provide it a str or char* such as "french" and the string be split before and after the string provided.
Are there any other methods besides strtok() that I could possibly try to implement in order to achieve my desired outcome?
I really appreciate any and all help that I receive.
-Patrick
strtok() can only split using a single symbol as separator (1 character), you can specify multiple separators in the form of a null-terminated string, but they are delt with individualy. To split a string with another string as separator, you will have to write a custom function. The example below works just like strtok(), but uses a null-terminated C string as separator:
#include <stdio.h>
char *strstrtok(char *src, char *sep)
{
static char *str = NULL;
if (src) str = src;
else src = str;
if (str == NULL)
return NULL;
char *next = strstr(str, sep);
if (next)
{
*next = '\0';
str = next + strlen(sep);
}
else
str = NULL;
return src;
}
int main(void) {
char buf[] = "yoursepmamasepissepsosepfatsepsheseptooksepasepspoonseptosepthesepsuperbowl";
char *token = strstrtok(buf, "sep");
while (token)
{
printf("%s ", token);
token = strstrtok(NULL, "sep");
}
return 0;
}
Notice just like strtok(), this solution is not thread-safe.
See it working on Ideone: http://ideone.com/cjRGgJ
I'm sorry for the sloppy title, but I didn't know how to format my question correctly. I'm trying to read a .txt, of which every line has information needed to fill a struct. First I use fgets to read the line, and then i was going to use sscanf to read the individual parts. Now here is where I'm stuck: normally sscanf breaks off parts on whitespaces, but I need the whitespace to be included. I know that sscanf allows ignoring whitespaces, but the tricky part is that I then need some other arbitrary character to separate the parts. For example, I have to break the line
Carl Sagan~Contact~scifi~1997
up into parts for Author,Name,Genre,year. You can see I need the space in Carl Sagan, but I need the function to break off the strings on the tilde character. Any help is appreciated
If your input is delimited by ~ or for instance any specific character:
Use this:
sscanf(s, "%[^~]", name);
[^ is conversion type, that matches all characters except the ones listed, ending with ]
Here is the sample program for testing it:
#include <stdio.h>
int main(int argv, char **argc)
{
char *s = "Carl Sagan~Contact~scifi~1997";
char name[100], contact[100], genre[100];
int yr;
sscanf(s, "%99[^~]~%99[^~]~%99[^~]~%d", name, contact, genre, &yr);
printf("%s\n%s\n%s\n%d\n", name, contact, genre, yr);
return 0;
}
You need strtok. Use ~ as your delimiter.
See the documentation: http://linux.die.net/man/3/strtok
strtok has some drawbacks but it sounds like it will work for you.
EDIT:
After reading this, it sounds like you can use sscanf cleverly to achieve the same result, and it may actually be safer after all.
#include <stddef.h>
#include <string.h>
#include <stdio.h>
char* mystrsep(char** input, const char* delim)
{
char* result = *input;
char* p;
p = (result != NULL) ? strpbrk(result, delim) : NULL;
if (p == NULL)
*input = NULL;
else
{
*p = '\0';
*input = p + 1;
}
return result;
}
int main()
{
char str[] = "Carl Sagan~Contact~scifi~1997";
const char delimiters[] = "~";
char* ptr;
char* token;
ptr = str;
token = mystrsep(&ptr, delimiters);
while(token)
{
printf("%s\n",token);
token = mystrsep(&ptr, delimiters);
}
return 0;
}
Output :-
Carl Sagan
Contact
scifi
1997
If I have a string like:
const char* mystr = "Test Test Bla Bla \n Bla Bla Test \n Test Test \n";
How would I use the newline character '\n', to split the string into an array of strings?
I'm trying to accomplish in C, the thing string.Split() does in C# or boost's string algorithm split does in C++ .
Try to use the strtok function. Be aware that it modifies the source memory so you can't use it with a string literal.
char *copy = strdup(mystr);
char *tok;
tok = strtok(copy, "\n");
/* Do something with tok. */
while (tok) {
tok = strtok(NULL, "\n");
/* ... */
}
free(copy);
The simplest way to split a string in C is to use strtok() however that comes along with an arm's length list of caveats on its usage:
It's destructive (destroys the input string), and you couldn't use it on the string you have above.
It's not reentrant (it keeps its state between calls, and you can only be using it to tokenize one string at a time... let alone if you wanted to use it with threads). Some systems provide a reentrant version, e.g. strtok_r(). Your example might be split up like:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main (void) {
char mystr[] = "Test Test Bla Bla \n Bla Bla Test \n Test Test \n";
char *word = strtok(mystr, " \n");
while (word) {
printf("word: %s\n", word);
word = strtok(NULL, " \n");
}
return 0;
}
Note the important change of your string declaration -- it's now an array and can be modified. It's possible to tokenize a string without destroying it, of course, but C does not provide a simple solution for doing so as part of the standard library.
Remember that C makes you do all the memory allocation by hand. Remember also that C doesn't really have strings, only arrays of characters. Also, string literals are immutable, so you're going to need to copy it. It will be easier to copy the whole thing first.
So, something like this (wholly untested):
char *copy = xstrdup(mystr);
char *p;
char **arry;
size_t count = 0;
size_t i;
for (p = copy; *p; p++)
if (*p == '\n')
count++;
arry = xmalloc((count + 1) * sizeof(char *));
i = 0;
p = copy;
arry[i] = p;
while (*p)
{
if (*p == '\n')
{
*p = '\0';
arry[i++] = p+1;
}
p++;
}
return arry; /* deallocating arry and arry[0] is
the responsibility of the caller */
In the above reactions, I see only while(){} loops, where IMHO for(){} loops are more compact.
cnicutar:
for(tok = strtok(copy, "\n");tok; tok = strtok(NULL, "\n") {
/* ... */
}
FatalError:
char *word;
for ( word = strtok(mystr, " \n");word; word = strtok(NULL, " \n") {
printf("word: %s\n", word);
}
Zack:
for (arry[i=0]=p=copy; *p ; p++)
{
if (*p == '\n')
{
*p = '\0';
arry[i++] = p+1;
}
}
[the clarity of this last example is disputable]
You can use below mentioned library. It has many other useful functions.
http://www.boost.org/doc/libs/1_48_0/libs/tokenizer/index.html
Or you can use strtok function.