C strtok() split string into tokens but keep old data unaltered - c

I have the following code:
#include <stdio.h>
#include <string.h>
int main (void) {
char str[] = "John|Doe|Melbourne|6270|AU";
char fname[32], lname[32], city[32], zip[32], country[32];
char *oldstr = str;
strcpy(fname, strtok(str, "|"));
strcpy(lname, strtok(NULL, "|"));
strcpy(city, strtok(NULL, "|"));
strcpy(zip, strtok(NULL, "|"));
strcpy(country, strtok(NULL, "|"));
printf("Firstname: %s\n", fname);
printf("Lastname: %s\n", lname);
printf("City: %s\n", city);
printf("Zip: %s\n", zip);
printf("Country: %s\n", country);
printf("STR: %s\n", str);
printf("OLDSTR: %s\n", oldstr);
return 0;
}
Execution output:
$ ./str
Firstname: John
Lastname: Doe
City: Melbourne
Zip: 6270
Country: AU
STR: John
OLDSTR: John
Why can't I keep the old data nor in the str or oldstr, what am I doing wrong and how can I not alter the data or keep it?

when you do strtok(NULL, "|") strtok() find token and put null on place (replace token with \0) and modify string.
you str, becomes:
char str[] = John0Doe0Melbourne062700AU;
Str array in memory
+------------------------------------------------------------------------------------------------+
|'J'|'o'|'h'|'n'|0|'D'|'o'|'e'|0|'M'|'e'|'l'|'b'|'o'|'u'|'r'|'n'|'e'|0|'6'|'2'|'7'|'0'|0|'A'|'U'|0|
+------------------------------------------------------------------------------------------------+
^ replace | with \0 (ASCII value is 0)
Consider the diagram is important because char '0' and 0 are diffident (in string 6270 are char in figure parenthesised by ' where for \0 0 is as number)
when you print str using %s it print chars upto first \0 that is John
To keep your original str unchanged you should fist copy str into some tempstr variable and then use that tempstr string in strtok():
char str[] = "John|Doe|Melbourne|6270|AU";
char* tempstr = calloc(strlen(str)+1, sizeof(char));
strcpy(tempstr, str);
Now use this tempstr string in place of str in your code.

Because oldstr is just a pointer, an assignment will not make a new copy of your string.
Copy it before passing str to the strtok:
char *oldstr=malloc(sizeof(str));
strcpy(oldstr,str);
Your corrected version:
#include <stdio.h>
#include <string.h>
#include<malloc.h>
int main (void) {
char str[] = "John|Doe|Melbourne|6270|AU";
char fname[32], lname[32], city[32], zip[32], country[32];
char *oldstr = malloc(sizeof(str));
strcpy(oldstr,str);
...................
free(oldstr);
return 0;
}
EDIT:
As #CodeClown mentioned, in your case, it's better to use strncpy. And instead of fixing the sizes of fname etc before hand, you can have pointers in their place and allocate the memory as is required not more and not less. That way you can avoid writing to the buffer out of bounds......
Another Idea:
would be to assign the result of strtok to pointers *fname, *lname, etc.. instead of arrays. It seems the strtok is designed to be used that way after seeing the accepted answer.
Caution:In this way, if you change str further that would be reflected in fname,lname also. Because, they just point to str data but not to new memory blocks. So, use oldstr for other manipulations.
#include <stdio.h>
#include <string.h>
#include<malloc.h>
int main (void) {
char str[] = "John|Doe|Melbourne|6270|AU";
char *fname, *lname, *city, *zip, *country;
char *oldstr = malloc(sizeof(str));
strcpy(oldstr,str);
fname=strtok(str,"|");
lname=strtok(NULL,"|");
city=strtok(NULL, "|");
zip=strtok(NULL, "|");
country=strtok(NULL, "|");
printf("Firstname: %s\n", fname);
printf("Lastname: %s\n", lname);
printf("City: %s\n", city);
printf("Zip: %s\n", zip);
printf("Country: %s\n", country);
printf("STR: %s\n", str);
printf("OLDSTR: %s\n", oldstr);
free(oldstr);
return 0;
}

strtok requires an writeable input string and it modifies the input string. If you want to keep the input string you have to a make a copy of it first.
For example:
char str[] = "John|Doe|Melbourne|6270|AU";
char oldstr[32];
strcpy(oldstr, str); // Use strncpy if you don't know
// the size of str

You just copy the pointer to the string, but not the string itself. Use strncpy() to create a copy.
char *oldstr = str; // just copy of the address not the string itself!

The for() loop below shows how code calls strtok() at only one location.
int separate( char *flds[], int size, char *fullStr ) {
int count = 0;
for( char *cp = fullStr; ( cp = strtok( cp, " " ) ) != NULL; cp = NULL ) {
flds[ count ] = strdup( cp ); // must be free'd later!
if( ++count == size )
break;
}
return( count );
}

Related

strcpy() Delete everything else

I need to read a file and look for a word and replace with a new word but it's not working as it should:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main() {
FILE * f = fopen("text" , "rb" );
if(f == NULL ){
perror("error ");
return 1;
}
char chain[100];
while(!feof(f)){
fgets(chain, 100 , f);
}
printf("%s", chain);
fclose(f);
printf("\n \n ;D \n");
return 0;
}
And this is how I replace the old word:
char str[] ="This is a hiall samplesss friends string";
char * pch;
pch = strstr (str,"hiall");
strncpy (pch,"sam",5);
puts (str);
thanks
That strncpy is copying 5 chars from "sam" into pch. Note however that "sam" has only 3 characters, so it is copying the three characters plus the \0 terminator. That's why it is "deleting" the rest of the string: it is adding a terminator right after "sam", so you are getting This is a sam in the output.
If you strncpy only 3 characters, you'll get this:
This is a samll samplesss friends string
If what you want is something like find-and-replace (i.e. replacing "hiall" with "sam" and getting This is a sam samplesss friends string), you need to move the rest of the string backwards. A simple strncpy won't work because the destination memory overlaps with the target. For this, you can either use memmove, or use an auxiliary buffer:
char str[] = "This is a hiall samplesss friends string";
char* pch;
char* old_word = "hiall";
char* new_word = "sam";
size_t len_old = strlen(old_word);
size_t len_new = strlen(new_word);
pch = strstr(str, old_word);
assert(len_new <= len_old);
if (pch) {
char* rest = pch + len_old;
size_t len_rest = strlen(rest);
char* aux = malloc(len_rest + 1);
strncpy(aux, rest, len_rest + 1);
strncpy(pch, new_word, len_new);
strncpy(pch + len_new, aux, len_rest + 1);
free(aux);
puts(str);
}
Note that this will only work if new_word is the same size or shorter than old_word. If new_word is longer, you won't be able to edit in-place (in the string str itself), unless the original string has extra memory for it (e.g. if you declared it with str[1000] to guarantee you can increase its size enough — then the code above would work). The safest approach if you can't plan ahead what size new_word is going to be would be to allocate a new string:
char str[] ="This is a hiall samplesss friends string";
char* pch;
char* old_word = "hiall";
char* new_word = "sam";
pch = strstr(str, old_word);
size_t len_str = strlen(str);
size_t len_new = strlen(new_word);
size_t len_old = strlen(old_word);
if (pch) {
char* new_str = malloc(len_str - len_old + len_new + 1);
ptrdiff_t pos_word = pch - str;
strncpy(new_str, str, pos_word);
strncpy(new_str + pos_word, new_word, len_new);
strncpy(new_str + pos_word + len_new, pch + len_old,
len_str - pos_word - len_old + 1);
puts(new_str);
}
(Edit: addressed issues pointed out by David Bowling in the comments.)

C Delete last character in string

I want to delete last character in string
first, i use strtok function
My Input is : "Hello World Yaho"
I use " " as my delimeter
My expectation is this
Hell
Worl
Yah
But the actual output is this
Hello
Worl
Yaho
How can I solve this problem? I can't understand this output
this is my code
int main(int argc, char*argv[])
{
char *string;
char *ptr;
string = (char*)malloc(100);
puts("Input a String");
fgets(string,100,stdin);
printf("Before calling a function: %s]n", string);
ptr = strtok(string," ");
printf("%s\n", ptr);
while(ptr=strtok(NULL, " "))
{
ptr[strlen(ptr)-1]=0;
printf("%s\n", ptr);
}
return 0;
}
This program deletes the last character of every word.
#include<stdio.h>
#include<stdlib.h>
#include<string.h>
int main(int argc, char*argv[]){
char *string;
char *ptr;
string = (char*)malloc(100);
puts("Input a String");
fgets(string,100,stdin);
printf("Before calling a function: %s\n", string);
string[strlen(string)-1]=0;
ptr = strtok(string," ");
printf("%s\n", ptr);
while(ptr){
ptr[strlen(ptr)-1]=0;
printf("%s\n", ptr);
ptr = strtok(0, " ");
}
return 0;
}
You must remember to
Trim the string from trailing newline
Use strtok properly
Test
Input a String
Hello World Yaho
Before calling a function: Hello World Yaho
Hello
Hell
Worl
Yah
Your problem is best solved by splitting it in 2 phases: parsing the phrase into words on one hand, with strtok if you wish, and printing the words with their last character omitted in a separate function:
#include <stdio.h>
#include <string.h>
static void print_truncated_word(const char *ptr) {
int len = strlen(ptr);
if (len > 0) len -= 1;
printf("%.*s", len, ptr);
}
int main(int argc, char*argv[]) {
char buf[128];
char *ptr;
puts("Input a string: ");
if (fgets(buf, sizeof buf, stdin) == NULL) {
/* premature end of file */
exit(1);
}
printf("Before calling a function: %s\n", string);
ptr = strtok(string, " \n");
while (ptr) {
print_truncated_word(ptr);
ptr = strtok(NULL, " \n");
}
return 0;
}
Note that the print_truncated_word function does not modify the buffer. Side effects on input arguments should be avoided, unless they are the explicit goal of the function. strtok is ill behaved to this regard, among other shortcomings such as its hidden state that prevents nested use.
Since you kept the delm as space it will create separate tokens for space separated words in your string and c-style strings contain their last characters as '\0' i.e null character so it deletes that character and not your last character in the text.
check this out
http://www.cprogramming.com/tutorial/c/lesson9.html
it turns out that C-style strings are always terminated with a null character, literally a '\0' character (with the value of 0),

Do I need to free the strtok resulting string?

Or rather, how does strtok produce the string to which it's return value points? Does it allocate memory dynamically? I am asking because I am not sure if I need to free the token in the following code:
The STANDARD_INPUT variables is for exit procedure in case I run out of memory for allocation and the string is the tested subject.
int ValidTotal(STANDARD_INPUT, char *str)
{
char *cutout = NULL, *temp, delim = '#';
int i = 0; //Checks the number of ladders in a string, 3 is the required number
temp = (char*)calloc(strlen(str),sizeof(char));
if(NULL == temp)
Pexit(STANDARD_C); //Exit function, frees the memory given in STANDARD_INPUT(STANDARD_C is defined as the names given in STANDARD_INPUT)
strcpy(temp,str);//Do not want to touch the actual string, so copying it
cutout = strtok(temp,&delim);//Here is the lynchpin -
while(NULL != cutout)
{
if(cutout[strlen(cutout) - 1] == '_')
cutout[strlen(cutout) - 1] = '\0'; \\cutout the _ at the end of a token
if(Valid(cutout,i++) == INVALID) //Checks validity for substring, INVALID is -1
return INVALID;
cutout = strtok(NULL,&delim);
strcpy(cutout,cutout + 1); //cutout the _ at the beginning of a token
}
free(temp);
return VALID; // VALID is 1
}
strtok manipulates the string you pass in and returns a pointer to it,
so no memory is allocated.
Please consider using strsep or at least strtok_r to save you some headaches later.
The first parameter to the strtok(...) function is YOUR string:
str
C string to truncate. Notice that this string is modified by
being broken into smaller strings (tokens). Alternativelly, a null
pointer may be specified, in which case the function continues
scanning where a previous successful call to the function ended.
It puts '\0' characters into YOUR string and returns them as terminated strings. Yes, it mangles your original string. If you need it later, make a copy.
Further, it should not be a constant string (e.g. char* myStr = "constant string";). See here.
It could be allocated locally or by malloc/calloc.
If you allocated it locally on the stack (e.g. char myStr[100];), you don't have to free it.
If you allocated it by malloc (e.g. char* myStr = malloc(100*sizeof(char));), you need to free it.
Some example code:
#include <string.h>
#include <stdio.h>
int main()
{
const char str[80] = "This is an example string.";
const char s[2] = " ";
char *token;
/* get the first token */
token = strtok(str, s);
/* walk through other tokens */
while( token != NULL )
{
printf( " %s\n", token );
token = strtok(NULL, s);
}
return(0);
}
NOTE: This example shows how you iterate through the string...since your original string was mangled, strtok(...) remembers where you were last time and keeps working through the string.
According to the docs:
Return Value
A pointer to the last token found in string.
Since the return pointer just points to one of the bytes in your input string where the token starts, whether you need to free depends on whether you allocated the input string or not.
As others mentioned, strtok uses its first parameter, your input string, as the memory buffer. It doesn't allocate anything. It's stateful and non-thread safe; if strtok's first argument is null, it reuses the previously-provided buffer. During a call, strtok destroys the string, adding nulls into it and returning pointers to the tokens.
Here's an example:
#include <stdio.h>
#include <string.h>
int main() {
char s[] = "foo;bar;baz";
char *foo = strtok(s, ";");
char *bar = strtok(NULL, ";");
char *baz = strtok(NULL, ";");
printf("%s %s %s\n", foo, bar, baz); // => foo bar baz
printf("original: %s\n", s); // => original: foo
printf("%ssame memory loc\n", s == foo ? "" : "not "); // => same memory loc
return 0;
}
s started out as foo;bar;baz\0. Three calls to strtok turned it into foo\0bar\0baz\0. s is basically the same as the first chunk, foo.
Valgrind:
==89== HEAP SUMMARY:
==89== in use at exit: 0 bytes in 0 blocks
==89== total heap usage: 1 allocs, 1 frees, 1,024 bytes allocated
==89==
==89== All heap blocks were freed -- no leaks are possible
While the code below doesn't fix all of the problems with strtok, it might help get you moving in a pinch, preserving the original string with strdup:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main() {
const char s[] = "foo;bar_baz";
const char delims[] = ";_";
char *cpy = strdup(s);
char *foo = strtok(cpy, delims);
char *bar = strtok(NULL, delims);
char *baz = strtok(NULL, delims);
printf("%s %s %s\n", foo, bar, baz); // => foo bar baz
printf("original: %s\n", s); // => original: foo;bar_baz
printf("%ssame memory loc\n", s == foo ? "" : "not "); // => not same memory loc
free(cpy);
return 0;
}
Or a more full-fledged example (still not thread safe):
#include <stdbool.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
void must(
bool predicate,
const char *msg,
const char *file,
const unsigned int line
) {
if (!predicate) {
fprintf(stderr, "%s:%d: %s\n", file, line, msg);
exit(1);
}
}
size_t split(
char ***tokens,
const size_t len,
const char *s,
const char *delims
) {
char temp[len+1];
temp[0] = '\0';
strcpy(temp, s);
*tokens = malloc(sizeof(**tokens) * 1);
must(*tokens, "malloc failed", __FILE__, __LINE__);
size_t chunks = 0;
for (;;) {
char *p = strtok(chunks == 0 ? temp : NULL, delims);
if (!p) {
break;
}
size_t sz = sizeof(**tokens) * (chunks + 1);
*tokens = realloc(*tokens, sz);
must(*tokens, "realloc failed", __FILE__, __LINE__);
(*tokens)[chunks++] = strdup(p);
}
return chunks;
}
int main() {
const char s[] = "foo;;bar_baz";
char **tokens;
size_t len = split(&tokens, strlen(s), s, ";_");
for (size_t i = 0; i < len; i++) {
printf("%s ", tokens[i]);
free(tokens[i]);
}
puts("");
free(tokens);
return 0;
}

C string append

I want to append two strings. I used the following command:
new_str = strcat(str1, str2);
This command changes the value of str1. I want new_str to be the concatanation of str1 and str2 and at the same time str1 is not to be changed.
You need to allocate new space as well. Consider this code fragment:
char * new_str ;
if((new_str = malloc(strlen(str1)+strlen(str2)+1)) != NULL){
new_str[0] = '\0'; // ensures the memory is an empty string
strcat(new_str,str1);
strcat(new_str,str2);
} else {
fprintf(STDERR,"malloc failed!\n");
// exit?
}
You might want to consider strnlen(3) which is slightly safer.
Updated, see above. In some versions of the C runtime, the memory returned by malloc isn't initialized to 0. Setting the first byte of new_str to zero ensures that it looks like an empty string to strcat.
do the following:
strcat(new_str,str1);
strcat(new_str,str2);
Consider using the great but unknown open_memstream() function.
FILE *open_memstream(char **ptr, size_t *sizeloc);
Example of usage :
// open the stream
FILE *stream;
char *buf;
size_t len;
stream = open_memstream(&buf, &len);
// write what you want with fprintf() into the stream
fprintf(stream, "Hello");
fprintf(stream, " ");
fprintf(stream, "%s\n", "world");
// close the stream, the buffer is allocated and the size is set !
fclose(stream);
printf ("the result is '%s' (%d characters)\n", buf, len);
free(buf);
If you don't know in advance the length of what you want to append, this is convenient and safer than managing buffers yourself.
You'll have to strncpy str1 into new_string first then.
You could use asprintf to concatenate both into a new string:
char *new_str;
asprintf(&new_str,"%s%s",str1,str2);
I write a function support dynamic variable string append, like PHP str append: str + str + ... etc.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <stdarg.h>
int str_append(char **json, const char *format, ...)
{
char *str = NULL;
char *old_json = NULL, *new_json = NULL;
va_list arg_ptr;
va_start(arg_ptr, format);
vasprintf(&str, format, arg_ptr);
// save old json
asprintf(&old_json, "%s", (*json == NULL ? "" : *json));
// calloc new json memory
new_json = (char *)calloc(strlen(old_json) + strlen(str) + 1, sizeof(char));
strcat(new_json, old_json);
strcat(new_json, str);
if (*json) free(*json);
*json = new_json;
free(old_json);
free(str);
return 0;
}
int main(int argc, char *argv[])
{
char *json = NULL;
str_append(&json, "name: %d, %d, %d", 1, 2, 3);
str_append(&json, "sex: %s", "male");
str_append(&json, "end");
str_append(&json, "");
str_append(&json, "{\"ret\":true}");
int i;
for (i = 0; i < 10; i++) {
str_append(&json, "id-%d", i);
}
printf("%s\n", json);
if (json) free(json);
return 0;
}
I needed to append substrings to create an ssh command, I solved with sprintf (Visual Studio 2013)
char gStrSshCommand[SSH_COMMAND_MAX_LEN]; // declare ssh command string
strcpy(gStrSshCommand, ""); // empty string
void appendSshCommand(const char *substring) // append substring
{
sprintf(gStrSshCommand, "%s %s", gStrSshCommand, substring);
}
strcpy(str1+strlen(str1), str2);
man page of strcat says that arg1 and arg2 are appended to arg1.. and returns the pointer of s1. If you dont want disturb str1,str2 then you have write your own function.
char * my_strcat(const char * str1, const char * str2)
{
char * ret = malloc(strlen(str1)+strlen(str2));
if(ret!=NULL)
{
sprintf(ret, "%s%s", str1, str2);
return ret;
}
return NULL;
}
Hope this solves your purpose
You can try something like this:
strncpy(new_str, str1, strlen(str1));
strcat(new_str, str2);
More info on strncpy: http://www.cplusplus.com/reference/clibrary/cstring/strncpy/

How to split a string to 2 strings in C

I was wondering how you could take 1 string, split it into 2 with a delimiter, such as space, and assign the 2 parts to 2 separate strings. I've tried using strtok() but to no avail.
#include <string.h>
char *token;
char line[] = "SEVERAL WORDS";
char *search = " ";
// Token will point to "SEVERAL".
token = strtok(line, search);
// Token will point to "WORDS".
token = strtok(NULL, search);
Update
Note that on some operating systems, strtok man page mentions:
This interface is obsoleted by strsep(3).
An example with strsep is shown below:
char* token;
char* string;
char* tofree;
string = strdup("abc,def,ghi");
if (string != NULL) {
tofree = string;
while ((token = strsep(&string, ",")) != NULL)
{
printf("%s\n", token);
}
free(tofree);
}
For purposes such as this, I tend to use strtok_r() instead of strtok().
For example ...
int main (void) {
char str[128];
char *ptr;
strcpy (str, "123456 789asdf");
strtok_r (str, " ", &ptr);
printf ("'%s' '%s'\n", str, ptr);
return 0;
}
This will output ...
'123456' '789asdf'
If more delimiters are needed, then loop.
Hope this helps.
char *line = strdup("user name"); // don't do char *line = "user name"; see Note
char *first_part = strtok(line, " "); //first_part points to "user"
char *sec_part = strtok(NULL, " "); //sec_part points to "name"
Note: strtok modifies the string, so don't hand it a pointer to string literal.
You can use strtok() for that
Example: it works for me
#include <stdio.h>
#include <string.h>
int main ()
{
char str[] ="- This, a sample string.";
char * pch;
printf ("Splitting string \"%s\" into tokens:\n",str);
pch = strtok (str," ,.-");
while (pch != NULL)
{
printf ("%s\n",pch);
pch = strtok (NULL, " ,.-");
}
return 0;
}
If you have a char array allocated you can simply put a '\0' wherever you want.
Then point a new char * pointer to the location just after the newly inserted '\0'.
This will destroy your original string though depending on where you put the '\0'
If you're open to changing the original string, you can simply replace the delimiter with \0. The original pointer will point to the first string and the pointer to the character after the delimiter will point to the second string. The good thing is you can use both pointers at the same time without allocating any new string buffers.
You can do:
char str[] ="Stackoverflow Serverfault";
char piece1[20] = ""
,piece2[20] = "";
char * p;
p = strtok (str," "); // call the strtok with str as 1st arg for the 1st time.
if (p != NULL) // check if we got a token.
{
strcpy(piece1,p); // save the token.
p = strtok (NULL, " "); // subsequent call should have NULL as 1st arg.
if (p != NULL) // check if we got a token.
strcpy(piece2,p); // save the token.
}
printf("%s :: %s\n",piece1,piece2); // prints Stackoverflow :: Serverfault
If you expect more than one token its better to call the 2nd and subsequent calls to strtok in a while loop until the return value of strtok becomes NULL.
This is how you implement a strtok() like function (taken from a BSD licensed string processing library for C, called zString).
Below function differs from the standard strtok() in the way it recognizes consecutive delimiters, whereas the standard strtok() does not.
char *zstring_strtok(char *str, const char *delim) {
static char *static_str=0; /* var to store last address */
int index=0, strlength=0; /* integers for indexes */
int found = 0; /* check if delim is found */
/* delimiter cannot be NULL
* if no more char left, return NULL as well
*/
if (delim==0 || (str == 0 && static_str == 0))
return 0;
if (str == 0)
str = static_str;
/* get length of string */
while(str[strlength])
strlength++;
/* find the first occurance of delim */
for (index=0;index<strlength;index++)
if (str[index]==delim[0]) {
found=1;
break;
}
/* if delim is not contained in str, return str */
if (!found) {
static_str = 0;
return str;
}
/* check for consecutive delimiters
*if first char is delim, return delim
*/
if (str[0]==delim[0]) {
static_str = (str + 1);
return (char *)delim;
}
/* terminate the string
* this assignmetn requires char[], so str has to
* be char[] rather than *char
*/
str[index] = '\0';
/* save the rest of the string */
if ((str + index + 1)!=0)
static_str = (str + index + 1);
else
static_str = 0;
return str;
}
Below is an example code that demonstrates the usage
Example Usage
char str[] = "A,B,,,C";
printf("1 %s\n",zstring_strtok(s,","));
printf("2 %s\n",zstring_strtok(NULL,","));
printf("3 %s\n",zstring_strtok(NULL,","));
printf("4 %s\n",zstring_strtok(NULL,","));
printf("5 %s\n",zstring_strtok(NULL,","));
printf("6 %s\n",zstring_strtok(NULL,","));
Example Output
1 A
2 B
3 ,
4 ,
5 C
6 (null)
You can even use a while loop (standard library's strtok() would give the same result here)
char s[]="some text here;
do {
printf("%s\n",zstring_strtok(s," "));
} while(zstring_strtok(NULL," "));

Resources