Or rather, how does strtok produce the string to which it's return value points? Does it allocate memory dynamically? I am asking because I am not sure if I need to free the token in the following code:
The STANDARD_INPUT variables is for exit procedure in case I run out of memory for allocation and the string is the tested subject.
int ValidTotal(STANDARD_INPUT, char *str)
{
char *cutout = NULL, *temp, delim = '#';
int i = 0; //Checks the number of ladders in a string, 3 is the required number
temp = (char*)calloc(strlen(str),sizeof(char));
if(NULL == temp)
Pexit(STANDARD_C); //Exit function, frees the memory given in STANDARD_INPUT(STANDARD_C is defined as the names given in STANDARD_INPUT)
strcpy(temp,str);//Do not want to touch the actual string, so copying it
cutout = strtok(temp,&delim);//Here is the lynchpin -
while(NULL != cutout)
{
if(cutout[strlen(cutout) - 1] == '_')
cutout[strlen(cutout) - 1] = '\0'; \\cutout the _ at the end of a token
if(Valid(cutout,i++) == INVALID) //Checks validity for substring, INVALID is -1
return INVALID;
cutout = strtok(NULL,&delim);
strcpy(cutout,cutout + 1); //cutout the _ at the beginning of a token
}
free(temp);
return VALID; // VALID is 1
}
strtok manipulates the string you pass in and returns a pointer to it,
so no memory is allocated.
Please consider using strsep or at least strtok_r to save you some headaches later.
The first parameter to the strtok(...) function is YOUR string:
str
C string to truncate. Notice that this string is modified by
being broken into smaller strings (tokens). Alternativelly, a null
pointer may be specified, in which case the function continues
scanning where a previous successful call to the function ended.
It puts '\0' characters into YOUR string and returns them as terminated strings. Yes, it mangles your original string. If you need it later, make a copy.
Further, it should not be a constant string (e.g. char* myStr = "constant string";). See here.
It could be allocated locally or by malloc/calloc.
If you allocated it locally on the stack (e.g. char myStr[100];), you don't have to free it.
If you allocated it by malloc (e.g. char* myStr = malloc(100*sizeof(char));), you need to free it.
Some example code:
#include <string.h>
#include <stdio.h>
int main()
{
const char str[80] = "This is an example string.";
const char s[2] = " ";
char *token;
/* get the first token */
token = strtok(str, s);
/* walk through other tokens */
while( token != NULL )
{
printf( " %s\n", token );
token = strtok(NULL, s);
}
return(0);
}
NOTE: This example shows how you iterate through the string...since your original string was mangled, strtok(...) remembers where you were last time and keeps working through the string.
According to the docs:
Return Value
A pointer to the last token found in string.
Since the return pointer just points to one of the bytes in your input string where the token starts, whether you need to free depends on whether you allocated the input string or not.
As others mentioned, strtok uses its first parameter, your input string, as the memory buffer. It doesn't allocate anything. It's stateful and non-thread safe; if strtok's first argument is null, it reuses the previously-provided buffer. During a call, strtok destroys the string, adding nulls into it and returning pointers to the tokens.
Here's an example:
#include <stdio.h>
#include <string.h>
int main() {
char s[] = "foo;bar;baz";
char *foo = strtok(s, ";");
char *bar = strtok(NULL, ";");
char *baz = strtok(NULL, ";");
printf("%s %s %s\n", foo, bar, baz); // => foo bar baz
printf("original: %s\n", s); // => original: foo
printf("%ssame memory loc\n", s == foo ? "" : "not "); // => same memory loc
return 0;
}
s started out as foo;bar;baz\0. Three calls to strtok turned it into foo\0bar\0baz\0. s is basically the same as the first chunk, foo.
Valgrind:
==89== HEAP SUMMARY:
==89== in use at exit: 0 bytes in 0 blocks
==89== total heap usage: 1 allocs, 1 frees, 1,024 bytes allocated
==89==
==89== All heap blocks were freed -- no leaks are possible
While the code below doesn't fix all of the problems with strtok, it might help get you moving in a pinch, preserving the original string with strdup:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main() {
const char s[] = "foo;bar_baz";
const char delims[] = ";_";
char *cpy = strdup(s);
char *foo = strtok(cpy, delims);
char *bar = strtok(NULL, delims);
char *baz = strtok(NULL, delims);
printf("%s %s %s\n", foo, bar, baz); // => foo bar baz
printf("original: %s\n", s); // => original: foo;bar_baz
printf("%ssame memory loc\n", s == foo ? "" : "not "); // => not same memory loc
free(cpy);
return 0;
}
Or a more full-fledged example (still not thread safe):
#include <stdbool.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
void must(
bool predicate,
const char *msg,
const char *file,
const unsigned int line
) {
if (!predicate) {
fprintf(stderr, "%s:%d: %s\n", file, line, msg);
exit(1);
}
}
size_t split(
char ***tokens,
const size_t len,
const char *s,
const char *delims
) {
char temp[len+1];
temp[0] = '\0';
strcpy(temp, s);
*tokens = malloc(sizeof(**tokens) * 1);
must(*tokens, "malloc failed", __FILE__, __LINE__);
size_t chunks = 0;
for (;;) {
char *p = strtok(chunks == 0 ? temp : NULL, delims);
if (!p) {
break;
}
size_t sz = sizeof(**tokens) * (chunks + 1);
*tokens = realloc(*tokens, sz);
must(*tokens, "realloc failed", __FILE__, __LINE__);
(*tokens)[chunks++] = strdup(p);
}
return chunks;
}
int main() {
const char s[] = "foo;;bar_baz";
char **tokens;
size_t len = split(&tokens, strlen(s), s, ";_");
for (size_t i = 0; i < len; i++) {
printf("%s ", tokens[i]);
free(tokens[i]);
}
puts("");
free(tokens);
return 0;
}
Related
one of the assignments in my class has this objective:
Complete CapVowels(), which takes a string as a parameter and returns a new string containing the string parameter with the first occurrence of each of the five English vowels (a, e, i, o, and u) capitalized.
Hint: Begin CapVowels() by copying the string parameter to a newly allocated string.
Ex: If the input is:
management
the output is:
Original: management
Modified: mAnagEment
This is the current code I have, and I will highlight the section I'm supposed to complete:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
**// Return a newly allocated copy of original
// with the first occurrence of each vowel capitalized
char* CapVowels(char* original) {
return CapVowels(*original= "A.E,I,O,U");
}**
int main(void) {
char userCaption[50];
char* resultStr;
scanf("%s", userCaption);
resultStr = CapVowels(userCaption);
printf("Original: %s\n", userCaption);
printf("Modified: %s\n", resultStr);
// Always free dynamically allocated memory when no longer needed
free(resultStr);
return 0;
}
The section with the ** meaning it's bolded is the section I'm supposed to complete before the int main(void). I can't figure out how to complete the objective. I get mixed up with pointers and dereferencing and, I tried dereferencing when returning the function so that the value will come out to what it's supposed to. I understand one part of it, but I don't know how you would complete it to output to the required output:
Original: management
Modified: mAnagEment
Hint: Begin CapVowels() by copying the string parameter to a newly allocated string.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
// Return a newly allocated copy of original
// with the first occurrence of each vowel capitalized
char* CapVowels(const char* original) {
char* result = strcpy(malloc(strlen(original+1)), original);
char* vowels = "aeiou";
while(*vowels)
{
char* ptr = strchr(result, *vowels);
(ptr)? *ptr = toupper(*ptr) : vowels++;
}
return result;
}
int main(void) {
char userCaption[50];
char* resultStr;
scanf("%s", userCaption);
resultStr = CapVowels(userCaption);
printf("Original: %s\n", userCaption);
printf("Modified: %s\n", resultStr);
// Always free dynamically allocated memory when no longer needed
free(resultStr);
return 0;
}
Output
Success #stdin #stdout 0s 5424KB
Original: management
Modified: mAnAgEmEnt
You can use strlen to get the length of the input, then use malloc to allocate enough space for the result. Then, just loop over the input until the terminating null character ('\0'), incrementally assigning the current character to the result if it is a consonant or the uppercase version if it is a vowel (using the toupper function).
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>
char *CapVowels(char *original){
if (!original)
return NULL;
const char *vowels = "aeiou";
size_t len = strlen(original); // get length of input (terminating '\0' not included)
char *result = malloc(len + 1); // allocate memory for new string (note that sizeof(char) is 1)
for (char *dest = result; *original; ++original)
*dest++ = strchr(vowels, *original) // check if current character is in the vowels
? toupper(*original) : *original;
result[len] = '\0';
return result;
}
Here's a version that works even with multiple words in 'original'.
char *CapVowels( const char *original ) {
char *cp, *out = strdup( original );
for( char *vowels = "aeiou"; *vowels; vowels++ ) {
if( ( cp = strchr( out, *vowels ) ) != NULL )
*cp = toupper( *cp );
}
return out;
}
void main( void ) {
char userCaption[50];
gets( userCaption );
char *capped = CapVowels( userCaption );
printf( "Original: %s\n", userCaption );
printf( "Modified: %s\n", capped );
free( capped );
}
According to this:
strcpy vs strdup,
strcpy could be implemented with a loop, they used this while(*ptr2++ = *ptr1++). I have tried to do similar:
#include <stdio.h>
#include <stdlib.h>
int main(){
char *des = malloc(10);
for(char *src="abcdef\0";(*des++ = *src++););
printf("%s\n",des);
}
But that prints nothing, and no error. What went wrong?
Thanks a lot for answers, I have played a bit, and decided how best to design the loop to see how the copying is proceeding byte by byte. This seems the best:
#include <stdio.h>
#include <stdlib.h>
int main(){
char *des = malloc(7);
for(char *src="abcdef", *p=des; (*p++=*src++); printf("%s\n",des));
}
In this loop
for(char *src="abcdef\0";(*des++ = *src++););
the destination pointer des is being changed. So after the loop it does not point to the beginning of the copied string.
Pay attention to that the explicit terminating zero character '\0' is redundant in the string literal.
The loop can look the following way
for ( char *src = "abcdef", *p = des; (*p++ = *src++););
And then after the loop
puts( des );
and
free( des );
You could write a separate function similar to strcpy the following way
char * my_strcpy( char *des, const char *src )
{
for ( char *p = des; ( *p++ = *src++ ); );
return des;
}
And call it like
puts( my_strcpy( des, "abcdef" ) )'
free( des );
You are incrementing des so naturally at the end of the cycle it will be pointing past the end of the string, printing it amounts to undefined behavior, you have to bring it back to the beginning of des.
#include <stdio.h>
#include <stdlib.h>
int main(){
int count = 0;
char *des = malloc(10);
if(des == NULL){
return EXIT_FAILURE; //or otherwise handle the error
}
// '\0' is already added by the compiler so you don't need to do it yourself
for(char *src="abcdef";(*des++ = *src++);){
count++; //count the number of increments
}
des -= count + 1; //bring it back to the beginning
printf("%s\n",des);
free(dest); //to free the allocated memory when you're done with it
return EXIT_SUCCESS;
}
Or make a pointer to the beginning of des and print that instead.
#include <stdio.h>
#include <stdlib.h>
int main(){
char *des = malloc(10);
if(des == NULL){
return EXIT_FAILURE; //or otherwise handle the error
}
char *ptr = des;
for(char *src="abcdef";(*des++ = *src++);){} //using {} instead of ;, it's clearer
printf("%s\n",ptr);
free(ptr) // or free(dest); to free the allocated memory when you're done with it
return EXIT_SUCCESS;
}
printf("%s\n",des); is undefined behavior (UB) as it attempts to print starting beyond the end of the string written to allocated memory.
Copy the string
Save the original pointer, check it and free when done.
const char *src = "abcdef\0"; // string literal here has 2 ending `\0`,
char *dest = malloc(strlen(src) + 1); // 7
char *d = dest;
while (*d++ = *src++);
printf("%s\n", dest);
free(dest);
Copy the string literal
const char src[] = "abcdef\0"; // string literal here has 2 ending `\0`,
char *dest = malloc(sizeof src); // 8
for (size_t i = 0; i<sizeof src; i++) {
dest[i] = src[i];
}
printf("%s\n", dest);
free(dest);
You just need to remember the original allocated pointer.
Do not program in main. Use functions.
#include <stdio.h>
#include <stdlib.h>
size_t strSpaceNeedeed(const char *str)
{
const char *wrk = str;
while(*wrk++);
return wrk - str;
}
char *mystrdup(const char *str)
{
char *wrk;
char *dest = malloc(strSpaceNeedeed(str));
if(dest)
{
for(wrk = dest; *wrk++ = *str++;);
}
return dest;
}
int main(){
printf("%s\n", mystrdup("asdfgfd"));
}
or even better
size_t strSpaceNeedeed(const char *str)
{
const char *wrk = str;
while(*wrk++);
return wrk - str;
}
char *mystrcpy(char *dest, const char *src)
{
char *wrk = dest;
while((*wrk++ = *src++)) ;
return dest;
}
char *mystrdup(const char *str)
{
char *wrk;
char *dest = malloc(strSpaceNeedeed(str));
if(dest)
{
mystrcpy(dest, str);
}
return dest;
}
int main(){
printf("%s\n", mystrdup("asdfgfd"));
}
You allocate the destination buffer des and correctly copy the source string into place. But since you are incrementing des for each character you copy, you have moved des from the start of the string to the end. When you go to print the result, you are printing the last byte which is the nil termination, which is empty.
Instead, you need to keep a pointer to the start of the string, as well as having a pointer to each character you copy.
The smallest change from your original source is:
#include <stdio.h>
#include <stdlib.h>
int main(){
char *des = malloc(10);
char *p = des;
for(char *src="abcdef";(*p++ = *src++););
printf("%s\n",des);
}
So p is the pointer to the next destination character, and moves along the string. But the final string that you print is des, from the start of the allocation.
Of course, you should also allocate strlen(src)+1 worth of bytes for des. And it is not necessary to null-terminate a string literal, since that will be done for you by the compiler.
But that prints nothing, and no error. What went wrong?
des does not point to the start of the string anymore after doing (*des++ = *src++). In fact, des is pointing to one element past the NUL character, which terminates the string, thereafter.
Thus, if you want to print the string by using printf("%s\n",des) it invokes undefined behavior.
You need to store the address value of the "start" pointer (pointing at the first char object of the allocated memory chunk) into a temporary "holder" pointer. There are various ways possible.
#include <stdio.h>
#include <stdlib.h>
int main (void) {
char *des = malloc(sizeof(char) * 10);
if (!des)
{
fputs("Error at allocation!", stderr);
return 1;
}
char *tmp = des;
for (const char *src = "abcdef"; (*des++ = *src++) ; );
des = temp;
printf("%s\n",des);
free(des);
}
Alternatives:
#include <stdio.h>
#include <stdlib.h>
int main (void) {
char *des = malloc(sizeof(char) * 10);
if (!des)
{
fputs("Error at allocation!", stderr);
return 1;
}
char *tmp = des;
for (const char *src = "abcdef"; (*des++ = *src++) ; );
printf("%s\n", tmp);
free(tmp);
}
or
#include <stdio.h>
#include <stdlib.h>
int main (void) {
char *des = malloc(sizeof(char) * 10);
if (!des)
{
fputs("Error at allocation!", stderr);
return 1;
}
char *tmp = des;
for (const char *src = "abcdef"; (*tmp++ = *src++) ; );
printf("%s\n", des);
free(des);
}
Side notes:
"abcdef\0" - The explicit \0 is not needed. It is appended automatically during translation. Use "abcdef".
Always check the return of memory-management function if the allocation succeeded by checking the returned for a null pointer.
Qualify pointers to string literal by const to avoid unintentional write attempts.
Use sizeof(char) * 10 instead of plain 10 in the call the malloc. This ensures the write size if the type changes.
int main (void) instead of int main (void). The first one is standard-compliant, the second not.
Always free() dynamically allocated memory, since you no longer need the allocated memory. In the example above it would be redundant, but if your program becomes larger and the example is part-focused you should free() the unneeded memory immediately.
I really want to change all spaces ' ' in my char array for NULL -
#include <string.h>
void ReplaceCharactersInString(char *pcString, char *cOldChar, char *cNewChar) {
char *p = strtok(pcString, cOldChar);
strcpy(pcString, p);
while (p != NULL) {
strcat(pcString, p);
p = strtok(cNewChar, cOldChar);
}
}
int main() {
char pcString[] = "I am testing";
ReplaceCharactersInString(pcString, " ", NULL);
printf(pcString);
}
OUTPUT: Iamtesting
If I simply put the printf(p) function before:
p = strtok(cNewChar, cOldChar);
In the result I have what I need - but the problem is how to store it in pcString (directly)?
Or there is maybe a better solution to simply do it?
While some functions expect a [single] string to be pre-parsed to: I\0am\0testing, that is rare.
And, if you have multiple spaces/delimiters, you'll get (e.g.) foo\0\0bar, which you probably don't want.
And, your printf in main will only print the first token in the string because it will stop on the first EOS (i.e. '\0').
(i.e.) You probably don't want strcpy/strcat.
More likely, you want to fill an array of char * pointers to the tokens you parse.
So, you'd want to pass down char **argv, then do: argv[argc++] = strtok(...); and then do: return argc
Here's how I would refactor your code:
#include <stdio.h>
#include <string.h>
#define ARGMAX 100
int
ReplaceCharactersInString(int argmax,char **argv,char *pcString,
const char *delim)
{
char *p;
int argc;
// allow space for NULL termination
--argmax;
for (argc = 0; argc < argmax; ++argc, ++argv) {
// get next token
p = strtok(pcString,delim);
if (p == NULL)
break;
// zap the buffer pointer
pcString = NULL;
// store the token in the [returned] array
*argv = p;
}
*argv = NULL;
return argc;
}
int
main(void)
{
char pcString[] = "I am testing";
int argc;
char **av;
char *argv[ARGMAX];
argc = ReplaceCharactersInString(ARGMAX,argv,pcString," ");
printf("argc: %d\n",argc);
for (av = argv; *av != NULL; ++av)
printf("'%s'\n",*av);
return 0;
}
Here's the output:
argc: 3
'I'
'am'
'testing'
strcat strcpy should not be used when the source and destination overlap in memory.
Iterate through the array and replace the matching character with the desired character.
Since zeros are part of the string, printf will stop at the first zero and strlen can't be used for the length to print. sizeof can be used as pcString is defined in the same scope.
Note that ReplaceCharactersInString would not work a second time as it would stop at the first zero. The function could be written to accept a length parameter and loop using the length.
#include <stdio.h>
#include <stdlib.h>
void ReplaceCharactersInString(char *pcString, char cOldChar,char cNewChar){
while ( pcString && *pcString) {//not NULL and not zero
if ( *pcString == cOldChar) {//match
*pcString = cNewChar;//replace
}
++pcString;//advance to next character
}
}
int main ( void) {
char pcString[] = "I am testing";
ReplaceCharactersInString ( pcString, ' ', '\0');
for ( int each = 0; each < sizeof pcString; ++each) {
printf ( "pcString[%02d] = int:%-4d char:%c\n", each, pcString[each], pcString[each]);
}
return 0;
}
You want to split the string into individual tokens separated by spaces such as "I\0am\0testing\0". You can use strtok() for this but this function is error prone. I suggest you allocate an array of pointers and make them point to the words. Note that splitting the source string is sloppy and does not allow for tokens to be adjacent such as in 1+1. You could allocate the strings instead.
Here is an example:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char **split_string(const char *str, char *delim) {
size_t i, len, count;
const char *p;
/* count tokens */
p = str;
p += strspn(p, delim); // skip initial delimiters
count = 0;
while (*p) {
count++;
p += strcspn(p, delim); // skip token
p += strspn(p, delim); // skip delimiters
}
/* allocate token array */
char **array = calloc(sizeof(*array, count + 1);
p = str;
p += strspn(p, delim); // skip initial delimiters
for (i = 0; i < count; i++) {
len = strcspn(p, delim); // token length
array[i] = strndup(p, len); // allocate a copy of the token
p += len; // skip token
p += strspn(p, delim); // skip delimiters
}
/* array ends with a null pointer */
array[count] = NULL;
return array;
}
int main() {
const char *pcString = "I am testing";
char **array = split_string(pcString, " \t\r\n");
for (size_t i = 0; array[i] != NULL; i++) {
printf("%zu: %s\n", i, array[i]);
}
return 0;
}
The strtok function pretty much does exactly what you want. It basically replaces the next delimiter with a '\0' character and returns the pointer to the current token. The next time you call strtok, you should pass a NULL argument (see the documentation for strtok) and it will point to the next token, which will again be delimited by '\0'. Read some more examples of correct strtok usage.
I'm trying to allocate a string within a struct with fscanf,
I tried this:
#include <stdio.h>
#include <conio.h>
#include <stdlib.h>
#include <Windows.h>
#include <string.h>
typedef struct _SPerson {
char *name;
char *surname;
char *id;
char *telephone;
}SPerson;
void main (void) {
unsigned int ne;
SPerson Archive[1000];
Load(Archive,&ne);
}
int Load(SPerson Archive[],unsigned int *ne) {
int k,i=0;
char s[4][20];
FILE *f;
f = fopen("archive.txt","r");
if(f==0) return 0;
while((k=fscanf(f,"%s %s %s %s",s[0],s[1],s[2],s[3]))==4) {
Archive[i].id = (char*) malloc( sizeof(char) *strlen(s[0]));
Archive[i].id =s[0];
Archive[i].name = (char*) malloc( sizeof(char) *strlen(s[1]));
Archive[i].name = s[1];
Archive[i].surname = (char*) malloc( sizeof(char) *strlen(s[2]));
Archive[i].surname = s[2];
Archive[i].telephone = (char*) malloc( sizeof(char) *strlen(s[3]));
Archive[i].telephone =s[3];
i++;
}
*ne = i;
fclose(f);
return 1;
}
Maybe in my brain it's correct, but something went wrong while loading the data, so is this the correct and clear method to read strings dynamically?
I thought to use fgets, but my string is separated by a space, so I need to implement another function, the split. Can Anyone help me?
while((k=fscanf(f,"%s %s %s %s",s[0],s[1],s[2],s[3]))==4) {
//your code
i++;
}
Instead of this use fgets to read complete data and then tokenize it using strtok -
char data[1024],*token;
int j=0;
while(fgets(data,sizeof data,f)!=NULL){ //read from file
token=strtok(data," "); //tokenize data using space as delimiter
while(token!=NULL && j<4){
j=0;
sprintf(s[j],"%s",token); //store it into s[i]
j++;
token=strtok(NULL," ");
}
Archive[i].id = malloc(sizeof *Archive[i].id * (strlen(s[0])+1)); //allocate memory
strcpy(Archive[i].id,s[i]); //copy at that allocated memory
//similar for all
i++;
}
This can be used instead of your loop.
Note - Do not do this -
Archive[i].id = (char*) malloc( sizeof(char) *(strlen(s[0])+1));
Archive[i].id =s[0]; <-- 2.
As after 2. statement you will lose reference to previous allocated memory and will not be able to free it — causing a memory leak.
This is same for all such following statements.
A couple of quick suggestions to improve:
1) void main (void) is really not a good prototype for main. Use:
int main(void);
Or:
int main(int argc, char **argv);
2) There is no need to cast the return of [m][c][re]alloc in C.
This line in your code:
Archive[i].id = (char*) malloc( sizeof(char) *strlen(s[0]));
^^^^^^^ ^^^^^^^^^^^^^^ //^^^ == remove
Should be written:
Archive[i].id = malloc( strlen(s[0]) + 1);//no cast, and sizeof(char) is always == 1
//"+ 1" for NULL termination
3) Suggest using fgets(), strtok() and strcpy() as a minimal method to read, parse and copy strings from file into struct members:
Note: there will be a number of calls to malloc() here, and
each will have to be freed at some point. To avoid all that, it would
be better if your struct contained members with hard coded stack
memory:
typedef struct
{
char name[80];
char surname[80];
char id[80];
char telephone[80];
}SPerson;
But assuming you have a reason to use heap memory, here is a way to do it using the struct as you have defined it:
char line[260];//line buffer (hardcoded length for quick demo)
char *tok={0};//for use with strtok()
int len = 0;//for use with strlen()
FILE *f;
f = fopen("archive.txt","r");//did not have example file for this demo
//therefore do not have delimiters, will guess
if(f==0) return 0;
i = 0;//initialize your index
while(fgets(line, 260, f))
{
tok = strtok(line, " ,\n\t");// will tokenize on space, newline, tab and comma
if(tok)
{
len = strlen(tok);
Archive[i].id = malloc(len + 1);//include space for NULL termination
strcpy(Archive[i].id, tok);//note correct way to assign string
//(Archive[i].id = tok is incorrect!!)
}
else {//handle error, free memory and return}
tok = strtok(NULL, " ,\n\t");//note NULL in first arg this time
if(tok)
{
len = strlen(tok);
Archive[i].name= malloc(len + 1);//include space for NULL termination
strcpy(Archive[i].name, tok);
}
else {//handle error, free memory and return}
//...And so on for rest of member assignments
//
i++;//increment index at bottom of while loop just before reading new line
}
I learnt C in uni but haven't used it for quite a few years. Recently I started working on a tool which uses C as the programming language. Now I'm stuck with some really basic functions. Among them are how to split and join strings using a delimiter? (I miss Python so much, even Java or C#!)
Below is the function I created to split a string, but it does not seem to work properly. Also, even this function works, the delimiter can only be a single character. How can I use a string as a delimiter?
Can someone please provide some help?
Ideally, I would like to have 2 functions:
// Split a string into a string array
char** fSplitStr(char *str, const char *delimiter);
// Join the elements of a string array to a single string
char* fJoinStr(char **str, const char *delimiter);
Thank you,
Allen
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
char** fSplitStr(char *str, const char *delimiters)
{
char * token;
char **tokenArray;
int count=0;
token = (char *)strtok(str, delimiters); // Get the first token
tokenArray = (char**)malloc(1 * sizeof(char*));
if (!token) {
return tokenArray;
}
while (token != NULL ) { // While valid tokens are returned
tokenArray[count] = (char*)malloc(sizeof(token));
tokenArray[count] = token;
printf ("%s", tokenArray[count]);
count++;
tokenArray = (char **)realloc(tokenArray, sizeof(char *) * count);
token = (char *)strtok(NULL, delimiters); // Get the next token
}
return tokenArray;
}
int main (void)
{
char str[] = "Split_The_String";
char ** splitArray = fSplitStr(str,"_");
printf ("%s", splitArray[0]);
printf ("%s", splitArray[1]);
printf ("%s", splitArray[2]);
return 0;
}
Answers: (Thanks to Moshbear, Joachim and sarnold):
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
char** fStrSplit(char *str, const char *delimiters)
{
char * token;
char **tokenArray;
int count=0;
token = (char *)strtok(str, delimiters); // Get the first token
tokenArray = (char**)malloc(1 * sizeof(char*));
tokenArray[0] = NULL;
if (!token) {
return tokenArray;
}
while (token != NULL) { // While valid tokens are returned
tokenArray[count] = (char*)strdup(token);
//printf ("%s", tokenArray[count]);
count++;
tokenArray = (char **)realloc(tokenArray, sizeof(char *) * (count + 1));
token = (char *)strtok(NULL, delimiters); // Get the next token
}
tokenArray[count] = NULL; /* Terminate the array */
return tokenArray;
}
char* fStrJoin(char **str, const char *delimiters)
{
char *joinedStr;
int i = 1;
joinedStr = realloc(NULL, strlen(str[0])+1);
strcpy(joinedStr, str[0]);
if (str[0] == NULL){
return joinedStr;
}
while (str[i] !=NULL){
joinedStr = (char*)realloc(joinedStr, strlen(joinedStr) + strlen(str[i]) + strlen(delimiters) + 1);
strcat(joinedStr, delimiters);
strcat(joinedStr, str[i]);
i++;
}
return joinedStr;
}
int main (void)
{
char str[] = "Split_The_String";
char ** splitArray = (char **)fStrSplit(str,"_");
char * joinedStr;
int i=0;
while (splitArray[i]!=NULL) {
printf ("%s", splitArray[i]);
i++;
}
joinedStr = fStrJoin(splitArray, "-");
printf ("%s", joinedStr);
return 0;
}
Use strpbrk instead of strtok, because strtok suffers from two weaknesses:
it's not re-entrant (i.e. thread-safe)
it modifies the string
For joining, use strncat for joining, and realloc for resizing.
The order of operations is very important.
Before doing the realloc;strncat loop, set the 0th element of the target string to '\0' so that strncat won't cause undefined behavior.
For starters, don't use sizeof to get the length of a string. strlen is the function to use. In this case strdup is better.
And you don't actually copy the string returned by strtok, you copy the pointer. Change you loop to this:
while (token != NULL) { // While valid tokens are returned
tokenArray[count] = strdup(token);
printf ("%s", tokenArray[count]);
count++;
tokenArray = (char **)realloc(tokenArray, sizeof(char *) * count);
token = (char *)strtok(NULL, delimiters); // Get the next token
}
tokenArray[count] = NULL; /* Terminate the array */
Also, don't forget to free the entries in the array, and the array itself when you're done with it.
Edit At the beginning of fSplitStr, wait with allocating the tokenArray until after you check that token is not NULL, and if token is NULL why not return NULL?
I'm not sure the best solution for you, but I do have a few notes:
token = (char *)strtok(str, delimiters); // Get the first token
tokenArray = (char**)malloc(1 * sizeof(char*));
if (!token) {
return tokenArray;
}
At this point, if you weren't able to find any tokens in the string, you return a pointer to an "array" that is large enough to hold a single character pointer. It is un-initialized, so it would not be a good idea to use the contents of this array in any way. C almost never initializes memory to 0x00 for you. (calloc(3) would do that for you, but since you need to overwrite every element anyway, it doesn't seem worth switching to calloc(3).)
Also, the (char **) case before the malloc(3) call indicates to me that you've probably forgotten the #include <stdlib.h> that would properly prototype malloc(3). (The cast was necessary before about 1989.)
Do note that your while() { } loop is setting pointers to the parts of the original input string to your tokenArray elements. (This is one of the cons that moshbear mentioned in his answer -- though it isn't always a weakness.) If you change tokenArray[1][1]='H', then your original input string also changes. (In addition to having each of the delimiter characters replaced with an ASCII NUL character.)