C: STRTOK exception [duplicate] - c

This question already has answers here:
strtok segmentation fault
(8 answers)
strtok causing segfault but not when step through code
(1 answer)
Closed 9 years ago.
for some reason i get an exception at the first use of strtok()
what i am trying to accomplish is a function that simply checks if a substring repeats itself inside a string. but so far i havent gotten strtok to work
int CheckDoubleInput(char* input){
char* word = NULL;
char cutBy[] = ",_";
word = strtok(input, cutBy); <--- **error line**
/* walk through other tokens */
while (word != NULL)
{
printf(" %s\n", word);
word = strtok(NULL, cutBy);
}
return 1;
}
and the main calling the function:
CheckDoubleInput("asdlakm,_asdasd,_sdasd,asdas_sas");

CheckDoubleInput() is ok. Look at this. Hope you will understand
int main(){
char a[100] = "asdlakm,_asdasd,_sdasd,asdas_sas";
// This will lead to segmentation fault.
CheckDoubleInput("asdlakm,_asdasd,_sdasd,asdas_sas");
// This works ok.
CheckDoubleInput(a);
return 0;
}

The strtok function modify its first input (the parsed string), so you can't pass a pointer to a constant string. In your code, you are passing a pointer to a string literal of char[N] type (ie. a compilation constant string) and hence trying to modify a constant string literal which is undefined behaviour. You'll have to copy the string in a temporary buffer before.
char* copy = strdup("asdlakm,_asdasd,_sdasd,asdas_sas");
int result = CheckDoubleInput(copy);
free(copy);
Here is what the man page for strtok says:
Bugs
Be cautious when using these functions. If you do use them, note that:
These functions modify their first argument.
These functions cannot be used on constant strings.
The identity of the delimiting byte is lost.
The strtok() function uses a static buffer while parsing, so it's not thread safe. Use strtok_r() if this matters to you.

Either input is somehow bad. Try printing it before calling strtok OR you're using strtok on multiple threads with GCC compiler. Some compilers have a thread safe version called 'strtok_r'. Visual Studio have fixed the original function to be thread safe.
Your modified answer shows that you're passing a string literal which is read only.

Related

\n is not substituted using strtok

I am trying to use the C's strtok function in order to process a char* and print it in a display, and looks like that for some reason I don't know the character '\n' is not substituted by '\0' as I believe strtok does. The code is as follows:
-Declaration of char* and pass to the function where it will be processed:
char *string_to_write = "Some text\nSome other text\nNewtext";
malloc(sizeof string_to_write);
screen_write(string_to_write,ALIGN_LEFT_TOP,I2C0);
-Processing of char* in function:
void screen_write(char *string_to_write,short alignment,short I2C)
{
char *stw;
stw = string_to_write;
char* text_to_send;
text_to_send=strtok(stw,"\n");
while(text_to_send != NULL)
{
write_text(text_to_send,I2C);
text_to_send=strtok(NULL, "\n");
}
}
When applying the code, the result can be seen in imgur (Sorry, I am having problems with format adding the image here in the post), where it can be seen that the \n is not substituted as it is the strange character appearing in the image, and the debugger still showed the character as well. Any hints of where can the problem be?
Thanks for your help,
Javier
strtok expects to be able to mutate the string you pass it: instead of allocating new memory for each token, it puts \0 characters into the string at token boundaries, then returns a series of pointers into that string.
But in this case, your string is immutable: it's a constant stored in your program, and can't be changed. So strtok is doing its best: it's returning indices into the string for each token's starting point, but it can't insert the \0s to mark the ends. Your device can't handle \ns in the way you'd expect, so it displays them with that error character instead. (Which is presumably why you're using this code in the first place.)
The key is to pass in only mutable strings. To define a mutable string with a literal value, you need char my_string[] = "..."; rather than char* my_string = "...". In the latter case, it just gives you a pointer to some constant memory; in the former case, it actually makes an array for you to use. Alternately, you can use strlen to find out how long the string is, malloc some memory for it, then strcpy it over.
P.S. I'm concerned by your malloc: you're not saving the memory it gives you anywhere, and you're not doing anything with it. Be sure you know what you're doing before working with dynamic memory allocation! C is not friendly about that, and it's easy to start leaking without realizing it.
1.
malloc(sizeof string_to_write); - it allocates the sizeof(char *) bytes not as many bytes as your string needs. You also do not assign the allocated block to anything
2.
char *string_to_write = "Some text\nSome other text\nNewtext";
char *ptr;
ptr = malloc(strlen(string_to_write) + 1);
strcpy(ptr, string_to_write);
screen_write(ptr,ALIGN_LEFT_TOP,I2C0);

Unable to predict the ouput of the following program [duplicate]

This question already has answers here:
Returning an array using C
(8 answers)
Closed 8 years ago.
I have an idea on dangling pointer. I know that the following program will produce a dangling pointer.But I couldnt understand the output of the program
char *getString()
{
char str[] = "Stack Overflow ";
return str;
}
int main()
{
char *s=getString();
printf("%c\n",s[1]);
printf("%s",s); // Statement -1
printf("%s\n",s); // Statement -2
return 0;
}
The output of the following program is
t
if only Statement-1 is there then output is some grabage values
if only Statement-2 is there then output is new line
Your code shows undefined behaviour, as you're returning the address of a local variable.
There is no existence of str once the getString() function has finished execution and returned.
As for the question,
if only Statement-1 is there then output is some grabage values if only Statement-2 is there then output is new line
No explanations. Once your program exhibits undefined behaviour, the output cannot be predicted, that's all. [who knows, it might print your cell phone number, too, or a daemon may fly out of my nose]
For simple logical part, adding a \n in printf() will cause the output buffer to be flushed to the output immediately. [Hint: stdout is line buffered.]
Solution:
You can do your job either of the two ways stated below
Take a pointer, allocate memory dynamically inside getString() and return the pointer. (I'd recommend this). Also, free() it later in main() once you're done.
make the char str[] static so that the scope is not limited to the lifetime of the function. (not so good, but still will do the job)
your str in getString is a local variable, which is allocate on stack, and when the function returns, it doesn't exist anymore.
I suggest you rewrite getString() like this
char *getString()
{
char str[] = "Stack Overflow ";
char *tmp = (char*)malloc(sizeof(char)*strlen(str));
memcpy(tmp, str, strlen(str));
return tmp;
}
and you need to add
free(s);
before return 0;
In my case, pointer tmp points to a block memory on heap, which will exist till your program ends
you need to know more about stack and heap
Besides, there is still another way, use static variable instead
char *getString()
{
static char str[] = "Stack Overflow ";
return str;
}
PS: You get the correct answer for the following statement printf("%c\n",s[1]); is just a coincidence. Opera System didn't have time to do some clean work when you return from function. But it will
Array is returned as a pointer yet the array itself is the garbage after return from function. Just use static modifier.
What's concerning s[1] is OK. The point is, it's the first printf after getting the dangling pointer. So, the stack at this point is still (probably) intact. You should recall that stack is used for function calls and local variables only (in DOS it could be used by system interrupts, but now it's not the case). So, before the first printf (when s[1] is calc'ed), s[] is OK, but after - it's not (printf' code had messed it up). I hope, now it's clear.

Unexpected error with the strtok function with pointers in c [duplicate]

This question already has answers here:
strtok program crashing
(4 answers)
Closed 8 years ago.
I'm just trying to understand how the strtok() function works below:
#define _CRT_SECURE_NO_WARNINGS
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
int main(){
char* p="abc,def,ghi,jkl,mno,pqr,stu,vwx,yz";
char* q=strtok(p, ",");
printf("%s", q);
return 0;
}
I was trying to run this simple code but I keep getting a weird error that I can't understand at all.
This is the error I keep getting:
It's probably not relevant but because my command prompt keeps exiting I changed some settings in my project and added this line Console (/SUBSYSTEM:CONSOLE) somewhere and I tried to run the program without debugging at first which didn't even run the program then I tried it with debugging and that's how I got the error in the link.
Instead of
char* p="abc,def,ghi,jkl,mno,pqr,stu,vwx,yz";
use
char p[] = "abc,def,ghi,jkl,mno,pqr,stu,vwx,yz";
because, as per the man page of strtok(), it may modify the input string. In your code, the string literal is read-only. That's why it produces the segmentation fault.
Related quote:
Be cautious when using these functions. If you do use them, note that:
*
These functions modify their first argument.
Firstly to solve the error: Segmentation fault.
Change char *p to char p[].
The reason is because strtok() will add '\0' the string delimiter when the token character is found.
In your case char* p="abc,def,ghi,jkl,mno,pqr,stu,vwx,yz"; is stored in read-only section of the program as string literal.
As you are trying to modify a location that is read-only leads to Segmentation fault
You may not use string literals with function strtok because the function tries to change the string passed to it as an argument.
Any attempt to change a string literal results in undefined behaviour.
According to the C Standard (6.4.5 String literals)
7 It is unspecified whether these arrays are distinct provided their
elements have the appropriate values. If the program attempts to
modify such an array, the behavior is undefined.
Change this definition
char* p="abc,def,ghi,jkl,mno,pqr,stu,vwx,yz";
to
char s[] = "abc,def,ghi,jkl,mno,pqr,stu,vwx,yz";
//...
char* q = strtok( s, "," );
For example function main can look like
int main()
{
char s[] = "abc,def,ghi,jkl,mno,pqr,stu,vwx,yz";
char* q = strtok( s, "," );
while ( q != NULL )
{
puts( q );
strtok( NULL, "," );
}
return 0;
}
char* p="abc,def,ghi,jkl,mno,pqr,stu,vwx,yz"; is assigning read-only memory to p. Really the type of p should be const char*.
Attempting to change that string in any way gives undefined behaviour. So using strtok on it is a very bad thing to do. To remedy, use
char p[]="abc,def,ghi,jkl,mno,pqr,stu,vwx,yz";
That allocates the string on the stack and modifications on it are permitted. p decays to a pointer in certain instances; one of them being as a parameter to strtok.

Why am I getting a segfault on string tokenizer function?

The code:
#include <string.h>
#include <libgen.h>
#include <stdio.h>
int main()
{
char *token = strtok(basename("this/is/a/path.geo.bin"), ".");
if (token != NULL){
printf( " %s\n", token );
}
return(0);
}
If I extract the filepath and put it into a char array it works perfectly. However, like this I'm getting a segfault.
It seems that function basename simply returns pointer within string literal "this/is/a/path.geo.bin" or even tries to change it.
However string literals may not be changed. Any attempt to change a string literal results in undefined behaviour. And function strtok changes the string passed to it as an argument.
According to the C Standard (6.4.5 String literals)
7 It is unspecified whether these arrays are distinct provided their
elements have the appropriate values. If the program attempts to
modify such an array, the behavior is undefined.
Thus the program could look like
int main()
{
char name[] = "this/is/a/path.geo.bin";
char *token = strtok(basename( name ), ".");
if (token != NULL){
printf( " %s\n", token );
}
return(0);
}
From the documentation of basename():
char *basename(char *path);
Both dirname() and basename() may modify the contents of path, so it may be desirable to pass a copy when calling one of these functions.
These functions may return pointers to statically allocated memory which may be overwritten by subsequent calls. Alternatively, they may return a pointer to some part of path, so that the string referred to by path should not be modified or freed until the pointer returned by the function is no longer required.
In addition strtok() is known to modify the string passed to it:
Be cautious when using these functions. If you do use them, note that:
These functions modify their first argument.
These functions cannot be used on constant strings.
This is likely the source of your problem as string literals must not be modified:
If the program attempts to modify such an array, the behavior is undefined.
You should work around that.
For reference:
http://linux.die.net/man/3/basename
http://linux.die.net/man/3/strtok
What is the type of string literals in C and C++?

Problem with C program never returning from strtok() function

I am working on a university assignment and I've been wracking my head around a weird problem where my program calls strtok and never returns.
My code looks like:
int loadMenuDataIn(GJCType* menu, char *data)
{
char *lineTokenPtr;
int i;
lineTokenPtr = strtok(data, "\n");
while (lineTokenPtr != NULL) {
/* ... */
}
}
I've looked up a bunch of sites on the web, but I cant see anything wrong with the way that I am using strtok and I cant determine why it would my code would get stuck on the line lineTokenPtr = strtok(data, "\n");
Can anyone help me shed some light on this?
(Using OSX and Xcode if it makes any difference)
have you checked the contents of the argument? is it \0 terminated?
the argument that you pass, is it writeable memory? strtok writes to the buffer that it gets as first argument when it tokenizes the string.
IOW if you write
char* mystring = "hello\n";
strtok(mystring,"\n"); // you get problems
The function strtok() replaces the actual token delimiting symbols in the character string with null (i.e., \0) chars, and returns a pointer to the start of the token in the string. So after repeated calls to strtok() with a newline delimiting symbol, a string buffer that looked like
"The fox\nran over\nthe hill\n"
in memory will be literally modified in-place and turned into
"The fox\0ran over\0the hill\0"
with char pointers returned from strtok() that point to the strings the fox\0, ran over\0, and the hill\0. No new memory is allocated ... the original string memory is modified in-place, which means it's important not to pass a string literal that is of type const char*.

Resources