C: Assign strtok token to char * Segfault

C: Assign strtok token to char * Segfault - c

Why do I get a segfault with the below code?
#include <stdio.h>
int main()
{
char * tmp = "0.1";
char * first = strtok(tmp, ".");
return 0;
}
Edited:
#include <stdio.h>
int main()
{
char tmp[] = "0.1";
char *first = strtok(tmp, ".");
char *second = strtok(tmp, "."); // Yes, should be NULL
printf("%s\n", first);
printf("Hello World\n");
return 0;
}
The segfault can be reproduced at the online gdb here:
https://www.onlinegdb.com/online_c_compiler

The problem with your first code is that tmp points at a string literal, which is read-only. When strtok tries to modify the string, it crashes.
The problem with your second code is a missing include:
#include <string.h>
This missing header means strtok is undeclared in your program. The C compiler assumes all undeclared functions return int. This is not true for strtok, which returns char *. The likely cause of the crash in your example is that the code is running on a 64-bit machine where pointers are 8 bytes wide but int is only 4 bytes. This messes up the return value of strtok, so first is a garbage pointer (and printf crashes when it tries to use it).
You can confirm this for yourself by doing
char *first = strtok(tmp, ".");
printf("%p %p\n", (void *)tmp, (void *)first);
The addresses printed for tmp and first should be identical (and they are if you #include <string.h>).
The funny thing is that gcc can warn you about these problems:
main.c: In function 'main':
main.c:6:19: warning: implicit declaration of function 'strtok' [-Wimplicit-function-declaration]
char *first = strtok(tmp, ".");
^
main.c:6:19: warning: initialization makes pointer from integer without a cast [-Wint-conversion]
main.c:7:20: warning: initialization makes pointer from integer without a cast [-Wint-conversion]
char *second = strtok(tmp, "."); // Yes, should be NULL
^
... and onlinegdb will show you these warnings, but only if compilation fails!
So to see compiler warnings on onlinegdb, you have to add a hard error to the code (e.g. by putting a # in the last line of the file).

The behaviour of the function strtok goes something like this:
Accept a string str or NULL and a string of delimiters characters.
The strtok function then begins to process the given string str, wherein which it reads the string character by character until it encounters a character present amongst the provided delimiter characters.
If the number of characters it has encountered until reaching the delimiter string is > 0, then replace the delimiter character by '\n' and returns a pointer to the first character in this iteration which was not a delimiter character.
Else, if the number of characters it has encountered until reaching the delimiter string is == 0, then continue iterating the rest of the string without replacing this delimiter character by '\n'.
I've created some code snippets which will help you better understand the nature of the function, https://ideone.com/6NCcrR and https://ideone.com/KVI5n4 (<- taking excerpts from your code your code)
Now to answer your question, including string.h header and setting
char tmp[] = "0.1"; should solve your issue.

With char * tmp = "0.1";, tmp points to a string literal that cannot be modified and strtok tries to modify the string by replacing . with '\0'.
Another approach, avioding the segfault, would be to use strchr to find the dot and the precision field to print a limited number of characters. The sub-strings may be copied to other variables as well.
#include <stdio.h>
#include <string.h>
int main ( void) {
char * tmp = "0.1";
char * first = strchr(tmp, '.');
char * second = first + 1;
if ( first) {
printf ( "%.*s\n", first - tmp, tmp);
printf ( "%s\n", second);
}
printf ( "Hello World\n");
return 0;
}

tmp is not a string literal as few answers or comments point out.
char *tmp = "0.1" this is a string literal.
char tmp[] = "0.1" is a character array and all array operations can be performed on them.
The segfault arises because the function declaration for strtok is not found as string.h is not included, and the gcc or other c compilers implicitly declare the return type as int by default.
Now depending on the platform the integer size may vary, if int size is 4 byte and pointer size is 8 byte respectively
char *first = (int)strtok(tmp,".");
Truncation takes place on the pointer address returned by strtok and then when your printing, you are de-referencing the address value contained in first which could be a memory region out of bound resulting in segmentation fault or undefined behavior.
If you can typecast the output of strtok to a type that is 8 bytes(long in my case) then there would not be a segfault, although this is not a clean way to do.
Include proper headerfiles to avoid undefined behavior.

Related

Wrong output after modifying an array in a function (in C)

I'm a C noob and I'm having problems with the following code:
#include <stdio.h>
#include <string.h>
#include <unistd.h>
void split_string(char *conf, char *host_ip[]){
long unsigned int conf_len = sizeof(conf);
char line[50];
strcpy(line, conf);
int i = 0;
char* token;
char* rest = line;
while ((token = strtok_r(rest, "_", &rest))){
host_ip[i] = token;
printf("-----------\n");
printf("token: %s\n", token);
i=i+1;
}
}
int main(){
char *my_conf[1];
my_conf[0] = "conf01_192.168.10.1";
char *host_ip[2];
split_string(my_conf[0], host_ip);
printf("%s\n",host_ip[0]);
printf("%s\n",host_ip[1]);
}
I want to modify the host_ip array inside the split_string function and then print the 2 resulting strings in the main.
However, the 2 last printf() are only printing unknown/random characters (maybe an address?). Any help?

There are 2 problems:
First, you're returning pointers to local variables. You can avoid this by strduping the strings and freeing in the caller.
Second:
On the first call to strtok_r(), str should point to the string to be parsed, and the value of saveptr is ignored. In subsequent calls, str should be NULL, and saveptr should be unchanged since the previous call.
I.e. you must NULL for the first argument after the first iteration in the loop. Nowhere is it said that it is OK to use the same pointer for both arguments.
This is because the strtok_r is an almost drop-in replacement to the braindead strtok, with just one extra argument, so that you could even wrap it with a macro...
Thus we get
char *start = rest;
while ((token = strtok_r(start, "_", &rest))){
host_ip[i] = strdup(token);
printf("-----------\n");
printf("token: %s\n", token);
i++;
start = NULL;
}
and in the caller:
free(host_ip[0]);
free(host_ip[1]);

You are storing address of local variable (line) which is in stack.Stack is LIFO and has valid data for local variables in its stack memory during its function life time.after that, the same stack memory will be allocated to another function's local variables. So, data stores in line【50】's memory will be invalid after coming out of string_split function

Why the string initialized with the too long initiializer would behave like this

#include<stdio.h>
void main()
{
int i;
char str[4] = "f4dfkjfj";
char str2[3] = "987";
char str3[2] = {'j','j','\0'};
//printf("%c\n",str[1]);
//printf("%c\n",1[str]);
puts(str);
puts(str2);
puts(str3);
}
Output observation:
Printing str2 prints the contents from both str2 and str.
Printing str3 prints str3, str2 and str.
Why this behaviour while printing the not nul ended string, which is concatenated with strings previously defined until "\0" character is encountered by the puts() function (this function prints till it encounters a nul )?
(Note: I deliberately initialised them with too long initializer strings)

(Note: I deliberately initialised them with too long initializer strings)
Then you should be aware of the side-effects, too.
The problem with all your arrays are, they are not null-terminated, so they are not strings.
Quoting C11, chapter §7.1.1, Definitions of terms, (emphasis mine)
A string is a contiguous sequence of characters terminated by and including the first null
character. [...]
Using them with string handling functions (like puts()) would invoke undefined behavior, as the functions, in search of the null-terminator, would go out of bound (i.e., outside allowed memory region) and cause the invalid memory access.
Quoting the standard again, chapter 7.21.7.9,
The puts function writes the string pointed to by s to the stream pointed to by stdout,
and appends a new-line character to the output. The terminating null character is not
written.
The expected argument is a string, which, none of the arguments in your code is.
That said, FWIW, for a hosted environment, the recommended signature of main() is int main(void), at least.

Initialization
char str[4] = "f4dfkjfj";
as well as
char str2[3] = "987";
and
char str3[2] = {'j','j','\0'};
are incorrect, because expression char str[4] allocates 4 bytes for data, but data - "f4dfkjfj" requires 9 bytes - 8 bytes for visible characters and one more byte for '\0'.
UPDATE:
Lets consider the following example
#include<stdio.h>
void main()
{
int i;
char str[4] = "f4dfkjfj";
char str2[3] = "987";
printf("Address of str2 = %p and size is %d bytes\n", str2, sizeof(str2));
printf("Addres | Data in memory\n");
char * ptr;
for (ptr = str2 - 2; ptr <= str2 + 5; ptr++)
{
printf("%p | %c\n", ptr, *ptr);
}
}
In my Visual Studio 2013 under Windows 7 I see the following:
But if I change char str2[3] = "987"; to char str2[4] = "987"; result will be
Try puts(str2) for char str2[4] = "987"; and you will see the difference.
Note: each time memory addresses (in the stack for local variables) are (can be) different, but data around allocated memory (changed or not) are more important.

These lines are constraint violations. The compiler should give an error message and the behaviour of the program is completely undefined:
char str[4] = "f4dfkjfj";
char str3[2] = {'j','j','\0'};
The constraint being violated is that there are too many initializers for the array. (C11 6.7.9/2)
However char str2[3] = "987"; is correct, there is a special case that when an array is initialized from a string literal, it is allowed to ignore the null terminator if there is no room in the array. (C11 6.7.9/14)
Going on to pass str2 to a function that expects a null-terminated string would cause undefined behaviour however.

"integer from pointer without cast" when adding nullbyte to pointer

I was messing around with all of the string functions today and while most worked as expected, especially because I stopped trying to modify literals (sigh), there is one warning and oddity I can't seem to fix.
#include <stdio.h>
#include <string.h>
int main() {
char array[] = "Longword";
char *string = "Short";
strcpy(array, string); // Short
strcat(array, " "); // Short (with whitespace)
strcat(array, string); // Short Short
strtok(array, " "); // Short
if (strcmp(array, string) == 0)
{
printf("They are the same!\n");
}
char *substring = "or";
if (strstr(array, substring) != NULL)
{
printf("There's a needle in there somewhere!\n");
char *needle = strstr(array, substring);
int len = strlen(needle);
needle[len] = "\0"; // <------------------------------------------------
printf("Found it! There ya go: %s",needle);
}
printf("%s\n", array);
return 0;
}
Feel free to ignore the first few operations - I left them in because they modified array in a way, that made the strstr function useful to begin with.
The point in question is the second if statement, line 32 if you were to copy it in an editor.
(EDIT: Added arrow to the line. Sorry about that!)

This line is wrong:
needle[len] = "\0";
Doublequotes make a string literal, whose type is char *. But needle[len] is a char. To make a char literal you use singlequotes:
needle[len] = '\0';
See Single quotes vs. double quotes in C or C++

Your second strcat call overruns the end of array, corrupting whatever happens to be after it in memory. Once that happens, the later code might do just about anything, which is why writing past the end of an array is undefined behavior

C strcat() gives wrong appended string

I am appending a string using single character, but I am not able to get it right. I am not sure where I am making mistake. Thank you for your help in advance. The original application of the method is in getting dynamic input from user.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
void main(){
int j;
char ipch=' ';
char intext[30]="What is the problem";
char ipstr[30]="";
printf("Input char: ");
j=0;
while(ipch!='\0'){
//ipch = getchar();
ipch = intext[j];
printf("%c", ipch);
strcat(ipstr,&ipch);
j++;
}
puts("\n");
puts(ipstr);
return;
}
Following is the output I am getting.
$ ./a.out
Input char: What is the problem
What is h e
p
oblem

change
strcat(ipstr,&ipch);
to
strncat(ipstr, &ipch, 1);
this will force appending only one byte from ipch. strcat() will continue appending some bytes, since there's no null termination character after the char you are appending. as others said, strcat might find somewhere in memory \0 and then terminate, but if not, it can result in segfault.
from manpage:
char *strncat(char *dest, const char *src, size_t n);
The strncat() function is similar to strcat(), except that
it will use at most n characters from src; and
src does not need to be null-terminated if it contains n or more characters.

strcat requires its second argument to be a pointer to a well-formed string. &ipch does not point to a well-formed string (the character sequence of one it points to lacks a terminal null character).
You could use char ipch[2]=" "; to declare ipch. In this case also use:
strcat(ipstr,ipch); to append the character to ipstr.
ipch[0] = intext[j]; to change the character to append.
What happens when you pass &ipch to strcat in your original program is that the function strcat assumes that the string continues, and reads the next bytes in memory. A segmentation fault can result, but it can also happen that strcat reads a few garbage characters and then accidentally finds a null character.

strcat() is to concatenate strings... so passing just a char pointer is not enough... you have to put that character followed by a '\0' char, and then pass the pointer of that thing. As in
/* you must have enough space in string to concatenate things */
char string[100] = "What is the problem";
char *s = "?"; /* a proper '\0' terminated string */
strcat(string, s);
printf("%s\n", string);

strcat function is used to concatenate two strings. Not a string and a character. Syntax-
char *strcat(char *dest, const char *src);
so you need to pass two strings to strcat function.
In your program
strcat(ipstr,&ipch);
it is not a valid statement. The second argument ipch is a char. you should not do that. It results in Segmentation Fault.

C String parsing errors with strtok(),strcasecmp()

So I'm new to C and the whole string manipulation thing, but I can't seem to get strtok() to work. It seems everywhere everyone has the same template for strtok being:
char* tok = strtok(source,delim);
do
{
{code}
tok=strtok(NULL,delim);
}while(tok!=NULL);
So I try to do this with the delimiter being the space key, and it seems that strtok() no only reads NULL after the first run (the first entry into the while/do-while) no matter how big the string, but it also seems to wreck the source, turning the source string into the same thing as tok.
Here is a snippet of my code:
char* str;
scanf("%ms",&str);
char* copy = malloc(sizeof(str));
strcpy(copy,str);
char* tok = strtok(copy," ");
if(strcasecmp(tok,"insert"))
{
printf(str);
printf(copy);
printf(tok);
}
Then, here is some output for the input "insert a b c d e f g"
aaabbbcccdddeeefffggg
"Insert" seems to disappear completely, which I think is the fault of strcasecmp(). Also, I would like to note that I realize strcasecmp() seems to all-lower-case my source string, and I do not mind. Anyhoo, input "insert insert insert" yields absolutely nothing in output. It's as if those functions just eat up the word "insert" no matter how many times it is present. I may* end up just using some of the C functions that read the string char by char but I would like to avoid this if possible. Thanks a million guys, i appreciate the help.

With the second snippet of code you have five problems: The first is that your format for the scanf function is non-standard, what's the 'm' supposed to do? (See e.g. here for a good reference of the standard function.)
The second problem is that you use the address-of operator on a pointer, which means that you pass a pointer to a pointer to a char (e.g. char**) to the scanf function. As you know, the scanf function want its arguments as pointers, but since strings (either in pointer to character form, or array form) already are pointer you don't have to use the address-of operator for string arguments.
The third problem, once you fix the previous problem, is that the pointer str is uninitialized. You have to remember that uninitialized local variables are truly uninitialized, and their values are indeterminate. In reality, it means that their values will be seemingly random. So str will point to some "random" memory.
The fourth problem is with the malloc call, where you use the sizeof operator on a pointer. This will return the size of the pointer and not what it points to.
The fifth problem, is that when you do strtok on the pointer copy the contents of the memory pointed to by copy is uninitialized. You allocate memory for it (typically 4 or 8 bytes depending on you're on a 32 or 64 bit platform, see the fourth problem) but you never initialize it.
So, five problems in only four lines of code. That's pretty good! ;)

It looks like you're trying to print space delimited tokens following the word "insert" 3 times. Does this do what you want?
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int main(int argc, char **argv)
{
char str[BUFSIZ] = {0};
char *copy;
char *tok;
int i;
// safely read a string and chop off any trailing newline
if(fgets(str, sizeof(str), stdin)) {
int n = strlen(str);
if(n && str[n-1] == '\n')
str[n-1] = '\0';
}
// copy the string so we can trash it with strtok
copy = strdup(str);
// look for the first space-delimited token
tok = strtok(copy, " ");
// check that we found a token and that it is equal to "insert"
if(tok && strcasecmp(tok, "insert") == 0) {
// iterate over all remaining space-delimited tokens
while((tok = strtok(NULL, " "))) {
// print the token 3 times
for(i = 0; i < 3; i++) {
fputs(tok, stdout);
}
}
putchar('\n');
}
free(copy);
return 0;
}

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

C: Assign strtok token to char * Segfault - c

Related

Wrong output after modifying an array in a function (in C)

Why the string initialized with the too long initiializer would behave like this

"integer from pointer without cast" when adding nullbyte to pointer

C strcat() gives wrong appended string

C String parsing errors with strtok(),strcasecmp()

Categories

Resources