Segmentation Fault when using strtok - c

I get segmentation fault when using char *s in main. If I use char s[100] or something like that everything is ok. Why is that? SIGSEGV appears when i call find_short(char *s) function on line with instruction char *token = strtok(s, delim);. This is my code:
#include <sys/types.h>
#include <string.h>
#include <limits.h>
#include <stdio.h>
int find_short(char *s)
{
int min = INT_MAX;
const char delim[2] = " ";
char *token = strtok(s, delim);
while(token != NULL) {
int len = (int)strlen(token);
if (min > len)
min = len;
token = strtok(NULL, delim);
}
return min;
}
int main()
{
char *s = "lel qwew dasdqew";
printf("%d",find_short(s));
return 0;
}

The line:
char *s = "lel qwew dasdqew";
creates a pointer to a constant string in memory.
Because that string is constant, you are unable to change its contents.
The strtok function will try to modify the contents by inserting \0 at the token-delimiter locations, and will fail because the string cannot be modified.
Changing the line to:
char s[] = "lel qwew dasdqew";
Now makes s an array of local data that you are free to change. strtok will now work because it can change the array.

The main your mistake is that you selected a wrong function to do the task.:)
I will say about this below.
As for the current program then string literals in C though they do not have constant character array types are immutable. Any attempt to change a string literal results in undefined behavior. And the function strtok changes passed to it string inserting the terminating zero between sub-strings.
Instead of the function strtok you should use string functions strspn and strcspn. They do not change the passed argument. So using these functions you are able to process also string literals.
Here is a demonstrative program.
#include <stdio.h>
#include <string.h>
size_t find_short( const char *s )
{
const char *delim= " \t";
size_t shortest = 0;
while ( *s )
{
s += strspn( s, delim );
const char *p = s;
s += strcspn( s, delim );
size_t n = s - p;
if ( shortest == 0 || ( n && n < shortest ) ) shortest = n;
}
return shortest;
}
int main(void)
{
const char *s = "lel qwew dasdqew";
printf( "%zu", find_short( s ) );
return 0;
}
Its output is
3

Related

How to use toupper library function in C?

I need to print the initials of a name, like tyler jae woodbury would print TJW, but I can't seem to print the uppercase initials.
This is my code:
#include <stdio.h>
#include <cs50.h>
#include <ctype.h>
#include <string.h>
string get_initials(string name, char initials[]);
int main(void)
{
// User input
string name = get_string("Name: ");
// Gets the users initials
char initials[10];
get_initials(name, initials);
printf("%s\n", initials);
}
string get_initials(string name, char initials[])
{
int counter = 0;
for (int i = 0, n = strlen(name); i < n; i++)
{
if (name[i] == ' ')
{
initials[counter] = name[i+1];
counter++;
}
else if (i == 0)
{
initials[counter] = name[i];
counter++;
}
}
return initials;
}
I know that usually toupper() is used for chars, and the print statement declares a string, but I don't know what to do.
The function is incorrect.
For starters in general a string can contain adjacent spaces between words or have trailing adjacent spaces.
Secondly the function does not build a string because it does not append the terminating zero character '\0' to the destination array.
Also the call of strlen is inefficient and redundant.
To convert a symbol to upper case use standard function toupper declared in the header <ctype.h>
Also the function declaration is confusing
string get_initials(string name, char initials[]);
Either use
string get_initials(string name, string initials);
or it will be better to write
char * get_initials( const char *name, char *initials);
The function can be defined the following way as shown in the demonstration program below.
#include <stdio.h>
#include <string.h>
#include <ctype.h>
char * get_initials( const char *name, char *initials )
{
const char *blank = " \t";
char *p = initials;
while ( name += strspn( name, blank ), *name )
{
*p++ = toupper( ( unsigned char )*name );
name += strcspn( name, blank );
}
*p = '\0';
return initials;
}
int main( void )
{
char name[] = " tyler jae woodbury ";
char initials[10];
puts( get_initials( name, initials ) );
}
The program output is
TJW

Why my returned value of strchr is ignored?

I have to make a function, that will code my sentence like this: I want to code all words with an o, so for example I love ice cream becomes I **** ice cream.
But my function ignores the result of strchr. And I don't know why.
This is my code:
#define _CRT_SECURE_NO_WARNINGS
#include <stdio.h>
#include <stdlib.h>
#define LEN 1000
char *Shift(char *str, char *let) {
const char *limits = " ,-;+.";
char copy[LEN];
strcpy(copy, str);
char *p;
char *ptr;
ptr = strtok(copy, limits);
for (int j = 0; ptr != NULL; ptr = strtok(NULL, limits), ++j) {
int len = 0;
if (strchr(ptr, let) != NULL) {
p = strstr(str, ptr);
for (int i = 0; i < strlen(ptr); i++) {
p[i] = "*";
}
}
}
return str;
}
int main() {
char *s = Shift("I love my cocktail", "o");
puts(s);
}
Expected output is: I **** my ********
but I've got just printed the original string
For starters the function strchr is declared like
char *strchr(const char *s, int c);
that is its second parameter has the type int and the expected argument must represent a character. While you are calling the function passing an object of the type char * that results in undefined behavior
if (strchr(ptr, let) != NULL) {
It seems you mean
if (strchr(ptr, *let) != NULL) {
Also you may not change a string literal. Any attempt to change a string literal results in undefined behavior and this code snippet
p = strstr(str, ptr);
for (int i = 0; i < strlen(ptr); i++) {
p[i] = "*";
}
tries to change the string literal passed to the function
char *s = Shift("I love my cocktail", "o");
And moreover in this statement
p[i] = "*";
you are trying to assign a pointer of the type char * to a character. At least you should write
p[i] = '*';
If you want to change an original string you need to store it in an array and pass to the function the array instead of a string literal. For example
char s[] = "I love my cocktail";
puts( Shift( s, "o" ) );
Pay attention to that there is no great sense to declare the second parameter as having the type char *. Declare its type as char.
Also the function name Shift is confusing. You could name it for example like Hide or something else.
Here is a demonstration program.
#include <stdio.h>
#include <string.h>
char * Hide( char *s, char c )
{
const char *delim = " ,-;+.";
for ( char *p = s += strspn( s, delim ); *p; p += strspn( p, delim ) )
{
char *q = p;
p += strcspn( p, delim );
char *tmp = q;
while ( tmp != p && *tmp != c ) ++tmp;
if ( tmp != p )
{
for ( ; q != p; ++q ) *q = '*';
}
}
return s;
}
int main( void )
{
char s[] = "I love my cocktail";
puts(s);
puts( Hide( s, 'o' ) );
}
The program output is
I love my cocktail
I **** my ********
The for loop
for ( ; q != p; ++q ) *q = '*';
within the function can be rewritten as a call of memset
memset( q, '*', p - q );
There are multiple problems:
copying the string to a fixed length local array char copy[LEN] will cause undefined behavior if the string is longer than LEN-1. You should allocate memory from the heap instead.
you work on a copy of the string to use strtok to split the words, but you do not use the correct method to identify the parts of the original string to patch.
you should pass a character to strchr(), not a string.
patching the string with p[i] = "*" does not work: the address of the string literal "*" is converted to a char and stored into p[i]... this conversion is meaningless: you should use p[i] = '*' instead.
attempting to modify a string literal has undefined behavior anyway. You should define a modifiable array in main and pass the to Shift.
Here is a corrected version:
#define _CRT_SECURE_NO_WARNINGS
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char *Shift(char *str, char letter) {
const char *limits = " ,-;+.";
char *copy = strdup(str);
char *ptr = strtok(copy, limits);
while (ptr != NULL) {
if (strchr(ptr, letter)) {
while (*ptr != '\0') {
str[ptr - copy] = '*';
ptr++;
}
}
ptr = strtok(NULL, limits);
}
free(copy);
return str;
}
int main() {
char s[] = "I love my cocktail";
puts(Shift(s, 'o'));
return 0;
}
The above code still has undefined behavior if the memory cannot be allocated. Here is a modified version that operates in place to avoid this problem:
#include <ctype.h>
#include <stdio.h>
#include <string.h>
char *Shift(char *str, char letter) {
char *ptr = str;
while ((ptr = strchr(ptr, letter)) != NULL) {
char *p = ptr;
while (p > str && isalpha((unsigned char)p[-1]))
*--p = '*';
while (isalpha((unsigned char)*ptr)
*ptr++ = '*';
}
return str;
}
int main() {
char s[] = "I love my cocktail";
puts(Shift(s, 'o'));
return 0;
}
Note that you can also search for multiple characters at a time use strcspn():
#include <ctype.h>
#include <stdio.h>
#include <string.h>
char *Shift(char *str, const char *letters) {
char *ptr = str;
while (*(ptr += strcspn(ptr, letters)) != '\0') {
char *p = str;
while (p > str && isalpha((unsigned char)p[-1]))
*--p = '*';
while (isalpha((unsigned char)*ptr)
*ptr++ = '*';
}
return str;
}
int main() {
char s[] = "I love my Xtabentun cocktail";
puts(Shift(s, "oOxX"));
return 0;
}

Getting no output why is that?

I', learning C and I'm getting no output for some reason, probably I don't return as I should but how I should? (described the problem in the comments below :D)
Any help is appreciated!
#include <stdio.h>
#include <ctype.h>
#include <string.h>
char *makeUpperCase (char *string);
int main()
{
printf(makeUpperCase("hello")); //Here there is no output, and when I'm trying with the format %s it returns null
return 0;
}
char *makeUpperCase(char *string)
{
char str_out[strlen(string) + 1];
for (int i = 0; i < strlen(string); ++i)
str_out[i] = toupper(string[i]);
printf(str_out); //Here I get the output.
return str_out;
}
You declared within the function a local variable length array that will not be alive after exiting the function
char str_out[strlen(string) + 1];
So your program has undefined behavior.
If the function parameter declared without the qualifier const then it means that the function changes the passed string in place. Such a function can be defined the following way
char * makeUpperCase( char *string )
{
for ( char *p = string; *p != '\0'; ++p )
{
*p = toupper( ( unsigned char )*p );
}
return string;
}
Otherwise you need to allocate dynamically a new string. For example
char * makeUpperCase( const char *string )
{
char *str_out = malloc( strlen( string ) + 1 );
if ( str_out != NULL )
{
char *p = str_out;
for ( ; *string != '\0'; ++string )
{
*p++ = toupper( ( unsigned char )*string );
}
*p = '\0';
}
return str_out;
}
Here is a demonstration program.
#include <stdop.h>
#include <stdlib.h>
#include <string.h>
char *makeUpperCase( const char *string )
{
char *str_out = malloc( strlen( string ) + 1 );
if (str_out != NULL)
{
char *p = str_out;
for (; *string != '\0'; ++string)
{
*p++ = toupper( ( unsigned char )*string );
}
*p = '\0';
}
return str_out;
}
int main( void )
{
char *p = makeUpperCase( "hello" );
puts( p );
free( p );
}
The program output is
HELLO
The problem is that printf() is buffering output based on a bit complex mechanism. When you are outputting to a terminal, printf() just buffers everything until the buffer fills (which is not going to happen with just the string "hello", or until it receives a '\n' character (which you have not used in your statement)
So, to force a buffer flush, just add the following statement
fflush(stdout);
after your printf() call.

Segmentation fault on strcat

I have recently begun working on learning the C language and have repeatedly run into an error in which calling the strcat function from the <string.h> module results in a segmentation fault. I've searched for the answers online, including on this stackoverflow post, without success. I thought this community might have a more personal insight into the problem, as the general solutions don't seem to be working. Might be user error, might be a personal issue with the code. Take a look.
#include <stdio.h>
#include <string.h>
char * deblank(const char str[]){
char *new[strlen(str)];
char *buffer = malloc(strlen(new)+1);
for (int i=0; i<strlen(*str); i++){
if(buffer!=NULL){
if(str[i]!=" "){
strcat(new,str[i]); //Segmentation fault
}
}
}
free(buffer);
return new;
}
int main(void){
char str[] = "This has spaces in it.";
char new[strlen(str)];
*new = deblank(str);
puts(new);
}
I've placed a comment on the line I've traced the segmentation fault back to. The following is some Java to make some sense out of this C code.
public class deblank {
public static void main(String[]args){
String str = "This has space in it.";
System.out.println(removeBlanks(str));
}
public static String removeBlanks(String str){
String updated = "";
for(int i=0; i<str.length(); i++){
if(str.charAt(i)!=' '){
updated+=str.charAt(i);
}
}
return updated;
}
}
Any insights into this error will be much appreciated. Please point out typos as well... I've been known to make them. Thanks.
OK, let's do this.
#include <stdio.h>
#include <string.h>
char * deblank(const char str[]){
char *new[strlen(str)];
^ This line creates an array of pointers, not a string.
char *buffer = malloc(strlen(new)+1);
malloc is undeclared. Missing #include <stdlib.h>. Also, you should check for allocation failure here.
strlen(new) is a type error. strlen takes a char * but new is (or rather evaluates to) a char **.
for (int i=0; i<strlen(*str); i++){
strlen(*str) is a type error. strlen takes a char * but *str is a char (i.e. a single character).
i<strlen(...) is questionable. strlen returns size_t (an unsigned type) whereas i is an int (signed, and possibly too small).
Calling strlen in a loop is inefficient because it has to walk the whole string to find the end.
if(buffer!=NULL){
This is a weird place to check for allocation failure. Also, you don't use buffer anywhere, so why create/check it at all?
if(str[i]!=" "){
str[i]!=" " is a type error. str[i] is a char whereas " " is (or rather evaluates to) a char *.
strcat(new,str[i]); //Segmentation fault
This is a type error. strcat takes two strings (char *), but new is a char ** and str[i] is a char. Also, the first argument to strcat must be a valid string but new is uninitialized.
}
}
}
free(buffer);
return new;
new is a local array in this function. You're returning the address of its first element, which makes no sense: As soon as the function returns, all of its local variables are gone. You're returning an invalid pointer here.
Also, this is a type error: deblank is declared to return a char * but actually returns a char **.
}
int main(void){
char str[] = "This has spaces in it.";
char new[strlen(str)];
*new = deblank(str);
This is a type error: *new is a char but deblank returns a char *.
puts(new);
puts takes a string, but new is essentially garbage at this point.
}
You can't use strcat like you did, it is intended to catenate a C-string at the end of another given one. str[i] is a char not a C-string (remember that a C-string is a contiguous sequence of chars the last being the NUL byte).
You also cannot compare strings with standard comparison operators, if you really need to compare strings then there is a strcmp function for it. But you can compare chars with standard operators as char is just a kind of integer type.
This should do the trick:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
char * deblank(const char str[]) {
char *buffer = malloc(strlen(str)+1); // allocate space to contains as much char as in str, included ending NUL byte
for (int i=0, j=0; i<strlen(str)+1; i++) { // for every char in str, included the ending NUL byte
if (str[i]!=' ') { // if not blank
buffer[j++] = str[i]; // copy
}
}
return buffer; // return a newly constructed C-string
}
int main(void){
char str[] = "This has spaces in it.";
char *new = deblank(str);
puts(new);
free(new); // release the allocated memory
}
So, not sure whether this helps you, but a C code doing the same as your Java code would look like this:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
static char *removeBlanks(const char *str)
{
char *result = malloc(strlen(str) + 1);
if (!result) exit(1);
const char *r = str;
char *w = result;
while (*r)
{
// copy each character except when it's a blank
if (*r != ' ') *w++ = *r;
++r;
}
*w = 0; // terminate the result to be a string (0 byte)
return result;
}
int main(void)
{
const char *str = "This has spaces in it.";
char *new = removeBlanks(str);
puts(new);
free(new);
return 0;
}
I would'nt recommend to name a variable new ... if you ever want to use C++, this is a reserved keyword.
I tried compiling with warnings enabled, here are some you should fix.
You need to include stdlib.h
char *new[strlen(str)] creates an array of char* not of char, so not really a string. Change it to char new[strlen(str)].
To check if str[i] is a space, you compare it to the space character ' ', not a string whose only character is a space " ". So change it to str[i]!=' '
strcat takes a string as the second argument and not a character, like you're giving it with str[i].
Also, what are you using buffer for?
Another mistake, is that you probably assumed that uninitialized arrays take zero values. The new array has random values, not zero/null. strcat concatenates two strings, so it would try to put the string in its second argument at the end of the first argument new. The "end" of a string is the null character. The program searches new for the first null character it can find, and when it finds this null, it starts writing the second argument from there.
But because new is uninitialized, the program might not find a null character in new, and it would keep searching further than the length of new, strlen(str), continuing the search in unallocated memory. That is probably what causes the segmentation fault.
There can be three approaches to the task.
The first one is to update the string "in place". In this case the function can look something like the following way
#include <stdio.h>
#include <ctype.h>
#include <iso646.h>
char * deblank( char s[] )
{
size_t i = 0;
while ( s[i] and not isblank( s[i] ) ) ++i;
if ( s[i] )
{
size_t j = i++;
do
{
if ( not isblank( s[i] ) ) s[j++] = s[i];
} while( s[i++] );
}
return s;
}
int main(void)
{
char s[] = "This has spaces in it.";
puts( s );
puts( deblank( s ) );
return 0;
}
The program output is
This has spaces in it.
Thishasspacesinit.
Another approach is to copy the source string in a destination character array skipping blanks.
In this case the function will have two parameters: the source array and the destination array. And the size of the destination array must be equal to the size of the source array because in general the source array can not have blanks.
#include <stdio.h>
#include <ctype.h>
#include <iso646.h>
char * deblank( char *s1, const char *s2 )
{
char *t = s1;
do
{
if ( not isblank( *s2 ) ) *t++ = *s2;
} while ( *s2++ );
return s1;
}
int main(void)
{
char s1[] = "This has spaces in it.";
char s2[sizeof( s1 )];
puts( s1 );
puts( deblank( s2, s1 ) );
return 0;
}
The program output will be the same as shown above.
Pay attention to this declaration
char s2[sizeof( s1 )];
The size of the destination string in general should be not less than the size of the source string.
And at last the third approach is when inside the function there is created dynamically an array and pointer to the first element of the array is returned from the function.
In this case it is desirable at first to count the number of blanks in the source array that to allocated the destination array with the appropriate size.
To use the functions malloc and free you need to include the following header
#include <stdlib.h>
The function can be implemented as it is shown in the demonstrative program.
#include <stdio.h>
#include <ctype.h>
#include <stdlib.h>
#include <iso646.h>
char * deblank( const char *s )
{
size_t n = 1; /* one byte reserved for the terminating zero character */
for ( const char *t = s; *t; ++t )
{
if ( not isblank( *t ) ) ++n;
}
char *s2 = malloc( n );
if ( s2 != NULL )
{
char *t = s2;
do
{
if ( not isblank( *s ) ) *t++ = *s;
} while ( *s++ );
}
return s2;
}
int main(void)
{
char s1[] = "This has spaces in it.";
char *s2 = deblank( s1 );
puts( s1 );
if ( s2 ) puts( s2 );
free( s2 );
return 0;
}
The program output is the same as for the two previous programs.
As for the standard C function strcat then it cats two strings.
For example
#include <stdio.h>
#include <string.h>
int main(void)
{
char s1[12] = "Hello ";
char *s2 = "World";
puts( strcat( s1, s2 ) );
return 0;
}
The destination array (in this case s1) must have enough space to be able to append a string.
There is another C function strncat in the C Standard that allows to append a single character to a string. For example the above program can be rewritten the following way
#include <stdio.h>
#include <string.h>
int main(void)
{
char s1[12] = "Hello ";
char *s2 = "World";
for ( size_t i = 0; s2[i] != '\0'; i++ )
{
strncat( s1, &s2[i], 1 );
}
puts( s1 );
return 0;
}
But it is not efficient to use such an approach for your original task because each time when the function is called it has to find the terminating zero in the source string that to append a character.
you can try recursively
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
void deblank(const char* str, char *dest) {
if (!*str) {*dest = '\0';return;}
// when we encounter a space we skip
if (*str == ' ') {
deblank(str+1, dest);
return;
}
*dest = *str;
deblank(str+1, dest+1);
}
int main(void) {
const char *str = "This has spaces in it.";
char *output = malloc(strlen(str)+1);
deblank(str, output);
puts(output);
free(output);
}

How to safety parse tab-delimited string ?

How to safety parse tab-delimiter string ? for example:
test\tbla-bla-bla\t2332 ?
strtok() is a standard function for parsing strings with arbitrary delimiters. It is, however, not thread-safe. Your C library of choice might have a thread-safe variant.
Another standard-compliant way (just wrote this up, it is not tested):
#include <string.h>
#include <stdio.h>
int main()
{
char string[] = "foo\tbar\tbaz";
char * start = string;
char * end;
while ( ( end = strchr( start, '\t' ) ) != NULL )
{
// %s prints a number of characters, * takes number from stack
// (your token is not zero-terminated!)
printf( "%.*s\n", end - start, start );
start = end + 1;
}
// start points to last token, zero-terminated
printf( "%s", start );
return 0;
}
Use strtok_r instead of strtok (if it is available). It has similar usage, except it is reentrant, and it does not modify the string like strtok does. [Edit: Actually, I misspoke. As Christoph points out, strtok_r does replace the delimiters by '\0'. So, you should operate on a copy of the string if you want to preserve the original string. But it is preferable to strtok because it is reentrant and thread safe]
strtok will leave your original string modified. It replaces the delimiter with '\0'. And if your string happens to be a constant, stored in a read only memory (some compilers will do that), you may actually get a access violation.
Using strtok() from string.h.
#include <stdio.h>
#include <string.h>
int main ()
{
char str[] = "test\tbla-bla-bla\t2332";
char * pch;
pch = strtok (str," \t");
while (pch != NULL)
{
printf ("%s\n",pch);
pch = strtok (NULL, " \t");
}
return 0;
}
You can use any regex library or even the GLib GScanner, see here and here for more information.
Yet another version; this one separates the logic into a new function
#include <stdio.h>
static _Bool next_token(const char **start, const char **end)
{
if(!*end) *end = *start; // first call
else if(!**end) // check for terminating zero
return 0;
else *start = ++*end; // skip tab
// advance to terminating zero or next tab
while(**end && **end != '\t')
++*end;
return 1;
}
int main(void)
{
const char *string = "foo\tbar\tbaz";
const char *start = string;
const char *end = NULL; // NULL value indicates first call
while(next_token(&start, &end))
{
// print substring [start,end[
printf("%.*s\n", end - start, start);
}
return 0;
}
If you need a binary safe way to tokenize a given string:
#include <string.h>
#include <stdio.h>
void tokenize(const char *str, const char delim, const size_t size)
{
const char *start = str, *next;
const char *end = str + size;
while (start < end) {
if ((next = memchr(start, delim, end - start)) == NULL) {
next = end;
}
printf("%.*s\n", next - start, start);
start = next + 1;
}
}
int main(void)
{
char str[] = "test\tbla-bla-bla\t2332";
int len = strlen(str);
tokenize(str, '\t', len);
return 0;
}

Resources