Segmentation fault on char string reference - c

I have a small C++ function which reverses a string in place:
void reverse1(string& s, int start, int end) {
if (s.empty()) return;
char tmp;
while (start < end) {
tmp = s[end];
s[end] = s[start];
s[start] = tmp;
++start;
--end;
}
}
This function works fine. However, when I rewrite it in c as below, I came across a segment fault on statement 11.
5 void reverse2(char *s, int start, int end) {
6 if (!s) return;
7 char tmp;
8
9 while (start < end) {
10 tmp = s[end];
11 *(s + end) = *(s + start);
12 *(s + start) = tmp;
13 ++start;
14 --end;
15 }
16 }
Driver program that calls the function:
int main() {
/* Flavor1 works */
string a = "hello world2012!";
reverse1(a, 0, a.length() - 1);
/* Flavor2 does not - segmentation fault */
char *b = "hello world2012!";
reverse2(b, 0, strlen(b) - 1);
}
I use gcc v 4.6.1 to compile my program. When stepping through the code with gdb, the program crashes at runtime with segmentation fault.
The char string s is not a const. Can someone please suggest what's going on here? How do I fix this issue. Thanks.
Update:
The reverse2 function is called on a string literal. The problem is I was trying to modify the string literal. As Jim and H2CO3 pointed out, this is an undefined behavior.
Now what's the exact difference between a string object (a) initialized with a string literal and a string literal (b)?

It depends on how you invoke your routine. If end is the length of the array, as is common in C, then s[end] is not a valid reference ... it's one character beyond s.
Also, !s is not equivalent to C++ s.empty ... it tests whether the pointer is NULL, rather than whether the string is empty -- for that, use !*s, !s[0], s[0] == '\0', strlen(s) == 0, etc.
The char string s is not a const.
It could fail anyway if it's a string literal constant; writing to such a string is Undefined Behavior.

you can rewrite the code as below
void reverse(char *s, int start, int end) {
if (!s) return;
char tmp;
if( end >= strlen(s) )
end = strlen(s)-1;
while (start < end) {
tmp = s[end];
*(s + end) = *(s + start);
*(s + start) = tmp;
++start;
--end;
}
}

Related

Recursive Programm to print all string combinations of 'a' and 'b' of given length n in c

The task is:
Write a full program that takes an int n > 0 and recursively prints all combinations of characters 'a' and 'b' on the screen.
Example for n=3: aaa, baa, bba, aba, bab, aab, abb, bbb.
I assume I have to use something similar to Backtracking.
This is what I have, but Im not able to think of the rest.
void rep(int n, char str, int pos) { //n would be the length and str would be the pointer
char c[n + 1];
char d[3];
d[0] = 'a';
d[1] = 'b';
for (int j = 0; j < 2; j++) {
if (strlen(c) == n) { // if c is n long recursion ends
printf("%s", c);
} else {
c[pos] = d[j]; // put 'a' or 'b' in c[pos]
rep(n, c, pos + 1); // update pos to next position
}
}
}
The variable length array c is not initialized
char c[n+1]
Thus the call of strlen in this if statement
if(strlen(c) == n){
invokes undefined behavior.
Moreover the parameter str is not used within the function.
I can suggest the following solution as it is shown in the demonstration program below
#include <stdio.h>
#include <string.h>
void rep( char *s )
{
puts( s );
char *p = strchr( s, 'a' );
if (p != NULL)
{
memset( s, 'a', p - s );
*p = 'b';
rep( s );
}
}
int main()
{
char s[] = "aaa";
rep( s );
}
The program output is
aaa
baa
aba
bba
aab
bab
abb
bbb
That is the function rep is initially called with an array that contains a string of the required size n (in the demonstration program n is equal to 3) consisting of all characters equal to the character 'a' and recursively outputs all combinations until the string contains all characters equal to the character 'b'.
There a some issues in your code:
the str argument should have type char *
you so not need new arrays in the recursive function, but use the one the str argument points to.
you do not set a null terminator at the end of your char arrays.
instead of strlen(), use pos to determine if the recursion should stop.
Here is a modified version
#include <stdio.h>
// n is the length and str points to an array of length n+1
void rep(int n, char *str, int pos) {
if (pos >= n) {
str[n] = '\0'; // set the null terminator
printf("%s\n", str);
} else {
str[pos] = 'a';
rep(n, str, pos + 1);
str[pos] = 'b';
rep(n, str, pos + 1);
}
}
#define LEN 3
int main() {
char array[LEN + 1];
rep(LEN, array, 0);
return 0;
}

Head First C string.h related questions

#include <stdio.h>
#include <string.h>
void print_reverse(char *s)
{
size_t len = strlen(s);
char *t = s + len - 1;
while (t >= s)
{
printf("%c", *t);
t = t - 1;
}
puts("");
}
Above is a function that will display a string backward on the screen. But I don't understand the 7th line (char *t = s+ len-1;). Could anybody explain this is spoken English please?
For starters this function
void print_reverse(char *s)
{
size_t len = strlen(s);
char *t = s + len - 1;
while (t >= s)
{
printf("%c", *t);
t = t - 1;
}
puts("");
}
is wrong and has undefined behavior.:)
There are two problems.
The first one is that the passed string as the argument can have a zero-length. In this case this declaration
char *t = s + len - 1;
will look like
char *t = s - 1;
and the pointer t can be wrong.
The second problem is that this expression statement
t = t - 1;
has undefined behavior in case when the pointer t is equal to s.
From the C Standard (6.5.6 Additive operators)
...If both the pointer operand and the result point to elements of the same
array object, or one past the last element of the array
object, the evaluation shall not produce an overflow; otherwise, the
behavior is undefined.
A correct function implementation can look the following way
void print_reverse( const char *s)
^^^^^
{
size_t len = strlen(s);
const char *t = s + len;
^^^^^^^
while (t != s)
^^^^^^
{
printf("%c", *--t);
^^^^
}
puts("");
}
As for your question then in this declaration
char *t = s + len - 1;
the pointer t is tried to be initialized by the address of the last character of the string before the terminating zero.
Main logic behind this functions is that this code:
char *t = s+ len-1;
return a pointer to the address of the last char in the char pointer you are passing to the function. The loop prints it by decrementing it:
t = t - 1;
So in simple words it prints the char pointer from backwards.

C strlen using pointers

I have seen the standard implementation of strlen using pointer as:
int strlen(char * s) {
char *p = s;
while (*p!='\0')
p++;
return p-s;
}
I get this works, but when I tried to do this using 3 more ways (learning pointer arithmetic right now), I would want to know whats wrong with them?
This is somewhat similar to what the book does. Is this wrong?
int strlen(char * s) {
char *p = s;
while (*p)
p++;
return p-s;
}
I though it would be wrong if I pass an empty string but still gives me 0, kinda confusing since p is pre increment: (and now its returning me 5)
int strlen(char * s) {
char *p = s;
while (*++p)
;
return p-s;
}
Figured this out, does the post increment and returns +1 on it.
int strlen(char * s) {
char *p = s;
while (*p++)
;
return p-s;
}
1) Looks fine to me. I personally prefer the explicit comparison against '\0' so that it's clear you didn't mean to (for example) compare p to the NULL pointer in situations where it's not clear from context.
2) When your program runs, the area of memory known as the stack is uninitialized. Local variables live there. The way you wrote your program puts p in the stack (if you made it const or used malloc, it would almost certainly live elsewhere). What happens when you look at *p is that you then peek at the stack. If the string is length 0, this is the same as char p[1] = {0}. Pre-incrementing looks at the byte immediately after the \0, so you're looking at undefined memory. Here be dragons!
3) I don't think there's a question there :) As you see, it always returns one more than the correct answer.
Addendum: You can also write this using a for-loop, if you prefer this style:
size_t strlen(char * s) {
char *p = s;
for (; *p != '\0'; p++) {}
return p - s;
}
Or (more error-prone-ly)
size_t strlen(char * s) {
char *p = s;
for (; *p != '\0'; p++);
return p - s;
}
Also, strlen can't return a negative number, so you should use an unsigned value. size_t is even better.
Version 1 is fine - while (*p != '\0') is equivalent to while (*p != 0), which is equivalent to while (*p).
In the original code and version 1, the pointer p is advanced if and only if *p is not 0 (IOW, you're not at the end of the string).
Versions 2 and 3 advance p regardless of whether *p is 0 or not. *p++ evaluates to the character p points to, and as a side effect advances p. *++p evaluates to the character following the character p points to, and as a side effect advances p. Therefore, versions 2 and 3 will always advance p past the end of the string, which is why your values are off.
One issue you will run into when you compare the performance of strlen replacement functions is their performance will suffer compared to the actual strlen function for long strings? Why? strlen processes more than one-byte per iteration in searching for the end of string. How can you implement a more efficient replacement?
It's not that difficult. The basic approach is to look at 4-bytes per iteration and adjust the return based on where within those 4-bytes the nul-byte is found. You could do something like the following (using array indexing):
size_t strsz_idx (const char *s) {
size_t len = 0;
for(;;) {
if (s[0] == 0) return len;
if (s[1] == 0) return len + 1;
if (s[2] == 0) return len + 2;
if (s[3] == 0) return len + 3;
s += 4, len += 4;
}
}
You can do the exact same thing using pointers and masks:
size_t strsz (const char *s) {
size_t len = 0;
for(;;) {
unsigned x = *(unsigned*)s;
if((x & 0xff) == 0) return len;
if((x & 0xff00) == 0) return len + 1;
if((x & 0xff0000) == 0) return len + 2;
if((x & 0xff000000) == 0) return len + 3;
s += 4, len += 4;
}
}
Either way, you will find a 4-byte comparison each iteration will give you performance equivalent to strlen itself.

Can anyone explain to me this piece of code? I am new to C

I am trying to learn C, so I went to try out some of the coderbyte challenges in C, one of which is to reverse a string. After gettung multiple compilation errors due to syntax I tried looking for examples and encountered this one on http://www.programmingsimplified.com/c-program-reverse-string
#include<stdio.h>
int string_length(char*);
void reverse(char*);
main()
{
char string[100];
printf("Enter a string\n");
gets(string);
reverse(string);
printf("Reverse of entered string is \"%s\".\n", string);
return 0;
}
void reverse(char *string)
{
int length, c;
char *begin, *end, temp;
length = string_length(string);
begin = string;
end = string;
for (c = 0; c < length - 1; c++)
end++;
for (c = 0; c < length/2; c++)
{
temp = *end;
*end = *begin;
*begin = temp;
begin++;
end--;
}
}
int string_length(char *pointer)
{
int c = 0;
while( *(pointer + c) != '\0' )//I DON'T UNDERSTAND THIS PART!!!!!
c++;
return c;
}
c is not even a char, so why would you add it? Or is pointer some sort of index considering the context of the while loop?
The + operator here does not mean string concatenation; it means pointer arithmetic. pointer is a pointer to a char, and c is an int, so pointer + c results in a pointer to a char that is c chars further forward in memory. For example, if you have an array {'j', 'k', 'l', 'm'} and pointer pointed at the 'j', and c was 2, then pointer + c would point at the 'l'. If you advance a pointer like this then deference it, that acts the same as the array indexing syntax: pointer[c]. The loop is therefore equivalent to:
while( pointer[c] != '\0' )
c++;
The effect of adding a pointer to an integer (or vice-versa) scales according to the size of what the pointer is (supposedly) pointing to, so you don't need to account for different sizes of objects. foo + 5, if foo is a pointer, will go 5 objects further forward in memory, whatever size object foo points to (assuming that foo is pointing at the type it is declared to point to).
Here you can talk about pointer arithmetic.
There is an important concept :
Addition a integer to a pointer will move the pointer forward. The number that you are adding will be multiplied by the size of type that the pointer is pointing to.
Example :
An int is coded on 4 bytes, so when we increment the pointer by 1, we have to multiply by 4 to obtain what really happen in regular arithmetic.
int a[3] = {1, 3, 6};
printf("%d\n", *(a + 1)); // print 3, look 4 bytes ahead
printf("%d \n", *(a + 2)); //print 6, look 8 bytes ahead
In your case :
A char is coded on 1 byte so
*(pointer + c) with c == 3
will evaluate to a memory address of 3 bytes (3 chars) ahead.
So the code :
while( *(pointer + c) != '\0' )
c++;
will evaluate the value of your pointer at a specific memory address. If the character is equal to the null-character, we have reached the end of the string.
Remember that *(pointer + c) is equivalent to pointer[c].
c is being used as an index.
/* reverse: Reverses the string pointed to by `string` */
void reverse(char *string)
{
int length, c;
char *begin, *end, temp;
/* compute length of string, initialize pointers */
length = string_length(string);
begin = string; /* points to the beginning of string */
end = string; /* will later point to the end of string */
/* make end point to the end */
for (c = 0; c < length - 1; c++)
end++;
/* walk through half of the string */
for (c = 0; c < length/2; c++)
{
/* swap the begin and end pointers */
temp = *end;
*end = *begin;
*begin = temp;
/* advance pointers */
begin++;
end--;
}
}
/* computes length of pointer */
int string_length(char *pointer)
{
int c = 0;
/* while we walk `pointer` and we don't see the null terminator */
while( *(pointer + c) != '\0' )//I DON'T UNDERSTAND THIS PART!!!!!
c++; /* advance position, c */
/* return the length */
return c;
}
The string_length function can be rewritten as
size_t strlen(const char *str)
{
size_t i;
for (i = 0; *str; ++i)
;
return i;
}
From what I understand if the string is : abcd
the result would be : dcba
if the input is : HELLO-WORLD
the output would be : DLROW-OLLEH

C substrings / C string slicing?

Hy everybody!
I am trying to write a program that checks if a given string of text is a palindrome (for this I made a function called is_palindrome that works) and if any of it's substrings is a palindrome, and I can't figure out what is the optimal way to do this:
For example, for the string s = "abcdefg" it should first check "a", then "ab", "abc", "abcd" and so on, for each character
In Python this is the equivalent of
s[:1], s[:2], ... (a, ab, ...)
s[1:2], s[1:3] ... (b, bc, ...)
What function/method is there that I can use in a similar way in C ?
This is the one liner I use to get a slice of a string in C.
void slice(const char *str, char *result, size_t start, size_t end)
{
strncpy(result, str + start, end - start);
}
Pretty straightforward.
Given you've checked boundaries and made sure end > start.
This slice_str() function will do the trick, with end actually being the end character, rather than one-past-the-end as in Python slicing:
#include <stdio.h>
#include <string.h>
void slice_str(const char * str, char * buffer, size_t start, size_t end)
{
size_t j = 0;
for ( size_t i = start; i <= end; ++i ) {
buffer[j++] = str[i];
}
buffer[j] = 0;
}
int main(void) {
const char * str = "Polly";
const size_t len = strlen(str);
char buffer[len + 1];
for ( size_t start = 0; start < len; ++start ) {
for ( int end = len - 1; end >= (int) start; --end ) {
slice_str(str, buffer, start, end);
printf("%s\n", buffer);
}
}
return 0;
}
which, when used from the above main() function, outputs:
paul#horus:~/src/sandbox$ ./allsubstr
Polly
Poll
Pol
Po
P
olly
oll
ol
o
lly
ll
l
ly
l
y
paul#horus:~/src/sandbox$
There isn't; you'll have to write your own.
In order to check a string, you would need to supply to the number of characters to check in order to check for a palindrome:
int palindrome(char* str, int len)
{
if (len < 2 )
{
return 0;
}
// position p and q on the first and last character
char* p = str;
char* q = str + len - 1;
// compare start char with end char
for ( ; p < str + len / 2; ++p, --q )
{
if (*p != *q)
{
return 0;
}
}
return 1;
}
now you would need to call the function above for each substring (as you described it, i.e. always starting from the beginning) e.g.
char candidate[] = "wasitaratisaw";
for (int len = 0; len < strlen(candidate); ++len)
{
if (palindrome(candidate, len))
{
...
}
}
disclaimer: not compiled.
Honestly, you don't need a string slicing function just to check for palindromes within substrings:
/* start: Pointer to first character in the string to check.
* end: Pointer to one byte beyond the last character to check.
*
* Return:
* -1 if start >= end; this is considered an error
* 0 if the substring is not a palindrome
* 1 if the substring is a palindrome
*/
int
ispalin (const char *start, const char *end)
{
if (start >= end)
return -1;
for (; start < end; ++start)
if (*start != *--end)
return 0;
return 1;
}
With that, you can create the following:
int
main ()
{
const char *s = "madam";
/* i: index of first character in substring
* n: number of characters in substring
*/
size_t i, n;
size_t len = strlen (s);
for (i = 0; i < len; ++i)
{
for (n = 1; n <= len - i; ++n)
{
/* Start of substring. */
const char *start = s + i;
/* ispalin(s[i:i+n]) in Python */
switch (ispalin (start, start + n))
{
case -1:
fprintf (stderr, "error: %p >= %p\n", (void *) start, (void *) (start + n));
break;
case 0:
printf ("Not a palindrome: %.*s\n", (int) n, start);
break;
case 1:
printf ("Palindrome: %.*s\n", (int) n, start);
break;
} /* switch (ispalin) */
} /* for (n) */
} /* for (i) */
}
Of course, if you really wanted a string slicing function merely for output (since you technically shouldn't cast a size_t to int), and you still want to be able to format the output easily, the answer by Paul Griffiths should suffice quite well, or you can use mine or even one of strncpy or the nonstandard strlcpy, though they all have their strengths and weaknesses:
/* dest must have
* 1 + min(strlen(src), n)
* bytes available and must not overlap with src.
*/
char *
strslice (char *dest, const char *src, size_t n)
{
char *destp = dest;
/* memcpy here would be ideal, but that would mean walking the string twice:
* once by calling strlen to determine the minimum number of bytes to copy
* and once for actually copying the substring.
*/
for (; n != 0 && *src != 0; --n)
*destp++ = *src++;
*destp = 0;
return dest;
}
strslice actually works like a combination of strncpy and the nonstandard strlcpy, though there are differences between these three functions:
strlcpy will cut the copied string short to add a null terminator at dest[n - 1], so copying exactly n bytes before adding a null terminator requires you to pass n + 1 as the buffer size.
strncpy may not terminate the string at all, leaving dest[n - 1] equal to src[n - 1], so you would need to add a null terminator yourself just in case. If n is greater than the src string length, dest will be padded with null terminators until n bytes have been written.
strslice will copy up to n bytes if necessary, like strncpy, and will require an extra byte for the null terminator, meaning a maximum of n+1 bytes are necessary. It doesn't waste time writing unnecessary null terminators as strncpy does. This can be thought of as a "lightweight strlcpy" with a small difference in what n means and can be used where the resulting string length won't matter.
You could also create a memslice function if you wanted, which would allow for embedded null bytes, but it already exists as memcpy.
There is not any built-in function/method in any standard C library which can handle this. However, you can come up with your own method to do the same.

Resources