Why doesn't strcpy work? - c

char sentence2[10];
strncpy(sentence2, second, sizeof(sentence2)); //shouldn't I specify the sizeof(source) instead of sizeof(destination)?
sentence2[10] = '\0'; //Is this okay since strncpy does not provide the null character.
puts(sentence2);
//////////////////////////////////////////////////////////////
char *pointer = first;
for(int i =0; i < 500; i++) //Why does it crashes without this meaningless loop?!
{
printf("%c", *pointer);
if(*pointer == '\n')
putchar('\n');
pointer++;
}
So here's the problem. When I run the first part of this code, the program crashes.
However, when I add the for loop that just prints garbage values in memory locations, it does not crash but still won't strcpy properly.
Second, when using strncpy, shouldn't I specify the sizeof(source) instead of sizeof(destination) since I'm moving the bytes of the source ?
Third, It makes sense to me to add the the null terminating character after strncpy, since I've read that it doesn't add the null character on its own, but I get a warning that it's a possible out of bounds store from my pelles c IDE.
fourth and most importantly, why doesn't the simply strcpy work ?!?!
////////////////////////////////////////////////////////////////////////////////////
UPDATE:
#include <stdio.h>
#include <string.h>
void main3(void)
{
puts("\n\n-----main3 reporting for duty!------\n");
char *first = "Metal Gear";
char *second = "Suikoden";
printf("strcmp(first, first) = %d\n", strcmp(first, first)); //returns 0 when both strings are identical.
printf("strcmp(first, second) = %d\n", strcmp(first, second)); //returns a negative when the first differenet char is less in first string. (M=77 S=83)
printf("strcmp(second, first) = %d\n", strcmp(second, first)); //returns a positive when the first different char is greater in first string.(M=77 S=83)
char sentence1[10];
strcpy(sentence1, first);
puts(sentence1);
char sentence2[10];
strncpy(sentence2, second, 10); //shouldn't I specify the sizeof(source) instead of sizeof(destination).
sentence2[9] = '\0'; //Is this okay since strncpy does not provide the null character.
puts(sentence2);
char *pointer = first;
for(int i =0; i < 500; i++) //Why does it crashes without this nonsensical loop?!
{
printf("%c", *pointer);
if(*pointer == '\n')
putchar('\n');
pointer++;
}
}
This is how I teach myself to program. I write code and comment all I know about it so that
the next time I need to look up something, I just look at my own code in my files. In this one, I'm trying to learn the string library in c.

char *first = "Metal Gear";
char sentence1[10];
strcpy(sentence1, first);
This doesn't work because first has 11 characters: the ten in the string, plus the null terminator. So you would need char sentence1[11]; or more.
strncpy(sentence2, second, sizeof(sentence2));
//shouldn't I specify the sizeof(source) instead of sizeof(destination)?
No. The third argument to strncpy is supposed to be the size of the destination. The strncpy function will always write exactly that many bytes.
If you want to use strncpy you must also put a null terminator on (and there must be enough space for that terminator), unless you are sure that strlen(second) < sizeof sentence2.
Generally speaking, strncpy is almost never a good idea. If you want to put a null-terminated string into a buffer that might be too small, use snprintf.
This is how I teach myself to program.
Learning C by trial and error is not good. The problem is that if you write bad code, you may never know. It might appear to work , and then fail later on. For example it depends on what lies in memory after sentence1 as to whether your strcpy would step on any other variable's toes or not.
Learning from a book is by far and away the best idea. K&R 2 is a decent starting place if you don't have any other.
If you don't have a book, do look up online documentation for standard functions anyway. You could have learnt all this about strcpy and strncpy by reading their man pages, or their definitions in a C standard draft, etc.

Your problems start from here:
char sentence1[10];
strcpy(sentence1, first);
The number of characters in first, excluding the terminating null character, is 10. The space allocated for sentence1 has to be at least 11 for the program to behave in a predictable way. Since you have already used memory that you are not supposed to use, expecting anything to behave after that is not right.
You can fix this problem by changing
char sentence1[10];
to
char sentence1[N]; // where N > 10.
But then, you have to ask yourself. What are you trying to accomplish by allocating memory on the stack that's on the edge of being wrong? Are you trying to learn how things behave at the boundary of being wrong/right? If the answer to the second question is yes, hopefully you learned from it. If not, I hope you learned how to allocate adequate memory.

this is an array bounds write error. The indices are only 0-9
sentence2[10] = '\0';
it should be
sentence2[9] = '\0';
second, you're protecting the destination from buffer overflow, so specifying its size is appropriate.
EDIT:
Lastly, in this amazingly bad piece of code, which really isn't worth mentioning, is relevant to neither strcpy() nor strncpy(), yet seems to have earned me the disfavor of #nonsensicke, who seems to write very verbose and thoughtful posts... there are the following:
char *pointer = first;
for(int i =0; i < 500; i++)
{
printf("%c", *pointer);
if(*pointer == '\n')
putchar('\n');
pointer++;
}
Your use of int i=0 in the for loop is C99 specific. Depending on your compiler and compiler arguments, it can result in a compilation error.
for(int i =0; i < 500; i++)
better
int i = 0;
...
for(i=0;i<500;i++)
You neglect to check the return code of printf or indicate that you are deliberately ignoring it. I/O can fail after all...
printf("%c", *pointer);
better
int n = 0;
...
n = printf("%c", *pointer);
if(n!=1) { // error! }
or
(void) printf("%c", *pointer);
some folks will get onto you for not using {} with your if statements
if(*pointer == '\n') putchar('\n');
better
if(*pointer == '\n') {
putchar('\n');
}
but wait there's more... you didn't check the return code of putchar()... dang
better
unsigned char c = 0x00;
...
if(*pointer == '\n') {
c = putchar('\n');
if(c!=*pointer) // error
}
and lastly, with this nasty little loop you're basically romping through memory like a Kiwi in a Tulip field and lucky if you hit a newline. Depending on the OS (if you even have an OS), you might actually encounter some type of fault, e.g. outside your process space, maybe outside addressable RAM, etc. There's just not enough info provided to say actually, but it could happen.
My recommendation, beyond the absurdity of actually performing some type of detailed analysis on the rest of that code, would be to just remove it altogether.
Cheers!

Related

How the while statement is executed in C or how this array-referenced pointers work?

I started learning C and I had this exercise from the book "Prentice Hall - The C Programming Language".
Chapter 5 Exercise 3:
Write a pointer version of the fuction strcat that we showed in Chapter 2. strcat(s, t) copies the string t to the end of s.
I did the exercise but the first method that came up to my mind was:
void stringcat(char *s, char *t){
int i,j;
i = j = 0;
while(*(s+i) != '\0'){
printf("%d", i);
i++;
}
while ( (*(t+j)) != '\0'){
*(s+i) = *(t+j);
i++;
j++;
}
}
In main I had:
int main(){
char s[] = "Hola";
char t[] = "lala";
stringcat(s,t);
printf("%s\n", s);
}
At first sight I thought it was right but the actual output was Holalalaa.
Of course it was not the output that I expected, but then I coded this:
void stringcat(char *s, char *t){
int i,j;
i = j = 0;
while(*(s+i) != '\0'){
printf("%d", i);
i++;
}
while((*(s+i) = *(t+j)) != '\0'){
i++;
j++;
}
}
And the output was right.
But then I was thinking a lot about the first code because it's very similar to the second one but why the first output was wrong?. Is it something related with the while statement? or something with pointers?. I found it really hard to understand because you can't see what's happening in the array.
Thanks a lot.
Your code has more than the one problem that you found, but let's start with it.
Actually you are asking why
/* ... */
while ((*(t+j)) != '\0') {
*(s+i) = *(t+j);
/* ... */
works differently than
/* ... */
while ((*(s+i) = *(t+j)) != '\0') {
/* ... */
I hope you see it already, now that both cases stand side by side, actually vertically ;-). In the first case the value of t[j] is compared before it is copied to s[i]. In the second case the comparison is done after the copy. That's why the second case copies the terminating '\0' to the target string, and the first case does not.
The output you get works accidentally, it is Undefined Behavior, since you are writing beyond the border of the target array. Fortunately for you, both strings are laying in sequence in the memory, and you are overwriting the source string with its own characters.
Because your first case does not copy the '\0', the final printf() outputs more characters until a '\0' is encountered. By chance this is the last 'a'.
As others commented, the target string has not enough space for the concatenated string. Provide some more space like this:
char s[10] = "Hola"; /* 10 is enough for both strings and the terminating '\0'. */
However, if you had done this already, the error would have not been revealed, because the last 6 characters of s are initialized with '\0'. Not copying the terminating '\0' makes no difference. You can see this if you use
char s[10] = "Hola\0xxxx";
I don't think that your solution is the expected one. Instead of s[i] you are using *(s + i), which is essentially the same, accessing an array. Consider changing s (and in the course, t) in the function and use just *s.
Side note: The printf() in the function is most probably a leftover from debugging. But I'm sure you know.

Trouble \0 null terminating a string (C)

I seem to have some trouble getting my string to terminate with a \0. I'm not sure if this the problem, so I decided to make a post.
First of all, I declared my strings as:
char *input2[5];
Later in the program, I added this line of code to convert all remaining unused slots to become \0, changing them all to become null terminators. Could've done with a for loop, but yea.
while (c != 4) {
input2[c] = '\0';
c++;
}
In Eclipse when in debug mode, I see that the empty slots now contain 0x0, not \0. Are these the same things? The other string where I declared it as
char input[15] = "";
shows \000 when in debug mode though.
My problem is that I am getting segmentation faults (on Debian VM. Works on my Linux 12.04 though). My GUESS is that because the string hasn't really been terminated, the compiler doesn't know when it stops and thus continues to try to access memory in the array when it is clearly already out of bound.
Edit: I will try to answer all other questions soon, but when I change my string declaration to the other suggested one, my program crashes. There is a strtok() function, used to chop my fgets input into strings and then putting them into my input2 array.
So,
input1[0] = 'l'
input1[1] = 's'
input1[2] = '\n'
input2[0] = "ls".
This is a shell simulating program with fork and execvp. I will post more code soon.
Regarding the suggestion:
char *input2[5]; This is a perfectly legal declaration, but it
defined input2 as an array of pointers. To contain a string, it needs
to be an array of char.
I will try that change again. I did try that earlier, but I remember it giving me another run-time error (seg fault?). I think it is because of the way I implemented my strtok() function though. I will check it out again. Thanks!
EDIT 2: I added a response below to update my progress so far. Thanks for all the help!
It is here.
.
You code should rather look like this:
char input2[5];
for (int c=0; c < 4; c++) {
input2[c] = '\0';
}
0x0 and \0 are different representation of the same value 0;
Response 1:
Thanks for all the answers!
I made some changes from the responses, but I reverted the char suggestion (or correct string declaration) because like someone pointed out, I have a strtok function. Strtok requires me to send in a char *, so I reverted back to what I originally had (char * input[5]). I posted my code up to strtok below. My problem is that the program works fine in my Ubuntu 12.04, but gives me a segfault error when I try to run it on the Debian VM.
I am pretty confused as I originally thought the error was because the compiler was trying to access an array index that is already out of bound. That doesn't seem like the problem because a lot of people mentioned that 0x0 is just another way of writing \000. I have posted my debug window's variable section below. Everything seems right though as far as I can see.. hmm..
Input2[0] and input[0], input[1 ] are the focus points.
Here is my code up to the strtok function. The rest is just fork and then execvp call:
int flag = 0;
int i = 0;
int status;
char *s; //for strchr, strtok
char input[15] = "";
char *input2[5];
//char input2[5];
//Prompt
printf("Please enter prompt:\n");
//Reads in input
fgets(input, 100, stdin);
//Remove \n
int len = strlen(input);
if (len > 0 && input[len-1] == '\n')
input[len-1] = ' ';
//At end of string (numb of args), add \0
//Check for & via strchr
s = strchr (input, '&');
if (s != NULL) { //If there is a &
printf("'&' detected. Program not waiting.\n");
//printf ("'&' Found at %s\n", s);
flag = 1;
}
//Now for strtok
input2[i] = strtok(input, " "); //strtok: returns a pointer to the last token found in string, so must declare
//input2 as char * I believe
while(input2[i] != NULL)
{
input2[++i] = strtok( NULL, " ");
}
if (flag == 1) {
i = i - 1; //Removes & from total number of arguments
}
//Sets null terminator for unused slots. (Is this step necessary? Does the C compiler know when to stop?)
int c = i;
while (c < 5) {
input2[c] = '\0';
c++;
}
Q: Why didn't you declare your string char input[5];? Do you really need the extra level of indirection?
Q: while (c < 4) is safer. And be sure to initialize "c"!
And yes, "0x0" in the debugger and '\0' in your source code are "the same thing".
SUGGESTED CHANGE:
char input2[5];
...
c = 0;
while (c < 4) {
input2[c] = '\0';
c++;
}
This will almost certainly fix your segmentation violation.
char *input2[5];
This is a perfectly legal declaration, but it defined input2 as an array of pointers. To contain a string, it needs to be an array of char.
while (c != 4) {
input2[c] = '\0';
c++;
}
Again, this is legal, but since input2 is an array of pointers, input2[c] is a pointer (of type char*). The rules for null pointer constants are such that '\0' is a valid null pointer constant. The assignment is equivalent to:
input2[c] = NULL;
I don't know what you're trying to do with input2. If you pass it to a function expecting a char* that points to a string, your code won't compile -- or at least you'll get a warning.
But if you want input2 to hold a string, it needs to be defined as:
char input2[5];
It's just unfortunate that the error you made happens to be one that a C compiler doesn't necessarily diagnose. (There are too many different flavors of "zero" in C, and they're often quietly interchangeable.)

why does this simple code gives not work at times????then when i restart my code blocks it does work at times

#include <stdio.h>
#include <stdlib.h>
main()
{
char *a,*b,*c={0};
int i=0,j=0;
a=(char *)malloc(20*sizeof(char));
b=(char *)malloc(20*sizeof(char));
c=(char *)malloc(20*sizeof(char));
printf("Enter two strings:");
gets(a);
gets(b);
while(a[i]!=NULL)
{
c[i]=a[i];
i++;
}
while(b[j]!=NULL)
{
c[i]=b[j];
i++;
j++;
}
printf("The concated string is %s",c);
}
this is crazy........i spend one whole night it didn't work and then next night it suddenly works perfectly....i'm confused
There are many things wrong with your code.
Not all of them matter if all you care about is getting the code to work.
However, I have tried here to show you different misconceptions that are clear from your code, and show you how to code it better.
You are misunderstanding what NULL means. a NULL pointer doesn't point at anything
Strings are terminated with '\0' which is an ASCII NUL, not the same thing, though both use the value 0.
char* s = "hello";
The above string is actually 6 characters long. 5 bytes for the hello, 1 for the '\0' that is stuck at the end. Incidentally, this means that you can only have strings up to 19 characters long because you need to reserve one byte for the terminal '\0'
char* r = NULL;
The pointer r is pointing at nothing. There is no '\0' there, and if you attempt to look at r[0], you will crash.
As Ooga pointed out, you missed terminating with '\0' which is going to create random errors because your printf will keep going to try to print until the first zero byte it finds. Whether you crash on any particular run is a matter of luck. Zeros are common, so usually you will stop before you crash, but you will probably print out some junk after the string.
Personally, I would rather crash than have the program randomly print out the wrong thing. At least when you crash, you know something is wrong and can fix it.
You also seem to have forgotten to free the memory you malloc.
If you are going to use malloc, you should use free at the end:
int* a = malloc(20);
...
free(a);
You also are only mallocing 20 characters. If you go over that, you will do horrible things in memory. 20 seems too short, you will have only 19 characters plus the null on the end to play with but if you do have 20 characters each in a and b, you would need 40 characters in c.
If this is an assignment to use malloc, then use it, but you should free when you are done. If you don't have to use malloc, this example does not show a reason for using it since you are allocating a small, constant amount of memory.
You are initializing c:
char* c = {0};
In a way that makes no sense.
The {0} is an array with a single zero value. c is pointing to it, but then you immediately point it at something else and never look at your little array again.
You probably mean that C is pointing to nothing at first.
That would be:
char* c = NULL;
but then you are immediately wiping out the null, so why initialize c, but not a and b?
As a general rule, you should not declare values and initialize them later. You can always do something stupid and use them before they are initialized. Instead, initialize as you declared the:
int* a = malloc(20);
int* b = malloc(20);
int* c = malloc(40);
Incidentally, the size of a char is by definition 1, so:
20* sizeof(char)
is the same as 20.
You probably saw an example like:
20 * sizeof(int)
Since sizeof(int) which is not 1 the above does something. Typically sizeof(int) is 4 bytes, so the above would allocate 80 bytes.
gets is unsafe, since it doesn't say how long the buffer is
ALWAYS use fgets instead of gets. (see below).
Many computers have been hacked using this bug (see http://en.wikipedia.org/wiki/Robert_Tappan_Morris)
Still, since malloc is not really needed, in your code, you really should write:
enum { SIZE = 128 };
char a[SIZE];
fgets(a, SIZE, STDIN);
char b[SIZE];
fgets(b, SIZE, STDIN);
char c[SIZE*2];
int i;
int j = 0;
for (i = 0; a[i] != '\0' && i < 127; i++)
c[j++] = a[i];
for (i; b[i] != '\0' && i < 127; i++)
c[j++] = a[i];
c[j] = '\0';
...
Last, I don't know if you are learning C or C++. I will simply point out that this kind of programming is a lot easier in C++ where a lot of the work is done for you. You can first get concatenation done the easy way, then learn all the pointer manipulation which is harder.
#include <string>
#include <iostream>
using namespace std;
int main() {
string a,b,c;
getline(cin, a); // read in a line
getline(cin, b);
c = a + b;
cout << c;
}
Of course, you still need to learn this low-level pointer stuff to be a sophisticated programmer in C++, but if the purpose is just to read in and concatenate lines, C++ makes it a lot easier.
You are not properly null-terminating c. Add this before the printf:
c[i] = '\0';
Leaving out null-termination will seem to work correctly if the char at i happens to be 0, but you need to set it to be sure.
The string c is not being terminated with a null char, this means printf does not know where to stop and will likely segfault your program when it overruns. The reason you may be getting sporadic success is that there is a random chance the malloced area has been pre zeroed when you allocate it, if this is the case it will succeed as a null char is represented as a literal 0 byte.
there are two solutions available to you here, first you could manually terminate the string with a null char as so:
c[i] = '\0';
Second you can use calloc instead of malloc, it guarantees the memory is always pre zeroed.
As a side note you should likely add some length checking to your code to ensure c will not overflow if A and B are both over 10. (or just make c 40 long)
I hope this helps.

Why is fgets() and strncmp() not working in this C code for string comparison?

This is a very fun problem I am running into. I did a lot of searching on stack overflow and found others had some similar problems. So I wrote my code accordingly. I originally had fscan() and strcmp(), but that completely bombed on me. So other posts suggested fgets() and strncmp() and using the length to compare them.
I tried to debug what I was doing by printing out the size of my two strings. I thought, maybe they have /n floating in there or something and messing it up (another post talked about that, but I don't think that is happening here). So if the size is the same, the limit for strncmp() should be the same. Right? Just to make sure they are supposedly being compared right. Now, I know that if the strings are the same, it returns 0 otherwise a negative with strncmp(). But it's not working.
Here is the output I am getting:
perk
repk
Enter your guess: perk
Word size: 8 and Guess size: 8
Your guess is wrong
Enter your guess:
Here is my code:
void guess(char *word, char *jumbleWord)
{
size_t wordLen = strlen(word);
size_t guessLen;
printf("word is: %s\n",word);
printf("jumble is: %s\n", jumbleWord);
char *guess = malloc(sizeof(char) * (MAX_WORD_LENGTH + 1));
do
{
printf("Enter your guess: ");
fgets(guess, MAX_WORD_LENGTH, stdin);
printf("\nword: -%s- and guess: -%s-", word, guess);
guessLen = strlen(guess);
//int size1 = strlen(word);
//int size2 = strlen(guess);
//printf("Word size: %d and Guess size: %d\n",size1,size2);
if(strncmp(guess,word,wordLen) == 0)
{
printf("Your guess is correct\n");
break;
}
}while(1);
}
I updated it from suggestions below. Especially after learning the difference between char * as a pointer and referring to something as a string. However, it's still giving me the same error.
Please note that MAX_WORD_LENGTH is a define statement used at the top of my program as
#define MAX_WORD_LENGTH 25
Use strlen, not sizeof. Also, you shouldn't use strncmp here, if your guess is a prefix of the word it will mistakenly report a match. Use strcmp.
sizeof(guess) is returning the size of a char * not the length of the string guess. Your problem is that you're using sizeof to manage string lengths. C has a function for string length: strlen.
sizeof is used to determine the size of data types and arrays. sizeof only works for strings in one very specific case - I won't go into that here - but even then, always use strlen to work with string lengths.
You'll want to decide how many characters you'll allow for your words. This is a property of your game, i.e. words in the game are never more that 11 characters long.
So:
// define this somewhere, a header, or near top of your file
#define MAX_WORD_LENGTH 11
// ...
size_t wordlen = strlen(word);
size_t guessLen;
// MAX_WORD_LENGTH + 1, 1 more for the null-terminator:
char *guess = malloc(sizeof(char) * (MAX_WORD_LENGTH + 1));
printf("Enter your guess: ");
fgets(guess, MAX_WORD_LENGTH, stdin);
guessLen = strlen(guess);
Also review the docs for fgets and note that the newline character is retained in the input, so you'll need to account for that if you want to compare the two words. One quick fix for this is to only compare up to the length of word, and not the length of guess, so: if( strncmp(guess, word, wordLen) == 0). The problem with this quick fix is that it will pass invalid inputs, i.e. if word is eject, and guess is ejection, the comparison will pass.
Finally, there's no reason to allocate memory for a new guess in each iteration of the loop, just use the string that you've already allocated. You could change your function setup to:
char guess(char *word, char *jumbledWord)
{
int exit;
size_t wordLen = strlen(word);
size_t guessLen;
char *guess = malloc(sizeof(char) * (MAX_WORD_LENGTH + 1));
do
{
printf("Enter your guess: ");
// ...
As everyone else has stated, use strlen not sizeof. The reason this is happening though, is a fundamental concept of C that is different from Java.
Java does not give you access to pointers. Not only does C have pointers, but they are fundamental to the design of the language. If you don't understand and use pointers properly in C then things won't make sense, and you will have quite a bit of trouble.
So, in this case, sizeof is returning the size of the char * pointer, which is (usually) 4 or 8 bytes. What you want is the length of the data structure "at the other end" of the pointer. This is what strlen encapsulates for you.
If you didn't have strlen, you would need to dereference the pointer, then walk the string until you find the null byte marking the end.
i = 1;
while(*guess++) { i++ }
Afterwards, i will hold the length of your string.
Update:
Your code is fine, except for one minor detail. The docs for fgets note that it will keep the trailing newline char.
To fix this, add the following code in between the fgets and strncmp sections:
if ( guess[guessLen-1] == '\n' ) {
guess[guessLen-1] = '\0';
}
That way the trailing newline, if any, gets removed and you are no longer off by one.
Some list of problems / advices for your code, much too long to fit in a comment:
your function returns a char which is strange. I don't see the
logic and what is more important, you actually never return a value. Don't do that, it will bring you trouble
look into other control structures in C, in particular don't do your exit thing. First, exit in C is a function, which does what it says, it exits the program. Then there is a break statement to leave a loop.
A common idiom is
do {
if (something) break;
} while(1)
you allocate a buffer in each iteration, but you never free it. this will give you big memory leaks, buffers that will be wasted and inaccessible to your code
your strncmp approach is only correct if the strings have the same length, so you'd have to test that first

robust string reverse

I am trying to code a trival interview question of reversing a string.
This is my code:
#include <string.h>
char* rev( char* str)
{
int i,j,l;
l = strlen(str);
for(i=0,j=l-1; i<l/2 ; i++, j--)
{
str[i] = (str[i] + str[j]);
str[j] = str[i] - str[j];
str[j] = str[i] - str[j];
}
return str;
}
int main()
{
char *str = " hello";
printf("\nthe reverse is %s ...", rev(str));
return 1;
}
Basically, this one gives a segmentation fault.
I have following questions:
I get segmentation fault probably because, the characters add up to something not defined in ascii and hence I cannot store them back as characters, I am using www.codepad.org [I wonder if it supports just ascii !!] . Is my understanding correct or there is something else to it.
How do I correct the problem , for the same platform [I mean swapping in place for codepad.org]
Here I have to use an additional integer l to calculate length. So to save a single char space by swapping in place .. I am using an extra int !!! .. just to impress the inteviewer :) ... Is this approach eve worth it !!!
This one is for those who are interested in writing unit tests/API tests . I want to have a robust implementation so what can be possible test cases. I assume that if interviewer asks such a simple question .. he definitely wants some very roboust implementation and test cases. Few that I thought off:
passing empty strings passing integer
strings passing integer array instead
of char array.
very long string ,
single char string string of special chars.
Any advise/suggestions would be helpful.
This line:
char *str = " hello";
Probably points to read-only memory. Try this:
char str[] = " hello";
(You have some other bugs too, but this change will fix your segfault).
Use a temporary variable rather than your approach for the swap. The compiler will probably use a register for the temporary variable due to optimizations.'
Either way, you implemented the swap algorithm wrong. It should be
str[i] = str[i] + str[j];
str[j] = str[i] - str[j];
str[i] = str[i] - str[j];
Page 62 of Kernighan & Ritchie's The C Programming Language shows an algorithm for in-place string reversal with a temporary variable.
Similar to this:
char* rev_string(char* const str)
{
int i, j;
char tmp;
for(i = 0, j = strlen(str)-1; i < j; i++; j--)
{
tmp = str[i];
str[i] = str[j];
str[j] = tmp;
}
return str;
}
This algorithm is easier to understand than the one without a temporary variable, imho.
As for item #3 in your list of questions:
As an interviewer, I would want to see simple, clear, and well structured code. That's impressive. Trickery will not impress me. Especially when its comes to premature optimization. BTW, my solution reverses the string in place with one additional char instead of an int. Impressive? :)
And item #4:
One other test case would be an unterminated string. Is your function robust enough to handle this case? Your function will only be as robust as the least robust part of it. Passing an unterminated string into my solution causes a Segmentation Fault, due to strlen reporting an incorrect string length. Not very robust.
The important point about robustness is, your code might be robust but you have to make sure all other external functions you use are, too!
Where to start...
OK, first you should be aware that your routine reverses a string in place, in other words changes are made to the original buffer.
This means that you could do
int main()
{
char str[] = "hello";
rev(str);
printf("\nthe reverse is %s ...", str);
return 0;
}
and the string would be reversed.
The other alternative is to create a new string that is a reversed copy of the original string. The algorithm is somewhat different, you should be able to do this too.
Next point:
str[i] = (str[i] + str[j]);
str[j] = str[i] - str[j];
str[j] = str[i] - str[j];
is broken. It should be
str[i] = str[i] + str[j];
str[j] = str[i] - str[j];
str[i] = str[i] - str[j];
But, as ~mathepic said, you should do this instead:
temp = str[i];
str[i] = str[j];
str[j] = temp;
Also: codepad makes it difficult to debug your code. Install a compiler and debugger (such as gcc and gdb) on your own computer.
the characters add up to something not defined in ascii and hence I cannot store them back as characters, I am using www.codepad.org [I wonder if it supports just ascii !!] . Is my understanding correct or there is something else to it.
In most C implementations (those that run on 32-bit PCs anyway), a char is an 8-bit integer. An int is a 32-bit integer. When you add or subtract two chars and the result is more than 8 bits, it will "wrap around" to some other value, but this process is reversible.
For instance, 255 + 1 gives 0, but 0 - 1 = 255. (Just an illustrative example.) This means that "I cannot store them back as characters" is not the problem here.
I want to have a robust implementation
You want to show that you take into account the costs and benefits of different design choices. Perhaps it is better to cause a segmentation fault if your routine is supplied with a NULL, because this very quickly alerts the programmer of a bug in his code.
passing empty strings
You must make sure your code works in a case like that.
passing integer
passing integer array instead
You can't pass integers, or an int [] to a function expecting a char *. In C, you cannot tell whether a char * really is a string or something else.
single char string
Make sure your routine works for a single char string, also for both strings with an odd number and an even number of chars.
string of special chars
There are no special chars in C (except, by convention, the null terminator '\0'). However, multi-char sequences are something that must be considered (reversing a UTF-8 string is different from reversing a regular string). However if the question does not specify I don't think you should be concerned about this.
Three final points:
In main(), return 1; usually indicates your program failed. return 0; is more common but return EXIT_SUCCESS; is best, though you may need to #include <stdlib.h>.
Consider using more descriptive variable names.
Consider making a strnrev() function, similar to the strncpy() and similar functions, where the function will not go beyond n characters if the null terminator is not found there.
If you are going to implement the swap of two characters without a temporary variable (which is a neat trick but not something that you should actually use in practice), it would be prudent to either use the "bitwise exclusive or" instead of addition/substraction, or use unsigned char instead of char, because overflow in signed arithmetic is undefined in the C99 standard, and guess what, gcc started to make use of this undefinedness for optimization purposes. I was just ranting about another case of unwanted optimization in another question.
I get segmentation fault probably because, the characters add up to something not defined in ascii and hence I cannot store them back as characters
I don't think so. They're all just numbers in C (albeit only 1 byte long), but you shouldn't have any problem there.
I think (but I'm not sure) that the problem is with this:
char *str = " hello";
printf("\nthe reverse is %s ...", rev(str));
What you're actually doing is creating the char array " hello", which is a constant array. That means, basically, that you're not supposed to change it. When you call rev, it actually changes the array in-place, so it's trying to assign new values to a constant char.
Since you do char* str = "hello", you're actually casting "hello" to an unsigned char, so this isn't treated as a compile-time error. But because "hello" is what's called a "string literal", it's being created as part of the executable file itself, i.e. it's not in the memory your program can freely change. That is why you're actually getting the run-time seg-fault, and not a compile-time error (although you should probably be getting a warning about this).
Thanks all for the reply. Here is the code with changes everyone suggested:
#include <string.h>
char* rev( char* str)
{
int start ,end ,len;
len = strlen(str);
for(start =0,end =len-1; start <len/2 ; start ++, end --)
{
str[start ] = str[start ] + str[end ];
str[end ] = str[start ] - str[end ];
str[start] = str[start ] - str[end ];
}
return str;
}
int main()
{
char str[] = " hello there !";
printf("\n the reverse string is %s ...", rev(str));
return 1;
}
The segmentation fault was bec *str was pointing to read only memory , change it to str[]. Thanks Carl Norum for pointing that out.
Any test cases [specifically for API testing] ?
as for testing:
null argument
empty string argument
length 1 string argument
various other lengths - perhaps one long string
you can of course implement the test method with the following strategy:
a generic verify method
verifyEquals( expected, actual ) { ... }
a test method with various cases:
testReverse() {
verifyEquals(NULL, rev(NULL));
verifyEquals("", rev(""));
verifyEquals("a", rev("a"));
verifyEquals("ba", rev("ab"));
verifyEquals("zyx", rev("xyz"));
verifyEquals("edcba", rev("abcde"));
}
You can also refactor the swap "algorithm" into a separate procedure and unit test it as well.

Resources