There are two versions of string copy functions written in C. My question is why the version1 need "!= '\0'" but the version2 doesn't. What if I have a character 0 to be copied using version2, will the '0' terminate coping process?
void version1(char to[], char from[])
{
int i;
i = 0;
while ((to[i] = from[i]) != '\0')
++i;
}
char *version2(char *dest, const char *src)
{
char *addr = dest;
while (*dest++ = *src++);
return addr;
}
In addition, why an input like "1230456" will not terminate the coping since '0' appears in the middle of the string?
This is because in C comparison to zero is optional. When you use an expression in a context requiring a logical expression, C would insert an implicit comparison to zero for you.
You can rewrite the first function as follows without changing the semantic:
while ((to[i] = from[i]))
++i;
Moreover, you can rewrite the second function as follows:
while ((*dest++ = *src++) != '\0');
There is exactly that same != 0 test in the second version, but it's impicit: The result of the expression *dest++ = *src++ becomes the value checked by the while, and in C, all tests boil down to a comparison with zero.
By the same token, in the first example, the while line could be rewritten:
while (to[i] = from[i])
and not change the meaning.
Both versions make the same check, in the second version you just don't see it. You can try to remove != '\0' from the first version, it should still work.
version1 does not NEED the != '\0' but it is better programming practice to include it. It just so happens that '\0' is equal to zero and so version2 will work, but, if you happen to come across a system where '\0' is NOT zero, version2 will not work.
One thing perhaps not mentioned yet is that the reason the zero termination gets copied in the second example is because of the post increment ensures that the value is copied BEFORE it gets checked for a zero value, which terminates the loop.
Related
I started learning C and I had this exercise from the book "Prentice Hall - The C Programming Language".
Chapter 5 Exercise 3:
Write a pointer version of the fuction strcat that we showed in Chapter 2. strcat(s, t) copies the string t to the end of s.
I did the exercise but the first method that came up to my mind was:
void stringcat(char *s, char *t){
int i,j;
i = j = 0;
while(*(s+i) != '\0'){
printf("%d", i);
i++;
}
while ( (*(t+j)) != '\0'){
*(s+i) = *(t+j);
i++;
j++;
}
}
In main I had:
int main(){
char s[] = "Hola";
char t[] = "lala";
stringcat(s,t);
printf("%s\n", s);
}
At first sight I thought it was right but the actual output was Holalalaa.
Of course it was not the output that I expected, but then I coded this:
void stringcat(char *s, char *t){
int i,j;
i = j = 0;
while(*(s+i) != '\0'){
printf("%d", i);
i++;
}
while((*(s+i) = *(t+j)) != '\0'){
i++;
j++;
}
}
And the output was right.
But then I was thinking a lot about the first code because it's very similar to the second one but why the first output was wrong?. Is it something related with the while statement? or something with pointers?. I found it really hard to understand because you can't see what's happening in the array.
Thanks a lot.
Your code has more than the one problem that you found, but let's start with it.
Actually you are asking why
/* ... */
while ((*(t+j)) != '\0') {
*(s+i) = *(t+j);
/* ... */
works differently than
/* ... */
while ((*(s+i) = *(t+j)) != '\0') {
/* ... */
I hope you see it already, now that both cases stand side by side, actually vertically ;-). In the first case the value of t[j] is compared before it is copied to s[i]. In the second case the comparison is done after the copy. That's why the second case copies the terminating '\0' to the target string, and the first case does not.
The output you get works accidentally, it is Undefined Behavior, since you are writing beyond the border of the target array. Fortunately for you, both strings are laying in sequence in the memory, and you are overwriting the source string with its own characters.
Because your first case does not copy the '\0', the final printf() outputs more characters until a '\0' is encountered. By chance this is the last 'a'.
As others commented, the target string has not enough space for the concatenated string. Provide some more space like this:
char s[10] = "Hola"; /* 10 is enough for both strings and the terminating '\0'. */
However, if you had done this already, the error would have not been revealed, because the last 6 characters of s are initialized with '\0'. Not copying the terminating '\0' makes no difference. You can see this if you use
char s[10] = "Hola\0xxxx";
I don't think that your solution is the expected one. Instead of s[i] you are using *(s + i), which is essentially the same, accessing an array. Consider changing s (and in the course, t) in the function and use just *s.
Side note: The printf() in the function is most probably a leftover from debugging. But I'm sure you know.
I saw this C program that copies the first string into the second one using pointers.
void copy(char const *s1, char *s2)
{
for(;(*s2=*s1);++s1,++s2){};
}
I don´t understand the condition that stops the for loop, because I could have written (*s2=*s1)!='\0'and it works, but if I don´t write the !='\0' it works too. How does the for loop know when to stop?
The parentheses are indicating that the test criteria is the result of the assignment to the location pointed to by s1 (the left operand). In other words, the loop runs until the value of *s2 is false.
(*s2=*s1)!='\0' is equivalent to (*s2=*s1)!=0, which is equivalent to (*s2=*s1); or (*s2=*s1)==true if you prefer. Obviously a non-zero value is evaluated as true, so the loop runs until the second string has a nul terminator.
A character between single quotes is a char. The char \0 has a value of 0 thus
char a = '\0`;
is equal to
char a = 0;
And thus if (x != '\0') is equal to if (x != 0), which is equal to if (x) similar in the condition as part of for.
This gets at the heart of how C distinguishes between true and false.
True is any non-zero value (any bit on in an integer). While the condition tests like == and > produce a value of 1 any non-zero value works for true.
False is a value of zero (all bits off in an integer), which includes NULL in pointers.
The value of '\0' is of course a binary zero so the (*s2++=*s1++) in the condition part of the for does an implicit test for non-zero so this works up, and until, the \0 is copied. The \0 returns false and exits the loop. Adding your own !='\0' is adding an explicit test for the same.
Beware: If you just used an incorrect *s1++ = *s2++ != '\0' without the parenthesis it would be treated as a very buggy *s1++ = (*s2++ != '\0') which would assign a series of 1's to *s1 followed by a '\0' to terminate the "string". Oops.
I seem to have some trouble getting my string to terminate with a \0. I'm not sure if this the problem, so I decided to make a post.
First of all, I declared my strings as:
char *input2[5];
Later in the program, I added this line of code to convert all remaining unused slots to become \0, changing them all to become null terminators. Could've done with a for loop, but yea.
while (c != 4) {
input2[c] = '\0';
c++;
}
In Eclipse when in debug mode, I see that the empty slots now contain 0x0, not \0. Are these the same things? The other string where I declared it as
char input[15] = "";
shows \000 when in debug mode though.
My problem is that I am getting segmentation faults (on Debian VM. Works on my Linux 12.04 though). My GUESS is that because the string hasn't really been terminated, the compiler doesn't know when it stops and thus continues to try to access memory in the array when it is clearly already out of bound.
Edit: I will try to answer all other questions soon, but when I change my string declaration to the other suggested one, my program crashes. There is a strtok() function, used to chop my fgets input into strings and then putting them into my input2 array.
So,
input1[0] = 'l'
input1[1] = 's'
input1[2] = '\n'
input2[0] = "ls".
This is a shell simulating program with fork and execvp. I will post more code soon.
Regarding the suggestion:
char *input2[5]; This is a perfectly legal declaration, but it
defined input2 as an array of pointers. To contain a string, it needs
to be an array of char.
I will try that change again. I did try that earlier, but I remember it giving me another run-time error (seg fault?). I think it is because of the way I implemented my strtok() function though. I will check it out again. Thanks!
EDIT 2: I added a response below to update my progress so far. Thanks for all the help!
It is here.
.
You code should rather look like this:
char input2[5];
for (int c=0; c < 4; c++) {
input2[c] = '\0';
}
0x0 and \0 are different representation of the same value 0;
Response 1:
Thanks for all the answers!
I made some changes from the responses, but I reverted the char suggestion (or correct string declaration) because like someone pointed out, I have a strtok function. Strtok requires me to send in a char *, so I reverted back to what I originally had (char * input[5]). I posted my code up to strtok below. My problem is that the program works fine in my Ubuntu 12.04, but gives me a segfault error when I try to run it on the Debian VM.
I am pretty confused as I originally thought the error was because the compiler was trying to access an array index that is already out of bound. That doesn't seem like the problem because a lot of people mentioned that 0x0 is just another way of writing \000. I have posted my debug window's variable section below. Everything seems right though as far as I can see.. hmm..
Input2[0] and input[0], input[1 ] are the focus points.
Here is my code up to the strtok function. The rest is just fork and then execvp call:
int flag = 0;
int i = 0;
int status;
char *s; //for strchr, strtok
char input[15] = "";
char *input2[5];
//char input2[5];
//Prompt
printf("Please enter prompt:\n");
//Reads in input
fgets(input, 100, stdin);
//Remove \n
int len = strlen(input);
if (len > 0 && input[len-1] == '\n')
input[len-1] = ' ';
//At end of string (numb of args), add \0
//Check for & via strchr
s = strchr (input, '&');
if (s != NULL) { //If there is a &
printf("'&' detected. Program not waiting.\n");
//printf ("'&' Found at %s\n", s);
flag = 1;
}
//Now for strtok
input2[i] = strtok(input, " "); //strtok: returns a pointer to the last token found in string, so must declare
//input2 as char * I believe
while(input2[i] != NULL)
{
input2[++i] = strtok( NULL, " ");
}
if (flag == 1) {
i = i - 1; //Removes & from total number of arguments
}
//Sets null terminator for unused slots. (Is this step necessary? Does the C compiler know when to stop?)
int c = i;
while (c < 5) {
input2[c] = '\0';
c++;
}
Q: Why didn't you declare your string char input[5];? Do you really need the extra level of indirection?
Q: while (c < 4) is safer. And be sure to initialize "c"!
And yes, "0x0" in the debugger and '\0' in your source code are "the same thing".
SUGGESTED CHANGE:
char input2[5];
...
c = 0;
while (c < 4) {
input2[c] = '\0';
c++;
}
This will almost certainly fix your segmentation violation.
char *input2[5];
This is a perfectly legal declaration, but it defined input2 as an array of pointers. To contain a string, it needs to be an array of char.
while (c != 4) {
input2[c] = '\0';
c++;
}
Again, this is legal, but since input2 is an array of pointers, input2[c] is a pointer (of type char*). The rules for null pointer constants are such that '\0' is a valid null pointer constant. The assignment is equivalent to:
input2[c] = NULL;
I don't know what you're trying to do with input2. If you pass it to a function expecting a char* that points to a string, your code won't compile -- or at least you'll get a warning.
But if you want input2 to hold a string, it needs to be defined as:
char input2[5];
It's just unfortunate that the error you made happens to be one that a C compiler doesn't necessarily diagnose. (There are too many different flavors of "zero" in C, and they're often quietly interchangeable.)
I'm trying to understand function which copies characters from stdin but I can't understand the while loop and the code following it exactly..... How does the while loop here work??
From what I understand it means until ith character from to[] isn't equal to ith character of from[] keep on adding i am I correct??
If yes than how does the ith character be equal in both the variables ??
Here is a short code :
void copy(char to[] , char from[])
{
int i;
i = 0 ;
while ((to[i] = from[i]) != '\0')
++i;
}
Rewriting it might help:
do{
to[i] = from[i];
++i;
}while (from[i-1] != '\0') // -1 here because we incremented i in the line before and need to check the copied position
Do you understand now?
The condition in the while loop uses the fact that in C an assignment expression has a value which is the value assigned in the assignment. This means that the condition in the while loop can be implemented to have a side-effect, namely the element-wise assignment of the source to the destination. In total, the actual work of the loop is carried out in its condition, while the loop's body just increases the index i.
It's how assignments work. An assignment (a = b) returns a value (b). What you're doing there, is moving from[i] to to[i], and comparing the return value (in this case, from[i]) to the character '\0'.
The null character (0x00) terminates any string, and is thus the terminating character of the string you're copying.
I'd be careful with this code, however, as you don't check the bounds on the array and are leaving yourself open to a segmentation fault if you were to encounter a string that isn't properly null terminated, or where the to[] string is too short.
It first copy from ith character to to ith position and check that, if its the end of string. if not then it increments i(position or index that will point now to next character) and perform this operation until its matches end of string i.e '\0' .
Your code is the same as
void copy(char to[] , char from[])
{
int i;
i = 0 ;
while (from[i] != '\0')
{
to[i] = from[i];
++i;
}
to[i] = '\0';
}
So while it's not at the end of to, it continue copying from in to.
I'm writing a simple string concatenation program.
The program works the way I have posted it. However, I first wrote it using the following code to find the end of the string:
while (*s++)
;
However, that method didn't work. The strings I passed to it weren't copied correctly. Specifically, I tried to copy "abc" to a char[] variable that held "\0".
From reading the C K&R book, it looks like it should work. That compact form should take the following steps.
*s is compared with '\0'
s points to the next address
So why doesn't it work? I am compiling with gcc on Debian.
I found that this version does work:
strncat(char *s, const char *t, int n)
{
char *s_start = s;
while (*s)
s++;
for ( ; n > 0 && *t; n--, s++, t++)
*s = *t;
*(s++) = '\0';
return s_start;
}
Thanks in advance.
After the end of while (*s++);, s points to the character after the null terminator. Take that into account in the code that follows.
The problem is that
while (*s++)
;
Always Increments s, even when s is zero (*s is false)
while (*s)
s++;
only increments s when *s is nonzero
so the first one will leave s pointing to first character after the first \0, while the second one will leave s pointing to the first \0.
There is difference. In the first case, s will point to the position after '\0', while the second stops right at '\0'.
As John Knoeller said, at the end of the run it'll s will point to the location after the NULL. BUT There is no need to sacrifice performance for the correct solution.. Take a look for yourself:
while (*s++); --s;
Should do the trick.
In addition what has been said, note that in C it is technically illegal for a pointer to point to unallocated memory, even if you don't dereference it. So be sure to fix your program, even if it appears to work.