Can someone explain me the use of --destination here? - c

I am completing my CISCO course on C and I got a doubt in the following function.
Can someone please explain me the logic of the function, especially the use of --destination here?
char *mystrcat(char *destination, char *source)
{
char *res;
for(res = destination; *destination++; ) ;
for(--destination; (*destination++ = *source++); ) ;
return res;
}

The first loop is looking for the string teminator. When it finds it, with *destination being false, the pointer is still post-incremented with *destination++.
So the next loop starts by decrementing the pointer back to pointing to the '\0' terminator, to start the concatentation.
In the second loop, each character is copied until the string terminator is found with (*destination++ = *source++); which is evaluated as the loop control. Again, this will include the required string terminator being copied.

This is a very complicated function for something that shouldn't be written so difficult.
--destination is a weird feature of C. I'm assuming you already know that variable++ increments the variable by one. Similarly variable-- decrements the variable by one. The thing is, when the ++ or -- comes after the variable name, that operation is done after the line is executed as a whole, when it is before the variable, C does the arithmetic first, then evaluates the full line.
For an example:
int c = 5
print(c++) -> outputs '5'
print(c) -> outputs '6'
but
int d = 5
print(++d) -> outputs '6'
print(d) -> outputs '6'
This is because in the second example, the increment is evaluated before the entire line is evaluate.
Hope that helps.

Related

Why isn't this pointing to the null character in array? ('\0')

Sorry about the poorly worded question, I couldn't think of a better name.
I am learning C, have just moved onto pointers and have written a function, strcat(char *s, char *t), which adds t to the end of s:
void strcat(char *s, char *t) //add t to the end of s
{
while(*s++) //get to the end of s
;
*s--; //unsure why I need this
while(*s++ = *t++) //copy t to the end of s
;
return;
}
Now the question I have is why do I need the line:
*s--;
When I originally added it I thought it made sense until I went through the code.
I would have thought the following was true though:
1) The first loop increments continually and when *s is 0 (or the null character) it moves on so now *s points to the null character of the array.
2) So all I should have to do is implement the second loop. The original null character of s will be replaced by the first character of t until we get to t's null character at which point we exit the second loop and returns.
Clearly I am missing something as the code doesn't work without it!!
After the first loop *s points to one position beyond '\0' but my question is why?
Thanks in advance :)
First *s is evaluated then s is incremented.
So when reaching s's 0-terminator the loop ends, but s still is incremented one more time.
Also there is no need to do:
*s--;
Doing
--s;
or
s--;
would be enough. There is no need to de-reference s here.
Or simply do
while (*s)
++s;
to get rid of --s;'s need at all.
You incremented the pointer after checking the value of the location it was pointing at. Functionally this is happening in while( *s++ ):
while( *s )
++s;
Change your first while to:
if (*s) {
while(*(++s)) //get to the end of s
;
}
In your code, you would always be checking if it was pointing to '\0' and then incrementing, so when you reach the '\0' you would check it only on the next iteration, and then you would increment it. Note that changing to pre-increment will not check if the pointer currently points to '\0', so you need to check it before the while.
Note that your code (post-increment and a decrement after the while) might be faster on most platforms (usually a branch is slower than a decrement), my code in this answer is just for you understand the problem.
The ++ operator after the variable name does postincrement, which means it increments by one, but the result of the operator is the value before the increment. If you used ++s, it would be different.
If s is 4 , then s will be 5 after x=++s as well as after x=s++. But the result (value of x) in the first case is 5, while it's 4 in the second case.
So in your while *s++, when s points to the '\0', you increment it, then take the old, un-incremented pointer, dereference it, see the \0, and stop the loop.
Btw, your '*s--' should be s-- because you don't need the character 'behind' the pointer there.

Reverse a string using recursion

I got this code from the internet but I couldnt get the whole code.
for example if(*str) . What does this code mean? and also can a string be returned? I thought that an array in main can be changed
directly in a function but here its been returned..
#include<stdio.h>
#define MAX 100
char* getReverse(char[]);
int main(){
char str[MAX],*rev;
printf("Enter any string: ");
scanf("%s",str);
rev = getReverse(str);
printf("Reversed string is: %s\n\n",rev);
return 0;
}
char* getReverse(char str[]){
static int i=0;
static char rev[MAX];
if(*str){
getReverse(str+1);
rev[i++] = *str;
}
return rev;
}
This is not the clearest example of recursion due to the use of the static variables. Hopefully the code generally seems clear to you, I suspect the part that is confusing to you is the same that was confusing to me at first.
if(*str){
getReverse(str+1);
rev[i++] = *str;
}
So line by line.
if(*str){
If we have not reached the null terminator.
getReverse(str+1);
Call the getReverse function on the next character of the string. It seems pretty straight forward up to here. But it also seems like it may not actually reverse anything because this is the next line
rev[i++] = *str;
We assign index i the character at the beginning of str and increment i but here is the tricky part. i may not be what you think. getReverse gets called before i is incremented. And i is static, so changes will persist between function calls. So, lets say we have a 5 letter word, let say "horse" we will end up with 6 calls on the stack to getReverse. The 6th will not do anything because that is where it finds the null terminator. The trick is that we will then go about resolving the calls in reverse order. First the call where str is pointing to 'e' will resolve and increment i because all the other ones are are still waiting for their calls to getReverse to return. So the last letters are actually the first ones to get added and increment i which is what can be confusing here.

Unexpected result in pointer arithmetic, in C

I'm new to C and I'm trying to solve one of the exercise problem of K&R book. The problem is to detect the whether a given string ends with the same content of other string.
For example, if string one is ratandcat and string two is cat. Then it is true that string two ends in string one. To solve this, I came up with half a solution and I got struck with pointers.
My idea is to find the position where string one ends and also the position the string two starts, so that I can proceed further. I have this code:
int strEnd(char *s,char *t){
int ans,slength =0 ,tlength = 0;
while(*s++)
slength++;
while(*t++)
tlength++;
s += slength - tlength;
printf("%c\n",*t);
printf("%c\n",*s);
return ans;
}
void main() {
char *message = "ratandcat";
char *message2 = "at";
int a = strEnd(message,message2);
}
But strangely this outputs :
%
(and a blank space)
But I want my s pointer to point to a. But it doesn't. Whats going wrong here?
Thanks in advance.
One problem is that you have incremented the s pointer to the end of the string, and then you add more stuff to it. You could do that loop with something like:
while(s[slength])
++slength;
Another problem is that you are assuming that the s string is longer than the other. What if it's not? And fix the ; problem noted by Simon.
You have an extra semicolon at the end of
while(*t++);
This means that tlength is never incremented, as only the semicolon (empty statement) is executed in the loop.

C pointers: difference between while(*s++) { ;} and while(*s) { s++;}

I'm going through K & R, and am having difficulty with incrementing pointers. Exercise 5.3 (p. 107) asks you to write a strcat function using pointers.
In pseudocode, the function does the following:
Takes 2 strings as inputs.
Finds the end of string one.
Copies string two onto the end of string one.
I got a working answer:
void strcats(char *s, char *t)
{
while (*s) /* finds end of s*/
s++;
while ((*s++ = *t++)) /* copies t to end of s*/
;
}
But I don't understand why this code doesn't also work:
void strcats(char *s, char *t)
{
while (*s++)
;
while ((*s++ = *t++))
;
}
Clearly, I'm missing something about how pointer incrementation works. I thought the two forms of incrementing s were equivalent. But the second code only prints out string s.
I tried a dummy variable, i, to check whether the function went through both loops. It did. I read over the sections 5.4 and 5.5 of K & R, but I couldn't find anything that sheds light on this.
Can anyone help me figure out why the second version of my function isn't doing what I would like it to? Thanks!
edit: Thanks everyone. It's incredible how long you can stare at a relatively simple error without noticing it. Sometimes there's no better remedy than having someone else glance at it.
This:
while(*s++)
;
due to post-increment, locates the nul byte at the end of the string, then increments it once more before exiting the loop. t is copied after then nul:
scontents␀tcontents␀
Printing s will stop at the first nul.
This:
while(*s)
s++;
breaks from the loop when the 0 is found, so you are left pointing at the nul byte. t is copied over the nul:
scontentstcontents␀
It's an off-by-one issue. Your second version increments the pointer every time the test is evaluated. The original increments one fewer time -- the last time when the test evaluates to 0, the increment isn't done. Therefore in the second version, the new string is appended after the original terminating \0, while in the first version, the first character of the new string overwrites that \0.
This:
while (*s)
s++;
stops as soon as *s is '\0', at which point it leaves s there (because it doesn't execute the body of the loop).
This:
while (*s++)
;
stops as soon as *s is '\0', but still executes the postincrement ++, so s ends up pointing right after the '\0'. So the string-terminating '\0' never gets overwritten, and it still terminates the string.
There's one less operation in while (*s) ++s; When *s is zero, then the loop breaks, while the form while (*s++) breaks but still increments s one last time.
Strictly speaking, the latter form may be incorrect (i.e. UB) if you attempt to form an invalid pointer. This is contrived, of course, but here's an example: char x = 0, * p = &x; while (*x++) { }.
Independent of that, it's best to write clean, readable and deliberate code rather than trying to outsmart yourself. Sometimes you can write nifty code in C that is actually elegant, and other times it's better to spell something out properly. Use your judgement, and ask someone else for feedback (or watch their faces as they look at your code).
let's assume the following characters in memory:
Address 0x00 0x01 0x02 0x03
------- ---- ---- ---- ----
0x8000 'a' 'b' 'c' 0
0x8004 ...
While executing loop, it happens in memory.
1. *s = 'a'
2. s = 0x8001
3. *s = 'b'
4. s = 0x8002
5. *s = 'c'
6. s = 0x8003
7. *s = 0;
8. s = 0x8004
9. end loop
While evaluating, *s++ advances the pointer even if the value of *s is 0.
// move s forward until it points one past a 0 character
while (*s++);
It doesn't work at all because s ends up pointing to a different place.
As it summarizes, we get a garbage value as last character in our target string. That garbage string is because of while loop exceed the limit of '\0' by one step forward.
You can eliminate it by using the below code, I think it is efficient
while (*s)
s++;
It execute as below in memory perspective.
1. *s = 'a'
2. s = 0x8001
3. *s = 'b'
4. s = 0x8002
5. *s = 'c'
6. s = 0x8003
7. *s = 0
8. end loop

How does this C code work?

I was looking at the following code I came across for printing a string in reverse order in C using recursion:
void ReversePrint(char *str) { //line 1
if(*str) { //line 2
ReversePrint(str+1); //line 3
putchar(*str); //line 4
}
}
I am relatively new to C and am confused by line 2. *str from my understanding is dereferencing the pointer and should return the value of the string in the current position. But how is this being used as an argument to a conditional statement (which should except a boolean right?)? In line 3, the pointer will always be incremented to the next block (4 bytes since its an int)...so couldn't this code fail if there happens to be data in the next memory block after the end of the string?
Update: so there are no boolean types in c correct? A conditional statement evaluates to 'false' if the value is 0, and 'true' otherwise?
Line 2 is checking to see if the current character is the null terminator of the string - since C strings are null-terminated, and the null character is considered a false value, it will begin unrolling the recursion when it hits the end of the string (instead of trying to call StrReverse4 on the character after the null terminator, which would be beyond the bounds of the valid data).
Also note that the pointer is to a char, thus incrementing the pointer only increments by 1 byte (since char is a single-byte type).
Example:
0 1 2 3
+--+--+--+--+
|f |o |o |\0|
+--+--+--+--+
When str = 0, then *str is 'f' so the recursive call is made for str+1 = 1.
When str = 1, then *str is 'o' so the recursive call is made for str+1 = 2.
When str = 2, then *str is 'o' so the recursive call is made for str+1 = 3.
When str = 3, then *str is '\0' and \0 is a false value thus if(*str) evaluates to false, so no recursive call is made, thus going back up the recursion we get...
Most recent recursion was followed by `putchar('o'), then after that,
Next most recent recursion was followed by `putchar('o'), then after that,
Least recent recursion was followed by `putchar('f'), and we're done.
The type of a C string is nothing but a pointer to char. The convention is that what the pointer points to is an array of characters, terminated by a zero byte.
*str, thus, is the first character of the string pointed to by str.
Using *str in a conditional evaluates to false if str points to the terminating null byte in the (empty) string.
At the end of a string is typically a 0 byte - the line if (*str) is checking for the existence of that byte and stopping when it gets to it.
In line 3, the pointer will always be incremented to the next block (4 bytes since its an int)...
Thats wrong, this is char *, it will only be incremented by 1. Because char is 1 byte long only.
But how is this being used as an argument to a conditional statement (which should except a boolean right?)?
You can use any value in if( $$ ) at $$, and it will only check if its non zero or not, basically bool is also implemented as simple 1=true and 0=false only.
In other higher level strongly typed language you cant use such things in if, but in C everything boils down to numbers. And you can use anything.
if(1) // evaluates to true
if("string") // evaluates to true
if(0) // evaulates to false
You can give any thing in if,while conditions in C.
At the end of the string there is a 0 - so you have "test" => [0]'t' [1]'e' [2]'s' [3]'t' [4]0
and if(0) -> false
this way this will work.
C has no concept of boolean values: in C, every scalar type (ie arithmetic and pointer types) can be used in boolean contexts where 0 means false and non-zero true.
As strings are null-terminated, the terminator will be interpreted as false, whereas every other character (with non-zero value!) will be true. This means means there's an easy way to iterate over the characters of a string:
for(;*str; ++str) { /* so something with *str */ }
StrReverse4() does the same thing, but by recursion instead of iteration.
conditional statements (if, for, while, etc) expect a boolean expression. If you provide an integer value the evaluation boils down to 0 == false or non-0 == true. As mentioned already, the terminating character of a c-string is a null byte (integer value 0). So the if will fail at the end of the string (or first null byte within the string).
As an aside, if you do *str on a NULL pointer you are invoking undefined behavior; you should always verify that a pointer is valid before dereferencing.
This is kind of off topic, but when I saw the question I immediately wondered if that was actually faster than just doing an strlen and iterate from the back.
So, I made a little test.
#include <string.h>
void reverse1(const char* str)
{
int total = 0;
if (*str) {
reverse1(str+1);
total += *str;
}
}
void reverse2(const char* str)
{
int total = 0;
size_t t = strlen(str);
while (t > 0) {
total += str[--t];
}
}
int main()
{
const char* str = "here I put a very long string ...";
int i=99999;
while (--i > 0) reverseX(str);
}
I first compiled it with X=1 (using the function reverse1) and then with X=2. Both times with -O0.
It consistently returned approximately 6 seconds for the recursive version and 1.8 seconds for the strlen version.
I think it's because strlen is implemented in assembler and the recursion adds quite an overhead.
I'm quite sure that the benchmark is representative, if I'm mistaken please correct me.
Anyway, I thought I should share this with you.
1.
str is a pointer to a char. Incrementing str will make the pointer point to the second character of the string (as it's a char array).
NOTE: Incrementing pointers will increment by the data type the pointer points to.
For ex:
int *p_int;
p_int++; /* Increments by 4 */
double *p_dbl;
p_dbl++; /* Increments by 8 */
2.
if(expression)
{
statements;
}
The expression is evaluated and if the resulting value is zero (NULL, \0, 0), the statements are not executed. As every string ends with \0 the recursion will have to end some time.
Try this code, which is as simple as the one which you are using:
int rev(int lower,int upper,char*string)
{
if(lower>upper)
return 0;
else
return rev(lower-1,upper-1,string);
}

Resources