Sorry about the poorly worded question, I couldn't think of a better name.
I am learning C, have just moved onto pointers and have written a function, strcat(char *s, char *t), which adds t to the end of s:
void strcat(char *s, char *t) //add t to the end of s
{
while(*s++) //get to the end of s
;
*s--; //unsure why I need this
while(*s++ = *t++) //copy t to the end of s
;
return;
}
Now the question I have is why do I need the line:
*s--;
When I originally added it I thought it made sense until I went through the code.
I would have thought the following was true though:
1) The first loop increments continually and when *s is 0 (or the null character) it moves on so now *s points to the null character of the array.
2) So all I should have to do is implement the second loop. The original null character of s will be replaced by the first character of t until we get to t's null character at which point we exit the second loop and returns.
Clearly I am missing something as the code doesn't work without it!!
After the first loop *s points to one position beyond '\0' but my question is why?
Thanks in advance :)
First *s is evaluated then s is incremented.
So when reaching s's 0-terminator the loop ends, but s still is incremented one more time.
Also there is no need to do:
*s--;
Doing
--s;
or
s--;
would be enough. There is no need to de-reference s here.
Or simply do
while (*s)
++s;
to get rid of --s;'s need at all.
You incremented the pointer after checking the value of the location it was pointing at. Functionally this is happening in while( *s++ ):
while( *s )
++s;
Change your first while to:
if (*s) {
while(*(++s)) //get to the end of s
;
}
In your code, you would always be checking if it was pointing to '\0' and then incrementing, so when you reach the '\0' you would check it only on the next iteration, and then you would increment it. Note that changing to pre-increment will not check if the pointer currently points to '\0', so you need to check it before the while.
Note that your code (post-increment and a decrement after the while) might be faster on most platforms (usually a branch is slower than a decrement), my code in this answer is just for you understand the problem.
The ++ operator after the variable name does postincrement, which means it increments by one, but the result of the operator is the value before the increment. If you used ++s, it would be different.
If s is 4 , then s will be 5 after x=++s as well as after x=s++. But the result (value of x) in the first case is 5, while it's 4 in the second case.
So in your while *s++, when s points to the '\0', you increment it, then take the old, un-incremented pointer, dereference it, see the \0, and stop the loop.
Btw, your '*s--' should be s-- because you don't need the character 'behind' the pointer there.
Related
I am completing my CISCO course on C and I got a doubt in the following function.
Can someone please explain me the logic of the function, especially the use of --destination here?
char *mystrcat(char *destination, char *source)
{
char *res;
for(res = destination; *destination++; ) ;
for(--destination; (*destination++ = *source++); ) ;
return res;
}
The first loop is looking for the string teminator. When it finds it, with *destination being false, the pointer is still post-incremented with *destination++.
So the next loop starts by decrementing the pointer back to pointing to the '\0' terminator, to start the concatentation.
In the second loop, each character is copied until the string terminator is found with (*destination++ = *source++); which is evaluated as the loop control. Again, this will include the required string terminator being copied.
This is a very complicated function for something that shouldn't be written so difficult.
--destination is a weird feature of C. I'm assuming you already know that variable++ increments the variable by one. Similarly variable-- decrements the variable by one. The thing is, when the ++ or -- comes after the variable name, that operation is done after the line is executed as a whole, when it is before the variable, C does the arithmetic first, then evaluates the full line.
For an example:
int c = 5
print(c++) -> outputs '5'
print(c) -> outputs '6'
but
int d = 5
print(++d) -> outputs '6'
print(d) -> outputs '6'
This is because in the second example, the increment is evaluated before the entire line is evaluate.
Hope that helps.
Can someone explain how this loop works? The entire function serves to figure out where in hash to place certain strings and the code is as follows:
//determine string location in hash
int hash(char* str)
{
int size = 100;
int sum;
for(; *str; str++)
sum += *str;
return sum % size;
}
It seems to iterate over the string character by character until it hits null, however why does simple *str works as a condition? Why does str++ moves to the next character, shouldn't it be something like this instead: *(str+i) where i increments with each loop and moves "i" places in memory based on *str address?
In C, chars and integers implicitly convert to booleans as: 0 - false, non-zero - true;
So for(; *str; str++) iterates until *str is zero. (or nul)
str is a pointer to an array of chars. str++ increments this pointer to point to the next element in the array and therefore the next character in the string.
So instead of indexing by index. You are moving the pointer.
The condition in a for loop is an expression that is tested for a zero value. The NUL character at the end of str is zero.
The more explicit form of this condition is of course *str != '\0', but that's equivalent since != produces zero when *str is equal to '\0'.
As for why str++ moves to the next character: that's how ++ is defined on pointers. When you increment a char*, you point it to the next char-sized cell in memory. Your *(str + i) solution would also work, it just takes more typing (even though it can be abbreviated str[i]).
This for loop makes use of pointer arithmetic. With that you can increment/decrement the pointer or add/substract an offset to it to navigate to certain entries in the array, since array are continuous blocks of memory you can do that.
str points to a string. Strings in C always end with a terminating \0.
*str dereferences the actual pointer to get the char value.
The for loop's break condition is equivalent to:
*str != '\0'
and
str++
moves the pointer forward to next element.
The hole for-loop is equivalent to:
int len = strlen(str);
int i;
for(i = 0; i < len; i++)
sum += str[i];
You could also write is as while-loop:
while(*str)
sum += *str++;
Why does str++ moves to the next character, shouldn't it be something like this
instead: *(str+i) where i increments with each loop and moves "i" places in
memory based on *str address?
In C/C++, string is a pointer variable that contains the address of your string literal.Initially Str points to the first character.*(str) returns the first character of string.
Str++ points to second charactes.Thus *(str) returns the second character of the string.
why does simple *str works as a condition?
Every c/c++ string contains null character.These Null Characters signify the end of a character string in C. ASCII code of NUL character is 0.
In C/C++,0 means FALSE.Thus, NUL Character in Conditional statement
means FALSE Condition.
for(;0;)/*0 in conditions means false, hence the loop terminates
when pointer points to Null Character.
{
}
It has to do with how C converts values to "True" and "False". In C, 0 is "False" and anything else is "True"
Since null (the character) happens to also be zero it evaluates to "False". If the character set were defined differently and the null character had a value of "11" then the above loop wouldn't work!
As for the 2nd half of the question, a pointer points to a "location" in memory. Incrementing that pointer makes it point to the next "location" in memory. The type of the pointer is relevant here too because the "Next" location depends on how big the thing being pointed to is
When the pointer points to a null character it is regarded as false. This happens in pointers. I don't know who defined it, but it happens.
It may be just becuase C treats 0 as false and every other things as true.
For example in the following code.
if(0) {
puts("true");
} else {
puts("false");
}
false will be the output
The unary * operator is a dereference operator -- *str means "the value pointed to by str." str is a pointer, so incrementing it with str++ (or ++str) changes the pointer to point to the next character. So it is the correct way to increment in the for loop.
Any integral value can be treated as a Boolean. *str as the condition of the for loop takes the value pointed to by str and determine if it is non-zero. If so, the loop continues Once it hits a null character, it terminates.
I'm going through K & R, and am having difficulty with incrementing pointers. Exercise 5.3 (p. 107) asks you to write a strcat function using pointers.
In pseudocode, the function does the following:
Takes 2 strings as inputs.
Finds the end of string one.
Copies string two onto the end of string one.
I got a working answer:
void strcats(char *s, char *t)
{
while (*s) /* finds end of s*/
s++;
while ((*s++ = *t++)) /* copies t to end of s*/
;
}
But I don't understand why this code doesn't also work:
void strcats(char *s, char *t)
{
while (*s++)
;
while ((*s++ = *t++))
;
}
Clearly, I'm missing something about how pointer incrementation works. I thought the two forms of incrementing s were equivalent. But the second code only prints out string s.
I tried a dummy variable, i, to check whether the function went through both loops. It did. I read over the sections 5.4 and 5.5 of K & R, but I couldn't find anything that sheds light on this.
Can anyone help me figure out why the second version of my function isn't doing what I would like it to? Thanks!
edit: Thanks everyone. It's incredible how long you can stare at a relatively simple error without noticing it. Sometimes there's no better remedy than having someone else glance at it.
This:
while(*s++)
;
due to post-increment, locates the nul byte at the end of the string, then increments it once more before exiting the loop. t is copied after then nul:
scontents␀tcontents␀
Printing s will stop at the first nul.
This:
while(*s)
s++;
breaks from the loop when the 0 is found, so you are left pointing at the nul byte. t is copied over the nul:
scontentstcontents␀
It's an off-by-one issue. Your second version increments the pointer every time the test is evaluated. The original increments one fewer time -- the last time when the test evaluates to 0, the increment isn't done. Therefore in the second version, the new string is appended after the original terminating \0, while in the first version, the first character of the new string overwrites that \0.
This:
while (*s)
s++;
stops as soon as *s is '\0', at which point it leaves s there (because it doesn't execute the body of the loop).
This:
while (*s++)
;
stops as soon as *s is '\0', but still executes the postincrement ++, so s ends up pointing right after the '\0'. So the string-terminating '\0' never gets overwritten, and it still terminates the string.
There's one less operation in while (*s) ++s; When *s is zero, then the loop breaks, while the form while (*s++) breaks but still increments s one last time.
Strictly speaking, the latter form may be incorrect (i.e. UB) if you attempt to form an invalid pointer. This is contrived, of course, but here's an example: char x = 0, * p = &x; while (*x++) { }.
Independent of that, it's best to write clean, readable and deliberate code rather than trying to outsmart yourself. Sometimes you can write nifty code in C that is actually elegant, and other times it's better to spell something out properly. Use your judgement, and ask someone else for feedback (or watch their faces as they look at your code).
let's assume the following characters in memory:
Address 0x00 0x01 0x02 0x03
------- ---- ---- ---- ----
0x8000 'a' 'b' 'c' 0
0x8004 ...
While executing loop, it happens in memory.
1. *s = 'a'
2. s = 0x8001
3. *s = 'b'
4. s = 0x8002
5. *s = 'c'
6. s = 0x8003
7. *s = 0;
8. s = 0x8004
9. end loop
While evaluating, *s++ advances the pointer even if the value of *s is 0.
// move s forward until it points one past a 0 character
while (*s++);
It doesn't work at all because s ends up pointing to a different place.
As it summarizes, we get a garbage value as last character in our target string. That garbage string is because of while loop exceed the limit of '\0' by one step forward.
You can eliminate it by using the below code, I think it is efficient
while (*s)
s++;
It execute as below in memory perspective.
1. *s = 'a'
2. s = 0x8001
3. *s = 'b'
4. s = 0x8002
5. *s = 'c'
6. s = 0x8003
7. *s = 0
8. end loop
I am learning C. And, I see this function find length of a string.
size_t strlen(const char *str)
{
size_t len = 0U;
while(*(str++)) ++len; return len;
}
Now, when does the loop exit? I am confused, since str++, always increases the pointer.
while(*(str++)) ++len;
is same as:
while(*str) {
++len;
++str;
}
is same as:
while(*str != '\0') {
++len;
++str;
}
So now you see when str points to the null char at the end of the string, the test condition fails and you stop looping.
C strings are terminated by the NUL character which has the value of 0
0 is false in C and anything else is true.
So we keep incrementing the pointer into the string and the length until we find a NUL and then return.
You need to understand two notions to grab the idea of the function :
1°) A C string is an array of characters.
2°) In C, an array variable is actually a pointer to the first case of the table.
So what strlen does ? It uses pointer arithmetics to parse the table (++ on a pointer means : next case), till it gets to the end signal ("\0").
Once *(str++) returns 0, the loop exits. This will happen when str points to the last character of the string (because strings in C are 0 terminated).
Correct, str++ increases the counter and returns the previous value. The asterisk (*) dereferences the pointer, i.e. it gives you the character value.
C strings end with a zero byte. The while loop exits when the conditional is no longer true, which means when it is zero.
So the while loop runs until it encounters a zero byte in the string.
I'm writing a simple string concatenation program.
The program works the way I have posted it. However, I first wrote it using the following code to find the end of the string:
while (*s++)
;
However, that method didn't work. The strings I passed to it weren't copied correctly. Specifically, I tried to copy "abc" to a char[] variable that held "\0".
From reading the C K&R book, it looks like it should work. That compact form should take the following steps.
*s is compared with '\0'
s points to the next address
So why doesn't it work? I am compiling with gcc on Debian.
I found that this version does work:
strncat(char *s, const char *t, int n)
{
char *s_start = s;
while (*s)
s++;
for ( ; n > 0 && *t; n--, s++, t++)
*s = *t;
*(s++) = '\0';
return s_start;
}
Thanks in advance.
After the end of while (*s++);, s points to the character after the null terminator. Take that into account in the code that follows.
The problem is that
while (*s++)
;
Always Increments s, even when s is zero (*s is false)
while (*s)
s++;
only increments s when *s is nonzero
so the first one will leave s pointing to first character after the first \0, while the second one will leave s pointing to the first \0.
There is difference. In the first case, s will point to the position after '\0', while the second stops right at '\0'.
As John Knoeller said, at the end of the run it'll s will point to the location after the NULL. BUT There is no need to sacrifice performance for the correct solution.. Take a look for yourself:
while (*s++); --s;
Should do the trick.
In addition what has been said, note that in C it is technically illegal for a pointer to point to unallocated memory, even if you don't dereference it. So be sure to fix your program, even if it appears to work.