How does this C code work? - c

I was looking at the following code I came across for printing a string in reverse order in C using recursion:
void ReversePrint(char *str) { //line 1
if(*str) { //line 2
ReversePrint(str+1); //line 3
putchar(*str); //line 4
}
}
I am relatively new to C and am confused by line 2. *str from my understanding is dereferencing the pointer and should return the value of the string in the current position. But how is this being used as an argument to a conditional statement (which should except a boolean right?)? In line 3, the pointer will always be incremented to the next block (4 bytes since its an int)...so couldn't this code fail if there happens to be data in the next memory block after the end of the string?
Update: so there are no boolean types in c correct? A conditional statement evaluates to 'false' if the value is 0, and 'true' otherwise?

Line 2 is checking to see if the current character is the null terminator of the string - since C strings are null-terminated, and the null character is considered a false value, it will begin unrolling the recursion when it hits the end of the string (instead of trying to call StrReverse4 on the character after the null terminator, which would be beyond the bounds of the valid data).
Also note that the pointer is to a char, thus incrementing the pointer only increments by 1 byte (since char is a single-byte type).
Example:
0 1 2 3
+--+--+--+--+
|f |o |o |\0|
+--+--+--+--+
When str = 0, then *str is 'f' so the recursive call is made for str+1 = 1.
When str = 1, then *str is 'o' so the recursive call is made for str+1 = 2.
When str = 2, then *str is 'o' so the recursive call is made for str+1 = 3.
When str = 3, then *str is '\0' and \0 is a false value thus if(*str) evaluates to false, so no recursive call is made, thus going back up the recursion we get...
Most recent recursion was followed by `putchar('o'), then after that,
Next most recent recursion was followed by `putchar('o'), then after that,
Least recent recursion was followed by `putchar('f'), and we're done.

The type of a C string is nothing but a pointer to char. The convention is that what the pointer points to is an array of characters, terminated by a zero byte.
*str, thus, is the first character of the string pointed to by str.
Using *str in a conditional evaluates to false if str points to the terminating null byte in the (empty) string.

At the end of a string is typically a 0 byte - the line if (*str) is checking for the existence of that byte and stopping when it gets to it.

In line 3, the pointer will always be incremented to the next block (4 bytes since its an int)...
Thats wrong, this is char *, it will only be incremented by 1. Because char is 1 byte long only.
But how is this being used as an argument to a conditional statement (which should except a boolean right?)?
You can use any value in if( $$ ) at $$, and it will only check if its non zero or not, basically bool is also implemented as simple 1=true and 0=false only.
In other higher level strongly typed language you cant use such things in if, but in C everything boils down to numbers. And you can use anything.
if(1) // evaluates to true
if("string") // evaluates to true
if(0) // evaulates to false
You can give any thing in if,while conditions in C.

At the end of the string there is a 0 - so you have "test" => [0]'t' [1]'e' [2]'s' [3]'t' [4]0
and if(0) -> false
this way this will work.

C has no concept of boolean values: in C, every scalar type (ie arithmetic and pointer types) can be used in boolean contexts where 0 means false and non-zero true.
As strings are null-terminated, the terminator will be interpreted as false, whereas every other character (with non-zero value!) will be true. This means means there's an easy way to iterate over the characters of a string:
for(;*str; ++str) { /* so something with *str */ }
StrReverse4() does the same thing, but by recursion instead of iteration.

conditional statements (if, for, while, etc) expect a boolean expression. If you provide an integer value the evaluation boils down to 0 == false or non-0 == true. As mentioned already, the terminating character of a c-string is a null byte (integer value 0). So the if will fail at the end of the string (or first null byte within the string).
As an aside, if you do *str on a NULL pointer you are invoking undefined behavior; you should always verify that a pointer is valid before dereferencing.

This is kind of off topic, but when I saw the question I immediately wondered if that was actually faster than just doing an strlen and iterate from the back.
So, I made a little test.
#include <string.h>
void reverse1(const char* str)
{
int total = 0;
if (*str) {
reverse1(str+1);
total += *str;
}
}
void reverse2(const char* str)
{
int total = 0;
size_t t = strlen(str);
while (t > 0) {
total += str[--t];
}
}
int main()
{
const char* str = "here I put a very long string ...";
int i=99999;
while (--i > 0) reverseX(str);
}
I first compiled it with X=1 (using the function reverse1) and then with X=2. Both times with -O0.
It consistently returned approximately 6 seconds for the recursive version and 1.8 seconds for the strlen version.
I think it's because strlen is implemented in assembler and the recursion adds quite an overhead.
I'm quite sure that the benchmark is representative, if I'm mistaken please correct me.
Anyway, I thought I should share this with you.

1.
str is a pointer to a char. Incrementing str will make the pointer point to the second character of the string (as it's a char array).
NOTE: Incrementing pointers will increment by the data type the pointer points to.
For ex:
int *p_int;
p_int++; /* Increments by 4 */
double *p_dbl;
p_dbl++; /* Increments by 8 */
2.
if(expression)
{
statements;
}
The expression is evaluated and if the resulting value is zero (NULL, \0, 0), the statements are not executed. As every string ends with \0 the recursion will have to end some time.

Try this code, which is as simple as the one which you are using:
int rev(int lower,int upper,char*string)
{
if(lower>upper)
return 0;
else
return rev(lower-1,upper-1,string);
}

Related

I don´t understand this C program made with pointers

I saw this C program that copies the first string into the second one using pointers.
void copy(char const *s1, char *s2)
{
for(;(*s2=*s1);++s1,++s2){};
}
I don´t understand the condition that stops the for loop, because I could have written (*s2=*s1)!='\0'and it works, but if I don´t write the !='\0' it works too. How does the for loop know when to stop?
The parentheses are indicating that the test criteria is the result of the assignment to the location pointed to by s1 (the left operand). In other words, the loop runs until the value of *s2 is false.
(*s2=*s1)!='\0' is equivalent to (*s2=*s1)!=0, which is equivalent to (*s2=*s1); or (*s2=*s1)==true if you prefer. Obviously a non-zero value is evaluated as true, so the loop runs until the second string has a nul terminator.
A character between single quotes is a char. The char \0 has a value of 0 thus
char a = '\0`;
is equal to
char a = 0;
And thus if (x != '\0') is equal to if (x != 0), which is equal to if (x) similar in the condition as part of for.
This gets at the heart of how C distinguishes between true and false.
True is any non-zero value (any bit on in an integer). While the condition tests like == and > produce a value of 1 any non-zero value works for true.
False is a value of zero (all bits off in an integer), which includes NULL in pointers.
The value of '\0' is of course a binary zero so the (*s2++=*s1++) in the condition part of the for does an implicit test for non-zero so this works up, and until, the \0 is copied. The \0 returns false and exits the loop. Adding your own !='\0' is adding an explicit test for the same.
Beware: If you just used an incorrect *s1++ = *s2++ != '\0' without the parenthesis it would be treated as a very buggy *s1++ = (*s2++ != '\0') which would assign a series of 1's to *s1 followed by a '\0' to terminate the "string". Oops.

What does while(*pointer) means in C?

When I recently look at some passage about C pointers, I found something interesting. What it said is, a code like this:
char var[10];
char *pointer = &var;
while(*pointer!='\0'){
//Something To loop
}
Can be turned into this:
//While Loop Part:
while(*pointer){
//Something to Loop
}
So, my problem is, what does *pointer means?
while(x) {
do_something();
}
will run do_something() repeatedly as long as x is true. In C, "true" means "not zero".
'\0' is a null character. Numerically, it's zero (the bits that represents '\0' is the same as the number zero; just like a space is the number 0x20 = 32).
So you have while(*pointer != '\0'). While the pointed-to -memory is not a zero byte. Earlier, I said "true" means "non-zero", so the comparison x != 0 (if x is int, short, etc.) or x != '\0' (if x is char) the same as just x inside an if, while, etc.
Should you use this shorter form? In my opinion, no. It makes it less clear to someone reading the code what the intention is. If you write the comparison explicitly, it makes it a lot more obvious what the intention of the loop is, even if they technically mean the same thing to the compiler.
So if you write while(x), x should be a boolean or a C int that represents a boolean (a true-or-false concept) already. If you write while(x != 0), then you care about x being a nonzero integer and are doing something numerical with x. If you write while(x != '\0'), then x is a char and you want to keep going until you find a null character (you're probably processing a C string).
*pointer means dereference the value stored at the location pointed by pointer. When pointer points to a string and used in while loop like while(*pointer), it is equivalent to while(*pointer != '\0'): loop util null terminator if found.
Let's start with a simple example::
int a = 2 ;
int *b = &a ;
/* Run the loop till *b i.e., 2 != 0
Now, you know that, the loop will run twice
and then the condition will become false
*/
while( *b != 0 )
{
*b-- ;
}
Similarly, your code is working with char*, a string.
char var[10] ;
/* copy some string of max char count = 9,
and append the end of string with a '\0' char.*/
char *pointer = &var ;
while( *pointer != '\0' )
{
// do something
// Increment the pointer 1 or some other valid value
}
So, the while loop will run till *pointer don't hit '\0'.
while( *pointer )
/* The above statement means the same as while( *pointer != '\0' ),
because, null char ('\0') = decimal value, numeric zero, 0*/
But the usage can change when you do, while(*pointer != 'x'), where x can be any char. In this case, your first code will exit after *pointer hits the 'x' char but your second snippet will run till *pointer hits '\0' char.
Yes, you can go for it.
Please note that *pointer is the value at the memory location the pointer point to(or hold the address of).
Your *pointer is now pointing to the individual characters of the character array var.
So, while(*pointer) is shorthand usage of the equivalent
while(*pointer!='\0').
Suppose, your string is initialized to 9 characters say "123456789" and situated at an address say addr(memory location).
Now because of the statement:
char *pointer=&var;
pointer will point to first element of string "1234567890".
When you write the *pointer it will retrieve the value stored at the memory location addr which is 1.
Now, the statement:
while(*pointer)
will be equivalent to
while(49)
because ASCII Value of 1 is 49, and condition is evaluated to true.
This will continue till \0 character is reached after incrementing pointer for nine times.
Now, the statement:
while(*pointer)
will be equivalent to
while(0)
because ASCII value of \0 is 0. Thus, condition is evaluated to false and loop stops.
Summary:
In while(condition), condition must be non-zero to continue loop execution. If condition evaluates to zero then loop stops executing.
while(*pointer) will work till the value at memory location being pointed to is a non-zero ASCII value.
Also you can use:
if(*ptr){ //instead of if(*ptr!='\0')
//do somthing
}
if(!*ptr){ //instead of if(*ptr=='\0')
//do somthing
}
*pointer means exactly what it says: "Give me the value that's stored at the place that the pointer points to". Or "dereference pointer" for short. In your concrete example, dereferencing the pointer produces the one of the characters in a string.
while(*pointer) also means exactly what is says: "While the expression *pointer yields a true value, execute the body of the loop".
Since C considers all non-zero values as true, using *pointer in a condition is always equivalent to using the expression *pointer != 0. Consequently, many C programmers omit the != 0 part in order to practice boolean zen.

Don't understand how this for loop works

Can someone explain how this loop works? The entire function serves to figure out where in hash to place certain strings and the code is as follows:
//determine string location in hash
int hash(char* str)
{
int size = 100;
int sum;
for(; *str; str++)
sum += *str;
return sum % size;
}
It seems to iterate over the string character by character until it hits null, however why does simple *str works as a condition? Why does str++ moves to the next character, shouldn't it be something like this instead: *(str+i) where i increments with each loop and moves "i" places in memory based on *str address?
In C, chars and integers implicitly convert to booleans as: 0 - false, non-zero - true;
So for(; *str; str++) iterates until *str is zero. (or nul)
str is a pointer to an array of chars. str++ increments this pointer to point to the next element in the array and therefore the next character in the string.
So instead of indexing by index. You are moving the pointer.
The condition in a for loop is an expression that is tested for a zero value. The NUL character at the end of str is zero.
The more explicit form of this condition is of course *str != '\0', but that's equivalent since != produces zero when *str is equal to '\0'.
As for why str++ moves to the next character: that's how ++ is defined on pointers. When you increment a char*, you point it to the next char-sized cell in memory. Your *(str + i) solution would also work, it just takes more typing (even though it can be abbreviated str[i]).
This for loop makes use of pointer arithmetic. With that you can increment/decrement the pointer or add/substract an offset to it to navigate to certain entries in the array, since array are continuous blocks of memory you can do that.
str points to a string. Strings in C always end with a terminating \0.
*str dereferences the actual pointer to get the char value.
The for loop's break condition is equivalent to:
*str != '\0'
and
str++
moves the pointer forward to next element.
The hole for-loop is equivalent to:
int len = strlen(str);
int i;
for(i = 0; i < len; i++)
sum += str[i];
You could also write is as while-loop:
while(*str)
sum += *str++;
Why does str++ moves to the next character, shouldn't it be something like this
instead: *(str+i) where i increments with each loop and moves "i" places in
memory based on *str address?
In C/C++, string is a pointer variable that contains the address of your string literal.Initially Str points to the first character.*(str) returns the first character of string.
Str++ points to second charactes.Thus *(str) returns the second character of the string.
why does simple *str works as a condition?
Every c/c++ string contains null character.These Null Characters signify the end of a character string in C. ASCII code of NUL character is 0.
In C/C++,0 means FALSE.Thus, NUL Character in Conditional statement
means FALSE Condition.
for(;0;)/*0 in conditions means false, hence the loop terminates
when pointer points to Null Character.
{
}
It has to do with how C converts values to "True" and "False". In C, 0 is "False" and anything else is "True"
Since null (the character) happens to also be zero it evaluates to "False". If the character set were defined differently and the null character had a value of "11" then the above loop wouldn't work!
As for the 2nd half of the question, a pointer points to a "location" in memory. Incrementing that pointer makes it point to the next "location" in memory. The type of the pointer is relevant here too because the "Next" location depends on how big the thing being pointed to is
When the pointer points to a null character it is regarded as false. This happens in pointers. I don't know who defined it, but it happens.
It may be just becuase C treats 0 as false and every other things as true.
For example in the following code.
if(0) {
puts("true");
} else {
puts("false");
}
false will be the output
The unary * operator is a dereference operator -- *str means "the value pointed to by str." str is a pointer, so incrementing it with str++ (or ++str) changes the pointer to point to the next character. So it is the correct way to increment in the for loop.
Any integral value can be treated as a Boolean. *str as the condition of the for loop takes the value pointed to by str and determine if it is non-zero. If so, the loop continues Once it hits a null character, it terminates.

While (*s) - How does this work?

How does this while loop works? When this *s argument terminates?
void putstr (char *s)
{
while (*s) putchar(*s++);
}
So other notable behaviors, arguments for while?
Logical expressions in C evaluate to false if they are 0, otherwise they evaluate to true. Thus your loop will terminate when *s is equal to 0. In the context of a char that is when the null-terminating character is encountered.
Note that ++ has a higher precedence than pointer dereferencing * and so the ++ is bound to the pointer rather than the char to which it points. Thus the body of your loop will call putchar for the character that s points to, and then increment the pointer s.
*s dereferences into a char, which in the loop, a zero (0, or '\0') will act as false, terminating the loop, all other non-zero characters keep it as true.
The char (*s) gets cast to int, for conditions it holds that any integer != 0 is interpreted as true, so the loop ands when a '\0' char is encountered.
Because the loop itself modifies s (with *s++), the while condition can examine it each time around the loop, and it will eventually terminate, when the pointer points to a nul character.
while (*s)
while the character pointed by s is not zero (that is, if we did't reach the end of the string)
putchar(*s++);
it can be thought as
putchar(*s); // write the character pointed by s
s += 1; // go to next one
s is a pointer on a string.
The end of a string is detected by a 0 value

understanding strlen function in C

I am learning C. And, I see this function find length of a string.
size_t strlen(const char *str)
{
size_t len = 0U;
while(*(str++)) ++len; return len;
}
Now, when does the loop exit? I am confused, since str++, always increases the pointer.
while(*(str++)) ++len;
is same as:
while(*str) {
++len;
++str;
}
is same as:
while(*str != '\0') {
++len;
++str;
}
So now you see when str points to the null char at the end of the string, the test condition fails and you stop looping.
C strings are terminated by the NUL character which has the value of 0
0 is false in C and anything else is true.
So we keep incrementing the pointer into the string and the length until we find a NUL and then return.
You need to understand two notions to grab the idea of the function :
1°) A C string is an array of characters.
2°) In C, an array variable is actually a pointer to the first case of the table.
So what strlen does ? It uses pointer arithmetics to parse the table (++ on a pointer means : next case), till it gets to the end signal ("\0").
Once *(str++) returns 0, the loop exits. This will happen when str points to the last character of the string (because strings in C are 0 terminated).
Correct, str++ increases the counter and returns the previous value. The asterisk (*) dereferences the pointer, i.e. it gives you the character value.
C strings end with a zero byte. The while loop exits when the conditional is no longer true, which means when it is zero.
So the while loop runs until it encounters a zero byte in the string.

Resources