Does a C for-loop exit on encountering a '\0' character? - c

I was going through a hash function and encountered a condition where the for loop is supposed to exit when a '\0' (NIL) character comes.
unsigned int hash_string (const char *s)
{
register unsigned int i;
for (i = 0; *s; s++) { // This for loop is supposed to end
// when a '\0' comes?
i *= 16777619;
i ^= *s;
}
return i;
}
As far as I know a C-Loop is supposed to end if a condition returns 0.
Here, however, there is no such condition and it still works?
Could someone also tell on what all conditions does a loop succed/fail?

The null character has the value of 0, so in your example, *s will evaluate to zero if it corresponds to the null termination of the character string.
From 5.2.1 Character sets
... A byte with all bits set to 0, called the null character, shall
exist in the basic execution character set; it is used to terminate a
character string.
Then in 6.4.4.4 Character constants
12 EXAMPLE 1 The construction '\0' is commonly used to represent the
null character.

*s de-references the character pointed by s.
If the character code is 0 the loop breaks and it passes for all values other than 0.
\0 is guaranteed to be 0, that why it is guaranteed that loop will terminate at string end when it encounters NUL character.
One of the reason for choosing \0 as string termination in C is to make constructs like this possible.

When *s evaluates to 0 or false, which is convertable from one another, the loop ends.
In fact, the integer representation for character \0 is 0. So it's the same thing.

Related

I don´t understand this C program made with pointers

I saw this C program that copies the first string into the second one using pointers.
void copy(char const *s1, char *s2)
{
for(;(*s2=*s1);++s1,++s2){};
}
I don´t understand the condition that stops the for loop, because I could have written (*s2=*s1)!='\0'and it works, but if I don´t write the !='\0' it works too. How does the for loop know when to stop?
The parentheses are indicating that the test criteria is the result of the assignment to the location pointed to by s1 (the left operand). In other words, the loop runs until the value of *s2 is false.
(*s2=*s1)!='\0' is equivalent to (*s2=*s1)!=0, which is equivalent to (*s2=*s1); or (*s2=*s1)==true if you prefer. Obviously a non-zero value is evaluated as true, so the loop runs until the second string has a nul terminator.
A character between single quotes is a char. The char \0 has a value of 0 thus
char a = '\0`;
is equal to
char a = 0;
And thus if (x != '\0') is equal to if (x != 0), which is equal to if (x) similar in the condition as part of for.
This gets at the heart of how C distinguishes between true and false.
True is any non-zero value (any bit on in an integer). While the condition tests like == and > produce a value of 1 any non-zero value works for true.
False is a value of zero (all bits off in an integer), which includes NULL in pointers.
The value of '\0' is of course a binary zero so the (*s2++=*s1++) in the condition part of the for does an implicit test for non-zero so this works up, and until, the \0 is copied. The \0 returns false and exits the loop. Adding your own !='\0' is adding an explicit test for the same.
Beware: If you just used an incorrect *s1++ = *s2++ != '\0' without the parenthesis it would be treated as a very buggy *s1++ = (*s2++ != '\0') which would assign a series of 1's to *s1 followed by a '\0' to terminate the "string". Oops.

String termination C/C++ char = 0

#include<stdio.h>
#include<string.h>
void terminateString(char *str){
str[3] = 0;
printf("string after termination is:%s\n",str);
}
int main(){
char str[]="abababcdfef";
terminateString(str);
return 0;
}
Output:
string after termination is:aba
We are only assigning element at index '3' to 0, but why are all characters after that index are ignored? Can someone please explain this behavior?
We are only assigning element at index '3' to 0, but why do all
characters after that index are ignored? Can someone please explain
this behavior?
The convention with a zero-terminated string is that the 0 byte is what indicates the end of the string. So when printf() encounters the zero-byte at position 3, it stops printing.
The ISO C standard defines a string as follows (see, for example, C11 7.1.1 Definition of terms), emphasis is mine:
A string is a contiguous sequence of characters terminated by and including the first null character.
Hence, when you have the character sequence abababcdfef\0, that is indeed a string.
However, when you put a null at offset 3, the string is not aba\0abcdfef\0 but, by virtue of the fact it's only a string up to and including the first null, it is aba\0.
C-string is Null-terminated string. With null-terminated it means "a null character terminates (indicates the end of) the string".
A null character is a character with all its bits set to 0, or \0, presented in memory as 0x00.
When you set str[3] = 0 you're changing str[3] to the terminator token, so when printf reads the terminator, it thinks the string is end and only prints "aba".
What you are demonstrating is the difference in c++ between strings and char arrays. Strings are a sequence of characters that continue up to and including the first null character. A character array is a memory allocation unit. A string might not use the entire character array allocated for it (indeed it is possible that it may even exceed the bounds of the containing array). If you want to diagnostically print an array rather than a string, you would need to iterate over the array in a loop. See below:
#include<stdio.h>
#include<string.h>
void terminateString(char *str){
str[3] = 0;
printf("string after termination is:%s\n",str);
}
int main(){
char str[]="abababcdfef";
terminateString(str);
for (int i = 0; i < sizeof(str)/sizeof(str[0]); i++) {
(str[i] != 0) ? printf("%c ", str[i]) : printf("\\0 ");
}
printf("\n");
return 0;
}
// OUTPUT
// string after termination is:aba
// a b a \0 a b c d f e f \0
c/c++ doesn't really distinguish between 0, '\0', and NULL, they're all just 0 in memory. c style strings are a sequence of characters that end with '\0', so every function that works with them ends after it finds this char. When you assign str[3]=0; it's the same as str[3]='\0'; i.e. stop the string after 3 chars. If you want the letter 0, do str[3]='0';, where the single quotes let the compiler know you want the character 0, ascii 48
Edit:
Note that NULL is a macro that evaluates to 0, not the same as nullptr
apparently starting with C++11 NULL can evaluate to nullptr, in C or C++98, it is 0
http://www.cplusplus.com/reference/cstring/NULL/

While (*s) - How does this work?

How does this while loop works? When this *s argument terminates?
void putstr (char *s)
{
while (*s) putchar(*s++);
}
So other notable behaviors, arguments for while?
Logical expressions in C evaluate to false if they are 0, otherwise they evaluate to true. Thus your loop will terminate when *s is equal to 0. In the context of a char that is when the null-terminating character is encountered.
Note that ++ has a higher precedence than pointer dereferencing * and so the ++ is bound to the pointer rather than the char to which it points. Thus the body of your loop will call putchar for the character that s points to, and then increment the pointer s.
*s dereferences into a char, which in the loop, a zero (0, or '\0') will act as false, terminating the loop, all other non-zero characters keep it as true.
The char (*s) gets cast to int, for conditions it holds that any integer != 0 is interpreted as true, so the loop ands when a '\0' char is encountered.
Because the loop itself modifies s (with *s++), the while condition can examine it each time around the loop, and it will eventually terminate, when the pointer points to a nul character.
while (*s)
while the character pointed by s is not zero (that is, if we did't reach the end of the string)
putchar(*s++);
it can be thought as
putchar(*s); // write the character pointed by s
s += 1; // go to next one
s is a pointer on a string.
The end of a string is detected by a 0 value

how to understand "return *test== ‘\0’;"

There is a code snippet,
int matchhere(char *regexp, char *text)
{
/* do sth */
return *test== '\0';
}
I do not understand what does
return *test== '\0';
mean. Or what it will return? How does "==" function here?
compare *test to '\0', return 0 if inequal, return 1 if equal.
The *test part reads the first character for the C string (a C string is merely a bunch of characters starting at a given address, and the *foo operator looks at that address which happens to contain the first character). By definition, a C string ends with a null byte ('\0' or simply 0).
So this tests whether the first character is the end-of-string character. Or in other words: it tests whether the string is empty. That comparison result (1 if empty, 0 if non-empty) is returned.
It fails to compile because "test" is not the same as "text", and because there is no such type Int in C.
If the typos were fixed, it'd see whether the first letter of the buffer pointed to by text is the NULL character -- i.e. it returns 1 if the buffer is empty, and 0 otherwise.
It checks if the character pointed by text pointer equals '\0' character (string-terminating character).
*test means the contents of the test pointer, which is a char.
*test == '\0' just compares that character to the null character.
return *test == '\0' means return the result of that comparison.
So basically, if test points to a null-character then matchhere() will return true, otherwise false.
It checks if *test is an empty string, in that case return a different from zero value
*test represents the first character of a string.
== is the equality operator.
'\0' is the null character, which in C represents the end of a string.
*test== ‘\0’ is a logical expression which returns true whenever the string is empty.
The whole instruction returns that logical result to the caller.
The statement
return *text == '\0';
is equivalent to
return text[0] == '\0';
which is also equivalent to
return text[0] == 0;
In each case, it's comparing the first character of the string pointed to by text to 0, which is the string terminator, and returning the result of the comparison. It's equivalent to writing
if (*text == '\0') // or *text == 0, or text[0] == 0, or !*text, or !text[0]
return 1;
else
return 0;
Another equivalent would be
return !*text; // or !text[0]
which will return 0 if *text is non-zero, 1 otherwise, but that's pushing the bounds of good taste.

understanding strlen function in C

I am learning C. And, I see this function find length of a string.
size_t strlen(const char *str)
{
size_t len = 0U;
while(*(str++)) ++len; return len;
}
Now, when does the loop exit? I am confused, since str++, always increases the pointer.
while(*(str++)) ++len;
is same as:
while(*str) {
++len;
++str;
}
is same as:
while(*str != '\0') {
++len;
++str;
}
So now you see when str points to the null char at the end of the string, the test condition fails and you stop looping.
C strings are terminated by the NUL character which has the value of 0
0 is false in C and anything else is true.
So we keep incrementing the pointer into the string and the length until we find a NUL and then return.
You need to understand two notions to grab the idea of the function :
1°) A C string is an array of characters.
2°) In C, an array variable is actually a pointer to the first case of the table.
So what strlen does ? It uses pointer arithmetics to parse the table (++ on a pointer means : next case), till it gets to the end signal ("\0").
Once *(str++) returns 0, the loop exits. This will happen when str points to the last character of the string (because strings in C are 0 terminated).
Correct, str++ increases the counter and returns the previous value. The asterisk (*) dereferences the pointer, i.e. it gives you the character value.
C strings end with a zero byte. The while loop exits when the conditional is no longer true, which means when it is zero.
So the while loop runs until it encounters a zero byte in the string.

Resources