why this "for loop" works instead of overflows - c

This is a part of code in an arduino SPI communication instance.
char c;
for (const char* p = "Hello, world!\n"; c = *p; p++){
SPI.transfer(c);
Serial.print(c);
}
it will output “Hello,world!” in serial port and SPI. But why c = *p can judge whether this char string is ended or not? Normally we use the
int k;
char p[]="hello world!";
for (k=0; k < strlen(p); k++){
SPI.transfer(p[k]);
Serial.print(p[k]);
}

Before explaining, you have to understand a few things:
Characters (i.e., a char type) can be treated as a number without any cast based upon its ASCII encoding
Strings in C always end with a NUL terminator. Importantly, this character has a numerical value of 0 (i.e., when you treat it as a number, it's equal to zero).
Boolean values don't really exist in C independant of an integer representation. Zero is treated as "false" and anything else is treated as "true".
With this out of the way, let's look at what's happening in the loop.
First, we have a pointer p that points at the first character of the string "Hello world!\n" (this is coming from the statement const char* p = "Hello, world!\n").
At the top of the loop, we check our loop condition, which is c = *p. Note that this assigns the value pointed at by p into c, and assignments in C evaluate to the value that's been assigned. What that means in this context is that our test is essentially just *p.
Since booleans don't exist in C, *p is false only when the character pointed at by p has a value of zero. This can only happen when p is pointing to a NUL terminator, which is always at the end of a string in C. Therefore, our loop will stop when we've hit the end of our string.

Related

Printing the value of a 0-initialized array element prints nothing, why?

I have to initialize a char array to 0's. I did it like
char array[256] = {0};
I wanted to check if it worked so I tried testing it
#include <stdio.h>
int main()
{
char s[256] = {0};
printf("%c\n", s[10]);
return 0;
}
After I compile and run it, the command line output shows nothing.
What am I missing ? Perhaps I initialized the array in a wrong manner ?
TL;DR -- %c is the character representation. Use %d to see the decimal 0 value.
Related , from C11, chapter §7.21.6.1, (emphasis mine)
c If no l length modifier is present, the int argument is converted to an
unsigned char, and the resulting character is written.
FYI, see the list of printable values.
That said, for a hosted environment, int main() should be int main(void), at least to conform to the standard.
You are printing s[10] as a character (%c), and the numeric value of s[10] is 0, which represents the character \0, which means end of string and has no textual representation. For this reason you are not seeing anything.
If you want to see the numeric value instead of the character value, use %d to print it as a decimal (integer) number:
printf("%d\n", s[10]);
Note that end of string isn't the same as end of line, as said in one of your comments. End of string means that any string operation over a character sequence must stop when the first \0 arrives. If the character sequence has anything else after \0, it won't be printed, because the string operation stops on the first \0 character.
An end of line is, however, a normal character, which visual effect is to say the terminal or text editor to print the next character after the end of line in a new line.
If you want to have a vector full of end of line characters (and print them as such), you have to travel the vector and fill it:
char s[256];
int i;
for (i = 0; i < 256; ++i)
s[i] = '\n';
printf("%c\n", s[10]);
The ASCII (decimal/numerical) value of the end of line character (\n) is 12, so, the following snippet will be equivalent:
char s[256];
int i;
for (i = 0; i < 256; ++i)
s[i] = 12;
printf("%c\n", s[10]);
That doesn't work however (it doesn't print a new line):
char s[256] = {'\n'}; // or {12};
printf("%c\n", s[10]);
because the effect of {'\n'} is to assign \n to the first element of the array, and the remainings 255 character are filled with value 0, no matter which type of array are you making (char[], int[] or whatever). If you write an empty pair of brackets {}, all the elements will be 0.
So, these two statements are equivalent:
char s[256] = {}; // Implicit filling to 0.
char s[256] = {0}; // Implicit filling to 0 from the second element.
However, without defining the array:
char s[256];
The array is not filling (not initialized), so, each element of s will have anything, until you fill it with values, for example, with a for.
I hope with all of this examples you get the whole picture.

How does a char pointer differ from an int pointer in the below code?

1)
int main()
{
int *j,i=0;
int A[5]={0,1,2,3,4};
int B[3]={6,7,8};
int *s1=A,*s2=B;
while(*s1++ = *s2++)
{
for(i=0; i<5; i++)
printf("%d ", A[i]);
}
}
2)
int main()
{
char str1[] = "India";
char str2[] = "BIX";
char *s1 = str1, *s2=str2;
while(*s1++ = *s2++)
printf("%s ", str1);
}
The second code works fine whereas the first code results in some error(maybe segmentation fault). But how is the pointer variable s2 in program 2 working fine (i.e till the end of the string) but not in program 1, where its running infinitely....
Also, in the second program, won't the s2 variable get incremented beyond the length of the array?
The thing with strings in C is that they have a special character that marks the end of the string. It's the '\0' character. This special character has the value zero.
In the second program the arrays you have include the terminator character, and since it is zero it is treated as "false" when used in a boolean expression (like the condition in your while loop). That means your loop in the second program will copy characters up to and including the terminator character, but since that is "false" the loop will then end.
In the first program there is no such terminator, and the loop will continue and go out of bounds until it just randomly happen to find a zero in the memory you're copying from. This leads to undefined behavior which is a common cause of crashes.
So the difference isn't in how pointers are handled, but in the data. If you add a zero at the end of the source array in the first program (B) then it will also work well.
In str2, You have assigned String. Which means there will be end Of
String('\0' or NULL) due to which when you will increment Str2 and it will
reach to end of string, It will return null and hence your loop will break.
And with integer pointer, there is no end of string. thats why its going to infinite loop.
Joachim gave a good explanation about String terminal character \0 in C language.
Another thing to be aware of when working with pointer is pointer arithmetic.
Arithmetic unit for pointer is the size of the entity pointed.
With a char * pointer named charPtr, on system where char are stored on 1 byte, doing charPtr++ will increase the value in charPtr by *1 (1 byte) to make it ready to point to the next char in memory.
With a int * pointer named intPtr, on system where int are stored on 4 bytes, doing intPtr++ will increase the value in intPtr by 4 (4 bytes) to make it ready to point to the next int in memory.

What does while(*pointer) means in C?

When I recently look at some passage about C pointers, I found something interesting. What it said is, a code like this:
char var[10];
char *pointer = &var;
while(*pointer!='\0'){
//Something To loop
}
Can be turned into this:
//While Loop Part:
while(*pointer){
//Something to Loop
}
So, my problem is, what does *pointer means?
while(x) {
do_something();
}
will run do_something() repeatedly as long as x is true. In C, "true" means "not zero".
'\0' is a null character. Numerically, it's zero (the bits that represents '\0' is the same as the number zero; just like a space is the number 0x20 = 32).
So you have while(*pointer != '\0'). While the pointed-to -memory is not a zero byte. Earlier, I said "true" means "non-zero", so the comparison x != 0 (if x is int, short, etc.) or x != '\0' (if x is char) the same as just x inside an if, while, etc.
Should you use this shorter form? In my opinion, no. It makes it less clear to someone reading the code what the intention is. If you write the comparison explicitly, it makes it a lot more obvious what the intention of the loop is, even if they technically mean the same thing to the compiler.
So if you write while(x), x should be a boolean or a C int that represents a boolean (a true-or-false concept) already. If you write while(x != 0), then you care about x being a nonzero integer and are doing something numerical with x. If you write while(x != '\0'), then x is a char and you want to keep going until you find a null character (you're probably processing a C string).
*pointer means dereference the value stored at the location pointed by pointer. When pointer points to a string and used in while loop like while(*pointer), it is equivalent to while(*pointer != '\0'): loop util null terminator if found.
Let's start with a simple example::
int a = 2 ;
int *b = &a ;
/* Run the loop till *b i.e., 2 != 0
Now, you know that, the loop will run twice
and then the condition will become false
*/
while( *b != 0 )
{
*b-- ;
}
Similarly, your code is working with char*, a string.
char var[10] ;
/* copy some string of max char count = 9,
and append the end of string with a '\0' char.*/
char *pointer = &var ;
while( *pointer != '\0' )
{
// do something
// Increment the pointer 1 or some other valid value
}
So, the while loop will run till *pointer don't hit '\0'.
while( *pointer )
/* The above statement means the same as while( *pointer != '\0' ),
because, null char ('\0') = decimal value, numeric zero, 0*/
But the usage can change when you do, while(*pointer != 'x'), where x can be any char. In this case, your first code will exit after *pointer hits the 'x' char but your second snippet will run till *pointer hits '\0' char.
Yes, you can go for it.
Please note that *pointer is the value at the memory location the pointer point to(or hold the address of).
Your *pointer is now pointing to the individual characters of the character array var.
So, while(*pointer) is shorthand usage of the equivalent
while(*pointer!='\0').
Suppose, your string is initialized to 9 characters say "123456789" and situated at an address say addr(memory location).
Now because of the statement:
char *pointer=&var;
pointer will point to first element of string "1234567890".
When you write the *pointer it will retrieve the value stored at the memory location addr which is 1.
Now, the statement:
while(*pointer)
will be equivalent to
while(49)
because ASCII Value of 1 is 49, and condition is evaluated to true.
This will continue till \0 character is reached after incrementing pointer for nine times.
Now, the statement:
while(*pointer)
will be equivalent to
while(0)
because ASCII value of \0 is 0. Thus, condition is evaluated to false and loop stops.
Summary:
In while(condition), condition must be non-zero to continue loop execution. If condition evaluates to zero then loop stops executing.
while(*pointer) will work till the value at memory location being pointed to is a non-zero ASCII value.
Also you can use:
if(*ptr){ //instead of if(*ptr!='\0')
//do somthing
}
if(!*ptr){ //instead of if(*ptr=='\0')
//do somthing
}
*pointer means exactly what it says: "Give me the value that's stored at the place that the pointer points to". Or "dereference pointer" for short. In your concrete example, dereferencing the pointer produces the one of the characters in a string.
while(*pointer) also means exactly what is says: "While the expression *pointer yields a true value, execute the body of the loop".
Since C considers all non-zero values as true, using *pointer in a condition is always equivalent to using the expression *pointer != 0. Consequently, many C programmers omit the != 0 part in order to practice boolean zen.

How does "for ( ; *p; ++p) *p = tolower(*p);" work in c?

I'm fairly new to programming and was just wondering by why this code:
for ( ; *p; ++p) *p = tolower(*p);
works to lower a string case in c, when p points to a string?
In general, this code:
for ( ; *p; ++p) *p = tolower(*p);
does not
” works to lower a string case in c, when p points to a string?
It does work for pure ASCII, but since char usually is a signed type, and since tolower requires a non-negative argument (except the special value EOF), the piece will in general have Undefined Behavior.
To avoid that, cast the argument to unsigned char, like this:
for ( ; *p; ++p) *p = tolower( (unsigned char)*p );
Now it can work for single-byte encodings like Latin-1, provided you have set the correct locale via setlocale, e.g. setlocale( LC_ALL, "" );. However, note that very common UTF-8 encoding is not a single byte per character. To deal with UTF-8 text you can convert it to a wide string and lowercase that.
Details:
*p is an expression that denotes the object that p points to, presumably a char.
As a continuation condition for the for loop, any non-zero char value that *p denotes, has the effect of logical True, while the zero char value at the end of the string has the effect of logical False, ending the loop.
++p advances the pointer to point to the next char.
To unpick, let's assume p is a pointer to a char and just before the for loop, it points to the first character in a string.
In C, strings are typically modelled by a set of contiguous char values with a final 0 added at the end which acts as the null terminator.
*p will evaluate to 0 once the string null-terminator is reached. Then the for loop will exit. (The second expression in the for loop acts as the termination test).
++p advances to the next character in the string.
*p = tolower(*p) sets that character to lower case.

Don't understand how this for loop works

Can someone explain how this loop works? The entire function serves to figure out where in hash to place certain strings and the code is as follows:
//determine string location in hash
int hash(char* str)
{
int size = 100;
int sum;
for(; *str; str++)
sum += *str;
return sum % size;
}
It seems to iterate over the string character by character until it hits null, however why does simple *str works as a condition? Why does str++ moves to the next character, shouldn't it be something like this instead: *(str+i) where i increments with each loop and moves "i" places in memory based on *str address?
In C, chars and integers implicitly convert to booleans as: 0 - false, non-zero - true;
So for(; *str; str++) iterates until *str is zero. (or nul)
str is a pointer to an array of chars. str++ increments this pointer to point to the next element in the array and therefore the next character in the string.
So instead of indexing by index. You are moving the pointer.
The condition in a for loop is an expression that is tested for a zero value. The NUL character at the end of str is zero.
The more explicit form of this condition is of course *str != '\0', but that's equivalent since != produces zero when *str is equal to '\0'.
As for why str++ moves to the next character: that's how ++ is defined on pointers. When you increment a char*, you point it to the next char-sized cell in memory. Your *(str + i) solution would also work, it just takes more typing (even though it can be abbreviated str[i]).
This for loop makes use of pointer arithmetic. With that you can increment/decrement the pointer or add/substract an offset to it to navigate to certain entries in the array, since array are continuous blocks of memory you can do that.
str points to a string. Strings in C always end with a terminating \0.
*str dereferences the actual pointer to get the char value.
The for loop's break condition is equivalent to:
*str != '\0'
and
str++
moves the pointer forward to next element.
The hole for-loop is equivalent to:
int len = strlen(str);
int i;
for(i = 0; i < len; i++)
sum += str[i];
You could also write is as while-loop:
while(*str)
sum += *str++;
Why does str++ moves to the next character, shouldn't it be something like this
instead: *(str+i) where i increments with each loop and moves "i" places in
memory based on *str address?
In C/C++, string is a pointer variable that contains the address of your string literal.Initially Str points to the first character.*(str) returns the first character of string.
Str++ points to second charactes.Thus *(str) returns the second character of the string.
why does simple *str works as a condition?
Every c/c++ string contains null character.These Null Characters signify the end of a character string in C. ASCII code of NUL character is 0.
In C/C++,0 means FALSE.Thus, NUL Character in Conditional statement
means FALSE Condition.
for(;0;)/*0 in conditions means false, hence the loop terminates
when pointer points to Null Character.
{
}
It has to do with how C converts values to "True" and "False". In C, 0 is "False" and anything else is "True"
Since null (the character) happens to also be zero it evaluates to "False". If the character set were defined differently and the null character had a value of "11" then the above loop wouldn't work!
As for the 2nd half of the question, a pointer points to a "location" in memory. Incrementing that pointer makes it point to the next "location" in memory. The type of the pointer is relevant here too because the "Next" location depends on how big the thing being pointed to is
When the pointer points to a null character it is regarded as false. This happens in pointers. I don't know who defined it, but it happens.
It may be just becuase C treats 0 as false and every other things as true.
For example in the following code.
if(0) {
puts("true");
} else {
puts("false");
}
false will be the output
The unary * operator is a dereference operator -- *str means "the value pointed to by str." str is a pointer, so incrementing it with str++ (or ++str) changes the pointer to point to the next character. So it is the correct way to increment in the for loop.
Any integral value can be treated as a Boolean. *str as the condition of the for loop takes the value pointed to by str and determine if it is non-zero. If so, the loop continues Once it hits a null character, it terminates.

Resources