So I was reading some code from my C-programming course, and came across to this question asking:
Trim the string: " make trim " into "make trim".
This was the code:
#include <stdio.h>
#include <string.h>
#define MAX 100
void trim(char *s) {
char *d = s;
while (isspace(*s++))
;
s--;
while (*d++ = *s++)
;
d--;
while (isspace(*--d))
*d = 0;
}
int main() {
char s[MAX];
gets(s);
printf("[%s] -> ", s);
trim(s);
printf("[%s]", s);
return 0;
}
So, what I want to ask is, what are
while (isspace(*s++))
;
s--;
while (*d++ = *s++)
;
d--;
while (isspace(*--d))
*d = 0;
the incremented string (*s++), the ; after the while() loop doing, and how do they work?
Thanks in advance
The empty body is simply because the condition of the while has a side effect so the whole loop is encapsulated in the condition and there is nothing to put in the body.
So:
while (isspace(*s++))
;
Is the same as:
while (isspace(*s)) { s++; }
They put the semicolon on the extra line to make it clearer that the loop is empty (otherwise you might think the next real statement is part of the loop). Other people might put a comment there as well, or even rewrite it to avoid the body being empty.
Code with some descriptive comments:
This loop picks up each character in turn, increments s to point to the next and continue doing that until the character we picked up is not a whitespace character. Note that because it increments before the test it will increment s one time too many, so that's why there's a decrement after the loop ends.
while (isspace(*s))
;
s--;
This would have been equivalent and doesn't need the separate decrement:
while (isspace(*s)) { s++; }
This loop copies each character from s so starting at the first non-whitespace character to d which points initially at the start of the string. If there was no leading whitespace it copies the character directly on top of itself. The loop ends after we copied a null character, i.e. the end of the string. As before there is one increment too many so we have to decrement d to endpointing at the null character:
while (*d++ = *s++)
;
d--;
This loop decrements d and then picks up whatever character it now points at. If it is a whitespace character then it replaces it with a null character. Note that there is no check for hitting the start of the string, so if the initial string was empty (i.e. a single null character) it will happily work backward through memory corrupting bytes outside the string as long as they are whitespace:
while (isspace(*--d))
*d = 0;
You can rewrite:
while (condition);
as follows:
while (condition) { /* empty block */ }
If the condition contains statements to be executed these will be executed and evaluated as long as the condition meets, this is the case in your example.
1.
In your case the first while loop:
while (isspace(*s++));
loops until it founds the first character being no space (which is the m of make in this case). As there is a postfix increment it will point to a of make:
" make trim \0"
^ ^
d| s|
2.
After that s is decremented (s--;) pointing at the start (m of make).
" make trim \0"
^ ^
d| s|
3.
The second while loop writes the string beginning with make to the beginning of the character buffer because d is pointing to the beginning due it's set at the beginning of the function (char *d = s;). This is done until the null terminator '\0' is reached:
while (*d++ = *s++)
could be rewritten as:
while ((*d++ = *s++) != '\0')
and will result in (note that null terminator '\0' is also written):
"make trim \0 \0"
^ ^
d| s|
4.
Now d will be decremented (d--;) pointing at the space before the null terminator '\0'.
"make trim \0 \0"
^ ^
d| s|
5.
The last while loop will search for the first character being no space (with d) but in reverse order:
while (isspace(*--d))
*d = 0;
It will write multiple null terminator '\0' until it find a character being no space.
"make trim\0\0\0\0 \0"
So the resulting string will be:
"make trim"
It isn't necessary to write multiple null terminator '\0' here but instead one null terminator '\0' would be enought after m of trim.
while (isspace(*s++))
;
is just equal to
while (isspace(*s++));
Don't let ; at a new line confuse you. All in all it is just a way to emphasize that this while loop does not have a body. And a while loop that has no body is performing until it's isspace(*s++) condition is false. Also isspace(*s++) it is an only calculation that should be done for this loop, so that is why there is no need to have a body for this loop.
When you read code like this it is easy to miss this ; and then you can make a wrong assumption that expression directly below the loop is a loop body, but just not surrounded by {}:
while (isspace(*s++));
s--;
while (isspace(*s++))
s--;
See the difference? In second example s-- is a while loop's body. To attract attention programmer can write:
while (isspace(*s++))
;
s--;
to clarify and emphasize his real intentions.
Related
i just saw this "while(something);" syntax. i googled this but did not found anything. how does this work? especially second while in the example code confuses me.
this code is a program to concatenate two strings using pointer.
#include <stdio.h>
#define MAX_SIZE 100 // Maximum string size
int main()
{
char str1[MAX_SIZE], str2[MAX_SIZE];
char * s1 = str1;
char * s2 = str2;
/* Input two strings from user */
printf("Enter first string: ");
gets(str1);
printf("Enter second string: ");
gets(str2);
/* !!!!!!!!!!!!!!!!! this is it!!!!!!!!!!!!!!!!!!!! Move till the end of str1 */
while(*(++s1));
/* !!!!!!!!!!!!!!!!! this is it!!!!!!!!!!!!!!!!!!!! Copy str2 to str1 */
while(*(s1++) = *(s2++));
printf("Concatenated string = %s", str1);
return 0;
}
The while loop is defined in C the following way
while ( expression ) statement
In this while loop
while(*(++s1));
the statement is a null statement. (The C Standard, 6.8.3 Expression and null statements)
3 A null statement (consisting of just a semicolon) performs no
operations.
So in the above while loop the expression is evaluated cyclically until it logically becomes false.
Pay attention to that this while loop has a bug.;)
Let's assume that the pointed string is empty "". In memory it is represented the following way
{ '\0' }
So initially s1 points to the terminating zero.
But before dereferencing it is incremented in the expression of the while loop
while(*(++s1));
^^^^
and after that points in the uninitialized part of the character array after the terminating zero '\0'. So the loop can invoke undefined behavior.
It would be more correctly to rewrite it like
while( *s1 != '\0' ) ++s1;
In this case after the loop the pointer s1 will point to the terminating zero '\0' of the source string.
This while loop where the statement is again a null statement
while(*(s1++) = *(s2++));
can be rewritten the following way
while( ( *s1++ = *s2++ ) != '\0' );
that is in essence the same as
while( ( *s1 = *s2 ) != '\0' )
{
++s1;
++s2;
}
(except that if the terminating zero was encountered and copied the pointers are not incremented)
That is the result of the assignment ( *s1 = *s2 ) is the assigned character that is checked whether it is equal already to the terminating zero character '\0'. And if so the loop stops and it means that the string pointed to by the pointer s2 is appended to the string pointed to by the pointer s1.
Pay attention to that the function gets is unsafe and is not supported by the C Standard. Instead you should use the function fgets as for example
#include <string.h>
#include <stdio.h>
//...
printf("Enter first string: ");
fgets(str1, sizeof( str1 ), stdin );
str1[ strcspn( str1, "\n" ) ] = '\0';
The last statement is used to remove the new line character '\n' that can be appended to the entered string by the function call.
Also you need to check in the program whether there is enough space in the array str1 and the string stored in the array str2 can be indeed appended to the string stored in the array str1.
while(*(++s1)); is an obfuscated and bugged way of writing while(*s1 != '\0') { s1++; }.
(It should have been while(*(s1++)); to behave as expected, but that too is wrong since it increments the pointer upon failure and won't work with an empty string.)
while(*(s1++) = *(s2++)); is an obfuscated (and likely inefficient) way of writing strcpy(s1,s2);.
The whole program is an obfuscated way of writing strcat(s1, s2);. You can replace both of these buggy while loops with that single function call.
Generally while(something); is bad practice, to the point where compilers might even warn for it, since it isn't clear if the semicolon ended up there on purpose or by a slip of the finger. Preferred style is either:
while(something)
; // aha this was surely not placed there by accident
or
while(something){}
or
while(something)
{}
++s1 advances (or increments) the pointer, before the while checks it value
The while loop will iterate through the string until it will reach the null terminator, since while(NULL) is equal to while(false) or while(0)
The loop
while(*(++s1));
doesn't need a body because everything is done inside the loop condition.
Therefore the loop body is an empty statement ;.
The loop consists the following steps:
++s1 increment pointer
*(...) dereference pointer, i.e. get the data where the pointer points to.
use the value as the condition (0 is false, everything else is true)
The loop can be rewritten as
do
{
++s1;
}
while(*s1); // or while(*s1 != '\0');
Similarly, the other loop
while(*(s1++) = *(s2++));
can be written as
do
{
char c;
*s1 = *s2;
c = *s1;
s1++;
s2++;
}
while(c != '\0')
Note that the original loop condition contains an assignment (=), not a comparison (==). The assigned value is used as the loop condition.
I found this program for reversing this program online.
I have just started learning C.
I am not able to understand few things here.
Why while is ended with ;
what does while(str[++i]!='\0'); mean?
Is rev[j++] = str[--i]; same as writing j++; and i--; inside the while loop?
This is the program:
#include<stdio.h>
int main(){
char str[50];
char rev[50];
int i=-1,j=0;
printf("Enter any string : ");
scanf("%s",str);
while(str[++i]!='\0');
while(i>=0)
rev[j++] = str[--i];
rev[j]='\0';
printf("Reverse of string is : %s",rev);
return 0;
}
while(str[++i]!='\0');
is equivalent to
while(str[++i]!='\0')
/*do nothing*/;
which is equivalent to
++i;
while (str[i]!='\0') {
++i;
}
and
while(i>=0)
rev[j++] = str[--i];
is equivalent to
while (i>=0) {
--i;
rev[j] = str[i];
++j;
}
Note that i is decremented before the statement since --i is a pre-decrement, whereas j is incremented after the statement since j++ is a post-increment.
I'll try to answer as best as i can...
Why while is ended with ;
This is valid syntax, it's often used to cause the program to wait at that line until a certain flag is set in an embedded scenario. In this case it's used to find the length of the string.
All strings are terminated with a null character, which is '\0', and the preincrement on i means that after that line i will hold the value for the length of the string.
Effectively its equivalent to this:
/* If the ith position of the string is not the end */
while (str[i] != '\0') {
/* Increment i and repeat */
i = i + 1;
}
The main concept here is the difference between postincrement and preincrement operators - might be worth reading up on that.
What does while(str[++i]!='\0'); mean?
See above.
3.Is rev[j++] = str[--i]; same as writing j++; and i--; inside the while loop?
If you're asking if its in the while loop, its entirely equivelant to:
while(i>=0) {
rev[j++] = str[i--];
}
Since there is only a single operation in the while loop the brackets aren't needed.
Just a note, and this is entirely subjective, but the majority of coding standards I've come accross use brackets even in this scenario.
Your questions seem to be related mainly to the syntax of C - it might be worth getting a book out or watching some tutorials to familiarise yourself with it.
The ; is there to close the loop
2: while(str[++i]!='\0'); means "Go throuch each char of str until the \0 char is reached".\0 is the ending char of a string
3: Yes
First of all, while(str[++i]!='\0'); increments i until it finds the last character. In C all strings end with \0 or NULL (both are the same).
The second one, no. It is not the same --i than i++.
Check the following code snipet:
int a,b,x=10,y=10;
a = x--;
b = --y;
At the end of execution, a = 10 but b = 9. This is because --y is a pre-decrement. It decrements the value first and then assigns its value to b.
Here's a commented version of the program:
// Include standard input output functions
#include<stdio.h>
// declares the main function. It accept an undefined number
// of parameters but it does not handles them, and returns an integer
int main(){
// declares tho arrays of 50 characters initialized with random values
char str[50];
char rev[50];
// declare and initialize two integer variables
int i=-1,j=0;
printf("Enter any string : ");
scanf("%s",str);
// executes the ';' instruction while the condition is satisfied.
// ';' is an empty statement. Thus do nothing.
// The only action executes here, is the increment of the i variable with
// a preincrement. Because the i variable was initialized with
// -1, the first control start checking if str[0] != '\0'
// If the post increment operator was used, the variable must
// have been initialized with 0 to have the same behaviour.
while(str[++i]!='\0');
// at the end of the previous while, the i variable holds the
// str lenght + 1 (including the '\0')
// starting from the end (excluding the '\0', using the pre-decrement on the i variable)
// assign to rev[j] the variable of str[i], then (post increment)
// increment the j variable
while(i>=0)
rev[j++] = str[--i];
// now j is equals to str lenth +1
// therefore in this position add the null byte
rev[j]='\0';
// print result
printf("Reverse of string is : %s",rev);
// return 0 to the OS
return 0;
}
; means the end of a statement in c.
while(condition)
{
//do something
}
do something means at least one statement should be executed. For this the ; is used here.
while(str[++i]!='\0'); '\0' represents end of the string. Here the loop is terminated at the end of the string and ++i increases i.
Is rev[j++] = str[--i]; same as writing j++; and i--; inside the while loop?
Yes. But as --i increases i before executing rev[j++] = str[--i] so i-- should be before rev[j] = str[i] and j++ increases j after executing rev[j++] = str[--i] so j++ should be after rev[j] = str[i]
The key here is understanding the difference in behaviour between prefix (++i), and postfix (i--) operators.
The prefix operator will increment its operand (i), and then evaluate to the new value.
The postfix operator will evaluate to its operands current value, and then increments the operand afterwards.
As for:
int i = -1;
while (str[++i] != '\0');
This is a loop with no block, because all of the statements can be expressed in the conditional. On each iteration:
Increment i by one.
Get the char at the position i evaluates to.
continue if it is not the NUL character.
This might be better understood when written as:
int i = -1;
do {
i++;
} while (str[i] != '\0');
The result of this operation is that i now holds the position of the NUL character in the string, since all valid character strings must end with the NUL character.
In the next section of the program, the prefix operator is used again to immediately get the character one position before the NUL character, and then one position before that, and so on, until we get the first character of the string, and then we're done.
while(i>=0)
rev[j++] = str[--i];
Why while is ended:
while(str[++i]!='\0')
Once str is an asciiz string it ends with a '\0' character. So while will end whenever the while reaches the end of the string.
The line above means:
=> ++i : Increments the string index before getting the corresponding character.
=> Checks if str[index] != '\0' // End of the string reached
On the end of while the i variable will contain the string length (excluding the '\0' character).
It would be easier to use this:
i = strlen(str);
Is rev[j++] = str[--i]; same as writing j++; and i--; inside the while loop?
No.
This line is the same as:
while(i>=0)
{
i = i - 1;
rev[j] = str[i];
j = j + 1;
}
--i : Gets the string character after decrementing i.
If you changed to i-- the code would get the str[i] before decrementing i, but it is not what you want.
I'm trying to understand function which copies characters from stdin but I can't understand the while loop and the code following it exactly..... How does the while loop here work??
From what I understand it means until ith character from to[] isn't equal to ith character of from[] keep on adding i am I correct??
If yes than how does the ith character be equal in both the variables ??
Here is a short code :
void copy(char to[] , char from[])
{
int i;
i = 0 ;
while ((to[i] = from[i]) != '\0')
++i;
}
Rewriting it might help:
do{
to[i] = from[i];
++i;
}while (from[i-1] != '\0') // -1 here because we incremented i in the line before and need to check the copied position
Do you understand now?
The condition in the while loop uses the fact that in C an assignment expression has a value which is the value assigned in the assignment. This means that the condition in the while loop can be implemented to have a side-effect, namely the element-wise assignment of the source to the destination. In total, the actual work of the loop is carried out in its condition, while the loop's body just increases the index i.
It's how assignments work. An assignment (a = b) returns a value (b). What you're doing there, is moving from[i] to to[i], and comparing the return value (in this case, from[i]) to the character '\0'.
The null character (0x00) terminates any string, and is thus the terminating character of the string you're copying.
I'd be careful with this code, however, as you don't check the bounds on the array and are leaving yourself open to a segmentation fault if you were to encounter a string that isn't properly null terminated, or where the to[] string is too short.
It first copy from ith character to to ith position and check that, if its the end of string. if not then it increments i(position or index that will point now to next character) and perform this operation until its matches end of string i.e '\0' .
Your code is the same as
void copy(char to[] , char from[])
{
int i;
i = 0 ;
while (from[i] != '\0')
{
to[i] = from[i];
++i;
}
to[i] = '\0';
}
So while it's not at the end of to, it continue copying from in to.
I am learning C now and I'm at the point where I don't really get what is the difference of initializing the end of the string with NULL '\0' character. Below is the example from the book:
#include <stdio.h>
#include <string.h>
int main(){
int i;
char str1[] = "String to copy";
char str2[20];
for(i = 0; str1[i]; i++)
str2[i] = str1[i];
str2[i] = '\0'; //<====WHY ADDING THIS LINE??
printf("String str2 %s\n\n", str2);
return 0;
}
So, why do I have to add NULL character? Because it works without that line as well. Also, is there a difference if I use:
for (i = 0; str1[i]; i++){
str2[i] = str1[i];
}
Thanks for your time.
The line you're referring to is added in general use for safety. When you copy values to a string you always want to be sure that it's null terminated, otherwise when reading the string it will continue past the point where you want the end of that string to be (because it doesn't know where to stop due to lack of the null terminator).
There is no difference with the alternate code you posted since you are separating only the line below the for statement to be in the loop, which happens by default anyway if you don't use the curly braces {}
In C, the end of the string is detected by the null character. Consider the string 'abcd'. If the variable in the actual binary have the next variable immediately after the 'd' character, C will think that the next characters in the platform are part of that string and you will continue. This is called buffer overrun.
Your initial statement allowing 20 bytes for str2 will usually fill it with 20 zeroes, However, this is not required and may not occur. Additionally, let us say you move a 15 character string into str2. Since it starts with 20 zeroes, this will work. However, say that you then copy a 10 character string into str2. The remaining 5 characters will be unchanged and you will then have a 15 character string consisting of the new 10 characters, followed by the five characters previously copied in.
In the code above the for loop says move the character in str1 to str2 and point to the next character. If the character now pointed to in str1 is not 0, loop back and do again. Otherwise drop out of the loop. Now add the null character to the end of the str2. If you left that out, the null character at the end of str1 would not be copied to str2, and you would have no null character at the end of str2.
This can be expressed as
i = 0;
label:
if (str1[i] == 0) goto end;
str2[i] = str1[i];
i = i + 1;
goto label;
end: /* This is the end of the loop*/
Note that the '\0' character has not yet been moved into str2.
Since C requires brackets to show the range of the for, only the first line after the for is part of the loop. If i had local scope and is lost after the loop, you would not be able to just wait to fall out of the loop and make it 0. You would no longer have a valid i pointer to tell you where in str2 you need to add the 0.
An example is C++ or some compilers in C which would allow (syntactically)
for (int i = 0; str1[i]; i++)
{
str2[i] = str1[i];
}
str2[i] = 0;
This would fail because i would be reset to whatever it happened to be before it entered the loop (probably 0) as it falls out of the loop. If it had not been defined before the loop, you would get an undefined variable compiler error.
I see that you fixed the indentation, but had the original indentation stayed there, the following comment would apply.
C does not work solely by indentation (as Python does, for example). If it did, the logic would be as follows and it would fail because str2 would be overwritten as all 0.
for (int i = 0; str1[i]; i++)
{
str2[i] = str1[i];
str2[i] = 0;
}
You should only add a \0 (also called the null byte) in the end of the string. Do as follows:
...
for(i = 0; str1[i]; i++) {
str2[i] = str1[i];
}
str2[i] = '\0'; //<====WHY ADDING THIS LINE??
...
(note that I simply added braces to make the code more readable, it was confusing before)
For me, that is clearer. What you were doing before is basically take advantage of the fact that the integer i that you declared is still available after you ran the loop to add a \0 in the end of str2.
The way strings work in C is that they are basically a pointer to the location of the first character and string functions (such as the ones you can find in string.h) will read every single char until they find a \0 (null byte). It is simply a convention for marking the end of the string.
Some further reading: http://www.cs.nyu.edu/courses/spring05/V22.0201-001/c_tutorial/classes/String.html
'\0' is used for denoting end of string. It is not for the compiler, it is for the libraries and possibly your code. C does not support arrays properly. You can have local arrays, but there is no way to pass them about. If you try you just pass the start address (address of first element). So you can ever have the last element be special e.g. '\0' or always pass the size, being careful not to mess up.
For example:
If your string is like this:
char str[]="Hello \0 World";
will you tell me what would display if you print str ?
Output is:
Hello
This will be the case in character arrays, Hence to be in safer side, it is good to add '\0'at the end of string.
If you didnt add '\0', some garbage values might get printed out, and it will keep on printing till it reached '\0'
In C, char[] do not know the length of the string. It is therefore important character '\0' (ASCII 0) to indicate the end of the string. Your "For" command will not copy '\0', so output is a string > str2 (until found '\ 0' last stop)
Try:
#include <stdio.h>
#include <string.h>
int main(){
int i;
char str[5] = "1234";
str[4] = '5';
printf("String %s\n\n", str);
return 0;
}
I have a snippet of C code:
I want to add a new line character at certain intervals. The problem is, when I add it in the if block, on the next iteration, strcat takes it away, then concats s on, and then puts the \n at the end.
I can think of any other way to do this so strcat does not remove the \n that I want to add.
Any ideas?
hmm... if curPos points to the the next available position in the array, than I would do this. curPos + 1 points after the current '\0' termination of ans, since arrays in C are indexed starting at 0.
if (i % 8 == 0 && i != 0)
{
ans[curPos++] = '\n';
ans[curPos] = '\0'; /* always null terminate a string after any extension */
}