Can someone explain why to[i] = '\0' is correct? - arrays

It's a function that should copy a given string into another string. Can someone explain me why to[i] = '\0' is correct without ++ to i after the loop has stop and what ++ means before and after i.
#include <stdio.h>
void copyStringArr(char to[], char from[]);
int main(void)
{
char string1[] = "A string to be copied";
char string2[250];
copyStringArr(string2, string1);
printf("%s\n", string2);
return 0;
}
void copyStringArr(char to[], char from[])
{
int i;
for(i = 0; from[i] != '\0'; i++)
to[i] = from[i];
to[i] = '\0';
}

Let's copy a 2-char string ("hi") instruction by instruction ..
// from = "hi"
// to = "HERE BE DRAGONS..." // garbage
i = 0;
from[0] != 0 // is true
to[0] = from[0]; // to = "hERE BE DRAGONS..."
i++ // i is 1 now
from[1] != 0 // is true
to[1] = from[1]; // to = "hiRE BE DRAGONS..."
i++ // i is 2 now
from[2] != 0 // is false, so exit loop
to[2] = 0; // terminate string, don't care about i anymore
// to = "hi"; the DRAGONS (the initial garbage) are
// still in memory, you need "special tricks" to see them

Because for loops are executed as:
Execute once: i = 0;.
Then loop:
Check from[i] != '\0'
Execute loop body.
Execute i++.
The last full lap of the loop executes i++ and then the next from[i] != '\0' check yields true. That's where it stops and i was already incremented so that its value corresponds to the index of the null terminator.
You can illustrate it with this code that uses the comma operator to sneak in a printf at each of the mentioned steps:
for(int i=(printf("[i=0]\n"),0);
printf("[i<3] "), i<3;
printf("[i++]\n"), i++)
{
printf("[loop body i: %d] ", i);
}
Output:
[i=0]
[i<3] [loop body i: 0] [i++]
[i<3] [loop body i: 1] [i++]
[i<3] [loop body i: 2] [i++]
[i<3]

Related

Why I am getting a space character in my program in the place of third last character?

Why I am getting a space character in my program in the place of third last character?
Even if I change the string str variable I get the same result.
#include <stdio.h>
#include <string.h>
void parser(char array[])
{
int a, b;
for (int i = 0; i < strlen(array); i++) {
if (array[i] == '>') {
a = i;
break;
}
}
for (int j = a; j < strlen(array); j++) {
if (array[j] == '<') {
b = j;
}
}
for (int p = 0, q = a + 1; p < b - a - 1, q < b; p++, q++) {
array[p] = array[q];
array[b - a] = '\0';
printf("%c", array[p]);
}
}
int main()
{
char str[] = "<h1>hello there i am programmer.</h1>";
parser(str);
return 0;
}
There are many things that could be written better in the code but they do not affect the result.
The line that produces the unexpected outcome is:
array[b-a]='\0';
When this for loop starts...
for(int p=0,q=a+1;p<b-a-1,q<b;p++,q++){
array[p]=array[q];
array[b-a]='\0';
printf("%c",array[p]);
}
... the values of a and b are 3 and 32.
The statement array[b-a]='\0'; puts the NUL terminator character at position 29 in array.
The loop starts with p=0, q=4 (a+1) and repeats until p reaches 28 and q reaches 31 (q<b)*.
When p is 25, q is 29 and array[29] has been repeatedly set to '\0' on the previous iterations, therefore '\0' is copied at position 25 and printed on screen.
You should set the NUL terminator only once, after the loop. And the right position for it is b-a-1, not b-a; you expressed this correctly in the for initialization (p=0) and exit condition (p<b-a-1).
All in all, the code around the last for loop should be like this:
for(int p=0, q=a+1;q<b;p++,q++){
array[p]=array[q];
printf("%c",array[p]);
}
array[b-a-1]='\0';
*The condition p<b-a-1 is ignore because of the comma character. You probably want & between the conditions but they are equivalent, one of them is enough.

Can't set second element of an empty string in C

I have this code, which doubles every letter in nabucco:
#include<stdio.h>
#include<string.h>
int main(){
char string1[20] = "nabucco";
char string2[40];
int j=-1;
for(int i=0; i<strlen(string1);i++){
j++;
string2[j] = string1[i];
j++;
string2[j] = string1[i];
}
printf("string1=\t%s\n",string1);
printf("string2=\t%s\n",string2);
}
string2 is set to and prints nnaabbuuccccoo.
However, when I try to set j=0; string2 prints (and is presumably set to) nothing.
If j=0 before going into the loop, then j is incremented to 1 in the for loop, which starts off setting the second element of the empty string2. Is there something behind the scenes preventing any element other than the first from being set? I do not understand what is causing this behavior.
The program as posted has undefined behavior since you don't put a null terminator in string2.
If you start with j=0, you never set the value of string2[0] since you increase j before using it, so that position remains indeterminable.
Possible fix:
#include<stdio.h>
#include<string.h>
int main(){
char string1[20] = "nabucco";
char string2[40];
size_t j=0;
for(size_t i=0, len = strlen(string1); i<len; i++) {
string2[j++] = string1[i]; // increase j "after" using the current value
string2[j++] = string1[i];
}
string2[j] = '\0'; // add null terminator
printf("string1=\t%s\n",string1);
printf("string2=\t%s\n",string2);
}
An alternative could be to initialize string2 when you create it:
#include<stdio.h>
#include<string.h>
int main(){
char string1[20] = "nabucco";
char string2[40] = {0}; // zero initialized
for(size_t i=0, j=0, len = strlen(string1); i<len; i++) {
string2[j++] = string1[i];
string2[j++] = string1[i];
}
printf("string1=\t%s\n",string1);
printf("string2=\t%s\n",string2);
}
You increment the counter before using it the first time, so if you set j=0 you do j++ before using it first, which means you start with string2[1].
You can change it like so:
int j=0;
for(int i=0; i<strlen(string1);i++){
string2[j] = string1[i];
j++;
string2[j] = string1[i];
j++;
}
When you do this
printf("string2=\t%s\n",string2);
and string2 was set from the second character, this is a sort of undefined behaviour (see also this answer). string2[0] was not initialised, and could contain anything (from sensitive data to something that can't be parsed successfully by other functions - okay, here it is a single character; in other circumstances it could be more than that, if you started with j=20 for example). Which is why undefined behaviour is to be avoided at all costs.
In this case, the quirks of platform, compiler and process made it so the first character is a zero. Since C strings are zero-terminated, and string2 begins with a zero, printf prints nothing. If you printed something like this, where "." is a 0 character (ASCII 0),
"nnaa.bbuuccoo."
you would get only "nnaa". By the same token, printf'ing ".nnaabbuuccoo" gets you nothing.
If you were to print string2 starting at the second character, you'd get different results:
printf("string2=\t%s\n",string2 + 1);
After the for loop the character array string2 does not contain a string because you forgot to append it with the terminating zero character '\0'. So this call of printf
printf("string2=\t%s\n",string2);
invokes undefined behavior.
A simplest approach to resolve the problem is to initialize the array in its declaration. For example
char string2[40] = "";
As for this statement
However, when I try to set j=0; string2 prints (and is presumably set
to) nothing.
then within the for loop the variable j is at once incremented.
int j = 0;
for(int i=0; i<strlen(string1);i++){
j++;
string2[j] = string1[i];
//...
So the first character of the array string2 is not set.
In this case it is better to write the for loop the following way
size_t j = 0;
for ( size_t i = 0, n = strlen( string1 ); i < n; i++ ){
string2[j++] = string1[i];
string2[j++] = string1[i];
}
string2[j] = '\0';
It is because strings in C are null terminated. So when you set j=0 outside of loop, you skip setting first character which remains null.

Nested for loop isn't iterating

I'm trying a K and R exercise. The program is to compare two strings. If the first string has any characters that are also in string 2 then it will be deleted in string1.
The goal of my compare function below is to compare every array element in the first string with every array element in the second string. If we've got a match then we "raise a red flag" (acting as a boolean value) and we DON'T add it to the new array that will contain the edited string1. However it seems to be ignoring the second for loop. It only passes through on the k = 0 iteration for every i iteration. My other issue is that based on the output (provided beneath node) it seems that s1[i] is being assigned to s2[k]. I'm guessing this takes place in the if statement but how would that be possible? Any help anyone could provide would be very appreciated.
I used the GNU GCC compiler if it makes a difference.
#include <stdio.h>
int getLength(char s[]);
char compare(char s1[], char s2[],int s1Length, int s2Length);
int main()
{
char stringOne[] = {'a','b','c','d','e'};
char stringTwo[] = {'P','f','g','c','t','y','u','o','z'};
int lengthOne;
int lengthTwo;
lengthOne = getLength(stringOne);
char theResultingString[lengthOne];
lengthTwo = getLength(stringTwo);
compare(stringOne, stringTwo, lengthOne, lengthTwo);
return 0;
} //end of main.
int getLength(char s[]) //getLength gives us the length of each and every string
{
int i=0;
for(i = 0; s[i]!='\0'; i++) {
} //end for loop
return i;
} //end of getLength
char compare(char s1[], char s2[],int s1Length, int s2Length)
{
int redFlagRaised = 0; //This will be used as a boolean indicator if we have a matching element
char toBeReturned[s1Length];
int i;
int k;
for(i = 0; i<s1Length; i++) {
printf("i is now %d\n",i);
for(k = 0; k<s2Length; k++) {
printf("k is now %d\n",k);
if(s1[i] = s2[k]) { //If at any point the s1 char being examined equals any of s2 chars then
printf("s1[i] is %c\n",s1[i]);
printf("s2[i] is %c\n",s2[i]);
redFlagRaised = 1; //we raise the red flag!
} //end first inner if statement
if((k=(s2Length-1))&&(redFlagRaised = 0)) { //if we reach the end and we DON'T have a red flag then
toBeReturned[i] = s1[i];
printf("toBeReturned[0] is %c\n",toBeReturned[0]);
} //end second inner if statement
} //end inner for loop
redFlagRaised = 0; //We lower the flag again for the next inner for loop iteration
} //end outer for loop
printf("The result is %c", toBeReturned[0]);
return toBeReturned[0];
} //end of compare
Output:
i is now 0
k is now 0
s1[i] is P
s2[i] is P
i is now 1
k is now 0
s1[i] is P
s2[i] is f
i is now 2
k is now 0
s1[i] is P
s2[i] is g
i is now 3
k is now 0
s1[i] is P
s2[i] is c
i is now 4
k is now 0
s1[i] is P
s2[i] is t
i is now 5
k is now 0
s1[i] is P
s2[i] is y
The result is �
Process returned 0 (0x0) execution time : 0.005 s
Press ENTER to continue.
char stringOne[] = {'a','b','c','d','e'};
char stringTwo[] = {'P','f','g','c','t','y','u','o','z'};
These are not strings. You need to terminate them using null character.
Try this -
char stringOne[] = {'a','b','c','d','e','\0'};
char stringTwo[] = {'P','f','g','c','t','y','u','o','z','\0'};
Also in this condition-
if(s1[i] = s2[k])
use == instead of =(this is assignment operator).So condition should be written as -
if(s1[i]==s2[k])
Similarly in this condition (as mentioned by Weather Vane Sir in comment)if((k=(s2Length-1))&&(redFlagRaised = 0)) use ==
if((k==(s2Length-1))&&(redFlagRaised == 0))
In compare function in IF condition you are assigning the value to K like bleow
if((k=(s2Length-1))&&(redFlagRaised = 0)){ //if we reach the end and we DON'T have a red flag then
toBeReturned[i] = s1[i];
printf("toBeReturned[0] is %c\n",toBeReturned[0]);
}
But it's needs to be like this
if((k==(s2Length-1))&&(redFlagRaised == 0)){ //if we reach the end and we DON'T have a red flag then
toBeReturned[i] = s1[i];
printf("toBeReturned[0] is %c\n",toBeReturned[0]);
}
You have to use Compare operator (==) not assignment operator(=)
In below code
char stringOne[] = {'a','b','c','d','e'};
char stringTwo[] = {'P','f','g','c','t','y','u','o','z'};
These are not strings. You need to terminate them using null character. Try this -
char stringOne[] = {'a','b','c','d','e','\0'};
char stringTwo[] = {'P','f','g','c','t','y','u','o','z','\0'};
Below also use this == operator instead of = operator
if(s1[i] = s2[k])

Why do I keep getting extra characters at the end of my string?

I have the string, "helLo, wORld!" and I want my program to change it to "Hello, World!". My program works, the characters are changed correctly, but I keep getting extra characters after the exclamation mark. What could I be doing wrong?
void normalize_case(char str[], char result[])
{
if (islower(str[0]) == 1)
{
result[0] = toupper(str[0]);
}
for (int i = 1; str[i] != '\0'; i++)
{
if (isupper(str[i]) == 1)
{
result[i] = tolower(str[i]);
}
else if (islower(str[i]) == 1)
{
result[i] = str[i];
}
if (islower(str[i]) == 0 && isupper(str[i]) == 0)
{
result[i] = str[i];
}
if (str[i] == ' ')
{
result[i] = str[i];
}
if (str[i - 1] == ' ' && islower(str[i]) == 1)
{
result[i] = toupper(str[i]);
}
}
}
You are not null terminating result so when you print it out it will keep going until a null is found. If you move the declaration of i to before the for loop:
int i ;
for ( i = 1; str[i] != '\0'; i++)
you can add:
result[i] = '\0' ;
after the for loop, this is assuming result is large enough.
Extra random-ish characters at the end of a string usually means you've forgotten to null-terminate ('\0') your string. Your loop copies everything up to, but not including, the terminal null into the result.
Add result[i] = '\0'; after the loop before you return.
Normally, you treat the isxxxx() functions (macros) as returning a boolean condition, and you'd ensure that you only have one of the chain of conditions executed. You'd do that with more careful use of else clauses. Your code actually copies str[i] multiple times if it is a blank. In fact, I think you can compress your loop to:
int i;
for (i = 1; str[i] != '\0'; i++)
{
if (isupper(str[i]))
result[i] = tolower(str[i]);
else if (str[i - 1] == ' ' && islower(str[i]))
result[i] = toupper(str[i]);
else
result[i] = str[i];
}
result[i] = '\0';
If I put result[i] outside of the for loop, won't the compiler complain about i?
Yes, it will. In this context, you need i defined outside the loop control, because you need the value after the loop. See the amended code above.
You might also note that your pre-loop code quietly skips the first character of the string if it is not lower-case, leaving garbage as the first character of the result. You should really write:
result[0] = toupper(str[0]);
so that result[0] is always set.
You should add a statement result[i] = '\0' at the end of the loop because in the C language, the string array should end with a special character '\0', which tells the compiler "this is the end of the string".
I took the liberty of simplifying your code as a lot of the checks you do are unnecessary. The others have already explained some basic points to keep in mind:
#include <stdio.h> /* for printf */
#include <ctype.h> /* for islower and the like */
void normalise_case(char str[], char result[])
{
if (islower(str[0]))
{
result[0] = toupper(str[0]); /* capitalise at the start */
}
int i; /* older C standards (pre C99) won't like it if you don't pre-declare 'i' so I've put it here */
for (i = 1; str[i] != '\0'; i++)
{
result[i] = str[i]; /* I've noticed that you copy the string in each if case, so I've put it here at the top */
if (isupper(result[i]))
{
result[i] = tolower(result[i]);
}
if (result[i - 1] == ' ' && islower(result[i])) /* at the start of a word, capitalise! */
{
result[i] = toupper(result[i]);
}
}
result[i] = '\0'; /* this has already been explained */
}
int main()
{
char in[20] = "tESt tHIs StrinG";
char out[20] = ""; /* space to store the output */
normalise_case(in, out);
printf("%s\n", out); /* Prints 'Test This String' */
return 0;
}

search for '\n in char pointer use c

I am trying to loop a char*str use this to find out how many lines:
char *str = "test1\ntest2\ntest3";
int lines = 0;
for(int i = 0 ; i < ?? ; i ++ )
{
if(str[i] == '\n') {
lines++;
}
}
I am not sure what to put at the ??, the question is :
1.I mean do I need to use strlen(str) + 1 ?
2.when the str is "test1\ntest2\ntest3\n",does the code still calculate correct lines?
I am using gcc by the way,thanks
every literal string ends with \0 which is a null character..It depicts the end of the string
So,
You can do this
for(int i = 0 ; str[i]!='\0' ; i ++ )
To extend the already-existent good answers: the idiomatic way for looping through a C string is
const char *s = "abc\ndef\nghi\n";
int lines = 0;
int nonempty = 0;
while (*s) {
nonempty = 1;
if (*s++ == '\n') lines++;
}
If you don't want to count the last empty line as a separate line, then add
if (nonempty && s[-1] == '\n' && lines > 0) lines--;
after the while loop.
Take the length of the string and iterate through all characters.
const unsigned long length=strlen(str);
for(int i = 0 ; i < length ; i ++ )
{
if(str[i] == '\n') {
lines++;
}
}
The following will deliver the same result regardless if the last character is a newline or not.
char *abc = "test1\ntest2\ntest3";
int lines = 0;
{
bool lastWasNewline = true;
char * p = abc;
for (; *p; ++p) {
if (lastWasNewline) ++lines;
lastWasNewline = *p == '\n';
}
}
1.I mean do I need to use strlen(str) + 1 ?
no, just use str[i] for i < ??, this tests if that is the 0 character which terminates the string
2.when the abc is "test1\ntest2\ntest3\n",does the code still calculate correct lines?
no, you code assumes that the input is broken into one input line per buffer line[j].
in place of ?? put strlen(abc) and make sure #include <string.h>
For better efficiency do
int length= strlen(abc);
and then use i < length
Or use str[i]!= '\0'

Resources