Assignment after initialization to specific index in an array - c

After assigning 26th element, when printed, still "Computer" is printed out in spite I assigned a character to 26th index. I expect something like this: "Computer K "
What is the reason?
#include <stdio.h>
int main()
{
char m1[40] = "Computer";
printf("%s\n", m1); /*prints out "Computer"*/
m1[26] = 'K';
printf("%s\n", m1); /*prints out "Computer"*/
printf("%c", m1[26]); /*prints "K"*/
}

At 8th index of that string the \0 character is found and %s prints only till it finds a \0 (the end of string, marked by \0) - at 26th the character k is there but it will not be printed as \0 is found before that.

char s[100] = "Computer";
is basically the same as
char s[100] = { 'C', 'o', 'm', 'p', 'u','t','e','r', '\0'};
Since printf stops when the string is 0-terminated it won't print character 26

Whenever you partially initialize an array, the remaining elements are filled with zeroes. (This is a rule in the C standard, C17 6.7.9 §19.)
Therefore char m1[40] = "Computer"; ends up in memory like this:
[0] = 'C'
[1] = 'o'
...
[7] = 'r'
[8] = '\0' // the null terminator you automatically get by using the " " syntax
[9] = 0 // everything to zero from here on
...
[39] = 0
Now of course \0 and 0 mean the same thing, the value 0. Either will be interpreted as a null terminator.
If you go ahead and overwrite index 26 and then print the array as a string, it will still only print until it encounters the first null terminator at index 8.
If you do like this however:
#include <stdio.h>
int main()
{
char m1[40] = "Computer";
printf("%s\n", m1); // prints out "Computer"
m1[8] = 'K';
printf("%s\n", m1); // prints out "ComputerK"
}
You overwrite the null terminator, and the next zero that happened to be in the array is treated as null terminator instead. This code only works because we partially initialized the array, so we know there are more zeroes trailing.
Had you instead written
int main()
{
char m1[40];
strcpy(m1, "Computer");
This is not initialization but run-time assignment. strcpy would only set index 0 to 8 ("Computer" with null term at index 8). Remaining elements would be left uninitialized to garbage values, and writing m1[8] = 'K' would destroy the string, as it would then no longer be reliably null terminated. You would get undefined behavior when trying to print it: something like garbage output or a program crash.

In C strings are 0-terminated.
Your initialization fills all array elements after the 'r' with 0.
If you place a non-0 character in any random field of the array, this does not change anything in the fields before or after that element.
This means your string is still 0-terminated right after the 'r'.
How should any function know that after that string some other string might follow?

That's because after "Computer" there's a null terminator (\0) in your array. If you add a character after this \0, it won't be printed because printf() stops printing when it encounters a null terminator.

Just as an addition to the other users answers - you should try to answer your question by being more proactive in your learning. It is enough to write a simple program to understand what is happening.
int main()
{
char m1[40] = "Computer";
printf("%s\n", m1); /*prints out "Computer"*/
m1[26] = 'K';
for(size_t index = 0; index < 40; index++)
{
printf("m1[%zu] = 0x%hhx ('%c')\n", index, (unsigned char)m1[index], (m1[index] >=32) ? m1[index] : ' ');
}
}

Related

C printf prints an array that I didn't ask for

I have recently started learning C and I got into this problem where printf() prints an array I didn't ask for.
I was expecting an error since I used %s format in char array without the '\0', but below is what I got.
char testArray1[] = { 'a','b','c'};
char testArray2[] = { 'q','w','e','r','\0' };
printf("%c", testArray1[0]);
printf("%c", testArray1[1]);
printf("%c\n", testArray1[2]);
printf("%s\n", testArray1);
the result is
abc
abcqwer
thanks
The format "%s" expects that the corresponding argument points to a string: sequence of characters terminated by the zero character '\0'.
printf("%s\n", testArray1);
As the array testArray1 does not contain a string then the call above has undefined behavior.
Instead you could write
printf("%.*s\n", 3,testArray1);
or
printf("%.3s\n", testArray1);
specifying exactly how many elements of the array you are going to output.
Pay attention to that in C instead of these declarations
char testArray1[] = { 'a','b','c'};
char testArray2[] = { 'q','w','e','r','\0' };
you may write
char testArray1[3] = { "abc" };
char testArray2[] = { "qwer" };
or that is the same
char testArray1[3] = "abc";
char testArray2[] = "qwer";
In C++ the first declaration will be invalid.
%s indeed stop when encountered \0, but testArray1 didn't have that \0, so it keeps printing the following bytes in the memory.
And the compiler magically(actually intentionally) places the testArray2 next to testArray1, the memory is like:
a b c q w e r \0
^ testArray1 starts here
^ testArray2 starts here
And the %s will print all of those chars above until it meets a \0.
You can validate that by:
printf("%d\n", testArray2 == testArray1 + 3);
// prints `1`(true)
As your question, there was no error because the a ... r \0 sequece in memory is owned by your process. Only the program is trying to access an address not owned by it, the OS will throw an error.
Add zero at the end of first array:
char testArray1[] = { 'a','b','c', 0 };
Otherwise printf continues with memory after 'c' until zero byte and there is the second array.
PS: zero 0 is 100% identical to longer ASCII '\0'.

'\0' and printf() in C

In an introductory course of C, I have learned that while storing the strings are stored with null character \0 at the end of it. But what if I wanted to print a string, say printf("hello") although I've found that that it doesn't end with \0 by following statement
printf("%d", printf("hello"));
Output: 5
but this seem to be inconsistent, as far I know that variable like strings get stored in main memory and I guess while printing something it might also be stored in main memory, then why the difference?
The null byte marks the end of a string. It isn't counted in the length of the string and isn't printed when a string is printed with printf. Basically, the null byte tells functions that do string manipulation when to stop.
Where you will see a difference is if you create a char array initialized with a string. Using the sizeof operator will reflect the size of the array including the null byte. For example:
char str[] = "hello";
printf("len=%zu\n", strlen(str)); // prints 5
printf("size=%zu\n", sizeof(str)); // prints 6
printf returns the number of the characters printed. '\0' is not printed - it just signals that the are no more chars in this string. It is not counted towards the string length as well
int main()
{
char string[] = "hello";
printf("szieof(string) = %zu, strlen(string) = %zu\n", sizeof(string), strlen(string));
}
https://godbolt.org/z/wYn33e
sizeof(string) = 6, strlen(string) = 5
Your assumption is wrong. Your string indeed ends with a \0.
It contains of 5 characters h, e, l, l, o and the 0 character.
What the "inner" print() call outputs is the number of characters that were printed, and that's 5.
In C all literal strings are really arrays of characters, which include the null-terminator.
However, the null terminator is not counted in the length of a string (literal or not), and it's not printed. Printing stops when the null terminator is found.
All answers are really good but I would like to add another example to complete all these
#include <stdio.h>
int main()
{
char a_char_array[12] = "Hello world";
printf("%s", a_char_array);
printf("\n");
a_char_array[4] = 0; //0 is ASCII for null terminator
printf("%s", a_char_array);
printf("\n");
return 0;
}
For those don't want to try this on online gdb, the output is:
Hello world
Hell
https://linux.die.net/man/3/printf
Is this helpful to understand what escape terminator does? It's not a boundary for a char array or a string. It's the character that will say to the guy that parses -STOP, (print) parse until here.
PS: And if you parse and print it as a char array
for(i=0; i<12; i++)
{
printf("%c", a_char_array[i]);
}
printf("\n");
you get:
Hell world
where, the whitespace after double l, is the null terminator, however, parsing a char array, will just the char value of every byte. If you do another parse and print the int value of each byte ("%d%,char_array[i]), you'll see that (you get the ASCII code- int representation) the whitespace has a value of 0.
In C function printf() returns the number of character printed, \0 is a null terminator which is used to indicate the end of string in c language and there is no built in string type as of c++, however your array size needs to be a least greater than the number of char you want to store.
Here is the ref: cpp ref printf()
But what if I wanted to print a string, say printf("hello") although
I've found that that it doesn't end with \0 by following statement
printf("%d", printf("hello"));
Output: 5
You are wrong. This statement does not confirm that the string literal "hello" does not end with the terminating zero character '\0'. This statement confirmed that the function printf outputs elements of a string until the terminating zero character is encountered.
When you are using a string literal as in the statement above then the compiler
creates a character array with the static storage duration that contains elements of the string literal.
So in fact this expression
printf("hello")
is processed by the compiler something like the following
static char string_literal_hello[] = { 'h', 'e', 'l', 'l', 'o', '\0' };
printf( string_literal_hello );
Th action of the function printf in this you can imagine the following way
int printf( const char *string_literal )
{
int result = 0;
for ( ; *string_literal != '\0'; ++string_literal )
{
putchar( *string_literal );
++result;
}
return result;
}
To get the number of characters stored in the string literal "hello" you can run the following program
#include <stdio.h>
int main(void)
{
char literal[] = "hello";
printf( "The size of the literal \"%s\" is %zu\n", literal, sizeof( literal ) );
return 0;
}
The program output is
The size of the literal "hello" is 6
You have to clear your concept first..
As it will be cleared when you deal with array, The print command you are using its just counting the characters that are placed within paranthesis. Its necessary in array string that it will end with \0
A string is a vector of characters. Contains the sequence of characters that form the
string, followed by the special ending character
string: '\ 0'
Example:
char str[10] = {'H', 'e', 'l', 'l', 'o', '\0'};
Example: the following character vector is not one string because it doesn't end with '\ 0'
char str[2] = {'h', 'e'};

Unexplainable behaviour when printing out strings in C

The following code works as expected and outputs ABC:
#include <stdio.h>
void printString (char toPrint [100]);
int main()
{
char hello [100];
hello[0] = 'A';
hello[1] = 'B';
hello[2] = 'C';
hello[3] = '\0';
printString(hello);
}
void printString (char toPrint [100])
{
int i = 0;
while (toPrint[i] != '\0')
{
printf("%c", toPrint[i]);
++i;
}
}
But if I remove the line that adds the null-character
hallo[3] = '\0';
I get random output like wBCÇL, ╗BCÄL, ┬BCNL etc.
Why is that so? What I expected is the loop in printString() to run forever because it doesn't run into a '\0', but what happend to 'A', 'B' and 'C'? Why do B and C still show up in the output but A is replaced by some random character?
You declaration of hello leaves it uninitialized and filled with random bytes
int main()
{
char hello [100];
...
}
If you want zero initialized array use
int main()
{
char hello [100] = {0};
...
}
There must have been, by pure chance, the value for \r somewhere in the memory cells following those of my array hello. That's why my character 'A' was overwritten.
On other machines, "ABC" was ouput as expected, followed by random characters.
Initializing the array with 0s, purposely omitted here, of course solves the problem.
edit:
I let the code print out each character in binary and toPrint[5] was indeed 00001101 which is ASCII for \r (carriage return).
When you declare an automatic like char hello [100];, the first thing to understand is that the 100 bytes can contain just about anything. You must assign values to each byte explicitly to do / have something meaningful.
You are terminating you loop when you find the \0 a.k.a the NUL character. Now, if you comment out the instruction which puts the \0 after the character c, your loop runs until you actually find \0.
Your array might contain \0 at some point or it might not. There are chances you might go beyond the 100 bytes still looking for a \0 and invoke undefined behaviour. You also invoke UB when you try to work with an unassigned piece of memory.

What is with the null character in reverse string program?

I had to make a simple program to reverse a string. I eventually got this code from my own understanding but also help from google since I couldn't originally get it to work. It runs fine and outputs as it should and I understand all of it except for the reverse[j] = '\0' statement.
I kept getting symbols in my output when I didn't state it but I want to know how this works. Can anyone explain please?
#include<stdio.h>
int main(void)
{
char original[20], reverse[20];
int length, i, j;
printf("Enter a string:\n");
gets(original);
length = strlen(original);
for (i = length - 1, j= 0; i >= 0; i--, j++)
reverse[j] = original[i];
reverse[j] = '\0'; //I don't know what this statement does exactly
printf("The string reversed is:\n %s\n", reverse);
return 0;
}
If you want that a character array would contain a string then it has to have the terminating zero.
For example function strlen that is used in your program counts characters in a character array before the terminating zero.
Also function printf used with the format specifier %s outputs characters from a character array until the terminating zero will be encountered.
For example if you have an array like this
char s[10] = "Hello";
then the call strlen( s ) returns 5 instead of 10. And call printf( "%s\n", s ); outputs 6 characters (including the new line character).
Consider this demonstrative program
#include <stdio.h>
int main(void)
{
char s[10] = "Hello";
printf( "%d\n", printf( "%s\n", s ) );
return 0;
}
Its output is
Hello
6
This initialization
char s[10] = "Hello";
is fully equivalent to
char s[10] = { 'H', 'e', 'l', 'l', 'o', '\0', '\0', '\0', '\0', '\0'};
If you need to reverse the string stored in the array it is obvious that you need to reverse characters before the first terminating zero. And if you want to copy the string in the reversed order to another character array you need to append the destination array with the terminating zero.
This loop
for (i = length - 1, j= 0; i >= 0; i--, j++)
reverse[j] = original[i];
copies in the reversed order all characters except the terminating zero from the original character array starting with the last character before the terminating zero to the destination character array. You need to append the destination character array with the terminating zero
reverse[j] = '\0';
The \0 is not really a character. It's the way a c program mark the end of a string. So if you only want to reverse a string you don't have to move this character at the begining of the reversed string.
original string: hello\0
reverse string : olleh\0
In C, a string always has a \0 at the end. Read why and how.
\0 is for null terminating the string. This will mark the end of the string. If it is not null terminated, the data in the succeeding locations will also wrongly treat as part of string.

C Programming - Functionality of strlen

I'm working to try and understand some string functions so I can more effectively use them in later coding projects, so I set up the simple program below:
#include <stdio.h>
#include <string.h>
int main (void)
{
// Declare variables:
char test_string[5];
char test_string2[] = { 'G', 'O', '_', 'T', 'E', 'S', 'T'};
int init;
int length = 0;
int match;
// Initialize array:
for (init = 0; init < strlen(test_string); init++)
{ test_string[init] = '\0';
}
// Fill array:
test_string[0] = 'T';
test_string[1] = 'E';
test_string[2] = 'S';
test_string[3] = 'T';
// Get Length:
length = strlen(test_string);
// Get number of characters from string 1 in string 2:
match = strspn(test_string, test_string2);
printf("\nstrlen return = %d", length);
printf("\nstrspn return = %d\n\n", match);
return 0;
}
I expect to see a return of:
strlen return = 4
strspn return = 4
However, I see strlen return = 6 and strspn return = 4. From what I understand, char test_string[5] should allocate 5 bytes of memory and place hex 00 into the fifth byte. The for loop (which should not even be nessecary) should then set all the bytes of memory for test_string to hex 00. Then, the immediately proceeding lines should fill test_string bytes 1 through 4 (or test_string[0] through test_string[3]) with what I have specified. Calling strlen at this point should return a 4, because it should start at the address of string 0 and count an increment until it hits the first null character, which is at string[4]. Yet strlen returns 6. Can anyone explain this? Thanks!
char test_string[5];
test_string is an array of 5 uninitialized char objects.
for (init = 0; init < strlen(test_string); init++)
Kaboom. strlen scans for the first '\0' null character. Since the contents of test_string are garbage, the behavior is undefined. It might return a small value if there happens to be a null character, or a large value or program crash if there don't happen to be any zero bytes in test_string.
Even if that weren't the case, evaluating strlen() in the header of a for loop is inefficient. Each strlen() call has to re-scan the entire string (assuming you've given it a valid string), so if your loop worked it would be O(N2).
If you want test_string to contain just zero bytes, you can initialize it that way:
char test_string[5] = "";
or, since you initialize the first 4 bytes later:
char test_string[5] = "TEST";
or just:
char test_string[] = "TEST";
(The latter lets the compiler figure out that it needs 5 bytes.)
Going back to your declarations:
char test_string2[] = { 'G', 'O', '_', 'T', 'E', 'S', 'T'};
This causes test_string2 to be 7 bytes long, without a trailing '\0' character. That means that passing test_string2 to any function that expects a pointer to a string will cause undefined behavior. You probably want something like:
char test_string2[] = "GO_TEST";
strlen searches for '\0' character to count them, in your test_string, there is none so it continues until it finds one which happens to be 6 bytes away from the start of your array since it is uninitialized.
The compiler does not generate code to initialize the array so you don't have to pay to run that code if you fill it later.
To initialize it to 0 and skip the loop, you can use
char test_string[5] = {0};
This way, all character will be initialized to 0 and your strlen will work after you filled the array with "TEST".
There are a few problems here. First of all, char test_string[5]; simply sets aside 5 bytes for that string, but does not set the bytes to anything. In particular, when you say "char test_string[5] should allocate 5 bytes of memory and place hex 00 into the fifth byte", the second part is wrong.
Secondly, your array initialization loop uses strlen(test_string) but since the bytes of test_string are uninitialized, there's no way to know what's there so strlen(test_string) returns some undefined result. A better way to clear the array would be memset( test_string, 0, sizeof(test_string) );.
You fill the array with "TEST" but don't set the NULL byte at the end, so the last byte is still uninitialized. If you do the memset above this will be fixed, or you can manually do test_string[4] = '\0'.

Resources