why the result of strlen() out of my expection? - c

The function strlen() counts the number of characters in a string up to NUL and it doesn't contain
NUL.In ASCII,NUl is equal to '\0'.
#include<stdio.h>
#include<string.h>
int main(void)
{
char a[]="abc";
char b[]="abcd'\0'def";
printf("%d\n%d",strlen(a),strlen(b));
return 0;
}
The result is 3 and 5.
THe second result is in contradiction with the first result.
Thus ,I try to find how to implement strlen().
int strlen(char a[])
{
int i;
for(i=0;a[i]!='\0';i++);
return i;
}
Based on this code ,I can understand the first result,but really can't understand the second one.
Why is the sceond result not 4 but 5?Thanks in advance.

You are getting 5 because you have wrapped the NUL character in single quotes, the value 5 is the length of the string "abcd'".
If you change the second example to "abcd\0ef" (no single quotes), you get a value of 4.

char b[]="abcd'\0'def";
the array elements are
[a][b][c][d]['][0]['][d][e][f][0]
so the lenthth of the string before first [0] is 5 as it contains ' character as well.

In ASCII, NUL is equal to '\0'.
NUL, and the null character, are equal to '\0', that is correct. But what you are looking here is a character literal (one character, \0, enclosed in single quotes). Within a string literal, those single quotes are not needed, and will indeed be interpreted as characters of their own.
So this...
char a[]="abc";
char b[]="abcd'\0'def";
...is equivalent to:
char a[]= { 'a', 'b', 'c', '\0' };
char b[]= { 'a', 'b', 'c', 'd', '\'', '\0', '\'', 'd', 'e', 'f', '\0' };
Your intention for b was this:
char b[]="abcd\0def";

Related

What is the difference between chars and strings?

I am working on cs50 coding exercise but I don't get strings and chars. I have problems with them cause I can't understand what they are. I don't know the difference between chars and strings cause they seem the same to me. My code is:
#include<stdio.h>
int main(void){
char c=A;
printf("%c", c);
string a = A;
printf("%s", a);
}
but it prints AA without changes whether it is a char or string.
If I do this:
#include<stdio.h>
int main(void){
char c=A;
printf("%c", c);
char a = A;
printf("%c", a);
}
it still prints AA. Even if I do this:
#include<stdio.h>
int main(void){
string c=A;
printf("%s", c);
string a = A;
printf("%s", a);
}
It STILL prints AA, even if I swap from chars to strings. I don't see a difference at all! Please help me understand.
I change from chars to strings but the result doesn't change. Please help me figure it out.
A char represents a single character. A string is a series of characters.
string is Alias of String in which we can store collection of Char Called Word like
string name = "mukesh"; but in char we can store single character only like
char ch='b';
A string is an array of chars.
A char is just one character or letter in english.
Like examples of chars are 'A', 'B', 'w', '3', '\n' etc.
In C/C++, a char takes 1 byte of space and is within single quotes.
Special characters like '\n', '\0' are also chars.
They take 1 byte of storage and can be represented as an integer also. Search ASCII on google to understand the integer relation.
string is a collection of chars.
Examples are "John", "Hello", "What is an Apple?".
The chars inside string "John" are J,o,h,n,\0. The length of the string is 5. A string always terminates with a \0 char.
strings are inside double quotes in C/C++.
There are a few possible common-use definitions of string (the C standard defines "A string is a contiguous sequence of characters terminated by and including the first null character.", see C11 7.1.1)
a) an array of char values of which the last one has value 0, eg
(char[4]){'b', 'a', 'r', '\0'} // a) string
b) an array of char values that contains a '\0', eg
(char[7]){'f', 'o', 'o', '\0', 'x', 'y', 'z'} // not an a) string; b) string
this last string can "grow" because the underlying array has enough space
c) a pointer to one of the above or to somewhere in the middle of such an array, eg
char array[10] = "foo\0bar\0X"; // not an a) string
char *p = &arr[4]; // not an a) string, not a b) string
This last array is not a string for definition a), it is a string for definition 2) string; p points to the 'b': that pointer is a string by definition c).
Always make sure you know what kind of string you're discussing!!
Of these definitions, c) is the one farthest from the C Standard definition.

'\0' and printf() in C

In an introductory course of C, I have learned that while storing the strings are stored with null character \0 at the end of it. But what if I wanted to print a string, say printf("hello") although I've found that that it doesn't end with \0 by following statement
printf("%d", printf("hello"));
Output: 5
but this seem to be inconsistent, as far I know that variable like strings get stored in main memory and I guess while printing something it might also be stored in main memory, then why the difference?
The null byte marks the end of a string. It isn't counted in the length of the string and isn't printed when a string is printed with printf. Basically, the null byte tells functions that do string manipulation when to stop.
Where you will see a difference is if you create a char array initialized with a string. Using the sizeof operator will reflect the size of the array including the null byte. For example:
char str[] = "hello";
printf("len=%zu\n", strlen(str)); // prints 5
printf("size=%zu\n", sizeof(str)); // prints 6
printf returns the number of the characters printed. '\0' is not printed - it just signals that the are no more chars in this string. It is not counted towards the string length as well
int main()
{
char string[] = "hello";
printf("szieof(string) = %zu, strlen(string) = %zu\n", sizeof(string), strlen(string));
}
https://godbolt.org/z/wYn33e
sizeof(string) = 6, strlen(string) = 5
Your assumption is wrong. Your string indeed ends with a \0.
It contains of 5 characters h, e, l, l, o and the 0 character.
What the "inner" print() call outputs is the number of characters that were printed, and that's 5.
In C all literal strings are really arrays of characters, which include the null-terminator.
However, the null terminator is not counted in the length of a string (literal or not), and it's not printed. Printing stops when the null terminator is found.
All answers are really good but I would like to add another example to complete all these
#include <stdio.h>
int main()
{
char a_char_array[12] = "Hello world";
printf("%s", a_char_array);
printf("\n");
a_char_array[4] = 0; //0 is ASCII for null terminator
printf("%s", a_char_array);
printf("\n");
return 0;
}
For those don't want to try this on online gdb, the output is:
Hello world
Hell
https://linux.die.net/man/3/printf
Is this helpful to understand what escape terminator does? It's not a boundary for a char array or a string. It's the character that will say to the guy that parses -STOP, (print) parse until here.
PS: And if you parse and print it as a char array
for(i=0; i<12; i++)
{
printf("%c", a_char_array[i]);
}
printf("\n");
you get:
Hell world
where, the whitespace after double l, is the null terminator, however, parsing a char array, will just the char value of every byte. If you do another parse and print the int value of each byte ("%d%,char_array[i]), you'll see that (you get the ASCII code- int representation) the whitespace has a value of 0.
In C function printf() returns the number of character printed, \0 is a null terminator which is used to indicate the end of string in c language and there is no built in string type as of c++, however your array size needs to be a least greater than the number of char you want to store.
Here is the ref: cpp ref printf()
But what if I wanted to print a string, say printf("hello") although
I've found that that it doesn't end with \0 by following statement
printf("%d", printf("hello"));
Output: 5
You are wrong. This statement does not confirm that the string literal "hello" does not end with the terminating zero character '\0'. This statement confirmed that the function printf outputs elements of a string until the terminating zero character is encountered.
When you are using a string literal as in the statement above then the compiler
creates a character array with the static storage duration that contains elements of the string literal.
So in fact this expression
printf("hello")
is processed by the compiler something like the following
static char string_literal_hello[] = { 'h', 'e', 'l', 'l', 'o', '\0' };
printf( string_literal_hello );
Th action of the function printf in this you can imagine the following way
int printf( const char *string_literal )
{
int result = 0;
for ( ; *string_literal != '\0'; ++string_literal )
{
putchar( *string_literal );
++result;
}
return result;
}
To get the number of characters stored in the string literal "hello" you can run the following program
#include <stdio.h>
int main(void)
{
char literal[] = "hello";
printf( "The size of the literal \"%s\" is %zu\n", literal, sizeof( literal ) );
return 0;
}
The program output is
The size of the literal "hello" is 6
You have to clear your concept first..
As it will be cleared when you deal with array, The print command you are using its just counting the characters that are placed within paranthesis. Its necessary in array string that it will end with \0
A string is a vector of characters. Contains the sequence of characters that form the
string, followed by the special ending character
string: '\ 0'
Example:
char str[10] = {'H', 'e', 'l', 'l', 'o', '\0'};
Example: the following character vector is not one string because it doesn't end with '\ 0'
char str[2] = {'h', 'e'};

Is '\0' in the middle of a string can be recognized as the end of a string in C?

I know in C '\0' is always at the end of a string, but is '\0' always mark the end of a string?
Just like "1230123" can be recognized as "123"?
One edition of the question used the notation '/0' instead of '\0'.
A byte with the value 0 by definition marks the end of a string. So if you had something like this:
char s[] = { 'a', 'b', 'c', '\0', 'd', 'e', 'f', '\0' };
printf("%s\n");
It would print abc.
This is different from "1230123" where the 4th character in the string is not the value 0 but the character '0', which has an ASCII code of 48.
The null terminating character is represented as \0 and not /0 and it always mark end of string because, in C, strings are actually one-dimensional array of characters terminated by a null character \0.
This
char s[] = "1230123";
is same as this
char s[] = {'1', '2', '3', '0', '1', '2', '3', '\0'};
| |
This is character '0' |
whose decimal value is 48 |
|
This is null terminating character
whose decimal value is 0
Check this example:
#include <stdio.h>
int main (void)
{
int x,y,z;
char s1[] = "1230123";
char s2[] = {'1','2','3','\0','4','5','6','\0'};
printf ("%s\n", s1);
printf ("%s\n", s2);
return 0;
}
Output:
1230123
123 <======= The characters after null terminating character is not printed.
A string literal can have a '\0' in the middle.
A string only has a '\0' at the end.
'\0' is the null character and has a value of 0. '0' is the character zero.
See value of '\0' is same ... 0?
C has, as part of the language, string literals.
The two string literals below have a size of 8: the 7 you see plus the 1 not explicitly coded trailing null character '\0'.
"abc-xyz" // size 8
"abc\0xyz" // size 8
C, as part of the standard library, defines a string.
A string is a contiguous sequence of characters terminated by and including the first null character.
Many str...() functions only work with the data up to the first null character.
strlen("abc-xyz") --> 7
strlen("abc\0xyz") --> 3

Maximum number of elements that can be stored in an array in c

Please pardon me if it is a copy question. I will be happy to delete it if pointed out.
The question is that, if I declare a character array in c, say
char character_array[4];
Does that mean I can only store 3 characters and one '/0' is added as the fourth character? But I have tried it and successfully added four characters into the character array. But when I do that where is the '/0' added since I have already used up the four positions?
Well, yes, you can store any four characters. The string-termination character '\0' is a character just like any other.
But you don't have to store strings, char is a small integer so you can do:
char character_array[] = { 1, 2, 3, 4 };
This uses all four elements, but doesn't store printable characters nor any termination; the result is not a C string.
If you want to store a string, you need to accommodate the terminator character of course, since C strings by definition always end with the termination character.
C does not have protection against buffer overflow, if you aim at your foot and pull the trigger it will, in general, happily blow it off for you. Some of us like this. :)
You mix two notions: the notion of arrays and the notion of strings.
In this declaration
char character_array[4];
there is declared an array that can store 4 objects of type char. It is not important what values the objects will have.
On the other hand the array can contain a string: a sequence of characters limited with a terminating zero.
For example you can initialize the array above in C the following way
char character_array[4] = { 'A', 'B', 'C', 'D' };
or
char character_array[4] = "ABCD";
or
char character_array[4] = { '\0' };
or
char character_array[4] = "";
and so on.
In all these cases the array has 4 objects of type char. In the last two cases you may suppose that the array contains strings (empty strings) because the array has an element with zero character ( '\0' ). That is in the last two cases you may apply to the array functions that deal with strings.
Or another example
char character_array[4] = { 'A', 'B', '\0', 'C' };
You can deal with the array as if it had a string "AB" or just four objects.
Consider this demonstrative program
#include <stdio.h>
#include <string.h>
int main( void )
{
char character_array[4] = { 'A', 'B', '\0', 'C' };
char *p = strchr(character_array, 'C');
if (p == NULL)
{
printf("Character '%c' is not found in the array\n", 'C');
}
else
{
printf("Character '%c' is found in the array at position %zu\n",
'C',
(size_t)(p - character_array));
}
p = ( char * )memchr(character_array, 'C', sizeof(character_array));
if (p == NULL)
{
printf("Character '%c' is not found in the array\n", 'C');
}
else
{
printf("Character '%c' is found in the array at position %zu\n",
'C',
(size_t)(p - character_array));
}
}
The program output is
Character 'C' is not found in the array
Character 'C' is found in the array at position 3
In the first part of the program it is assumed that the array contains a string. The standard string function strchr just ignores all elements of the array after encountering the element with the value '\0'.
In the second part of the program it is assumed that the array contains a sequence of objects with the length of 4. The standard function memchr knows nothing about strings.
Conclusion.
This array
char character_array[4];
can contain 4 objects of type character. It is so declared.
The array can contain a string if to interpret its content as a string provided that at least one element of the array is equal to '\0'.
For example if to declare the array like
char character_array[4] = "A";
that is equivalent to
char character_array[4] = { 'A', '\0', '\0', '\0' };
then it may be said that the array contains the string "A" with the length equal to 1. On the other hand the array actually contain 4 object of type char as the second equivalent declaration shows.
You just reserve 4 bytes to fill with. If you write to _array[4] (the fifth character) you have a so called buffer overflow, means you write to non-reserved memory.
If you store a string in 4 characters, you have actually just 3 characters for printable characters (_array[0], ..., _array[2]) and the last one (_array[3]) is just for keeping the string termination '\0'.
For instance, in your case the function strlen() parses until such string termination '\0' and returns length=3.

character array printing output

I am trying to understand outputs of printing character arrays and it is giving me variable outputs on ideone.com(C++ 4.3.2) and on my machine (Dev c++, MinGW compiler)
1)
#include<stdio.h>
main()
{
char a[] = {'s','t','a','c','k','o'};
printf("%s ",a);
}
It prints "stacko" on my machine BUT doesn't print anything on ideone
2)
#include<stdio.h>
main()
{
char a[] = {'s','t','a','c','k','o','v','e'};
printf("%s ",a);
}
on ideone : it prints "stackove" only the first time then prints nothing the subsequent times when i run this program
on my dev-c : it prints "stackove.;||w"
what should be the IDEAL OUTPUT when I try to print this kind of character array without any '\0' at the end , it seems to give variable outputs everywhere . please help !
%s conversion specifier expects a string. A string is a character array containing a terminating null character '\0' which marks the end of the string. Therefore, your program as such invokes undefined behaviour because printf overruns the array accessing memory out of the bound of the array looking for the terminating null byte which is not there.
What you need is
#include <stdio.h>
int main(void)
{
char a[] = {'s', 't', 'a', 'c', 'k', 'o', 'v', 'e', '\0'};
// ^
// include the terminating null character to make the array a string
// output a newline to immediately print on the screen as
// stdout is line-buffered by default
printf("%s\n", a);
return 0;
}
You can also initialize your array with a string literal as
#include <stdio.h>
int main(void)
{
char a[] = "stackove"; // initialize with the string literal
// equivalent to
// char a[] = {'s', 't', 'a', 'c', 'k', 'o', 'v', 'e', '\0'};
// output a newline to immediately print on the screen as
// stdout is line-buffered by default
printf("%s\n", a);
return 0;
}

Resources