Null termination of C string - c

Is it right to say that the null terminating C string is automatically added by the compiler in general?
So in the following example:
char * str = "0124";
printf("%x", str[str[3] - str[2] + str[4]]);
the output is always 32?
Thanks.

First question: yes
Second question: yes on a ASCII system: you calculate '4' - '2' + '\0' which is in integers: 0x34 - 0x32 + 0 = 2 so you get str[2] which is '2' which is 0x32.
'4' - '2' to be 2 is defined in C, but if you ran your code on an EBCDIC system, '2' was 0xf2

Yes, the compiler does add the null terminator. Thus there is 5 bytes of memory allocated to str off the stack.
By the looks of it, with that string literal, (str[3] - str[2] + str[4]) evaluates to (52 - 50 + 0), so you are acessing str[2], which will print 0x32 in hex.

The terminating null character is added by the compiler; 6.4.5p6:
6 - In translation phase 7, a byte or code of value zero is appended to each multibyte
character sequence that results from a string literal or literals. The multibyte character
sequence is then used to initialize an array of static storage duration and length just
sufficient to contain the sequence. [...]
The printf output will be the character code of the 2 character on your system. The characters 0 to 9 are guaranteed to have contiguous codes (5.2.1p3), but not to have any particular value.

Related

Why the strlen() function doesn't return the correct length for a hex string?

I have a hex string for example \xF5\x17\x30\x91\x00\xA1\xC9\x00\xDF\xFF, when trying to use strlen() function to get the length of that hex string it returns 4!
const char string_[] = { "\xF5\x17\x30\x91\x00\xA1\xC9\x00\xDF\xFF" };
unsigned int string_length = strlen(string_);
printf("%d", string_length); // the result: 4
Is the strlen() function dealing with that hex as a string, or is something unclear to me?
For string functions in the C standard library, a character with value zero, also called a null character, marks the end of a string. Your string contains \x00, which designates a null character, so the string ends there. There are four non-null characters before it, so strlen returns four.
C 2018 7.1.1 1 says:
A string is a contiguous sequence of characters terminated by and including the first null character… The length of a string is the number of bytes preceding the null character…
C 2018 7.24.6.3 2 says:
The strlen function computes the length of the string pointed to by s [its first argument].
You could compute the size of your array as sizeof string_ (because it is an array of char) or sizeof string_ / sizeof *string_ (to compute the number of elements regardless of type), but this will include a terminating null character because defining an array with [] and letting the length be computed from a string literal initializer includes the terminating null character of the string literal. You may need to hard-code the length of the array, possibly using #define to define a preprocessor macro, and use that length in the array definition and in other places where the length is needed.
It is because you have zero at index [4]
string_[0] == 0xF5
string_[1] == 0x17
string_[2] == 0x30
string_[3] == 0x91
string_[4] == 0
...
"\xf5" puts char having integer value 0xf5 at position [0]
To see it as a string you need to escape the \ character
const char string_[] = "\\xF5\\x17\\x30\\x91\\x00\\xA1\\xC9\\x00\\xDF\\xFF";
At compile time, your "string" appears as consecutive hex values expressed in C syntax inside a pair of quotation marks.
strlen() is a run time function that scans through a series of bytes, looking for the first instance of a zero-value byte.
It's good to understand the difference between "compile time" and "run time".

Why *strptr = 0 truncate the string?(C)

Why *strptr = 0 truncate the string?(C)
Why the ascii value 30 of 0 become 0 of null?
Here I'm confused with the number 0, string 0 and the string termianl 0.
your explaination will be appreciated.
More precisely, there are three lexical elements that contain a zero character: 0 (unquoted), '0' (quoted, typically (but not always) equal to 48 or 0x30 unquoted) and '\0' (equal to 0, but in character notation).
The question is talking about two distinct values...'0' != '\0'. Forget about 30, 48, etc. Just remember '0' and '\0' are different characters, and '\0' is a string terminator that has a value of 0...
I think you meant to use '0' (emphasis on the quotation marks).
All standard library string routines treat character '\0' as string terminator, so if you put it at the beginning of the string - they all see no data to process, because first character is a terminator so effectively string is empty. And yes, per standard '\0' is a character that has value 0. As result: '\0' == 0 is true.

What is wrong with printing sizeof(char) and sizeof("a")?

The sizeof(char) in C gives 1 and sizeof("a") gives 2. Please help
A char i.e. a character has size 1.
The string literal "a" is not a character. It is a "string" (and by string I mean char[]). All "strings" in C are null-terminated, so your "string" is actually:
{'a','\0'}
And that's two characters. So size is 2.
sizeof("a")
"a" is a string that reads {'a','\0'}, which is 2 chars, or 2 bytes. This is because in C, double quotes indicate a string. A string in C is required to be null-terminated.
sizeof(char)
a single character is guaranteed to have the size 1 byte.
A sizeof(char) is 1 byte of size, where as "a" is a string which having 1byte for character and it will end with null '\0', so sizeof("a") is 2 byte.
'a' is not the same as "a"
at least for 8-bit CPUs like AVRs:
'a' is a single char and
sizeof('a') == 1,
the answer you expected.
"a" is a string as noted in the other answers.

C: sizeof() related doubts?

#include <stdio.h>
#include <string.h>
main()
{
printf("%d \n ",sizeof(' '));
printf("%d ",sizeof(""));
}
output:
4
1
Why o/p is coming 4 for 1st printf and moreover if i am giving it as '' it is showing error as error: empty character constant but for double quote blank i.e. without any space is fine no error?
The ' ' is example of integer character constant, which has type int (it's not converted, it has such type). Second is "" character literal, which contains only one character i.e. null character and since sizeof(char) is guaranteed to be 1, the size of whole array is 1 as well.
' ' is converted to an integer character constant(hence 4 bytes on your machine), "" is empty character array, which is still 1 byte('\0') terminated.
Here in below check the difference
#include<stdio.h>
int main()
{
char a= 'b';
printf("%d %d %d", sizeof(a),sizeof('b'), sizeof("a"));
return 0;
}
here a is defined as character whose data type size is 1 byte.
But 'b' is character constant. A character constant is an integer,The value of a character constant is the numeric value of the character in the machine's character set. sizeof char constant is nothing but int which is 4 byte
this is string literals "a" ---> array character whose size is number of character + \0 (NULL). Here its 2
This is answered in Size of character ('a') in C/C++
In C, the type of a character constant like 'a' is actually an int, with size of 4 (or some other implementation-dependent value). In C++, the type is char, with size of 1. This is one of many small differences between the two languages.
The 'space', or 'any single character', is actually of type integer, equal to the ASCII value of that character. So it's size will be 4 bytes.
If you create a character variable and store a character in it, then only it is stored in 1 byte memory.
char ch;
ch=' ';
printf("%d",sizeof(ch));
//outputs 1
For anything to be a string, it must be terminated with a null character represented as '\0'.
If we write a string "hello", it is actually stored as 'h' 'e' 'l' 'l' 'o' '\0', so that the system knows string ends after the 'o' in "hello" and it stops reading when null character comes. The length of this string is still 5 if you use strlen() function but actually the sizeof(string) is 6 bytes.
When we create an empty string, like "", it's length is 0 but size is 1 byte as it must terminate where it starts, i.e. at 0th character.
Hence an empty string consists of only one character, that is null character, giving size 1 byte.
From C Traps and Pitfalls
Single and double quotes mean very different things in C.
A Character enclosed in single quotes is just a another way of writing the integer that corresponds to the given character in ASCII implementation. Thus ' ' means exactly same thing as 32.
On the other hand, A string enclosed in double quotes is a short-hand way of writing a pointer to the initial character of a nameless array that has been initialized with the characters between the quotes and an extra character whose binary value is zero. Thus writing "" that is empty string still has '\0' character whose size is one.
because of in 1st case there is a character that's why sizeof operator is take the SACII value of character and it's take as an integer so in 1st case it will give you 4.
in 2nd case sizeof operator take as a string and in string there is no data means it's understood NULL string , so NULL string size is 1, that's why it will give you answer as a 1.

difference between sizeof('a') and sizeof("a")

My question is about the sizeof operator in C.
sizeof('a'); equals 4, as it will take 'a' as an integer: 97.
sizeof("a"); equals 2: why? Also (int)("a") will give some garbage value. Why?
'a' is a character constant - of type int in standard C - and represents a single character. "a" is a different sort of thing: it's a string literal, and is actually made up of two characters: a and a terminating null character.
A string literal is an array of char, with enough space to hold each character in the string and the terminating null character. Because sizeof(char) is 1, and because a string literal is an array, sizeof("stringliteral") will return the number of character elements in the string literal including the terminating null character.
That 'a' is an int instead of a char is a quirk of standard C, and explains why sizeof('a') == 4: it's because sizeof('a') == sizeof(int). This is not the case in C++, where sizeof('a') == sizeof(char).
because 'a' is a character, while "a" is a string consisting of the 'a' character followed by a null.

Resources