The following code is completely ok in C but not in C++. In following code if statement is always false. How C compares character variable against string?
int main()
{
char ch='a';
if(ch=="a")
printf("confusion");
return 0;
}
The following code is completely ok in C
No, Not at all.
In your code
if(ch=="a")
is essentially trying to compare the value of ch with the base address of the string literal "a",. This is meaning-and-use-less.
What you want here, is to use single quotes (') to denote a char literal, like
if(ch == 'a')
NOTE 1:
To elaborate on the difference between single quotes for char literals and double quotes for string literal s,
For char literal, C11, chapter §6.4.4.4
An integer character constant is a sequence of one or more multibyte characters enclosed in single-quotes, as in 'x'
and, for string literal, chapter §6.4.5
Acharacter string literal is a sequence of zero or more multibyte characters enclosed in
double-quotes, as in "xyz".
NOTE 2:
That said, as a note, the recommend signature of main() is int main(void).
I wouldn't say the code is okay in either language.
'a' is a single character. It is actually a small integer, having as its value the value of the given character in the machine's character set (almost invariably ASCII). So 'a' has the value 97, as you can see by running
char c = 'a';
printf("%d\n", c);
"a", on the other hand, is a string. It is an array of characters, terminated by a null character. In C, arrays are almost always referred to by pointers to their first element, so in this case the string constant "a" acts like a pointer to an array of two characters, 'a' and the terminating '\0'. You could see that by running
char *str = "a";
printf("%d %d\n", str[0], str[1]);
This will print
97 0
Now, we don't know where in memory the compiler will choose to put our string, so we don't know what the value of the pointer will be, but it's safe to say that it will never be equal to 97. So the comparison if(ch=="a") will always be false.
When you need to compare a character and a string, you have two choices. You can compare the character to the first character of the string:
if(c == str[0])
printf("they are equal\n");
else printf("confusion\n");
Or you can construct a string from the character, and compare that. In C, that might look like this:
char tmpstr[2];
tmpstr[0] = c;
tmpstr[1] = '\0';
if(strcmp(tmpstr, str) == 0)
printf("they are equal\n");
else printf("confusion\n");
That's the answer for C. There's a different, more powerful string type in C++, so things would be different in that language.
There is difference between 'a' (a character) and "a" (a string having two characters a and \0). ch=="a" comparison will be evaluated to false because in this expression "a" will converted to pointer to its first element and of course that address is not a character but a hexadecimal number.
Change it to
if(ch=='a')
Related
#include <stdio.h>
#include <string.h>
main()
{
printf("%d \n ",sizeof(' '));
printf("%d ",sizeof(""));
}
output:
4
1
Why o/p is coming 4 for 1st printf and moreover if i am giving it as '' it is showing error as error: empty character constant but for double quote blank i.e. without any space is fine no error?
The ' ' is example of integer character constant, which has type int (it's not converted, it has such type). Second is "" character literal, which contains only one character i.e. null character and since sizeof(char) is guaranteed to be 1, the size of whole array is 1 as well.
' ' is converted to an integer character constant(hence 4 bytes on your machine), "" is empty character array, which is still 1 byte('\0') terminated.
Here in below check the difference
#include<stdio.h>
int main()
{
char a= 'b';
printf("%d %d %d", sizeof(a),sizeof('b'), sizeof("a"));
return 0;
}
here a is defined as character whose data type size is 1 byte.
But 'b' is character constant. A character constant is an integer,The value of a character constant is the numeric value of the character in the machine's character set. sizeof char constant is nothing but int which is 4 byte
this is string literals "a" ---> array character whose size is number of character + \0 (NULL). Here its 2
This is answered in Size of character ('a') in C/C++
In C, the type of a character constant like 'a' is actually an int, with size of 4 (or some other implementation-dependent value). In C++, the type is char, with size of 1. This is one of many small differences between the two languages.
The 'space', or 'any single character', is actually of type integer, equal to the ASCII value of that character. So it's size will be 4 bytes.
If you create a character variable and store a character in it, then only it is stored in 1 byte memory.
char ch;
ch=' ';
printf("%d",sizeof(ch));
//outputs 1
For anything to be a string, it must be terminated with a null character represented as '\0'.
If we write a string "hello", it is actually stored as 'h' 'e' 'l' 'l' 'o' '\0', so that the system knows string ends after the 'o' in "hello" and it stops reading when null character comes. The length of this string is still 5 if you use strlen() function but actually the sizeof(string) is 6 bytes.
When we create an empty string, like "", it's length is 0 but size is 1 byte as it must terminate where it starts, i.e. at 0th character.
Hence an empty string consists of only one character, that is null character, giving size 1 byte.
From C Traps and Pitfalls
Single and double quotes mean very different things in C.
A Character enclosed in single quotes is just a another way of writing the integer that corresponds to the given character in ASCII implementation. Thus ' ' means exactly same thing as 32.
On the other hand, A string enclosed in double quotes is a short-hand way of writing a pointer to the initial character of a nameless array that has been initialized with the characters between the quotes and an extra character whose binary value is zero. Thus writing "" that is empty string still has '\0' character whose size is one.
because of in 1st case there is a character that's why sizeof operator is take the SACII value of character and it's take as an integer so in 1st case it will give you 4.
in 2nd case sizeof operator take as a string and in string there is no data means it's understood NULL string , so NULL string size is 1, that's why it will give you answer as a 1.
When I came across this C language implementation of Porters Stemming algorithm I found a C-ism I was confused about.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
void test( char *s )
{
int len = s[0];
printf("len= %i\n", len );
printf("s[len] = %c\n", s[len] );
}
int main()
{
test("\07" "abcdefg");
return 0;
}
and output:
len = 7
s[len] = g
However, when I input
test("\08" "abcdefgh");
or any string constant that is longer than 7 with the corresponding length in the first pair of parenthesis ( i.e. test("\09" "abcdefghi"); the output is
len = 0
s[len] =
But any input like test("\01" "abcdefgh"); prints out the character in that position ( if we call the first character position 1 and not 0 for the moment )
It appears if test( char *s ) reads the number in the first pair of parenthesis ( how it does this I am not sure since I thought s[0] would be able to only read a single char, i.e. the '\' ) and prints the last character at that index + 1 of the string constant in the second pair of parenthesis.
My question is this: It seems as if we are passing two string constants into test( char *s ). What exactly is happening here, meaning, how does the compiler seem to "split" up the string over two pairs of parenthesis? Another question one might have is, is a string of the form "blah" "abcdefg" one consecutive block of memory? It may be the case that I have overlooked something elementary, but even so I would like to know what I overlooked. I know this is a basic concept but I could not find a clear example or situation on the web that explains this and in all honesty I don't follow the output. Any helpful comments are welcomed.
There are at least three things going on here:
Literal strings juxtaposed against one another are concatenated by the compiler. "a" "b" is exactly the same as "ab".
The backslash is an escape character, which means it is not copied literally into the resulting string. The notation \01 means "the character with ASCII value 1".
The notation \0... means an octal character constant. Octal numbers are base 8, made up from digits that range from 0 through 7 inclusive. 8 is not a valid octal constant, so "\08" does not follow "\07".
The problem is not in the length of the string, but in the \o syntax for specifying non-printable values in string literals. \o, \oo, and \ooo denote octal constants, i.e. a single character whose value is written in base 8. Since 08 in \08 doesn't represent a valid base 8 number, it is interpreted as \0 followed by the ASCII character 8.
To fix the problem, represent 8 as \10 or \010:
test("\007" "abcdefg");
test("\010" "abcdefgh");
...or switch to hexadecimal, where the \x prefix makes the base more explicit to the casual reader:
test("\x07" "abcdefg");
test("\x08" "abcdefgh");
test("\x09" "abcdefghi");
test("\x0a" "abcdefghij");
...
\number in a character or string literal is means the character whose code is the value number. number is interpreted in octal, so the first non-octal digit terminates the number. So "\07" is a one-character string containing the character with code 7, but \08 is a two-character string containing the character with code 0 followed by the digit 8.
Additionally, code 0 the null terminator that's used in C to indicate the end of the string. So that second string ends at the beginning, because its first byte is the terminator. This why the length of the string in your second example is 0.
When two or more string literals are adjacent (separated only by white-space), the compiler will join them into a single string. Therefore "\07" "abcdefg" is equivalent to "\07abcdefg".
"\07" is an octal escape. An octal escape ends after three digits or with first non-octal character. So, when you enter "\08", 8 is a non octal character therefore escape ends and 0 is stored at s[0].
Now, len is 0 and printing s[len] will try to print the character at s[0] which has a non printable ASCII code (Only character above ASCII value above 32 are printable).
When I am giving an input as 'x' the compVal is giving the value as -1. I am expecting 0 since both the values are same. Please someone explain the reason.
char ch = getchar();
int compVal = strcmp("x", &ch);
You have to give to strcmp two strings. A string is an array of char with the last value being \0.
In your example, the second value you are passing it is just the address of a char and there is no string terminator so the function goes blindly ahead until it finds a 0 ( same thing as \0).
You should either use strcmp with a char vector like char ch[2] ( One value for the character you want and the other for the \0 I mentioned earlier or, in your case you should just use the == operator since you want to compare only one character.
You probably shouldn't be using strcmp() to compare single characters.
Char variables can just be compared using relational operators such as ==, >, >= etc
I would think the reason that you're comparison isn't working is that you're comparing a string to a single character. Strings have a null terminator "\0" on the end of them, and it will be added if it isn't there. Therefore string compare is correctly telling you that "x\0" is not equal to "x".
strcmp reads from the input address untill a \0 is found. So you need to provide NULL terminated strings to strcmp. Not doing so results in Undefined behavior.
These are two different data types.
Remember that internally "x" is stored as 'x' and '\0' in memory. You need to make memory look the same for it to work as a string in C.
This will work:
char ch[2];
ch[0] = getchar();
ch[1] = 0;
int compVal = strcmp("x",ch);
Here you compare two arrays of characters. Not an address of a single char and a char*.
You compare the constant string "x" with a char 'x'. By giving the pointer to that char your make strcmp think it is comparing strings. However, the constant string "x" ends with '\0' but the char you use as a string does not end with '\0', which is a requirement of a string.
x\0
x ^ <- difference found
However, what you are doing might result in a segmentation fault on other systems. The correct fix for this is to put a terminating null character after the input or just compare the chars (in this case that is even better!).
You can compare characters directly:
char ch = getchar();
if ('x'==ch)
{
/* ... */
}
My question is about the sizeof operator in C.
sizeof('a'); equals 4, as it will take 'a' as an integer: 97.
sizeof("a"); equals 2: why? Also (int)("a") will give some garbage value. Why?
'a' is a character constant - of type int in standard C - and represents a single character. "a" is a different sort of thing: it's a string literal, and is actually made up of two characters: a and a terminating null character.
A string literal is an array of char, with enough space to hold each character in the string and the terminating null character. Because sizeof(char) is 1, and because a string literal is an array, sizeof("stringliteral") will return the number of character elements in the string literal including the terminating null character.
That 'a' is an int instead of a char is a quirk of standard C, and explains why sizeof('a') == 4: it's because sizeof('a') == sizeof(int). This is not the case in C++, where sizeof('a') == sizeof(char).
because 'a' is a character, while "a" is a string consisting of the 'a' character followed by a null.
I have a small snippet of code.
When I run this on my DevC++ gnu compiler it shows the following output:
main ()
{ char b = 'a';
printf ("%d,", sizeof ('a'));
printf ("%d", sizeof (b));
getch ();
}
OUTPUT: 4,1
Why is 'a' treated as an integer, whereas as b is treated as only a character constant?
Because character literals are of type int and not char in C.
So sizeof 'a' == sizeof (int).
Note that in C++, a character literal is of type char and so sizeof 'a' == sizeof (char).
That's just the way it is in C. That's just how the language was originally defined. As for why... Back then virtually everything in C was an int, unless there was a very good reason to make it something else. So, historically character constants in C have type int.
Note BTW, in C nomenclature 'a' is called constant, not literal. C has string literals and no other literals.
In C, a character literal has type int.
In C++, a character literal that contains only one character has type char, which is an integral type.
In both C and C++, a wide character literal has type wchar_t, and a multicharacter literal has type int.
From IBM XL C/C++ documentation
A character literal contains a sequence of characters or escape
sequences enclosed in single quotation mark symbols, for example 'c'.
A character literal may be prefixed with the letter L, for example
L'c'. A character literal without the L prefix is an ordinary
character literal or a narrow character literal. A character literal
with the L prefix is a wide character literal. An ordinary character
literal that contains more than one character or escape sequence
(excluding single quotes ('), backslashes () or new-line characters)
is a multicharacter literal.
Character literals have the following form:
.---------------------.
V |
>>-+---+--'----+-character-------+-+--'------------------------><
'-L-' '-escape_sequence-'
At least one character or escape sequence must appear in the character
literal. The characters can be from the source program character set,
excluding the single quotation mark, backslash and new-line symbols. A
character literal must appear on a single logical source line.
C A character literal has type int