C string comparison [duplicate] - c

This question already has answers here:
How do I properly compare strings in C?
(10 answers)
Closed 5 years ago.
I ve been coding in C++, completly new in C.
Why doesnt it work? I want to end program by typing exit
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main()
{
char command[4];
do{
printf( " -> " ) ;
scanf("%c", &command);
}while(&command != "exit");
return 0;
}

Because in C you have to use strcmp for string comparison.
In C a string is a sequence of characters that ends with the '\0'-terminating byte, whose value is 0.
The string "exit" looks like this in memory:
+-----+-----+-----+-----+------+
| 'e' | 'x' | 'i' | 't' | '\0' |
+-----+-----+-----+-----+------+
where 'e' == 101, 'x' == 120, etc.
The values of the characters are determined by the codes of the ASCII Table.
&command != "exit"
is just comparing pointers.
while(strcmp(command, "exit") != 0);
would be correct. strcmp returns 0 when both strings are equal, a non-zero
value otherwise. See
man strcmp
#include <string.h>
int strcmp(const char *s1, const char *s2);
DESCRIPTION
The strcmp() function compares the two strings s1 and s2. It returns an integer less than, equal to, or greater than zero if s1 is
found, respectively, to be less than, to match, or be greater than s2.
But you've made another error:
scanf("%c", &command);
Here you are reading 1 character only, this command is not a string.
scanf("%s", command);
would be correct.
The next error would be
char command[4];
This can hold strings with a maximal length of 3 characters, so "exit" doesn't
fit in the buffer.
Make it
char command[1024];
Then you can store a string with max. length of 1023 bytes.
In general, of want to save a string of length n, you need a char array of
at least n+1 dimension.

You use strcmp, obviously:
while (strcmp(c, "exit"))
What your code does is compare the address of the input buffer with the address of the static string "exit", which of course will never match. You must compare the characters at the pointers.
The orher problem is you have a four byte buffer for a five byte string, the terminator character needs to fit. C is extremely tricky this way, you'll need to allocate a "big enough" buffer for whatever people might type in or the program will immediately crash. Use 1024 or something reasonably big for test programs.
Now I say "obviously" because when writing C code you should have a C standard library reference open at all times to be sure you're using the correct functions and arguments, plus to know what tools you have available.

Multiple issues with the code
You need five chars not four to include the ending null char \0.
You should use %s for inputting string. %c is for characters
You are comparing memory locations (or pointers) in the while loop. You need strcmp for comparing strings.

Related

Why is my output wrong? C newbie

#include <stdio.h>
int main(void)
{
char username;
username = '10A';
printf("%c\n", username);
return 0;
}
I just started learning C, and here is my first problem. Why is this program giving me 2 warnings (multi-character constant, overflow in implicit constant conversion)?
And instead of giving 10A as output, it is giving just A.
You are trying to stuff multiple characters into a single set of '', and into a single char variable. You need "" for string literals, and you'll need an array of characters to hold a string. And to print a string, use %s.
Putting all of this together, you get:
#include <stdio.h>
int main(void)
{
char username[] = "10A";
printf("%s\n", username);
return 0;
}
Footnote
From Jonathan Leffler in the comments below regarding multi-character constants:
Note that multi-character constants are a part of C (hence the warning, not an error), but the value of a multi-character constant is implementation defined and hence not portable. It is an integer value; it is larger than fits in a char, so you get that warning. You could have gotten almost anything as the output — 1, A and a null byte could all be plausible.
'10A' is an allowed but obscure way to define a value.
In the case of an int variable,
int username = '10A';
printf("%x\n", username);
will output
313041
These are pairs of hexadecimal values - each pair is
0x31 is the '1' of your input.
0x30 is the '0' of your input.
0x41 is the 'A' of your input.
But a char type can't hold this.
In C there are no String objects. Instead Strings are arrays of characters (followed by a null character). Other answers have pointed out statically allocating this memory. However I recommend dynamically allocating Strings. Just remember C lacks a garbage memory collector (like there is in java). So remember to free your pointers. Have fun!!
You could use char *username to point to the beginning of the address and loop through the memory after. For instance use sizeof(username) to get the size and then loop printf until you have printed the amount of characters in username. However you may end up with major problems if you aren't careful...

String decleration length in C

So I'm writing a small program (I'm new to C, coming from C++), and I want to take in a string of maximum length ten.
I declare a character array as
#define SYMBOL_MAX_LEN 10 //Maximum length a symbol can be from the user (NOT including null character)
.
.
.
char symbol[SYMBOL_MAX_LEN + 1]; //Holds the symbol given by the user (+1 for null character)
So why is it when I use:
scanf("%s", symbol); //Take in a symbol given by the user as a string
I am able to type '01234567890', and the program will still store the entire value?
My questions are:
Does scanf not prevent values from being recorded in the adjacent
blocks of memory after symbol?
How could I prevent the user from entering a value of greater than length SYMBOL_MAX_LEN?
Does scanf put the null terminating character into symbol automatically, or is that something I will need to do manually?
You can limit the number of characters scanf() will read as so:
#include <stdio.h>
int main(void) {
char buffer[4];
scanf("%3s", buffer);
printf("%s\n", buffer);
return 0;
}
Sample output:
paul#local:~/src/c/scratch$ ./scanftest
abc
abc
paul#local:~/src/c/scratch$ ./scanftest
abcdefghijlkmnop
abc
paul#local:~/src/c/scratch$
scanf() will add the terminating '\0' for you.
If you don't want to hardcode the length in your format string, you can just construct it dynamically, e.g.:
#include <stdio.h>
#define SYMBOL_MAX_LEN 4
int main(void) {
char buffer[SYMBOL_MAX_LEN];
char fstring[100];
sprintf(fstring, "%%%ds", SYMBOL_MAX_LEN - 1);
scanf(fstring, buffer);
printf("%s\n", buffer);
return 0;
}
For the avoidance of doubt, scanf() is generally a terrible function for dealing with input. fgets() is much better for this type of thing.
Does scanf not prevent values from being recorded in the adjacent blocks of memory after symbol?
As far as I know, No.
How could I prevent the user from entering a value of greater than length SYMBOL_MAX_LEN?
By using buffer safe functions like fgets.
Does scanf put the null terminating character into symbol automatically, or is that something I will need to do manually?
Only if the size was enough for it to put the nul terminator. For example if your array was of length 10 and you input 10 chars how will it put the nul terminator.
I am able to type '01234567890', and the program will still store the entire value?
This is because you are Unlucky that you are getting your desired result. This will invoke undefined behavior.
Does scanf not prevent values from being recorded in the adjacent blocks of memory after symbol?
No.
How could I prevent the user from entering a value of greater than length SYMBOL_MAX_LEN?
Use fgets.
Does scanf put the null terminating character into symbol automatically, or is that something I will need to do manually?
Yes

How strcmp() is returning -1 even though the two values are same?

When I am giving an input as 'x' the compVal is giving the value as -1. I am expecting 0 since both the values are same. Please someone explain the reason.
char ch = getchar();
int compVal = strcmp("x", &ch);
You have to give to strcmp two strings. A string is an array of char with the last value being \0.
In your example, the second value you are passing it is just the address of a char and there is no string terminator so the function goes blindly ahead until it finds a 0 ( same thing as \0).
You should either use strcmp with a char vector like char ch[2] ( One value for the character you want and the other for the \0 I mentioned earlier or, in your case you should just use the == operator since you want to compare only one character.
You probably shouldn't be using strcmp() to compare single characters.
Char variables can just be compared using relational operators such as ==, >, >= etc
I would think the reason that you're comparison isn't working is that you're comparing a string to a single character. Strings have a null terminator "\0" on the end of them, and it will be added if it isn't there. Therefore string compare is correctly telling you that "x\0" is not equal to "x".
strcmp reads from the input address untill a \0 is found. So you need to provide NULL terminated strings to strcmp. Not doing so results in Undefined behavior.
These are two different data types.
Remember that internally "x" is stored as 'x' and '\0' in memory. You need to make memory look the same for it to work as a string in C.
This will work:
char ch[2];
ch[0] = getchar();
ch[1] = 0;
int compVal = strcmp("x",ch);
Here you compare two arrays of characters. Not an address of a single char and a char*.
You compare the constant string "x" with a char 'x'. By giving the pointer to that char your make strcmp think it is comparing strings. However, the constant string "x" ends with '\0' but the char you use as a string does not end with '\0', which is a requirement of a string.
x\0
x ^ <- difference found
However, what you are doing might result in a segmentation fault on other systems. The correct fix for this is to put a terminating null character after the input or just compare the chars (in this case that is even better!).
You can compare characters directly:
char ch = getchar();
if ('x'==ch)
{
/* ... */
}

What is the difference between memcmp, strcmp and strncmp in C?

I wrote this small piece of code in C to test memcmp() strncmp() strcmp() functions in C.
Here is the code that I wrote:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main() {
char *word1="apple",*word2="atoms";
if (strncmp(word1,word2,5)==0)
printf("strncmp result.\n");
if (memcmp(word1,word2,5)==0)
printf("memcmp result.\n");
if (strcmp(word1,word2)==0)
printf("strcmp result.\n");
}
Can somebody explain me the differences because I am confused with these three functions?
My main problem is that I have a file in which I tokenize its line of it,the problem is that when I tokenize the word "atoms" in the file I have to stop the process of tokenizing.
I first tried strcmp() but unfortunately when it reached to the point where the word "atoms" were placed in the file it didn't stop and it continued,but when I used either the memcmp() or the strncmp() it stopped and I was happy.
But then I thought,what if there will be a case in which there is one string in which the first 5 letters are a,t,o,m,s and these are being followed by other letters.
Unfortunately,my thoughts were right as I tested it using the above code by initializing word1 to "atomsaaaaa" and word2 to atoms and memcmp() and strncmp() in the if statements returned 0.On the other hand strcmp() it didn't. It seems that I must use strcmp().
In short:
strcmp compares null-terminated C strings
strncmp compares at most N characters of null-terminated C strings
memcmp compares binary byte buffers of N bytes
So, if you have these strings:
const char s1[] = "atoms\0\0\0\0"; // extra null bytes at end
const char s2[] = "atoms\0abc"; // embedded null byte
const char s3[] = "atomsaaa";
Then these results hold true:
strcmp(s1, s2) == 0 // strcmp stops at null terminator
strcmp(s1, s3) != 0 // Strings are different
strncmp(s1, s3, 5) == 0 // First 5 characters of strings are the same
memcmp(s1, s3, 5) == 0 // First 5 bytes are the same
strncmp(s1, s2, 8) == 0 // Strings are the same up through the null terminator
memcmp(s1, s2, 8) != 0 // First 8 bytes are different
memcmp compares a number of bytes.
strcmp and the like compare strings.
You kind of cheat in your example because you know that both strings are 5 characters long (plus the null terminator). However, what if you don't know the length of the strings, which is often the case? Well, you use strcmp because it knows how to deal with strings, memcmp does not.
memcmp is all about comparing byte sequences. If you know how long each string is then yeah, you could use memcmp to compare them, but how often is that the case? Rarely. You often need string comparison functions because, well... they know what a string is and how to compare them.
As for any other issues you are experiencing it is unclear from your question and code. Rest assured though that strcmp is better equipped in the general case for string comparisons than memcmp is.
strcmp():
It is used to compare the two string stored in two variable, It takes some time to compare them. And so it slows down the process.
strncmp():
It is very much similar to the previous one, but in this one, it compares the first n number of characters alone. This also slows down the process.
memcmp():
This function is used compare two variables using their memory. It doesn't compare them one by one, It compares four characters at one time. If your program is too concerned about speed, I recommend using memcmp().
To summarize:
strncmp() and strcmp() treat a 0 byte as the end of a string, and don't compare beyond it
to memcmp(), a 0 byte has no special meaning
strncmp and memcmp are same except the fact that former takes care of NULL terminated string.
For strcmp you'll want to be only comparing what you know are going to be strings however sometimes this is not always the case such as reading lines of binary files and there for you would want to use memcmp to compare certain lines of input that contain NUL characters but match and you may want to continue checking further lengths of input.

Checking contents of char variable - C Programming

This might seem like a very simple question, but I am struggling with it. I have been writing iPhone apps with Objective C for a few months now, but decided to learn C Programming to give myself a better grounding.
In Objective-C if I had a UILabel called 'label1' which contained some text, and I wanted to run some instructions based on that text then it might be something like;
if (label1.text == #"Hello, World!")
{
NSLog(#"This statement is true");
}
else {
NSLog(#"Uh Oh, an error has occurred");
}
I have written a VERY simple C Program I have written which uses printf() to ask for some input then uses scanf() to accept some input from the user, so something like this;
int main()
{
char[3] decision;
Printf("Hi, welcome to the introduction program. Are you ready to answer some questions? (Answer yes or no)");
scanf("%s", &decision);
}
What I wanted to do is apply an if statement to say if the user entered yes then continue with more questions, else print out a line of text saying thanks.
After using the scanf() function I am capturing the users input and assigning it to the variable 'decision' so that should now equal yes or no. So I assumed I could do something like this;
if (decision == yes)
{
printf("Ok, let's continue with the questions");
}
else
{
printf("Ok, thank you for your time. Have a nice day.");
}
That brings up an error of "use of undeclared identifier yes". I have also tried;
if (decision == "yes")
Which brings up "result of comparison against a string literal is unspecified"
I have tried seeing if it works by counting the number of characters so have put;
if (decision > 3)
But get "Ordered comparison between pointer and integer 'Char and int'"
And I have also tried this to check the size of the variable, if it is greater than 2 characters it must be a yes;
if (sizeof (decision > 2))
I appreciate this is probably something simple or trivial I am overlooking but any help would be great, thanks.
Daniel Haviv's answer told you what you should do. I wanted to explain why the things you tried didn't work:
if (decision == yes)
There is no identifier 'yes', so this isn't legal.
if (decision == "yes")
Here, "yes" is a string literal which evaluates to a pointer to its first character. This compares 'decision' to a pointer for equivalence. If it were legal, it would be true if they both pointed to the same place, which is not what you want. In fact, if you do this:
if ("yes" == "yes")
The behavior is undefined. They will both point to the same place if the implementation collapses identical string literals to the same memory location, which it may or may not do. So that's definitely not what you want.
if (sizeof (decision > 2))
I assume you meant:
if( sizeof(decision) > 2 )
The 'sizeof' operator evaluates at compile time, not run time. And it's independent of what's stored. The sizeof decision is 3 because you defined it to hold three characters. So this doesn't test anything useful.
As mentioned in the other answer, C has the 'strcmp' operator to compare two strings. You could also write your own code to compare them character by character if you wanted to. C++ has much better ways to do this, including string classes.
Here's an example of how you might do that:
int StringCompare(const char *s1, const char *s2)
{ // returns 0 if the strings are equivalent, 1 if they're not
while( (*s1!=0) && (*s2!=0) )
{ // loop until either string runs out
if(*s1!=*s2) return 1; // check if they match
s1++; // skip to next character
s2++;
}
if( (*s1==0) && (*s2==0) ) // did both strings run out at the same length?
return 0;
return 1; // one is longer than the other
}
You should use strcmp:
if(strcmp(decision, "yes") == 0)
{
/* ... */
}
You should be especially careful with null-terminated string in C programming. It is not object. It is a pointer to a memory address. So you can't compare content of decision directly with a constant string "yes" which is at another address. Use strcmp() instead.
And be careful that "yes" is actually "yes\0" which will take 4 bytes and the "\0" is very important to strcmp() which will be recognized as the termination during the comparison loop.
Ok a few things:
decision needs to be an array of 4 chars in order to fit the string "yes" in it. That's because in C, the end of a string is indicated by the NUL char ('\0'). So your char array will look like: { 'y', 'e', 's', '\0' }.
Strings are compared using functions such as strcmp, which compare the contents of the string (char array), and not the location/pointer. A return value of 0 indicates that the two strings match.
With: scanf("%s", &decision);, you don't need to use the address-of operator, the label of an array is the address of the start of the array.
You use strlen to get the length of a string, which will just increment a counter until it reaches the NUL char, '\0'. You don't use sizeof to check the length of strings, it's a compile-time operation which will return the value 3 * sizeof(char) for a char[3].
scanf is unsafe to use with strings, you should alternatively use fgets(stdin...), or include a width specifier in the format string (such as "3%s") in order to prevent overflowing your buffer. Note that if you use fgets, take into account it'll store the newline char '\n' if it reads a whole line of text.
To compare you could use strcmp like this:
if(strcmp(decision, "yes") == 0) {
// decision is equal to 'yes'
}
Also you should change char decision[3] into char decision[4] so that the buffer has
room for a terminating null character.
char decision[4] = {0}; // initialize to 0
There's several issues here:
You haven't allocated enough storage for the answer:
char[3] decision;
C strings are bytes in the string followed by an ASCII NUL byte: 0x00, \0. You have only allocated enough space for ye\0 at this point. (Well, scanf(3) will give you yes\0 and place that NUL in unrelated memory. C can be cruel.) Amend that to include space for the terminating \0 and amend your scanf(3) call to prevent the buffer overflow:
char[4] decision;
/* ... */
scanf("%3s", decision);
(I've left off the &, because simply giving the name of the array is the same as giving the address of its first element. It doesn't matter, but I believe this is more idiomatic.)
C strings cannot be compared with ==. Use strcmp(3) or strncmp(3) or strcasecmp(3) or strncasecmp(3) to compare your strings:
if(strcasecmp(decision, "yes") == 0) {
/* yes */
}
C has lots of lib functions to handle this but it pays to know what you are declaring.
Declaring
char[3] decision;
is actually declaring a char array of length 3. So therefor attempting a comparison of
if(decision == "yes")
is comparing a literal against and array and therefor will not work. Since there is no defined string type in C you have to use pointers, but not directly, if you don't want to. In C strings are in fact arrays of char so you can declare them both ways eg:
char[3] decision ;
* char decision ;
Both will in point of fact work but you in the first instance the compiler will allocate the memory for you, but it will ONLY allocate 3 bytes. Now since strings in C are null terminated you need to actually allocate 4 bytes since you need room for "yes" and the null. Declaring it the second way simply declares a pointer to someplace in memory but you have no idea really where. You would then have to allocate memory to contain whatever you are going to put there since to do otherwise will more then likely cause a SEGFAULT.
To compare what you get from input you have two options, either use the strcomp() function or do it yourself by iterating through decision and comparing each individual byte against "Y" and "E" and "S" until you hit null aka \0.
There are variations on strcomp() to deal with uppercase and lowercase and they are part of the standard string.h library.

Resources