I have this code:
char *name = "George"
if(name == "George")
printf("It's George")
I thought that c strings could not be compared with == sign and I have to use strcmp. For unknown reason when I compile with gcc (version 4.7.3) this code works. I though that this was wrong because it is like comparing pointers so I searched in google and many people say that it's wrong and comparing with == can't be done. So why this comparing method works ?
I thought that c strings could not be compared with == sign and I have to use strcmp
Right.
I though that this was wrong because it is like comparing pointers so I searched in google and many people say that it's wrong and comparing with == can't be done
That's right too.
So why this comparing method works ?
It doesn't "work". It only appears to be working.
The reason why this happens is probably a compiler optimization: the two string literals are identical, so the compiler really generates only one instance of them, and uses that very same pointer/array whenever the string literal is referenced.
Just to provide a reference to #H2CO3's answer:
C11 6.4.5 String literals
It is unspecified whether these arrays are distinct provided their elements have the
appropriate values. If the program attempts to modify such an array, the behavior is
undefined.
This means that in your example, name(a string literal "George") and "George" may and may not share the same location, it's up to the implementation. So don't count on this, it may results differently in other machines.
The comparison you have done compares the location of the two strings, rather than their content. It just so happens that your compiler decided to only create one string literal containing the characters "George". This means that the location of the string stored in name and the location of the second "George" are the same, so the comparison returns non-zero.
The compiler is not required to do this, however - it could just as easily create two different string literals, with different locations but the same content, and the comparison would then return zero.
This will fail, since you are comparing two different pointers of two separate strings.
If this code still works, then this is a result of a heavy optimization of GCC, that keeps only one copy for size optimization.
Use strcmp(). Link.
If you compare two stings that you are comparing base addresses of those strings not actual characters in those strings. for comparing strings use strcmp() and strcasecmp() library functions or write program like this. below is not a full code just logic required for string comparison.
void mystrcmp(const char *source,char *dest)
{
for(i=0;source[i] != '\0';i++)
dest[i] = source[i];
dest[i] = 0;
}
Related
For clarity I'm only talking about null terminated strings.
I'm familiar with the standard way of doing string comparisons in C with the usage of strcmp. But I feel like it's slow and inefficient.
I'm not necessarily looking for the easiest method but the most efficient.
Can the current comparison method (strcmp) be optimized further while the underlying code remains cross platform?
If strcmp can't be optimized further, what is the fastest way which I could perform the string comparison without strcmp?
Current use case:
Determine if two arbitrary strings match
Strings will not exceed 4096 bytes, nor be less than 1 byte in size
Strings are allocated/deallocated and compared within the same code/library
Once comparison is complete I do pass the string to another C library which needs the format to be in a standard null terminated format
System memory limits are not a huge concern, but I will have tens of thousands of such strings queued up for comparison
Strings may contain high-ascii character set or UTF-8 characters but for my purposes I only need to know if they match, content is not a concern
Application runs on x86 but should also run on x64
Reference to current strcmp() implementation:
How does strcmp work?
What does strcmp actually do?
GLIBC strcmp() source code
Edit: Clarified the solution does not need to be a modification of strcmp.
Edit 2: Added specific examples for this use case.
I'm afraid your reference imlementation for strcmp() is both inaccurate and irrelevant:
it is inaccurate because it compares characters using the char type instead of the unsigned char type as specified in the C11 Standard:
7.24.4 Comparison functions
The sign of a nonzero value returned by the comparison functions memcmp, strcmp, and strncmp is determined by the sign of the difference between the values of the first pair of characters (both interpreted as unsigned char) that differ in the objects being compared.
It is irrelevant because the actual implementation used by modern compilers is much more sophisticated, expanded inline using hand-coded assembly language.
Any generic implementation is likely to be less optimal, especially if coded to remain portable across platforms.
Here are a few directions to explore if your program's bottleneck is comparing strings.
Analyze your algorithms, try and find ways to reduce the number of comparisons: for example if you search for a string in an array, sorting that array and using a binary search with drastically reduce the number of comparisons.
If your strings are tokens used in many different places, allocate unique copies of these tokens and use those as scalar values. The strings will be equal if and only if the pointers are equal. I use this trick in compilers and interpreters all the time with a hash table.
If your strings have the same known length, you can use memcmp() instead of strcmp(). memcmp() is simpler than strcmp() and can be implemented even more efficiently in places where the strings are known to be properly aligned.
EDIT: with the extra information provided, you could use a structure like this for your strings:
typedef struct string_t {
size_t len;
size_t hash; // optional
char str[]; // flexible array, use [1] for pre-c99 compilers
} string_t;
You allocate this structure this way:
string_t *create_str(const char *s) {
size_t len = strlen(s);
string_t *str = malloc(sizeof(*str) + len + 1;
str->len = len;
str->hash = hash_str(s, len);
memcpy(str->str, s, len + 1);
return str;
}
If you can use these str things for all your strings, you can greatly improve the efficiency of the matching by first comparing the lengths or the hashes. You can still pass the str member to your library function, it is properly null terminated.
I'm currently trying to port a program from VB6 to plain C. A lot of functions use multiple instances of the & Operator to concatenate Strings like this:
(VB6 Code)
Public Function myFunc (myString As String) As String
Dim myNewString As String
myNewString = globalString & myString
myFunc = myNewString
End Function
The intent behind that is to concatenate different strings together. This is done exhaustively on multiple hundred occasions in the code.
I currently emulate this behavior like this:
sprintf(myString, "%s%s", myString, newString);
the strings are declared like this:
char myString[500] = {};
char newString[100] = {};
Its very important to note that never will my concatenate operations exceed the max length of the string, as that is not possible with any scenarios this program deals with.
My question is now:
Assuming that I never exceed the max length of the char arrays, is this a safe and performing way to emulate this operation (I have not run into any issue so far with it in production testing)
Are there better ways to do this?
1. [...] is this a safe and performing way to emulate this operation (I have not run into any issue so far with it in production testing)
No, not at all. Your code,
sprintf(myString, "%s%s", myString, newString);
produces undefined behavior.
As mentioned in the C11 standard, chapter ยง7.21.6.6, The sprintf() function
[...] If copying takes place between objects that overlap, the behavior is undefined.
Then,
2. Are there better ways to do this?
Yes, sure. You should be using strcat() to concatenate the strings.
If the destination string is large enough and already contains a valid, null-terminated string, simply use strcat.
strcat(myString, newString);
This question already has answers here:
Issue with main arguments handling
(3 answers)
Closed 7 years ago.
I am fairly new to C, so am not overly familiar with it's syntax, however I have debugged my code and researched for the correct syntax, and it seems to be correct, I have also changed the scope of the variables to see if this was causing the error.
The if statement should compare two variables, which both hold strings, I have even printed both the variables out to ensure they are the same, however it is still skipping straight to the else section of the if statement. Can anyone give me any pointers on why it will not run the if statement, it just skips straight to 'incorrect'.
The correctWord variable is defined at a different section in the code.
Find full code here.
-UPDATE-
I have now updated the syntax of the code, however it is still returning false.
char correctWord[20];
void userGuess(){
char userWordGuess[20];
printf("Anagram: ");
printf(anagramWord);
printf("Your Guess: ");
scanf("%s",userWordGuess); //Reads in user input
printf(correctWord);
printf(userWordGuess);
if(strcmp(userWordGuess, correctWord) == 0){
printf("Congratulations, you guessed correctly!");
}else{
printf("Incorrect, try again or skip this question");
}
}
You cannot compare strings in C using ==, because this compares the addresses of the strings, not the contents of the string. (which you certainly don't require, and obviously, the addresses of the two strings are not equal too.)
C has a pretty nice function for it : strcmp() which returns 0 if both the strings are equal.
Try using this in your if condition:
if (!strcmp(userWordGuess,correctWord))
{
//Yay! Strings are equal. Do what you want to here.
}
Be sure to #include <string.h> before using strcmp().
In C, you can't compare strings using ==. You will end up comparing the addresses of the strings, which is not the same.
You need to call the strmcp() function, which will return 0 if its arguments (two strings) are equal.
So the code should be if(strcmp(userWordGuess, correctWord) == 0).
You're comparing addresses of different arrays, which will always be unequal.
You need to use strcmp or some other strings library function to compare strings character by character.
userWordGuess == correctWord will compare the pointers (i.e. the locations in memory of the arrays), which are probably not equal.
For string comparision in C, use strcmp (or strncmp):
if (!strcmp(userWordGuess, correctWord)){
/*Strings are equal*/
Use
if(strcmp(userWordGuess, correctWord) == 0) // strings are equal
{
printf("Congratulations, you guessed correctly!");
}
else // not equal
{
printf("Incorrect, try again or skip this question");
}
if both string are equal than if condition will run. otherwise it wil run else
The strings are not first-class citizens in the C language. The strings are represented as either arrays of characters or pointers to such arrays.
In both cases, the variable you use to access the string is a synonym for the address in memory of the first character of the string.
What you compare with userWordGuess == correctWord is not the strings but their addresses in memory. Since userWordGuess and correctWord are two different arrays of characters, their addresses in memory are always different and their comparison will always produce FALSE.
In order to compare the actual string values you have to use the standard function strcmp() or one of its variants (find them at the bottom of the documentation page).
Change in the code:
/** Need to include the header that declares the strcmp() function */
#include <string.h>
char correctWord[20];
void userGuess(){
char userWordGuess[20];
/** stripped some lines here ... */
/** compare the strings, not their addresses in memory */
if (strcmp(userWordGuess, correctWord) == 0) {
/** the rest of your code */
What you are doing here is comparing two pointers. userWordGuess and correctWord point each to the beginning of an array of characters (which is what you defined at the beginning of your example code).
So if you want to compare the two arrays of chars you can use the strcmp function defined in string.h
It is important that you learn the relation between arrays and pointers. Pointer arithmetic is as well important here. Check this out: Arrays, Pointers, Pointer Arithmetic
char first_array[5][4] = {"aaa","bbb","ccc","ddd","eee"};
char second_array[1][4];
How would I copy, for example, the third element in first_array ("ccc") and save it to second_array?
The syntax below is clearly wrong, but this is what I'm asking for:
second_array[0] = first_array[2];
Also, after copying, I also want to know how to compare elements in the two arrays. Again, the syntax below might be wrong, I'm just explaining what I'm trying to do:
if(second_array[0] == first_array[2]){ printf("yes"); } //should print yes
You can't assign to arrays in c, you can fill arrays with some library functions like strcpy(), so
second_array[0] = first_array[2];
would be
strcpy(second_array[0], first_array[2]);
you must however ensure that the destination array fits the number of characters you are copying to it.
If you try to compare two strings in c, you can't do it through the == operator, because strings in c are arrays of char which contain a sequence of non-nul characters followed by a nul character, so if you write this
if (second_array[0] == first_array[2])
even when you succeeded at copying the data, the result will be most likely false, because you are not comparing the contents of the arrays, but their addresses, so to compare them correctly there is also a function strcmp() then the correct way of comparing the strings is
if (strcmp(second_array[0], first_array[2]) == 0)
The functions above require you to include the string.h header, and also that the passed strings are strings in the c sense, i.e what I described above.
I was recently trying to do this, as well: it is not possible to do this sort of direct assignment in C.
When you write first_array[0], the compiler will read that as an address which points to the first element (character) of first_array[2], not the entire string. When you run the assignment, if it were to work, it would only set the first character.
The easiest way is to use strncpy or memcpy (or a loop to cycle through the string.
in one of my university assignments I am restricted in the libraries I use. I am new to C and pointers and want to see if two strings (or should I say char's) are equal.
Part of me wants to loop through every char of the 'char string' and test equivalence, but then it comes back how to test equivalence (lol).
Any help is appreciated.
edit: I am seeing this:
warning: result of comparison against a string literal is
unspecified (use strncmp instead) [-Wstring-compare]
which leads to a segmentation fault. I know it has to do with this piece of code because all I added was:
if (example.name == "testName"){
printf("here!\n");
}
Part of me wants to loop through every char of the 'char string' and test equivalence
That's exactly what you need to do. Make a function mystrcmp with the signature identical to regular strcmp,
int mystrcmp ( const char * str1, const char * str2 );
and write your own implementation.
but then it comes back how to test equivalence.
When you loop character-by-character, you test equivalence of individual characters, not strings. Characters in C can be treated like numbers: you can compare them for equality using ==, check what character code is less than or greater than using < and >, and so on.
The only thing left to do now is deciding when to stop. You do that by comparing the current character of each string to zero, which is the null terminator.
Don't forget to forward-declare your mystrcmp function before using it.
A string in C is terminated with null character(0x00 or \0).You should compare both strings in a loop character by character till null char for either of the string is reached.
Loop should be broken if characters are not equal.
EDIT:
To answer your edit in question:
You should take two character pointers pointing to both strings and then copmare them like
//loop start,loop till null for any one of the string is found
if(*ptr1 != *ptr2)
{
//break loop
}
ptr1++;ptr2++;
//end loop
if((*ptr1 == *ptr2) &&(*ptr1== 0x00))
{
//strings are equal
}
Given that this is a university assignment, you should pay heed to chars just being small integers. You should also pay heed that C strings are contiguous memory buffers terminated by a binary zero (0x00).
You should also learn about pointer math. You will learn ways to shorten the code you have to write while learning something really interesting concerning the C language and how computers work. It will certainly help you if you choose a career on lower-level programming.