what is the effective way to compare strings? [duplicate] - c

This question already has an answer here:
Fastest way of comparing strings in C
(1 answer)
Closed 3 years ago.
I am writing the different set of strings generated by a piece of software into a text file. I want to write a test so that it compares the generated and written text for any possible error!
What is the effective way to do such test?

The standard method to compare C strings is the strcmp() function declared in <string.h>.
There are a few special cases where more efficient solutions can be sought:
if the strings have a known length: memcmp() can be used and might perform better as it does not need to test for end of strings.
if only equality is to be tested, the extra work performed by strcmp() to compute the relative lexicographical order of the strings could be avoided, but strcmp() is usually implemented very efficiently, so it is unlikely you get any improvement by handcoding an alternative in C.

To compare two strings in C programming, you have to ask the user to enter the two strings and start comparing using the function strcmp().
If it will return 0, then both strings are equal.
If it will not return 0, then both strings are not be equal to each other.

Related

compare two binary files using strcmp() in c Language

Sorry for my bad English first.
I have two binary files.
And I store binary into buffer respectively.
Then I compared two buffer using strcmp().
Result of strcmp() is zero.
So I think two binary is identical.
Open two binary and then checked if there are no differences.
But I can find little difference.
what is the problem?
strcmp() function doesn't proper way to compare binary to binary?
The C function strcmp is written to compare strings. In C, strings are char pointers or arrays, that end with a null byte ('\0'). Therefore, the comparison only goes up to the first null byte.
Example:
File A: "abcd\0efg"
File B: "abcd\0xyz"
Since both files are equal up to the null byte, the "strings" at these locations are equal, although what comes after may differ. You should use the function memcmp instead (see this tutorial; see examples from the reference).
EDIT:
As pointed out by the comment under this answer and as mentioned in the other answer, the man pages of strcmp and memcmp are reliable resources to learn about these function from the standard library.
You cant compare binary data using string function.
You need to use memcmp instead.
https://man7.org/linux/man-pages/man3/memcmp.3.html

'If' Condition Not Working As Expected In My C Code

I am fully aware that this is due to some error overlooked by me while writing my text-based calculator project in C, but I have only started learning C less than a week ago, so please help me out!
Since the entire code is 119 lines, I'll just post the necessary snippet where the real issue lies: (There are no errors during compiling, so there is no error beyond these lines)
char choice[15];
printf("Type 'CALCULATE' or 'FACTORISE' or 'AVERAGE' to choose function :\n");
gets(choice);
if (choice == "CALCULATE")
The bug is that even after perfectly entering CALCULATE or FACTORISE or AVERAGE, I still get the error message that I programmed in case of invalid input (i.e, if none of these 3 inputs are entered). It SHOULD be going on to ask me the first number I wish to operate on, as written for the CALCULATE input.
The code runs fine, no errors in VS 2013, and so I'm sure its not a syntax error but rather something stupid I've done in these few lines.
If you use == you are comparing the addresses of 2 arrays, not the contents of the arrays.
Instead, you need to do:
if (strcmp(choice, "CALCULATE") == 0)
Two things to mention here:
Never use gets() it has serious security issues and is removed from the latest standard. Use fgets() instead.
To compare strings, you should use strcmp(), not ==.
The problem is you're trying to compare a string literal with a char array. C isn't found those things being the same, since the '==' comparison operator is not implemented in that way.
You have two options for performing that comparison :
1) Use the strcmp() function, from string.h library
2) Manually comparing the chars in your array, and the string literal
Definitely, the first option is the easiest and cleanest one.

Handle a string with length in C

In C (not C++), we can think several ways of handling strings with its length:
Just rely on the null terminating character (\0): We assume that the string doesn't contain \0. Store a string to a char array and append \0 at the end. Use the functions like strlen() when we need its size.
Store the characters and the length into a struct:
typedef struct _String {
char* data;
int size;
} String;
Use another variable for storing the length: For example,
char name[] = "hello";
int name_size = 5;
some_func(name, name_size, ...);
Personally, I prefer to use the second approach, since
It can cover some 'weird' strings which contain \0 in the middle.
We may implement some functions like string_new(), string_del(), string_getitem(), etc. to write some 'OOP-like' codes.
We don't have to two (or more) variables to handle the string and its length together.
My question is: What is the most-used way to handle strings in C? (especially: when we have to use a lot of strings (ex. writing an interpreter))
Thanks.
What is the most-used way to handle strings in C?
No doubt the most common way by far is to simply rely on the null termination.
Is it the "best" way? Probably not. Using a custom string library may be the "best" way as far as execution speed and program design are concerned. The downside is that you would have to drag that library around, since there are no standard or even de facto standard string libraries for C.
Most C programmers simply use asciiz strings and accept the inefficiency. C is still a very fast language.
However if you are doing a lot of string processing, it's maybe worthwhile writing a dedicated string library or suite. So a struct with a length member and a pointer is an obvious choice. However if you get really advanced, for example for genetic data processing, you find that you need structures such as suffix trees, which allow searches for sub-strings in O(constant) time.
In C language, a string is by definition a null terminated string. That's the reason why litteral string are null terminated, and why the strxxx functions of the Standard Library operate on null terminated strings.
On the other hand, character arrays can contain what you want including nulls, and you have to pass their length in another way, like for any other array.
Because of the way C handles string litterals and of the C standard library, C programmers ordinarily use null terminated strings. But it is worth noticing that in C++ a std::string is close(*) to a character array and a length and even if it is a different language C++, the introduction of C++ standard says (emphasize mine):
C++ is a general purpose programming language based on the C programming language...
Another example is the way Windows API internally manages unicode strings as BSTR. A BSTR is a special array of uint16_t where the length is at a -1 offset. This was choosen for compatibility with Visual Basic.
So if you need it, it is perfectly fine to build a library using strings defined as a struct array + length... or use the WINAPI implementation if appropriate or migrate to C++.
(*) In fact a C++ string is a smart pointer counting references to a character array and its length
Obviously the most used way is the null-terminated way, since that is supported by the standard libraries.
Writing your own structures for strings may make sense for your purpose, but it will never become "the most used way", because it is not a standard way.

algo to find number of common characters in 2 strings [duplicate]

This question already has answers here:
Finding common characters in two strings
(13 answers)
Closed 8 years ago.
I am writing a c program to find the number of common characters in two strings.
Eg: aabbccc aabc Ans:4
aabcA aa Ans:2
(Strings will have upper case ,lower case and numbers)
I have two algorithms in my mind
Assuming length of strings is n,m
1.Sort the arrays and then count O(nlogn+mlogm) complexity
2.scan through two strings and use a count arrays - O(n+m) complexity
Can anyone please suggest further optimization or any other methods to do this?
basically you are asking about a Bag(Multiset) Intersection.
and I guess there won't be any more efficient algo than O(n+m) because you will have to go through each and every element of two bags at least once.
Since, optimization is needed for big input, I think your second method is pretty fine(counting array method). Whatever algorithm you try to find out, you can't find the answer to your problem without looking at the two strings completely. Hence, there shouldn't be any further optimization to this problem as it is already O(m+n). I think for smaller input your first algorithm will work faster as there is a constant of O(26+26+10) associated with your second algorithm. But if you are really interested in a faster code then try to optimize the method of reading and writing the output. You may google for "faster I/O in C++" and read about it.

What is the way to find the String literal size limit [duplicate]

This question already has answers here:
Is there any limitation on string length defined in variable argument list
(3 answers)
Closed 8 years ago.
I wanted to know what are the ways to find out the String literal size limit.I guess different compilers do provide the max size limit of string literal but how do I find it programmatically or there is some standard header files which maintain this size limit as some macro??
I checked the C99 draft, all it says is that at least 4,095 characters should be supported in a string literal; there doesn't seem to be a maximum length. This makes sense to me; why impose such a limit?
I really don't think you can "detect" this at run-time. Of course you should be able to detect it, crudely speaking, at compile-time by checking if the compilation succeeds. Write a program that generates a program containing a string literal of a given length, then try to build that output and iterate until building fails. Of course you will only have learnt something about your particular compiler, not a general lesson.
Perhaps you should try to state your actual problem, it seems you're kind of hinting at it instead.

Resources