Can someone explain how this simple C character comparison function works? - c

I am very new to the C programming language. Im confused why the function takes pointers as arguments? From my understanding, it looks like the function returns the difference in character value of 2 characters, is this correct?
Heres the function:
int charcomp(char *x, char *y) { return *x - *y; }
Thanks in advance for any help!

There is a C library idiom that a 'generic' comparison function takes two pointers to objects to compare and returns an int result that indicates the following:
a negative result means the first object is less than the second object
a zero result means the two objects are equal
a positive result means the first object is greater than the second object
Since the negative and positive results don't have to be specific values (though -1 and 1 are commonly used), it's also a common idiom to use a subtraction to generate that result for numeric comparisons. That's what is happening here.
However, beware that subtracting two int values could lead to undefined behavior in certain situations where overflow could occur (that usually won't be the case when subtracting char types though). A simple if/else set of tests can be used to avoid overflow.

This will compare the first character of two C strings (pointers-to-char). If the characters are the same it will return 0, otherwise it will return non-zero. This can be used as a boolean comparator, as in:
if (charcomp("alice", "aarvark"))
print("they start with different letters")
else
printf("they start with the same letter");

Related

Recursive function for finding factorial of a number

I am getting an output of 24 which is the factorial for 4, but I should be getting the output for 5 factorial which is 120
#include <stdio.h>
int factorial(int number){
if(number==1){
return number;
}
return number*factorial(--number);
}
int main(){
int a=factorial(5);
printf("%d",a);
}
Your program suffers from undefined behavior.
In the first call to factorial(5), where you have
return number * factorial(--number);
you imagine that this is going to compute
5 * factorial(4);
But that's not guaranteed!
What if the compiler looks at it in a different order?
What it if works on the right-hand side first?
What if it first does the equivalent of:
temporary_result = factorial(--number);
and then does the multiplication:
return number * temporary_result;
If the compiler does it in that order, then temporary_result will be factorial(4), and it'll return 4 times that, which won't be 5!. Basically, if the compiler does it in that order -- and it might! -- then number gets decremented "too soon".
You might not have imagined that the compiler could do things this way.
You might have imagined that the expression would always be "parsed left to right".
But those imaginations are not correct.
(See also this answer for more discussion on order of evaluation.)
I said that the expression causes "undefined behavior", and this expression is a classic example. What makes this expression undefined is that there's a little too much going on inside it.
The problem with the expression
return number * factorial(--number);
is that the variable number is having its value used within it, and that same variable number is also being modified within it. And this pattern is, basically, poison.
Let's label the two spots where number appears, so that we can talk about them very clearly:
return number * factorial(--number);
/* A */ /* B */
At spot A we take the value of the variable number.
At spot B we modify the value of the variable number.
But the question is, at spot A, do we get the "old" or the "new" value of number?
Do we get it before or after spot B has modified it?
And the answer, as I already said, is: we don't know. There is no rule in C to tell us.
Again, you might have thought there was a rule about left-to-right evaluation, but there isn't. Because there's no rule that says how an expression like this should be parsed, a compiler can do anything it wants. It can parse it the "right" way, or the "wrong" way, or it can do something even more bizarre and unexpected. (And, really, there's no "right" or "wrong" way to parse an undefined expression like this in the first place.)
The solution to this problem is: Don't do that!
Don't write expressions where one variable (like number) is both used and modified.
In this case, as you've already discovered, there's a simple fix:
return number * factorial(number - 1);
Now, we're not actually trying to modify the value of the variable number (as the expression --number did), we're just subtracting 1 from it before passing the smaller value off to the recursive call.
So now, we're not breaking the rule, we're not using and modifying number in the same expression.
We're just using its value twice, and that's fine.
For more (much more!) on the subject of undefined behavior in expressions like these, see Why are these constructs using pre and post-increment undefined behavior?
How to find the factorial of a number;
function factorial(n) {
if(n == 0 || n == 1 ) {
return 1;
}else {
return n * factorial(n-1);
}
//return newnum;
}
console.log(factorial(3))

Count length and number of words in string

I am a beginner in C. I am trying to print out the length and the number of words in a string inputted by user. I have trouble dealing with the counting of words. I attempt to scan the number of the space character , then add 1 to the result. It is guaranteed that each word is only separated by one space.
I executed the code, and inputted
I go to school by bus.
The second output was not as I expected
22
24
22 is correct since it is indeed the length of the string, but I do not understand why it prints 24 instead of 6 (result of 5 spaces plus 1).
The code is as follows
#include<stdio.h>
#include<string.h>
int main(){
char in[256];
int cl=0,cw,i;
scanf("%[^\n]",&in);
cw=strlen(in);
for(i=0;i<=cw;i++){
if(in[i]=' '){
cl=cl+1;
}
}
printf("%d\n",cw);
printf("%d\n",cl+1);
return 0;
}
What went wrong, and how can I get a correct output?
if(in[i]=' '){
will always be true, you need to change it to
if(in[i]==' '){
2 mistakes I see in your code.
if(in[i]=' ') should be if(in[i]==' ')
== Checks if the values of two operands are equal or not. If yes, then the condition becomes true.
= Simple assignment operator. Assigns values from right side operands to left side operand.
i<=cw should be i<cw
indexing starts with 0 in C.
As Pras said, on ifs you normally use ==, because you are comparing and it returns true or false
If you just use = it attributes (insert) a value on the right to the left.
You should use == instead of = in the if statement that detects the space.
The assignment operator, = gives is always true if the number is asssigned to a nonzero value.
The equivalence operator, == is what actually compares whether two numbers have the same value.
It may be surprising that the assignment operator can actually test whether or not a number is zero(at least on x860. This is because jz, one of the testing instructions, checks something called a flag register (a special memory location) that contains information about the arithmetic operation most recently executed. == does a cmp, which subtracts both numbers from each other to see if it is 0. Jz stands for "jump if zero", so two equal numbers result to the "jump" to the instructions in the if statement. When = is used, the compiler uses jz instead (in order to comply with the c idea that nonzero number in if is true), which results in the behavior described above.

Is it safe to compare floats strictly, given we do no operations on them?

Normally, when we want to test fractional numbers for equality, we do so with some amount of uncertainty, because of approximate nature of IEEE754.
if (fabs(one_float - other_float) < SUFFICIENTLY_SMALL) {
consider them equal;
}
Other approach might be to cast the floats to integers of specific magnitude, and compare resulting integers instead.
if ((uint)one_float == (uint)other_float) {
consider them equal;
}
But consider the situation where our floats never undergo any arithmetics, and the only thing we do with them is assignment.
// Might be called at any moment
void resize(float new_width, float new_height)
{
if (current_width == new_width && current_height == new_height) {
return;
}
accomodate framebuffers;
update uniforms;
do stuff;
current_width = new_width;
current_height = new_height;
}
In the example above, we have an event handler that may fire up spontaneously, and we want to reallocate resources only if real resize occurred. We might take the usual path with approximate comparison, but that seems like a waste, because that handler will fire up pretty often. As far as I know, floats are assigned just as everything else, using memory move; and equality operator performs just the memory comparison. So it looks like we are in a safe harbour. When checked on x86 and ARM, this assumption holds, but I just wanted to be sure. Maybe it's backed by some specs?
So, is there something that might change floats when only assigning them? Is the described approach viable?
A mass of "should"s follows.
I don't think there's anything that says an assignment from float to float (current_width = new_width) couldn't alter the value, but I would be surprised if such a thing existed. There should be no reason to make assignment between variables of the same type anything else than a direct copy.
If the incoming new_width and new_height keep their values until they change, then this comparison should not have any issues. But if they are calculated before every call they might change their value, depending on how the calculation is done. So it's not only this function that needs to be checked.
The C 2011 standard says that calculations may use bigger precision than the format you assign to, but nothing specific about assigning a variable to another. So the only imprecision should be in the calculation stage. The "simple assignment" part (6.5.16.1) says:
In simple assignment (=), the value of the right operand is converted to the type of the assignment expression and replaces the value stored in the object designated by the left operand.
So if the types already match, there should be no need for conversion.
So, simply put: if you don't recalculate the incoming value on every call the comparison for equality should hold true. But is there really a case where your framebuffers are sized as floats and not integers?
In the case you mentioned it is safe to use == to compare because of the following lines at the end of the function:
current_width = new_width;
current_height = new_height;
any change to new_width or new_height will fail the if statement and you get the behavior you wanted.
abs() function usually is used when there is one variable float which is assigned dynamically in the program and one constant float which you want to use as a reference. Something like:
bool isEqual(float first, float second)
{
if(fabs(first-second)<0.0001)
return true;
return false;
}
int main()
{
float x = (float) 1 / 3;
if(isEqual(x,0.3333))
printf("It is equal\n");
else
printf("It is not equal\n");
}
Have a look here for how to compare two floats.
https://randomascii.wordpress.com/2012/02/25/comparing-floating-point-numbers-2012-edition/
Find the test with "ULP, he said nervously". You can compare the floats as ints but have a look at how it's been done in the article ie (I hope I'm not breaching any licensing here, if so please delete the code that follows it's not mine)
bool AlmostEqualUlps(float A, float B, int maxUlpsDiff){
Float_t uA(A);
Float_t uB(B);
// Different signs means they do not match.
if (uA.Negative() != uB.Negative()) {
// Check for equality to make sure +0==-0
if (A == B) {
return true;
}
return false;
}
// Find the difference in ULPs.
int ulpsDiff = abs(uA.i - uB.i);
if (ulpsDiff <= maxUlpsDiff) {
return true;
}
return false;
}
Please read the rest of that article, it's really good.

How to get and evaluate the Expressions from a string in C

How to get and evaluate the Expressions from a string in C
char *str = "2*8-5+6";
This should give the result as 17 after evaluation.
Try by yourself. you can Use stack data structure to evaluate this string here is reference to implement (its in c++)
stack data structre for string calcualtion
You have to do it yourself, C does not provide any way to do this. C is a very low level language. Simplest way to do it would be to find a library that does it, or if that does not exist use lex + yacc to create your own interpreter.
A quick google suggests the following:
http://www.gnu.org/software/libmatheval/
http://expreval.sourceforge.net/
You should try TinyExpr. It's a single C source code file (with no dependencies) that you can add to your project.
Using it to solve your problem is just:
#include <stdio.h>
#include "tinyexpr.h"
int main()
{
double result = te_interp("2*8-5+6", 0);
printf("Result: %f\n", result);
return 0;
}
That will print out: Result: 17
C does not have a standard eval() function.
There are lots of libraries and other tools out there that can do this.
But if you'd like to learn how to write an expression evaluator yourself, it can be surprisingly easy. It is not trivial: it is actually a pretty deeply theoretical problem, because you're basically writing a miniature parser, perhaps built on a miniature lexical analyzer, just like a real compiler.
One straightforward way of writing a parser involves a technique called recursive descent. Writing a recursive descent parser has a lot in common with another great technique for solving big or hard problems, namely by breaking the big, hard problem up into smaller and hopefully easier subproblems.
So let's see what we can come up with. We're going to write a function int eval(const char * expr) that takes a string containing an expression, and returns the int result of evaluating it. But first let's write a tiny main program to test it with. We'll read a line of text typed by the user using fgets, pass it to our expr() function, and print the result.
#include <stdio.h>
int eval(const char *expr);
int main()
{
char line[100];
while(1) {
printf("Expression? ");
if(fgets(line, sizeof line, stdin) == NULL) break;
printf(" -> %d\n", eval(line));
}
}
So now we start writing eval(). The first question is, how will we keep track of how how far we've read through the string as we parse it? A simple (although mildly cryptic) way of doing this will be to pass around a pointer to a pointer to the next character. That way any function can move forwards (or occasionally backwards) through the string. So our eval() function is going to do almost nothing, except take the address of the pointer to the string to be parsed, resulting in the char ** we just decided we need, and calling a function evalexpr() to do the work. (But don't worry, I'm not procrastinating; in just a second we'll start doing something interesting.)
int evalexpr(const char **);
int eval(const char *expr)
{
return evalexpr(&expr);
}
So now it's time to write evalexpr(), which is going to start doing some actual work. Its job is to do the first, top-level parse of the expression. It's going to look for a series of "terms" being added or subtracted. So it wants to get one or more subexpressions, with + or - operators between them. That is, it's going to handle expressions like
1 + 2
or
1 + 2 - 3
or
1 + 2 - 3 + 4
Or it can read a single expression like
1
Or any of the terms being added or subtracted can be a more-complicated subexpression, so it will also be able to (indirectly) handle things like
2*3 + 4*5 - 9/3
But the bottom line is that it wants to take an expression, then maybe a + or - followed by another subexpression, then maybe a + or - followed by another subexpression, and so on, as long as it keeps seeing a + or -. Here is the code. Since it's adding up the additive "terms" of the expression, it gets subexpressions by calling a function evalterm(). It also needs to look for the + and - operators, and it does this by calling a function gettok(). Sometimes it will see an operator other than + or -, but those are not its job to handle, so if it sees one of those it "ungets" it, and returns, because it's done. All of these functions pass the pointer-to-pointer p around, because as I said earlier, that's how all of these functions keep track of how they're moving through the string as they parse it.
int evalterm(const char **);
int gettok(const char **, int *);
void ungettok(int, const char **);
int evalexpr(const char **p)
{
int r = evalterm(p);
while(1) {
int op = gettok(p, NULL);
switch(op) {
case '+': r += evalterm(p); break;
case '-': r -= evalterm(p); break;
default: ungettok(op, p); return r;
}
}
}
This is some pretty dense code, Stare at it carefully and convince yourself that it's doing what I described. It calls evalterm() once, to get the first subexpression, and assigns the result to the local variable r. Then it enters a potentially infinite loop, because it can handle an arbitrary number of added or subtracted terms. Inside the loop, it gets the next operator in the expression, and uses it to decide what to do. (Don't worry about the second, NULL argument to gettok; we'll get to that in a minute.)
If it sees a +, it gets another subexpression (another term) and adds it to the running sum. Similarly, if it sees a -, it gets another term and subtracts it from the running sum. If it gets anything else, this means it's done, so it "ungets" the operator it doesn't want to deal with, and returns the running sum -- which is literally the value it has evaluated. (The return statement also breaks the "infinite" loop.)
At this point you're probably feeling a mixture of "Okay, this is starting to make sense" but also "Wait a minute, you're playing pretty fast and loose here, this is never going to work, is it?" But it is going to work, as we'll see.
The next function we need is the one that collects the "terms" or subexpressions to be added (and subtracted) together by evalexpr(). That function is evalterm(), and it ends up being very similar -- very similar -- to evalexpr. Its job is to collect a series of one or more subexpressions joined by * and/or /, and multiply and divide them. At this point, we're going to call those subexpressions "primaries". Here is the code:
int evalpri(const char **);
int evalterm(const char **p)
{
int r = evalpri(p);
while(1) {
int op = gettok(p, NULL);
switch(op) {
case '*': r *= evalpri(p); break;
case '/': r /= evalpri(p); break;
default: ungettok(op, p); return r;
}
}
}
There's actually nothing more to say here, because the structure of evalterm ends up being exactly like evalexpr, except that it does things with * and /, and it calls evalpri to get/evaluate its subexpressions.
So now let's look at evalpri. Its job is to evaluate the three lowest-level, but highest-precedence elements of an expression: actual numbers, and parenthesized subexpressions, and the unary - operator.
int evalpri(const char **p)
{
int v;
int op = gettok(p, &v);
switch(op) {
case '1': return v;
case '-': return -evalpri(p);
case '(':
v = evalexpr(p);
op = gettok(p, NULL);
if(op != ')') {
fprintf(stderr, "missing ')'\n");
ungettok(op, p);
}
return v;
}
}
The first thing to do is call the same gettok function we used in evalexpr and evalterm. But now it's time to say a little more about it. It is actually the lexical analyzer used by our little parser. A lexical analyzer returns primitive "tokens". Tokens are the basic syntactic elements of a programming language. Tokens can be single characters, like + or -, or they can also be multi-character entities. An integer constant like 123 is considered a single token. In C, other tokens are keywords like while, and identifiers like printf, and multi-character operators like <= and ++. (Our little expression evaluator doesn't have any of those, though.)
So gettok has to return two things. First it has to return a code for what kind of token it found. For single-character tokens like + and - we're going to say that the code is just the character. For numeric constants (that is, any numeric constant), we're going to say that gettok is going to return the character 1. But we're going to need some way of knowing what the value of the numeric constant was, and that, as you may have guessed, is what the second, pointer argument to the gettok function is for. When gettok returns 1 to indicate a numeric constant, and if the caller passes a pointer to an int value, gettok will fill in the integer value there. (We'll see the definition of the gettok function in a moment.)
At any rate, with that explanation of gettok out of the way, we can understand evalpri. It gets one token, passing the address of a local variable v in which the value of the token can be returned, if necessary. If the token is a 1 indicating an integer constant, we simply return the value of that integer constant. If the token is a -, this is a unary minus sign, so we get another subexpression, negate it, and return it. Finally, if the token is a (, we get another whole expression, and return its value, checking to make sure that there's another ) token after it. And, as you may notice, inside the parentheses we make a recursive call back to the top-level evalexpr function to get the subexpression, because obviously we want to allow any subexpression, even one containing lower-precedence operators like + and -, inside the parentheses.
And we're almost done. Next we can look at gettok. As I mentioned, gettok is the lexical analyzer, inspecting individual characters and constructing full tokens from them. We're now, finally, starting to see how the passed-around pointer-to-pointer p is used.
#include <stdlib.h>
#include <ctype.h>
void skipwhite(const char **);
int gettok(const char **p, int *vp)
{
skipwhite(p);
char c = **p;
if(isdigit(c)) {
char *p2;
int v = strtoul(*p, &p2, 0);
*p = p2;
if(vp) *vp = v;
return '1';
}
(*p)++;
return c;
}
Expressions can contain arbitrary whitespace, which is ignored, so we skip over that with an auxiliary function skipwhite. And now we look at the next character. p is a pointer to pointer to that character, so the character itself is **p. If it's a digit, we call strtoul to convert it. strtoul helpfully returns a pointer to the character following the number it scans, so we use that to update p. We fill in the passed pointer vp with the actual value strtoul computed for us, and we return the code 1 indicating an integer constant.
Otherwise -- if the next character isn't a digit -- it's an ordinary character, presumably an operator like + or - or punctuation like ( ), so we simply return the character, after incrementing *p to record the fact that we've consumed it. Properly "incrementing" p is mildly tricky: it's a pointer to a pointer, and we want to increment the pointed-to pointer. If we wrote p++ or *p++ it would increment the pointer p, so we need (*p)++ to say that it's the pointed-to pointer that we want to increment. (See also C FAQ 4.3.)
Two more utility functions, and then we're done. Here's skipwhite:
void skipwhite(const char **p)
{
while(isspace(**p))
(*p)++;
}
This simply skips over zero or more whitespace characters, as determined by the isspace function from <ctype.h>. (Again, we're taking care to remember that p is a pointer-to-pointer.)
Finally, we come to ungettok. It's a hallmark of a recursive descent parser (or, indeed, almost any parser) that it has to "look ahead" in the input, making a decision based on the next token. Sometimes, however, it decides that it's not ready to deal with the next token after all, so it wants to leave it there on the input for some other part of the parser to deal with later.
Stuffing input "back on the input stream", so to speak, can be tricky. This implementation of ungettok is simple, but it's decidedly imperfect:
void ungettok(int op, const char **p)
{
(*p)--;
}
It doesn't even look at the token it's been asked to put back; it just backs the pointer up by 1. This will work if (but only if) the token it's being asked to unget is in fact the most recent token that was gotten, and if it's not an integer constant token. In fact, for the program as written, and as long as the expression it's parsing is well-formed, this will always be the case. It would be possible to write a more complicated version of gettok that explicitly checked for these assumptions, and that would be able to back up over multi-character tokens (such as integer constants) if necessary, but this post has gotten much longer than I had intended, so I'm not going to worry about that for now.
But if you're still with me, we're done! And if you haven't already, I encourage you to copy all the code I've presented into your friendly neighborhood C compiler, and try it out. You'll find, for example, that 1 + 2 * 3 gives 7 (not 9), because the parser "knows" that * and / have higher precedence than + and -. Just like in a real compiler, you can override the default precedence using parentheses: (1 + 2) * 3 gives 9. Left-to-right associativity works, too: 1 - 2 - 3 is -4, not +2. It handles plenty of complicated, and perhaps surprising (but legal) cases, too: (((((5))))) evaluates to just 5, and ----4 evaluates to just 4 (it's parsed as "negative negative negative negative four", since our simplified parser doesn't have C's -- operator).
This parser does have a pretty big limitation, however: its error handling is terrible. It will handle legal expressions, but for illegal expressions, it will either do something bizarre, or just ignore the problem. For example, it simply ignores any trailing garbage it doesn't recognize or wasn't expecting -- the expressions 4 + 5 x, 4 + 5 %, and 4 + 5 ) all evaluate to 9.
Despite being somewhat of a "toy", this is also a very real parser, and as we've seen it can parse a lot of real expressions. You can learn a lot about how expressions are parsed (and about how compilers can be written) by studying this code. (One footnote: recursive descent is not the only way to write a parser, and in fact real compilers will usually use considerably more sophisticated techniques.)
You might even want to try extending this code, to handle other operators or other "primaries" (such as settable variables). Once upon a time, in fact, I started with something like this and extended it all the way into an actual C interpreter.

Why does strcmp() return 0 when its inputs are equal?

When I make a call to the C string compare function like this:
strcmp("time","time")
It returns 0, which implies that the strings are not equal.
Can anyone tell me why C implementations seem to do this? I would think it would return a non-zero value if equal. I am curious to the reasons I am seeing this behavior.
strcmp returns a lexical difference (or should i call it "short-circuit serial byte comparator" ? :-) ) of the two strings you have given as parameters.
0 means that both strings are equal
A positive value means that s1 would be after s2 in a dictionary.
A negative value means that s1 would be before s2 in a dictionary.
Hence your non-zero value when comparing "time" and "money" which are obviously different, even though one would say that time is money ! :-)
The nice thing about an implementation like this is you can say
if(strcmp(<stringA>, <stringB>) > 0) // Implies stringA > stringB
if(strcmp(<stringA>, <stringB>) == 0) // Implies stringA == stringB
if(strcmp(<stringA>, <stringB>) < 0) // Implies stringA < stringB
if(strcmp(<stringA>, <stringB>) >= 0) // Implies stringA >= stringB
if(strcmp(<stringA>, <stringB>) <= 0) // Implies stringA <= stringB
if(strcmp(<stringA>, <stringB>) != 0) // Implies stringA != stringB
Note how the comparison with 0 exactly matches the comparison in the implication.
It's common to functions to return zero for the common - or one-of-a-kind - case and non-zero for special cases. Take the main function, which conventionally returns zero on success and some nonzero value for failure. The precise non-zero value indicates what went wrong. For example: out of memory, no access rights or something else.
In your case, if the string is equal, then there is no reason why it is equal other than that the strings contain the same characters. But if they are non-equal then either the first can be smaller, or the second can be smaller. Having it return 1 for equal, 0 for smaller and 2 for greater would be somehow strange i think.
You can also think about it in terms of subtraction:
return = s1 - s2
If s1 is "lexicographically" less, then it will give is a negative value.
Another reason strcmp() returns the codes it does is so that it can be used directly in the standard library function qsort(), allowing you to sort an array of strings:
#include <string.h> // for strcmp()
#include <stdlib.h> // for qsort()
#include <stdio.h>
int sort_func(const void *a, const void *b)
{
const char **s1 = (const char **)a;
const char **s2 = (const char **)b;
return strcmp(*s1, *s2);
}
int main(int argc, char **argv)
{
int i;
printf("Pre-sort:\n");
for(i = 1; i < argc; i++)
printf("Argument %i is %s\n", i, argv[i]);
qsort((void *)(argv + 1), argc - 1, sizeof(char *), sort_func);
printf("Post-sort:\n");
for(i = 1; i < argc; i++)
printf("Argument %i is %s\n", i, argv[i]);
return 0;
}
This little sample program sorts its arguments ASCIIbetically (what some would call lexically). Lookie:
$ gcc -o sort sort.c
$ ./sort hi there little fella
Pre-sort:
Argument 1 is hi
Argument 2 is there
Argument 3 is little
Argument 4 is fella
Post-sort:
Argument 1 is fella
Argument 2 is hi
Argument 3 is little
Argument 4 is there
If strcmp() returned 1 (true) for equal strings and 0 (false) for inequal ones, it would be impossible to use it to obtain the degree or direction of inequality (i.e. how different, and which one is bigger) between the two strings, thus making it impossible to use it as a sorting function.
I don't know how familiar you are with C. The above code uses some of C's most confusing concepts - pointer arithmetic, pointer recasting, and function pointers - so if you don't understand some of that code, don't worry, you'll get there in time. Until then, you'll have plenty of fun questions to ask on StackOverflow. ;)
You seem to want strcmp to work like a (hypothetical)
int isEqual(const char *, const char *)
To be sure that would be true to the "zero is false" interpretation of integer results, but it would complicate the logic of sorting because, having established that the two strings were not the same, you would still need to learn which came "earlier".
Moreover, I suspect that a common implementation looks like
int strcmp(const char *s1, const char *s2){
const unsigned char *q1=s1, *q2=s2;
while ((*q1 == *q2) && *q1){
++q1; ++q2;
};
return (*q1 - *q2);
}
which is [edit: kinda] elegant in a K&R kind of way. The important point here (which is increasingly obscured by getting the code right (evidently I should have left well enough alone)) is the way the return statement:
return (*q1 - *q2);
which gives the results of the comparison naturally in terms of the character values.
There's three possible results: string 1 comes before string 2, string 1 comes after string 2, string 1 is the same as string 2. It is important to keep these three results separate; one use of strcmp() is to sort strings. The question is how you want to assign values to these three outcomes, and how to keep things more or less consistent. You might also look at the parameters for qsort() and bsearch(), which require compare functions much like strcmp().
If you wanted a string equality function, it would return nonzero for equal strings and zero for non-equal strings, to go along with C's rules on true and false. This means that there would be no way of distinguishing whether string 1 came before or after string 2. There are multiple true values for an int, or any other C data type you care to name, but only one false.
Therefore, having a useful strcmp() that returned true for string equality would require a lot of changes to the rest of the language, which simply aren't going to happen.
I guess it is simply for symmetry: -1 if less, 0 if equal, 1 if more.

Resources