I do not understand strcmp results - c

this is my implementation of strcmp ,
#include <stdio.h>
#include <string.h>
int ft_strcmp(const char *s1, const char *s2)
{
while (*s1 == *s2)
{
if (*s1 == '\0')
return (0);
s1++;
s2++;
}
return (*s1 - *s2);
}
int main()
{
char s1[100] = "bon";
char s2[100] = "BONN";
char str1[100] = "bon";
char str2[100] = "n";
printf("%d\n", ft_strcmp(s1, s2));
printf("%d\n", ft_strcmp(str1, str2));
return (0);
}
from the book kernighan and Ritchie but i use a while loop, instead of the for, i ve tested it many times and my strcmp geaves the same results as the original strcmp,
but i do not understand the results , i rode the man:
"The strcmp() and strncmp() functions lexicographically compare the null-terminated strings s1 and s2."
what does lexicography means ?
"return an integer greater than, equal to, or less than 0, according as the string s1 is greater than, equal to, or less than the string s2."
i understand this part but my questions are how can it come up with such results:
32
-12
s1 looks < s2 for me so how and why do i get 32 and how the calcul is made ?
str1 looks > str2 for me, how and why do i get -12 and how the calcul is made.
I ve compile it with the real STRCMP and i get the Same results..
last question why do i need to compare *s1 to '\0' won't it work fine without ?
thank you for your answers i m confused..

1) K&R are comparing the ascii values of those chars, that's why you get 32 and -12, check out an ascii table and you'll understand.
2)If you don't check for \0 , how can you know when the string end? That's the c strings terminator.

Capital letters in terms of ASCII codes actually precede lowercase letters, as you can see here.
So in terms of lexicographic ordering, s1 is treated as being bigger than s2, because the ascii value of the first letter that differs is the larger one.

SO we compare *s1 to '\0' to see when does the string ends,
and the results are made using the decimal value of the first characteres of each string.

int ft_strcmp(char *s1,char *s2)
{
int x;
x = 0;
while(s1[x] != '\0' && s2[x] != '\0' && s1[x] == s2[x])
i++;
return (s1[x] - s2[x]);
}
by mokgohloa ally

Related

NUL character and static character arrays/string literals in C

I understand that strings are terminated by a NUL '\0' byte in C.
However, what I can't figure out is why a 0 in a string literal acts differently than a 0 in an char array created on the stack. When checking for NUL terminators in a literal, the zeros in the middle of the array are not treated as such.
For example:
#include <stdio.h>
#include <string.h>
#include <sys/types.h>
int main()
{
/* here, one would expect strlen to evaluate to 2 */
char *confusion = "11001";
size_t len = strlen(confusion);
printf("length = %zu\n", len); /* why is this == 5, as opposed to 2? */
/* why is the entire segment printed here, instead of the first two bytes?*/
char *p = confusion;
while (*p != '\0')
putchar(*p++);
putchar('\n');
/* this evaluates to true ... OK */
if ((char)0 == '\0')
printf("is null\n");
/* and if we do this ... */
char s[6];
s[0] = 1;
s[1] = 1;
s[2] = 0;
s[3] = 0;
s[4] = 1;
s[5] = '\0';
len = strlen(s); /* len == 2, as expected. */
printf("length = %zu\n", len);
return 0;
}
output:
length = 5
11001
is null
length = 2
Why does this occur?
The variable 'confusion' is a pointer to char of a literal string.
So the memory looks something like
[11001\0]
So when you print the variable 'confusion', it will print everything until first null character which is represented by \0.
Zeroes in 11001 are not null, they are literal zeroes since it is surrounded with double quotes.
However, in your char array assignment for variable 's', you are assigning a decimal value 0 to
char variable. When you do that, ASCII decimal value of 0 which is ASCII character value of NULL character gets assigned to it. So the the character array looks something like in the memory
[happyface, happyface, NULL]
ASCII character happyface has ASCII decimal value of 1.
So when you print, it will print everything up to first NULL and thus
the strlen is 2.
The trick here is understanding what really gets assigned to a character variable when a decimal value is assigned to it.
Try this code:
#include <stdio.h>
int
main(void)
{
char c = 0;
printf( "%c\n", c ); //Prints the ASCII character which is NULL.
printf( "%d\n", c ); //Prints the decimal value.
return 0;
}
You can view an ASCII Table (e.g. http://www.asciitable.com/) to check the exact value of character '0' and null
'0' and 0 are not the same value. (The first one is 48, usually, although technically the precise value is implementation-defined and it is considered very bad style to write 48 to refer to the character '0'.)
If a '0' terminated a character string, you wouldn't be able to put zeros in strings, which would be a bit... limiting.

How to use the character array to identify a string

For my class we use char arrays for strings. If I was to use an if else statement, would something like this work if I had it modified to do so?
I know an array like this would make every character broken down to simple letters. And to use an if else statement I have to go like array[1] == 'H' and so on.
Is there a way to modify the code below to spit out the information I want if I type up "Alas". Right now, it only goes to the else part.
int main()
{
char s[10];
printf("Yo, this is a string: ");
gets_s(s);
if (s == "Alas")
{
printf("B ");
}
else
{
printf("A");
}
system("pause");
}
Use the strncmp standard library function to compare two strings. Include the <string.h> header.
strncmp(const char *s1, const char *s2, size_t n)
RETURN VALUE:
Upon successful completion, strncmp() shall return an integer greater than, equal to, or less than 0, if the possibly null-terminated array pointed to by s1 is greater than, equal to, or less than the possibly null-terminated array pointed to by s2 respectively.
Now in your code s is pointer and "Alas" is treated as pointer. Pointer to another memory area. This is a reason why they are always different. Use
if (!strcmp(s, "Alas"))
Something like:
int main()
{
char s[10];
printf("Yo, this is a string: ");
gets_s(s);
if (strcmp(s, "Alas") == 0)
{
printf("B ");
}
else
{
printf("A");
}
system("pause");
}
If the only thing you want to know is whether two strings are identical or not, you can define a function yourself to check each character of two strings, returning a 0 as soon as you encounter a difference, returning a 1 only if it encounters the terminating zero on both at the same time:
SameStrings( char * s1, char * s2 ) {
for ( int i = 0; s1[i] && s2[i]; i++ )
if ( s1[i] != s2[i] )
return 0;
// if the programme advanced this far
// either one of both should be 0
return s1[i] == s2[i];
// if they are equal, then both must be 0, in which case it will return 1
// else it will return a 0
}
You can add one more argument to that function, an integer that will limit the maximum number of characters to be checked, in case, for example, you want SameStrings( "lalaqwe", "lalaasd", 4 ) to return true.
This is good if you don't want to include a library for a function that does much more than you need...

Representation of C string at memory and comparison

I have such code:
char str1[100] = "Hel0lo";
char *p;
for (p = str1; *p != 0; p++) {
cout << *p << endl;
/* skip along till the end */
}
and there are some parts not clear for me.
I understand that null-terminated string at memory is byte with all bits equal to 0 (ASCII). That's why when *p != 0 we decide that we found the end of the string. If I would like to search till zero char, I should compare with 48, which is DEC representation of 0 according to ASCII at memory.
But why while access to memory we use HEX numbers and for comparison we use DEC numbers?
Is it possible to compare "\0" as the end of string? Something like this(not working):
for (p = str1; *p != "\0"; p++) {
And as I understand "\48" is equal to 0?
Your loop includes the exit test
*p != "\0"
This takes the value of the char p, promotes it to int then compares this against the address of the string literal "\0". I think you meant to compare against the nul character '\0' instead
*p != '\0'
Regarding comparison against hex, decimal or octal values - there are no set rules, you can use them interchangably but should try to use whatever seems makes your code easiest to read. People often use hex along with bitwise operations or when handling binary data. Remember however that '0' is identical to 48, x30 and '\060' but different from '\0'.
Yes you can compare end of string like:
for (p = str1; *p != '\0'; p++) {
// your code
}
ASCII value of \0 char is 0 (zero)
you Could just do
for (p = str1; *p ; p++) {
// your code
}
As #Whozraig also commented, because *p is \0 and ASCII value is 0 that is false

strcmp() return values in C [duplicate]

This question already has answers here:
How does strcmp() work?
(9 answers)
Closed 5 years ago.
I am learning about strcmp() in C. I understand that when two strings are equal, strcmp returns 0.
However, when the man pages state that strcmp returns less than 0 when the first string is less than the second string, is it referring to length, ASCII values, or something else?
In this sense, "less than" for strings means lexicographic (alphabetical) order.
So cat is less than dog because cat is alphabetically before dog.
Lexicographic order is, in some sense, an extension of alphabetical order to all ASCII (and UNICODE) characters.
A value greater than zero indicates that the first character that does not match has a greater value in the first string than in the second, and a value less than zero indicates the opposite.
C99 7.21.4:
The sign of a nonzero value returned by the comparison functions
memcmp, strcmp, and strncmp is determined by the sign of
the difference between the values of the first pair of characters (both
interpreted as unsigned char) that differ in the objects being
compared.
Note in particular that the result doesn't depend on the current locale; LC_COLLATE (see C99 7.11) affects strcoll() and strxfrm(), but not strcmp().
int strcmp (const char * s1, const char * s2)
{
for(; *s1 == *s2; ++s1, ++s2)
if(*s1 == 0)
return 0;
return *(unsigned char *)s1 < *(unsigned char *)s2 ? -1 : 1;
}
Look out the following program, here I am returning the value depending upon the string you have typed. The function strcmp retrun value according to ASCII value of whole string considered totally.
For eg. str1 = "aab" and str2 = "aaa" will return 1 as aab > aaa.
int main()
{
char str1[15], str2[15];
int n;
printf("Enter the str1 string: ");
gets(str1);
printf("Enter the str2 string : ");
gets(str2);
n = strcmp(str1, str2);
printf("Value returned = %d\n", n);
return 0;
}

comparing strings (from other indices rather than 0)

How can one compare a string from the middle (or some other point but not the start) to another string?
like i have a string
str1[]="I am genius";
now if i want to find a word in it how should i compare it with the word? for example the word is am.
Here is what i did.Its a bit stupid but works perfectly :D
#include<stdio.h>
#include<string.h>
void print( char string[]);
int main()
{
int i;
char string1[20];
printf("Enter a string:");
gets(string1);
print(string1);
return 0;
getch();
}
void print(char string[])
{
int i,word=1,sum=0,x;
for(i=0; ;i++)
{
sum++;
if(string[i]==' ')
{
printf("Word#%d:%d\n",word,sum-1);
sum=0;
word++;
}/* if ends */
if(string[i]=='\0')
{ // program sai kaam karnay k liye ye code yahan bhi paste hona chahyey
printf("Word#%d:%d\n",word,sum-1);
sum=0;
word++;
break;
}
}/* for ends*/
}
Use strncmp():
strncmp( whereToFind + offsetToStartAt, patternToFind, patternLength );
If you wish to find a substring in a string, use the function strstr():
char *p = strstr(str1, "am");
if (p != NULL)
{
// p now points to start of substring
printf("found substring\n");
}
else
{
printf("substring not found\n");
}
If you want to compare the remainder of string s1 starting at index i1 to the remainder of string s2 starting at i2, it's very easy:
result = strcmp(s1+i1, s2+i2);
If you want to see if the substring of s1 beginning at i1 matches the string s2, try:
result = strcmp(s1+i1, s2);
or:
result = strncmp(s1+i1, s2, strlen(s2));
depending on whether you want the whole remainder of s1 to match or just the portion equal in length to s2 to match (i.e whether s1 contains s2 as a substring beginning at position i1.
If you want to search for a substring, use strstr.
Since this is homework I am assuming you can't use standard functions, so I can think of two solutions:
Split all of the words into a link
list, then just compare each string
until you find your word.
Just use a for loop, start at the
beginning, and you can use [] to
help jump through the string, so
instr[3] would be the fourth
character, as the index is
zero-based. Then you just see if you are at your word yet.
There are optimizations you can do with (2), but I am not trying to do your homework for you. :)
One option you be to use
size_t strspn( char *s1, const char *s2) /* from #include <string.h> */
*returns the length of the longest substring of s1 that begins at the start of s1 and consists only of the characters found in s2.
If it returns ZERO than there is no substring.
You can use parse the string into words and store them in new char arrays/pointers.
Or
Suppose the string you want to find is "am" stored in ptr *str2.
You start comparison using the index[] from str1 till you find a matching char for index 0 from str2
Once you find a match increment both pointers till you reach end of str2 to compare entire string.
If there is no match then continue to find char at index 0 in str2 in str1 from the place where you entered step 2.
Alternatively
You have to use a two dimensinal array.
char str[3][10] = { "i","am","2-darray"};
Here str[1] will contain "am". Thats assuming you want to get indvidual words of a string.
Edit: Removed the point diverting from OP

Resources