Creating my own strcmp () function in C - c

I was assigned by my teacher to write my own strcmp() function in C. I did create my own version of said function, and I was hoping to get some feedback.
int CompareTwoStrings ( char *StringOne, char *StringTwo ) {
// Evaluates if both strings have the same length.
if ( strlen ( StringOne ) != strlen ( StringTwo ) ) {
// Given that the strings have an unequal length, it compares between both
// lengths.
if ( strlen ( StringOne ) < strlen ( StringTwo ) ) {
return ( StringOneIsLesser );
}
if ( strlen ( StringOne ) > strlen ( StringTwo ) ) {
return ( StringOneIsGreater );
}
}
int i;
// Since both strings are equal in length...
for ( i = 0; i < strlen ( StringOne ); i++ ) {
// It goes comparing letter per letter.
if ( StringOne [ i ] != StringTwo [ i ] ) {
if ( StringOne [ i ] < StringTwo [ i ] ) {
return ( StringOneIsLesser );
}
if ( StringOne [ i ] > StringTwo [ i ] ) {
return ( StringOneIsGreater );
}
}
}
// If it ever reaches this part, it means they are equal.
return ( StringsAreEqual );
}
StringOneIsLesser, StringOneIsGreater, StringsAreEqual are defined as const int with the respective values: -1, +1, 0.
Thing is, I'm not exactly sure if, for example, my StringOne has a lesser length than my StringTwo, that automatically means StringTwo is greater, because I don't know how strcmp() is particularly implemented. I need some of your feedback for that.

So much for such a simple task. I believe something simple as this would do:
int my_strcmp(const char *a, const char *b)
{
while (*a && *a == *b) { ++a; ++b; }
return (int)(unsigned char)(*a) - (int)(unsigned char)(*b);
}

strcmp compares alphabetically: so "aaa" < "b" even though "b" is shorter.
Because of this, you can skip the length check and just do the letter by letter comparison. If you get to a NULL character while both strings are equal so far, then the shorter one is the lesser one.
Also: make StringsAreEqual == 0, not 1 for compatibility with standard sorting functions.

int mystrncmp(const char * str1, const char * str2, unsigned int n)
{
while (*str1 == *str2) {
if (*str1 == '\0' || *str2 == '\0')
break;
str1++;
str2++;
}
if (*str1 == '\0' && *str2 == '\0')
return 0;
else
return -1;
}

strcmp() is fairly easy to code. The usual mis-codings issues include:
Parameter type
strcmp(s1,s2) uses const char * types, not char *. This allows the function to be called with pointers to const data. It conveys to the user the function's non-altering of data. It can help with optimization.
Sign-less compare
All str...() function perform as if char was unsigned char, even if char is signed. This readily affects the result when strings differ and a character outside the range [1...CHAR_MAX] is found.
Range
On select implementations, the range of unsigned char minus unsigned char is outside the int range. Using 2 compares (a>b) - (a-b) avoids any problem rather than a-b;. Further: many compilers recognized that idiom and emit good code.
int my_strcmp(const char *s1, const char *s2) {
// All compares done as if `char` was `unsigned char`
const unsigned char *us1 = (const unsigned char *) s1;
const unsigned char *us2 = (const unsigned char *) s2;
// As long as the data is the same and '\0' not found, iterate
while (*us1 == *us2 && *us1 != '\0') {
us1++;
us2++;
}
// Use compares to avoid any mathematical overflow
// (possible when `unsigned char` and `unsigned` have the same range).
return (*us1 > *us2) - (*us1 < *us2);
}
Dinosaur computers
Machines that use a signed char and non-2's complement, the following can be wrong or a trap with *s1 != '\0'. Such machines could have a negative 0 - which does not indicate the end of a string, yet quits the loop. Using unsigned char * pointers solves that.
int my_strcmp(const char *s1, const char *s2) {
while (*s1 == *s2 && *s1 != '\0') { // Error!
s1++;
s2++;
}

Try this also for your better understanding:
#include <stdio.h>
#include <string.h>
int main(void)
{
char string1[20], string2[20];
int i=0,len=0, count=0;
puts("enter the stirng one to compare");
fgets(string1, sizeof(string1), stdin);
len = strlen(string1);
if(string1[len-1]=='\n')
string1[len-1]='\0';
puts("enter the stirng two to compare");
fgets(string2, sizeof(string2), stdin);
len = strlen(string2);
if(string2[len-1]=='\n')
string2[len-1]='\0';
if(strlen(string1)==strlen(string2))
{
for(i=0;string1[i]!='\0', string2[i]!='\0', i<strlen(string1);i++)
{
count=string1[i]-string2[i];
count+=count;
}
if(count==0)
printf("strings are equal");
else if(count<0)
printf("string1 is less than string2");
else if(count>0)
printf("string2 is less than string1");
}
if(strlen(string1)<strlen(string2))
{
for(i=0;string1[i]!='\0', i<strlen(string1);i++)
{
count=string1[i]-string2[i];
count+=count;
}
if(count==0)
printf("strings are equal");
else if(count<0)
printf("string1 is less than string2");
else if(count>0)
printf("string2 is less than string1");
}
if(strlen(string1)>strlen(string2))
{
for(i=0;string2[i]!='\0', i<strlen(string2);i++)
{
count=string1[i]-string2[i];
count+=count;
}
if(count==0)
printf("strings are equal");
else if(count<0)
printf("string1 is less than string2");
else if(count>0)
printf("string2 is less than string1");
}
return 0;
}

bool str_cmp(char* str1,char* str2)
{
if (str1 == nullptr || str2 == nullptr)
return false;
const int size1 = str_len_v(str1);
const int size2 = str_len_v(str2);
if (size1 != size2)
return false;
for(int i=0;str1[i] !='\0' && str2[i] !='\0';i++)
{
if (str1[i] != str2[i])
return false;
}
return true;
}

Related

How do I return a char from a char pointer function in C?

i recently made a function that finds the smallest character in a string. I am not sure how to return the smallest character in a char pointer function.
#include <stdio.h>
#include <string.h>
char * smallest(char s[])
{
char small = 'z';
int i = 0;
while (s[i] != '\0')
{
if (s[i] < small)
{
small = s[i];
}
i++;
}
return small;
}
int main(void)
{
char s[4] = "dog";
printf("%c",smallest(s));
}
The variable small has the type char according to its declaration
char small = 'z';
//...
return small;
and this variable is returned from the function while the function return type is the pointer type char *.
char * smallest(char s[])
Also if the user will pass an empty string to the function then you will try to return the character 'z' as a smallest character though this character is absent in the empty string.
I think in this case you should return a pointer to the terminating zero character '\0'.
The function can be defined the following way
char * smallest( char s[] )
{
char *small = s;
if ( *s )
{
while ( *++s )
{
if ( *s < *small ) small = s;
}
}
return small;
}
Or as in C there is no function overloading then the function should be declared and defined like
char * smallest( const char s[] )
{
const char *small = s;
if ( *s )
{
while ( *++s )
{
if ( *s < *small ) small = s;
}
}
return ( char * )small;
}
Pay attention to that this assert
assert(smallest(s[4] == 'd'));
is incorrect, It seems you mean
assert( *smallest( s ) == 'd');
Or after you updated your program you need to write
printf("%c\n",*smallest(s));
instead of
printf("%c",smallest(s));
Using this function you can not only to find the smallest character but also to determine its position in the source string.
For example
char *small = smallest( s );
printf( "The smallest character is '%c' at the position %tu\n",
*small, small - s );
or
char *small = smallest( s );
if ( *small == '\0' )
{
puts( "The source string is empty" );
}
else
{
printf( "The smallest character is '%c' at the position %tu\n",
*small, small - s );
}
There are two problems with your program.
1. Wrong parameters
The function smallest(char[] s)expects to be given an character array but what you are passing in as an argument is s[4] == 'd' which is not a character array.
This has nothing to do with the assert() itself.
What you want to do is assert(smallest(s) == 'd').
2. Wrong return type
Your function is declares that it would return *char (= a pointer to a char) but you are trying to return a char. So you should adjust the return type of your function to be char.
The correct program:
#include <stdio.h>
#include <assert.h>
#include <string.h>
char smallest(char s[]) {
char small = 'z';
int i = 0;
while(s[i] != '\0') {
if (s[i] < small) {
small = s[i];
}
i++;
}
return small;
}
int main(void) {
char s[4] = "dog";
assert(smallest(s) == 'd');
printf("Passed\n");
}

Which string is the longest

My code:
What I'm trying to do is to input two strings, then return the longest one. If they're the same length then return NULL. Now, the code is just outputting gibberish and I cannot find out why. The function returns a pointer to the first character of the largest string. Then it goes through the while loop, and I'm trying to dereference the pointer and print out its value.
Note: I'm revising for an exam and we have to use only pointers and not treat strings as arrays.
#include<stdio.h>
char* string_ln(char*, char*);
int main() {
char str1[20];
char str2[20];
char* length;
scanf("%s%s", str1, str2);
length = string_ln(str1, str2);
while (length != '\0') {
printf("%c", *length);
length++;
}
}
char* string_ln(char*p1, char*p2) {
int count1 = 0;
while (*p1 != '\0') {
count1++;
p1++;
}
int count2 = 0;
while (*p2 != '\0') {
count2++;
p2++;
}
if (count1 > count2) {
return p1;
}
else if (count2 > count1) {
return p2;
}
else {
return NULL;
}
}
In writing string_ln you iterate over both strings completely to find their lengths, and then compare those numbers. This can work, but you don't actually need to do this. You only need to know which is longer. It doesn't matter how much longer the longer string is.
char *string_ln(char *str1, char *str2) {
char *iter1, *iter2;
for (iter1 = str1, iter2 = str2;
*iter1 && *iter2;
iter1++, iter2++);
if (!(*iter1 || *iter2)) {
return NULL;
}
else if (*iter1) {
return str1;
}
else {
return str2;
}
}
We simply need to iterate over both strings, until at least one hits a NULL character. Once we get to that point, we can test to see which iterator is NULL. If it's both of them, then they're the same length. If the first iterator is not NULL, then the first string is longer. Otherwise, the second string is longer.
The benefit to this approach is that we avoid unnecessary work, and make it much quicker to compare strings of very different lengths.
There are a few problems here. First, you're modifying p1 and p2 in the function, so you won't actually return a pointer to the beginning of the largest string, but to its end. One way to avoid this is to iterate over copies of p1 and p2:
char* string_ln(char*p1, char*p2)
{
char* tmp1 = p1;
int count1 = 0;
while (*tmp1 != '\0') {
count1++;
tmp1++;
}
char* tmp2 = p2;
int count2 = 0;
while (*tmp2 != '\0') {
count2++;
tmp2++;
}
if(count1>count2){
return p1;
}
else if(count2>count1){
return p2;
}
else{
return NULL;
}
}
Second, in your main, you're using the %c format string, which works for a single char, not a whole string. Since you have a string anyway, you can avoid a format string and just print it directly. Also, note that you should explicitly check for NULLs:
int main() {
char str1[20];
char str2[20];
char* longest;
scanf("%s%s", str1, str2);
longest = string_ln(str1, str2);
if (longest) {
printf(longest);
} else {
printf("They are the same length");
}
}
I think you're missing to dereference the pointer. Instead of
while(length!='\0')
you'd need
while(*length!='\0')
That said, in the called function, you're reuring pointers after the increment, i.e., the returned pointers do not point to the start of the string anymore. You need to ensure that you return pointers which points to the beginning of the string. You can change your code to
int count1 = 0;
while (p1[count1] != '\0') {
count1++;
}
int count2 = 0;
while (p2[count2] != '\0') {
count2++;
}
so that p1 and p2 does not change.
For starters the function should be declared like
char * string_ln( const char *, const char * );
because the passed strings are not being changed within the function.
You are returning from the function the already modified pointer p1 or p2 that is being changed in one of the while loops
while (*p1 != '\0') {
count1++;
p1++;
}
while (*p2 != '\0') {
count2++;
p2++;
}
So the returned pointer points to the terminating zero '\0' of a string.
Moreover in main before this while loop
length = string_ln(str1, str2);
while(length!='\0'){
printf("%c", *length);
length++;
}
you are not checking whether the pointer length is equal to NULL. As a result the program can invoke undefined behavior.
The function itself can be defined the following way using only pointers.
char * string_ln( const char *p1, const char *p2 )
{
const char *s1 = p1;
const char *s2 = p2;
while ( *s1 != '\0' && *s2 != '\0' )
{
++s1;
++s2;
}
if ( *s1 == *s2 )
{
return NULL;
}
else if ( *s1 == '\0' )
{
return ( char * )p2;
}
else
{
return ( char * )p1;
}
}
and in main you need to write
char *length = string_ln( str1, str2 );
if ( length != NULL )
{
while ( *length )
printf( "%c", *length++ );
}
Pay attention to that the return type of the function is char * instead of const char *. It is because in C there is no function overloading and the returned pointer can point to a constant string or to a non-constant string. It is a general convention in C for declaring string functions.

How to find out if a word is inside of a line?

I am tring to make a function from exercise in book "Programming in C". The correct function should indicate if a line contain some word, if yes - return its first charcter position(of the word) in the line.
#include <stdio.h>
#include <string.h>
#include <stdbool.h>
int substring (char a[], char b[]);
int main ()
{
char line1[15], line2[15];
printf("Print first one\n");
int i = 0;
char character;
do
{
character = getchar ();
line1[i] = character;
i++;
}
while (character != '\n');
line1[i-1] = '\0';
printf ("Print the second one\n");
scanf("%s", line2);
printf("%s, %s\n", line1, line2); \\ for checking lines
int index;
index = substring (line1, line2);
printf("The result is: %i\n", index);
}
int substring (char a[], char b[])\*function to determine if a line contains a word, if yes then return its polition, else return -3*\
{
int len1 = strlen(a), len2 = strlen(b);
int current1 = 0, current2 = 0;
bool found = false;
int result;
while( current1 < len1 )
{
if (a[current1] == b[current2])
{
if(!found)
{
result = current1+1;
}
else
found = true;
while ((a[current1] == b[current2]) && (a[current1] != '\0') && (b[current2] != '\0'))
{
current1++;
if(current2+1 == len2)
return result;
current2++;
}
current1 = result;
}
else
{
current2 = 0;
found = false;
current1++;
}
}
return -3;
}
The problem is somehow in the second function(substring), cause when i try to search for "word" in line "Here is your word", fucntion works properly, but when i try to search "word" in a line "Here is your wwwwword", function returns -3 (which is indication if something went wrong).
For starters there is the standard C string function strstr that allows to determine whether one string is contained in other string.
If you need to write such a function yourself then your function implementation looks too complicated.
For example the if-else statement
if (a[current1] == b[current2])
{
if(!found)
{
result = current1+1;
}
else
found = true;
// ...
does not make a great sense at least by two reasons. The first one is the variable result that can be returned from the function in the while loop does not contain the exact index of the found word. And the second one is that found never can be equal to true within the function. So the condition in the if statement
if( !found )
always evaluates to true.
Also you forgot to reset the variable current2 in the compound statement of the if statement after the while loop.
Apart from this the function parameters should have the qualifier const because passed strings are not being changed in the function. And the function return type shall be size_t.
The function can look for example the following way as it is shown in the demonstrative program below.
#include <stdbool.h>
#include <string.h>
size_t substring( const char *s1, const char *s2 )
{
size_t n1 = strlen( s1 );
size_t n2 = strlen( s2 );
size_t i = 0;
bool found = false;
if (!( n1 < n2 ))
{
for ( size_t n = n1 - n2; !found && i <= n; i += !found )
{
size_t j = 0;
while (s2[j] != '\0' && s2[j] == s1[i + j]) ++j;
found = s2[j] == '\0';
}
}
return found ? i : -1;
}
int main( void )
{
const char *word = "word";
const char *s1 = "Here is your word";
const char *s2 = "Here is your wwwwword";
printf( "%zu\n", substring( s1, word ) );
printf( "%zu\n", substring( s2, word ) );
}
The program output is
13
17
If a word is not found in the source string then the function returns a value of the type size_t that is equal to -1.
Pay attention to that the variable character should be declared having the type int instead of char. Otherwise in general the comparison with EOF can produce unexpected result if the type char behaves as the type unsigned char.

C - Determining alphabetical order of characters/strings

I'm trying to write a function that compares two strings (s1 and s2) and works out whether s1 comes before, after or is equal to the s2 string, alphabetically (in the same way as a dictionary is read). If s1 comes before s2 it should return -1. If it's equal to s2 it should return 0. If it comes after s2 it should return 1.
I'm having difficulty getting the function to work - I can only seem to get returns for the first chars in each string and only using the same case. Grateful for any help you can give.
Here's the code so far:
#include <stdio.h>
#include <stdlib.h>
int cmpstr(const char *, const char *);
int main()
{
printf("Test 1: %d\n", cmpstr( "Hello", "World"));
printf("Test 2: %d\n", cmpstr( "Hello", "Hello"));
printf("Test 3: %d\n", cmpstr( "World", "Hello"));
return 0;
}
int cmpstr(const char *s1, const char *s2)
{
/*compare corresponding string characters until null is reached*/
while(*s1 != '\0' && *s2 != '\0' )
{
if (*s1 < *s2)
{
return -1;
}
else if (*s1 > *s2)
{
return 1;
}
else
{
return 0;
s1++;
s2++;
}
}
return 0;
}
just remove the last else part and put return 0 out of loop because both string are only equal if if part and else-if part will not be true, when it will come out from loop it will return 0.
int cmpstr(const char *s1, const char *s2)
{
/*compare corresponding string characters until null is reached*/
while(*s1 != '\0' && *s2 != '\0' )
{
if (*s1 < *s2)
{
return -1;
}
else if (*s1 > *s2)
{
return 1;
}
s1++;
s2++;
}
return 0;
}
Your code has a very obvious mistake, which is the return 0-statement making the s1++;s2++ to unreachable code (your compiler should have warned you about that).
But it has also a conceptual mistake, as it ignores situations where s1 is longer than s2 or vice versa. So in your approach (once corrected the return 0-thing, "Hello" and "Hello there" would compare equal.
See the following code with works in a different manner. It skips equal characters until one (or both) strings has (have) ended. Then, according to this state, result is determined:
int cmpstr(const char *s1, const char *s2)
{
while (*s1 && *s2 && *s1 == *s2) { // move forward until either one of the strings ends or the first difference is detected.
s1++;
s2++;
}
int result = (*s1 - *s2);
// if both strings are equal, s1 and s2 have reached their ends and result is 0
// if *s1 > *s2, s1 is lexographically greater than s2 and result is positive
// if *s1 < *s2, s1 is lexographically lower than s2 and result is negative
// normalize "positive" and "negative" to 1 and -1, respectively
if (result < 0)
result = -1;
else if (result > 0)
result = 1;
return result;
}
Removing 'return 0' in else statement will work. If the chars are equal in same level, you need to look next ones until the equality breaks.
Edit: Also, you need to think about when lengths of strings are not equal.
int cmpstrMY(const char *s1, const char *s2)
{
char sc1, sc2;
/*compare corresponding string characters until null is reached*/
while (1)
{
sc1 = towlower(*s1);
sc2 = towlower(*s2);
if (sc1 == '\0' && sc2 == '\0') {
break;
}
else if (sc1 == '\0' && sc2 != '\0') {
return -1;
}
else if (sc1 != '\0' && sc2 == '\0') {
return 1;
}
else if (sc1 < sc2)
{
return -1;
}
else if (sc1 > sc2)
{
return 1;
}
else
{
s1++;
s2++;
}
}
return 0;
}
Your cmpstr must be something like the code above.

How to make strcmp function?

I want to make my own strcmp function, like the one in C.
int my_cmp(const char* str1, const char* str2)
{
int index;
for (index = 0; str1[index] != '\0' && str2[index] != '\0'; index++)
if (str1[index] != str2[index])
return (str1[index] - str2[index]);
return 0;
}
Am I right?
I know that not all the strings have the same length.
I'm not sure about condition of for statement.
Here is one of the Official implemention.
int strcmp(const char *s1, const char *s2)
{
for ( ; *s1 == *s2; s1++, s2++)
if (*s1 == '\0')
return 0;
return ((*(unsigned char *)s1 < *(unsigned char *)s2) ? -1 : +1);
}
Update:
Problems of your code:
your code works fine for string of the same length, the other cases it will false.
For Extended ASCII(range between 128~255), you use sign char, so their value would overflow to an negative value, then you may get a wrong value.
fix version:
int my_cmp(const char* str1, const char* str2)
{
int index;
for (index = 0; str1[index] != '\0' && str2[index] != '\0'; index++)
if (str1[index] != str2[index])
return ((*(unsigned char *)str1 < *(unsigned char *)str2) ? -1 : +1);
// here is the fix code.
if (str1[index] != '\0') {
return 1;
} else if (str2[index] != '\0') {
return -1;
}
return 0;
}
the following code snippet shows you how you could implement an "strcmp" function:
int myStrCmp (const char *s1, const char *s2) {
const unsigned char *p1 = (const unsigned char *)s1;
const unsigned char *p2 = (const unsigned char *)s2;
while (*p1 != '\0') {
if (*p2 == '\0') return 1;
if (*p2 > *p1) return -1;
if (*p1 > *p2) return 1;
p1++;
p2++;
}
if (*p2 != '\0') return -1;
return 0;
}
Am I right? I know that not all the strings have the same length. I'm not sure about condition of for statement.
You are almost right. Your if statement
if (str1[index] != str2[index])
return (str1[index] - str2[index]);
is basically correct (though the characters should be subtracted as unsigned chars), but the for loop itself
for (index = 0; str1[index] != '\0' && str2[index] != '\0'; index++)
is wrong. Specifically the condition:
str1[index] != '\0' && str2[index] != '\0'
This is wrong because it checks to make sure that both characters at the given index are not '\0', rather than either character. This can be fixed by replacing && with ||.
Here's how a seasoned C programmer might write the strcmp function (I wrote this :p (EDIT: #chux suggested an improvement)):
int strcmp(const char *s1, const char *s2) {
for (; *s1 && (*s1 == *s2); s1++, s2++) {}
return (unsigned char)(*s1) - (unsigned char)(*s2);
}

Resources