Comparing string through pointers - c

I am new to the C language. Please, could someone tell me why I am always getting zero as output when comparing differing strings using my own implementation of strcmp?
I wrote the function xstrcmp to compare two strings: if they are equal, then it returns 0; otherwise, it returns the numeric difference between the ASCII values of the first non-matching pair of characters.
#include<stdio.h>
int xstrcmp(char*,char*);
int main()
{
int i;
char string1[]="jerry";
char string2[]="ferry";
i=xstrcmp(string1,string2);
printf("difference=%d\n",i);
return 0;
}
int xstrcmp(char*p,char*q)
{
int m;
while(*p!=*q)
{
if((*p=='\0')&&(*q=='\0'))
break;
p++;
q++;
}
m=(*p)-(*q);
return m;
}

You loop until you find equal chars, then you subtract them -- so of course the result is always 0.
Also, the condition inside the loop will always fail ... if the chars aren't equal, they can't both be NUL.
That should be enough for you to fix your code.

The reason why you get zero always is that your while loop while(*p!=*q) means that the loop will execute as long as the characters are NOT same.
The exit from loop will happen when *p and *q have the same value.
Hence the return value, which is m=(*p)-(*q); will always be zero.
while (*p == *q) /* as long as they have same value, loop; otherwise exit */
{
p++; /* increment the pointers */
q++;
}
return (*p)-(*q);
would be the way to go.

Related

How to count the number of distinct characters in common between two strings?

How can a program count the number of distinct characters in common between two strings?
For example, if s1="connect" and s2="rectangle", the count is being displayed as 5 but the correct answer is 4; repeating characters must be counted only once.
How can I modify this code so that the count is correct?
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main()
{
int i,j,count=0;
char s1[100],s2[100];
scanf("%s",s1);//string 1 is inputted
scanf("%s",s2);//string 2 is taken as input
for(i=1;i<strlen(s1);i++)
{
for(j=1;j<strlen(s2);j++)
{
if(s1[i]==s2[j])//compare each char of both the strings to find common letters
{
count++;//count the common letters
break;
}
}
}
printf("%d",count);//display the count
}
The program is to take two strings as input and display the count of the common characters in those strings. Please let me know what's the problem with this code.
If repeating characters must be ignored, the program must 'remember' the character which were already encountered. You could do this by storing the characters which were processed into a character array and then consult this array while processing the other characters.
You could use a counter variable to keep track of the number of common characters like
int ctr=0;
char s1[100]="connect", s2[100]="rectangle", t[100]="";
Here, t is the character array where the examined characters will be stored. Its size is made to be same as the size of the largest of the other 2 character arrays.
Now use a loop like
for(int i=0; s1[i]; ++i)
{
if(strchr(t, s1[i])==NULL && strchr(s2, s1[i])!=NULL)
{
t[ctr++]=s1[i];
t[ctr]=0;
}
}
t initially has an empty string. Characters which were previously absent in t are added to it via the body of the loop which will be executed only if the character being examined (ie, s1[i]) is not in t but is present in the other string (ie, s2).
strchr() is a function with a prototype
char *strchr( const char *str, int c );
strchr() finds the first occurrence of c in the string pointed to by str. It returns NULL if c is not present in str.
Your usage of scanf() may cause trouble.
Use
scanf("%99s",s1);
(where 99 is one less than the size of the array s1) instead of
scanf("%s",s1);
to prevent overflow problems. And check the return value of scanf() and see if it's 1. scanf() returns the number of successful assignment that it made.
Or use fgets() to read the string.
Read this post to see more about this.
And note that array indexing starts from 0. So in your loops, the first character of the strings are not checked.
So it should've been something like
for(i=0;i<strlen(s1);i++)
instead of
for(i=1;i<strlen(s1);i++)
Here's a solution that avoids quadratic O(N²) or cubic O(N³) time algorithms — it is linear time, requiring one access to each character in each of the input strings. The code uses a pair of constant strings rather than demanding user input; an alternative might take two arguments from the command line and compare those.
#include <limits.h>
#include <stdio.h>
int main(void)
{
int count = 0;
char bytes[UCHAR_MAX + 1] = { 0 };
char s1[100] = "connect";
char s2[100] = "rectangle";
for (int i = 0; s1[i] != '\0'; i++)
bytes[(unsigned char)s1[i]] = 1;
for (int j = 0; s2[j] != '\0'; j++)
{
int k = (unsigned char)s2[j];
if (bytes[k] == 1)
{
bytes[k] = 0;
count++;
}
}
printf("%d\n",count);
return 0;
}
The first loop records which characters are present in s1 by setting an appropriate element of the bytes array to 1. It doesn't matter whether there are repeated characters in the string.
The second loop detects when a character in s2 was in s1 and has not been seen before in s2, and then both increments count and marks the character as 'no longer relevant' by setting the entry in bytes back to 0.
At the end, it prints the count — 4 (with a newline at the end).
The use of (unsigned char) casts is necessary in case the plain char type on the platform is a signed type and any of the bytes in the input strings are in the range 0x80..0xFF (equivalent to -128..-1 if the char type is signed). Using negative subscripts would not lead to happiness. The code does also assume that you're working with a single-byte code set, not a multi-byte code set (such as UTF-8). Counts will be off if you are dealing with multi-byte characters.
The code in the question is at minimum a quadratic algorithm because for each character in s1, it could step through all the characters in s2 only to find that it doesn't occur. That alone requires O(N²) time. Both loops also use a condition based on strlen(s1) or strlen(s2), and if the optimizer does not recognize that the value returned is the same each time, then the code could scan each string on each iteration of each loop.
Similarly, the code in the other two answers as I type (Answer 1 and Answer 2) are also quadratic or worse because of their loop structures.
At the scale of 100 characters in each string, you probably won't readily spot the difference, especially not in a single iteration of the counting. If the strings were bigger — thousands or millions of bytes — and the counts were performed repeatedly, then the difference between the linear and quadratic (or worse) algorithms would be much bigger and more easily detected.
I've also played marginally fast'n'loose with the Big-O notation. I'm assuming that N is the size of the strings, and they're sufficiently similar in size that treating N₁ (the length of s1) as approximately equal to N₂ (the length of s2) isn't going to be a major problem. The 'quadratic' algorithms might be more formally expressed as O(N₁•N₂) whereas the linear algorithm is O(N₁+N₂).
Based on what you expect as output you should keep track which char you used from the second string. You can achieve this as follows:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main()
{
int i, j, count = 0, skeep;
char s1[100], s2[100], s2Used[100]{0};
scanf("%s", s1); //string 1 is inputted
scanf("%s", s2); //string 2 is taken as input
for (i = 0; i<strlen(s1); i++)
{
skeep = 0;
for (j = 0; j < i; j++)
{
if (s1[j] == s1[i])
{
skeep = 1;
break;
}
}
if (skeep)
continue;
for (j = 0; j<strlen(s2); j++)
{
if (s1[i] == s2[j] && s2Used[j] == 0) //compare each char of both the strings to find common letters
{
//printf("%c\n", s1[i]);
s2Used[j] = 1;
count++;//count the common letters
break;
}
}
}
printf("%d", count);//display the count
}

Pointer De-referencing

#include<stdlib.h>
#include<stdio.h>
#define NO_OF_CHARS 256
/* Returns an array of size 256 containg count
of characters in the passed char array */
int *getCharCountArray(char *str)
{
int *count = (int *)calloc(sizeof(int), NO_OF_CHARS);
int i;
for (i = 0; *(str+i); i++)
count[*(str+i)]++;
return count;
}
/* The function returns index of first non-repeating
character in a string. If all characters are repeating
then returns -1 */
int firstNonRepeating(char *str)
{
int *count = getCharCountArray(str);
int index = -1, i;
for (i = 0; *(str+i); i++)
{
if (count[*(str+i)] == 1)
{
index = i;
break;
}
}
free(count); // To avoid memory leak
return index;
}
/* Driver program to test above function */
int main()
{
char str[] = "geeksforgeeks";
int index = firstNonRepeating(str);
if (index == -1)
printf("Either all characters are repeating or string is empty");
else
printf("First non-repeating character is %c", str[index]);
getchar();
return 0;
}
I really can't grasp the following lines:
count[*(str+i)]++;
amd
int *getCharCountArray(char *str)
{
int *count = (int *)calloc(sizeof(int), NO_OF_CHARS);
int i;
for (i = 0; *(str+i); i++)
count[*(str+i)]++;
return count;
}
The program is used to find the first Non-Repeating character in the string.
*(str+i) is same as str[i]. The line:
for (i = 0; *(str+i); i++)
is the same as:
for (i = 0; str[i]; i++)
The statements in the loop will be executed as long as str[i] evaluates to non-zero. Since C strings are arrays of characters that are terminated by a null character, the for loop will be executed for each character in str. It will stop when the end of the string is reached.
count[*(str+i)]++;
is the same as:
count[str[i]]++;
If str[i] is 'a', this line will increment the value of count['a'], which is count[97] in ASCII encoding.
At the end of the loop, count will be filled with integers that represent the number of times a particular character appears in str.
I really can't grasp the following lines:
count[*(str+i)]++;
Work from the outside in:
since str is a pointer to char and i is an int, str + i is a pointer to the char that is i chars after the one str itself points to
*(str+i) dereferences pointer str+i, meaning it evaluates to the char the pointer points to. This is exactly equivalent to str[i].
count[*(str+i)] uses the char at index i in string str as an index into dynamic array count. The expression designates the int at that index (since count points to an array of ints). See also below.
count[*(str+i)]++ evaluates to the int at index *(str+i) in the array count points to. As a side effect, it increments that array element by one after the value of the is determined expression. This overall expression is present in your code exclusively for its side effect.
It is important to note that although space is reserved in array count for counting appearances of 256 distinct char values, the expression you asked about is not a safe way to count all of them. That's because type char can be implemented as a signed type (at the C implementer's discretion), and it is common for it to be implemented that way. In that case, only the non-negative char values correspond to array elements, and undefined behavior will result if the input string contains others. Safer would be:
#include <stdint.h>
# ...
count[(uint8_t) *(str+i)]++;
i.e. the same as the original, except for explicitly casting each character of the input string to an unsigned 8-bit value.
Overall, the function simply creates an array of 256 ints, one for each possible char value, and scans the string to count the number of occurrences of each char value that appears in it. It then returns this array of occurrence counts.
This code is equivalent to the confusing loop you posted. Does it help?
*(str + i) is confusing way of expressing str[i] and IMO inappropriate here.
for (i = 0; str[i] != '\0'; ++i)
{
char curr_char = str[i];
++count[curr_char];
}
In for loop there are three things we need to consider :
Explanation of for loop
Initialization of counter variable( i in your eg.). 2) Condition (*(str+i)) 3) Increment/decrement part (i++).
the for loop gets executed till the condition is true(i.e any non zero value) . so *(str+i) is providing a non zero value until there is any character in the array..
count[*(str+i)]++; // it is counting the number of characters in the array by incrementing the string character by character.
count[*(str+i)]++ =>count[*(str+i)]=count[*(str+i)]+1
Now consider one scenario:
char str[] = "aaab";
*(str+i)/str[i] Will show char like 'a','b'...etc.
So
count[*(str+i)]++=count['a']++ Mean;
count['a']=count['a']+1 // Will store iteration of a=1
count['a']=count['a']+1 // Will Update iteration of a=2
count['a']=count['a']+1 // Will Update iteration of a=3
and like other character.
So count[*(str+i)]++ will update occrance of charcarter in updated count.

C: Replacing a substring within a string using loops

I am struggling with the concept of replacing substrings within strings. This particular exercise does not want you to use built in functions from <string.h> or <strings.h>.
Given the string made up of two lines below:
"Mr. Fay, is this going to be a battle of wits?"
"If it is," was the indifferent retort, "you have come unarmed!"
I have to replace a substring with another string.
This is what I have so far, and I'm having trouble copying the substring to a new array, and replacing the substring with the new string:
#include <stdio.h>
#include <string.h>
int dynamic();
int main()
{
char str[]="\n\"Mr. Fay, is this going to be a battle of wits?\" \n\"If it is,\" was the indifferent retort, \"you have come unarmed!\"";
int i, j=0, k=0, l=0, n=0;
unsigned int e = n-2;
char data[150];
char newData[150];
char newStr[150];
printf("Give me a substring from the string");
gets(data);
printf("Give me a substring to replace it with");
gets(newData);
dynamic();
for (i=0; str[i] != '\0'; i++)
{
if (str[i] != data[j])
{
newStr[l] = str[i];
l++;
}
else if ((str[i+e] == data[j+e]) && (j<n))
{
newStr[l] = newData[j];
j++;
l++;
e--;
}
else if ((str[i+e] == data[j+e]) && (j>=n))
{
j++;
e--;
}
else
{
newStr[l] = str[i];
l++;
}
}
printf("original string is-");
for (k=0; k<n; k++)
printf("%c",str[k]);
printf("\n");
printf("modified string is-");
for(k=0; k<n; k++)
printf("%c",newStr[k]);
printf("\n");
}
int dynamic()
{
char str[]="\n\"Mr. Fay, is this going to be a battle of wits?\" \n\"If it is,\" was the indifferent retort, \"you have come unarmed!\"";
int i, n=0;
for (i=0; str[i] != '\0'; i++)
{
n++;
}
printf("the number of characters is %d\n",n);
return (n);
}
I tried your problem and got output for my code. Here is the code-
EDIT- THIS IS THE EDITED MAIN CODE
#include <stdio.h>
#include <string.h>
int var(char *); //function declaration. I am telling CPU that I will be using this function in the later stage with one argument of type char *
int main() //main function
{
char *str="\n\"Mr. Fay, is this going to be a battle of wits?\" \n\"If it is,\" was the indifferent retort, \"you have come unarmed!\"";
int i,j=0,k=0,l=0;
char data[] = "indifferent";
char newData[] = "nonchalant";
char newStr[150];
//here 'n' is returned from the 'var' function and is received in form of r,r1,r2,r3.
int r=var(str); //getting the length of str from the function 'var' and storing in 'r'
int r1=var(data); //getting the length of data from the function 'var' and storing in 'r1'
int r2=var(newData); //getting the length of newData from the function and storing in 'r2'
unsigned int e=r1-2; //r1-2 because r1 is the data to be replaced. and string index starts from 0. Here r1 is of length 12. but we dont need to check last
//character because it is null character and the index starts from 0. not from 1. so, it is 0 to 11 and 11th is '\0'. so "12-"2"=10" characters to be compared.
for(i=0;str[i]!='\0';i++)
{
if(str[i]!=data[j])
{
newStr[l]=str[i];
l++;
}
else if((str[i+e]==data[j+e]) && (j<r2))
{
newStr[l]=newData[j];
j++;
l++;
e--;
}
else if((str[i+e]==data[j+e]) && (j>=r2))
{
j++;
e--;
}
else
{
newStr[l]=str[i];
l++;
}
}
int r3=var(newStr); //getting the length of str from the function and storing in 'r'
printf("original string is-");
for(k=0;k<r;k++)
printf("%c",str[k]);
printf("\n");
printf("modified string is-");
for(k=0;k<r3;k++)
printf("%c",newStr[k]);
printf("\n");
} // end of main function
// Below is the new function called 'var' to get the character length
//'var' is the function name and it has one parameter. I am returning integer. so, it is int var.
int var(char *stri)//common function to get length of strings and substrings
{
int i,n=0;
for(i=0;stri[i]!='\0';i++)
{
n++; //n holds the length of a string.
}
// printf("the number of characters is %d\n",n);
return (n); //returning this 'n' wherever the function is called.
}
Let me explain few parts of the code-
I have used unsigned int e, because I don't want 'e' to go negative.(I will explain more about this later).
In the first for loop, I am checking whether my string has reached the end.
In first 'IF' condn, I am checking whether the first character of string is NOT-EQUAL to the first character of the word which needs to be replaced. If condition satisfies, print regularly thr original string.
ELSE IF, i.e(first character of string is EQUAL to the first character of the word)then check the next few characters to make sure that the word matches. Here, I used 'e' because it will check the condition for str[i+e] and data[i+e]. example- ai notequalto ae. If I had not used 'e'in code,... after checking the first character itself, newdata would have been printed in newstr. I used 'e'=5 because the probabilty of 1st letter and 5th letter being the same in data and the str is less. You can use 'e'=4 also. No rule that you have to use 'e'=5 only.
Now, I am decrementing 'e' and checking whether the letters in the string is same or no. I can't increment because, there is a certain limit of size of a string. As, I used unsigned int, 'e' won't go down below 0.
ELSE, (this means that only first letter is matching, the 5th letter of str and data are not matching), print the str in newstr.
In the last FOR loop, I have used k<114 because, that much characters are there in the string. (You can write a code to find how many characters are there in a string. No need to manually count).
And lastly, I have used conditions (j<10) and (j>=10) along with ELSE-IF condition because, in first ELSE-IF, the new data is ofsize 10. So, even if the word to be replaced is more than 10,say 12 for example. I don't need the extra 2 bits to be stored in new data. So, if the size is more than 10, just bypass that in the next ELSE-IF condition. Note that this 10 is the size of new word. So, it varies if your word is smaller or bigger. And , in second ELSE-IF, I am not incrementing 'l'(l++) because, here, I am not putting anything in newstr. I am just bypassing it. So, I didn't increment.
I tried my best to put the code in words. If you have any doubt, you can ask again. I will be glad to help. And this code is NOT OPTIMAL. The numerical values used varies with the words/strings you use. Ofcourse, I can write a generalized code for that(to fetch the numerical values automatically from the strings). But, I didn't write that code here. This code works for your problem. You can change few variables like 'e' and ELSE-IF part and try to understand how the code works. Play with it.
EDIT-
include
int main()
{
char str[]="\n\"Mr. Fay, is this going to be a battle of wits?\" \n\"If it is,\" was the indifferent retort, \"you have come unarmed!\"";// I took this as string. The string which u need to calculate the length, You have to pass that as the function parameter.
int i,n=0;
for(i=0;str[i]!='\0';i++)
{
n++;
}
printf("the number of characters is %d\n",n);
return (n);
}// If you execute this as a separate program, you will get the number of characters in the string. Basically, you just have to modify this code to act as a separate function and when calling the function, you have to pass correct arguments.
//Use Pointers in the function to pass arguments.

How to use the character array to identify a string

For my class we use char arrays for strings. If I was to use an if else statement, would something like this work if I had it modified to do so?
I know an array like this would make every character broken down to simple letters. And to use an if else statement I have to go like array[1] == 'H' and so on.
Is there a way to modify the code below to spit out the information I want if I type up "Alas". Right now, it only goes to the else part.
int main()
{
char s[10];
printf("Yo, this is a string: ");
gets_s(s);
if (s == "Alas")
{
printf("B ");
}
else
{
printf("A");
}
system("pause");
}
Use the strncmp standard library function to compare two strings. Include the <string.h> header.
strncmp(const char *s1, const char *s2, size_t n)
RETURN VALUE:
Upon successful completion, strncmp() shall return an integer greater than, equal to, or less than 0, if the possibly null-terminated array pointed to by s1 is greater than, equal to, or less than the possibly null-terminated array pointed to by s2 respectively.
Now in your code s is pointer and "Alas" is treated as pointer. Pointer to another memory area. This is a reason why they are always different. Use
if (!strcmp(s, "Alas"))
Something like:
int main()
{
char s[10];
printf("Yo, this is a string: ");
gets_s(s);
if (strcmp(s, "Alas") == 0)
{
printf("B ");
}
else
{
printf("A");
}
system("pause");
}
If the only thing you want to know is whether two strings are identical or not, you can define a function yourself to check each character of two strings, returning a 0 as soon as you encounter a difference, returning a 1 only if it encounters the terminating zero on both at the same time:
SameStrings( char * s1, char * s2 ) {
for ( int i = 0; s1[i] && s2[i]; i++ )
if ( s1[i] != s2[i] )
return 0;
// if the programme advanced this far
// either one of both should be 0
return s1[i] == s2[i];
// if they are equal, then both must be 0, in which case it will return 1
// else it will return a 0
}
You can add one more argument to that function, an integer that will limit the maximum number of characters to be checked, in case, for example, you want SameStrings( "lalaqwe", "lalaasd", 4 ) to return true.
This is good if you don't want to include a library for a function that does much more than you need...

Writing a program in C with the function isAlphabetic to determine if a string strictly contains alphabetic letters or not

This is what I have so far.
#include <stdio.h>
#include <stdlib.h>
int main(void) {
int value;
char c='Z';
char alph[30]="there is a PROF 1 var orada";
char freq[27];
int i;
// The function isAlphabetic will accept a string and test each character to
// verify if it is an alphabetic character ( A through Z , lowercase or uppercase)
// if all characters are alphabetic characters then the function returns 0.
// If a nonalphabetic character is found, it will return the index of the nonalpabetic
// character.
value = isAlphabetic(alph);
if (value == 0)
printf("\n The string is alphabetic");
else
printf("Non alphabetic character is detected at position %d\n",value);
return EXIT_SUCCESS;
}
int isAlphabetic(char *myString) {
}
What I'm confused is how will I have the program scan through a string to detect exactly where a non alphabetic character is, if any? I'm guessing it'll first involve counting all the characters in a string first?
Not going to provide the answer via code (as someone else did), but consider:
A string in C is nothing more than an array of characters and a null terminator.
You can iterate through each item in an array using [] (i.e., input[i]) to check its value against an ASCII table for example.
Your function can exit as soon as it finds one value that is not alphabetic.
There are certainly other ways to solve this problem, but my assumption is that at this level, your professor would be a bit suspicious if you started using a bunch of libraries / tools you haven't been taught.
Let's take your questions one at a time:
...how will I have the program scan through a string...
"Scan through a string" means you skin the cat with a loop:
char xx[] = "ABC DEF 123 456";
int ii;
/* for, while, do while; pick your poison */
for (ii = 0; xx[ii] != '\0'; ++ii)
{
/* Houston, we're scanning. */
}
...to detect...
"Detect" means you skin the cat with a comparison of some sort:
char a, b;
a == b; /* equality of two char's */
a >= b; /* greater-than-or-equal-to relationship of two char's */
a < b; /* I'll bet you can guess what this does now */
...exactly where a non alphabetic character is...
Well by virtue of scanning you'll know "exactly where" due to your index.
Scan from the first alphabet to the last alphabet. Begin with a counter variable set to 0.
Each time you move to next character, do counter++;this will give you the index of non alphabet.
If you find any non-alphabet character,return counter there itself.
I will give you a hint :
#include <stdio.h>
int main()
{
char c = '1';
printf("%d",c-48); //notice this
return 0;
}
Output : 1
Should be more than enough to solve it on your own now :)

Resources