Design a Character Searching Function, While Forced to Use strchr

Design a Character Searching Function, While Forced to Use strchr - c

Background Information
I was recently approached by a friend who was given a homework problem to develop a searching algorithm. Before anyone asks, I did think of a solution! However, my solution is not what the teacher is asking for...
Anyway, this is an introductory C programming course where the students have been asked to write a search function called ch_search that is supposed to search an array of characters to determine how many times a specific character occurs. The constraints are what I don't understand...
Constraints:
The arguments are: array to search, character to search for, and length of the array being searched.
The function must use a for-loop.
The algorithm must use the strchr function.
Okay, so the first two constraints I can understand... but the 3rd constraint is what really gets me... I was initially thinking that we could just use a for-loop to iterate through the string from the beginning to the end, simply counting each instance of the character. When the student originally described the problem to me, I came up with (although incorrect) the solution:
Proposed Solution
int ch_search(char array_to_search[], char char_to_search_for, int array_size)
{
int count = 0;
for (int i = 0; i < array_size; i++)
{
// count each character instance
if (array_to_search[i] == char_to_search_for)
{
// keep incrementing the count
count++;
}
}
return count;
}
Then I was told that I had to specifically use the character position function (and apparently it has to be strchr and not strrchr so we can't start at the end I guess?)... I just don't see how that wouldn't be overcomplicating this. I don't see how that would help at all, especially counting from the beginning... Even strrchr might make a little more sense to me. Thoughts?

It's true that having the length of the array and having to use a for loop,
the most natural thing to do would be to iterate over every characters of the
source array. But you can also loop over the result of strchr like this:
int ch_search(char haystack[], char needle, int size)
{
int count = 0;
char *found;
for(; (found = strchr(haystack, needle)) != NULL; haystack = found + 1)
count++;
return count;
}
In this case you don't need the size of the array but the assignment doesn't say
that you have to use it. Obviously this solution requires the source to be '\0'-terminated.

I think the teacher wanted you to use strchr to navigate to the next occurrence of the char_to_search_for within a string:
int ch_search(char array_to_search[], char char_to_search_for, int array_size) {
int count = 0;
for (char *ptr = array_to_search ; ptr != &array_to_search[array_size] ; ptr++) {
ptr = strchr(ptr, char_to_search_for);
if (!ptr) {
break; // Character is not found
}
count++;
}
return count;
}
Note that array_to_search must be null-terminated in order to be used together with strchr solution above.

This sounds like your friend was given a trick question. The function gets an array of chars and the length of that array but is required to use strchr() even though that function only works on '\0' terminated strings (and there was not given any guaranty that the array is '\0' terminated).
You might thing that it would be fine to use strchr() on the array anyway and then compare the returned pointer to the given length of the array to check if it went past the end of the array. But there are two problems with that:
If strchr() searches past the end of the array, then you already have Undefined Behavior before getting to the check. The program might have crashed before returning from strchr(), the returned pointer might be some total garbage or you might get a pointer to an address a bit further in memory than the end of the array.
Even if the returned pointer is just to an address a bit further in memory than the end of the array, then there is the problem that comparing two pointers (or subtracting them to find the distance between the pointed addresses) is Undefined Behavior unless they're both pointing to parts of the same memory object (or one position past the end of the object). In this instance it means that checking if the returned pointer is within the bounds of the array is only defined behavior if the returned pointer is within the bounds of the array (or one past the end) making the check a bit useless.
The only solution to that is to make sure that strchr() is working with a '\0' terminated string. For example:
int ch_search(char array_to_search[], char char_to_search_for, int array_size)
{
char *buffer = malloc(array_size + 1);
// Add test here to check if malloc was succesful
strncpy(buffer, array_to_search, array_size);
buffer[array_size] = '\0';
int count = 0;
for (char *i = buffer; (i = strchr(i, char_to_search_for)) != NULL; i++) {
count++;
}
free(buffer);
return count;
}

strchr is a very convenient function to search for a char in a string.
Find and read more about strchr. This is my favorite function ever!
The C library function char *strchr(const char *str, int c) searches for the first occurrence of the character c (an unsigned char) in the string pointed to by the argument str.
Declaration
Following is the declaration for strchr() function.
char *strchr(const char *str, int c)
Parameters
str − This is the C string to be scanned.
c − This is the character to be searched in str.
Return value
Function returns a pointer to the first occurrence of the character c in the string str, or NULL if the character is not found.
Constraints:
1) The arguments are: array to search, character to search for, and
length of the array being searched.
This constrain gives the length of the array to be searched. The given array has to contain '\0' at some point. However the length of search search can be shorter and specified by the search_length.
Following compact solution takes this under account.
int ch_search(char array_to_search[], char char_to_search_for, int search_length)
{
int count = 0;
for(char *p = array_to_search; ;p++)
{
p = strchr(p, char_to_search_for);
if( p != NULL && (p - array_to_search < search_length) )
count++;
else
break;
}
return count;
}
Or equivalent ch_search2:
#include<stdio.h>
#include<string.h>
int ch_search(char array_to_search[], char char_to_search_for, int search_length)
{
int count = 0;
for(char *p = array_to_search; ;p++)
{
p = strchr(p, char_to_search_for);
if( p != NULL && (p - array_to_search < search_length) )
count++;
else
break;
}
return count;
}
// Your original function:
int ch_search1(char array_to_search[], char char_to_search_for, int array_size)
{
int count = 0;
for (int i = 0; i < array_size; i++){
// count each character instance
if (array_to_search[i] == char_to_search_for){
count++; // keep incrementing the count
}
}
return count;
}
int ch_search2(char array_to_search[], char char_to_search_for, int array_size)
{
int count = 0;
char *p = array_to_search;
for(;;)
{
p = strchr(p, char_to_search_for);
if( p != NULL )
{
if (p - array_to_search >= array_size) // we reached beyond
{
break;
}
else
{
count++;
p++;
}
}
else
break; // char not found
}
return count;
}
int main(void)
{
// the arr has to contain '\0' terminator but we can search within the specified length.
char arr[]={'1','1','2','2','1','1','3','3','3','1','4','4', '1','1','!','1','\0','1'};
char arr1[] = "zdxbab";
printf("count %d count %d \n",ch_search(arr , '1', 12),ch_search2(arr , '1', 12));
printf("count %d count %d \n",ch_search(arr1,'b',strlen(arr1)),ch_search2(arr1,'b',strlen(arr1)));
return 0;
}
Output:
count 5 count 5
count 2 count 2

Related

What recursive function can I use to substitute the characters in a string by the greatest character on its right?

I got this exercise (it's not homework or an assignment, just practice) that asks for a program that replaces the characters in a string by the greatest character on the right using recursive functions, no loops allowed.
If I have a string acbcba the function has to return ccccba.
One thing I've tried with the loops for the string is this, to maybe try to turn it into a recursion if it worked:
void nextGreatest(char *str, int size)
{
int max_from_right = str[size-1];
str[size-1] = -1;
for(int i = size-2; i >= 0; i--)
{
int temp = str[i];
str[i] = max_from_right;
if(max_from_right < temp)
max_from_right = temp;
}
}
Output: cccba
I think the issue is that it doesn't count the characters that don't have to be replaced.
There was also another example using python I found and tried to change to C (MAX is a macro from here):
void nextGreatest(char *arr, int rev_i, int maxnum){
if (rev_i == strlen(arr) - 1){
arr[0] = maxnum;
return;
}
int i = strlen(arr) - 1 - rev_i;
arr[i], maxnum = maxnum, MAX(maxnum, arr[i]);
return nextGreatest(arr, rev_i + 1, maxnum);
}
Output: ~bccba

This is a simple problem of recursion. The idea is, you MUST cross once over the full string to detect what is the maximum and to keep the index the maximum occurred at, and a the second pass over the string to replace some characters at indexes less than the index of the maximum with the maximum you detected at the first loop. I wrote this code for you:
#include <stdio.h>
void
replace_max_char(char*s, char **index, char *max)
{
if (!*s) return;
if (*s>=*max)
*index = s, *max = *s;
replace_max_char(s+1, index, max);
if (s<*index)
*s=*max;
}
int
main(void)
{
char s[] = "acbcba", *index, max;
replace_max_char(s, &index, &max);
printf("%s\n", s);
return 0;
}
Notice the if (s<*index) after the recursive call of replace_max_char(s+1, index, max). When the recursion finishes, it will start executing the stacked if... and at that moment the maximum is known and also the index the maximum occurred at.

It does not need to be hard:
char nextGreatest(char *s)
{
char n;
if (!*s) return 0;
n = nextGreatest(s + 1);
return *s = MAX(*s, n);
}
If a character is \0 then we've reached the end of the string. Otherwise, we have more characters to the right, and we want the current character to be the greatest of its own value and all values to the right. But since we call the function in this order, you only will need to call it for its direct right neighbour, because it will in turn do the same first.

think on it as an induction.
first define the recursive function well
the function get an arr of chars and switch the letter to the greatest from right.(you can define whatever you want)
we will do the recursive on the length of the arr(we can call it n, same as induction)
base: n=0 , array with only null charecter, just return the arr.
step: lets say if we send arr with length n-1 the function will work so what we nees to do for n? we simply need to look ob the first letter and check what is greatest from right and switch to it.
lets look on example: for acbcba, if we call on smaller arr, cbcba, the return will be cccba(since we assume it works for n-1), now we need to solve for n, so now we have arr acccba, so we look on the first letter and aee what is greatest from it, and then return it.
now try to code it.

First Not Repeating Character Code

Here is the question:
Write a solution that only iterates over the string once and uses O(1) additional memory, since this is what you would be asked to do during a real interview.
Given a string s, find and return the first instance of a non-repeating character in it. If there is no such character, return '_'.
And here is my code:
char firstNotRepeatingCharacter(char * s) {
int count;
for (int i=0;i<strlen(s);i++){
count=0;
char temp=s[i];
s[i]="_";
char *find= strchr(s,temp);
s[i]=temp;
if (find!=NULL) count++;
else return s[i];
}
if (count!=0) return '_';
}
I dont know what's wrong but when given an input:
s: "abcdefghijklmnopqrstuvwxyziflskecznslkjfabe"
the output is for my code is "g" instead of "d".
I thought the code should have escaped the loop and return "d" soon as "d" was found.
Thx in advance!!!

In your program, problem is in this statement-
s[i]="_";
You are assigning a string to a character type variable s[i]. Change it to -
s[i]='_';
At the bottom of your firstNotRepeatingCharacter() function, the return statement is under the if condition and compiler must be giving a warning for this as the function is supposed to return a char. Moreover, count variable is not needed. You could do something like:
char firstNotRepeatingCharacter(char * s) {
for (int i=0;i<strlen(s);i++){
char temp=s[i];
s[i]='_';
char *find= strchr(s,temp);
s[i]=temp;
if (find==NULL)
return s[i];
}
return '_';
}
But this code is using strchr inside the loop which iterates over the string so, this is not the exact solution of your problem as you have a condition that - the program should iterates over the string once only. You need to reconsider the solution for the problem.
May you use recursion to achieve your goal, something like - iterate the string using recursion and, somehow, identify the repetitive characters and while the stack winding up identify the first instance of a non-repeating character in the string. It's implementation -
#include <stdio.h>
int ascii_arr[256] = {0};
char firstNotRepeatingCharacter(char * s) {
char result = '-';
if (*s == '\0')
return result;
ascii_arr[*s] += 1;
result = firstNotRepeatingCharacter(s+1);
if (ascii_arr[*s] == 1)
result = *s;
return result;
}
int main()
{
char a[] = "abcdefghijklmnopqrstuvwxyziflskecznslkjfabe";
printf ("First non repeating character: %c\n", firstNotRepeatingCharacter(a));
return 0;
}
In the above code, firstNotRepeatingCharacter() function iterates over the string only once using recursion and during winding up of the stack it identifies the first non-repetitive character. I am using a global int array ascii_arr of length 256 to keep the track of non-repetitive character.

Java Solution:
Time Complexity: O(n)
Space Complexity: with constant space as it will only use more 26 elements array to maintain count of chars in the input
Using Java inbuilt utilities : but for inbuilt utilities time complexity is more than O(n)
char solution(String s) {
char[] c = s.toCharArray();
for (int i = 0; i < s.length(); i++) {
if (s.indexOf(c[i]) == s.lastIndexOf(c[i]))
return c[i];
}
return '_';
}
Using simple arrays. O(n)
char solution(String s) {
// maintain count of the chars in a constant space
int[] base = new int[26];
// convert string to char array
char[] input = s.toCharArray();
// linear loop to get count of all
for(int i=0; i< input.length; i++){
int index = input[i] - 'a';
base[index]++;
}
// just find first element in the input that is not repeated.
for(int j=0; j<input.length; j++){
int inputIndex = input[j]-'a';
if(base[inputIndex]==1){
System.out.println(j);
return input[j];
}
}
return '_';
}

Pointers to string C

trying to write function that returns 1 if every letter in “word” appears in “s”.
for example:

containsLetters1("this_is_a_long_string","gas") returns 1
containsLetters1("this_is_a_longstring","gaz") returns 0
containsLetters1("hello","p") returns 0
Can't understand why its not right:
#include <stdio.h>
#include <string.h>
#define MAX_STRING 100
int containsLetters1(char *s, char *word)
{
int j,i, flag;
long len;
len=strlen(word);
for (i=0; i<=len; i++) {
flag=0;
for (j=0; j<MAX_STRING; j++) {
if (word==s) {
flag=1;
word++;
s++;
break;
}
s++;
}
if (flag==0) {
break;
}
}
return flag;
}
int main() {
char string1[MAX_STRING] , string2[MAX_STRING] ;
printf("Enter 2 strings for containsLetters1\n");
scanf ("%s %s", string1, string2);
printf("Return value from containsLetters1 is: %d\n",containsLetters1(string1,string2));
return 0;

Try these:
for (i=0; i < len; i++)... (use < instead of <=, since otherwise you would take one additional character);
if (word==s) should be if (*word==*s) (you compare characters stored at the pointed locations, not pointers);
Pointer s advances, but it should get back to the start of the word s, after reaching its end, i.e. s -= len after the for (j=...);
s++ after word++ is not needed, you advance the pointer by the same amount, whether or not you found a match;
flag should be initialized with 1 when declared.

Ah, that should be if(*word == *s) you need to use the indirection operator. Also as hackss said, the flag = 0; must be outside the first for() loop.

Unrelated but probably replace scanf with fgets or use scanf with length specifier For example
scanf("%99s",string1)
Things I can see wrong at first glance:
Your loop goes over MAX_STRING, it only needs to go over the length of s.
Your iteration should cover only the length of the string, but indexes start at 0 and not 1. for (i=0; i<=len; i++) is not correct.
You should also compare the contents of the pointer and not the pointers themselves. if(*word == *s)
The pointer advance logic is incorrect. Maybe treating the pointer as an array could simplify your logic.
Another unrelated point: A different algorithm is to hash the characters of string1 to a map, then check each character of the string2 and see if it is present in the map. If all characters are present then return 1 and when you encounter the first one that is not present then return 0. If you are only limited to using ASCII characters a hashing function is very easy. The longer your ASCII strings are the better the performance of the second approach.

Here is a one-liner solution, in keeping with Henry Spencer's Commandment 7 for C Programmers.
#include <string.h>
/*
* Does l contain every character that appears in r?
*
* Note degenerate cases: true if r is an empty string, even if l is empty.
*/
int contains(const char *l, const char *r)
{
return strspn(r, l) == strlen(r);
}
However, the problem statement is not about characters, but about letters. To solve the problem as literally given in the question, we must remove non-letters from the right string. For instance if r is the word error-prone, and l does not contain a hyphen, then the function returns 0, even if l contains every letter in r.
If we are allowed to modify the string r in place, then what we can do is replace every non-letter in the string with one of the letters that it does contain. (If it contains no letters, then we can just turn it into an empty string.)
void nuke_non_letters(char *r)
{
static const char *alpha =
"abcdefghijklmnopqrstuvwxyz"
"ABCDEFGHIJKLMNOPQRSTUVWXYZ";
while (*r) {
size_t letter_span = strspn(r, alpha);
size_t non_letter_span = strcspn(r + letter_span, alpha);
char replace = (letter_span != 0) ? *r : 0;
memset(r + letter_span, replace, non_letter_span);
r += letter_span + non_letter_span;
}
}
This also brings up another flaw: letters can be upper and lower case. If the right string is A, and the left one contains only a lower-case a, then we have failure.
One way to fix it is to filter the characters of both strings through tolower or toupper.
A third problem is that a letter is more than just the 26 letters of the English alphabet. A modern program should work with wide characters and recognize all Unicode letters as such so that it works in any language.
By the time we deal with all that, we may well surpass the length of some of the other answers.

Extending the idea in Rajiv's answer, you might build the character map incrementally, as in containsLetters2() below.
The containsLetters1() function is a simple brute force implementation using the standard string functions. If there are N characters in the string (haystack) and M in the word (needle), it has a worst-case performance of O(N*M) when the characters of the word being looked for only appear at the very end of the searched string. The strchr(needle, needle[i]) >= &needle[i] test is an optimization if there are likely to be repeated characters in the needle; if there won't be any repeats, it is a pessimization (but it can be removed and the code still works fine).
The containsLetters2() function searches through the string (haystack) at most once and searches through the word (needle) at most once, for a worst case performance of O(N+M).
#include <assert.h>
#include <stdio.h>
#include <string.h>
static int containsLetters1(char const *haystack, char const *needle)
{
for (int i = 0; needle[i] != '\0'; i++)
{
if (strchr(needle, needle[i]) >= &needle[i] &&
strchr(haystack, needle[i]) == 0)
return 0;
}
return 1;
}
static int containsLetters2(char const *haystack, char const *needle)
{
char map[256] = { 0 };
size_t j = 0;
for (int i = 0; needle[i] != '\0'; i++)
{
unsigned char c_needle = needle[i];
if (map[c_needle] == 0)
{
/* We don't know whether needle[i] is in the haystack yet */
unsigned char c_stack;
do
{
c_stack = haystack[j++];
if (c_stack == 0)
return 0;
map[c_stack] = 1;
} while (c_stack != c_needle);
}
}
return 1;
}
int main(void)
{
assert(containsLetters1("this_is_a_long_string","gagahats") == 1);
assert(containsLetters1("this_is_a_longstring","gaz") == 0);
assert(containsLetters1("hello","p") == 0);
assert(containsLetters2("this_is_a_long_string","gagahats") == 1);
assert(containsLetters2("this_is_a_longstring","gaz") == 0);
assert(containsLetters2("hello","p") == 0);
}
Since you can see the entire scope of the testing, this is not anything like thoroughly tested, but I believe it should work fine, regardless of how many repeats there are in the needle.

Pointer Arithmetic for String Parsing

So I'm trying to get a grasp of pointers/arrays by creating a basic function, strend, that returns 1 if a substring occurs at the end of a given string and 0 if not. I realize I could accomplish this by measuring the length of a char array, subtracting the length of the substring from that length, and starting my program there, but I kind of want to do it the way my function does it to get a stronger grasp of pointer arithmetic. So here is the program:
#include <stdio.h>
#define NELEMS(x) (sizeof(x) / sizeof(x[0]))
int strend(char *string, char *substring, int substringLength){
int count; /*keep track of how many number of chars that match in a row*/
while(*string != '\0'){
count = 0;
while(*string == *substring){
if (count + 1 == substringLength) return 1; /*if matches = length of str*/
count++;
string ++;
substring ++;
}
if (count == 0) string++; /*only increment outer loop if inner loop has done no work*/
else {
substring - count; /*reset substring, don't increment string... BUGGY*/
}
}
return 0;
}
int main(){
char string[] = "John Coltrane";
char substring[] = "Coltrane";
int substringLength = NELEMS(substring);
printf("%d \n", strend(string, substring, substringLength));
char string2[] = "John Coltrane is Awesome Coltrane";
char substring2[] = "Coltrane";
int substringLength2 = NELEMS(substring);
printf("%d \n", strend(string2, substring2, substringLength2));
return 1;
}
For the first test strings, string and substring, I get the right result, returning 1 because "Coltrane" is at the end of the string. Similarly, if I take the "Coltrane" out of string2, I get the correct result, returning 0 because the string does not end with Coltrane.
However, for the version of string2 that you see above, I also get zero, and the problem lies in the fact that strend does not reset substring after I iterate over it and increment it while it matches part of the main string. This is fine when the first instance of substring is at the end of the string, but not when there are two instances, as in string2. I thought that substring - count would decrement the pointer back to the beginning of the substring array, but it does not appear to be doing that.
If I change that expression with substring--, it does show the last character of the substring, but is an expression like for(int i = 0; i < count; i++, substring--) really the only way to do this?
Edit: Replacing substring - count with for(; count > 0; count--, substring--) seems like a pretty elegant one liner and it works for me, but I still have a gut feeling there's a better way.

This is an expression which does not change the value of any variable:
substring - count;
This is how you change the value of the variable:
substring -= count;
The other error in your code is to only increment string when count is 0. What if there is a partial match like "Cole Slaw"?

Swap each even pair of characters with next pair of characters in array

I want to create an function which splits strings into two sets of characters, character by character, then merges the second set before the first, character by character. For example string "KILOS" (odd # of chars) would split into "KL" "IO" then "S" where the final output would look like "IKOLS".Meaning for every odd case, the last character from original string holds the last place in the new string. The encode function expects s2 to point to a string containing a string that is converted from s1. Any help, hint would be appreciated! Thank you.
***//I HAVE DELETED MY CODE BECAUSE I ACTUAL STUDENTS MIGHT COPY IT, AND GET CAUGHT PLAGIARISING> SORRY>***

The thing is here the code you have written is complicated and simple looping over the string can solve the problem.
void convert (char *s1, char *s2){
size_t len = strlen(s1);
for( size_t i = 0; i < len; i+=2 ){
if(i+1 < len){
s2[i+1] = s1[i];
s2[i] = s1[i+1];
}else{
s2[i] = s1[i];
}
}
s2[len]=0;
}
If you have to use the function like this:-
char s[6]="hello";
char t[6];
convert(s,t);
printf("%s\n",t);
Here ofcourse it is considered that s2 has enough memory to hold the processed string. This has literally nothing more than the copying logic. You are considering two characters each and then swapping them while copying. At last you reach a position when you are accessing an element which has no pair (odd number of elements). Then you simply copy it and move on.
In case you don't know what array subscripting means - let me tell you, s1[i] is same as *(s1+i).
Edit1
Also in your adaptation of my code in the last line you have put *s2 = 0.
It should be
*(s2+len)=0;
Another thing is in your readline code you don't need these two lines. You can do it simply like this:-
int read_line(char *str, int n)
{
int words; int store=0;
while((words=getchar())!='\n')
{
if(store<n)
{
*str++=words;
store++;
}
}
*str=0;
return store;
}
And
void encode(char *s1, char *s2)
{
int len = strlen(s1);
for( int i = 0; i < len; i+=2 ){
if(i+1 < len){
*(s2+i+1) = *(s1+i);
*(s2+i) = *(s1+i+1);
}else{
*(s2+i) = *(s1+i);
}
}
*(s2+len)='\0'; //<---- note this
}