How can I go about finding balance in a string in C? - c

I want this program to recursively solve this using a stack implementation with push and pop. I have the push and pop done, as well as these functions:
A string the users enter can only be made up of these characters. Any other characters and it returns unbalanced.
'(', ')', '{', '}', '[', ']'
An example of a balanced string is like this
()
(())
()()
{()()}
[]
[()[]{}]()
etc..
An unbalanced string looks like this:
{}}
()[}
[()}
etc..
This is the recursive definition of a balanced string:
(BASIS)The empty string is balanced
(NESTING) If s is also a balanced string then (s), [s], and {s} is balanced.
(CONCATENATION) If A and B are both strings, then AB is also balanced.
I do not know what my base case would be or how to implement this in recursion. I can without but I want to learn recursion. Any help?

I think you want to implement "Parenthesis Balanced" problem.
You can solve it easily by using stack without any recursion operation.
You can follow this.
//stk is a stack
// s is a string
for(int i=0; i<s.size(); i++)
{
if(str[i]=='('||str[i]=='[')
stk.push(s[i]);
else if(str[i]==')' && !stk.empty() && stk.top()=='(')
stk.pop();
else if(str[i]==']' && !stk.empty() && stk.top()=='[')
stk.pop();
}
Then by using a flag you can find this string of parenthesis is balanced or not.
You can get help from this question. Same to your question(Basic Recursion, Check Balanced Parenthesis) I think.

Well, having double as your stack's element type is rather wasteful, but I'll play along:
int is_balanced(char *ins) {
SPointer st = stk_create(),
int rval = 1;
for (int i = 0; i < strlen(ins); i += 1) {
int c = ins[i];
if ('(' == c) stk_push(st, (ElemType)')');
else if ('[' == c) stk_push(st, (ElemType)']');
else if ('{' == c) stk_push(st, (ElemType)'}');
else if (')' == c || ']' == c || '}' == c) {
if (stk_empty(st) || c != stk_pop(st)) {
rval = 0;
break;
}
} else {
rval = 0;
break;
}
}
if (! stk_empty(st)) rval = 0;
stk_free(st);
return rval;
}

Recursively done...
char* balanced_r(char* s, int* r)
{
const char* brackets= "([{\0)]}";
char *b = brackets;
if (s == 0) return s;
if (*s == 0) return s;
while (*b && *b != *s) b++;
if (*s == *b)
{
s = balanced_r(s+1, r);
if (*s != *(b+4)) *r = 0;
return balanced_r(s + 1, r);
}
return s;
}
int balanced(char* s)
{
int r = 1;
balanced_r(s, &r);
return r;
}

Here is a demonstrative program written in C++ that you can use as an algorithm and rewrite it in C
#include <iostream>
#include <iomanip>
#include <stack>
#include <cstring>
bool balance( const char *s, std::stack<char> &st )
{
const char *open = "({[<";
const char *close = ")}]>";
if ( *s == '\0' )
{
return st.empty();
}
const char *p;
if ( ( p = std::strchr( open, *s ) ) != nullptr )
{
st.push( *s );
return balance( s + 1, st );
}
else if ( ( p = std::strchr( close, *s ) ) != nullptr )
{
if ( !st.empty() && st.top() == open[p-close] )
{
st.pop();
return balance( s + 1, st );
}
else
{
return false;
}
}
else
{
return false;
}
}
int main()
{
for ( const char *s : {
"()", "(())", "()()", "{()()}", "[]", "[()[]{}]()",
"{}}", "()[}", "[()}"
} )
{
std::stack<char> st;
std::cout <<'\"' << s << "\" is balanced - "
<< std::boolalpha << balance( s, st )
<< std::endl;
}
return 0;
}
The program output is
"()" is balanced - true
"(())" is balanced - true
"()()" is balanced - true
"{()()}" is balanced - true
"[]" is balanced - true
"[()[]{}]()" is balanced - true
"{}}" is balanced - false
"()[}" is balanced - false
"[()}" is balanced - false

Related

How do I scan a string for numbers, and convert the numbers to an integer form which I can access?

For instance, if a character array contains "86*20/20/(10)" I wish to be able to keep 86 in an int variable which I can manipulate I do not want to just count the number of numbers, I need the whole number as itself too. Also, is it possible to take the whole array and do "-'0'" to convert the whole thing to a number?
int is_operand(char item)
{
if (item != '(' && item != ')' && item != '+' && item != '-' && item != '/' && item != '*' && item != '%')
{
return 1;
}
else
{
return 0;
}
}
void expressionQ(char *infix, Queue* qPtr)
{
// Write your code here
// this is used to iterate the list
int i = 0;
int j = 0;
int num;
char num2;
char* result;
printf("This %c\n", infix[0]);
num = infix[0] - '0';
printf("testingHERE %d", num);
// while (infix[i] != '\0')
// {
// if (is_operand(infix[i]))
// {
// num = infix[i];
// i++;
// // while (is_operand(infix[i]))
// // {
// // num2 = infix[i];
// // result = malloc(strlen(&num) + strlen(&num2)+1);
// // strcpy(result, &num);
// // strcat(result, &num2);
// // i++;
// }
// }
// i++;
// }
// // printf("Tests %c\n", result[0]);
// }
Write some functions that accept (or expect) a specific valid input.
void skip_whitespace( char ** s )
{
while (**s and isspace( **s )) *s += 1;
}
bool accept_operator( char desired_operator, char ** s )
{
skip_whitespace( s );
if (*s == desired_operator)
{
*s += 1;
return true;
}
return false;
}
int expect_number( char ** s )
{
// success only if NON-whitespace was scanned,
// so let us get rid of any leading whitespace ourselves
char * first_non_ws = *s;
skip_whitespace( &first_non_ws );
// now we can try to extract a number
int result = strtol( first_non_ws, s, 10 );
if (first_non_ws == *s)
{
fprintf( stderr, "Expected a number at \"%s\"\n", *s );
halt( 1 );
}
return result;
}
You can then get information by asking for it:
char s[1000] = "12 * 3";
char * p = s;
int n1 = expect_number( &p ); // extracts '12'
if (accept_operator( '*', &p )) ... // successfully extracts '*'
Your next problem will be properly handling the operators. You should google around “recursive descent parser”. Here is a freebie for add/subtract:
int term( char ** s )
{
int lhs = factor( s );
while (true)
{
if (accept_operator( '+', s )) lhs += factor( s );
else if (accept_operator( '-', s )) lhs -= factor( s );
else break;
}
return lhs;
}
Unless you are doing something with lower operator precedence than addition, that should be the first thing your expression() function calls.
int expression( char ** s )
{
return term( s );
}
And the last thing should be parentheses, which should either:
accept an open parenthesis, call expression(), and accept a close parenthesis, or
expect a number
Now you can parse stuff in your code easily enough:
char s[1000] = "86*20/20/(10)";
char * p = s;
int result = expression( &p );
// if it matters, check to make sure there isn’t any other input left unread
skip_whitespace( &p );
if (*p)
{
fprintf( stderr, "Expression not properly terminated!\n" );
halt( 1 );
}
Recursive Descent isn’t the only way to do this stuff, but it is the most direct.
The other good way to handle infix expressions is using Shunting Yard. That, however, requires a bit more setup and post-processing.
You must first properly tokenize things into an array/list/whatever
Tokenization is identifying and storing objects
You must pre-verify that the infix expression is properly-formed
Because SY is terrible at validation
You must afterwards compute the result!
SY is an infix→postfix (RPN) converter. It doesn’t actually process the RPN expression.
Both RD and SY are very, very easy, but I suspect you are just trying to get through a simple homework assignment, and RD is really just the easier path for you, methinks.

Find all vowels which never used in all words in the text

I wrote code that find all vowels which used in all words in the text. And I do not know how to transfer it. Do I need to rewrite all code?
So, I need to have such results:
Text:
wwe w fa
Result:
o u i
#include <stdio.h>
#include <ctype.h>
#define vowel (1u<<('a'-'a') | 1u<<('e'-'a') | 1u<<('i'-'a') | 1u<<('o'-'a') | 1u<<('u'-'a'))
unsigned int char_to_set(char c)
{
c = tolower(c);
if (c < 'a' || c > 'z')
return 0; else return 1u<<(c-'a');
}
int letter(int c)
{
return (c >= 'a' && c <= 'z') || (c >= 'A' && c <= 'Z');
}
int sign(int c)
{
return c == ' ' || c == ',' || c == '\n' || c == '\t';
}
int main ()
{
int c, flag=0;
char alpha;
unsigned int sl = 0, mn = vowel;
FILE *pf;
pf=fopen("l13.txt","r");
printf ("Ishodnyi text:\n\n");
while (!feof(pf))
{
c=getc(pf);
printf("%c",c);
switch (flag)
{
case (0):
{
if (letter(c))
{
sl = sl | char_to_set(c);
flag = 1;
}
if (sign(c)) flag = 0;
break;
}
case (1):
{
if (letter(c))
{
sl = sl | char_to_set(c);
flag = 1;
}
if (sign(c))
{
mn = mn & sl;
sl = 0;
flag = 0;
}
break;
}
}
}
if (mn == 0) { printf ("\n\n no vowels are included in all word"); } else { printf ("\n\n vowels are included in all word:\n"); for(alpha='a'; alpha <= 'z'; alpha++){ if((mn & char_to_set(alpha)) != 0){ printf("%c ", alpha);
}
}
}
fclose(pf);
getchar();
return 0;
}
There are many ways to do what you want. Below is one way. There may be better ways but hopefully it will give you some ideas to improve it.
If I understand your code correctly, mn contains the bit mask of vowels present in the text. So you can write a function to check all the vowel bits that are not set. The following code checks for a and e only but I think it should be clear how to extend it for the other other vowels.
#define A_MASK (1u<<('a'-'a'))
#define E_MASK (1u<<('e'-'a'))
/*
* Convenience struct for associating masks with characters.
* Could be done without this by deriving the character from the mask
* but this (IMHO) makes the code simpler to understand.
*/
struct {
unsigned int mask
char c;
} masks[] = { { A_MASK, 'a'} , { E_MASK, 'e'} };
void vowels_not_present (unsigned int vowels_mask)
{
int ix;
for (ix = 0; ix < sizeof(masks) / sizeof(masks[0]); ix++) {
if (!(vowels_mask & masks[ix].mask)) {
printf("vowel %c is not present\n", masks[ix].c);
}
}
}
Then in your main invoke the above function:
vowels_not_present(mn);

Calculator in C using stack

I'm trying to create a calculator in c, which can calculate with priority and get right results for examples like these:
((5+5)/3)*3) -- > 9
((1+2) * 3) -- > 9
These examples my code below can calculate. But for something like this
(2+5) * (2+5), my program gives wrong answer.
I'm using 2 stacks. One for operators and one for numbers. It works on this principle:
follows:
((4 - 2) * 5) + 3 --> normal infix expression:
+ * - 4 2 5 3
Pseudo code:
Read + (an operation), push it onto the stack,
Read * (an operation), push it onto the stack,
Read - (an operation), push it onto the stack,
Read 4 (a number), the top of the stack is not a number, so push it onto the stack.
Read 2 (a number), the top of the stack is a number, so pop from the stack twice, you get 4 - 2, calculate it (2), and push the result (2) onto the stack.
Read 5 (a number), the top of the stack is a number, so pop from the stack twice, you get 2 * 5, push the result (10) onto the stack.
Read 3 (a number), the top of the stack is a number, so pop from the stack twice, you get 3 + 10, push the result (13) onto the stack.
Nothing left to read, pop from the stack and return the result (13).
Actual code:
#include <stdio.h>
#include<ctype.h>
#include<stdlib.h>
#include<string.h>
#define MAXSIZE 102
typedef struct
{
char stk[MAXSIZE];
int top;
}STACK;
typedef struct stack
{
int stk[MAXSIZE];
int itop;
}INT_STACK;
STACK s;
INT_STACK a;
void push(char);
char pop(void);
void display(void);
int main()
{
a.itop = 0;
char string[MAXSIZE],vyb,vyb2;
int cislo1,cislo2,vysledok;
while (gets(string) != NULL){
for(int j = strlen(string); j > 0; j--){
if(string[j] == '*' || string[j] == '/' || string[j] == '+' || string[j] == '-')
push(string[j]);
}
//display();
for(int j = 0; j < strlen(string); j++){
if(isdigit(string[j])&&!(a.itop)){
//display();
char pomoc[2];
pomoc[0] = string[j];
pomoc[1] = '\0';
int_push(atoi(pomoc));
}
else if(isdigit(string[j])&&(a.itop)){
cislo1 = int_pop();
vyb2 = pop();
char pomoc[2];
pomoc[0] = string[j];
pomoc[1] = '\0';
cislo2 = atoi(pomoc);
if(vyb2 == '+')
vysledok = cislo1+cislo2;
else if(vyb2 == '-')
vysledok = cislo1-cislo2;
else if(vyb2 == '*')
vysledok = cislo1*cislo2;
else if(vyb2 == '/')
vysledok = cislo1 / cislo2;
//printf(" v %d",vysledok);
int_push(vysledok);
}
}
printf("%d\n",int_pop());
}
}
/* Function to add an element to the stack */
void push (char c)
{
s.top++;
s.stk[s.top] = c;
//printf ("pushed element is = %c \n", s.stk[s.top]);
}
/* Function to delete an element from the stack */
char pop ()
{
char num = s.stk[s.top];
// printf ("poped element is = %c\n", s.stk[s.top]);
s.top--;
return(num);
}
int empty()
{
if (s.top == - 1)
{
printf ("Stack is Empty\n");
return (s.top);
}
return 1;
}
void display ()
{
int i;
if (!empty)
{
printf ("Stack is empty\n");
return;
}
else
{
printf ("\n The status of the stack is \n");
for (i = s.top; i >= 0; i--)
{
printf ("%c\n", s.stk[i]);
}
}
printf ("\n");
}
void int_push (int c)
{
a.itop++;
a.stk[a.itop] = c;
//printf ("pushed element is = %d \n", a.stk[a.itop]);
}
/* Function to delete an element from the stack */
int int_pop ()
{
int num = a.stk[a.itop];
// printf ("poped element is = %d\n", a.stk[a.itop]);
a.itop--;
return(num);
}
Is there any other way to create a calculator with priority, which can give good answers?
Thanks for your respond
Put breakpoints - you'll get the following expression:
+ + * 2 5 2 5. The problem with that, is your interpreter is interpeting this as (2+5+2)*5 instead of (2+5) * (2+5).
Well then, you might be wondering how to solve this. There's no simple single solution - you could either fix your own interpreter or build a whole new mechanic, because the way you build expressions just can't handle more then one pair of parthesises.
For example, you may want to calculate all the values in parnthesises before even building the expression seperatley, possibly using recursion in the case of parenthesiseception - however if you actually choose to use that method, you might want to change the way you work with the expressions entirely, because that's a different approach.
If you need me to show actual code examples to explain this further using parts of the code you made, just ask for it and i'll edit and provide what you need.
Either way, I really advise you to look up working with interpreters in general - you could really learn a lot about analysing strings and working with different inputs, and people even did similar stuff to yours with calculators before
EDIT: you asked for examples, so here you go - this is an example of a completely different method using recursion. This way, you handle a single pair of parenthesises at a time, and thus you won't have the problem you currently do. Note - the source i'm basing this on ( pretty much copy-pasted with edits from the thread and some personal comments ) is from codereview on stack exchange, you can see it here
if you're intrested.
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
void getInput(char * in) {
printf("> ");
fgets(in, 256, stdin);
}
int isLeftParantheses(char p) {
if (p == '(') return 1;
else return 0;
}
int isRightParantheses(char p) {
if (p == ')') return 1;
else return 0;
}
int isOperator(char p) {
if (p == '+' || p == '-' || p == '*' || p == '/') return p;
else return 0;
}
int performOperator(int a, int b, char p) {
switch(p) {
case '+': return a+b;
case '-': return a-b;
case '*': return a*b;
case '/':
if (b == 0) { printf("Can't divide by 0, aborting...\n"); exit(1); } // now we dont want the world to expload here do we.
return a/b;
default:
puts("Bad value in switch.\n"); // A replacement which was mentioned in the thread- better have a default response just in case something goes wrong.
break;
}
return 0;
}
char isDigit(char p) {
if (p >= '0' && p <= '9') return 1;
else return 0;
}
int charToDigit(char p) {
if (p >= '0' && p <= '9') return p - '0';
else return 0;
}
int isNumber(char * p) {
while(*p) {
if (!isDigit(*p)) return 0;
p++;
}
return 1;
}
int len(char * p)
{
return (int) strlen(p); // This was bugged in the source, so I fixed it like the thread advised.
}
int numOfOperands(char * p) {
int total = 0;
while(*p) {
if (isOperator(*p)) total++;
p++;
}
return total+1;
}
int isMDGRoup(char *p)
{
for(; *p; p++) // used to be a while loop in the source, but this is better imho. more readable, also mentioned on the thread itself.
{
if (!isDigit(*p) && *p != '/' && *p != '*') return 0;
}
return 1;
}
int getLeftOperand(char * p, char * l) {
// Grab the left operand in p, put it in l,
//and return the index where it ends.
int i = 0;
// Operand is part of multi-*/ group
if (isMDGRoup(p)) {
while(1) {
if (*p == '*' || *p == '/') break;
l[i++] = *p++;
}
return i;
}
// Operand is in parantheses (so that's how you write it! sorry for my bad english :)
if(isLeftParantheses(*p)) {
int LeftParantheses = 1;
int RightParantheses= 0;
p++;
while(1) {
if (isLeftParantheses(*p)) LeftParantheses++;
if (isRightParantheses(*p)) RightParantheses++;
if (isRightParantheses(*p) && LeftParantheses == RightParantheses)
break;
l[i++] = *p++;
}
// while (!isRightParantheses(*p)) {
// l[i++] = *p++;
// }
l[i] = '\0';
return i+2;
}
// Operand is a number
while (1) {
if (!isDigit(*p)) break;
l[i++] = *p++;
}
l[i] = '\0';
return i;
}
int getOperator(char * p, int index, char * op) {
*op = p[index];
return index + 1;
}
int getRightOperand(char * p, char * l) {
// Grab the left operand in p, put it in l,
//and return the index where it ends.
while(*p && (isDigit(*p) || isOperator(*p) ||
isLeftParantheses(*p) || isRightParantheses(*p))) {
*l++ = *p++;
}
*l = '\0';
return 0;
}
int isEmpty(char * p) {
// Check if string/char is empty
if (len(p) == 0) return 1;
else return 0;
}
int calcExpression(char * p) {
// if p = #: return atoi(p)
//
// else:
// L = P.LeftSide
// O = P.Op
// R = P.RightSide
// return PerformOp(calcExpression(L), calcExpression(R), O)
// ACTUAL FUNCTION
// if p is a number, return it
if (isNumber(p)) return atoi(p);
// Get Left, Right and Op from p.
char leftOperand[256] = ""; char rightOperand[256]= "";
char op;
int leftOpIndex = getLeftOperand(p, leftOperand);
int operatorIndex = getOperator(p, leftOpIndex, &op);
int rightOpIndex = getRightOperand(p+operatorIndex, rightOperand);
printf("%s, %c, %s", leftOperand, op, rightOperand);
getchar();
if (isEmpty(rightOperand)) return calcExpression(leftOperand);
return performOperator(
calcExpression(leftOperand),
calcExpression(rightOperand),
op
);
}
int main()
{
char in[256];
while(1) {
// Read input from user
getInput(in);
if (strncmp(in, "quit", 4) == 0) break;
// Perform calculations
int result = calcExpression(in);
printf("%d\n", result);
}
}

Searching a particular word in a matrix of characters

I was trying to search for a particular word in a matrix of characters through C but was unable to come to a fixed solution.
For ex:
Suppose I have to search for the word INTELLIGENT in a matrix of characters (3*9)
(Once you have picked a character from the matrix to form a sentence, you cannot pick it again to form the same sentence.There is a path from any cell to all its neighboring cells. A neighbor may share an edge or a corner.)
IIIINN.LI
....TTEGL
.....NELI
Output: YES (the word INTELLIGENT can be found)
Can anybody please give a solution to the above problem !!!!
Use a depth first search.
You can do this using a recursive algorthm. Find all the (unused) places containing the first letter then see if it is possible to find the rest of the word on the remaining board by starting from one of the adjacent squares.
#include <stdio.h>
char Matrix[3][9] = {
{ 'I','I','I','I','N','N','.','L','I'},
{ '.','.','.','.','T','T','E','G','L'},
{ '.','.','.','.',',','N','E','L','I'}
};
char Choice[3][9] = { { 0 }, { 0 }, { 0 } };
const char WORD[] = "INTELLIGENT";
const int Len = sizeof(WORD)-1;
int Path[sizeof(WORD)-1] = { 0 };
char get(int row, int col){
if(1 > col || col > 9) return '\0';
if(1 > row || row > 3) return '\0';
if(Choice[row-1][col-1] || Matrix[row-1][col-1] == '.')
return '\0';
else
return Matrix[row-1][col-1];
}
#define toLoc(r, c) (r)*10+(c)
#define getRow(L) L/10
#define getCol(L) L%10
int search(int loc, int level){
int r,c,x,y;
char ch;
if(level == Len) return 1;//find it
r = getRow(loc);
c = getCol(loc);
ch = get(r,c);
if(ch == 0 || ch != WORD[level]) return 0;
Path[level]=toLoc(r,c);
Choice[r-1][c-1] = 'v';//marking
for(x=-1;x<=1;++x){
for(y=-1;y<=1;++y){
if(search(toLoc(r+y,c+x), level + 1)) return 1;
}
}
Choice[r-1][c-1] = '\0';//reset
return 0;
}
int main(void){
int r,c,i;
for(r=1;r<=3;++r){
for(c=1;c<=9;++c){
if(search(toLoc(r,c), 0)){
printf("YES\nPath:");
for(i=0;i<Len;++i){
printf("(%d,%d)", getRow(Path[i]), getCol(Path[i]));
}
printf("\n");
return 0;
}
}
}
printf("NO\n");
return 0;
}
I think this is what you mean..... Though it seems simpler to what you currently have been offered, so I may have misunderstood the question.
I use Numpy to reshape an arbitrary array into a single
list of letters, then we create a mask of the search term and
a copy of the input list.
I tick off each letter to search for while updating the mask.
import numpy as np
import copy
def findInArray(I,Word):
M=[list(x) for x in I]
M=list(np.ravel(M))
print "Letters to start: %s"%"".join(M)
Mask=[False]*len(Word)
T = copy.copy(M)
for n,v in enumerate(Word):
try:
p=T.index(v)
except ValueError:
pass
else:
T[p]=''
Mask[n]=True
print "Letters left over: %s"%"".join(T)
if all(Mask):print "Found %s"%Word
else:print "%s not Found"%Word
print "\n"
return all(Mask)
I=["IIIINN.LI","....TTEGL",".....NELI"]
findInArray(I,"INTEL")
findInArray(I,"INTELLIGENT")
findInArray(I,"INTELLIGENCE")
Example output
Letters to start: IIIINN.LI....TTEGL.....NELI
Letters left over: IIIN.I....TGL.....NELI
Found INTEL
Letters to start: IIIINN.LI....TTEGL.....NELI
Letters left over: II.I.........NLI
Found INTELLIGENT
Letters to start: IIIINN.LI....TTEGL.....NELI
Letters left over: II.I....T.....NLI
INTELLIGENCE not Found
#include <stdio.h>
#define ROW 1
#define COL 11
char Matrix[ROW][COL] = { { 'I','N','T','E','L','L','I','G','E', 'N', 'T'} };
char Choice[ROW][COL] = { { 0 } };
const char WORD[] = "INTELLIGENT";
const int Len = sizeof(WORD)-1;
int Path[sizeof(WORD)-1] = { 0 };
char get(int row, int col){
if(1 > col || col > COL) return '\0';
if(1 > row || row > ROW) return '\0';
if(Choice[row-1][col-1] || Matrix[row-1][col-1] == '.')
return '\0';
else
return Matrix[row-1][col-1];
}
#define toLoc(r, c) (r)*16+(c)
#define getRow(L) L/16
#define getCol(L) L%16
int search(int loc, int level){
int r,c,x,y;
char ch;
if(level == Len) return 1;//find it
r = getRow(loc);
c = getCol(loc);
ch = get(r,c);
if(ch == 0 || ch != WORD[level]) return 0;
Path[level]=toLoc(r,c);
Choice[r-1][c-1] = 'v';//marking
for(x=-1;x<=1;++x){
for(y=-1;y<=1;++y){
if(search(toLoc(r+y,c+x), level + 1)) return 1;
}
}
Choice[r-1][c-1] = '\0';//reset
return 0;
}
int main(void){
int r,c,i;
for(r=1;r<=ROW;++r){
for(c=1;c<=COL;++c){
if(search(toLoc(r,c), 0)){
printf("YES\nPath:");
for(i=0;i<Len;++i){
printf("(%d,%d)", getRow(Path[i]), getCol(Path[i]));
}
printf("\n");
return 0;
}
}
}
printf("NO\n");
return 0;
}

Match sub-string within a string with tolerance of 1 character mismatch

I was going through some Amazon interview questions on CareerCup.com, and I came across this interesting question which I haven't been able to figure out how to do. I have been thinking on this since 2 days. Either I am taking a way off approach, or its a genuinely hard function to write.
Question is as follows:
Write a function in C that can find if a string is a sub-string of another. Note that a mismatch of one character
should be ignored.
A mismatch can be an extra character: ’dog’ matches ‘xxxdoogyyyy’
A mismatch can be a missing character: ’dog’ matches ‘xxxdgyyyy’
A mismatch can be a different character: ’dog’ matches ‘xxxdigyyyy’
The return value wasn't mentioned in the question, so I assume the signature of the function can be something like this:
char * MatchWithTolerance(const char * str, const char * substr);
If there is a match with the given rules, return the pointer to the beginning of matched substring within the string. Else return null.
Bonus
If someone can also figure out a generic way of making the tolerance to n instead of 1, then that would be just brilliant.
In that case the signature would be:
char * MatchWithTolerance(const char * str, const char * substr, unsigned int tolerance = 1);
This seems to work, let me know if you find any errors and I'll try to fix them:
int findHelper(const char *str, const char *substr, int mustMatch = 0)
{
if ( *substr == '\0' )
return 1;
if ( *str == '\0' )
return 0;
if ( *str == *substr )
return findHelper(str + 1, substr + 1, mustMatch);
else
{
if ( mustMatch )
return 0;
if ( *(str + 1) == *substr )
return findHelper(str + 1, substr, 1);
else if ( *str == *(substr + 1) )
return findHelper(str, substr + 1, 1);
else if ( *(str + 1) == *(substr + 1) )
return findHelper(str + 1, substr + 1, 1);
else if ( *(substr + 1) == '\0' )
return 1;
else
return 0;
}
}
int find(const char *str, const char *substr)
{
int ok = 0;
while ( *str != '\0' )
ok |= findHelper(str++, substr, 0);
return ok;
}
int main()
{
printf("%d\n", find("xxxdoogyyyy", "dog"));
printf("%d\n", find("xxxdgyyyy", "dog"));
printf("%d\n", find("xxxdigyyyy", "dog"));
}
Basically, I make sure only one character can differ, and run the function that does this for every suffix of the haystack.
This is related to a classical problem of IT, referred to as Levenshtein distance.
See Wikibooks for a bunch of implementations in different languages.
This is slightly different than the earlier solution, but I was intrigued by the problem and wanted to give it a shot. Obviously optimize if desired, I just wanted a solution.
char *match(char *str, char *substr, int tolerance)
{
if (! *substr) return str;
if (! *str) return NULL;
while (*str)
{
char *str_p;
char *substr_p;
char *matches_missing;
char *matches_mismatched;
str_p = str;
substr_p = substr;
while (*str_p && *substr_p && *str_p == *substr_p)
{
str_p++;
substr_p++;
}
if (! *substr_p) return str;
if (! tolerance)
{
str++;
continue;
}
if (strlen(substr_p) <= tolerance) return str;
/* missed due to a missing letter */
matches_missing = match(str_p, substr_p + 1, tolerance - 1);
if (matches_missing == str_p) return str;
/* missed due to a mismatch of letters */
matches_mismatched = match(str_p + 1, substr_p + 1, tolerance - 1);
if (matches_mismatched == str_p + 1) return str;
str++;
}
return NULL;
}
Is the problem to do this efficiently?
The naive solution is to loop over every substring of size substr in str, from left to right, and return true if the current substring if only one of the characters is different in a comparison.
Let n = size of str
Let m = size of substr
There are O(n) substrings in str, and the matching step takes time O(m). Ergo, the naive solution runs in time
O(n*m)
With arbitary no. of tolerance levels.
Worked for all the test cases I could think of. Loosely based on |/|ad's solution.
#include<stdio.h>
#include<string.h>
report (int x, char* str, char* sstr, int[] t) {
if ( x )
printf( "%s is a substring of %s for a tolerance[%d]\n",sstr,str[i],t[i] );
else
printf ( "%s is NOT a substring of %s for a tolerance[%d]\n",sstr,str[i],t[i] );
}
int find_with_tolerance (char *str, char *sstr, int tol) {
if ( (*sstr) == '\0' ) //end of substring, and match
return 1;
if ( (*str) == '\0' ) //end of string
if ( tol >= strlen(sstr) ) //but tol saves the day
return 1;
else //there's nothing even the poor tol can do
return 0;
if ( *sstr == *str ) { //current char match, smooth
return find_with_tolerance ( str+1, sstr+1, tol );
} else {
if ( tol <= 0 ) //that's it. no more patience
return 0;
for(int i=1; i<=tol; i++) {
if ( *(str+i) == *sstr ) //insertioan of a foreign character
return find_with_tolerance ( str+i+1, sstr+1, tol-i );
if ( *str == *(sstr+i) ) //deal with dletion
return find_with_tolerance ( str+1, sstr+i+1, tol-i );
if ( *(str+i) == *(sstr+i) ) //deal with riplacement
return find_with_tolerance ( str+i+1, sstr+i+1, tol-i );
if ( *(sstr+i) == '\0' ) //substr ends, thanks to tol & this loop
return 1;
}
return 0; //when all fails
}
}
int find (char *str, char *sstr, int tol ) {
int w = 0;
while (*str!='\0')
w |= find_with_tolerance ( str++, sstr, tol );
return (w) ? 1 : 0;
}
int main() {
const int n=3; //no of test cases
char *sstr = "dog"; //the substr
char *str[n] = { "doox", //those cases
"xxxxxd",
"xxdogxx" };
int t[] = {1,1,0}; //tolerance levels for those cases
for(int i = 0; i < n; i++) {
report( find ( *(str+i), sstr, t[i] ), *(str+i), sstr, t[i] );
}
return 0;
}

Resources