method for expand a-z to abc...xyz form - c

Hi:) what i'm trying to do is write a simple program to expand from shortest entry
for example
a-z or 0-9 or a-b-c or a-z0-9
to longest write
for example
abc...xyz or 0123456789 or abc or abcdefghijklmnouprstwxyz0123456789
1-st examle shortest entry = 1-st example result which should give:)
so far i write something like this and it's work only for letters from a to z:
expand(char s[])
{
int i,n,c;
n=c=0;
int len = strlen(s);
for(i = 1;s[i] > '0' && s[i]<= '9' || s[i] >= 'a' && s[i] <= 'z' || s[i]=='-';i++)
{
/*c = s[i-1];
g = s[i];
n = s[i+1];*/
if( s[0] == '-')
printf("%c",s[0]);
else if(s[i] == '-')
{
if(s[i-1]<s[i+1])
{
while(s[i-1] <= s[i+1])
{
printf("%c", s[i-1]);
s[i-1]++;
}
}
else if(s[i-1] == s[i+1])
printf("%c",s[i]);
else if(s[i+1] != '-')
printf("%c",s[i]);
else if(s[i-1] != '-')
printf("%c",s[i]);
}
else if(s[i] == s[i+1])
{
while(s[i] == s[i+1])
{
printf("%c",s[i]);
s[i]++;
}
}
else if( s[len] == '-')
printf("%c",s[len]);
}
}
but now i'm stuck:(
any ideas what should i check to my program work correctly?
Edit1: #Andrew Kozak (1) abcd (2) 01234
Thanks for advance:)

Here is a C version (in about 38 effective lines) that satisfies the same test as my earlier C++ version.
The full test program including your test cases, mine and some torture test can be seen live on http://ideone.com/sXM7b#info_3915048
Rationale
I'm pretty sure I'm overstating the requirements, but
this should be an excellent example of how to do parsing in a robust fashion
use states in an explicit fashion
validate input (!)
this version doesn't assume a-c-b can't happen
It also doesn't choke or even fail on simple input like 'Hello World' (or (char*) 0)
it shows how you can avoid printf("%c", c) each char without using extraneous functions.
I put in some comments as to explain what happens why, but overall you'll find that the code is much more legible anyways, by
staying away from too many short-named variables
avoiding complicated conditionals with un-transparent indexers
avoiding the whole string length business: We only need max lookahead of 2 characters, and *it=='-' or predicate(*it) will just return false if it is the null character. Shortcut evaluation prevents us from accessing past-the-end input characters
ONE caveat: I haven't implemented a proper check for output buffer overrun (the capacity is hardcoded at 2048 chars). I'll leave it as the proverbial exercise for the reader
Last but not least, the reason I did this:
It will allow me to compare raw performance of the C++ version and this C version, now that they perform equivalent functions. Right now, I fully expect the C version to outperform the C++ by some factor (let's guess: 4x?) but, again, let's just see what suprises the GNU compilers have in store for us. More later Update turns out I wasn't far off: github (code + results)
Pure C Implementation
Without further ado, the implementation, including the testcase:
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
int alpha_range(char c) { return (c>='a') && (c<='z'); }
int digit_range(char c) { return (c>='0') && (c<='9'); }
char* expand(const char* s)
{
char buf[2048];
const char* in = s;
char* out = buf;
// parser state
int (*predicate)(char) = 0; // either: NULL (free state), alpha_range (in alphabetic range), digit_range (in digit range)
char lower=0,upper=0; // tracks lower and upper bound of character ranges in the range parsing states
// init
*out = 0;
while (*in)
{
if (!predicate)
{
// free parsing state
if (alpha_range(*in) && (in[1] == '-') && alpha_range(in[2]))
{
lower = upper = *in++;
predicate = &alpha_range;
}
else if (digit_range(*in) && (in[1] == '-') && digit_range(in[2]))
{
lower = upper = *in++;
predicate = &digit_range;
}
else *out++ = *in;
} else
{
// in a range
if (*in < lower) lower = *in;
if (*in > upper) upper = *in;
if (in[1] == '-' && predicate(in[2]))
in++; // more coming
else
{
// end of range mode, dump expansion
char c;
for (c=lower; c<=upper; *out++ = c++);
predicate = 0;
}
}
in++;
}
*out = 0; // null-terminate buf
return strdup(buf);
}
void dotest(const char* const input)
{
char* ex = expand(input);
printf("input : '%s'\noutput: '%s'\n\n", input, ex);
if (ex)
free(ex);
}
int main (int argc, char *argv[])
{
dotest("a-z or 0-9 or a-b-c or a-z0-9"); // from the original post
dotest("This is some e-z test in 5-7 steps; this works: a-b-c. This works too: b-k-c-e. Likewise 8-4-6"); // from my C++ answer
dotest("-x-s a-9 9- a-k-9 9-a-c-7-3"); // assorted torture tests
return 0;
}
Test output:
input : 'a-z or 0-9 or a-b-c or a-z0-9'
output: 'abcdefghijklmnopqrstuvwxyz or 0123456789 or abc or abcdefghijklmnopqrstuvwxyz0123456789'
input : 'This is some e-z test in 5-7 steps; this works: a-b-c. This works too: b-k-c-e. Likewise 8-4-6'
output: 'This is some efghijklmnopqrstuvwxyz test in 567 steps; this works: abc. This works too: bcdefghijk. Likewise 45678'
input : '-x-s a-9 9- a-k-9 9-a-c-7-3'
output: '-stuvwx a-9 9- abcdefghijk-9 9-abc-34567'

Ok I tested your program out and it seems to be working for nearly every case. It correctly expands a-z and other expansions with only two letters/numbers. It fails when there are more letters and numbers. The fix is easy, just make a new char to keep the last printed character, if the currently printed character matches the last one skip it. The a-z0-9 scenario didn't work because you forgot a s[i] >= '0' instead of s[i] > '0'. the code is:
#include <stdio.h>
#include <string.h>
void expand(char s[])
{
int i,g,n,c,l;
n=c=0;
int len = strlen(s);
for(i = 1;s[i] >= '0' && s[i]<= '9' || s[i] >= 'a' && s[i] <= 'z' || s[i]=='-';i++)
{
c = s[i-1];
g = s[i];
n = s[i+1];
//printf("\nc = %c g = %c n = %c\n", c,g,n);
if(s[0] == '-')
printf("%c",s[0]);
else if(g == '-')
{
if(c<n)
{
if (c != l){
while(c <= n)
{
printf("%c", c);
c++;
}
l = c - 1;
//printf("\nl is %c\n", l);
}
else
{
c++;
while(c <= n)
{
printf("%c", c);
c++;
}
l = c - 1;
//printf("\nl is %c\n", l);
}
}
else if(c == n)
printf("%c",g);
else if(n != '-')
printf("%c",g);
else if(c != '-')
printf("%c",g);
}
else if(g == n)
{
while(g == n)
{
printf("%c",s[i]);
g++;
}
}
else if( s[len] == '-')
printf("%c",s[len]);
}
printf("\n");
}
int main (int argc, char *argv[])
{
expand(argv[1]);
}
Isn't this problem from K&R? I think I saw it there. Anyway I hope I helped.

Based on the fact that the existing function addresses "a-z" and "0-9" sequences just fine, separately, we should explore what happens when they meet. Trace your code (try printing each variable's value at each step -- yes it will be cluttered, so use line breaks), and I believe you will find a logical short-circuit when iterating, for example, from "current token is 'y' and next token is 'z'" to "current token is 'z' and next token is '0'". Explore the if() condition and you will find that it does not cover all possibilities, i.e. you have covered yourself if you are within a<-->z, within 0<-->9, or exactly equal to '-', but you have not considered being at the end of one (a-z or 0-9) with your next character at the start of the next.

Just for fun, I decided to demonstrate to myself that C++ is really just as suited to this kind of thing.
Test-first, please
First, let me define the requirements a little more strictly: I assumed it needs to handle these cases:
int main()
{
const std::string in("This is some e-z test in 5-7 steps; this works: a-b-c. This works too: b-k-c-e. Likewise 8-4-6");
std::cout << "input : " << in << std::endl;
std::cout << "output: " << expand(in) << std::endl;
}
input : This is some e-z test in 5-7 steps; this works: a-b-c. This works too: b-k-c-e. Likewise 8-4-6
output: This is some efghijklmnopqrstuvwxyz test in 567 steps; this works: abc. This works too: bcdefghijk. Likewise 45678
C++0x Implementation
Here is an implementation (actually a few variants) in 14 lines (23 including whitespace, comments) of C++0x code1
static std::string expand(const std::string& in)
{
static const regex re(R"([a-z](?:-[a-z])+|[0-9](?:-[0-9])+)");
std::string out;
auto tail = in.begin();
for (auto match : make_iterator_range(sregex_iterator(in.begin(), in.end(), re), sregex_iterator()))
{
out.append(tail, match[0].first);
// char range bounds: the cost of accepting unordered ranges...
char a=127, b=0;
for (auto x=match[0].first; x<match[0].second; x+=2)
{ a = std::min(*x,a); b = std::max(*x,b); }
for (char c=a; c<=b; out.push_back(c++));
tail = match.suffix().first;
}
out.append(tail, in.end());
return out;
}
Of course I'm cheating a little because I'm using regex iterators from Boost. I will do some timings comparing to the C version for performance. I rather expect the C++ version to compete within a 50% margin. But, let's see what kind of surprises the GNU compiler ahs in store for us :)
Here is a complete program that demonstrates the sample input. _It also contains some benchmark timings and a few variations that trade-off
functional flexibility
legibility / performance
#include <set> // only needed for the 'slow variant'
#include <boost/regex.hpp>
#include <boost/range.hpp>
using namespace boost;
using namespace boost::range;
static std::string expand(const std::string& in)
{
// static const regex re(R"([a-z]-[a-z]|[0-9]-[0-9])"); // "a-c-d" --> "abc-d", "a-c-e-g" --> "abc-efg"
static const regex re(R"([a-z](?:-[a-z])+|[0-9](?:-[0-9])+)");
std::string out;
out.reserve(in.size() + 12); // heuristic
auto tail = in.begin();
for (auto match : make_iterator_range(sregex_iterator(in.begin(), in.end(), re), sregex_iterator()))
{
out.append(tail, match[0].first);
// char range bounds: the cost of accepting unordered ranges...
#if !SIMPLE_BUT_SLOWER
// debug 15.149s / release 8.258s (at 1024k iterations)
char a=127, b=0;
for (auto x=match[0].first; x<match[0].second; x+=2)
{ a = std::min(*x,a); b = std::max(*x,b); }
for (char c=a; c<=b; out.push_back(c++));
#else // simpler but slower
// debug 24.962s / release 10.270s (at 1024k iterations)
std::set<char> bounds(match[0].first, match[0].second);
bounds.erase('-');
for (char c=*bounds.begin(); c<=*bounds.rbegin(); out.push_back(c++));
#endif
tail = match.suffix().first;
}
out.append(tail, in.end());
return out;
}
int main()
{
const std::string in("This is some e-z test in 5-7 steps; this works: a-b-c. This works too: b-k-c-e. Likewise 8-4-6");
std::cout << "input : " << in << std::endl;
std::cout << "output: " << expand(in) << std::endl;
}
1 Compiled with g++-4.6 -std=c++0x

This is a Java implementation. It expands the character ranges similar to 0-9, a-z and A-Z. Maybe someone will need it someday and Google will bring them to this page.
package your.package;
public class CharacterRange {
/**
* Expands character ranges similar to 0-9, a-z and A-Z.
*
* #param string a string to be expanded
* #return a string
*/
public static String expand(String string) {
StringBuilder buffer = new StringBuilder();
int i = 1;
while (i <= string.length()) {
final char a = string.charAt(i - 1); // previous char
if ((i < string.length() - 1) && (string.charAt(i) == '-')) {
final char b = string.charAt(i + 1); // next char
char[] expanded = expand(a, b);
if (expanded.length != 0) {
i += 2; // skip
buffer.append(expanded);
} else {
buffer.append(a);
}
} else {
buffer.append(a);
}
i++;
}
return buffer.toString();
}
private static char[] expand(char a, char b) {
char[] expanded = expand(a, b, '0', '9'); // digits (0-9)
if (expanded.length == 0) {
expanded = expand(a, b, 'a', 'z'); // lower case letters (a-z)
}
if (expanded.length == 0) {
expanded = expand(a, b, 'A', 'Z'); // upper case letters (A-Z)
}
return expanded;
}
private static char[] expand(char a, char b, char min, char max) {
if ((a > b) || !(a >= min && a <= max && b >= min && b <= max)) {
return new char[0];
}
char[] buffer = new char[(b - a) + 1];
for (int i = 0; i < buffer.length; i++) {
buffer[i] = (char) (a + i);
}
return buffer;
}
public static void main(String[] args) {
String[] ranges = { //
"0-9", "a-z", "A-Z", "0-9a-f", "a-z2-7", "0-9a-v", //
"0-9a-hj-kmnp-tv-z", "0-9a-z", "1-9A-HJ-NP-Za-km-z", //
"A-Za-z0-9", "A-Za-z0-9+/", "A-Za-z0-9-_" };
for (int i = 0; i < ranges.length; i++) {
String input = ranges[i];
String output = CharacterRange.expand(ranges[i]);
System.out.println("input: " + input);
System.out.println("output: " + output);
System.out.println();
}
}
}
Output:
input: 0-9
output: 0123456789
input: a-z
output: abcdefghijklmnopqrstuvwxyz
input: A-Z
output: ABCDEFGHIJKLMNOPQRSTUVWXYZ
input: 0-9a-f
output: 0123456789abcdef
input: a-z2-7
output: abcdefghijklmnopqrstuvwxyz234567
input: 0-9a-v
output: 0123456789abcdefghijklmnopqrstuv
input: 0-9a-hj-kmnp-tv-z
output: 0123456789abcdefghjkmnpqrstvwxyz
input: 0-9a-z
output: 0123456789abcdefghijklmnopqrstuvwxyz
input: 1-9A-HJ-NP-Za-km-z
output: 123456789ABCDEFGHJKLMNPQRSTUVWXYZabcdefghijkmnopqrstuvwxyz
input: A-Za-z0-9
output: ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789
input: A-Za-z0-9+/
output: ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/
input: A-Za-z0-9-_
output: ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-_

Related

Substitute characters in a string with their values?

I have a string given (a+b)&(a+c) and I have created a truth table with values of a,b, and c. Now the problem is to evaluate the logic expression by substituting a,b, and c with corresponding values from the truth table. How it can be done in C?
Ex: a=0 b=0 c=0 r=(0+0)&(0+)=0
a=0 b=0 c=1 r=(0+0)&(0+1)=0
and so on
The code itself looks like this
#include <stdio.h>
#include <stdlib.h>
int main()
{
char c,* str, *vars, **result;
int i=0,count=0,j=0;
unsigned long long rows;
str = (char*) malloc(1*sizeof(char));
vars=(char*) malloc(1*sizeof(char));
result=(char**)malloc(1*sizeof(char));
char values[] = {'F', 'T'};
while ((c = getchar()) != EOF)
{
str[i++] = c;
str = (char*) realloc(str, (i+1) * sizeof(char));
if (c >= 'a' && c <= 'z')
{
vars[j++]=c;
vars=(char*) realloc(vars,(j+1)*sizeof(char));
count++;
}
}
rows=1ULL<<(count);
result=(char**)realloc(result,(rows+2)*sizeof(char));
for (i = 0; i < rows+1; i++)
{
result[i]=(char*)malloc(sizeof(char)*(count+1));
for (j = 0; j < count; j++)
{
if(i==0)
result[i][j]=vars[j];
else
result[i][j]=values[(i >> j) & 1];
}
}
result[0][count]='R';
for(i=0;i<rows+1;i++)
{
for(j=0;j<count+1;j++)
{
//do something
}
}
Now the problem is to evaluate the logic expression by substituting a,b, and c with corresponding values from the truth table.
Aside from the issues mentioned in the question's comments, substituting alone won't do the job to evaluate the logic expression. The following function for example substitutes the values while evaluating the expression. (You didn't specify the general syntax of your expressions, so I chose to support combinations of the used operators and lower case variables.)
#include <ctype.h>
#include <string.h>
int indx(char *s, char c) { return strchr(s, c)-s; }
char *gstr, *gvars, *vals; // expression string, variables, value combination
char eval()
{ // evaluate expression "gstr"
char or = 0; // neutral element of +
do
{
char and = 1; // neutral element of &
do
{
char c = *gstr++; // get next token
if (islower(c))
and &= indx("FT", vals[indx(gvars, c)]);
else
if (c == '(')
{ // evaluate subexpression
and &= eval();
c = *gstr++; // get next token
if (c != ')')
printf("error at '%c': expected ')'\n", c), exit(1);
}
else
printf("error at '%c'\n", c), exit(1);
} while (*gstr == '&' && ++gstr);
or |= and;
} while (*gstr == '+' && ++gstr);
return or;
}
It can be called from your main (inserted in your code, hence the inconsistent spacing)
result[0][count]='R';
gvars = vars; // make variable names globally accessible
for (i = 1; i <= rows; ++i)
{
gstr = str, vals = result[i], // globally accessible
result[i][count] = values[eval()];
while (isspace(*gstr)) ++gstr;
if (*gstr)
printf("error at '%c': expected end of input\n", *gstr), exit(1);
}
for(i=0;i<rows+1;i++)
{
for(j=0;j<count+1;j++)
{
putchar(result[i][j]);
}
putchar('\n');
}
(Don't forget to put str[i] = '\0'; after your getchar loop to make a null-terminated string.) Note that due to the given for loop counting, the order of the truth table entries is somewhat unusual in that the row with all variables F comes last.

Comparing two arrays in C without case sensitivity

I have for example two arrays
char 1 is: 77a abcd Abc abc1d #### v k
char2 is: 789 ABA AABB 123 ab #% abcde
The common index should be in places 0,3,4,5,9,10,12,20
The result should be 8 but I get 9 The problem is that an Aski code lower than 64 still works and it should not
Is the code
int intersection(char arrayNumberOne[size], char arrayNumberTwo[size])
{
int counter = 0;
for (int i = 0; i < strlen(arrayNumberOne); i++)
{
if ((arrayNumberOne[i] == arrayNumberTwo[i]) || (arrayNumberOne[i] == arrayNumberTwo[i] + 32) || (arrayNumberOne[i] + 32 == arrayNumberTwo[i]))
{
if (arrayNumberOne[i] < 64)
{
?????
}
counter++;
}
}
return counter;
}
Sum 32 to an ASCII character can lead to undesirable combinations, and you already found the issue.
I suggest you, break your conditionals into small pieces and there are some small changes:
First, the condition about the symbols is incomplete, then change:
if (arrayNumberOne[i] < 64)
to:
if (arrayNumberOne[i] <= 64 || arrayNumberTwo[i] <= 64)
Because both arrays could contain a symbol.
Second, organize the expressions to see the logic better, but you can join after all:
// considering inside a for-loop used 'continue' to skip the other verifications
// compare any character
// both lower case or upper case
if (arrayNumberOne[i] == arrayNumberTwo[i]) {
counter++;
continue;
}
// is a symbol, skip the verification ahead
if (arrayNumberOne[i] <= 64 || arrayNumberTwo[i] <= 64)
continue;
// character verifications
// first is uppercase
if (arrayNumberOne[i] + 32 == arrayNumberTwo[i]) {
counter++;
continue;
}
// second is uppercase
if (arrayNumberOne[i] == arrayNumberTwo[i] + 32) {
counter++;
continue;
}
This will make your code work but would be better to check for 'a-z' or use tolower as they said in the comments/answers.
Use tolower, no need to do strlen and you need to check both string length.
cast to (unsigned char) this allow to avoid an undefined behavior if char is a signed type (which it often is) and if any of the arrays contain negative char values (that are different from EOF: the one negative value permitted in tolower) (thanks #Kaz to flag this)
#include <ctype.h>
#include <stdio.h>
int intersection(const char *s1, const char *s2)
{
int counter = 0;
while (*s1 != '\0' && *s2 != '\0') {
if (tolower((unsigned char) *s1++) == tolower((unsigned char) *s2++)) {
counter++;
}
}
return counter;
}
int main() {
printf("%d\n", intersection("77a abcd Abc abc1d #### v k",
"789 ABA AABB 123 ab #% abcde"));
return 0;
}
Returns 8

function doesn't pass certain test case

I have a problem with one of the test for my solution for challenge in codewars. I have to write a function that returns alphabet position of characters in input string. My solution is below. I pass all my test and also tests from codewars but fail on this one (I did not implement this test code it was pat of the test code implemented by code wars):
Test(number_tests, should_pass) {
srand(time(NULL));
char in[11] = {0};
char *ptr;
for (int i = 0; i < 15; i++) {
for (int j = 0; j < 10; j++) {
char c = rand() % 10;
in[j] = c + '0';
}
ptr = alphabet_position(in);
cr_assert_eq(strcmp(ptr, ""), 0);
free(ptr);
}
}
The error I receive is following: The expression (strcmp(ptr, "")) == (0) is false. Thanks for the help!
p.s Also I noticed that I am leaking memory (I don't know how to solve this so I suppose I would use array to keep track of string and don't use malloc) --> I suppose this is not an issue I would just free(ptr) in main function.
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
char *alphabet_position(char *text);
// test
int main()
{
if (!strcmp("1 2 3", alphabet_position("abc")))
{
printf("success...\n");
}
else
{
printf("fail...\n");
}
if (!strcmp("", alphabet_position("..")))
{
printf("success...\n");
}
else
{
printf("fail...\n");
}
if (!strcmp("20 8 5 19 21 14 19 5 20 19 5 20 19 1 20 20 23 5 12 22 5 15 3 12 15 3 11", alphabet_position("The sunset sets at twelve o' clock.")))
{
printf("success...\n");
}
else
{
printf("fail...\n");
}
}
char *alphabet_position(char *text)
{
// signature: string -> string
// purpose: extact alphabet position of letters in input string and
// return string of alphabet positions
// return "123"; // stub
// track numerical value of each letter according to it's alphabet position
char *alph = "abcdefghijklmnopqrstuvwxyz";
// allocate maximum possible space for return string
// each char maps to two digit number + trailing space after number
char *s = malloc(sizeof(char) * (3 * strlen(text) + 1));
// keep track of the begining of return string
char *head = s;
int index = 0;
int flag = 0;
while(*text != '\0')
{
if ( ((*text > 64) && (*text < 91)) || ((*text > 96) && (*text < 123)))
{
flag = 1;
index = (int)(strchr(alph, tolower(*text)) - alph) + 1;
if (index > 9)
{
int n = index / 10;
int m = index % 10;
*s = n + '0';
s++;
*s = m + '0';
s++;
*s = ' ';
s++;
}
else
{
*s = index + '0';
s++;
*s = ' ';
s++;
}
}
text++;
}
if (flag != 0) // if string contains at least one letter
{
*(s -1) = '\0'; // remove the trailing space and insert string termination
}
return head;
}
Here is what I think is happening:
In the cases where none of the characters in the input string is an alphabet character, s is never used, and therefore the memory allocated by malloc() could be anything. malloc() does not clear / zero-out memory.
The fact that your input case of ".." passes is just coincidence. The codewars test case does many such non-alphabetical tests in a row, each of which causes a malloc(), and if any one of them fails, the whole thing fails.
I tried recreating this situation, but it's (as I say) unpredictable. To test this, add a debugging line to output the value of s when flag is still 0:
if (flag != 0) { // if string contains at least one letter
*(s -1) = '\0'; // remove the trailing space and insert string termination
}
else {
printf("flag is still 0 : %s\n", s);
}
I'll wager that sometimes you get a garbage / random string that is not "".

Effective way of checking if a given string is palindrome in C

I was preparing for my interview and started working from simple C programming questions. One question I came across was to check if a given string is palindrome. I wrote a a code to find if the user given string is palindrome using Pointers. I'd like to know if this is the effective way in terms of runtime or is there any enhancement I could do to it. Also It would be nice if anyone suggests how to remove other characters other than letters (like apostrophe comas) when using pointer.I've added my function below. It accepts a pointer to the string as parameter and returns integer.
int palindrome(char* string)
{
char *ptr1=string;
char *ptr2=string+strlen(string)-1;
while(ptr2>ptr1){
if(tolower(*ptr1)!=tolower(*ptr2)){
return(0);
}
ptr1++;ptr2--;
}
return(1);
}
"how to remove other characters other than letters?"
I think you don't want to actually remove it, just skip it and you could use isalpha to do so. Also note that condition ptr2 > ptr1 will work only for strings with even amount of characters such as abba, but for strings such as abcba, the condition should be ptr2 >= ptr1:
int palindrome(char* string)
{
size_t len = strlen(string);
// handle empty string and string of length 1:
if (len == 0) return 0;
if (len == 1) return 1;
char *ptr1 = string;
char *ptr2 = string + len - 1;
while(ptr2 >= ptr1) {
if (!isalpha(*ptr2)) {
ptr2--;
continue;
}
if (!isalpha(*ptr1)) {
ptr1++;
continue;
}
if( tolower(*ptr1) != tolower(*ptr2)) {
return 0;
}
ptr1++; ptr2--;
}
return 1;
}
you might need to #include <ctype.h>
How about doing like this if you want to do it using pointers only:
int main()
{
char str[100];
char *p,*t;
printf("Your string : ");
gets(str);
for(p=str ; *p!=NULL ; p++);
for(t=str, p-- ; p>=t; )
{
if(*p==*t)
{
p--;
t++;
}
else
break;
}
if(t>p)
printf("\nPalindrome");
else
printf("\nNot a palindrome");
getch();
return 0;
}
int main()
{
const char *p = "MALAYALAM";
int count = 0;
int len = strlen(p);
for(int i = 0; i < len; i++ )
{
if(p[i] == p[len - i - 1])
count++;
}
cout << "Count: " << count;
if(count == len)
cout << "Palindrome";
else
cout << "Not Palindrome";
return 0;
}
I have actually experimented quite a lot with this kind of problem.
There are two optimisations that can be done:
Check for odd string length, odd stings can't be palindromes
Start using vectorised compares, but this only really gives you performance if you expect a lot of palindromes. If the majority of your strings aren't palindromes you are still best off with byte by byte comparisons. In fact my vectorised palindrome checker ran 5% slower then the non-vectorised just because palindromes were so rare in the input. The extra branch that decided vectorised vs non vectorised made this big difference.
Here is code draft how you can do it vectorised:
int palindrome(char* string)
{
size_t length = strlen(string);
if (length >= sizeof(uintptr_t)) { // if the string fits into a vector
uintptr_t * ptr1 = (uintptr_t*)string;
size_t length_v /= sizeof(uintptr_t);
uintptr_t * ptr2 = (uintptr_t*)(string + (length - (length_v * sizeof(uintptr_t)))) + length_v - 1;
while(ptr2>ptr1){
if(*ptr1 != bswap(*ptr2)){ // byte swap for your word length, x86 has an instruction for it, needs to be defined separately
return(0);
}
ptr1++;ptr2--;
}
} else {
// standard byte by byte comparison
}
return(1);
}

C - Largest String From a Big One

So pray tell, how would I go about getting the largest contiguous string of letters out of a string of garbage in C? Here's an example:
char *s = "(2034HEY!!11 th[]thisiswhatwewant44";
Would return...
thisiswhatwewant
I had this on a quiz the other day...and it drove me nuts (still is) trying to figure it out!
UPDATE:
My fault guys, I forgot to include the fact that the only function you are allowed to use is the strlen function. Thus making it harder...
Uae strtok() to split your string into tokens, using all non-letter characters as delimiters, and find the longest token.
To find the longest token you will need to organise some storage for tokens - I'd use linked list.
As simple as this.
EDIT
Ok, if strlen() is the only function allowed, you can first find the length of your source string, then loop through it and replace all non-letter characters with NULL - basically that's what strtok() does.
Then you need to go through your modified source string second time, advancing one token at a time, and find the longest one, using strlen().
This sounds similar to the standard UNIX 'strings' utility.
Keep track of the longest run of printable characters terminated by a NULL.
Walk through the bytes until you hit a printable character. Start counting. If you hit a non-printable character stop counting and throw away the starting point. If you hit a NULL, check to see if the length of the current run is greater then the previous record holder. If so record it, and start looking for the next string.
What defines the "good" substrings compared to the many others -- being lowercase alphas only? (i.e., no spaces, digits, punctuation, uppercase, &c)?
Whatever the predicate P that checks for a character being "good", a single pass over s applying P to each character lets you easily identify the start and end of each "run of good characters", and remember and pick the longest. In pseudocode:
longest_run_length = 0
longest_run_start = longest_run_end = null
status = bad
for i in (all indices over s):
if P(s[i]): # current char is good
if status == bad: # previous one was bad
current_run_start = current_run_end = i
status = good
else: # previous one was also good
current_run_end = i
else: # current char is bad
if status == good: # previous one was good -> end of run
current_run_length = current_run_end - current_run_start + 1
if current_run_length > longest_run_length:
longest_run_start = current_run_start
longest_run_end = current_run_end
longest_run_length = current_run_length
status = bad
# if a good run ends with end-of-string:
if status == good: # previous one was good -> end of run
current_run_length = current_run_end - current_run_start + 1
if current_run_length > longest_run_length:
longest_run_start = current_run_start
longest_run_end = current_run_end
longest_run_length = current_run_length
Why use strlen() at all?
Here's my version which uses no function whatsoever.
#ifdef UNIT_TEST
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#endif
/*
// largest_letter_sequence()
// Returns a pointer to the beginning of the largest letter
// sequence (including trailing characters which are not letters)
// or NULL if no letters are found in s
// Passing NULL in `s` causes undefined behaviour
// If the string has two or more sequences with the same number of letters
// the return value is a pointer to the first sequence.
// The parameter `len`, if not NULL, will have the size of the letter sequence
//
// This function assumes an ASCII-like character set
// ('z' > 'a'; 'z' - 'a' == 25; ('a' <= each of {abc...xyz} <= 'z'))
// and the same for uppercase letters
// Of course, ASCII works for the assumptions :)
*/
const char *largest_letter_sequence(const char *s, size_t *len) {
const char *p = NULL;
const char *pp = NULL;
size_t curlen = 0;
size_t maxlen = 0;
while (*s) {
if ((('a' <= *s) && (*s <= 'z')) || (('A' <= *s) && (*s <= 'Z'))) {
if (p == NULL) p = s;
curlen++;
if (curlen > maxlen) {
maxlen = curlen;
pp = p;
}
} else {
curlen = 0;
p = NULL;
}
s++;
}
if (len != NULL) *len = maxlen;
return pp;
}
#ifdef UNIT_TEST
void fxtest(const char *s) {
char *test;
const char *p;
size_t len;
p = largest_letter_sequence(s, &len);
if (len && (len < 999)) {
test = malloc(len + 1);
if (!test) {
fprintf(stderr, "No memory.\n");
return;
}
strncpy(test, p, len);
test[len] = 0;
printf("%s ==> %s\n", s, test);
free(test);
} else {
if (len == 0) {
printf("no letters found in \"%s\"\n", s);
} else {
fprintf(stderr, "ERROR: string too large\n");
}
}
}
int main(void) {
fxtest("(2034HEY!!11 th[]thisiswhatwewant44");
fxtest("123456789");
fxtest("");
fxtest("aaa%ggg");
return 0;
}
#endif
While I waited for you to post this as a question I coded something up.
This code iterates through a string passed to a "longest" function, and when it finds the first of a sequence of letters it sets a pointer to it and starts counting the length of it. If it is the longest sequence of letters yet seen, it sets another pointer (the 'maxStringStart' pointer) to the beginning of that sequence until it finds a longer one.
At the end, it allocates enough room for the new string and returns a pointer to it.
#include<stdio.h>
#include<stdlib.h>
#include<string.h>
int isLetter(char c){
return ( (c >= 'a' && c <= 'z') || (c >= 'A' && c <= 'Z') );
}
char *longest(char *s) {
char *newString = 0;
int maxLength = 0;
char *maxStringStart = 0;
int curLength = 0;
char *curStringStart = 0;
do {
//reset the current string length and skip this
//iteration if it's not a letter
if( ! isLetter(*s)) {
curLength = 0;
continue;
}
//increase the current sequence length. If the length before
//incrementing is zero, then it's the first letter of the sequence:
//set the pointer to the beginning of the sequence of letters
if(curLength++ == 0) curStringStart = s;
//if this is the longest sequence so far, set the
//maxStringStart pointer to the beginning of it
//and start increasing the max length.
if(curLength > maxLength) {
maxStringStart = curStringStart;
maxLength++;
}
} while(*s++);
//return null pointer if there were no letters in the string,
//or if we can't allocate any memory.
if(maxLength == 0) return NULL;
if( ! (newString = malloc(maxLength + 1)) ) return NULL;
//copy the longest string into our newly allocated block of
//memory (see my update for the strlen() only requirement)
//and null-terminate the string by putting 0 at the end of it.
memcpy(newString, maxStringStart, maxLength);
newString[maxLength + 1] = 0;
return newString;
}
int main(int argc, char *argv[]) {
int i;
for(i = 1; i < argc; i++) {
printf("longest all-letter string in argument %d:\n", i);
printf(" argument: \"%s\"\n", argv[i]);
printf(" longest: \"%s\"\n\n", longest(argv[i]));
}
return 0;
}
This is my solution in simple C, without any data structures.
I can run it in my terminal like this:
~/c/t $ ./longest "hello there, My name is Carson Myers." "abc123defg4567hijklmnop890"
longest all-letter string in argument 1:
argument: "hello there, My name is Carson Myers."
longest: "Carson"
longest all-letter string in argument 2:
argument: "abc123defg4567hijklmnop890"
longest: "hijklmnop"
~/c/t $
the criteria for what constitutes a letter could be changed in the isLetter() function easily. For example:
return (
(c >= 'a' && c <= 'z') ||
(c >= 'A' && c <= 'Z') ||
(c == '.') ||
(c == ' ') ||
(c == ',') );
would count periods, commas and spaces as 'letters' also.
as per your update:
replace memcpy(newString, maxStringStart, maxLength); with:
int i;
for(i = 0; i < maxLength; i++)
newString[i] = maxStringStart[i];
however, this problem would be much more easily solved with the use of the C standard library:
char *longest(char *s) {
int longest = 0;
int curLength = 0;
char *curString = 0;
char *longestString = 0;
char *tokens = " ,.!?'\"()#$%\r\n;:+-*/\\";
curString = strtok(s, tokens);
do {
curLength = strlen(curString);
if( curLength > longest ) {
longest = curLength;
longestString = curString;
}
} while( curString = strtok(NULL, tokens) );
char *newString = 0;
if( longest == 0 ) return NULL;
if( ! (newString = malloc(longest + 1)) ) return NULL;
strcpy(newString, longestString);
return newString;
}
First, define "string" and define "garbage". What do you consider a valid, non-garbage string? Write down a concrete definition you can program - this is how programming specs get written. Is it a sequence of alphanumeric characters? Should it start with a letter and not a digit?
Once you get that figured out, it's very simple to program. Start with a naive method of looping over the "garbage" looking for what you need. Once you have that, look up useful C library functions (like strtok) to make the code leaner.
Another variant.
#include <stdio.h>
#include <string.h>
int main(void)
{
char s[] = "(2034HEY!!11 th[]thisiswhatwewant44";
int len = strlen(s);
int i = 0;
int biggest = 0;
char* p = s;
while (p[0])
{
if (!((p[0] >= 'A' && p[0] <= 'Z') || (p[0] >= 'a' && p[0] <= 'z')))
{
p[0] = '\0';
}
p++;
}
for (; i < len; i++)
{
if (s[i] && strlen(&s[i]) > biggest)
{
biggest = strlen(&s[i]);
p = &s[i];
}
}
printf("%s\n", p);
return 0;
}

Resources