Substitute characters in a string with their values? - c

I have a string given (a+b)&(a+c) and I have created a truth table with values of a,b, and c. Now the problem is to evaluate the logic expression by substituting a,b, and c with corresponding values from the truth table. How it can be done in C?
Ex: a=0 b=0 c=0 r=(0+0)&(0+)=0
a=0 b=0 c=1 r=(0+0)&(0+1)=0
and so on
The code itself looks like this
#include <stdio.h>
#include <stdlib.h>
int main()
{
char c,* str, *vars, **result;
int i=0,count=0,j=0;
unsigned long long rows;
str = (char*) malloc(1*sizeof(char));
vars=(char*) malloc(1*sizeof(char));
result=(char**)malloc(1*sizeof(char));
char values[] = {'F', 'T'};
while ((c = getchar()) != EOF)
{
str[i++] = c;
str = (char*) realloc(str, (i+1) * sizeof(char));
if (c >= 'a' && c <= 'z')
{
vars[j++]=c;
vars=(char*) realloc(vars,(j+1)*sizeof(char));
count++;
}
}
rows=1ULL<<(count);
result=(char**)realloc(result,(rows+2)*sizeof(char));
for (i = 0; i < rows+1; i++)
{
result[i]=(char*)malloc(sizeof(char)*(count+1));
for (j = 0; j < count; j++)
{
if(i==0)
result[i][j]=vars[j];
else
result[i][j]=values[(i >> j) & 1];
}
}
result[0][count]='R';
for(i=0;i<rows+1;i++)
{
for(j=0;j<count+1;j++)
{
//do something
}
}

Now the problem is to evaluate the logic expression by substituting a,b, and c with corresponding values from the truth table.
Aside from the issues mentioned in the question's comments, substituting alone won't do the job to evaluate the logic expression. The following function for example substitutes the values while evaluating the expression. (You didn't specify the general syntax of your expressions, so I chose to support combinations of the used operators and lower case variables.)
#include <ctype.h>
#include <string.h>
int indx(char *s, char c) { return strchr(s, c)-s; }
char *gstr, *gvars, *vals; // expression string, variables, value combination
char eval()
{ // evaluate expression "gstr"
char or = 0; // neutral element of +
do
{
char and = 1; // neutral element of &
do
{
char c = *gstr++; // get next token
if (islower(c))
and &= indx("FT", vals[indx(gvars, c)]);
else
if (c == '(')
{ // evaluate subexpression
and &= eval();
c = *gstr++; // get next token
if (c != ')')
printf("error at '%c': expected ')'\n", c), exit(1);
}
else
printf("error at '%c'\n", c), exit(1);
} while (*gstr == '&' && ++gstr);
or |= and;
} while (*gstr == '+' && ++gstr);
return or;
}
It can be called from your main (inserted in your code, hence the inconsistent spacing)
result[0][count]='R';
gvars = vars; // make variable names globally accessible
for (i = 1; i <= rows; ++i)
{
gstr = str, vals = result[i], // globally accessible
result[i][count] = values[eval()];
while (isspace(*gstr)) ++gstr;
if (*gstr)
printf("error at '%c': expected end of input\n", *gstr), exit(1);
}
for(i=0;i<rows+1;i++)
{
for(j=0;j<count+1;j++)
{
putchar(result[i][j]);
}
putchar('\n');
}
(Don't forget to put str[i] = '\0'; after your getchar loop to make a null-terminated string.) Note that due to the given for loop counting, the order of the truth table entries is somewhat unusual in that the row with all variables F comes last.

Related

Replacing a sequence of ascending characters by ascii code with a specific pattern

I am doing some kind of challenge in C in the internet and got the following mission:
Given a string, that consists of some ascending characters one after the other (by ASCII value), return the same string, but replace the middle letters of the sequence with one minus sign
Example:
Given the following input: dabcefLMNOpQrstuv567zyx
We expect the following output: da-cefL-OpQr-v567zyx
The code I've tried:
/* Importing useful libs */
#include <stdio.h>
#include <string.h>
/* Declaring boolean definitions */
typedef enum {
false,
true
}
bool_enum;
/* Declaring Max. Input Length */
#define MAX_INPUT_LENGTH 80
void sequence_replace(char string[]);
/* Main Function */
int main() {
char input_str[MAX_INPUT_LENGTH];
printf("Please enter the string you'd like to switch its sequences with the three char method: ");
scanf("%s", input_str);
sequence_replace(input_str);
return 0;
}
void sequence_replace(char string[]) {
int first_char, last_char;
int slen = strlen(string);
bool_enum sequence = false;
for(int i = 0; i < slen; i ++) {
int s1 = string[i];
int s2 = string[i+1];
if (s1 + 1 == s2) {
if (sequence = false) {
sequence = true;
first_char = i;
}
}
if (s1 + 1 != s2) {
if (sequence = true) {
last_char = i;
string[first_char + 1] = '-';
for(int j = first_char+2; j < last_char; j++) {
string[j] = '';
}
}
sequence = false;
}
}
printf("Sequences after replacement are: %s", string);
}
Basically what I tried to do, is in the sequence_replace function iterate over the string until I find one character whose ascii code + 1 equals to the ascii code of the next character, I change a boolean flag to true to show that I am inside a sequence as well as keeping the index of when the first character of the sequence showed up, then once it hits a character whose ascii code - 1 is not equal to the previous character ascii code, I then switch the character that comes next after the first character with '-' sign and then just run a loop until the end of the sequence to replace all other remaining chars with just an empty string.
Unfortunately, doesn't seem to be working, Would like to get any help if possible.
For starters there is no need to introduce this typedef declaration
/* Declaring boolean definitions */
typedef enum {
false,
true
}
bool_enum;
It is much better just to include the header <stdbool.h> and use names bool, false and true defined in the header.
The function itself should be declared like
char * sequence_replace(char string[]);
Using the function strlen is redundant and inefficient.
As it follows from the provided example you should check whether a current character is an alpha character or not.
You may not declare integer character constants that do not contain a symbol like this
string[j] = '';
That is in C there are no empty integer character constants.
Also there is a logical error in this if statement (apart from the typo in the inner of statement if (sequence = true) { where there is used the assignment operator = instead of the equality operator ==)
if (s1 + 1 != s2) {
if (sequence = true) {
last_char = i;
string[first_char + 1] = '-';
for(int j = first_char+2; j < last_char; j++) {
string[j] = '';
}
}
sequence = false;
}
It unconditionally write the symbol '-' even if there are only two characters that satisfy the condition
s1 + 1 == s2
In this case according to the provided example the symbol '-' is not inserted.
Also for example the for loop will not process the tail of the string that represents an increased sequence of letters.
The function can look the following way as shown in the demonstration program below.
#include <stdio.h>
#include <ctype.h>
char * sequence_replace( char s[] )
{
char *p = s;
for ( char *q = s; *q; )
{
*p++ = *q++;
char *current = q;
while (isalpha( ( unsigned char )q[-1] ) &&
isalpha( ( unsigned char )q[0] ) &&
( unsigned char )( q[-1] + 1 ) == ( unsigned char )q[0])
{
++q;
}
if (current != q)
{
if (q - current > 1)
{
*p++ = '-';
}
*p++ = q[-1];
}
}
*p = '\0';
return s;
}
int main( void )
{
char s[] = "dabcefLMNOpQrstuv567zyx";
puts( s );
puts( sequence_replace( s ) );
}
The program output is
dabcefLMNOpQrstuv567zyx
da-cefL-OpQr-v567zyx

Counting the number of vowels in a string in C

enter image description here
I'm solving the decryption problem in C language
There's a problem.
There's a process of counting the vowels in the string,
code not reading the number of vowels properly in that 'countingmeasure'
I was so curious that I debugged it,
count ++ doesn't work at'o'.
I'm really curious why this is happening
#include <stdio.h>
int main(void)
{
char original[15] = { 't','f','l','e','k','v','i','d','v','r','j','l','i','v',NULL };
printf("암호화된 문자열 : %s\n", original);
printf("원본 가능 문자열 \n");
printf("\n");
for (int j = 0; j < 26; j++)//모음이 7개일때만 출력을 어떻게 시킬까?
{
char change[14] = { 0 };
int counter=0;
char a;
for (int i = 0; i < 14; i++)
{
a = original[i] + j;
if (a > 122)
{
original[i] -= 26 ;
}
if (a == 'a' || a == 'e' || a == 'i' || a == 'o' || a == 'u')
{
counter++;
}
printf("%c", original[i] + j);
}
printf(" %d\n",counter);
}
}
a = original[i] + j; doesn't make any sense, since a is a char and the result might not fit inside it. Specifically, "character value + 26" might be larger than 127. Is char signed or unsigned by default?
Furthermore, arithmetic on any symbols except '0' to '9' isn't well-defined and they are not guaranteed to be allocated adjacently. Also please refrain from using hard-coded "magic numbers" in source code. Instead of 122 you should use 'z' etc.
There are several ways you can fix the program.
The quick & dirty solution is to do unsigned char a on the existing program, if you are content with "it just works, but I don't even know what I'm doing".
A better solution is to declare a string of vowels and then for every character in the input string, do a strchr() search in the vowel string for a match. (Correct but naive and slow, good enough beginner solution.)
A professional solution would be to create a look-up table of 128 booleans like
const bool LOOKUP [128] = { ['A'] = true, ['a'] = true, ['E'] = true, ... }; Then check if(item[i] == LOOKUP[ item[i] ]) /* then vowel */.
Use functions.
NULL is a pointer not zero or null terminating character
Use string literals.
use standard functions to change the case (tolower, toupper)
char *mystrchr(const char *str, const char lt, int ignoreCase)
{
while(*str)
{
if(ignoreCase)
if(tolower((unsigned char)*str) == tolower((unsigned char)lt)) return (char *)str;
else
if(*str == lt) return (char *)str;
str++;
}
return NULL;
}
size_t count(const char *haystack, const char *needle, int ignoreCase)
{
size_t count = 0;
while(*haystack)
{
if(mystrchr(needle, *haystack, ignoreCase)) count++;
haystack++;
}
return count;
}
int main(void)
{
char *str = "tflekvidvrjliv";
printf("%zu\n", count(str, "aeiou", 1));
}

How do I check the first two characters of my char array in C?

This is code to create a similar C library function atoi() without the use of any C runtime library routines.
I'm currently stuck on how to check for the first two digits of the char array s to see whether the input begins with "0x".
If it starts with 0x, this means that I can then convert it in to hexadecimal.
#include <stdio.h>
int checkforint(char x){
if (x>='0' && x<='9'){
return 1;
}
else{
return 0;
}
}
unsigned char converthex(char x){
//lets convert the character to lowercase, in the event its in uppercase
x = tolower(x);
if (x >= '0' && x<= '9') {
return ( x -'0');
}
if (x>='a' && x<='f'){
return (x - 'a' +10);
}
return 16;
}
int checkforhex(const char *a, const char *b){
if(a = '0' && b = 'x'){
return 1;
}else{
return 0;
}
}
//int checkforint
/* We read an ASCII character s and return the integer it represents*/
int atoi_ex(const char*s, int ishex)
{
int result = 0; //this is the result
int sign = 1; //this variable is to help us deal with negative numbers
//we initialise the sign as 1, as we will assume the input is positive, and change the sign accordingly if not
int i = 0; //iterative variable for the loop
int j = 2;
//we check if the input is a negative number
if (s[0] == '-') { //if the first digit is a negative symbol
sign = -1; //we set the sign as negative
i++; //also increment i, so that we can skip past the sign when we start the for loop
}
//now we can check whether the first characters start with 0x
if (ishex==1){
for (j=2; s[j]!='\0'; ++j)
result = result + converthex(s[j]);
return sign*result;
}
//iterate through all the characters
//we start from the first character of the input and then iterate through the whole input string
//for every iteration, we update the result accordingly
for (; s[i]!='\0'; ++i){
//this checks whether the current character is an integer or not
//if it is not an integer, we skip past it and go to the top of the loop and move to the next character
if (checkforint(s[i]) == 0){
continue;
} else {
result = result * 10 + s[i] -'0';
}
//result = s[i];
}
return sign * result;
}
int main(int argc)
{
int isithex;
char s[] = "-1";
char a = s[1];
char b = s[2];
isithex=checkforhex(a,b);
int val = atoi_ex(s,isithex);
printf("%d\n", val);
return 0;
}
There are several errors in your code. First, in C you start counting from zero. So in main(), you should write:
char a = s[0];
char b = s[1];
isithex = checkforhex(a, b);
Then, in checkforhex(), you should use == (two equal signs) to do comparisons, not =. So:
if (a == '0' && b == 'x')
However, as pointed out by kaylum, why not write the function to pass a pointer to the string instead of two characters? Like so:
int checkforhex(const char *str) {
if (str[0] == '0' && str[1] == 'x') {
...
}
}
And in main() call it like so:
isithex = checkforhex(s);

Attempting to split and store arrays similar to strtok

For an assignment in class, we have been instructed to write a program which takes a string and a delimiter and then takes "words" and stores them in a new array of strings. i.e., the input ("my name is", " ") would return an array with elements "my" "name" "is".
Roughly, what I've attempted is to:
Use a separate helper called number_of_delimeters() to determine the size of the array of strings
Iterate through the initial array to find the number of elements in a given string which would be placed in the array
Allocate storage within my array for each string
Store the elements within the allocated memory
Include directives:
#include <stdlib.h>
#include <stdio.h>
This is the separate helper:
int number_of_delimiters (char* s, int d)
{
int numdelim = 0;
for (int i = 0; s[i] != '\0'; i++)
{
if (s[i] == d)
{
numdelim++;
}
}
return numdelim;
}
`This is the function itself:
char** split_at (char* s, char d)
{
int numdelim = number_of_delimiters(s, d);
int a = 0;
int b = 0;
char** final = (char**)malloc((numdelim+1) * sizeof(char*));
for (int i = 0; i <= numdelim; i++)
{
int sizeofj = 0;
while (s[a] != d)
{
sizeofj++;
a++;
}
final[i] = (char*)malloc(sizeofj);
a++;
int j = 0;
while (j < sizeofj)
{
final[i][j] = s[b];
j++;
b++;
}
b++;
final[i][j+1] = '\0';
}
return final;
}
To print:
void print_string_array(char* a[], unsigned int alen)
{
printf("{");
for (int i = 0; i < alen; i++)
{
if (i == alen - 1)
{
printf("%s", a[i]);
}
else
{
printf("%s ", a[i]);
}
}
printf("}");
}
int main(int argc, char *argv[])
{
print_string_array(split_at("Hi, my name is none.", ' '), 5);
return 0;
}
This currently returns {Hi, my name is none.}
After doing some research, I realized that the purpose of this function is either similar or identical to strtok. However, looking at the source code for this proved to be little help because it included concepts we have not yet used in class.
I know the question is vague, and the code rough to read, but what can you point to as immediately problematic with this approach to the problem?
The program has several problems.
while (s[a] != d) is wrong, there is no delimiter after the last word in the string.
final[i][j+1] = '\0'; is wrong, j+1 is one position too much.
The returned array is unusable, unless you know beforehand how many elements are there.
Just for explanation:
strtok will modify the array you pass in! After
char test[] = "a b c ";
for(char* t = test; strtok(t, " "); t = NULL);
test content will be:
{ 'a', 0, 'b', 0, 'c', 0, 0 }
You get subsequently these pointers to your test array: test + 0, test + 2, test + 4, NULL.
strtok remembers the pointer you pass to it internally (most likely, you saw a static variable in your source code...) so you can (and must) pass NULL the next time you call it (as long as you want to operate on the same source string).
You, in contrast, apparently want to copy the data. Fine, one can do so. But here we get a problem:
char** final = //...
return final;
void print_string_array(char* a[], unsigned int alen)
You just return the array, but you are losing length information!
How do you want to pass the length to your print function then?
char** tokens = split_at(...);
print_string_array(tokens, sizeof(tokens));
will fail, because sizeof(tokens) will always return the size of a pointer on your local system (most likely 8, possibly 4 on older hardware)!
My personal recommendation: create a null terminated array of c strings:
char** final = (char**)malloc((numdelim + 2) * sizeof(char*));
// ^ (!)
// ...
final[numdelim + 1] = NULL;
Then your print function could look like this:
void print_string_array(char* a[]) // no len parameter any more!
{
printf("{");
if(*a)
{
printf("%s", *a); // printing first element without space
for (++a; *a; ++a) // *a: checking, if current pointer is not NULL
{
printf(" %s", *a); // next elements with spaces
}
}
printf("}");
}
No problems with length any more. Actually, this is exactly the same principle C strings use themselves (the terminating null character, remember?).
Additionally, here is a problem in your own code:
while (j < sizeofj)
{
final[i][j] = s[b];
j++; // j will always point behind your string!
b++;
}
b++;
// thus, you need:
final[i][j] = '\0'; // no +1 !
For completeness (this was discovered by n.m. already, see the other answer): If there is no trailing delimiter in your source string,
while (s[a] != d)
will read beyond your input string (which is undefined behaviour and could result in your program crashing). You need to check for the terminating null character, too:
while(s[a] && s[a] != d)
Finally: how do you want to handle subsequent delimiters? Currently, you will insert empty strings into your array? Print out your strings as follows (with two delimiting symbols - I used * and + like birth and death...):
printf("*%s+", *a);
and you will see. Is this intended?
Edit 2: The variant with pointer arithmetic (only):
char** split_at (char* s, char d)
{
int numdelim = 0;
char* t = s; // need a copy
while(*t)
{
numdelim += *t == d;
++t;
}
char** final = (char**)malloc((numdelim + 2) * sizeof(char*));
char** f = final; // pointer to current position within final
t = s; // re-assign t, using s as start pointer for new strings
while(*t) // see above
{
if(*t == d) // delimiter found!
{
// can subtract pointers --
// as long as they point to the same array!!!
char* n = (char*)malloc(t - s + 1); // +1: terminating null
*f++ = n; // store in position pointer and increment it
while(s != t) // copy the string from start to current t
*n++ = *s++;
*n = 0; // terminate the new string
}
++t; // next character...
}
*f = NULL; // and finally terminate the string array
return final;
}
While I've now been shown a more elegant solution, I've found and rectified the issues in my code:
char** split_at (char* s, char d)
{
int numdelim = 0;
int x;
for (x = 0; s[x] != '\0'; x++)
{
if (s[x] == d)
{
numdelim++;
}
}
int a = 0;
int b = 0;
char** final = (char**)malloc((numdelim+1) * sizeof(char*));
for (int i = 0; i <= numdelim; i++)
{
int sizeofj = 0;
while ((s[a] != d) && (a < x))
{
sizeofj++;
a++;
}
final[i] = (char*)malloc(sizeofj);
a++;
int j = 0;
while (j < sizeofj)
{
final[i][j] = s[b];
j++;
b++;
}
final[i][j] = '\0';
b++;
}
return final;
}
I consolidated what I previously had as a helper function, and modified some points where I incorrectly incremented .

method for expand a-z to abc...xyz form

Hi:) what i'm trying to do is write a simple program to expand from shortest entry
for example
a-z or 0-9 or a-b-c or a-z0-9
to longest write
for example
abc...xyz or 0123456789 or abc or abcdefghijklmnouprstwxyz0123456789
1-st examle shortest entry = 1-st example result which should give:)
so far i write something like this and it's work only for letters from a to z:
expand(char s[])
{
int i,n,c;
n=c=0;
int len = strlen(s);
for(i = 1;s[i] > '0' && s[i]<= '9' || s[i] >= 'a' && s[i] <= 'z' || s[i]=='-';i++)
{
/*c = s[i-1];
g = s[i];
n = s[i+1];*/
if( s[0] == '-')
printf("%c",s[0]);
else if(s[i] == '-')
{
if(s[i-1]<s[i+1])
{
while(s[i-1] <= s[i+1])
{
printf("%c", s[i-1]);
s[i-1]++;
}
}
else if(s[i-1] == s[i+1])
printf("%c",s[i]);
else if(s[i+1] != '-')
printf("%c",s[i]);
else if(s[i-1] != '-')
printf("%c",s[i]);
}
else if(s[i] == s[i+1])
{
while(s[i] == s[i+1])
{
printf("%c",s[i]);
s[i]++;
}
}
else if( s[len] == '-')
printf("%c",s[len]);
}
}
but now i'm stuck:(
any ideas what should i check to my program work correctly?
Edit1: #Andrew Kozak (1) abcd (2) 01234
Thanks for advance:)
Here is a C version (in about 38 effective lines) that satisfies the same test as my earlier C++ version.
The full test program including your test cases, mine and some torture test can be seen live on http://ideone.com/sXM7b#info_3915048
Rationale
I'm pretty sure I'm overstating the requirements, but
this should be an excellent example of how to do parsing in a robust fashion
use states in an explicit fashion
validate input (!)
this version doesn't assume a-c-b can't happen
It also doesn't choke or even fail on simple input like 'Hello World' (or (char*) 0)
it shows how you can avoid printf("%c", c) each char without using extraneous functions.
I put in some comments as to explain what happens why, but overall you'll find that the code is much more legible anyways, by
staying away from too many short-named variables
avoiding complicated conditionals with un-transparent indexers
avoiding the whole string length business: We only need max lookahead of 2 characters, and *it=='-' or predicate(*it) will just return false if it is the null character. Shortcut evaluation prevents us from accessing past-the-end input characters
ONE caveat: I haven't implemented a proper check for output buffer overrun (the capacity is hardcoded at 2048 chars). I'll leave it as the proverbial exercise for the reader
Last but not least, the reason I did this:
It will allow me to compare raw performance of the C++ version and this C version, now that they perform equivalent functions. Right now, I fully expect the C version to outperform the C++ by some factor (let's guess: 4x?) but, again, let's just see what suprises the GNU compilers have in store for us. More later Update turns out I wasn't far off: github (code + results)
Pure C Implementation
Without further ado, the implementation, including the testcase:
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
int alpha_range(char c) { return (c>='a') && (c<='z'); }
int digit_range(char c) { return (c>='0') && (c<='9'); }
char* expand(const char* s)
{
char buf[2048];
const char* in = s;
char* out = buf;
// parser state
int (*predicate)(char) = 0; // either: NULL (free state), alpha_range (in alphabetic range), digit_range (in digit range)
char lower=0,upper=0; // tracks lower and upper bound of character ranges in the range parsing states
// init
*out = 0;
while (*in)
{
if (!predicate)
{
// free parsing state
if (alpha_range(*in) && (in[1] == '-') && alpha_range(in[2]))
{
lower = upper = *in++;
predicate = &alpha_range;
}
else if (digit_range(*in) && (in[1] == '-') && digit_range(in[2]))
{
lower = upper = *in++;
predicate = &digit_range;
}
else *out++ = *in;
} else
{
// in a range
if (*in < lower) lower = *in;
if (*in > upper) upper = *in;
if (in[1] == '-' && predicate(in[2]))
in++; // more coming
else
{
// end of range mode, dump expansion
char c;
for (c=lower; c<=upper; *out++ = c++);
predicate = 0;
}
}
in++;
}
*out = 0; // null-terminate buf
return strdup(buf);
}
void dotest(const char* const input)
{
char* ex = expand(input);
printf("input : '%s'\noutput: '%s'\n\n", input, ex);
if (ex)
free(ex);
}
int main (int argc, char *argv[])
{
dotest("a-z or 0-9 or a-b-c or a-z0-9"); // from the original post
dotest("This is some e-z test in 5-7 steps; this works: a-b-c. This works too: b-k-c-e. Likewise 8-4-6"); // from my C++ answer
dotest("-x-s a-9 9- a-k-9 9-a-c-7-3"); // assorted torture tests
return 0;
}
Test output:
input : 'a-z or 0-9 or a-b-c or a-z0-9'
output: 'abcdefghijklmnopqrstuvwxyz or 0123456789 or abc or abcdefghijklmnopqrstuvwxyz0123456789'
input : 'This is some e-z test in 5-7 steps; this works: a-b-c. This works too: b-k-c-e. Likewise 8-4-6'
output: 'This is some efghijklmnopqrstuvwxyz test in 567 steps; this works: abc. This works too: bcdefghijk. Likewise 45678'
input : '-x-s a-9 9- a-k-9 9-a-c-7-3'
output: '-stuvwx a-9 9- abcdefghijk-9 9-abc-34567'
Ok I tested your program out and it seems to be working for nearly every case. It correctly expands a-z and other expansions with only two letters/numbers. It fails when there are more letters and numbers. The fix is easy, just make a new char to keep the last printed character, if the currently printed character matches the last one skip it. The a-z0-9 scenario didn't work because you forgot a s[i] >= '0' instead of s[i] > '0'. the code is:
#include <stdio.h>
#include <string.h>
void expand(char s[])
{
int i,g,n,c,l;
n=c=0;
int len = strlen(s);
for(i = 1;s[i] >= '0' && s[i]<= '9' || s[i] >= 'a' && s[i] <= 'z' || s[i]=='-';i++)
{
c = s[i-1];
g = s[i];
n = s[i+1];
//printf("\nc = %c g = %c n = %c\n", c,g,n);
if(s[0] == '-')
printf("%c",s[0]);
else if(g == '-')
{
if(c<n)
{
if (c != l){
while(c <= n)
{
printf("%c", c);
c++;
}
l = c - 1;
//printf("\nl is %c\n", l);
}
else
{
c++;
while(c <= n)
{
printf("%c", c);
c++;
}
l = c - 1;
//printf("\nl is %c\n", l);
}
}
else if(c == n)
printf("%c",g);
else if(n != '-')
printf("%c",g);
else if(c != '-')
printf("%c",g);
}
else if(g == n)
{
while(g == n)
{
printf("%c",s[i]);
g++;
}
}
else if( s[len] == '-')
printf("%c",s[len]);
}
printf("\n");
}
int main (int argc, char *argv[])
{
expand(argv[1]);
}
Isn't this problem from K&R? I think I saw it there. Anyway I hope I helped.
Based on the fact that the existing function addresses "a-z" and "0-9" sequences just fine, separately, we should explore what happens when they meet. Trace your code (try printing each variable's value at each step -- yes it will be cluttered, so use line breaks), and I believe you will find a logical short-circuit when iterating, for example, from "current token is 'y' and next token is 'z'" to "current token is 'z' and next token is '0'". Explore the if() condition and you will find that it does not cover all possibilities, i.e. you have covered yourself if you are within a<-->z, within 0<-->9, or exactly equal to '-', but you have not considered being at the end of one (a-z or 0-9) with your next character at the start of the next.
Just for fun, I decided to demonstrate to myself that C++ is really just as suited to this kind of thing.
Test-first, please
First, let me define the requirements a little more strictly: I assumed it needs to handle these cases:
int main()
{
const std::string in("This is some e-z test in 5-7 steps; this works: a-b-c. This works too: b-k-c-e. Likewise 8-4-6");
std::cout << "input : " << in << std::endl;
std::cout << "output: " << expand(in) << std::endl;
}
input : This is some e-z test in 5-7 steps; this works: a-b-c. This works too: b-k-c-e. Likewise 8-4-6
output: This is some efghijklmnopqrstuvwxyz test in 567 steps; this works: abc. This works too: bcdefghijk. Likewise 45678
C++0x Implementation
Here is an implementation (actually a few variants) in 14 lines (23 including whitespace, comments) of C++0x code1
static std::string expand(const std::string& in)
{
static const regex re(R"([a-z](?:-[a-z])+|[0-9](?:-[0-9])+)");
std::string out;
auto tail = in.begin();
for (auto match : make_iterator_range(sregex_iterator(in.begin(), in.end(), re), sregex_iterator()))
{
out.append(tail, match[0].first);
// char range bounds: the cost of accepting unordered ranges...
char a=127, b=0;
for (auto x=match[0].first; x<match[0].second; x+=2)
{ a = std::min(*x,a); b = std::max(*x,b); }
for (char c=a; c<=b; out.push_back(c++));
tail = match.suffix().first;
}
out.append(tail, in.end());
return out;
}
Of course I'm cheating a little because I'm using regex iterators from Boost. I will do some timings comparing to the C version for performance. I rather expect the C++ version to compete within a 50% margin. But, let's see what kind of surprises the GNU compiler ahs in store for us :)
Here is a complete program that demonstrates the sample input. _It also contains some benchmark timings and a few variations that trade-off
functional flexibility
legibility / performance
#include <set> // only needed for the 'slow variant'
#include <boost/regex.hpp>
#include <boost/range.hpp>
using namespace boost;
using namespace boost::range;
static std::string expand(const std::string& in)
{
// static const regex re(R"([a-z]-[a-z]|[0-9]-[0-9])"); // "a-c-d" --> "abc-d", "a-c-e-g" --> "abc-efg"
static const regex re(R"([a-z](?:-[a-z])+|[0-9](?:-[0-9])+)");
std::string out;
out.reserve(in.size() + 12); // heuristic
auto tail = in.begin();
for (auto match : make_iterator_range(sregex_iterator(in.begin(), in.end(), re), sregex_iterator()))
{
out.append(tail, match[0].first);
// char range bounds: the cost of accepting unordered ranges...
#if !SIMPLE_BUT_SLOWER
// debug 15.149s / release 8.258s (at 1024k iterations)
char a=127, b=0;
for (auto x=match[0].first; x<match[0].second; x+=2)
{ a = std::min(*x,a); b = std::max(*x,b); }
for (char c=a; c<=b; out.push_back(c++));
#else // simpler but slower
// debug 24.962s / release 10.270s (at 1024k iterations)
std::set<char> bounds(match[0].first, match[0].second);
bounds.erase('-');
for (char c=*bounds.begin(); c<=*bounds.rbegin(); out.push_back(c++));
#endif
tail = match.suffix().first;
}
out.append(tail, in.end());
return out;
}
int main()
{
const std::string in("This is some e-z test in 5-7 steps; this works: a-b-c. This works too: b-k-c-e. Likewise 8-4-6");
std::cout << "input : " << in << std::endl;
std::cout << "output: " << expand(in) << std::endl;
}
1 Compiled with g++-4.6 -std=c++0x
This is a Java implementation. It expands the character ranges similar to 0-9, a-z and A-Z. Maybe someone will need it someday and Google will bring them to this page.
package your.package;
public class CharacterRange {
/**
* Expands character ranges similar to 0-9, a-z and A-Z.
*
* #param string a string to be expanded
* #return a string
*/
public static String expand(String string) {
StringBuilder buffer = new StringBuilder();
int i = 1;
while (i <= string.length()) {
final char a = string.charAt(i - 1); // previous char
if ((i < string.length() - 1) && (string.charAt(i) == '-')) {
final char b = string.charAt(i + 1); // next char
char[] expanded = expand(a, b);
if (expanded.length != 0) {
i += 2; // skip
buffer.append(expanded);
} else {
buffer.append(a);
}
} else {
buffer.append(a);
}
i++;
}
return buffer.toString();
}
private static char[] expand(char a, char b) {
char[] expanded = expand(a, b, '0', '9'); // digits (0-9)
if (expanded.length == 0) {
expanded = expand(a, b, 'a', 'z'); // lower case letters (a-z)
}
if (expanded.length == 0) {
expanded = expand(a, b, 'A', 'Z'); // upper case letters (A-Z)
}
return expanded;
}
private static char[] expand(char a, char b, char min, char max) {
if ((a > b) || !(a >= min && a <= max && b >= min && b <= max)) {
return new char[0];
}
char[] buffer = new char[(b - a) + 1];
for (int i = 0; i < buffer.length; i++) {
buffer[i] = (char) (a + i);
}
return buffer;
}
public static void main(String[] args) {
String[] ranges = { //
"0-9", "a-z", "A-Z", "0-9a-f", "a-z2-7", "0-9a-v", //
"0-9a-hj-kmnp-tv-z", "0-9a-z", "1-9A-HJ-NP-Za-km-z", //
"A-Za-z0-9", "A-Za-z0-9+/", "A-Za-z0-9-_" };
for (int i = 0; i < ranges.length; i++) {
String input = ranges[i];
String output = CharacterRange.expand(ranges[i]);
System.out.println("input: " + input);
System.out.println("output: " + output);
System.out.println();
}
}
}
Output:
input: 0-9
output: 0123456789
input: a-z
output: abcdefghijklmnopqrstuvwxyz
input: A-Z
output: ABCDEFGHIJKLMNOPQRSTUVWXYZ
input: 0-9a-f
output: 0123456789abcdef
input: a-z2-7
output: abcdefghijklmnopqrstuvwxyz234567
input: 0-9a-v
output: 0123456789abcdefghijklmnopqrstuv
input: 0-9a-hj-kmnp-tv-z
output: 0123456789abcdefghjkmnpqrstvwxyz
input: 0-9a-z
output: 0123456789abcdefghijklmnopqrstuvwxyz
input: 1-9A-HJ-NP-Za-km-z
output: 123456789ABCDEFGHJKLMNPQRSTUVWXYZabcdefghijkmnopqrstuvwxyz
input: A-Za-z0-9
output: ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789
input: A-Za-z0-9+/
output: ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/
input: A-Za-z0-9-_
output: ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-_

Resources