How to check a list of regex faster - c

I'm using regex in C where I'm checking a word against a list of regex, below is what I have,
I have this string
0118889994444
and I have these regexes
^012[0-9]{10}$ if this one hits then do 1
^0011[0-9]{10}$ if this one hits then do 2
^00[0-9]{10}$ if if this one hits then do 3
^11[0-9]{10}$ if this one hits then do 4
^011[0-9]{10}$ if this one hits then do 5 // this one will match the string
What I'm currently doing is looping through the regex list and see which one will hit and then do whatever is set for that regex, so, the bigger the list the more time it takes to finish the loop, is there a way or a trick to make this faster and more intelligent :) ?

In the case above I would drop regex altogether, and go for a straightforward approach of checking the prefix against a fixed list, followed by the detection that the rest of the string is composed of ten digit. You can do it like this:
struct mathc_def {
const char *prefix;
int offset;
} match_defs[] = {
{.prefix = "012", .offset = 3}
, {.prefix = "0011", .offset = 4}
, {.prefix = "00", .offset = 2}
, {.prefix = "11", .offset = 2}
, {.prefix = "011", .offset = 3}
};
bool ten_digits(const char* str) {
int i = 0;
while (isdigit(str[i])) {
i++;
}
return i == 10;
}
char *str = "0118889994444";
for (int i = 0 ; i != 5 ; i++) {
if (strstr(str, match_defs[i].prefix) == str && ten_digits(&str[match_defs[i].offset])) {
printf("Item %d matched.\n", i);
}
}

Not sure if it will be the fastest approach, but you could convert the (decimal) string into a 64 bit long and check the value divided by 1e10. Apart from that, check the length of the string to verify the leading zeroes.
This will of course only work as long as the characters are in the [0-9] range.
Example, using uintmax_t as integer type:
uintmax_t val,prefix;
char *endp=NULL;
int len, prefixlen;
printf("sizeof val: %lu\n",sizeof(val));
val = strtoumax(argv[1],&endp,10);
prefix = val / 1e10;
len = endp - argv[1];
prefixlen = len - 10;
printf("val=%ju, len=%u\n",val,len);
printf("prefix=%ju, len=%u\n",prefix,prefixlen);
switch ( prefixlen )
{
case 2:
if ( prefix == 0 ) ; // do 3
if ( prefix == 11 ) ; // do 4
break;
case 3:
if ( prefix == 12 ) ; // do 1
if ( prefix == 11 ) ; // do 5
break;
case 4:
if ( prefix == 11 ) ; // do 2
break;
}
This is probably fast compared to any solution comparing actual strings (just one loop to parse the number), but hard to maintain or expand to non-decimal strings or values exceeding 2^64.

Related

Maximum binary number a binary string will result to if only one operation is allowed i.e. Right-Rotate By K-Bits where K = [0, Length of String]

Suppose you have a binary string S and you are only allowed to do one operation i.e. Right-Rotate by K-bits where K = [0, Length of the string]. Write an algorithm that will print the maximum binary-number you can create by the defined process.
For Example:
S = [00101] then maximum value I can get from the process is 10100 i.e. 20.
S = [011010] then maximum value I can get from the process is 110100 i.e. 52.
S = [1100] then maximum value I can get from the process is 1100 i.e. 12.
The length of the string S has an upper-limit i.e. 5*(10^5).
The idea which I thought of is kind of very naive which is: as we know that when you right-rotate any binary number by 1-bit, you get the same binary number after m rotations where m = number of bits required to represent that number.
So, I right-rotate by 1 until I get to the number with which I started with and during the process, I keep track of the max-value I encountered and in last I print the max-value.
Is there an efficient approach to solve the problem?
UPD1: This the source of the problem One-Zero, it all boils down to the statement I have described above.
UPD2: As the answer can be huge, the program will print the answer modulo 10^9 + 7.
You want to find the largest number expressed in a binary encoded string with wrap around.
Here are steps for a solution:
let len be the length of the string
allocate an array of size 2 * len and duplicate the string to it.
using linear search, find the position pos of the largest substring of length len in this array (lexicographical order can be used for that).
compute res, the converted number modulo 109+7, reading len bits starting at pos.
free the array and return res.
Here is a simple implementation as a function:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
long max_bin(const char *S) {
size_t i, pos, len;
char *a;
// long has at least 31 value bits, enough for numbers upto 2 * 1000000007
long res;
if ((len = strlen(S)) == 0)
return 0;
if ((a = malloc(len + len)) == NULL)
return -1;
memcpy(a, S, len);
memcpy(a + len, S, len);
// find the smallest right rotation for the greatest substring
for (pos = i = len; --i > 0;) {
if (memcmp(a + i, a + pos, len) > 0)
pos = i;
}
res = 0;
for (i = 0; i < len; i++) {
res = res + res + a[pos + i] - '0';
if (res >= 1000000007)
res -= 1000000007;
}
free(a);
return res;
}
int main(int argc, char *argv[]) {
for (int i = 1; i < argc; i++) {
printf("[%s] -> %ld\n", argv[i], max_bin(argv[i]));
}
return 0;
}
It is feasible to avoid memory allocation if it is a requirement.
It's me again.
I got to thinking a bit more about your problem in the shower this morning, and it occurred to me that you could do a QuickSelect (if you're familiar with that) over an array of the start indexes of the input string and determine the index of the most "valuable" rotate based on that.
What I show here does not concern itself with presenting the result the way you are required to, only with determining what the best offset for rotation is.
This is not a textbook QuickSelect implementation but rather a simplified method that does the same thing while taking into account that it's a string of zeros and ones that we are dealing with.
Main driver logic:
static void Main(string[] args)
{
Console.WriteLine(FindBestIndex("")); // exp -1
Console.WriteLine(FindBestIndex("1")); // exp 0
Console.WriteLine(FindBestIndex("0")); // exp 0
Console.WriteLine(FindBestIndex("110100")); // exp 0
Console.WriteLine(FindBestIndex("100110")); // exp 3
Console.WriteLine(FindBestIndex("01101110")); // exp 4
Console.WriteLine(FindBestIndex("11001110011")); // exp 9
Console.WriteLine(FindBestIndex("1110100111110000011")); // exp 17
}
Set up the index array that we'll be sorting, then call FindHighest to do the actual work:
static int FindBestIndex(string input)
{
if (string.IsNullOrEmpty(input))
return -1;
int[] indexes = new int[input.Length];
for (int i = 0; i < indexes.Length; i++)
{
indexes[i] = i;
}
return FindHighest(input, indexes, 0, input.Length);
}
Partition the index array into two halves depending on whether each index points to a string that starts with zero or one at this offset within the string.
Once that's done, if we have just one element that started with one, we have the best string, else if we have more, partition those based on the next index. If none started with one, proceed with zero in the same way.
static int FindHighest(string s, int[] p, int index, int len)
{
// allocate two new arrays,
// one for the elements of p that have zero at this index, and
// one for the elements of p that have one at this index
int[] zero = new int[len];
int[] one = new int[len];
int count_zero = 0;
int count_one = 0;
// run through p and distribute its elements to 'zero' and 'one'
for (int i = 0; i < len; i++)
{
int ix = p[i]; // index into string
int pos = (ix + index) % s.Length; // wrap for string length
if (s[pos] == '0')
{
zero[count_zero++] = ix;
}
else
{
one[count_one++] = ix;
}
}
// if we have a one in this position, use that, else proceed with zero (below)
if (count_one > 1)
{
// if we have more than one, sort on the next position (index)
return FindHighest(s, one, index + 1, count_one);
} else if (count_one == 1)
{
// we have exactly one entry left in ones, so that's the best one we have overall
return one[0];
}
if (count_zero > 1)
{
// if we have more than one, sort on the next position (index)
return FindHighest(s, zero, index + 1, count_zero);
}
else if (count_zero == 1)
{
// we have exactly one entry left in zeroes, and we didn't have any in ones,
// so this is the best one we have overall
return zero[0];
}
return -1;
}
Note that this can be optimized further by expanding the logic: If the input string has any ones at all, there's no point in adding indexes where the string starts with zero to the index array in FindBestIndex since those will be inferior. Also, if an index does start with a one but the previous index also did, you can omit the current one as well because the previous index will always be better since that string flows into this character.
If you like, you can also refactor to remove the recursion in favor of a loop.
If I were tackling this I would do so as follows.
I think it's all to do with counting alternating runs of '1' and runs of '0', treating a run of '1's followed by a run of '0's as a pair, then bashing a list of those pairs.
Let us start by scanning to the first '1', and setting start position s. Then count each run of '1's c1 and the following run of '0's c0, creating pairs (c1,c0).
The scan then proceeds forwards to the end, wrapping round as required. If we represent runs of one or more '1' and '0' as single digits, and '|' as the start and end of the string, then we have cases:
|10101010|
^ initial start position s -- the string ends neatly with a run of '0's
|1010101|
^ new start position s -- the string starts and ends in a '1', so the first
run of '1's and run of '0's are rotated (left),
to be appended to the last run of '1's
Note that this changes our start position.
|01010101|
^ initial start position s -- the first run of '0's is rotated (left),
to follow the last run of '1's.
|010101010|
^ initial start position s -- the first run of '0's is rotated (left),
to be appended to the last run of '0's.
NB: if the string both starts and ends with a '1', there are, initially, n runs of '0's and n+1 runs of '1's, but the rotation reduces that to n runs of '1's. And similarly, but conversely, if the string both starts and ends with a '0'.
Let us use A as shorthand for the pair (a1,a0). Suppose we have another pair, X -- (x1,x0) -- then can compare the two pairs, thus:
if a1 > x1 or (a1 = x1 and (a0 < x0) => A > X -- A is better start
if a1 = x1 and a0 = x0 => A = X
if a1 < x1 or (a1 = x1 and (a0 > x0) => A < X -- X is better start
The trick is probably to pack each pair into an integer -- say (x1 * N) - x0, where N is at least the maximum allowed length of the string -- for ease of comparison.
During the scan of the string (described above) let us construct a vector of pairs. During that process, collect the largest pair value A, and a list of the positions, s, of each appearance of A. Each s on the list is a potential best start position. (The recorded s needs to be the index in the vector of pairs and the offset in the original string.)
[If the input string is truly vast, then constructing the entire vector of all pairs will eat memory. In which case the vector would need to be handled as a "virtual" vector, and when an item in that vector is required, it would have to be created by reading the respective portion of the actual string.]
Now:
let us simplify groups of two or more contiguous A. Clearly the second and subsequent A's in such a group cannot be the best start, since there is a better one immediately to the left. So, in the scan we need to record only the s for the first A of such groups.
if the string starts with one or more A's and ends with one or more A's, need to "rotate" to collect those as a single group, and record the s only for the leftmost A in that group (in the usual way).
If there is only one s on the list, our work is done. If the string is end-to-end A, that will be spotted here.
Otherwise, we need to consider the pairs which follow each of the s for our (initial) A's -- where when we say 'follow' we include wrapping round from the end to the start of the string (and, equivalently, the list of pairs).
NB: at this point we know that all the (initial) A's on our list are followed by zero or more A's and then at least one x, where x < A.
So, set i = 1, and consider all the pairs at s+i for our list of s. Keep only the s for the instances of the largest pair found. So for i = 1, in this example we are considering pairs x, y and z:
...Ax....Az...Az..Ax...Ay...
And if x is the largest, this pass discards Ay and both Az. If only one s remains -- in the example, y is the largest -- our work is done. Otherwise, repeat for i = i+1.
There is one last (I think) wrinkle. Suppose after finding z to be the largest of the ith pairs, we have:
...A===zA===z...
where the two runs === are the same as each other. By the same logic that told us to ignore second and subsequent A's in runs of same, we can now discard the second A===z. Indeed we can discard third, fourth, etc. contiguous A===z. Happily that deals with the extreme case of (say):
=zA===zA===zA==
where the string is a sequence of A===z !
I dunno, that all seems more complicated than I expected when I set out with my pencil and paper :-(
I imagine someone much cleverer than I can reduce this to some standard greatest prefix-string problem.
I was bored today, so I knocked out some code (and revised it on 10-Apr-2020).
typedef unsigned int uint ; // assume that's uint32_t or similar
enum { max_string_len = 5 * 100000 } ; // 5 * 10^5
typedef uint64_t pair_t ;
static uint
one_zero(const char* str, uint N)
{
enum { debug = false } ;
void* mem ;
size_t nmx ;
uint s1, s0 ; // initial run of '1'/'0's to be rotated
uint s ;
pair_t* pv ; // pair vector
uint* psi ; // pair string index
uint* spv ; // starts vector -- pair indexes
uint pn ; // count of pairs
uint sn ; // count of starts
pair_t bp ; // current best start pair value
uint bpi ; // current best start pair index
uint bsc ; // count of further best starts
char ch ;
if (N > max_string_len)
{
printf("*** N = %u > max_string_len (%u)\n", N, max_string_len) ;
return UINT_MAX ;
} ;
if (N < 1)
{
printf("*** N = 0 !!\n") ;
return UINT_MAX ;
} ;
// Decide on initial start position.
s = s1 = s0 = 0 ;
if (str[0] == '0')
{
// Start at first '1' after initial run of '0's
do
{
s += 1 ;
if (s == N)
return 0 ; // String is all '0's !!
}
while (str[s] == '0') ;
s0 = s ; // rotate initial run of '0's
}
else
{
// First digit is '1', but if last digit is also '1', need to rotate.
if (str[N-1] == '1')
{
// Step past the leading run of '1's and set length s1.
// This run will be appended to the last run of '1's in the string
do
{
s += 1 ;
if (s == N)
return 0 ; // String is all '1's !!
}
while (str[s] == '1') ;
s1 = s ; // rotate initial run of '1's
// Step past the (now) leading run of '0's and set length s0.
// This run will be appended to the last run of '1's in the string
//
// NB: we know there is at least one '0' and at least one '1' before
// the end of the string
do { s += 1 ; } while (str[s] == '0') ;
s0 = s - s1 ;
} ;
} ;
// Scan the string to construct the vector of pairs and the list of potential
// starts.
nmx = (((N / 2) + 64) / 64) * 64 ;
mem = malloc(nmx * (sizeof(pair_t) + sizeof(uint) + sizeof(uint))) ;
pv = (pair_t*)mem ;
spv = (uint*)(pv + nmx) ;
psi = (uint*)(spv + nmx) ;
pn = 0 ;
bp = 0 ; // no pair is 0 !
bpi = 0 ;
bsc = 0 ; // no best yet
do
{
uint x1, x0 ;
pair_t xp ;
psi[pn] = s ;
x1 = x0 = 0 ;
do
{
x1 += 1 ;
s += 1 ;
ch = (s < N) ? str[s] : '\0' ;
}
while (ch == '1') ;
if (ch == '\0')
{
x1 += s1 ;
x0 = s0 ;
}
else
{
do
{
x0 += 1 ;
s += 1 ;
ch = (s < N) ? str[s] : '\0' ;
}
while (str[s] == '0') ;
if (ch == '\0')
x0 += s0 ;
} ;
// Register pair (x1,x0)
reg:
pv[pn] = xp = ((uint64_t)x1 << 32) - x0 ;
if (debug && (N == 264))
printf("si=%u, sn=%u, pn=%u, xp=%lx bp=%lx\n", psi[sn], sn, pn, xp, bp) ;
if (xp > bp)
{
// New best start.
bpi = pn ;
bsc = 0 ;
bp = xp ;
}
else
bsc += (xp == bp) ;
pn += 1 ;
}
while (ch != '\0') ;
// If there are 2 or more best starts, need to find them all, but discard
// second and subsequent contiguous ones.
spv[0] = bpi ;
sn = 1 ;
if (bsc != 0)
{
uint pi ;
bool rp ;
pi = bpi ;
rp = true ;
do
{
pi += 1 ;
if (pv[pi] != bp)
rp = false ;
else
{
bsc -= 1 ;
if (!rp)
{
spv[sn++] = pi ;
rp = true ;
} ;
} ;
}
while (bsc != 0) ;
} ;
// We have: pn pairs in pv[]
// sn start positions in sv[]
for (uint i = 1 ; sn > 1 ; ++i)
{
uint sc ;
uint pi ;
pair_t bp ;
if (debug && (N == 264))
{
printf("i=%u, sn=%u, pv:", i, sn) ;
for (uint s = 0 ; s < sn ; ++s)
printf(" %u", psi[spv[s]]) ;
printf("\n") ;
} ;
pi = spv[0] + i ; // index of first s+i pair
if (pi >= pn) { pi -= pn ; } ;
bp = pv[pi] ; // best value, so far.
sc = 1 ; // new count of best starts
for (uint sj = 1 ; sj < sn ; ++sj)
{
// Consider the ith pair ahead of sj -- compare with best so far.
uint pb, pj ;
pair_t xp ;
pb = spv[sj] ;
pj = pb + i ; // index of next s+i pair
if (pj >= pn) { pj -= pn ; } ;
xp = pv[pj] ;
if (xp == bp)
{
// sj is equal to the best so far
//
// So we keep both, unless we have the 'A==zA==z' case,
// where 'z == xp == sp', the current 'ith' position.
uint pa ;
pa = pi + 1 ;
if (pa == pn) { pa = 0 ; } ; // position after first 'z'
// If this is not an A==zA==z case, keep sj
// Otherwise, drop sj (by not putting it back into the list),
// but update pi so can spot A==zA==zA===z etc. cases.
if (pa != pb)
spv[sc++] = spv[sj] ; // keep sj
else
pi = pj ; // for further repeats
}
else if (xp < bp)
{
// sj is less than best -- do not put it back into the list
}
else
{
// sj is better than best -- discard everything so far, and
// set new best.
sc = 1 ; // back to one start
spv[0] = spv[sj] ; // new best
pi = pj ; // new index of ith pair
bp = xp ; // new best pair
} ;
} ;
sn = sc ;
} ;
s = psi[spv[0]] ;
free(mem) ;
return s ;
}
I have tested this against the brute force method given elsewhere, and as far as I can see this is (now, on 10-Apr-2020) working code.
When I timed this on my machine, for 100,000 random strings of 400,000..500,000 characters (at random) I got:
Brute Force: 281.8 secs CPU
My method: 130.3 secs CPU
and that's excluding the 8.3 secs to construct the random string and run an empty test. (That may sound a lot, but for 100,000 strings of 450,000 characters, on average, that's a touch less than 1 CPU cycle per character.)
So for random strings, my complicated method is a little over twice as fast as brute-force. But it uses ~N*16 bytes of memory, where the brute-force method uses N*2 bytes. Given the effort involved, the result is not hugely gratifying.
However, I also tried two pathological cases, (1) repeated "10" and (2) repeated "10100010" and for just 1000 (not 100000) strings of 400,000..500,000 characters (at random) I got:
Brute Force: (1) 1730.9 (2) 319.0 secs CPU
My method: 0.7 0.7 secs CPU
That O(n^2) will kill you every time !
#include <iostream>
#include <string>
#include <math.h>
using namespace std;
int convt(int N,string S)
{
int sum=0;
for(int i=0; i<N; i++)
{
int num=S[i];
sum += pow(2,N-1-i)*(num-48);
}
return sum;
}
string rot(int N, string S)
{
int temp;
temp = S[0];
for( int i=0; i<N;i++)
S[i]=S[i+1];
S[N-1]=temp;
return S;
}
int main() {
int t;
cin>>t;
while(t--)
{
int N,K;
cin>>N;
cin>>K;
char S[N];
for(int i=0; i<N; i++)
cin>>S[i];
string SS= S;
int mx_val=INT_MIN;
for(int i=0;i<N;i++)
{
string S1=rot(N,SS);
SS= S1;
int k_val=convt(N,SS);
if (k_val>mx_val)
mx_val=k_val;
}
int ki=0;
int j=0;
string S2=S;
while(ki!=K)
{
S2=rot(N,S2);
if (convt(N,S2)==mx_val)
ki++;
j++;
}
cout<<j<<endl;
}
}

Parsing a CSV File Problems

I tried this to parse data given in a csv file into ID, AGE, and GPA fields in a "data" file, but I don't think I'm doing this right (when I tried printing the data, its printing weird numbers). What am I doing wrong?
char data[1000];
FILE *x = fopen("database.csv","rt");
char NAME[300];
int ID[300],AGE[300],GPA[300];
int i,j;
i = 0;
while(!feof(x)) {
fgets(data,999,x);
for (j = 0; j < 300 && data[i] != ','; j++, i++) {
ID[j] = data[i];
i++;
}
for (j = 0; j < 300 && data[i] != ','; j++, i++) {
NAME[j] = data[i];
i++;
}
for (j = 0; j < 300 && ( data[i] != '\0' || data[i] != '\r' || data[i] != data[i] != '\n'); j++, i++) {
GPA[j] = data[i];
}
}
First of all: for what you're doing, you probably want to look carefully at the function strtok and the atoi macro. But given the code you posted, that's perhaps still a bit too advanced, so I'm taking a longer way here.
Supposing that the line is something like
172,924,1182
then you need to parse those numbers. The number 172 is actually represented by two or four bytes in memory, in a very different format, and the byte "0" is nothing like the number 0. What you'll read is the ASCII code, which is 48 in decimal, or 0x30 in hex.
If you take the ASCII value of a single digit and subtract 48, you will get a number, because fortunately the numbers are stored in digit order, so "0" is 48, "1" is 49 and so on.
But you still have the problem of converting the three digits 1 7 2 into 172.
So once you have 'data':
(I have added commented code to deal with a unquoted, unescaped text field inside the CSV, since in your question you mention an AGE field, but then you seem to want to use a NAME field. The case when the text field is quoted or escaped is another can of worms entirely)
size_t i = 0;
int number = 0;
int c;
int field = 0; // Fields start at 0 (ID).
// size_t x = 0;
// A for loop that never ends until we issue a "break"
for(;;) {
c = data[i++];
// What character did we just read?
if ((',' == c) || (0x0c == c) || (0x0a == c) || (0x00 == c)) {
// We have completed read of a number field. Which field was it?
switch(field) {
case 0: ID[j] = number; break;
case 1: AGE[j] = number; break;
// case 1: NAME[j][x] = 0; break; // we have already read in NAME, but we need the ASCIIZ string terminator.
case 2: GPA[j] = number; break;
}
// Are we at the end of line?
if ((0x0a == c) || (0x0c == c)) {
// Yes, break the cycle and read the next line
break;
}
// Read the next field. Reinitialize number.
field++;
number = 0;
// x = 0; // if we had another text field
continue;
}
// Each time we get a digit, the old value of number is shifted one order of magnitude, and c gets added. This is called Horner's algorithm:
// Number Read You get
// 0 "1" 0*10+1 = 1
// 1 "7" 1*10+7 = 17
// 17 "2" 17*10+2 = 172
// 172 "," Finished. Store 172 in the appropriate place.
if (c >= '0' && c <= '9') {
number = number * 10 + (c - '0');
}
/*
switch (field) {
case 1:
NAME[j][x++] = c;
break;
}
*/
}

String to space separated integer

I have a string with integers in it (in descending order) then output should be space separated integers. Integers are not negative.
INPUT: 9876543 OUTPUT: 9 8 7 6 5 4 3
INPUT: 109876543 OUTPUT: 10 9 8 7 6 5 4 3
INPUT: 400399398397 OUTPUT: 400 399 398 397
So I tried using sscanf() but was not able to get the desired result, this is the code I tried:
fgets(s1,100,stdin); // Get string
while(sscanf(data1,"%d%n",&m1,&len)==1){
b[i] = m1; // Store the integers in the array
data1 += len;
i += 1;
}
How can I achieve the desired result?
Continuing from the comments and the additional answer, to parse and separate the string into a space separated series of integers decreasing by one, there are probably a number of differing approaches you can take. The biggest design question is whether you start with the length of the input string, cut it in half and then work backwards decreasing the number of digits you check for adjacent values by one -- or whether you start at the beginning and work toward the end incrementing the number of digits being considered along the way.
Regardless of the direction you choose, the twist is handling/checking adjacent values with a different number of digits. Your second example, 109876543, hits at the heart of this twist, where you must code a way to check the 2-digit value 10 against the next single-digit value in the series 9. There is just no pretty way to do this. One reasonable way is to simply compute the smallest number that can be represented by n-digits (e.g. 10, 100, 1000, ...). Essentially 10^(n-1) (where we let int expn = n - 1;). If your first value v1 is equal to 10^(n-1), then reduce the number of characters you consider for the next smallest values. Something like the following:
while (expn--) /* loop to build 10 ^ (n-1) */
x10 *= 10; /* compute 10 ^ (n-1), 10, 100 */
if (v1 == x10) /* compare against v1 */
n--; /* reduce t2 by 1-char/digit */
The remainder of the task is just basically a brute force check with a minimum number of validations necessary to protect array bounds, while handling adding values to your integer array (or however you want to store values until you validate or invalidate the remaining characters in the string) while you work your way through the remaining characters.
Putting all the pieces together, and noting there are many, many ways to code this, this example being only one, you could do something similar to the following. Note, the code simply handles the conversion from ASCII to int in the single-digit series case by subtracting '0' from the character value, for multi-digit conversions, strtol is used with a validation check of errno. The code works from beginning to end of the string incrementing the number of digits checked until the end of the string is reached. If a solution is found, a space-separated list of integers is output, otherwise, "no solution found." is output. The code is commented to help you work though it.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <errno.h>
#define MAXI 256
int main (int argc, char **argv) {
int a[MAXI] = {0}, i = 1, idx = 0, n = 1, len;
char *p = argc > 1 ? argv[1] : "9876543";
printf ("string : %s\n", p); /* original input string */
len = (int)strlen (p); /* get length */
while (i + n <= len && n < len) { /* loop until conditions met */
if (n >= MAXI) { /* protect int array bounds */
fprintf (stderr, "warning: array full, %d elements filled.\n", n);
break;
}
if (n == 1) { /* handle single digits series */
if (p[i - 1] == p[i] + 1) { /* previous equal current + 1? */
if (!idx) /* if array index == 0 */
a[idx++] = p[i - 1] - '0'; /* store first integer */
a[idx++] = p[i] - '0'; /* store current integer */
i++; /* increment string index */
}
else
n++, i = n, idx = 0; /* increment n-digits to check */
} /* set i = n, zero array index */
else { /* handle multi-digit values */
char t1[MAXI] = "", t2[MAXI] = ""; /* tmp strings for values */
int v1 = 0, v2 = 0, /* tmp for coverted values */
expn = n - 1, x10 = 1, /* 10 ^ expn for n-- test */
norig = n; /* n to restore on no match */
strncpy (t1, p + i - n, n); /* copy n-digits for 1st value */
errno = 0;
v1 = (int) strtol (t1, NULL, 10); /* convert to int/validate */
if (errno) {
fprintf (stderr, "error: failed conversion, i: %d, n: %d\n",
i, n);
return 1;
}
while (expn--) /* loop to build 10 ^ (n-1) */
x10 *= 10; /* compute 10 ^ (n-1), 10, 100 */
if (v1 == x10) /* compare against v1 */
n--; /* reduce t2 by 1-char/digit */
strncpy (t2, p + i, n); /* copy n-digits for 2nd value */
errno = 0;
v2 = (int) strtol (t2, NULL, 10); /* convert to int/validate */
if (errno) {
fprintf (stderr, "error: failed conversion, i: %d, n: %d\n",
i, n);
return 1;
}
if (v1 == v2 + 1) { /* check decreasing values */
if (!idx) /* if array index == 0 */
a[idx++] = v1; /* store first integer */
a[idx++] = v2; /* store current integer */
i += n; /* increment string index */
}
else {
n += n < norig ? 2 : 1; /* reset n if no match */
i = n; /* set string index to n */
idx = 0; /* reset array index to 0 */
}
}
}
if (idx && n < len) { /* if array has values, output */
printf ("integers :");
for (int j = 0; j < idx; j++)
printf (" %*d", n, a[j]);
putchar ('\n');
}
else
printf ("no solution found.\n");
return 0;
}
note: not all corner-cases have been evaluated and the input is presumed to contain only digits. (you are free to add the check for isdigit if you expect otherwise), further testing on your part should be done to satisfy yourself any odd-ball cases are sufficiently covered.
Example Use/Output
$ ./bin/intsepdecr
string : 9876543
integers : 9 8 7 6 5 4 3
$ ./bin/intsepdecr 109876543
string : 109876543
integers : 10 9 8 7 6 5 4 3
$ ./bin/intsepdecr 400399398397
string : 400399398397
integers : 400 399 398 397
$ ./bin/intsepdecr 400399398396
string : 400399398396
no solution found.
$ ./bin/intsepdecr 101176543
string : 101176543
no solution found.
Look things over and let me know if you have any further questions.
I can give a basic algorithm of how to go about this problem.
Convert the first x digits of the string into an integer (Initially x=1). You can even use a simple function like strtoi for this. (Let us say this number is N)
Find no of digits in N-1. Convert that many digits next into an integer.
Is the Converted value equal to N-1. If So, Continue and convert the rest of the string by repeating steps 2 & 3. (Need to Set N = N-1)
If not equal to N-1, Repeat from step 1, Only this time increment the number of digits converted.
Exit the program and declare a malformed string if you are not able to convert the entire string and x is greater than half the length of string.
Here is something i just whipped up i tested it against your cases and it works
Mind it is in c++ but it would be simple to convert to c.
string spaceNum(string in)
{
string numConvBuff;
size_t matchSize = 0;
if (in.size() == 1)
{
return to_string(in[0] - '0');
}
for (int i = 0; i < in.size() / 2; i++)
{
numConvBuff = in.substr(0, i + 1);
unsigned int numRes = stoul(numConvBuff) - 1;
string numResStr = to_string(numRes);
string n = in.substr(i + 1, numResStr.length());
if(numRes == stoul(n))
matchSize = i+1;
}
if (matchSize)
{
string out = in.substr(0, matchSize);
unsigned int numRes = stoul(out);
for (size_t i = matchSize; i < in.length();)
{
numRes--;
string numResStr = to_string(numRes);
string n = in.substr(i, numResStr.length());
out += " " + n;
i += numResStr.length();
}
return out;
}
return "";
}

I need to add string characters in C. A + B must = C. Literally

I am writing a program that is due tonight at midnight, and I am utterly stuck. The program is written in C, and takes input from the user in the form SOS where S = a string of characters, O = an operator (I.E. '+', '-', '*', '/'). The example input and output in the book is the following:
Input> abc+aab
Output: abc + aab => bce
And that's literally, not variable. Like, a + a must = b.
What is the code to do this operation? I will post the code I have so far, however all it does is take the input and divide it between each part.
#include <stdio.h>
#include <string.h>
int main() {
system("clear");
char in[20], s1[10], s2[10], o[2], ans[15];
while(1) {
printf("\nInput> ");
scanf("%s", in);
if (in[0] == 'q' && in[1] == 'u' && in[2] == 'i' && in[3] == 't') {
system("clear");
return 0;
}
int i, hold, breakNum;
for (i = 0; i < 20; i++) {
if (in[i] == '+' || in[i] == '-' || in[i] == '/' || in[i] == '*') {
hold = i;
}
if (in[i] == '\0') {
breakNum = i;
}
}
int j;
for (j = 0; j < hold; j++) {
s1[j] = in[j];
}
s1[hold] = '\0';
o[0] = in[hold];
o[1] = '\0';
int k;
int l = 0;
for (k = (hold + 1); k < breakNum; k++) {
s2[l] = in[k];
l++;
}
s2[breakNum] = '\0';
printf("%s %s %s =>\n", s1, o, s2);
}
}
Since this is homework, let's focus on how to solve this, rather than providing a bunch of code which I suspect your instructor would frown upon.
First, don't do everything from within the main() function. Break it up into smaller functions each of which do part of the task.
Second, break the task into its component pieces and write out the pseudocode:
while ( 1 )
{
// read input "abc + def"
// convert input into tokens "abc", "+", "def"
// evaluate tokens 1 and 3 as operands ("abc" -> 123, "def" -> 456)
// perform the operation indicated by token 2
// format the result as a series of characters (579 -> "egi")
}
Finally, write each of the functions. Of course, if you stumble upon roadblocks along the way, be sure to come back to ask your specific questions.
Based on your examples, it appears “a” acts like 1, “b” acts like 2, and so on. Given this, you can perform the arithmetic on individual characters like this:
// Map character from first string to an integer.
int c1 = s1[j] - 'a' + 1;
// Map character from second string to an integer.
int c2 = s2[j] - 'a' + 1;
// Perform operation.
int result = c1 + c2;
// Map result to a character.
char c = result - 1 + 'a';
There are some things you have to add to this:
You have to put this in a loop, to do it for each character in the strings.
You have to vary the operation according to the operator specified in the input.
You have to do something with each result, likely printing it.
You have to do something about results that extended beyond the alphabet, like “y+y”, “a-b”, or “a/b”.
If we assume, from your example answer, that a is going to be the representation of 1, then you can find the representation values of all the other values and subtract the value representation of a from it.
for (i = 0; i < str_len; i++) {
int s1Int = (int)s1[i];
int s2Int = (int)s1[i];
int addAmount = 1 + abs((int)'a' - s2Int);
output[i] = (char)(s1Int + addAmount)
}
Steps
1) For the length of the s1 or s2
2) Retrieve the decimal value of the first char
3) Retrieve the decimal value of the second char
4) Find the difference between the letter a (97) and the second char + 1 <-- assuming a is the representation of 1
5) Add the difference to the s1 char and convert the decimal representation back to a character.
Example 1:
if S1 char is a, S2 char is b:
s1Int = 97
s2Int = 98
addAmount = abs((int)'a' - s2Int)) = 1 + abs(97 - 98) = 2
output = s1Int + addAmount = 97 + 2 = 99 = c
Example 2:
if S1 char is c, S2 char is a:
s1Int = 99
s2Int = 97
addAmount = abs((int)'a' - s2Int)) = 1 + abs(97 - 97) = 1
output = s1Int + addAmount = 99 + 1 = 100 = d

Display the binary representation of a number in C? [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
Is there a printf converter to print in binary format?
Still learning C and I was wondering:
Given a number, is it possible to do something like the following?
char a = 5;
printf("binary representation of a = %b",a);
> 101
Or would i have to write my own method to do the transformation to binary?
There is no direct way (i.e. using printf or another standard library function) to print it. You will have to write your own function.
/* This code has an obvious bug and another non-obvious one :) */
void printbits(unsigned char v) {
for (; v; v >>= 1) putchar('0' + (v & 1));
}
If you're using terminal, you can use control codes to print out bytes in natural order:
void printbits(unsigned char v) {
printf("%*s", (int)ceil(log2(v)) + 1, "");
for (; v; v >>= 1) printf("\x1b[2D%c",'0' + (v & 1));
}
Based on dirkgently's answer, but fixing his two bugs, and always printing a fixed number of digits:
void printbits(unsigned char v) {
int i; // for C89 compatability
for(i = 7; i >= 0; i--) putchar('0' + ((v >> i) & 1));
}
Yes (write your own), something like the following complete function.
#include <stdio.h> /* only needed for the printf() in main(). */
#include <string.h>
/* Create a string of binary digits based on the input value.
Input:
val: value to convert.
buff: buffer to write to must be >= sz+1 chars.
sz: size of buffer.
Returns address of string or NULL if not enough space provided.
*/
static char *binrep (unsigned int val, char *buff, int sz) {
char *pbuff = buff;
/* Must be able to store one character at least. */
if (sz < 1) return NULL;
/* Special case for zero to ensure some output. */
if (val == 0) {
*pbuff++ = '0';
*pbuff = '\0';
return buff;
}
/* Work from the end of the buffer back. */
pbuff += sz;
*pbuff-- = '\0';
/* For each bit (going backwards) store character. */
while (val != 0) {
if (sz-- == 0) return NULL;
*pbuff-- = ((val & 1) == 1) ? '1' : '0';
/* Get next bit. */
val >>= 1;
}
return pbuff+1;
}
Add this main to the end of it to see it in operation:
#define SZ 32
int main(int argc, char *argv[]) {
int i;
int n;
char buff[SZ+1];
/* Process all arguments, outputting their binary. */
for (i = 1; i < argc; i++) {
n = atoi (argv[i]);
printf("[%3d] %9d -> %s (from '%s')\n", i, n,
binrep(n,buff,SZ), argv[i]);
}
return 0;
}
Run it with "progname 0 7 12 52 123" to get:
[ 1] 0 -> 0 (from '0')
[ 2] 7 -> 111 (from '7')
[ 3] 12 -> 1100 (from '12')
[ 4] 52 -> 110100 (from '52')
[ 5] 123 -> 1111011 (from '123')
#include<iostream>
#include<conio.h>
#include<stdlib.h>
using namespace std;
void displayBinary(int n)
{
char bistr[1000];
itoa(n,bistr,2); //2 means binary u can convert n upto base 36
printf("%s",bistr);
}
int main()
{
int n;
cin>>n;
displayBinary(n);
getch();
return 0;
}
Use a lookup table, like:
char *table[16] = {"0000", "0001", .... "1111"};
then print each nibble like this
printf("%s%s", table[a / 0x10], table[a % 0x10]);
Surely you can use just one table, but it will be marginally faster and too big.
There is no direct format specifier for this in the C language. Although I wrote this quick python snippet to help you understand the process step by step to roll your own.
#!/usr/bin/python
dec = input("Enter a decimal number to convert: ")
base = 2
solution = ""
while dec >= base:
solution = str(dec%base) + solution
dec = dec/base
if dec > 0:
solution = str(dec) + solution
print solution
Explained:
dec = input("Enter a decimal number to convert: ") - prompt the user for numerical input (there are multiple ways to do this in C via scanf for example)
base = 2 - specify our base is 2 (binary)
solution = "" - create an empty string in which we will concatenate our solution
while dec >= base: - while our number is bigger than the base entered
solution = str(dec%base) + solution - get the modulus of the number to the base, and add it to the beginning of our string (we must add numbers right to left using division and remainder method). the str() function converts the result of the operation to a string. You cannot concatenate integers with strings in python without a type conversion.
dec = dec/base - divide the decimal number by the base in preperation to take the next modulo
if dec > 0:
solution = str(dec) + solution - if anything is left over, add it to the beginning (this will be 1, if anything)
print solution - print the final number
This code should handle your needs up to 64 bits.
char* pBinFill(long int x,char *so, char fillChar); // version with fill
char* pBin(long int x, char *so); // version without fill
#define width 64
char* pBin(long int x,char *so)
{
char s[width+1];
int i=width;
s[i--]=0x00; // terminate string
do
{ // fill in array from right to left
s[i--]=(x & 1) ? '1':'0'; // determine bit
x>>=1; // shift right 1 bit
} while( x &gt 0);
i++; // point to last valid character
sprintf(so,"%s",s+i); // stick it in the temp string string
return so;
}
char* pBinFill(long int x,char *so, char fillChar)
{ // fill in array from right to left
char s[width+1];
int i=width;
s[i--]=0x00; // terminate string
do
{
s[i--]=(x & 1) ? '1':'0';
x>>=1; // shift right 1 bit
} while( x > 0);
while(i>=0) s[i--]=fillChar; // fill with fillChar
sprintf(so,"%s",s);
return so;
}
void test()
{
char so[width+1]; // working buffer for pBin
long int val=1;
do
{
printf("%ld =\t\t%#lx =\t\t0b%s\n",val,val,pBinFill(val,so,0));
val*=11; // generate test data
} while (val < 100000000);
}
Output:
00000001 = 0x000001 = 0b00000000000000000000000000000001
00000011 = 0x00000b = 0b00000000000000000000000000001011
00000121 = 0x000079 = 0b00000000000000000000000001111001
00001331 = 0x000533 = 0b00000000000000000000010100110011
00014641 = 0x003931 = 0b00000000000000000011100100110001
00161051 = 0x02751b = 0b00000000000000100111010100011011
01771561 = 0x1b0829 = 0b00000000000110110000100000101001
19487171 = 0x12959c3 = 0b00000001001010010101100111000011
You have to write your own transformation. Only decimal, hex and octal numbers are supported with format specifiers.

Resources