find the longest non decreasing sub sequence - c

given a string consists only of 0s and 1s say 10101
how to find the length of the longest non decreasing sub-sequence??
for example,
for the string,
10101
the longest non decreasing sub sequences are
111
001
so you should output 3
for the string
101001
the longest non decreasing sub sequence is
0001
so you should output 4
how to find this??
how can this be done when we are provided with limits.sequence between the limit
for example
101001
limits [3,6]
the longest non decreasing sub sequence is
001
so you should output 3
can this be achieved in o(strlen)

Can this be achieved in O(strlen)?
Yes. Observe that the non-decreasing subsequences would have one of these three forms:
0........0 // Only zeros
1........1 // Only ones
0...01...1 // Some zeros followed by some ones
The first two forms can be easily checked in O(1) by counting all zeros and by counting all ones.
The last one is a bit harder: you need to go through the string keeping the counter of zeros that you've seen so far, along with the length of the longest string of 0...01...1 form that you have discovered so far. At each step where you see 1 in the string, the length of the longest subsequence of the third form is the larger of the number of zeros plus one or the longest 0...01...1 sequence that you've seen so far plus one.
Here is the implementation of the above approach in C:
char *str = "10101001";
int longest0=0, longest1=0;
for (char *p = str ; *p ; p++) {
if (*p == '0') {
longest0++;
} else { // *p must be 1
longest1 = max(longest0, longest1)+1;
}
}
printf("%d\n", max(longest0, longest1));
max is defined as follows:
#define max( a, b ) ( ((a) > (b)) ? (a) : (b) )
Here is a link to a demo on ideone.

Use dynamic programming. Run through the string from left to right, and keep track of two variables:
zero: length of longest subsequence ending in 0
one: length of longest subsequence ending in 1
If we see a 0, we can append this to any prefix that ends in 0, so we increase zero. If we see a 1, we can either append it to the prefix that ends in 0, or in 1, so we set one the one which is longest. In C99:
int max(int a, int b) {
return a > b ? a : b;
}
int longest(char *string) {
int zero = 0;
int one = 0;
for (; *string; ++string) {
switch (*string) {
case '0':
++zero;
break;
case '1':
one = max(zero, one) + 1;
break;
}
}
return max(zero, one);
}

do {
count++;
if (array[i] < prev) {
if (count > max)
max = count;
count = 0;
}
prev = array[i];
} while (++i < length);
Single pass. Will even work on any numbers, not just 1s and 0s.
For limits - set i to starting number, use ending instead of array length.

Related

Subsequence using single string

Given a string S with length N. Choose an integer K and two non-empty subsequences A and B of characters of this string, each with length K, such that:
A=B, i.e. for each valid i, the i-th character in A is the same as
the i-th character in B.
Let's denote the indices of characters used to construct A by
a1,a2,…,aK, i.e. A=(Sa1,Sa2,…,SaK). Similarly, let's denote the
indices of characters used to construct B by b1,b2,…,bK.
If we denote the number of common indices in the sequences a and b by
M, then M+1≤K.
What is the maximum value of K such that it is possible to find sequences A and B which satisfy the above conditions.
Please give the simplest solution of this problem.I'm not able to proceed on how to solve this.
Your ans will be look like this...
find the minimum distance between repeated character and the ans will be total string length - distance .
for example of "ababdbbdhfdksl"
minimum distance between repeated character = 1 ( 2 b's in the middle )
so ans = length ( 14 ) - 1 = 13
if all are distinct character answer will be 0 .
the question seems to find the duplicate subsequences of max length.
like if the string is "ababdb" then answer will be 4.
because abab can be generated two times and at least one index is different.
so approach is
1. generate all possible subsequences.
2. store and match with the previous one.
3. if any duplicate found we update the max .
#include <iostream>
#include <vector>
#include<string>
using namespace std;
vector<string> test;
vector<string>::iterator iter;
int max = 0;
void Sub(string input , string output)
{
if (input.length() == 0)
{
if (output.length() > max)// store only next greater max
{
iter = std::find(test.begin(), test.end(), output);
if (iter != test.end())
{
max = output.length();
return;
}
}
else
{
return;
}
test.push_back(output);
return;
}
Sub(input.substr(1), output);
Sub(input.substr(1), output + input[0]);
}
int main()
{
string input = "ababdb";
string output = "";
Sub(input, output);
cout << max;
return 0;
}

Subsequence calculator. Can i make it more efficient?

The idea of subsequences is explained very well in this post:Generate subsequences
But i didnt understand the answers on that question because im a beginner.
What i wanted to know is if i could make my C program any more efficient while still keeping it simple and understandable and without using functions?
#include <stdio.h>
#define NUM 123456
int main(void) {
int i,num=NUM,x,y,mask,digits=0,max=1;
while ( num != 0 ) { //calculate digits
num /= 10;
digits++;
}
for ( i = 1; i <= digits; i++ ) { //calculate max number of subsequences
max *= 2;
}
max=max-2;
printf("Subsequences are:\n");
for ( i = 1; i <= max ; i++ ) {
mask = i; //digit selector
x = 1; //multiplier
num = NUM;
y=0; //subsequence value
while ( num != 0 ) {
if ( mask % 2 == 1 ) {
y += num % 10 * x;
x *= 10;
}
num /= 10;
mask /= 2;
}
printf("%d \n" , y);
}
return 0;
}
Note that when we define NUM as a number such as 5111 or 100 some of the subsequences appear twice. Is there any simple way to fix that?
Thanks!
The root of the reason that certain subsequences appear more than once with some numbers is because those numbers have repetitions of the same digit.
That repetition could be eliminated by saving each subsequence in an array and checking that array to see if the specific subsequence is already in the array. If already in the array, do not print. Otherwise, add subsequence to array contents and print
The problem can be divided into two tasks: (1) find all subsequences of an array of digits and (2) pack and unpack integers into digits.
Let's consider the subsequences of the array {a, b, c}. You can generate them by walking through the array from left to right and follow two paths: One where you include the current element in a subsequence and one where you don't.
That leads to a recursive approach tat we can represent as a tree:
{}
/ \
{} {a}
/ \ / \
{} {b} {a} {ab}
/ \ / \ / \ / \
{} {c} {b} {bc} {a} {ac} {ab} {abc}
When we branch left, we skip the current element and when we go right, we include the element. The elements themselves are the depth of the tree: On the first level we treat element a, on the next band on the last c.
The bottom row has all subsequences. This includes the empty sequence and the full sequence, which you don't want. But let's include them for now. (The arrays in the bottom row are usually called a power set, which is a nice web-searchable term.)
I mentioned recursion, which entails recursive functions, and functions are out.
So we need to tackle the problem another way. Let's turn the tree on its side. A dash denotes that the element was skipped. The notation on the right uses another notation: 0 means the element was skipped, 1 means the element was included:
- - - 000
- - c 001
- b - 010
- b c 011
a - - 100
a - c 101
a b - 110
a b c 111
I hope the codes on the right look familiar, because that's how you count from 000 to 111 in binary. That's a nice way to enumerate our elements. Now we need a way to tell which bits are set in each number.
The rightmost bit is set when the number is odd. We can find out about the other bits by repeatedly dividing the number by two, which in binary is a shift to the right, discarding the rightmost bit.
Now how to extract the digits from the original number? That number is a decimal number; it's in base 10. We can use the same approach as for finding the bits in the binary number, because the bits 0 and 1 are the binary digits.
Start with the number. The last digit is the result of taking the remainder after a division by 10. Then divide the number by ten until it is zero. This code yields the digits from right to left. So does the code for finding the bits, which means we can find whether a bit is set and which digit to print in a single loop, always taking the rightmost bit and if it is set, print the rightmost digit of the original number.
The empty and the full subsequences are the first and last items in the enumeration. If you don't want them, skip them.
That leaves the problem of the duplicated subsequences if the digit has repeated digits. I don' see an easy way to solve this except user3629249's suggestion to create the subsequence anyway and later check whether is has already been printed.
An easy way to do this is to keep an array of the subsequences. That array has max entries. After you have filled that array, sort it and then print it, but skip entries that are equal to the previous entry.
Here's an example implementation that uses an array of digits so that the original number doesn't have to be decomposed each time. It uses the sorting function qsort from <stdlib.h>, which requires a sorting function:
#include <stdlib.h>
#include <stdio.h>
#define NUM 412131
typedef unsigned int uint;
int uintcmp(const void *pa, const void *pb)
{
const uint *a = pa;
const uint *b = pb;
return (*a > *b) - (*a < *b);
}
int main(void)
{
uint digit[20]; // array of digits
size_t ndigit = 0; // length of array
uint num = NUM;
uint max = 1;
size_t i;
while (num) {
digit[ndigit++] = num % 10;
num /= 10;
max *= 2;
}
uint res[max]; // array of subsequences
for (i = 0; i < max; i++) {
uint mask = i; // mask for bit detection
uint j = ndigit; // index into digit array
uint s = 0;
while (j--) {
if (mask % 2) s = s*10 + digit[j];
mask /= 2;
}
res[i] = s;
}
qsort(res, max, sizeof(*res), uintcmp);
for (i = 1; i < max - 1; i++) {
if (res[i] != res[i - 1]) printf("%u\n", res[i]);
}
return 0;
}

How should I generate the n-th digit of this sequence in logarithmic time complexity?

I have the following problem:
The point (a) was easy, here is my solution:
#include <stdio.h>
#include <string.h>
#define MAX_DIGITS 1000000
char conjugateDigit(char digit)
{
if(digit == '1')
return '2';
else
return '1';
}
void conjugateChunk(char* chunk, char* result, int size)
{
int i = 0;
for(; i < size; ++i)
{
result[i] = conjugateDigit(chunk[i]);
}
result[i] = '\0';
}
void displaySequence(int n)
{
// +1 for '\0'
char result[MAX_DIGITS + 1];
// In this variable I store temporally the conjugates at each iteration.
// Since every component of the sequence is 1/4 the size of the sequence
// the length of `tmp` will be MAX_DIGITS / 4 + the string terminator.
char tmp[(MAX_DIGITS / 4) + 1];
// There I assing the basic value to the sequence
strcpy(result, "1221");
// The initial value of k will be 4, since the base sequence has ethe length
// 4. We can see that at each step the size of the sequence 4 times bigger
// than the previous one.
for(int k = 4; k < n; k *= 4)
{
// We conjugate the first part of the sequence.
conjugateChunk(result, tmp, k);
// We will concatenate the conjugate 2 time to the original sequence
strcat(result, tmp);
strcat(result, tmp);
// Now we conjugate the conjugate in order to get the first part.
conjugateChunk(tmp, tmp, k);
strcat(result, tmp);
}
for(int i = 0; i < n; ++i)
{
printf("%c", result[i]);
}
printf("\n");
}
int main()
{
int n;
printf("Insert n: ");
scanf("%d", &n);
printf("The result is: ");
displaySequence(n);
return 0;
}
But for the point b I have to generate the n-th digit in logarithmic time. I have no idea how to do it. I have tried to find a mathematical property of that sequence, but I failed. Can you help me please? It is not the solution itself that really matters, but how do you tackle this kind of problems in a short amount of time.
This problem was given last year (in 2014) at the admission exam at the Faculty of Mathematics and Computer Science at the University of Bucharest.
Suppose you define d_ij as the value of the ith digit in s_j.
Note that for a fixed i, d_ij is defined only for large enough values of j (at first, s_j is not large enough).
Now you should be able to prove to yourself the two following things:
once d_ij is defined for some j, it will never change as j increases (hint: induction).
For a fixed i, d_ij is defined for j logarithmic in i (hint: how does the length of s_j increase as a function of j?).
Combining this with the first item, which you solved, should give you the result along with the complexity proof.
There is a simple programming solution, the key is to use recursion.
Firstly determine the minimal k that the length of s_k is more than n, so that n-th digit exists in s_k. According to a definition, s_k can be split into 4 equal-length parts. You can easily determine into which part the n-th symbol falls, and what is the number of this n-th symbol within that part --- say that n-th symbol in the whole string is n'-th within this part. This part is either s_{k-1}, either inv(s_{k-1}). In any case you recursively determine what is n'-th symbol within that s_{k-1}, and then, if needed, invert it.
The digits up to 4^k are used to determine the digts up to 4^(k+1). This suggests writing n in base 4.
Consider the binary expansion of n where we pair digits together, or equivalently the base 4 expansion where we write 0=(00), 1=(01), 2=(10), and 3=(11).
Let f(n) = +1 if the nth digit is 1, and -1 if the nth digit is 2, where the sequence starts at index 0 so f(0)=1, f(1)=-1, f(2)-1, f(3)=1. This index is one lower than the index starting from 1 used to compute the examples in the question. The 0-based nth digit is (3-f(n))/2. If you start the indices at 1, the nth digit is (3-f(n-1))/2.
f((00)n) = f(n).
f((01)n) = -f(n).
f((10)n) = -f(n).
f((11)n) = f(n).
You can use these to compute f recursively, but since it is a back-recursion you might as well compute f iteratively. f(n) is (-1)^(binary weight of n) = (-1)^(sum of the binary digits of n).
See the Thue-Morse sequence.

atmost K mismatch substrings?

I'm tryring to solve this problem though using brute force I was able to solve it, but
the following optimised algo is giving me incorrect results for some of the testcases .I tried but couldn;t find the problem with the code can any body help me.
Problem :
Given a string S and and integer K, find the integer C which equals the number of pairs of substrings(S1,S2) such that S1 and S2 have equal length and Mismatch(S1, S2) <= K where the mismatch function is defined below.
The Mismatch Function
Mismatch(s1,s2) is the number of positions at which the characters in S1 and S2 differ. For example mismatch(bag,boy) = 2 (there is a mismatch in the second and third position), mismatch(cat,cow) = 2 (again, there is a mismatch in the second and third position), Mismatch(London,Mumbai) = 6 (since the character at every position is different in the two strings). The first character in London is ‘L’ whereas it is ‘M’ in Mumbai, the second character in London is ‘o’ whereas it is ‘u’ in Mumbai - and so on.
int main() {
int k;
char str[6000];
cin>>k;
cin>>str;
int len=strlen(str);
int i,j,x,l,m,mismatch,count,r;
count=0;
for(i=0;i<len-1;i++)
for(j=i+1;j<len;j++)
{ mismatch=0;
for(r=0;r<len-j+i;r++)
{
if(str[i+r]!=str[j+r])
{ ++mismatch;
if(mismatch>=k)break;
}
if(mismatch<=k)++count;
}
}
cout<<count;
return 0;
}
Sample test cases
Test case (passing for above code)
**input**
0
abab
**output**
3
Test case (failing for above code)
**input**
3
hjdiaceidjafcchdhjacdjjhadjigfhgchadjjjbhcdgffibeh
**expected output**
4034
**my output**
4335
You have two errors. First,
for(r=1;r<len;r++)
should be
for(r=1;r<=len-j;r++)
since otherwise,
str[j+r]
would at some point begin comparing characters past the null-terminator (i.e. beyond the end of the string). The greatest r can be is the remaining number of characters from the jth index to the last character.
Second, writing
str[i+r]
and
str[j+r]
skips the comparison of the ith and jth characters since r is always at least 1. You should write
for(r=0;r<len-j;r++)
You have two basic errors. You are quitting when mismatches>=k instead of mismatches>k (mismatches==k is an acceptable number) and you are letting r get too large. These skew the final count in opposite directions but, as you see, the second error "wins".
The real inner loop should be:
for (r=0; r<len-j; ++r)
{
if (str[i+r] != str[j+r])
{
++mismatch;
if (mismatch > k)
break;
}
++count;
}
r is an index into the substring, and j+r MUST be less than len to be valid for the right substring. Since i<j, if str[j+r] is valid, then so it str[i+r], so there's no need to have i involved in the upper limit calculation.
Also, you want to break on mismatch>k, not on >=k, since k mismatches are allowed.
Next, if you test for too many mismatches after incrementing mismatch, you don't have to test it again before counting.
Finally, the upper limit of r<len-j (instead of <=) means that the trailing '\0' character won't be compared as part of the str[j+r] substring. You were comparing that and more when j+r >= len, but mismatches was less than k when that first happened.
Note: You asked about a faster method. There is one, but the coding is more involved. Make the outer loop on the difference delta between starting index values. (0<delta<len) Then, count all acceptable matches with something like:
count = 0;
for delta = 1 to len-1
set i=0; j=delta; mismatches=0; r=0;
while j < len
.. find k'th mismatch, or end of str:
while mismatches < k and j+r&ltlen
if str[i+r] != str[j+r] then mismatches=mismatches+1
r = r+1
end while
.. extend r to cover any trailing matches:
while j+r<len and str[i+r]==str[j+r]
r + r+1
end while
.. arrive here with r being the longest string pair starting at str[i]
.. and str[j] with no more than k mismatches. This loop will add (r)
.. to the count and advance i,j one space to the right without recounting
.. the character mismatches inside. Rather, if a mismatch is dropped off
.. the front, then mismatches is decremented by 1.
repeat
count = count + r
if str[i] != str[j] then mismatches=mismatches-1
i = i+1, j = j+1, r = r-1
until mismatches < k
end if
end while
That's pseudocode, and also pseudocorrect. The general idea is to compare all substrings with starting indices differing by (delta) in one pass, starting and the left, and increasing the substring length r until the end of the source string is reached or k+1 mismatches have been seen. That is, str[j+r] is either the end of the string, or the camel's-back-breaking mismatch position in the right substring. That makes r substrings that had k or fewer mismatches starting at str[i] and str[j].
So count those r substrings and move to the next positions i=i+1,j=j+1 and new length r=r-1, reducing the mismatch count if unequal characters were dropped off the left side.
It should be pretty easy to see that on each loop either r increases by 1 or j increases by 1 and (j+r) stays the same. Both will j and (j+r) will reach len in O(n) time, so the whole thing is O(n^2).
Edit: I fixed the handing of r, so the above should be even more pseudocorrect. The improvement to O(n^2) runtime might help.
Re-edit: Fixed comment bugs.
Re-re-edit: More typos in algorithm, mostly mismatches misspelled and incremented by 2 instead of 1.
#Mike I have some modifications in your logic and here is the correct code for it...
#include<iostream>
#include<string>
using namespace std;
int main()
{
long long int k,c=0;
string s;
cin>>k>>s;
int len = s.length();
for(int gap = 1 ; gap < len; gap ++)
{
int i=0,j=gap,mm=0,tmp_len=0;
while (mm <=k && (j+tmp_len)<len)
{
if (s[i+tmp_len] != s[j+tmp_len])
mm++;
tmp_len++;
}
// while (((j+tmp_len)<len) && (s[i+tmp_len]==s[j+tmp_len]))
// tmp_len++;
if(mm>k){tmp_len--;mm--;}
do{
c = c + tmp_len ;
if (s[i] != s[j]) mm--;
i++;
j++;
tmp_len--;
while (mm <=k && (j+tmp_len)<len)
{
if (s[i+tmp_len] != s[j+tmp_len])
mm++;
tmp_len++;
}
if(mm>k){tmp_len--;mm--;}
}while(tmp_len>0);
}
cout<<c<<endl;
return 0;
}

I don't understand itoa() in K&R book

I am reading K&R; so far I'm doing well with it, but there is something in function itoa() which I don't understand. Here in itoa() they say they reverse the numbers themselves. For example 10 is 01 (they reverse the string):
void itoa(int n, char s[])
{
int i, sign;
if ((sign = n) < 0) /* record sign */
n = -n; /* make n positive */
i = 0;
do { /* generate digits in reverse order */
s[i++] = n % 10 + '0'; /* get next digit */
} while ((n /= 10) > 0); /* delete it */
if (sign < 0)
s[i++] = '-';
s[i] = '\0';
reverse(s);
return;
}
I don't understand how it reversed the number. Even though we are just doing n % 10 + '0' then its the following digit which 10 then 1 gets deleted then it goes to 0 right ? Or I don't get its logic ?
In the do-while loop, it is pulling the numbers off from behind (the least significant digit first). So, if you had the number -123456789, it processes the 9, then the 8, then the 7, etc.
So, when it hits the null-terminator (3rd to last line), you would have "987654321-", which is then reversed.
n % 10 gives 0 for n = 10, so after the loop, the string s contains 01.
The call to reverse() fixes this.
The algorithm determines the digits from least to most significant order. Because the total number of digits that will be generated is not known in advance, the correct position cannot be determined as they are generated - the least significant digit will be at the end, but the 'end' is not known. So they are buffered in the order they are calculated (reverse) and then the whole string is reversed to correct the ordering.
One way of avoiding this is to determine the length in advance:
decimal_digits = (int)log10( n ) + 1 ;
but on devices without an FPU (and some with very simple FPUs) that is likely to be a heavier task than string reversal.

Resources