Subsequence using single string

Subsequence using single string - arrays

Given a string S with length N. Choose an integer K and two non-empty subsequences A and B of characters of this string, each with length K, such that:
A=B, i.e. for each valid i, the i-th character in A is the same as
the i-th character in B.
Let's denote the indices of characters used to construct A by
a1,a2,…,aK, i.e. A=(Sa1,Sa2,…,SaK). Similarly, let's denote the
indices of characters used to construct B by b1,b2,…,bK.
If we denote the number of common indices in the sequences a and b by
M, then M+1≤K.
What is the maximum value of K such that it is possible to find sequences A and B which satisfy the above conditions.
Please give the simplest solution of this problem.I'm not able to proceed on how to solve this.

Your ans will be look like this...
find the minimum distance between repeated character and the ans will be total string length - distance .
for example of "ababdbbdhfdksl"
minimum distance between repeated character = 1 ( 2 b's in the middle )
so ans = length ( 14 ) - 1 = 13
if all are distinct character answer will be 0 .

the question seems to find the duplicate subsequences of max length.
like if the string is "ababdb" then answer will be 4.
because abab can be generated two times and at least one index is different.
so approach is
1. generate all possible subsequences.
2. store and match with the previous one.
3. if any duplicate found we update the max .
#include <iostream>
#include <vector>
#include<string>
using namespace std;
vector<string> test;
vector<string>::iterator iter;
int max = 0;
void Sub(string input , string output)
{
if (input.length() == 0)
{
if (output.length() > max)// store only next greater max
{
iter = std::find(test.begin(), test.end(), output);
if (iter != test.end())
{
max = output.length();
return;
}
}
else
{
return;
}
test.push_back(output);
return;
}
Sub(input.substr(1), output);
Sub(input.substr(1), output + input[0]);
}
int main()
{
string input = "ababdb";
string output = "";
Sub(input, output);
cout << max;
return 0;
}

Related

CODEVITA "similar char" how to reduce my time complexity in c?

Example 1
Input
9
abacsddaa
2
9
3
Output
3
1
Explanation
Here Q = 2
For P=9, character at 9th location is 'a'. Number of occurrences of 'a' before P i.e., 9 is three.
Similarly for P=3, 3rd character is 'a'. Number of occurrences of 'a' before P. i.e., 3 is one.
My answer is
#include<stdio.h>
#include<stdlib.h>
int occ(int a,char *p){
int cnt=0;
for(int i=0;i<a;i++){
if(p[i]==p[a]){
cnt++;
}
}
return cnt;
}
int main(){
int l,q;
scanf("%d",&l);
char s[l];
scanf("\n%s\n%d",s,&q);
while(q>0){
int n;
scanf("\n%d",&n);
n=n-1;
int r=occ(n,s);
printf("%d\n",r);
q--;
}
}

I am not a C expert, but I can give you an idea of how to improve your time complexity in here.
You can use some sort of memorization, first ask: Is there any useful information I can get from iterating the array only once so I can answer each query faster?
Right now your solution do not pre process anything, and your complexity is O(n) per query. Let's make it something better, let's preprocess data in O(n) and answer each query in O(1).
You would have a map of characters that would count how many times a character appears. Notice that for index i, you just take into account appearances of s[i] before, so index i doesn't care about other characters.
Follow this approach
Create a vector(int) v of size s.length.
Create a map(char to int) m for counting characters appearances.
For i = 0 until s.length do:
v[i] = m[s[i]]++
That way, you just calculated the answer for each index in one iteration.
Now, for each query q, just print v[q - 1].
Time complexity per query: O(1)
Extra space complexity: O(n)
Note: For better understanding of the whole answer, n is the length of the string (s.length)
Hope that helps :)

Current execution complexity is O(lq) while l is the length of the input array and q is the number of queries.
The complexity of each query is O(l).
With proper data structure, you can store the input data in such way that each query will be O(1). For example, you can create a table where each line will present the letter (from a to z, for this example let's assume we get only lower case letters). Each column will present the number of times, the given letter has occurred till (and including) the index of this column.
For instance if the input is aabz, the table will look like this:
| 0 1 2 3
------------------------
a | 1 2 2 2
b | 0 0 1 1
. | . . . .
. | . . . .
y | 0 0 0 0
z | 0 0 0 1
In such case if you need to check number of occurrence of the letter at index 2 till (and including) this index, all you need to do is
Check the letter at index 2 in the input string ('b')
Check the value in the lookup table at ['b'][2] --> 1
The complexity to create such table is O(l). Here is an example for the code to build such table:
#define CHARS_SIZE ('z' - 'a' + 1)
// 'arr' - is the input array of chars
// 'len' - length of the input array
// 'lookup' - pointer to a zeroed (cleared) array of size: CHARS_SIZE * len * sizeof(*lookup)
void build_lookup(const char *arr, int len, int *lookup)
{
int char_val;
// normalize the letter to integer value between 0 (for 'a') and 25 (for 'z')
char_val = arr[0] - 'a';
lookup[char_val*len] = 1;
// 'i' indicates the column index in the table
for (int i = 1; i < len; ++i)
{
char_val = arr[i] - 'a';
// update the number of occurrences for each letter a..z at column 'i'
for (int char_iter = 0; char_iter < CHARS_SIZE; ++char_iter)
{
if (char_iter != char_val)
{
// same value as the previous one
lookup[char_iter*len + i] = lookup[char_iter*len + i - 1];
}
else {
// +1 to the value in the previous value
lookup[char_iter*len + i] = lookup[char_iter*len + i - 1] + 1;
}
}
}
}
The query, in such case, would be:
int occ(const char *arr, int len, const int *lookup, int idx){
// normalize the letter to integer value between 0 (for 'a') and 25 (for 'z')
int char_val = arr[idx] - 'a';
return lookup[char_val * len + idx];
}
Here is your code with few additions of what I explained above: https://godbolt.org/z/zaY4RL
Note that I haven't tested it so there probably a few bugs so use it as a reference and not as a full solution.

Selecting elements of n inputs in c without using arrays

Given a sequence of integers as input I need to be able to select specific integers of the sequence so that I can execute various arithmetic operations between them.
For example, given this input I want to know the sum of the third and the last integer of the sequence.
EDIT: in the example it would be the sum of 7 and 9
3
1
7
4
9
The condition I have to follow is that I don't use arrays so I can't give every integer an index.
Also I can't know in advance how many integers there will be in the sequence so I can't create an input that determine the number of integers.
To read the inputs I thought of using this loop with a scanf condition:
while (scanf("%d", &i) == 1);
The thing I'm stuck in is how to select the third and the last integers; if it was something like finding the sum of the odd integers I would just put a condition like this:
if (i%2 != 0)
{
sum = sum + i;
i++;
}
Most examples are solved using a for lopp but they either have the number of inputs declared or the inputs are just consecutive integers.
Any suggestion on how can I solve this problem?

You can mix a simple for loop with your scanf condition so you can associate an index to each number and memorize numbers needed to compute your operation (including the last).
Here is an example:
int i, value;
int sum = 0;
int lastValue = 0;
for(i=1 ; scanf("%d", &value) == 1 ; ++i)
{
if(i == 3) /* select the third element */
sum += value;
lastValue = value; /* memorize the last element so far to compute it later */
}
sum += lastValue; /* assume that there is at least one value in the sequence */

How should I generate the n-th digit of this sequence in logarithmic time complexity?

I have the following problem:
The point (a) was easy, here is my solution:
#include <stdio.h>
#include <string.h>
#define MAX_DIGITS 1000000
char conjugateDigit(char digit)
{
if(digit == '1')
return '2';
else
return '1';
}
void conjugateChunk(char* chunk, char* result, int size)
{
int i = 0;
for(; i < size; ++i)
{
result[i] = conjugateDigit(chunk[i]);
}
result[i] = '\0';
}
void displaySequence(int n)
{
// +1 for '\0'
char result[MAX_DIGITS + 1];
// In this variable I store temporally the conjugates at each iteration.
// Since every component of the sequence is 1/4 the size of the sequence
// the length of `tmp` will be MAX_DIGITS / 4 + the string terminator.
char tmp[(MAX_DIGITS / 4) + 1];
// There I assing the basic value to the sequence
strcpy(result, "1221");
// The initial value of k will be 4, since the base sequence has ethe length
// 4. We can see that at each step the size of the sequence 4 times bigger
// than the previous one.
for(int k = 4; k < n; k *= 4)
{
// We conjugate the first part of the sequence.
conjugateChunk(result, tmp, k);
// We will concatenate the conjugate 2 time to the original sequence
strcat(result, tmp);
strcat(result, tmp);
// Now we conjugate the conjugate in order to get the first part.
conjugateChunk(tmp, tmp, k);
strcat(result, tmp);
}
for(int i = 0; i < n; ++i)
{
printf("%c", result[i]);
}
printf("\n");
}
int main()
{
int n;
printf("Insert n: ");
scanf("%d", &n);
printf("The result is: ");
displaySequence(n);
return 0;
}
But for the point b I have to generate the n-th digit in logarithmic time. I have no idea how to do it. I have tried to find a mathematical property of that sequence, but I failed. Can you help me please? It is not the solution itself that really matters, but how do you tackle this kind of problems in a short amount of time.
This problem was given last year (in 2014) at the admission exam at the Faculty of Mathematics and Computer Science at the University of Bucharest.

Suppose you define d_ij as the value of the ith digit in s_j.
Note that for a fixed i, d_ij is defined only for large enough values of j (at first, s_j is not large enough).
Now you should be able to prove to yourself the two following things:
once d_ij is defined for some j, it will never change as j increases (hint: induction).
For a fixed i, d_ij is defined for j logarithmic in i (hint: how does the length of s_j increase as a function of j?).
Combining this with the first item, which you solved, should give you the result along with the complexity proof.

There is a simple programming solution, the key is to use recursion.
Firstly determine the minimal k that the length of s_k is more than n, so that n-th digit exists in s_k. According to a definition, s_k can be split into 4 equal-length parts. You can easily determine into which part the n-th symbol falls, and what is the number of this n-th symbol within that part --- say that n-th symbol in the whole string is n'-th within this part. This part is either s_{k-1}, either inv(s_{k-1}). In any case you recursively determine what is n'-th symbol within that s_{k-1}, and then, if needed, invert it.

The digits up to 4^k are used to determine the digts up to 4^(k+1). This suggests writing n in base 4.
Consider the binary expansion of n where we pair digits together, or equivalently the base 4 expansion where we write 0=(00), 1=(01), 2=(10), and 3=(11).
Let f(n) = +1 if the nth digit is 1, and -1 if the nth digit is 2, where the sequence starts at index 0 so f(0)=1, f(1)=-1, f(2)-1, f(3)=1. This index is one lower than the index starting from 1 used to compute the examples in the question. The 0-based nth digit is (3-f(n))/2. If you start the indices at 1, the nth digit is (3-f(n-1))/2.
f((00)n) = f(n).
f((01)n) = -f(n).
f((10)n) = -f(n).
f((11)n) = f(n).
You can use these to compute f recursively, but since it is a back-recursion you might as well compute f iteratively. f(n) is (-1)^(binary weight of n) = (-1)^(sum of the binary digits of n).
See the Thue-Morse sequence.

Non-recursive combination algorithm to generate distinct character strings

This problem has been irritating me for too long. I need a non-recursive algorithm in C to generate non-distinct character strings. For instance, if a given character string is 26 characters long, and the string is of length 2, then there are 26^2 non-distinct characters.
Please note that these are distinct combinations, aab is not the same as baa or aba. I've searched S.O., and most solutions produce non-distinct combinations. Also, I do not need permutations.
The algorithm can't rely on a libraries. I'm going to translate this C code into cuda where standard C libraries don't work (at least not efficiently).
Before I show you what I started, let me explain an aspect of the program. It is multithreaded on a GPU, so I initialize the beginning string with a few characters, aa in this case. To create a combination, I add one or more characters depending on the desired length.
Here's one method that I have attempted:
int main(void){
//Declarations
char final[12] = {0};
char b[3] = "aa";
char charSet[27] = "abcdefghijklmnopqrstuvwxyz";
int max = 4; //Set for demonstration purposes
int ul = 1;
int k,i;
//This program is multithreaded on a GPU. Each thread is initialized
//to a starting value for the string. In this case, it is aa
//Set final with a starting prefix
int pref = strlen(b);
memcpy(final, b, pref+1);
//Determine the number of non-distinct combinations
for(int j = 0; j < length; j++) ul *= strlen(charSet);
//Start concatenating characters to the current character string
for(k = 0; k < ul; k++)
{
final[pref+1] = charSet[k];
//Do some work with the string
}
...
It should be obvious that this program does nothing useful, accept if I'm only appending one character from charSet.
My professor suggested that I try using a mapping (this isn't homework; I asked him about possible ways to generate distinct combinations without recursion).
His suggestion is similar to what I started above. Using the number of combinations calculated, he suggested to decompose it according to mod 10. However, I realized it wouldn't work.
For example, say I need to append two characters. This gives me 676 combinations using the character set above. If I am on the 523rd combination, the decomposition he demonstrated would yield
523 % 10 = 3
52 % 10 = 2
5 % 10 = 5
It should be obvious that this doesn't work. For one, it yields three characters, and two, if my character set is larger than 10 characters, the mapping ignores those above index 9.
Still, I believe a mapping is key to the solution.
The other method I explored utilized for loops:
//Psuedocode
c = charset;
for(i = 0; i <length(charset); i++){
concat string
for(j = 0; i <length(charset); i++){
concat string
for...
However, this hardcodes the length of the string I want to compute. I could use an if statement with a goto to break it, but I would like to avoid this method.
Any constructive input is appreciated.

Given a string, to find the next possible string in the sequence:
Find the last character in the string which is not the last character in the alphabet.
Replace it with the next character in the alphabet.
Change every character to the right of that character with the first character in the alphabet.
Start with a string which is a repetition of the first character of the alphabet. When step 1 fails (because the string is all the last character of the alphabet) then you're done.
Example: the alphabet is "ajxz".
Start with aaaa.
First iteration: the rightmost character which is not z is the last one. Change it to the next character: aaaj
Second iteration. Ditto. aaax
Third iteration: Again. aaaz
Four iteration: Now the rightmost non-z character is the second last one. Advance it and change all characters to the right to a: aaja
Etc.

First, thanks for everyone's input; it was helpful. Being that I am translating this algorithm into cuda, I need it to be as efficient as possible on a GPU. The methods proposed certainly work, but not necessarily optimal for GPU architecture. I came up with a different solution using modular arithmetic that takes advantage of the base of my character set. Here's an example program, primarily in C with a mix of C++ for output, and it's fairly fast.
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <iostream>
using namespace std;
typedef unsigned long long ull;
int main(void){
//Declarations
int init = 2;
char final[12] = {'a', 'a'};
char charSet[27] = "abcdefghijklmnopqrstuvwxyz";
ull max = 2; //Modify as need be
int base = strlen(charSet);
int placeHolder; //Maps to character in charset (result of %)
ull quotient; //Quotient after division by base
ull nComb = 1;
char comb[max+1]; //Array to hold combinations
int c = 0;
ull i,j;
//Compute the number of distinct combinations ((size of charset)^length)
for(j = 0; j < max; j++) nComb *= strlen(charSet);
//Begin computing combinations
for(i = 0; i < nComb; i++){
quotient = i;
for(j = 0; j < max; j++){ //No need to check whether the quotient is zero
placeHolder = quotient % base;
final[init+j] = charSet[placeHolder]; //Copy the indicated character
quotient /= base; //Divide the number by its base to calculate the next character
}
string str(final);
c++;
//Print combinations
cout << final << "\n";
}
cout << "\n\n" << c << " combinations calculated";
getchar();
}

find the longest non decreasing sub sequence

given a string consists only of 0s and 1s say 10101
how to find the length of the longest non decreasing sub-sequence??
for example,
for the string,
10101
the longest non decreasing sub sequences are
111
001
so you should output 3
for the string
101001
the longest non decreasing sub sequence is
0001
so you should output 4
how to find this??
how can this be done when we are provided with limits.sequence between the limit
for example
101001
limits [3,6]
the longest non decreasing sub sequence is
001
so you should output 3
can this be achieved in o(strlen)

Can this be achieved in O(strlen)?
Yes. Observe that the non-decreasing subsequences would have one of these three forms:
0........0 // Only zeros
1........1 // Only ones
0...01...1 // Some zeros followed by some ones
The first two forms can be easily checked in O(1) by counting all zeros and by counting all ones.
The last one is a bit harder: you need to go through the string keeping the counter of zeros that you've seen so far, along with the length of the longest string of 0...01...1 form that you have discovered so far. At each step where you see 1 in the string, the length of the longest subsequence of the third form is the larger of the number of zeros plus one or the longest 0...01...1 sequence that you've seen so far plus one.
Here is the implementation of the above approach in C:
char *str = "10101001";
int longest0=0, longest1=0;
for (char *p = str ; *p ; p++) {
if (*p == '0') {
longest0++;
} else { // *p must be 1
longest1 = max(longest0, longest1)+1;
}
}
printf("%d\n", max(longest0, longest1));
max is defined as follows:
#define max( a, b ) ( ((a) > (b)) ? (a) : (b) )
Here is a link to a demo on ideone.

Use dynamic programming. Run through the string from left to right, and keep track of two variables:
zero: length of longest subsequence ending in 0
one: length of longest subsequence ending in 1
If we see a 0, we can append this to any prefix that ends in 0, so we increase zero. If we see a 1, we can either append it to the prefix that ends in 0, or in 1, so we set one the one which is longest. In C99:
int max(int a, int b) {
return a > b ? a : b;
}
int longest(char *string) {
int zero = 0;
int one = 0;
for (; *string; ++string) {
switch (*string) {
case '0':
++zero;
break;
case '1':
one = max(zero, one) + 1;
break;
}
}
return max(zero, one);
}

do {
count++;
if (array[i] < prev) {
if (count > max)
max = count;
count = 0;
}
prev = array[i];
} while (++i < length);
Single pass. Will even work on any numbers, not just 1s and 0s.
For limits - set i to starting number, use ending instead of array length.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Subsequence using single string - arrays

Related

CODEVITA "similar char" how to reduce my time complexity in c?

Selecting elements of n inputs in c without using arrays

How should I generate the n-th digit of this sequence in logarithmic time complexity?

Non-recursive combination algorithm to generate distinct character strings

find the longest non decreasing sub sequence

Categories

Resources