Why/how does the Longest Proper Prefix/Suffix algorithm work?

Why/how does the Longest Proper Prefix/Suffix algorithm work? - arrays

The LPS (Longest Proper Prefix which is also a Suffix) algorithm goes as follows:
public static int[] constructLPSArray(String s) {
int n = s.length();
int[] arr = new int[n];
int j = 0;
for (int i = 1; i < n; ) {
if (s.charAt(i) == s.charAt(j)) {
arr[i] = j + 1;
i++;
j++;
} else {
if (j != 0) {
j = arr[j - 1];
} else {
i++;
}
}
}
return arr;
}
The if (s.charAt(i) == s.charAt(j)) part looks clear, but the else part is unclear.
Why do we do:
if (j != 0) {
j = arr[j - 1];
} else {
i++;
}
More specifically, why does j = arr[j - 1] work ? Or why do we even do it? How do we validate the correctness of this step?

Let's say we are parsing an array of characters with i and j positioned like this:
a b a b x x a b a b ...
^ ^
j i
with arr holding:
0 0 1 2 0 0 1 2 3 4
i. e., the length of the longest prefix/suffix of each substring of s of that length until i. You can probably guess how that was generated from the rest of the algorithm. Now, if the next character after i does not match the next character after j,
a b a b x x a b a b a ...
^ ^
j i
we don't have to retry the matching, because we know the longest prefix/suffix of our previous prefix/suffix! Looking up arr[j - 1] yields 2 – so we essentially cached the information that the parts highlighted here
A B a b x x a b A B a ...
=== ^ === ^
j i
are identical, and don't need to be compared again!

*Here's one more solution*
int length=str.length();
int mid=length/2;
if(length<2){
System.out.println("-1");
}
for(int i=mid;i>=0;i--){
String prefix=str.substring(0,i);
String suffix=str.substring(length-i,length);
if(suffix.equals("") || prefix.equals("")){
System.out.println("-1");
}
if(suffix.equals(prefix)){
System.out.println(suffix.length());
break;
}
}

Related

Check whether exists index k such that elements of array A[] moved clockwise make a reverse bitonic array

Check whether exists index 0 <= k < n - 2 such that elements of array A[] moved clockwise by k indexes make a reverse bitonic array.
My approach to do it in O(n) time complexity:
bool is_antibitonicable(int A[], int n) {
// returns if there is such index k that
// after moving clockwise k elements of array
// A[], that array is reverse bitonic
// - strictly decreasing then strictly
// increasing
if (n < 3)
return false;
// if is_increasing[i] == 1 means this part of A[] is increasing,
// == 0 means that part of A[] is decreasing, == -1 default
int is_increasing[3] = { -1, -1, -1 };
for (int i = 0, j; i < n - 1;) {
if (A[i] < A[i + 1]) { // if A[] is increasing
j = 0;
while (j < 3 && is_increasing[j] != -1)
j++;
if (j == 3)
return false;
is_increasing[j] = 1;
while (i < n - 1 && A[i] < A[i + 1])
i++;
}
else if (A[i] > A[i + 1]) { // check if decreasing
j = 0;
while (j < 3 && is_increasing[j] != -1)
j++;
if (j == 3)
return false;
is_increasing[j] = 0;
while (i < n - 1 && A[i] > A[i + 1])
i++;
}
else // sequence of A[] is neither increasing nor decreasing
return false;
}
// if A[] is only increasing/decreasing
if (is_increasing[1] == is_increasing[2])
return false;
// if A[] is increasing->decreasing->increasing check if increasing
// parts can be merged into one increasing sequence
if (is_increasing[0] == 1 && is_increasing[1] == 0 && is_increasing[2] == 1)
return (A[0] > A[n - 1]);
// decreasing->increasing->decreasing
if (is_increasing[0] == 0 && is_increasing[1] == 1 && is_increasing[2] == 0)
return (A[0] < A[n - 1]);
return true; // increasing -> decreasing or opposite
}
I'd be very glad if someone could look at my solution and comment whether it seems correct or how to do it better, any feedback will be appreciated.

Your solution doesn't look bad, but it does incorrectly return false // if A[] is only increasing/decreasing. Such a sequence can always be turned into a first decreasing and then increasing one by rotating by one in the right (appropriate) direction.

Find the longest palindromic DNA sub-sequence that has the most mutations on it

I've been trying to do a Dynamic Programming assignment for university but I had no success so far.
The problem:
Given a DNA string and a list of mutation locations (for exemple, pieces 0 and 2 are mutations), find the longest palindromic sub-sequence that contains the most mutations on it.
Input: a string S with 0 to 2000 chars; an integer N such that 0<=N<=|S| and N positions (numbers from 0 to |S|) of mutations.
Output: an integer representing the size of the longest palindromic sub-sequence containing the maximum number of mutations.
Examples:
Input: CAGACAT 0
Output: 5
Input: GATTACA 1 0
Output: 1
Input: GATTACA 3 0 4 5
Output: 3
Input: TATACTATA 2 4 8
Output: 7
We have to code it in C, but what I really need are ideas, any language or pseudo-code is good to me.
My code to find the LPS (in C)
int find_lps(char *input)
{
int len = strlen(input), i, cur_len;
int c[len][len];
for (i = 0; i < len; i++)
c[i][i] = 1;
for (cur_len = 1; cur_len < len; cur_len++) {
for (i = 0; i < len - cur_len; i++) {
int j = i + cur_len;
if (input[i] == input[j]) {
c[i][j] = c[i + 1][j - 1] + 2;
} else {
c[i][j] = max(c[i + 1][j], c[i][j - 1]);
}
}
}
return c[0][len - 1];
}
What I tried to do for the mutations:
1- Creating an array of places where the LPS is changed. That doesn't work, and really, I have no idea of what to do.
More details about the problem:
In a situation where you have n palindromic subsequences, both of them with the same size of mutations inside, I need the longest of them. Given that you have n palindromic subsequences with X mutations, (we have M mutations), I need the longest palindromic subsequence of X mutations, considering you don't have a palindromic subsequence with M mutations. If you do, then you should choose the other subsequence, even if it's shorter. So, first criteria: most mutations in a palindromic subsequence. If we have the same amount, then the longest of the subsequences.
Any help is appreciated, thank you.

Lets define C[i][j] to store 2 values:
1- The length of the longest palindromic sub-sequence in the sub-string S(i,j) that contains the most mutations in it, and lets denote it by C[i][j].len
2- The number of mutations in the longest palindromic sub-sequence in the sub-string S(i,j) that contains the most mutations in it, and lets denote it by C[i][j].ms
Then the result of the problem would be C[0][|S|-1].len
Note: m[i] = 1 means the character s[i] is a mutation, otherwise m[i] = 0
Here is the full code written in c++:
#include <iostream>
#include <string>
using namespace std;
string s;
int m[2001];
struct Node {
int ms;//number of mutations
int len;
Node() {
ms = len = 0;
}
Node(int v1,int v2) {
ms = v1;
len = v2;
}
};
Node C[2001][2001];
Node getBestNode(Node n1, Node n2) {
if (n1.ms > n2.ms)
return n1;
if (n1.ms < n2.ms)
return n2;
if (n1.len > n2.len)
return n1;
if (n1.len < n2.len)
return n2;
return n1;
}
void init() {
for (int i = 0; i < 2001; i++) {
m[i] = 0;
for (int j = 0; j < 2001; j++) C[i][j] = Node(0,0);
}
}
void solve() {
int len = s.length();
// initializing the ranges of length = 1
for (int i = 0; i < len; i++)
C[i][i] = Node( m[i],1 );
// initializing the ranges of length = 2
for (int i = 0; i < len - 1; i++)
if (s[i] == s[i + 1])
C[i][i + 1] = Node(m[i] + m[i + 1],2);
else if (m[i] || m[i + 1])
C[i][i + 1] = Node(1,1) ;
// for ranges of length >= 3
for (int cur_len = 3; cur_len <= len; cur_len++)
for (int i = 0; i <= len - cur_len; i++) {
int j = i + cur_len - 1;
C[i][j] = getBestNode(C[i + 1][j], C[i][j-1]);
if (s[i] == s[j]) {
Node nn = Node(
C[i + 1][j - 1].ms + m[i] + m[j] ,
C[i + 1][j - 1].len + 2
);
C[i][j] = getBestNode(C[i][j], nn);
}
}
}
int main() {
int n;
cin >> s >> n;
init();//initializing the arrays with zeros
for (int i = 0; i < n; i++) {
int x; cin >> x;
m[x] = 1;
}
solve();
cout << C[0][s.length()-1].len << endl;
return 0;
}
The function getBestNode() is returning the best of 2 solutions by considering the number of mutations then the length of the sub-sequence.
Note: The code can be shorter, but I made it this way for clarity.

Insertion Sort C programming

Since this year I'm starting studying C programming at university.
In particular today I was trying to understand the insertion sort.
I wrote this code that is perfectly working:
void insertionSort (int v[], int s)
{
int i;
int j;
int value;
for (i = 1; i < s; i++)
{
value = v[i];
for (j = i - 1; (j >= 0) && (value < v[j]); j --)
{
v[j + 1] = v[j];
}
v[j + 1] = value; // why v[j+1]?
}
}
My question is about the last code line: v[j + 1] = value. If I understand correctly, j (that decreases every time), at the end of the for cycle, has a value of -1 and that's why is correct to write v[j + 1] = value.
Am I right or am I missing something? Really thanks for anybody who wants to help me by explaining me better.

The way you have your code setup right now, you need v[j + 1] because j will always be one before where you want to insert.
For example:
int v[6] = {1, 34, 2, 50, 4, 10}
s = sizeof(v) / sizeof(v[0]) = 6
Stepping through your code:
i = 1, j = 0
value = v[i] = 34
34 < 1 is false so it doesn't go into the inner
for loop
v[j + 1] = 34 which is right where 34 should be
Looping your entire code a second time: value = 2, j = 1, i = 2
Both conditions are met where j = 1 && 2 < 34 and you go into your inner loop
Since you already stored v[2] earlier when you did value = v[i], v[2] = 34 at this point is where you decrease j by 1 making j = 0
Looking at your array, it looks like this:
1, 34, 34
The inner for loop will try to loop again but fail the second check
At this point, j is 0 and when you do v[j + 1] = value, you're storing value (2) in its proper place.
Your array at this point looks like 1, 2, 34
So again, the significance of v[j + 1] is to insert in the correct place. If the value is already in the correct place than you swap with itself.

This is the process of Insertion Sort. It will swap if the numbers are not ordered.

Over here you can find an visualized example: https://visualgo.net/en/sorting
Here you have an example in C:
#include <stdio.h>
int main()
{
int n, array[1000], c, d, t;
printf("Enter number of elements\n");
scanf("%d", &n);
printf("Enter %d integers\n", n);
for (c = 0; c < n; c++) {
scanf("%d", &array[c]);
}
// Insertion Sort
for (c = 1 ; c <= n - 1; c++) {
d = c;
while ( d > 0 && array[d] < array[d-1]) {
t = array[d];
array[d] = array[d-1];
array[d-1] = t;
d--;
}
}
printf("Sorted list in ascending order:\n");
for (c = 0; c <= n - 1; c++) {
printf("%d\n", array[c]);
}
return 0;
}
mark first element as sorted
for each unsorted element
'extract' the element
for i = lastSortedIndex to 0
if currentSortedElement > extractedElement
move sorted element to the right by 1
else: insert extracted element

Generate all possible permutations in C

I'm trying to develop a code to solve the Travelling salesman problem in C, but I have some restrictions: I can only use "for, "while", "do", arrays, matrix and simple things like that, so, no functions or recursion (unfortunately).
What I've got so far:
The user will will type the city coordinates X and Y like this:
8.15 1.58
9.06 9.71
1.27 9.57
9.13 4.85
The code to storage the coordinates.
float city[4][2];
int i;
for (i=0; i<4; i++)
scanf("%f %f", &cidade[i][0], &cidade[i][1]);
There are 4 cities, so "i" goes from 0 to 3. X and Y are storaged on the second dimension of the matrix, [0] and [1].
The problem now is that I have to generate ALL POSSIBLE permutations of the first dimension of the matrix. It seems easy with just 4 cities, because all possible routes are (it must starts with city A everytime):
A B C D
A B D C
A C B D
A C D B
A D C B
A D B C
But I'll have to expand it for 10 cities. People have told me that it will use 9 nested foor loops, but I'm not being able to develop it =(
Can somebody give me an idea?

Extending to 10 (and looking up city names) as an exercise for the reader. And it's horrid, but that's what you get with your professor's limitations
#include <stdio.h>
int main(void) {
for (int one = 0; one < 4; one++) {
for (int two = 0; two < 4; two++) {
if (two != one) {
for (int three = 0; three < 4; three++) {
if (one != three && two != three) {
for (int four = 0; four < 4; four++)
if (one != four && two != four && three != four) {
printf("%d %d %d %d\n", one, two, three, four);
}
}
}
}
}
}
return 0;
}

This is based on https://stackoverflow.com/a/3928241/5264491
#include <stdio.h>
int main(void)
{
enum { num_perm = 10 };
int perm[num_perm];
int i;
for (i = 0; i < num_perm; i++) {
perm[i] = i;
}
for (;;) {
int j, k, l, tmp;
for (i = 0; i < num_perm; i++) {
printf("%d%c", perm[i],
(i == num_perm - 1 ? '\n' : ' '));
}
/*
* Find largest j such that perm[j] < perm[j+1].
* Break if no such j.
*/
j = num_perm;
for (i = 0; i < num_perm - 1; i++) {
if (perm[i + 1] > perm[i]) {
j = i;
}
}
if (j == num_perm) {
break;
}
for (i = j + 1; i < num_perm; i++) {
if (perm[i] > perm[j]) {
l = i;
}
}
tmp = perm[j];
perm[j] = perm[l];
perm[l] = tmp;
/* reverse j+1 to end */
k = (num_perm - 1 - j) / 2; /* pairs to swap */
for (i = 0; i < k; i++) {
tmp = perm[j + 1 + i];
perm[j + 1 + i] = perm[num_perm - 1 - i];
perm[num_perm - 1 - i] = tmp;
}
}
return 0;
}

I need help creating a k-combinations algorithm non-recursively

I've looked around online for an non-recursive k-combinations algorithm, but have had trouble understanding all of the reindexing involved; The code I've found online is not commented well, or crashes.
For example, if I have the collection, {'a', 'b', 'c', 'd', 'e'} and I want to find a 3 combinations; ie,
abc
abd
abe
acd
ace
ade
bcd
bce
bde
cde
How can I implement an algorithm to do this? When I write down the general procedure, this it is clear. That is; I increment the last element in a pointer until it points to 'e', increment the second to last element and set the last element to the second to last element + 1, then increment the last element again until it reaches 'e' again, and so on and so forth, as illustrated by how I printed the combinations. I looked at Algorithm to return all combinations of k elements from n for inspiration, but my code only prints 'abc'. Here is a copy of it:
#include <stdio.h>
#include <stdlib.h>
static void
comb(char *buf, int n, int m)
{
// Initialize a pointer representing the combinations
char *ptr = malloc(sizeof(char) * m);
int i, j, k;
for (i = 0; i < m; i++) ptr[i] = buf[i];
while (1) {
printf("%s\n", ptr);
j = m - 1;
i = 1;
// flag used to denote that the end substring is at it's max and
// the j-th indice must be incremented and all indices above it must
// be reset.
int iter_down = 0;
while((j >= 0) && !iter_down) {
//
if (ptr[j] < (n - i) ) {
iter_down = 1;
ptr[j]++;
for (k = j + 1; k < m; k++) {
ptr[k] = ptr[j] + (k - j);
}
}
else {
j--;
i++;
}
}
if (!iter_down) break;
}
}
int
main(void)
{
char *buf = "abcde";
comb(buf, 5, 3);
return 1;
}

The very big problem with your code is mixing up indices and values. You have an array of chars, but then you try to increment the chars as if they were indices into the buffer. What you really need is an array of indices. The array of chars can be discarded, since the indices provide all you need, or you can keep the array of chars separately.

I found a psuedocode description here, http://www4.uwsp.edu/math/nwodarz/Math209Files/209-0809F-L10-Section06_03-AlgorithmsForGeneratingPermutationsAndCombinations-Notes.pdf
and implemented it in C by
#include <stdlib.h>
#include <stdio.h>
// Prints an array of integers
static void
print_comb(int *val, int len) {
int i;
for (i = 0; i < len; i++) {
printf("%d ", val[i]);
}
printf("\n");
}
// Calculates n choose k
static int
choose(int n, int k)
{
double i, l = 1.0;
double val = 1.0;
for (i = 1.0; i <= k; i++) {
l = ((double)n + 1 - i) / i;
val *= l;
}
return (int) val;
}
static void
comb(int n, int r)
{
int i, j, m, max_val;
int s[r];
// Initialize combinations
for (i = 0; i < r; i++) {
s[i] = i;
}
print_comb(s, r);
// Iterate over the remaining space
for (i = 1; i < choose(n, r); i++) {
// use for indexing the rightmost element which is not at maximum value
m = r - 1;
// use as the maximum value at an index, specified by m
max_val = n - 1; // use for
while(s[m] == max_val) {
m--;
max_val--;
}
// increment the index which is not at it's maximum value
s[m]++;
// iterate over the elements after m increasing their value recursively
// ie if the m-th element is incremented, all elements afterwards are
// incremented by one plus it's offset from m
// For example, this is responsible for switching 0 3 4 to 1 2 3 in
// comb(5, 3) since 3 and 4 in the first combination are at their maximum
// value
for (j = m; j < r - 1; j++) {
s[j + 1] = s[j] + 1;
}
print_comb(s, r);
}
}
int
main(void)
{
comb(5, 3);
return 1;
}

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Why/how does the Longest Proper Prefix/Suffix algorithm work? - arrays

Related

Check whether exists index k such that elements of array A[] moved clockwise make a reverse bitonic array

Find the longest palindromic DNA sub-sequence that has the most mutations on it

Insertion Sort C programming

Generate all possible permutations in C

I need help creating a k-combinations algorithm non-recursively

Categories

Resources