Python Counting consecutive characters in a string - arrays

I'm working in a PSET6 for CS50 on edx, it's called DNA
https://cs50.harvard.edu/x/2020/psets/6/dna/
And a video to explain the problem in details : https://youtu.be/j84b_EgntcQ
Here's the code down below that I need help with
I want to count how many times does the following set of characters repeats consecutively
"AGATC", "AATG", and "TATC"
which means if "AGATC" appeared one time , I ignore it,
however if it's repeated back to back , so I count those, and so on , then return the maximum number was it counted
Here's a text for , you are free to edit for testing
that code , doesn't provide the needed results, because counter is grouping each letter
so is there's a way i can get the below result
# code from https://www.journaldev.com/23666/python-string-find
def find_all_indexes(input_str, search_str):
l1 = []
length = len(input_str)
index = 0
while index < length:
i = input_str.find(search_str, index)
if i == -1:
return l1
l1.append(i)
index = i + 1
return l1
s = 'AAGAGATCAGATCAGATCAGATCAGGTGAGTTAAATAGAAGATCAGATCAGATCAGATCAGATCATAGGTTAAAAATGAATGAATGAATGAATGATTAAAGGAGATCAGATCAGATCAGATCTATCTATCTATCTATCAATGAATGAATGTATCTATCTATCAGAAAATGAATGAATGAAGAGTATATCTATCAATAGTTAAAGAGTAAGATATTGAATTGAAAATATTGTTGGGGAAAGGAGGGATAGAAGG'
print(find_all_indexes(s, 'AGATC'))
# printed values: [3, 8, 13, 18, 39, 44, 49, 54, 59, 102, 107, 112, 117]
so now I'm able to find the location
however i don't know how to count each consecutive ones
for example
locations 3 , 8 , 13 , 39, 44,54 are counted 6
then locations 102, 107, 112, 117 are counted 4
so the greatest number is 6
then I need to get this 6 please which is the maximum repeats this string was repeated

The code is based on your example, maybe you would need to optimize it:
import re
def find_all_indexes(input_str, search_str):
max_count = 0
for group in re.findall(f"(?:{search_str})+", input_str):
count = len(re.findall(f"(?:{search_str})", group))
if count > max_count:
max_count = count
return max_count
s = 'AAGAGATCAGATCAGATCAGATCAGGTGAGTTAAATAGAAGATCAGATCAGATCAGATCAGATCATAGGTTAAAAATGAATGAATGAATGAATGATTAAAGGAGATCAGATCAGATCAGATCTATCTATCTATCTATCAATGAATGAATGTATCTATCTATCAGAAAATGAATGAATGAAGAGTATATCTATCAATAGTTAAAGAGTAAGATATTGAATTGAAAATATTGTTGGGGAAAGGAGGGATAGAAGG'
print(find_all_indexes(s, 'AGATC'))
Take a look again. You can check the documentation here: https://docs.python.org/3/library/re.html#re.findall

Related

How to find out if an arithmetic sequence exists in an array

If there is an array that contains random integers in ascending order, how can I tell if this array contains a arithmetic sequence (length>3) with the common differece x?
Example:
Input: Array=[1,2,4,5,8,10,17,19,20,23,30,36,40,50]
x=10
Output: True
Explanation of the Example: the array contains [10,20,30,40,50], which is a arithmetic sequence (length=5) with the common differece 10.
Thanks!
I apologize that I have not try any code to solve this since I have no clue yet.
After reading the answers, I tried it in python.
Here are my codes:
df = [1,10,11,20,21,30,40]
i=0
common_differene=10
df_len=len(df)
for position_1 in range(df_len):
for position_2 in range(df_len):
if df[position_1] + common_differene == df[position_2]:
position_1=position_2
i=i+1
print(i)
However, it returns 9 instead of 4.
Is there anyway to prevent the repetitive counting in one sequence [10,20,30,40] and also prevent accumulating i from other sequences [1,11,21]?
You can solve your problem by using 2 loops, one to run through every element and the other one to check if the element is currentElement+x, if you find one that does, you can continue form there.
With the added rule of the sequence being more than 2 elements long, I have recreated your problem in FREE BASIC:
DIM array(13) As Integer = {1, 2, 4, 5, 8, 10, 17, 19, 20, 23, 30, 36, 40, 50}
DIM x as Integer = 10
DIM arithmeticArrayMinLength as Integer = 3
DIM index as Integer = 0
FOR position As Integer = LBound(array) To UBound(array)
FOR position2 As Integer = LBound(array) To UBound(array)
IF (array(position) + x = array(position2)) THEN
position = position2
index = index + 1
END IF
NEXT
NEXT
IF (index <= arithmeticArrayMinLength) THEN
PRINT false
ELSE
PRINT true
END IF
Hope it helps
Edit:
After reviewing your edit, I have come up with a solution in Python that returns all arithmetic sequences, keeping the order of the list:
def arithmeticSequence(A,n):
SubSequence=[]
ArithmeticSequences=[]
#Create array of pairs from array A
for index,item in enumerate(A[:-1]):
for index2,item2 in enumerate(A[index+1:]):
SubSequence.append([item,item2])
#finding arithmetic sequences
for index,pair in enumerate(SubSequence):
if (pair[1] - pair[0] == n):
found = [pair[0],pair[1]]
for index2,pair2 in enumerate(SubSequence[index+1:]):
if (pair2[0]==found[-1] and pair2[1]-pair2[0]==n):
found.append(pair2[1])
if (len(found)>2): ArithmeticSequences.append(found)
return ArithmeticSequences
df = [1,10,11,20,21,30,40]
common_differene=10
arseq=arithmeticSequence(df,common_differene)
print(arseq)
Output: [[1, 11, 21], [10, 20, 30, 40], [20, 30, 40]]
This is how you can get all the arithmetic sequences out of df for you to do whatever you want with them.
Now, if you want to remove the sub-sequences of already existing arithmetic sequences, you can try running it through:
def distinct(A):
DistinctArithmeticSequences = A
for index,item in enumerate(A):
for index2,item2 in enumerate([x for x in A if x != item]):
if (set(item2) <= set(item)):
DistinctArithmeticSequences.remove(item2)
return DistinctArithmeticSequences
darseq=distinct(arseq)
print(darseq)
Output: [[1, 11, 21], [10, 20, 30, 40]]
Note: Not gonna lie, this was fun figuring out!
Try from 1: check the presence of 11, 21, 31... (you can stop immediately)
Try from 2: check the presence of 12, 22, 32... (you can stop immediately)
Try from 4: check the presence of 14, 24, 34... (you can stop immediately)
...
Try from 10: check the presence of 20, 30, 40... (bingo !)
You can use linear searches, but for a large array, a hash map will be better. If you can stop as soon as you have found a sequence of length > 3, this procedure takes linear time.
Scan the list increasingly and for every element v, check if the element v + 10 is present and draw a link between them. This search can be done in linear time as a modified merge operation.
E.g. from 1, search 11; you can stop at 17; from 2, search 12; you can stop at 17; ... ; from 8, search 18; you can stop at 19...
Now you have a graph, the connected components of which form arithmetic sequences. You can traverse the array in search of a long sequence (or a longest), also in linear time.
In the given example, the only links are 10->-20->-30->-40->-50.

Creating a count down list of numbers and string using loops and going in steps of 3

I have to generate a count down list that stops before the numbers become negative. The list has to go in steps of 3, for instance, 60, 57, 54. However, when the number is even it should print the string 'even' and when odd it should print the number, for instance, even, 57, even, 51, even, etc.
The count down should be displayed as starting from 60, but it should work for any integer starting value.
This is my attempt:
def IsEven(counter):
if (counter % 1 == 0):
out = counter
else:
type(counter)=='even'
def count_down(list):
#loop over every value of the list
for counter in list:
counter=counter-3
return counter
list1=[range(60,0)]
count_down(list1)

Discord py limit instead of requirement on range

I'm having List index out of range error and the issue is that I'm trying to show 25 results of players on a squad. Squads don't require 25, but only have a limit of 25. So when the squad doesn't contain 25 players, I get the out of range error. My question is, how do I display a list of squad members up to 25, but not requiring 25? Here is the line that is causing issues:
e = discord.Embed(title=f"{x2[0]['squadName']} ({squadnumber})", color=discord.Colour(value=235232), description='\n'.join([f"{c} <#{x[c-1]['player']}> - {int(x[c-1]['points']):,d} Score"]) for c in range(1+(25*(0)), 26+(25*(0)))]))
I used this method to get the range:
x = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
y = [x[i] for i in range(0, 5 if len(x) >= 5 else len(x))]
# this will get the first 5 elements of the list, and if the list isn't long enough
# it will get the length of the list
Here's the concept in use:
And applying this method will get you this:
e = discord.Embed(title=f"{x2[0]['squadName']} ({squadnumber})",
color=0x396E0,
description='\n'.join([f"{c} <#{x[c-1]['player']}> - {int(x[c-1]['points']):,d} Score" for c in range(1, 26 if len(x.keys()) > 25 else len(x.keys()))]))
Also, I noticed another thing with the code, such as discord.Color(value=some_value), you could just do 0xHEXCODE for example, to get the hex code, so I edited it in to make it easier on the eyes.
Please let me know if you need clarification on anything.
References:
0x usage in python
Using if/else in list comprehension
Getting hex colour codes

adding some elements in matlab with known Index

I have one array like below:
Array = [21.2, 13.6, 86.2, 54.6, 76, 34, 78, 12, 90, 4];
Now I want to add Array values from the first index to the fourth index, and from the seventh index to the tenth.
I wrote this code but it did not work correctly.
s = 0
for I=1:10
if 1<=I<=4 | I>6
s = s + Array(I);
end
end
Please help me with this problem.
You can implement it without any kind of loop that may slow your code. To make those sums, you just need to use 'sum'. For further help, please read this. In your case, I'd do the following:
a = [21.2, 13.6, 86.2, 54.6, 76, 34, 78, 12, 90, 4];
b = sum(a(1:4))+sum(a(8:end));

Element by Element Comparison of Multiple Arrays in MATLAB

I have a multiple input arrays and I want to generate one output array where the value is 0 if all elements in a column are the same and the value is 1 if all elements in a column are different.
For example, if there are three arrays :
A = [28, 28, 43, 43]
B = [28, 43, 43, 28]
C = [28, 28, 43, 43]
Output = [0, 1, 0, 1]
The arrays can be of any size and any number, but the arrays are also the same size.
A none loopy way is to use diff and any to advantage:
A = [28, 28, 43,43];
B = [28, 43, 43,28];
C = [28, 28, 43,43];
D = any(diff([A;B;C])) %Combine all three (or all N) vectors into a matrix. Using the Diff to find the difference between each element from row to row. If any of them is non-zero, then return 1, else return 0.
D = 0 1 0 1
There are several easy ways to do it.
Let's start by putting the relevant vectors in a matrix:
M = [A; B; C];
Now we can do things like:
idx = min(M)==max(M);
or
idx = ~var(M);
No one seems to have addressed that you have a variable amount of arrays. In your case, you have three in your example but you said you could have a variable amount. I'd also like to take a stab at this using broadcasting.
You can create a function that will take a variable number of arrays, and the output will give you an array of an equal number of columns shared among all arrays that conform to the output you're speaking of.
First create a larger matrix that concatenates all of the arrays together, then use bsxfun to take advantage of broadcasting the first row and ensuring that you find columns that are all equal. You can use all to complete this step:
function out = array_compare(varargin)
matrix = vertcat(varargin{:});
out = ~all(bsxfun(#eq, matrix(1,:), matrix), 1);
end
This will take the first row of the stacked matrix and see if this row is the same among all of the rows in the stacked matrix for every column and returns a corresponding vector where 0 denotes each column being all equal and 1 otherwise.
Save this function in MATLAB and call it array_compare.m, then you can call it in MATLAB like so:
A = [28, 28, 43, 43];
B = [28, 43, 43, 28];
C = [28, 28, 43, 43];
Output = array_compare(A, B, C);
We get in MATLAB:
>> Output
Output =
0 1 0 1
Not fancy but will do the trick
Output=nan(length(A),1); %preallocation and check if an index isn't reached
for i=1:length(A)
Output(i)= ~isequal(A(i),B(i),C(i));
end
If someone has an answer without the loop take that, but i feel like performance is not an issue here.

Resources