I am trying to speed up my algorithm by using CUDA to find all possible combination of a string. What is the best way I can achieve this?
example:
abc
gives:
a
b
c
ab
ac
bc
i have nothing so far. i am not asking for code. i am just asking for the best way to do it? an algorithm? a pseudocode? maybe a discussion?
The advantage to using CUDA is massive parallelism with potentially thousands of threads with little overhead. To that end, you have to figure out a way to divide the problem into small chunks without relying too much on communication between the threads. In this problem you have n characters and each can be either present or absent in each output string. This yields 2^n total output strings. (You've left off the empty string and the original string from your list...if that's the desired result then you have 2^n - 2 total output strings.)
In any event, one way you can divide up the work of creating the strings is to assign each potential output string a number and have each thread compute the output strings for a certain range of numbers. The mapping from number to output string is easy if you look at the binary representation of each number. Each binary digit in an n-bit number corresponds to a character in the string of length n. Thus, for your example, the number 5 or 101 in binary maps to the string "ac". The strings you listed would be created by computing the mappings for numbers from 1 to 6 as follows:
1 c
2 b
3 bc
4 a
5 ac
6 ab
You could compute 7 to get abc or 0 to get the empty string if desired.
Unless you're doing this for words longer than a dozen or so characters, I'm not sure this will be that much faster though. If you're doing it for words longer than 25 or so characters you might start running into memory constraints since you'll be wrangling hundreds of megabytes.
I will be very, very surprised if CUDA is the right solution to this problem.
However, I would write a kernel to find all substrings of length n, and launch the kernel in a loop for each value of n from 0 to the length of the string. Thus, each thread in a kernel will have exactly the same instructions (no threads will sit around idle while others finish).
Each thread will "find" one substring, so you might as well have thread i find the substring starting at index i in the string. Note that each substring length requires a different number of threads.
so, for n=1:
thread 0: a
thread 1: b
thread 2: c
and for n=2:
thread 0: ab
thread 1: bc
Related
I'm trying to write a program that reads a .txt file containing several thousand strings (each one is exactly 9 letters long) made up only of the letters A,C,G and T (i.e. DNA sequences).
Now, there are of course 4^9 possible combinations of A,C,G and T in a 9-letter string. I need to know how often each of these 262144 combinations appears in my .txt file.
My problem is that I (obviously) don't want to initialize 262144 individual variables, increment each when a match is found and then print them all individually, because that would be crazy.
So, my idea was to create either some kind of tree which goes down the branches according to the letter encountered at each node and stores the number of times each branch was 'run down' (i.e. each possible 9-letter combination) at the last node.
Or an array of 262144 positions where I can store the number of appearances of each possible combination. For that, however, I would need some kind of non-redundant system that chooses a unique position in the array (to store the number of times that combination has been encountered) based on which letters have been encountered in which sequence in the 9-letter string.
For example: For each 'A' encountered in the 9-letter string, I increment my 'pointer variable' (which points to the position in the big array) by 0, so every time the sequence AAAAAAAAA is encountered, position [0] of my array is incremented by 1. For every 'T' I increment the pointer by 1, so TTTTTTTTT would increment position [9] of my array by 1 and so on.
This, however, gives me the problem that both sequences AAAAAAAAT and TAAAAAAAA (and all other combinations of 8 As and 1T) will increment position [1] of the array. So I would have to use some kind of system where the pointer can actually reach each value between 0 and 262143 exactly once?
I'm sure there is some better way? Multi-dimensional arrays or something like that?
Best regards,
rokyo
You want to store this as a tree of depth 9, each node can have 4 children, just each of the 4 possibilities of the next letter. Each leaf would have a counter in it. When you have built your tree, go through all the leaves and that will give you the counts.
So it would work like this:
Read in a sequence.
For each character in the sequence select the proper child, if it does not exist create the node. If it does then go to the child.
If you are at the end of you string then update the count in the node.
loop back to read in a sequence.
Once all sequences are read, and tree is built.
Iterate through the tree, if it is a leaf (no children) then spit out the count.
The benefit to this approach is if the size of the data changes, or the length of each sequence it will still work. This is a typical use for a tree.
Why multidimensional. If you want to count just encode into an integer and increment the place in an array of 262143 integers.
How to encode your string: think of those 4 letters as a binary number with 2 places. so you need 18 bits to represent one combination.
A - 00
C - 01
G - 10
T - 11
AAAAAAAAA - 000000000000000000
ACACACACA - 000100010001000100 - 17476
GAAAAAAAA - 100000000000000000 - 131072
TAAAAAAAA - 110000000000000000 - 196608
AAAAAAAAT - 000000000000000011 - 3
The Array in Memory would be depending on the maximum number of occurrences you want to cope with. If 4 Billion is enough you would need about a megabyte of memory to represent this "counter"-array.
Each counting access would be O(1).
this idea had been flowing in my head for 3 years and i am having problems to apply it
i wanted to create a compression algorithm that cuts the file size in half
e.g. 8 mb to 4 mb
and with some searching and experience in programming i understood the following.
let's take a .txt file with letters (a,b,c,d)
using the IO.File.ReadAllBytes function , it gives the following array of bytes : ( 97 | 98 | 99 | 100 ) , which according to this : https://en.wikipedia.org/wiki/ASCII#ASCII_control_code_chart is the decimal value of the letter.
what i thought about was : how to mathematically cut this 4-membered-array to only 2-membered-array by combining each 2 members into a single member but you can't simply mathematically combine two numbers and simply reverse them back as you have many possibilities,e.g.
80 | 90 : 90+80=170 but there is no way to know that 170 was the result of 80+90 not like 100+70 or 110+60.
and even if you could overcome that , you would be limited by the maximum value of bytes (255 bytes) in a single member of the array.
i understand that most of the compression algorithms use the binary compression and they were successful,but imagine cutting a file size in half , i would like to hear your ideas on this.
Best Regards.
It's impossible to make a compression algorithm that makes every file shorter. The proof is called the "counting argument", and it's easy:
There are 256^L possible files of length L.
Lets say there are N(L) possible files with length < L.
If you do the math, you find that 256^L = 255*N(L)+1
So. You obviously cannot compress every file of length L, because there just aren't enough shorter files to hold them uniquely. If you made a compressor that always shortened a file of length L, then MANY files would have to compress to the same shorter file, and of course you could only get one of them back on decompression.
In fact, there are more than 255 times as many files of length L as there are shorter files, so you can't even compress most files of length L. Only a small proportion can actually get shorter.
This is explained pretty well (again) in the comp.compression FAQ:
http://www.faqs.org/faqs/compression-faq/part1/section-8.html
EDIT: So maybe you're now wondering what this compression stuff is all about...
Well, the vast majority of those "all possible files of length L" are random garbage. Lossless data compression works by assigning shorter representations (the output files) to the files we actually use.
For example, Huffman encoding works character by character and uses fewer bits to write the most common characters. "e" occurs in text more often than "q", for example, so it might spend only 3 bits to write "e"s, but 7 bits to write "q"s. bytes that hardly ever occur, like character 131 may be written with 9 or 10 bits -- longer than the 8-bit bytes they came from. On average you can compress simple English text by almost half this way.
LZ and similar compressors (like PKZIP, etc) remember all the strings that occur in the file, and assign shorter encodings to strings that have already occurred, and longer encodings to strings that have not yet been seen. This works even better since it takes into account more information about the context of every character encoded. On average, it will take fewer bits to write "boy" than "boe", because "boy" occurs more often, even though "e" is more common than "y".
Since it's all about predicting the characteristics of the files you actually use, it's a bit of a black art, and different kinds of compressors work better or worse on different kinds of data -- that's why there are so many different algorithms.
I have test.csv (300 lines) file as below
10 20 100 2 5 4 5 7 9 10 ....
55 600 7000 500 25
3 10
2 5 6
....
Each line has different number of integers (maximum number of records =1000) and I need to proceed these records line by line. I tried as below
integer,dimension(1000)::rec
integer::i,j
open(unit=5,file="test.csv",status="old",action="read")
do i=1,300
read(unit=5,fmt=*) (rec(j),j=1,1000)
!do some procedue with rec
enddo
close(unit=50)
but it seems like that rec array is not constructed by line by line. It means that when i=n, rec get the numbers from non-nth line. How can I solve this problem.
thank you
List directed formatting (as specified by the star in the read statement) reads what it needs to satisfy the list (hence it is "list directed"). As shown, your code will try and read 1000 values each iteration, consuming as many records (lines) as required each iteration in order to do that.
(List directed formatting has a number of surprising features beyond that, which may have made sense with card based input forms 40 years ago, but are probably misplaced today. Before using list directed input you should understand exactly what the rules around it say.)
A realistic and robust approach to this sort of situation is to read in the input line by line, then "manually" process each line, tokenising that line and extracting values as per whatever rules you are following.
(You should get in touch with whoever is naming files that have absolutely no commas or semicolons with an extension ".csv", and have a bit of a chat.)
(As a general rule, low value unit numbers are to be avoided. Due to historical reasons they may have been preconnected for other purposes. Reconnecting them to a different file is guaranteed to work for that specific purpose, but other statements in your program may implicitly be assuming that the low value unit is still connected as it was before the program started executing - for example, PRINT and READ statements intended to work with the console might start operating on your file instead.)
I have to write a simple program in C that prints to the standard output triangle with two equal edges for given number n. Meaning that for n=3 the output would be:
x
xx
xxx
Now I'm supposed to do two version of this program:
1. Memory conservative.
2. Time conservative.
Now I'm not entirely sure, but I think that the first version would just print x one at a time, and the second would expand the char table one at a time and then print it.
But is printing a char* faster than printing multiple single chars?
You may not be able to observe but building the entire string in memory and then printing it at once is definitely faster in theory. Reason being you will be making less calls to printf function. Each time you call a function there are multiple things that happen in the background like pushing all the current method variables and current location to stack and popping them back after returning.
However as I mentioned you may not be able to observe this difference for smaller inputs because the time needed for each of these operations are small unless you use a computer from 1960s.
Need some good problems which students can think of and apply their own logic to solve them using control instructions only. The topics covered until now are basic, not even arrays are done yet. But, I want students to be perfect before proceeding to higher topics.
I tried searching for some example problems, none were as I expected / they were the ones which I already knew.
Some of which I know:
Write a program to find out the value of a^b without using built in functions.
Write a program to find out Armstrong numbers between a range.
Write a program to print binary equivalent of a number in reverse order (since arrays are not yet done, just simple logic to print the remainder and divide the number further)
Count all -ve, +ve and 0 numbers entered by user until user wishes to terminate the program.
Write a program to display all divisors of a given number.
Write a program to find if the given number is prime or not.
Check if the given number is odd or even.
Need more good logically interesting problems which would help students to build their problem solving capability.
Thanks.
PS: Please forgive me if this question is vague or not to the point coz this question has scope for vast answers and I cannot accept a single answer, I guess?
Check if number is a palindrome (1234554321)
Rewrite a function using write() to print a number in the console (similar to printf("%d", ...))
A function that writes all combinations of 2 digits starting from 12 to 89, not allowing twice the same digit, nor a different order (12, 13, ..., 19, 23, 24... : skipping 21 because it's done with 12)
A function that write all combinations of n digits (n given as a parameter from 1 to 9) with the same rules (without using arrays)
Print first 33 terms of Fibonacci-Series
Write factorial of n being input from keyboard on console.
Find hours,minutes,seconds from given seconds.(305 s = 5m + 5s ....)
Calculate dot-product and cross-product of two 2D vectors.
Find the intersecting point of two lines(m=slope, (x0,y0)=a point for each line)
Calculate sin(pi/4) with using series expansion
Print the minimum of values given from keyboard on screen.
Simulate **and** , **or** and **xor** gates.
Find projection of a vector(3D) on another vector.
Find area of a polygon(2D)
Calculate the integral of x-square between x=0 and x=3 using integration by trapezoidal rule
Find roots of: (x-square) plus (two times x) plus (one) equals (zero)