I am trying to generate permutations of strings of length upto 50. This means 50! (50 factorial strings = approx 3.041 * 10^64). I have the following parallel algorithm in mind, considering its not possible to use a single virtual machine or process. I plan to hive of computations to a number of virtual machines:
Say input string is "abcdef"
1) Have a function that removes a request from a common global pool and processes it. A request object consists of two variables: ancestorSequence, remainingCollection
2) Once function is invoked, do the following:
2a) if remainingCollection has two characters then print this list:
[ancestorSequence+remainingCollection[0]+remainingCollection[1], ancestorSequence+remainingCollection[1]+remainingCollection[0]
otherwise do the following
2b) Create the following requests and add it to the common global pool:
2b1) (ancestorSequence+remainingCollection[0],remainingCollection without remainingCollection[0])
2b2) (ancestorSequence+remainingCollection[1],remainingCollection without remainingCollection[1])
2b3) (ancestorSequence+remainingCollection[2],remainingCollection without remainingCollection[2])
and so on until
2bn) (ancestorSequence+remainingCollection[n-1],remainingCollection without remainingCollection[n-1]
We can always optimize by having each function accept multiple such requests. Please let me know your thoughts on this.
Related
Context:
Erlang programs running on heterogeneous nodes, retrieving and storing data
from Mnesia databases. These database entries are meant to be used for a long
time (e.g. across multiple Erlang version releases) remains in the form of
Erlang objects (i.e. no serialization). Among the information stored, there are
currently two uses for arrays:
Large (up to 16384 elements) arrays. Fast access to an element
using its index was the basis for choosing this type of collection.
Once the array has been created, the elements are never modified.
Small (up to 64 elements) arrays. Accesses are mostly done using indices, but there are also some iterations (foldl/foldr). Both reading and replacement of the elements is done frequently. The size of the collection remains constant.
Problem:
Erlang's documentation on arrays states that "The representation is not
documented and is subject to change without notice." Clearly, arrays should not be used in my context: database entries containing arrays may be
interpreted differently depending on the node executing the program and
unannounced changes to how arrays are implemented would make them unusable.
I have noticed that Erlang features "ordsets"/"orddict" to address a similar
issue with "sets"/"dict", and am thus looking for the "array" equivalent. Do you know of any? If none exists, my strategy is likely going to be using lists of lists to replace my large arrays, and orddict (with the index as key) to replace the smaller ones. Is there a better solution?
An array is a tuple of nested tuples and integers, with each tuple being a fixed size of 10 and representing a segment of cells. Where a segment is not currently used an integer (10) acts as a place holder. This without the abstraction is I suppose the closet equivalent.You could indeed copy the array module from otp and add to your own app and thus it would be a stable representation.
As to what you should use devoid of array depends on the data and what you will do with it. If data that would be in your array is fixed, then a tuple makes since, it has constant access time for reads/lookups. Otherwise a list sounds like a winner, be it a list of lists, list of tuples, etc. However, once again, that's a shot in the dark, because I don't know your data or how you use it.
See the implementation here: https://github.com/erlang/otp/blob/master/lib/stdlib/src/array.erl
Also see Robert Virding's answer on the implementation of array here: Arrays implementation in erlang
And what Fred Hebert says about the array in A Short Visit to Common Data Structures
An example showing the structure of an array:
1> A1 = array:new(30).
{array,30,0,undefined,100}
2> A2 = array:set(0, true, A1).
{array,30,0,undefined,
{{true,undefined,undefined,undefined,undefined,undefined,
undefined,undefined,undefined,undefined},
10,10,10,10,10,10,10,10,10,10}}
3> A3 = array:set(19, true, A2).
{array,30,0,undefined,
{{true,undefined,undefined,undefined,undefined,undefined,
undefined,undefined,undefined,undefined},
{undefined,undefined,undefined,undefined,undefined,
undefined,undefined,undefined,undefined,true},
10,10,10,10,10,10,10,10,10}}
4>
My idea is to create a 2 stage pipeline with global time clock (100 cycles). The two stages are represented as two functions and are the following: The first one generates 2D random matrix - once generated this data is moved to stage two, and immediately stage 1 starts generating new data. Meanwhile stage 2 sums the 2D matrix and provides an output. This movement/computation is repeated for the 100 cycles.
Do I utilize a local iteration of 100 cycles for each function. I don't want to use a fork/pipe option.
I have used the following but it generates sequentially:
for(i=0;i<100;i++){
stage_one();
stage_two();
}
The other option is to locally do the loop for each stage and use data queue to move between functions.
Can someone introduce me to a resource I can read/test how to do it. I thank you very much for your help.
I am using JDBC to access a postgresql database through Matlab, and have gotten hung up when trying to insert an array of values that I would rather store as an array instead of individual values. The Matlab code that I'm using is as follows:
insertCommand = 'INSERT INTO neuron (classifier_id, threshold, weights, neuron_num) VALUES (?,?,?,?)';
statementObject = dbhandle.prepareStatement(insertCommand);
statementObject.setObject(1,1);
statementObject.setObject(2,output_thresholds(1));
statementObject.setArray(3,dbHandle.createArrayOf('"float8"',outputnodes(1,:)));
statementObject.setObject(4,1);
statementObject.execute;
close(statementObject);
Everything functions properly except for the line dealing with Arrays. The object outputnodes is a <5x23> double matrix, so I'm attempting to put the first <1x23> into my table.
I've tried several different combinations of names and quotes for the '"float8"' part of the createArrayof call, but I always get this error:
??? Java exception occurred:
org.postgresql.util.PSQLException: Unable to find server array type for provided name "float8".
at org.postgresql.jdbc4.AbstractJdbc4Connection.createArrayOf(AbstractJdbc4Connection.java:82)
at org.postgresql.jdbc4.Jdbc4Connection.createArrayOf(Jdbc4Connection.java:19)
Error in ==> Databasetest at 22
statementObject.setArray(3,dbHandle.createArrayOf('"float8"',outputnodes(1,:)));
Performance of JDBC connector for arrays
I'd like to note that in the case you have to export rather big volumes of data containing arrays JDBC may not be the best choice. Firstly, its performance degrades due to the overhead caused by a conversion of native Matlab arrays into org.postgresql.jdbc.PgArray objects. Secondly, this may lead to a shortage of Java heap memory (and simply increasing Java heap memory size may not be a panacea). Both these points can be seen on the following picture illustrating the performance of datainsert method from Matlab Database Toolbox (it works with PostgreSQL exactly through a direct JDBC connection):
The blue graph displays the performance of batchParamExec command from PgMex library (see https://pgmex.alliedtesting.com/#batchparamexec for details). The endpoint of the red graph corresponds
to a certain maximum data volume passed into the database by datainsert without any error.
A data volume greater than that maximum causes “out of Java heap memory” problem
(Java heap size is specified at the top of the figure).
For further details of experiments please see the following
paper with full benchmarking results for data insertion.
Example reworked
As can be seen PgMex based on libpq (the official C application programmer's interface to PostgreSQL) has greater performance and able to process volumes at least up to more than 2Gb.
Using this library your code can be rewritten as follows (we assume below that all the parameters marked by <> signs are properly filled, that the table neuron already exists in the database and have fields classifier_id of int4, threshold of float8, weights of float8[] and neuron_num of int4 and, at last, that the variables classfierIdVec, output_thresholds, outputnodes and neuronNumVec are already defined and are numerical arrays of sizes shown in the comments in the code below; in the case the types of table fields are different you need to appropriately fix the last command of the code):
% Create the database connection
dbConn = com.allied.pgmex.pgmexec('connect',[...
'host=<yourhost> dbname=<yourdb> port=<yourport> '...
'user=<your_postgres_username> password=<your_postgres_password>']);
insertCommand = ['INSERT INTO neuron '...
'(classifier_id, threshold, weights, neuron_num) VALUES ($1,$2,$3,$4)'];
SData = struct();
SData.classifier_id = classifierIdVec(:); % [nTuples x 1]
SData.threshold = output_thresholds(:); % [nTuples x 1]
SData.weights = outputnodes; % [nTuples x nWeights]
SData.neuron_num = neuronNumVec; % [nTuples x 1]
com.allied.pgmex.pgmexec('batchParamExec',dbConn,insertCommand,...
'%int4 %float8 %float8[] %int4',SData);
It should be noted that outputnodes needs not to be cut along rows on separate arrays because the latter ones
are of the same length. In the case of arrays for different tuples having different sizes it is necessary to pass them
as a column cell array with each cell containing its own array for each tuple.
EDIT: Currently PgMex has free academic licensing.
I was getting confused with the documentation which all used double quotes, which Matlab doesn't allow, using only single quotes actually resolved this. The correct line was:
statementObject.setArray(3,dbHandle.createArrayOf('float8',outputnodes(1,:)));
instead of
statementObject.setArray(3,dbHandle.createArrayOf('"float8"',outputnodes(1,:)));
I originally thought that the problem was with the alias that I was using for double precision was incorrect, but as Craig pointed out in the comment above this isn't the case.
I have sequential code for looking for maximum value in matrix columns. Because this matrix could be even 5000 x 5000, I am thinking of speeding it in MPI. I don't know how to achieve this now, but I looked up functions MPI_Scatter for distributing items from a columns (maybe block mapping) and MPI_Gather for getting max values from all processes (in my case max 3 processes) and then comparing them... Do you think this could have some benefit in lesser computing time? If so, can someone give me a kick off?
Is all you want to do find out the maximum entry in the matrix (or within a part of the matrix)?
If so, the easiest way for you is probably split the matrix up to different processes, search for the maximum value in the part each process gets assigned and then compare them using MPI_Allreduce, which is able to send the maximum value of a variable which has different values in each processes to all processes.
No matter if you are dealing with a whole matrix or just a column, this technique can of course always be applied. You just have to think about a good way of splitting the area up to different processes
Of course this will speed up your computation only from a certain matrix size upwards. I assume if you are dealing with a 10 x 10 matrix and want to split it into 3 processes, the overhead for MPI is larger than the gain from parallelization. :)
I have to do a table lookup to translate from input A to output A'. I have a function with input A which should return A'. Using databases or flat files are not possible for certain reasons. I have to hardcode the lookup in the program itself.
What would be the the most optimum (*space-wise and time-wise separately): Using a hashmap, with A as the key and A' as the value, or use switch case statements in the function?
The table is a string to string lookup with a size of about 60 entries.
If speed is ultra ultra necessary, then I would consider perfect hashing. Otherwise I'd use an array/vector of string to string pairs, created statically in sort order and use binary search. I'd also write a small test program to check the speed and memory constraints were met.
I believe that both the switch and the table-look up will be equivalent (although one should do some tests on the compiler being used). A modern C compiler will implement a big switch with a look-up table. The table look-up can be created more easily with a macro or a scripting language.
For both solutions the input A must be an integer. If this is not the case, one solution will be to implement a huge if-else statement.
If you have strings you can create two arrays - one for input and one for output (this will be inefficient if they aren't of the same size). Then you need to iterate the contents of the input array to find a match. Based on the index you find, you return the corresponding output string.
Make a key that is fast to calculate, and hash
If the table is pretty static, unlikely to change in future, you could have a look-see if adding a few selected chars (with fix indexes) in the "key" string could get unique values (value K). From those insert the "value" strings into a hash_table by using the pre-calculated "K" value for each "key" string.
Although a hash method is fast, there is still the possibility of collision (two inputs generating the same hash value). A fast method depends on the data type of the input.
For integral types, the fastest table lookup method is an array. Use the incoming datum as an index into the array. One of the problems with this method is that the array must account for the entire spectrum of values for the fastest speed. Otherwise execution is slowed down by translating the original index into an index for the array (kind of like a hashing method).
For string input types, a nested look up may be the fastest. One example is to break up tables by length. The first array returns pointers to the table to search based on length, e.g. char * sub_table = First_Array[5] for a string of length 5. These can be configured for specialized input data.
Another method is to use a B-Tree, which is a binary tree of "pages". Behavior is similar to nested arrays.
If you let us know the input type, we can better answer your question.