data structure for quick access - c

The given data is input to an algo:
F1---- P1---- P2 ---- P3 .....
2 ---- 4 ------5 ------2.......
5 ----2-------10------1.....
1----4--------15------0...... (numbers under F1 are unique and non zero..numbers under P1,P2,P3 can be same and may be zero)
The algo chooses some numbers from F1 and gives the positions as 0,1,2 for which the numbers have to be accessed according to selected F1 number.(position 0 correspond to P1,1 to P2 and so on)
Again processing has to be done on numbers from selected positions.
I have made a data structure and attached the link to it where all the numbers in F1 will go in Part I of Portion A in sorted order and Part II of portion A will point to array consisting of numbers from P1,P2,P3 so that when number of F1 and position is selected by algo, the position can be quickly accessed and number retrieved from that position.
DS Image is in the link.
https://www.dropbox.com/s/iz9nqfg8jy4iekn/DS.png
(In case the image is not accessible--the DS consist of an array of structure having two members.Member 1 stores sorted number from F1 and member 2 points to an array consisting of numbers from P1,P2,P3 corresponding to a particular number in F1)
The access time has to be reduced.That is why i have taken everything into memory and quickly access the positions through that index in the horizontal array.Using a simple 2-D would have an overhead in moving the numbers of horizontal array because the numbers of F1 have to be in sorted order in Part I.
How Can i improve this?

Related

How to generate a random pair of numbers in a loop without duplicate pairs

I want to make a program where on a 8x8 2D array, I input an integer from 0 to 64, and it will place that amount of 'Z' on the array randomly. I can't figure out how to generate a loop of pair of random numbers that do not have any duplicate pairs. Since the pair of random numbers generate will represent the position in the 2D array where the 'Z' will be placed, there should be no duplicate pairs generated so that all the 'Z' are printed.
Since the problem is only of size 64, an efficient way of doing this can be to:
Create an array of 64 items. Each item corresponds to one position in the grid. For example the item at index 0 will correspond to row with index 0 and column with index 0. The item at index 63 would be at row with index 7 and column with index 7, and so on. Generally speaking, the formula to map the location of the array to the grid would be: Ai = Gr * grid columns + Gc, where Ai is the i-th element in the 1D array of items, Gr is the r-th row in the grid and Gc is the c-th column of the grid.
Place n times the letter 'Z' in the array, where n is the user's input. You can place them all for example in the first n places of the array of items.
Shuffle the array randomly. For example you can shuffle it by performing random indices' permutations (swaps).
Place the 'Z' letters from the array to the grid using the inverse of the formula from step 1, ie: Gr = Ai / grid columns and Gc = Ai % grid columns.
You can probably also do this without the need for the temporary array of step 1 at all. Just translate each coordinate on the fly.
See also: Choosing random cards from a deck? for a related problem.

Maximize sum of weights with constraints given on left and right indices in array

I recently came through an interesting coding problem, which is as follows:
There are n boxes, let us assume this is an array of n boxes.
For each index i of this array, three values are given -
1.) Weight(i)
2.) Left(i)
3.) Right(i)
left(i) means - if weight[i] is chosen, we are not allowed to choose left[i] elements from the left of this ith element.
Similarly, right[i] means if arr[i] is chosen, we are not allowed to choose right[i] elements from the right of it.
Example :
Weight[2] = 5
Left[2] = 1
Right[2] = 3
Then, if I pick element at position 2, I get weight of 5 units. But, I cannot pick elements at position {1} (due to left constraint). And cannot pick elements at position {3,4,5} (due to right constraint).
Objective - We need to calculate the maximum sum of the weights we can pick.
Sample Test Case :-
**Input: **
5
2 0 3
4 0 0
3 2 0
7 2 1
9 2 0
**Output: **
13
Note - First column is weights, Second column is left constraints, Third column is right constraints
I used Dynamic Programming approach(similar to Longest Increasing Subsequence) to reach a O(n^2) solution. But, not able to think of a O(n*logn) solution. (n can be up to 10^5.)
I also tried to use priority queue, in which elements with lower value of (right[i] + i) are given higher priority(assigned higher priority to element with lower value of "i", in case primary key value is equal). But, it is also giving timeout error.
Any other approach for this? or any optimization in priority queue method? I can post both of my codes if needed.
Thanks.
One approach is to use a binary indexed tree to create a data structure that makes it easy to do two operations in O(logn) time each:
Insert number into an array
Find maximum in a given range
We will use this data structure to hold the maximum weight that can be achieved by selecting box i along with an optimal selection of boxes to the left.
The key is that we will only insert values into this data structure when we reach a point where the right constraint has been met.
To find the best value for box i, we need to find the maximum value in the data structure for all points up to location i-left[i], which can be done in O(logn).
The final algorithm is to loop over i=0..n-1 and for each i:
Compute result for box i by finding maximum in range 0..(i-left[i])
Schedule the result to be added when we reach location i+right[i]
Add any previously scheduled results into our data structure
The final result is the maximum value in the whole data structure.
Overall, the complexity is o(nlogn) because each value of i results in one lookup and one update operation.

Fill gaps of matrix that has irregular steps

I am trying to perform a calculation using two different matrices, but they have come in slightly different forms.
The one matrix (for interests sake) are filled with reflectance values of a material from wavelengths 200nm to 2600nm, so each individual wavelength, in increments of 1 has a reflectance value.
The second matrix is a solar energy matrix which stores the amount of energy that is present at each wavelength. This one however has irregular steps and ranges from 280nm to 4000nm. But from 280nm-400nm it is in steps of 0.5nm, from 400nm-1705nm it is in steps of 1nm, and from 1750nm-4000nm it is steps of 5nm.
What I have been trying to do, unsucessfully thus far, is to edit this solar energy matrix so that it gives the entire range in steps of 1nm.
filename='H:\I_sol data.csv';
Dataisol = csvread(filename,1,0);
for j=1:1:count
if Dataisol(j,:)~=Dataisol(j+1,:)-1 %compare the wavelength to the value of the next wavelegth
newx=(Dataisol(j,:)+[1,0]) %if the next wavelength is not 1 larger than the previous, add a new row
newx(1,2)=NaN %make the new row to add blank
Dataisol=insertrows(Dataisol, newrow, j+1) %insert the new blank row
end
end
Above is what I have started with, at the moment I am just trying to fill the gaps by adding in new rows where there is a 5nm jump between wavelengths. Once i am able to create the missing elements, then I will turn my attention to populating them with the correct values (probably the midpoint between the 2 given values)
My end goal is going to be to trim both of the matrices so that they both have the same starting and ending wavelength and both have increments of 1nm throughout (also for interest sake, or for advice if this is trivial for someone). If anyone knows how to fill these gaps or make the necessary changes to the matrix it would be a great help!
Example of the csv file:
Wvlgth nm Etr W*m-2*nm-1
280.0 8.2000E-02
280.5 9.9000E-02
281.0 1.5000E-01
281.5 2.1200E-01
282.0 2.6700E-01
282.5 3.0300E-01
283.0 3.2500E-01
283.5 3.2300E-01
284.0 2.9900E-01
284.5 2.5024E-01
285.0 1.7589E-01
285.5 1.5500E-01
286.0 2.4200E-01
... .....
428.0 1.6510E+00
429.0 1.5230E+00
430.0 1.2120E+00
431.0 1.0990E+00
432.0 1.8220E+00
433.0 1.6913E+00
434.0 1.5600E+00
435.0 1.7090E+00
436.0 1.8680E+00
437.0 1.9000E+00
438.0 1.6630E+00
439.0 1.6010E+00
440.0 1.8300E+00
.... .....
2205.0 8.0900E-02
2210.0 8.0810E-02
2215.0 8.0410E-02
2220.0 7.9990E-02
2225.0 7.8840E-02
2230.0 7.8400E-02
2235.0 7.7930E-02
2240.0 7.6510E-02
2245.0 7.6250E-02
2250.0 7.5370E-02
... .....
Here is the code I use for assigning the variables to be used in the interp1 function, which is called as follows:
solx=Dataisol(:,1);
soly=Dataisol(:,2);
xi=280:1:2600;
newsol = [xi interp1(solx,soly,xi,'linear','extrap')];
The values that are stored in these variables as well as the error I am receiving are given below:
The function you need here is interp1. Set xi to be a vector of all the wavelengths you want to consider, say xi=280:1:2600;.
if wavelength is a vector of all your irregular values from the file, and sol is the corresponding vector of all the solar energies (you can use column references for your single matrix here as well)
newsol = [xi interp1(wavelength,sol,xi,'linear','extrap')];
will give you a new matrix with wavelengths increasing by 1 in column 1, and column 2 will contain values directly from your file where they exist and linearly interpolated values where they do not.

How to find the biggest subset of an array given some constraints?

There is an array A[1........N]. How to find the largest subset of the array such that product of any two distinct element of the subset is not a perfect cube. Upper bound for N is 100000.
Example:
For A = 1 2 4 8. Answer will be {1, 2} or {1, 4} or {8, 2} 0r {8, 4}.
1 and 8 cannot come together in the solution.
Similarly 2 and 4.
My approach.
check all the subset of the given array and return the subset of maximum length which satisfies the constraint. It will take O(N*N*2^N).
create a graph out of the given array. Two nodes in the graph will be connected if their product is perfect cube. Our main task is to remove the minimum number of nodes such that there are no edges left in the graph (when we remove any node all the edges associated with the node will disappear). Here the main issue is the space (representation of graph). In the worst case size of the graph will be O(N*N).
Please help.
Explanation
Consider the factorization of each number as follows:
A[i] = x^3.y^2.z
i.e. we first find the largest cube that divides (and call it x), then the largest square (and call it y), then call whatever is left over z.
The product of A[i] with another A[j]=X^3.Y^2.Z will be a cube if and only if Y=z and Z=y.
Therefore, if you consider groups of numbers with the same value of y^2.z, these groups form into pairs, where for each pair you cannot take an element from both linked groups.
Clearly the best case is to take all the elements from whichever group is the largest in each pair.
There is one special case, where y^2.z is equal to 1. In this case, any number in the group is already a perfect cube and cannot be paired with another number from the same group. Therefore you can add just 1 number from the set of perfect cubes.
Example
Suppose our array was (expressed as a prime factorization):
A[0] = 2^3
A[1] = 3^3
A[2] = 2^2.3.5^3
A[3] = 2^2.3.7^3
A[4] = 2.3^2.13^3
We first assign these into groups:
Value 1 = Group A (2^3, 3^3)
Value 2^2.3 = Group B (2^2.3.5^3, 2^2.3.7^3)
Value 2.3^2 = Group C (2.3^2.13^3)
Group A is paired with itself, while group B is paired with group C.
Therefore we can take one element from group A, and the whole of group B, for a total of 3 elements in the final subset.
You can formulate it as a largest clique problem.
Create a graph with each number as a vertex and connect two vertices if their product is not a cube.
Now find the largest clique in the graph. See https://en.wikipedia.org/wiki/Clique_problem#Finding_maximum_cliques_in_arbitrary_graphs

Splitting an array into n parts and then joining them again forming a histogram

I am new to Matlab.
Lets say I have an array a = [1:1:1000]
I have to divide this into 50 parts 1-20; 21-40 .... 981-1000.
I am trying to do it this way.
E=1000X
a=[1:E]
n=50
d=E/n
b=[]
for i=0:n
b(i)=a[i:d]
end
But I am unable to get the result.
And the second part I am working on is, depending on another result, say if my answer is 3, the first split array should have a counter and that should be +1, if the answer is 45 the 3rd split array's counter should be +1 and so on and in the end I have to make a histogram of all the counters.
You can do all of this with one function: histc. In your situation:
X = (1:1:1000)';
Edges = (1:20:1000)';
Count = histc(X, Edges);
Essentially, Count contains the number of elements in X that fall into the categories defined in Edges, where Edges is a monotonically increasing vector whose elements define the boundaries of sequential categories. A more common example might be to construct X using a probability density, say, the uniform distribution, eg:
X = 1000 * rand(1000, 1);
Play around with specifications for X and Edges and you should get the idea. If you want the actual histogram plot, look into the hist function.
As for the second part of your question, I'm not really sure what you're asking.

Resources