How to find top 10 sum of each match?

How to find top 10 sum of each match? - database

I mean if I have several X and several Y
and I do a match like this:
X -[ W ]-> Y
With X and Y related by several W ( there can be several W between same pairs (X,Y) )
I want top ten X for each Y with the property sum(W.property)
If I return
return Y , sum(W.property) , X order by sum(W.property) desc Limit 10
I Just get 10 but I need for every Y,
Is there a way to do that?

MATCH X -[ W ]-> Y
WITH Y, sum(W.property) AS total, X
ORDER BY total DESC
WITH Y, collect({sum: total, X: X})[0..10] AS values
UNWIND values AS value
RETURN Y, value.sum, value.X
You can actually skip the UNWIND and just change that second WITH to a RETURN if you're OK with it returning it as an array. It would be a bit more efficient because you're not repeating values of Y over and over. If you were going to do that you could even change the map structure into an array like this:
collect([total, X])[0..10]

Related

Creating mesh from 3-column table in matlab

I have a table with values extracted from a csv I want to make a contour plot from.
Let's use this table as an example
tdata.x = [1;2;1;2];
tdata.y = [3;3;4;4];
tdata.z = randn(4,1);
tdata=struct2table(tdata);
>> tdata
tdata =
4×3 table
x y z
_ _ _______
1 3 0.53767
2 3 1.8339
1 4 -2.2588
2 4 0.86217
I would like to pivot this such that I can use it for plotting a contour, so in principle I want a 2x2 z matrix where rows/columns are given by y and x respectively, something in this direction:
x 1 2
y
3 0.53767 1.8339
4 -2.2588 0.86217
where the first row are the x coordinates, the first columns is the y coordinates and in-between are the corresponding z-values. So that is to say the z-value corresponding to (x,y)=(1,4) is -2.2588.
Note, I am going to use this grid for other things down the road so solutions involving interpolation are not valid, as well the data is guaranteed to be given on a grid.

You can use unstack, i.e.
t = unstack( tdata, {'z'}, {'x'} );
Which will give you this:
Note that the column names are all prefixed with x because you can't have a column name beginning with a number. You should be able to extract the x values back again, especially if they're always integers it won't be too hard, for whatever operations you want from here...

Here's the approach I would use:
result = full(sparse(findgroups(tdata.y), findgroups(tdata.x), tdata.z));
Equivalently, you could use the third output of unique instead of findgroups, or accumarray instead of sparse:
[~, ~, ux] = unique(tdata.x);
[~, ~, uy] = unique(tdata.y);
result = accumarray([uy ux], tdata.z);

recursive brute force implementation in 2d array

i am tackling on a problem. i have gotten stuck, so i decided to ask here. so, the problem is, given n team and their points respectively of a world cup group. determine whether the set is possible or not. each team plays with every other team in the group once. hence, each group plays (n-1) times. for 1<=n<=5. in a match if a team win, they'll get 3 points, if lose 0 points, and tied, 1 point. my idea of the solution is using 2d(n x n) array which act like a scoreboard.
A B C D E //column
A X 1 3 0 1 //r
B 1 X 0 1 0 //o
C 0 3 X 0 3 //w
D 3 1 3 X 1
E 1 3 0 1 X
so for every column and row representing one distinct team in a multiplication table fashion(team in column 1(a) is same as team row 1(A), and so on)note that the alphabet above and beside the array(A,B..) isn't included, just for clearance. every intersection between a row and a column is representing a match, except intersection between same column and row. e.g. column 1, row 2, means team A tied against team B, column 2, row 1 means team B tied against A.
my idea is to use recursive brute-force-wise algorithm to check every possibilities. i have developed one, it's work good enough in 4 teams setting, but doesn't so well for 5. so the algorithm work like starting from column 2 row 1 check 1 out of 3 possibility then crawl to the bottom-side and right-side of it and repeat through the second last column, and last row.
you may have noticed that x diagonal act like mirror. when we change column 1 row 3(A against C) to win, we must change column 3 row 1(C against A) to lose simultaneously. here some part of my code
/*
* scoreBoard[][] array <- the array which i have described above
* scores[] array <- store the given score
* x <- current column
* y <- current row
* n <- gnumber of team
*/
bool Solve(int x, int y, int scoreBoard[][5], int scores[], int n)
{
bool con1, con2, con3;
if((x < y)&&(y < n)) {
scoreBoard[x][y] = 3;//win-lose - possibiiity 1
scoreBoard[y][x] = 0;
//crawl to the right and bottom side array
con1 = (Solve( x + 1, y, scoreBoard, scores, n)) || (Solve( x, y + 1, scoreBoard, scores, n));
scoreBoard[x][y] = 0;//lose-win - possibility 2
scoreBoard[y][x] = 3;
con2 = (Solve( x + 1, y, scoreBoard, scores, n)) || (Solve( x, y + 1, scoreBoard, scores, n));
scoreBoard[x][y] = 1;//tied - possibility 3
scoreBoard[y][x] = 1;
//crawl to the right and bottom side array
con3 = (Solve( x + 1, y, scoreBoard, scores, n)) || (Solve( x, y + 1, scoreBoard, scores,n));
return con1 || con2 || con3;
} else {
if((x==y)&&(y==n-1))
return CheckArr(scoreBoard, scores, n); //to check whether the current array equal with the given score or not
else
return 0;
}
}
i presume, the problem is that this algorithm does not cover every possibility, because it work on(give the expected output for some, and dont so for other) a few 5 team setting possiblity. but i haven't managed how to fix it.
thanks in advance for every suggestion, and helpful link, also, i'll welcome any other strategy. hope this clear enough.

Recover matrix X using pointers from rows in X to rows in matrix Y without doing loops in MATLAB

In MATLAB, suppose I know how each rows of my N-row matrix X corresponds to rows in matrix Y. Specifically, suppose X is 1000-by-3, Y is 40-by-3, v is 1000-by-1, and X(i,:) = Y(v(i,1),:) for rows of X. Is there any efficient way to recover X using Y and v without making a loop over 1:1000?

The index can be a vector:
X = Y(v,:);

Grid calculation and export to CSV

I have two sets x [xmin,xmax] and y [ymin, ymax] and I would like to execute a function going stepwise from the min to the max values of x and y. So I want to apply a function to the Cartesian product of x and y. I would then like to save each combination as a row to a CSV file. I've been trying for some time with a do loop, but got a bit stuck on how to create the list in the end. For instance:
for x: 1 thru 2 step 1 do
for y: 1 thru 2 step 1 do
print([x,y,find_root (exp(a*x) = y, a, 0, 1)])
I'd get the values of x and y and the function of all combinations, but I struggle to save it and export it to a CSV, because I don't know how to create the list with [[1,1,function(1,1)],[1,2,function(1,2)],[2,1,function(2,1)],[2,2,function(2,2)]], that I could export with write_data.
Alternatively, I'd like to use:
xlist:makelist(x,x,1,2,1);
ylist:makelist(y,y,1,2,1);
create_list([x,y,x^y],x,xlist,y,ylist);
In this case I don't know how to include the function in create list or how to use map.
How do I do the above or is there a better way?

About speeding up the construction of the 1 million item list, how about solving the equation just once and then substituting values of x and y? E.g.:
solve (exp(a*x) = y, a);
my_solution : rhs (first (%));
create_list ([x, y, ev (my_solution)], x, xlist, y, ylist);
Here ev evaluates my_solution with the current values of the variables it contains (namely x and y).
About writing a CSV file, try this:
write_data (my_list, "my_output_file", 'comma);

What's a good way to store this relation so I can answer queries of this form efficiently?

Excuse me if I get a little mathy for a second:
I have two sets, X and Y, and a many-to-many relation ℜ &subseteq; X&cross;Y.
For all x ∈ X, let xℜ = { y | (x,y) ∈ ℜ } &subseteq; Y, the subset of Y associated with x by ℜ.
For all y ∈ Y, let ℜy = { x | (x,y) ∈ ℜ } &subseteq; X, the subset of X associated with y by ℜ.
Define a query as a set of subsets of Y, Q &subseteq; ℘(Y).
Let the image of the query be the union of the subsets in Q:image(Q) = Uq∈Q q
Say an element of X x satisifies a query Q if for all q ∈ Q, q ∩ xℜ ≠ ∅, that is if all subsets in Q overlap with the subset of Y associated with x.
Define evidence of satisfaction of an element x of a query Q such that:evidence(x,Q) = xℜ ∩ image(Q)
That is, the parts of Y that are associated with x and were used to match some part of Q. This could be used to verify whether x satisfies Q.
My question is how should I store my relation ℜ so that I can efficiently report which x∈X satisfy queries, and preferably report evidence of satisfaction?
The relation isn't overly huge, as csv it's only about 6GB. I've got a couple ideas, neither of which I'm particularly happy with:
I could store { (x, xℜ) | ∀ x∈X } just in a flat file, then do O(|X||Q||Y|) work checking each x to see if it satisfies the query. This could be parallelized, but feels wrong.
I could store ℜ in a DB table indexed on Y, retrieve { (y, ℜy) | ∀ y∈image(Q) }, then invert it to get { (x, evidence(x,Q)) | ∀ x s.t. evidence(x,Q) ≠ ∅ }, then check just that to find the x that satisfy Q and the evidence. This seems a little better, but I feel like inverting it myself might be doing something I could ask my RDBMS to do.
How could I be doing this better?

I think #2 is the way to go. Also, if Q can be represented in CNF you can use several queries plus INTERSECT to get the RDBMS to do some of the heavy lifting. (Similarly with DNF and UNION.)
This also looks a bit a you want a "inverse index", which some RDBMS have support for. X = set of documents, Y = set of words, q = set of words matching the glob "a*c".
HTH