django: SQL UNION or concatenate with queryset: not working properly when used with annotate - django-models

I have the below two querysets books1 and books2. On both i am using annotate to add another column with a constant of 10 for books1 and 30 for books2. The problem is after i Concatenate the querysets the final set shows the annotate value = 10 even for books2
author1 = Author.objects.get(pk=1)
books1 = author1.books_set.annotate(sample_var=Value(10))
author2 = Author.objects.get(pk=2)
books2 = author2.books_set.annotate(sample_var=Value(30))
combine = books1 | books2
what i see:
name author id sample_var
name1 (from books1) 1 10
.. 1 10
.. 1 10
.. 1 10
name2 (from books2) 2 10 (instead of 30)
.. 2 10 (instead of 30)
.. 2 10 (instead of 30)
.. 2 10 (instead of 30)
the reason is, the sql being used for the combine is as below. I get this when i try to access combine[0].name. It takes care of the WHERE part, but not the annotate part. It uses 10 AS "sample_var" for the combine queryset, which is not right, thats why all show 10 in the above table.
SELECT "books_books"."name", "books_books"."author_id", 10 AS "sample_var"
FROM "books_books"
WHERE ("books_books"."author_id" = 1 OR "books_books"."author_id" = 2)
I think if the union to happen properly the sql should be:
SELECT "books_books"."name", "books_books"."author_id", 10 AS "sample_var"
FROM "books_books"
WHERE ("books_books"."author_id" = 1)
union all
SELECT "books_books"."name", "books_books"."author_id", 30 AS "sample_var"
FROM "books_books"
WHERE ("books_books"."author_id" = 2)

In Django 1.11 you can do union, and in your case:
all_books = books1.union(books2, all=True)

Related

How can I use a loop to create lag variables?

* Example generated by -dataex-. To install: ssc install dataex
clear
input int(date date2)
18257 16112
18206 16208
17996 16476
18197 17355
18170 17204
end
format %d date
format %d date2
I'm trying to create a loop in Stata that generates four variables (lags at 0 months, 3 months, 12 months, and 18 months). I tried this (below) and I get an error: invalid syntax
foreach x inlist (0,3,12,18) & foreach y inlist (0,90,360,540){
gen var`x' = (date > date2 + `y')
}
Here is a way for me to successfully create these variables without the loop. It would be much nicer if it could be simplified with a loop.
gen var0=(date>date2)
gen var3=(date>date2+90)
gen var12=(date>date2+360)
gen var18=(date>date2+540)
Good news: you need just one loop over 4 possibilities, as 0 3 12 18 and 0 90 360 540 are paired.
foreach x in 0 3 12 18 {
gen var`x' = date > (date2 + 30 * `x')
}
foreach requires either in or of following the macro name, so your code fails at that point. There is also no construct foreach ... & foreach ....: perhaps you are using syntax from elsewhere or just guessing there.

SQL exclude rows that contain anything other than desired item

How to select rows that only contain desired items, if contain desired item and other items, exclude it.
for example, sample data,
Primarykey food_code recipes
1 22 only_rice_5874136489
2 22 only_rice_9549618454
3 33 only_rice_5874136489
4 33 only_peanut_8889548456
5 44 only_pepper_7777777715
food_code = 2 , contain the recipes begin with only_rice, that is what i want, but food_code =3 contain rice and peanut, don't select it, food_code = 44 don't select it too due to not contain rice.
Expected output;
Primarykey food_code recipes
1 22 only_rice_5874136489
2 22 only_rice_9549618454
the challenge is i have millions of rows, they all have the same string pattern, only the one set of trailing numbers are different, if write down all items that need to be excluded, e.g.(only_peanut..... only_pepper) is not a good solutions.
just check for NOT EXISTS of other item that is NOT only_rice
SELECT *
FROM recipes r
WHERE r.recipes LIKE 'only_rice%'
AND NOT EXISTS
(
SELECT *
FROM recipes x
WHERE x.food_code = r.food_code
AND x.recipes NOT LIKE 'only_rice%'
)

Get value to max date

I have a problem on on Qlikview.
I would like to get the associated value of the max date when I select and ID.
For example, if I have the following data:
id value date
1 2 1
1 4 2
1 6 3
1 5 4
When I select ID=1 I would like to get the associated value of the max date (4), which is 5
Thank you!
You can use set analysis to achieve this:
= Sum( {< date = {"$(=Max(date))"} >} value )
You can read more about set analysis on Qlik Help website

summing & matching cell arrays of different sizes

I have a 4016 x 4 cell, called 'totalSalesCell'. The first two columns contain text the remaining two are numeric.
1st field CompanyName
2nd field UniqueID
3rd field NumberItems
4th field TotalValue
In my code I have a loop which goes over the last month in weekly steps - i.e. 4 loops.
At each loop my code returns a cell of the same structure as totalSalesCell, called weeklySalesCell which generally contains a different number of rows to totalSalesCell.
There are two things I need to do. First if weeklySalesCell contains a company that is not in totalSalesCell it needs to be added to totalSalesCell, which I believe the code below will do for me.
co_list = unique([totalSalesCell(:, 1); weeklySalesCell (:, 1)]);
index = ismember(co_list, totalSalesCell(:, 1));
new_co = co_list(index==0, :);
totalSalesCell = [totalSalesCell; new_co];
The second thing I need to do and am unsure of the best way of going about it is to then add the weeklySalesCell numeric fields to the totalSalesCell. As mentioned the cells will 90% of the time have different row numbers so cannot apply a simple addition. Below is an example of what I wish to achieve.
totalSalesCell weeklySalesCell Result
co_id sales_value co_id sales_value co_id sales_value
23DFG 5 DGH84 3 23DFG 5
DGH84 6 ABC33 1 DGH84 9
12345 7 PLM78 4 ABC33 1
PLM78 4 12345 3 12345 10
KLH11 11 PLM78 8
KLH11 11
I believe the following codes must take care of both of your tasks -
[x1,x2] = ismember(totalSalesCell(:,1),weeklySalesCell(:,1))
corr_c2 = nonzeros(x1.*x2)
newval = cell2mat(totalSalesCell(x1,2)) + cell2mat(weeklySalesCell(corr_c2,2))
totalSalesCell(x1,2) = num2cell(newval)
excl_c2 = ~ismember(weeklySalesCell(:,1),totalSalesCell(:,1))
out = vertcat(totalSalesCell,weeklySalesCell(excl_c2,:)) %// desired output
Output -
out =
'23DFG' [ 5]
'DGH8444' [ 9]
'12345' [10]
'PLM78' [ 8]
'KLH11' [11]
'ABC33' [ 1]

Changing indices and order in arrays

I have a struct mpc with the following structure:
num type col3 col4 ...
mpc.bus = 1 2 ... ...
2 2 ... ...
3 1 ... ...
4 3 ... ...
5 1 ... ...
10 2 ... ...
99 1 ... ...
to from col3 col4 ...
mpc.branch = 1 2 ... ...
1 3 ... ...
2 4 ... ...
10 5 ... ...
10 99 ... ...
What I need to do is:
1: Re-order the rows of mpc.bus, such that all rows of type 1 are first, followed by 2 and at last, 3. There is only one element of type 3, and no other types (4 / 5 etc.).
2: Make the numbering (column 1 of mpc.bus, consecutive, starting at 1.
3: Change the numbers in the to-from columns of mpc.branch, to correspond to the new numbering in mpc.bus.
4: After running simulations, reverse the steps above to turn up with the same order and numbering as above.
It is easy to update mpc.bus using find.
type_1 = find(mpc.bus(:,2) == 1);
type_2 = find(mpc.bus(:,2) == 2);
type_3 = find(mpc.bus(:,2) == 3);
mpc.bus(:,:) = mpc.bus([type1; type2; type3],:);
mpc.bus(:,1) = 1:nb % Where nb is the number of rows of mpc.bus
The numbers in the to/from columns in mpc.branch corresponds to the numbers in column 1 in mpc.bus.
It's OK to update the numbers on the to, from columns of mpc.branch as well.
However, I'm not able to find a non-messy way of retracing my steps. Can I update the numbering using some simple commands?
For the record: I have deliberately not included my code for re-numbering mpc.branch, since I'm sure someone has a smarter, simpler solution (that will make it easier to redo when the simulations are finished).
Edit: It might be easier to create normal arrays (to avoid woriking with structs):
bus = mpc.bus;
branch = mpc.branch;
Edit #2: The order of things:
Re-order and re-number.
Columns (3:end) of bus and branch are changed. (Not part of this question)
Restore original order and indices.
Thanks!
I'm proposing this solution. It generates a n x 2 matrix, where n corresponds to the number of rows in mpc.bus and a temporary copy of mpc.branch:
function [mpc_1, mpc_2, mpc_3] = minimal_example
mpc.bus = [ 1 2;...
2 2;...
3 1;...
4 3;...
5 1;...
10 2;...
99 1];
mpc.branch = [ 1 2;...
1 3;...
2 4;...
10 5;...
10 99];
mpc.bus = sortrows(mpc.bus,2);
mpc_1 = mpc;
mpc_tmp = mpc.branch;
for I=1:size(mpc.bus,1)
PAIRS(I,1) = I;
PAIRS(I,2) = mpc.bus(I,1);
mpc.branch(mpc_tmp(:,1:2)==mpc.bus(I,1)) = I;
mpc.bus(I,1) = I;
end
mpc_2 = mpc;
% (a) the following mpc_tmp is only needed if you want to truly reverse the operation
mpc_tmp = mpc.branch;
%
% do some stuff
%
for I=1:size(mpc.bus,1)
% (b) you can decide not to use the following line, then comment the line below (a)
mpc.branch(mpc_tmp(:,1:2)==mpc.bus(I,1)) = PAIRS(I,2);
mpc.bus(I,1) = PAIRS(I,2);
end
% uncomment the following line, if you commented (a) and (b) above:
% mpc.branch = mpc_tmp;
mpc.bus = sortrows(mpc.bus,1);
mpc_3 = mpc;
The minimal example above can be executed as is. The three outputs (mpc_1, mpc_2 & mpc_3) are just in place to demonstrate the workings of the code but are otherwise not necessary.
1.) mpc.bus is ordered using sortrows, simplifying the approach and not using find three times. It targets the second column of mpc.bus and sorts the remaining matrix accordingly.
2.) The original contents of mpc.branch are stored.
3.) A loop is used to replace the entries in the first column of mpc.bus with ascending numbers while at the same time replacing them correspondingly in mpc.branch. Here, the reference to mpc_tmp is necessary so ensure a correct replacement of the elements.
4.) Afterwards, mpc.branch can be reverted analogously to (3.) - here, one might argue, that if the original mpc.branch was stored earlier on, one could just copy the matrix. Also, the original values of mpc.bus are re-assigned.
5.) Now, sortrows is applied to mpc.bus again, this time with the first column as reference to restore the original format.

Resources