Conditionally Sum a googlesheet column based on entries in related tables - arrays

Say I have two related sheets/tabs within a google sheet. One sheet/tab is titled "Categories", the other is "Measures".
Categories:
userid
catcode
1
a
1
b
2
a
3
c
Measures:
userid
catcode
points
1
a
5
1
b
5
1
c
3
2
a
4
3
c
3
For each user I'd like to be able to sum the points from the Measures table where the catcode is present for the user in the categories table. Ideally using an auto-extending/filling formula (like an arrayformula or query).
I have some idea how I'd approach this with SQL statements (joining the related tables, or doing a select where exists), but I'm new to googlesheets and would appreciate some direction here. I've experimented with this a bit and assuming a third table named "Users" with userids in column A, I can add this formula:
=sum(filter(measure!C2:C4, measure!A2:A4=users!A2, not(iserror(vlookup(measure!B2:B4, unique(filter(categories!B2:B5, categories!A2:A5=users!A2)), 1, false)))))
However this approach doesn't seem to be compatible with arrayformula and won't allow me to autofill down the Users tab for newly added userids. Sum itself is apparently incompatible with arrayformula. Additionally, if I enclose the above in arrayformula and replace sum with sumproduct or some other approach to the summation, I'm unable to get the users!A2 references to extend down as I'd expect via something like users!A2:A.
Any help/direction would be appreciated. Thanks!

Try:
=ARRAYFORMULA(QUERY({A1:A, VLOOKUP(A1:A&B1:B, {D1:D&E1:E, F1:F}, 2, 0)},
"select Col1,sum(Col2) where Col1 is not null group by Col1 label sum(Col2)''"))

Related

How do you use ArrayFormula with arrays (after aggregation)?

In my example:
https://docs.google.com/spreadsheets/d/1QQNTw_r9-q-FqVNwUoYklup73niZCFyO0VDUYImP5fo/edit?usp=sharing
I'm using Google Forms as an eBay clone to sell rare items. Each bid is outputted from the form to the "Data" worksheet and then I have ArrayFormulas set up inside the "Processed" worksheet. The idea is that I want to process the bids so that we filter everything except the items with the highest bids. All data should be automatically updated, hence why I want to use ArrayFormulas.
My strategy is that in colum A, I first filter all unique items (=unique(filter(Data!A2:A,Data!A2:A<>""))) and end up with:
Jurassic Park 6-Pog Hologram Set
Princess the Bear TY Beanie Baby
Holographic 1st Ed Charizard
However, then in column B, we have to find the highest bid that corresponds to that unique item, e.g.:
=IF(ISBLANK(A2),,ArrayFormula(MAX(IF(Data!A2:A=A2,Data!B2:B))))
However, I don't want to have A2 be a single cell (A2) but an array (A2:A) so that it doesn't have to be manually copied down the rows. Similarly, I also want columns D and E to be automatic as well. Is there any way to achieve this?
Not sure if it would be considered easier than the previously posted answer, but in case this thread is found in the future, I think that this is a slightly simpler way to solve these kinds of problems:
Try this on a fresh tab in cell A1:
=FILTER(Data!A:D,COUNTIFS(Data!A:A,Data!A:A,Data!B:B,">"&Data!B:B)=0)
I did some research and found an answer very similar to what you were looking for. After rearranging the formula slightly to match your sheet, I was able to get this to work:
=ArrayFormula(vlookup(query({row(Data!A2:A),sort(Data!A2:C)},"select max(Col1) where Col2 <> '' group by Col2 label max(Col1)''",0),{row(Data!A2:A),sort(Data!A2:D)},{2,3,4,5},0))
This formula automatically populates product name, highest bid, username, and timestamp. I ran some tests, adding my own random names and values into the data sheet, and the formula worked great.
Reference: https://webapps.stackexchange.com/a/75637
use:
={A1:D1; SORTN(SORT(A2:D, 1, , 2, ), 9^9, 2, 1, )}
translated:
{A1:D1} - headers
SORT(A2:D, 1, , 2, ) - sort 1st column then 2nd column descending
9^9 - output all possible rows
2 - use 2nd mode of sortn which will group selected column
1 - selected column to be marged based on unique values

Select Row values based on values in another row

I have a table that has an auto-incrementing identity "Reference" field and a pair of other fields that determine the sort order. What I need to do is find the 'next' item in the table when sorted based on the pair of fields based on the reference field of an initial item.
So my data looks like this when sorted by SortParent.SortChild:
Reference SortParent SortChild Data
------------------------------------------
9 1 2 Fred
7 1 3 Jim
11 1 4 Sheila
4 2 1 Micro
5 2 2 Archimedes
12 2 3 Electron
So in this example the "Jim" row (Reference=7) comes after "Fred" (Reference=9) even though it's reference is smaller.
So i want to be able to find which row comes after Fred by searching based on Jim's reference
At the moment in code I do a query to find the values for Fred's row:
SELECT SortParent,SortChild From MyTable WHERE Reference=9
Which returns 1,2. Then do a search for the first row that comes after 1,2:
SELECT * FROM MyTable
WHERE ((SortParent=1 and SortChild>2) OR (SortParent>2))
ORDER BY SortParent,SortChild
Which will therefore come back with the row having reference 7 and sort values 1,3
I'm pretty sure this can be done in a single query, but i'm stumped on the best way.
Incidentally, if anyone has any suggestions on alternate way of handling the two part sort columns that would make this easier, please feel free to help!
I believe You are looking at the LEAD or LAG windowed function:
https://msdn.microsoft.com/en-US/library/hh213125.aspx
SELECT
NextReference
FROM
(SELECT
reference
, LEAD(reference, 1,0) OVER (ORDER BY SortParent,SortChild) AS NextReference
, *
FROM
mytable
) newTable
WHERE
reference = 9
I used LEAD, but try it with LAG if you are looking in the other direction for the row
I havn't tested this particular query, so my not be syntactically sound, but let me know if you have any troubles with it and I'll go over it a bit more once I'm back at my desk
EDIT: Used the wrong sql from your question as my base
EDIT2: Put the lead into a subquery to allow us to query on it

How to write a query to see who called who the most in an Access database of phone calls?

I have a phone bill in Excel that shows all calls made to and from my phone and I imported it into a table in Access 2007. I want to learn to use Access to do a simple query to determine who I talk to the most.
Say we have Column A (caller) and Column B (person being called), and that my number will always be in either column. How do I make a query in Access to determine which phone number I talk the most with? I've got the Table with the Excel data in it, but I need some step-by-step handholding to learn how to do the query.
In simple english, I want to query all phone calls that contain my number in either column A or column B. Then, I want to count each unique pair (mynumber + othernumber or othernumber + mynumber should be counted under the same pair). Then, I want to count/summarize each unique pair to see which pair yields the highest count.
E.g. Go to Create ribbon, click Query Wizard, etc...
Thanks!
Lets say you have the following table :-
Column A : Column B
---------:----------
Fred : 1
Bill : 2
Fred : 1
You could do a query for example :-
SELECT A, B, Count(B) AS CountOfB
FROM Table1
GROUP BY A, B
ORDER BY Count(B) DESC
This would give you :-
Column A : Column B : CountOfB
---------:----------:----------
Fred : 1 : 2
Bill : 2 : 1
The first row would list the most common occurrences of column B and the count would list the number of times that row has been seen.

Reporting Services - Calculating row totals/percentages in a table

I am trying to calculate totals for each row as well as a percentage of the overall total.
Right now I have a table like this:
Blah Col1 Col2 Col3
-----------------------------
ABC 1 1 1
DEF 2 2 3
-----------------------------
Total 3 3 4
And I want it to include totals/percentages like so:
Blah Col1 Col2 Col3 Total %
--------------------------------------------
ABC 1 1 1 3 30%
DEF 2 2 3 7 70%
--------------------------------------------
Total 3 3 4 10 100%
I know I can do the calculations in the SQL query, but the stored procedure is rather complicated so I'd like to avoid that if possible. So I'm wondering if there's a simple way to achieve this in SSRS.
Right now I just have a row group for each Blah which I use to calculate column totals.
I added a Totals Row for my matrix, then I referenced the totals textbox (textbox 8 in my case) for the column and I did:
Sum(Fields.FieldName.Value)/ReportItems!Textbox8.Value
I hope this makes sense!
To calculate the total, just do a simple sum using the + operator. For the percentage, you can refer to the grand total using ReportItems!ItemName.
You can use aggregate functions in Reporting Services like "SUM" and "AVG" to achieve what you are trying to do. The way it works is "Detail" parts of groups in SSRS tables will list all of the data, while non-detail parts (like headers and footers) of groups can be used for aggregates like:
=SUM(Fields!TestValue.Value)
http://msdn.microsoft.com/en-us/library/ms159134%28v=sql.90%29.aspx
Let me know if you need any more help.
Create two groups, one on a column that is the same data for each row, then one on column blah. add a row for the emcompassing group and do a sum there.
you can simply do as following:
Sum(CInt(Fields!TestValue.Value))
or Sum(CInt(Fields!DollarAmountOfCheck.Value),"DataSet1")
sometime when data is coming through WCF, it does not accept Sum() function. but this works fine in that case.

yet another tsql question

i have three tables
documents
attributes
attributevalues
documents can have many attributes
and these atributes have value in attributevalue table
what i want in single query get all documents and assigned atributes of relevant documents in row each row
(i assume every documents have same attributes assigned dont need complexity of diffrent attribues now)
for example
docid attvalue1 attvalue2
1 2 2
2 2 2
3 1 1
how can i do that in single query
Off the top if my head, I don't think you can do this without dynamic SQL.
The crux of the Entity-Attribute-Value (EAV) technique (which is what you are using) is to store columns as rows. What you want to do is convert those rows back to columns for the purpose of this query. Using PIVOT makes this possible. However, PIVOT requires knowing the number of rows that need to be converted to columns at the time the query is written. So assuming you are using EAV because you need flexible attributes/values, you won't know this information when you write the query.
So the solution would be to use dynamic SQL in conjunction with PIVOT. Did a quick search and this looks promising (didn't really read the whole thing):
http://www.simple-talk.com/community/blogs/andras/archive/2007/09/14/37265.aspx
For the record, I am not a fan of dynamic SQL and would recommend finding another approach to the larger problem (e.g. pivoting in application code).
If you know all the attributes (and their IDs) at design-time:
SELECT d.docid,
a1.attvalue AS attvalue1
a2.attvalue AS attvalue2
FROM documents d
JOIN attributevalues a1 ON d.docid = a1.docid
JOIN attributevalues a2 ON d.docid = a2.docid
WHERE a1.attrid = 1
AND a2.attrid = 2
If you don't, things get quite a bit messier and difficult to answer without knowing your schema.
lets make example
documents table's columns
docid,docname,createddate,createduser
and values
1 account.doc 10.10.2010 aeon
2 hr.doc 10.11.2010 aeon
atributes table's columns
attid,name,type
and values
1 subject string
2 recursive int
attributevalues table's columns
attvalueid,docid,attid,attvalue(sql_variant)
and values
1 1 1 "accounting doc"
1 1 2 0
1 2 1 "humen r doc"
1 2 2 1
and I want query result
docid,name,atribvalue1,atribvalue1,atribvalueN
1 account.doc "accounting doc" 0
2 hr.doc "humen r doc" 1

Resources