Google Sheets - Query Multiple Sheets That Have Different Structure - arrays

Seeking assistance regarding how to structure a query that will be processing data from multiple sheets (ie tabs), however both sheets have different data structure.
The first query (below) queries a tab that contains all of my expenses itemised. This sums them by month.
=query(Expense_Data, "SELECT C, SUM(Q) where T Matches 'Expense' GROUP BY C ORDER BY C desc limit 3 label SUM(Q) 'Expenses'",1)
Example Data Output Below
Date
Expenses
01/01/2021
-$1000
01/02/2021
-$1500
01/03/2021
-$1000
What I am seeking is to query another sheet which contains data (located in column G) that I wish to return based upon the date returned from the first query (located in column A), which I will then calculate the difference between. My issue is associating the 2 data sets together. Any support would be greatly appreciated!
Date
Expenses
Budget
Difference
01/01/2021
-$1000
-$2000
-$XXXX
01/02/2021
-$1500
-$1500
-$XXXX
01/03/2021
-$1000
-$1500
-$XXXX

try:
=QUERY(Expense_Input,
"select C,sum(Q)
where T matches 'Expense'
group by C
order by C desc
limit 3
label sum(Q) 'Expenses', C'Month'
format C'mmmm yyyy'", 1)
then:
={"Budget"; ARRAYFORMULA(IFNA(VLOOKUP(TO_TEXT(A13:A),
{'Expense Lookup (Monthly)'!C:C&" "&'Expense Lookup (Monthly)'!D:D,
SUBSTITUTE('Expense Lookup (Monthly)'!G:G, "$", )*1}, 2, 0)))}
and:
={"Difference"; INDEX(IF(A13:A="",,C13:C-B13:B))}
update
in one go:
=ARRAYFORMULA(QUERY({QUERY(Expense_Input,
"select C,sum(Q)
where T matches 'Expense'
group by C
order by C desc
limit 3
format C 'mmmm yyyy'", 1), IFNA(VLOOKUP(TEXT(INDEX(QUERY(Expense_Input,
"select C,sum(Q)
where T matches 'Expense'
group by C
order by C desc
limit 3",1),,1), "mmmm yyyy"),
{'Expense Lookup (Monthly)'!C:C&" "&'Expense Lookup (Monthly)'!D:D,
SUBSTITUTE('Expense Lookup (Monthly)'!G:G, "$", )*1}, 2, 0))},
"select Col1,Col2,Col3,Col3-Col2
label Col1'Month',Col2'Expenses',Col3'Budget',Col3-Col2'Difference'"))

Related

How do I get this average percentage formula to work across multiple sheets?

I have 4 sheets named: STATS, SHEET01, SHEET02 and SHEET03.
SHEET01, SHEET02 and SHEET03 look exactly the same like the picture below but with different dates and percentages.
In cell A1 of the STATS sheet I have the following formula:
=IFERROR(ArrayFormula(AVERAGEIFS(SHEET01!B2:B,month(SHEET01!A2:A),2,year(SHEET01!A2:A),2020)))
This formula returns the average percentage of February 2020.
This formula clearly only works with SHEET01 and I can't figure out how to make it also take the averages of SHEET02 and SHEET03.
I have tried:
=IFERROR(ArrayFormula(AVERAGEIFS(SHEET01!B2:B,SHEET02!B2:B,SHEET03!B2:B,month(SHEET01!A2:A,SHEET02!A2:A,SHEET03!A2:A),2,year(SHEET01!A2:A,SHEET02!A2:A,SHEET03!A2:A),2020)))
UPDATE
Now using the following query:
=query({'SHEET01'!A:B;'SHEET02'!A:B;'SHEET03'!A:B},"select avg(B) Where A is not null and A>=date'2020-01-01' and A<=date'2020-01-31'")
This outputs the following error:
Unable to parse query string for Function QUERY parameter 2: NO_COLUMN: B
try in B2 of stats sheet if A2:A contains dates:
=ARRAYFORMULA(IFNA(VLOOKUP(A2:A,
QUERY({SHEET01!A2:B; SHEET02!A2:B; SHEET03!A2:B},
"select Col1,avg(Col2)
where Col1 is not null
group by Col1
label avg(Col2)''"), 2, 0)))
or use only this to get the full summary:
=QUERY({SHEET01!A2:B; SHEET02!A2:B; SHEET03!A2:B},
"select Col1,avg(Col2)
where Col1 is not null
group by Col1
label avg(Col2)''")
note that you might need to format dates via 123... button if your dates will outputted as 4000+ numbers

Google Sheets Query for Maximum Date

I have a table which I would like to return in its entirety but filtered by column H with the maximum date as rows have duplicate account numbers with different dates.
https://docs.google.com/spreadsheets/d/1x3hYy1igiL3_lqhFE3IwrQNFbOjdpXRtHLAbOXB2BE4/edit?usp=sharing
I'm using this query right now, but clearly I'm overlooking something, can anyone help, please?
=QUERY(B3:I12, "Select *, Max(H) WHERE B is not null Group by C")
correct syntax would be:
=QUERY(B3:I12,
"select B,C,D,E,F,G,H,I,max(H)
where B is not null
group by B,C,D,E,F,G,H,I")
but you probably need this:
={B3:I3; SORTN(SORT(B4:I, 7, 0), 99^99, 2, 2, 1)}
The final formula was ={DataTable;SORTN(SORT(Data,17,0),99^99,2,2,1)}
DataTable is the headings and Data is the dataset. Out of 14,000 lines this filtered down to around 7,000 lines of unique rows with the highest (Maximum) date.

SQL Server Reporting Service- datasets- grouping

I have table1 that is using dataset-A. I grouped this table by date. I have other dataset B which contains total_qty column.
I have to sum this column (total_qty) and show the result in table1 under every group total.
Data set A columns:
Date
company_code
location_name
volumes
Data set B columns:
Date
Company_code
Total_Qty
How can I get Sum(total_qty) from dataset B into table1 under group total? Also, total_Qty should change as it is placed in group.
Thanks
If you don't need to SUM anything for DataSet B's Company_code, you can use a Lookup to get the value. You would need to match the date and Company_code from both tables.
=LookUp(CSTR(FIELDS!Company_code.VALUE) & "|" & CSTR(FIELDS!Date.VALUE),
CSTR(FIELDS!Company_code.VALUE) & "|" & CSTR(FIELDS!Date.VALUE),
,FIELDS!Total_Qty.VALUE, "DataSetB")
This will combine your company code and date fields into one string and lookup the same combination in DataSet B.

SQL Query for continuous eligibility dates

How can i write query for the below.
I have data like
id eff date term date
C 20000101 20050228
C 20000501 20120229
C 20060101 20120301
I need to check for continuous eligibility and if continuous i need to get rid of other records. For above example i need the output as
id Eff date Term date
C 20000101 20120301
I need to remove the overlapping ranges.In the above example i have overlapping ranges of eligibility. If i had another input record(4th one) say C 20130101 20130531 Then my output should be two records. C 20000010 20120301 and C 20130101 20130531
I am not sure about the business rules for continuous eligibility - but wonder if just this will do it:
SELECT id, min(Eff_date), max(Term_date)
FROM my_table
GROUP BY id

MS Access : Average and Total Calculation in Single Query

INTRODUCTION TO DATABASE TABLE BEING USED -
I am working on a “Stock Market Prices” based Database Table. My table has got the data for the following FIELDS –
ID
SYMBOL
OPEN
HIGH
LOW
CLOSE
VOLUME
VOLUME CHANGE
VOLUME CHANGE %
OPEN_INT
SECTOR
TIMESTAMP
New data gets added to the table daily “Monday to Friday”, based on the stock market price changes for that day. The current requirement is based on the VOLUME field, which shows the volume traded for a particular stock on daily basis.
REQUIREMENT –
To get the Average and Total Volume for last 10,15 and 30 Days respectively.
METHOD USED CURRENTLY -
I created these 9 SEPARATE QUERIES in order to get my desired results –
First I have created these 3 queries to take out the most recent last 10,15 and 30 dates from the current table:
qryLast10DaysStored
qryLast15DaysStored
qryLast30DaysStored
Then I have created these 3 queries for getting the respective AVERAGES:
qrySymbolAvgVolume10Days
qrySymbolAvgVolume15Days
qrySymbolAvgVolume30Days
And then I have created these 3 queries for getting the respective TOTALS:
qrySymbolTotalVolume10Days
qrySymbolTotalVolume15Days
qrySymbolTotalVolume30Days
PROBLEM BEING FACED WITH CURRENT METHOD -
Now, my problem is that I have ended up having these so many different queries, whereas I wanted to get the output into One Single Query, as shown in the Snapshot of the Excel Sheet:
http://i49.tinypic.com/256tgcp.png
SOLUTION NEEDED -
Is there some way by which I can get these required fields into ONE SINGLE QUERY, so that I do not have to look into multiple places for the required fields? Can someone please tell me how to get all these separate queries into one -
A) Either by taking out or moving the results from these separate individual queries to one.
B) Or by making a new query which calculates all these fields within itself, so that these separate individual queries are no longer needed. This would be a better solution I think.
One Clarification about Dates –
Some friend might think why I used the method of using Top 10,15 and 30 for getting the last 10,15 and 30 Date Values. Why not I just used the PC Date for getting these values? Or used something like -
("VOLUME","tbl-B", "TimeStamp BETWEEN Date() - 10 AND Date()")
The answer is that I require my query to "Read" the date from the "TIMESTAMP" Field, and then perform its calculations accordingly for LAST / MOST RECENT "10 days, 15 days, 30 days” FOR WHICH THE DATA IS AVAILABLE IN THE TABLE, WITHOUT BOTHERING WHAT THE CURRENT DATE IS. It should not depend upon the current date in any way.
If there is any better method or more efficient way to create these queries, then please enlighten.
You have separate queries to compute 10DayTotalVolume and 10DayAvgVolume. I suspect you can compute both in one query, qry10DayVolumes.
SELECT
b.SYMBOL,
Sum(b.VOLUME) AS 10DayTotalVolume,
Avg(b.VOLUME) AS 10DayAvgVolume
FROM
[tbl-B] AS b INNER JOIN
qryLast10DaysStored AS q
ON b.TIMESTAMP = q.TIMESTAMP
GROUP BY b.SYMBOL;
However, that makes me wonder whether 10DayAvgVolume can ever be anything other than 10DayTotalVolume / 10
Similar considerations apply to the 15 and 30 day values.
Ultimately, I think you want something based on a starting point like this:
SELECT
q10.SYMBOL,
q10.[10DayTotalVolume],
q10.[10DayAvgVolume],
q15.[15DayTotalVolume],
q15.[15DayAvgVolume],
q30.[30DayTotalVolume],
q30.[30DayAvgVolume]
FROM
(qry10DayVolumes AS q10
INNER JOIN qry15DayVolumes AS q15
ON q10.SYMBOL = q15.SYMBOL)
INNER JOIN qry30DayVolumes AS q30
ON q10.SYMBOL = q30.SYMBOL;
That assumes you have created qry15DayVolumes and qry30DayVolumes following the approach I suggested for qry10DayVolumes.
If you want to cut down the number of queries, you could use subqueries for each of the qry??DayVolumes saved queries, but try it this way first to make sure the logic is correct.
In that second query above, there can be a problem due to field names which start with digits. Enclose those names in square brackets or re-alias them in qry10DayVolumes, qry15DayVolumes, and qry30DayVolumes using alias names which begin with letters instead of digits.
I tested the query as written above with the "2nd Upload.mdb" you uploaded, and it ran without error from Access 2007. Here is the first row of the result set from that query:
SYMBOL 10DayTotalVolume 10DayAvgVolume 15DayTotalVolume 15DayAvgVolume 30DayTotalVolume 30DayAvgVolume
ACC-1 42909 4290.9 54892 3659.46666666667 89669 2988.96666666667
Access doesn't support most advanced SQL syntax and clauses, so this is a bit of a hack, but it works, and is fast on your small sample. You're basically running 3 queries but the Union clauses allow you to combine into one:
select
Symbol,
sum([10DayTotalVol]) as 10DayTotalV,
sum([10DayAvgVol]) as 10DayAvgV,
sum([15DayTotalVol]) as 15DayTotalV,
sum([15DayAvgVol]) as 15DayAvgV,
sum([30DayTotalVol]) as 30DayTotalV,
sum([30DayAvgVol]) as 30DayAvgV
from (
select
Symbol,
sum(volume) as 10DayTotalVol, avg(volume) as 10DayAvgVol,
0 as 15DayTotalVol, 0 as 15DayAvgVol,
0 as 30DayTotalVol, 0 as 30DayAvgVol
from
[tbl-b]
where
timestamp >= (select min(ts) from (select distinct top 10 timestamp as ts from [tbl-b] order by timestamp desc ))
group by
Symbol
UNION
select
Symbol,
0, 0,
sum(volume), avg(volume),
0, 0
from
[tbl-b]
where
timestamp >= (select min(ts) from (select distinct top 15 timestamp as ts from [tbl-b] order by timestamp desc ))
group by
Symbol
UNION
select
Symbol,
0, 0,
0, 0,
sum(volume), avg(volume)
from
[tbl-b]
where
timestamp >= (select min(ts) from (select distinct top 30 timestamp as ts from [tbl-b] order by timestamp desc ))
group by
Symbol
) s
group by
Symbol

Resources