Query for the latest 3 weeks data in Snowflake query - snowflake-cloud-data-platform

In a Snowflake query, I will like to query for the latest 3 weeks of data based on a column that has the WorkWeek number with the data type as Number (not timestamp).
My Database looks like this:
WorkWeek
Data1
Data2
202235
...
...
202235
...
...
202235
...
...
202234
...
...
202233
...
...
202233
...
...
202232
...
...
202232
...
...
What I want (latest 3 Workweeks data):
WorkWeek
Data1
Data2
202235
...
...
202235
...
...
202235
...
...
202234
...
...
202233
...
...
202233
...
...
What I am using so far the code is a pretty manual way of catching the latest 3 weeks data, I will like the new code to be able to catch the latest 3 weeks data as a new week rolls into the db:
Select *
From DB
where WorkWeek in ('202235','202234','202233')

Qualify with Dense Rank is what you need:
select column1 as WorkWeek, column2 as SomeData from values
(202235,'a')
,(202235,'b')
,(202235,'c')
,(202234,'d')
,(202233,'e')
,(202233,'f')
,(202232,'g')
,(202232,'h')
qualify dense_rank() over (order by WorkWeek desc) <= 3;

Related

Perform Query Using Columns of Table as Arguments

I need to perform different queries with different arguments. These arguments are arranged in the table T2 below. Everyday these arguments change so the table T2 change and the result of the query I want to do will change. The query is simple but I dont know how to perform using the columns of T2 as arguments...
-----Table T2----
ID Country FilterExpression
----------- ------- ------ ----- -----
1 Argentina 'Filter01'
2 Brazil 'Filter02'
3 USA 'Filter03'
4 UK 'Filter04'
5 France 'Filter05'
6 Mexico 'Filter06'
...
100 Canada 'Filter100'
The query I need to perform:
SELECT Element,Value
FROM ArchTot
WHERE Country = [Column Country of T2]
AND FilterExpr = [Column FilterExpression of T2]
Once I have a Table of arguments of 100 rows, the result of my query must have the same 100 rows.
Somebody can help me to build this query?
Try to be more specific so I can understand what you need to do.. Why you need to retrieve 100 for 100? And when The table data changes what happens to The old data... You need to query new data or The data before it changes?
Everyday the table of arguments change so I have to perform the query to update the Result. Once I Have 100 arguments I'll have 100 results:
Result of query from Tables Archtot or Archtot2
Country FilterExpr Value
----------- ------- ------ ----- -----
Argentina 'Filter01' 100.82
Brazil 'Filter02' 102.87
USA 'Filter03' 82.7
UK 'Filter04' 106.8
France 'Filter05' 110.7
Mexico 'Filter06' 79.9
...
Canada 'Filter100' 102.04
I Tried the following, It works...
SELECT A.Element,A.Value
FROM ArchTot A, T2 B
WHERE A.Country = B.Country
AND A.FilterExpr = B.FilterExpression
The problem now is that I Cant perform a INNER JOIN Statement once I have inserted the Table T2 in the FROM clause, it gives timeout. For example:
SELECT * FROM
(
(SELECT A.Element As Element,A.Value As Value, A.Country As Country
FROM ArchTot A, T2 B
WHERE A.Country = B.Country
AND A.FilterExpr = B.FilterExpression)T1
INNER JOIN
(SELECT C.Element As Element,C.Value As Value, C.Country As Country
FROM ArchTot2 C, T2 D
WHERE C.Country = D.Country
AND C.FilterExpr = D.FilterExpression)T2
ON T1.Country=T2.Country
)
Is the query I Built OK? Any Ideas why I Cant Perform the INNER JOIN?

Convert columns to rows by ID

Looking for a way to convert columns to rows in sql server.
I have a table with the columns below:
[ID] [Action] [Note] [Resolution]
Here is what I want to get as the result with the columns: [ID] [Notes]
And the result values will be:
'1' 'Action1'
'1' 'Note1'
'1' 'Resolution1'
'2' 'Action2'
'2' 'Note2'
'2' 'Note2.1'
'2' 'Resolution2' etc
Any ideas how I could do this in T-SQL? Also for the note field there could be multiple entries. Thanks!
Assuming your source table and data looks like this:
-- select * from t:
ID Action Note Resolution
--- ------- ------- -----------
1 Action1 Note1 Resolution1
2 Action2 Note2 Resolution2
2 Action2 Note2.1 Resolution2
This query:
select distinct id, notes
from (select * from t) as source
unpivot (notes for ids in ([action], [note], [resolution])
) as unpivotted_table
will produce this result:
id notes
--- ------
1 Action1
1 Note1
1 Resolution1
2 Action2
2 Note2
2 Note2.1
2 Resolution2
which looks a lot like what you are asking for.
You can find more information on how the unpivot operator works here.

nontrivial query

I need to write a query.
have 3 tables, one main and two associated
for example :
main_table
id name
-- -------------
1 example
2 example2
join_table1
id main_table_id
-- -------------
1 1
join_table2
id main_table_id
-- -------------
1 2
if main_table_id contained in join_table1
I need to sort by join_table1.id, else if main_table_id contained in join_table2,
i need to sort by join_table2.id
any ideas how to make such query
main_table_id can be either in table join_table1 or join_table2
database - SQL Server
You should combine coalesce and left joins. Query will look like:
select ...
from main_table
left outer join join_table1 j1 on ...
left outer join join_table2 j2 on ...
order by coalesce( j1.id, j2.id )

SQL Case statements, making sub selections on a condition?

I've come across a scenario where I need to return a complex set of calculated values at a crossover point from "legacy" to current.
To cut a long story short I have something like this ...
with someofit as
(
select id, col1, col2, col3 from table1
)
select someofit.*,
case when id < #lastLegacyId then
(select ... from table2 where something = id) as 'bla'
,(select ... from table2 where something = id) as 'foo'
,(select ... from table2 where something = id) as 'bar'
else
(select ... from table3 where something = id) as 'bla'
,(select ... from table3 where something = id) as 'foo'
,(select ... from table3 where something = id) as 'bar'
end
from someofit
No here lies the problem ...
I don't want to be constantly doing that case check for each sub selection but at the same time when that condition applies I need all of the selections within the relevant case block.
Is there a smarter way to do this?
if I was in a proper OO language I would use something like this ...
var common = GetCommonSuff()
foreach (object item in common)
{
if(item.id <= lastLegacyId)
{
AppendLegacyValuesTo(item);
}
else
{
AppendCurrentValuesTo(item);
}
}
I did initially try doing 2 complete selections with a union all but this doesn't work very well due to efficiency / number of rows to be evaluated.
The sub selections are looking for total row counts where some condition is met other than the id match on either table 2 or 3 but those tables may have millions of rows in them.
The cte is used for 2 reasons ...
firstly it pulls only the rows from table 1 i am interested in so straight away im only doing a fraction of the sub selections in each case.
secondly its returning the common stuff in a single lookup on table 1
Any ideas?
EDIT 1 :
Some context to the situation ...
I have a table called "imports" (table 1 above) this represents an import job where we take data from a file (csv or similar) and pull the records in to the db.
I then have a table called "steps" this represents the processing / cleaning rules we go through and each record contains a sproc name and a bunch of other stuff about the rule.
There is then a join table that represents the rule for a particular import "ImportSteps" (table 2 above - for current data), this contains a "rowsaffected" column and the import id
so for the current jobs my sql is quite simple ...
select 123 456
from imports
join importsteps
for the older legacy stuff however I have to look through table 3 ... table 3 is the holding table, it contains every record ever imported, each row has an import id and each row contains key values.
on the new data rowsaffected on table 2 for import id x where step id is y will return my value.
on the legacy data i have to count the rows in holding where col z = something
i need data on about 20 imports and this data is bound to a "datagrid" on my mvc web app (if that makes any difference)
the cte i use determines through some parameters the "current 20 im interested in" those params represent start and end record (ordered by import id).
My biggest issue is that holding table ... it's massive .. individual jobs have been known to contain 500k + records on their own and this table holds years of imported rows so i need my lookups on that table to be as fast as possible and as few as possible.
EDIT 2:
The actual solution (suedo code only) ...
-- declare and populate the subset to reduce reads on the big holding table
declare table #holding ( ... )
insert into #holding
select .. from holding
select
... common stuff from inner select in "from" below
... bunch of ...
case when id < #legacy then (select getNewValue(id, stepid))
else (select x from #holding where id = ID and ... ) end as 'bla'
from
(
select ROW_NUMBER() over (order by importid desc) as 'RowNum'
, ...
) as I
-- this bit handles the paging
where RowNum >= #StartIndex
and RowNum < #EndIndex
i'm still confident i can clean it up more but my original query that looked something like bills solution was about 45 seconds in execution time, this is about 7
I take it the subqueries must return a single scalar value, correct? This point is important because it is what ensures the LEFT JOINs will not multiply the result.
;with someofit as
(
select id, col1, col2, col3 from table1
)
select someofit.*,
bla = coalesce(t2.col1, t3.col1),
foo = coalesce(t2.col2, t3.col2),
bar = coalesce(t2.bar, t3.bar)
from someofit
left join table2 t2 on t2.something=someofit.id and somefit.id < #lastLegacyId
left join table3 t3 on t3.something=someofit.id and somefit.id >= #lastLegacyId
Beware that I have used id >= #lastLegacyId as the complement of the condition, by assuming that id is not nullable. If it is, you need an IsNull there, i.e. somefit.id >= isnull(#lastLegacyId,somefit.id).
Your edit to the question doesn't change the fact that this is an almost literal translation of the O-O syntax.
foreach (object item in common) --> "from someofit"
{
if(item.id <= lastLegacyId) --> the precondition to the t2 join
{
AppendLegacyValuesTo(item); --> putting t2.x as first argument of coalesce
}
else --> sql would normally join to both tables
--> hence we need an explicit complement
--> condition as an "else" clause
{
AppendCurrentValuesTo(item); --> putting t3.x as 2nd argument
--> tbh, the order doesn't matter since t2/t3
--> are mutually exclusive
}
}
function AppendCurrentValuesTo --> the correlation between t2/t3 to someofit.id
Now, if you have actually tried this and it doesn't solve your problem, I'd like to know where it broke.
Assuming you know that there are no conflicting ID's between the two tables, you can do something like this (DB2 syntax, because that's what I know, but it should be similar):
with combined_tables as (
select ... as id, ... as bla, ...as bar, ... as foo from table 2
union all
select ... as id, ... as bla, ...as bar, ... as foo from table 3
)
select someofit.*, combined_ids.bla, combined_ids.foo, combined_ids.bar
from someofit
join combined_tables on someofit.id = combined_tables.id
If you had cases like overlapping ids, you could handle that within the combined_tables() section

TSQL: Get all rows for given ID

I am trying to output all of my database reports into one report. I'm currently using nested select statements to get each line for each ID (the number of ID's is unknown). Now I would like to return all the rows for every ID (e.g. 1-25 if there are 25 rows) in one query. How would I do this?
SELECT (
(SELECT ... FROM ... WHERE id = x) As Col1
(SELECT ... FROM ... WHERE id = x) As Col2
(SELECT ... FROM ... WHERE id = x) As Col3
)
EDIT: Here's an example:
SELECT
(select post_id from posts where report_id = 1) As ID,
(select isnull(rank, 0) from results where report_id = 1 and url like '%www.testsite.com%') As Main,
(select isnull(rank, 0) from results where report_id = 1 and url like '%.testsite%' and url not like '%www.testsite%') As Sub
This will return the rank of a result for the main domain and the sub-domain, as well as the ID for the posts table.
ID Main Sub
--------------------------------------
1 5 0
I'd like to loop through this query and change report_id to 2, then 3, then 4 and carry on until all results are displayed. Nothing else needs to change other than the report_id.
Here's a basic example of what is inside the tables
POSTS
post_id post report_id
---------------------------------------------------------
1 "Hello, I am..." 1
2 "This may take..." 2
3 "Bla..." 2
4 "Bla..." 3
5 "Bla..." 4
RESULTS
result_id url title report_id
--------------------------------------------------------
1 http://... "Intro" 1
2 http://... "Hello!" 1
3 http://... "Question" 2
4 http://... "Help" 3
REPORTS
report_id description
---------------------------------
1 Introductions
2 Q&A
3 Starting Questions
4 Beginner Guides
5 Lectures
The query will want to pull the first post, the first result from the main website (www) and the first result from a subdomain by their report_id. These tables are part of a complicated join structure with many other tables but for these purposes these tables are the only ones that are needed.
I've managed to solve the problem by creating a table, setting variables to take all the contents and insert them in a while loop, then selecting them and dropping the table. I'll leave this open for a bit to see if anyone picks up a better way of doing it because I hate doing it this way.
If you need each report id on its own column, take a look at the PIVOT/UNPIVOT commands.
Here's one way of doing it :
SELECT posts.post_id AS ID,
IsNull(tblMain.Rank, 0) AS Main,
IsNull(tblSub.Rank, 0) AS Sub
FROM posts
LEFT JOIN results AS tblMain ON posts.post_id = tblMain.report_id AND tblMain.url like '%www.testsite.com%'
LEFT JOIN results AS tblSub ON posts.post_id = tblSub.report_id AND tblSub.url like '%.testsite%' and tblSub.url not like '%www.testsite%'
That is one query? You've provided your own answer?
If you mean you want to return a series of 'rows' as, for some reason, 'columns', this ability does exist, but I can't remember the exact name. Possible pivot. But it's a little odd.
see if this is what you are looking
SELECT
CASE WHEN reports.id = 1 THEN reports.Name
ELSE "" AS Col1,
CASE WHEN reports.id = 2 THEN reports.Name
ELSE "" AS Col2
....
FROM reports
Best Regards,
Iordan
Assuming you have a "master" table of IDs (if not I suggest you do so for Foreign Key purposes):
SELECT (
(SELECT ... FROM ... WHERE id = m.ID) As Col1
(SELECT ... FROM ... WHERE id = m.ID) As Col2
(SELECT ... FROM ... WHERE id = m.ID) As Col3
)
FROM MasterIDs m
Depending on how much each report is similar,you may be able to speed that up by moving some of the logic out of the nested statements and into the main body of the query.
Possibly a better way of thinking about this is to alter each report statement to return (ID,value) and do something like:
SELECT
report1.Id
,report1.Value AS Col1
,report2.Value AS Col2
FROM (SELECT Id, ... AS Value FROM ...) report1
JOIN (SELECT Id, ... AS Value FROM ...) report2 ON report1.Id = report2.Id
again, depending on the similarity of your reports you could probably combine these in someway.

Resources