I'm in the process of rewriting a mySQL 5.7 query to Snowflake. I would like to keep as much of the mySQL query as close as possible when converting to Snowflake.
PROBLEM: The issue that I'm facing is that I have some user defined variables that change dynamically based on the previous value in a given column and I'm unable to replicate that functionality. I've been able to set variables but when I attempt to change their value as described I'm presented with a series of error.
MySQL Query:
SELECT
temp.id,
temp.profile_id,
temp.citations,
temp.DataValue,
#rank := CASE
WHEN #prevId = temp.profile_id AND #prevDataValue = temp.DataValue THEN #rank+1
ELSE 1
END as DataValueRank,
#prevId := temp.profile_id AS prevId,
#prevDataValue := temp.DataValue AS prevDataValue
FROM
l.temp AS temp,
(SELECT #prevId := 0, #prevDataValue := 0, #rank := 0) AS X
ORDER BY
profile_id DESC,
DataValue ASC,
citations DESC
in the above query we have that #prevId will be the value of the previous value found in the profile_id column, similarly for #prevDataValue and DataValue.
The table that gets created looks like the following:
id profile_id citations DataValue DataValueRank prevId prevDataValue
...
24508771 1003077033 1 E04.936.580.225 49 1003077033 E04.936.580.225
24160975 1003077033 1 E04.987 50 1003077033 E04.987
24160975 1003077033 1 E04.987.775 51 1003077033 E04.987.775
28079605 1003077025 9 C10 1 1003077025 C10
28079605 1003077025 9 C10.597 2 1003077025 C10.597
...
Where l.temp is the same as above with the last three columns:
id profile_id citations DataValue
...
24508771 1003077033 1 E04.936.580.225
24160975 1003077033 1 E04.987
24160975 1003077033 1 E04.987.775
28079605 1003077025 9 C10
28079605 1003077025 9 C10.597
...
ATTEMPT
Snowflake Query
SET (prevId, prevDataValue, rank) = (0, 0, 0); -- AS X
I'm thinking that this acts as my variable initializer similar to
SELECT #prevId := 0, #prevDataValue := 0, #rank := 0) AS X
SELECT
temp.id,
temp.profile_id,
temp.citations,
temp.DataValue,
-- $rank = CASE
-- WHEN $prevId = temp.profile_id AND $prevDataValue = temp.DataValue THEN $rank+1
-- ELSE 1
-- END as DataValueRank, -- THIS WON'T CHANGE VALUE DEPENDING ON THE CASE
-- $prevId = temp.profile_id, -- THIS WON'T CHANGE VALUE
-- $prevDataValue = temp.DataValue -- THIS WON'T CHANGE VALUE
FROM
DATAWAREHOUSE.MY_DATA AS temp,
ORDER BY
profile_id DESC,
DataValue ASC,
citations DESC
;
If anybody knows how the variables can change value within the SELECT statement that would be helpful
It's not possible to change values of SQL variables within the SELECT statement. Only possible option is to use SET command to assign/change value of an SQL variable in Snowflake:
https://docs.snowflake.com/en/sql-reference/session-variables.html
I understand that MySQL's variables is very useful, but Snowflake does not support same functionality.
Related
I have a table in MS SQL Server where the EXPECTED result should look like this:
Prior to expected result/query execution, all FieldX values are NULL. When I run my query, FieldX is only updated from row 2 to 8.
I need to UPDATE FieldX using a set of rules, which I define as such:
WITH cte_previous_rows AS (
SELECT Date, Staff_Id, LAG(FieldX) OVER (partition by Staff_Id ORDER by [date]) as Prev_Row
FROM Sales
) UPDATE Sales
SET FieldX = (CASE
WHEN Staff_id_sales < 1500 AND ClosedSale = 0 THEN 0
WHEN Staff_id_sales = 1500 and ClosedSale = 0 THEN 5
WHEN Staff_id_sales <= 3000 and Staff_id_sales > 1500 and ClosedSale = 0 THEN 1
WHEN Staff_id_sales > 3000 and (c.Prev_Row = 1 OR c.Prev_Row = 0) THEN 2
WHEN Staff_id_sales > 3000 and (c.Prev_Row = 2 or c.Prev_Row = 3) THEN 3
ELSE FieldX
END)
FROM Sales
JOIN cte_previous_rows as c ON Sales.staff_id = c.staff_id AND Sales.Date = c.Date;
This query works just fine. But the problem lies in the last two WHEN statements. The reason for this, is of course that c.Prev_Row (previous row) is used in the rule set for these two last WHEN statements..
How can I edit my query so that the above rule set is applied on to all 50k rows in a SINGLE execution? Perhaps a new method is required..
A recursive CTE that works from the earliest row for each Staff_Id forward may be the ticket:
Note: This query was not run on an image of the data, so it might have some errors.
I am trying to update my tables data(1=>3, 2=>1, 3=>2) by swapping them using below queries.
/* Temporarily set 1 to a dummy unused value of 11
so they are disambiguated from those set to 1 in the next step */
update <tablename>
set id = 11
where id = 1
update <tablename>
set id = 1
where id = 2
update <tablename>
set id = 2
where id = 3
update <tablename>
set id = 3
where id = 11
Wondering if I can optimize my script.
You can just use case. Conceptually the operation happens "all at once" so there's no need to use a fourth dummy value as in your sequential approach.
UPDATE YourTable
SET ID = CASE ID WHEN 1 THEN 3
WHEN 2 THEN 1
WHEN 3 THEN 2
END
WHERE ID IN (1,2,3)
Though changing ids is unusual as they should generally be immutable.
Let's say I have the following table:
camp_1, camp_2
0, 048
00, 048
000, 042
000, 043
I now want to insert these values into a new table dim_promotion, which should look like this:
PromotionID, CampaignID, CouponID
1, 1, 1
2, 2, 1,
3, 3, 2,
4, 3, 3
I know how I can fill the tables (dim_campaigns and dim_coupons which stand behind CampaignID and CouponID) by doing this:
INSERT INTO [REPORTING].dbo.dim_campaigns
SELECT DISTINCT
camp_2 AS CampaignID
FROM [reporting2].[dbo].[reporting_rawdata_v2]
The primary keys for the respective three tables (dim_coupons, dim_campaigns and dim_promotion) are all set to "Identity Specification = YES"
So how do I then fill dim_promotion? What is the natural order? FIrst derive dim_campaigns and dim_coupons and then isnert this into dim_promotion or the other way round?
EDIT: I have the following DB model (only an excerpt) I am only referring to the yellow fields.
RIght now, we have only large table (rawdatatbl) where all data is stored (CampaignCode, CouponCode, Campaign Name, CouponName). This is not really efficient and that's why I want to compeltely restructure the model (see the screenshot). So, as currently the data does not consist of any IDs at all, I do need IDs to fill the new tables dim_campaigns and dim_coupons. This means: To fill dim_campaigns I would run a SELECT DISTINCT campaign_code on my current rawdatatbl and then Insert this into dim_campaigns (CampaignID is automatically filled by using auto_increment and CampaignName is filled with 'Dummy'). The same I could do for dim_coupons. But how can I then use this data to initialize the Dim_promotion table? Or what is the best process to transform my current data from rawdatatbl into IDs?
one possible way is :-
---- insert your all distinct Campaign data
Insert Into dim_campaigns(CampaignCode,CampaignName)
Select Distinct
cm.CampaignCode
,cm.CampaignName
From rawdatatbl As cm
---- insert your all distinct Coupon data
Insert Into dim_coupons(CouponCode,CouponName)
Select Distinct
c.CouponCode
,c.CouponName
From rawdatatbl As c
Declare #Total Int
,#Inc Int
,#RowId Int
,#CampaignCode Varchar(100)
,#CouponCode Varchar(100)
----Primary keys
,#CampaignId Int
,#CouponId Int
,#PromotionID Int
Select #CampaignCode = ''
,#CouponCode = ''
----loop through one by one and take necessary action
Select #Total = Count(1)
,#Inc = 0
,#RowId = 0
From rawdatatbl As c With (Nolock)
While (#Inc < #Total)
Begin
Select Top 1
#RowId = [TableUniqueId]
,#CampaignCode = c.CampaignCode
,#CouponCode = c.CouponCode
---- others columns
From rawdatatbl As c
Where [TableUniqueId] > #RowId
Order By [TableUniqueId] Asc
Select #CampaignId = dc.CampaignId
From dim_campaigns As dc
Where dc.CampaignCode = #CampaignCode
Select #CouponId = dc.CouponId
From dim_coupons As dc
Where dc.CouponCode = #CouponCode
If (#CampaignId > 0 And #CouponId > 0)
Begin
Insert Into dim_promotion(CampaignID,CouponID)
Select #CampaignId,#CouponId
Select #PromotionID = ##Identity
---- other operation insert/update based on #PromotionID you can do here.
End
Select #Inc = #Inc + 1
,#CampaignCode = ''
,#CouponCode = ''
,#CampaignId = 0
,#CouponId = 0
End
I have a table called Settings with columnA, columnB, columnC, columnD, columnE and their value as followed:
columnA = 1000 columnB = 100 columnC = 200 columnD = 18 columnE = 6
I want to change the value in columnA/B/C when the time is between 18pm to 6am.
I'm thinking of somekind of a trigger that updates the values by looking at the timestamp but I just don't know how to do that. Any ideas?
If you can control the application reading the table then could you create a view that checks the time and returns the values you require?
Something like:
SELECT
[COLUMN1],
CASE
WHEN DATEPART("hh", GETDATE()) BETWEEN 6 AND 14 THEN 1
ELSE [COLUMN2] END AS [COLUMN2],
[COLUMN3]
FROM [TABLE1]
Edit:
If you're querying the table via SQL built within your app you can alter the SQL query to return these values with the CASE above:
CASE
WHEN DATEPART("hh", GETDATE()) BETWEEN 6 AND 14 THEN 1
ELSE [COLUMN2] END AS [COLUMN2],
I have a simple table that stores stock levels. ie.
ID int PK
LocationID int
StockLevel real
There could be multiple rows in this table for each location ie:
ID | LocationID | StockLevel
----------------------------
1 | 1 | 100
2 | 1 | 124
3 | 2 | 300
In this example its trivial to see that 224 units exist at location 1.
When I come to decrement the stock level at location 1 I am using
a cursor to iterate over all rows at where LocationID is 1 and using some simple
logic decide whether the stock available at the current row will satisfy the passed in
decrement value. If the row has sufficient quantity to satisfy the requirement I decrement the rows value and break out of the cursor, and end the procedure, however if the row doesnt have sufficient quantity available I decrement its value to zero and move to the next row and try again (with the reduced quantity)
Its quite simple and works ok, but the inevitable question is: Is there a way of performing
this RBAR operation without a cursor?? I have attempted to search for alternatives but even wording
the search criteria for such an operation is painful!
Thanks in advance
Nick
ps. I am storing data in this format because each row also contains other columns that are unique, and hence cant simply be aggregated into one row for each location.
pps. Cursor Logic as requested (where '#DecrementStockQuantityBy' is the quantity that we need
to reduce the stock level by at the specified location):
WHILE ##FETCH_STATUS = 0
BEGIN
IF CurrentRowStockStockLevel >= #DecrementStockQuantityBy
BEGIN
--This row has enough stock to satisfy decrement request
--Update Quantity on the Current Row by #DecrementStockQuantityBy
--End Procedure
BREAK
END
IF CurrentRowStockStockLevel < #DecrementStockQuantityBy
BEGIN
--Update CurrentRowStockStockLevel to Zero
--Reduce #DecrementStockQuantityBy by CurrentRowStockStockLevel
--Repeat until #DecrementStockQuantityBy is zero or end of rows reached
END
FETCH NEXT FROM Cursor
END
Hope this is clear enough? Let me know if further/better explanation is required.
Thanks
You are correct sir a simple update statement can help you in this scenario I'm still trying to find a legitimate use for a cursor or while that I can't solve with CTE or set based.
After looking a little deeper into your question I will also propose an alternate solution:
Declare #LocationValue int = 1,#decimentvalue int = 20
with temp (id,StockLevel,remaining) as (
select top 1 id, Case when StockLevel - #decimentvalue >0 then
StockLevel = StockLevel - #decimentvalue
else
StockLevel = 0
end, #decimentvalue - StockLevel
from simpleTable st
where st.LocationID = #LocationValue
union all
select top 1 id, Case when StockLevel - t.remaining >0 then
StockLevel = StockLevel -t.remaining
else
StockLevel = 0
end, t.remaining - StockLevel
from simpleTable st
where st.LocationID = #LocationValue
and exists (select remaining from temp t
where st.id <> t.id
group by t.id
having min(remaining ) >0) )
update st
set st.StockLevel = t.StockLevel
from simpleTable st
inner join temp t on t.id = st.id