snowflake unsupported subquery cannot be evaluated - snowflake-cloud-data-platform

/Table TEMP has customer hash, effective start date and effective end date. Table CDTLS has customer hash, effective start date.I want to customer hash, effective from, Customer name from TEMP and CDTLS. I am calculating CDTLS end date on the fly and comparing it with TEMP.EFFECTIVE_FROM and TEMP_EFFECTIVE_TO dates. I get an error that unsupported subquery cannot be evaluated./
SELECT
TEMP.CUSTOMER_HASH,
TEMP.EFFECTIVE_FROM,
TEMP.EFFECTIVE_TO,
CDTLS.NAME
FROM TEMP
LEFT CDTLS
ON
TEMP.CUSTOMER_HASH = CDTLS.CUSTOMER_HASH
AND
CDTLS.EFFECTIVE_FROM <= TEMP.EFFECTIVE_FROM
AND
(
SELECT VW.EFFECTIVE_TO FROM
(
SELECT CUSTOMER_HASH, EFFECTIVE_FROM, LEAD(EFFECTIVE_FROM, 1, '9999-12-31') OVER (PARTITION
BY CUSTOMER_HASH ORDER BY EFFECTIVE_FROM ASC) AS EFFECTIVE_TO
FROM CUST_DETAILS
) AS VW
WHERE CDTLS.CUSTOMER_HASH = VW.CUSTOMER_HASH AND CDTLS.EFFECTIVE_FROM = VW.EFFECTIVE_FROM
) >= TEMP.EFFECTIVE_TO
;

I suppose you wanted to run this query:
SELECT
TEMP.CUSTOMER_HASH,
TEMP.EFFECTIVE_FROM,
TEMP.EFFECTIVE_TO,
CDTLS.NAME
FROM TEMP
LEFT join CDTLS
ON
TEMP.CUSTOMER_HASH = CDTLS.CUSTOMER_HASH
AND
CDTLS.EFFECTIVE_FROM <= TEMP.EFFECTIVE_FROM
left join (
SELECT CUSTOMER_HASH, EFFECTIVE_FROM, LEAD(EFFECTIVE_FROM, 1, '9999-12-31') OVER (PARTITION
BY CUSTOMER_HASH ORDER BY EFFECTIVE_FROM ASC) AS EFFECTIVE_TO
FROM CUST_DETAILS
) AS VW on CDTLS.CUSTOMER_HASH = VW.CUSTOMER_HASH AND CDTLS.EFFECTIVE_FROM = VW.EFFECTIVE_FROM
where
VW.EFFECTIVE_TO >= TEMP.EFFECTIVE_TO

You could try using MIN / MAX / LISTAGG etc in the select query to make it deterministically scalar to check if that helps.
https://docs.snowflake.net/manuals/user-guide/querying-subqueries.html#differences-between-correlated-and-non-correlated-subqueries

Related

Using recursion with a CTE in SQL Server

I have following table Structure: (this is just a sample set with exact same columns in my final output query)
Actual data has a much higher number of rows in index and I have to remove few symbols before arriving to the index value. This is a custom index to be built for internal use.
https://dbfiddle.uk/?rdbms=sqlserver_2016&fiddle=b1d5ed7db79c665d8cc179ae4cc7d4f1
This is link to the fiddle for SQL data
below is the image of the same:
I want to calculate point contribution to the index value and finally the index value.
To calculate pts contribution by each symbol the formula is :
ptsC = yesterday_index * wt * px_change / yest_close
I do not have beginning value of yesterday Index .i.e for 17 Nov 2021 and should be considered as 1000
The Index Value of 18 Nov will then be 1000 + sum(ptsC)
This value should now be used to calculate ptsC for each symbol for 22-Nov and so on...
I am trying to write a recursive CTE but am not sure where I am going wrong.
Yesterday Index value should be recursively determined and thus the ptsC should be calculated.
The final output should be:
where total Point Contribution is sum of all the ptsC for the day and New index Value is yesterday Index Value + Total Point Contribution.
Below is the code I have which generates the first table:
declare #beginval as float=17671.65
set #beginval=1000
declare #indexname varchar(20)='NIFTY ENERGY'
declare #mindt as datetime
select #mindt=min(datetime) from indices_json where indexname=#indexname
;
with tbl as (
SELECT IndexName, datetime, sum(Indexmcap_today) totalMcap_today,sum(Indexmcap_yst) totalmcap_yst
FROM indices_json
WHERE IndexName = #indexname
group by indexname,datetime
)
,tbl2 as
(
select j.indexname,j.datetime,symbol,Indexmcap_today/d.totalMcap_today*100 calc_wt_today,Indexmcap_yst/d.totalmcap_yst*100 calc_wt_yest,iislPtsChange,adjustedClosePrice,pointchange
from indices_json j inner join tbl d on d.datetime=j.datetime and d.IndexName=j.IndexName
)
,tbl3 as
(
select indexname,datetime,symbol,calc_wt_today,calc_wt_yest,iislPtsChange,adjustedClosePrice,pointchange
,case when datetime=#mindt then #beginval*calc_wt_yest*iislPtsChange/adjustedClosePrice/100 else null end ptsC
from tbl2
)
,tbl4 as
(
select indexname,datetime,sum(ptsC) + #beginval NewIndexVal,sum(pointchange) PTSCC
from tbl3
group by indexname,datetime
)
,tbl5 as
(
select *,lag(datetime,1,null) over(order by datetime asc) yest_dt
from tbl4
)
,
tbl6 as
(
select d.*,s.yest_dt
from tbl2 d inner join tbl5 s on d.datetime=s.datetime
)
,tbl7 as
(
select d.IndexName,d.datetime,d.symbol,d.calc_wt_today,d.calc_wt_yest,d.iislPtsChange,d.adjustedClosePrice,d.pointchange,case when i.datetime is null then #beginval else i.NewIndexVal end yest_index
from tbl6 d left join tbl4 i on d.yest_dt=i.datetime
)
select IndexName,convert(varchar(12),datetime,106)date,symbol,round(calc_wt_yest,4) wt,iislPtsChange px_change,adjustedClosePrice yest_close--,pointchange,yest_index
from tbl7 d where datetime <='2021-11-24'
order by datetime
Thanks in advance.
I found a solution for this:
I calculated the returns for each constituent for each date
then summed up these returns for a date
then multiplied all the sum of the returns of all dates to arrive at the final value - this works
below is the query for the same. I did not require recursion here
declare #beginval as float=17671.65
declare #indexname varchar(20)='NIFTY 50'
declare #mindt as datetime
select #mindt=min(datetime) from indices_json where indexname=#indexname
declare #startdt as datetime = '2021-11-01'
;
with tbl as (
SELECT IndexName, datetime, sum(Indexmcap_today) totalMcap_today,sum(Indexmcap_yst) totalmcap_yst
FROM indices_json
WHERE IndexName = #indexname-- and symbol!='AXISBANK'
group by indexname,datetime
)
,tbl2 as
(
select j.indexname,j.datetime,symbol,Indexmcap_today/d.totalMcap_today*100 calc_wt_today,Indexmcap_yst/d.totalmcap_yst*100 calc_wt_yest,iislPtsChange,adjustedClosePrice,pointchange
from indices_json j inner join tbl d on d.datetime=j.datetime and d.IndexName=j.IndexName
)
,tbl7 as
(
select d.IndexName,d.datetime,d.symbol,d.calc_wt_today,d.calc_wt_yest,d.iislPtsChange,d.adjustedClosePrice,d.pointchange, d.calc_wt_yest*d.iislPtsChange/d.adjustedClosePrice/100 ret
from tbl2 d
)
,tbl8 as
(
select indexname,datetime,1+sum(ret) tot_ret from tbl7 group by indexname,datetime
)
select indexname,datetime date
,round(exp(sum(log(sum(tot_ret))) over (partition by IndexName order by datetime)),6)*#beginval final_Ret
from tbl8 where datetime>=#startdt
group by indexname,datetime order by date

Display of online users on the system

I don't know exactly where I'm wrong, but I need a list of all the workers who are currently at work (for the current day), this is my sql query:
SELECT
zp.ID,
zp.USER_ID,
zp.Arrive,
zp.Deppart,
zp.DATUM
FROM time_recording as zp
INNER JOIN personal AS a on zp.USER_ID, = zp.USER_ID,
WHERE zp.Arrive IS NOT NULL
AND zp.Deppart IS NULL
AND zp.DATUM = convert(date, getdate())
ORDER BY zp.ID DESC
this is what the data looks like with my query:
For me the question is, how can I correct my query so that I only get the last Arrive time for the current day for each user?
In this case to get only these values:
Try this below script using ROW_NUMBER as below-
SELECT * FROM
(
SELECT zp.ID, zp.USER_ID, zp.Arrive, zp.Deppart, zp.DATUM,
ROW_NMBER() OVER(PARTITION BY zp.User_id ORDER BY zp.Arrive DESC) RN
FROM time_recording as zp
INNER JOIN personal AS a
on zp.USER_ID = zp.USER_ID
-- You need to adjust above join relation as both goes to same table
-- In addition, as you are selecting nothing from table personal, you can drop the total JOIN part
WHERE zp.Arrive IS NOT NULL
AND zp.Deppart IS NULL
AND zp.DATUM = convert(date, getdate())
)A
WHERE RN =1
you can try this:
SELECT DISTINCT
USER_ID,
LAR.LastArrive
FROM time_recording as tr
CROSS APPLY (
SELECT
MAX(Arrive) as LastArrive
FROM time_recording as ta
WHERE
tr.USER_ID = ta.USER_ID AND
ta.Arrive IS NOT NULL
) as LAR

Returning a single date with max(date)

I have the following query. I want to retrieve a list of unique Object ID's with the value closest to a specified date:
INSERT INTO #temp
(
[Object ID]
,[Waarde]
,[Kenmerk]
)
select DISTINCT PME.OBJECTID,
LEFT(PME.OBJECTSCORINGVALUE,LEN(PME.OBJECTSCORINGVALUE)-2),
'P3'
FROM PMEOBJECTSCORINGPOINTS PME
LEFT JOIN PMEOBJECTSCORINGHISTORY PMEH ON PME.OBJECTSCORINGHISTORYID = PMEH.OBJECTSCORINGHISTORYID
INNER JOIN(SELECT OBJECTSCORINGHISTORYID, MAX(DATE) DATE
FROM PMEOBJECTSCORINGHISTORY
WHERE DATE < DATEFROMPARTS(YEAR(getdate())-1, 12, 31)
GROUP BY OBJECTSCORINGHISTORYID) P3 ON PME.OBJECTSCORINGHISTORYID = P3.OBJECTSCORINGHISTORYID
AND PMEH.DATE = P3.DATE
AND PME.ATTRIBUTEID = 'Energie-idx'
AND PME.OBJECTSCORINGVALUE <> ''
------------------
select * from #temp
order by [Object ID], [Kenmerk] ASC
When a certain Object ID only has one known value before 2019-12-31, I get one record in the result set. However, if an Object ID has two (or more) known values before that date, I still get multiple results instead of the value for the date closest to 2019-12-31.
Any pointers on how to get the desired results? Thanks in advance!
(edit: apologies for the bad readibility on the code, thanks for fixing it)
Use analytical funtion ROW_NUMBER(), if you can sort on a column, Perhaps P3.Date or PME.OBJECTSCORINGVALUE
select Objectid, OBJECTSCORINGVALUE,P3
from(
select PME.OBJECTID,
LEFT(PME.OBJECTSCORINGVALUE,LEN(PME.OBJECTSCORINGVALUE)-2) OBJECTSCORINGVALUE,
'P3' Pcol,P3.Date
row_number() over (partition by PME.OBJECTID, LEFT(PME.OBJECTSCORINGVALUE,LEN(PME.OBJECTSCORINGVALUE)-2) order by P3.Date DESC) rn
FROM PMEOBJECTSCORINGPOINTS PME
LEFT JOIN PMEOBJECTSCORINGHISTORY PMEH ON PME.OBJECTSCORINGHISTORYID = PMEH.OBJECTSCORINGHISTORYID
INNER JOIN(SELECT OBJECTSCORINGHISTORYID, MAX(DATE) DATE
FROM PMEOBJECTSCORINGHISTORY
WHERE DATE < DATEFROMPARTS(YEAR(getdate())-1, 12, 31)
GROUP BY OBJECTSCORINGHISTORYID) P3 ON PME.OBJECTSCORINGHISTORYID = P3.OBJECTSCORINGHISTORYID
AND PMEH.DATE = P3.DATE
AND PME.ATTRIBUTEID = 'Energie-idx'
AND PME.OBJECTSCORINGVALUE <> ''
) where rn=1
Here I order the replies with P3.Date, and return only the one with the highest value.
It is guaranteed to only return one row, however you have to be sure about your data to be sure that it is deterministic
I fixed the problem. Turned out I did an incorrect join (wrong level of granularity). I should have done it on OBJECTID instead of OBJECTSCORINGHISTORYID. The result was that the max(date) was returned for OBJECTSCORINGHISTORYID instead of on the level of OBJECTID.
This is the correct query:
INSERT INTO #temp
(
[Object ID]
,[Waarde]
,[Kenmerk]
)
select PME.OBJECTID,
LEFT(PME.OBJECTSCORINGVALUE,LEN(PME.OBJECTSCORINGVALUE)-2),
'P3'
FROM PMEOBJECTSCORINGPOINTS PME
LEFT JOIN PMEOBJECTSCORINGHISTORY PMEH ON PME.OBJECTSCORINGHISTORYID = PMEH.OBJECTSCORINGHISTORYID
INNER JOIN(SELECT OBJECTID, MAX(DATE) DATE
FROM PMEOBJECTSCORINGHISTORY
WHERE DATE < DATEFROMPARTS(YEAR(getdate())-1, 12, 31)
GROUP BY OBJECTID) P3 ON PME.OBJECTID = P3.OBJECTID
AND PMEH.DATE = P3.DATE
AND PME.ATTRIBUTEID = 'Energie-idx'
AND PME.OBJECTSCORINGVALUE <> ''

SQL Query Get Last record Group by multiple fields

Hi I have a table with following fields:
ALERTID POLY_CODE ALERT_DATETIME ALERT_TYPE
I need to query above table for records in the last 24 hour.
Then group by POLY_CODE and ALERT_TYPE and get the latest Alert_Level value ordered by ALERT_DATETIME
I can get up to this, but I need the AlertID of the resulting records.
Any suggestions what would be an efficient way of getting this ?
I have created an SQL in SQL Server. See below
SELECT POLY_CODE, ALERT_TYPE, X.ALERT_LEVEL AS LAST_ALERT_LEVEL
FROM
(SELECT * FROM TableA where ALERT_DATETIME >= GETDATE() -1) T1
OUTER APPLY (SELECT TOP 1 [ALERT_LEVEL]
FROM (SELECT * FROM TableA where ALERT_DATETIME >= GETDATE() -1) T2
WHERE T2.POLY_CODE = T1.POLY_CODE AND
T2.ALERT_TYPE = T1.ALERT_TYPE ORDER BY T2.[ALERT_DATETIME] DESC) X
GROUP BY POLY_CODE, ALERT_TYPE, X.[ALERT_LEVEL]
POLY_CODE ALERT_TYPE ALERT_LEVEL
04575 Elec 2
04737 Gas 3
06239 Elec 2
06552 Elec 2
06578 Elec 2
10320 Elec 2
select top 1 with ties *
from TableA
where ALERT_DATETIME >= GETDATE() -1
order by row_number() over (partition by POLY_CODE,ALERT_TYPE order by [ALERT_DATETIME] DESC)
The way this works is that for each group of POLY_CODE,ALERT_TYPE get their own row_number() starting from the most recent alert_datetime. Then, the with ties clause ensures that all rows(= all groups) with the row_number value of 1 get returned.
One way of doing it is creating a cte with the grouping that calculates the latesdatetime for each and then crosses it with the table to get the results. Just keep in mind that if there are more than one record with the same combination of poly_code, alert_type, alert_level and datetime they will all show.
WITH list AS (
SELECT ta.poly_code,ta.alert_type,MAX(ta.alert_datetime) AS LatestDatetime,
ta.alert_level
FROM dbo.TableA AS ta
WHERE ta.alert_datetime >= DATEADD(DAY,-1,GETDATE())
GROUP BY ta.poly_code, ta.alert_type,ta.alert_level
)
SELECT ta.*
FROM list AS l
INNER JOIN dbo.TableA AS ta ON ta.alert_level = l.alert_level AND ta.alert_type = l.alert_type AND ta.poly_code = l.poly_code AND ta.alert_datetime = l.LatestDatetime

SELECT most recent date out of group

I have a T-SQL query that is designed to weed out duplicate entries of a certain product training, grabbing only the one with the most recent DateTaken. For example, if someone has taken a certain training course 3 times, we only want to display one row, that row being the one that contains the most recent DateTaken. Here is what I have so far, however I am receiving the following error:
An expression of non-boolean type specified in a context where a condition is expected, near 'ORDER'.
The ORDER BY is necessary since we want to group all the results of this query by the expiration date. Below is the full query:
SELECT DISTINCT
p.ProductDescription as ProductDesc,
c.CourseDescription as CourseDesc,
c.Partner, a.DateTaken, a.DateExpired, p.Status
FROM
sNumberToAgentId u, AgentProductTraining a, Course c, Product p
WHERE
#agentId = u.AgentId
and u.sNumber = a.sNumber
and a.CourseCode = c.CourseCode
and (a.DateExpired >= #date or a.DateExpired IS NULL)
and a.ProductCode = p.ProductCode
and (p.status != 'D' or p.status IS NULL)
GROUP BY
(p.ProductDescription)
HAVING
MIN(a.DateTaken)
ORDER BY
DateExpired ASC
EDIT
I've made the following changes to the GROUP BY and HAVING clauses, however I am still receiving errors:
GROUP BY
(p.ProductDescription, c.CourseDescription)
HAVING
MIN(a.DateTaken) > GETUTCDATE()
In SQL Management Studio, and red line error marker appears under the ',' after p.ProductDescription, the ')' after c.CourseDescription, the 'a' in a.DateTaken, and the closing parenthesis ')' of GETUTCDATE(). If I simply leave the GROUP BY statement to include only p.ProductDescription I get this error message:
Column 'Product.ProductDescription' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
I'm relatively new to SQL, could someone explain what's going on? Thank you!
My suggestion since you are using sql server is to implement row_number() and partition by the ProductDescription and CourseDescription. This will go in a subquery and then you apply a filter to return only those where the row number is equal to one or the most recent record:
select *
from
(
SELECT p.ProductDescription as ProductDesc,
c.CourseDescription as CourseDesc,
c.Partner, a.DateTaken, a.DateExpired, p.Status
row_number() over(partition by p.ProductDescription, c.CourseDescription order by a.DateTaken desc) rn
FROM sNumberToAgentId u
INNER JOIN AgentProductTraining a
ON u.sNumber = a.sNumber
AND (a.DateExpired >= #date or a.DateExpired IS NULL)
INNER JOIN Course c
ON a.CourseCode = c.CourseCode
INNER JOIN Product p
ON a.ProductCode = p.ProductCode
AND (p.status != 'D' or p.status IS NULL)
WHERE u.AgentId = #agentId
) src
where rn = 1
order by DateExpired
Its this line
HAVING MIN(a.DateTaken)
Should be a boolean type such as
HAVING MIN(a.DateTaken) > GETUTCDATE()
Have to return True or a False (Boolean)
Here is the final query I wound up using. It is similar to the suggestions above:
SELECT ProductDesc, CourseDesc, Partner, DateTaken, DateExpired, Status
FROM(
SELECT
p.ProductDescription as ProductDesc,
c.CourseDescription as CourseDesc,
c.Partner, a.DateTaken, a.DateExpired, p.Status,
row_number() OVER (PARTITION BY p.ProductDescription, c.CourseDescription ORDER BY abs(datediff(dd, DateTaken, GETDATE()))) as Ranking
FROM
sNumberToAgentId u, AgentProductTraining a, Course c, Product p
WHERE
#agentId = u.AgentId
and u.sNumber = a.sNumber
and a.CourseCode = c.CourseCode
and (a.DateExpired >= #date or a.DateExpired IS NULL)
and a.ProductCode = p.ProductCode
and (p.status != 'D' or p.status IS NULL)
) aa
WHERE Ranking = '1'

Resources