Does anyone know how to return a column while counting in TDengine database? - tdengine

select count(*), path as value from
(select distinct page_id, url_path
from t_front_data_report where tenant_id = 10025 and app_id = 1613513711340531714
and ts >= today() and visit_id is not null) tmp group by url_path;
for example, I want here to return the count of path.

Related

snowflake unsupported subquery cannot be evaluated

/Table TEMP has customer hash, effective start date and effective end date. Table CDTLS has customer hash, effective start date.I want to customer hash, effective from, Customer name from TEMP and CDTLS. I am calculating CDTLS end date on the fly and comparing it with TEMP.EFFECTIVE_FROM and TEMP_EFFECTIVE_TO dates. I get an error that unsupported subquery cannot be evaluated./
SELECT
TEMP.CUSTOMER_HASH,
TEMP.EFFECTIVE_FROM,
TEMP.EFFECTIVE_TO,
CDTLS.NAME
FROM TEMP
LEFT CDTLS
ON
TEMP.CUSTOMER_HASH = CDTLS.CUSTOMER_HASH
AND
CDTLS.EFFECTIVE_FROM <= TEMP.EFFECTIVE_FROM
AND
(
SELECT VW.EFFECTIVE_TO FROM
(
SELECT CUSTOMER_HASH, EFFECTIVE_FROM, LEAD(EFFECTIVE_FROM, 1, '9999-12-31') OVER (PARTITION
BY CUSTOMER_HASH ORDER BY EFFECTIVE_FROM ASC) AS EFFECTIVE_TO
FROM CUST_DETAILS
) AS VW
WHERE CDTLS.CUSTOMER_HASH = VW.CUSTOMER_HASH AND CDTLS.EFFECTIVE_FROM = VW.EFFECTIVE_FROM
) >= TEMP.EFFECTIVE_TO
;
I suppose you wanted to run this query:
SELECT
TEMP.CUSTOMER_HASH,
TEMP.EFFECTIVE_FROM,
TEMP.EFFECTIVE_TO,
CDTLS.NAME
FROM TEMP
LEFT join CDTLS
ON
TEMP.CUSTOMER_HASH = CDTLS.CUSTOMER_HASH
AND
CDTLS.EFFECTIVE_FROM <= TEMP.EFFECTIVE_FROM
left join (
SELECT CUSTOMER_HASH, EFFECTIVE_FROM, LEAD(EFFECTIVE_FROM, 1, '9999-12-31') OVER (PARTITION
BY CUSTOMER_HASH ORDER BY EFFECTIVE_FROM ASC) AS EFFECTIVE_TO
FROM CUST_DETAILS
) AS VW on CDTLS.CUSTOMER_HASH = VW.CUSTOMER_HASH AND CDTLS.EFFECTIVE_FROM = VW.EFFECTIVE_FROM
where
VW.EFFECTIVE_TO >= TEMP.EFFECTIVE_TO
You could try using MIN / MAX / LISTAGG etc in the select query to make it deterministically scalar to check if that helps.
https://docs.snowflake.net/manuals/user-guide/querying-subqueries.html#differences-between-correlated-and-non-correlated-subqueries

Update records SQL?

First when I started this project seemed very simple. Two tables, field tbl1_USERMASTERID in Table 1 should be update from field tbl2_USERMASTERID Table 2. After I looked deeply in Table 2, there is no unique ID that I can use as a key to join these two tables. Only way to match the records from Table 1 and Table 2 is based on FIRST_NAME, LAST_NAME AND DOB. So I have to find records in Table 1 where:
tbl1_FIRST_NAME equals tbl2_FIRST_NAME
AND
tbl1_LAST_NAME equals tbl2_LAST_NAME
AND
tbl1_DOB equals tbl2_DOB
and then update USERMASTERID field. I was afraid that this can cause some duplicates and some users will end up with USERMASTERID that does not belong to them. So if I find more than one record based on first,last name and dob those records would not be updated. I would like just to skip and leave them blank. That way I wouldn't populate invalid USERMASTERID. I'm not sure what is the best way to approach this problem, should I use SQL or ColdFusion (my server side language)? Also how to detect more than one matching record?
Here is what I have so far:
UPDATE Table1 AS tbl1
LEFT OUTER JOIN Table2 AS tbl2
ON tbl1.dob = tbl2.dob
AND tbl1.fname = tbl2.fname
AND tbl1.lname = tbl2.lname
SET tbl1.usermasterid = tbl2.usermasterid
WHERE LTRIM(RTRIM(tbl1.usermasterid)) = ''
Here is query where I tried to detect duplicates:
SELECT DISTINCT
tbl1.FName,
tbl1.LName,
tbl1.dob,
COUNT(*) AS count
FROM Table1 AS tbl1
LEFT OUTER JOIN Table2 AS tbl2
ON tbl1.dob = tbl2.dob
AND tbl1.FName = tbl2.first
AND tbl1.LName = tbl2.last
WHERE LTRIM(RTRIM(tbl1.usermasterid)) = ''
AND LTRIM(RTRIM(tbl1.first)) <> ''
AND LTRIM(RTRIM(tbl1.last)) <> ''
AND LTRIM(RTRIM(tbl1.dob)) <> ''
GROUP BY tbl1.FName,tbl1.LName,tbl1.dob
Some data after I tested query above:
First Last DOB Count
John Cook 2008-07-11 2
Kate Witt 2013-06-05 1
Deb Ruis 2016-01-22 1
Mike Bennet 2007-01-15 1
Kristy Cruz 1997-10-20 1
Colin Jones 2011-10-13 1
Kevin Smith 2010-02-24 1
Corey Bruce 2008-04-11 1
Shawn Maiers 2016-08-28 1
Alenn Fitchner 1998-05-17 1
If anyone have idea how I can prevent/skip updating duplicate records or how to improve this query please let me know. Thank you.
You could check for and avoid duplicate matches using with common_table_expression (Transact-SQL)
along with row_number()., like so:
with cte as (
select
t.fname
, t.lname
, t.dob
, t.usermasterid
, NewUserMasterId = t2.usermasterid
, rn = row_number() over (partition by t.fname, t.lname, t.dob order by t2.usermasterid)
from table1 as t
inner join table2 as t2 on t.dob = t2.dob
and t.fname = t2.fname
and t.lname = t2.lname
and ltrim(rtrim(t.usermasterid)) = ''
)
--/* confirm these are the rows you want updated
select *
from cte as t
where t.NewUserMasterId != ''
and not exists (
select 1
from cte as i
where t.dob = i.dob
and t.fname = i.fname
and t.lname = i.lname
and i.rn>1
);
--*/
/* update those where only 1 usermasterid matches this record
update t
set t.usermasterid = t.NewUserMasterId
from cte as t
where t.NewUserMasterId != ''
and not exists (
select 1
from cte as i
where t.dob = i.dob
and t.fname = i.fname
and t.lname = i.lname
and i.rn>1
);
--*/
I use the cte to extract out the sub query for readability. Per the documentation, a common table expression (cte):
Specifies a temporary named result set, known as a common table expression (CTE). This is derived from a simple query and defined within the execution scope of a single SELECT, INSERT, UPDATE, or DELETE statement.
Using row_number() to assign a number for each row, starting at 1 for each partition of t.fname, t.lname, t.dob. Having those numbered allows us to check for the existence of duplicates with the not exists() clause with ... and i.rn>1
You could use a CTE to filter out the duplicates from Table1 before joining:
; with CTE as (select *
, count(ID) over (partition by LastName, FirstName, DoB) as IDs
from Table1)
update a
set a.ID = b.ID
from Table2 a
left join CTE b
on a.FirstName = b.FirstName
and a.LastName = b.LastName
and a.Dob = b.Dob
and b.IDs = 1
This will work provided there are no exact duplicates (same demographics and same ID) in table 1. If there are exact duplicates, they will also be excluded from the join, but you can filter them out before the CTE to avoid this.
Please try below SQL:
UPDATE Table1 AS tbl1
INNER JOIN Table2 AS tbl2
ON tbl1.dob = tbl2.dob
AND tbl1.fname = tbl2.fname
AND tbl1.lname = tbl2.lname
LEFT JOIN Table2 AS tbl3
ON tbl3.dob = tbl2.dob
AND tbl3.fname = tbl2.fname
AND tbl3.lname = tbl2.lname
AND tbl3.usermasterid <> tbl2.usermasterid
SET tbl1.usermasterid = tbl2.usermasterid
WHERE LTRIM(RTRIM(tbl1.usermasterid)) = ''
AND tbl3.usermasterid is null

Can someone help me make this SQL query more efficient?

SELECT
datepart(qq, o.created_date),
count(DISTINCT o.order_id),
sum(o.order_margin)
FROM
orders o
WHERE
o.account_id IN (SELECT e.account_id
FROM emailsegment e
WHERE e.segment = 'H')
AND o.created_date >= '1/1/2016'
AND o.order_status = 'Shipped'
GROUP BY
datepart(qq, o.created_date)
ORDER BY
datepart(qq, o.created_date)
This is taking forever to run, any ideas?
Try this:
SELECT
datepart(qq, o.created_date),
count(DISTINCT o.order_id),
sum(o.order_margin)
FROM
orders o
INNER JOIN
emailsegment e ON e.account_id = o.account_id AND e.segment = 'H'
WHERE
o.created_date >= '2016-01-01'
AND o.order_status = 'Shipped'
GROUP BY
datepart(qq, o.created_date)
ORDER BY
datepart(qq, o.created_date)
Just several wild guesses:
How many records returns following query and how long it is taking to run?
SELECT e.account_id FROM emailsegment e WHERE e.segment = 'H';
It might be beneficial to create an index on account_id with filter on segment column.
CREATE INDEX ix_account_id_emailsegment ON emailsegment(account_id)
WHERE segment = 'H'
Or
CREATE INDEX ix_segment_emailsegment ON emailsegment(segment)
INCLUDE (account_id)
Depending which number of filtered records is smaller created_date >= '1/1/2016' or o.account_id IN ( or o.order_status = 'Shipped' that column has to be the first column within an index. Would say it will be created_date then you have to create an index like:
CREATE INDEX ix_created_date_orders
on orders(created_date, account_id, order_id)
INCLUDE (order_margin)
WHERE order_status = 'Shipped';
If your column order_id is a clustered index then you do not have include it into the index.
You can try to add word DISTINCT into your sub-query. Not sure if that will help.

SQL WHERE != clause results in no results when the WHERE returns nothing

I have this SQL query:
USE thr_clinic
GO
WITH CompleteSchedule AS (
SELECT U.ID as UserID, U.Role, U.Surname, U.Clinic, TS.ID as TimeSlotID, TS.TimeSlot
FROM Users U
CROSS JOIN TimeSlots TS
)
SELECT CS.*
FROM CompleteSchedule CS
LEFT JOIN Appointments A
ON A.MedicalStaffID = CS.UserID
AND A.TimeSlot = CS.TimeSlotID
AND A.AppDate = CONVERT(DATE,DATEADD(day, 3, GETDATE()))
WHERE A.ID is null
AND CS.Role != 'Patient'
AND CS.Clinic = (SELECT Clinic FROM Users WHERE Users.ID = 1)
AND CS.UserID != (SELECT StaffID FROM DaysOff WHERE DayOff = CONVERT(DATE,DATEADD(day, 3, GETDATE())))
ORDER BY CS.UserID, CS.TimeSlotID
However, with the WHERE just before the ORDER BY, if that returns empty (meaning no one is off on the given date) the overall query returns nothing; but if there is a result (someone off), they won't appear, everyone else will and it works fine.
I assumed if it returned empty then it would show everyone, as empty isn't a userID it can not show.
Since the subquery used in the where clause presumably might return more than one value you probably shouldn't use != but rather not in:
AND CS.UserID NOT IN (SELECT StaffID FROM DaysOff WHERE DayOff
So you wonder why yo get no rows where the sub-query returns NULL? Because NULL is neither = nor <> anything else. Use IS NULL:
AND ((SELECT StaffID FROM DaysOff WHERE DayOff = CONVERT(DATE,DATEADD(day, 3, GETDATE()
IS NULL OR CS.UserID != (SELECT StaffID FROM DaysOff WHERE DayOff = CONVERT(DATE,DATEADD(day, 3, GETDATE()))))

SELECT most recent date out of group

I have a T-SQL query that is designed to weed out duplicate entries of a certain product training, grabbing only the one with the most recent DateTaken. For example, if someone has taken a certain training course 3 times, we only want to display one row, that row being the one that contains the most recent DateTaken. Here is what I have so far, however I am receiving the following error:
An expression of non-boolean type specified in a context where a condition is expected, near 'ORDER'.
The ORDER BY is necessary since we want to group all the results of this query by the expiration date. Below is the full query:
SELECT DISTINCT
p.ProductDescription as ProductDesc,
c.CourseDescription as CourseDesc,
c.Partner, a.DateTaken, a.DateExpired, p.Status
FROM
sNumberToAgentId u, AgentProductTraining a, Course c, Product p
WHERE
#agentId = u.AgentId
and u.sNumber = a.sNumber
and a.CourseCode = c.CourseCode
and (a.DateExpired >= #date or a.DateExpired IS NULL)
and a.ProductCode = p.ProductCode
and (p.status != 'D' or p.status IS NULL)
GROUP BY
(p.ProductDescription)
HAVING
MIN(a.DateTaken)
ORDER BY
DateExpired ASC
EDIT
I've made the following changes to the GROUP BY and HAVING clauses, however I am still receiving errors:
GROUP BY
(p.ProductDescription, c.CourseDescription)
HAVING
MIN(a.DateTaken) > GETUTCDATE()
In SQL Management Studio, and red line error marker appears under the ',' after p.ProductDescription, the ')' after c.CourseDescription, the 'a' in a.DateTaken, and the closing parenthesis ')' of GETUTCDATE(). If I simply leave the GROUP BY statement to include only p.ProductDescription I get this error message:
Column 'Product.ProductDescription' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
I'm relatively new to SQL, could someone explain what's going on? Thank you!
My suggestion since you are using sql server is to implement row_number() and partition by the ProductDescription and CourseDescription. This will go in a subquery and then you apply a filter to return only those where the row number is equal to one or the most recent record:
select *
from
(
SELECT p.ProductDescription as ProductDesc,
c.CourseDescription as CourseDesc,
c.Partner, a.DateTaken, a.DateExpired, p.Status
row_number() over(partition by p.ProductDescription, c.CourseDescription order by a.DateTaken desc) rn
FROM sNumberToAgentId u
INNER JOIN AgentProductTraining a
ON u.sNumber = a.sNumber
AND (a.DateExpired >= #date or a.DateExpired IS NULL)
INNER JOIN Course c
ON a.CourseCode = c.CourseCode
INNER JOIN Product p
ON a.ProductCode = p.ProductCode
AND (p.status != 'D' or p.status IS NULL)
WHERE u.AgentId = #agentId
) src
where rn = 1
order by DateExpired
Its this line
HAVING MIN(a.DateTaken)
Should be a boolean type such as
HAVING MIN(a.DateTaken) > GETUTCDATE()
Have to return True or a False (Boolean)
Here is the final query I wound up using. It is similar to the suggestions above:
SELECT ProductDesc, CourseDesc, Partner, DateTaken, DateExpired, Status
FROM(
SELECT
p.ProductDescription as ProductDesc,
c.CourseDescription as CourseDesc,
c.Partner, a.DateTaken, a.DateExpired, p.Status,
row_number() OVER (PARTITION BY p.ProductDescription, c.CourseDescription ORDER BY abs(datediff(dd, DateTaken, GETDATE()))) as Ranking
FROM
sNumberToAgentId u, AgentProductTraining a, Course c, Product p
WHERE
#agentId = u.AgentId
and u.sNumber = a.sNumber
and a.CourseCode = c.CourseCode
and (a.DateExpired >= #date or a.DateExpired IS NULL)
and a.ProductCode = p.ProductCode
and (p.status != 'D' or p.status IS NULL)
) aa
WHERE Ranking = '1'

Resources