Subquery distinct field - sql-server

Hello I am learning to write sql queries and I am trying to query a ledger table to SELECT a field "ENCID" where there are more than 4 distinct values in a separate file "TDATE" for each distinct ENCID.
Then filter by a 3rd field "ITEMTYPE"
This is what I have:
SELECT
[ENCID]
,[PATIENTID]
,[ITEMTYPE]
,[Service Date]
,[Transaction Date]
,[Trans]
,[PracticeName]
FROM TABLE1
WHERE ITEMTYPE = 'S'
AND ENCID IN (SELECT ENCID FROM TABLE1 WHERE Count(Distinct [Transaction Date]) >4
AND ITEMTYPE = 'S')
I am getting this error "DataSource.Error: Microsoft SQL: An aggregate may not appear in the WHERE clause unless it is in a subquery contained in a HAVING clause or a select list, and the column being aggregated is an outer reference."

Try this instead:
SELECT
[ENCID]
,[PATIENTID]
,[ITEMTYPE]
,[Service Date]
,[Transaction Date]
,[Trans]
,[PracticeName]
FROM TABLE1
WHERE ITEMTYPE = 'S'
AND ENCID IN (
SELECT ENCID
FROM TABLE1
WHERE ITEMTYPE = 'S'
GROUP BY ENCID
HAVING Count(Distinct [Transaction Date]) >4
AND MAX ([Transaction Date]) - MIN ([Transaction Date]) > 60
)
Generally, aggregate functions can only be used in SELECT, HAVING, and ORDER BY clauses (as WHERE determines exactly which records are being aggregated).
What the error message is surprisingly elaborating on is the unique cases where WHERE may contain an aggregate function. Such as this:
SELECT a.id
FROM a
GROUP BY a.id
HAVING a.id IN (
SELECT b.a_id
FROM b
WHERE b.total = COUNT(a.something)
)

Related

How do I have 2 averages based on different conditions in a query

I am trying to find out the average time per month it takes for someone to complete a task but where one group of people have a disability where as the other group don't.
I have a temp table named #Temp that holds the unique identifier for each person that holds a disability. The join value Number is the unique identifier for each person.
The query currently looks like;
DROP TABLE IF EXISTS #Temp
SELECT *
INTO #Temp
FROM [Table]
WHERE [Disability] = 'Y'
SELECT [MonthName]
, AVG(DATEDIFF(DAY, [DateStarted], [DateEnded])) AS [Average Length In Days For Completion For Disabled Users]
FROM TableName
LEFT JOIN #Temp AS T ON T.[Number] = [Number]
LEFT JOIN [Calendar] AS Cal ON Cal.[Date] = [DateStarted]
WHERE [DateStarted] >= '20220101'
AND T.[Disability] = 'Y'
GROUP BY [MonthName]
ORDER BY [MonthName]
SELECT [MonthName]
, AVG(DATEDIFF(DAY, [DateStarted], [DateEnded])) AS [Average Length In Days For Completion For Non-Disabled Users]
FROM TableName
LEFT JOIN [Calendar] AS Cal ON Cal.[Date] = [DateStarted]
WHERE [DateStarted] >= '20220101'
GROUP BY [MonthName]
ORDER BY [MonthName]
How can I merge both these queries together so that there is one record per month for each average? If I do a subquery, it returns 2 rows per month with the non-disability people having NULL records as I have to group it by disability.
Since avg ignores null values you can combine the two queries using conditional aggregation:
SELECT [MonthName]
, AVG( D.DAYS ) AS [Average Length In Days For Completion For All Users]
, AVG( CASE WHEN T.[Disability] = 'Y' THEN D.DAYS END ) AS [Average Length In Days For Completion For Disabled Users]
, AVG( CASE WHEN T.[Disability] <> 'Y' THEN D.DAYS END ) AS [Average Length In Days For Completion For Non-Disabled Users]
FROM TableName
LEFT JOIN #Temp AS T ON T.[Number] = [Number]
LEFT JOIN [Calendar] AS Cal ON Cal.[Date] = [DateStarted]
CROSS APPLY ( DATEDIFF( DAY, [DateStarted], [DateEnded] ) AS DAYS ) AS D
WHERE [DateStarted] >= '20220101'
GROUP BY [MonthName]
ORDER BY [MonthName]
The semantics of Disability were not provided by the OP so I have taken the liberty of making an uneducated guess that 'Y' and something else are present for all users, a fact belied by the use of left outer join. Some tweaking of the case conditions may be needed to make the logic correct, e.g. checking for T.[Disability] IS NULL OR T.[Disability] <> 'Y'.
Note: Best practice would be to use a table alias on each column reference. Since the OP declined to share DDL for the tables I have not attempted to add the aliases.

How To Use a Sub-Query instead of a View In SQL Server

I have a table that has many records and duplicate IDs. I'm trying to group by each ID and show the record with the latest date. Only one ID should show and it should be the record with the latest date. I have done that in the below query and I have put this query into a View in SQL Server:
Pymt_View:
SELECT MAX(CONVERT(datetime, pmt_dt)) AS PymtDate, id_no AS ID, MAX(CONVERT(datetime, due_dt))
AS PymtDueDate, pmt_cd
FROM Payment_Table
WHERE (CONVERT(datetime, due_dt) <= GETDATE()) AND (pmt_cd = '999') AND (amount > 0)
GROUP BY ID, pmt_cd
Then I use this View in a new query to join on the original Payment_Table to grab the columns I need based off the matching of columns in the join. I am using the View as my left table and the original payment table as my right.
New Query With View as the Left Table:
select a.PymtDate, a.[ID], a.PymtDueDate, b.amount, b.pmt_cd, b.batch_no
from Pymt_View a left join Payment_Table b on (a.ID = b.ID) and a.PymtDate = cast(b.pmt_dt as
datetime) and a.pmt_cd = b.pmt_cd)
order by a.[ID] asc, PymtDate desc, PymtDueDate desc
This produces the results I need. However, I am curious how I can do this in one query/view without having to create the Pymt_View? I have tried the following code in an attempt to subquery the grouped population I need and left join it onto the table to grab the columns I need that I was unable to grab from the group by query. I need the b.amount, b.pmt_cd, b.batch_no columns, but couldn't get them through with the group by
select x.*,
from (
SELECT MAX(CONVERT(datetime, pmt_dt)) AS PymtDate, id_no AS ID, MAX(CONVERT(datetime, due_dt))
AS PymtDueDate, pmt_cd
FROM Payment_Table
WHERE (CONVERT(datetime, due_dt) <= GETDATE()) AND (pmt_cd = '999') AND (amount > 0)
GROUP BY ID, pmt_cd
) x left join Payment_Table on (a.ID = b.ID) and a.PymtDate = cast(b.pmt_dt as datetime) and a.pmt_cd
= b.pmt_cd)
This doesn't work. Eventually, if I am able to get the population that I need from the payment_table without having to use the pymt_view. I would then need to join that query with another population, so I would have to create a View again that would be used in another View. Is there any way to get around having to create Views and use them as tables? I would like all this to be in one query that I can put into a View. Without having to create more views to be used in a View. I hope to do this in just one View.
Thank you for your assistance.
You can do this with a row-numbering solution very efficiently:
SELECT
CONVERT(datetime, pmt_dt) PymtDate,
p.[ID],
CONVERT(datetime, due_dt) PymtDueDate,
p.amount,
p.pmt_cd,
p.batch_no
from (
SELECT *,
ROW_NUMBER() OVER (PARTITION BY ID, pmt_cd ORDER BY CONVERT(datetime, pmt_dt) DESC) AS rn
FROM Payment_Table
WHERE (CONVERT(datetime, due_dt) <= GETDATE()) AND (pmt_cd = '999') AND (amount > 0)
) p
WHERE rn = 1
order by a.[ID] asc, PymtDate desc, PymtDueDate desc

SQL - Return a value sum only once when grouped

I want to count the unique record of a string but grouping by dates, and if the string already appeared previously on a group it shouldn't be counted anymore.
I've tried using distinct and it does show the unique count of the record but the record is counted again on every month.
Actual and minified SQL query:
select
date,
count(distinct d.name) as count
from ...
group by date
Sample and desired output
Image
Grab unique names and tag them with the earliest date. At that point it's just a matter of regrouping the resulting rows by date. Each name will uniquely correspond to only one date as desired:
with data as (select name, min("date") as dt from T group by name)
select dt, count(name) as cnt from data group by dt;
If you still need to see the original dates even when no names are counted, then flag each row according to whether it should be counted and then count the flags per date:
with data as (
select *,
case when "date" = min("date") over (partition by name)
then 1 end as flag
from T
)
select "date", count(flag) as cnt
from data
group by "date";
So you want the name only count once:
SELECT COUNT(u.name) as name_count, u.[date]
FROM (
SELECT d.name,MIN(d.date) AS [date]
FROM yourTable d
GROUP BY d.name) u
GROUP BY u.[date];
You can add a ROW_NUMBER() that is Partitioned by name and ordered by date and add a WHERE clause that only returns the rows with Row_Number = 1.
You can check this following option-
SELECT A.Date,COUNT(B.[Name]) Count
FROM
(
SELECT DISTINCT Date FROM your_table
)A
LEFT JOIN
(
SELECT * FROM
(
SELECT *,ROW_NUMBER() OVER(PARTITION BY [Name] ORDER BY Date) RN
FROM your_table
)A WHERE RN = 1
)B ON A.Date = B.Date
GROUP BY A.Date
But the best option if I modify a bit the concept from Shawnt00 is as below-
SELECT A.Date,COUNT(B.[Name]) Count
FROM
(
SELECT DISTINCT Date FROM your_table
)A
LEFT JOIN
(
SELECT [Name],MIN(Date) Date FROM your_table GROUP BY [Name]
)B ON A.Date = B.Date
GROUP BY A.Date
Both case the output will be-
Date Count
20190101 2
20190201 0
20190301 1

SQL Query to retrieve total order count for each user

Write an SQL command that retrieves last name and first name of all customers and the order numbers of orders they have placed…
CustDetails TABLE: http://prntscr.com/msicdp
OrderDetails TABLE: http://prntscr.com/msichp
I am trying to display list of all users from CustDetails (table), with an additional column, "TotalOrders", that counts how many orders each user have from OrderDetails (table) with COUNT(*), but it seems like I have no idea what am I doing.
I've tried LEFT JOIN paired with COUNT(*) AS [Total Orders] and I am getting all kind of errors whatever I try
SELECT DISTINCT CustDetails.*, OrderDetails.CustRef,COUNT(*) AS [Order_number]
FROM CustDetails
LEFT JOIN OrderDetails ON CustDetails.CustRef = OrderDetails.CustRef
GROUP BY CustDetails.FName
--SELECT CustDetails.CustRef, count(*) AS NUM
-- FROM CustDetails GROUP BY CustRef
You can't put * with GROUP BY. If you are using GROUP BY, all non-aggregated columns should be present in your GROUP BY clause.
You need to write your query like the following.
select c.CustRef,
c.LName,
c.Fname,
sum(case when od.CustRef is null then 0 else 1 end) TotalOrders
from CustDetails c
left join OrderDetails od on od.CustRef =c.CustRef
group by c.CustRef ,c.LName, C.Fname
In case you need all the columns you can try like the following without GROUP BY.
select *,
(select count(*) from OrderDetails od where od.CustRef =c.CustRef) TotalOrders
from CustDetails c
Another way of doing it using PARTITION BY
select * from
(
select c.*,
sum(case when od.CustRef is null then 0 else 1 end) over(partition by c.CustRef) as TotalOrders,
row_number() over (partition by c.CustRef order by (select 1)) rn
from CustDetails c
left join OrderDetails od on od.CustRef =c.CustRef
) t
where rn=1

Invalid Column Name in case statement

Im getting Invalid Column Name 'Average'. When Im writing without alias name it work. But I have too much condition.
SELECT
ST_LAWERP_PERFORMANCE_EVALUATION_ENTRIES.EMPLOYEEID,
Employees.EMPLOYEENAMESURNAMEFORMAT AS LastFirstName,
(CAST(SUM(EVALUATION) AS FLOAT)
/
(SELECT TOP 1
COUNT(*)
FROM ST_LAWERP_PERFORMANCE_EVALUATION_ENTRIES
WHERE DATE BETWEEN '2013-01-01' AND '2013-12-12' AND TYPE=2 AND SUPERVISORID=1020 GROUP BY EmployeeID )) AS AVERAGE,
CASE WHEN AVERAGE=1
THEN 'GOOD' END AS EVALUATION
FROM ST_LAWERP_PERFORMANCE_EVALUATION_ENTRIES INNER JOIN Employees
ON ST_LAWERP_PERFORMANCE_EVALUATION_ENTRIES.EMPLOYEEID=Employees.EmployeeID
WHERE DATE BETWEEN '2013-01-01' AND '2013-12-12' AND TYPE=2 AND ST_LAWERP_PERFORMANCE_EVALUATION_ENTRIES.SUPERVISORID=1020 AND ACTIVESTATUS=1
GROUP BY
ST_LAWERP_PERFORMANCE_EVALUATION_ENTRIES.EMPLOYEEID,
EMPLOYEENAMESURNAMEFORMAT
Try THis
SELECT *,case when average =1 then 'good' end as evaluation from(
select
ST_LAWERP_PERFORMANCE_EVALUATION_ENTRIES.EMPLOYEEID,
Employees.EMPLOYEENAMESURNAMEFORMAT AS LastFirstName,
(CAST(SUM(EVALUATION) AS FLOAT)
/
(SELECT TOP 1
COUNT(*)
FROM ST_LAWERP_PERFORMANCE_EVALUATION_ENTRIES
WHERE DATE BETWEEN '2013-01-01' AND '2013-12-12' AND TYPE=2 AND SUPERVISORID=1020 GROUP BY EmployeeID )) AS AVERAGE
FROM ST_LAWERP_PERFORMANCE_EVALUATION_ENTRIES INNER JOIN Employees
ON ST_LAWERP_PERFORMANCE_EVALUATION_ENTRIES.EMPLOYEEID=Employees.EmployeeID
WHERE DATE BETWEEN '2013-01-01' AND '2013-12-12' AND TYPE=2 AND ST_LAWERP_PERFORMANCE_EVALUATION_ENTRIES.SUPERVISORID=1020 AND ACTIVESTATUS=1
GROUP BY
ST_LAWERP_PERFORMANCE_EVALUATION_ENTRIES.EMPLOYEEID,
EMPLOYEENAMESURNAMEFORMAT) as subquery

Resources