Using SQL to combine detailed and aggregated results - sql-server

I am developing a report against a SQL Server database. Using the query presented here...
SELECT
f.FacilityID as 'FID',
COUNT (DISTINCT f.PhoneTypeID) as 'Ptypes',
COUNT (DISTINCT f.PhoneID) as 'Pnumbers'
from dbo.FacilityPhones as f
inner join
dbo.Phones as ph
f.PhoneID = ph.PhoneID
group by f.FacilityID
having COUNT(DISTINCT f.PhoneTypeID)<>COUNT(DISTINCT f.PhoneId);
...I have identified 107 records where the number of phone numbers present for a Facility differs from the number of phone number types (e.g., there are two distinct phone numbers, both listed as primary).
I would like to be able to produce a detailed report that would list phone numbers and phone types for each facility, but ONLY when the distinct counts differ.
Is there a way to do this with a single query? Or would I need to save the summaries to a temp table, then join back to that temp table to get the details?

Not sure what fields exist in dbo.Phone; but assume the number comes from there... Likely need to join to the type table to get it's description as well...
This uses a common table expression to get your base list of items an then a correlated subquery to ensure only those facilities in your cte are displayed.
WITH CTE AS (
SELECT f.FacilityID as 'FID'
, COUNT (DISTINCT f.PhoneTypeID) as 'Ptypes'
, COUNT (DISTINCT f.PhoneID) as 'Pnumbers'
FROM dbo.FacilityPhones as f
GROUP BY f.FacilityID
HAVING COUNT(DISTINCT f.PhoneTypeID)<>COUNT(DISTINCT f.PhoneId))
SELECT *
FROM dbo.FaclityPhones FP
INNER JOIN dbo.Phones as ph
ON FP.PhoneID = ph.PhoneID
WHERE EXISTS (SELECT 1
FROM CTE
WHERE FID = FP.FacilityID)
The where clause here just says only show those FacilityID's and associated records if the FacilityID exists in your original query (CTE) (107) If we needed data from the CTE we'd join to it; but as it's simply restricting data placing it in the where clause and using an exists will likely be more efficient.

Related

SQL - Filter calculated column with calculated column

I'm trying to find out the most dosed patients in a database. The sum of the doses has to be calculated and then I have to dynamically list out the patients who have been dosed that much. The query has to be dynamic, and there can be more than 5 patients listed - For example, the 5 most doses are 7,6,5,4,3 doses, but 3 people have gotten 5 doses, so I'd have to list out 7 people in total (the patients getting 7,6,5,5,5,4,3 doses). I'm having issues because you cannot refer to a named column in a where clause and I have no idea how to fix this.
The query goes like this:
SELECT
info.NAME, SUM(therapy.DOSE) AS total
FROM
dbo.PATIENT_INFORMATION_TBL info
JOIN
dbo.PATIENT_THERAPY_TBL therapy ON info.HOSPITAL_NUMBER = therapy.HOSPITAL_NUMBER
LEFT JOIN
dbo.FORMULARY_CLINICAL clinical ON clinical.ITEMID = therapy.ITEMID
WHERE
total IN (SELECT DISTINCT TOP 5 SUM(t.DOSE) AS 'DOSES'
FROM dbo.PATIENT_INFORMATION_TBL i
JOIN dbo.PATIENT_THERAPY_TBL t ON i.HOSPITAL_NUMBER = t.HOSPITAL_NUMBER
LEFT JOIN dbo.FORMULARY_CLINICAL c ON c.ITEMID = t.ITEMID
GROUP BY NAME
ORDER BY 'DOSES' DESC)
GROUP BY
info.NAME
ORDER BY
total DESC
The database looks like this:
The main question is: how can I use a where/having clause where I need to compare a calculated column to a list of dynamically calculated values?
I'm using Microsoft's SQL Server 2012. The DISTINCT in the subquery is needed so that only the top 5 dosages appear (e.g. without DISTINCT I get 7,6,5,4,3 with DISTINCT I get 7,6,6,5,4 and my goal is the first one).
Most DBMSes support Standard SQL Analytical Functions like DENSE_RANK:
with cte as
(
SELECT info.NAME, SUM(therapy.DOSE) as total,
DENSE_RANK() OVER (ORDER BY SUM(therapy.DOSE) DESC) AS dr
FROM dbo.PATIENT_INFORMATION_TBL info
JOIN dbo.PATIENT_THERAPY_TBL therapy ON info.HOSPITAL_NUMBER=therapy.HOSPITAL_NUMBER
LEFT JOIN dbo.FORMULARY_CLINICAL clinical ON clinical.ITEMID=therapy.ITEMID
GROUP BY info.NAME
)
select *
from cte
where dr <= 5 -- only the five highest doses
ORDER BY total desc
Btw, you probably don't need the LEFT JOIN as you're not selecting any column from dbo.FORMULARY_CLINICAL

SQL query where person has as request for every product

I have a shortcut query where I just counted the total number of products and used count to display that the person has requested that many products. (Which is 8 products)
I want to know if there's an easier way where I wouldn't need to count the the products myself and have the query do it. Basically, replace the 8 with the total amount of products that the database has.
SELECT DISTINCT
Tb_Consumer.Name
FROM
Tb_Consumer, Tb_Product, Tb_Requests
WHERE
Tb_Consumer.Con_ID = Tb_Requests.Con_ID
AND Tb_Requests.Prod_ID = Tb_Product.Prod_ID
GROUP BY
Tb_Consumer.Name
HAVING
COUNT(Tb_Product.Name) = 8
Use a subquery to find the number of products in the Tb_Product table:
SELECT
tbc.Name
FROM Tb_Consumer tbc
INNER JOIN Tb_Request tbr
ON tbc.Con_ID = tbr.Con_ID
INNER JOIN Tb_Product tbp
ON tbr.Prod_ID = tbp.Prod_ID
GROUP BY
tbc.Name
HAVING
COUNT(tbp.Name) = (SELECT COUNT(*) FROM Tb_Product); -- count products here
This assumes that every record in Tb_Product corresponds to a single unique product. If there could be duplication for some reason, then you can count distinct products, e.g.
(SELECT COUNT(DISTINCT Name) FROM Tb_Product)
Other changes I made include removing DISTINCT from the select clause, since the GROUP BY should already make each name distinct. I also refactored your query to remove the commas in the FROM clause. Instead, I use explicit joins between the three tables.

MSSQL Group By failing, but no dup Column names

I know this question has been asked time and time again, but I have no two column names that are the same, yet I am getting:
Msg 8120, Level 16, State 1, Line 13 Column 'dbo.PRODUCT.ProductName'
is invalid in the select list because it is not contained in either an
aggregate function or the GROUP BY clause.
My ProductId column is unique to my dbo.Product Table, and I am not sure why it is getting confused with another value. In this image you can see the dup ProductIds
WITH products AS
(
SELECT
*,
ROW_NUMBER() OVER(ORDER BY p.[ProductName]) AS 'RowNumber'
FROM dbo.PRODUCT p
JOIN dbo.Category c ON p.ProductCategoryCode = c.CategoryCode
JOIN dbo.Supplier s ON p.ProductSupplierCode = s.SupplierCode
LEFT JOIN dbo.ProductTag pt ON pt.ProductUPC = p.UPC
LEFT JOIN dbo.Tag t ON pt.ProductTagTagCode = t.TagCode
GROUP BY p.ProductId
)
SELECT *
FROM products
WHERE RowNumber BETWEEN 0 AND 2;
Your error is because you are selecting ALL of the fields in ALL of the tables, but you are only grouping by one value. If a value is returned by the query, then it must either be GROUPED or aggregated (Min, Max, SUM, AVG, etcetera).
If you simply add the Product Name to your grouping:
GROUP BY p.ProductId, p.ProductName
You will still have the same problem with (for example) p.ProductCategoryCode, p.ProductSupplierCode, c.CategoryCode, etc, etc.
In this case, where you are looking for unique rows, do not use GROUP BY - use DISTINCT (which works on all fields returned automatically) instead. Note that #bjones is still correct as to why you are getting duplicates - one of the tables you are joining in can have multiple rows for each product (e.g. many times a product will come from more than one supplier.)
To solve this, you need to:
Determine what data you need to return, and only select those columns
Determine if you need to summarize any data (i.e. Total Sold or On Hand), then:
Use GROUP BY if you do need to summarize any values, or
Use DISTINCT if you do not need to summarize any values

Right way to use distinct in SQL Server

I am trying to retrieve some records based on the query
Select distinct
tblAssessmentEcosystemCredit.AssessmentEcosystemCreditID,
tblSpecies.CommonName
from
tblAssessmentEcosystemCredit
left join
tblSpeciesVegTypeLink on tblAssessmentEcosystemCredit.VegTypeID = tblSpeciesVegTypeLink.VegTypeID
left join
tblSpecies on tblSpecies.SpeciesID = tblSpeciesVegTypeLink.SpeciesID
where
tblAssessmentEcosystemCredit.SpeciesTGValue < 1
The above query returns 17,000 records but when I remove tblSpecies.CommonName, it retrieves only 4200 (that's actually correct).
I have no idea how to distinct only tblAssessmentEcosystemCredit.AssessmentEcosystemCreditID column and retrieve all other table columns in the query.
This query selects the different COMBINATION of AssessmentEcosystemCreditID and CommonName; if you want only one row per value of AssessmentEcosystemCreditID then you need to use a GROUP BY, as suggested by #JonasB; however, in that case, there could be several values of CommonName per value of AssessmentEcosystemCreditID , and so SQL requires you to specify WHICH one you want
Select tblAssessmentEcosystemCredit.AssessmentEcosystemCreditID ,
max(tblSpecies.CommonName) as CommonName,
min(tblSpecies.CommonName) as CommonName2, -- so you can verify you only have one value
from tblAssessmentEcosystemCredit
left join tblSpeciesVegTypeLink
on tblAssessmentEcosystemCredit.VegTypeID = tblSpeciesVegTypeLink.VegTypeID
left join tblSpecies on tblSpecies.SpeciesID= tblSpeciesVegTypeLink.SpeciesID
where tblAssessmentEcosystemCredit.SpeciesTGValue <1
GROUP BY tblAssessmentEcosystemCredit.AssessmentEcosystemCreditID
See this topic: mySQL select one column DISTINCT, with corresponding other columns
You probably have to deactivate ONLY_FULL_GROUP_BY, see http://dev.mysql.com/doc/refman/5.7/en/sql-mode.html#sqlmode_only_full_group_by

Multiple Select against one CTE

I have a CTE query filtering a table Student
Student
(
StudentId PK,
FirstName ,
LastName,
GenderId,
ExperienceId,
NationalityId,
CityId
)
Based on a lot filters (multiple cities, gender, multiple experiences (1, 2, 3), multiple nationalites), I create a CTE by using dynamic sql and joining the student table with a user defined tables (CityTable, NationalityTable,...)
After that I have to retrieve the count of student by each filter like
CityId City Count
NationalityId Nationality Count
Same thing the other filter.
Can I do something like
;With CTE(
Select
FROM Student
Inner JOIN ...
INNER JOIN ....)
SELECT CityId,City,Count(studentId)
FROm CTE
GROUP BY CityId,City
SELECT GenderId,Gender,Count
FROM CTE
GROUP BY GenderId,Gender
I want to something like what LinkedIn is doing with search(people search,job search)
http://www.linkedin.com/search/fpsearch?type=people&keywords=sales+manager&pplSearchOrigin=GLHD&pageKey=member-home
It's so fast and do the same thing.
You can not use multiple select but you can use more than one CTE like this.
WITH CTEA
AS
(
SELECT 'Coulmn1' A,'Coulmn2' B
),
CETB
AS
(
SELECT 'CoulmnX' X,'CoulmnY' Y
)
SELECT * FROM CTEA, CETB
For getting count use RowNumber and CTE some think like this.
ROW_NUMBER() OVER ( ORDER BY COLUMN NAME )AS RowNumber,
Count(1) OVER() AS TotalRecordsFound
Please let me know if you need more information on this.
Sample for your reference.
With CTE AS (
Select StudentId, S.CityId, S.GenderId
FROM Student S
Inner JOIN CITY C
ON S.CityId = C.CityId
INNER JOIN GENDER G
ON S.GenderId = G.GenderId)
,
GENDER
AS
(
SELECT GenderId
FROM CTE
GROUP BY GenderId
)
SELECT * FROM GENDER, CTE
It is not possible to get multiple result sets from a single CTE.
You can however use a table variable to cache some of the information and use it later instead of issuing the same complex query multiple times:
declare #relevantStudent table (StudentID int);
insert into #relevantStudent
select s.StudentID from Students s
join ...
where ...
-- now issue the multiple queries
select s.GenderID, count(*)
from student s
join #relevantStudent r on r.StudentID = s.StudentID
group by s.GenderID
select s.CityID, count(*)
from student s
join #relevantStudent r on r.StudentID = s.StudentID
group by s.CityID
The trick is to store only the minimum required information in the table variable.
As with any query whether this will actually improve performance vs. issuing the queries independently depends on many things (how big the table variable data set is, how complex is the query used to populate it and how complex are the subsequent joins/subselects against the table variable, etc.).
Do a UNION ALL to do multiple SELECT and concatenate the results together into one table.
;WITH CTE AS(
SELECT
FROM Student
INNER JOIN ...
INNER JOIN ....)
SELECT CityId,City,Count(studentId),NULL,NULL
FROM CTE
GROUP BY CityId,City
UNION ALL
SELECT NULL,NULL,NULL,GenderId,Gender,Count
FROM CTE
GROUP BY GenderId,Gender
Note: The NULL values above just allow the two results to have matching columns, so the results can be concatenated.
I know this is a very old question, but here's a solution I just used. I have a stored procedure that returns a PAGE of search results, and I also need it to return the total count matching the query parameters.
WITH results AS (...complicated foo here...)
SELECT results.*,
CASE
WHEN #page=0 THEN (SELECT COUNT(*) FROM results)
ELSE -1
END AS totalCount
FROM results
ORDER BY bar
OFFSET #page * #pageSize ROWS FETCH NEXT #pageSize ROWS ONLY;
With this approach, there's a small "hit" on the first results page to get the count, and for the remaining pages, I pass back "-1" to avoid the hit (I assume the number of results won't change during the user session). Even though totalCount is returned for every row of the first page of results, it's only computed once.
My CTE is doing a bunch of filtering based on stored procedure arguments, so I couldn't just move it to a view and query it twice. This approach allows avoid having to duplicate the CTE's logic just to get a count.

Resources