Obtain Distinct top 1 columns in SQL Server - sql-server

I am writing a stored procedure for a project in SQL Server 2014 and I have this code:
ALTER PROCEDURE FOF_MejorVendedor
AS
BEGIN
SELECT TOP 1
F.Nombre, Em.Nombre, (P.Precio * CA.Cantidad) as 'Ganancia'
FROM
dbo.FO_Carrito CA
JOIN
dbo.FO_Solicitud S on S.ID = CA.FK_SolicitudC
JOIN
dbo.FO_Recibo R ON R.FK_Solicitud = S.ID
JOIN
dbo.FO_Productos P ON P.ID = CA.FK_ProductosC
JOIN
dbo.FO_Cliente C ON C.ID = S.FK_Cliente
JOIN
dbo.FO_Estante E ON E.FK_Producto = P.ID
JOIN
dbo.FO_PasilloXDepartamento PD ON PD.FK_Estante = E.NumeroEstante
JOIN
dbo.FO_Encargado En ON En.ID = PD.FK_Encargado
JOIN
dbo.FO_Empleado Em ON Em.ID = En.FK_EmpleadoE
JOIN
dbo.FO_Departamento D ON D.ID = PD.FK_Departamento
JOIN
dbo.FO_Ferreteria F ON D.FK_Ferreteria = F.ID
JOIN
dbo.FO_EmpleadosXFerreteria EF ON EF.FK_Ferreterias = F.ID
GROUP BY
F.Nombre, Em.Nombre, (P.Precio * CA.Cantidad)
ORDER BY
Ganancia DESC
END
But I am only getting the Top 1 of 'Ganancia' but I want to get it for each distinct value in the column "F.Nombre". How can I modify the query?

You are retrieving the top record because you using Top 1 clause, did u believe !
so remove it and the
Group by
will show the result as distinct.

Related

Select row with max value in multiple where condition

I edited the following query based on this page:
Selecting a Record With MAX Value
Select query :
select
Users.Id, Users.[Name], Users.Family, Users.BirthDate,
Users.Mobile, Users.[Description], Users.Email,
Users.UserName, Users.fatherName,
Users.archiveNumber, Users.[Address], Users.IsMarried,
Users.Mazhab,
Cities.CityName, Religions.PersianName, Users.Date_insert,
Users.ImageName,
MaghtaeTahsilis.[Name] as MaghtaeTahsilisName,
FieldStudies.[Name] as FieldStudiesName,
Eductionals.Institute, Eductionals.Moaddal,
Eductionals.FromYear, Eductionals.ToYear
from
Users
left outer join
Eductionals on Users.id = Eductionals.UserID
left outer join
MaghtaeTahsilis on Eductionals.MaghtaeID = MaghtaeTahsilis.ID
left outer join
Cities on Users.City_Id = Cities.Id
left outer join
Religions on Users.Relegion_ID = Religions.ID
left outer join
FieldStudies on Eductionals.FieldStudy_ID = FieldStudies.ID
where
Users.UserName = #code_melli
and Eductionals.MaghtaeID = (select MAX(MaghtaeID) from Eductionals
where Eductionals.UserID = Users.Id)
This command works correctly in choosing MAX value, But if the following statement has a NULL value, no row are returned. I want to show NULL value if it is NULL.
Your left outer joins are being turned into inner joins by the where conditions. Your query should look like:
select u.Id, u.[Name], u.Family, u.BirthDate, u.Mobile, u.[Description], u.Email, u.UserName, u.fatherName,
u.archiveNumber, u.[Address], u.IsMarried, u.Mazhab, c.CityName, r.PersianName, u.Date_insert, u.ImageName,
mt.[Name] As MaghtaeTahsilisName, fs.[Name] As FieldStudiesName, e.Institute, e.Moaddal, e.FromYear, e.ToYear
from Users u left outer join
Eductionals e
on u.id = e.UserID and
e.MaghtaeID = (select MAX(e2.MaghtaeID)
from Eductionals e2
where e2.UserID = u.Id
) left outer join
MaghtaeTahsilis mt
on e.MaghtaeID = mt.ID left outer join
Cities c
on u.City_Id = c.Id left outer join
Religions r
on u.Relegion_ID = r.ID left outer join
FieldStudies fs
on e.FieldStudy_ID = fs.ID
where u.UserName = #code_melli ;
Conditions on the first table -- in a chain of left joins should be in the where clause. On subsequent tables in the on clauses.
You'll notice that I also added table aliases so the query is easier to write and to read.
You can also use window functions:
from Users u left outer join
(select e2.*,
row_number() over (partition by e2.userId order by e2.MaghtaeID desc) as seqnum
from Eductionals e2
) e
on u.id = e.UserID and
e.seqnum = 1 left outer join
. . .
Reason for returning zero records when second query returns NULL is, when second query returns NULL, your SQL syntax become like this
And Eductionals.MaghtaeID=NULL
And probably Dbtable Educationals holds NULL values for field MaghtaeID.
So SQL fails above syntax and thus returns zero records.
Correct syntax for checking NULL values would be
And Eductionals.MaghtaeID is NULL
So please modify where condition in your query as follows which will return desired result.
where Users.UserName = #code_melli AND isnull(Eductionals.MaghtaeID,0) = isnull((select MAX(MaghtaeID) from Eductionals where Eductionals.UserID = Users.Id),0)

SQL Server 2016 + Linked Server Join = Row not returned

I'm querying on SQL Server 2016:
SELECT 1
FROM LINKEDSERVER1.DATABASE1.DBO.TABLE1 A with (nolock)
LEFT JOIN LINKEDSERVER1.DATABASE1.DBO.TABLE2 B with (nolock) ON ...
LEFT JOIN LINKEDSERVER1.DATABASE1.DBO.TABLE3 C with (nolock) ON ...
LEFT JOIN LINKEDSERVER1.DATABASE1.DBO.TABLE4 D with (nolock) ON (D.FIELD1 = C.FIELD1 AND D.CHAR_FIELD2 = B.VARCHAR_FIELD2 AND (D.FIELD3 = B.FIELD3 OR D.FIELD3 = B.FIELD4))
WHERE B.FIELD5 IN ('4472')
Result: Row is not returned
1 - If I change the condition AND D.CHAR_FIELD2 = B.VARCHAR_FIELD2 outside of the join:
SELECT 1
FROM LINKEDSERVER1.DATABASE1.DBO.TABLE1 A with (nolock)
LEFT JOIN LINKEDSERVER1.DATABASE1.DBO.TABLE2 B with (nolock) ON ...
LEFT JOIN LINKEDSERVER1.DATABASE1.DBO.TABLE3 C with (nolock) ON ...
LEFT JOIN LINKEDSERVER1.DATABASE1.DBO.TABLE4 D with (nolock) ON (D.FIELD1 = C.FIELD1 AND (D.FIELD3 = B.FIELD3 OR D.FIELD3 = B.FIELD4))
WHERE B.FIELD5 IN ('4472')
AND D.CHAR_FIELD2 = B.VARCHAR_FIELD2
Result: Row is returned
2 - If I remove the linked server on TABLE4:
SELECT 1
FROM LINKEDSERVER1.DATABASE1.DBO.TABLE1 A with (nolock)
LEFT JOIN LINKEDSERVER1.DATABASE1.DBO.TABLE2 B with (nolock) ON ...
LEFT JOIN LINKEDSERVER1.DATABASE1.DBO.TABLE3 C with (nolock) ON ...
LEFT JOIN TABLE4 D with (nolock) ON (D.FIELD1 = C.FIELD1 AND D.CHAR_FIELD2 = B.VARCHAR_FIELD2 AND (D.FIELD3 = B.FIELD3 OR D.FIELD3 = B.FIELD4))
WHERE B.FIELD5 IN ('4472')
Result: Row is returned
3 - If I run the same query on SQL Server 2005:
SELECT 1
FROM LINKEDSERVER1.DATABASE1.DBO.TABLE1 A with (nolock)
LEFT JOIN LINKEDSERVER1.DATABASE1.DBO.TABLE2 B with (nolock) ON ...
LEFT JOIN LINKEDSERVER1.DATABASE1.DBO.TABLE3 C with (nolock) ON ...
LEFT JOIN LINKEDSERVER1.DATABASE1.DBO.TABLE4 D with (nolock) ON (D.FIELD1 = C.FIELD1 AND D.CHAR_FIELD2 = B.VARCHAR_FIELD2 AND (D.FIELD3 = B.FIELD3 OR D.FIELD3 = B.FIELD4))
WHERE B.FIELD5 IN ('4472')
Result: Row is returned
I'm running SQL Server 2016 13.0.1601.5.
I couldn't find anything about this on SP1 and SP2.
Is this a known issue? Am I missing something?
Found the problem. I was checking the fields' collation, but the databases default collations are different.
I guess that because the fields are char and varchar, it is casting one of then and using default collation, then when remote query is executed, it doesn't find the record.
If I change default collation or force field collation, it works. If I disable "Use Remote Collation" on then linked server properties, it works too.

How to improve SQL Server performance issue with hash match right outer join

I am new to performance issues. So I am not sure of what my approach should be.
This is the query that is taking over 7 minutes to run.
INSERT INTO SubscriberToEncounterMapping(PatientEncounterID, InsuranceSubscriberID)
SELECT
PV.PatientVisitId AS PatientEncounterID,
InsSub.InsuranceSubscriberID
FROM
DB1.dbo.PatientVisit PV
JOIN
DB1.dbo.PatientVisitInsurance PVI ON PV.PatientVisitId = PVI.PatientVisitId
JOIN
DB1.dbo.PatientInsurance PatIns on PatIns.PatientInsuranceId = PVI.PatientInsuranceId
JOIN
DB1.dbo.PatientProfile PP On PP.PatientProfileId = PatIns.PatientProfileId
LEFT OUTER JOIN
DB1.dbo.Guarantor G ON PatIns.PatientProfileId = G.PatientProfileId
JOIN
Warehouse.dbo.InsuranceSubscriber InsSub ON InsSub.InsuranceCarriersID = PatIns.InsuranceCarriersId
AND InsSub.OrderForClaims = PatIns.OrderForClaims
AND ((InsSub.GuarantorID = G.GuarantorId) OR (InsSub.GuarantorID IS NULL AND G.GuarantorId IS NULL))
JOIN
Warehouse.dbo.Encounter E ON E.PatientEncounterID = PV.PatientVisitId
The execution plan states that there is a
Hash Match Right Outer Join that Cost 89%
of the query.
There is not a right outer join in the query so I don't see where the problem is.
How can I make the query more efficient?
Here is the Hash Map Detail:
To elaborate on my comment you could try splitting it into two queries, the first to match on GuarantorID and the second to match when it is NULL in InsuranceSubscriber, and in Guarantor, or if the record is missing completely from Guarantor:
INSERT INTO SubscriberToEncounterMapping(PatientEncounterID, InsuranceSubscriberID)
SELECT PV.PatientVisitId AS PatientEncounterID, InsSub.InsuranceSubscriberID
FROM DB1.dbo.PatientVisit PV
JOIN DB1.dbo.PatientVisitInsurance PVI
ON PV.PatientVisitId = PVI.PatientVisitId
JOIN DB1.dbo.PatientInsurance PatIns
ON PatIns.PatientInsuranceId = PVI.PatientInsuranceId
JOIN DB1.dbo.PatientProfile PP
ON PP.PatientProfileId = PatIns.PatientProfileId
JOIN DB1.dbo.Guarantor G
ON PatIns.PatientProfileId = G.PatientProfileId
JOIN Warehouse.dbo.InsuranceSubscriber InsSub
ON InsSub.InsuranceCarriersID = PatIns.InsuranceCarriersId
AND InsSub.OrderForClaims = PatIns.OrderForClaims
AND InsSub.GuarantorID = G.GuarantorId
JOIN Warehouse.dbo.Encounter E
ON E.PatientEncounterID = PV.PatientVisitId
UNION ALL
SELECT PV.PatientVisitId AS PatientEncounterID, InsSub.InsuranceSubscriberID
FROM DB1.dbo.PatientVisit PV
JOIN DB1.dbo.PatientVisitInsurance PVI
ON PV.PatientVisitId = PVI.PatientVisitId
JOIN DB1.dbo.PatientInsurance PatIns
ON PatIns.PatientInsuranceId = PVI.PatientInsuranceId
JOIN DB1.dbo.PatientProfile PP
ON PP.PatientProfileId = PatIns.PatientProfileId
JOIN Warehouse.dbo.InsuranceSubscriber InsSub
ON InsSub.InsuranceCarriersID = PatIns.InsuranceCarriersId
AND InsSub.OrderForClaims = PatIns.OrderForClaims
AND InsSub.GuarantorID IS NULL
JOIN Warehouse.dbo.Encounter E
ON E.PatientEncounterID = PV.PatientVisitId
WHERE NOT EXISTS
( SELECT 1
FROM DB1.dbo.Guarantor G
WHERE PatIns.PatientProfileId = G.PatientProfileId
AND InsSub.GuarantorID IS NOT NULL
);
I would re-order the joins based on the ability to reduce the number of records returned by each join. Whichever join can reduce the number or records returned will increase efficiency. Then perform the outer join. Also, table locking can always be an issue so add with(nolock) to prevent records that are locked.
Perhaps something like this would work with a little tweaking.
INSERT INTO SubscriberToEncounterMapping (
PatientEncounterID
, InsuranceSubscriberID
)
SELECT PV.PatientVisitId AS PatientEncounterID
, InsSub.InsuranceSubscriberID
FROM DB1.dbo.PatientVisit PV WITH (NOLOCK)
INNER JOIN Warehouse.dbo.Encounter E WITH (NOLOCK)
ON E.PatientEncounterID = PV.PatientVisitId
INNER JOIN DB1.dbo.PatientVisitInsurance PVI WITH (NOLOCK)
ON PV.PatientVisitId = PVI.PatientVisitId
INNER JOIN DB1.dbo.PatientInsurance PatIns WITH (NOLOCK)
ON PatIns.PatientInsuranceId = PVI.PatientInsuranceId
INNER JOIN DB1.dbo.PatientProfile PP WITH (NOLOCK)
ON PP.PatientProfileId = PatIns.PatientProfileId
INNER JOIN Warehouse.dbo.InsuranceSubscriber InsSub WITH (NOLOCK)
ON InsSub.InsuranceCarriersID = PatIns.InsuranceCarriersId
AND InsSub.OrderForClaims = PatIns.OrderForClaims
LEFT JOIN DB1.dbo.Guarantor G WITH (NOLOCK)
ON PatIns.PatientProfileId = G.PatientProfileId
AND (
(InsSub.GuarantorID = G.GuarantorId)
OR (
InsSub.GuarantorID IS NULL
AND G.GuarantorId IS NULL
)
)

How to group row value using SQL Server?

I want to group same yAxisTitle in SQL Server, below image shows my data.
Expected result:
Query I used:
select
q.questionId, q.questionName,
p.perspectiveTitle, x.xAxisTitle, y.yAxisTitle, c.value
from
coaching_questionPerspectiveMap as c
inner join
Coaching_question as q on c.questionId = q.questionId
inner join
Coaching_perspective as p on c.perspectiveId = p.perspectiveId
inner join
coaching_xAxisData x on c.xAxisDataId = x.xAxisDataId
inner join
coaching_yAxisData y on c.yAxisDataId = y.yAxisDataId
where
q.questionId = 14
and p.perspectiveId = 1
order by
c.sort
Please provide any solution?
Thanks,
If you just want the data ordered so that it shows in groups of yAxisTitle, use this:
select
q.questionId, q.questionName,
p.perspectiveTitle, x.xAxisTitle, y.yAxisTitle, c.value
from
coaching_questionPerspectiveMap as c
inner join
Coaching_question as q on c.questionId = q.questionId
inner join
Coaching_perspective as p on c.perspectiveId = p.perspectiveId
inner join
coaching_xAxisData x on c.xAxisDataId = x.xAxisDataId
inner join
coaching_yAxisData y on c.yAxisDataId = y.yAxisDataId
where
q.questionId = 14
and p.perspectiveId = 1
order by
y.yAxisTitle, c.sort

SQL Server AVG and Excel AVERAGE producing different results?

I'm trying to show averages on SQL server, but when I test the data in Excel the results are not the same, there must be something obvious I am missing.
Here is the code and results from SQL server:
SELECT DISTINCT
d.d_reference + ' - ' + d.d_name AS Faculty,
AVG(sub.GroupSize) AS FacultyAverage
FROM
unitesnapshot.dbo.capd_register r
INNER JOIN unitesnapshot.dbo.capd_studentregister sr ON sr.sr_register = r.r_id
INNER JOIN unitesnapshot.dbo.capd_activity a ON a.a_register = r.r_id
INNER JOIN unitesnapshot.dbo.capd_moduleactivity ma ON ma.ma_activity = a.a_id
INNER JOIN unitesnapshot.dbo.capd_module m ON m.m_id = ma.ma_activitymodule
INNER JOIN unitesnapshot.dbo.capd_department d ON d.d_id = m.m_moduledept
INNER JOIN unitesnapshot.dbo.capd_section sec ON sec.s_id = m.m_modulesection
INNER JOIN (SELECT
r.r_reference,
COUNT(DISTINCT s.s_studentreference) AS GroupSize
FROM
unitesnapshot.dbo.capd_student s
INNER JOIN unitesnapshot.dbo.capd_person p ON p.p_id = s.s_id
INNER JOIN unitesnapshot.dbo.capd_studentregister sr ON sr.sr_student = p.p_id
INNER JOIN unitesnapshot.dbo.capd_register r ON r.r_id = sr.sr_register
GROUP BY
r.r_reference) sub ON sub.r_reference = r.r_reference
WHERE
SUBSTRING(r.r_reference,4,2) = '12' AND
d.d_reference = '730'
GROUP BY
d.d_reference,
d.d_name
Here is the results in Excel:
Thanks
Try this for fun:
select avg(a)
from
(values(1),(2),(3),(4)) x(a);
avg(a)
-------
2
AVG() returns the same datatype as the base column. If your columns are of type int, then the result will be truncated to an int as well. The below returns the "correct" result.
select avg(cast(a as decimal(10,5)))
from
(values(1),(2),(3),(4)) x(a);
result
--------
2.5
The discrepancy you are showing (24 vs 19.50484) will most likely involve another error in conjunction with this. For example, to check that you are summing up the same data in Excel as in SQL Server, dump this result into Excel and sum it up. If it doesn't match what you currently believe is the Excel equivalent of the SQL Server data, line the columns up and check they have the same number of rows. Then sort each column individually by value ASCENDING and compare again.
SELECT d.d_name, sub.GroupSize AS FacultyAverage
FROM unitesnapshot.dbo.capd_register r
INNER JOIN unitesnapshot.dbo.capd_studentregister sr ON sr.sr_register = r.r_id
INNER JOIN unitesnapshot.dbo.capd_activity a ON a.a_register = r.r_id
INNER JOIN unitesnapshot.dbo.capd_moduleactivity ma ON ma.ma_activity = a.a_id
INNER JOIN unitesnapshot.dbo.capd_module m ON m.m_id = ma.ma_activitymodule
INNER JOIN unitesnapshot.dbo.capd_department d ON d.d_id = m.m_moduledept
INNER JOIN unitesnapshot.dbo.capd_section sec ON sec.s_id = m.m_modulesection
INNER JOIN (SELECT r.r_reference,
COUNT(DISTINCT s.s_studentreference) AS GroupSize
FROM unitesnapshot.dbo.capd_student s
INNER JOIN unitesnapshot.dbo.capd_person p ON p.p_id = s.s_id
INNER JOIN unitesnapshot.dbo.capd_studentregister sr ON sr.sr_student = p.p_id
INNER JOIN unitesnapshot.dbo.capd_register r ON r.r_id = sr.sr_register
GROUP BY r.r_reference) sub ON sub.r_reference = r.r_reference
WHERE SUBSTRING(r.r_reference,4,2) = '12' AND d.d_reference = '730'
ORDER BY d.d_name

Resources