How to remove duplicates in a SQL query?

How to remove duplicates in a SQL query? - sql-server

I have a query that is built for an specific filter that I want to re-structure it to provide me an output for all records, but it is creating duplicate sum values.
The initial query I have is as follows:
SELECT
CustomerNo,
(SELECT SUM(Amount)
FROM DetailedCustLedgEntry DLE
WHERE
DLE.PostingDate <= '2019-12-16' AND
DLE.[Entry Type] In (1,3,4,5,6,7,8,9,12,13,14,15,16,17) AND
DLE.CustLedgerEntryNo = CLE.EntryNo) as TotalAmount,
--Amt Modifer 1
(SELECT SUM(ClosedByAmount)
FROM CustLedgerEntry
WHERE
"Closed by EntryNo" = CLE.EntryNo AND
PostingDate <= '2019-12-16') as AmtModifer1,
--Amt Modifier 2
(SELECT SUM(ClosedByAmount * -1)
FROM CustLedgerEntry
WHERE
EntryNo = CLE."Closed by EntryNo" AND
PostingDate <= '2019-12-16') as AmtModifer2
FROM
CustLedgerEntry CLE
WHERE
CustomerNo = '104421' AND
"Open"=1 AND
Branch='D--HE-001' AND
PostingDate<='2019-12-17'
The query above filters by CustomerNo = '104421', but I want to be able to see all customers.
I have created joints as follows:
SELECT
CLE.CustomerNo,
SUM(DLE.Amount) as TotalAmount,
SUM(CLE_AM1.ClosedByAmount) as AmtModifer1,
SUM(CLE_AM2.ClosedByAmount * -1) as AmtModifer2
INTO
#TempCLE
FROM
CustLedgerEntry CLE
LEFT JOIN
DetailedCustLedgEntry DLE ON DLE.CustLedgerEntryNo = CLE.EntryNo
LEFT JOIN
CustLedgerEntry CLE_AM1 ON CLE_AM1."Closed by EntryNo" = CLE.EntryNo
LEFT JOIN
CustLedgerEntry CLE_AM2 ON CLE_AM2.EntryNo = CLE."Closed by EntryNo"
WHERE
CLE."Open" = 1 AND
CLE.Branch = 'D--HE-001' AND
CLE.PostingDate <= '2019-12-17' and
DLE.PostingDate <= '2019-12-16' AND
DLE.[Entry Type] IN (1,3,4,5,6,7,8,9,12,13,14,15,16,17)
GROUP BY
CLE.CustomerNo
The problem I have is that the TotalAmount seems to duplicate the output (like, sum the value twice, instead of 2 I see 4)
If I remove the joints for CLE_AM1 and CLE_AM2 the duplicates. I'm guessing the problem is in these joints, but I don't see how they should be build instead
LEFT JOIN
"Bobcat Bensheim GmbH$Cust_ Ledger Entry" CLE_AM1 ON CLE_AM1."Closed by Entry No_" = CLE."Entry No_"
LEFT JOIN
"Bobcat Bensheim GmbH$Cust_ Ledger Entry" CLE_AM2 ON CLE_AM2."Entry No_" = CLE."Closed by Entry No_"
Any ideas?

Related

How can I use outer join with subquery and groupby?

Tool : MySQL Workbench 6.3
Version : MySQL 5.7
SELECT *
FROM cars as a, battery_log as b
WHERE a.user_seq = 226 AND a.seq = b.car_seq
AND b.created = ( SELECT MAX(created) FROM battery_log WHERE car_seq = a.seq )
GROUP BY car_type
ORDER BY a.created DESC;
I want to turn this query into an outer join.
By searching user_seq in the'cars' table
I need to get the latest value of the battery log in the one-to-many relationship of the corresponding car table.
Sometimes the battery log does not have a value that matches car seq, so it is truncated from the joining process of table a and table b. How can I fix this?
SELECT a.*, b.battery
FROM cars as a
LEFT OUTER JOIN battery_log as b ON a.seq = b.car_seq
LEFT OUTER JOIN ( SELECT MAX(created) FROM battery_log WHERE a.seq = b.car_seq) as c
ON b.created = c.MAX(created)
WHERE a.user_seq = 226
GROUP BY car_type
ORDER BY a.created DESC
I tried to fix it this way, but I got the following error:
Error Code: 1054, Unknown column'a.seq' in'where clause'

I solved this problem like this.
SELECT *
FROM cars as a
LEFT OUTER JOIN battery_log as b ON a.seq = b.car_seq
AND b.created = (SELECT MAX(created) FROM battery_log WHERE car_seq = b.car_seq)
WHERE a.user_seq = 226
GROUP BY car_type
ORDER BY a.created DESC;
After LEFT OUTER JOIN ... ON, an additional condition was given with AND, and the query was performed according to the condition.

SQL Server ISNULL not working in multi table selection

I'm very new to SQL server and I'm trying to get the maximum price of an item based on the update of table and if is null to replace the null able value with zero.
Here is what I did:
DECLARE #itemid BIGINT
SELECT
(SELECT ISNULL(MAX(ITEM_SUPPLIER_PRICE.Price), 0.00)
FROM ITEM_SUPPLIER_PRICE
WHERE (ITEM_SUPPLIER_PRICE.item_id = 7)) AS price,
itemunits.unit_id,
itemunits.unit_name
FROM
ITEM_SUPPLIER_PRICE
INNER JOIN
Items ON ITEM_SUPPLIER_PRICE.item_id = Items.Item_id
INNER JOIN
itemunits ON Items.Item_unit_id = itemunits.unit_id
WHERE
(Items.Item_id = 7)
GROUP BY
itemunits.unit_id, itemunits.unit_name,
ITEM_SUPPLIER_PRICE.update_date
ORDER BY
ITEM_SUPPLIER_PRICE.update_date DESC;

I think you're just looking for the max price in the group. Since prices probably can't be negative, the second option below should be equivalent but I throw it in just in case the problem is there.
SELECT
COALESCE(MAX(isp.Price), 0.00) AS price1,
MAX(COALESCE(isp.Price, 0.00)) AS price2,
iu.unit_id,
iu.unit_name
FROM ITEM_SUPPLIER_PRICE isp
INNER JOIN Items i ON i.item_id = isp.Item_id
INNER JOIN itemunits iu ON iu.unit_id = i.Item_unit_id
WHERE i.Item_id = 7
GROUP BY
iu.unit_id,
iu.unit_name,
isp.update_date
ORDER BY isp.update_date desc;

T-Sql How to get Max dated records?

I want max dated rows for per GroupCode
I wrote this.
SELECT FH.BelgeNo AS FaturaNo
,FHD.UrunId
,FH.Tarih
,UG.Grup AS GrupKodu
,FHD.Kodu
,FHD.UrunAdi
,FHD.BirimFiyat
FROM FirmaHareketDetayi FHD
LEFT JOIN FirmaHareketleri FH ON FH.ID = FHD.HareketId
LEFT JOIN Urunler U ON U.UrunId = FHD.UrunId --and U.Kodu = FHD.Kodu
LEFT JOIN UrunGruplari UG ON UG.GrupId = U.GrupId
WHERE FHD.Kodu = '2S619H307CF'
AND FH.FirmaId = 2610
ORDER BY Tarih DESC
and results are like this
There are 2 PIERBURG rows.
is it possible to get only one PIERBURG ?
I mean max dated one (Tarih: Date column, GrupKodu: Group Code)
Notes: Table UrunGrupları: ProductGroups
Table FirmaHareketleri: FirmMovements
Table FirmaHareketDetayi: FirmMovementDetails (Connected with FirmMovements by HareketId (Foreign Key))
Sorry about my english :(

You can use window functions for this
;with cte as (
SELECT FH.BelgeNo AS FaturaNo
,FHD.UrunId
,FH.Tarih
,UG.Grup AS GrupKodu
,FHD.Kodu
,FHD.UrunAdi
,FHD.BirimFiyat
, row_number() over(partition by UG.Grup order by FH.Tarih desc) as rownum
FROM FirmaHareketDetayi FHD
LEFT JOIN FirmaHareketleri FH ON FH.ID = FHD.HareketId
LEFT JOIN Urunler U ON U.UrunId = FHD.UrunId --and U.Kodu = FHD.Kodu
LEFT JOIN UrunGruplari UG ON UG.GrupId = U.GrupId
WHERE FHD.Kodu = '2S619H307CF'
AND FH.FirmaId = 2610
)
select *
from cte
where rownum = 1

Intersect query with no duplicates

I have not used sql server in a large complex scale in years, and Looking for help on how to proper sintax intersect type query to joing these two data sets, and not create duplicate names. Some patients will have both an order and a clinical event entry and some will only have a clinical event.
Data Set 1
SELECT
distinct
ea.alias as FIN,
per.NAME_Last + ', ' + per.NAME_FIRST + ' ' + Isnull(per.NAME_MIDDLE, '') as PatientName,
oa.action_dt_tm as CirOrder,
od.ORIG_ORDER_DT_TM as DischOrder,
e.disch_dt_tm as ActualDisch,
prs.NAME_FULL_FORMATTED as OrderedBy,
from pathway py
join encounter e on e.CERNER_ENCOUNTER_ID = py.encntr_id
join encntr_alias ea on ea.CERNER_ENCNTR_ID = e.CERNER_ENCOUNTER_ID and ea.ENCNTR_ALIAS_TYPE_WCD = 1049
join person per on per.CERNER_PERSON_ID = e.cerner_PERSON_ID
join orders o on o.CERNER_ENCNTR_ID= e.CERNER_ENCOUNTER_ID and o.CATALOG_wCD = '82111' -- communication order
and o.pathway_catalog_id = '43809296' ---Circumcision Order
join order_action oa on oa.[CERNER_ORDER_ID] = o.CERNER_ORDER_ID and oa.ACTION_TYPE_WCD = '2494'--ordered
join orders od on od.CERNER_ENCNTR_ID= e.CERNER_ENCOUNTER_ID and od.CATALOG_WCD = '203520' --- Discharge Patient
join prsnl prs on prs.CERNER_PERSON_ID = oa.order_provider_id
where py.pathway_catalog_id = '43809296' and ---Circumcision Order
oa.action_dt_tm > '2016-01-01 00:00:00'
and oa.ACTION_DT_TM < '2016-01-19 23:59:59'
--use the report prompts as parameters for the action_dt_tm
Data Set 2
SELECT
distinct e.[CERNER_ENCOUNTER_ID],
ea.alias as FIN,
per.NAME_Last + ', ' + per.NAME_FIRST + ' ' + Isnull(per.NAME_MIDDLE, '') as PatientName,
ce.EVENT_END_DT_TM as CircTime,
od.ORIG_ORDER_DT_TM as DischOrder,
e.disch_dt_tm as ActualDisch,
'' OrderedBy, -- should be blank for this set
cv.DISPLAY
from encounter e
join clinical_event ce on e.CERNER_ENCOUNTER_ID = ce.CERNER_ENCNTR_ID
join encntr_alias ea on ea.CERNER_ENCNTR_ID = e.CERNER_ENCOUNTER_ID and ea.ENCNTR_ALIAS_TYPE_WCD = 1049
join person per on per.CERNER_PERSON_ID = e.cerner_PERSON_ID
join orders od on od.CERNER_ENCNTR_ID= e.CERNER_ENCOUNTER_ID and od.CATALOG_WCD = '203520' --- Discharge Patient
left outer join ENCNTR_LOC_HIST elh on elh.CERNER_ENCNTR_ID = e.CERNER_ENCOUNTER_ID
left outer join CODE_VALUE cv on cv.CODE_VALUE_WK = elh.LOC_NURSE_UNIT_WCD
where ce.event_wcd = '201148' ---Newborn Circumcision
and ce.[RESULT_VAL] = 'Newborn Circumcision'
and ce.EVENT_END_DT_TM > '2016-01-01 00:00:00'
and ce.event_end_dt_tm < '2016-01-19 23:59:59’
and ce.RESULT_STATUS_WCD = '25'
and elh.ACTIVE_STATUS_DT_TM < ce.event_end_dt_tm -- Circ time between the location's active time and end time.
and elh.END_EFFECTIVE_DT_TM > ce.[EVENT_END_DT_TM]
--use the report prompts as parameters for the ce.[EVENT_END_DT_TM]

The structure of an intersect query is as simple as:
select statement 1
intersect
select statement 2
intersect
select statement 3
...
This will return all columns that are in both select statements. The columns returned in the select statements must be of the same quantity and type (or at least be convertible to common type).
You can also do an intersect type of query just using inner joins to filter out records in the one query that are not in the other. So for a simple example let's say you have two tables of colors.
Select distinct ColorTable1.Color
from ColorTable1
join ColorTable2
on ColorTable1.Color = ColorTable2.Color
This will return all the distinct colors in ColorTable1 that are also in ColorTable2. Using joins to filter could help your query perform better, but it does take more thought.
Also see: Set Operators (Transact-SQL)

paging over SELECT UNION super slow and killing my server

I have an SP that returns paged data from a query that contains a UNION. This is killing my DB and taking 30 seconds to run sometimes, am I missing something obvious here? What can I do to improve it's performance?
Tables Involved: Products, Categories, CategoryProducts
Goal:
Any Products that are not in a Category or have been deleted from a category UNION all Products currently in a category and page over them for a web service.
I have Indexes on all columns that I am joining on and there are 427,996 Products, 6148 Categories and 409,691 CategoryProducts in the database.
Here is my query that is taking between 6, and 30 seconds to run:
SELECT * FROM (
SELECT ROW_NUMBER() OVER(ORDER BY Products.ItemID, Products.ManufacturerID) AS RowNum, *
FROM
(
SELECT Products.*,
CategoryID = NULL, CategoryName = NULL,
CategoryProductID = NULL,
ContainerMinimumQuantity =
CASE COALESCE(Products.ContainerMinQty, 0)
WHEN 0 THEN Products.OrderMinimumQuantity
ELSE Products.ContainerMinQty
END
Products.IsDeleted,
SortOrder = NULL
FROM CategoryProducts RIGHT OUTER JOIN Products
ON CategoryProducts.ManufacturerID = Products.ManufacturerID
AND CategoryProducts.ItemID = Products.ItemID
WHERE (Products.ManufacturerID = #ManufacturerID)
AND (Products.ModifiedOn > #tStamp )
AND ((CategoryProducts.IsDeleted = 1) OR (CategoryProducts.IsDeleted IS NULL))
UNION
SELECT Products.*,
CategoryProducts.CategoryID , CategoryProducts.CategoryName,
CategoryProducts.CategoryProductID ,
ContainerMinimumQuantity =
CASE COALESCE(Products.ContainerMinQty, 0)
WHEN 0 THEN Products.OrderMinimumQuantity
ELSE Products.ContainerMinQty
END
CategoryProducts.IsDeleted,
CategoryProducts.SortOrder
FROM Categories INNER JOIN
CategoryProducts ON Categories.CategoryID = CategoryProducts.CategoryID INNER JOIN
Products ON CategoryProducts.ManufacturerID = Products.ManufacturerID
AND CategoryProducts.ItemID = Products.ItemID
WHERE (Products.ManufacturerID = #ManufacturerID)
AND (Products.ModifiedOn > #tStamp OR CategoryProducts.ModifiedOn > #tStamp))
AS Products) AS C
WHERE RowNum >= #StartRow AND RowNum <= #EndRow
Any insight would be greatly appreciated.

If I read your situation correctly, the only reason for having two distinct queries is treatment of missing/deleted CategoryProducts. I tried to address this issue by left join with IsDeleted = 0 to bring all deleted CategoryProducts to nulls, so I don't have to test them again. ModifiedOn part got another test for null for missing/deleted Categoryproducts you wish to retrieve.
select *
from (
SELECT
Products.*,
-- Following three columns will be null for deleted/missing categories
CategoryProducts.CategoryID,
CategoryProducts.CategoryName,
CategoryProducts.CategoryProductID ,
ContainerMinimumQuantity = COALESCE(nullif(Products.ContainerMinQty, 0),
Products.OrderMinimumQuantity),
CategoryProducts.IsDeleted,
CategoryProducts.SortOrder,
ROW_NUMBER() OVER(ORDER BY Products.ItemID,
Products.ManufacturerID) AS RowNum
FROM Products
LEFT JOIN CategoryProducts
ON CategoryProducts.ManufacturerID = Products.ManufacturerID
AND CategoryProducts.ItemID = Products.ItemID
-- Filter IsDeleted in join so we get nulls for deleted categories
-- And treat them the same as missing ones
AND CategoryProducts.IsDeleted = 0
LEFT JOIN Categories
ON Categories.CategoryID = CategoryProducts.CategoryID
WHERE Products.ManufacturerID = #ManufacturerID
AND (Products.ModifiedOn > #tStamp
-- Deleted/missing categories
OR CategoryProducts.ModifiedOn is null
OR CategoryProducts.ModifiedOn > #tStamp)
) C
WHERE RowNum >= #StartRow AND RowNum <= #EndRow
On a third look I don't see that Category is used at all except as a filter to CategoryProducts. If this is the case second LEFT JOIN should be changed to INNER JOIN and this section should be enclosed in parenthessis.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

How to remove duplicates in a SQL query? - sql-server

Related

How can I use outer join with subquery and groupby?

SQL Server ISNULL not working in multi table selection

T-Sql How to get Max dated records?

Intersect query with no duplicates

paging over SELECT UNION super slow and killing my server

Categories

Resources