select top 1 with a group by

select top 1 with a group by - sql-server

I have two columns:
namecode name
050125 chris
050125 tof
050125 tof
050130 chris
050131 tof
I want to group by namecode, and return only the name with the most number of occurrences. In this instance, the result would be
050125 tof
050130 chris
050131 tof
This is with SQL Server 2000

I usually use ROW_NUMBER() to achieve this. Not sure how it performs against various data sets, but we haven't had any performance issues as a result of using ROW_NUMBER.
The PARTITION BY clause specifies which value to "group" the row numbers by, and the ORDER BY clause specifies how the records within each "group" should be sorted. So partition the data set by NameCode, and get all records with a Row Number of 1 (that is, the first record in each partition, ordered by the ORDER BY clause).
SELECT
i.NameCode,
i.Name
FROM
(
SELECT
RowNumber = ROW_NUMBER() OVER (PARTITION BY t.NameCode ORDER BY t.Name),
t.NameCode,
t.Name
FROM
MyTable t
) i
WHERE
i.RowNumber = 1;

select distinct namecode
, (
select top 1 name from
(
select namecode, name, count(*)
from myTable i
where i.namecode = o.namecode
group by namecode, name
order by count(*) desc
) x
) as name
from myTable o

SELECT max_table.namecode, count_table2.name
FROM
(SELECT namecode, MAX(count_name) AS max_count
FROM
(SELECT namecode, name, COUNT(name) AS count_name
FROM mytable
GROUP BY namecode, name) AS count_table1
GROUP BY namecode) AS max_table
INNER JOIN
(SELECT namecode, COUNT(name) AS count_name, name
FROM mytable
GROUP BY namecode, name) count_table2
ON max_table.namecode = count_table2.namecode AND
count_table2.count_name = max_table.max_count

I did not try but this should work,
select top 1 t2.* from (
select namecode, count(*) count from temp
group by namecode) t1 join temp t2 on t1.namecode = t2.namecode
order by t1.count desc

Here are to examples that you could use but the temp table use is more efficient than the view, but was done on a small data sample. You would want to check your own statistics.
--Creating A View
GO
CREATE VIEW StateStoreSales AS
SELECT t.state,t.stor_id,t.stor_name,SUM(s.qty) 'TotalSales'
,ROW_NUMBER() OVER (PARTITION BY t.state ORDER BY SUM(s.qty) DESC) AS 'Rank'
FROM [dbo].[sales] s
JOIN [dbo].[stores] t ON (s.stor_id = t.stor_id)
GROUP BY t.state,t.stor_id,t.stor_name
GO
SELECT * FROM StateStoreSales
WHERE Rank <= 1
ORDER BY TotalSales Desc
DROP VIEW StateStoreSales
---Using a Temp Table
SELECT t.state,t.stor_id,t.stor_name,SUM(s.qty) 'TotalSales'
,ROW_NUMBER() OVER (PARTITION BY t.state ORDER BY SUM(s.qty) DESC) AS 'Rank' INTO #TEMP
FROM [dbo].[sales] s
JOIN [dbo].[stores] t ON (s.stor_id = t.stor_id)
GROUP BY t.state,t.stor_id,t.stor_name
SELECT * FROM #TEMP
WHERE Rank <= 1
ORDER BY TotalSales Desc
DROP TABLE #TEMP

Related

Is there any way to sum duplicate rows when deleting duplicates using CTE?

I have a table that contains duplicated ItemId. I am using CTE to remove the duplicate records and keep only single record for each item. I am able to successfully achieve this milestone using following Query:
Create procedure sp_SumSameItems
as
begin
with cte as (select a.Id,a.ItemId,Qty, QtyPrice,
ROW_NUMBER() OVER(PARTITION by ItemId ORDER BY Id) AS rn from tblTest a)
delete x from tblTest x Join cte On x.Id = cte.Id where cte.rn > 1
end
The actual problem is I want to Sum the Qty and QtyPrice before deleting duplicate records. Where should I add Sum function ?
Problem Illustration:

You can't use update with delete statement, you need to update before :
update t
set t.qty = (select sum(t1.qty) from table t1 where t1.itemid = t.itemid);

A CTE is valid for only one statement, so you will need to either run the cte twice, once summing and then deleting or you could put the result of CTE in a temp table and then use the temp table to sum and then delete records in the original table.

At first level, you have to update Qty and QtyPrice after that remove duplicate records.
Given Example:
CREATE PROCEDURE Sp_sumsameitems
AS
BEGIN
WITH cte1
AS (SELECT a.id,
a.itemid,
Sum(qty) Qty,
Sum(qtyprice)QtyPrice,
FROM tbltest a
GROUP BY a.id)
UPDATE x
SET x.qty = c.qty,
x.qtyprice = c.qtyprice
FROM tbltest x
JOIN cte1 c
ON x.id = cte.id
WITH cte
AS (SELECT a.id,
a.itemid,
qty,
qtyprice,
Row_number()
OVER(
partition BY itemid
ORDER BY id) AS rn
FROM tbltest a)
DELETE x
FROM tbltest x
JOIN cte
ON x.id = cte.id
WHERE cte.rn > 1
END

SQL Simple Join with two tables, but one is random

I am stuck with this. I have a simple set-up with two tables. One table is holding emailaddresses one table is holding vouchercodes. I want to join them in a third table, so that each emailaddress has one random vouchercode.
Unfortunatly I am stuck with this as there are no identic Ids to match both values. What I have so far brings no result:
Select
A.Email
B.CouponCode
FROM Emailaddresses as A
JOIN CouponCodes as B
on A.Email = B.CouponCode
A hint would be great as search did not bring me any further yet.
Edit -
Table A (Addresses)
-------------------
Column A | Column B
-------------------------
email1#gmail.com True
email2#gmail.com
email3#gmail.com True
email4#gmail.com
Table B (Voucher)
-------------------
ABCD1234
ABCD5678
ABCD9876
ABCD5432
Table C
-------------------------
column A | column B
-------------------------
email1#gmail.com ABCD1234
email2#gmail.com ABCD5678
email3#gmail.com ABCD9876
email4#gmail.com ABCD5432
Sample Data:

While joining without proper keys is not a good solution, for your case you can try this. (note: not tested, just a quick suggestion)
;with cte_email as (
select row_number() over (order by Email) as rownum, Email
from Emailaddresses
)
;with cte_coupon as (
select row_number() over (order by CouponCode) as rownum, CouponCode
from CouponCodes
)
select a.Email,b.CouponCode
from cte_email a
join cte_coupon b
on a.rownum = b.rownum

You want to randomly join records, one email with one coupon each. So create random row numbers and join on these:
select
e.email,
c.couponcode
from (select t.*, row_number() over (order by newid()) as rn from emailaddresses t) e
join (select t.*, row_number() over (order by newid()) as rn from CouponCodes t) c
on c.rn = e.rn;

Give a row number for both the tables and join it with row number.
Query
;with cte as(
select [rn] = row_number() over(
order by [Column_A]
), *
from [Table_A]
),
cte2 as(
select [rn] = row_number() over(
order by [Column_A]
), *
from [Table_B]
)
select t1.[Column_A] as [Email_Id], t2.[Column_A] as [Coupon]
from cte t1
join cte2 t2
on t1.rn = t2.rn;
Find a demo here

SQL Server 2014 Consolidate Tables avoiding duplicates

I have 36 Sales tables each referred to one store:
st1.dbo.Sales
st2.dbo.Sales
...
st35.dbo.Sales
st36.dbo.Sales
Each record has the following key columns:
UserName, PostalCode, Location, Country, InvoiceAmount, ItemsCount, StoreID
Here is SQLFiddle
I need to copy into Customers table all Username (and their details) that are not already present into Customers
in case of duplicated it is required to use the fields of record where InvoiceAmount is MAX
I tried to build a query but looks too complicated and it is also wrong because in CROSS APPLY should consider the full list of Sales Tables
INSERT INTO Customers (.....)
SELECT distinct
d.UserName,
w.postalCode,
w.location,
W.country,
max(w.invoiceamount) invoiceamount,
max(w.itemscount) itemscount,
w.storeID
FROM
(SELECT * FROM st1.dbo.Sales
UNION
SELECT * FROM st2.dbo.Sales
UNION
...
SELECT * FROM st36.dbo.Sales) d
LEFT JOIN
G.dbo.Customers s ON d.Username = s.UserName
CROSS APPLY
(SELECT TOP (1) *
FROM s.dbo.[Sales]
WHERE d.Username=w.Username
ORDER BY InvoiceAmount DESC) w
WHERE
s.UserName IS NULL
AND d.username IS NOT NULL
GROUP BY
d.UserName, w.postalCode, w.location,
w.country, w.storeID
Can somebody please give some hints?

As a basic SQL query, I'd create a row_number in the inner subquery and then join to customers and then isolated the max invoice number for each customer not in the customer table.
INSERT INTO Customers (.....)
SELECT w.UserName,
w.postalCode,
w.location,
w.country,
w.invoiceamount,
w.itemscount,
w.storeID
FROM (select d.*,
row_number() over(partition by d.Username order by d.invoiceamount desc) rownumber
from (SELECT *
FROM st1.dbo.Sales
UNION
SELECT *
FROM st2.dbo.Sales
UNION
...
SELECT *
FROM st36.dbo.Sales
) d
LEFT JOIN G.dbo.Customers s
ON d.Username = s.UserName
WHERE s.UserName IS NULL
AND d.username IS NOT NULL
) w
where w.rownumber = 1

Using your fiddle this will select distinct usernames rows with max invoiceamount
with d as(
SELECT * FROM Sales
UNION
SELECT * FROM Sales2
)
select *
from ( select *,
rn = row_number() over(partition by Username order by invoiceamount desc)
from d) dd
where rn=1;

step 1 - use cte .
select username , invoiceamount ,itemscount from Sales
UNION all
select user name , invoiceamount ,itemscount from Sales
.....
...
step 2
next cte use group by and get max invoiceamount ,itemscount for user of last result set.
,cte2 as (
select user name , max (invoiceamount) as invoiceamount ,max(itemscount) as itemscount from cte)
step3
use left join with user table and find missing record and itemscount invoiceamount

Using max(col) with count in sub-query SQL Server

I am putting together a query in SQL Server but having issues with the sub-query
I wish to use the max(loadid) and count the number of records the query returns.
So for example my last loadid is 400 and the amount of records with 400 is 2300, so I would my recor_count column should display 2300. I have tried various ways below but am getting errors.
select count (loadid)
from t1
where loadid = (select max(loadid) from t1) record_count;
(select top 1 LOADID, count(*)
from t1
group by loadid
order by count(*) desc) as Record_Count

Showing loadid and number of matching rows with the use of grouping, ordering by count and limiting the output to 1 row with top.
select top 1 loadid, count(*) as cnt
from t1
group by loadid
order by cnt desc

This may be easier to achieve with a window function in the inner query:
SELECT COUNT(*)
FROM (SELECT RANK() OVER (ORDER BY loadid DESC) AS rk
FROM t1) t
WHERE rk = 1

Another simplest way to achieve the result :
Set Nocount On;
Declare #Test Table
(
Id Int
)
Insert Into #Test(Id) Values
(397),(398),(399),(400)
Declare #Abc Table
(
Id Int
,Value Varchar(100)
)
INsert Into #Abc(Id,Value) Values
(398,'')
,(400,'')
,(397,'')
,(400,'')
,(400,'')
Select a.Id
,Count(a.Value) As RecordCount
From #Abc As a
Join
(
Select Max(t.Id) As Id
From #Test As t
) As v On a.Id = v.Id
Group By a.Id

RANK() Over Partition BY not working

When I run the code below the ROWID is always 1.
I need to the ID to start at 1 for each item with the same Credit Value.
;WITH CTETotal AS (SELECT
TranRegion
,TranCustomer
,TranDocNo
,SUM(TranSale) 'CreditValue'
FROM dbo.Transactions
LEFT JOIN customers AS C
ON custregion = tranregion
AND custnumber = trancustomer
LEFT JOIN products AS P
ON prodcode = tranprodcode
GROUP BY
TranRegion
,TranCustomer
,TranDocNo)
SELECT
r.RegionDesc
,suppcodedesc
,t.tranreason as [Reason]
,t.trandocno as [Document Number]
,sum(tranqty) as Qty
,sum(tranmass) as Mass
,sum(transale) as Sale
,cte.CreditValue AS 'Credit Value'
,RANK() OVER (PARTITION BY cte.CreditValue ORDER BY cte.CreditValue)AS ROWID
FROM transactions t
LEFT JOIN dbo.Regions AS r
ON r.RegionCode = TranRegion
LEFT JOIN CTETotal AS cte
ON cte.TranRegion = t.TranRegion
AND cte.TranCustomer = t.TranCustomer
AND cte.TranDocNo = t.TranDocNo
GROUP BY
r.RegionDesc
,suppcodedesc
,t.tranreason
,t.trandocno
,cte.CreditValue
ORDER BY CreditValue ASC
EDIT
All the credit values with 400 must have the ROWID set to 1. And all the credit values with 200 must have the ROWID set to 2. And so on and so on.

Do you need something like this?
with cte (item,CreditValue)
as
(
select 'a',8 as CreditValue union all
select 'b',18 union all
select 'a',8 union all
select 'b',18 union all
select 'a',8
)
select CreditValue,dense_rank() OVER (ORDER BY item)AS ROWID from cte
Result
CreditValue ROWID
----------- --------------------
8 1
8 1
8 1
18 2
18 2
In your code replace
,RANK() OVER (PARTITION BY cte.CreditValue ORDER BY cte.CreditValue)AS ROWID
by
,DENSE_RANK() OVER (ORDER BY cte.CreditValue)AS ROWID

You just don't have to use PARTITION, just DENSE_RANK() OVER (ORDER BY cte.CreditValue)

I think the problem is with the RANK() OVER (PARTITION BY clause
you have to partition it by item not by CreditValue

Try this
RANK() OVER (PARTITION BY cte.CreditValue ORDER BY cte.RegionDesc)AS ROWID

Edit: The issue here isn't actually the nesting of the subquery, it's potentially based on partition by having columns that truly make each row unique (or 1)
Rather than ranking within your complex query like this
select
rank() over(partition by...),
*
from
data_source
join table1
join table2
join table3
join table4
order by
some_column
Try rank() or row_number() on the resulting data set, not within it.
For example, using the query above, remove rank() and implement it this way:
select
rank() over(partition by...),
results.*
from (
select
*
from
data_source
join table1
join table2
join table3
join table4
order by
some_column
) as results

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

select top 1 with a group by - sql-server

I have two columns: namecode name 050125 chris 050125 tof 050125 tof 050130 chris 050131 tof I want to group by namecode, and return only the name with the most number of occurrences. In this instance, the result would be 050125 tof 050130 chris 050131 tof This is with SQL Server 2000

select distinct namecode , ( select top 1 name from ( select namecode, name, count() from myTable i where i.namecode = o.namecode group by namecode, name order by count() desc ) x ) as name from myTable o

I did not try but this should work, select top 1 t2.* from ( select namecode, count(*) count from temp group by namecode) t1 join temp t2 on t1.namecode = t2.namecode order by t1.count desc

Related

Is there any way to sum duplicate rows when deleting duplicates using CTE?

SQL Simple Join with two tables, but one is random

SQL Server 2014 Consolidate Tables avoiding duplicates

Using max(col) with count in sub-query SQL Server

RANK() Over Partition BY not working

Categories

Resources

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

select top 1 with a group by - sql-server

I have two columns: namecode name 050125 chris 050125 tof 050125 tof 050130 chris 050131 tof I want to group by namecode, and return only the name with the most number of occurrences. In this instance, the result would be 050125 tof 050130 chris 050131 tof This is with SQL Server 2000

select distinct namecode , ( select top 1 name from ( select namecode, name, count(*) from myTable i where i.namecode = o.namecode group by namecode, name order by count(*) desc ) x ) as name from myTable o

I did not try but this should work, select top 1 t2.* from ( select namecode, count(*) count from temp group by namecode) t1 join temp t2 on t1.namecode = t2.namecode order by t1.count desc

Related

Is there any way to sum duplicate rows when deleting duplicates using CTE?

SQL Simple Join with two tables, but one is random

SQL Server 2014 Consolidate Tables avoiding duplicates

Using max(col) with count in sub-query SQL Server

RANK() Over Partition BY not working

Categories

Resources

select distinct namecode , ( select top 1 name from ( select namecode, name, count() from myTable i where i.namecode = o.namecode group by namecode, name order by count() desc ) x ) as name from myTable o