SQL Server: Count distinct occurrences in one field by value in another - sql-server

Currently I'm writing two queries to count distinct occurrences of fieldOne for each possible value of fieldTwo. How can I do this in one query? Thanks
select
count(*) from(select distinct(fieldOne) from myTable where fieldTwo= 'valueOne')x
select
count(*) from(select distinct(fieldOne) from myTable where fieldTwo = 'valueTwo') y

Try using CASE statement
SELECT COUNT(DISTINCT CASE WHEN FIELDTWO= 'VALUEONE' THEN FIELDONE END) X ,
COUNT(DISTINCT CASE WHEN FIELDTWO= 'VALUETWO' THEN FIELDONE END)Y
FROM MYTABLE

This can be done with cross apply to remove the need to know the possible values in fieldTwo:
select twos.FieldTwo, count(1)
from (select distinct fieldTwo from MyTable) twos
cross apply (select distinct t.fieldOne
from MyTable t
where t.fieldTwo = twos.FieldTwo) ones
group by twos.FieldTwo

Related

Is there any way to sum duplicate rows when deleting duplicates using CTE?

I have a table that contains duplicated ItemId. I am using CTE to remove the duplicate records and keep only single record for each item. I am able to successfully achieve this milestone using following Query:
Create procedure sp_SumSameItems
as
begin
with cte as (select a.Id,a.ItemId,Qty, QtyPrice,
ROW_NUMBER() OVER(PARTITION by ItemId ORDER BY Id) AS rn from tblTest a)
delete x from tblTest x Join cte On x.Id = cte.Id where cte.rn > 1
end
The actual problem is I want to Sum the Qty and QtyPrice before deleting duplicate records. Where should I add Sum function ?
Problem Illustration:
You can't use update with delete statement, you need to update before :
update t
set t.qty = (select sum(t1.qty) from table t1 where t1.itemid = t.itemid);
A CTE is valid for only one statement, so you will need to either run the cte twice, once summing and then deleting or you could put the result of CTE in a temp table and then use the temp table to sum and then delete records in the original table.
At first level, you have to update Qty and QtyPrice after that remove duplicate records.
Given Example:
CREATE PROCEDURE Sp_sumsameitems
AS
BEGIN
WITH cte1
AS (SELECT a.id,
a.itemid,
Sum(qty) Qty,
Sum(qtyprice)QtyPrice,
FROM tbltest a
GROUP BY a.id)
UPDATE x
SET x.qty = c.qty,
x.qtyprice = c.qtyprice
FROM tbltest x
JOIN cte1 c
ON x.id = cte.id
WITH cte
AS (SELECT a.id,
a.itemid,
qty,
qtyprice,
Row_number()
OVER(
partition BY itemid
ORDER BY id) AS rn
FROM tbltest a)
DELETE x
FROM tbltest x
JOIN cte
ON x.id = cte.id
WHERE cte.rn > 1
END

SQL Server Group By - Aggregate NULL or empty values into all other values

I am trying to group by a column. The problem is that the NULL values of the column are grouped as a separate group.
I want the NULL values to be added to each of the other group values instead.
Example of a table:
The results I want to get from group by with sum aggregation over the 'val' column:
Can anyone help me?
Thanks!
You can precalculate the value to spread through the rows and then just do arithmetic:
select t.id,
sum(t.val) + (null_sum / cnt_id)
from t cross join
(select count(distinct id) as cnt_id,
sum(case when id is null then val else 0 end) as null_sum
from t
) tt
group by t.id;
Note some databases do integer division, so you might need null_sum * 1.0 / cnt_id.
A GROUP BY operation can't really generate values for each group on the fly, so logically you need records which are missing to really be present.
One approach is to use a calendar table to generate a table containing one NULL record for each id group:
WITH ids AS (
SELECT DISTINCT id FROM yourTable
WHERE id IS NOT NULL
),
cte AS (
SELECT t1.id, t2.val
FROM ids t1
CROSS JOIN yourTable t2
WHERE t2.id IS NULL
)
SELECT t.id, SUM(t.val) AS val
FROM
(
SELECT id, val FROM yourTable WHERE id IS NOT NULL
UNION ALL
SELECT id, val FROM cte
) t
GROUP BY
id;
Demo

Sql Filter table by two dates in order

I have been trying to filter one table by two dates with an order of importance (date2 > date1) as follows:
SELECT
t1.customer, t1.weights, t1.max(t1.date1) as date1, t1.date2
FROM
(SELECT *
FROM table
WHERE CAST(date2 AS smalldatetime) = '10/29/2017') t2
INNER JOIN
table t1 ON t1.customer = t2.customer
AND t1.date2 = t2.date2
GROUP BY
t1.customer, t1.date2
ORDER BY
t1.customer;
It filters the table correctly by date2 first, the max(t1.date1) doesn't what I want it to do though. I get duplicate customers, that share the same (and correct) date2, but show different date1's. These duplicate records have the following in common: The weight row is different. What would I need to do to output just the the customer records connected to the most current date1 without taking other columns into consideration?
I am still a noob, help would be greatly appreciated!
Solution for t-sql (all based on the accepted answer):
SELECT * FROM (
SELECT row_number() over(partition by t1.customer order by t1.date1 desc) as rownum, t1.customer, t1.weights, t1.date1 , t1.date2
FROM
(SELECT *
FROM table
WHERE CAST(date2 AS smalldatetime) = '10/29/2017') t2
INNER JOIN
table t1 ON t1.customer = t2.customer
AND t1.date2 = t2.date2
)t3
where rownum = 1;
If I understood correctly, then instead of a group by logic, I would just use a qualify row statement :)
Try the code below and tell me if it's what you needed - what I'm telling it to do is to bring back only one row per customer ID....but where we select the row based on the dates (by sorting them in ascending order) - however, I'm unclear of what you mean by importance of the 2 dates so I may be completely off base here...can you please give an example of input and desired output?
SELECT t1.customer, t1.weights, t1.date1, t1.date2
FROM
(
Select *
FROM table
WHERE Cast(date2 as smalldatetime)='10/29/2017'
) t2
Inner Join table t1
ON t1.customer = t2.customer
AND t1.date2 = t2.date2
Qualify row_number() over(partition by t1.customer order by date2 , date1)=1
Order By t1.customer;

Using max(col) with count in sub-query SQL Server

I am putting together a query in SQL Server but having issues with the sub-query
I wish to use the max(loadid) and count the number of records the query returns.
So for example my last loadid is 400 and the amount of records with 400 is 2300, so I would my recor_count column should display 2300. I have tried various ways below but am getting errors.
select count (loadid)
from t1
where loadid = (select max(loadid) from t1) record_count;
(select top 1 LOADID, count(*)
from t1
group by loadid
order by count(*) desc) as Record_Count
Showing loadid and number of matching rows with the use of grouping, ordering by count and limiting the output to 1 row with top.
select top 1 loadid, count(*) as cnt
from t1
group by loadid
order by cnt desc
This may be easier to achieve with a window function in the inner query:
SELECT COUNT(*)
FROM (SELECT RANK() OVER (ORDER BY loadid DESC) AS rk
FROM t1) t
WHERE rk = 1
Another simplest way to achieve the result :
Set Nocount On;
Declare #Test Table
(
Id Int
)
Insert Into #Test(Id) Values
(397),(398),(399),(400)
Declare #Abc Table
(
Id Int
,Value Varchar(100)
)
INsert Into #Abc(Id,Value) Values
(398,'')
,(400,'')
,(397,'')
,(400,'')
,(400,'')
Select a.Id
,Count(a.Value) As RecordCount
From #Abc As a
Join
(
Select Max(t.Id) As Id
From #Test As t
) As v On a.Id = v.Id
Group By a.Id

select top 1 with a group by

I have two columns:
namecode name
050125 chris
050125 tof
050125 tof
050130 chris
050131 tof
I want to group by namecode, and return only the name with the most number of occurrences. In this instance, the result would be
050125 tof
050130 chris
050131 tof
This is with SQL Server 2000
I usually use ROW_NUMBER() to achieve this. Not sure how it performs against various data sets, but we haven't had any performance issues as a result of using ROW_NUMBER.
The PARTITION BY clause specifies which value to "group" the row numbers by, and the ORDER BY clause specifies how the records within each "group" should be sorted. So partition the data set by NameCode, and get all records with a Row Number of 1 (that is, the first record in each partition, ordered by the ORDER BY clause).
SELECT
i.NameCode,
i.Name
FROM
(
SELECT
RowNumber = ROW_NUMBER() OVER (PARTITION BY t.NameCode ORDER BY t.Name),
t.NameCode,
t.Name
FROM
MyTable t
) i
WHERE
i.RowNumber = 1;
select distinct namecode
, (
select top 1 name from
(
select namecode, name, count(*)
from myTable i
where i.namecode = o.namecode
group by namecode, name
order by count(*) desc
) x
) as name
from myTable o
SELECT max_table.namecode, count_table2.name
FROM
(SELECT namecode, MAX(count_name) AS max_count
FROM
(SELECT namecode, name, COUNT(name) AS count_name
FROM mytable
GROUP BY namecode, name) AS count_table1
GROUP BY namecode) AS max_table
INNER JOIN
(SELECT namecode, COUNT(name) AS count_name, name
FROM mytable
GROUP BY namecode, name) count_table2
ON max_table.namecode = count_table2.namecode AND
count_table2.count_name = max_table.max_count
I did not try but this should work,
select top 1 t2.* from (
select namecode, count(*) count from temp
group by namecode) t1 join temp t2 on t1.namecode = t2.namecode
order by t1.count desc
Here are to examples that you could use but the temp table use is more efficient than the view, but was done on a small data sample. You would want to check your own statistics.
--Creating A View
GO
CREATE VIEW StateStoreSales AS
SELECT t.state,t.stor_id,t.stor_name,SUM(s.qty) 'TotalSales'
,ROW_NUMBER() OVER (PARTITION BY t.state ORDER BY SUM(s.qty) DESC) AS 'Rank'
FROM [dbo].[sales] s
JOIN [dbo].[stores] t ON (s.stor_id = t.stor_id)
GROUP BY t.state,t.stor_id,t.stor_name
GO
SELECT * FROM StateStoreSales
WHERE Rank <= 1
ORDER BY TotalSales Desc
DROP VIEW StateStoreSales
---Using a Temp Table
SELECT t.state,t.stor_id,t.stor_name,SUM(s.qty) 'TotalSales'
,ROW_NUMBER() OVER (PARTITION BY t.state ORDER BY SUM(s.qty) DESC) AS 'Rank' INTO #TEMP
FROM [dbo].[sales] s
JOIN [dbo].[stores] t ON (s.stor_id = t.stor_id)
GROUP BY t.state,t.stor_id,t.stor_name
SELECT * FROM #TEMP
WHERE Rank <= 1
ORDER BY TotalSales Desc
DROP TABLE #TEMP

Resources