T-SQL count distinct and group by distinct IDs - sql-server

I have the following type of table:
BranchName CustID
========== ======
Branch1 1111
Branch1 1111
Branch1 2222
Branch2 2222
Branch2 4444
Branch3 1111
Branch4 3333
What I'm trying to achieve is to count distinct CustID and group them by branches, without repeating any CustID.
Basically I'm trying to get to this:
BranchName DistCountofCust
========== ======
Branch1 2
Branch2 1
Branch3 0
Branch4 1
I tried this code:
SELECT X.BranchName, COUNT(DISTINCT X.CustID) as DistCountofCust
FROM
(SELECT T.BranchName, T.CustID
FROM MyTable T) as X
GROUP BY X.BranchName
It doesn't give correct result (doesn't count the number of CustID per branch correctly, because CustIDs overlap for certain branches). Is it possible to eliminate distinct CustIDs and group them per given branches? (In the final result I need only unique customers to be listed for branches).

You may use NOT EXISTS for this. I'm assuming that branches have certain priority according to their name.
SELECT T1.BranchName, COUNT(DISTINCT X.CustID) as DistCountofCust
FROM MyTable T1
LEFT JOIN (
SELECT BranchName, CustID
FROM MyTable T1
WHERE NOT EXISTS (
SELECT 1
FROM MyTable T2
WHERE T1.BranchName > T2.BranchName and
T1.CustID = T2.CustID
)
) as X ON T1.BranchName = X.BranchName and T1.CustID = X.CustID
GROUP BY T1.BranchName
dbfiddle demo

One answer for you. As I say in my comment, ideally you want to have a separate table for your branches, as otherwise, when filtering your results, you can't do a count of 0 9as that would return no rows, and be omitted). This is why there is a second CTE to get the DISTINCT values of BranchName.
CREATE TABLE #Branch (BranchName varchar(10), CustID int);
INSERT INTO #Branch
VALUES
('Branch1',1111),
('Branch1',1111),
('Branch1',2222),
('Branch2',2222),
('Branch2',4444),
('Branch3',1111),
('Branch4',3333);
WITH FirstCustomer AS(
SELECT *,
ROW_NUMBER() OVER (PARTITION BY CustID ORDER BY BranchName) AS RN
FROM #Branch),
Branches AS ( --really you should have a seperate table to store your branches in
SELECT DISTINCT BranchName
FROM #Branch)
SELECT B.BranchName,
COUNT(FC.CustID) AS Customers
FROM Branches B
LEFT JOIN FirstCustomer FC ON B.BranchName = FC.BranchName
AND FC.RN = 1
GROUP BY B.BranchName;
DROP TABLE #Branch;

Related

TSQL : Find PAIR Sequence in a table

I have following table in T-SQL(there are other columns too but no identity column or primary key column):
Oid Cid
1 a
1 b
2 f
3 c
4 f
5 a
5 b
6 f
6 g
7 f
So in above example I would like to highlight that following Oid are duplicate when looking at Cid column values as "PAIRS":
Oid:
1 (1 matches Oid: 5)
2 (2 matches Oid: 4 and 7)
Please NOTE that Oid 2 match did not include Oid 6, since the pair of 6 has letter 'G' as well.
Is it possible to create a query without using While loop to highlight the "Oid" like above? along with how many other matches count exist in database?
I am trying to find the patterns within the dataset relating to these two columns. Thank you in Advance.
Here is a worked example - see comments for explanation:
--First set up your data in a temp table
declare #oidcid table (Oid int, Cid char(1));
insert into #oidcid values
(1,'a'),
(1,'b'),
(2,'f'),
(3,'c'),
(4,'f'),
(5,'a'),
(5,'b'),
(6,'f'),
(6,'g'),
(7,'f');
--This cte gets a table with all of the cids in order, for each oid
with cte as (
select distinct Oid, (select Cid + ',' from #oidcid i2
where i2.Oid = i.Oid order by Cid
for xml path('')) Cids
from #oidcid i
)
select Oid, cte.Cids
from cte
inner join (
-- Here we get just the lists of cids that appear more than once
select Cids, Count(Oid) as OidCount
from cte group by Cids
having Count(Oid) > 1 ) as gcte on cte.Cids = gcte.Cids
-- And when we list them, we are showing the oids with duplicate cids next to each other
Order by cte.Cids
select o1.Cid, o1.Oid, o2.Oid
, count(*) + 1 over (partition by o1.Cid) as [cnt]
from table o1
join table o2
on o1.Cid = o2.Cid
and o1.Oid < o2.Oid
order by o1.Cid, o1.Oid, o2.Oid
Maybe Like this then:
WITH CTE AS
(
SELECT Cid, oid
,ROW_NUMBER() OVER (PARTITION BY cid ORDER BY cid) AS RN
,SUM(1) OVER (PARTITION BY oid) AS maxRow2
,SUM(1) OVER (PARTITION BY cid) AS maxRow
FROM oid
)
SELECT * FROM CTE WHERE maxRow != 1 AND maxRow2 = 1
ORDER BY oid

select statement with "Group by" on specific columns but displaying other columns along with group by columns

I want to get all data based on group by of only encounter,medicationname
column data..
select encounter,medicationname,count(*) as freq,labdate,result
from Medications where (labdate between #admitdate and DATEDIFF(dd,24,#admitdate))
group by encounter,medicationname having count(*)>2
I have records like
encounter medicationname freq
8604261 ACC 3
Now based on this data ,I want to get
This is my desired output
encounter medicationname labtime result
8604261 ACC 2015-05-22 18
8604261 ACC 2015-07-23 23
8604261 ACC 2015-09-09 27
You can use COUNT() as a window function, something like this:
;With Counted as (
SELECT encounter,medicationname,labdate,result,
COUNT(*) OVER (PARTITION BY encounter,medicationname) as cnt
from Medications
where (labdate between #admitdate
and DATEDIFF(dd,24,#admitdate))
)
select encounter,medicationname,labdate,result
from Counted
where cnt > 2
I would note that I think DATEDIFF1 is probably wrong also but since I don't have your data, inputs and an actual spec, I've left it as is for now.
1DATEDIFF returns an int, but you're using it in a comparison against a column which is apparently a date. DATEADD would be the more probably desired function here, but as I say, I don't have full information to go on.
If I understand you question correctly what you need is this
;WITH CTE AS
(
select encounter,medicationname,count(*) as freq,labdate,result
from Medications where (labdate between #admitdate and DATEDIFF(dd,24,#admitdate))
group by encounter,medicationname having count(*) > 2
)
select encounter,medicationname,labdate,result
from Medications M
INNER JOIN CTE C
ON M.encounter = C.encounter
AND M.medicationname = C.medicationname
where (labdate between #admitdate and DATEDIFF(dd,24,#admitdate))
or better yet using COUNT()OVER()
;WITH CTE AS
(
SELECT encounter,medicationname,COUNT(*) OVER(PARTITION BY encounter,medicationname)as freq,labdate,result
FROM Medications
WHERE (labdate between #admitdate and DATEDIFF(dd,24,#admitdate))
)
SELECT * FROM CTE
WHERE freq > 2
select encounter,medicationname,count(*) as freq,labdate,result
from Medications
where (labdate between #admitdate and DATEDIFF(dd,24,#admitdate))
group by encounter,medicationname having count(*) > 2

T-SQL order by, based on other column value

I'm stuck with a query which should be pretty simple but, for reasons unknown, my brain is not playing ball here ...
Table:
id(int) | strategy (varchar) | value (whatever)
1 "ABC" whatevs
2 "ABC" yeah
3 "DEF" hello
4 "DEF" kitty
5 "QQQ" hurrr
The query should select ALL rows grouped on strategy but only one row per strategy - the one with the higest id.
In the case above, it should return rows with id 2, 4 and 5
SELECT id, strategy , value
FROM (
SELECT id, strategy , value
,ROW_NUMBER() OVER (PARTITION BY strategy ORDER BY ID DESC) rn
FROM Table_Name
) Sub
WHERE rn = 1
Working SQL FIDDLE
You can use window function to get the solution you want. Fiddle here
with cte as
(
select
rank()over(partition by strategy order by id desc) as rnk,
id, strategy, value from myT
)
select id, strategy, value from
cte where rnk = 1;
Try this:
SELECT T2.id,T1.strategy,T1.value
FROM TableName T1
INNER JOIN
(SELECT MAX(id) as id,strategy
FROM TableName
GROUP BY strategy) T2
ON T1.id=T2.id
Result:
ID STRATEGY VALUE
2 ABC yeah
4 DEF kitty
5 QQQ hurrr
See result in SQL Fiddle.
SELECT id, strategy , value
FROM (
SELECT id, strategy , value
,MAX(id) OVER (PARTITION BY strategy) MaxId
FROM YourTable
) Sub
WHERE id=MaxId
You may try this one as well:
SELECT id, strategy, value FROM TableName WHERE id IN (
SELECT MAX(id) FROM TableName GROUP BY strategy
)
Bit depends on your data, you might get results faster with it as it does not do sorting, but by the other hand it uses IN, which can slow you down if there is many 'strategies'

Check for applicable Group query for shopping cart

I have problem for one of discount check condition. I have tables structure as below:
Cart table (id, customerid, productid)
Group table (groupid, groupname, discountamount)
Group Products table (groupproductid, groupid, productid)
While placing an order, there will be multiple items in cart, I want to check those items with top most group if that group consists of all product shopping cart have?
Example:
If group 1 consists 2 products and those two products exists in cart table then group 1 discount should be returned.
please help
It's tricky, without having real table definitions nor sample data. So I've made some up:
create table Carts(
id int,
customerid int,
productid int
)
create table Groups(
groupid int,
groupname int,
discountamount int
)
create table GroupProducts(
groupproductid int,
groupid int,
productid int
)
insert into Carts (id,customerid,productid) values
(1,1,1),
(2,1,2),
(3,1,4),
(4,2,2),
(5,2,3)
insert into Groups (groupid,groupname,discountamount) values
(1,1,10),
(2,2,15),
(3,3,20)
insert into GroupProducts (groupproductid,groupid,productid) values
(1,1,1),
(2,1,5),
(3,2,2),
(4,2,4),
(5,3,2),
(6,3,3)
;With MatchedProducts as (
select
c.customerid,gp.groupid,COUNT(*) as Cnt
from
Carts c
inner join
GroupProducts gp
on
c.productid = gp.productid
group by
c.customerid,gp.groupid
), GroupSizes as (
select groupid,COUNT(*) as Cnt from GroupProducts group by groupid
), MatchingGroups as (
select
mp.*
from
MatchedProducts mp
inner join
GroupSizes gs
on
mp.groupid = gs.groupid and
mp.Cnt = gs.Cnt
)
select * from MatchingGroups
Which produces this result:
customerid groupid Cnt
----------- ----------- -----------
1 2 2
2 3 2
What we're doing here is called "relational division" - if you want to search elsewhere for that term. In my current results, each customer only matches one group - if there are multiple matches, we need some tie-breaking conditions to determine which group to report. I prompted with two suggestions in comments (lowest groupid or highest discountamount). Your response of "added earlier" doesn't help - we don't have a column which contains the addition dates of groups. Rows have no inherent ordering in SQL.
We would do the tie-breaking in the definition of MatchingGroups and the final select:
MatchingGroups as (
select
mp.*,
ROW_NUMBER() OVER (PARTITION BY mp.customerid ORDER BY /*Tie break criteria here */) as rn
from
MatchedProducts mp
inner join
GroupSizes gs
on
mp.groupid = gs.groupid and
mp.Cnt = gs.Cnt
)
select * from MatchingGroups where rn = 1

How to Get current column value, Previous Column Value

How to get Previous Column Value?
IIf id1 = id2 then display previous column id1 value
id1 id2
1001 1001
1002 1002
1003 1003
so on...
select id1, id2, Iff id2 = id1 then disply previous id1 value as idadjusted
Output
id1 id2 id3(Expected)
1001 1001 **1000**
1002 1002 **1001**
1003 1003 **1002**
so on...
I want to disply previous column value of id1
My query
SELECT CARDNO, NAME, TITLENAME, CARDEVENTDATE, MIN(CARDEVENTTIME) AS INTIME, MAX(CARDEVENTTIME) AS OUTTIME,
CARDEVENTDATE AS LASTDATE, MAX(CARDEVENTTIME) AS LASTTIME
FROM (SELECT T_PERSON.CARDNO, T_PERSON.NAME, T_TITLE.TITLENAME, T_CARDEVENT.CARDEVENTDATE, T_CARDEVENT.CARDEVENTTIME FROM (T_TITLE INNER JOIN T_PERSON ON T_TITLE.TITLECODE = T_PERSON.TITLECODE) INNER JOIN T_CARDEVENT ON T_PERSON.PERSONID = T_CARDEVENT.PERSONID ORDER BY T_PERSON.TITLECODE) GROUP BY CARDNO, NAME, TITLENAME, CARDEVENTDATE
For the LastDate - I want to Display Previous column cardeventdate value
For the Lasttime - I want to display previous column outtime value
Need Query Help?
The on clause is used to retrieve the previous id, I have tested it and works fine.
This solution will work even if intermediate ids are missiing i.e. ids are not consecutive
select t1.id, t1.column1, t1.column2,
case
when (t1.column1 = t1.column2) then t2.column1
else null
end as column3
from mytable t1
left outer join mytable t2
on t1.id = (select max(id) from mytable where id < t1.id)
For your complex query, you can create a view and then use the above sql format for your view:
Create a view MyView for:
SELECT CARDNO, NAME, TITLENAME, CARDEVENTDATE, MIN(CARDEVENTTIME) AS INTIME, MAX(CARDEVENTTIME) AS OUTTIME
FROM (SELECT T_PERSON.CARDNO, T_PERSON.NAME, T_TITLE.TITLENAME, T_CARDEVENT.CARDEVENTDATE, T_CARDEVENT.CARDEVENTTIME
FROM T_TITLE
INNER JOIN T_PERSON ON T_TITLE.TITLECODE = T_PERSON.TITLECODE
INNER JOIN T_CARDEVENT ON T_PERSON.PERSONID = T_CARDEVENT.PERSONID
ORDER BY T_PERSON.TITLECODE) GROUP BY CARDNO, NAME, TITLENAME, CARDEVENTDATE
And then the query would be:
select v1.CARDNO, v1.NAME, v1.TITLENAME, v1.CARDEVENTDATE, v1.INTIME, v1.OUTTIME,
case
when (v1.NAME = v1.TITLENAME) then v2.CARDEVENTDATE -- Replace v1.NAME = v1.TITLENAME with your reqd condn
else null end as LASTDATE,
case
when (v1.NAME = v1.TITLENAME) then v2.OUTTIME -- Replace v1.NAME = v1.TITLENAME with your reqd condn
else null end as LASTTIME
from myview v1
left outer join myview v2
on v2.CARDNO = (select max(CARDNO) from table1 where CARDNO < v1.CARDNO)
The v1.NAME = v1.TITLENAME in case stmt needs to be replaced with appropriate condn. I was not sure of the condn as its not mentioned in the question.
When you are designing your database you should consider the fact that you cannot rely on all the rows being in the right order. Instead you should create an identity value, that increment by one for every new row. And if you do this your solution becomes easy (or easier at least)
Assuming a new column called ID
SELECT colum1 FROM myTable WHERE ID = (SELECT ID FROM myTAble WHERE Column1 = Column2) - 1
If you get no match you will end up with ID -1 and this does not exist so you're ok.
If it is possible to get more than one match you will have to consider that too
Your table isn't in first normal form (1NF):
According to Date's definition of 1NF,
a table is in 1NF if and only if it is
'isomorphic to some relation', which
means, specifically, that it satisfies
the following five conditions:
1) There's no top-to-bottom ordering to the rows.
2) There's no left-to-right ordering to the columns.
3) ...

Resources