Check for applicable Group query for shopping cart - sql-server

I have problem for one of discount check condition. I have tables structure as below:
Cart table (id, customerid, productid)
Group table (groupid, groupname, discountamount)
Group Products table (groupproductid, groupid, productid)
While placing an order, there will be multiple items in cart, I want to check those items with top most group if that group consists of all product shopping cart have?
Example:
If group 1 consists 2 products and those two products exists in cart table then group 1 discount should be returned.
please help

It's tricky, without having real table definitions nor sample data. So I've made some up:
create table Carts(
id int,
customerid int,
productid int
)
create table Groups(
groupid int,
groupname int,
discountamount int
)
create table GroupProducts(
groupproductid int,
groupid int,
productid int
)
insert into Carts (id,customerid,productid) values
(1,1,1),
(2,1,2),
(3,1,4),
(4,2,2),
(5,2,3)
insert into Groups (groupid,groupname,discountamount) values
(1,1,10),
(2,2,15),
(3,3,20)
insert into GroupProducts (groupproductid,groupid,productid) values
(1,1,1),
(2,1,5),
(3,2,2),
(4,2,4),
(5,3,2),
(6,3,3)
;With MatchedProducts as (
select
c.customerid,gp.groupid,COUNT(*) as Cnt
from
Carts c
inner join
GroupProducts gp
on
c.productid = gp.productid
group by
c.customerid,gp.groupid
), GroupSizes as (
select groupid,COUNT(*) as Cnt from GroupProducts group by groupid
), MatchingGroups as (
select
mp.*
from
MatchedProducts mp
inner join
GroupSizes gs
on
mp.groupid = gs.groupid and
mp.Cnt = gs.Cnt
)
select * from MatchingGroups
Which produces this result:
customerid groupid Cnt
----------- ----------- -----------
1 2 2
2 3 2
What we're doing here is called "relational division" - if you want to search elsewhere for that term. In my current results, each customer only matches one group - if there are multiple matches, we need some tie-breaking conditions to determine which group to report. I prompted with two suggestions in comments (lowest groupid or highest discountamount). Your response of "added earlier" doesn't help - we don't have a column which contains the addition dates of groups. Rows have no inherent ordering in SQL.
We would do the tie-breaking in the definition of MatchingGroups and the final select:
MatchingGroups as (
select
mp.*,
ROW_NUMBER() OVER (PARTITION BY mp.customerid ORDER BY /*Tie break criteria here */) as rn
from
MatchedProducts mp
inner join
GroupSizes gs
on
mp.groupid = gs.groupid and
mp.Cnt = gs.Cnt
)
select * from MatchingGroups where rn = 1

Related

T-SQL count distinct and group by distinct IDs

I have the following type of table:
BranchName CustID
========== ======
Branch1 1111
Branch1 1111
Branch1 2222
Branch2 2222
Branch2 4444
Branch3 1111
Branch4 3333
What I'm trying to achieve is to count distinct CustID and group them by branches, without repeating any CustID.
Basically I'm trying to get to this:
BranchName DistCountofCust
========== ======
Branch1 2
Branch2 1
Branch3 0
Branch4 1
I tried this code:
SELECT X.BranchName, COUNT(DISTINCT X.CustID) as DistCountofCust
FROM
(SELECT T.BranchName, T.CustID
FROM MyTable T) as X
GROUP BY X.BranchName
It doesn't give correct result (doesn't count the number of CustID per branch correctly, because CustIDs overlap for certain branches). Is it possible to eliminate distinct CustIDs and group them per given branches? (In the final result I need only unique customers to be listed for branches).
You may use NOT EXISTS for this. I'm assuming that branches have certain priority according to their name.
SELECT T1.BranchName, COUNT(DISTINCT X.CustID) as DistCountofCust
FROM MyTable T1
LEFT JOIN (
SELECT BranchName, CustID
FROM MyTable T1
WHERE NOT EXISTS (
SELECT 1
FROM MyTable T2
WHERE T1.BranchName > T2.BranchName and
T1.CustID = T2.CustID
)
) as X ON T1.BranchName = X.BranchName and T1.CustID = X.CustID
GROUP BY T1.BranchName
dbfiddle demo
One answer for you. As I say in my comment, ideally you want to have a separate table for your branches, as otherwise, when filtering your results, you can't do a count of 0 9as that would return no rows, and be omitted). This is why there is a second CTE to get the DISTINCT values of BranchName.
CREATE TABLE #Branch (BranchName varchar(10), CustID int);
INSERT INTO #Branch
VALUES
('Branch1',1111),
('Branch1',1111),
('Branch1',2222),
('Branch2',2222),
('Branch2',4444),
('Branch3',1111),
('Branch4',3333);
WITH FirstCustomer AS(
SELECT *,
ROW_NUMBER() OVER (PARTITION BY CustID ORDER BY BranchName) AS RN
FROM #Branch),
Branches AS ( --really you should have a seperate table to store your branches in
SELECT DISTINCT BranchName
FROM #Branch)
SELECT B.BranchName,
COUNT(FC.CustID) AS Customers
FROM Branches B
LEFT JOIN FirstCustomer FC ON B.BranchName = FC.BranchName
AND FC.RN = 1
GROUP BY B.BranchName;
DROP TABLE #Branch;

SQL Server : how to divide database records into even, random groups

tblNames
OrganizationID (int)
LastName (varchar)
...
GroupNumber (int)
GroupNumber is currently NULL for all records, I need an UPDATE statement to update this column.
I need to split up records on an OrganizationID level into even, random groups.
If there are < 20,000 records for an OrganizationID, I need 2 even, random groups. So records for that OrganizationID will have a GroupNumber of 1 or 2. There will be the same (or if odd number of records difference of only 1) number of records for GroupNumber = 1 and for GroupNumber = 2, and there will be no recognizable way to tell how a person got into a GroupNumber - i.e. LastNames that start with A-L are group 1, M-Z are group 2 would not be OK.
If there are > 20,000 records for an OrganizationID, I need 4 even, random groups. So records for that OrganizationID will have a GroupNumber values of 1, 2, 3, or 4. There will be the same (or if odd number of records difference of only 1) number of records for each GroupNumber, and there will be no recognizable way to tell how a person got into a GroupNumber - i.e. LastNames that start with A-F are group 1, G-L are group 2, etc. would not be OK.
There are only about 20 organizations, so I can run an update statement 20 times, once per organizationID if needed.
I have full control of the table so I can add keys or columns, but for now this is what it is.
Would appreciate any help.
Create row numbers randomly (with ROW_NUMBER and GETID). Then get their modulo 2 or 4 depending on the record count to get buckets 0 to 1 or 0 to 3.
select
organizationid, lastname, ...,
case when cnt <= 20000 then rn % 2 else rn % 4 end as bucket
from
(
select
organizationid, lastname, ...,
row_number() over(order by newid()) as rn,
count(*) over () as cnt
from mytable
) randomized;
UPDATE: I suppose the update statement would have to look something like this:
with randomized as
(
select
groupnumber,
row_number() over(order by newid()) as rn,
count(*) over () as cnt
from mytable
)
update randomzized
set groupnumber = case when cnt <= 20000 then rn % 2 else rn % 4 end + 1;
Another slightly different approach;
Setting up some fake data:
if object_id('tempdb.dbo.#Orgs') is not null drop table #Orgs
create table #Orgs
(
RID int identity(1,1) primary key clustered,
OrganizationId int,
LastName varchar(36),
GroupId int
)
insert into #Orgs (OrganizationId, LastName)
select top 40000 row_number() over (order by (select null)) % 20000, newid()
from sys.all_objects a, sys.all_objects b
then using the rarely useful ntile() function to get as close to identically sized groups as possible. Sorting by newid() essentially sorts the data randomly (or as random as generating one guid to the next is).
declare #NumRandomGroups int = 4
update o
set GroupId = x.GroupId
from #orgs o
inner join (select RID, GroupId = ntile(#NumRandomGroups) over (order by newid())
from #orgs) x
on o.RID = x.RID
select GroupId, count(1)
from #Orgs
group by GroupId
select *
from #Orgs
order by RID
You can then set #NumRandomGroups to whatever you want it to be based on the count of Organizations

TSQL : Find PAIR Sequence in a table

I have following table in T-SQL(there are other columns too but no identity column or primary key column):
Oid Cid
1 a
1 b
2 f
3 c
4 f
5 a
5 b
6 f
6 g
7 f
So in above example I would like to highlight that following Oid are duplicate when looking at Cid column values as "PAIRS":
Oid:
1 (1 matches Oid: 5)
2 (2 matches Oid: 4 and 7)
Please NOTE that Oid 2 match did not include Oid 6, since the pair of 6 has letter 'G' as well.
Is it possible to create a query without using While loop to highlight the "Oid" like above? along with how many other matches count exist in database?
I am trying to find the patterns within the dataset relating to these two columns. Thank you in Advance.
Here is a worked example - see comments for explanation:
--First set up your data in a temp table
declare #oidcid table (Oid int, Cid char(1));
insert into #oidcid values
(1,'a'),
(1,'b'),
(2,'f'),
(3,'c'),
(4,'f'),
(5,'a'),
(5,'b'),
(6,'f'),
(6,'g'),
(7,'f');
--This cte gets a table with all of the cids in order, for each oid
with cte as (
select distinct Oid, (select Cid + ',' from #oidcid i2
where i2.Oid = i.Oid order by Cid
for xml path('')) Cids
from #oidcid i
)
select Oid, cte.Cids
from cte
inner join (
-- Here we get just the lists of cids that appear more than once
select Cids, Count(Oid) as OidCount
from cte group by Cids
having Count(Oid) > 1 ) as gcte on cte.Cids = gcte.Cids
-- And when we list them, we are showing the oids with duplicate cids next to each other
Order by cte.Cids
select o1.Cid, o1.Oid, o2.Oid
, count(*) + 1 over (partition by o1.Cid) as [cnt]
from table o1
join table o2
on o1.Cid = o2.Cid
and o1.Oid < o2.Oid
order by o1.Cid, o1.Oid, o2.Oid
Maybe Like this then:
WITH CTE AS
(
SELECT Cid, oid
,ROW_NUMBER() OVER (PARTITION BY cid ORDER BY cid) AS RN
,SUM(1) OVER (PARTITION BY oid) AS maxRow2
,SUM(1) OVER (PARTITION BY cid) AS maxRow
FROM oid
)
SELECT * FROM CTE WHERE maxRow != 1 AND maxRow2 = 1
ORDER BY oid

SQL Server : return value in specific table2 column based on value in table1

I have a query that gets data from 2 tables.
Transaction table contains week_id, customer_id, upc12, sales_dollars
Products table contains upc12, column_1, column_2, column_3
I want my query to return the value in products table, based on what the customer_id is in the transaction table. customer_id = 1 should return column_1, customer_id = 2 should return column_3, etc.
SELECT
t.week_id,
customer_id,
upc12,
p.___________ sum(t.sales_dollars)
FROM
transaction t, products p
WHERE
t.upc_12 = p.upc_12
GROUP BY
t.week_id, customer_id, upc12, p.___________
Sorry if this makes no sense, but my research hasn't been very good, as I don't know how to correctly formulate my question. You probably guessed I'm new to SQL.
Thanks!
Here is one way to do it:
;WITH cte as
(
SELECT
t.week_id,
customer_id,
upc12,
CASE customer_id
WHEN 1 THEN p.Column_1
WHEN 2 THEN p.Column_2
WHEN 3 THEN p.Column_3
END As ColByCustomer,
t.sales_dollars
FROM transaction t
INNER JOIN products p on t.upc_12 = p.upc_12
)
SELECT week_id, customer_id, upc12, ColByCustomer, SUM(sales_dollars)
FROM cte
GROUP BY week_id, customer_id, upc12, ColByCustomer

Problem with unique SQL query

I want to select all records, but have the query only return a single record per Product Name. My table looks similar to:
SellId ProductName Comment
1 Cake dasd
2 Cake dasdasd
3 Bread dasdasdd
where the Product Name is not unique. I want the query to return a single record per ProductName with results like:
SellId ProductName Comment
1 Cake dasd
3 Bread dasdasdd
I have tried this query,
Select distict ProductName,Comment ,SellId from TBL#Sells
but it is returning multiple records with the same ProductName. My table is not realy as simple as this, this is just a sample. What is the solution? Is it clear?
Select ProductName,
min(Comment) , min(SellId) from TBL#Sells
group by ProductName
If y ou only want one record per productname, you ofcourse have to choose what value you want for the other fields.
If you aggregate (using group by) you can choose an aggregate function,
htat's a function that takes a list of values and return only one : here I have chosen MIN : that is the smallest walue for each field.
NOTE : comment and sellid can come from different records, since MIN is taken...
Othter aggregates you might find useful :
FIRST : first record encountered
LAST : last record encoutered
AVG : average
COUNT : number of records
first/last have the advantage that all fields are from the same record.
SELECT S.ProductName, S.Comment, S.SellId
FROM
Sells S
JOIN (SELECT MAX(SellId)
FROM Sells
GROUP BY ProductName) AS TopSell ON TopSell.SellId = S.SellId
This will get the latest comment as your selected comment assuming that SellId is an auto-incremented identity that goes up.
I know, you've got an answer already, I'd like to offer a way that was fastest in terms of performance for me, in a similar situation. I'm assuming that SellId is Primary Key and identity. You'd want an index on ProductName for best performance.
select
Sells.*
from
(
select
distinct ProductName
from
Sells
) x
join
Sells
on
Sells.ProductName = x.ProductName
and Sells.SellId =
(
select
top 1 s2.SellId
from
Sells s2
where
x.ProductName = s2.ProductName
Order By SellId
)
A slower method, (but still better than Group By and MIN on a long char column) is this:
select
*
from
(
select
*,ROW_NUMBER() over (PARTITION BY ProductName order by SellId) OccurenceId
from sells
) x
where
OccurenceId = 1
An advantage of this one is that it's much easier to read.
create table Sale
(
SaleId int not null
constraint PK_Sale primary key,
ProductName varchar(100) not null,
Comment varchar(100) not null
)
insert Sale
values
(1, 'Cake', 'dasd'),
(2, 'Cake', 'dasdasd'),
(3, 'Bread', 'dasdasdd')
-- Option #1 with over()
select *
from Sale
where SaleId in
(
select SaleId
from
(
select SaleId, row_number() over(partition by ProductName order by SaleId) RowNumber
from Sale
) tt
where RowNumber = 1
)
order by SaleId
-- Option #2
select *
from Sale
where SaleId in
(
select min(SaleId)
from Sale
group by ProductName
)
order by SaleId
drop table Sale

Resources