SQL Server 2014 Consolidate Tables avoiding duplicates - sql-server

I have 36 Sales tables each referred to one store:
st1.dbo.Sales
st2.dbo.Sales
...
st35.dbo.Sales
st36.dbo.Sales
Each record has the following key columns:
UserName, PostalCode, Location, Country, InvoiceAmount, ItemsCount, StoreID
Here is SQLFiddle
I need to copy into Customers table all Username (and their details) that are not already present into Customers
in case of duplicated it is required to use the fields of record where InvoiceAmount is MAX
I tried to build a query but looks too complicated and it is also wrong because in CROSS APPLY should consider the full list of Sales Tables
INSERT INTO Customers (.....)
SELECT distinct
d.UserName,
w.postalCode,
w.location,
W.country,
max(w.invoiceamount) invoiceamount,
max(w.itemscount) itemscount,
w.storeID
FROM
(SELECT * FROM st1.dbo.Sales
UNION
SELECT * FROM st2.dbo.Sales
UNION
...
SELECT * FROM st36.dbo.Sales) d
LEFT JOIN
G.dbo.Customers s ON d.Username = s.UserName
CROSS APPLY
(SELECT TOP (1) *
FROM s.dbo.[Sales]
WHERE d.Username=w.Username
ORDER BY InvoiceAmount DESC) w
WHERE
s.UserName IS NULL
AND d.username IS NOT NULL
GROUP BY
d.UserName, w.postalCode, w.location,
w.country, w.storeID
Can somebody please give some hints?

As a basic SQL query, I'd create a row_number in the inner subquery and then join to customers and then isolated the max invoice number for each customer not in the customer table.
INSERT INTO Customers (.....)
SELECT w.UserName,
w.postalCode,
w.location,
w.country,
w.invoiceamount,
w.itemscount,
w.storeID
FROM (select d.*,
row_number() over(partition by d.Username order by d.invoiceamount desc) rownumber
from (SELECT *
FROM st1.dbo.Sales
UNION
SELECT *
FROM st2.dbo.Sales
UNION
...
SELECT *
FROM st36.dbo.Sales
) d
LEFT JOIN G.dbo.Customers s
ON d.Username = s.UserName
WHERE s.UserName IS NULL
AND d.username IS NOT NULL
) w
where w.rownumber = 1

Using your fiddle this will select distinct usernames rows with max invoiceamount
with d as(
SELECT * FROM Sales
UNION
SELECT * FROM Sales2
)
select *
from ( select *,
rn = row_number() over(partition by Username order by invoiceamount desc)
from d) dd
where rn=1;

step 1 - use cte .
select username , invoiceamount ,itemscount from Sales
UNION all
select user name , invoiceamount ,itemscount from Sales
.....
...
step 2
next cte use group by and get max invoiceamount ,itemscount for user of last result set.
,cte2 as (
select user name , max (invoiceamount) as invoiceamount ,max(itemscount) as itemscount from cte)
step3
use left join with user table and find missing record and itemscount invoiceamount

Related

Print results side by side SQL Server

I have following result set,
Now with above results i want to print the records via select query as below attached image
Please note, I will have only two types of columns in output Present Employee & Absent Employees.
I tried using pivot tables, temporary table but cant achieve what I want.
One method would be to ROW_NUMBER each the the "statuses" and then use a FULL OUTER JOIN to get the 2 datasets into the appropriate columns. I use a FULL OUTER JOIN as I assume you could have a different amount of employees who were present/absent.
CREATE TABLE dbo.YourTable (Name varchar(10), --Using a name that doesn't require delimit identification
Status varchar(7), --Using a name that doesn't require delimit identification
Days int);
GO
INSERT INTO dbo.YourTable(Name, Status, Days)
VALUES('Mal','Present',30),
('Jess','Present',20),
('Rick','Absent',30),
('Jerry','Absent',10);
GO
WITH RNs AS(
SELECT Name,
Status,
Days,
ROW_NUMBER() OVER (PARTITION BY Status ORDER BY Days DESC) AS RN
FROM dbo.YourTable)
SELECT P.Name AS PresentName,
P.Days AS PresentDays,
A.Name AS AbsentName,
A.Days AS AbsentDays
FROM (SELECT R.Name,
R.Days,
R.Status,
R.RN
FROM RNs R
WHERE R.Status = 'Present') P
FULL OUTER JOIN (SELECT R.Name,
R.Days,
R.Status,
R.RN
FROM RNs R
WHERE R.Status = 'Absent') A ON P.RN = A.RN;
GO
DROP TABLE dbo.YourTable;
db<>fiddle
2 CTE's is actually far neater:
WITH Absents AS(
SELECT Name,
Status,
Days,
ROW_NUMBER() OVER (ORDER BY Days DESC) AS RN
FROM dbo.YourTable
WHERE Status = 'Absent'),
Presents AS(
SELECT Name,
Status,
Days,
ROW_NUMBER() OVER (ORDER BY Days DESC) AS RN
FROM dbo.YourTable
WHERE Status = 'Present')
SELECT P.Name AS PresentName,
P.Days AS PresentDays,
A.Name AS AbsentName,
A.Days AS AbsentDays
FROM Absents A
FULL OUTER JOIN Presents P ON A.RN = P.RN;

SQL Simple Join with two tables, but one is random

I am stuck with this. I have a simple set-up with two tables. One table is holding emailaddresses one table is holding vouchercodes. I want to join them in a third table, so that each emailaddress has one random vouchercode.
Unfortunatly I am stuck with this as there are no identic Ids to match both values. What I have so far brings no result:
Select
A.Email
B.CouponCode
FROM Emailaddresses as A
JOIN CouponCodes as B
on A.Email = B.CouponCode
A hint would be great as search did not bring me any further yet.
Edit -
Table A (Addresses)
-------------------
Column A | Column B
-------------------------
email1#gmail.com True
email2#gmail.com
email3#gmail.com True
email4#gmail.com
Table B (Voucher)
-------------------
ABCD1234
ABCD5678
ABCD9876
ABCD5432
Table C
-------------------------
column A | column B
-------------------------
email1#gmail.com ABCD1234
email2#gmail.com ABCD5678
email3#gmail.com ABCD9876
email4#gmail.com ABCD5432
Sample Data:
While joining without proper keys is not a good solution, for your case you can try this. (note: not tested, just a quick suggestion)
;with cte_email as (
select row_number() over (order by Email) as rownum, Email
from Emailaddresses
)
;with cte_coupon as (
select row_number() over (order by CouponCode) as rownum, CouponCode
from CouponCodes
)
select a.Email,b.CouponCode
from cte_email a
join cte_coupon b
on a.rownum = b.rownum
You want to randomly join records, one email with one coupon each. So create random row numbers and join on these:
select
e.email,
c.couponcode
from (select t.*, row_number() over (order by newid()) as rn from emailaddresses t) e
join (select t.*, row_number() over (order by newid()) as rn from CouponCodes t) c
on c.rn = e.rn;
Give a row number for both the tables and join it with row number.
Query
;with cte as(
select [rn] = row_number() over(
order by [Column_A]
), *
from [Table_A]
),
cte2 as(
select [rn] = row_number() over(
order by [Column_A]
), *
from [Table_B]
)
select t1.[Column_A] as [Email_Id], t2.[Column_A] as [Coupon]
from cte t1
join cte2 t2
on t1.rn = t2.rn;
Find a demo here

Latest Customer Donation And Amount

I have two tables:
Customer which has an Id column representing the customer Id.
CustomerDonation that contains CustomerId (FK), Amount and DatePayed
I'd like have all the customers together with their latest donation and the amount of that donation.
I am receiving duplicate values on my query so I will not paste it here.
You could also use the WITH TIES option
Select Top 1 With Ties *
From YourTable
Order By Row_Number() over (Partition By CustomerId Order By DatePayed Desc)
WITH
SortedDonation AS
(
SELECT
ROW_NUMBER() OVER (PARTITION BY CustomerId ORDER BY DatePayed DESC) AS SeqID,
*
FROM
CustomerDonation
)
SELECT
*
FROM
Customer
LEFT JOIN
SortedDonation
ON SortedDonation.CustomerId = Customer.Id
AND SortedDonation.SeqId = 1
If the same customer can make multiple donations with the same DatePayed, then this will arbitrarily pick just one of them.
If you add additional fields to the ORDER BY you can deterministically pick which one you want.
Or, if you want all of them use DENSE_RANK() instead of ROW_NUMBER()
Use Row_Number() Analytic function .
Select * from (
Select customerId,Amount,DatePayed, row_number() over (partition by CustomerId order by DatePayed desc) as rowN)
as tab where rowN = 1
You only need the CustomerDonation table for this. You can join with the Customer table if you want other information of the customer.
WITH cte AS (
SELECT
CustomerId
, MAX(DatePayed) AS LastDate
FROM
CustomerDonation
)
SELECT
cd.CustomerId
, cd.Amount
, cd.DatePayed
FROM
CustomerDonation cd
JOIN cte ON cd.CustomerId = cte.CustomerId
AND cd.DatePayed = cte.LastDate

RANK() Over Partition BY not working

When I run the code below the ROWID is always 1.
I need to the ID to start at 1 for each item with the same Credit Value.
;WITH CTETotal AS (SELECT
TranRegion
,TranCustomer
,TranDocNo
,SUM(TranSale) 'CreditValue'
FROM dbo.Transactions
LEFT JOIN customers AS C
ON custregion = tranregion
AND custnumber = trancustomer
LEFT JOIN products AS P
ON prodcode = tranprodcode
GROUP BY
TranRegion
,TranCustomer
,TranDocNo)
SELECT
r.RegionDesc
,suppcodedesc
,t.tranreason as [Reason]
,t.trandocno as [Document Number]
,sum(tranqty) as Qty
,sum(tranmass) as Mass
,sum(transale) as Sale
,cte.CreditValue AS 'Credit Value'
,RANK() OVER (PARTITION BY cte.CreditValue ORDER BY cte.CreditValue)AS ROWID
FROM transactions t
LEFT JOIN dbo.Regions AS r
ON r.RegionCode = TranRegion
LEFT JOIN CTETotal AS cte
ON cte.TranRegion = t.TranRegion
AND cte.TranCustomer = t.TranCustomer
AND cte.TranDocNo = t.TranDocNo
GROUP BY
r.RegionDesc
,suppcodedesc
,t.tranreason
,t.trandocno
,cte.CreditValue
ORDER BY CreditValue ASC
EDIT
All the credit values with 400 must have the ROWID set to 1. And all the credit values with 200 must have the ROWID set to 2. And so on and so on.
Do you need something like this?
with cte (item,CreditValue)
as
(
select 'a',8 as CreditValue union all
select 'b',18 union all
select 'a',8 union all
select 'b',18 union all
select 'a',8
)
select CreditValue,dense_rank() OVER (ORDER BY item)AS ROWID from cte
Result
CreditValue ROWID
----------- --------------------
8 1
8 1
8 1
18 2
18 2
In your code replace
,RANK() OVER (PARTITION BY cte.CreditValue ORDER BY cte.CreditValue)AS ROWID
by
,DENSE_RANK() OVER (ORDER BY cte.CreditValue)AS ROWID
You just don't have to use PARTITION, just DENSE_RANK() OVER (ORDER BY cte.CreditValue)
I think the problem is with the RANK() OVER (PARTITION BY clause
you have to partition it by item not by CreditValue
Try this
RANK() OVER (PARTITION BY cte.CreditValue ORDER BY cte.RegionDesc)AS ROWID
Edit: The issue here isn't actually the nesting of the subquery, it's potentially based on partition by having columns that truly make each row unique (or 1)
Rather than ranking within your complex query like this
select
rank() over(partition by...),
*
from
data_source
join table1
join table2
join table3
join table4
order by
some_column
Try rank() or row_number() on the resulting data set, not within it.
For example, using the query above, remove rank() and implement it this way:
select
rank() over(partition by...),
results.*
from (
select
*
from
data_source
join table1
join table2
join table3
join table4
order by
some_column
) as results

select top 1 with a group by

I have two columns:
namecode name
050125 chris
050125 tof
050125 tof
050130 chris
050131 tof
I want to group by namecode, and return only the name with the most number of occurrences. In this instance, the result would be
050125 tof
050130 chris
050131 tof
This is with SQL Server 2000
I usually use ROW_NUMBER() to achieve this. Not sure how it performs against various data sets, but we haven't had any performance issues as a result of using ROW_NUMBER.
The PARTITION BY clause specifies which value to "group" the row numbers by, and the ORDER BY clause specifies how the records within each "group" should be sorted. So partition the data set by NameCode, and get all records with a Row Number of 1 (that is, the first record in each partition, ordered by the ORDER BY clause).
SELECT
i.NameCode,
i.Name
FROM
(
SELECT
RowNumber = ROW_NUMBER() OVER (PARTITION BY t.NameCode ORDER BY t.Name),
t.NameCode,
t.Name
FROM
MyTable t
) i
WHERE
i.RowNumber = 1;
select distinct namecode
, (
select top 1 name from
(
select namecode, name, count(*)
from myTable i
where i.namecode = o.namecode
group by namecode, name
order by count(*) desc
) x
) as name
from myTable o
SELECT max_table.namecode, count_table2.name
FROM
(SELECT namecode, MAX(count_name) AS max_count
FROM
(SELECT namecode, name, COUNT(name) AS count_name
FROM mytable
GROUP BY namecode, name) AS count_table1
GROUP BY namecode) AS max_table
INNER JOIN
(SELECT namecode, COUNT(name) AS count_name, name
FROM mytable
GROUP BY namecode, name) count_table2
ON max_table.namecode = count_table2.namecode AND
count_table2.count_name = max_table.max_count
I did not try but this should work,
select top 1 t2.* from (
select namecode, count(*) count from temp
group by namecode) t1 join temp t2 on t1.namecode = t2.namecode
order by t1.count desc
Here are to examples that you could use but the temp table use is more efficient than the view, but was done on a small data sample. You would want to check your own statistics.
--Creating A View
GO
CREATE VIEW StateStoreSales AS
SELECT t.state,t.stor_id,t.stor_name,SUM(s.qty) 'TotalSales'
,ROW_NUMBER() OVER (PARTITION BY t.state ORDER BY SUM(s.qty) DESC) AS 'Rank'
FROM [dbo].[sales] s
JOIN [dbo].[stores] t ON (s.stor_id = t.stor_id)
GROUP BY t.state,t.stor_id,t.stor_name
GO
SELECT * FROM StateStoreSales
WHERE Rank <= 1
ORDER BY TotalSales Desc
DROP VIEW StateStoreSales
---Using a Temp Table
SELECT t.state,t.stor_id,t.stor_name,SUM(s.qty) 'TotalSales'
,ROW_NUMBER() OVER (PARTITION BY t.state ORDER BY SUM(s.qty) DESC) AS 'Rank' INTO #TEMP
FROM [dbo].[sales] s
JOIN [dbo].[stores] t ON (s.stor_id = t.stor_id)
GROUP BY t.state,t.stor_id,t.stor_name
SELECT * FROM #TEMP
WHERE Rank <= 1
ORDER BY TotalSales Desc
DROP TABLE #TEMP

Resources