I am trying to increment a project code when I do a SQL merge operation
I have two tables. One has lots of information including customer's and projects. I want to merge the customer name and project name from one table to the other. This article is perfect and showed me how to do what I needed to do
https://www.mssqltips.com/sqlservertip/1704/using-merge-in-sql-server-to-insert-update-and-delete-at-the-same-time/
However I need to maintain a project number that increments every time a record is added and left alone when you do an edit of customer or project name. If the project is deleted then we carry on from the next available number. I was trying to do it using row number over partition but it didn't give me the correct number of projects.
Using the articles example and to provide a visualisation I would need another column called Type with Food or Drink as the answer and get
Item Cost Code Type
Tea 10 1 Drink
Coffee 12 2 Drink
Muffin 11 1 Food
Biscuit 4 2 Food
I will go with the data from the example provided from the link and add a bit more data to be sure im covering all the cases so first lets start with these tables and fill them.
--Create a target table
Declare #Products TABLE
(
ProductID INT PRIMARY KEY,
ProductName VARCHAR(100),
ProductNumber int,
ProductType VARCHAR(100),
Rate MONEY
)
--Insert records into target table
INSERT INTO #Products
VALUES
(1, 'Tea', 1,'Drink',10.00),
(2, 'Coffee', 2,'Drink', 20.00),
(3, 'BiscuitX1', 1,'Food', 45.00) ,
(4, 'Muffin', 2,'Food', 30.00),
(5, 'BiscuitX2', 3,'Food', 40.00),
(6, 'BiscuitX3', 4,'Food', 45.00),
(7, 'Donut', 5, 'Food', 30.00),
(8, 'BiscuitX4', 6,'Food', 40.00),
(9, 'BiscuitX5', 7,'Food', 45.00)
--Create source table
Declare #UpdatedProducts TABLE
(
ProductID INT PRIMARY KEY,
ProductName VARCHAR(100),
ProductNumber int,
ProductType VARCHAR(100),
Rate MONEY
)
--Insert records into source table
INSERT INTO #UpdatedProducts
VALUES
(1, 'Tea', 0,'Drink', 10.00),
(2, 'Coffee', 0,'Drink', 25.00),
(4, 'Muffin', 0,'Food', 35.00),
(7, 'Donut', 0, 'Food', 30.00),
(10, 'Pizza', 0,'Food', 60.00),
(11, 'PizzaLarge', 0,'Food', 80.00)
You can see that I added the ProductNumber and ProductType.
for the #UpdatedProducts table Im assuming you dont have the product Number, if you do then you will do the direct merge with no problem, if you dont you would need to find it.
so lets first update ProductNumber in #UpdatedProducts
;with cte as (
select u.ProductID,u.ProductName,u.ProductType,u.Rate
,coalesce(p.ProductNumber,row_number() over (partition by u.ProductType order by u.ProductID)
+(select max(pp.ProductNumber) from #Products pp where pp.ProductType=u.ProductType)
-(select Count(*) from #UpdatedProducts uu
inner join #Products ppp on ppp.ProductID=uu.ProductID
where uu.ProductType=u.ProductType)) [ProductNumber]
from #UpdatedProducts u
left outer join #Products p on p.ProductID=u.ProductID
)
update a
set a.[ProductNumber]=cte.[ProductNumber]
From #UpdatedProducts a
inner join cte on cte.ProductID=a.ProductID
I did not find a way to put this in the merge directly.
The result of the #UpdatedProducts after the update will be as below:-
ProductID ProductName ProductNumber ProductType Rate
========= =========== ============= =========== ====
1 Tea 1 Drink 10.00
2 Coffee 2 Drink 25.00
4 Muffin 2 Food 35.00
7 Donut 5 Food 30.00
10 Pizza 8 Food 60.00
11 PizzaLarge 9 Food 80.00
So now we can do a direct merge, as below:-
--Synchronize the target table with refreshed data from source table
MERGE #Products AS TARGET
USING #UpdatedProducts AS SOURCE
ON (TARGET.ProductID = SOURCE.ProductID)
--When records are matched, update the records if there is any change
WHEN MATCHED AND TARGET.ProductName <> SOURCE.ProductName OR TARGET.Rate <> SOURCE.Rate
THEN UPDATE SET TARGET.ProductName = SOURCE.ProductName, TARGET.Rate = SOURCE.Rate ,TARGET.ProductNumber= TARGET.ProductNumber --left alone on edit
--When no records are matched, insert the incoming records from source table to target table
WHEN NOT MATCHED BY TARGET
THEN INSERT (ProductID, ProductName, Rate,ProductNumber,ProductType)
VALUES (SOURCE.ProductID, SOURCE.ProductName, SOURCE.Rate,SOURCE.ProductNumber,SOURCE.ProductType)-- increments every time a record is added
--When there is a row that exists in target and same record does not exist in source then delete this record target
WHEN NOT MATCHED BY SOURCE
THEN DELETE
--$action specifies a column of type nvarchar(10) in the OUTPUT clause that returns
--one of three values for each row: 'INSERT', 'UPDATE', or 'DELETE' according to the action that was performed on that row
OUTPUT $action,
DELETED.ProductID AS TargetProductID,
DELETED.ProductName AS TargetProductName,
DELETED.Rate AS TargetRate,
INSERTED.ProductID AS SourceProductID,
INSERTED.ProductName AS SourceProductName,
INSERTED.Rate AS SourceRate;
SELECT * FROM #Products
The result of #Products would be as below:-
ProductID ProductName ProductNumber ProductType Rate
========= =========== ============= =========== ====
1 Tea 1 Drink 10.00
2 Coffee 2 Drink 25.00
4 Muffin 2 Food 35.00
7 Donut 5 Food 30.00
10 Pizza 8 Food 60.00
11 PizzaLarge 9 Food 80.00
For the product numbers (1,3,4,6,7) were all skipped and the new food product Pizza took Product number 8 and continued to be Product 9 for the PrizzaLarge.
hope this helps.
Related
I have the following table:
respid, uploadtime
I need a query that will show all the records that respid is duplicate and show them except the latest (by upload time)
exmple:
4 2014-01-01
4 2014-06-01
4 2015-01-01
4 2015-06-01
4 2016-01-01
In this case the query should return four records (the latest is : 4 2016-01-01 )
Thank you very much.
Use ROW_NUMBER:
WITH cte AS (
SELECT respid, uploadtime,
ROW_NUMBER() OVER (PARTITION BY respid ORDER BY uploadtime DESC) rn
FROM yourTable
)
SELECT respid, uploadtime
FROM cte
WHERE rn > 1
ORDER BY respid, uploadtime;
The logic here is to show all records except those having the first row number value, which would be the latest records for each respid group.
If I interpreted your question correctly, then you want to see all records where respid occurs multiple times, but exclude the last duplicate.
Translating this to SQL could sound like "show all records that have a later record for the same respid". That is exactly what the solution below does. It says that for every row in the result a later record with the same respid must exists.
Sample data
declare #MyTable table
(
respid int,
uploadtime date
);
insert into #MyTable (respid, uploadtime) values
(4, '2014-01-01'),
(4, '2014-06-01'),
(4, '2015-01-01'),
(4, '2015-06-01'),
(4, '2016-01-01'), --> last duplicate of respid=4, not part of result
(5, '2020-01-01'); --> has no duplicate, not part of result
Solution
select mt.respid, mt.uploadtime
from #MyTable mt
where exists ( select top 1 'x'
from #MyTable mt2
where mt2.respid = mt.respid
and mt2.uploadtime > mt.uploadtime );
Result
respid uploadtime
----------- ----------
4 2014-01-01
4 2014-06-01
4 2015-01-01
4 2015-06-01
This is not a homework question.
I'm trying to take the count of t-shirts in an order and see which price range the shirts fall into, depending on how many have been ordered.
My initial thought (I am brand new at this) was to ask another table if count > 1st price range's maximum, and if so, keep looking until it's not.
printing_range_max printing_price_by_range
15 4
24 3
33 2
So for example here, if the order count is 30 shirts they would be $2 each.
When I'm looking into how to do that, it looks like most people are using BETWEEN or IF and hard-coding the ranges instead of looking in another table. I imagine in a business setting it's best to be able to leave the range in its own table so it can be changed more easily. Is there a good/built-in way to do this or should I just write it in with a BETWEEN command or IF statements?
EDIT:
SQL Server 2014
Let's say we have this table:
DECLARE #priceRanges TABLE(printing_range_max tinyint, printing_price_by_range tinyint);
INSERT #priceRanges VALUES (15, 4), (24, 3), (33, 2);
You can create a table with ranges that represent the correct price. Below is how you would do this in pre-2012 and post-2012 systems:
DECLARE #priceRanges TABLE(printing_range_max tinyint, printing_price_by_range tinyint);
INSERT #priceRanges VALUES (15, 4), (24, 3), (33, 2);
-- post-2012 using LAG
WITH pricerange AS
(
SELECT
printing_range_min = LAG(printing_range_max, 1, 0) OVER (ORDER BY printing_range_max),
printing_range_max,
printing_price_by_range
FROM #priceRanges
)
SELECT * FROM pricerange;
-- pre-2012 using ROW_NUMBER and a self-join
WITH prices AS
(
SELECT
rn = ROW_NUMBER() OVER (ORDER BY printing_range_max),
printing_range_max,
printing_price_by_range
FROM #priceRanges
),
pricerange As
(
SELECT
printing_range_min = ISNULL(p2.printing_range_max, 0),
printing_range_max = p1.printing_range_max,
p1.printing_price_by_range
FROM prices p1
LEFT JOIN prices p2 ON p1.rn = p2.rn+1
)
SELECT * FROM pricerange;
Both queries return:
printing_range_min printing_range_max printing_price_by_range
------------------ ------------------ -----------------------
0 15 4
15 24 3
24 33 2
Now that you have that you can use BETWEEN for your join. Here's the full solution:
-- Sample data
DECLARE #priceRanges TABLE
(
printing_range_max tinyint,
printing_price_by_range tinyint
-- if you're on 2014+
,INDEX ix_xxx NONCLUSTERED(printing_range_max, printing_price_by_range)
-- note: second column should be an INCLUDE but not supported in table variables
);
DECLARE #orders TABLE
(
orderid int identity,
ordercount int
-- if you're on 2014+
,INDEX ix_xxy NONCLUSTERED(orderid, ordercount)
-- note: second column should be an INCLUDE but not supported in table variables
);
INSERT #priceRanges VALUES (15, 4), (24, 3), (33, 2);
INSERT #orders(ordercount) VALUES (10), (20), (25), (30);
-- Solution:
WITH pricerange AS
(
SELECT
printing_range_min = LAG(printing_range_max, 1, 0) OVER (ORDER BY printing_range_max),
printing_range_max,
printing_price_by_range
FROM #priceRanges
)
SELECT
o.orderid,
o.ordercount,
--p.printing_range_min,
--p.printing_range_max
p.printing_price_by_range
FROM pricerange p
JOIN #orders o ON o.ordercount BETWEEN printing_range_min AND printing_range_max
Results:
orderid ordercount printing_price_by_range
----------- ----------- -----------------------
1 10 4
2 20 3
3 25 2
4 30 2
Now that we have that we can
I am facing this problem where I need to compare the most recent row with the immediate previous one based on the same criteria (it will be trader in this case).
Here is my table:
ID Trader Price
-----------------
1 abc 5
2 xyz 5.2
3 abc 5.7
4 xyz 5
5 abc 5.2
6 abc 6
Here is the script
CREATE TABLE Sale
(
ID int not null PRIMARY KEY ,
trader varchar(10) NOT NULL,
price decimal(2,1),
)
INSERT INTO Sale (ID,trader, price)
VALUES (1, 'abc', 5), (2, 'xyz', 5.2),
(3, 'abc', 5.7), (4, 'xyz', 5),
(5, 'abc', 5.2), (6, 'abc', 6);
So far I am working with this solution that is not perfect yet
select
a.trader,
(a.price - b.price ) New_price
from
sale a
join
sale b on a.trader = b.trader and a.id > b.ID
left outer join
sale c on a.trader = c.trader and a.id > c.ID and b.id < c.ID
where
c.ID is null
Above is not perfect because I want to compare only the most recent with the immediate previous on... In this sample for example
Trader abc : I will compare only id = 6 and id = 5
Trader xyz : id = 4 and id = 2
Thanks for any help!
If you are using SQL Server 2012 or later, you can use functions LEAD and LAG to join previous and next data. Unfortunately these function can only be used in SELECT or ORDER BY clause, so you will need to use subquery to get the data you need:
SELECT t.trader, t.current_price - t.previous_price as difference
FROM (
SELECT
a.trader,
a.price as current_price,
LAG(a.price) OVER(PARTITION BY a.trader ORDER BY a.ID) as previous_price,
LEAD(a.price) OVER(PARTITION BY a.trader ORDER BY a.ID) as next_price
FROM sale a
) t
WHERE t.next_price IS NULL
Here in your subquery you create additional columns for previous and next value. Then in your main query you filter only these rows where next price is NULL - that indicates this is the last row for the specific trader.
I have seen few similar questions asked on Stackoverflow with same subject, However this is not fitting into any of those.
I have 3 Tables Purchase, Hotel, Car Purchase to Hotel and Car is 1 to 0 or many relationship. MS SQL 2008 Server.
Purchase Table
pid bookingdate ...
1
2
3
Hotel Table
hid pid amount rooms location brand...
1 1
2 1
3 1
4 3
4 3
Car Table
cid pid make model ...
1 1
2 2
3 2
What I want is display the hotel data in columns
pid bookingdate cid make model hid1 amount1 rooms1 location1 brand1 hid2 amount2 rooms2 location2 brand2 hid3 amount3 rooms3 location3 brand3 hid4 amount4 rooms4 location4 brand4
If there is only one hotel for given purchase id, other columns should be null. If more than 4 hotel ignore the other hotels(5th 6th ect)
Please assume Car table can have 0 or 1 at this stage.
(Advance::>> If Car table can 0 to many (relationship to Purchase) 1.) take only first record to account 2.) get the sum of records.)
Purpose of this project is generate a csv file which can upload to a some third party product. Currently we have about 100 columns for Purchase and Car about 30 Columns for Hotel (130 all). With Hotel rows displaying in this way it will be 220 (100 + 30 x 4) columns. And about 100K rows. Performance is an issue as well.
I tried few things, but nothing successful not even close to getting stuck. This was the closest I got but I have 30 columns to duplicate not just a one, In-fact I have a feeling that PIVOT is not the way to go. I'm thinking using DENSE_RANK() or something like that
SELECT pid
, hid
, DENSE_RANK() OVER(ORDER BY pid)
FROM HOTEL
WHERE pid IN (
SELECT pid
FROM Hotel
WHERE pid IN (
SELECT pid
FROM Purchase
WHERE rdate BETWEEN '2014-04-01' AND '2014-12-31'
)
GROUP BY pid
HAVING COUNT(pid) > 1
)
Assuming then we might able to process hotels with more than 1 attached to Purchase first, using a cursor (ugly) or something like that
Okay I have found a solution. You are welcome to refine and add your thoughts. Dynamically name columns would be handy. (ie product1,product2,product3,product4)
Schema
create table Purchase
(
pid int,
purchasedate date,
currency varchar(10),
paymenttype varchar(10),
creditcard varchar(10),
creditcardtype varchar(10)
);
create table Hotel
(
hid int,
pid int,
product varchar(30),
country varchar(10),
city varchar(10),
rooms int,
starrating int
);
create table Car
(
cid int,
pid int,
product varchar(30),
country varchar(10),
city varchar(10),
cancel int,
starrating int
);
insert into Purchase values (1, '2015-01-15','AUD', 'CC','12345678','AMEX')
insert into Purchase values (2, '2015-01-15','AUD', 'CC','12345678','AMEX')
insert into Purchase values (3, '2015-01-15','AUD', 'CC','12345678','AMEX')
insert into Purchase values (4, '2015-01-15','AUD', 'CC','12345678','AMEX')
insert into Purchase values (5, '2015-01-15','AUD', 'CC','12345678','AMEX')
insert into Hotel values (1,1, 'Five for Two','Australia', 'Melbourne','1','3')
insert into Hotel values (2,1, 'Five for None','Australia', 'Sydney','1','3')
insert into Hotel values (3,1, 'Five for Five','Australia', 'Melbourne','1','3')
insert into Hotel values (4,1, 'Five for Two','Australia', 'Jamboora','1','3')
insert into Hotel values (5,2, 'Five for Three','Australia', 'Sydney','1','3')
insert into Hotel values (6,2, 'Five for Love','Australia', 'Cook','1','3')
insert into Hotel values (7,2, 'Five for Grease','Australia', 'Darwin','1','3')
insert into Hotel values (8,3, 'Love Me','Australia', 'Darwin','1','3')
insert into Hotel values (9,4, 'Live for Grease','Australia', 'Footscray','1','3')
insert into Hotel values (10,4, 'Love Grease','Australia', 'Officer','1','3')
insert into Car values (1,1, 'Love Grease','Australia', 'Officer','1','3')
insert into Car values (2,2, 'Love Grease','Australia', 'Cook','1','3')
insert into Car values (3,4, 'Live Grease','Australia', 'Jamboora','1','3')
-- For Advance insert into Car values (4,4, 'Cove Grease','Australia', 'Melbourne','1','3')
And the Code is
SELECT *, DENSE_RANK() OVER(PARTITION BY pid ORDER BY hid DESC) AS Ranking
INTO #TempHotel
FROM Hotel
SELECT P.pid, P.purchasedate, P.currency, P.paymenttype, P.creditcard, P.creditcardtype
,C.cid, C.product, C.country, C.city, C.cancel, C.starrating
,H1.hid, H1.product, H1.country, H1.city,H1.rooms,H1.starrating
,H2.hid, H2.product, H2.country, H2.city,H2.rooms,H2.starrating
,H3.hid, H3.product, H3.country, H3.city,H3.rooms,H3.starrating
,H4.hid, H4.product, H4.country, H4.city,H4.rooms,H4.starrating
FROM Purchase P
left join Car C on P.pid = C.pid
left join #TempHotel H1 on P.pid = H1.pid and H1.Ranking = 1
left join #TempHotel H2 on P.pid = H2.pid and H2.Ranking = 2
left join #TempHotel H3 on P.pid = H3.pid and H3.Ranking = 3
left join #TempHotel H4 on P.pid = H4.pid and H4.Ranking = 4
DROP Table #TempHotel
You can also find this in SQL fiddle Here.
If the column count never changes you could use Pivot / Unpivot
Here is a technet article which says it all: http://technet.microsoft.com/en-us/library/ms177410(v=sql.105).aspx
and a sample from the article.
-- Pivot table with one row and five columns
SELECT 'AverageCost' AS Cost_Sorted_By_Production_Days,
[0], [1], [2], [3], [4]
FROM
(SELECT DaysToManufacture, StandardCost
FROM Production.Product) AS SourceTable
PIVOT
(
AVG(StandardCost)
FOR DaysToManufacture IN ([0], [1], [2], [3], [4])
) AS PivotTable;
I'm working with SQL Server 2005 and looking to export some data off of a table I have. However, prior to do that I need to update a status column based upon a field called "VisitNumber", which can contain multiple entries same value entries. I have a table set up in the following manner. There are more columns to it, but I am just putting in what's relevant to my issue
ID Name MyReport VisitNumber DateTimeStamp Status
-- --------- -------- ----------- ----------------------- ------
1 Test John Test123 123 2014-01-01 05.00.00.000
2 Test John Test456 123 2014-01-01 07.00.00.000
3 Test Sue Test123 555 2014-01-02 08.00.00.000
4 Test Ann Test123 888 2014-01-02 09.00.00.000
5 Test Ann Test456 888 2014-01-02 10.00.00.000
6 Test Ann Test789 888 2014-01-02 11.00.00.000
Field Notes
ID column is a unique ID in incremental numbers
MyReport is a text value and can actually be thousands of characters. Shortened for simplicity. In my scenario the text would be completely different
Rest of fields are varchar
My Goal
I need to address putting in a status of "F" for two conditions:
* If there is only one VisitNumber, update the status column of "F"
* If there is more than one visit number, only put "F" for the one based upon the earliest timestamp. For the other ones, put in a status of "A"
So going back to my table, here is the expectation
ID Name MyReport VisitNumber DateTimeStamp Status
-- --------- -------- ----------- ----------------------- ------
1 Test John Test123 123 2014-01-01 05.00.00.000 F
2 Test John Test456 123 2014-01-01 07.00.00.000 A
3 Test Sue Test123 555 2014-01-02 08.00.00.000 F
4 Test Ann Test123 888 2014-01-02 09.00.00.000 F
5 Test Ann Test456 888 2014-01-02 10.00.00.000 A
6 Test Ann Test789 888 2014-01-02 11.00.00.000 A
I was thinking I could handle this by splitting each types of duplicates/triplicates+ (2,3,4,5). Then updating every other (or every 3,4,5 rows). Then delete those from the original table and combine them together to export the data in SSIS. But I am thinking there is a much more efficient way of handling it.
Any thoughts? I can accomplish this by updating the table directly in SQL for this status column and then export normally through SSIS. Or if there is some way I can manipulate the column for the exact conditions I need, I can do it all in SSIS. I am just not sure how to proceed with this.
WITH cte AS
(
SELECT *, ROW_NUMBER() OVER (PARTITION BY VisitNumber ORDER BY DateTimeStamp) rn from MyTable
)
UPDATE cte
SET [status] = (CASE WHEN rn = 1 THEN 'F' ELSE 'A' END)
I put together a test script to check the results. For your purposes, use the update statements and replace the temp table with your table name.
create table #temp1 (id int, [name] varchar(50), myreport varchar(50), visitnumber varchar(50), dts datetime, [status] varchar(1))
insert into #temp1 (id,[name],myreport,visitnumber, dts) values (1,'Test John','Test123','123','2014-01-01 05:00')
insert into #temp1 (id,[name],myreport,visitnumber, dts) values (2,'Test John','Test456','123','2014-01-01 07:00')
insert into #temp1 (id,[name],myreport,visitnumber, dts) values (3,'Test Sue','Test123','555','2014-01-01 08:00')
insert into #temp1 (id,[name],myreport,visitnumber, dts) values (4,'Test Ann','Test123','888','2014-01-01 09:00')
insert into #temp1 (id,[name],myreport,visitnumber, dts) values (5,'Test Ann','Test456','888','2014-01-01 10:00')
insert into #temp1 (id,[name],myreport,visitnumber, dts) values (6,'Test Ann','Test789','888','2014-01-01 11:00')
select * from #temp1;
update #temp1 set status = 'F'
where id in (
select id from #temp1 t1
join (select min(dts) as mindts, visitnumber
from #temp1
group by visitNumber) t2
on t1.visitnumber = t2.visitnumber
and t1.dts = t2.mindts)
update #temp1 set status = 'A'
where id not in (
select id from #temp1 t1
join (select min(dts) as mindts, visitnumber
from #temp1
group by visitNumber) t2
on t1.visitnumber = t2.visitnumber
and t1.dts = t2.mindts)
select * from #temp1;
drop table #temp1
Hope this helps