Update latest record with previous record values - sql-server

I have a table called Audits that has a CompanyId, Date, AuditNumber field, each Company can have many audits and the AuditNumber field keeps track of how many audits said company has.
I'm trying to update all latest audit records date with it's previous date + 5 years, so say CompanyId 12345 has 3 audits, I want to update the 3rd audit (3rd audit being the latest one) records date with the 2nd audit records date + 5 years into the future, etc... basically doing this to all the latest records.
What I've got so far is trying to use a while loop to do this but I'm pretty stuck as it's not exactly doing what I want it to...
DECLARE #counter INT = 1;
WHILE(#counter <= (SELECT COUNT(*) FROM Audits WHERE AuditNumber > 1)
BEGIN
UPDATE Audits
SET Date = CASE
WHEN AuditNumber > 1 THEN (SELECT TOP 1 DATEADD(YEAR, 5, Date) FROM Audits WHERE AuditNumber < (SELECT(MAX(AuditNumber) FROM Audits))
END
WHERE AuditNumber > 1
SET #counter = #counter + 1
END
I'm no expert on SQL, but this just updates the Date with the first previous date it can find due to the SELECT TOP(1) but if I don't put that TOP(1) the subquery returns more than 1 record so it complains.
Any help would be appreciated.
Thanks!

No need for a procedure and a loop. I would recommend window functions and an updatable cte for this:
with cte as (
select date,
row_number() over(partition by company order by auditnumber desc) rn,
lag(date) over(partition by company order by auditnumber) lag_date
from audits
)
update cte
set date = dateadd(year, 5, lag_date)
where rn = 1 and lag_date is not null
The common table expression ranks records having the same company by descending audit number, and retrieves the date of the previous audit. The outer query filters on the top record per group, and updates the date to 5 years after the previous date.
You did not tell what to do when a company has just one audit. I added a condition to no update those rows, if any.

You must add row_number to you result_tbl first then join result_tbl with self ON
Al.CompanyId=A2.CompanyId AND Al.IND=1 AND A2.IND=2, now you have latest record and previous record in one record, and you can update original table
WITH A AS
(
SELECT *,ROW_NUMBER(PARTITION BY CompanyId ORDER BY AuditNumber DESC) IND FROM Audits
),B AS
(
SELECT Al.CompanyId,A1.AuditNumber,A2.[DATE] FROM A A1 INNER JOIN A A2 ON Al.CompanyId=A2.CompanyId AND Al.IND=1 AND A2.IND=2
)UPDATE _Audits SET _Audits.[Date]= DATEADD(YEAR,5,B.[DATE]) FROM
B LEFT JOIN Audits _Audits ON B.CompanyId=_Audits.CompanyId AND B.AuditNumber=_Audits.AuditNumber

Related

SQl Server - Where clause uses maximum date in data

I'm struggling with something i thought would be easy.
I have a table that is updated via an append on most days and has a report date field that shows the date the rows were updated.
I want to join to this table but only pull back the records from the date the table was last updated
Most of the time I could get away just looking for yesterdays date as the table is updated most days
Where [reportdate] > DATEADD(DAY, -1, GETDATE())
But as its not always updated daily, I wanted to rule this issue out. Is there anyway of returning the max date?
I was trying to figure out max (date), but I can't figure out the grouping. I need to return all the fields. The below just seems to return the whole table
SELECT max ([ReportDate]) as reportdate
,[GUID]
,[Make]
,[Model]
,[MPxN]
,[PaymentMode]
,[Consent]
,[Category]
,[Fuel]
,[pkCommCompID]
FROM table
group by guid
,[Make]
,[Model]
,[MPxN]
,[PaymentMode]
,[Consent]
,[Category]
,[Fuel]
,[pkCommCompID]
I could get round it with a temp table that just has the max report date and then using this as the left part of a join
SELECT max ([ReportDate]) as reportdate
FROM [DOMCustomers].[dbo].[DCC_Device_Comms_Compiled]
But The SQL is triggered in Excel so temp tables are problematic (i think).
Is there anyway of returning the max date?
Like this:
SELECT *
FROM SomeTable
where ReportDate = (select max(ReportDate) from SomeTable)
Here is a conceptual example.
It will produce a latest row for each car make.
SQL
-- DDL and sample data population, start
DECLARE #tbl TABLE (ID INT IDENTITY PRIMARY KEY, make VARCHAR(20), ReportDate DATETIME);
INSERT INTO #tbl (make, ReportDate) VALUES
('Ford', '2020-12-31'),
('Ford', '2020-10-17'),
('Tesla', '2020-10-25'),
('Tesla', '2020-12-30');
-- DDL and sample data population, end
;WITH rs AS
(
SELECT *
, ROW_NUMBER() OVER (PARTITION BY make ORDER BY ReportDate DESC) AS seq
FROM #tbl
)
SELECT * FROM rs
WHERE seq = 1;
Seems like a DENSE_RANK and TOP would work (assuming ReportDate is a date):
SELECT TOP (1) WITH TIES
[ReportDate]
,[GUID]
,[Make]
,[Model]
,[MPxN]
,[PaymentMode]
,[Consent]
,[Category]
,[Fuel]
,[pkCommCompID]
FROM YourTable
ORDER BY DENSE_RANK() OVER (ORDER BY ReportDate DESC);
If ReportDate is a date and time value, and you want everything for the latest date (ignoring time), then replace ReportDate with CONVERT(date,ReportDate) in the ORDER BY.

T-SQL: GROUP BY, but while keeping a non-grouped column (or re-joining it)?

I'm on SQL Server 2008, and having trouble querying an audit table the way I want to.
The table shows every time a new ID comes in, as well as every time an IDs Type changes
Record # ID Type Date
1 ae08k M 2017-01-02:12:03
2 liei0 A 2017-01-02:12:04
3 ae08k C 2017-01-02:13:05
4 we808 A 2017-01-03:20:05
I'd kinda like to produce a snapshot of the status for each ID, at a certain date. My thought was something like this:
SELECT
ID
,max(date) AS Max
FROM
Table
WHERE
Date < 'whatever-my-cutoff-date-is-here'
GROUP BY
ID
But that loses the Type column. If I add in the type column to my GROUP BY, then I'd get get duplicate rows per ID naturally, for all the types it had before the date.
So I was thinking of running a second version of the table (via a common table expression), and left joining that in to get the Type.
On my query above, all I have to join to are the ID & Date. Somehow if the dates are too close together, I end up with duplicate results (like say above, ae08k would show up once for each Type). That or I'm just super confused.
Basically all I ever do in SQL are left joins, group bys, and common table expressions (to then left join). What am I missing that I'd need in this situation...?
Use row_number()
select *
from ( select *
, row_number() over (partition by id order by date desc) as rn
from table
WHERE Date < 'whatever-my-cutoff-date-is-here'
) tt
where tt.rn = 1
I'd kinda like know how many IDs are of each type, at a certain date.
Well, for that you use COUNT and GROUP BY on Type:
SELECT Type, COUNT(ID)
FROM Table
WHERE Date < 'whatever-your-cutoff-date-is-here'
GROUP BY Type
Basing on your comment under Zohar Peled answer you probably looking for something like this:
; with cte as (select distinct ID from Table where Date < '$param')
select [data].*, [data2].[count]
from cte
cross apply
( select top 1 *
from Table
where Table.ID = cte.ID
and Table.Date < '$param'
order by Table.Date desc
) as [data]
cross apply
( select count(1) as [count]
from Table
where Table.ID = cte.ID
and Table.Date < '$param'
) as [data2]

Repeat Customers with multiple purchases on the same day counts a 1

I am trying to wrap my head around this problem. I was asked to create a report that show repeat customers in our database.
One of the requirements is if a customer has more than 1 order on a specific date, it would only count as 1.
Then if they have more than 1 purchase date, they would then count as a repeat customer.
Searching on here, I found this which works for finding the Customers with more then 1 purchase on a specific purchase date.
SELECT DISTINCT s.[CustomerName], s.PurchaseDate
FROM Reports.vw_Repeat s WHERE s.PurchaseDate <> ''
GROUP BY s.[CustomerName] , cast(s.PurchaseDate as date)
HAVING COUNT(*) > 1;
This MSSQL code works like it should, by showing customers who had more than 1 purchase on the same date.
My problem is what would the best approach be to Join this into another query (this is where i need help) that then shows a complete repeat customer list where customers with more than 1 purchase would be returned.
I am using MSSQL. Any help would be greatly appreciated.
You're close, you need to move distinct into your having clause because you want to include only customers that have more than 1 distinct purchase date.
Also, only group by the customer id because the different dates have to be part of the same group for count distinct to work.
SELECT s.[CustomerName], COUNT(distinct cast(s.PurchaseDate as date))
FROM Reports.vw_Repeat s WHERE s.PurchaseDate <> ''
GROUP BY s.[CustomerName]
HAVING COUNT(distinct cast(s.PurchaseDate as date)) > 1;
If you want to pass a parameter to a query and join the result, that's what table-valued functions are for. When you join it, you use CROSS APPLY or OUTER APPLY instead of an INNER JOIN or a LEFT JOIN.
Also, I think this goes without saying, but when you check if PurchaseDate is empty:
WHERE s.PurchaseDate <> ''
Could be issues there... it implies it's a varchar field instead of a datetime (yes?) and doesn't handle null values. You might, at least, want to replace that with ISNULL(s.PurchaseDate, '') <> ''. If it's actually a datetime, use IS NOT NULL instead of <> ''.
(Edited to add sample data and DDL statements. I recommend adding these to SQL posts to assist answerers. Also, I made purchasedate a varchar instead of a datetime because of the string comparison in the query.)
https://technet.microsoft.com/en-us/library/ms191165(v=sql.105).aspx
CREATE TABLE company (company_name VARCHAR(25))
INSERT INTO company VALUES ('Company1'), ('Company2')
CREATE TABLE vw_repeat (customername VARCHAR(25), purchasedate VARCHAR(25), company VARCHAR(25))
INSERT INTO vw_repeat VALUES ('Cust1', '11/16/2017', 'Company1')
INSERT INTO vw_repeat VALUES ('Cust1', '11/16/2017', 'Company1')
INSERT INTO vw_repeat VALUES ('Cust2', '11/16/2017', 'Company2')
CREATE FUNCTION [dbo].tf_customers
(
#company varchar(25)
)
RETURNS TABLE AS RETURN
(
SELECT s.[CustomerName], cast(s.PurchaseDate as date) PurchaseDate
FROM vw_Repeat s
WHERE s.PurchaseDate <> '' AND s.Company = #company
GROUP BY s.[CustomerName] , cast(s.PurchaseDate as date)
HAVING COUNT(*) > 1
)
GO
SELECT *
FROM company c
CROSS APPLY tf_customers(c.company_name)
First thanks to everyone for the help.
#MaxSzczurek suggested I use table-valued functions. After looking into this more, I ended up using just a temporary table first to get the DISTINCT purchase dates for each Customer. I then loaded that into another temp table RIGHT JOINED to the main table. This gave me the result I was looking for. Its a little(lot) ugly, but it works.

Unique vs MAX in SQL statement

I have a table with three columns:
PERSON
VISITOR
DATE
The table is basically a transactional table. The following is true:
There are multiple rows per person
There are multiple rows per visitor
There are multiple rows of a given person/visitor combination.
Assumed unique person/date combination
What I need is
I want visitor for each Person's MAX Date.
I cannot have multiple persons in the output.
Person must be unique.
visitor may repeat.
I have tried:
SELECT
ROW_NUMBER() OVER (PARTITION BY PERSON, VISITOR ORDER BY Date DESC) row_num,
PERSON,
VISITOR as VISITOR
FROM
`TABLE`
ORDER BY
PERSON
Maybe this... not sure I fully understand question. Sample data /expected results would help.
You said you wanted only the 1 person with the visitor per max date so the row_num of 1 will be the record w/ the max date. and since we partition by person it will not matter if person A had 3 visitors. only the person and their Most recent visitor will be listed.
WITH cte as (
SELECT ROW_NUMBER() OVER (PARTITION BY PERSON ORDER BY Date DESC) row_num
, PERSON
, VISITOR as VISITOR
FROM `TABLE`)
SELECT *
FROM cte
WHERE row_Num = 1
I think this can be done with a cross apply too though i'm not as good at using them yet...
SELECT A.Person, A.Visitor, A.Date
FROM table A
CROSS APPLY (SELECT TOP 1 *
FROM TABLE B
WHERE A.Person = B.Person
and A.Visitor = B.Visitor
and A.Date = B.Date
ORDER BY DATE DESC) C
Essentially the inner query runs for each record on the outer query; thus only the top most record will be returned thus the newest date.
select a.* from myTable as a inner join (
SELECT person, max(date) as maxDate from myTable group by person
) as b
on a.date = b.maxDate
and a.person = b.person;
I am weak in reading and writing English.
In my opinion the answer may be:
SELECT `PERSON`, `VISITOR`, MAX(`DATE`) AS `DATE`
FROM `TABLE`
GROUP BY `PERSON`, `VISITOR`;

Find records updated within hour timeframe from creation of new record

Let's say I have records that are created for every entry a user creates. We will call these documents. They have a CreatedWhen column.
Let's also state that documents can be updated changing their timestamp. A column known as UpdatedOn.
The UpdatedOn column has an initial timestamp that is the same as CreatedOn for new document creation, unless they saved it before exiting.
Now, someone wants to know what documents were updated 2 hours prior to the creation of a new document.
Here was my attempt, and I can't figure out why it's not working.
EDIT: The times I get back are not correct.
SELECT
CASE WHEN (d2.UpdatedOn IS NOT NULL)
THEN ROW_NUMBER() OVER (partition by d2.DocumentID, d2.DocumentName, d2.CreatedWhen --sequence grouping
order by d2.UpdatedOn desc) --get earliest dates
ELSE 0
END as SeqNum,
d.DocumentName,
d.CreatedWhen
d2.DocumentName as [DocUpdatedWithin2HR],
d2.UpdatedOn
Into #temp
FROM Documents d
JOIN Documents d2 ON d.DocumentID = d2.DocumentID
WHERE d2.UpdatedOn BETWEEN DATEADD(Hour,-2,d.CreatedWhen) --2 hours prior, 1 hour window
AND DATEADD(Second,-1,(DATEADD(Hour,-1,d.CreatedWhen))) --1 second shy of an hour
select distinct
DocumentName,
CreatedWhen,
DocUpdatedWithin2HR, --xml path this column later
UpdatedOn
from #temp where rn = 1 order by 1,3
The JOIN needs the DocumentID. I'm surprised you didn't get a syntax error. Update the JOIN to JOIN Documents d2 ON d.DocumentID = d2.DocumentID

Resources