How to recreate old snapshot using field history table in Bigquery - database

I'm currently working on an interesting problem. I am trying to recreate state of table as it was on a given previous date. I have 2 tables
Table A: consists of live data, gets refreshed on an hourly basis.
Table A_field_history: consists of changes made to the fields in Table A.
Following image consists of current state, where Table A has live updated data and Table A_field_history only captures changes made to the fields on table A.
I am trying to recreate Table A as of particular given date. Following image consists of table state as it was on 06/30/2020.
The requirement is to have capability to recreate state of Table A based on any given date.

I actually identified a way to rollback (virtually, not on actual table) all the updates made after given specific date. Following are the steps followed:
Create dummy tables:
WITH
Table_A AS
(
SELECT 1 As ID, '2020-6-28' as created_date, 10 as qty, 100 as value
Union ALL
SELECT 2 As ID, '2020-5-29' as created_date, 20 as qty, 200 as value),
Table_A_field_history AS
(
SELECT 'xyz' id,'2020-07-29' created_date,'12345' created_by,'qty' field,'10' new_value,'200' old_value,'1' A_id
UNION ALL
SELECT 'abc' id,'2020-07-24' created_date,'12345' created_by,'qty' field,'20' new_value,'10' old_value,'2' A_id
UNION ALL
SELECT 'xyz' id,'2020-07-29' created_date,'12345' created_by,'value' field,'100' new_value,'2000' old_value,'1' A_id
UNION ALL
SELECT 'abc' id,'2020-07-24' created_date,'12345' created_by,'value' field,'200' new_value,'5000' old_value,'2' A_id
UNION ALL
SELECT 'xyz' id,'2020-06-29' created_date,'12345' created_by,'qty' field,'200' new_value,'' old_value,'1' A_id
UNION ALL
SELECT 'abc' id,'2020-05-30' created_date,'12345' created_by,'qty' field,'10' new_value,'' old_value,'2' A_id
UNION ALL
SELECT 'xyz' id,'2020-06-29' created_date,'12345' created_by,'value' field,'2000' new_value,'' old_value,'1' A_id
UNION ALL
SELECT 'abc' id,'2020-05-30' created_date,'12345' created_by,'value' field,'5000' new_value,'' old_value,'2' A_id
),
Step 1. Create date cte to filter data based on given date:
`date_spine
AS
(
SELECT * FROM UNNEST(GENERATE_DATE_ARRAY('2020-01-01', CURRENT_DATE(), INTERVAL 1 Day)) AS as_of_date
),`
Step 2. Above created date cte can be used as a Spine for our query, cross join to map as_of_date with all the changes made in the history table.
date_changes
AS
(
SELECT DISTINCT
date.as_of_date,
hist.A_id
FROM Table_A_field_history hist CROSS JOIN date_spine date
),
Step 3. Now we have as_of_date mapped to all historical transactions, now we can get max of change date.
most_recent_changes AS (
SELECT
dc.as_of_date,
dc.A_id ,
MAX(fh.created_date) AS created_date,
FROM date_changes dc
LEFT JOIN Table_A_field_history AS fh
ON dc.A_id = fh.A_id
WHERE CAST(fh.created_date AS DATE) <= dc.as_of_date
GROUP BY dc.as_of_date,
dc.A_id
),
Step 4. Now mapping max change date with actual created_date and history table
past_changes AS (
SELECT
mr.as_of_date,
mr.A_id,
mr.created_date,
a.id AS entry_id,
a.created_by AS created_by_id,
CASE WHEN a.field='qty' THEN a.new_value ELSE '' END AS qty,
CASE WHEN a.field='value' THEN a.new_value ELSE '' END AS value,
FROM most_recent_changes AS mr
LEFT JOIN Table_A_field_history AS a
ON mr.A_id = a.A_id
AND mr.created_date = a.created_date
WHERE a.id IS NOT NULL
)
Step 5. Now we can use as_of_date to get historical state of Table A.
Select *
From past_changes x
WHERE x.as_of_date = '2020-07-29'

Related

Compare row with other rows in the same table in sql server

I have the below records in my table,
If the HoleNumber combination is not having 'A' and 'B' for the particular datetime, we need to remove the alphabets from the number.
i.e., Remove 'A' from third record and sixth record. Because, it doesn't have B combinations for that datetime.
delete from myTable
where id in
(
select id from myTable t1
inner join
(
select [date], left([holeNumber], len(holeNumber)-1) as hNumber
from myTable
group by [date], left([holeNumber], len(holeNumber)-1)
having count(holeNumber) = 1
) tmp
on t1.[date] = tmp.[date] and left(t1.holeNumber, len(holeNumber)-1) = tmp.hNumber);
would do it, provided your requirements are strictly to remove having only 1 type of holeNumber.
DBFiddle demo

Left outer join returning extra records

I have 2 tables namely "Item" and "Messages".
Item table has the columns like Id, Amount, etc.
Messages table has the columns like ItemId, Count, Comment, etc.
Here the common link between these 2 tables is the "Id" from Item and "ItemId" from Messages.
The "Count" column in the Messages table is just the count of comments per ItemId. i.e. When user updates the comment for any record, an entry gets created in the Messages table and Count for that particular ItemId shows as 1. If user updates one more comment to same record, the Count shows 2 and so on. If user does not update comment for a certain record, the entry does not get created in Messages table at all (NULL).
I want to capture all the records from the Item table irrespective of whether user has updated comment or not. If there are 0 comments, the query should return NULL in the Comments column for that record. But, If the user has updated the comment, it should pick up the comment having the highest "Count". E.g. if one record has 8 comments, the query should return only the record where Messages.Count=8 and not all 8 records. If only one comment, then that comment should be seen.
I have written LEFT OUTER JOIN but not able to get through as it shows all 8 records. In the results, I find 7 records with NULL as the count and the 8th record showing count as 8 but I need only this 8th record and not the other 7.
Any help would be highly appreciated. Below is my query:
Select
Id,
Amount,
Messages.Comment As Comments
From Item
Left Outer Join Messages ON Messages.ItemId=Item.Id
Left Outer Join (Select ItemId, MAX(Id) as max_id from Messages Group by ItemId) T ON Messages.ItemId=T.ItemId and Messages.Id=T.max_id
Where amount > 100
I've hooked up an example using temp tables which I think covers what you're looking for. Just remove the temp table stuff and replace with your actual tables and it should work.
CREATE TABLE #Item ( ID int PRIMARY KEY,
Amount numeric(9,2))
CREATE TABLE #Messages ( ItemId int REFERENCES #Item(ID),
[Count] smallint,
Comment nvarchar(max))
INSERT INTO #Item (ID, Amount)
SELECT 1, 100
UNION
SELECT 2, 120
UNION
SELECT 3, 140
UNION
SELECT 4, 50
INSERT INTO #Messages ( ItemID,
[Count],
Comment)
SELECT 1, 1, 'Comment 1 - 1'
UNION
SELECT 1, 2, 'Comment 1 - 2'
UNION
SELECT 2, 1, 'Comment 2 - 1'
UNION
SELECT 2, 1, 'Comment 3 - 1'
UNION
SELECT 2, 2, 'Comment 3 - 2'
SELECT I.Id,
I.Amount,
M.Comment
FROM #Item AS I
OUTER APPLY ( SELECT TOP 1 M.Comment
FROM #Messages AS M
WHERE M.ItemId = I.ID
ORDER BY M.[Count] DESC) AS M
WHERE i.amount > 100
DROP TABLE #Messages
DROP TABLE #Item
go for it bro....
Select
Id,
Amount,
T.Comment As Comments
From Item
Left Outer Join (Select ItemId, MAX(Id) as max_id, Comments from Messages Group by ItemId) T ON Item.ItemId=T.ItemId
Where amount > 100

SQL Server : return value in specific table2 column based on value in table1

I have a query that gets data from 2 tables.
Transaction table contains week_id, customer_id, upc12, sales_dollars
Products table contains upc12, column_1, column_2, column_3
I want my query to return the value in products table, based on what the customer_id is in the transaction table. customer_id = 1 should return column_1, customer_id = 2 should return column_3, etc.
SELECT
t.week_id,
customer_id,
upc12,
p.___________ sum(t.sales_dollars)
FROM
transaction t, products p
WHERE
t.upc_12 = p.upc_12
GROUP BY
t.week_id, customer_id, upc12, p.___________
Sorry if this makes no sense, but my research hasn't been very good, as I don't know how to correctly formulate my question. You probably guessed I'm new to SQL.
Thanks!
Here is one way to do it:
;WITH cte as
(
SELECT
t.week_id,
customer_id,
upc12,
CASE customer_id
WHEN 1 THEN p.Column_1
WHEN 2 THEN p.Column_2
WHEN 3 THEN p.Column_3
END As ColByCustomer,
t.sales_dollars
FROM transaction t
INNER JOIN products p on t.upc_12 = p.upc_12
)
SELECT week_id, customer_id, upc12, ColByCustomer, SUM(sales_dollars)
FROM cte
GROUP BY week_id, customer_id, upc12, ColByCustomer

SELECT from multiple queries

I have this tables:
tblDiving(
diving_number int primary key
diving_club int
date_of_diving date)
tblDivingClub(
number int primary key not null check (number>0),
name char(30),
country char(30))
tblWorks_for(
diver_number int
club_number int
end_working_date date)
tblCountry(
name char(30) not null primary key)
I need to write a query to return a name of a country and the number of "Super club" in it.
a Super club is a club which have more than 25 working divers (tblWorks_for.end_working_date is null) or had more than 100 diving's in it(tblDiving) in the last year.
after I get the country and number of super club, I need to show only the country's that contains more than 2 super club.
I wrote this 2 queries:
select tblDivingClub.name,count(distinct tblWorks_for.diver_number) as number_of_guids
from tblWorks_for
inner join tblDivingClub on tblDivingClub.number = tblWorks_for.club_number,tblDiving
where tblWorks_for.end_working_date is null
group by tblDivingClub.name
select tblDivingClub.name, count(distinct tblDiving.diving_number) as number_of_divings
from tblDivingClub
inner join tblDiving on tblDivingClub.number = tblDiving.diving_club
WHERE tblDiving.date_of_diving <= DATEADD(year,-1, GETDATE())
group by tblDivingClub.name
But I don't know how do I continue.
Every query works separately, but how do I combine them and select from them?
It's university assignment and I'm not allowed to use views or temporary tables.
It's my first program so I'm not really sure what I'm doing:)
WITH CTE AS (
select tblDivingClub.name,count(distinct tblWorks_for.diver_number) as diving_number
from tblWorks_for
inner join tblDivingClub on tblDivingClub.number = tblWorks_for.club_number,tblDiving
where tblWorks_for.end_working_date is null
group by tblDivingClub.name
UNION ALL
select tblDivingClub.name, count(distinct tblDiving.diving_number) as diving_number
from tblDivingClub
inner join tblDiving on tblDivingClub.number = tblDiving.diving_club
WHERE tblDiving.date_of_diving <= DATEADD(year,-1, GETDATE())
group by tblDivingClub.name
)
SELECT * FROM CTE
You can combine the queries using a UNION ALL as long as there are the same number of columns in each query. You can then roll them into a Common Table Expression (CTE) and do a select from that.

SQL Server: Joining in rows via. comma separated field

I'm trying to extract some data from a third party system which uses an SQL Server database. The DB structure looks something like this:
Order
OrderID OrderNumber
1 OX101
2 OX102
OrderItem
OrderItemID OrderID OptionCodes
1 1 12,14,15
2 1 14
3 2 15
Option
OptionID Description
12 Batteries
14 Gift wrap
15 Case
[etc.]
What I want is one row per order item that includes a concatenated field with each option description. So something like this:
OrderItemID OrderNumber Options
1 OX101 Batteries\nGift Wrap\nCase
2 OX101 Gift Wrap
3 OX102 Case
Of course this is complicated by the fact that the options are a comma separated string field instead of a proper lookup table. So I need to split this up by comma in order to join in the options table, and then concat the result back into one field.
At first I tried creating a function which splits out the option data by comma and returns this as a table. Although I was able to join the result of this function with the options table, I wasn't able to pass the OptionCodes column to the function in the join, as it only seemed to work with declared variables or hard-coded values.
Can someone point me in the right direction?
I would use a splitting function (here's an example) to get individual values and keep them in a CTE. Then you can join the CTE to your table called "Option".
SELECT * INTO #Order
FROM (
SELECT 1 OrderID, 'OX101' OrderNumber UNION SELECT 2, 'OX102'
) X;
SELECT * INTO #OrderItem
FROM (
SELECT 1 OrderItemID, 1 OrderID, '12,14,15' OptionCodes
UNION
SELECT 2, 1, '14'
UNION
SELECT 3, 2, '15'
) X;
SELECT * INTO #Option
FROM (
SELECT 12 OptionID, 'Batteries' Description
UNION
SELECT 14, 'Gift Wrap'
UNION
SELECT 15, 'Case'
) X;
WITH N AS (
SELECT I.OrderID, I.OrderItemID, X.items OptionCode
FROM #OrderItem I CROSS APPLY dbo.Split(OptionCodes, ',') X
)
SELECT Q.OrderItemID, Q.OrderNumber,
CONVERT(NVarChar(1000), (
SELECT T.Description + ','
FROM N INNER JOIN #Option T ON N.OptionCode = T.OptionID
WHERE N.OrderItemID = Q.OrderItemID
FOR XML PATH(''))
) Options
FROM (
SELECT N.OrderItemID, O.OrderNumber
FROM #Order O INNER JOIN N ON O.OrderID = N.OrderID
GROUP BY N.OrderItemID, O.OrderNumber) Q
DROP TABLE #Order;
DROP TABLE #OrderItem;
DROP TABLE #Option;

Resources