I have a table like the one shown below. Now I need to add a new filed called Expiration date based on these rows. Since the first row ABCD is repeated I need to add expiration date of 2/28/2021 and for 2nd and 3rd row 1/1/9999.
How can I browse through the rows (via snowflake function) in order to calculate expiration date?
I guess you are looking for basic SCD2 functionality. Here is the query:
with CTE as (
select 'ABCD' as TypeName, 200 as Amount, '2021-01-01'::DATE as ValidFrom union all
select 'b1234' as TypeName, 300 as Amount, '2021-02-01'::DATE as ValidFrom union all
select 'ABCD' as TypeName, 150 as Amount, '2021-03-01'::DATE as ValidFrom
)
select TypeName, Amount, ValidFrom,
nvl(dateadd(DAY, -1, lead(ValidFrom) over (partition by TypeName order by ValidFrom)), '9999-01-01'::DATE) as ExpirationDate
from CTE
order by ValidFrom;
Do you need to physically add that row or can the row be virtual?
physical
Alter table, add column
Run date statement to update rows based on there being another row newer by Typename
Alternatively do a CTAS
virtual
Create a view over the table using the Window function to track changes
select *
, coalesce(lead(ValidFrom) over (partition by TypeName order by ValidFrom)
, cast('9999-12-31' as date)) as ValidTo
from CTE;
Related
I have some sample data as follows
Name Value Timestamp
a 23 2016/12/23 11:23
a 43 2016/12/23 12:55
b 12 2016/12/23 12:55
I want to select the latest value for a and b. When I used Last_Value, I used the following query
Select Name, Last_Value(Value) over (partition by Name order by timestamp) from table
This returned 2 rows for a, but I wanted it grouped so that I get only the last entered value for each name. So I had to use sub queries.
select x.Name,x.Value from (Select Name, Last_Value(Value) over (partition by Name order by timestamp) ) as x group by x.Name,x.Value
This again returns 2 records for a...I just wanted to do a group by and orderby and instaed of selelcting the max() wanted to select the top record.
Can anybody tell me how to solve this problem?
One method doesn't use window functions:
select t.*
from table t
where t.timestamp = (select max(t2.timestamp) from table t2 where t2.name = t.name);
Otherwise, the subquery method is fine, although I would often use row_number() and conditional aggregation rather than last_value() (or first_value() with a descending order by).
Unfortunately, SQL Server does not support first_value() or last_value() as an aggregation function, only as a window function.
How do I get the current year and prior year of a numeric column using a date column?
Columns available are
Premium Column that has a calculated/numeric value. Example would be 115.20, 325,126.29.
Date column. Example would be 03/09/2016, 12/10/2015
Date column Premium Column
03/09/2016 115.20
12/10/2015 325,126.29
I need to create 2 new columns, Current Date Premium and Prior Date premium.
The results I need to get for Current Date premium is 115.20
and for prior column is 325,126.29
Current Date premium Prior Date premium
115.20 325,126.29
How do I apply the date part function for this two new columns if the date part does not allow another expression (numeric column like Premium Column) to get the current year?
If you have SQL server 2008 and above you can try like below. You can also try replacing the inner query part with a CTE to refactor.
Logic:
Inner query provides a ranking on the table rows based on dates and then we do a self join based on a (current table's rank)=(prior table's rank)-1
SELECT
t1.[Premium Column] as [Current Date Premium],
t2.[Premium Column] as [Prior Date Premium]
FROM
(
SELECT
*,
ROW_NUMBER() OVER (ORDER BY [Date column] ASC) AS Ranking
FROM Tbl
)t1
LEFT JOIN
(
SELECT
*,
ROW_NUMBER() OVER (ORDER BY [Date column] ASC) AS Ranking
FROM Tbl
)t2
ON t1.Ranking=t2.Ranking-1
Environment:
OS: Windows Server 2012 DataCenter
DBMS: SQL Server 2012
Hardware (VPS): Xeon E5530 4 cores + 4GB RAM
Question:
I have a large table with 140 million rows. Some rows are supposed to be duplicate so I want to remove such rows. For example:
id name value timestamp
---------------------------------------
001 dummy1 10 2015-7-27 10:00:00
002 dummy1 10 2015-7-27 10:00:00 <-- duplicate
003 dummy1 20 2015-7-27 10:00:00
The second row is deemed duplicate because it has identical name, value and timestamp regardless of different id with the first row.
Note: the first two rows are duplicate NOT because of all identical columns, but due to self-defined rules.
I tried to remove such duplication by using window function:
select
id, name, value, timestamp
from
(select
id, name, value, timestamp,
DATEDIFF(SECOND, lag(timestamp, 1) over (partition by name order by timestamp),
timestamp) [TimeDiff]
from table) tab
But after an hour of execution, the lock is used up and error was raised:
Msg 1204, Level 19, State 4, Line 2
The instance of the SQL Server Database Engine cannot obtain a LOCK resource at this time. Rerun your statement when there are fewer active users. Ask the database administrator to check the lock and memory configuration for this instance, or to check for long-running transactions.
How could I remove such duplicate rows in an efficient way?
What about using a cte? Something like this.
with DeDupe as
(
select id
, [name]
, [value]
, [timestamp]
, ROW_NUMBER() over (partition by [name], [value], [timestamp] order by id) as RowNum
from SomeTable
)
Delete DeDupe
where RowNum > 1;
If only thing is selection of non-duplicate rows from table, consider using this script
SELECT MIN(id), name, value, timestamp FROM table GROUP BY name, value, timestamp
If you need to delete duplicate rows:
DELETE FROM table WHERE id NOT IN ( SELECT MIN(id) FROM table GROUP BY name, value, timestamp)
or
DELETE t FROM table t INNER JOIN
table t2 ON
t.name=t2.name AND
t.value=t2.value AND
t.timestamp=t2.timestamp AND
t2.id<t.id
Try something like this - determine the lowest ID for each set of values, then delete rows that have an ID other than the lowest one.
Select Name, Value, TimeStamp, min(ID) as LowestID
into #temp1
From MyTable
group by Name, Value, TimeStamp
Delete MyTable
from MyTable a
inner join #temp1 b
on a.Name = b.Name
and a.Value = b.Value
and a.Timestamp = b.timestamp
and a.ID <> b.LowestID
I have a table with more than one date column,
each date column hold a date or null value,
I want to write a SQL query which will display each column have date into a new row with a new additional column named LogDate that contain the same date of column.
Its difficult to explain, please referrer the attached image.
Just use UNION ALL to concatenate the three result sets:
SELECT [ReceivedDate] AS LogDate, * FROM MyTable WHERE [ReceivedDate] IS NOT NULL
UNION ALL
SELECT [Closing Date] AS LogDate, * FROM MyTable WHERE [Closing Date] IS NOT NULL
UNION ALL
SELECT [LPODate] AS LogDate, * FROM MyTable WHERE [LPODate] IS NOT NULL
To sort by LogDate simply add the following ORDER BY clause to the end of this query:
ORDER BY LogDate
I have 2 relational tables orders and order_items
orders has
id
customer_name
delivery_date
order_items has
order_id
item
unit
qty
I need to see how much (ie. Sum(qty)) of each item/unit combination each customer ordered with in a specified date range.
the only way I see this can be done is to use C# or vb.net and first create a datatable with distinct item/unit combinations for the date range.
The I would loop through those item/units and get a total for a customer for them in that date range and add them to another datatable.
Is there a way to do this in sql alone?
Yes, there is:
SELECT customer_name, item, unit, SUM(Qty) as totals
FROM orders o INNER JOIN order_items oi
ON o.id = oi.order_id
WHERE o.delivery_date BETWEEN #datefrom AND #dateto
GROUP BY customer_name, item, unit
ORDER BY o.delivery_date