Select rows where a value is maximum, and a column is null - sql-server

I have a table, products, that looks along these lines:
productID | version | done
1 | 1 | 2000-01-01
1 | 2 | NULL
2 | 1 | NULL
2 | 2 | 2000-01-01
Version is assumed to be increasing.
What I want is a query that returns a ProductID and its highest / current Version, if the Done column for that version is NULL. In plain English, I want all products where the latest version is not Done, and the corresponding version. The goal: among products, find the ones with a new version that have not been "done" / processed yet.
Note: in the example above, I would expect the query to return ProductID 1, Version 2 only. I do not want the highest not-done version of a product, I want the highest version of a product, if it is not-done. Sorry if the clarification is overkill.
I wrote a query which appears to do what I want:
SELECT productID ProductID, version Version
FROM products
WHERE done IS NULL
AND version IN (
SELECT MAX(version)
FROM products
GROUP BY productID
)
However, it also appears to not be very efficient. So my question is, is there a better way to approach this query?

We can try using ROW_NUMBER here:
WITH cte AS (
SELECT *, ROW_NUMBER() OVER (PARTITION BY productID ORDER BY version DESC) rn
FROM products
)
SELECT productID, version
FROM cte
WHERE rn = 1 AND done IS NULL;
Demo
The CTE above assigns a row number, starting with 1, to latest record for each product, according to version. Then, we subquery and retain only product records where the latest one happens to not have a value assigned to the done column.

Seems you are almost correct with your query, what's missing is the correlation between the productID of your subquery and your main table.
SELECT t.productID ProductID, t.version Version
FROM products t
WHERE t.done IS NULL
AND version IN (
SELECT MAX(p.version)
FROM products p
WHERE p.productID = t.productID
GROUP BY p.productID
)
Another solution is to use join
select t1.* from products t1
inner join
(select max(version) as versionId, productID
from products
group by productID) t2 on t2.productID = t1.productID and t2.versionId = t1.version
where coalesce(done, '') = ''

Related

SQL - Return first non-empty value for previous days

I'm currently working with an exchange rates table in SQL that has these fields:
| Country | ExchangeRateDt | ExchangeRateValue |
| DK | 202000601 | 0.2 |
| DK | 202000603 | 0.21 |
| HR | 202000601 | 0.10 |
| HR | 202000602 | 0.12 |
For each currency I don't have a value for any day of the year because of bank holidays or simply weekends.
I need to join it with an order table where some orders are placed on weekends and on a specific day I could not have an exchange rate to calculate taxes.
I need to take the first non missing value from the previous days (so in the examples should I have an order for day 2020-06-02 in Denmark I should exchange it using the rate 0.2)
I thought about using a calendar table but I can't manage to get the job done.
Can someone help me?
Thanks in advance,
R
To get the most recent value less than or equal to the current day:
SELECT
<whatever columns you need from order>
,exchange.ExchangeRateValue
FROM
<order table> order
LEFT JOIN
<exchange rate table> exchange
ON exchange.Country = order.Country
AND exchange.ExchangeRateDt =
(
SELECT
MAX(ExchangeRateDt)
FROM
<exchange rate table>
WHERE
Country = order.Country
AND ExchangeRateDt <= order.OrderDt
)
Ensure the clustered index on the exchange rate table is (Country, ExchangeRateDt).
I have this as a left join so you will still return order results if the currency information is somehow missing. You would have to refer to business rules on how to proceed if no exchange rate was available.
You would typically create a calendar table that stores all the days you are interested in, say dates, with each date on a separate row.
You would also probably have a table that lists the countries: I assumed countries.
Then, one option is a lateral join:
select c.country, d.date, t.ExchangeRateValue
from dates d
cross join countries c
outer apply (
select top (1) t.*
from mytable t
where t.country = c.country and t.ExchangeRateDt <= d.date
order by t.ExchangeRateDt desc limit 1
) t
If you don't have these two tables, or can't create them, then one option is a recursive query to generate the dates and a subquery to list the countries. For example, this would generate the data for the month of June:
with dates as (
select '20200601' date
union all
select dateadd(day, 1, date) from dates where date < '20200701'
)
select c.country, d.date, t.ExchangeRateValue
from dates d
cross join (select distinct country from mytable) c
outer apply (
select top (1) t.*
from mytable t
where t.country = c.country and t.ExchangeRateDt <= d.date
order by t.ExchangeRateDt desc limit 1
) t
You should be able to do the mapping between the transation date and the exchange rate date with this query:
select TAB.primary_key, TAB.TransationDate, max(EXR.ExchangeRateDt)
from yourtable TAB
inner join exchangerate EXR
on TAB.Country = EXR.Country and TAB.TransationDate >= EXR.ExchangeRateDt
group by TAB.primary_key, TAB.TransationDate

Finding invoices without matching credits

The simplified table looks like that:
BillID|ProductID|CustomerID|Price|TypeID
------+---------+----------+-----+-------
111111|Product1 |Customer1 | 100| I
111112|Product1 |Customer1 | -100| C
111113|Product1 |Customer1 | 100| I
111114|Product1 |Customer1 | -100| C
111115|Product1 |Customer1 | 100| I
I need to find invoices (I) that have their matching credits (C) but not "odd" invoices without matching credits (the last record) - or the other way around (unmatched invoices without corresponding credits).
So far I've got this:
SELECT Invoices.billid, Credits.billid
FROM
(SELECT B1.billid
FROM billing B1
WHERE B1.typeid='I') Invoices
INNER JOIN
(SELECT B2.billid
FROM billing B2
WHERE B2.typeid='C') Credits
ON Invoices.customerid = Credits.customerid
AND Invoices.productid = Credits.productid
AND Invoices.price = -(Credits.price)
But it obviously doesn't work, as it returns something looking like:
billid | billid2
-------+ -------
111111 | 111112
111113 | 111114
111115 | 111114
What I would like to get is a list of unmatched invoices;
billid |
-------+
111115 |
Or alternatively only the matching invoices;
billid | billid2
-------+ -------
111111 | 111112
111113 | 111114
The invoice numbers (BillID) will not necessarily be consecutive of course, it's just a simplified view.
Any help would be appreciated.
This should work. I tested by adding a few consecutive invoices before a credit. The query below shows all invoices with matching credit and shows NULL for the aliased "bar" part of the query if a match doesn't exist.
SELECT * FROM (
SELECT
ROW_NUMBER() OVER(Partition By TypeID, CustomerID, ProductID, Price ORDER BY BillID ASC) AS rownumber,
*
FROM Billing
) AS foo
LEFT JOIN
(SELECT
ROW_NUMBER() OVER(Partition By TypeID, CustomerID, ProductID, Price ORDER BY BillID ASC) AS rownumber,
*
FROM Billing
) AS bar
on foo.CustomerID = bar.CustomerID and
foo.ProductID = bar.ProductID and
foo.rownumber = bar.rownumber and
foo.Price = -1*bar.Price
where foo.Price > 1
Here's the updated data that I used:
And Here are what my results looked like:
I wrote this a long time ago so there may be better ways to solve it now. Also I've attempted to adapt it to your table structure, so apologies if its not 100% there. I also assume that your BillID is sequential in date order i.e. larger numbers were entered later. I've also assumed that invoices are always positive and credit notes always negative - so I don't bother checking the type.
Essentially the query filters out any matched items.
Anyway here goes:
select *
from billing X
/* If we are inside the number of unmatched entries then show it. e.g. if there are 3 unmatched entries, and we are in the top 3 then display */
where (
/* Number of later entries relating that match this account entry e.g. Price/Product/Customer */
select count(*)
from billing Z
where Z.Customer = X.Customer and Z.ProductID = X.ProductID
and Z.Price = X.Price
and Z.BillID >= X.BillId
) <=
(
/* Number of unmatched entries for this Price/Product/Customer there are, and whether they are negative or positive. */
select abs(Y.Number)
from (
-- Works out how many unmatched billing entries for this Price/Product/Customer there are, and whether they are negative or positive
select ProductID, CustomerID, abs(Price) Price, sum(case when Price < 0 then -1 else +1 end) Number
from billing
group by ProductID, CustomerID, abs(Price)
having sum(Price) <> 0
) as Y
where X.ProductID = Y.ProductID
and X.CustomerID = Y.CustomerID
and X.Price = case when Y.Number < 0 then -1*Y.Amount else Y.Amount end
)
The odd/even thing concerns me a bit. But assuming this is an incremental key and your business logic is in place, try including this logic in the WHERE clause, the JOIN PREDICATE, or implementing a Lead/Lag function.
SELECT DISTINCT
Invoices.billid
,Credits.billid
FROM
(SELECT B1.billid
FROM billing B1
WHERE B1.typeid='I') Invoices
INNER JOIN (SELECT B2.billid
FROM billing B2
WHERE B2.typeid='C') Credits
ON Invoices.customerid = Credits.customerid
AND Invoices.productid = Credits.productid
AND Invoices.price = -(Credits.price)
AND (Invoices.Billid + 1) = Credits.Billid
Note: This is using your INNER JOIN, so we will get the cases where the invoices have a corresponding credit. You could also do a FULL OUTER JOIN instead, then include a WHERE CLAUSE that specifies WHERE Invoices.Billid IS NULL OR Credits.Billid IS NULL. That scenario would give you the trailing case where you don't have a match.

Max Value with unique values in more than one column

I feel like I'm missing something really obvious here.
Using T-SQL/SQL-Server:
I have unique values in more than one column but want to select the max version based on one particular column.
Dataset:
Example
ID | Name| Version | Code
------------------------
1 | Car | 3 | NULL
1 | Car | 2 | 1000
1 | Car | 1 | 2000
Target status: I want my query to only select the row with the highest version value. Running a MAX on the version column pulls all three because of the distinct values in the 'Code' column:
SELECT ID
,Name
,MAX(Version)
,Code
FROM Table
GROUP BY ID, Name, Code
The net result is that I get all three entries as per the data set due to the unique values in the Code column, but I only want the top row (Version 3).
Any help would be appreciated.
You need to identify the row with the highest version as 1 query and use another outer query to pull out all the fields for that row. Like so:
SELECT t.ID, t.Name, GRP.Version, t.Code
FROM (
SELECT ID
,Name
,MAX(Version) as Version
FROM Table
GROUP BY ID, Name
) GRP
INNER JOIN Table t on GRP.ID = t.ID and GRP.Name = t.Name and GRP.Version = t.Version
You can also use row_number() to do this kind of logic, for example like this:
select ID, Name, Version, Code
from (
select *, row_number() over (order by Version desc) as RN
from Table1
) X where RN = 1
Example in SQL Fiddle
add the top statment to force the return of a single row. Also add the order by notation
SELECT top 1 ID
,Name
,MAX(Version)
,Code
FROM Table
GROUP BY ID, Name, Code
order by max(version) desc

SQL Pulling the latest information and information from another table

I have a record table that is recording changes within a table. I can pull the data from the first table fine, however when i try to join in another table to add some of its column information it stops displaying the information.
PartNumber | PartDesc | value | date
1 | test | 1 | 3/4/2015
I wanted to include the Aisle tag's from the location table
PartNumber| AisleTag | AisleTagTwo
1 | A1 | N/A
here is what i have as my sql statement so far
Select t1.PartNumber, t1.PartDesc , t1.NewValue , t1.Date,t2.AisleTag,t2.AisleTagTwo
from InvRecord t1
JOIN PartAisleListTbl t2 ON t1.PartNumber = t2.PartNumber
where Date = (select max(Date) from InvRecord where t1.PartNumber = InvRecord.PartNumber)
order by t1.PartNumber
it is coming up blank, my original sql statement doesn't include anything from t2. I am not sure what approach to go with in terms of getting the data combined any help is much appreciated thank you !
this should be the end result
PartNumber | PartDesc | value | date | AisleTag | AisleTagTwo
1 | test | 1 | 3/4/2015 | A1 | N/A
Pull the most recent row (based on Date) for each PartNumber in Table A and append data from Table B (joined on PartNumber):
SELECT *
FROM (
SELECT A.PartNumber
, A.PartDesc
, A.NewValue
, A.Date
, B.AisleTag
, B.AisleTagTwo
, DateSeq = ROW_NUMBER() OVER(PARTITION BY A.PartNumber ORDER BY A.Date DESC)
FROM InvRecord A
LEFT JOIN PartAisleListTbl B
ON A.PartNumber = B.PartNumber
) A
WHERE A.DateSeq = 1
ORDER BY A.PartNumber
Are you returning no records at all, or only records with AisleTag and AisleTagTwo as null?
Your sentence "it is coming up blank, my original sql statement doesn't include anything from t2." makes it sound like you're getting records with nulls for the t2 fields.
If you are, then you probably have a record in t2 that has nulls for those fields.
For troubleshooting purposes, try running the query without the WHERE clause:
Select t1.PartNumber, t1.PartDesc , t1.NewValue , t1.Date,t2.AisleTag,t2.AisleTagTwo
from InvRecord t1
JOIN PartAisleListTbl t2 ON t1.PartNumber = t2.PartNumber
order by t1.PartNumber
If you do get records, your problem is with the WHERE clause. If you don't, your problem is with the PartNumber fields in InvRecord and PartAisleListTbl not matching.
Not sure why your's isn't working... is date in both t1 and t2 by any chance?
Here's it re factored to use a inline view instead of a correlated query wonder if it makes a difference.
Select t1.PartNumber, t1.PartDesc , t1.NewValue , t1.Date,t2.AisleTag,t2.AisleTagTwo
from InvRecord t1
JOIN PartAisleListTbl t2
ON t1.PartNumber = t2.PartNumber
JOIN (select max(Date) mdate, PartNumber from InvRecord GROUP BY PartNumber) t3
on t3.partNumber= T1.PartNumber
and T3.mdate = T1.Date
order by t1.PartNumber

T-SQL order by, based on other column value

I'm stuck with a query which should be pretty simple but, for reasons unknown, my brain is not playing ball here ...
Table:
id(int) | strategy (varchar) | value (whatever)
1 "ABC" whatevs
2 "ABC" yeah
3 "DEF" hello
4 "DEF" kitty
5 "QQQ" hurrr
The query should select ALL rows grouped on strategy but only one row per strategy - the one with the higest id.
In the case above, it should return rows with id 2, 4 and 5
SELECT id, strategy , value
FROM (
SELECT id, strategy , value
,ROW_NUMBER() OVER (PARTITION BY strategy ORDER BY ID DESC) rn
FROM Table_Name
) Sub
WHERE rn = 1
Working SQL FIDDLE
You can use window function to get the solution you want. Fiddle here
with cte as
(
select
rank()over(partition by strategy order by id desc) as rnk,
id, strategy, value from myT
)
select id, strategy, value from
cte where rnk = 1;
Try this:
SELECT T2.id,T1.strategy,T1.value
FROM TableName T1
INNER JOIN
(SELECT MAX(id) as id,strategy
FROM TableName
GROUP BY strategy) T2
ON T1.id=T2.id
Result:
ID STRATEGY VALUE
2 ABC yeah
4 DEF kitty
5 QQQ hurrr
See result in SQL Fiddle.
SELECT id, strategy , value
FROM (
SELECT id, strategy , value
,MAX(id) OVER (PARTITION BY strategy) MaxId
FROM YourTable
) Sub
WHERE id=MaxId
You may try this one as well:
SELECT id, strategy, value FROM TableName WHERE id IN (
SELECT MAX(id) FROM TableName GROUP BY strategy
)
Bit depends on your data, you might get results faster with it as it does not do sorting, but by the other hand it uses IN, which can slow you down if there is many 'strategies'

Resources