How do I join the first row of a subquery? - sql-server

I've got a table of invoices and a child table of related data related by key. In particular, for each invoice, I'm interested in only the first related row from the child table. Given that I want the one related row for every invoice key - how do I accomplish this?
Select i.[Invoice Number],
c.[Carrier Name]
From Invoice i
Left Join Carriers c on i.[InvoiceKey] = c.[InvoiceKey]
Where -- what?
I guess semantically speaking, what I'm looking for something akin to the concept of Top 1 c.CarrierName Group by InvoiceKey (or what would be the concept of that if that were possible in T-SQL.)
I've thought about doing a left join on a subquery, but that doesn't seem very efficient. Does anyone have any T-SQL tricks to achieve this efficiently?
Edit: Sorry guys, I forgot to mention this is SQL Server 2000, so while I'm going to give upvotes for the current SQL Server 2005/2008 responses that will work, I can't accept them I'm afraid.

Provided that Carriers has a PRIMARY KEY called id:
SELECT i.[Invoice Number],
c.[Carrier Name]
FROM Invoice i
JOIN Carriers c
ON c.id =
(
SELECT TOP 1 ID
FROM Carriers ci
WHERE ci.InvoiceKey = i.InvoiceKey
ORDER BY
id -- or whatever
)

;with cteRowNumber as (
select c.InvoiceKey, c.[Carrier Name], ROW_NUMBER() over (partition by c.InvoiceKey order by c.[Carrier Name]) as RowNum
from Carriers c
)
select i.[Invoice Number],
rn.[Carrier Name]
from Invoice i
left join cteRowNumber rn
on i.InvoiceKey = rn.InvoiceKey
and rn.RowNum = 1

This works for me:
select ir.[Invoice Number], c.[Carrier Name]
from
(select ROW_NUMBER() over (order by i.[Invoice Number] asc) AS RowNumber, i.[Invoice Number], i.InvoiceKey
from Invoice i) AS ir
left join Carriers c
on ir.InvoiceKey = c.InvoiceKey
where RowNumber = 1
union all
select ir.[Invoice Number], NULL as [Carrier Name]
from
(select ROW_NUMBER() over (order by i.[Invoice Number] asc) AS RowNumber, i.[Invoice Number]
from Invoice i) AS ir
where RowNumber > 1
or
select TOP 1 i.[Invoice Number], c.[Carrier Name]
from Invoice i
left join Carriers c
on i.InvoiceKey = c.InvoiceKey
union all
select ir.[Invoice Number], NULL as [Carrier Name]
from
(select ROW_NUMBER() over (order by i.[Invoice Number] asc) AS RowNumber, i.[Invoice Number]
from Invoice i) AS ir
where RowNumber > 1

This is how I would do it, using a slightly different syntax than yours (MySQL style), but I guess you could apply it to your solution as well:
SELECT i.invoiceNumber, c.carrierName
FROM Invoice as i
LEFT JOIN Carriers as c ON (c.id = (SELECT id FROM Carriers WHERE invoiceKey = i.invoiceKey ORDER BY id LIMIT 1))
This will take all records from Invoice, and join it with one (or zero) record from Carriers, specifically the record which has the same invoiceKey and only the first one.
As long as you have an index on Carriers.invoiceKey the performance of this query should be acceptable.
Sebastian

Alternatively you could use OUTER APPLY as well.
Please notice the use of angle brackets for unknown field names:
Select i.[Invoice Number], c.[Carrier Name], x.<Carrier_field1>
From Invoice i
OUTER APPLY
(
SELECT TOP 1
FROM Carriers c
WHERE c.[InvoiceKey] = i.[InvoiceKey]
ORDER BY <order _clause>
) x

In such cases I often employ a device which I here apply to your example and describe below:
SELECT
i.[Invoice Number],
c.[Carrier Name]
FROM Invoice i
INNER JOIN Carriers c ON i.InvoiceKey = c.InvoiceKey
INNER JOIN (
SELECT MIN(ID) AS ID
FROM Carriers
GROUP BY InvoiceKey
) c_top ON c.ID = c_top.ID
I think, this is roughly what Quassnoi has posted, only I try to avoid using SELECT TOPs like that.
Invoice is joined with Carriers based on their linking expression (InvoiceKey in this case). Now, Carriers can have multiple rows for the same InvoiceKey, so we need to limit the output. And that is done using a derived table.
The derived table groups rows from Carrier based on the same expression that is used for linking the two tables (InvoiceKey).
And there's another way: instead of joining the derived table you could use IN (subquery) with the same effect. That is, the complete query would then look like this:
SELECT
i.[Invoice Number],
c.[Carrier Name]
FROM Invoice i
INNER JOIN Carriers c ON i.InvoiceKey = c.InvoiceKey
AND c.ID IN (SELECT MIN(ID) FROM Carriers GROUP BY InvoiceKey)

group by carriername having max(invoicenumber)
to get the first carrier for each invoice:
group by invoicenumber having max(carriername)
-- substitute the column you want to order by for carrier name to change which is 'first'

Related

Subtracting two columns within the sql query

I have been trying to subtract two columns in sql server to form a third one. Below is my query
select AD.Id, Sum(APS.Amount) AS TotalDue,
isnull((select sum(Amount) from Activation where InvoiceId in (select InvoiceId from Invoices where AgreementId = AD.Id)),0)
As AllocatedToDate
from AdvantageDetails AD
inner join AllPaymentsSubstantial APS
on APS.AgreementId=AD.Id
where AD.OrganizationId=30
group by AD.Id
What I tried is below but it is not working. :
select AD.Id, Sum(APS.Amount) AS TotalDue,
isnull((select sum(Amount) from Activation where InvoiceId in (select InvoiceId from Invoices where AgreementId = AD.Id)),0)
As AllocatedToDate , (TotalDue-AllocatedToDate) as NewColumn
from AdvantageDetails AD
inner join AllPaymentsSubstantial APS
on APS.AgreementId=AD.Id
where AD.OrganizationId=30
group by AD.Id
At last I tried it using a CTE which worked fine. But I want to do it without creating CTE. Can there be any other way for performing the same functionality. I do not want to use CTE because it is forcasted that there
can be other columns which will be calculated in future.
with CTE as(select AD.Id, Sum(APS.Amount) AS TotalDue,
isnull((select sum(Amount) from Activation where InvoiceId in (select InvoiceId from Invoices where AgreementId = AD.Id)),0)
As AllocatedToDate , (TotalDue-AllocatedToDate) as NewColumn
from AdvantageDetails AD
inner join AllPaymentsSubstantial APS
on APS.AgreementId=AD.Id
where AD.OrganizationId=30
group by AD.Id) select * , (CTE.TotalDue-CTE.AllocatedToDate)As Newcolumn from CTE
You can do it without a CTE by repeating the entire formula that makes up AllocatedToDate.
You cannot use the alias of a column in the SELECT list, so you cannot do this:
SELECT {some calculation} AS ColumnA, (ColumnA - ColumnB) AS ColumnC
If you don't want to use a CTE or derived table, you have to do this:
SELECT {some calculation} AS ColumnA, ({some calculation} - ColumnB) AS ColumnC
And by the way, I can't imagine why the possibility of future columns being added is a reason not to use a CTE. To me, it sounds like a reason TO use a CTE, as you will only have to make changes in one place in the code, and not duplicate the same code in different places in the same query.
You can just use nested queries:
select Id, TotalDue, AllocatedToDate, (TotalDue-AllocatedToDate) as NewColumn
from (
select AD.Id, Sum(APS.Amount) AS TotalDue,
isnull((select sum(Amount) from Activation where InvoiceId in (select InvoiceId from Invoices where AgreementId = AD.Id)),0)
As AllocatedToDate
from AdvantageDetails AD
inner join AllPaymentsSubstantial APS
on APS.AgreementId=AD.Id
where AD.OrganizationId=30
group by AD.Id
) x

SQL Server - Distinct case in one column but last record in second column

I have a requirement (simple, but can't find simple solution) to fetch mobile number and unique transaction id (latest transaction would be good, but any transaction id is also ok)
Sample Data
Seq. Mobile No. Transaction No.
1 1234567890 ABC1234
2 2345678901 ABC2392
3 2345678901 ABC2782
I simply want to find mobile number 2345678901 and any of the one transaction, however latest would be good.
Output
Seq. Mobile No. Transaction No.
1 1234567890 ABC1234
2 2345678901 ABC2782
I know simply DISTINCT won't work, so not sure what's the best way to get the outcome.
I found a way to do it via sub-query, but I want to do it in single query for better performance.
Plz Help!!
You can use ROW_NUMBER for this:
SELECT Seq, MobileNo, TransactionNo
FROM (
SELECT Seq, MobileNo, TransactionNo,
ROW_NUMBER() OVER (PARTITION BY MobileNo ORDER BY Seq DESC) AS rn
FROM mytable) AS t
WHERE t.rn = 1
The above query will pick exactly one record per MobileNo: the one having the greatest Seq value.
You can use group by.
select [Seq], [Mobile No], [Transaction No] from yourtable t1
inner join
(select [Mobile No], max([Transaction No]) as T_no from yourtable
group by [Mobile No]) t2
on t1.[Mobile No]=t2.[Mobile No] and t1.[Transaction No]=t2.T_no
Right query gives you latest [Transaction No] per [Mobile No] and left query is used only for finding matching [Seq].
CREATE TABLE #Transaction
(
Seq VARCHAR(12),
MobileNo VARCHAR(12),
TransactionNo VARCHAR(12)
)
INSERT INTO #Transaction VALUES
(1,'1234567890','ABC1234')
,(2,'2345678901','ABC2392')
,(3,'2345678901','ABC2782')
SELECT DT.Seq,DT.MobileNo,DT.TransactionNo FROM
(SELECT Seq,
MobileNo,
TransactionNo,
ROW_NUMBER() OVER(PARTITION BY MobileNo ORDER BY Seq) AS Rn
FROM
#Transaction) DT
WHERE DT.Rn = 1

T-SQL Group skip and take

Consider following tables:
How to skip and take groups from the table? Tried using Row_Number() but it doesn't help. Any ideas?
Used query
;WITH cte AS (SELECT Room.Id, Room.RoomName,
ROW_NUMBER() OVER
(ORDER BY Room.Id) AS RN
FROM Room INNER JOIN
RoomDetails ON Room.Id = RoomDetails.RoomId)
SELECT Id, RoomName
FROM cte
WHERE RN = 1
You need to use partition as part of the dense_rank function
dense_rank() over (partition by roomid) as row
see here for some more examples Windowing functions

Need help creating a query for a non-normalized database

I've never worked with a non-normalized database before, so I'll try and explain my problem as best I can. So I have two tables:
The customers table holds all the customers information, and the orders table holds all the orders that they have placed. I haven't listed all the fields in the tables, just the ones that I need. The customer number in both tables is not the primary key, but I'm inner joining on them anyway. So the problem I'm having is that I don't know how to make a query that:
Selects all the customers with their first name, last name, and email, and also show the most recent orderdate, most recent total, and most recent ordertype. I know that I have to use a max() aggregate for the date, but that's as far as I got. Please help a noob out.
You can try:
SELECT FirstName,
LastName,
Email,
OrderDate,
OrderTotal,
OrderType
FROM Customers AS C
INNER JOIN Order AS O
ON O.CustomerNumber = C.CustomerNumber AND
O.OrderDate = (
SELECT MAX (O1.OrderDate)
FROM Order AS O1
WHERE O1.CustomerNumber = C.CustomerNumber)
)
assuming that Orders.OrderDate is unique for each CustomerNumber, does this work for you? if a single CustomerNumber has more than one entry in Order for OrderDate, you'll get each of those rows.
select c.FirstName, c.LastName, c.Email, o.OrderDate, o.OrderTotal, o.OrderType
from Customers c
join
(select CusomterNumber, max(OrderDate) as MostRecentOrderDate
from Orders
group by CustomerNumber
) mro on mro.CustomerNumber=s.CustomerNumber
join Orders o on o.OrderDate=mro.MostRecentOrdeDate and
o.CustomerNumber=mro.CustomerNumber
Try this:
SELECT
Customers.*, Orders.*
FROM
Customers
JOIN
(SELECT
Customer_Number,
MAX(Order_Date) OrderDate
FROM
Orders
GROUP BY
Customer_Number
) as Ord ON Customers.Customer_Number = Ord.Customer_Number
JOIN Order ON Orders.Customer_Number = Ord.Customer_Number
If you are doing this with SQL Server use the query designer and basically all you want to do is do a join since you have two keys that are the same one in Customer Table ->Customer Join on Order->Customer alias the Customer table as C and Orders table as O
so for example
SELECT Customer.*, Orders.*
From Customer c, Orders O INNER JOIN O where C.Customer Number = O.Customer Number
This should be enough to get you started.. if you don't want all the fields then fully qualify the names for example
SELECT C.FirstName, C.LastName, O.OrderDate, O.OrderType FROM Customer C, Orders O
WHERE C.Customer NUmber = O.Customer Number //this is another way of doing a Join when working with the where Clause.

Using SELECT TOP from one column, then sorting on a different column

I'm using SQL Server 2005, and I want to query for the vendors generating the most revenue, sorted by the vendor's name. Below is the query I have tried. The inner subquery gets the 15 largest vendors sorted by revenue, and I try to order those results by the vendor name.
SELECT Revenue, VendorName
FROM (
SELECT TOP 15
SUM(po.POTotal) AS Revenue
, Vendors.VendorName AS VendorName
FROM PurchaseOrders po
INNER JOIN Vendors ON po.Vendor_ID = Vendors.Vendor_ID
WHERE ...
GROUP BY Vendors.VendorName
ORDER BY Revenue DESC
)
ORDER BY VendorName ASC
But this gives me an error message:
Msg 156, Level 15, State 1, Line 14
Incorrect syntax near the keyword 'ORDER'.
Is there another way to do this? I think this might be possible with a view, but I'd prefer not to do it that way.
I apologize if this is a duplicate, I don't even know what to search for to see if this has already been asked.
Add an alias for the subquery:
SELECT Revenue, VendorName
FROM (SELECT TOP 15
SUM(po.POTotal) AS Revenue,
v.VendorName AS VendorName
FROM PurchaseOrders po
JOIN Vendors v
ON po.Vendor_ID = v.Vendor_ID
WHERE ...
GROUP BY v.VendorName
ORDER BY Revenue DESC) Z
ORDER BY VendorName ASC
You need to give your derived table an alias:
...
ORDER BY Revenue DESC
) AS DerivedTable
ORDER BY VendorName;
I believe you can do this with a CTE:
WITH revenue (Revenue, VendorName)
AS
(SELECT TOP 15 SUM(po.POTotal) AS Revenue, Vendors.VendorName AS VendorName
FROM PurchaseOrders po
INNER JOIN Vendors
ON po.Vendor_ID = Vendors.Vendor_ID
WHERE ...
GROUP BY Vendors.VendorName
ORDER BY Revenue DESC)
SELECT Revenue, VendorName
FROM revenue
ORDER BY VendorName ASC
You can also do this without a sub-query if you like --
SELECT sum(po.POTotal) as Revenue, vendors.VendorName
FROM PurchaseOrders po INNER JOIN Vendors ON po.Vendor_ID = Vendors.Vendor_ID
WHERE ...
GROUP BY Vendors.VendorName
ORDER BY sum(po.POTotal) DESC, VendorName ASC
Try that and see if it works - we do the same sort of thing here and this was our solution...
Sorry, forgot the TOP 15 in the query above - it needs to go just befor the sum() aggregate function.

Resources