MS SQL problem: Max and GUID

MS SQL problem: Max and GUID - sql-server

select first(orderid), accountid
from [Order]
group by AccountId
order by DateCreated desc
first() is invalid function.
max() does not work for unique identifiers
How would I get the last orderid created for all accounts? Thanks.

Something like (untested):
;WITH CTE_LatestOrders AS (
select accountid, lastcreated = max(datecreated)
from [Order]
group by accountid
)
select
accountid, orderid
from
[Orders] o
join CTE_LatestOrders l
on o.AccountID = l.AccountID
and o.datecreated = l.lastcreated

Max() does work for unique identifiers as of MS SQL 2012

You can proceed with below as well.
Select Temp.orderid, T.AccountId, T.DateCreated
From
(
Select AccountId, max(DateCreated) as DateCreated
From [Order]
Group By AccountId
)T
Inner Join [Order] Temp on Temp.AccountId = T.AccountId
AND Temp.DateCreated = T.DateCreated
A CTE is not a UDT/temp table; think of a CTE as a view that is defined only for your current query. Just like a view, a CTE is expanded and folded into the overall query plan. Global optimization will still occur, but do not think that just because you use a CTE you will only execute the query once. Here is a trivial example that fits in this space: WITH vw AS ( SELECT COUNT(*) c FROM Person ) SELECT a.c, b.c FROM vw a, vw b; The query plan will clearly show two scans/aggregations and a join instead of just projecting the same result twice.

Related

SQL Simple Join with two tables, but one is random

I am stuck with this. I have a simple set-up with two tables. One table is holding emailaddresses one table is holding vouchercodes. I want to join them in a third table, so that each emailaddress has one random vouchercode.
Unfortunatly I am stuck with this as there are no identic Ids to match both values. What I have so far brings no result:
Select
A.Email
B.CouponCode
FROM Emailaddresses as A
JOIN CouponCodes as B
on A.Email = B.CouponCode
A hint would be great as search did not bring me any further yet.
Edit -
Table A (Addresses)
-------------------
Column A | Column B
-------------------------
email1#gmail.com True
email2#gmail.com
email3#gmail.com True
email4#gmail.com
Table B (Voucher)
-------------------
ABCD1234
ABCD5678
ABCD9876
ABCD5432
Table C
-------------------------
column A | column B
-------------------------
email1#gmail.com ABCD1234
email2#gmail.com ABCD5678
email3#gmail.com ABCD9876
email4#gmail.com ABCD5432
Sample Data:

While joining without proper keys is not a good solution, for your case you can try this. (note: not tested, just a quick suggestion)
;with cte_email as (
select row_number() over (order by Email) as rownum, Email
from Emailaddresses
)
;with cte_coupon as (
select row_number() over (order by CouponCode) as rownum, CouponCode
from CouponCodes
)
select a.Email,b.CouponCode
from cte_email a
join cte_coupon b
on a.rownum = b.rownum

You want to randomly join records, one email with one coupon each. So create random row numbers and join on these:
select
e.email,
c.couponcode
from (select t.*, row_number() over (order by newid()) as rn from emailaddresses t) e
join (select t.*, row_number() over (order by newid()) as rn from CouponCodes t) c
on c.rn = e.rn;

Give a row number for both the tables and join it with row number.
Query
;with cte as(
select [rn] = row_number() over(
order by [Column_A]
), *
from [Table_A]
),
cte2 as(
select [rn] = row_number() over(
order by [Column_A]
), *
from [Table_B]
)
select t1.[Column_A] as [Email_Id], t2.[Column_A] as [Coupon]
from cte t1
join cte2 t2
on t1.rn = t2.rn;
Find a demo here

Alias name issue in SQL

I was trying to write a query for the SQL Server sample DB Northwind. The question was: "Show the most recent five orders that were purchased by a customer who has spent more than $25,000 with Northwind."
In my query the Alias name - "Amount" is not being recognized. My query is as follows:
select top(5) a.customerid, sum(b.unitprice*b.quantity) as "Amount", max(c.orderdate) as Orderdate
from customers a join orders c
on a.customerid = c.customerid
join [order details] b
on c.orderid = b.orderid
group by a.customerid
--having Amount > 25000 --throws error
having sum(b.unitprice*b.quantity) > 25000 --works, but I don't think that this is a good solution
order by Orderdate desc
Pls let me know what I am doing wrong here, as I am a newbie in writing T Sql. Also can this query and my logic be treated as production level query?
TIA,

You must use the aggregate in the query you have. This all has to do with the order in which a SELECT statement is executed. The syntax of the SELECT statement is as follows:
SELECT
FROM
WHERE
GROUP BY
HAVING
ORDER BY
The order in which a SELECT statement is executed is as follows. Since the SELECT clause isn't executed until after the HAVING clause, you can't use the alias like you can in the ORDER BY clause.
FROM
WHERE
GROUP BY
HAVING
SELECT
ORDER BY
Reference Article: http://www.bennadel.com/blog/70-sql-query-order-of-operations.htm

This is a known limitation in SQL Server, at least, but no idea if it's a bug, intentional or even part of the standard. But the thing is, neither the WHERE or HAVING clauses accept an alias as part of their conditions, you must use only columns from the original source tables, which means that for filtering by calculated expressions, you must copy-paste the very same thing in both the SELECT and WHERE parts.
A workaround for avoiding this duplication can be to use a subquery or cte and apply the filter on the outer query, when the alias is just an "input" table:
WITH TopOrders AS (
select a.customerid, sum(b.unitprice*b.quantity) as "Amount", max(c.orderdate) as Orderdate
from customers a join orders c
on a.customerid = c.customerid
join [order details] b
on c.orderid = b.orderid
group by a.customerid
--no filter here
order by Orderdate desc
)
SELECT TOP(5) * FROM TopOrders WHERE Amount > 25000 ;
Interesting enough, the ORDER BY clause does accepts aliases directly.

You must use Where b.unitprice*b.quantity > 25000 instead of having Amount > 25000.
Having used for aggregate conditions. Your business determine your query condition. If you need to calculate sum of prices that have above value than 25000, must be use Where b.unitprice*b.quantity > 25000 and if you need to show customer that have total price above than 25000 must be use having Amount > 25000 in your query.
select top(5) a.customerid, sum(b.unitprice*b.quantity) as Amount, max(c.orderdate) as Orderdate
from customers a
JOIN orders c ON a.customerid = c.customerid
join [order details] b ON c.orderid = b.orderid
group by a.customerid
having sum(b.unitprice*b.quantity) > 25000 --works, but I don't think that this is a good solution
Order by Amount

I don't have that schema at hand, so table' and column' names might go a little astray, but the principle is the same:
select top (5) ord2.*
from (
select top (1) ord.CustomerId
from dbo.Orders ord
inner join dbo.[Order Details] od on od.OrderId = ord.OrderId
group by ord.CustomerId
having sum(od.unitPrice * od.Quantity) > $25000
) sq
inner join dbo.Orders ord2 on ord2.CustomerId = sq.CustomerId
order by ord2.OrderDate desc;

The Having Clause will works with aggregate function like SUM,MAX,AVG..
You may try like this
SELECT TOP 5 customerid,SUM(Amount)Amount , MAX(Orderdate) Orderdate
FROM
(
SELECT A.customerid, (B.unitprice * B.quantity) As "Amount", C.orderdate As Orderdate
FROM customers A JOIN orders C ON A.customerid = C.customerid
JOIN [order details] B ON C.orderid = B.orderid
) Tmp
GROUP BY customerid
HAVING SUM(Amount) > 25000
ORDER BY Orderdate DESC

The question is little ambiguos.
Show the most recent five orders that were purchased by a customer who
has spent more than $25,000 with Northwind.
Is it asking to show the 5 recent orders by all the customers who have spent more than $25,000 in all of their transactions (which can be more than 5).
The following query shows all the customers who spent $25000 in all of their transactions (not just the recent 5).
In one of the Subquery BigSpenders it gets all the Customers who spent more than $25000.
Another Subquery calculates the total amount for each order.
Then it gets rank of all the orders by OrderDate and OrderID.
Then it filters it by Top 5 orders for each customer.
--
SELECT *
FROM (SELECT C.customerid,
C.orderdate,
C.orderid,
B3.amount,
Row_number()
OVER(
partition BY C.customerid
ORDER BY C.orderdate DESC, C.orderid DESC) Rank
FROM orders C
JOIN
--Get Amount Spend Per Order
(SELECT b2.orderid,
Sum(b2.unitprice * b2.quantity) AS Amount
FROM [order details] b2
GROUP BY b2.orderid) B3
ON C.orderid = B3.orderid
JOIN
--Get Customers who spent more than 25000
(SELECT c.customerid
FROM orders c
JOIN [order details] b
ON c.orderid = b.orderid
GROUP BY c.customerid
HAVING Sum(b.unitprice * b.quantity) > 25000) BigSpenders
ON C.customerid = BigSpenders.customerid) X
WHERE X.rank <= 5

SQL SERVER : select the latest comment using the max date

I have a table like so:
Id, Comment, LastUpdatedDate
I'm tyring to select the latest comment for that id. The table can have many comments on that id with different dates but I'm trying to get the latest date out of there. I've tried the following with no success:
SELECT tt.*
FROM tagtestresultcomment tt
INNER JOIN
(
SELECT tag_id, MAX(last_update) AS MaxDateTime
FROM tagtestresultcomment
GROUP BY tag_id
) groupedtt ON tt.tag_id = groupedtt.tag_id AND tt.last_update = groupedtt.MaxDateTime
order by tag_id
Does anyone have any ideas of how to achieve this?
Thanks!

It sounds like you want only the latest comment for each tag_id? In which case, here is one approach you can use from SQL 2005 and on:
;WITH CTE AS
(
SELECT *,
ROW_NUMBER() OVER(PARTITION BY tag_id ORDER BY last_update DESC) AS RowNo
FROM TagTestResultComment
)
SELECT * FROM CTE WHERE RowNo = 1

try this
Select * from tagtestresultcomment where last_update in
(select max(last_update) from tagtestresultcomment group by tag_id)

your query code is too redundant. first
tt.tag_id = groupedtt.tag_id AND tt.last_update = groupedtt.MaxDateTime
it's enough just
tt.tag_id = groupedtt.tag_id
and second, it's enough just
SELECT [desired field list extcept last_update and ],
tag_id,
MAX(last_update) AS MaxDateTime
FROM
tagtestresultcomment
group by
tag_id, [desired field list extcept last_update and tag_id]
at all to achieve your objective

I have tried something like this:
declare #tagtestresultcomment table
(
id int
, comment varchar(50)
,LastUpdatedDate datetime
)
--==== Populate table
insert into #tagtestresultcomment(id,comment,LastUpdatedDate)
select 1,'My name is Arthur','2011-06-09 00:00:00' union all
select 2,'My name is DW','2011-06-19 00:00:00' union all
select 2,'Arthur is my brother','2011-06-21 00:00:00' union all
select 1,'I have a sister named DW','2011-06-21 00:00:00' union all
select 3,'I am Muffy','2011-06-14 00:00:00' union all
select 3,'I like sports','2011-06-14 00:00:00'
-- SELECT stmt
select * from #tagtestresultcomment t
join
(
select id, MAX(lastupdateddate) as LastUpdatedDate from #tagtestresultcomment group by id
) m
on t.id = m.id
and t.LastUpdatedDate = m.LastUpdatedDate

The "MAX" group function wasn't working for me, so I used a sub-query. I had trouble wrapping my head around your single table example, so I'm using a common parent-child 1-to-many relationship with a blog and comment tables as an example.
SELECT
b.id,
b.content,
c.id,
c.blog_id,
c.content,
c.last_update
FROM blog b
INNER JOIN blog_comment c
ON b.id = c.blog_id AND c.id = (
SELECT TOP 1 id FROM blog_comment WHERE blog_id = b.id ORDER BY last_update DESC
)
The query takes a hit on my sub-query, as it will call that "SELECT TOP 1" query for each record in the blog table. I'd like to hear of a faster, more efficient example if possible.

select top 1 with a group by

I have two columns:
namecode name
050125 chris
050125 tof
050125 tof
050130 chris
050131 tof
I want to group by namecode, and return only the name with the most number of occurrences. In this instance, the result would be
050125 tof
050130 chris
050131 tof
This is with SQL Server 2000

I usually use ROW_NUMBER() to achieve this. Not sure how it performs against various data sets, but we haven't had any performance issues as a result of using ROW_NUMBER.
The PARTITION BY clause specifies which value to "group" the row numbers by, and the ORDER BY clause specifies how the records within each "group" should be sorted. So partition the data set by NameCode, and get all records with a Row Number of 1 (that is, the first record in each partition, ordered by the ORDER BY clause).
SELECT
i.NameCode,
i.Name
FROM
(
SELECT
RowNumber = ROW_NUMBER() OVER (PARTITION BY t.NameCode ORDER BY t.Name),
t.NameCode,
t.Name
FROM
MyTable t
) i
WHERE
i.RowNumber = 1;

select distinct namecode
, (
select top 1 name from
(
select namecode, name, count(*)
from myTable i
where i.namecode = o.namecode
group by namecode, name
order by count(*) desc
) x
) as name
from myTable o

SELECT max_table.namecode, count_table2.name
FROM
(SELECT namecode, MAX(count_name) AS max_count
FROM
(SELECT namecode, name, COUNT(name) AS count_name
FROM mytable
GROUP BY namecode, name) AS count_table1
GROUP BY namecode) AS max_table
INNER JOIN
(SELECT namecode, COUNT(name) AS count_name, name
FROM mytable
GROUP BY namecode, name) count_table2
ON max_table.namecode = count_table2.namecode AND
count_table2.count_name = max_table.max_count

I did not try but this should work,
select top 1 t2.* from (
select namecode, count(*) count from temp
group by namecode) t1 join temp t2 on t1.namecode = t2.namecode
order by t1.count desc

Here are to examples that you could use but the temp table use is more efficient than the view, but was done on a small data sample. You would want to check your own statistics.
--Creating A View
GO
CREATE VIEW StateStoreSales AS
SELECT t.state,t.stor_id,t.stor_name,SUM(s.qty) 'TotalSales'
,ROW_NUMBER() OVER (PARTITION BY t.state ORDER BY SUM(s.qty) DESC) AS 'Rank'
FROM [dbo].[sales] s
JOIN [dbo].[stores] t ON (s.stor_id = t.stor_id)
GROUP BY t.state,t.stor_id,t.stor_name
GO
SELECT * FROM StateStoreSales
WHERE Rank <= 1
ORDER BY TotalSales Desc
DROP VIEW StateStoreSales
---Using a Temp Table
SELECT t.state,t.stor_id,t.stor_name,SUM(s.qty) 'TotalSales'
,ROW_NUMBER() OVER (PARTITION BY t.state ORDER BY SUM(s.qty) DESC) AS 'Rank' INTO #TEMP
FROM [dbo].[sales] s
JOIN [dbo].[stores] t ON (s.stor_id = t.stor_id)
GROUP BY t.state,t.stor_id,t.stor_name
SELECT * FROM #TEMP
WHERE Rank <= 1
ORDER BY TotalSales Desc
DROP TABLE #TEMP

SQL Server 2005 Syntax Help - "Select Info based upon Max Value of Sub Query"

The objective is below the list of tables.
Tables:
Table: Job
JobID
CustomerID
Value
Year
Table: Customer
CustomerID
CustName
Table: Invoice
SaleAmount
CustomerID
The Objective
Part 1: (easy) I need to select all invoice records and sort by Customer (To place nice w/ Crystal Reports)
Select * from Invoice as A inner join Customer as B on A.CustomerID = B.CustomerID
Part 2: (hard) Now, we need to add two fields:
JobID associated with that customer's job that has the Maximum Value (from 2008)
Value associated with that job
Pseudo Code
Select * from
Invoice as A
inner join Customer as B on A.CustomerID = B.CustomerID
inner join
(select JobID, Value from Jobs where Job:JobID has the highest value out of all of THIS customer's jobs from 2008)
General Thoughts
This is fairly easy to do If I am only dealing with one specific customer:
select max(JobId), max(Value) as MaxJobID from Jobs where Value = (select max(Value) from Jobs where CustomerID = #SpecificCustID and Year = '2008') and CustomerID = SpecificCustID and CustomerID = '2008'
This subquery determines the max Value for this customer in 2008, and then its a matter of choosing a single job (can't have dupes) out of potential multiple jobs from 2008 for that customer that have the same value.
The Difficulty
What happens when we don't have a specific customer ID to compare against? If my goal is to select ALL invoice records and sort by customer, then this subquery needs access to which customer it is currently dealing with. I suppose this can "sort of" be done through the ON clause of the JOIN, but that doesn't really seem to work because the sub-sub query has no access to that.
I'm clearly over my head. Any thoughts?

How about using a CTE. Obviously, I can't test, but here is the idea. You need to replace col1, col2, ..., coln with the stuff you want to select.
Inv( col1, col2, ... coln)
AS
(
SELECT col1, col2, ... coln,
ROW_NUMBER() OVER (PARTITION BY A.CustomerID
ORDER BY A.Value DESC) AS [RowNumber]
FROM Invoice A INNER JOIN Customer B ON A.CustomerID = B.CustomerID
WHERE A.CustomerID = #CustomerID
AND A.Year = #Year
)
SELECT * FROM Inv WHERE RowNumber = 1
If you don't have a CustomerID, this will return the top value for each customer (that will hurt on performance tho).

The row_number() function can give you what you need:
Select A.*, B.*, C.JobID, C.Value
from
Invoice as A
inner join Customer as B on A.CustomerID = B.CustomerID
inner join (
select JobID, Value, CustomerID,
ROW_NUMBER() OVER (PARTITION BY CustomerID ORDER BY Value DESC) AS Ordinal
from Jobs
WHERE Year = 2008
) AS C ON (A.CustomerID = C.customerID AND C.Ordinal = 1)
The ROW_NUMBER() function in this query will order by value in descending order and the PARTITION BY clause will do this separately for each different value of CustomerID. This means that the highest Value for each customer will always be 1, so we can join to that value.

The over function is an awesome, but often neglected function. You can use it in a subquery to pull back your valid jobs, like so:
select
a.*
from
invoice a
inner join customer b on
a.customerid = b.customerid
inner join (select customerid, max(jobid) as jobid, maxVal from
(select customerid,
jobid,
value,
max(value) over (partition by customerid) as maxVal
from jobs
where Year = '2008') s
where s.value = s.maxVal
group by customerid, maxVal) c on
b.customerid = c.customerid
and a.jobid = c.jobid
Essentially, that first inner query looks like this:
select
customerid,
jobid,
value,
max(value) over (partition by customerid) as maxVal
from jobs
where Year = '2008'
You'll see that this pulls back all of the jobs, but with that additional column which lets you know what the maximum value is for each customer. With the next subquery, we filter out any rows that have value and maxVal equal. Additionally, it finds the max JobID based on customerid and maxVal, because we need to pull back one and only one JobID (as per the requirements).
Now, you have a complete listing of CustomerID and JobID that meet the conditions of having the highest JobID that contains the maximum Value for that CustomerID in a given year. All that's left is to join it to Invoice and Customer, and you're good to go.

Just to be complete with the non row_number solution for those < MSSQL 2005. Personanly, I find it easier to follow myslef...but that could be biased considering how much time I spend in MSSQL 2000 vs 2005+.
SELECT *
FROM Invoice as A
INNER JOIN Customer as B ON
A.CustomerID = B.CustomerID
INNER JOIN (
SELECT
CustomerId,
--MAX in case dupe Values.
==If UC on CustomerId, Value (or CustomerId, Year, Value) then not needed
MAX(JobId) as JobId
FROM Jobs
JOIN (
SELECT
CustomerId,
MAX(Value) as MaxValue
FROM Jobs
WHERE Year = 2008
GROUP BY
CustomerId
) as MaxValue ON
Jobs.CustomerId = MaxValue.CustomerId
AND Jobs.Value = MaxValue.MaxValue
WHERE Year = 2008
GROUP BY
CustomerId
) as C ON
B.CustomerID = C.CustomerID