How to get only last row after inner join? - sql-server

I need to return unique funeral_homes who contains not completed leads and sort these by last lead timestamp.
This is my sql query:
select distinct h.funeral_home_name, h.funeral_home_guid, h.address1, h.city, h.state, p.discount_available, t.date_created
from tblFuneralHomes h inner join tblFuneralTransactions t on h.funeral_home_guid = t.funeral_home_guid
inner join vwFuneralHomePricing p on h.funeral_home_guid = p.funeral_home_guid where completed=0 order by 'funeral_home_name' asc;
This is the result, but I need only unique homes with last added lead
What I should change here?

The problem here appears that you are joining into tables with 1 to many relationships with table tblFuneralHomes, yet you expect only one row per funeral home.
Instead of using distinct, I would suggest that instead you group by the required output funeral home columns, and then apply some kind of aggregate on the columns needed from the joined tables in order to return just a single computed value from all possible joined values.
For instance, below we find the first transaction date (min) associated with each funeral home:
select h.funeral_home_name, h.funeral_home_guid, h.address1, h.city, h.state,
p.discount_available, min(t.date_created)
from tblFuneralHomes h
inner join tblFuneralTransactions t on h.funeral_home_guid = t.funeral_home_guid
inner join vwFuneralHomePricing p on h.funeral_home_guid = p.funeral_home_guid
where completed=0
group by h.funeral_home_name, h.funeral_home_guid, h.address1, h.city, h.state,
p.discount_available
order by h.funeral_home_name asc
Note that depending on the cardinality of the association between tblFuneralHomes and vwFuneralHomePricing, you may also need to remove p.discount_available from the grouping and also introduce it with an aggregate function, similar to what I've done with t.date_created

Related

Understanding DISTINCT vs DISTINCT ON vs Group by

I have a query which returns a set of 'records'.
The result is always from the same table, and should always be unique. It has a set of inner joins to filter the rows down to the appropriate subset.
The query is returning roughly 10 columns.
However, I found that it was returning duplicate rows, so I added select distinct to the query, which solved the duplication problem but has significant performance issues.
My understanding is that select distinct on (records.id), id... will return the same result in this case, as all duplicates would have the same primary key, and seems to be about twice as fast.
My other tests show that group by records.id is even faster again, and seems to do the same thing?
Am I correct that all three of these approaches will always return the same set of single table records?
Also, is there an easy way to compare the results of different approaches to ensure the set is being returned?
Here is my query:
SELECT DISTINCT records.*
FROM records
INNER JOIN records parents on parents.path #> records.path
INNER JOIN record_types ON record_types.id = records.record_type_id
INNER JOIN user_roles ON user_roles.record_id = parents.id AND user_roles.user_id = _user_id
INNER JOIN memberships ON memberships.role_id = user_roles.role_id
INNER JOIN roles ON roles.id = memberships.role_id
INNER JOIN groups ON memberships.group_id = groups.id AND
groups.id = record_types.view_group_id
Any individual record can have tree of 'parent' records. This is done using the ltree plugin. Effectively, we are looking to see if the user has a role which is in a group which is defined as the 'view group' for either the current record, or any of the parents. The query is actually a function, and _user_id is being passed in.
Since you are only selecting from records, you don't need DISTINCT; the records are already distinct (I presume).
So the duplicates you encounter could be caused by all the joins, for instance if more than one role or group membership matches one of your records, the same record will be combined with each of these references.
SELECT *
FROM records r
WHERE EXISTS (
SELECT *
FROM records pa on pa.path #> r.path
JOIN record_types typ ON typ.id = r.record_type_id
JOIN user_roles ur ON ur.record_id = pa.id AND ur.user_id = _user_id
JOIN memberships mem ON mem.role_id = ur.role_id
JOIN roles ON roles.id = mem.role_id
JOIN groups gr ON mem.group_id = gr.id AND gr.id = typ.view_group_id
)
;

MSSQL Group By failing, but no dup Column names

I know this question has been asked time and time again, but I have no two column names that are the same, yet I am getting:
Msg 8120, Level 16, State 1, Line 13 Column 'dbo.PRODUCT.ProductName'
is invalid in the select list because it is not contained in either an
aggregate function or the GROUP BY clause.
My ProductId column is unique to my dbo.Product Table, and I am not sure why it is getting confused with another value. In this image you can see the dup ProductIds
WITH products AS
(
SELECT
*,
ROW_NUMBER() OVER(ORDER BY p.[ProductName]) AS 'RowNumber'
FROM dbo.PRODUCT p
JOIN dbo.Category c ON p.ProductCategoryCode = c.CategoryCode
JOIN dbo.Supplier s ON p.ProductSupplierCode = s.SupplierCode
LEFT JOIN dbo.ProductTag pt ON pt.ProductUPC = p.UPC
LEFT JOIN dbo.Tag t ON pt.ProductTagTagCode = t.TagCode
GROUP BY p.ProductId
)
SELECT *
FROM products
WHERE RowNumber BETWEEN 0 AND 2;
Your error is because you are selecting ALL of the fields in ALL of the tables, but you are only grouping by one value. If a value is returned by the query, then it must either be GROUPED or aggregated (Min, Max, SUM, AVG, etcetera).
If you simply add the Product Name to your grouping:
GROUP BY p.ProductId, p.ProductName
You will still have the same problem with (for example) p.ProductCategoryCode, p.ProductSupplierCode, c.CategoryCode, etc, etc.
In this case, where you are looking for unique rows, do not use GROUP BY - use DISTINCT (which works on all fields returned automatically) instead. Note that #bjones is still correct as to why you are getting duplicates - one of the tables you are joining in can have multiple rows for each product (e.g. many times a product will come from more than one supplier.)
To solve this, you need to:
Determine what data you need to return, and only select those columns
Determine if you need to summarize any data (i.e. Total Sold or On Hand), then:
Use GROUP BY if you do need to summarize any values, or
Use DISTINCT if you do not need to summarize any values

SQL Left Outer Join?

I have table that should joint to another table based on the unique id. If I do LEFT OUTER JOIN ON I will have duplicates. If I put DISTINCT in my SELECT I will get correct number of records. Then if I include any field from the table that I did LEFT OUTER JOIN in that case I'm getting duplicates again. Here is my query:
SELECT DISTINCT
Table1.fname,
Table1.lname,
Table2.address
FROM Table1
LEFT OUTER JOIN Table2
ON Table2.user_id = Table1.userid
In the example above I'm getting duplicates, also I have tried to do:
LEFT OUTER JOIN (
SELECT user_id
FROM Table2
GROUP BY user_id
) AS t2 ON Table1.user_id = t2.user_id
This gave me correct number of records but I need some additional columns from that second table, after I include extra columns I'm getting duplicates again, example:
LEFT OUTER JOIN (
SELECT user_id, address
FROM Table2
GROUP BY user_id, address
) AS t2 ON Table1.user_id = t2.user_id
I'm wondering if I missed something or there is better way to handle this type of problem. If anyone see something or know better solution please let me know.
It is impossible for you to pick the correct answer here without understanding your data.
It seems that Table2 supports multiple addresses per user_id. This is a common design. If you want to return only one address per user_id you have several options:
Fix the data - Remove the duplicate addresses from table 2 and add a constraint that prevents this situation again. You will need to determine which addresses are incorrect.
Reduce the left join to only include one address per user - How you do this will depend on your other data. You could use min() or max() with a group by if you don't care which one to return where there are multiples or you will need to perhaps order by an effective date and take the latest one - or maybe there are billing and shipping addresses and you should pick the correct one.
Accept that there are multiple addresses per user - this may be correct - and adjust the rest of your code.

Find matching records of two different tables in SQL Server

I have two tables in one of them a seller saves a record for a product he is selling. and in another table buyers save what they need to buy.
I need to get a list of user ids (uid field) from buyers table which matches a specific product on sales table. this is what I have written:
select n.[uid]
from needs n
left join ads(getdate()) a
on n.mid=a.mid
and a.[year] between n.from_year and n.to_year
and a.price between n.from_price and n.to_price
and n.[uid]=a.[uid]
and a.pid=n.pid
Well I need to use a where clause to eliminate those records which doesn't match. as I think all of these conditions are defined with ON must be defined with a where clause. but joining needs at least one ON clause. may be I shouldn't join two tables? what can I do?
There is an important difference between LEFT JOIN and JOIN, or more accurately OUTER and INNER joins respectively.
Inner joins require that both sides of the join match. In other words, if you:
had a table representing People
you had another table representing Automobiles
and each automobile had a PersonId
and joined these tables using ON with the PersonId
using LEFT (OUTER) JOIN would return all people, even those without automobiles. INNER JOIN only returns the people with vehicles.
This article may help: http://www.codinghorror.com/blog/2007/10/a-visual-explanation-of-sql-joins.html

SQL Same Column in one row

I have a lookup table that has a Name and an ID in it. Example:
ID NAME
-----------------------------------------------------------
5499EFC9-925C-4856-A8DC-ACDBB9D0035E CANCELLED
D1E31B18-1A98-4E1A-90DA-E6A3684AD5B0 31PR
The first record indicates and order status. The next indicates a service type.
In a query from an orders table I do the following:
INNER JOIN order.id = lut.Statusid
This returns the 'cancelled' name from my lookup table. I also need the service type in the same row. This is connected in the order table by the orders.serviceid How would I go about doing this?
It Cancelled doesnt connect to 31PR.
Orders connects to both. Orders has 2 fields in it called Servicetypeid and orderstatusid. That is how those 2 connect to the order. I need to return both names in the same order row.
I think many will tell you that having two different pieces of data in the same column violates first normal form. There is a reason why having one lookup table to rule them all is a bad idea. However, you can do something like the following:
Select ..
From order
Join lut
On lut.Id = order.StatusId
Left Join lut As l2
On l2.id = order.ServiceTypeId
If order.ServiceTypeId (or whatever the column is named) is not nullable, then you can use a Join (inner join) instead.
A lot of info left out, but here it goes:
SELECT orders.id, lut1.Name AS OrderStatus, lut2.Name AS ServiceType
FROM orders
INNER JOIN lut lut1 ON order.id = lut.Statusid
INNER JOIN lut lut2 ON order.serviceid = lut.Statusid

Resources