Removing Duplicated Data after Joining Tables - sql-server

I've been stuck on this query since a while and don't know how to go forward. The problem is when joining multiple tables: I noticed that the numeric data does not match the numeric data that I expect. This is because for each record in the table with fewer records, the join takes all of the corresponding records from the bigger table.
For example, suppose you have the following tables. The table with fewer records, Available Fruits, has one record for each A, B, C, D, and E. The table with more records, Sales Today, has multiple records each for A, B, C, D, and E.
Then suppose you use an JOIN to combine the two tables above.
SELECT A.*, B.*
FROM [Available Fruits] A
JOIN [Sales Today] B
ON A.[Fruit ID]=B.[Fruit ID]
The result is the table below. Notice that the rows from the Available Fruits table are duplicated for every instance the corresponding ID appears on the Sales Today table. i.e, If you place the Inventory and Fruit fields into a new table, this joined table causes the inventory of apples to appear as 375 instead of the expected 75, as shown in the following image.
Unfortunately I still don't have enough points to post images.
EDIT: So what I´d like to do in SQL is to somehow be able to roll up the "Sales Today" table to the "Available fruits" granularity level so it doesn't get duplicated OR somehow calculate the distribution of Inventory sold per A,B,C,D,E so I can Join the two tables without duplicate what I've explained with the Inventory field.
I really appreciate all your help guys.

You haven't actually posted what you want it to look like. Try Something like this (I am not in a position to try it right now and I have never done group by on joined tables before)
SELECT A.[Fruit ID], A.Fruit, sum(B.[Fruits Sold])
from [Available Fruits] A
left outer join [Sales Today] B on A.[Fruit ID] = B.[Fruit ID]
group by A.[Fruit ID], A.[Fruit]

Related

Access 2016 two table join based on a field found in another table

In advance, thank you for your assistance and explanation, I am a novice at best and trying to teach myself.
I have two different types of claims tables, [claim_a] and [claim_b]. Each of these tables capture similar data and have one claim record per row. I have a similar form to display data by record for each table. One of my forms has a grid that captures documents sent and the date sent.
In a third table, [documents], each instance of a document sent is associated with an ssn, claim_num, first, last and clm_type, doc_type and date_sent.
I want to create one query that would output all correspondence sent for both claim tables. I realize I could just do two individual queries but I think this can be done and is not too difficult, I am just missing something and would like to know what. I have tried various join type (inner, left, right) and get various results but nothing that is actually correct. With INNER JOIN, I only got 78 records but am expecting 2,261 and when I did LEFT OUTER, I got 3,070 which totals more than what I had in my [documents] table-I do understand that an outer join with one row in the LEFT table that matches two rows in the RIGHT table will return as two ROWS.
I have also been sure to use parenthesis in my first join statement which based on Google searches seems to be related to Access. I also tried using where clauses too.
I think the problem may be that some of the records in [documents] do not correspond to a record in either claims table. I also just tried joining one claim table to [documents] but even that did not return the expected number of results.
Here are few of the joins I have tried:
Inner Join for one table: My output was missing 4 records for an SSN with 6 total records and I could not figure out why it skipped over the remaining 4. It was only for this SSN. I had other SSNs with more than 6 records.
SELECT documents.date, documents.doc_type,
FROM documents INNER JOIN claim_a ON documents.ssn =
claim_a.ssn WHERE (((documents.clm_type)="Life Only")) OR
(((documents.clm_type)<>("Health")) AND (("Life/ ADB")<>False) AND
(("Life")<>False));
I got 78 records with this join
SELECT documents.date, documents.doc_type,
FROM (documents INNER JOIN claim_a ON documents.ssn =
claim_a.ssn) INNER JOIN claim_b ON documents.ssn =
claim_b.ssn;
I got 3070 records with this join
SELECT documents.date, documents.doc_type,
FROM (documents LEFT OUTER JOIN claim_a ON documents.ssn =
claim_a.ssn) LEFT OUTER JOIN claim_b ON documents.ssn =
claim_b.ssn;
I got the correct number of results with this query but I am concerned it will not work with my Master Form to display header specific information for my form associated with table, claim_b.
SELECT documents.date, documents.doc_type,
FROM documents LEFT JOIN claim_a ON documents.ssn =
claim_a.ssn WHERE (((documents.clm_type)<>""));
I am obviously doing something wrong. Can someone please advise?
Sounds like you need a Union query between the two claims tables to get a list of all claims. Then use the results of theat query to get the document list.
Union query
Select ssn from claim_a
Union all select ssn from claim_b
Save this with a name like SSN_List, then join it to Documents in another query
Select * from Documents
left join SSN_List on Documents.ssn=SSN_List.ssn
And of course change the 2nd query as needed to get the information you need from Documents.
This can probably be done in one query, but I find it easier to understand and use the 2 step approach.

SQL: Building hierarchies and nesting queries on the same table

I am trying to build hierarchies by nesting queries on the same table in MS SQL Server 2014.
To give an example of what I am trying to achieve:
I have a table 'employees' whith the following columns:
[ID],[First Name],[Last Name],[ReportsTo]
{1},{John},{Doe},{2}
{2},{Mary},{Miller},{NULL}
I am trying to build a statement, where I join the employees table with itself and where I build a hierarchy with the boss on top.
Expected Result:
[Employee],[Boss]
{Miller,Mary},{NULL}
{Doe, John},{Miller,Mary}
I apologize, if this is a stupid question, but I fail to create a working nested query.
Could you please help me with that?
Thank you very much in advance
Based on the intended results, it looks like what you essentially want is a list of employees. So let's start with that:
SELECT LastName, FirstName, ReportsTo FROM Employees
This gives you the list, so you now have the objects you're looking for. But you need to fill out more data. You want to follow ReportsTo and show data from the record to which that points as well. This would be done exactly as it would if the foreign key pointed to a different table. (The only difference from being the same table is that you must use table aliases in the query, since you're including the same table twice.)
So let's start by joining the table:
SELECT e.LastName, e.FirstName, e.ReportsTo
FROM Employees e
LEFT OUTER JOIN Employees b on e.ReportsTo = b.ID
The results should still be the same, but now you have more data to select from. So you can add the new columns to the SELECT clause:
SELECT
e.LastName AS EmployeeLastName,
e.FirstName AS EmployeeFirstName,
b.LastName AS BossLastName,
b.FirstName AS BossFirstName
FROM Employees e
LEFT OUTER JOIN Employees b on e.ReportsTo = b.ID
It's a join like any other, it just happens to be a join to the same table.

SQL Aggregate function issues using DISTINCT function

I'm needing help with a question that I've been racking my brain on for two days now (Almost), on this assignment. I'm still pretty new to SQL and I'm just struggling.
I DO NOT WANT THE ANSWER!! I'm just looking for help getting going in the right direction.
Here is the question:
Write a SELECT statement that answers this question: Which customers have ordered more than one product? Return these columns:
The email address from the Customers table
The count of distinct products from the customer’s orders
Here is what I have so far:
SELECT Customers.CustomerID,
Count(DISTINCT ProductID) AS ProductsCount
FROM Customers
INNER JOIN Orders
ON Customers.CustomerID = Orders.CustomerID
JOIN Products
ON Products.ProductID = OrderItems.ProductID
GROUP BY Customers.CustomerID,
Orders.CustomerID
But I keep getting this error:
Msg 4104, Level 16, State 1, Line 2
The multi-part identifier "OrderItems.ProductID" could not be bound.
The structure of the three tables in play here is.
The Customer table has an Emailaddress and CustomerID column.
The Orders table has CustomerID and OrderID columns.
The Products table has ProductID column.
The OrderItems table has OrderID, ProductID, and Quantity columns.
Any Help would be really really helpful!
Thanks!
Fix the syntax as suggested by joining to the OrderItems table and in order to look for something more than once, you need to use group by field1, field2, etc. having count(field) > 1. You are almost there.
You are almost there :-)
You forgot to join OrderItems.
You don't need the table Products in this query (you don't want to see anything from that table; you get the product count from OrderItems).
To limit by aggregates (as by the number of products here) use HAVING.
To group by Orders.CustomerID is superfluous, as it equals Customers.CustomerID.
Here is a tip to find the answer,
find the orders which has has more than one item in OrderItems
table by using having clause with Count() aggregate and Group by orderID.
so from the first step you will have the orders that have more than
one item and count of products in every order. next join the first step result with
order table to get the customerid.
next join the second step result with customers table to get the
customers information who bought more than one one product in single
order with count.

Find matching records of two different tables in SQL Server

I have two tables in one of them a seller saves a record for a product he is selling. and in another table buyers save what they need to buy.
I need to get a list of user ids (uid field) from buyers table which matches a specific product on sales table. this is what I have written:
select n.[uid]
from needs n
left join ads(getdate()) a
on n.mid=a.mid
and a.[year] between n.from_year and n.to_year
and a.price between n.from_price and n.to_price
and n.[uid]=a.[uid]
and a.pid=n.pid
Well I need to use a where clause to eliminate those records which doesn't match. as I think all of these conditions are defined with ON must be defined with a where clause. but joining needs at least one ON clause. may be I shouldn't join two tables? what can I do?
There is an important difference between LEFT JOIN and JOIN, or more accurately OUTER and INNER joins respectively.
Inner joins require that both sides of the join match. In other words, if you:
had a table representing People
you had another table representing Automobiles
and each automobile had a PersonId
and joined these tables using ON with the PersonId
using LEFT (OUTER) JOIN would return all people, even those without automobiles. INNER JOIN only returns the people with vehicles.
This article may help: http://www.codinghorror.com/blog/2007/10/a-visual-explanation-of-sql-joins.html

Using Full-Text Search in SQL Server 2008 across multiple tables, columns

I need to search across multiple columns from two tables in my database using Full-Text Search. The two tables in question have the relevant columns full-text indexed.
The reason I'm opting for Full-text search:
1. To be able to search accented words easily (cafè)
2. To be able to rank according to word proximity, etc.
3. "Did you mean XXX?" functionality
Here is a dummy table structure, to illustrate the challenge:
Table Book
BookID
Name (Full-text indexed)
Notes (Full-text indexed)
Table Shelf
ShelfID
BookID
Table ShelfAuthor
AuthorID
ShelfID
Table Author
AuthorID
Name (Full-text indexed)
I need to search across Book Name, Book Notes and Author Name.
I know of two ways to accomplish this:
Using a Full-text Indexed View: This would have been my preferred method, but I can't do this because for a view to be full-text indexed, it needs to be schemabound, not have any outer joins, have a unique index. The view I will need to get my data does not satisfy these constraints (it contains many other joined tables I need to get data from).
Using joins in a stored procedure: The problem with this approach is that I need to have the results sorted by rank. If I am making multiple joins across the tables, SQL Server won't search across multiple fields by default. I can combine two individual CONTAINS queries on the two linked tables, but I don't know of a way to extract the combined rank from the two search queries. For example, if I search for 'Arthur', the results of both the Book query and the Author query should be taken into account and weighted accordingly.
Using FREETEXTTABLE, you just need to design some algorithm to calculate the merged rank on each joined table result. The example below skews the result towards hits from the book table.
SELECT b.Name, a.Name, bkt.[Rank] + akt.[Rank]/2 AS [Rank]
FROM Book b
INNER JOIN Author a ON b.AuthorID = a.AuthorID
INNER JOIN FREETEXTTABLE(Book, Name, #criteria) bkt ON b.ContentID = bkt.[Key]
LEFT JOIN FREETEXTTABLE(Author, Name, #criteria) akt ON a.AuthorID = akt.[Key]
ORDER BY [Rank] DESC
Note that I simplified your schema for this example.
I had the same problem as you but it actually involved 10 tables (a Users table and several others for information)
I did my first query using FREETEXT in the WHERE clause for each table but the query was taking far too long.
I then saw several replies about using FREETEXTTABLE instead and checking for not nulls values in the key column for each table, but that took also to long to execute.
I fixed it by using a combination of FREETEXTTABLE and UNION selects:
SELECT Users.* FROM Users INNER JOIN
(SELECT Users.UserId FROM Users INNER JOIN FREETEXTTABLE(Users, (column1, column2), #variableWithSearchTerm) UsersFT ON Users.UserId = UsersFT.key
UNION
SELECT Table1.UserId FROM Table1 INNER JOIN FREETEXTTABLE(Table1, TextColumn, #variableWithSearchTerm) Table1FT ON Table1.UserId = Table1FT.key
UNION
SELECT Table2.UserId FROM Table2 INNER JOIN FREETEXTTABLE(Table2, TextColumn, #variableWithSearchTerm) Table2FT ON Table2.UserId = Table2FT.key
... --same for all tables
) fts ON Users.UserId = fts.UserId
This proved to be incredibly much faster.
I hope it helps.
I don't think the accepted answer will solve the problem. If you try to find all the books from a certain author and, therefore, use the author's name (or part of it) as the search criteria, the only books returned by the query will be those which have the search criteria in its own name.
The only way I see around this problem is to replicate the Author's columns that you wish to search by in the Book table and index those columns (or column since it would probably be smart to store the author's relevant information in an XML column in the Book table).
FWIW, in a similar situation our DBA created DML triggers to maintain a dedicated full-text search table. It was not possible to use a materialized view because of its many restrictions.
I would use a stored procedure. The full text method or whatever returns a rank which you can sort by. I am not sure how they will be weighted against eachother, but I'm sure you could tinker for awhile and figure it out. For example:
Select SearchResults.key, SearchResults.rank From FREETEXTTABLE(myColumn, *, #searchString) as SearchResults Order By SearchResults.rank Desc
This answer is well overdue, but one way to do this if you cannot modify primary tables is to create a new table with the search parameters added to one column.
Then create a full text index on that column and query that column.
Example
SELECT
FT_TBL.[EANHotelID] AS HotelID,
ISNULL(FT_TBL.[Name],'-') AS HotelName,
ISNULL(FT_TBL.[Address1],'-') AS HotelAddress,
ISNULL(FT_TBL.[City],'-') AS HotelCity,
ISNULL(FT_TBL.[StateProvince],'-') AS HotelCountyState,
ISNULL(FT_TBL.[PostalCode],'-') AS HotelPostZipCode,
ISNULL(FT_TBL.[Latitude],0.00) AS HotelLatitude,
ISNULL(FT_TBL.[Longitude],0.00) AS HotelLongitude,
ISNULL(FT_TBL.[CheckInTime],'-') AS HotelCheckinTime,
ISNULL(FT_TBL.[CheckOutTime],'-') AS HotelCheckOutTime,
ISNULL(b.[CountryName],'-') AS HotelCountry,
ISNULL(c.PropertyDescription,'-') AS HotelDescription,
KEY_TBL.RANK
FROM [EAN].[dbo].[tblactivepropertylist] AS FT_TBL INNER JOIN
CONTAINSTABLE ([EAN].[dbo].[tblEanFullTextSearch], FullTextSearchColumn, #s)
AS KEY_TBL
ON FT_TBL.EANHotelID = KEY_TBL.[KEY]
INNER JOIN [EAN].[dbo].[tblCountrylist] b
ON FT_TBL.Country = b.CountryCode
INNER JOIN [EAN].[dbo].[tblPropertyDescriptionList] c
ON FT_TBL.[EANHotelID] = c.EANHotelID
In the code above [EAN].[dbo].[tblEanFullTextSearch], FullTextSearchColumn is the new table and column with the fields added, you can now do a query on the new table with joins to the table you want to display the data from.
Hope this helps

Resources