Using INNER JOIN to Find Similar DATA-My MyProblem:Repeating some results - sql-server

I am using this query to find similar results in two tables, however some of the data repeat themselves, and I have no idea what might the problem be.
Could you please check this query, to see if there is something I did wrong or not.
All sorts of join(inner,left,right and join alone! returned the same results)
select dbo.netss.[CODE]
,dbo.netss.[NUM]
,dbo.netss.[state]
,dbo.netss.[county]
,dbo.netss.[zone]
,dbo.netss.[Mvillage]
,dbo.netss.[Village]
,dbo.netss.[operator]
,dbo.P1.*
from dbo.P1
inner join
dbo.netss
on (dbo.netss.[state]=dbo.p1.[state])
where dbo.P1.Name=dbo.netss.Village

Joining on [state] is probably not what you want. If the values in the [state] column are not unique in both tables, then you will get multiple matches from dbo.p1 to dbo.netss and possibly from dbo.netss to dbo.p1 also. Add more columns to your join list as follows, for example:
on dbo.netss.[state] = dbo.p1.[state] and dbo.netss.[county] = dbo.p1.[county] and dbo.netss.[CODE] = dbo.netss.[CODE]
In other words, join on the combination of columns that make a row unique and that are also in the other table. This will define a unique key in one table that can be a foreign key to the other table.

--You Can try this
select DISTINCT
dbo.netss.[CODE]
,dbo.netss.[NUM]
,dbo.netss.[state]
,dbo.netss.[county]
,dbo.netss.[zone]
,dbo.netss.[Mvillage]
,dbo.netss.[Village]
,dbo.netss.[operator]
,dbo.P1.*
from dbo.P1
inner join
dbo.netss
on (dbo.netss.[state]=dbo.p1.[state])
where dbo.P1.Name=dbo.netss.Village
----the DISTINCT keywords do eliminate duplicate

Related

How to create a VIEW by join multiple tables having multiple columns using SQL Server

I am creating a view for a table and joining multiple tables for additional columns, getting exact output for some join tables which has only two columns(ID, Name) in it. But i am getting 'Ambiguous column error' for a column which i am using one join table which has more than 10 columns in it.
I know why I am getting this error but i need help to call the EventName from EventInfo join table. Also is it possible to give different column names in the view.
Below are the select statement and the join table that I am getting error.
Thanks in advance for your suggestions.
SELECT
Event_ID, EventName
FROM
[dbo].[tbl_ValueTool_Event_ResultantDataEntryData] Res_event
JOIN
[GRID_ProjectServer_Applications].[dbo].[tbl_ValueTool_EventInformation] EVT ON Res_event.[Event_ID] = EVT.[Event_ID]
Below is the join table from which I want to call the column names corresponding to the Id's
You can use alias for the column names if needed.
SELECT EVT.Event_ID, EVT.EventName
FROM [dbo].[tbl_ValueTool_Event_ResultantDataEntryData] Res_event
JOIN [GRID_ProjectServer_Applications].[dbo].[tbl_ValueTool_Event‌​Information] EVT
ON Res_event.[Event_ID] = EVT.[Event_ID]

SQL Left Outer Join?

I have table that should joint to another table based on the unique id. If I do LEFT OUTER JOIN ON I will have duplicates. If I put DISTINCT in my SELECT I will get correct number of records. Then if I include any field from the table that I did LEFT OUTER JOIN in that case I'm getting duplicates again. Here is my query:
SELECT DISTINCT
Table1.fname,
Table1.lname,
Table2.address
FROM Table1
LEFT OUTER JOIN Table2
ON Table2.user_id = Table1.userid
In the example above I'm getting duplicates, also I have tried to do:
LEFT OUTER JOIN (
SELECT user_id
FROM Table2
GROUP BY user_id
) AS t2 ON Table1.user_id = t2.user_id
This gave me correct number of records but I need some additional columns from that second table, after I include extra columns I'm getting duplicates again, example:
LEFT OUTER JOIN (
SELECT user_id, address
FROM Table2
GROUP BY user_id, address
) AS t2 ON Table1.user_id = t2.user_id
I'm wondering if I missed something or there is better way to handle this type of problem. If anyone see something or know better solution please let me know.
It is impossible for you to pick the correct answer here without understanding your data.
It seems that Table2 supports multiple addresses per user_id. This is a common design. If you want to return only one address per user_id you have several options:
Fix the data - Remove the duplicate addresses from table 2 and add a constraint that prevents this situation again. You will need to determine which addresses are incorrect.
Reduce the left join to only include one address per user - How you do this will depend on your other data. You could use min() or max() with a group by if you don't care which one to return where there are multiples or you will need to perhaps order by an effective date and take the latest one - or maybe there are billing and shipping addresses and you should pick the correct one.
Accept that there are multiple addresses per user - this may be correct - and adjust the rest of your code.

SQL WHERE NOT EXISTS (skip duplicates)

Hello I'm struggling to get the query below right. What I want is to return rows with unique names and surnames. What I get is all rows with duplicates
This is my sql
DECLARE #tmp AS TABLE (Name VARCHAR(100), Surname VARCHAR(100))
INSERT INTO #tmp
SELECT CustomerName,CustomerSurname FROM Customers
WHERE
NOT EXISTS
(SELECT Name,Surname
FROM #tmp
WHERE Name=CustomerName
AND ID Surname=CustomerSurname
GROUP BY Name,Surname )
Please can someone point me in the right direction here.
//Desperate (I tried without GROUP BY as well but get same result)
DISTINCT would do the trick.
SELECT DISTINCT CustomerName, CustomerSurname
FROM Customers
Demo
If you only want the records that really don't have duplicates (as opposed to getting duplicates represented as a single record) you could use GROUP BY and HAVING:
SELECT CustomerName, CustomerSurname
FROM Customers
GROUP BY CustomerName, CustomerSurname
HAVING COUNT(*) = 1
Demo
First, I thought that #David answer is what you want. But rereading your comments, perhaps you want all combinations of Names and Surnames:
SELECT n.CustomerName, s.CustomerSurname
FROM
( SELECT DISTINCT CustomerName
FROM Customers
) AS n
CROSS JOIN
( SELECT DISTINCT CustomerSurname
FROM Customers
) AS s ;
Are you doing that while your #Tmp table is still empty?
If so: your entire "select" is fully evaluated before the "insert" statement, it doesn't do "run the query and add one row, insert the row, run the query and get another row, insert the row, etc."
If you want to insert unique Customers only, use that same "Customer" table in your not exists clause
SELECT c.CustomerName,c.CustomerSurname FROM Customers c
WHERE
NOT EXISTS
(SELECT 1
FROM Customers c1
WHERE c.CustomerName = c1.CustomerName
AND c.CustomerSurname = c1.CustomerSurname
AND c.Id <> c1.Id)
If you want to insert a unique set of customers, use "distinct"
Typically, if you're doing a WHERE NOT EXISTS or WHERE EXISTS, or WHERE NOT IN subquery,
you should use what is called a "correlated subquery", as in ypercube's answer above, where table aliases are used for both inside and outside tables (where inside table is joined to outside table). ypercube gave a good example.
And often, NOT EXISTS is preferred over NOT IN (unless the WHERE NOT IN is selecting from a totally unrelated table that you can't join on.)
Sometimes if you're tempted to do a WHERE EXISTS (SELECT from a small table with no duplicate values in column), you could also do the same thing by joining the main query with that table on the column you want in the EXISTS. Not always the best or safest solution, might make query slower if there are many rows in that table and could cause many duplicate rows if there are dup values for that column in the joined table -- in which case you'd have to add DISTINCT to the main query, which causes it to SORT the data on all columns.
-- Not efficient at all.
And, similarly, the WHERE NOT IN or NOT EXISTS correlated subqueries can be accomplished (and give the exact same execution plan) if you LEFT OUTER JOIN the table you were going to subquery -- and add a WHERE . IS NULL.
You have to be careful using that, but you don't need a DISTINCT. Frankly, I prefer to use the WHERE NOT IN subqueries or NOT EXISTS correlated subqueries, because the syntax makes the intention clear and it's hard to go wrong.
And you do not need a DISTINCT in the SELECT inside such subqueries (correlated or not). It would be a waste of processing (and for WHERE EXISTS or WHERE IN subqueries, the SQL optimizer would ignore it anyway and just use the first value that matched for each row in the outer query). (Hope that makes sense.)

SQL Same Column in one row

I have a lookup table that has a Name and an ID in it. Example:
ID NAME
-----------------------------------------------------------
5499EFC9-925C-4856-A8DC-ACDBB9D0035E CANCELLED
D1E31B18-1A98-4E1A-90DA-E6A3684AD5B0 31PR
The first record indicates and order status. The next indicates a service type.
In a query from an orders table I do the following:
INNER JOIN order.id = lut.Statusid
This returns the 'cancelled' name from my lookup table. I also need the service type in the same row. This is connected in the order table by the orders.serviceid How would I go about doing this?
It Cancelled doesnt connect to 31PR.
Orders connects to both. Orders has 2 fields in it called Servicetypeid and orderstatusid. That is how those 2 connect to the order. I need to return both names in the same order row.
I think many will tell you that having two different pieces of data in the same column violates first normal form. There is a reason why having one lookup table to rule them all is a bad idea. However, you can do something like the following:
Select ..
From order
Join lut
On lut.Id = order.StatusId
Left Join lut As l2
On l2.id = order.ServiceTypeId
If order.ServiceTypeId (or whatever the column is named) is not nullable, then you can use a Join (inner join) instead.
A lot of info left out, but here it goes:
SELECT orders.id, lut1.Name AS OrderStatus, lut2.Name AS ServiceType
FROM orders
INNER JOIN lut lut1 ON order.id = lut.Statusid
INNER JOIN lut lut2 ON order.serviceid = lut.Statusid

Using Full-Text Search in SQL Server 2008 across multiple tables, columns

I need to search across multiple columns from two tables in my database using Full-Text Search. The two tables in question have the relevant columns full-text indexed.
The reason I'm opting for Full-text search:
1. To be able to search accented words easily (cafè)
2. To be able to rank according to word proximity, etc.
3. "Did you mean XXX?" functionality
Here is a dummy table structure, to illustrate the challenge:
Table Book
BookID
Name (Full-text indexed)
Notes (Full-text indexed)
Table Shelf
ShelfID
BookID
Table ShelfAuthor
AuthorID
ShelfID
Table Author
AuthorID
Name (Full-text indexed)
I need to search across Book Name, Book Notes and Author Name.
I know of two ways to accomplish this:
Using a Full-text Indexed View: This would have been my preferred method, but I can't do this because for a view to be full-text indexed, it needs to be schemabound, not have any outer joins, have a unique index. The view I will need to get my data does not satisfy these constraints (it contains many other joined tables I need to get data from).
Using joins in a stored procedure: The problem with this approach is that I need to have the results sorted by rank. If I am making multiple joins across the tables, SQL Server won't search across multiple fields by default. I can combine two individual CONTAINS queries on the two linked tables, but I don't know of a way to extract the combined rank from the two search queries. For example, if I search for 'Arthur', the results of both the Book query and the Author query should be taken into account and weighted accordingly.
Using FREETEXTTABLE, you just need to design some algorithm to calculate the merged rank on each joined table result. The example below skews the result towards hits from the book table.
SELECT b.Name, a.Name, bkt.[Rank] + akt.[Rank]/2 AS [Rank]
FROM Book b
INNER JOIN Author a ON b.AuthorID = a.AuthorID
INNER JOIN FREETEXTTABLE(Book, Name, #criteria) bkt ON b.ContentID = bkt.[Key]
LEFT JOIN FREETEXTTABLE(Author, Name, #criteria) akt ON a.AuthorID = akt.[Key]
ORDER BY [Rank] DESC
Note that I simplified your schema for this example.
I had the same problem as you but it actually involved 10 tables (a Users table and several others for information)
I did my first query using FREETEXT in the WHERE clause for each table but the query was taking far too long.
I then saw several replies about using FREETEXTTABLE instead and checking for not nulls values in the key column for each table, but that took also to long to execute.
I fixed it by using a combination of FREETEXTTABLE and UNION selects:
SELECT Users.* FROM Users INNER JOIN
(SELECT Users.UserId FROM Users INNER JOIN FREETEXTTABLE(Users, (column1, column2), #variableWithSearchTerm) UsersFT ON Users.UserId = UsersFT.key
UNION
SELECT Table1.UserId FROM Table1 INNER JOIN FREETEXTTABLE(Table1, TextColumn, #variableWithSearchTerm) Table1FT ON Table1.UserId = Table1FT.key
UNION
SELECT Table2.UserId FROM Table2 INNER JOIN FREETEXTTABLE(Table2, TextColumn, #variableWithSearchTerm) Table2FT ON Table2.UserId = Table2FT.key
... --same for all tables
) fts ON Users.UserId = fts.UserId
This proved to be incredibly much faster.
I hope it helps.
I don't think the accepted answer will solve the problem. If you try to find all the books from a certain author and, therefore, use the author's name (or part of it) as the search criteria, the only books returned by the query will be those which have the search criteria in its own name.
The only way I see around this problem is to replicate the Author's columns that you wish to search by in the Book table and index those columns (or column since it would probably be smart to store the author's relevant information in an XML column in the Book table).
FWIW, in a similar situation our DBA created DML triggers to maintain a dedicated full-text search table. It was not possible to use a materialized view because of its many restrictions.
I would use a stored procedure. The full text method or whatever returns a rank which you can sort by. I am not sure how they will be weighted against eachother, but I'm sure you could tinker for awhile and figure it out. For example:
Select SearchResults.key, SearchResults.rank From FREETEXTTABLE(myColumn, *, #searchString) as SearchResults Order By SearchResults.rank Desc
This answer is well overdue, but one way to do this if you cannot modify primary tables is to create a new table with the search parameters added to one column.
Then create a full text index on that column and query that column.
Example
SELECT
FT_TBL.[EANHotelID] AS HotelID,
ISNULL(FT_TBL.[Name],'-') AS HotelName,
ISNULL(FT_TBL.[Address1],'-') AS HotelAddress,
ISNULL(FT_TBL.[City],'-') AS HotelCity,
ISNULL(FT_TBL.[StateProvince],'-') AS HotelCountyState,
ISNULL(FT_TBL.[PostalCode],'-') AS HotelPostZipCode,
ISNULL(FT_TBL.[Latitude],0.00) AS HotelLatitude,
ISNULL(FT_TBL.[Longitude],0.00) AS HotelLongitude,
ISNULL(FT_TBL.[CheckInTime],'-') AS HotelCheckinTime,
ISNULL(FT_TBL.[CheckOutTime],'-') AS HotelCheckOutTime,
ISNULL(b.[CountryName],'-') AS HotelCountry,
ISNULL(c.PropertyDescription,'-') AS HotelDescription,
KEY_TBL.RANK
FROM [EAN].[dbo].[tblactivepropertylist] AS FT_TBL INNER JOIN
CONTAINSTABLE ([EAN].[dbo].[tblEanFullTextSearch], FullTextSearchColumn, #s)
AS KEY_TBL
ON FT_TBL.EANHotelID = KEY_TBL.[KEY]
INNER JOIN [EAN].[dbo].[tblCountrylist] b
ON FT_TBL.Country = b.CountryCode
INNER JOIN [EAN].[dbo].[tblPropertyDescriptionList] c
ON FT_TBL.[EANHotelID] = c.EANHotelID
In the code above [EAN].[dbo].[tblEanFullTextSearch], FullTextSearchColumn is the new table and column with the fields added, you can now do a query on the new table with joins to the table you want to display the data from.
Hope this helps

Resources