The column "xxxx" was specified multiple times for "CTE_YYYY" - sql-server

I am new to Microsoft SQL Server. I am trying to join two tables that has common key named CampaignID using LEFT OUTER JOIN. I need to reuse the result in a different query, so I decided to capture the result set using CTE_Results. For example,
-- This is my CTE script
WITH CTE_Results AS
(
SELECT t1.CampaignID, t2.CampaignID, t1.Name, t2.Vendor
FROM CampaignDetails AS t1
LEFT OUTER JOIN CampaignOnlineDetails AS t2
ON t1.CampaignID = t2.CampaignID
)
-- This is the script I want to use to compare the resulting table. For example,
SELECT Vendor
FROM CTE_Results
However, when I ran above, I get:
The column `CampaignID` was specified multiple times for `CTE_Results`.
From reading through old StackOverflow questions and answers, it seems like since CampaignID is in both tables that are being joined, I must use table aliases to specify whose (which table's) CampaignID I want to SELECT. But I think I did that and even that it seems like the error still occurs.
Is there a way for me to select and keep BOTH CampaignID's in my CTE? If so, what should be changed? Thank you for the answers!

You have CampaignID selected twice in CTE, use different alias name to fix the problem
WITH CTE_Results
AS (SELECT t1.CampaignID AS cd_CampaignID,
t2.CampaignID AS cod_CampaignID,
t1.NAME,
t2.Vendor
FROM CampaignDetails AS t1
LEFT OUTER JOIN CampaignOnlineDetails AS t2
ON t1.CampaignID = t2.CampaignID)
-- This is the script I want to use to compare the resulting table. For example,
SELECT Vendor
FROM CTE_Results
or use this
WITH CTE_Results(cd_CampaignID, cod_CampaignID, NAME, Vendor)
AS (SELECT t1.CampaignID,
t2.CampaignID,
t1.NAME,
t2.Vendor
FROM CampaignDetails AS t1
LEFT OUTER JOIN CampaignOnlineDetails AS t2
ON t1.CampaignID = t2.CampaignID)
-- This is the script I want to use to compare the resulting table. For example,
SELECT Vendor
FROM CTE_Results

You need to Alias the CampaignID Columns in your CTE or define the returned column names in the CTE declaration. Otherwise it would be like creating a table with two columns with the same name.
Example Column Alias:
WITH CTE_Results AS
(
SELECT t1.CampaignID as 'CampaignID1', t2.CampaignID as 'CampaignID2', t1.Name, t2.Vendor
FROM CampaignDetails AS t1
LEFT OUTER JOIN CampaignOnlineDetails AS t2
ON t1.CampaignID = t2.CampaignID
)
Or In CTE declaration:
WITH CTE_Results (CampaignID1, CampaignID2, [Name], Vendor) AS
(
SELECT t1.CampaignID, t2.CampaignID , t1.Name, t2.Vendor
FROM CampaignDetails AS t1
LEFT OUTER JOIN CampaignOnlineDetails AS t2
ON t1.CampaignID = t2.CampaignID
)

Related

SQL get counts using subqueries from multiple linked tables

Suppose I have tables 1-4, all the other tables are linked to table1. For what its worth, table1, table2 and table3 are relatively small but table4 contains a lot of data.
Now I have the following query:
SELECT t1.id
, (SELECT COUNT(*) FROM table2 WHERE table1_id = t1.id) AS t2_count
, (SELECT COUNT(*) FROM table3 WHERE table1_id = t1.id) AS t3_count
, (SELECT COUNT(*) FROM table4 WHERE table1_id = t1.id) AS t4_count
FROM table1 t1
Due to the fact that the subqueries are dependent/correlated I assumed that there must be a better way (performance wise) to get the data.
I tried to do the following but it drastically increased the execution time (from about 2s to 35s). I'm guessing that the multiple left joins creates a very big data set?!
SELECT t1.id
, COUNT(t2.id) AS t2_count
, COUNT(t3.id) AS t3_count
, COUNT(t4.id) AS t4_count
FROM table1 t1
LEFT JOIN table2 t2 ON t2.table1_id = t1.id
LEFT JOIN table3 t3 ON t3.table1_id = t1.id
LEFT JOIN table4 t4 ON t4.table1_id = t1.id
GROUP BY t1.id
Is there better way to get the counts? I don't need the data from the other tables.
UPDATE:
Bart's answer got me thinking that the table1_id columns are nullable. I added a IS NOT NULL check to the WHERE clauses and this brought the time down to 1s.
SELECT t1.id
, (SELECT COUNT(*) FROM table2 WHERE table1_id IS NOT NULL AND table1_id = t1.id) AS t2_count
, (SELECT COUNT(*) FROM table3 WHERE table1_id IS NOT NULL AND table1_id = t1.id) AS t3_count
, (SELECT COUNT(*) FROM table4 WHERE table1_id IS NOT NULL AND table1_id = t1.id) AS t4_count
FROM table1 t1
I guess not. If you execute a SELECT COUNT(*) FROM [table], it should perform a count on the table's PK. That should be pretty fast, even for very large tables.
Is your table4 a real table (and not a view, or a table-valued function, or something else that looks like a table)? And does it have a primary key? If so, I don't think that the performance of a SELECT COUNT(*) FROM [table4] query can be increased significantly.
It may also be the case, that your table4 is heavily targeted (in concurrent transactions over multiple connections), or perhaps your SQL Server is doing some heavy IO or computations. I cannot assume anything about that, however. You may try to check if your query is also slow on a restored database backup on a physically separate test server.

Convert T-SQL Cross Apply to Oracle

I'm looking to convert this SQL Server (T-SQL) query that uses a cross apply to Oracle 11g. Oracle does not support Cross Apply until 12g, so I have to find a work-around. The idea behind the query is for each Tab.Name that = 'Foobar', I need find the previous row's name with the same ID ordered by Tab.Date. (This table contains multiple rows for 1 ID with different Name and Date).
Here is the T-SQL code:
SELECT DISTINCT t1.ID
t1.Name,
t1.Date,
t2.Date as 'PreviousDate',
t2.Name as 'PreviousName'
FROM Tab t1
OUTER apply (SELECT TOP 1 t2.Date,
t2.Name
FROM Tab t2
WHERE t1.Id = t2.Id
ORDER BY t2.Date DESC) t2
WHERE t1.Name = 'Foobar' )
Technically, I was able to recreate this same functionality in Oracle using LEFT JOIN and LAG() function:
SELECT DISTINCT t1.ID
t1.Name,
t1.Date,
t2.PreviousDate as PreviousDate,
t2.PreviousName as PreviousName
FROM Tab t1
LEFT JOIN (
SELECT ID,
LAG(Name) OVER (PARTITION BY ID ORDER BY PreviousDate) as PreviousName,
LAG(Date) OVER (PARTITION BY ID ORDER BY PreviousDate) as PreviousDate
FROM Tab) t2 ON t2.ID = t1.ID
WHERE t1.Name = 'Foobar'
The issue is the order it executes the Oracle query. It will pull back ALL rows from Tab, order them (because of the LAG function), then it will filter them down using the ON statement when it joins it to the main query. That table has millions of records, so doing that for EACH ID is not feasible. Basically, I want to change the order of operations in the sub-query to just pull back rows for a single ID, sort those rows to find the previous, and join that. Any ideas on how to tweak it?
TL;DR
SQL Server: filters, orders, joins
Oracle: orders, filters, joins
You can look for the latest row per (id) group with row_number():
select *
from tab t1
left join
(
select row_number() over (
partition by id
order by Date desc) as rn
, *
from t2
) t2
on t1.id = t2.id
and t2.rn = 1 -- Latest row per id

SQL Server - Invalid object name while joining table to itself

I've tried to find an answer to my problem but couldn't find similar example.
I have results from such a query
SELECT * FROM (
SELECT id FROM table
) AS t1
Now I would like to join t1 to another instance of itself because I need to shift it. For example if I wanted to compare a row with the previous one. I tried:
SELECT * FROM (
SELECT id FROM table
) AS t1
LEFT JOIN t1 AS t2 ON (my conditions)
But I get an error that t1 is invalid object name. When I copy my select statement:
SELECT * FROM (
SELECT id FROM table
) AS t1
LEFT JOIN (
SELECT id FROM table
) AS t2 ON (my conditions)
The above works, but is it not slower than joining to already returned results?
Any help would be appreciated
The first one is in correct:
SELECT * FROM (
SELECT id FROM table
) AS t1
LEFT JOIN t1 AS t2 ON (my conditions)
Because you can't alias an alias. You can do something similar to it using CTE like so:
;WITH cte
AS
(
SELECT * FROM Table
)
SELECT *
FROM Cte t1
INNER JOIN cte t2 ON --
I think your select should be of the form:
SELECT *
FROM [table] t1
LEFT JOIN [table] t2 ON (your conditions)
From a performance perspective, this is identical to your last select and to the CTE solution in Mahmoud's answer (I've reviewed the execution plan for all three in SQL Server).
It might only be a matter of taste, but I find this form to be more readable/maintainable.

preventing display of duplicate records in SQL server

I'm using an stored procedure in SQL server, but it is giving me some duplicate records, of course I don't have duplicate records in my database, but my stored procedure is giving me two instances of a same record, what can be wrong? how can I prevent my query from giving duplicate records?
it is my SP select clause:
select (ROW_NUMBER() OVER (ORDER BY Review.Point desc) ) as rownumber,
Business.BusinessId,Business.BName,Business.BAddress1
,Business.BAddress2,Business.BCity,Business.BState,Business.BZipCode,Business.countryCode,Business.BPhone1,Business.BPhone2,Business.BEmail,Business.Keyword
,Business.BWebAddress,Business.BCatId,Business.BSubCatId,Business.BDetail,Business.bImage,Business.UCId,Business.UCConfirm
,Business.UOId,Business.UOConfirm,Business.x,Business.y,Cat.CatName,SubCat1.SubCatName
from Business left outer join
Review on business.BusinessId=Review.BusinessId left outer join
Cat on business.BCatid=Cat.CatId left outer join
SubCat1 on business.BSubCatid=SubCat1.SubCatId '+#sql2+'
) as tbl
where rownumber between '+CONVERT(varchar, #lbound)+' and '+CONVERT(varchar, #ubound);
I don't know your data to dig in to your join logic, but if it duplicating across BusinessID, you could add another ROW_NUMBER() for the duplicates:
select (ROW_NUMBER() OVER (ORDER BY Review.Point desc) ) as rownumber,
r = ROW_NUMBER()OVER(PARTITION BY Business.BusinessId ORDER BY Business.BusinessId)
Business.BusinessId,Business.BName,Business.BAddress1
,Business.BAddress2,Business.BCity,Business.BState,Business.BZipCode,Business.countryCode,Business.BPhone1,Business.BPhone2,Business.BEmail,Business.Keyword
,Business.BWebAddress,Business.BCatId,Business.BSubCatId,Business.BDetail,Business.bImage,Business.UCId,Business.UCConfirm
,Business.UOId,Business.UOConfirm,Business.x,Business.y,Cat.CatName,SubCat1.SubCatName
from Business left outer join
Review on business.BusinessId=Review.BusinessId left outer join
Cat on business.BCatid=Cat.CatId left outer join
SubCat1 on business.BSubCatid=SubCat1.SubCatId '+#sql2+'
) as tbl
where rownumber between '+CONVERT(varchar, #lbound)+' and '+CONVERT(varchar, #ubound)
AND r = 1;
Include the reserved word DISTINCT in your query.
eg
select distinct
*
from
students s
inner join enrollments e on e.StudentId = s.Id
inner join courses c on c.Id = e.CourseId
However, unexpected duplicates in a result table is often (but not always) a clue that you have a badly formed query or a badly designed database.
Try to remove this left join
Review on business.BusinessId=Review.BusinessId left outer join
seems not needed in your query and if there are more than one review for one business ...

CROSS APPLY Performance

Is it possible to improve performance by taking the following SQL:
SELECT t1.id,
t1.name,
t2.subname,
t2.refvalue
FROM table1 AS t1
CROSS apply (SELECT TOP 1 t2.subid,
t2.subname,
t3.refvalue
FROM table2 AS t2
INNER JOIN table3 AS t3
ON t2.subid = t3.subid
ORDER BY lastupdated DESC) AS t2
And rewriting it so that it looks like this:
SELECT t1.id,
t1.name,
t2.subname,
t3.refvalue
FROM table1 AS t1
CROSS apply (SELECT TOP 1 t2.subid,
t2.subname
FROM table2 AS t2
ORDER BY lastupdated DESC) AS t2
INNER JOIN table3 AS t3
ON t2.subid = t3.subid
Firstly, does it give the same result?
If so, what does the query plan say, and also set statistics io on?
How many rows in Table1, Table2 and Table3? How many intersect and end up in the result? I'm trying to figure out the purpose of rewriting the query, and agree with gbn... do you get the same result, does the query plan look the same in both cases, do the statistics i/o get any better, and does the rewritten query run any faster?

Resources