SQL Server Views performances - sql-server

If I have defined a view in SQL Server like this:
CREATE View V1
AS
SELECT *
FROM t1
INNER JOIN t2 ON t1.f1 = t2.f2
ORDER BY t1.f1
Should I expect performance differences between
SELECT * FROM V1 WHERE V1.f1 = 100
and just avoiding view, like this
SELECT *
FROM t1
INNER JOIN t2 ON t1.f1 = t2.f2
WHERE t1.f1 = 100
ORDER BY t1.f1
?
We don't have any reason to use views except the need to centralize complex queries.
Thanks

There should be no performance penalty.
Simplifying complex queries is what views are for.
If performance is something you are concerned about - read about indexed views in SQL Server:
indexed views provide additional performance benefits that cannot be achieved using standard indexes. Indexed views can increase query performance in the following ways:
Aggregations can be precomputed and stored in the index to minimize expensive computations during query execution.
Tables can be prejoined and the resulting data set stored.
Combinations of joins or aggregations can be stored.

Generally you shouldn't expect performance differences but check the execution plans for your queries.
If you are joining Views onto Views then the execution plans can be sub optimal and contain repeated accesses to the same table that could have been consolidated. Also there can be issues with views and predicate pushing.

Related

Does SQL Server expand a view's sql inline during execution?

Let's say I have a (hypothetical) table called Table1 with 500 columns and there is a view called View1 which is basically
select Column1, Column2,..., Column500, ComputedOrForeignKeyColumn1,...
from Table1
inner join ForeignKeyTables .....
Now, when I execute something like
Select Column32, Column56
from View1
which one of the below 3 does SQL Server turn it into?
Query #1:
select Column32, Column56
from
(select
Column1, Column2,..., Column500, ComputedOrForeignKeyColumn1,...
from
Table1
inner join
ForeignKeyTables ......) v
Query #2:
Select Column32, Column56
from Table1
Query #3:
select Column32, Column56
from
(select Column32, Column56
from Table1) v
The reason I'm asking this is that I do have a very wide table and a view sitting on top of it (that basically inner joins to bring texts from all foreign key ids) and I can't figure out if SQL Server fetches all columns and then selects the ones that are needed or fetches only those that are needed (while also ignoring unnecessary joins etc)...if it is former then a view would not be the best for performance.
SQL Server query compilation can be split into phases:
Parsing
Binding
Optimization
View resolution is performed during binding. At this stage the view reference is replaced with its definition. At this point, unused view columns will be present.
The next stage is optimization, where the bound syntax tree is transformed into an execution plan. The optimizer considers many kinds of manipulations on the execution plan to increase efficiency, and removing unused columns is one of the most basic. At this point, the unused column references will be removed.
So to answer your question, unused columns in the view definition will not impact performance, since the optimizer will be smart enough to remove them.
Note: this answer assumes the view is not indexed. For indexed views, the resolution process works differently, and there is view maintenance overhead for UPDATEs of the base tables.
None of the above. SQL Server will parse the query and it will create and execution plan. The resulting execution plan is calculated based on many factors, like indexes joins, etc.
Your question cannot be truly answered by anyone other than you, examining such execution plan.
See How do I obtain a Query Execution Plan? for more information.
The view definition is merged with the outer query in very early stage of compilation. You may or may not get the same execution plan for query on a view vs an equivalent query touching base tables, depending on complexity of the view and given the limitations of QO.
For your particular case it's worth noting that an inner join doesn't only fetch data from joined tables, but it also limits the result (in the same way as an IF EXISTS check does). If there is a declarative FK between the tables, the QO will be smart enough not to check the referenced tables, as the existence is guaranteed by the constraint, but otherwise it has to.

Multiple Joins in SQL query

I'm trying to run a query from multiple tables and I'm having an issue with the query taking over 10 minutes to provide just 3 records. The query is as follows:
select TOP 100 pm_entity_type_name, year(event_date),
pm_event_type_name, pm_event_name, pm_entity_name,
pm_entity_code, event_priority, event_cost
from pm_event_priority, pm_entity, pm_entity_type, pm_event_type, pm_event
where pm_event.pm_event_id = pm_event_priority.pm_event_id
And pm_entity.pm_entity_id = pm_event_priority.pm_entity_id
And pm_entity_type.pm_entity_type_id = pm_entity.pm_entity_type_id
And pm_event_type.pm_event_type_id = pm_event_priority.pm_event_type_id
And ( pm_entity.pm_entity_type_id = '002LEITUU0005T8EX40001XFTEW000000OZX' OR
pm_entity_type.parent_id= '002LEITUU0005T8EX40001XFTEW000000OZX' )
ORDER BY 1,2,3
I wonder, is there any way I can modify this query to possibly make the query a little faster?
Query performance can tank when you have to join many large tables together, particularly when the join columns are not properly indexed. In your case, I suspect your tables are quite large (many rows) and the _id columns are not indexed.
If you are using SQL Server Management Studio, you can click on "Display Estimated Execution Plan" to see how the query optimizer is interpreting your query. If you see a bunch of Table Scans rather than Index Scans/Seeks, this means SQL Server has to read through each and every row in your tables; a performance nightmare! Try putting some indexes on the _id columns of each table (perhaps a clustered index), and/or using Database Engine Tuning Advisor to automatically recommend the best index structure to apply to your tables to improve this query's performance.
You need to look at the query plan. See this question on how to obtain it.
Once you have a query plan, see if you can tell from the list what is so slow. Chances are there are table scans because of missing indexes.
What happens if you take out the TOP 100 and do a SELECT * using the same criteria? If you get a ridiculous amount of data back, there may be missing join criteria.

FULL TEXT INDEX - Huge Performance Decrease on Multiple Tables

I've recently been learning something very new to me - FULLTEXT Indexes.
It seems that I can run off two separate queries (using CONTAINSTABLE) on the same parameters against two separate tables are gain an almost instantaneous answer (sub 10ms) however when I combined the two together, the query takes 1.3 seconds - or 130+ times slower!!
Below are the queries (simplified for the purpose of this question).
Query 1:
SELECT
*
FROM
dbo.FooBar FB
INNER JOIN dbo.FooBalls FBS on FB.ID = FBS.ID
LEFT JOIN CONTAINSTABLE(dbo.FooBar, (Col1, Col2, Col3), #query) FBCONT ON FB.ID = FBCONT.[KEY]
WHERE
FBCONT.[KEY] IS NOT NULL
Query 2:
SELECT
*
FROM
dbo.FooBar FB
INNER JOIN dbo.FooBalls FBS on FB.ID = FBS.ID
LEFT JOIN CONTAINSTABLE(dbo.FooBalls, (Col1), #query) FBSCONT ON FBS.ID = FBSCONT.[KEY]
WHERE
FBSCONT.[KEY] IS NOT NULL
Query Combined:
SELECT
*
FROM
dbo.FooBar FB
INNER JOIN dbo.FooBalls FBS on FB.ID = FBS.ID
LEFT JOIN CONTAINSTABLE(dbo.FooBar, (Col1, Col2, Col3), #query) FBCONT ON FB.ID = FBCONT.[KEY]
LEFT JOIN CONTAINSTABLE(dbo.FooBalls, (Col1), #query) FBSCONT ON FBS.ID = FBSCONT.[KEY]
WHERE
(FBCONT.[KEY] IS NOT NULL OR FBSCONT.[KEY] IS NOT NULL)
Perhaps my research has missed something but can someone give me an indicator as to why having both clauses together reduces performance by over 130 times?
NOTES:
I've checked the relevant indexes for joining exist - verified by the speed of the individual queries.
There are actually more joins involved in the process - however they are completely unlinked to the tables being queries and again response are under 10ms when searching for results in 100,000 plus records.
I tried replacing the CONTAINSTABLE with individual CONTAINS statements - performance was massively degraded as my research would lead me to expect.
A catalog has been set up that references ONLY the four columns from the two tables being queried
The #query parameter is set to NVARCHAR (50) at the present. I've read that using NVACHAR is faster as implicit conversions are not required.
I know I could do a dirty UNION ALL on both queries separately, but I'd prefer to writer better queries if possible rather than hack it together. Additionally UNION ALL would leave me with potential duplicates if #query value was in two columns from separate tables linked to one record.
Any further suggestions would be greatly received.
Your question comments suggest you improved performance to a satisfactory level by rewriting an unrelated part of the query (not shown in the question).
This is fair enough if it works, but doesn't explain why the two separate queries and the combined query differ so significantly, when other unrelated parts of the query are kept constant.
It's difficult to say confidently without seeing a query plan and statistics results; however I can think of two possibilities based solely on reasoning about how the SQL queries are written:
One or both of the ID columns (from FooBar and FooBalls) may be non-unique in the row set after these two tables have been inner joined. Doing two, rather than one, join to CONTAINSTABLE result sets may thus be "breeding" rather more records than a single join does; larger result sets take longer to be passed back to the client and displayed. To test this: compare the row counts returned by the two separate queries, and compare these to the row counts of each separate query if the WHERE clauses are omitted. Larger row counts will typically suggest a longer query elapsed time (all other things being equal).
Each of the separate queries has been written with a left outer join, but the result set is then restricted to only include rows where the join has succeeded. This is effectively an inner join: SQL Server's query planner may well be identifying this fact and choosing an execution plan as if an inner join had been specified. Conversely, the combined query requires rows where either join (but not necessarily both) have succeeded, which is a true left join. The execution plan is likely to use different, slower, approaches for these joins. To test this: look at the execution plans, and compare to execution plans for the separate queries with inner joins requested instead of left joins.

SQL Server performance - Subselect or Inner Join?

I've been pondering the question which of those 2 Statements might have a higher performance (and why):
select * from formelement
where formid = (select id from form where name = 'Test')
or
select *
from formelement fe
inner join form f on fe.formid = f.id
where f.name = 'Test'
One form contains several form elements, one form element is always part of one form.
Thanks,
Dennis
look at the execution plan, most likely it will be the same if you add the filtering to the join, that said the join will return everything from both tables, the in will not
I actually prefer EXISTS over those two
select * from formelement fe
where exists (select 1 from form f
where f.name='Test'
and fe.formid =f.id)
The performance depends on the query plan choosen by the SQL Server Engine. The query plan depends on a lot of factors, including (but not limited to) the SQL, the exact table structure, the statistics of the tables, available indexes, etc.
Since your two queries are quite simple, my guess would be that they result in the same (or a very similar) execution plan, thus yielding comparable performance.
(For large, complicated queries, the exact wording of the SQL can make a difference, the book SQL Tuning by Dan Tow gives a lot of great advice on that.)

Join queries taking more execution time than their corresponding nested queries

I have 2 tables Person_Organization and Person_Organization_other and nested query is :
SELECT
Person_Organization_id
FROM
Person_Organization_other
WHERE
company_name IN (SELECT company_name
FROM Person_Organization_other
WHERE Person_Organization_id IN (SELECT Person_Organization_Id
FROM Person_Organization
WHERE person_id = 117
AND delete_flag = 0)
)
Whereas the above query's corresponding query with join that I tried is :-
SELECT
poo.Person_Organization_id
FROM
Person_Organization_other poo, Person_Organization_other poo1, Person_Organization po
WHERE
poo1.Person_Organization_id = po.Person_Organization_Id
AND po.person_id = 117
AND po.delete_flag = 0
AND poo.company_name = poo1.company_name
GROUP BY
poo.Person_Organization_id
However the nested query is found to take less time as compared to it's corresponding query with joins. I used SQL profiler trace to compare times of executed queries. For the nested query it took 30 odd ms. For the joined query it took 41 odd ms
I was under the impression that as a rule nested queries are less perfomant and should be "flattened out" using joins.
Could someone explain what I am doing wrong?
regards
Nitin
You are using cross joins. Try inner joins.
select poo.Person_Organization_id
from Person_Organization po
INNER JOIN Person_Organization_other poo ON
poo.Person_Organization_id=po.Person_Organization_Id
INNER JOIN Person_Organization_other poo1 ON
poo1.Person_Organization_id=po.Person_Organization_Id AND
poo.company_name=poo1.company_name
where po.person_id=117 AND po.delete_flag=0
group by poo.Person_Organization_id
By separating your tables with commas, you are effectively CROSS JOINing them together. I would try doing explicit INNER JOINs between the tables and see if that helps performance.
The view that nested queries are less performant and should be flattened out using joins is a myth - it is true that inappropriate nested subqueries can cause performance issues, however in many cases using a subquery is just as good as using a join.
In fact the SQL server optimises all queries that it executes by reducing them to an execution tree - often queries that use a JOIN end up with identical execution trees to equivalent sql statements that use nested queries instead.
In this case the execution time of these is really low anyway - the difference could just as easily be explained as due to caches etc... not being filled.
My advice would be to use whatever syntax makes more sense to you - if you have a performance problem then by all means go back and check to see if a nested subquery is the cause of your problem, however I definitely wouldn't spend time worrying about "flattening out" queries that aren't causing problems.
Your order of tables might reduce the performance your table order in from clause should be in increasing order of number of rows

Resources