row_number over partition performance from PowerBi - sql-server

I have a a view which has a query in SQL Server 2019 using row_number over partition in a left join:
from
tableA
left join
(
(row_number over partition query)
) b on a.field=tableA.field
this query resulted in SSMS in 9 sec. execution time (used as a view)
If this view is used in another view, the query finish also fine in SSMS but times out in PowerBi.
I have a colleague of mine who is claiming the issue is due to usage of (row_number over partition query) that it is not good for PowerBi but I dont get it how this will matter when the view is SQL server , how this can have an impact on PowerBi?
**Verified that after changing the view with a table instead, its working fine in PowerBi.
Type of query from PowerBi: "Import"
Thanks.

Related

Put query hint (OPTION) into view in SQL Server

I have an SQL query on a view using several joins that is occasionally running really slow - a lot slower than normal, making the query nearly unusable.
I copied the query out of the view and experimented and found a solution at https://dba.stackexchange.com/a/60180/52607 - if I add
OPTION (MERGE JOIN, HASH JOIN)
to the end of the query, it is running ~6x faster.
I now tried to adapt the OPTION to the original view, but SQL Server/SSMS tells me
Incorrect syntax near the keyword 'OPTION'.
How can I add this option to the view so that the resulting query of the view is just as fast?
(Adding the option to the query on the view did not result in any speedup. This looked like this:
select * from vMyView
where SomeDate >= CONVERT(Datetime, '2017.09.20')
OPTION (MERGE JOIN, HASH JOIN)
I think I would have to use this option directly for the vMyView - if possible.)
You could add a local hint in the joins in the view
select X, Y from tab1 inner merge JOIN tab1 on tab1.id = tab2.id

Sql Server strange execution plan choices

I have a query on sql server 2012 sp3 which is built dynamically through an application. I have noticed a case where it runs slow due to insufficient execution plan and I am trying to figure out the problem.
In this case the query that is being built has the following form
Select some columns from
(SELECT TOP 1 1 AS NEW FROM tr) AS AL
JOIN
(select some columns from a view join some tables
where column = 'a' or column = 'b' column = 'c'...) t5
ON 1=1 WHERE [t5].[ROW_NUMBER] BETWEEN 0+1 AND 0+20 ORDER BY [t5].[ROW_NUMBER]
The outer select is being used for pagination. The inner select labeled as t5 runs fast when is being executed alone in any case. However combined with the outer select for pagination it can be very slow depended on the number of values chosen in its where statement and how selective (small number of rows fetched) it is.
I have tried to change the query to improve performance but when i do this i ruin the performance of queries built by the application which is not selective (fetch many rows)
From what I see, the execution plan is depended on the values selected in the where statement. Is there a way to help sql server choose the right execution plan so that it can avoid useless rows reads?
I would appreciate any suggestion.

query generated by SSAS contains subqueries

I created a cube that for now consists of a simple partition that looks something like this:
select table1.col1, table1.col2, table1.col3 from table1
when i examine the query that SSAS is executing using sql profiler the resulting query looks like
select [dbo_table1].[col1],[dbo_table1].[col2],[dbo_table1].[col3] from
(
select table1.col1, table1.col2, table1.col3 from table1
)
For some other tables and seemingly similar queries, the resulting query get wrapped twice, with another inner select. I'm trying to understand why this happens and if there is a way to get the exact query I write to be executed.
The cube is built using VS 2012 Premium (11.0.3000.0) and my SQL Server is 2012 (11.0.3128)

SQL Server Pagination w/o row_number() or nested subqueries?

I have been fighting with this all weekend and am out of ideas. In order to have pages in my search results on my website, I need to return a subset of rows from a SQL Server 2005 Express database (i.e. start at row 20 and give me the next 20 records). In MySQL you would use the "LIMIT" keyword to choose which row to start at and how many rows to return.
In SQL Server I found ROW_NUMBER()/OVER, but when I try to use it it says "Over not supported". I am thinking this is because I am using SQL Server 2005 Express (free version). Can anyone verify if this is true or if there is some other reason an OVER clause would not be supported?
Then I found the old school version similar to:
SELECT TOP X * FROM TABLE WHERE ID NOT IN (SELECT TOP Y ID FROM TABLE ORDER BY ID) ORDER BY ID where X=number per page and Y=which record to start on.
However, my queries are a lot more complex with many outer joins and sometimes ordering by something other than what is in the main table. For example, if someone chooses to order by how many videos a user has posted, the query might need to look like this:
SELECT TOP 50 iUserID, iVideoCount FROM MyTable LEFT OUTER JOIN (SELECT count(iVideoID) AS iVideoCount, iUserID FROM VideoTable GROUP BY iUserID) as TempVidTable ON MyTable.iUserID = TempVidTable.iUserID WHERE iUserID NOT IN (SELECT TOP 100 iUserID, iVideoCount FROM MyTable LEFT OUTER JOIN (SELECT count(iVideoID) AS iVideoCount, iUserID FROM VideoTable GROUP BY iUserID) as TempVidTable ON MyTable.iUserID = TempVidTable.iUserID ORDER BY iVideoCount) ORDER BY iVideoCount
The issue is in the subquery SELECT line: TOP 100 iUserID, iVideoCount
To use the "NOT IN" clause it seems I can only have 1 column in the subquery ("SELECT TOP 100 iUserID FROM ..."). But when I don't include iVideoCount in that subquery SELECT statement then the ORDER BY iVideoCount in the subquery doesn't order correctly so my subquery is ordered differently than my parent query, making this whole thing useless. There are about 5 more tables linked in with outer joins that can play a part in the ordering.
I am at a loss! The two above methods are the only two ways I can find to get SQL Server to return a subset of rows. I am about ready to return the whole result and loop through each record in PHP but only display the ones I want. That is such an inefficient way to things it is really my last resort.
Any ideas on how I can make SQL Server mimic MySQL's LIMIT clause in the above scenario?
Unfortunately, although SQL Server 2005 Row_Number() can be used for paging and with SQL Server 2012 data paging support is enhanced with Order By Offset and Fetch Next, in case you can not use any of these solutions you require to first
create a temp table with identity column.
then insert data into temp table with ORDER BY clause
Use the temp table Identity column value just like the ROW_NUMBER() value
I hope it helps,

How to force SQL Server to process CONTAINS clauses before WHERE clauses?

I have a SQL query that uses both standard WHERE clauses and full text index CONTAINS clauses. The query is built dynamically from code and includes a variable number of WHERE and CONTAINS clauses.
In order for the query to be fast, it is very important that the full text index be searched before the rest of the criteria are applied.
However, SQL Server chooses to process the WHERE clauses before the CONTAINS clauses and that causes tables scans and the query is very slow.
I'm able to rewrite this using two queries and a temporary table. When I do so, the query executes 10 times faster. But I don't want to do that in the code that creates the query because it is too complex.
Is there an a way to force SQL Server to process the CONTAINS before anything else? I can't force a plan (USE PLAN) because the query is built dynamically and varies a lot.
Note: I have the same problem on SQL Server 2005 and SQL Server 2008.
You can signal your intent to the optimiser like this
SELECT
*
FROM
(
SELECT *
FROM
WHERE
CONTAINS
) T1
WHERE
(normal conditions)
However, SQL is declarative: you say what you want, not how to do it. So the optimiser may decide to ignore the nesting above.
You can force the derived table with CONTAINS to be materialised before the classic WHERE clause is applied. I won't guarantee performance.
SELECT
*
FROM
(
SELECT TOP 2000000000
*
FROM
....
WHERE
CONTAINS
ORDER BY
SomeID
) T1
WHERE
(normal conditions)
Try doing it with 2 queries without temp tables:
SELECT *
FROM table
WHERE id IN (
SELECT id
FROM table
WHERE contains_criterias
)
AND further_where_classes
As I noted above, this is NOT as clean a way to "materialize" the derived table as the TOP clause that #gbn proposed, but a loop join hint forces an order of evaluation, and has worked for me in the past (admittedly usually with two different tables involved). There are a couple of problems though:
The query is ugly
you still don't get any guarantees that the other WHERE parameters don't get evaluated until after the join (I'll be interested to see what you get)
Here it is though, given that you asked:
SELECT OriginalTable.XXX
FROM (
SELECT XXX
FROM OriginalTable
WHERE
CONTAINS XXX
) AS ContainsCheck
INNER LOOP JOIN OriginalTable
ON ContainsCheck.PrimaryKeyColumns = OriginalTable.PrimaryKeyColumns
AND OriginalTable.OtherWhereConditions = OtherValues

Resources