snowflake query performance tuning - snowflake-cloud-data-platform

snowflake query performance tuning - snowflake-cloud-data-platform

I have a snowflake query with multiple ctes and inserting into a table using a Talend job. It takes more than 90 minutes to execute the query. It is multiple cascading ctes, one is calling other and other is calling the other.
I want to improve the performance of the query. It is like 1000 lines of code and I can't paste it here. As I checked the profile and it is showing all the window functions and aggregate functions which slows down the query.
For example, the top slower is,
ROW_NUMBER() OVER (PARTITION BY LOWER(S.SUBSCRIPTIONID)
ORDER BY S.ISROWCURRENT DESC NULLS FIRST,
TO_NUMBER(S.STARTDATE) DESC NULLS FIRST,
IFF(S.ENDDATE IS NULL, '29991231', S.ENDDATE) DESC NULLS FIRST)
takes 7.3% of the time. Can you suggest an alternative way to improve the performance of the query please?

The problem is that 1000 lines are very hard for any query analyzer to optimize. It also makes troubleshooting a lot harder for you and for a future team member who inherits the code.
I recommend breaking the query up and these optimizations:
Use CREATE TEMPORARY TABLE AS instead of CTEs. Add ORDER BY as you create the table on the column that you will join or filter on. The temporary tables are easier for the optimizer to build and later use. The ORDER BY helps Snowflake know what to optimize for with subsequent joins to other tables. They're also easier for troubleshooting.
In your example, see if you can persist this data as permanent columns so that Snowflake can skip the conversion portion and have better statistics on it: TO_NUMBER(S.STARTDATE) and IFF(S.ENDDATE IS NULL, '29991231', S.ENDDATE).
Alternatively to step 2, instead of sorting by startDate and endDate, see if you can add an IDENTITY, SEQUENCE, or populate an INTEGER column which you can use as the sortkey. You can also literally name this new column sortKey. Sorting an integer will be significantly faster than running a function on a DATETIME and then ordering by it.
If any of the CTEs can be changed into materialized views, they will be pre-built and significantly faster.
Finally stage all of the data in a temporary table - ordered by the same columns that your target table was created in - before you insert it. This will make the insert step itself quicker and Snowflake will have an easier time handling a concurrent change to that table.
Notes:
To create a temporary table:
create or replace temporary table table1 as select * from dual; After that you refer to table1 instead of your code instead of the CTE.
Materialized views are documented here. They are an Enterprise edition feature. They syntax is: create materialized view mymv as select col1, col2 from mytable;

Related

Optimization problems with View using Clustered Index Insert on tempdb on SQL Server 2008

I am creating a Java function that needs to use a SQL query with a lot of joins before doing a full scan of its result. Instead of hard-coding a lot of joins I decided to create a view with this complex query. Then the Java function just uses the following query to get this result:
SELECT * FROM VW_####
So the program is working fine but I want to make it faster since this SELECT command is taking a lot of time. After taking a look on its plan execution plan I created some indexes and made it +-30% faster but I want to make it faster.
The problem is that every operation in the execution plan have cost between 0% and 4% except one operation, a clustered-index insert that has +-50% of the execution cost. I think that the system is using a temporary table to store the view's data, but an index in this view isn't useful for me because I need all rows from it.
So what can I do to optimize that insert in the CWT_PrimaryKey? I think that I can't turn off that index because it seems to be part of the SQL Server's internals. I read somewhere that this operation could appear when you use cursors but I think that I am not using (or does the view use it?).
The command to create the view is something simple (no T-SQL, no OPTION, etc) like:
create view VW_#### as SELECTS AND JOINS HERE
And here is a picture of the problematic part from the execution plan: http://imgur.com/PO0ZnBU
EDIT: More details:
Well the query to create the problematic view is a big query that join a lot of tables. Based on a single parameter the Java-Client modifies the query string before creating it. This view represents a "data unit" from a legacy Database migrated to the SQLServer that didn't had any Foreign or Primary Key, so our team choose to follow this strategy. Because of that the view have more than 50 columns and it is made from the join of other seven views.
Main view's query (with a lot of Portuguese words): http://pastebin.com/Jh5vQxzA
The other views (from VW_Sintese1 until VW_Sintese7) are created like this one but without using extra views, they just use joins with the tables that contain the data requested by the main view.
Then the Java Client create a prepared Statement with the query "Select * from VW_Sintese####" and execute it using the function "ExecuteQuery", something like:
String query = "Select * from VW_Sintese####";
PreparedStatement ps = myConn.prepareStatement(query,ResultSet.TYPE_SCROLL_INSENSITIVE, ResultSet.CONCUR_READ_ONLY);
ResultSet rs = ps.executeQuery();
And then the program goes on until the end.
Thanks for the attention.

First: you should post the code of the view along with whatever is using the views because of the rest of this answer.
Second: the definition of a view in SQL Server is later used to substitute in querying. In other words, you created a view, but since (I'm assuming) it isn't an indexed view, it is the same as writing the original, long SELECT statement. SQL Server kind of just swaps it out in the DML statement.
From Microsoft's 'Querying Microsoft SQL Server 2012': T-SQL supports the following table expressions: derived tables, common table expressions (CTEs), views, inline table-valued functions.
And a direct quote:
It’s important to note that, from a performance standpoint, when SQL Server optimizes
queries involving table expressions, it first unnests the table expression’s logic, and therefore interacts with the underlying tables directly. It does not somehow persist the table expression’s result in an internal work table and then interact with that work table. This means that table expressions don’t have a performance side to them—neither good nor
bad—just no side.
This is a long way of reinforcing the first statement: please include the SQL code in the view and what you're actually using as the SELECT statement. Otherwise, we can't help much :) Cheers!
Edit: Okay, so you've created a view (no performance gain there) that does 4-5 LEFT JOIN on to the main view (again, you're not helping yourself out much here by eliminating rows, etc.). If there are search arguments you can use to filter down the resultset to fewer rows, you should have those in here. And lastly, you're ordering all of this at the top, so your query engine will have to get those views, join them up to a massive SELECT statement, figure out the correct order, and (I'm guessing here) the result count is HUGE and SQL's db engine is ordering it in some kind of temporary table.
The short answer: get less data (fewer columns and only the rows you need); don't order the results if the resultset is very large, just get the data to the client and then sort it there.
Again, if you want more help, you'll need to post table schemas and index strategies for all tables that are in the query (including the views that are joined) and you'll need to include all view definitions (including the views that are joined).

Oracle 11g: Index not used in "select distinct"-query

My question concerns Oracle 11g and the use of indexes in SQL queries.
In my database, there is a table that is structured as followed:
Table tab (
rowid NUMBER(11),
unique_id_string VARCHAR2(2000),
year NUMBER(4),
dynamic_col_1 NUMBER(11),
dynamic_col_1_text NVARCHAR2(2000)
) TABLESPACE tabspace_data;
I have created two indexes:
CREATE INDEX Index_dyn_col1 ON tab (dynamic_col_1, dynamic_col_1_text) TABLESPACE tabspace_index;
CREATE INDEX Index_unique_id_year ON tab (unique_id_string, year) TABLESPACE tabspace_index;
The table contains around 1 to 2 million records. I extract the data from it by executing the following SQL command:
SELECT distinct
"sub_select"."dynamic_col_1" "AS_dynamic_col_1","sub_select"."dynamic_col_1_text" "AS_dynamic_col_1_text"
FROM
(
SELECT "tab".* FROM "tab"
where "tab".year = 2011
) "sub_select"
Unfortunately, the query needs around 1 hour to execute, although I created the both indexes described above.
The explain plan shows that Oracle uses a "Table Full Access", i.e. a full table scan. Why is the index not used?
As an experiment, I tested the following SQL command:
SELECT DISTINCT
"dynamic_col_1" "AS_dynamic_col_1", "dynamic_col_1_text" "AS_dynamic_col_1_text"
FROM "tab"
Even in this case, the index is not used and a full table scan is performed.
In my real database, the table contains more indexed columns like "dynamic_col_1" and "dynamic_col_1_text".
The whole index file has a size of about 50 GB.
A few more informations:
The database is Oracle 11g installed on my local computer.
I use Windows 7 Enterprise 64bit.
The whole index is split over 3 dbf files with about 50GB size.
I would really be glad, if someone could tell me how to make Oracle use the index in the first query.
Because the first query is used by another program to extract the data from the database, it can hardly be changed. So it would be good to tweak the table instead.
Thanks in advance.
[01.10.2011: UPDATE]
I think I've found the solution for the problem. Both columns dynamic_col_1 and dynamic_col_1_text are nullable. After altering the table to prohibit "NULL"-values in both columns and adding a new index solely for the column year, Oracle performs a Fast Index Scan.
The advantage is that the query takes now about 5 seconds to execute and not 1 hour as before.

Are you sure that an index access would be faster than a full table scan? As a very rough estimate, full table scans are 20 times faster than reading an index. If tab has more than 5% of the data in 2011 it's not surprising that Oracle would use a full table scan. And as #Dan and #Ollie mentioned, with year as the second column this will make the index even slower.
If the index really is faster, than the issue is probably bad statistics. There are hundreds of ways the statistics could be bad. Very briefly, here's what I'd look at first:
Run an explain plan with and without and index hint. Are the cardinalities off by 10x or more? Are the times off by 10x or more?
If the cardinality is off, make sure there are up to date stats on the table and index and you're using a reasonable ESTIMATE_PERCENT (DBMS_STATS.AUTO_SAMPLE_SIZE is almost always the best for 11g).
If the time is off, check your workload statistics.
Are you using parallelism? Oracle always assumes a near linear improvement for parallelism, but on a desktop with one hard drive you probably won't see any improvement at all.
Also, this isn't really relevant to your problem, but you may want to avoid using quoted identifiers. Once you use them you have to use them everywhere, and it generally makes your tables and queries painful to work with.

Your index should be:
CREATE INDEX Index_year
ON tab (year)
TABLESPACE tabspace_index;
Also, your query could just be:
SELECT DISTINCT
dynamic_col_1 "AS_dynamic_col_1",
dynamic_col_1_text "AS_dynamic_col_1_text"
FROM tab
WHERE year = 2011;
If your index was created solely for this query though, you could create it including the two fetched columns as well, then the optimiser would not have to go to the table for the query data, it could retrieve it directly from the index making your query more efficient again.
Hope it helps...

I don't have an Oracle instance on hand so this is somewhat guesswork, but my inclination is to say it's because you have the compound index in the wrong order. If you had year as the first column in the index it might use it.

Your second test query:
SELECT DISTINCT
"dynamic_col_1" "AS_dynamic_col_1", "dynamic_col_1_text" "AS_dynamic_col_1_text"
FROM "tab"
would not use the index because you have no WHERE clause, so you're asking Oracle to read every row in the table. In that situation the full table scan is the faster access method.
Also, as other posters have mentioned, your index on YEAR has it in the second column. Oracle can use this index by performing a skip scan, but there is a performance hit for doing so, and depending on the size of your table Oracle may just decide to use the FTS again.

I don't know if it's relevant, but I tested the following query:
SELECT DISTINCT
"dynamic_col_1" "AS_dynamic_col_1", "dynamic_col_1_text" "AS_dynamic_col_1_text"
FROM "tab"
WHERE "dynamic_col_1" = 123 AND "dynamic_col_1_text" = 'abc'
The explain plan for that query show that Oracle uses an index scan in this scenario.
The columns dynamic_col_1 and dynamic_col_1_text are nullable. Does this have an effect on the usage of the index?
01.10.2011: UPDATE]
I think I've found the solution for the problem. Both columns dynamic_col_1 and dynamic_col_1_text are nullable. After altering the table to prohibit "NULL"-values in both columns and adding a new index solely for the column year, Oracle performs a Fast Index Scan. The advantage is that the query takes now about 5 seconds to execute and not 1 hour as before.

Try this:
1) Create an index on year field (see Ollie answer).
2) And then use this query:
SELECT DISTINCT
dynamic_col_1
,dynamic_col_1_text
FROM tab
WHERE ID (SELECT ID FROM tab WHERE year=2011)
or
SELECT DISTINCT
dynamic_col_1
,dynamic_col_1_text
FROM tab
WHERE ID (SELECT ID FROM tab WHERE year=2011)
GROUP BY dynamic_col_1, dynamic_col_1_text
Maybe it will help you.

SELECT TOP is slow, regardless of ORDER BY

I have a fairly complex query in SQL Server running against a view, in the form:
SELECT *
FROM myview, foo, bar
WHERE shared=1 AND [joins and other stuff]
ORDER BY sortcode;
The query plan as shown above shows a Sort operation just before the final SELECT, which is what I would expect. There are only 35 matching records, and the query takes well under 2 seconds.
But if I add TOP 30, the query takes almost 3 minutes! Using SET ROWCOUNT is just as slow.
Looking at the query plan, it now appears to sort all 2+ million records in myview before the joins and filters.
This "sorting" is shown on the query plan as an Index Scan on the sortcode index, a Clustered Index Seek on the main table, and a Nested Loop between them, all before the joins and filters.
How can I force SQL Server to SORT just before TOP, like it does when TOP isn't specified?
I don't think the construction of myview is the issue, but just in case, it is something like this:
CREATE VIEW myview AS
SELECT columns..., sortcode, 0 as shared FROM mytable
UNION ALL
SELECT columns..., sortcode, 1 as shared FROM [anotherdb].dbo.mytable
The local mytable has a few thousand records, and mytable in the other database in the same MSSQL instance has a few million records. Both tables do have indexes on their respective sortcode column.

And so starts the unfortunate game of "trying to outsmart the optimizer (because it doesn't always know best)".
You can try putting the filtering portions into a subquery or CTE:
SELECT TOP 30 *
FROM
(SELECT *
FROM myview, foo, bar
WHERE shared=1 AND [joins and other stuff]) t
ORDER BY sortcode;
Which may be enough to force it to filter first (but the optimizer gets "smarter" with each release, and can sometimes see through such shenanigans). Or you might have to go as far as putting this code into a UDF. If you write the UDF as a multistatement table-valued function, with the filtering inside, and then query that UDF with your TOP x/ORDER BY, you've pretty well forced the querying order (because SQL Server is currently unable to optimize around multistatement UDFs).
Of course, thinking about it, introducing the UDF is just a way of hiding what we're really doing - create a temp table, use one query to populate it (based on WHERE filters), then another query to find the TOP x from the temp table.

SQL Query Optimization

I am using SQL Server 2008 and I need to optimize my queries.For that purpose I am using Database Engine Tuning Advisor.
My question is can I check the performance of only one SQL query at a time or more than one suing new session?

To analyze one query at a time right click it in the SSMS script window and choose the option "Analyze Query in DTA" For this workload select the option "keep all existing PDS" to avoid loads of drop recommendations for indexes not used by the query under examination.
To do more than one first capture a trace file with a representative workload sample then you can analyse that with the DTA.

There are simple steps that must follow when writes SQL Query:-
1-Take the name of the columns in the select query instead of *
2-Avoid sub queries
3-Avoid to use operator IN operator
4-Use having as a filter in in Group By
5-Don not save image in database instead of this save the image
Path in database Ex: saving image in the DB takes large space and each
time needs to serialization when saving or retrieving images in the database.
6-Each table should have a primary key
7-Each table should have a minimum of one clustered index
8-Each table should have an appropriate amount of non-clustered index Non-clustered index should be created on columns of table based on query which is running
9-Following priority orders should be followed when any index is
created a) WHERE clause, b) JOIN clause, c) ORDER BY clause, d)SELECT clause
10-Do not to use Views or replace views with original source table
11-Triggers should not be used if possible, incorporate
the logic of trigger in stored procedure
12-Remove any adhoc queries and use Stored Procedure instead
13-Check if there is atleast 30% HHD is empty it will be improves the performance a bit
14-If possible move the logic of UDF to SP as well
15-Remove any unnecessary joins from the table
16-If there is cursor used in a query, see if there is any other way to avoid the use of this
(either by SELECT … INTO or INSERT … INTO, etc)

Which are more performant, CTE or temporary tables?

Which are more performant, CTE or Temporary Tables?

It depends.
First of all
What is a Common Table Expression?
A (non recursive) CTE is treated very similarly to other constructs that can also be used as inline table expressions in SQL Server. Derived tables, Views, and inline table valued functions. Note that whilst BOL says that a CTE "can be thought of as temporary result set" this is a purely logical description. More often than not it is not materlialized in its own right.
What is a temporary table?
This is a collection of rows stored on data pages in tempdb. The data pages may reside partially or entirely in memory. Additionally the temporary table may be indexed and have column statistics.
Test Data
CREATE TABLE T(A INT IDENTITY PRIMARY KEY, B INT , F CHAR(8000) NULL);
INSERT INTO T(B)
SELECT TOP (1000000) 0 + CAST(NEWID() AS BINARY(4))
FROM master..spt_values v1,
master..spt_values v2;
Example 1
WITH CTE1 AS
(
SELECT A,
ABS(B) AS Abs_B,
F
FROM T
)
SELECT *
FROM CTE1
WHERE A = 780
Notice in the plan above there is no mention of CTE1. It just accesses the base tables directly and is treated the same as
SELECT A,
ABS(B) AS Abs_B,
F
FROM T
WHERE A = 780
Rewriting by materializing the CTE into an intermediate temporary table here would be massively counter productive.
Materializing the CTE definition of
SELECT A,
ABS(B) AS Abs_B,
F
FROM T
Would involve copying about 8GB of data into a temporary table then there is still the overhead of selecting from it too.
Example 2
WITH CTE2
AS (SELECT *,
ROW_NUMBER() OVER (ORDER BY A) AS RN
FROM T
WHERE B % 100000 = 0)
SELECT *
FROM CTE2 T1
CROSS APPLY (SELECT TOP (1) *
FROM CTE2 T2
WHERE T2.A > T1.A
ORDER BY T2.A) CA
The above example takes about 4 minutes on my machine.
Only 15 rows of the 1,000,000 randomly generated values match the predicate but the expensive table scan happens 16 times to locate these.
This would be a good candidate for materializing the intermediate result. The equivalent temp table rewrite took 25 seconds.
INSERT INTO #T
SELECT *,
ROW_NUMBER() OVER (ORDER BY A) AS RN
FROM T
WHERE B % 100000 = 0
SELECT *
FROM #T T1
CROSS APPLY (SELECT TOP (1) *
FROM #T T2
WHERE T2.A > T1.A
ORDER BY T2.A) CA
Intermediate materialisation of part of a query into a temporary table can sometimes be useful even if it is only evaluated once - when it allows the rest of the query to be recompiled taking advantage of statistics on the materialized result. An example of this approach is in the SQL Cat article When To Break Down Complex Queries.
In some circumstances SQL Server will use a spool to cache an intermediate result, e.g. of a CTE, and avoid having to re-evaluate that sub tree. This is discussed in the (migrated) Connect item Provide a hint to force intermediate materialization of CTEs or derived tables. However no statistics are created on this and even if the number of spooled rows was to be hugely different from estimated is not possible for the in progress execution plan to dynamically adapt in response (at least in current versions. Adaptive Query Plans may become possible in the future).

I'd say they are different concepts but not too different to say "chalk and cheese".
A temp table is good for re-use or to perform multiple processing passes on a set of data.
A CTE can be used either to recurse or to simply improved readability.
And, like a view or inline table valued function can also be treated like a macro to be expanded in the main query
A temp table is another table with some rules around scope
I have stored procs where I use both (and table variables too)

CTE has its uses - when data in the CTE is small and there is strong readability improvement as with the case in recursive tables. However, its performance is certainly no better than table variables and when one is dealing with very large tables, temporary tables significantly outperform CTE. This is because you cannot define indices on a CTE and when you have large amount of data that requires joining with another table (CTE is simply like a macro). If you are joining multiple tables with millions of rows of records in each, CTE will perform significantly worse than temporary tables.

Temp tables are always on disk - so as long as your CTE can be held in memory, it would most likely be faster (like a table variable, too).
But then again, if the data load of your CTE (or temp table variable) gets too big, it'll be stored on disk, too, so there's no big benefit.
In general, I prefer a CTE over a temp table since it's gone after I used it. I don't need to think about dropping it explicitly or anything.
So, no clear answer in the end, but personally, I would prefer CTE over temp tables.

So the query I was assigned to optimize was written with two CTEs in SQL server. It was taking 28sec.
I spent two minutes converting them to temp tables and the query took 3 seconds
I added an index to the temp table on the field it was being joined on and got it down to 2 seconds
Three minutes of work and now its running 12x faster all by removing CTE. I personally will not use CTEs ever they are tougher to debug as well.
The crazy thing is the CTEs were both only used once and still putting an index on them proved to be 50% faster.

I've used both but in massive complex procedures have always found temp tables better to work with and more methodical. CTEs have their uses but generally with small data.
For example I've created sprocs that come back with results of large calculations in 15 seconds yet convert this code to run in a CTE and have seen it run in excess of 8 minutes to achieve the same results.

CTE won't take any physical space. It is just a result set we can use join.
Temp tables are temporary. We can create indexes, constrains as like normal tables for that we need to define all variables.
Temp table's scope only within the session.
EX:
Open two SQL query window
create table #temp(empid int,empname varchar)
insert into #temp
select 101,'xxx'
select * from #temp
Run this query in first window
then run the below query in second window you can find the difference.
select * from #temp

Late to the party, but...
The environment I work in is highly constrained, supporting some vendor products and providing "value-added" services like reporting. Due to policy and contract limitations, I am not usually allowed the luxury of separate table/data space and/or the ability to create permanent code [it gets a little better, depending upon the application].
IOW, I can't usually develop a stored procedure or UDFs or temp tables, etc. I pretty much have to do everything through MY application interface (Crystal Reports - add/link tables, set where clauses from w/in CR, etc.). One SMALL saving grace is that Crystal allows me to use COMMANDS (as well as SQL Expressions). Some things that aren't efficient through the regular add/link tables capability can be done by defining a SQL Command. I use CTEs through that and have gotten very good results "remotely". CTEs also help w/ report maintenance, not requiring that code be developed, handed to a DBA to compile, encrypt, transfer, install, and then require multiple-level testing. I can do CTEs through the local interface.
The down side of using CTEs w/ CR is, each report is separate. Each CTE must be maintained for each report. Where I can do SPs and UDFs, I can develop something that can be used by multiple reports, requiring only linking to the SP and passing parameters as if you were working on a regular table. CR is not really good at handling parameters into SQL Commands, so that aspect of the CR/CTE aspect can be lacking. In those cases, I usually try to define the CTE to return enough data (but not ALL data), and then use the record selection capabilities in CR to slice and dice that.
So... my vote is for CTEs (until I get my data space).

One use where I found CTE's excelled performance wise was where I needed to join a relatively complex Query on to a few tables which had a few million rows each.
I used the CTE to first select the subset based of the indexed columns to first cut these tables down to a few thousand relevant rows each and then joined the CTE to my main query. This exponentially reduced the runtime of my query.
Whilst results for the CTE are not cached and table variables might have been a better choice I really just wanted to try them out and found the fit the above scenario.

I just tested this- both CTE and non-CTE (where the query was typed out for every union instance) both took ~31 seconds. CTE made the code much more readable though- cut it down from 241 to 130 lines which is very nice. Temp table on the other hand cut it down to 132 lines, and took FIVE SECONDS to run. No joke. all of this testing was cached- the queries were all run multiple times before.

This is a really open ended question, and it all depends on how its being used and the type of temp table (Table variable or traditional table).
A traditional temp table stores the data in the temp DB, which does slow down the temp tables; however table variables do not.

From my experience in SQL Server,I found one of the scenarios where CTE outperformed Temp table
I needed to use a DataSet(~100000) from a complex Query just ONCE in my stored Procedure.
Temp table was causing an overhead on SQL where my Procedure was
performing slowly(as Temp Tables are real materialized tables that
exist in tempdb and Persist for the life of my current procedure)
On the other hand, with CTE, CTE Persist only until the following
query is run. So, CTE is a handy in-memory structure with limited
Scope. CTEs don't use tempdb by default.
This is one scenario where CTEs can really help simplify your code and Outperform Temp Table.
I had Used 2 CTEs, something like
WITH CTE1(ID, Name, Display)
AS (SELECT ID,Name,Display from Table1 where <Some Condition>),
CTE2(ID,Name,<col3>) AS (SELECT ID, Name,<> FROM CTE1 INNER JOIN Table2 <Some Condition>)
SELECT CTE2.ID,CTE2.<col3>
FROM CTE2
GO

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight