Predicting the cost of a Snowflake query before executing it - snowflake-cloud-data-platform

Is there a way to predict the cost of a query I'm going to execute on Snowflake before actually executing it? We would like to avoid high costs, so having the cost calculated after executing the query doesn't help much.
Other technologies such as BigQuery provide calculators to estimate the query's cost before executing, but I didn't find such an option for Snowflake.

There isn't a tool within Snowflake that will do this for you automatically unfortunately, you'd have to manually figure it out yourself.
For example, a naive way of calculating it could be: run the query on a 10% sample of the table(s) and then take the credit usage of that query and multiply it by 10.

Related

Quickly view the total sum of execution cost in execution plan

I have a large execution plan. Is there a way to quickly view the total sum of IO cost, CPU cost, etc. for the full plan without manually calculating each node?
SentryOne Plan Explorer is a great tool for analysing SQL Server execution plans. I don't work for them, I use this tool very often.
It can show the costs of individual operator or cumulative costs.

Can I use Plan Guides to optimize a slow performing Query?

A synchronization program is syncing data between our SQL server and an online database. Every 5 minutes the program runs query's on all tables, al in the format:
select max(ID) from table
After that, the program retreives information from the online database, using the max(ID) to retreive only newer records.
The query runs fast on small tables, but some tables have millions of records.
The performance can be boosted by using a Where statement:
select max(ID) from table where date >= dateadd(dd,-30,getdate())
Unfortunately it's an old program, which cannot be changed anymore.
(no supplier and no source code)
I read something about Plan Guides, which should give performance boosts to query's.
Can I use a plan guide to alter these query's so they run much faster??
I would try to stay clear of plan guides unless you have tried everything else to help SQL Server choose a better execution plan; reason, you are forcing SQL Server to choose maybe not the best execution plan and if your statistics are right, SQL Server does a heck of a job to provide you with a very good estimation plan.
Sorry if this is going to be long winded but SQL Server performance is based on statistics and if your statistics are off, your query will not perform optimally. This is where I would first start.
I would first run your query and generate an estimated execution plan and actual execution plan. If you have never worked with execution plans, here are a few sites to start you off to understand operators and how to interpret some of the more common operators; Rasanu Consulting (Joins), Microsoft (Joins), Simple Talk (Common Operatiors), Microsoft (All Operators). The execution plans are generated from statistics SQL Server stores on indexes and non-indexed columns. I am assuming your query is stored in a procedure where the initial execution plan is stored. Unfortunately, changes to the query in the stored proc can cause performance issues because of an outdated plan and the proc will need to be recompiled.
Based on the results of your execution plan, you may find a simple clustered index will help performance or find that fine tuning your where clause will result in better carnality estimates. Here are a few sites that do a really good job explaining statistics; Patrick Keisler, Itzik Ben-Gan, Microsoft
You may be wondering "what does this have to do with my question?" which would expect a simple yes or no answer but, effectively correcting query performance starts with an understanding statistics and execution plans.
Hope this helps!

How to fetch query execution statistics using Oracle DB?

I am new to database. I try to run a simple query on SQL Server 2014 and Oracle 12c.
This is the execution plan I get using SQL Server. It contains information about I/O cost and CPU cost in seconds.
However I can't find the same information using Oracle. The CPU cost shown in the execution plan is not based on execution time.
I want to do some comparison between the two databases. How I can obtain the same information in Oracle as in SQL Server? Besides, how I can know the cache hit ratio?
Thank you.
The cost estimate is in fact based on time.
It is a non-dimensionalised measurement that expresses the estimated time for the query to complete in terms of the equivalent number of logical reads, so if a logical read is expected to take 0.001 seconds then a cost of 12 is 0.012 seconds.
Although it is commonly stated that the cost between different queries cannot be compared, this was only definitively true in earlier versions. The difficulty in comparing query costs relates to how long single block and multiblock reads, writes and CPU operations take. This can depend on such a multitude of factors (other activity on the system, and activity immediately prior that affects the likelihood of blocks being cached by the instance or the i/o subsystem) that it is highly unlikely that you really expect to derive a time from a cost.
Cache hit ratios have been discredited for quite some time as a measurement of system efficiency. It is possible to improve the cache hit ratio to an arbitrary number by simply running particular types of highly inefficient queries.
Use the Oracle Database 12c: EM Express Performance Hub to get both estimates and actual values for queries and their operations. Regular explain plans are helpful, but they just show you what Oracle thinks will happen, not necessarily what will happen.
Specifically, use either the SQL Details (aggregate) or the SQL Monitor Details (last execution) information.
You're close, very close.
Run with AutoTrace.
I talk more about the feature here, or you can of course read up on the docs or the Help.

SQL Server 2008 R2 - optimization issue

I'm running into a performance issue with the current schema. So I built an equivalent schema to solve the issue.
I ran some tests on both schemas and the results are hard to understand. For the record, the data is the same.
I get the following from the Profiler when executing equivalent requests on the two schemas.
Old schema:
1,300,000 reads
5,000 CPU
4 seconds execution time
New schema:
30,000 reads
3,000 CPU
6 seconds execution time
The difference seems to be in the query plan used. The old schema has parallelism in the query plan. The new schema isn't using parallelism.
Has anyone faced similar situations (less IO/CPU but more execution time). How did you solve it?
Is there a way to force parallelism? I've played with query hints(http://msdn.microsoft.com/en-us/library/ms18171). I'm able to stop parallelism on the old schema but can't seem the query on the new schema to use parallelism.
Thanks in advance.
Louis,
Currently there is no way to force parallelism in SQL Server straight out of the box but Adam Machanic did some work to do that though.
http://whoisactive.com
Coming to your first question, yes we have seen cases like that too. Note that Parallelism is cpu bound and that's why you are seeing more cpu time but overall less execution time as you have multiple threads doing the work for you.
http://www.simple-talk.com/sql/learn-sql-server/understanding-and-using-parallelism-in-sql-server/
Make sure you have proper indexes in place and also stats are updated with full scan. In the long run it is best if Query Optimizer makes the decisions by itself but if you want to overwrite the QO plans then you may have to add lot more details. Schema, data and repro.
HTH

What is the usage of Execution Plan in SQL Server?

What is the usage of Execution Plan in SQL Server? When can these plans help me?
When your queries all run fast, all is good in the world and execution plans don't really matter that much. However, when something is running slow, they are very important. They are primarily used to help tune (speed up) slow SQL. Without execution plans, you'd just be guessing at what to change to make your SQL go faster.
Here is the simplest way they can help you. Take a slow query and do the following in a SQL Server Management Studio query window:
1) run the command:
SET SHOWPLAN_ALL ON
2) run your slow query
3) your query will not run, but the execution plan will be returned.
4) look through the PhysicalOp column output for the word SCAN within any text in this column, this is usually the part of the query that is causing the slowdown. Analyze your joins and index usage in regards to this row of the output and if you can eliminate the scan, you will usually improve the query speed.
There are may useful columns (TotalSubTreeCost, etc) in the output, you will become familiar with them as you learn how to read execution plans and tune your slow queries.
When you need to perform performance profiling on a specific query.
Have a look at SQL Server Query Execution Plan Analysis
When it comes time to analyze the
performance of a specific query, one
of the best methods is to view the
query execution plan. A query
execution plan outlines how the SQL
Server query optimizer actually ran
(or will run) a specific query. This
information if very valuable when it
comes time to find out why a specific
query is running slow.
It's helpful in identifying where bottlenecks are occurring in long running queries. You can make some quite impressive performance improvements simply by knowing how the server executes your complex query.
If I remember correctly it also identifies good candidates for indexing which is another way to increase performance.
These plans describe how SQL Server goes about executing your query. It's the result of a cost-based algorithm by the SQL Server query optimiser, which comes up with a plan for how to get to the end result in the expected best possible way.
It's useful because it will show you where time is being spent in the query, whether indexes are being used or not, what type of process is being done on those indexes (scan, seek) etc.
So if you have a poorly performing query, the execution plan will highlight what the costliest parts are and allow you to see what needs optimising (e.g. may be a missing index, may be an inefficiently written query resulting in an index scan instead of a seek).

Resources