Query Performance Optimization in Snowflake - snowflake-cloud-data-platform

Query Performance Optimization in Snowflake - snowflake-cloud-data-platform

In Query Profile View,which of these are to be specifically considered to increase query performance altogether ?
A.Byte scanned
B.Partition scanned out of total partition
C.% scanned from cache
D.Byte passed over network

They all contribute. Scanning more bytes will take longer and if you have a slow network, retrieving the results will take longer
However, partition pruning has a very noticeable impact. The more partitions you can eliminate and the more you can scan from cache, the faster the query

Related

SQL Server high logical reads vs scan count

From a performance tuning perspective which one is more important?
Say a query reports 30 scans and 148 logical reads on a table with about 2 million records.
A modified version of the the same query reports 1 scan with 1400 logical reads. Second query takes about 40ms less CPU time to execute. Is the second query better?
I think so and this is my thesis:
In the first case, we have a high number of scans on a very large table. This is costly on CPU and server memory, since all the rows in the table have to be loaded into memory. Executing such a query thousands of times will be taxing on server resources.
In the second case, we have less scans even though we are accumulating a higher number of logical reads. Since logical reads effectively corresponds to number of pages being read from cache, the bottle neck here will be network bandwidth in getting the results back to the client. The actual work SQL Server has to do in this case is less.
What are your thoughts?

The logical read metrics are mostly irrelevant. You care about time elapsed, CPU time spent and disk resources used. Why would you care about logical reads? They are accounted for by looking at CPU time.
If you want your query to go faster measure wall clock time. If you want to use less resources measure CPU and physical IO.

Are table-scan results held in memory negating the benefit of indexes?

Theoretical SQL Server 2008 question:
If a table-scan is performed on SQL Server with a significant amount of 'free' memory, will the results of that table scan be held in memory, thereby negating the efficiencies that may be introduced by an index on the table?
Update 1: The tables in question contain reference data with approx. 100 - 200 records per table (I do not know the average size of each row), so we are not talking about massive tables here.
I have spoken to the client about introducing a memcached / AppFabric Cache solution for this reference data, however that is out of scope at the moment and they are looking for a 'quick win' that is minimal risk.

Every page read in the scan will be read into the buffer pool and only released under memory pressure as per the cache eviction policy.
Not sure why you think that would negate the efficiencies that may be introduced by an index on the table though.
An index likely means that many fewer pages need to be read and even if all pages are already in cache so no physical reads are required reducing the number of logical reads is a good thing. Logical reads are not free. They still have overhead for locking and reading the pages.

Besides the performance problem (even when all pages are in memory a scan is still going to be many many times slower than an index seek on any table of significant size) there is an additional issue: contention.
The problem with scans is that any operation will have to visit every row. This means that any select will block behind any insert/update/delete (since is guaranteed to visit the row locked by these operations). The effect is basically serialization of operations and adds huge latency, as SELECT now have to wait for DML to commit every time. Even under mild concurrency the effect is an overall sluggish and slow to respond table. With indexes present operations are only looking at rows in the ranges of interest and this, by virtue of simple probabilities, reduces the chances of conflict. The result is a much livelier, responsive, low latency system.

Full Table Scans also are not scalable as the data grows. It’s very simple. As more data is added to a table, full table scans must process more data to complete and therefore they will take longer. Also, they will produce more Disk and Memory requests, further putting strain on your equipment.
Consider a 1,000,000 row table that a full table scan is performed on. SQL Server reads data in the form of an 8K data page. Although the amount of data stored within each page can vary, let’s assume that on average 50 rows of data fit in each of these 8K pages for our example. In order to perform a full scan of the data to read every row, 20,000 disk reads (1,000,000 rows / 50 rows per page). That would equate to 156MB of data that has to be processed, just for this one query. Unless you have a really super fast disk subsystem, it might take it a while to retrieve all of that data and process it. Now then, let’s say assume that this table doubles in size each year. Next year, the same query must read 312MB of data just to complete.
Pls refer this link - http://www.datasprings.com/resources/articles-information/key-sql-performance-situations-full-table-scan

SQL Server Profiler - Evaluating Reads. What is considered 'good' or 'bad'?

I'm profiling (SQL Server 2008) some of our views and queries to determine their efficiency with regards to CPU usage and Reads. I understand Reads are the number of logical disk reads in 8KB pages. But I'm having a hard time determining what I should be satisfied with.
For example, when I query one of our views, which in turn joins with another view and has three OUTER APPLYs with table-valued UDFs, I get a Reads value of 321 with a CPU value of 0. My first thought is that I should be happy with this. But how do I evaluate the value of 321? This tells me 2,654,208 bytes of data were logically read to satisfy the query (which returned a single row and 30 columns).
How would some of you go about determining if this is good enough, or requires more fine tuning? What criteria would you use?
Also, I'm curious what is included in the 2,654,208 bytes of logical data read. Does this include all the data contained in the 30 columns in the single row returned?

The 2.5MB includes all data in the 321 pages, including the other rows in the same pages as those retrieved for your query, as well as the index pages retrieved to find your data. Note that these are logical reads, not physical reads, e.g. read from a cached page will make the read much 'cheaper' - take CPU and profiler cost indicator as well when optimising.
w.r.t. How to determine an optimum 'target' for reads.
FWIW I compare the actual reads with a optimum value which I can think of as the minimum number of pages needed to return the data in your query in a 'perfect' world.
e.g. if you calculate roughly 5 rows per page from table x, and your query returns 20 rows, the 'perfect' number of reads would be 4, plus some overhead of navigating indexes (assuming of course that the rows are clustered 'perfectly' for your query) - so utopia would be around say 5-10 pages.
For a performance critical query, you can use the actual reads vs 'utopian' reads to micro-optimise, e.g.:
Whether I can fit more rows per page in the cluster (table), e.g. replacing non-searched strings with varchar() not char, or using varchar not nvarchar() or using smaller integer types etc.
Whether the clustered index could be changed such that fewer pages would need to be fetched (e.g. if the 20 rows for the above query were scattered across different pages, then reads would be > 4)
Failing which (since you can only one CI), whether covering indexes could replace the need to go to the table data (cluster) at all, since covering indexes fitting your query will have higher 'row' densities
And for indexes, density improvements such as fillfactors or narrower indexing for indexes can mean less index reads
You might find this article useful
HTH!

321 reads with a CPU value of 0 sounds pretty good, but it all depends.
How often is this query run? Why are table-returning UDFs used instead of just doing joins? What is the context of database use (how many users, number of transactions per second, database size, is it OLTP or data warehousing)?
The extra data reads come from:
All the other data in the pages needed to satisfy the reads done in the execution plan. Note this includes clustered and nonclustered indexes. Examining the execution plan will give you a better idea of what exactly is being read. You'll see references to all sorts of indexes and tables, and whether a seek or scan was required. Note that a scan means every page in the whole index or table was read. That is why seeks are desirable over scans.
All the related data in tables INNER JOINed to in the views regardless of whether these JOINs are needed to give correct results for the query you're performing, since the optimizer doesn't know that these INNER JOINs will or won't exclude/include rows until it JOINs them.
If you provide the queries and execution plans, as requested, I would probably be able to give you better advice. Since you're using table-valued UDFs, I would also need to see the UDFs themselves or at least the execution plan of the UDFs (which is only possibly by tearing out its meat and running outside a function context, or converting it to a stored procedure).

SQL Server 2005 - Rowsize effect on query performance?

Im trying to squeeze some extra performance from searching through a table with many rows.
My current reasoning is that if I can throw away some of the seldom used member from the searched table thereby reducing rowsize the amount of pagesplits and hence IO should drop giving a benefit when data start to spill from memory.
Any good resource detailing such effects?
Any experiences?
Thanks.

Tuning the size of a row is only a major issue if the RDBMS is performing a full table scan of the row, if your query can select the rows using only indexes then the row size is less important (unless you are returning a very large number of rows where the IO of returning the actual result is significant).
If you are doing a full table scan or partial scans of large numbers of rows because you have predicates that are not using indexes then rowsize can be a major factor. One example I remember, On a table of the order of 100,000,000 rows splitting the largish 'data' columns into a different table from the columns used for querying resulted in an order of magnitude performance improvement on some queries.
I would only expect this to be a major factor in a relatively small number of situations.

I don't now what else you tried to increase performance, this seems like grasping at straws to me. That doesn't mean that it isn't a valid approach. From my experience the benefit can be significant. It's just that it's usually dwarfed by other kinds of optimization.
However, what you are looking for are iostatistics. There are several methods to gather them. A quite good introduction can be found ->here.

The sql server query plan optimizer is a very complex algorithm and decision what index to use or what type of scan depends on many factors like query output columns, indexes available, statistics available, statistic distribution of you data values in the columns, row count, and row size.
So the only valid answer to your question is: It depends :)
Give some more information like what kind of optimization you have already done, what does the query plan looks like, etc.
Of cause, when sql server decides to do a table scna (clustered index scan if available), you can reduce io-performance by downsize row size. But in that case you would increase performance dramatically by creating a adequate index (which is a defacto a separate table with smaller row size).

If the application is transactional then look at the indexes in use on the table. Table partitioning is unlikely to be much help in this situation.
If you have something like a data warehouse and are doing aggregate queries over a lot of data then you might get some mileage from partitioning.
If you are doing a join between two large tables that are not in a 1:M relationship the query optimiser may have to resolve the predicates on each table separately and then combine relatively large intermediate result sets or run a slow operator like nested loops matching one side of the join. In this case you may get a benefit from a trigger-maintained denormalised table to do the searches. I've seen good results obtained from denormalised search tables for complex screens on a couple of large applications.

If you're interested in minimizing IO in reading data you need to check if indexes are covering the query or not. To minimize IO you should select column that are included in the index or indexes that cover all columns used in the query, this way the optimizer will read data from indexes and will never read data from actual table rows.
If you're looking into this kind of details maybe you should consider upgrading HW, changing controllers or adding more disk to have more disk spindle available for the query processor and so allowing SQL to read more data at the same time
SQL Server disk I/O is frequently the cause of bottlenecks in most systems. The I/O subsystem includes disks, disk controller cards, and the system bus. If disk I/O is consistently high, consider:
Move some database files to an additional disk or server.
Use a faster disk drive or a redundant array of inexpensive disks (RAID) device.
Add additional disks to a RAID array, if one already is being used.
Tune your application or database to reduce disk access operations.
Consider index coverage, better indexes, and/or normalization.
Microsoft SQL Server uses Microsoft Windows I/O calls to perform disk reads and writes. SQL Server manages when and how disk I/O is performed, but the Windows operating system performs the underlying I/O operations. Applications and systems that are I/O-bound may keep the disk constantly active.
Different disk controllers and drivers use different amounts of CPU time to perform disk I/O. Efficient controllers and drivers use less time, leaving more processing time available for user applications and increasing overall throughput.

First thing I would do is ensure that your indexes have been rebuilt; if you are dealing with huge amount of data and an index rebuild is not possible (if SQL server 2005 onwards you can perform online rebuilds without locking everyone out), then ensure that your statistics are up to date (more on this later).
If your database contains representative data, then you can perform a simple measurement of the number of reads (logical and physical) that your query is using by doing the following:
SET STATISTICS IO ON
GO
-- Execute your query here
SET STATISTICS IO OFF
GO
On a well setup database server, there should be little or no physical reads (high physical reads often indicates that your server needs more RAM). How many logical reads are you doing? If this number is high, then you will need to look at creating indexes. The next step is to run the query and turn on the estimated execution plan, then rerun (clearing the cache first) displaying the actual execution plan. If these differ, then your statistics are out of date.

I think you're going to be farther ahead using standard optimization techniques first -- check your execution plan, profiler trace, etc. and see whether you need to adjust your indexes, create statistics etc. -- before looking at the physical structure of your table.

Costs vs Consistant gets

What does it indicate to see a query that has a low cost in the explain plan but a high consistent gets count in autotrace? In this case the cost was in the 100's and the CR's were in the millions.

The cost can represent two different things depending on version and whether you are running in cpu-based costing mode or not.
Briefly, the cost represents that amount of time that the optimizer expects the query to execute for, but it is expressed in units of the amount of time that a single block read takes. For example if Oracle expects a single block read to take 1ms and the query to take 20ms, then the cost equals 20.
Consistent gets do not match exactly with this for a number of reasons: the cost includes non-consistent (current) gets (eg reading and writing temp data), the cost includes CPU time, and a consistent get can be a multiblock read instead of a single block read and hence have a different duration. Oracle can also get the estimate of the cost completely wrong and it could end up requiring a lot more or less consistent gets than the estimate suggested.
A useful method that can helo explain disconnects between predicted execution plan and actual performance is "cardinality feedback". See this presentation: http://www.centrexcc.com/Tuning%20by%20Cardinality%20Feedback.ppt.pdf

At best, the cost is the optimizer's estimate of the number of I/O's that a query would perform. So, at best, the cost is only likely to be accurate if the optimizer has found a very good plan-- if the optimizer's estimate of the cost is correct and the plan is ideal, that generally means that you're never going to bother looking at the plan because that query is going to perform reasonably well.
Consistent gets, however, is an actual measure of the number of gets that a query actually performed. So that is a much more accurate benchmark to use.
Although there are many, many things that can influence the cost, and a few things that can influence the number of consistent gets, it is probably reasonable to expect that if you have a very low cost and a very high number of consistent gets that the optimizer is probably working with poor estimates of the cardinality of the various steps (the ROWS column in the PLAN_TABLE tells you the expected number of rows returned in each step). That may indicate that you have missing or outdated statistics, that you are missing some histograms, that your initialization parameters or system statistics are wrong in some way, or that the CBO has problems for some other reason estimating the cardinality of your results.
What version of Oracle are you using?

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight