Say we have two servers making identical queries to the same database roughly once an hour, and the database gets updated rarely (every 30 minutes). Getting the result back fast is not important, but we would like the data warehouse running for as short a time as possible.
Should we make sure that one of the queries completes before the other begins, so that the result is cached? Is snowflake smart enough to realize when it is being asked to run two identical queries, and only does the work once?
As referenced from the documentation snowflake queries results are persisted for 24 hours. So if nothing has changed to the query snowflake does not regenerate results. This we have tested in all our applications.
Below is the link please check and let me know if this helps.
https://docs.snowflake.com/en/user-guide/querying-persisted-results.html
Related
Over the years our sql sever database has accumulated a lot of data which was causing the queries to run slow which further resulted in the application to slow down significantly.
We eventually decided to archive certain data by storing it in a different data store and deleting from SQL Server. Note that the data is spread over 22 tables (meta-data). After deleting about 40% of the data we saw that certain queries were running significantly slower. Even though there are queries which have slight performance improvement, we are observing that the response times and transactions per second have come down significantly when we ran a load test.
As part of cleanup on the database side, we Reorganized and Rebuilt the indexes. We made sure that the fragmentation after the deletes was with in permissible limits (ideally < 30% but it was way less than that). After this we ran update statistics with full scan.
We identified a particular query which runs significantly slower after the deletes and saw that it had a different query plan as compared to before delete (we created a new database with the pre-delete data set to compare the differences) even though the table definitions (indexes, constraints etc) are the same for all tables in the query.
We ran the load tests again after these steps but still saw the same performance as before. I am not a SQL Server expert but am working with closely with the DBAs to understand the underlying issue for the slow down.
The DBAs are suggesting to optimize the queries but I am not convinced if that would actually help, the queries are performing better with a larger dataset which means deleting the data caused some change on the tables which is resulting in the slowdown.
I would really appreciate any pointers or guidance on addressing this issue.
Thanks,
Karthik
I am creating a database medical system and then I came to a point where I am trying to create a notification feature and i will use SQL jobs in it, where the SQL job responsibility is to check some tables and the entities that will find it need to be notified for a change in certain data will put their ids in an entity called Notification and a trigger will be called for the app to check that table and send the notificiation.
what I want to ask is how many SQL jobs can a sql server handle ?
Does the number of running SQL jobs in background affect the performance of my application or the database performance in a way or another ?
NOTE: the SQL job will run every 10 seconds
I couldn't find any useful information online.
thanks in advance.
This question really doesn't have enough background to get a definitive answer. What are the considerations?
Do the queries in your ten-second job actually complete in ten seconds, even when your DBMS is under its peak transactional workload? Obviously, if the job routinely doesn't complete in ten seconds, you'll get jobs piling up.
Do the queries in your job lock up tables and/or indexes so the transactional load can't run efficiently? (You should use SET ISOLATION LEVEL READ UNCOMMITTED; as much as you can so database reads won't lock things unnecessarily.)
Do the queries in your job do a lot of rows' worth of inserts and updates, and so swamp the SQL Server transaction logs?
How big is your server? (CPU cores? RAM? IO capacity?) How big is your database?
If your project succeeds and you get many users, will your answers to the above questions remain the same? (Hint: no.)
You should spend some time on the execution plans for the queries in your job, and try to make them as efficient as possible. Add the necessary indexes. If necessary refactor the queries to make them more efficient. SSMS will show you the execution plans and suggest appropriate indexes.
If your job is doing things like deleting expired rows, you may want to build the expiration in your data model. For example, suppose your job does
DELETE FROM readings WHERE expiration_date >= GETDATE()
and your application does this, relying on your job to avoid getting expired readings.
SELECT something FROM readings
You can refactor your application query to say
SELECT something FROM readings WHERE expiration_date < GETDATE()
and then run your job overnight, at a quiet time, rather than every ten seconds.
A ten-second job is not the greatest idea in the world. If you can rework your application so it will function correctly with a ten-second, ten-minute, or twelve-hour job, you'll have a more resilient production system. At any rate if something goes wrong with the job when your system is very busy you'll have more than ten seconds to fix it.
We are running databases in SQL Server 2012 with multiple large datasets (some are in the 50M+ records range). The previous SQL developer designed the queries and optimized them but they still take 2+ hours to run.
He partially worked around this by creating a static table which gets updated every time the queries are run so if the data hasn't changed, the query runs from this summarized table. The static table gets updated by performing checksums on the relevant tables and updates it if the checksum show the data has changed.
I'm trying to speed up the whole process. We have designed an in-house GUI to run the queries for managers to be able to run reports themselves but I don't want them to waste 2 hours waiting for a report. I will be reviewing the indexes he was using to see if I can optimize them further and tweak his code as well but I suspect I might only get minimal performance improvement.
I like the idea of the static table for reporting but would like have it updated more frequently (preferably nightly) but since the data can also change depending on tasks, I want to avoid any performance hits. For example, the team may be loading records overnight.
Any suggestions would be great. Thank you.
I am using SQL Server 2008 R2.
The process is actually like this:
First, about 2 million records are pulled from a remote server,
then a join is done locally,
the final result is thousands of records.
The time cost varies from less one 1 min to 30 mins.
And after I experienced the 30 mins delay, it seems the following time costs are all only around 3 mins.
It is the same data, same SP.
What could cause this drastic difference?
Update
I delet the SP, re-start the SQL server service, and re-creat the SP. The execution took only 50 seconds!
What's wrong?
The behaviour you describe seems extreme - but (if you exclude the client), there are 3 logical places to look.
The first is the query execution on the database server. It's worth using the Query Analyzer tool to see if it's using any indices - by far the most common reason for variable performance of database queries is that the query is not using (the right) indices, and that therefore the impact of the query cache plays a big part. SQL Server will cache a lot of data, and the first run of your proc populates that cache; the second run is faster because it hits the cache. After a while, the cache goes stale, and running the proc slows down again.
The second possibility is that the database server is wobbly - it may just not be powerful enough to do all the work it's supposed to do. In that case, one moment you get lucky, have all the server resources to yourself; the next, someone else is running a query and yours slows down. That would make all queries slow, not just this one - so it doesn't sound likely.
Third possibility is networking weirdness - as Phil says, "thousands of records" is nothing too scary, but if they're big, and your network is saturated with pictures of kittens, it might have an impact. Again, that would manifest in general network slowness, and is unlikely to explain a delay of 30 minutes...
Fourth, is anything going on at the same time?
Fifth, does your SP use dynamically generated SQL statements? This would cause the SP not to become pre-compiled. If possible seperate such statements into child SPs.
I'm tracking down an odd and massive performance problem in my SQL server installation. On my system, a particular stored procedure takes 2 minutes to execute; on a colleague's system it takes less than 1 second. We have similar databases/data and configurations, but there's obviously something very different.
I ran the SP in question through the Profiler on both systems and noticed something odd. On My system, I see 9 entries with the following properties:
The Duration is way high relative to other rows. I have values as high as 37,698 and as low as 1734. On the "fast" system the maximum duration (for the entire SP call) is 259.
They are executed for two databases related to the one that contains the SP I'm running. (This SP makes calls via Linked Servers to these two databases).
They are executions of one of the following system SPs:
sp_tables_info_90_rowset
sp_check_constbytable_rowset
sp_columns_90_rowset
sp_table_statistics2_rowset
sp_indexes_90_rowset
I can't find any Googleable documentation on what these are, why they would be so slow, or why they would run on one system but not the other. Does anyone know what they're all about?
Try manually updating statistics on that table.
UPDATE STATISTICS [TableName]
Then double check that the database option to AutoUpdateStatistics is TRUE. Even if it is, though, I've seen cases where adding large amounts of data to a table doesn't always cause the statistics to update in a timely way, and queries can be slow.
I don't know the answer to your question. But to try to fix the problem you're having (which, I assume, is what you're actually interested in), the first thing I'd do is run a re-index on the tables you're querying. This frequently will fix any kind of slowness when the conditions are as you described (same database structure, different data/database, same query).
These are the tables created when you have linked server calls. These are called work tables created in Tempdb. They are automatically created by the database engine for temporary operations like Spooling etc.
Those sp's mean your query is hitting linked servers by using synonyms. This should be avoided whenever possible.
I'm not familiar with those specific procedures, but you can try running:
SELECT object_definition(object_id('Procedure Name'))
To get a better idea of what's going on under the hood.
Last index rebuild? Last statistics update?
Otherwise, these stored procs are used by the SQL Server client too... no? And probably won't cause these errors