We are using partitioned views (SQL Server 2008 Standard, partitioned tables are not an option), and they work fine if we consider the partition elimination goal: if we run a query in a partitioned view specifying a clause on the column we choose as the discriminator, we can see from the actual execution plan that only the table related to the specified discriminator value is hit. But we incur in locking problems if there are concurrent INSERT or UPDATE statements, even if those ones are NOT hitting the table selected by the discriminator.
Analyzing the locks I can see that, even if the execution plan shows that only the right one table is read, IS locks are still put on ALL the tables in the partitioned view, and of course if someone else has already put an X locks on one of those the whole query running on the partitioned view gets locked on that one, even if the table with an X upon is not read at all.
Is this a limitation of partitioned view in general, or there is a way to avoid it while sticking with partitioned views? We created the partitioned view and the related things following the SQL Server Books Online recommendations.
Thanks
Wasp
This is by design. Do not lock X entire tables.
Related
I have SQL Server table with 160+ million records having continuous CRUD operations from UI, batch jobs etc. basically from multiple sources
Currently I have partitioned the table on a column to have better performance on the table.
I came across In-Memory tables which can be used in case of tables with frequent updates and also if updates happening from multiple sources it won't put a lock instead it will maintain row versioning, so concurrent updates is better using this approach.
So what are my options in this case ?
Partition the table or Create In-Memory table
As I have read SQL server is not supporting In-Memory table when table is partitioned.
What is the better option in this case In-Memory table or partitioned table.
It depends.
In-memory tables look great on theory, but you really need to spend time learning the details in order to make the right implementation. You may find some details disturbing. For example:
there are no parallel inserts in in-memory tables which make creation of rows slower compare to parallel insert in traditional table stored in SSD
not all index operations supported by dis-based indexes are available in in-memory table indexes
not all data types are supported
there are both unsupported features and T-SQL constructs
you may need more RAM then you think
If you are ready to pay the price for using Hekaton, you may start with reading its white-paper.
The partitioning itself comes with benefits but there is no guarantee it will heal your system. Only particular queries and case-scenarios can benefit from it. For example, if 99% of your workload is touching the data in one partition you may see no optimization at all. On the other hand, if your reports are based on historical data and your inserts/updates/deletes touch another partition it will be better.
Both of the technologies are good, but need to be examine in details and applied carefully. Often, folks believe that using some new tech will solve their problems, when the problems can be solved just applying some basic concepts.
For example, you said that you are performing CRUD over 160+ millions records. Ask yourself:
is my table normalized - when data is stored in normalized way you gain two things - first, you will perform CRUD only on part of the data, the engine may read only the data that is needed for particular query (without the need to support an index)
are my T-SQL statements write well - row by agonizing row, calling stored procedures in loops or not processing the data in batches are common sources of slow queries
which are the blocking and deadlocked queries - for example, there is a possibility one long running query to block all your inserts - identify these types of issues first and try to resolve them with data pre-calculation (indexed view) or creating covering indexes (which can be filtered with include columns, too)
are readers and writers being blocked - you can try different isolation levels to solve this type of issues - RCSI is the Azure default isolation level. You may need to add more RAM to your RAMDISK used by your TempDB, but since your are looking at Hekaton, this will be easier to test (and rollback) compare to it(or partitioning)
I've been working in Oracle World for 3 weeks after working in SQL Server Land for more than 4 years. Right now I'm finding Oracle's lack of local temp tables baffling.
In lieu of a data warehouse for reporting I've often been responsible for putting together reports from large amounts of normalized data. I quickly learned that cramming all of the logic into one gigantic query (i.e. one with many joins, sub-queries, correlated sub-queries, unions, etc.) was a recipe for terrible performance. Properly breaking the process into smaller steps and utilizing indexed temp tables (that you could create and alter on the fly within a procedure or an ad-hoc script) was often exponentially faster.
Enter Oracle... no local temp tables. I apparently can't even CREATE a global temporary table without being granted the permission to create permanent tables as well. I Googled "oracle temp table permission" and the first link returned is a forum question where the accepted answer starts with "As has been pointed out, it would be extremely unusual to want to have a user that could create global temporary tables but not permanent tables. I'm very hard-pressed to imagine a scenario where that would make sense." That's exactly what I could use in our prod environment. My SQL Server mind is blown.
I can almost accept having ONLY global temp tables to work with but is it really that unusual in Oracle to use them in this manner? What, if anything, can I do to implement some sort of similar step by step logic without using temp tables? How, within an ad-hoc script, can I save off and later reuse something similar to an indexed set of data? I'm obviously looking for something other than a subquery or a CTE.
I must be missing something...
unfortunalty we don't have such a privilege.but as workaround you can revoke any quota on permanent tablespaces then the user can't create any permanent table.of course in 11g with deferred segment creation feature users can create table but they cann't insert any row. because temp tables uses of temporary tablespaces they won't have any problem.
Mohsen
When a record is updated in a SQL Server table, how does the db engine physically execute such a request: is it INSERT + DELETE or UPDATE operation?
As we know, the performance of a database and any statements depends on many variables. But I would like to know if some things can be generalized.
Is there a threshold (table size, query length, # records affected...) after which the database switches to one approach or the other upon UPDATEs?
If there are times when SQL Server is physically performing insert/delete when a logical update is requested, is there a system view or metric that would show this? - i.e, if there is a running total of all the inserts, updates and deletes that the database engine has performed since it was started, then I would be able to figure out how the database behaves after I issue a single UPDATE.
Is there any difference between the UPDATE statement's behavior depending on SQL Server version (2008, 2012...)
Many thanks.
Peter
UPDATE on base table without triggers is always physical UPDATE. SQL Server has no such threshold. You can look up usage statistics, for example, in sys.dm_db_index_usage_stats.
Update edits the existing row. If it was insert/delete, then you'd get update failures for duplicate keys.
Insert/Update/Delete also all can be discretely permissioned. So a user could update records, but not insert or delete, also leading to that not being the way it works.
I have a database that contains data for many "clients". Currently, we insert tens of thousands of rows into multiple tables every so often using .Net SqlBulkCopy which causes the entire tables to be locked and inaccessible for the duration of the transaction.
As most of our business processes rely upon accessing data for only one client at a time, we would like to be able to load data for one client, while updating data for another client.
To make things more fun, all PKs, FKs and clustered indexes are on GUID columns (I am looking at changing this).
I'm looking at adding the ClientID into all tables, then partitioning on this. Would this give me the functionality I require?
I haven't used the built-in partitioning functionality of SQL Server, but it's something I am particularly interested in. My understanding is that this would solve your problem.
From this article
This allows you to operate on a
partition even with performace
critical operation, such as
reindexing, without affecting the
others.
And a great whitepaper on partitioning by Kimberly L Tripp is here. Well worth a read - I won't even try to paraphrase it - covers it all in a lot of detail.
Hope this helps.
Can you partition on Client ID : Yes, but partitioning is limited to 1000 partitions so that is 1000 clients before it hits a hard limit. The only way to get around that is to start using partitioned views across multiple partitioned tables - it gets a bit messy.
Will is help your locking situation : In SQL 2005 the lock escalation is row -> page -> table, but in 2008 they introduced a new level allowing row -> page -> partition -> table. So it might get round it, depending on your SQL version (unspecified).
If 2008 is not an option, then there is a trace flag (TF 1211 / 1224) feature that turns off lock escalations, but I would not jump in and use it without some serious testing.
The partitioning feature remains an enterprise upwards feature as well which puts some people off.
The most ideal way in which to perform a data load with partitioning, but avoiding locks is to bring the data into a staging table and then swap the data into a new partition - but this requires that the data is somewhat sequence based (such as datetime) so that new data can be brought in to an entirely new partition whilst older data eventually is removed. (rolling the partition window.)
Do you know if there's any way of doing this in SQL Server (2008)?
I'm working on a DataWarehouse loading process, so what I want to do is to drop the indexes of the partition being loaded so I can perform a quick bulk load, and then I can rebuild again the index at partition level.
I think that in Oracle it's possible to achieve this, but maybe not in SQL Server.
thanks,
Victor
No, you can't drop a table's indexes for just a single partition. However, SQL 2008 provides a methodology for bulk-loading that involves setting up a second table with exactly the same schema on a separate partition on the same filegroup, loading it, indexing it in precisely the way, then "switching" your new partition with an existing, empty partition on the production table.
This is a highly simplified description, though. Here's the MSDN article for SQL 2008 on implementing this:
http://msdn.microsoft.com/en-us/library/ms191160.aspx
I know it wasn't possible in SQL 2005. I haven't heard of anything that would let you do this in 2008, but it could be there (I've read about but have not yet used it). The closest I could get was disabling the index, but if you disable a clustered index you can no longer access the table. Not all that useful, imho.
My solution for our Warehouse ETL project was to create a table listing all the indexes and indexing constraints (PKs, UQs). During ETL, we walk through the table (for the desired set of tables being loaded), drop the indexes/indexing constraints, load the data, then walk through the table again and recreate the indexes/constraints. Kind of ugly and a bit awkward, but once up and running it won't break--and has the added advantage of freshly built indexes (i.e. no fragmentation, and fillfactor can be 100). Adding/modifying/dropping indexes is also awkward, but not all that hard.
You could do it dynamically--read and store the indexes/constraints definitions from the target table, drop them, load data, then dynamically build and run the (re)create scripts from your stored data. But, if something crashes during the run, you are so dead. (That's why I settled on permanent tables.)
I find this to work very well with table partitioning, since you do all the work on "Loading" tables, and the live (dbo, for us) tables are untouched.