Querying SQL Server tables while they are locked - sql-server

I have two DBs on Azure let's say DBLive and DBCrud.
DBLive contains the tables that provide the real data. There is one table for which the CRUD operations are not done on DBLive but on DBCrud. The table structure is replicated on DBCrud - so after doing our CRUD operations on DBCrud we do a synchronization to DBLive by doing a SQLbulkcopy.
Given there many records, the bulkcopy takes time. During this time the app must still provide users with data (A SELECT Query EF6 let's call it STATUSQUERY) from DBLive even if dirty - even if they look like before the bulkcopy was initiated. But given the synchronization process the query hangs and timeout.
Initially I used SQlbulkCopyOptions.TableLock , then used an external transaction with IsolationLevel.ReadUncommitted. This still does not solve the problem. The select query still hangs. I cannot use IsolationLevel.Snapshot because of DBA policy.

Related

Disable transactions on SQL Server

I need some light here. I am working with SQL Server 2008.
I have a database for my application. Each table has a trigger to stores all changes on another database (on the same server) on one unique table 'tbSysMasterLog'. Yes the log of the application its stored on another database.
Problem is, before any Insert/update/delete command on the application database, a transaction its started, and therefore, the table of the log database is locked until the transaction is committed or rolled back. So anyone else who tries to write in any another table of the application will be locked.
So...is there any way possible to disable transactions on a particular database or on a particular table?
You cannot turn off the log. Everything gets logged. You can set to "Simple" which will limit the amount of data saved after the records are committed.
" the table of the log database is locked": why that?
Normally you log changes by inserting records. The insert of records should not lock the complete table, normally there should not be any contention in insertion.
If you do more than inserts, perhaps you should consider changing that. Perhaps you should look at the indices defined on log, perhaps you can avoid some of them.
It sounds from the question that you have a create transaction at the start of your triggers, and that you are logging to the other database prior to the commit transaction.
Normally you do not need to have explicit transactions in SQL server.
If you do need explicit transactions. You could put the data to be logged into variables. Commit the transaction and then insert it into your log table.
Normally inserts are fast and can happen in parallel with out locking. There are certain things like identity columns that require order, but this is very lightweight structure they can be avoided by generating guids so inserts are non blocking, but for something like your log table a primary key identity column would give you a clear sequence that is probably helpful in working out the order.
Obviously if you log after the transaction, this may not be in the same order as the transactions occurred due to the different times that transactions take to commit.
We normally log into individual tables with a similar name to the master table e.g. FooHistory or AuditFoo
There are other options a very lightweight method is to use a trace, this is what is used for performance tuning and will give you a copy of every statement run on the database (including triggers), and you can log this to a different database server. It is a good idea to log to different server if you are doing a trace on a heavily used servers since the volume of data is massive if you are doing a trace across say 1,000 simultaneous sessions.
https://learn.microsoft.com/en-us/sql/tools/sql-server-profiler/save-trace-results-to-a-table-sql-server-profiler?view=sql-server-ver15
You can also trace to a file and then load it into a table, ( better performance), and script up starting stopping and loading traces.
The load on the server that is getting the trace log is minimal and I have never had a locking problem on the server receiving the trace, so I am pretty sure that you are doing something to cause the locks.

Detect Table Changes In A Database Without Modifications

I have a database ("DatabaseA") that I cannot modify in any way, but I need to detect the addition of rows to a table in it and then add a log record to a table in a separate database ("DatabaseB") along with some info about the user who added the row to DatabaseA. (So it needs to be event-driven, not merely a periodic scan of the DatabaseA table.)
I know that normally, I could add a trigger to DatabaseA and run, say, a stored procedure to add log records to the DatabaseB table. But how can I do this without modifying DatabaseA?
I have free-reign to do whatever I like in DatabaseB.
EDIT in response to questions/comments ...
Databases A and B are MS SQL 2008/R2 databases (as tagged), users are interacting with the DB via a proprietary Windows desktop application (not my own) and each user has a SQL login associated with their application session.
Any ideas?
Ok, so I have not put together a proof of concept, but this might work.
You can configure an extended events session on databaseB that watches for all the procedures on databaseA that can insert into the table or any sql statements that run against the table on databaseA (using a LIKE '%your table name here%').
This is a custom solution that writes the XE session to a table:
https://github.com/spaghettidba/XESmartTarget
You could probably mimic functionality by writing the XE events table to a custom user table every 1 minute or so using the SQL job agent.
Your session would monitor databaseA, write the XE output to databaseB, you write a trigger that upon each XE output write, it would compare the two tables and if there are differences, write the differences to your log table. This would be a nonstop running process, but it is still kind of a period scan in a way. The XE only writes when the event happens, but it is still running a check every couple of seconds.
I recommend you look at a data integration tool that can mine the transaction log for Change Data Capture events. We are recently using StreamSets Data Collector for Oracle CDC but it also has SQL Server CDC. There are many other competing technologies including Oracle GoldenGate and Informatica PowerExchange (not PowerCenter). We like StreamSets because it is open source and is designed to build realtime data pipelines between DB at the schema level. Till now we have used batch ETL tools like Informatica PowerCenter and Pentaho Data Integration. I can near real-time copy all the tables in a schema in one StreamSets pipeline provided I already deployed DDL in the target. I use this approach between Oracle and Vertica. You can add additional columns to the target and populate them as part of the pipeline.
The only catch might be identifying which user made the change. I don't know whether that is in the SQL Server transaction log. Seems probable but I am not a SQL Server DBA.
I looked at both solutions provided by the time of writing this answer (refer Dan Flippo and dfundaka) but found that the first - using Change Data Capture - required modification to the database and the second - using Extended Events - wasn't really a complete answer, though it got me thinking of other options.
And the option that seems cleanest, and doesn't require any database modification - is to use SQL Server Dynamic Management Views. Within this library residing, in the System database, are various procedures to view server process history - in this case INSERTs and UPDATEs - such as sys.dm_exec_sql_text and sys.dm_exec_query_stats which contain records of database transactions (and are, in fact, what Extended Events seems to be based on).
Though it's quite an involved process initially to extract the required information, the queries can be tuned and generalized to a degree.
There are restrictions on transaction history retention, etc but for the purposes of this particular exercise, this wasn't an issue.
I'm not going to select this answer as the correct one yet partly because it's a matter of preference as to how you approach the problem and also because I'm yet to provide a complete solution. Hopefully, I'll post back with that later. But if anyone cares to comment on this approach - good or bad - I'd be interested in your views.

PostgreSQL table lock

I have a PostgreSQL 9.2 database using pgBouncer connection pool in a debian server.
In that database, regular users performs queries over a few tables and I have a cron process fetching data and inserting in a table (using a pg/plsql function which makes some validations before insert).
The problem I have is that when I have a huge load on the cron processes (many inserts), the table get locked and the queries to this table does not respond (or takes a lot of time to respond).
Is there any way to set priorities by stored procedure, database user(cron and queries use different database users) or by type (select has higher priority than insert).
If there is no way to user priority definition in postgreSQL, is there any workaround?
Inserts can wait, but the user queries should not...
The cron process creates and drops a pgbouncer connection per insert. If I use the same connection, the problem is bigger (the queries takes even longer)
Thanks in advance,
Claudio

Azure SQL Database trigger to insert audit info into Azure Table

I am working on a database auditing solution and was thinking of having SQL Server triggers take care of changes and inserting them into an auditing table. Since this is a SQL Azure Database and will be fairly large I am concerned about the cost of a growing database due to auditing.
In order to cut down on the costs needed for auditing purposes, I am considering storing the audit table (or tables) in Azure Tables instead of Azure SQL databases. So the question becomes, how to get the SQL Server trigger to get the changed data into Azure Tables?
The only thing I can come up with is to have an audit table (or tables) in SQL Databases so the trigger can insert the rows locally, and then have a Worker Role every X seconds pull any rows from that and move them to Azure Tables and delete from the SQL Database table so it doesn't grow large.
Is there a better way to do this integration? Can I somehow put a message in a queue from a trigger?
Azure SQL Database (formerly SQL Azure) doesn't support CLR (hence no EXTERNAL NAME trigger parameter) so there's no way for your triggers to do anything outside of T-SQL. If you want audit content to go to a table, you could take the approach you came up with (temporarily write to SQL table, then move content periodically to Table). There are other approaches you could take (and this would be opinion/subjective, frowned upon here), but going with the queue concept for a minute, since you asked about queues, and illustrating what you could do with Azure Queues:
You could use an Azure queue to specify an item to insert/update in your SQL database. The queue processing code could then be responsible for performing the update and writing to the Azure table. Since the queue messages must be explicitly deleted after processing, you could simply repeat the queue message processing if something failed during execution (e.g. you write to SQL but fail before writing to table storage). The message eventually becomes visible for reading again, if you don't delete it before its timeout value. As long as your operations are idempotent, you'd be ok with this pattern.
A cheaper solution than using worker roles would be to use a combination of Azure Scheduled Tasks (you can enable them for free to run every 15 min within Mobile Apps) and Azure Web Sites. Basically the way it would work is to run this scheduled job every 15 min which would make an HTTP call to some code you have running within your Azure Web Site. This code would do the same work you had outlined for your worker role.
Alternatively, use SQL Server System-Versioned temporal tables to automatically handle the writing of audited record (i.e., changes) to corresponding history tables.

SQL Server 2005/8 Replication Transaction ID

I have a scenario where I'm using transactional replication to replicate multiple SQL Server 2005 databases (same instance) into a single remote database (different instance on a separate physical machine).
I am then performing some processing on the replicated data for reporting purposes. I'm using table level triggers to identify changes which actions my post processing code.
Up to this point everything is fine.
However, what I'd like to know is, where certain tables are created, updated or deleted in the same transaction, is it possible to identify some sort of transaction ID from replication (or anywhere) so then I don't perform the same post processing multiple times for a single transaction.
Basic Example: I have a TUser Table and TAddress table. If I was to create both in a single transaction, they would be replicated across in a single transaction too. However, there would be two triggers fired in the replicated database - which at present causes my post processing code to be run twice. What I'd really like to identify is that these two changes arrived in the replicated in the same transaction.
Is this possible in any way? Does an identifier as I've describe exist and is it accessible?
Short answer is no, there is nothing of the sort that you can rely on. Long answer in summary would be that yes it exists, but it would not be recommended in any way to be used for anything.
Given that replication is transactionally consistent, one approach you could consider would be pushing an identifier for the primary record (in this case TUser, since an TAddress is related to TUser) onto a queue (using something like Service Broker ideally or potentially a user-defined queue) and then perform the post-processing by popping data off the queue and processing separately.
Another possibility would be simply batch processing every 'x' amount of time by polling for new/updated records from the primary tables and post-processing in that manner - you'd need to track id's, rowversions, or timestamps of some sort that you've processed for each primary table as meta-data and pull anything that hasn't yet been processed during each batch run.
Just a few thoughts, hope that helps.

Resources