I´m not a dba but I´m having a problems with a database which it´s been locking tables and creating a big chaos as a result. I get some information from the oracle dba, if someone could help me to found the key of the problem or point me what I need to do, I put more information here:
I have a big report from oracle but I don´t understand 95% of the data.
This looks like regular row-level locking. A row can only be modified by one user at a time. The data dictionary has information about who is blocked and who is doing the blocking:
--Who's blocking who?
select
blocked_sql.sql_id blocked_sql_id
,blocked_sql.sql_text blocked_sql_text
,blocked_session.username blocked_username
,blocking_sql.sql_id blocking_sql_id
,blocking_sql.sql_text blocking_sql_text
,blocking_session.username blocking_username
from gv$sql blocked_sql
join gv$session blocked_session
on blocked_sql.sql_id = blocked_session.sql_id
and blocked_sql.users_executing > 0
join gv$session blocking_session
on blocked_session.final_blocking_session = blocking_session.sid
and blocked_session.final_blocking_instance = blocking_session.inst_id
left join gv$sql blocking_sql
on blocking_session.sql_id = blocking_sql.sql_id;
If you understand the system it is usually easier to focus on "who" is doing the blocking instead of "what" is blocked. The above query only returns a few common columns, but there are dozens of other columns in those tables that might help identify the process.
Related
I'm building an API to allow remote systems to synchronise millions of rows with my source table (one-way).
How can I remotely check if two SQL data rows are identical, without sending all data between the servers? I have considered using a LastUpdated column but worried that one day, someone might manually edit a record to fix a problem and the systems will then forever be out of sync.
I'm aware of the checksum(*) option, but this seems like there could be (largely theoretical) edge cases where the checksum for an updated row might be the same as previously.
The idea is, that my API will provide a list of IDs (primary keys), followed by a "check" value which the remote systems can then use to find out which records they need to add or update. Hopefully this is a sensible approach to keeping the two systems in sync, but I can't find any "best practice" type articles for this.
Each table has a rowversionrowversion and rowversionOther
From A synch
select b.id
from tableA a
join tableB b
on a.ID = b.ID
and b.rowversion != a.rowversionOther
After a data synch then needs to update b.rowversionOther = a.rowversion or would get into a vicious cycle.
If a and b update before synch would need to determine the master.
So I have two queries that I'm working on, one comes from an Oracle DB and the other SQL Server DB. I'm trying to use PowerBI via Power Query as the cross over between the two. Because of the size of the Oracle DB I'm having a problem with running it, so my thought is to use one query as a clause/sub-query of the other to limit the number of results.
Based on the logic of MSFT's M language I'd assume there's a way to do sub-queries of another but I've yet to figure it out. Does anyone have any ideas on how to do this?
What I have learned to do is create a connection to the two under lying data sets, but do not load them. Then you can merge them, but do not use the default Table.NestedJoin().
After the PQ generates this, change it to:
= Table.Join(dbo_SCADocument,{"VisitID"},VISIT_KEY,{"VisitID"}) .
Also remove the trailing name. The reason is , it keeps query folding alive. For some reason, Table.NestedJoin() kills query folding. Note, if there are similar fields in the two sources other than the join it will fail.
Also it brings everything from both sources, but that is easy to alter . Also you will need to turn off the function firewall as this does not allow you to join potential sensitive data with non sensitive data. This is setting your privacy level to ignore all.
I would attempt this using the Merge command. I'm more of a UI guy, so I would click the Merge button. This generates a PQL statement for Table.Join
https://msdn.microsoft.com/en-us/library/mt260788.aspx
Setting Join Kind to Inner will restrict the output to matching rows.
I say attempt as your proposed query design will likely slow your query, not improve it. I would expect PQ to run both queries against the 2 servers and download all their data, then attempt the Join in Memory.
I am using Entity Framework, and I am inserting records into our database which include a blob field. The blob field can be up to 5 MB of data.
When inserting a record into this table, does it lock the whole table?
So if you are querying any data from the table, will it block until the insert is done (I realise there are ways around this, but I am talking by default)?
How long will it take before it causes a deadlock? Will that time depend on how much load is on the server, e.g. if there is not much load, will it take longer to cause a deadlock?
Is there a way to monitor and see what is locked at any particular time?
If each thread is doing queries on single tables, is there then a case where blocking can occur? So isn't it the case that a deadlock can only occur if you have a query which has a join and is acting on multiple tables?
This is taking into account that most of my code is just a bunch of select statements, not heaps of long running transactions or anything like that.
Holy cow, you've got a lot of questions in here, heh. Here's a few answers:
When inserting a record into this table, does it lock the whole table?
Not by default, but if you use the TABLOCK hint or if you're doing certain kinds of bulk load operations, then yes.
So if you are querying any data from the table will it block until the insert is done (I realise there are ways around this, but I am talking by default)?
This one gets a little trickier. If someone's trying to select data from a page in the table that you've got locked, then yes, you'll block 'em. You can work around that with things like the NOLOCK hint on a select statement or by using Read Committed Snapshot Isolation. For a starting point on how isolation levels work, check out Kendra Little's isolation levels poster.
How long will it take before it causes a deadlock? Will that time depend on how much load is on the server, e.g. if there is not much load will it take longer to cause a deadlock?
Deadlocks aren't based on time - they're based on dependencies. Say we've got this situation:
Query A is holding a bunch of locks, and to finish his query, he needs stuff that's locked by Query B
Query B is also holding a bunch of locks, and to finish his query, he needs stuff that's locked by Query A
Neither query can move forward (think Mexican standoff) so SQL Server calls it a draw, shoots somebody's query in the back, releases his locks, and lets the other query keep going. SQL Server picks the victim based on which one will be less expensive to roll back. If you want to get fancy, you can use SET DEADLOCK_PRIORITY LOW on particular queries to paint targets on their back, and SQL Server will shoot them first.
Is there a way to monitor and see what is locked at any particular time?
Absolutely - there's Dynamic Management Views (DMVs) you can query like sys.dm_tran_locks, but the easiest way is to use Adam Machanic's free sp_WhoIsActive stored proc. It's a really slick replacement for sp_who that you can call like this:
sp_WhoIsActive #get_locks = 1
For each running query, you'll get a little XML that describes all of the locks it holds. There's also a Blocking column, so you can see who's blocking who. To interpret the locks being held, you'll want to check the Books Online descriptions of lock types.
If each thread is doing queries on single tables, is there then a case where blocking can occur? So isn't it the case that a deadlock can only occur if you have a query which has a join and is acting on multiple tables?
Believe it or not, a single query can actually deadlock itself, and yes, queries can deadlock on just one table. To learn even more about deadlocks, check out The Difficulty with Deadlocks by Jeremiah Peschka.
If you have direct control over the SQL, you can force row level locking using:
INSERT INTO WITH (ROWLOCK) MyTable(Id, BigColumn)
VALUES(...)
These two answers might be helpful:
Is it possible to force row level locking in SQL Server?
Locking a table with a select in Entity Framework
To view current held locks in Management Studio, look under the server, then under Management/Activity Monitor. It has a section for locks by object, so you should be able to see whether the inserts are really causing a problem.
Deadlock errors generally return quite quickly. Deadlock states do not occur as a result of a timeout error occurring while waiting for a lock. Deadlock is detected by SQL Server by looking for cycles in the lock requests.
The best answer I can come up with is: It depends.
The best way to check is to find your connection SPID and use sp_lock SPID to check if the lock mode is X on the TAB type. You can also verify the table name with SELECT OBJECT_NAME(objid). I also like to use the below query to check for locking.
SELECT RESOURCE_TYPE,RESOURCE_SUBTYPE,DB_NAME(RESOURCE_DATABASE_ID) AS 'DATABASE',resource_database_id DBID,
RESOURCE_DESCRIPTION,RESOURCE_ASSOCIATED_ENTITY_ID,REQUEST_MODE,REQUEST_SESSION_ID,
CASE WHEN RESOURCE_TYPE = 'OBJECT' THEN OBJECT_NAME(RESOURCE_ASSOCIATED_ENTITY_ID,RESOURCE_DATABASE_ID) ELSE '' END OBJETO
FROM SYS.DM_TRAN_LOCKS (NOLOCK)
WHERE REQUEST_SESSION_ID = --SPID here
In SQL Server 2008 (and later) you can disable the lock escalation on the table and enforce a WITH (ROWLOCK) in your insert clause effectively forcing a rowlock. This can't be done prior to SQL Server 2008 (you can write WITH ROWLOCK, but SQL Server can choose to ignore it).
I'm speaking generals here, and I don't have much experience with BLOBs as I usually advise developers to avoid them, especially if larger than 1 MB.
I have a query that runs each night on a table with a bunch of records (200,000+). This application simply iterates over the results (using a DbDataReader in a C# app if that's relevant) and processes each one. The processing is done outside of the database altogether. During the time that the application is iterating over the results I am unable to insert any records into the table that I am querying for. The insert statements just hang and eventually timeout. The inserts are done in completely separate applications.
Does SQL Server lock the table down while a query is being done? This seems like an overly aggressive locking policy. I could understand how there could be a conflict between the query and newly inserted records, but I would be perfectly ok if records inserted after the query started were simply not included in the results.
Any ways to avoid this?
Update:
The WITH (NOLOCK) definitely did the trick. As some of you pointed out, this isn't the cleanest approach. I can't really query everything into memory given the amount of records and some of the columns in this table are binary (some records are actually about 1MB of total data).
The other suggestion, was to query for batches of records at a time. This isn't a bad idea either, but it does bring up a new issue: database independent queries. Right now the application can work with a variety of different databases (Oracle, MySQL, Access, etc). Each database has their own way of limiting the rows returned in a query. But maybe this is better saved for another question?
Back on topic, the "WITH (NOLOCK)" clause is certainly SQL Server specific, is there any way to keep this out of my query (and thus preventing it from working with other databases)? Maybe I could somehow specify a parameter on the DbCommand object? Or can I specify the locking policy at the database level? That is, change some properties in SQL Server itself that will prevent the table from locking like this by default?
If you're using SQL Server 2005+, then how about giving the new MVCC snapshot isolation a try. I've had good results with it:
ALTER DATABASE SET SINGLE_USER WITH ROLLBACK IMMEDIATE;
ALTER DATABASE SET READ_COMMITTED_SNAPSHOT ON;
ALTER DATABASE SET MULTI_USER;
It will stop readers blocking writers and vice-versa. It eliminates many deadlocks, at very little cost.
It depends what Isolation Level you are using. You might try doing your selects using the With (NoLock) hint, that will prevent the read locks, but will also mean the data being read might change before the selecting transaction completes.
The first thing you could do is try to add the "WITH (NOLOCK)" to any tables you have in your query. This will "Tame down" the locking that SQL Server does. An example of using "NOLOCK" on a join is as follows...
SELECT COUNT(Users.UserID)
FROM Users WITH (NOLOCK)
JOIN UsersInUserGroups WITH (NOLOCK) ON
Users.UserID = UsersInUserGroups.UserID
Another option is to use a dataset instead of a datareader. A datareader is a "fire hose" technique that stays connected to the tables while your program is processing and basically handling the table row by row through the hose. A dataset uses a "disconnected" methodology where all the data is loaded into memory and then the connection is closed. Your program can then loop the data in memory without having to worry about locking. However, if this is a really large amount of data, there maybe memory issues.
Hope this helps.
If you add the WITH (NOLOCK) hint after a table name in the FROM clause it should make sure it doesn't lock, and it doesn't care about reading data that is locked. You might get "out of date" results if you are writing at the same time, but if you don't care about that then you should be fine.
I reckon your best way of avoiding this is to do it in SQL rather than in the application.
You can add a
WAITFOR DELAY '000:00:01'
at the end of each loop iteration to provide time for other processes to run - just make sure that you haven't initiated a TRANSACTION such that all other processes are locked out anyway
The query is performing a table lock, thus the inserts are failing.
It sounds to me like you're keeping a lock on the table while processing the results.
You should instead load them into an array or collection of some sort, and close the database connection.
Then process the array.
In addition, while you're doing your select use either:
WITH(NOLOCK) or WITH(READPAST)
I'm not a big fan of using lock hints as you could end up with dirty reads or other weirdness. A couple of other ideas:
Can you break the number of rows down so you don't grab 200k at a time? Is there a way to tell whether you've processed a row - a flag, a timestamp - you could use to make the query? Your query could be 'SELECT TOP 5000 ...' getting a differnet 5k each time. Shorter queries mean shorter-lived locks.
If you can use smaller sets of rows I like the DataSet vs. IDataReader idea. You will be loading data into memory and not consuming any SQL locks, but the amount of memory can cause other problems.
-Brian
You should be able to set the isolation level at the .NET level so that you don't have to include the WITH (NOLOCK) hint.
If you want to go with the batching option, you should be able to specify the Rowcount setting from the .NET level which would tell the database to only return n number of records. By setting these settings at the .NET level they should become database independent and work across all the platforms.
We've got a weird problem with joining tables from SQL Server 2005 and MS Access 2003.
There's a big table on the server and a rather small table locally in Access. The tables are joined via 3 fields, one of them a datetime field (containing a day; idea is to fetch additional data (daily) from the big server table to add data to the local table).
Up until the weekend this ran fine every day. Since yesterday we experienced strange non-time-outs in Access with this query. Non-time-out means that the query runs forever with rather high network transfer, but no timeout occurs. Access doesn't even show the progress bar. Server trace tells us that the same query is exectuted over and over on the SQL server without error but without result either. We've narrowed it down to the problem seemingly being accessing server table with a big table and either JOIN or WHERE containing a date, but we're not really able to narrow it down. We rebuilt indices already and are currently restoring backup data, but maybe someone here has any pointers of things we could try.
Thanks, Mike.
If you join a local table in Access to a linked table in SQL Server, and the query isn't really trivial according to specific limitations of joins to linked data, it's very likely that Access will pull the whole table from SQL Server and perform the join locally against the entire set. It's a known problem.
This doesn't directly address the question you ask, but how far are you from having all the data in one place (SQL Server)? IMHO you can expect the same type of performance problems to haunt you as long as you have some data in each system.
If it were all in SQL Server a pass-through query would optimize and use available indexes, etc.
Thanks for your quick answer!
The actual query is really huge; you won't be happy with it :)
However, we've narrowed it down to a simple:
SELECT * FROM server_table INNER JOIN access_table ON server_table.date = local_table.date;
If the server_table is a big table (hard to say, we've got 1.5 million rows in it; test tables with 10 rows or so have worked) and the local_table is a table with a single cell containing a date. This runs forever. It's not only slow, It just does nothing besides - it seems - causing network traffic and no time out (this is what I find so strange; normally you get a timeout, but this just keeps on running).
We've just found KB article 828169; seems to be our problem, we'll look into that. Thanks for your help!
Use the DATEDIFF function to compare the two dates as follows:
' DATEDIFF returns 0 if dates are identical based on datepart parameter, in this case d
WHERE DATEDIFF(d,Column,OtherColumn) = 0
DATEDIFF is optimized for use with dates. Comparing the result of the CONVERT function on both sides of the equal (=) sign might result in a table scan if either of the dates is NULL.
Hope this helps,
Bill
Try another syntax ? Something like:
SELECT * FROM BigServerTable b WHERE b.DateFld in (SELECT DISTINCT s.DateFld FROM SmallLocalTable s)
The strange thing in your problem description is "Up until the weekend this ran fine every day".
That would mean the problem is really somewhere else.
Did you try creating a new blank Access db and importing everything from the old one ?
Or just refreshing all your links ?
Please post the query that is doing this, just because you have indexes doesn't mean that they will be used. If your WHERE or JOIN clause is not sargable then the index will not be used
take this for example
WHERE CONVERT(varchar(49),Column,113) = CONVERT(varchar(49),OtherColumn,113)
that will not use an index
or this
WHERE YEAR(Column) = 2008
Functions on the left side of the operator (meaning on the column itself) will make the optimizer do an index scan instead of a seek because it doesn't know the outcome of that function
We rebuilt indices already and are currently restoring backup data, but maybe someone here has any pointers of things we could try.
Access can kill many good things....have you looked into blocking at all
run
exec sp_who2
look at the BlkBy column and see who is blocking what
Just an idea, but in SQL Server you can attach your Access database and use the table there. You could then create a view on the server to do the join all in SQL Server. The solution proposed in the Knowledge Base article seems problematic to me, as it's a kludge (if LIKE works, then = ought to, also).
If my suggestion works, I'd say that it's a more robust solution in terms of maintainability.