Can SQL Server detect deadlocks involving joins with linked databases? - sql-server

I have code with queries that look like this:
INSERT INTO LinkedServer.DB.dbo.Table1 (column)
SELECT something
FROM LocalDB.dbo.Table2
WHERE something NOT IN (SELECT column FROM LinkedServer.DB.dbo.Table1)
There's a lot of other code in the ecosystem that could be touching those tables. Today, this SQL code hung for way longer than I expected it to. If there were a deadlock between Table1 and Table2, can SQL Server detect that if it's a linked server?
Also, what if LocalDB were only calling out to LinkedServer and not locking any of its own objects? Can it detect the deadlock if the affected objects are all on the remote server.
In this case, LocalDB is 2012 R2 and LinkedServer is 2008

As per my understanding deadlock between two different tables can happen only when there is short of resource in the server level, Are you trying to understand the Deadlock between you insert and select queries . Then yes let say you are inserting some records into the table in the Linkedservers and at the same time you are selecting data in localDB using the data in Linkedservers, But since the insert is not yet completed that can cause the blocking to you sub-query which is trying to fetch data from the Linkedserver and pass the data to localDB select query. in this case blocking will be because of Table1 only which is of linkedserver but from our point of view it looks like table2 is being blocked by table1 but in actual that is not the scenario. And yes off course the deadlock information will be available in the case i explained but those information will be available in the DMV's of the linkedserver.

Related

SSIS, query Oracle table using ID's from SQL Server?

Here's the basic idea of what I want to do in SSIS:
I have a large query against a production Oracle database, and I need the following where clause that brings in a long list of ids from SQL Server. From there, the results are sent elsewhere.
select ...
from Oracle_table(s) --multi-join
where id in ([select distinct id from SQL_SERVER_table])
Alternatively, I could write the query this way:
select ...
from Oracle_table(s) --multi-join
...
join SQL_SERVER_table sst on sst.ID = Oracle_table.ID
Here are my limitations:
The Oracle query is large and cannot be run without the where id in (... clause
This means I cannot run the Oracle query, then join it against the ids in another step. I tried this, and the DBA's killed the temp table after it became 3 TB in size.
I have 160k id's
This means it is not practical to iterate through the id's one by one. In the past, I have run against ~1000 IDs, using a comma-separated list. It runs relatively fast - a few minutes.
The main query is in Oracle, but the ids are in SQL Server
I do not have the ability to write to Oracle
I've found many questions like this.
None of the answers I have found have a solution to my limitations.
Similar question:
Query a database based on result of query from another database
To prevent loading all rows from the Oracle table. The only way is to apply the filter in the Oracle database engine. I don't think this can be achieved using SSIS since you have more than 160000 ids in the SQL Server table, which cannot be efficiently loaded and passed to the Oracle SQL command:
Using Lookups and Merge Join will require loading all data from the Oracle database
Retrieving data from SQL Server, building a comma-separated string, and passing it to the Oracle SQL command cannot be done with too many IDs (160K).
The same issue using a Script Task.
Creating a Linked Server in SQL Server and Joining both tables will load all data from the Oracle database.
To solve your problem, you should search for a way to create a link to the SQL Server database from the Oracle engine.
Oracle Heterogenous Services
I don't have much experience in Oracle databases. Still, after a small research, I found something in Oracle equivalent to "Linked Servers" in SQL Server called "heterogeneous connectivity".
The query syntax should look like this:
select *
from Oracle_table
where id in (select distinct id from SQL_SERVER_table#sqlserverdsn)
You can refer to the following step-by-step guides to read more on how to connect to SQL Server tables from Oracle:
What is Oracle equivalent for Linked Server and can you join with SQL Server?
Making a Connection from Oracle to SQL Server - 1
Making a Connection from Oracle to SQL Server - 2
Heterogeneous Database connections - Oracle to SQL Server
Importing Data from SQL Server to a staging table in Oracle
Another approach is to use a Data Flow Task that imports IDs from SQL Server to a staging table in Oracle. Then use the staging table in your Oracle query. It would be better to create an index on the staging table. (If you do not have permission to write to the Oracle database, try to get permission to a separate staging database.)
Example of exporting data from SQL Server to Oracle:
Export SQL Server Data to Oracle using SSIS
Minimizing the data load from the Oracle table
If none of the solutions above solves your issue. You can try minimizing the data loaded from the Oracle database as much as possible.
As an example, you can try to get the Minimum and Maximum IDs from the SQL Server table, store both values within two variables. Then, you can use both variables in the SQL Command that loads the data from the Oracle table, like the following:
SELECT * FROM Oracle_Table WHERE ID > #MinID and ID < #MaxID
This will remove a bunch of useless data in your operation. In case your ID column is a string, you can use other measures to filter data, such as the string length, the first character.

SQL Server - Vertica Connection

I need to query a hp vertica database from SQL Server stored procedure. It is a join query and If I use linked server, it is going to fire as 2 separate selects and join it in the SQL Server . Is there any way I can use ODBC to fire the join query to Vertica from TSQL and get the processed result set back into an SQL table.?
Any other approach to suggest to achieve this ?
You may need to use OPENQUERY syntax in SQL Server, to get the full query sent to Vertica for execution there... There are other possibilities, but we'd need much more detail about what you have in play (especially but not only your current query) to usefully discuss them.

What is the reasoning for using OPENQUERY within a tsql stored procedure?

I am currently reviewing some jobs that run stored procedures on a database. All of these stored procedures are connecting to a linked server(s). I am not too familiar with this functionality. I am at the moment attempting to determine why these were used versus just a normal query as the queries I am running seem to be pulling in the data.
I read this, which is MSDNs explanation of openquery. :
http://technet.microsoft.com/en-us/library/ms188427.aspx
I also read this, which is a stackoverflow link talking about why not to use it on local server. :
Why is using OPENQUERY on a local server bad?
My question is do you basically just use this when the stored procedure requires the embedded credentials of the linked server? Or are there more reasons for using OpenQuery that I am not aware of?
Two advantages I can think of using openquery. It can reduce the amount of data you'd need to transfer by doing the necessary filtering on the remote server. It can allow the query optimizer on the remote server to choose the optimal execution plan when joining tables.
The other alternative is using REMOTE JOIN
I've had some luck using it but Aaron Bertrand has a nice write up about it here.. http://www.mssqltips.com/sqlservertip/2765/revisit-your-use-of-the-sql-server-remote-join-hint/
Here is the MS documentation
REMOTE
Specifies that the join operation is performed on the site of the right table. This is useful when the left table is a local table and the right table is a remote table. REMOTE should be used only when the left table has fewer rows than the right table.
If the right table is local, the join is performed locally. If both tables are remote but from different data sources, REMOTE causes the join to be performed on the site of the right table. If both tables are remote tables from the same data source, REMOTE is not required.
REMOTE cannot be used when one of the values being compared in the join predicate is cast to a different collation using the COLLATE clause.
REMOTE can be used only for INNER JOIN operations.

Access Upsize to Sqlserver not transfer data

I have a Access Database with 11,000,000 records.I want to transfer this records to same table in Sqlserver 2008 using Upsize tools. This tool creates the database and tables correctly but the table in SQL Server is empty and data is not transferred.
Since you didn't mention receiving an error message, check the field types in the new SQL Server table to confirm they are compatible with their Access counterparts.
If it looks OK, start Access and create an ODBC link to the SQL Server table. Then create an Access "append query" to add data from the Access table to the SQL Server table.
INSERT INTO remote_table (field1, field2, field3)
SELECT field1, field2, field3
FROM local_table
WHERE date_field >= #2012-01-01# AND date_field < #2012-02-01#;
Note I imagined a WHERE clause which limits the number of rows to a reasonably small subset of the 11 million rows. Adjust as needed for your situation.
If that INSERT succeeds, repeat it with different WHERE conditions to append chunks of the data to SQL Server until you get it all transferred.
And if it fails, hopefully you will get an error message which explains why.
As noted here in most cases it is a bad date or simply a date that is outside of SQL server that cases a fail. I would suggest you use the Access migration tool as opposed to the built in tool. It does a MUCH better job.
You find this utility here:
http://www.microsoft.com/en-us/download/details.aspx?id=28763
The above tends to deal with the date and other issues that prevent data uploads far better than the built in upsize tool

SQL Server Service Broker to insert data on a SQL Server 2000

I have two servers, the first one with SQL Server 2005 and the second one with SQL Server 2000. I want to insert data from the 2005 to the 2000 and I want to do it unsync (without distributed transactions because "save transaction" are used).
Once the information is inserted in the tables of the 2000 server, some instead-of triggers are fired to process this information.
In that scenario I decided to use Service Broker. So I have a Stored Procedure to insert information from one server to the other and it works perfectly.
But when I call this procedure from the target queue process message procedure it fails, and I don't know why!!
Also, I know it works because when I use the same structure (queues & stored procedures) to copy form one database to another on the same SQL 2005 server.
So it fails only between machines, anyone knows why or how to get more information about the cause of the failure? Or how to insert data unsync (I can't to use the SQL Agent because I want insert the information more frequently than 1 minute).
The usual issue is that the SSB procedure uses WAITFOR and WAITFOR is incompatible with distributed transactions. The only solution is to get rid of the WAITFOR(RECEIVE) and use a ordinary RECEIVE instead.
Consider using a linked server instead of service broker. With a linked server, you can:
insert into LinkedServer.The2000Db.dbo.The2000Table
(col1, col2, col3)
select col1, col2, col3
from The2005Table

Resources