SQL Server Linked Server Update - terrible performance - sql-server

In my SQL Server 2012 database, I have a linked server reference to a second SQL Server database that I need to pull records from and update accordingly.
I have the following update statement that I am trying to run:
UPDATE
Linked_Tbl
SET
Transferred = 1
FROM
MyLinkedServer.dbo.MyTable Linked_Tbl
JOIN
MyTable Local_Tbl ON Local_Tbl.LinkedId = Linked_Tbl.Id
JOIN
MyOtherTable Local_Tbl2 ON Local_Tbl.LocalId = Local_Tbl2.LocalId
Which I had to stop after an hour of running as it was still executing.
I've read online and found solutions stating that the best solution is to create a stored procedure on the Linked Server itself to execute the update statement rather than run it over the wire.
The problems I have are:
I don't have the ability to create any procedures on the other server.
Even if I could create that procedure, I would need to pass through all the Ids to the stored procedure for the update and I'm not sure how to do that efficiently with thousands of Ids (this, obviously, is the smaller of the issues, though since I can't create that procedure in the first place).
I'm hoping there are other solutions people may have managed to come up with given that it's often the case you don't have permissions to make changes to a different server.
Any ideas??

I am not sure, whether it can give more performance, you an try:
UPDATE
Linked_Tbl
SET
Transferred = 1
FROM OPENDATASOURCE([MyLinkedServer],'select Id, LocalId,Transferred from remotedb.dbo.MyTable') AS Linked_Tbl
JOIN MyTable Local_Tbl
ON Local_Tbl.LinkedId = Linked_Tbl.Id
JOIN MyOtherTable Local_Tbl2
ON Local_Tbl.LocalId = Local_Tbl2.LocalId

Related

Allow Windows AD group to own a SQL job

The SQL Agent Jobs sits above the user level and requires a login to be assigned to the owner. But it doesn't take a group login as an accepted parameter. I need to use the Windows AD group as owner because I have different SQL users and some of them should see only the specific jobs. As now Ive created separate jobs for every user using SQLAgentUserRole which is not good for sure and the database is full of 1:1 jobs, each of them with different owner to avoid seeing the other jobs.
The whole picture:
Lets say that I have 10 different jobs in the database. One of those jobs is named UserJob. I want specific users when connecting to the database and expand the jobs section to see ONLY the job named "UserJob" and be able to start it. I dont need it via Stored procedure, etc. I just need to start the job via the SSMS (right click, start job, enter parameters if needed). Thanks.
As per the docs SSMS checks user membership in the following Database Roles to show SQL Server Agent tree node:
SQLAgentUserRole
SQLAgentReaderRole
SQLAgentOperatorRole
I used SQL Server Profiler to find what queries are executed when you first connect to database in Object Browser and expand various nodes.
For SQL Server Agent it uses SELECT * FROM msdb.dbo.sysjobs_view view to list Jobs. This view can be modified.
Changes
Create a new Database Role in msdb database. I called it "CustomJobRole".
I then created a new Job (I assume you already have a Job) called "TestJob"
Create a low privilege user that should be able to see and run only "TestJob".
Add this user to "CustomJobRole" and "SQLAgentReaderRole" and/or "SQLAgentOperatorRole" (see linked above docs for details)
Modify sysjobs_view as follows:
(see comments in code)
ALTER VIEW sysjobs_view
AS
SELECT jobs.job_id,
svr.originating_server,
jobs.name,
jobs.enabled,
jobs.description,
jobs.start_step_id,
jobs.category_id,
jobs.owner_sid,
jobs.notify_level_eventlog,
jobs.notify_level_email,
jobs.notify_level_netsend,
jobs.notify_level_page,
jobs.notify_email_operator_id,
jobs.notify_netsend_operator_id,
jobs.notify_page_operator_id,
jobs.delete_level,
jobs.date_created,
jobs.date_modified,
jobs.version_number,
jobs.originating_server_id,
svr.master_server
FROM msdb.dbo.sysjobs as jobs
JOIN msdb.dbo.sysoriginatingservers_view as svr
ON jobs.originating_server_id = svr.originating_server_id
--LEFT JOIN msdb.dbo.sysjobservers js ON jobs.job_id = js.job_id
WHERE
-- Custom: Add Condition for your Custom Role and Job Name
( (ISNULL(IS_MEMBER(N'CustomJobRole'), 0) = 1) AND jobs.name = 'TestJob' )
OR (owner_sid = SUSER_SID())
OR (ISNULL(IS_SRVROLEMEMBER(N'sysadmin'), 0) = 1)
-- Custom: In order for users to be able to see and start Jobs they have to be members of SQLAgentReaderRole/SQLAgentOperatorRole
-- but these roles gives them ability to see all jobs so add an exclusion
OR ( ISNULL(IS_MEMBER(N'SQLAgentReaderRole'), 0) = 1 AND ISNULL( IS_MEMBER(N'CustomJobRole'), 0 ) = 0 )
OR ( (ISNULL(IS_MEMBER(N'TargetServersRole'), 0) = 1) AND
(EXISTS(SELECT * FROM msdb.dbo.sysjobservers js
WHERE js.server_id <> 0 AND js.job_id = jobs.job_id))) -- filter out local jobs
Note: commented out LEFT JOIN is original code and has nothing to do with the solution.
Summary
This method is "hacky" as it only modifies the job list for certain users and does not actually prevent them from running other jobs via code, in other words this does not offer any security, just convenience of clean UI. Implementation is simple but, obviously not scalable: Job name is hard-coded and negative membership presence is used (i.e. AND ISNULL( IS_MEMBER(N'CustomJobRole'), 0 ) = 0). IMO, it is the simplest and most reliable (least side effects) method though.
Tested on
SSMS v18.9.2 + SQL Server 2014 SP3
Editing Job Step Workaround
It is not possible to modify Job Step unless you are a Job Owner or a Sysadmin.
One, even more "hacky", way to work around this problem is to create a table that would hold all input parameters and give users insert/update access to this table. Your SP can then read parameters from this table. It should be easy for Users to Right-Click -> Edit on the table and modify data.
For the table structure I would recommend the following:
Assuming you have relatively few parameters I suggest that you create one column per parameter. This way you have correct data types for each
parameter.
Add an After Insert / Delete trigger to the table to ensure
that the table always has exactly one row of data.

Verifying Syntax of a Massive SQL Update Command

I'm new to SQL Server and am doing some cleanup of our transaction database. However, to accomplish the last step, I need to update a column in one table of one database with the value from another column in another table from another database.
I found a SQL update code snippet and re-wrote it for our own needs but would love someone to give it a once over before I hit the execute button since the update will literally affect hundreds of thousands of entries.
So here are the two databases:
Database 1: Movement
Table 1: ItemMovement
Column 1: LongDescription (datatype: text / up to 40 char)
Database 2: Item
Table 2: ItemRecord
Column 2: Description (datatype: text / up to 20 char)
Goal: set Column1 from db1 to the value of Colum2 from db2.
Here is the code snippet:
update table1
set table1.longdescription = table2.description
from movement..itemmovement as table1
inner join item..itemrecord as table2 on table1.itemcode = table2.itemcode
where table1.longdescription <> table2.description
I added the last "where" line to prevent SQL from updating the column where it already matches the source table.
This should execute faster and just update the columns that have garbage. But as it stands, does this look like it will run? And lastly, is it a straightforward process, using SQL Server 2005 Express to just backup the entire Movement db before I execute? And if it messes up, just restore it?
Alternatively, is it even necessary to re-cast the tables as table1 and table 2? Is it valid to execute a SQL query like this:
update movement..itemmovement
set itemmovement.longdescription = itemrecord.description
from movement..itemmovement
inner join item..itemrecord on itemmovement.itemcode = itemrecord.itemcode
where itemmovement.longdescription <> itemrecord.description
Many thanks in advance!
You don't necessarily need to alias your tables but I recommend you do for faster typing and reduce the chances of making a typo.
update m
set m.longdescription = i.description
from movement..itemmovement as m
inner join item..itemrecord as i on m.itemcode = i.itemcode
where m.longdescription <> i.description
In the above query I have shortened the alias using m for itemmovement and i for itemrecord.
When a large number of records are to be updated and there's question whether it would succeed or not, always make a copy in a test database (residing on a test server) and try it out over there. In this case, one of the safest bet would be to create a new field first and call it longdescription_text. You can make it with SQL Server Management Studio Express (SSMS) or using the command below:
use movement;
alter table itemmovement add column longdescription_test varchar(100);
The syntax here says alter table itemmovement and add a new column called longdescription_test with datatype of varchar(100). If you create a new column using SSMS, in the background, SSMS will run the same alter table statement to create a new column.
You can then execute
update m
set m.longdescription_test = i.description
from movement..itemmovement as m
inner join item..itemrecord as i on m.itemcode = i.itemcode
where m.longdescription <> i.description
Check data in longdescription_test randomly. You can actually do a spot check faster by running:
select * from movement..itemmovement
where longdescription <> longdescription_test
and longdescription_test is not null
If information in longdescription_test looks good, you can change your update statement to set m.longdescription = i.description and run the query again.
It is easier to just create a copy of your itemmovement table before you do the update. To make a copy, you can just do:
use movement;
select * into itemmovement_backup from itemmovement;
If update does not succeed as desired, you can truncate itemmovement and copy data back from itemmovement_backup.
Zedfoxus provided a GREAT explanation on this and I appreciate it. It is excellent reference for next time around. After reading over some syntax examples, I was confident enough in being able to run the second SQL update query that I have in my OP. Luckily, the data here is not necessarily "live" so at low risk to damage anything, even during operating hours. Given the nature of the data, the updated executed perfectly, updating all 345,000 entries!

Connection scoped temp tables across stored procedures

I'm working on a data virtualization solution. The user is able to write his own SQL queries as filters for a query i make. I would like not having to run this filter query every time i select something from the database(It will likely be a complex series of joins).
My idea was to use a # temp table at script level and keep the connection alive. This #temp table would then be selected from but updated only when the user changes the filter. The idea being i can actually use it from stored procedures and the table is scoped to that connection.
I got the idea from someone who suggested to use dynamic sql and ## global temp tables named with the connection process ID so to make each connection have a unique global temp table. This was to overcome sharing temp tables across stored procedures. But it seems a bit clumsy.
I did a quick test with the below code and seemed to work fine
-- Run script at connection open from some app
SELECT * INTO #test
FROM dataTable
-- Now we can use stored procedures with #test table
EXECUTE selectFromTempTable
EXECUTE updateTempTable #sqlFilterString
EXECUTE selectFromTempTable
Only real problem i can see is the connection have to be kept alive for the duration which can be a few hours maybe. A single user can have multiple connections running at the same time. The number of users on a single database server would be like max 20.
If its a huge issue i could make it so the application can close and open them as needed so each user only have 1 connection open at a time. And maybe even then close it if not in use, and reopen when needed again with the delay of having to wait for the query to run.
Would this be bad practice? or kill any performance benefit from not running the filter query? This is on SQL Server 2008 and up.
I think I would create a permanent table, using the spid (process ID) as a key value. Each connection has its own process ID, so anyone can use it to identify their entries in the table:
create table filter(
spid int,
filternum int,
filterstring varchar(255),
<other cols> );
create unique index filterindx on filter(spid, filternum);
Then when a user creates filter entries:
delete from filter where spid = ##spid
insert into filter(spid, filternum, filterstring) select ##spid, 1, 'some sql thing'
insert into filter(spid, filternum, filterstring) select ##spid, 2, 'some other sql thing'
Then you can access each user's filter values by selecting where spid = ##spid etc

SQL Server Linked Server Example Query

While in Management Studio, I am trying to run a query/do a join between two linked servers.
Is this a correct syntax using linked db servers:
select foo.id
from databaseserver1.db1.table1 foo,
databaseserver2.db1.table1 bar
where foo.name=bar.name
Basically, do you just preface the db server name to the db.table ?
The format should probably be:
<server>.<database>.<schema>.<table>
For example:
DatabaseServer1.db1.dbo.table1
Update: I know this is an old question and the answer I have is correct; however, I think any one else stumbling upon this should know a few things.
Namely, when querying against a linked server in a join situation the ENTIRE table from the linked server will likely be downloaded to the server the query is executing from in order to do the join operation. In the OP's case, both table1 from DB1 and table1 from DB2 will be transferred in their entirety to the server executing the query, presumably named DB3.
If you have large tables, this may result in an operation that takes a long time to execute. After all it is now constrained by network traffic speeds which is orders of magnitude slower than memory or even disk transfer speeds.
If possible, perform a single query against the remote server, without joining to a local table, to pull the data you need into a temp table. Then query off of that.
If that's not possible then you need to look at the various things that would cause SQL server to have to load the entire table locally. For example using GETDATE() or even certain joins. Others performance killers include not giving appropriate rights.
See http://thomaslarock.com/2013/05/top-3-performance-killers-for-linked-server-queries/ for some more info.
SELECT * FROM OPENQUERY([SERVER_NAME], 'SELECT * FROM DATABASE_NAME..TABLENAME')
This may help you.
For those having trouble with these other answers , try OPENQUERY
Example:
SELECT * FROM OPENQUERY([LinkedServer], 'select * from [DBName].[schema].[tablename]')
If you still find issue with <server>.<database>.<schema>.<table>
Enclose server name in []
You need to specify the schema/owner (dbo by default) as part of the reference. Also, it would be preferable to use the newer (ANSI-92) join style.
select foo.id
from databaseserver1.db1.dbo.table1 foo
inner join databaseserver2.db1.dbo.table1 bar
on foo.name = bar.name
select * from [Server].[database].[schema].[tablename]
This is the correct way to call.
Be sure to verify that the servers are linked before executing the query!
To check for linked servers call:
EXEC sys.sp_linkedservers
right click on a table and click script table as select
select name from drsql01.test.dbo.employee
drslq01 is servernmae --linked serer
test is database name
dbo is schema -default schema
employee is table name
I hope it helps to understand, how to execute query for linked server
Usually direct queries should not be used in case of linked server because it heavily use temp database of SQL server. At first step data is retrieved into temp DB then filtering occur. There are many threads about this. It is better to use open OPENQUERY because it passes SQL to the source linked server and then it return filtered results e.g.
SELECT *
FROM OPENQUERY(Linked_Server_Name , 'select * from TableName where ID = 500')
For what it's worth, I found the following syntax to work the best:
SELECT * FROM [LINKED_SERVER]...[TABLE]
I couldn't get the recommendations of others to work, using the database name. Additionally, this data source has no schema.
In sql-server(local) there are two ways to query data from a linked server(remote).
Distributed query (four part notation):
Might not work with all remote servers. If your remote server is MySQL then distributed query will not work.
Filters and joins might not work efficiently. If you have a simple query with WHERE clause, sql-server(local) might first fetch entire table from the remote server and then apply the WHERE clause locally. In case of large tables this is very inefficient since a lot of data will be moved from remote to local. However this is not always the case. If the local server has access to remote server's table statistics then it might be as efficient as using openquery More details
On the positive side T-SQL syntax will work.
SELECT * FROM [SERVER_NAME].[DATABASE_NAME].[SCHEMA_NAME].[TABLE_NAME]
OPENQUERY
This is basically a pass-through. The query is fully processed on the remote server thus will make use of index or any optimization on the remote server. Effectively reducing the amount of data transferred from the remote to local sql-server.
Minor drawback of this approach is that T-SQL syntax will not work if the remote server is anything other than sql-server.
SELECT * FROM OPENQUERY([SERVER_NAME], 'SELECT * FROM DATABASE_NAME.SCHEMA_NAME.TABLENAME')
Overall OPENQUERY seems like a much better option to use in majority of the cases.
I have done to find out the data type in the table at link_server using openquery and the results were successful.
SELECT * FROM OPENQUERY (LINKSERVERNAME, '
SELECT DATA_TYPE, COLUMN_NAME
FROM [DATABASENAME].INFORMATION_SCHEMA.COLUMNS
WHERE
TABLE_NAME =''TABLENAME''
')
Its work for me
Following Query is work best.
Try this Query:
SELECT * FROM OPENQUERY([LINKED_SERVER_NAME], 'SELECT * FROM [DATABASE_NAME].[SCHEMA].[TABLE_NAME]')
It Very helps to link MySQL to MS SQL
PostgreSQL:
You must provide a database name in the Data Source DSN.
Run Management Studio as Administrator
You must omit the DBName from the query:
SELECT * FROM OPENQUERY([LinkedServer], 'select * from schema."tablename"')
For MariaDB (and so probably MySQL), attempting to specify the schema using the three-dot syntax did not work, resulting in the error "invalid use of schema or catalog". The following solution worked:
In SSMS, go to Server Objects > Linked Servers > Providers > MSDASQL
Ensure that "Dynamic parameter", "Level zero only", and "Allow inprocess" are all checked
You can then query any schema and table using the following syntax:
SELECT TOP 10 *
FROM LinkedServerName...[SchemaName.TableName]
Source: SELECT * FROM MySQL Linked Server using SQL Server without OpenQuery
Have you tried adding " around the first name?
like:
select foo.id
from "databaseserver1".db1.table1 foo,
"databaseserver2".db1.table1 bar
where foo.name=bar.name

SQL Server COMPILE locks?

SQL Server 2000 here.
I'm trying to be an interim DBA, but don't know much about the mechanics of a database server, so I'm getting a little stuck. There's a client process that hits three views simultaneously. These three views query a remote server to pull back data.
What it looks like is that one of these queries will work, but the other two fail (client process says it times out, so I'm guessing a lock can do that). The querying process has a lock that sticks around until the SQL process is restarted (I got gutsy and tried to kill the spid once, but it wouldn't let go). Any queries to this database after the lock hang, and blame the first process for blocking it.
The process reports these locks... (apologies for the formatting, the preview functionality shows it as fully lined up).
spid dbid ObjId IndId Type Resource Mode Status
53 17 0 0 DB S GRANT
53 17 1445580188 0 TAB Sch-S GRANT
53 17 1445580188 0 TAB [COMPILE] X GRANT
I can't analyze that too well. Object 1445580188 is sp_bindefault, a system stored procedure in master. What's it hanging on to an exclusive lock for?
View code, to protect the proprietary...I only changed the names (they stayed consistent with aliases and whatnot) and tried to keep everything else exactly the same.
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER OFF
GO
ALTER view [dbo].[theView]
as
select
a.[column1] column_1
,b.[column2] column_2
,[column3]
,[column4]
,[column5]
,[column6]
,[column7]
,[column8]
,[column9]
,[column10]
,p.[column11]
,p.[column12]
FROM
[remoteServer].db1.dbo.[tableP] p
join [remoteServer].db2.dbo.tableA a on p.id2 = a.id
join [remoteServer].db2.dbo.tableB b on p.id = b.p_id
WHERE
isnumeric(b.code) = 1
GO
SET ANSI_NULLS OFF
GO
SET QUOTED_IDENTIFIER ON
GO
Take a look at this link. Are you sure it's views that are blocking and not stored procedures? To find out, run this query below with the ObjId from your table above. There are things that you can do to mitigate stored procedure re-compiles. The biggest thing is to avoid naming your stored procedures with an "sp_" prefix, see this article on page 10. Also avoid using if/else branches in the code, use where clauses with case statements instead. I hope this helps.
[Edit]:
I believe sp_binddefault/rule is used in conjunction with user defined types (UDT). Does your view make reference to any UDT's?
SELECT * FROM sys.Objects where object_id = 1445580188
Object 1445580188 is sp_bindefault in the master database, no? Also, it shows resource = "TAB" = table.
USE master
SELECT OBJECT_NAME(1445580188), OBJECT_ID('sp_bindefault')
USE mydb
SELECT OBJECT_NAME(1445580188)
If the 2nd query returns NULL, then the object is a work table.
I'm guessing it's a work table being generated to deal with the results locally.
The JOIN will happen locally and all data must be pulled across.
Now, I can't shed light on the compile lock: the view should be compiled already. This is complicated by the remote server access and my experience of compile locks is all related to stored procs.

Resources