Options for eliminating cross server joins

Options for eliminating cross server joins - sql-server

I have a database on server X containing my source data.
I also have a database on server Y that contains data I need to augment with data on server X.
Currently we have a nightly job on server Y that calls a stored procedure on server X, inserts it into a table variable, then sends the data as xml to a stored procedure in the database on server Y.
Below is a basically what the code looks like:
--Get data from source
DECLARE #MySourceData TABLE
(
[ColumnX] VARCHAR(50),
[ColumnY] VARCHAR(50)
);
INSERT INTO #MySourceData EXECUTE [ServerX].SourceDatabase.dbo.[pListData];
DECLARE #XmlData XML;
SELECT
#XmlData =
(
SELECT
[ColumnX]
,[ColumnY]
FROM
#MySourceData
FOR XML RAW ('Item'), ROOT('Items'), ELEMENTS, TYPE
)
--Send data to target
EXEC TargetDatabase.dbo.pImportData #XmlData;
This approach keeps any server names or database names within the sql of the job step (which we think of as part of configuration), and allows us to abide by our in house development standards of using stored procedures for data access. While this particular solution is only processing a few thousand records, and the xml won't get that big, if we tried applying it in scenarios where the dataset was larger, how poorly it might scale. I'm curious if others have better suggestions.

Related

How to Auto Generate Code for Stored Procedure Column Data Types - SQL Server

My desired end result is to simply be able to SELECT from a Stored Procedure. I've searched the Internet and unfortunately the Internet said this can't be done and that you first need to create a Temp Table to store the data. My problem is that you must first define the columns in the Temp Table before Executing the STORED Procedure. This is just time consuming. I simply want to take the data from the stored procedure and just stick it into a Temp Table.
What is the FASTEST route to achieve this from a coding perspective? To put it simply it's time consuming to first have to lookup the returned fields from a Stored Procedure and then write them all out.
Is there some sort of tool that can just build the CREATE Table Statement based on the Stored Procedure? See screenshot for clarification.
Most of the Stored Procedures I'm dealing with have 50+ fields. I don't look forward to defining each of these fields manually.
Here is good SO Post that got me this far but not what I was hoping. This still takes too much time. What are experienced SQL Server guys doing? I've only just recently made the jump from Oracle to SQL Server and I see that Temp Tables are a big deal in SQL Server from what I can tell.

You have several options to ease your task. However, these won't be fully automatic. Be aware that these won't work if there's dynamic sql in the procedure's code. You might be able to format the result from the functions to increase the automation allowing you to copy and paste easily.
SELECT * FROM sys.dm_exec_describe_first_result_set_for_object(OBJECT_ID('report.MyStoredProcedureWithAnyColumns'), 0) ;
SELECT * FROM sys.dm_exec_describe_first_result_set(N'EXEC report.MyStoredProcedureWithAnyColumns', null, 0) ;
EXEC sp_describe_first_result_set #tsql = N'EXEC report.MyStoredProcedureWithAnyColumns';
GO

If you don't mind ##temp table and some dynamic SQL
NOTE: As Luis Cazares correctly pointed out... the ##temp runs the risk of collision due to concurrency concerns
Example
Declare #SQL varchar(max) = 'Exec [dbo].[prc-App-Lottery-Search] ''8117'''
Declare #temp varchar(500) = '##myTempTable'
Set #SQL = '
If Object_ID(''tempdb..'+#temp+''') Is Not NULL Drop Table '+#temp+';
Create Table '+#temp+' ('+stuff((Select concat(',',quotename(Name),' ',system_type_name)
From sys.dm_exec_describe_first_result_set(#SQL,null,null ) A
Order By column_ordinal
For XML Path ('')),1,1,'') +')
Insert '+#temp+' '+#SQL+'
'
Exec(#SQL)
Select * from ##myTempTable

Writing stored procedures when using dynamic schema names in sql server

The application, I have been currently working with has different schema names for its tables, for example Table1 can have multiple existence say A.Table1 and B.Table1. All my stored procedures are stored under dbo. I'm writing the below stored procedures using dynamic SQL. I'm currently using SQL Server 2008 R2 and soon it will be migrated to SQL Server 2012.
create procedure dbo.usp_GetDataFromTable1
#schemaname varchar(100),
#userid bigint
as
begin
declare #sql nvarchar(4000)
set #sql='select a.EmailID from '+#schemaname+'.Table1 a where a.ID=#user_id';
exec sp_executesql #sql, N'#user_id bigint', #user_id=#userid
end
Now my questions are,
1. Is this type of approach affects the performance of my stored procedure?
2. If performance is affected, then how to write procedures for this kind of scenario?

The best way around this would be a redesign, if at all possible.
You can even implement this retrospectively by adding a new column to replace the schema, for example: Profile, then merge all tables from each schema into one in a single schema (e.g. dbo).
Then your procedure would appear as follows:
create procedure dbo.usp_GetDataFromTable1
#profile int,
#userid bigint
as
begin
select a.EmailID from dbo.Table1 a
where a.ID = #user_id
and a.Profile = #profile
end
I have used an int for the profile column, but if you use a varchar you could even keep your schema name for the profile value, if that helps to make things clearer.

I would look at a provisioning approach, where you dynamically create the tables and stored procedures as part of some up-front process. I'm not 100% sure of your scenario, but perhaps this could be when you add a new user. Then, you can call these SP's by convention in the application.
For example, new user creation calls an SP which creates c.Table and c.GetDetails SP.
then in the app you can call c.GetDetails based on "c" being a property of the user definition.
This gets you around any security concerns from using dynamic SQL. It's still dynamic, but is built once up front.

Dynamic schema and same table structure is quite unusual, but you can still obtain what you want using something like this:
declare #sql nvarchar(4000)
declare #schemaName VARCHAR(20) = 'schema'
declare #tableName VARCHAR(20) = 'Table'
-- this will fail, as the whole string will be 'quoted' within [..]
-- declare #tableName VARCHAR(20) = 'Category; DELETE FROM TABLE x;'
set #sql='select * from ' + QUOTENAME(#schemaName) + '.' + QUOTENAME(#tableName)
PRINT #sql
-- #user_id is not used here, but it can if the query needs it
exec sp_executesql #sql, N'#user_id bigint', #user_id=0
So, QUOTENAME should keep on the safe side regarding SQL injection.
1. Performance - dynamic SQL cannot benefit from some performance improvements (I think procedure associated statistics or something similar), so there is a performance risk.
However, for simple things that run on rather small amount of data (tens of millions at most) and for data that is not heavily changes (inserts and deletes), I don't think you will have noticeable problems.
2. Alternative -bukko has suggested a solution. Since all tables have the same structure, they can be merged. If it becomes huge, good indexing and partitioning should be able to reduce query execution times.

There is a work around for this if you know what schemas you are going to be using. You stated here that schema name is created on signup, we use this approach on login. I have a view which I add or remove unions from on session startup/dispose. Example below.
CREATE VIEW [engine].[vw_Preferences]
AS
SELECT TOP (0) CAST (NULL AS NVARCHAR (255)) AS SessionID,
CAST (NULL AS UNIQUEIDENTIFIER) AS [PreferenceGUID],
CAST (NULL AS NVARCHAR (MAX)) AS [Value]
UNION ALL SELECT 'ZZZ_7756404F411B46138371B45FB3EA6ADB', * FROM ZZZ_7756404F411B46138371B45FB3EA6ADB.Preferences
UNION ALL SELECT 'ZZZ_CE67D221C4634DC39664975494DB53B2', * FROM ZZZ_CE67D221C4634DC39664975494DB53B2.Preferences
UNION ALL SELECT 'ZZZ_5D6FB09228D941AC9ECD6C7AC47F6779', * FROM ZZZ_5D6FB09228D941AC9ECD6C7AC47F6779.Preferences
UNION ALL SELECT 'ZZZ_5F76B619894243EB919B87A1E4408D0C', * FROM ZZZ_5F76B619894243EB919B87A1E4408D0C.Preferences
UNION ALL SELECT 'ZZZ_A7C5ED1CFBC843E9AD72281702FCC2B4', * FROM ZZZ_A7C5ED1CFBC843E9AD72281702FCC2B4.Preferences
The first select top 0 row is a fall back so I always have a default definition, and a static table definition. You can select from the view and filter by a session id with
SELECT PreferenceGUID, Value
FROM engine.vw_Preferences
WHERE SessionID = 'ZZZ_5D6FB09228D941AC9ECD6C7AC47F6779';
The interesting part here though is how the execution plan is generated when you have static values inside a view. the unions that would not produce results are not evaluated by the code, leaving a basic execution plan without any joins or unions...
You can test this, it and it is just as efficient as reading directly from the table (to within a margin of error so minor nobody would care). It is even possible to replace the write back processes by using "instead" triggers and then building dynamic sql in the background. The dynamic sql is less efficient on writes but it means you can update any table via the view, usually only possible with a single table view.

Dynamic Sql usually effects both performance and security, most of the times for the worst. However, since you can't parameterize identifiers, this is probably the only way for you unless you are willing to duplicate your stored procedures for each schema:
create procedure dbo.usp_GetDataFromTable1
#schemaname varchar(100),
#userid bigint
as
begin
if #schemaname = 'a'
begin
select EmailID from a.Table1 where ID = #user_id
end
else if schemaname = 'b'
begin
select EmailID from b.Table1 where ID = #user_id
end
end

The only reason I can think of for doing this is satisfying multiple tenants. You're close but the approach you are taking is wrong.
There are 3 solutions for multi-tenancy which I'm aware of: Database per tenant, single database schema per tenant, or single database single schema (aka, tenant by row).
Two of these have already been mentioned by other users here. The one that hasn't really been detailed is schema per tenant which is what it looks like you fall under. For this approach you need to change the way you see the database. The database at this point is just a container for schemas. Each schema can have their own design, stored procs, triggers, queues, functions, etc. The main goal is data isolation. You don't want tenant A seeing tenant Bs stuff. The advantage of the schema per tenant approach is you can be more flexible with tenant specific database changes. It also allows you to scale easier than a database per tenant approach.
Answer: Instead of writing dynamic SQL to take into account the schema using the DBO user you should instead create the same stored proc for each schema (create procedure example: schema_name.stored_proc_name). In order to run the stored proc for a schema you'll need to impersonate a user that is tied to the schema in question. It would look something like this:
execute as user = 'tenantA'
exec sp_testing
revert --revert will take us back to the original user, most likely DBO in your case.
Data collation across all tenants is a little harder. The only solution that I'm aware of is to run using the DBO user and "union all" the results across all schemas separately, kind of tedious if you have a ton of schemas.

SQL Server varbinary(max) to Oracle LONG RAW

I am trying to push binary data from SQL Server to an Oracle LONG RAW column. I have a linked server created on SQL Server that connects to the Oracle server. I have a stored procedure on the Oracle side that I am trying to call from SQL Server. I can't seem to get the binary to pass into the stored procedure. I've tried changing the from and to types; however, the data ultimately needs to end up in a LONG RAW column. I have control of the Oracle stored procedure and the SQL Server code, but I do not have control of the predefined Oracle table structure.
varbinary(max) -> long raw
ORA-01460: unimplemented or unreasonable conversion requested
varbinary(max) -> blob
PLS-00306: wrong number or types of arguments in call to 'ADDDOC'
varbinary -> long raw
No errors, but get data truncation or corruption
The varbinary(max) does work if I set the #doc = null.
Below is the Oracle procedure and the SQL Server.
Oracle:
CREATE OR REPLACE
PROCEDURE ADDDOC (param1 IN LONG RAW)
AS
BEGIN
-- insert param1 into a LONG RAW column
DBMS_OUTPUT.PUT_LINE('TEST');
END ADDDOC;
SQL Server:
declare #doc varbinary(max)
select top 1 #doc = Document from Attachments
execute ('begin ADDDOC(?); end;', #doc) at ORACLE_DEV
-- tried this too, same error
--execute ('begin ADDDOC(utl_raw.cast_to_raw(?)); end;', #doc) at ORACLE_DEV
I've also tried creating the record in the Oracle Documents table then updating the LONG RAW field from SQL Server without invoking a stored procedure, but the query just seems to run and run and run and run...
--already created record and got the Id of the record I want to put the data in
--hard coding for this example
declare #attachmentId, #documentId
set #attachmentId = 1
set #documentId = 1
update ORACLE_DEV..MYDB.Documents
set Document = (select Document from Attachments where Id = #attachmentId)
where DocumentId=#documentId

As noted in the comments, LONG RAW is very difficult to work with; unfortunately, our vendor is using the datatype in their product and I have no choice but to work with it. I found that I could not pass binary data from SQL Server to an Oracle stored procedure parameter. I ended up having to create a new record with a NULL value for the LONG RAW field then using an OPENQUERY update to set the field to the VARBINARY(MAX) field; I did try using an update with the four part identifier, as noted in my code sample, but it took over 11 minutes for a single update, this new approach completes in less than 3 seconds. I am using an Oracle stored procedure here because in my real world scenario I am creating multiple records in multiple tables and coded business logic that is not relevant then tying them together with the docId.
This feels more like a workaround than a solution, but it actually works with acceptable performance.
Oracle:
create or replace procedure ADDDOC(docId OUT Number)
as
begin
select docseq.nextval into docId from dual;
-- insert new row, but leave Document LONG RAW field null for now
insert into DOC (Id) values(docId);
end ADDDOC;
SQL Server:
declare #DocId float, #AttachmentID int, #Qry nvarchar(max)
set #AttachmentID = 123 -- hardcoded for example
execute('begin ADDDOC(?); end;', #DocId output) at ORACLE_DEV
-- write openquery sql that will update Oracle LONG RAW field from a SQL Server varbinary(max) field
set #Qry = '
update openquery(ORACLE_DEV, ''select Document from Documents where Id=' + cast(#DocId as varchar) + ''')
set Document = (select Document from Attachments where Id = ' + cast(#AttachmentID as varchar) + ')
'
execute sp_executesql #Qry

SQL Server 2012 Insert XML into table issue

I am trying to create a generic update procedure. The point of this procedure is that we want to be able to track everything that happens in a table. If a recordis updated, we need to be able to know who changed that record, what it was originally, what it is after the change and when the change occurred. We only do this on our most important tables where accountability is a must.
Right now, we do this through a combination of web server programming and SQL Server commands.
I need to take what we currently have, and make a SQL only version.
So, here are the requirements of what I need:
The original sp is called UpdateWithHistory. Right now, it takes 4 parameters all varchar (or it can be nvarchar, doesn't matter). They are the table name, the primary key field, primary key value and a comma delimited list of fields and values in the format field='value',field1='value1'...etc.
In the background, we have a mapping table that we use to map the string table names to actual tables.
In the stored procedure, I have tried various combinations of OPENROWSET, exec(), select into, xml, and other methods. None seem to work.
So basically, I have to be able to dynamically generate a simple select statement (no joins or other complicated select stuff) from the 4 supplied parameters, then store the results of that query in a table. Since it is dynamic, I don't know the number of fields being queried, or what data types they will be.
I tried select into since that will automatically create a table with the appropriate fields and data types, but it doesn't work in conjunction with the exec command. I have also tried
exec sp_executeSQL #SQL, N'#output xml output', #resultXML output
Where #resultXML is XML datatype and #SQL is the sql command. #resultXML always ends up as null, no matter what I do. I also tried the xml route because I know that "FOR XML Path" always returns one column, but I can't use that in an insert into statement....
That statement output will be used to determine the original values before the update.
I figure once I get past this hurdle the rest will be a piece of cake. Anyone got any ideas?
So here is code for something that I finally got to work, although I don't want to use global tables, so I would gladly accept a different answer...
DECLARE #curRecordString varchar(max) = 'SELECT * into ##TEMP_TABLE FROM SOMEDB.dbo.' + #tbl + ' WHERE ' + #prikey + ' = ''' + #prival + ''' '
exec(#curRecordString)
Basically, as stated before, I need to dynamically build a sql query, then store the result of running the query so that I can access it later. I would prefer to store it as XML datatype, since I will later be using XQuery to parse and compare nodes. In the code above, I am using a global temp table (not ideal, I know) to store the result of the query so that the rest of my procedure can access the data.
Like I said, I don't like this approach but hopefully someone else can come up with something better that will allow me to dynamically build a SQL query, run it, store the results so that I can access the results later in the stored procedure.

This is most definitely a hack, but...
DECLARE #s VARCHAR(MAX)
SET #s = 'SELECT (SELECT 1 as splat FOR XML PATH) a'
CREATE TABLE #save (x XML)
INSERT INTO #save
( x )
EXEC (#s)
SELECT * FROM #save s
DROP TABLE #save

Set database name dynamically in SQL Server stored procedure?

How do I set the database name dynamically in a SQL Server stored procedure?

Sometimes, the use of SYNONYMs is a good strategy:
CREATE SYNONYM [schema.]name FOR [[[linkedserver.]database.]schema.]name
Then, refer to the object by its synonym in your stored procedure.
Altering where the synonym points IS a matter of dynamic SQL, but then your main stored procedures can be totally dynamic SQL-free. Create a table to manage all the objects you need to reference, and a stored procedure that switches all the desired synonyms to the right context.
This functionality is only available in SQL Server 2005 and up.
This method will NOT be suitable for frequent switching or for situations where different connections need to use different databases. I use it for a database that occasionally moves around between servers (it can run in the prod database or on the replication database and they have different names). After restoring the database to its new home, I run my switcheroo SP on it and everything is working in about 8 seconds.

Stored Procedures are database specific. If you want to access data from another database dynamically, you are going to have to create dynamic SQL and execute it.
Declare #strSQL VarChar (MAX)
Declare #DatabaseNameParameter VarChar (100) = 'MyOtherDB'
SET #strSQL = 'SELECT * FROM ' + #DatabaseNameParameter + '.Schema.TableName'
You can use if clauses to set the #DatabaseNameParameter to the DB of your liking.
Execute the statement to get your results.

This is not dynamic SQL and works for stored procs
Declare #ThreePartName varchar (1000)
Declare #DatabaseNameParameter varchar (100)
SET #DatabaseNameParameter = 'MyOtherDB'
SET #ThreePartName = #DatabaseNameParameter + '.Schema.MyOtherSP'
EXEC #ThreePartName #p1, #p2... --Look! No brackets

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight