SQL Server Destination vs OLE DB Destination - sql-server

I was using OLE Db destination for Bulk import of multiple Flat Files. After some tuning I ended up with SQL Server Destination to be 25 - 50 % faster.
Though I am confused about this destination as there are contradictory information on the web, some are against it, some are suggesting using it. I would like to know, are there any serious pitfalls before I deploy it to production? Thanks

In this answer, I will try to provide information from official documentation of SSIS and I will mention my personal experience with SQL Server destination.
1. SQL Server Destination
According to the official SQL Server Destination documentation:
The SQL Server destination connects to a local SQL Server database and bulk loads data into SQL Server tables and views. You cannot use the SQL Server destination in packages that access a SQL Server database on a remote server. Instead, the packages should use the OLE DB destination.
The SQL Server destination offers the same high-speed insertion of data into SQL Server that the Bulk Insert task provides; however, by using the SQL Server destination, a package can apply transformations to column data before the data is loaded into SQL Server.
For loading data into SQL Server, you should consider using the SQL Server destination instead of the OLE DB destination
2. OLEDB Destination
According to the official OLEDB Destination documentation:
OLEDB Destination - fast load option: Load data into a table or view in the OLE DB destination and use the fast load option, which are optimized for bulk inserts
3. OLEDB Destination vs SQL Server Destination
According to SQL Server Destination Vs OLE DB Destination - MSDN topic:
Donald Farmer, the former Group Program Manager for Integration Services said that you can get a 5 to 10% increase in performance using the SQL Server Destination.
In addition, refering to the following post of Matt Masson a data integration specialist at Microsoft where he answered the following question:
Should I use the SQL Server Destination?
The Answer was
No
...
My recommendation is that if you need every bit of performance (a 10% perf increase on a 10 hour load can be significant), try out the SQL Server Destination to see how it works for you. However – keep in mind the following limitations of the SQL Server Destination:
You must have SSIS running on the same machine as the destination database
You must run the package as an administrator
It is very difficult to debug when things go wrong
Given these limitations, I recommend using the OLE DB Destination even if you are seeing a performance increase with the SQL Server Destination.
3.1. The Data Loading Performance Guide
(Update # 2019-03-25)
While searching on SSIS best practices i found a very helpful Microsoft artcile that can be used as a reference:
The Data Loading Performance Guide
In this article they made a comparison between all data loads methods including SQL Server destination and OLEDB destination, they mentioned that:
SQL Server Destination The SQL Server destination is the fastest way to bulk load data from an Integration Services data flow to SQL Server. This destination supports all the bulk load options of SQL Server – except ROWS_PER_BATCH.
Be aware that this destination requires shared memory connections to SQL Server. This means that it can only be used when Integration Services is running on the same physical computer as SQL Server.
OLE DB Destination: The OLE DB destination supports all of the bulk load options for SQL Server. However, to support ordered bulk load, some additional configuration is required. For more information, see “Sorted Input Data”. To use the bulk API, you have to configure this destination for “fast load”.
The OLE DB destination can use both TCP/IP and named pipes connections to SQL Server. This means that the OLE DB destination, unlike the SQL Server destination, can be run on a computer other than the bulk load target. Because Integration Services packages that use the OLE DB destination do not need to run on the SQL Server computer itself, you can scale out the ETL flow with workhorse servers.
3.2. Personal experience
(Update # 2019-03-25)
Since this question is used as a reference by many, and after being more experienced in this domain, i added this section to mention my personal experience using SQL Server destination.
While official documentation mentioned that SQL Server destination will increase performance, i don't recommend at all using this components due to many reasons:
It requires that destination server and the ETL server are the same (works only with Local SQL server)
It always throw exception that don't have any meaning
After testing on huge volume of data the performance difference with OLEDB destination is negligible (tested on about 500 GB data loaded in chunks and the time difference is less than one minute)
You can also refer to the following post (from #billinkc) to get more information about this topic:
Should SSIS packages and SQL database be on same server?
4. Conclusion
Based on Microsoft articles, you can say that SQL Server Destination increase the performance of inserting data (it uses BULK insert), but it is designed for a specific case which is the Local SQL server. OLEDB Destination is more general and recommended in the other cases and by using the Fast Load data access mode (which uses also BULK insert) on the OLE DB destination it will increase the performance of data load.
On the other hand, based on my experience and from many articles written by SSIS experts, it is not recommended at all to use SQL Server Destination since it is not stable and it often throws exception and the performance can be considered as negligible.
Additional Information
Recently, I published a detailed article about this topic. You can check it at:
SSIS OLE DB Destination Vs SQL Server Destination

To augment Hadi's fine answer, don't use the SQL Server Destination.
In my experience, the performance benefit does not outweigh the restriction that the package must be executed on the same machine as the destination database. It forces a processing architecture that may or may not be right for you today or a year from now. It's just too inflexible for my tastes.
The other, bigger reason I advocate avoiding the SQL Server Destination is the flat out bugginess I've experienced with it. Same flat file to an empty table- round 1, it aborts with a vague error message (can't recall specific) that something went wrong. Immediately restart the package and it works as expected.
Maybe you, most humble reader, can accept that trade off in processing time for the reprocessing time but for me, it's not been worth it since probably 2008.

Related

Optimizing OLE DB Destination for Fast load from Oracle to SQL Server for SSIS

I'm working with a SSIS package for importing from an Oracle Table to an SQL Server Table. for this in between I had to put a data conversion.
the OLE DB Source is retrieving the complete Table, then being converted by the data conversion and then sent to the OLE DB Destination with current setup
now, the table I'm trying to import has around 7.3 Million records with 53 columns.
I need to know how can I setup (or what changes should do to current setup) to speed up as much as possible this process.
This package is going to run scheduled as a job in the SQL server agent.
In the last run inserted 78k records in 15 minutes. at this pace is too slow.
I believe I have to tune setting with the "rows per batch" and "maximum insert commit size" but looking around I haven't found information about what settings should work, and I've tried different settings here, not finding actual difference between them.
UPDATE: After a bit more test, the delay is from getting records from Oracle, not to insert them into SQL server. I need to check on how can I improve this
I think that the main problem is not loading data into SQL Server, check the OLE DB provider you are using to extract data from Oracle.
There are many suggestions you can go with:
Use Attunity connectors which are the fastest one available
Make sure you are not using the old Microsoft OLEDB Provider for Oracle (part of MDAC). Use the Oracle Provider for OLEDB (part of ODAC) instead
If it didn't work, try using an ODBC connection / ODBC Source to read data from Oracle

Configuring SQL Server to Oracle initial data load in Goldengate

As per my understanding before setting up transaction replication in Oracle Goldengate, we have to setup initial data load. In my case the source is SQL Server 2012 and the destination is Oracle 12 and both are residing in the same system. Now my questions are
1. What is the best way to setup the initial load? I meant to use some SQL Server utility such as SSIS or use Goldengate's "Direct Bulk load" feature?
2. Though my source DB and destination DB are residing on the same machine, do I still have to use two installations (one for source and other for destination) of the Goldengate for transaction replication?
I used GG direct load for MSSQL initial load; the database was huge and it went fine. The downside of it is that if a failure occurs, then you'll need to truncate the target table ans start the load from the beginning. As for multiple installations, in one environment I have both target and source Oracle databases running on the same machine and using the same installation, so I think you'll be fine with just one.
Look at the link it could be beneficial
http://www.ateam-oracle.com/oracle-goldengate-heterogeneous-database-initial-load-using-oracle-goldengate/

Move data between different servers

I'm working on a project where I need to automatically run an insert statement to insert a result set - problem is that I need it to go from a SQL Server over to a DB2 server. I can't create a file or script and then import it or run it on the other side. I need to insert or update the DB2 side from the SQL Server side.
Is this possible? I need this to run all by itself as part of a stored procedure in SQL Server.
You're looking for the linked server feature.
Typically linked servers are configured to enable the Database Engine to execute a Transact-SQL statement that includes tables in another instance of SQL Server, or another database product such as Oracle. Many types OLE DB data sources can be configured as linked servers, including Microsoft Access and Excel. Linked servers offer the following advantages:
The ability to access data from outside of SQL Server.
The ability to issue distributed queries, updates, commands, and transactions on heterogeneous data sources across the enterprise.
The ability to address diverse data sources similarly.
(I believe most of the major RDBMSs have a similar feature)
For the most part, this essentially allows you to treat tables or sources in the other database as if they were part of the SQL Server instance - an INSERT statement should just work "normally".
As mentioned you can use a linked server on the SQL Server side to perform operations between two servers. I haven't done much with running DML on DB2 from SQL Server, but from my experience SSIS performs far better than linked servers for transactions pulling data from DB2 to SQL Server using an OLE DB connection. You can read more about OLE DB connections in SSIS here and you'll want to reference the DB2 documentation for the specific DB2 type (Mainframe, LUW, etc.) that's used for details on setting up the connection there. If you setup the SSIS catalog you can run packages using SQL Server stored procedures, which you can either use directly or execute from an existing user stored procedures.

Fastest way to copy large amounts of data from Oracle to SQL Server

I need to copy large amounts of data from an Oracle database to a SQL Server database. What is the fastest way to do this?
I am looking at data that takes 60 - 70 gig of storage in Oracle. There are no particular restrictions on the method that I use. I can use the SQL Server Management Studio, or the SQL Serer import/export program, or a .NET app, or the developer interface in Oracle, or third party tools, or ----. I just need to move the data as quickly as possible.
The data is geographically organized. The data for each state comes is updated separately into the Oracle database and can be moved over to SQL Server on its own. So the entire volume of the data will rarely be all moved over at once.
So what suggestions would people have?
The fastest way to insert large amounts of data into SQL Server is with SQL Server bulk insert. Common bulk insert techniques are:
T-SQL BULK INSERT statement
BCP command-line utility
SSIS package OLE DB destination with the fast load option
ODBC bcp API from unmanaged code
OLE DB IRowsetFastLoad from unmanaged code
SqlBulkCopy from a .NET application
T-SQL BULK INSERT and the command-line BCP utility use a flat file source so the implication is that you'll need to first export data to files. The other methods can use Oracle SELECT query results directly without the need for an intermediate file, which should perform better overall as long as source/destination network bandwidth and latency isn't a concern.
With SSIS, one would typically create a data flow task for each table to be copied with a OLE DB source (Oracle) and OLE DB destination (SQL Server). The Oracle source provider can be downloaded separately depending on the SSIS version. The latest is the Microsoft Connector v4.0 for Oracle. The SSMS import wizard can be used to generate an SSIS package for the task, which may be run immediately and/or saved and customized as desired. For example, you could create a package variable for the state to be copied and use that in the source SELECT query and in a target DELETE query prior to refreshing data. That would allow the same package to be reused for any state.
OLE DB IRowSetFastLoad or ODBC bcp calls should perform similarly to SSIS but you might be able to eek out some additional performance gains with a lot of attention to detail. However, using these APIs is not trivial unless you are already familiar with C++ and the APIs.
SqlBulkCopy is fast (generally millions of rows per minute), which is good enough performance for most applications without the additional complexity of unmanaged code. It will be best to use the Oracle managed provider for the source SELECT query rather than ODBC or OLE DB provider in .NET code.
My recommendation is you consider not only performance but also your existing skillset.
I actually used the "Microsoft SQL Server Migration Assistant (SSMA)" from MS once for this and it actually did what it promised to do:
SQL Server Migration Assistant for
Oracle
(documentation)
Microsoft SQL Server Migration Assistant v6.0 for
Oracle
(download)
SQL Server Migration Assistant (SSMA) Team's
Blog
However in my case it was not as fast as I would have expected for a 80 GB Oracle-DB (4 hours or something) and I had to do some manual steps afterwards, but the application was developed in hell anyway (one table had 90+ columns and 100+ indices).

How can I migrate database from SQL Server 2008 to SQL Server 2000

I am replacing an Access application with a web app, but the client is using SQL Server 2000, and I am using SQL Server 2008.
So, I have the database redesigned, with foreign keys, but now I need to get the data on the client's system.
Part of the problem is that they have images that are over 32k, so osql failed as the command buffer filled up.
I should be able to use osql to import the new schema at least, and perhaps all of the data except for the images.
The Export wizard just wouldn't work, even though I tried the Native SQL Driver and the OLE DB Sql Driver.
Flat files seems like a bad choice, as I don't know if it can do the images.
So, what is a good way to copy a 330M database from 2008 -> 2000?
Not sure about performance or time needed, but you could always try a tool like
Red-Gate SQL Compare / SQL Data Compare
Apex SQL Diff / SQL Data Diff
These will allow you to compare both the schema of two databases, as well as the data, and allow you to create synchronization scripts, or synchronize online.
Marc
I set the image column to null, which reduced the size of the insert statements.
This enabled me to import the data into the target database.

Resources