When I use Fast Load to SQL Server from ETL tool like SSIS or Altery I see below running on SQL Server
insert bulk <TableName>(Column1 datatype, Column2 datatype, ......., column datatype)
with (TABLOCK, ROWS_PER_BATCH = 418397672, CHECK_CONSTRAINTS)
How does SSIS/Alteryx make above insert bulk call? Does SQL Server expose some APIs for ETL tools to call?
I found below documentation and syntax but not much detail.
The docs you cited seem pretty old, you may want to look at the latest (even tho I didn't find something useful).
BULK INSERT is meant to import data from files ONLY and can be used from SSMS.
It's also an API that may or may not be supported/used by the different drivers and I've found it referenced as
Bulk Copy API for batch insert operations.
I didn't find any "global" page about it (like a list of drivers with support yes/no), therefore I think you will have to look into the documentation of your driver.
Related
I need to copy large amounts of data from an Oracle database to a SQL Server database. What is the fastest way to do this?
I am looking at data that takes 60 - 70 gig of storage in Oracle. There are no particular restrictions on the method that I use. I can use the SQL Server Management Studio, or the SQL Serer import/export program, or a .NET app, or the developer interface in Oracle, or third party tools, or ----. I just need to move the data as quickly as possible.
The data is geographically organized. The data for each state comes is updated separately into the Oracle database and can be moved over to SQL Server on its own. So the entire volume of the data will rarely be all moved over at once.
So what suggestions would people have?
The fastest way to insert large amounts of data into SQL Server is with SQL Server bulk insert. Common bulk insert techniques are:
T-SQL BULK INSERT statement
BCP command-line utility
SSIS package OLE DB destination with the fast load option
ODBC bcp API from unmanaged code
OLE DB IRowsetFastLoad from unmanaged code
SqlBulkCopy from a .NET application
T-SQL BULK INSERT and the command-line BCP utility use a flat file source so the implication is that you'll need to first export data to files. The other methods can use Oracle SELECT query results directly without the need for an intermediate file, which should perform better overall as long as source/destination network bandwidth and latency isn't a concern.
With SSIS, one would typically create a data flow task for each table to be copied with a OLE DB source (Oracle) and OLE DB destination (SQL Server). The Oracle source provider can be downloaded separately depending on the SSIS version. The latest is the Microsoft Connector v4.0 for Oracle. The SSMS import wizard can be used to generate an SSIS package for the task, which may be run immediately and/or saved and customized as desired. For example, you could create a package variable for the state to be copied and use that in the source SELECT query and in a target DELETE query prior to refreshing data. That would allow the same package to be reused for any state.
OLE DB IRowSetFastLoad or ODBC bcp calls should perform similarly to SSIS but you might be able to eek out some additional performance gains with a lot of attention to detail. However, using these APIs is not trivial unless you are already familiar with C++ and the APIs.
SqlBulkCopy is fast (generally millions of rows per minute), which is good enough performance for most applications without the additional complexity of unmanaged code. It will be best to use the Oracle managed provider for the source SELECT query rather than ODBC or OLE DB provider in .NET code.
My recommendation is you consider not only performance but also your existing skillset.
I actually used the "Microsoft SQL Server Migration Assistant (SSMA)" from MS once for this and it actually did what it promised to do:
SQL Server Migration Assistant for
Oracle
(documentation)
Microsoft SQL Server Migration Assistant v6.0 for
Oracle
(download)
SQL Server Migration Assistant (SSMA) Team's
Blog
However in my case it was not as fast as I would have expected for a 80 GB Oracle-DB (4 hours or something) and I had to do some manual steps afterwards, but the application was developed in hell anyway (one table had 90+ columns and 100+ indices).
I need to transfer certain information out of our SQL Server database into an MS Access database. I've already got the access table structure setup. I'm looking for a pure sql solution; something I could run straight from ssms and not have to code anything in c# or vb.
I know this is possible if I were to setup an odbc datasource first. I'm wondering if this is possible to do without the odbc datasource?
If you want a 'pure' SQL solution, my proposal would be to connect from your SQL server to your Access database making use of OPENDATASOURCE.
You can then write your INSERT instructions using T-SQL. It will look like:
INSERT INTO OPENDATASOURCE('Microsoft.Jet.OLEDB.4.0','Data Source=myDatabaseName.mdb')...[myTableName] (insert instructions here)
The complexity of your INSERTs will depend on the differences between SQL and ACCESS databases. If tables and fields have the same names, it will be very easy. If models are different, you might have to build specific queries in order to 'shape' your data, before being able to insert it into your MS-Access tables and fields. But even if it gets complex, it can be treated through 'pure SQL'.
Consider setting up your Access db as a linked server in SQL Server. I found instructions and posted them in an answer to another SO question. I haven't tried them myself, so don't know what challenges you may encounter.
But if you can link the Access db, I think you may then be able to execute an insert statement from within SQL Server to add your selected SQL Server data to the Access table.
Here's a nice solution for ur question
http://www.codeproject.com/Articles/13128/Exporting-Data-from-SQL-to-Access-in-Mdb-File
I was wondering if anyone knew how to reverse engineer an Access Database. I would like to be able to generate the SQL code that is used to create the database tables and to insert all the records in the table. In other words, I would like to create whats is similar to a MySQL dump file.
Any ideas would be great
thanks
jason
There's nothing built into Access that will generate the DDL for your tables.
There are many third party tools however (ERWin, ERStudio, Visio, etc) that can generate the DDL for you.
I don't know anything that will generate the Insert scripts for you. Access does however have plenty of export/import options if you just want to create a copy of your data and then use that as an import source.
It should be pointed out that there's nothing stopping you from writing some VBA code to loop through the TableDefs and creating the DDL and insert scripts yourself.
One possible approach that may work, is to upsize your tables to SQL server, and then have SQL server generate scripts for you.
Unfortunately the resulting scripts would probably be only compatible with SQL server. So then you would have to run them on SQL server and then pull the data in tables down to access.
Access does support DDL, but unfortunately it does not have any tools are facilities built in to generate the scripts.
I'm not a good SQL programmer, I've got only the basics, but I've heard of some BCP thing for fast data loading. I've searched the internet and it seems to be a command-line only utility, and not something you can use in code.
The thing is, I want to be able to make very fast inserts and updates in a SQL Server 2008 database. I would like to have a function in the database that would accept:
The name of the table I want to execute an insert/update operation against
The names of the columns I'll be feeding data to
The data in a CSV format or something that SQL can read stupid-fast
A flag indicating weather the function should perform an insert or update operation
This function would then read this CSV string and genarate the necessary code for inserting/updating the table.
I would then write code in C# to call that function passing it the table name, column names, a list of objects serialized as a CSV string and the insert/update flag.
As you can see, this is intended to be both fast and generic, suitable for any project dealing with large amounts of data, and thus a candidate to my company's framework.
Am I thinking right? Is this a good idea? Can I use that BCP thing, and is it suitable to every case?
As you can see, I need some directions on this... thanks in advance for any help!
In C#, look at SQLBulkCopy. It's what SSIS uses in the background.
For true bcp/BULK INSERT, you'd need bulkadmin rights which may not be allowed
Have you considered using SQL Server Integrated Services (SSIS). It's designed to do exactly what you describe. It is very fast. You can insert data on a transactional basis. And you can set it up to run on a schedule. And much more.
Via a web service, remote computers will be sending a set of rows to insert into our central sql server.
What is the best way (performance wise) to insert these rows? There could be anywhere from 50-500 rows to insert each time.
I know I can do a bulk insert or format the data as XML that insert it that way, but I've never done this in an enterprise setting before.
Updates
using wcf web services (or maybe wse not sure yet) and SQL Server 2008 standard.
Unless you're running on a 10 year-old computer, 50-500 rows isn't very many; you could literally send over SQL statements and pipe them directly into the database and get great performance. Assuming you trust the services sending you data, of course :-)
If performance really is an issue sending over a bcp file is absolutely the fastest way to jam data in the database. It sounds from your question that you already know how to do this.
A mere 50-500 records does not constitute a "bulk insert". The bulk insert mechanism is designed for really massive import of data which is to be immediately followed up with a back up.
In web service I would simply pass the XML into SQL server. The specifics would be version dependent.
What kind of web service is this?
If it's .Net, usually the best way is to load the input into a DataTable, then shoot it up to the SQL server using the SqlBulkCopy class
50-500 rows shouldn't be a problem! There is no need to do performance tuning! Do normal (prepared) SQL Statements in your application.
Don't kill it with complexity and overengineering.
When you should insert more than 250.000 rows, you should think about scaling!
Don't turn of the constraints, you might kill the DB.
To echo all other answers, 500 rows is no issue for SQL server. If you do need to insert a large number of records, the fastest way is with a built-in stored proc called BulkInsert,
BulkInsert
which (I Believe) is an entry point to a SQL Server utility designed specifically for doing this called bcp.exe
bcp