I am building a SSIS package in which package i need to transfer from
an odata source some tables into sql server.
So far i have implement an "insert into" query to the sql server from the tables i read from odata Source. Because the number of tables are 10+ is there a way that i can do "select into" query for faster transfer of those tables in SSIS ?
SSIS has no build in operation to create a table on a destination based on a data set, which is what SELECT ... INTO does.
There is no easy tweak to do this either, SSIS is mostly based for static metadata ETLs, that is performing operations between different sources and destinations with consistent structures and data types. You might achieve what you need with custom scripts, but that would be as well completely outside of SSIS.
If you already know the data you will be inserting into, create the destination tables first (with CREATE TABLE) and then use SSIS to map the corresponding columns. If your destination tables will be dynamic then you will have a hard time using regular SSIS operations to match the metadata of each table, since this is set at design time.
If the problem isn't the table's column data type but the speed of the operation (SELECT ... INTO has minimal logging), then the fastest option is using the bulk insert operation on the destination component when working with SQL Server. It will be faster than regular inserts, but usually slower than performing a SELECT ... INTO directly from SQL.
Related
I am relatively new to SSIS and have to come up with a SSIS package for work such that certain tables must be dynamically moved from one SQL server database to another SQL server database. I have the following constraints that need to be met:
Source table names and destination table names may differ so direct copying of table does not work with transfer SQL server object task.
Only certain columns may be transferred from source table to destination table.
This package needs to run every 5 minutes so it has to be relatively fast.
The transfer must be dynamic such that if there are new source tables, the package need not be reconfigured with hard coded values.
I have the following ideas for now:
Use transfer SQL Server object task but I'm not sure if the above requirements can be met, especially selective transfer of tables and dynamic mapping of columns.
Use SQLBulkCopy in a script component to perform migration.
I would appreciate if anyone could give some direction as to how I can go about meeting the requirements and if my existing ideas are possible.
We have around 5000 tables in Oracle and the same 5000 tables exist in SQL server. Each table's columns vary frequently but at any point in time source and destination columns will always be the same. Creating 5000 Data flow tasks is a big pain. Further there's a need to map every time a table definition changes, such as when a column is added or removed.
Tried the SSMA (SQL Server Migration Assistance for Oracle ) but it is very slow for transferring huge amount of data then moved to SSIS
I have followed the below approach in SSIS:
I have created a staging table where it will have a table name, source
query (oracle), Target Query (SQL server) used that table in Execute
SQL task and stored the result set as the full result set
created for each loop container off that execute SQL task result set
and with the object and 3 variables table name, source query and
destination query
In the data flow task source I have chosen OLE DB source for oracle
connection and choose data access mode as an SQL command from a
variable (passed source query from loop mapping variable)
In the data flow task destination I have chosen OLE DB source for SQL
connection and choose data access mode as an SQL command from a
variable (passed Target query from loop mapping variable)
And looping it for all the 5000 tables..it is not working can you please guide us how I need to create it for 5000 tables dynamically from oracle to SQL server using SSIS. any sample code/help would be greatly appreciated. Thanks in advance
Using SSIS, when thinking about dynamic source or destination you have to take into consideration that the only case you can do that is when metadata is well defined at run-time. In your case:
Each table columns vary frequently but at any point of time source destination columns will always same.
You have to think about build packages programatically rather than looping over tables.
Yes, you can use loops in case you can classify tables into groups based on their metadata (columns names, data types ...). Then you can create a package for each group.
If you are familiar with C# you can dynamically import tables without the need of SSIS. You can refer to the following project to learn more about reading from oracle and import to SQL using C#:
Github - SchemaMapper
I will provide some links that you can refer to for more information about creating packages programatically and dynamic columns mapping:
How to manage SSIS script component output columns and its properties programmatically
How to Map Input and Output Columns dynamically in SSIS?
Implementing Foreach Looping Logic in SSIS
I have more than 500s of tables in SQL server that I want to move to Dynamics 365. I am using SSIS so far. The problem with SSIS is the destination entity of dynamics CRM is to be specified along with mappings and hence it would be foolish to create separate data flows for entities for 100s of SQL server table sources. Is there any better way to accomplish this?
I am new to SSIS. I don't feel this is the correct approach. I am just simulating the import/export wizard of SQL server. Please let me know if there are better ways
It's amazing how often this gets asked!
SSIS cannot have dynamic dataflows because the buffer size (the pipeline) is calculated at design time (as opposed to execution time).
The only way you can re-use a dataflow is if all the source to target mappings are the same - Eg if you have 2 tables with exactly the same DDL structure.
One option (horrible IMO) is to concatenate all columns into a massive pipe-separated VARCHAR and then write this to your destination into a custom staging table with 2 columns eg (table_name, column_dump) & then "unpack" this in your target system via a post-Load SQL statement.
I'd bite the bullet, put on your headphones and start churning out the SSIS dataflows one by one - you'd be surprised how quick you can bang them out!
ETL works that way. You have to map source, destination & column mapping. If you want that to be dynamic that’s possible in Execute SQL task inside foreach loop container. Read more
But when we are using Kingswaysoft CRM destination connector - this is little tricky (may or may not be possible?) as this need very specific column mapping between source & destination.
That too when the source schema is from OLEDB, better to have separate Dataflow tasks for each table.
When developing a Azure SQL Data warehouse with SSIS. We have 2-phrase steps
to 1) copy data source to staging table, 2) copy staging table to report table
My question is, will SSIS actually extract data through it's own server, even it knows source & target are the same OLE DB provider? Or it's smart enough to use "SELECT INTO FROM SELECT * FROM .."? This makes a difference to us as Azure calculate the cost on exporting data out from Azure, and we have a lot of similar copying actions in DW, and SSIS is the only machine on-premise.
We could define a series of SQL statement tasks with nested query but it's hard to manage for TransactionOption in such a quantity.
Thanks.
I do not think it will do this as SSIS was designed with so many hooks into the pipe that it is counter-intuitive to try to optimize by skipping it.
You could however use TSQL to do the Select Into and keep processing on the same server as the database engine.
If you need to switch between the two methods you can have a parameter to your package and conditional execution via constraints.
I wanted to do onetime load from one source Oracle db to destination oracle db.
it can't done direct load /unload or import/export of data as it as different tables structures columns at source and destination. so it requires good transformation,
My plan is to get the data as in XML format from the source DB and process the XML to destination DB.
and also Data volume would be more ( 1 to 20+ million records or more in some tables) and the databases involved are : Oracle (source) and Oracle (destination),
Please provide some best practices or best way to do this.
I'm not sure that I understand why you can't do a direct load.
If you create a database link on the destination database that points to the source database, you can then put your ETL logic into SQL statements that SELECT from the source database and INSERT into the destination database. That avoids the need to write the data to a flat file, to read that flat file, to parse the XML, etc. which is going to be slow and require a decent amount of coding. That way, you can focus on the ETL logic and you can migrate the data as efficiently as possible.
You can write SQL (or PL/SQL) that loads directly from the old table structure on the old database to the new table structure on the new database.
INSERT INTO new_table( <<list of columns>> )
SELECT a.col1, a.col2, ... , b.colN, b.colN+1
FROM old_table_1#link_to_source a,
old_table_2#link_to_source b
WHERE <<some join condition>>