I have a pretty heavy MDX query with many dimensions and measurements. I need to export the results of that MDX query to a SQL Server table using SSIS data flow (OLE DB Source --> SQL Command). I created an OLE DB Source, added "Format=Tabular" to its properties, and connected it to the OLE DB Destination. For a simple MDX query it runs fine.
But even for a simple query every time I copy and paste it in the SQL Command window (OLE DB Source) and press "OK" or "Columns" it looks like SSIS runs the whole query and returns the metadata then.
Is it possible to get just metadata without completely executing the query? I need to pass the metadata to the destination. I will appreciate any help on extracting the data from a cube into SQL Server table. Thanks.
That's been our experience with SSIS. The way we've got round it is to use a dummy query from a variable, that is returns the metadata to SSIS during design time. Then at runtime the actual variable is modified using another task (such as a script task) to be the full query.
Related
We have around 5000 tables in Oracle and the same 5000 tables exist in SQL server. Each table's columns vary frequently but at any point in time source and destination columns will always be the same. Creating 5000 Data flow tasks is a big pain. Further there's a need to map every time a table definition changes, such as when a column is added or removed.
Tried the SSMA (SQL Server Migration Assistance for Oracle ) but it is very slow for transferring huge amount of data then moved to SSIS
I have followed the below approach in SSIS:
I have created a staging table where it will have a table name, source
query (oracle), Target Query (SQL server) used that table in Execute
SQL task and stored the result set as the full result set
created for each loop container off that execute SQL task result set
and with the object and 3 variables table name, source query and
destination query
In the data flow task source I have chosen OLE DB source for oracle
connection and choose data access mode as an SQL command from a
variable (passed source query from loop mapping variable)
In the data flow task destination I have chosen OLE DB source for SQL
connection and choose data access mode as an SQL command from a
variable (passed Target query from loop mapping variable)
And looping it for all the 5000 tables..it is not working can you please guide us how I need to create it for 5000 tables dynamically from oracle to SQL server using SSIS. any sample code/help would be greatly appreciated. Thanks in advance
Using SSIS, when thinking about dynamic source or destination you have to take into consideration that the only case you can do that is when metadata is well defined at run-time. In your case:
Each table columns vary frequently but at any point of time source destination columns will always same.
You have to think about build packages programatically rather than looping over tables.
Yes, you can use loops in case you can classify tables into groups based on their metadata (columns names, data types ...). Then you can create a package for each group.
If you are familiar with C# you can dynamically import tables without the need of SSIS. You can refer to the following project to learn more about reading from oracle and import to SQL using C#:
Github - SchemaMapper
I will provide some links that you can refer to for more information about creating packages programatically and dynamic columns mapping:
How to manage SSIS script component output columns and its properties programmatically
How to Map Input and Output Columns dynamically in SSIS?
Implementing Foreach Looping Logic in SSIS
I need to run some analysis on my queries (specifically finding all the tables which a ssis calls).
Right now I'm opening up every single ssis package, every single step in it and copy and pasting manually the tables from it.
As you can imagine it's very time consuming and mind-numbing.
Is there a way to do export all the queries automatically ?
btw i'm using sql server 2012
Retrieve Queries is not a simple process, you can work in two ways to achieve it:
Analyzing the .dtsx package XML content using Regular Expression
SSIS packages (.dtsx) are XML files, you can read these file as text file and use Regular Expressions to retrieve tables (as example you may search all sentences that starts with SELECT, UPDATE, DELETE, DROP, ... keywords)
There are some questions asking to retrieve some information from .dtsx files that you can refer to to get some ideas:
Reverse engineering SSIS package using C#
Automate Version number Retrieval from .Dtsx files
Using SQL Profiler
You can create and run an SQL Profiler trace on the SQL Server instance and filter on all T-SQL commands executed while executing the ssis package. Some examples can be found in the following posts:
How to capture queries, tables and fields using the SQL Server Profiler
How to monitor just t-sql commands in SQL Profiler?
SSIS OLE DB Source Editor Data Access Mode: “SQL command” vs “Table or view”
Is there a way in SQL profiler to filter by INSERT statements?
Filter Events in a Trace (SQL Server Profiler)
Also you can use Extended Events (has more options than profiler) to monitor the server and collect SQL commands:
Getting Started with Extended Events in SQL Server 2012
Capturing queries run by user on SQL Server using extended events
You could create a schema for this specific project and then have all the SQL stored within views on that schema... Will help keep things tidy and help with issues like this.
I'm using SSIS to connect to a SQL Server Database and pull a table into another SQL Server Database. Using Visual Studio 2013 to manage the SSIS scripts.
One of the tables I need to pull is huge so I'd like to just pull data that is greater than a date Range. ie: Data from Jan 1 2016 and newer.
How do i do this via SSIS? I feel like there should be somewhere to add a 'Where' clause or the equivelent to that.
Is it easier to link the two databases, although for security reasons I'm not sure if that is an option.
Any insight would be great.
Thanks!
Try it
Use a Data Flow Task
Configure the Source as OLE DB Source to data Access mode as SQL Command or SQL Command from variable (you can parametrize both)
Add an OLE DB destination, configure Data access mode to Table or
view - Fast load and Rows per batch and Maximum insert commit size to
100000 (you need test to fits to your need)
You can do this without a script. Create a data flow and set up your source via sql command or create a stored procedure to handle your filtering. Here is an example using a sql command
Then create a destination and map the source columns to your destination.
Another option is to set the data access mode to a variable. With this you can build the sql query via variables and make the date filter dynamic with a variable expressions.
I typically used stored procedures and pass in parameters for filtering but if you are just starting with ssis I would try the other two options first.
I need a bit advice how to solve the following task:
I got a source system based on IBM DB2 (IBMDA400) which has a lot of tables that changes rapidly and daily in structure. I must load specified tables from the DB2 into a MSSQL 2008 R2 Server. Therefore i thought using SSIS is the best choice.
My first attempt was just to add both datasources, drop all tables in MSSQL and recreate them with a "Select * Into #Table From #Table". But I was not able to get this working because I could not connect both OLEDB Connections. I also tried this with an Openrowset statement but the SQL Server does not allow that for security reasons and I am not allowed to change that.
My second try was to manually read the tables from the source and drop and recreate the tables with a for each loop and then load the data via the Data Flow Task. But I got stuck on getting the meta data from the Execute SQL Task... so i dont got the column names and types.
I can not believe that this is too hard to archieve. Why is there no "create table if not exist" checkbox on the Data Flow Task?
Of course i searched for the problem here before but could not find a solution.
Thanks in advance,
Pad
This is the solution i got at the end:
Create a File/Table which is used for selection of the source tables.
Important: Create a linked Server on your SQL Instance or a working Connectionstring for the OPENROWSET (i was not able to do so - i choosed the linked server)
Query source File/Table
Build a loop through the resultset
Use Variables and Script Task to build your query
Drop the destination table
Build another Querystring with INSERT INTO TABLE FROM OPENROWSET (or if you used linked Server OPENQUERY)
Execute this Statement
Done.
As i said above i am not quite happy with this but for now it should be ok. I will update this if i got another solution.
Hi we have a ssis package which is loading data from a SSAS cube.So we are using mdx query inside the data flow task for fetching data from Cube.But at the time of fetching data from cube, the package is taking huge time. it is first valiadting then executing so total 1 hour for some 1000 rows.. So is there any option us there to optimize??.. in mdx we have ony select query for one meausre and some 4 dimension.. and same mdx giving result in query window from cube in seconds.. so where is the problem??
Try creating a linked server on the database server to the cube server and run an OPENQUERY command to execute the MDX. This data source can then be mapped to the destination. I have heard that it's usually the better way of obtaining results from a multidimensional source.
References - 1 and 2. I have also heard that this is a great tool. If you are open to using open source code then, try this out as well.