I have four script components in SSIS (set up as sources) that generate four output tables for further procession.
I need some component to allow me to run an SQL command against input tables as if there were tabled in SQL database.
Which component (or a set of components) I should use to query my input and generate output in an aforementioned scenario?
Thank you.
you cant!
There are a few components that allow you to "check" your data like the row sampling or even the count to count the number of rows, but if you need to run a SQL query on a data set you just read from a text file, you will need to persist your data in a temporary table, do whatever you need to do with it, and then send it to the destination.
If you are on SQL Server 2012, and the query you want to run on the data wont update it, you can use "data taps". What it basically does is send the data passing through a data flow path to a file on the file disk. It is handy but its more of a "debug" mechanism, you dont want to have it on production and rely on it for an important task. They are also only available when you are executing a package deployed to the SSISDB catalog.
Related
I am moving data within folder from Azure Data Lake to a SQL Server using Azure Data Factory (ADF).
The folder contains hundreds of .csv files. However, one inconsistent problem with these csv's is that some (not all) have a final row that contains a special character, which when trying to load to a sql table with datatypes other than NVARCHAR(MAX) will fail. To get around this, I have to first use ADF to load the data into staging tables where all columns are set to NVARCHAR(MAX), then I insert those rows that do not contain a special character into tables that have the appropriate data type.
This is a weekly process, and is over a terabyte of data and it takes forever to move the data so I am looking into ways to import into my final tables rather than having a staging component.
I notice that there is a 'pre-copy script' field that can execute before the load to sql server. I want to add code that will allow me to parse out special characters OR null rows before loading to sql server.
I am unsure of how to approach this since the csv's would not be stored in a table, so SQL code wouldn't work. Any guidance on how I can utilize the pre-copy script to clean my data before loading it into sql server?
The pre-copy script is a script that you run against the database before copying new data in, not to modify the data you are ingesting.
I already answered this on another question, providing a possible solution using an intermediate table: Pre-copy script in data factory or on the fly data processing
Hope this helped!
You could consider stored procedure. https://learn.microsoft.com/en-us/azure/data-factory/connector-azure-sql-database#invoking-stored-procedure-for-sql-sink
What I want to do is build a dynamic data pull from different SQL source servers (Server1,Server2,Server3) etc.
To pull down to dynamic locations on my SQL server (Dev,Prod) into databases (database1,database2,etc)
The tables will be dropped and recreated each time the package is run so that I am sure I match the source servers if they change anything on source (field names, datatypes, lengths, etc)
I will still get the data to extract. I want to pull this down using a single dataflow in a foreach loop.
I have a table that has all the server names and tables and databases in it and
I want to loop through that table and pull all the rows of tables inside down to my server (server1.database1.table_x,server5.database3.table_y,etc) So that I don't have to build a new data flow for each table.
In order to do this I have already built the foreach loop with a sql task that is dumping results into an object. Then the foreach loop takes that object that has 7 different fields (Source_Server_Name,Source_Server_Type_Driver,Source_Database,Source_Table,Source_Where_Clause,Source_Connection_String,then destination stuff) and it puts each of those fields into a different String variable for use inside the loop.
I can change the Connections dynamically using the variables but I can't figure out how to get the column mapping in the dataflow to function,
Is there some kind of script task I can use to edit the backend XML that will create the column mapping for me so the metadata does not error out? Any help would be greatly appreciated :-)
This is the best illustrated example I could find of what I am doing just remember I need to have a different metadata setup for each table I pull down to my server.
http://sql-bi-dev.blogspot.com/2010/07/dynamic-database-connection-using-ssis.html
The solution I ended up using is BIML which generates the package on the fly using dynamic sql and BIML. Not pretty but it works :-)
I have heard that it is possible to dynamically generate and publish the packages but I would never go this route. I have done something similar using c# code which can be run from an application via sql agent or from inside an SSIS package script task.
If you try this approach look into SqlConnection and SqlCommand. Then write code to build the sql statements dynamically.
For example create table statements using ExecuteNonQuery(), use datareader to pipe in input and pass that reader to SqlBulkCopy to write to the destination.
I am trying to copy data from views on a trusted SQL Server 2012 to tables on a local instance of SQL Server on a scheduled transfer. What would be the best practice for this situation?
Here are the options I have come up with so far:
Write an executable program in C# or VB to delete existing local table, query the data from remote database and then write results to tables in the local database. The executable would run on a scheduled task.
Use BCP to copy data to a file and then upload into local table.
Use SSIS
Note: The connection between local and remote SQL Server is very slow.
Since the transfers are scheduled, so I suppose you want this data to be up-to-date.
My recommendation would be to use SSIS and schedule it using SQL Agent. If you wrote a C# program, I think the best outcome you will gain is a program imitating SSIS. Moreover, SSIS will be a very easy to amend the workflow anytime.
Either way, to make such program/package up-to-date, you will have to answer an important question: Is the source table updatable or is it like a log (inserts only)?
This question is so important because it will determine how you will fetch the new updates from the source table. For example, if the table represents logs, you will most probably use the Primary Key to detect new records, if not, you might want to seek a column representing update date/time. If you have the authority to alter the source table, you might want to add timestamp column which represent the row version (timestamp differs than datetime)
For building an SSIS package, it will mainly contain the following components:
Execute SQL Task to get the maximum value from source table.
Execute SQL Task to get the last value where it should start from at the destination table. You can get this value either by selecting the maximum value from the destination table or if the table is pretty large you can store that value in another table (configuration table for example).
Data Flow which moves the data from source table starting after the value fetched in step 2 to the value fetched in step 1.
Execute SQL Task for updating the new maximum value back to the configuration table if you chose this technique.
BCP can be used to export the data compress and transfer over network which can be then imported into local instance of SQL.
Also with BCP data exports can be contained with smaller batches of data for easier management of data.
https://msdn.microsoft.com/en-us/library/ms191232.aspx
https://technet.microsoft.com/en-us/library/ms190923(v=sql.105).aspx
I have a stored procedure that I am going to run every weekend, it produces a result set that I need to export into an Excel file.
For the above problem I want to automate this process, so I am going to create a SQL Job and I am going to run this stored procedure every weekend so that that generated Excel file is sent to my reporter.
For this I need steps to export the result set data to an Excel file.
And also is it possible to send that Excel file to the specific mail while running the job itself?
So, you might try your luck on https://dba.stackexchange.com/, but in my experience a SQL Agent job running a stored procedure could be coaxed to return CSV or XML - and those could end up in Excel, but there are missing links. I think the missing links would involve programming and potentially 3rd party tools to avoid using Excel's COM API.
I'd strongly recommend your pursuing SQL Server Reporting Services. It is included free with your edition of SQL and includes the ability to
run reports on a schedule (subscriptions),
format the results as an Excel file
distribute the results via email
You'd take your query and use it as the data source for a "report" and use the report wizard to create a very simple table with the results.
Avoid page headers (or footers) that span columns - this will keep the excel output cleaner.
References
Stack Overflow: reporting-services-export-to-excel-with-multiple-worksheets
Technet: Reporting Services
I have a complicated query that marshals data to a temporary table, which I then marshal into a further output temporary table before finally selecting on it to display to screen. This gets saved out from Grid view, to text and I get the file I need for processing off site.
What I want to do is have this query be run-able and create that file on the local disk without any need for the operator to change the "Results to" option, or fiddle with anything.
What command or functionality might be available to me to do this?
I can not install any stored procedures or similar to the server involved.
Since you can't do anything on the server I would suggest writing an SSIS package. Create a data flow, and in your source object put your script. Your destination object will then point to the file you want. You have a fair number of options for output.
The SSIS package can then be run by
A SQL Job (assuming you are allowed even that)
A non SQL job running a bat file with a DTEXEC command
The DTEXECUI GUI.
Also you can store your SSIS package in the instance or on any fileshare you choose.