Trying to convert an old xp_cmdshell process that creates output files from within a cursor. I broke it down into a function call to pull a list of dates and a stored procedure that processes an individual day's worth of data.
When I try to get this working in SSIS, I keep going around in circles. I have an OLE DB Source dataflow task that references the function stored in a variable. I can pass that output to an OLE DB command that runs a sqlcommand "exec ?=export_data ?".
It's at this point where I get stuck. I need to export each date as a separate file, but can't get the flat file destination to work. I tried editing the function call with a second (null) column "result" to contain the output from each command call, but all I get is the '0' returnvalue instead of the in-memory table. I was able to get the recordset destination to work, but couldn't progress from there.
I know there's some extra work to be done with the flat file destination to get the distinct files, but I'd be happy just to get one file with some relevant data at this point.
Related
I have a SSIS package in which I use a ForEach Container to loop through a folder destination and pull a single .csv file.
The Container takes the file it finds and uses the file name for the ConnectionString of a Flat File Connection Manager.
Within the Container, I have a Data Flow Task to move row data from the .csv file (using the Flat File Connection Manager) into an OLEDB destination (this has another OLEDB Connection Manager it uses).
When I try to execute this container, it can grab the file name, load it into the Flat File Connection Manager, and begin to transfer row data; however, it continually errors out before moving any data - namely over two issues:
Error: 0xC02020A1 at Move Settlement File Data Into Temp Table, SettlementData_YYYYMM [1143]: Data conversion failed. The data conversion for column ""MONTHS_REMAIN"" returned status value 2 and status text "The value could not be converted because of a potential loss of data.".
Error: 0xC02020A1 at Move Settlement File Data Into Temp Table, Flat File Source [665]: Data conversion failed. The data conversion for column ""CUST_NAME"" returned status value 4 and status text "Text was truncated or one or more characters had no match in the target code page.".
In my research so far, I know that you can set what conditions to force an error-out failure and choose to ignore failures from Truncation in the Connection Manager; however, because the Flat File Connection Manager's ConnectionString is re-made each time the Container executes, it does not seem to hold on to those option settings. It also, in my experience, should be picking the largest value from the dataset when the Connection Manager chooses the OutputColumnWidth for each column, so I don't quite understand how it is truncating names there (the DB is set up as VARCHAR(255) so there's plenty of room there).
As for the failed data conversions, I also do not understand how that can happen when the column referenced is using simple Int values, and both the Connection Manager AND the receiving DB are using floats, which should encompass the Int data (am I unaware that you cannot convert Int into Float?).
It's been my experience that some .csv files don't play well in SSIS when going directly into a DB destination; so, would it be better to transform the .csv into a .xlsx file, which plays much nicer going into a DB, or is there something else I am missing to easily move massive amounts of data from a .csv file into a DB - OR, am I just being stupid and turning a trivial matter into something bigger than it is?
Note: The reason I am dynamically setting the file in the Flat File Connection Manager is that the .csv file will have a set name appended with the month/year it was produced as part of a repeating process, and so I use the constant part of the name to grab it regardless of the date info
EDIT:
Here is a screen cap of my Flat File Connection Manager previewing some of the data that it will try to pipe through. I noticed some of these rows have quotes around them, and wanted to make sure that wouldn't affect anything adversely - the column having issues is the MONTHS_REMAIN one
Is it possible that one of the csv files in the suite you are processing is malformed? For instance, if one of the files had an extra column/comma, then that could force a varchar column into an integer column, producing error similar to the ones you have described. Have you tried using error row redirection to confirm that all of your csv files are formed correctly?
To use error row redirection, update your Flat File Source and adjust the Error Output settings to redirect rows. Your Flat File Source component will now have an extra red arrow which you can connect to a destination. Drag the red arrow from your source component to a new conditional split. Next, right-click the red line and add dataviewer. Now, when error rows are processed, they will flow over the red line into the data viewer so you can examine them. Last, Execute the package and wait for the dataviewer to capture the errant rows for examination.
Do the data values captured by the data viewer look correct? Good luck!
I've chased my tail for a full 12 hours. Haven't found the right solution.
I'm locked into using SSIS. I have a SQL Server table with full paths and filenames already concatenated. Examples:
\\MydevServer1\C$\ABC\App_Data\Sample.pdf
\\MydevServer2\E$\Garth\App_Data\Morefiles.txt
\\MydevServer3\D$\Paths\App_Data\MySS.xlsx
etc.
I need to read each row of the table, get the path and filename and move that file to a new static destination directory.
The rows in the table will remain unchanged. I only use it as a source to locate the file to be moved.
I've tried:
1) Feeding a resultset from an ole db source to a recordset destination then to an Object variable that connects via variable to a foreach loop container holding a files system task. (Very problematic.)
2) Sending the table rows to a .csv file and reading each line of the csv file using a foreach loop container holding a file system task.
3) Reading directly from the table rows using a foreach loop container holding a file system task. (preferred).
and many other scenarios.
I have viewed a hundred examples online, but most of them involve loading a table, or sending results to flat files, or moving files from one folder to another based on extension type, etc. I haven't found anything on configuring a file system task to read a table supplied path and move the file based on the table value as the source.
I'm rambling. :-)
Any insight or help will be appreciated. I'm not new to SSIS, but I sure feel like it right now.
Create two string variables to store source and destination paths
Use an Execute SQL Task to populate a Full Recordset (Variable with Object data type)
Use For Loop container to go through each row of recordset and set those two variables.
Inside For Loop container, use File System Task. You need to specify IsSourcePathVariable = True, IsDestinationPathVariable = True, path variables - DestinationVariable / SourceVariable, and set operation (copy, move, etc.)
It appears I've been tail chasing due to the error, "Source is empty error".
This was caused by a blank first row in my recordset. I was searching for a fix to the Object variable is empty issue, when in reality the issue was that the Object variable couldn't find data right off the bat.
Insert shameful smug here.
Thanks to Anton for the help.
I am trying to merge a number of files. About 40,000 excel files all in exactly the same format (columns etc).
I have tried to run a merge command through CMD which has merged them together to a point but the CSV file it has merged to I am unable to open due to the size of it.
What I am trying to find out is what is the best process to merge such a large amount of files and then the process to load them into SQL server.
Is there any tools or something that may need to be customised and built?
I don't know a tool for that, but my first idea is this, assumed you are experienced with Transact SQL:
open a command shell, change to folder where your Excel files are stored in and enter the following command: dir *.xlsx /b > source.txt
This will create a textfile named "source.txt", which contains the names (and only the names) of all your Excel files
import this file in a SQL Server table, i.e. called "sourcefiles"
create a new stored procedure, which contains a cursor. The cursor should read your table "sourcefiles" in a loop row by row and store the name of the actually readed Excel file in a variable, i.e. called "#FileName"
in this loop perform a sql statement like this for every readed Excel file:
SELECT * INTO dbo.YourDatabaseTable
FROM OPENROWSET('Microsoft.ACE.OLEDB.12.0',
'Excel 12.0 Xml;HDR=YES;Database=#FileName',
'SELECT * FROM [YourWorkSheet$]')
let the cursor read the next row
Replace "YourDataseTable" and "YourWorkSheet" with your needs.
#FileName must contain the full path to the Excel files.
Maybe you have to download the Microsoft.ACE.OLEDB.12.0-Provider before executing the sql command.
Hope, this helps to think about your further steps
Michael
edit: have a look on this website for possible errors
Here i am new in Developing the SSIS package
I need your support to come up with the solution.
I have 10 different set of stored procedures which I have to export into text file, all 10 procedures will return the same set of columns (only calling parameters are different).
I am not getting the solution how to do ?
Could you please help me to understand how to export the data from a stored procedure output to tab delimited text file?
Please let me know how to build the ssis package ?
Thanks
This is very hard to do without putting pictures in on each step. I do not seem to be able to put pictures in so I will try to describe in is much detail as I can.
You have to first set up a connection to the database where you are going to run the stored procedures from. This means creating a connection manager for "New OLE DB Connection". You will need vaild login to the database information to create this connection.
In the control area I would set up a "Execute SQL Task". I would set the result set to full result set and set the connection to the one you named in the prior step. To call a stored procedure from a SQL task use something like "exec ? = dbo.usp_check_load_table_all #JobCode = ?, #TransId = ? , #Status = ?, #TurnStatusOff = ?" The first ? is the return code from the stored procedure. The others are the parameters to run the stored procedure. Now you are running 10 different stored procedures and I only know how to run one but you could create ten packages, one to run each and concatenate the files when it is done. In the parameter mapping you set the values for the variables to run with. Make sure to create a USER::ReturnValue type long for the return code. The results set needs one entry a USER::Results of type object.
You now put in a foreach loop for a ADO enumerator putting in the USER::Results in as the variable. This allows you to read in each row one at a time. You must create user variables for the variable mapping to go into.
I would then do a data flow task and put a derived column task and set up each of the fields you want to write to the file from the USER:: fieids you created for the foreach loop.
I would create a flat file connection in the connection manager as a delimited file, tab delimited. You will need a file that looks like the output you desire as you will need to map each field in the file.
Add a flat file destination to under the deriried column task and map it to the flat file you just created. Now map each field to the output.
I hope this is helpful as I was once new SSIS myself.
I am working on a generic SSIS package that receives a flat file, add new columns to it, and generate a new flat file.
The problem I have is that the number of new columns varies based on a stored procedure XML parameter. I tried to use the "Execute Process Task" to call BCP, but the XML parameter is too long for the command line.
I search on the web and found that you cannot dynamically change the SSIS package during runtime and that I would have to use a script task to generate the output. I started going trough that path and found that you still have to let the script component know how may columns will be receiving and that is exactly what I do not know at design time.
I found a third party SSIS extension from CozyRoc, but I want to do it without any extensions.
Has anyone done something like this?
Thanks!
If the number of columns is unknown at run time then you will have to do something dynamically, and that means using a script task and/or a script component.
The workflow could be:
Parse the XML to get the number of rows
Save the number of rows in a package variable
Add columns to the flat file based on the variable
This is all possible using script tasks, although if there is no data flow involved, it might be easier to do the whole thing in an external Perl script or C# program and just call that from your package.