SSIS: Flat File default length - sql-server

I have to import about 50 different types of files every day. Some of them with a few columns, some inculde up to 250 columns.
The Flat File connection always defaults all columns to 50 chars.
Some columns can be way longer than 50 chars, and will of course end up in errors.
Currently i am doing a stupid search&replace with notepad++ - Opening all SISS packages, replacing:
DTS:MaximumWidth="50"
by
DTS:MaximumWidth="500"
This is an annoying workaround.
Is there any possibility to set a default length for flatfile string columns to a certain value?
I am developing in Microsoft Visual Studio Professional 2015 and SQL Server Data Tools 14.0.61021.0
Thanks!

I don't think that there is a way to achieve this from SQL Server Data Tools.
But you can do some workaround to achieve this:
Easiest solution, In the Flat file connection manager - Advanced Tab, select all columns (using Ctrl key) and change the data length property for them all in one edit. (detailed in #MikeHoney answer)
You can use BIML (Business Intelligence Markup Language) to create ssis package, if you're new to BIML you can access to BIML Script website for detailed tutorials.
You can create a Small application that loop over .dtsx files in a folder and replace DTS:MaximumWidth="50" with DTS:MaximumWidth="500" using normal String.Replace function or using Regular expressions. (you can read my answer # Automate Version number Retrieval from .Dtsx files to see an exmaple on reading .dtsx file using Regular expressions)
Function To Read and Replace content of dtsx file (Vb.Net)
Public Sub FixDTSX(byval strFile as string)
dim strContent as string = string.empty
Using sr as new Io.StreamReader(strFile)
strContent = sr.ReadToEnd()
sr.Close()
End Using
strContent = strContent.Replace("DTS:MaximumWidth=""50""","DTS:MaximumWidth=""500""")
Using sw as new Io.StreamWriter(strFile,False)
sw.Write(strContent)
sw.Close()
End Using
End Sub

There is a way to achieve what you want using the standard Visual Studio SSDT UI, although it is quite obscure. AFAIK it works in every version of this editor since SQL Server 2005.
With the package open, from the Connection Managers pane, right-click your Flat File Connection and choose Edit. Then navigate to the Advanced page. Then multi-select the columns you want to change (e.g. shift-click a range or ctrl-click a specific set). Now the Properties appearing at the right will be applied to all the selected columns.
In the example shown below, I have set all the selected columns to a width of 255.

Esteban,
I suggest you use the Object Model API which allows you to develop SSIS packages programmatically. Using that, you can make use of any .net code that allows you to gather data/metadata from text files. Also, the assumption is that, since you are using SSIS, you already might be familiar with writing code in C#/VB.Net
Now, if you are just starting with the Object Model API, there would be a huge learning curve (but it is worth learning it if SSIS is your day to day life). If you do not have the time to invest right now, I would recommend you to use a library I wrote (called Pegasus) which greatly simplifies how you can use the Object Model API; you can create your packages in an almost declarative fashion (using C#).
On Github, there is an example that shows how to create a package that loads any number of text files with differing schemas in a given folder. See here; specifically the method GenerateProjectToLoadTextFilesToSqlServerDatabase().
In the example, I use a third party .Net library called lumenworks.framework to probe delimited files and get their metadata. Using this library, I get the names of the columns; and I also infer data types and lengths based on sampling the first 'n' number of rows. (In my code, I am only inferring ints, dates and strings; if you have more data types, add relevant code accordingly). Or you can specify one specific data type and length (looks like you want to use string of 500 chars) for all your columns. [Or (in some cases), you might have this metadata available outside in a excel file/config file.] Then I use this metadata to configure my text file connection managers programmatically.
YOu can download the code from Github and run the DataFlowExample by specifying where your source files are and see how far it gets you.
Another recommendation would be Biml, but I am not sure if you can incorporate your own/third party full fledged C# code (not just snippets) into Biml workflow. If you can, then go with Biml.
Let me know if you have any questions.

Related

Reading Excel files into SQL Server not using OLEDB/ODBC

Is there a way to read Excel 2010/2013 files natively ?
We are importing Excel files into SQL Server and have come across a specific issue whereby it looks as though the Excel driver decides the type of a destination data column depends upon testing the contents of only the first 65K odd rows.
This has only just started happening within the past 3 weeks, before then we had managed to convince Excel of the error of its ways by a simple registry hack that forced it to read the entire set of rows.
The problem is that we have some datasets that contain, say 120,000 rows and these may have all numeric values for the first 80,000, then it will have some non-numeric yet vital information that we wish to retain.
Yes, the data is not correctly typed, we know.
Because the source data type has been determined by the Excel driver to be a float it promptly turns all our non-numeric values into NULLs - not very useful.
If there was some other way to read an Excel file not using the standard ODBC/OLEDB drivers that might help.
We have tried saving it into various other formats before importing but of course all these exports use the Excel driver which has the problem.
I think the closest we have got is to save it as XML (which is frankly huge at 800MB) and then shred it using standard xpath queries and some pretty dodgy workarounds to handle no doubt well-formed but still tricky variations on how column data is represented.
Edit: changed title to more closely reflect the issue
As well as the registry key, when connectting to your excel file have you tried setting the following:
;Extended Properties="IMEX=1"
See here
Also see this MSDN article

Generating several similar SSIS packages (file data source to DB)

Is there a way to automatically generate SSIS packages? I need to create a lot of SSIS packages that just erase data from one table and import data from a text file. The file name matches table name and the column headers are in the first line of the file.
For more detailed information:
I am working on a project in which I have to separate two systems that are currently coupled (one system has direct access to the other's database). After the modifications, one system will provide data through txt files to be loaded in the other database.
We have to use SSIS to load data into the database from the text files.
The text files will be provided in CSV format with column headers in the first line.
The tables from both databases have matching column names, and all we need to do is clear the table and load data from the files.
I have more than one hundred tables with different number of columns. Do I need to create each package manually?
I'm familiar with 2 free options.
EzAPI might be a good place if you're a .NET heavy shop or just really want to geek out with the API. This approach allows you to control the pretty much the entire package generation but at the cost of coding time. I find EzAPI generally easier than working with the base COM/.NET libraries for SSIS.
Biml is an interesting beast. Varigence will be happy to sell you a license to Mist but it's not needed. All you would need is BIDSHelper and then browse through BimlScript and look for a recipe that approximates your needs. Once you have that, click the context sensitive menu button in BIDSHelper and whoosh, it generates packages.
I did this just using vb, I passed in the table names as a command parameter and used vb to generate the insert and clear, worked a charm... I can try and dig it out tomorrow when I'm back in the office but it was pretty simple. There didn't seem to be any other way to say "just get x and export it", "just take y and import it into z" so vb it had to be. In fact come to think of it I think I actually used a small xml file to pass the table info for export and then determined the table name for import from the csv file name. To be clear, this was only one package but it could dynamically choose the number of imports/exports it did. Further clarification this was vb within ssis as a processing step

Using Excel with Silverlight app not writing new columns

I have a project as follows:
User uploads Excel file to server, server will return back with 2 new columns. User wants us to check prices being charged and we have file that holds average standard pricing.
In the desktop application just done, I use Microsoft.Office.Interop.Excel
for manipulating the Excel file.
But this is not available in Silverlight. Reading is not the issue.
The issue is adding 2 new columns. Program reads excel file using oledb, and oledb is very light and is available in web.
But for creating 2 new columns, I use Microsoft.Office.Interop.Excel that Microsoft provides.
This is not available in web.
I will be need to check how can we do this.
One possibility is to have the program on the server, waiting for a file, process the file, and email back to the user.
I just want to see if there is another way. I don't like this approach it doesn't seem best.
You have a few options for doing this with Silverlight. First, you can use the Excel XML format for the files which means adding a column is just an XML exercise. Second, if that doesn't work, you can upload the file to the server and run the same code you have in your desktop app to update the file. Once it is updated you can prompt the user to save the file back to their hard drive.
If you go the Excel XML route then you would need to create a web service to get the price data from your database out to the Silverlight on the client. Oledb won't work since you don't want to expose your database via oledb on the Internet.

Export queries from SSMS to files - not the results but the query

I want to export all my queries as individual files for purposes of putting them into mercurial source control, but I don't know how to export the individual queries as individual files without having to open each one, then save to the folder, then add into the project, or some equally convoluted process.
I wouldn't mind having to add each one individually, but how do I get them out of the database as individual files without opening them all and doing each one save as? Ostensibly I would like them named with the name they have in the database right now.
I could easily dump the whole lot into one long file using database tasks, but that's not really super helpful is it?
I have SSMS 2k5 and 2k8 (and VS 2k5, 2k8, 2010 to boot) to work with, any thoughts?
Right click on the database. Select Generate Script. On the last page. Script To file you can choose single file or file per object
When you script a database in SSMS you have the option of one file per objects.
SMO is useful with a small app to iterate through
Third party tools like Red Gate SQL Compare (there are other free tools) can script too
I would write a small C# program which extracts your database object via SMO and stores them in your filesystem the way you want.
It is rather easy to write stored procedures which fetches the definition into the result as text. sp_helptext could be used as start.
Than you can use PowerShell to write the Output to the file system.
It sounds as if this would fit rather good into the Really Simple Data Dictionary codeplex project. link text

Simplest way to implement a database for a Song list type program (Using Visual C++)

Working on a school project, the program is supposed to read from a text file that has a record about a song in every line, fields separated by ";".
Anyways I have no knowledge of databases, and I just want the quickest way to create a database from that text file, and also i will need to change some of the fields of the records once in a while from the program... Also the program needs to search through the database based on certain fields.
Anyways so far all our projects didn't keep a database, so when we closed the program, every info was gone, now i actually need to keep some info for the next time the program runs. What's the fastest way to accomplish this?
Also I wanna be able to keep some info about the software, like the path of the original text file for weekly updates. Where can i save info like that?
EDIT: it doesn't have to an actual database, as long as i can search and edit it efficiently.
If you can use SQL database, I'd suggest simple file-based database SQLite
With SQLite, you can query, insert and update records by executing regular SQL statements.
Here you will find introduction to C++ interface It's easy to embed SQLite support in an application because SQLite comes as a library, meaning a bunch of header files and 1-2 binary archive with library.
Your comma-delimited textfile is aleady a database. You can add records, delete records, and modify records using the standard textfile routines provided by the standard C++ libraries.
Alternatively, you can import your textfile into SQL Server using BULK INSERT.
Finally, you can access your CSV (comma-delimited text) file using SQL queries. You need to find the correct connection string. See http://www.connectionstrings.com/textfile.

Resources