How to replace - (hyphens) with spaces in SSIS - database

Since we receive the data in the form of excels. We are finding and replacing hyphens with space in excel (manual changes in the excel). Now the requirement is to make it automatic, to replace hyphens with spaces for the entire table (all columns) through SSIS, before they are loaded into tables.
How can we do that?

Let's suppose that you have two columns :
This will you data entry in Excel Source component :
You can use a derived column and use REPLACE() function :

Related

Is it possible to generate both an FMT and a table from a CSV file?

So I need a way to import CSVs that vary in column names, column order, and number of columns. They will always be CSV and of course comma-delimited.
Is it possible to generate both FMT and a temp table creation script of a CSV file?
From what I can gather, you need one or the other. For example, you need the table to generate the FMT file using the bcp utility. And you need the FMT file to dynamically build a create script for a table.
Using just SQL and to dynamically load files text files there is no quick way to do this. I see one option:
Get the data into SQL Server as a single column (bcp it in or use
t-sql and openrowset to load, SSIS, etc...). Be sure to include in this table a second column that is an identity (I'll call it "row_nbr"). You will need this to find the first row to get column names from the header in the file.
Parse the first record "where row_nbr = 1" to get the header record. You will need a string parse function (find online, or create your own) to substring out each column name.
Build dynamic SQL statement to create a new table with the parsed
out number of fields you just found. Must calculate lengths and use
a generic "varchar" data type since you wont know how to type the
data. Use column names found above.
Once you have a table created with the correct number of adequately
sized columns, you can create the format file.
I assumed, in my answer, that you are comfortable with doing all these things, just shared the logical flow at a high level. I can add more if you need more detail.

Ignoring column from Excel file while importing to SQL Server

I have multiple Excel files that have the same format. I need to import them into SQL Server.
The issue I currently have is that there are two text columns that I need to ignore completely as they are free text and the character length for some rows exceeds what the server allows me to import which results in a truncation error.
Because I don't need these columns for my analysis, the table I'm importing to doesn't include these columns but for some reason the SSIS packages still picks up those columns and cuts the import job halfway through.
I tried using max character length for those columns which still results in the truncation error.
I need to create an SSIS package that ignores the two columns completely without deleting the columns from Excel.
You can specify which columns you need to ignore from the Edit Mappings dialog.
I have added the image for your reference:
If you just create the SSIS package in SSDT the Excel file can be queried to return only the required columns. In the package, create an Excel Connection Manager using the Excel file. Then on the Control Flow of the package add a Data Flow Task that has an Excel Source component in it. On this source, change the data access mode to SQL command and the file can then be queried similar to SQL. In the following example TabName is the name of the Excel tab containing the data that will be returned. If either the tab or any column names contain spaces they will need to be enclosed in square brackets, i.e. TabName would be [Tab Name].
Import/Export Wizard
Since you mentioned in the comments that you are using SQL Server Import/Export Wizard. You can solve that if you have a fixed columns (range) that you are looking to import (example: first 10 columns).
In Import/Export wizard, after selecting destination options you will be asked if you want to read from tables or query:
Select the query option, then use a simple select query and specify the columns range after the sheet name. As example:
SELECT * FROM [Sheet1$A:C]
The query above will read from the first 3 columns in Sheet1 since A:C represent the range between first column A and third column C.
Now, you can check the columns from the Edit Mappings dialog:
SSIS
You can use the same logic within SSIS package, just write the same SQL command in the Excel Source after changing the Access Mode to SQL Command.
The solution is simple. I needed to write a query that will exclude the columns. So instead of selecting "Copy data from one or more tables" you select "write a query" and exclude the columns you don't need. This one worked 100%

How can I make SSIS detect the record length when importing a flat file?

I wrote an SSIS package which imports data from a fixed record length flat file into a SQL table. Within a single file, the record length is constant, but different files may have different record lengths. Each record ends with a CR/LF. How can I make it detect where the end of the record is, and use that length when importing it?
You can use a script task. Pass a ReadWriteVariable into the script task. Let's call the ReadWriteVariable intLineLength. In the script task code, detect the location of the CR/LF and write it to intLineLength. Use the intLineLength ReadWriteVariable in following package steps to import the data.
Here is an article with some good examples: script-task-to-dynamically-build-package-variables
Hope this helps.
This may not work for everyone, but what I ended up doing was simply setting it to a delimited flat file and setting CR/LF as the row delimiter, and leaving the column delimiter and text qualifier blank. This won't work if you actually need to have it split out columns in the import, but I was already using a Derived Column task to do the actual column splitting, because my column positions are variable, so it worked fine.

SSIS 2008 R2 - How to load header and data row from CSV

I have a CSV file where there is a header row and data rows in the same file.
I want to get information from both rows during the same load.
What is the easiest way to do this?
i.e File Example - Import.CSV
2,11-Jul-2011
Mr,Bob,Smith,1-Jan-1984
Ms,Jane,Doe,23-Apr-1981
In the first row, there a a count of the number of rows and the date of transmission.
In the second and subsequent rows is the actual data, in this Title, FirstName, LastName, Birthdate
SQL Server Integration Services Conditional Split Transformation should do it.
I wonder what would You do with that info in the pipeline. However, there is only one solution to read it in one pass (take a look at notes/limitations at the end):
Create a data flow
Put File source component and set it the way You want
Add script task to count the number of rows
Put conditional split transformation where condition is mycounter=0
One path from condition split will be the first row of file (mycounter=0) and the other path will be the rest of the rows (2 in your example).
Note#1: file source can set only one metadata for each column in the source. This means that if your first column of data is string (Mr, Ms, ...) then You have to set it as string data type in the source. Otherwise, if You set it as integer (DT_Ix) it
will fail as soon as it encounters row with string data (Mr, Ms, ...) in the first column of file. This applies to all columns, not just the first one.
Note #2: SSIS will see only the number of columns You told it to. This means that You have to have the same number of columns in EACH row. Otherwise, You have ragged csv file and You need to take another approach - search the Internet. But those solutions also require different layout of csv.
Answers in the following links explain how to load parent-child data from a flat file into an SQL Server database when both parent and child rows exist in the same file next to each other.
How do I split flat file data and load into parent-child tables in database?
How to load a flat file with header and detail data into a database using SSIS package?

How to import variable record length CSV file using SSIS?

Has anyone been able to get a variable record length text file (CSV) into SQL Server via SSIS?
I have tried time and again to get a CSV file into a SQL Server table, using SSIS, where the input file has varying record lengths. For this question, the two different record lengths are 63 and 326 bytes. All record lengths will be imported into the same 326 byte width table.
There are over 1 million records to import.
I have no control of the creation of the import file.
I must use SSIS.
I have confirmed with MS that this has been reported as a bug.
I have tried several workarounds. Most have been where I try to write custom code to intercept the record and I cant seem to get that to work as I want.
I had a similar problem, and used custom code (Script Task), and a Script Component under the Data Flow tab.
I have a Flat File Source feeding into a Script Component. Inside there I use code to manipulate the incomming data and fix it up for the destination.
My issue was the provider was using '000000' as no date available, and another coloumn had a padding/trim issue.
You should have no problem importing this file. Just make sure when you create the Flat File connection manager, select Delimited format, then set SSIS column length to maximum file column length so it can accomodate any data.
It appears like you are using Fixed width format, which is not correct for CSV files (since you have variable length column), or maybe you've incorrectly set the column delimiter.
Same issue. In my case, the target CSV file has header & footer records with formats completely different than the body of the file; the header/footer are used to validate completeness of file processing (date/times, record counts, amount totals - "checksum" by any other name ...). This is a common format for files from "mainframe" environments, and though I haven't started on it yet, I expect to have to use scripting to strip off the header/footer, save the rest as a new file, process the new file, and then do the validation. Can't exactly expect MS to have that out-of-the box (but it sure would be nice, wouldn't it?).
You can write a script task using C# to iterate through each line and pad it with the proper amount of commas to pad the data out. This assumes, of course, that all of the data aligns with the proper columns.
I.e. as you read each record, you can "count" the number of commas. Then, just append X number of commas to the end of the record until it has the correct number of commas.
Excel has an issue that causes this kind of file to be created when converting to CSV.
If you can do this "by hand" the best way to solve this is to open the file in Excel, create a column at the "end" of the record, and fill it all the way down with 1s or some other character.
Nasty, but can be a quick solution.
If you don't have the ability to do this, you can do the same thing programmatically as described above.
Why can't you just import it as a test file and set the column delimeter to "," and the row delimeter to CRLF?

Resources