Import csv file via SSMS where text fields contain extra quotes - sql-server

I'm trying to import a customer csv file via SSMS Import Wizard, file contains 1 million rows and I'm having trouble importing where the field has extra quotes e.g. the file has ben populated freehand so could contain anything.
Name, Address
"John","Liverpool"
"Paul",""New York"""
"Ringo","London|,"
"George","India"""
Before I press on looking into SSMS should SSMS 2016 handle this now or do I have to do in SSIS, it is a one off load to check something?

In SSMS Import/Export Wizard, when configuring the Flat File Source you have to set:
Text Qualifier = "
Column Delimiter = ,
This will import the file as the following:
Name Address
John Liverpool
Paul "New York""
Ringo London|,
George India
The remaining double quotes must be removed after import is done using SQL, or you have to create an SSIS package manually using Visual Studio and add some transformation to clean data.

Related

Import a CSV comma-delimited file into a SQL Server table

I need to import a .CSV file into a SQL Server table and I'm having problems due to " appearing within the string.
I have found the problem
lines ,"32" Leather Bike Trs ",
It never splits the column.
I've been trying to solve this for hours, what I'm I missing here.
If it can't be done with SSMS Import.
Can it be done in SSIS, import as one big column and use SQL, C# script, what would be my next step to research?
Thanks.
Below sample line to put into a csv file to try.
"Company","Customer No","Store No","Store Name","Channel","POS Terminal No","Currency Code","Exchange Rate","Sales Order No","Date of Sales Order","Date of Transaction","Transaction No","Line No","Division Code","Item Category Code","Budget Group Description","Item Description","Item Status","Item Variant Season Code","Item No","Variant Code","Colour Code","Size","Original Price","Price","Quantity","Cost Amount","Net Amount","Value Including Tax","Discount Amount","Original Store No","Original POS Terminal No","Original Trans No","Original Line No","Original Sales Order No","Discount Code","Refund Code","Web Return Description" "Motor City","","561","Outback","In-store","P12301","HKD","1","","","20160218","185","10000","MT","WW","Jeans","32" Leather Bike Trs ","In Stock","9902","K346T4","BK12","BK","12","180.00000000000000000000","149.00000000000000000000","1.00000000000000000000","34.12500000000000000000","135.45000000000000000000","149.00000000000000000000",".00000000000000000000","","","0","0","","","",""
You're right the issue come from one " placed in your text. The fun fact is, if you had 2 " in your text, SSMS could handle it (as many other tools).
Maybe you should consider the possibility to change the text qualifier of your file before implementing a SSIS package ?

Importing *.DAT file into SQL server

I am trying to import *.DAT file(as flat file source) into sql server using SQL server import and export wizard. It has DC4 as delimiter which is causing error while trying to separate the columns and their respective data and importing them in sql server.
Are there any setting changes to be made during the importing process?
If you don't have to use the wizard, you can script it like:
BULK INSERT [your_database].[your_schema].[your_table]
FROM 'your file location.dat'
WITH (ROWTERMINATOR='0x04' -- DC4 char
,MAXERRORS=0
,FIELDTERMINATOR='þ'
,TABLOCK
,CodePage='RAW'
)
The wizard uses SSIS under the hood. Instead of executing it directly, chose CrLF as row delimiter, then chose to save it as file. Open the file and edit it using any text editor. It's a simple xml file.
It's not clear whether 0x04 is the column delimiter or the row delimiter. Assuming it's the row delimiter,
Replace all instances of
Delimiter="_x000D__x000A_"
with
Delimiter="_x0004_"
there're two instances: DTS:HeaderRowDelimiter and DTS:ColumnDelimiter
Save the file and execute it with a double clik or "Open with: Execute package utility". I tested the solution on my PC using an account with limited permissions.

Importing DAT file that contains double quotes within few fields

I'm using SQL Server 2014 Management Studio - Import and Export wizard to import a .DAT file into a SQL Server table.
However, while all of these records are text qualified with " double quotes, some of the field values have double quotes within them. SQL Server aborts the package every time it hits a row like that. Any solution?
"Field A"|"Field B"|"Field C"
"Value A"|"Another value "with" for B"|"Value C"

Importing quote-escaped CSV into SQL Server via Import Data wizard into all nvarchar columns

I have a table of CSV data like such:
a | b | c | d | f
1: 12 Dave Larry $1234.0 FALSE
2: 324.0 Bob Gray $24.012 TRUE
3: 2000 John Stan $204.0
4: 9000 Stace Jill - FALSE
5: 850.0 Till $30 TRUE
A field such as a user's comments would include commas, so these are escaped via single- or double-quotes. Excel opens these just fine and can be used to cleanse or manipulate the data before importing.
The easiest thing for me from a migration standpoint was to just get the data into the SQL Server as varchars first, then use SQL to manipulate the data into its target destination format.
I did run into the following problems:
1) Trying to import the CSV can cause issues. SQL Server Management Studio's import expects a strictly formatted CSV, meaning something like a comments column or numbers formatted as currency in text could cause imports to fail.
2) When saving the CSV as XLS, SQL Server Management Studio still seems to try and be "smart" about how it interprets the data, regardless of however it was formatted. Sometimes, data cannot be converted to nvarchar or varchar even if you desire that, because the import utility already assumes the data is numeric. Tab-delimited can end up not working as well, especially for something like user comments.
What is an error-free method of importing CSV to SQL Server, making all columns varchar or nvarchar?
One solution was to use the Data -> Text to Columns, delimited, and then not selecting any delimiters. However, Excel only lets you do this one column at a time. However, this XLS reads into SQL Server just fine, as all nvarchars.
Revising this solution further, you can create the following macro, saving it into the PERSONAL.XLSB so that it is available in all future worksheets. By mapping this macro to a key combination, you can select the a cell, and then the macro will select the column and then run the text to columns function for you:
Sub ColumnToNVarChar()
'
' ColumnToNVarChar Macro
' Convert a column in Excel to a format that SQL Server Management Studio's import process will interpret as nvarchar.
'
' Keyboard Shortcut: Ctrl+d
'
ActiveCell.EntireColumn.Select
Selection.TextToColumns Destination:=ActiveCell.EntireColumn, DataType:=xlDelimited, _
TextQualifier:=xlDoubleQuote, ConsecutiveDelimiter:=False, Tab:=False, _
Semicolon:=False, Comma:=False, Space:=False, Other:=False, FieldInfo _
:=Array(1, 2), TrailingMinusNumbers:=True
End Sub
You then save this as an XLS file and SQL Server Management Studio's "Import Data" process will treat every column as an nvarchar. Usually nvarchar(255).

Manual import into SQL Server 2000 of tab delimited text file does not format international characters

I have searched for this specific solution and while I have found similar queries, I have not found one that solves my issue. I am manually importing a tab-delimited text file of data that contains international characters in some fields.
This is one such character: Exhibit Hall C–D
it's either an em dash or en dash in between the C & D. It copies and pastes fine, but when the data is taken into SQL Server 2000, it ends up looking like this:
Exhibit Hall C–D
The field is nvarchar and like I said, I am doing the import manually through Enterprise Manager. Any ideas on how to solve this?
The problem is that the encoding between the import file and SQL Server is mismatched. The following approach worked for me in SQL Server 2000 importing into a database with the default encoding (SQL_Latin1_General_CP1_CI_AS):
Open the .csv/.tsv file with the free text editor Notepad++, and ensure that special characters appear normal to start with (if not, try Encoding|Encode in...)
Select Encoding|Convert to UCS-2 Little Endian
Save as a new .csv/.tsv file
In SQL Server Enterprise Manager, in the DTS Import/Export Wizard, choose the new file as the data source (source type: Text File)
If not automatically detected, choose File type: Unicode (in preview on this page, the unicode characters will still look like black blocks)
On the next page, Specify Column Delimiter, choose the correct delimiter. Once chosen, Unicode characters should appear correctly in the Preview pane
Complete import wizard
I would try using the bcputility ( http://technet.microsoft.com/en-us/library/ms162802(v=sql.90).aspx ) with the -w parameter.
You may also want to check the text encoding of the input file.

Resources