Importing *.DAT file into SQL server - sql-server

I am trying to import *.DAT file(as flat file source) into sql server using SQL server import and export wizard. It has DC4 as delimiter which is causing error while trying to separate the columns and their respective data and importing them in sql server.
Are there any setting changes to be made during the importing process?

If you don't have to use the wizard, you can script it like:
BULK INSERT [your_database].[your_schema].[your_table]
FROM 'your file location.dat'
WITH (ROWTERMINATOR='0x04' -- DC4 char
,MAXERRORS=0
,FIELDTERMINATOR='þ'
,TABLOCK
,CodePage='RAW'
)

The wizard uses SSIS under the hood. Instead of executing it directly, chose CrLF as row delimiter, then chose to save it as file. Open the file and edit it using any text editor. It's a simple xml file.
It's not clear whether 0x04 is the column delimiter or the row delimiter. Assuming it's the row delimiter,
Replace all instances of
Delimiter="_x000D__x000A_"
with
Delimiter="_x0004_"
there're two instances: DTS:HeaderRowDelimiter and DTS:ColumnDelimiter
Save the file and execute it with a double clik or "Open with: Execute package utility". I tested the solution on my PC using an account with limited permissions.

Related

Import csv file via SSMS where text fields contain extra quotes

I'm trying to import a customer csv file via SSMS Import Wizard, file contains 1 million rows and I'm having trouble importing where the field has extra quotes e.g. the file has ben populated freehand so could contain anything.
Name, Address
"John","Liverpool"
"Paul",""New York"""
"Ringo","London|,"
"George","India"""
Before I press on looking into SSMS should SSMS 2016 handle this now or do I have to do in SSIS, it is a one off load to check something?
In SSMS Import/Export Wizard, when configuring the Flat File Source you have to set:
Text Qualifier = "
Column Delimiter = ,
This will import the file as the following:
Name Address
John Liverpool
Paul "New York""
Ringo London|,
George India
The remaining double quotes must be removed after import is done using SQL, or you have to create an SSIS package manually using Visual Studio and add some transformation to clean data.

Error importing data from CSV with OpenRowset in SQL Server - Mysterious value of "S7"

I have a file dump which needs to be imported into SQL Server on a daily basis, which I have created a scheduled task to do this without any attendant. All CSV files are decimated by ',' and it's a Windows CR/LF file encoded with UTF-8.
To import data from these CSV files, I mainly use OpenRowset. It works well until I ran into a file in which there's a value of "S7". If the file contains the value of "S7" then that column will be recognized as datatype of numeric while doing the OpenRowset import and which will lead to a failure for other alphabetic characters to be imported, leaving only NULL values.
This is by far I had tried:
Using IMEX=1: openrowset('Microsoft.ACE.OLEDB.15.0','text;IMEX=1;HDR=Yes;
Using text driver: OpenRowset('MSDASQL','Driver=Microsoft Access Text Driver (*.txt, *.csv);
Using Bulk Insert with or without a format file.
The interesting part is that if I use Bulk Insert, it will give me a warning of unexpected end of file. To solve this, I have tried to use various row terminator indicators like '0x0a','\n', '\r\n' or not designated any, but they all failed. And finally I managed to import some of the records which using a row terminator of ',\n'. However the original file contains like 1000 records and only 100 will be imported, without any notice of errors or warnings.
Any tips or helps would be much appreciated.
Edit 1:
The file is ended with a newline character, from which I can tell from notepad++. I managed to import files which give me an error of unexpected end of file by removing the last record in those files. However even with this method, that I still can not import all records, only a partial of which can be imported.

Tilde (~) Delimited File Read in SSIS

I'm trying to load a Tilde (~) delimited .DAT to SQL Server DB using SSIS. When I use a flat file source to read the file, I don't see the option of a ~ delimiter. I'm pasting a row from my file below:
7318~97836: LRX PAIN MONTHLY DX~001~ALL OTHER NSAIDs~1043676~001~1043676~001~OSR~401~01~ORALS,SOL,TAB/CAP RE~156720~50MG~ANSAID~100 0170-07
In here, I need to get the data between the columns separated by a ~ i.e.
Column 1 should have '7318', Column 2 should have '97836: LRX PAIN MONTHLY DX'.
Can someone help me with this? Can this be done using a Flat File Source or do I need to use a Script Task?
Sure you can, you just need to configure the "Column delimiter" property in the "Flat File Connection Manager Editor". There are some predetermined choices there, but you can click and type any separator you want:
After that you can click "refresh" and then "OK".

Manual import into SQL Server 2000 of tab delimited text file does not format international characters

I have searched for this specific solution and while I have found similar queries, I have not found one that solves my issue. I am manually importing a tab-delimited text file of data that contains international characters in some fields.
This is one such character: Exhibit Hall C–D
it's either an em dash or en dash in between the C & D. It copies and pastes fine, but when the data is taken into SQL Server 2000, it ends up looking like this:
Exhibit Hall C–D
The field is nvarchar and like I said, I am doing the import manually through Enterprise Manager. Any ideas on how to solve this?
The problem is that the encoding between the import file and SQL Server is mismatched. The following approach worked for me in SQL Server 2000 importing into a database with the default encoding (SQL_Latin1_General_CP1_CI_AS):
Open the .csv/.tsv file with the free text editor Notepad++, and ensure that special characters appear normal to start with (if not, try Encoding|Encode in...)
Select Encoding|Convert to UCS-2 Little Endian
Save as a new .csv/.tsv file
In SQL Server Enterprise Manager, in the DTS Import/Export Wizard, choose the new file as the data source (source type: Text File)
If not automatically detected, choose File type: Unicode (in preview on this page, the unicode characters will still look like black blocks)
On the next page, Specify Column Delimiter, choose the correct delimiter. Once chosen, Unicode characters should appear correctly in the Preview pane
Complete import wizard
I would try using the bcputility ( http://technet.microsoft.com/en-us/library/ms162802(v=sql.90).aspx ) with the -w parameter.
You may also want to check the text encoding of the input file.

Talend: Write data to PostgreSQL database error

I am trying to write data from a .csv file to my postgreSQL database. The connection is fine, but when I run my job i get the following error:
Exception in component tPostgresqlOutput_1
org.postgresql.util.PSQLException: ERROR: zero-length delimited identifier at or near """"
at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:1592)
at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1327)
at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:192)
at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:451)
at org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:336)
at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:328)
at talend_test.exporttoexcel_0_1.exportToExcel.tFileInputDelimited_1Process(exportToExcel.java:568)
at talend_test.exporttoexcel_0_1.exportToExcel.runJobInTOS(exportToExcel.java:1015)
at talend_test.exporttoexcel_0_1.exportToExcel.main(exportToExcel.java:886)
My job is very simple:
tFileInputDelimiter -> PostgreSQL_Output
I think that the error means that the double quotes should be single quotes ("" -> ''), but how can i edit this in Talend?
Or is it another reason?
Can anyone help me on this one?
Thanks!
If you are using the customer.csv file from the repository then you have to change the properties of customer file by clicking through metadata->file delimited->customer in the repository pane.
You should be able to right click the customer file and then choose Edit file delimited. In the third screen, if the file extension is .csv then in Escape char settings you have to select CSV options. Typical escape sequences (as used by Excel and other programs) have escape char as "\"" and text enclosure is also "\"".
You should also check that encoding is set to UTF-8 in the file settings. You can then refresh your preview to view a sample of your file in a table format. If this matches your expectations of the data you should then be able to save the metadata entry and update this to your jobs.
If your file is not in the repository, then click on the component with your file and do all of the above CSV configuration steps in the basic settings of the component.

Resources