I have a text file (txt) containing formatted text (just line breaks, carriage returns and tabs)
It also contains German language characters.
I want to use the Bulk Insert comment in T-SQL to read in the text file into one field within a database table.
I ran this command:
CREATE TABLE #MyTestTable (
MyData NVARCHAR(MAX)
)
BULK INSERT [#MyTestTable]
FROM 'D:\MyTextFile.txt'
SELECT * FROM #MyTestTable
The problem is that it reads each line of the text file into a new row in the Temp table. I want it to read the whole file (formatting and all) into one row.
Also the German language characters appear to be lost - replaced by a non-printable character default in the Results View.
Anyone any ideas how I can achieve this?
Thanks.
You can use ROWTERMINATOR and CODEPAGE parameters. Default row terminator is '\r\n'. For the CODEPAGE, you need to know encoding of your raw file and default collation of your DB.
BULK INSERT [#MyTestTable]
FROM 'D:\MyTextFile.txt'
WITH (ROWTERMINATOR = '\0',
CODEPAGE = 'ACP')
Also see http://msdn.microsoft.com/en-us/library/ms188365.aspx
Use this:
FIELDTERMINATOR = '|',
ROWTERMINATOR = '\n'
Where | is your column delimiter.
don't use bulk insert. it is made to take one record per line. You need to write code.
Properly handle the transition from you text file to the unicode (nvarchar) in code. bulk insert probably appplied the standard codepage, loosing your characters.
This really cries for some minor programming job - an hour work or so, plus naother testing and as long for running as it takes.
Related
I’m trying to export a table into excel/csv , but I’m having trouble because of one column, which is long and has been concatenated with delimiter of “char(10) + char(13)” for a new lines . When I copy all the data from sql server management studio and use “save as” csv file, the output gets broken . Every place that there is a use of a new line , the output get stretched to more than 1 row and breaks the columns position.
I also tried using the export wizard ( don’t know if it will make a difference ) but with no success as the export keeps failing on the last step (getting a warning of “potential lost conversion from nvarchar to longtext) with error of “data conversion failed ..”
To allow multiline fields in csv, those fields have to be enclosed in quotes:
123,"multiline
field",456
789,second record,147
If this is not the case in your generated csv you might have to tell the generator to quote the fields.
If the quotes are already there the csv is valid and any decent reader should take care of those multiline fields. Of course, if you open the file in Notepad you'll still see multiple lines per record, which is normal.
To avoid such issues, you need to clean the data by replacing the carriage return (char(13)) and line feed (char(10)) in your SELECT statement using the following query:
SELECT replace(replace([ColumnName], char(10), ''), char(13), '')
FROM [dbo].[yourTableName]
I'm importing a CSV using sql server bulk option and below is my sql inputs.
MAXERRORS = 1000000,
CODEPAGE = 1251,
FIELDTERMINATOR = '~%',
ROWTERMINATOR = '0x0a',
ERRORFILE = 'C:\MyFile_BadData.log'
My problem is BULK INSERT fails to load the last row data.
Also please note that no errors was reported by the sql bulk option..
If i add a empty newline to the file the loading works without any issues.
But my concern is i cannot modify the CSV file, please suggest your valuable inputs if any
This happens when the last line doesn't end with the row terminator. Make sure the last line ends with the row terminator, then the last row will be imported.
If you can't change the export routine that generates the CSV, use powershell or something to add the row terminator to the CSV. If you can't change the original, copy it to a location where you can change it (include that in your powershell script).
Exactly! The last character in the file must be some kind of special character; Line Break, Space, etc. Can you simply delete the last line and do the import? If you need an automated solution, create a small C# app to open the file, delete the last line, and then save the file. THEN, run your import process. This can all be controlled using the Windows Task Scheduler, so you can even run it as an overnight job, when you are far away from your computer.
https://www.digitalcitizen.life/how-create-task-basic-task-wizard
If you're on a Unix variant, you can also use this to add to the file without having to edit it by hand.
sed -i '' -e '$a\' fname.csv
Note that this will only add a newline if there isn't one. So running it multiple times will not affect a file that already has a trailing newline.
I have data in the csv file similar to this:
Name,Age,Location,Score
"Bob, B",34,Boston,0
"Mike, M",76,Miami,678
"Rachel, R",17,Richmond,"1,234"
While trying to BULK INSERT this data into a SQL Server table, I encountered two problems.
If I use FIELDTERMINATOR=',' then it splits the first (and sometimes the last) column
The last column is an integer column but it has quotes and comma thousand separator whenever the number is greater than 1000
Is there a way to import this data (using XML Format File or whatever) without manually parsing the csv file first?
I appreciate any help. Thanks.
You can parse the file with http://filehelpers.sourceforge.net/
And with that result, use the approach here: SQL Bulkcopy YYYYMMDD problem or straight into SqlBulkCopy
Use MySQL load data:
LOAD DATA LOCAL INFILE 'path-to-/filename.csv' INTO TABLE `sql_tablename`
CHARACTER SET 'utf8'
FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '\"'
IGNORE 1 LINES;
The part optionally enclosed by '\"', or escape character and quote, will keep the data in the first column together for the first field.
IGNORE 1 LINES will leave the field name row out.
UTF8 line is optional but good to use if names have diacritics, like in José.
I am currently working on a project that requires data from a report generated by third party software to be inserted into a local SQL database. So far I have the data stored as a tab delimited .txt file and the following bulk insert SQL statement:
BULK INSERT ExampleTable
FROM 'c:\temp\Example.txt'
WITH
(
FIRSTROW = 2,
FIELDTERMINATOR = '\t',
ROWTERMINATOR = '\n'
)
GO
The two problems I am encountering are, quotation marks around any value that includes it's own comma, and money signs in every field that has a dollar amount.
For instance one of the columns of the table is a description field and some of the values come out looking like:
"this is an example description, some more information, I don't know why the author would use commas in the first place here"
I don't care about the description field nearly as much as other fields that include dollar amounts. Each of these fields is already prefixed with a $ sign, so I have to set them as a nvarchar instead of a decimal or a float, which would be A LOT more useful for reporting. Furthermore, when the dollar amount is greater than 1000, the field will also contain a comma, and thus, quotation marks. ex "$1,084.59"
I am familiar with SSMS, but I have never made a format or bcp file (the solutions I have found online).
Any help would be greatly appreciated.
You can use a format file, but only if your metadata remains constant, which it does not appear to be in your case. You state that the dollar amounts are enclosed in quotes only when they exceed 999 and the comma is inserted. A format file would allow you to define per column delimiters such as [,] or [","]. But if that delimiter is shifting throughout your file, you will have to pre-process the file. Text qualifiers themselves are not supported.
For reference:
CSV import in SQL Server 2008
http://jessesql.blogspot.com/2010/05/bulk-insert-csv-with-text-qualifiers.html
i dont see why, but ThiefMaster deleted my answer :-(
probabaly a mistake and he did not check the link, as this link is the full answer to you question, i will try again for the last time here...
Tip: if your CSV file don't have consistent format, for example ON THE SAME COLUMN some of the values are doubleqouted and some not than this blog will help you do it in an easy way (using openrowset in the last step make it a one simple query): http://ariely.info/Blog/tabid/83/EntryId/122/Using-Bulk-Insert-to-import-inconsistent-data-format-using-pure-T-SQL.aspx
There is a new WIKI at: http://social.technet.microsoft.com/wiki based on this blog if you prefer to read from Microsoft site.
I extracted some 10 tables in CSV with " as the text qualifier. Problem is my extract does not look right in Excel because of special characters in a few columns. Some columns are breaking into a new row when it should stay in the column.
I've been doing it manually using the management studio export feature, but what's the best extract the 10 tables to CSV with the double quote qualifier using a script?
Will I have to escape commas and double quotes? Best way to do this?
How should I handle newline codes in my columns, we need them for migration to a new system, but the PM wants to open the files and make modifications using Excel. Can they have it both ways?
I understand that much of the problem is that Excel is interpreting the file where a load utility into another database might not do anything special with new line, but what about double quotes and commas in the data, if I don't care about excel, must I escape that?
Many Thanks.
If you are using SQL Server 2005 or later, the export wizard will export the excel file out for you.
Right click the database, select Tasks-> Export Data...
Set the source to be the database.
Set the destination to excel.
At the end of the wizard, select the option to create an SSIS package. You can then create a job to execute the package on a schedule or on demand.
I'd suggest never using commas for your delimiter - they show up too frequently in other places. Use a tab, since a tab isn't too easy to include in Excel tables.
Make sure you never start a field with a space unless you want that space in the field.
Try changing your text lf's into the literal text \n. That is:
You might have:
0,1,"Line 1
Line 2", 3
I suggest you want:
0 1 "Line 1\nLine 2" 3
(assuming the spacing between lines are tabs)
Good luck
As far as I know, you cannot have new line in csv columns. If you know a column could have comma, double quotes or new line, then you can use this SQL statement to extract the value as valid csv
SELECT '"' + REPLACE(REPLACE(REPLACE(CAST([yourColumnName] AS VARCHAR(MAX)), '"', '""'), char(13), ''), char(10), '') + '"' FROM yourTable.