I am importing a large 9.5 million record file in MongoDB using MongoImport. The CSV file was created for SQL Server Import operations and because the data in SQL table contained quotes(") itself. So, it was generating error importing in SQL.That is why, it has pipe (|) as the text qualifier and as the column delimiter.
While using MongoImport I can't find anything related to setting column delimiter and text qualifier by myself.
The screenshot contains the error message
You cannot. Since it is using | as delimiter it's not longer valid CSV.
Your options are to export data to JSON, TSV or CSV and then use mongoimport or do it programmatically which will be pretty valid and fast option since mongo doesn't wait till it's inserted so it will be pretty fast.
Related
I have issues importing a binary .dat-file that has been generated with the cmd-line tool bcp. Besides the unhelpful error messages I get (invalid field size for datatype ; but no mention of which column and or datatype) I have nothing to go with. I did not generate the bcp file, so there is that.
It's a large file and I have not encountered any means to see the contents!
Is it somehow possible to convert the the binrary .dat file into a flat text .csv file?
Is it possible to just extract the first couple of rows into clear text?
Thank you for any hints.
I have a Azure Logic App in which I am executing the SQL Select queries and converting the output of SQL Queries in CSV files and Mail them. The SQL view contains column names with special characters such as +,%,() or whitespaces. But after creating the CSV file from the Query result the column are replaced with unknown values in place of special characters. Example -
a column name "total(value + value)" is replace in CSV with
"total_x005c_value_x0027_value_x005c_". Please help me with this I want the exact column names in CSV file also.
Thanks.
You get this it is because the source data in sql server is UCS-2. And in logic app it doesn't support to decode it.
The simplest way is set the custom header when you create the csv table.
However if you have multiple columns it will be hard to solve it. So you could use Azure Function to convert the json data. You pass the data to HTTP trigger Function, convert it and return data to the logic app and create the CSV.
I'm into a task of importing a CSV file to SQL server table. I'm using bcp tool as my data can be large. The issue im facing with bcp is that the table where I'm gonna import CSV into can have a mix of data types like date, int, etc and if I use bcp using native mode (-n), I will need bcp file as the input but I have CSV file.
Is there any way to convert CSV file into bcp file? or
How can I import a CSV file into SQL server table given that my table columns can have any data type and not just character types?
Had it been that all columns are of character type, i would have used bcp tool with -c option.
Actually... the safest thing to do when importing data, especially when it ins bulk like this, is to import it into a staging table first. In this case where all of the fields are string/varchars. That then allows you to scrub/validate the data and make sure it's safe for consumption. Then once you've verified it, move/copy it to your production tables converting it to the proper type as you go. That's typically what I do when dealing with import data.
a CSV file is just a text file that is delimited by commas. With regard to importing text files, there is no such thing as a 'BCP' file. BCP has an option to work with native SQL data (unreadable to the human eye with a text editor), but the default is to just work with text the same as what you have in your CSV file. There is no conversion needed, with using textual data, there is no such thing as a "BCP file". It's just a ascii text file.
Whoever created the text file has already completed a conversion from their natural datatypes into text. As others have suggested, you will save yourself some pain later if you just load the textual CSV data file you have into a "load" table of all "VARCHAR" fields. Then from that load table you can manipulate the data into whatever datatypes you require in your final destination table. Better to do this than to make SQL do implied conversions by having BCP insert data directly into the final destination table.
How can I load BAI2 file to SSIS?
.BAI2 is an industry standard format used by the banks. Below is the one truncated example
01,021000021,CST_USER,110520,1610,1627,,,2/
02,CST_USER,089900137,1,110509,1610,,2/
03,000000370053368,USD,010,782711622,,,015,7620008 12,,,040,760753198,,/
88,043,760000052,,,045,760010026,,,050,760000040,, ,055,760000045,,/
Use a Flat file connection manager
I think you can import these files using a flat file connection manager, because they are similar to comma separated text, try to change the row delimiter and column delimiter properties to find the appropriate one.
From the example you mentioned i think you should use:
, as Column delimiter
/ as Row delimiter
To learn more about how to interpret a BAI2 file check the following link:
EBS – How to interpret a BAI2 file
Based on this link:
The BAI2 file is a plain text file (.TXT Format), which contains values / texts one after the other.
Because the number of columns is not fixed among all rows than you must use define only one column (DT_STR,4000) in the flat file connection manager, and split columns using a Script Component:
SSIS ragged file not recognized CRLF
how to check column structure in ssis?
SSIS : Creating a flat file with different row formats
Helpful links
SQL SERVER – Import CSV File into Database Table Using SSIS
Importing Flat Files with Inconsistent Formatting Using SSIS
SSIS Lesson 2: First Package
Has anyone been able to get a variable record length text file (CSV) into SQL Server via SSIS?
I have tried time and again to get a CSV file into a SQL Server table, using SSIS, where the input file has varying record lengths. For this question, the two different record lengths are 63 and 326 bytes. All record lengths will be imported into the same 326 byte width table.
There are over 1 million records to import.
I have no control of the creation of the import file.
I must use SSIS.
I have confirmed with MS that this has been reported as a bug.
I have tried several workarounds. Most have been where I try to write custom code to intercept the record and I cant seem to get that to work as I want.
I had a similar problem, and used custom code (Script Task), and a Script Component under the Data Flow tab.
I have a Flat File Source feeding into a Script Component. Inside there I use code to manipulate the incomming data and fix it up for the destination.
My issue was the provider was using '000000' as no date available, and another coloumn had a padding/trim issue.
You should have no problem importing this file. Just make sure when you create the Flat File connection manager, select Delimited format, then set SSIS column length to maximum file column length so it can accomodate any data.
It appears like you are using Fixed width format, which is not correct for CSV files (since you have variable length column), or maybe you've incorrectly set the column delimiter.
Same issue. In my case, the target CSV file has header & footer records with formats completely different than the body of the file; the header/footer are used to validate completeness of file processing (date/times, record counts, amount totals - "checksum" by any other name ...). This is a common format for files from "mainframe" environments, and though I haven't started on it yet, I expect to have to use scripting to strip off the header/footer, save the rest as a new file, process the new file, and then do the validation. Can't exactly expect MS to have that out-of-the box (but it sure would be nice, wouldn't it?).
You can write a script task using C# to iterate through each line and pad it with the proper amount of commas to pad the data out. This assumes, of course, that all of the data aligns with the proper columns.
I.e. as you read each record, you can "count" the number of commas. Then, just append X number of commas to the end of the record until it has the correct number of commas.
Excel has an issue that causes this kind of file to be created when converting to CSV.
If you can do this "by hand" the best way to solve this is to open the file in Excel, create a column at the "end" of the record, and fill it all the way down with 1s or some other character.
Nasty, but can be a quick solution.
If you don't have the ability to do this, you can do the same thing programmatically as described above.
Why can't you just import it as a test file and set the column delimeter to "," and the row delimeter to CRLF?