ssis import skipping some lines - sql-server

for some reason, I can't seem to figure out why I'm missing some lines when I import a csv file that I exported using sqlplus. The ssis package gives no warnings or errors as well.
The reason I know this is because when I loaded the csv file into a separate analysis application it gets the correct totals for numeric columns and row counts.
But for some reason, SSIS doesn't seem to capture all of the lines...
there were 6313052 rows in the csv file and ssis imports 6308607...
any thoughts?
I've tried different code pages too (1250, 1252, and UTF8) but they didn't seem to have an affect
I checked out this link:
Why all the records are not being copied from CSV to SQL table in a SSIS package
and numbers I've checked on numbers 2, 3, and 4.
Although, for number 1, I'm using a for each loop container descirbed in this site:
http://help.pragmaticworks.com/dtsxchange/scr/FAQ%20-%20How%20to%20loop%20through%20files%20in%20a%20specified%20folder,%20load%20one%20by%20one%20and%20move%20to%20archive%20folder%20using%20SSIS.htm
to loop through files in a folder and import them.
I've also thought about missing delimiters, but I exported the files myself using sqlplus like this:
select
trim(to_char(t1.D_DTM, 'yyyy-mm-dd hh24:mm:ss'))||','||
trim(t1.TECHNOLOGY)||','||
trim(t1.VOICEDATA)||','||
trim(t2.MRKT_NM)||','||
trim(t2.REGION_NM)||','||
trim(t2.CLUSTER_NM)||','||
trim(t3.BSC_NM)||','||
trim(t1.BTS_ID)||','||
trim(t1.CSCD_NM)||','||
trim(t1.SECT_SEQ_ID)||','||
trim(t1.BND_ID)||','||
trim(t1.FA_ID)||','||
trim(t1.TCE_DTCT_CALL_SETUP_F_CNT)||','||
trim(t1.MS_ACQ_CALL_SETUP_F_CNT)||','||
trim(t1.SIGN_CALL_SETUP_F_CNT)||','||
trim(t1.BAD_FRM_CALL_SETUP_F_CNT)||','||
trim(t1.REORG_ATT_CNT)||','||
trim(t1.TCE_CALL_SETUP_F_CNT)||','||
trim(t1.WCD_CALL_SETUP_F_CNT)||','||
trim(t1.L_CALL_SETUP_F_CNT)||','||
trim(t1.TRAF_FRM_MISS_CNT)||','||
trim(t1.BCP_TMO_CALL_SETUP_F_CNT)
from table
I'm not sure how I could miss delimiters if I export a file like that...
Also, I tested importing these files into MySql using LOAD DATA INFILE and that seems to work fine and import all of the data...

Related

SSIS Ignore Blank Lines

I get the following SSIS error message when my source file has blank lines at the end of the file. I don't care about the blank lines as they don't affect the overall goal of pumping data from a text file to a database table. I'd like to ignore this message or, if its easier, configure SSIS to ignore blanks.
<DTS:Column DTS:ID="96" DTS:IdentificationString="Flat File Source.Outputs[Flat File Source Error Output].Columns[Flat File Source Error Output Column]"/>
I found a similar question below, but the solution isn't an SSIS one, its one that preprocesses the text files which would be my least favorite solution.
SSIS Import Multiple Files Ignore blank lines
If you want to exclude records with blank values you can use the Conditional Split. Add it between you source file and your destination.
The expression can be like below :
ISNULL(Col1) && ISNULL(Col2) && ISNULL(Col3) ...
Name the output as Remove Blank Lines. When connecting your Conditional Split to your destination, SSIS will ask you what output the split component that needs to be returned. In this case chose the Conditional Split Default Output to get the entire records without the blank values.
You can enable Data Viewer before and after the conditional split to see the filtered output.

NULLs when importing flat file - SQL Server

I am trying to import some data from a .csv file in SSMS using the "Import Flat File" option. However, not all the data is being copied.
Both data types are set to nvarchar(50).
The lines containing just a single value after the semi-colon is imported. Lines with multiple values are imported as NULL. I've tried separating the values with forward slashes and commas. The result is still the same.
How can I get the values imported instead of these NULL values?
I tested on my local machine with your example and everything is ok.
Maybe there is an issue in your flat file, have you checked your file is coherent (appropriate < CR >< LF > for example) in Notepad++ ?
Also, during the import, Is the result you see in the "preview" window correct ?

Error importing data from CSV with OpenRowset in SQL Server - Mysterious value of "S7"

I have a file dump which needs to be imported into SQL Server on a daily basis, which I have created a scheduled task to do this without any attendant. All CSV files are decimated by ',' and it's a Windows CR/LF file encoded with UTF-8.
To import data from these CSV files, I mainly use OpenRowset. It works well until I ran into a file in which there's a value of "S7". If the file contains the value of "S7" then that column will be recognized as datatype of numeric while doing the OpenRowset import and which will lead to a failure for other alphabetic characters to be imported, leaving only NULL values.
This is by far I had tried:
Using IMEX=1: openrowset('Microsoft.ACE.OLEDB.15.0','text;IMEX=1;HDR=Yes;
Using text driver: OpenRowset('MSDASQL','Driver=Microsoft Access Text Driver (*.txt, *.csv);
Using Bulk Insert with or without a format file.
The interesting part is that if I use Bulk Insert, it will give me a warning of unexpected end of file. To solve this, I have tried to use various row terminator indicators like '0x0a','\n', '\r\n' or not designated any, but they all failed. And finally I managed to import some of the records which using a row terminator of ',\n'. However the original file contains like 1000 records and only 100 will be imported, without any notice of errors or warnings.
Any tips or helps would be much appreciated.
Edit 1:
The file is ended with a newline character, from which I can tell from notepad++. I managed to import files which give me an error of unexpected end of file by removing the last record in those files. However even with this method, that I still can not import all records, only a partial of which can be imported.

Import Multiple Excel files into a table using SSIS

I have two Excel files named 'First' and 'Second' in same location .
They have same schema.
I used foreach loop counter and put Data flow task into it.
The data flow diagram looks like this:-
Here, I selected first excel file as the source....
My For Each Loop Container Editor:-
After running the SSIS package successfully the output came like this:-
Which took data only from First excel file and three times,I must have done something wrong in there,But I cant figure it out.
Check your Foreach Loop Editor:
Collection>Folder
Collection>Files
Your file should not have a particular file name, for multiple excel use *.xlsx.
Edit:
Use a Script task to Debug. Map the value of ForEach to a variable and display it through Script task.
Edit the script task with below code.
MessageBox.Show(Dts.Variables["Variable"].Value.ToString());
Also, Please check your Source Excel connetion is configured correctly with values coming from foreach.

SSIS read flat file skip first row

First of all, I did spend quite some time on research, and I know there are many related questions, though I can't find the right answer on this question.
I'm creating a SSIS package, which does the following:
1. Download and store CSV file locally, using HTTP connection.
And 2. Read in CSV file and store on SQL Server.
Due to the structure of my flat file, the flat file connection keeps giving me errors, both in SSIS as in the SQL Import Wizard.
The structure of the file is:
"name of file"
"columnA","columnB"
"valueA1","valueB1"
"valueA2","valueB2"
Hence the row denominator is end of line {CR}{LF} and the column denominator is a comma{,}, with text qualifier ".
I want to import only the values, not the name of the file or the column names.
I played around with the settings and got the right preview with the following settings (see image below)
- Header rows to skip: 0
- Column names in the first data row: no
- 2 self-configured columns (string with columnWidth = 255)
- Data rows to skip: 2
When I run the SSIS Package or SQL Import Wizard I get the following error:
[SSIS.Pipeline] Error: SSIS Error Code DTS_E_PRIMEOUTPUTFAILED. The
PrimeOutput method on Flat File Source returned error code 0xC0202091.
The component returned a failure code when the pipeline engine called
PrimeOutput(). The meaning of the failure code is defined by the
component, but the error is fatal and the pipeline stopped executing.
There may be error messages posted before this with more information
about the failure.
I can't figure out what goes wrong and what I can do to make this import work.
If you want to skip the file name and the column names, you need to set Header Rows to skip to 2. You should also check whether the file actually uses line feeds (LF) instead of CR+LF. Checking the line breaks in a text editor isn't enough to detect the difference, as most editors display correctly files with both CR+LF or LF.
You can check the results of your settings by clicking on the "Preview" button in your flat file source. If the settings are correct, you'll see a grid with your data properly aligned. If not, you'll get an error, or the data will be wrong in some way, eg a very large number of columns, column names in the first data row etc

Resources