Excel Source SSIS - sql-server

I have an SSIS package with an Excel Source reads an Excel table. I currently am using the Table or View Data Access Mode and it is literally reading every row in the worksheet, 1,048,576 which is the maximum.
The source worksheet has an Excel table on it named PSA_DATA. Why isn't this table in the Table or View drop down? There is an option for the worksheet followed by _FilterDatabase but this fails when I run the package even though it pulls the correct data when I press Preview. Wouldn't this make more sense than using the SQL Command and SELECT * FROM [fact_PSA$Ax:Bx]? The whole reason we use Named Ranges and Tables in Excel is because they are dynamic! Now I have to hard code the range in every time with rows numbers?
What am I missing here? Is there an easier way I am missing? I just want to move an Excel table into a SQL table! Why don't doesn't the most ubiquitous piece of software in the world easily talk to the second most ubiquitous piece of software in the world!?!?!

If the sheet name is not shown in Table or view combobox, it is not a bad idea to use a Sql Command.
But When using SQL Comand to read from excel it is not necessary to specify a range, OLEDB will take used range by default just use the following command
SELECT * FROM [fact_PSA$]
Workaround
you can try reading your excel file from a script task or a script component, you can follow one of the following links to achieve this:
https://social.msdn.microsoft.com/Forums/sqlserver/en-US/2d45f180-9fd0-4224-a298-cb99e2b2100a/how-to-read-the-contents-of-excel-file-through-ssis-script-task-without-the-headers?forum=sqlintegrationservices
https://msdn.microsoft.com/en-us/library/ms403358.aspx
http://billfellows.blogspot.com/2013/04/ssis-excel-source-via-script.html
Side Note: there are many links you can follow to import data from excel to SQL using SSIS:
http://www.sqlshack.com/using-ssis-packages-import-ms-excel-data-database/
https://www.mssqltips.com/sqlservertip/2770/importing-data-from-excel-using-ssis--part-1/
https://www.simple-talk.com/sql/ssis/moving-data-from-excel-to-sql-server-10-steps-to-follow/
https://www.simple-talk.com/sql/ssis/importing-excel-data-into-sql-server-via-ssis-questions-you-were-too-shy-to-ask/

I appreciate the links to work-arounds, but I didn't really get an answer to my question. Why can't we reference an EXCEL TABLE (not a worksheet) from the SSIS Excel Source???
I ended up using the SQL Command data access mode with this query:
SELECT * FROM [fact_PSA$A:W]
WHERE fact_PSA_ID IS NOT NULL
Somehow, using SQL stopped it from reading every possible row in the worksheet even though the range provided is set for "A:W" which is every row. I guess the "WHERE fact_PSA_ID" limits the rows read before it hits the SSIS source.

Related

Excel to SQL (SSIS) - Importing more then 1 file, ever file has more then 1 sheet and the data from excel starts from the 3rd row

Excel to SQL (SSIS) - Importing more then 1 file, ever file has more then 1 sheet and the data from excel starts from the 3rd row.
How would you build this the best way?
I know how to do each 1 separate but together I got into a pickle.
Please help me as I haven't found Videos or sites regarding this.
Just to clarify -
The tables (in excel) have the same design (each in different sheet).
Some excel files have 4 sheets some have only 3.
Many thanks,
Eyal
Assuming that all of the Excel files to be imported are located in the same folder, you will first create a For-Each loop in your control flow. Here you will create a user variable that will be assigned the full path and file name of the Excel file being read (you'll need to define the .xls or .xlsx extension in the loop in order to limit it to reading only Excel files). The following link shows how to set up the first part.
How to read data from multiple Excel files with SQL Server Integration Services
Within this loop you will then create a another For-Each loop that will loop through all of the Worksheets in that current Excel file being read. Apply the following link to perform that task of reading the rows and columns from each worksheet into the database table.
Use SSIS to import all of the worksheets from an Excel file
The outer loop will pick up the Excel file and the inner loop will read each worksheet, regardless of the number. They key is that the format of each worksheet must be the same. Also, using the Excel data flow task, you can define from which line of each worksheet to begin reading. The process will continue until all of the Excel files have been read.
For good tracking and auditing purposes, it is a good idea to include counters in the automated process to track the number of files and worksheets for each that were read. I also like to first import all of the records into staging tables where any issues and cleaning can be performed for efficiently using SQL before populating the results to the final production tables.
Hope this all helps.

Ignoring column from Excel file while importing to SQL Server

I have multiple Excel files that have the same format. I need to import them into SQL Server.
The issue I currently have is that there are two text columns that I need to ignore completely as they are free text and the character length for some rows exceeds what the server allows me to import which results in a truncation error.
Because I don't need these columns for my analysis, the table I'm importing to doesn't include these columns but for some reason the SSIS packages still picks up those columns and cuts the import job halfway through.
I tried using max character length for those columns which still results in the truncation error.
I need to create an SSIS package that ignores the two columns completely without deleting the columns from Excel.
You can specify which columns you need to ignore from the Edit Mappings dialog.
I have added the image for your reference:
If you just create the SSIS package in SSDT the Excel file can be queried to return only the required columns. In the package, create an Excel Connection Manager using the Excel file. Then on the Control Flow of the package add a Data Flow Task that has an Excel Source component in it. On this source, change the data access mode to SQL command and the file can then be queried similar to SQL. In the following example TabName is the name of the Excel tab containing the data that will be returned. If either the tab or any column names contain spaces they will need to be enclosed in square brackets, i.e. TabName would be [Tab Name].
Import/Export Wizard
Since you mentioned in the comments that you are using SQL Server Import/Export Wizard. You can solve that if you have a fixed columns (range) that you are looking to import (example: first 10 columns).
In Import/Export wizard, after selecting destination options you will be asked if you want to read from tables or query:
Select the query option, then use a simple select query and specify the columns range after the sheet name. As example:
SELECT * FROM [Sheet1$A:C]
The query above will read from the first 3 columns in Sheet1 since A:C represent the range between first column A and third column C.
Now, you can check the columns from the Edit Mappings dialog:
SSIS
You can use the same logic within SSIS package, just write the same SQL command in the Excel Source after changing the Access Mode to SQL Command.
The solution is simple. I needed to write a query that will exclude the columns. So instead of selecting "Copy data from one or more tables" you select "write a query" and exclude the columns you don't need. This one worked 100%

Finding the column names from source assistant in SSIS

I am creating a SSIS package in which i have to move data from Excel to a table in SQL server. Excel file is like Source Assistant in data flow task.
Number columns in Excel file won't change but column names will change. So i have to find all the columns names in Excel file before inserting data.
Could you please help me on this?
Solution overview
Exclude column names in first row in excel connection, use sql command as data access mode
Alias column names in output column as matching your destination
Add a script task before the data flow task that import the data
You have to use the script task to open the excel file and get the Worksheet name and the header row
Build the Query and store it in a variable
in the second Data Flow task you have to use the query stored above as source (Note that you have to set Delay Validation property to true)
Detailed Solution
You can follow my answer at Importing excel files having variable headers it is solving a very similar case.

Exporting to Excel leaves one empty row after title (but only if it exports one column!)

I have the following problem. I'm exporting to an Excel 2003 file (has to be Excel 2003) from SQL Server through SSIS. It first creates a sheet through a SQL Task and then populates it with a SQL Data Flow. The Excel connection specifies that the first row has the column names.
The problem I have is that when the sheet only has one column, SSIS starts writing not in row 2, but row 3.
This is the SQL script that creates the sheet:
CREATE TABLE `Sheet1` (`Column` LongText)
And the script that populates it:
SELECT socialSecNum FROM Users
If I add a dummy column, with name ".", and in the DataFlow fill it with blanks, it doesn't skip that row, and starts writing in row 2.
The SQL Task script that creates the sheet in this case is:
CREATE TABLE `Sheet1` (`Column` LongText, `.` LongText)
It's the same SQL script that fills the Excel file in both screenshots. The output doesn't change, so there isn't a NULL value being inserted randomly at the beginning there.
What is going on? How do I avoid it? I can't have that "." column name there.
EDIT: Also note that it's not that the Excel files are dirty and that's why it leaves an empty row in row 2 because it thinks it's being used; the same file doesn't skip a row if I add a second column in the script.
EDIT2: I was asked to remove the pictures, sorry.
I was finally able to replicate your issue of a blank row and your fix with the extra column. In the end I couldn't get back to not getting the blank row until I exported with the file actually open. Yeah that's right I got no blank row when the file was actually open in Excel while SSIS package wrote to it, which obviously is not a solution or a good one anyway.
CRAZY....
In all of the testing I did (a lot) I would say I got some inconsistent results using your SQL Task to create the table. If a worksheet with the same name already existed some times it would overwrite what was there but most of the time I would get new worksheet with an extra 1 on it. So when you are creating Sheet1 and it exists your table is created as Sheet11.... Because you are deleting the workbook all together you probably aren't seeing any of that weird behavior.
A quick search on the internet showed that this is a common issue to the 97-2003. So things you can do/try:
Switch to CSV but name the file with .xls extension, it will still open in Excel, have no formating etc. but user may get a warning when opening file.
Add the column then add another sql task to drop it after you populate it I wasn't successful with this but I don't do this in Excel so I may just not know a certain command.
Add another sql task to delete null rows, again I wasn't successful with this but I don't write queries against Excel very often.

How to export data to an Excel 2007 table using SSIS?

I have an excel file (xlsx) containing a table :
Once I launched my ssis task (successfully) to insert data in it, it is actually append after the table :
My expected result:
So I am looking for a way to insert into the table and expand it with the data. I hope someone could help me.
I would not use SSIS for this, you may have Excel2007 as linked server , putting data into Excel by regular TSQL, or process data by Excel VBA getting data directly from SQL Server. As a matter of practical sanity, I would not ever use SSIS for anything
Well, there is not much information how you do it but you should specify somehow that first row should not be used as header names container (HDR=NO), something like,
insert into OPENROWSET('Microsoft.Jet.OLEDB.4.0',
'Excel 8.0;Database=D:\testing.xls; ; HDR=NO',
'SELECT * FROM [Sheet1$]')
I finally found an answer.
So I needed to generate excel reports with a lot of pivot charts linked to a main table.
But using a table was a bad idea. Instead, the pivot charts must be linked to a named range.
The last thing to know is that the error message "Invalid References" appears if the named range doesn't use the OFFSET function.
My named range formula is :
=OFFSET(Sheet!$A$1, 0, 0, COUNTA(Sheet!$A:$A), NUMBER_OF_COLUMNS)
Where Sheet is the name of the worksheet and NUMBER_OF_COLUMNS is the number of columns of the data.
That's it. I can now generate excel report without any line of code, only using SSIS 2005.

Resources