SSIS - Excel to SQL Server with changing column names - sql-server

I have an Excel sheet that changes column names based on the year and current week of the year, so for example 201901 would be the first week of 2019.
The Excel sheet that is sent to us daily automatically adjusts the column names based on the current date (up to 6 months), so currently (31/07/2019) the year and week show 201931 - 202011:
So the N column next week will be 201932 (the columns shift left basically).
I have tried changing the Excel source columns to a different alias of just 1,2,3,4 etc in hopes to just get the data into SQL Server, and then script a trigger in SQL Server to change column names but doesn't work due to the mapping SSIS requires.
Works fine until the column changes to next week.
A simple method would be to drop the table and just dump the file in a new table named the same but can't see how to set up in SSIS as you need to map the column names (which unfortunately change).
Here is how the dataflow looks:
Ideally, for me, something like this would be perfect:
But not sure how to achieve this outcome in SSIS?

I would suggest transforming the data. Currently, you have a "cross-table"-format.
How about putting the Excel data in the form of (RAG_week; CalenderWeek; Value_of_CalenderWeek) ? For doing this you can use an Excel-Macro which fills a new Sheet in the Excel file. (Each cell is transformed in one dataset, being a row on its own.) Next, you create a similar table on the SQL Server. Then you can create a SSIS package with constant column assignment, simply appending the new data each week.
This impacts the further evaluation of your data, but seems to be a far more stable approach.

Related

SSIS - How to take a specific cell in Excel and add it to a existing table as it's own column for every row

I have a SSIS package which takes one sheet of a Excel file and creates a SQL table from it. This is simple because it's in tabular form and SSIS takes care of it.
However, I have another sheet in that Excel file which has a specific cell with a dollar value in it, I would like to take that specific cells value and tack it onto the table created from the tabular data I mentioned above. Aka/ I want to add a column to the table created above, named InsuranceTotal2018, and fill every row in the table with the same value which is derived from that specific cells value.
Is this possible in SSIS? If so, how would one go about doing such?

Why does my excel plus source in ssis outputting a fixed one date for all date columns?

I have an excel file that I want to import using SSIS, I have done this process many more times without a problem with other excel files. however, this excel file has three spreadsheets on it. two of them works just fine importing correct data, BUT one of them is importing fixed data with 06-30-2019 for all date fields, and this not changing when it gets to table but it changed while it still at the source when I preview it, I see a fixed date already set up for date fields but other fields that are non-date fields are coming good. the dates in the spread field of that sheet have different dates per row for that field/column. how is this happening? I am using SSIS 2017 and excel plus loads to table SQL 2017. how can I fix this?
in my excel sheet1
col1 datefield
1 01-08-2019
2 05-06-2019
3 06-12-2019
4 07-25-2019
in my excel source SSIS on preview show this below and that what it loads as well to table.
col1 datefield
1 06-30-2019
2 06-30-2019
3 06-30-2019
4 06-30-2019
well, after I spent a great deal time, I have come to realize my excel file was in a shared file, which we can only access it VIA link/path given to me. instead, I was passing to my excel source the path from my local to that file(where I saved the file earlier), instead, I should have given the shared folder path, and happy to tell you all, the problem was solved.
However the Question Remains, why would Excel Plus Source do something like this and why choose a particular date 06/30/2019? I had that date on my file but not the only one and that date wasn't the first row in my data.

How can I use variables and SQL code within an SSIS package?

I have an SSIS package I am building to take data from a .CSV file and load it into a table in a SQL Server database. The .CSV file has more columns than my table and I'm looking to filter out the data based on some of these columns that are not being inserted into the table.
I have year, kind, type, dollars as my column names in the .CSV file but I'm only pulling type and dollars into the DB. However, I can only pull those rows where the kind= "L" and year is the current year (with one major caveat). If the process is running in the first quarter of a given year (so month <= 3) it needs to use the previous year as my qualifier for what rows it pulls in from the .CSV file. For instance, say it is February 2015 when this package is running, I need it to pull only rows with a year of 2014 and kind="L" from my .CSV file. If it is September 2015 then it needs to pull in rows with a year of 2015 and kind="L".
Any idea what the best way of doing this is? Right now I have a conditional split in my package but I can only get it to say year==YEAR(GETDATE()) and this will not work for the first quarter. I'd need some sort of variable logic to say something like IF(currentmonth<=3 THAN #year = currentyear-1) ELSE (#year = currentyear) and then use the #year variable in the conditional split. Is this possible?
Any help is much appreciated!
Normally for this kind of workflow, I will import the entire CSV into a temporary table, and then have a separate SQL script or view which reads from the temporary table and applies whatever business logic is needed for the final view.
If you want the logic to be in the SSIS package you can use a derived column component to declare a new boolean field for example IncludeRowInOutput and set it to be something like
((currentmonth <= 3 and year = year(getdate() - 1)) or (year = year(getdate))) and kind = 'L'
Then you can do the conditional split based on the IncludeRowInOutput field.
I'd normally be wary of using too much script components, I find that they are harder to debug and make the dataflow harder to understand.

Excel displays dates as times only by default

I use Excel a lot while analyzing data and most of that data comes from SQL Server Management Studio. So I'll execute a query and copy the result set using Copy With Headers and then paste it into Excel. Annoyingly, because SQL Server Management Studio always inserts a time of midnight, Excel insists on displaying the time only.
So the value is 2015-04-17, but SSMS copies it as 2015-04-17 00:00:00.000 and Excel displays it as 00:00.0. In most cases, our date fields don't contain times (or then the implicit midnight time) and I am not even a little bit interested in those. I want to see the dates.
I am aware that I can select the cells and then set their date format (selecting Short Date from the Ribbon does the trick) but this is something I have to do every single time. Does anyone know of a way to ensure the time is not copied by SSMS, or that the default display format in Excel includes the date?
It depends from cells formatting. You can change this formatting cells before pasting rows. For example, you can select the entire document, right click on cells, Click on "cells format" (I have excel in another language, I hope I translated right) and format all cells as text.
In your specific case you can do this just on the columns you need.

Pivot or Unpivot a dataset using Excel, SSIS or SQL Server

I have an Excel file with various market indexes, dates, and values. In this file there is a single column to the left representing the market index names followed by many date columns to the right. I can perform this pivot in Excel, SSIS or SQL Server. I have the most recent versions of each program. I don't want to simply copy and paste special transpose in Excel. I would like the solution to be automated as possible. I suspect loading this data into SSMS and using SQL would be the easiest. I can change the format of the dates if needed.
I have used pivot in both SSIS and SSMS but always with less columns than this tasks requires and am not sure how to approach it in a way that will allow for the large amount of columns and potential for the number of columns (dates) to vary. Perhaps this requires dynamic SQL. The dates which comprise the bulk of the columns can potentially extend 100 rows or more. Note there may be null values if an index didn't have a value on a given day.
Input data
Here is the desired output format.
Here is the data loaded into SSMS 2012. The dates become the column headers. Same goal of transposing the dates and index names.
I would use the Excel Power Query Add-In for this. It has Pivot and Unpivot commands that you can use. The Power Query implementations of both commands are dynamic i.e. they adjust to variations in the input data.
For your scenario I would first select Name and Unpivot all other columns. That will transform each date column and value into a row. Then I would Pivot on the Name column, which will generate columns for each of your Name values.
To automate this process, you just need a few lines of VBA code or script code to open, refresh and save the Excel file (including Power Queries). There are lots of options for this, e.g.
http://southbaydba.com/2013/09/10/part-5-power-query-api-refreshing-data-indeed/

Resources