I am attempting to normalize data using SSIS in the following format:
SerialNumber Date R01 R02 R03 R04
-------------------------------------------
1 9/25/2011 9 6 1 2
1 9/26/2011 4 1 3 5
2 9/25/2011 7 3 2 1
2 9/26/2011 2 4 10 6
Each "R" column represents a reading for an hour. R01 is 12:00 AM, R02 is 1:00 AM, R03 is 2:00 AM and R04 is 3:00 AM. I would like to transform the data and store it in another table in this format (line breaks for readability):
SerialNumber Date Reading
-----------------------------------------
1 9/25/2011 12:00 AM 9
1 9/25/2011 1:00 AM 6
1 9/25/2011 2:00 AM 1
1 9/25/2011 3:00 AM 2
1 9/26/2011 12:00 AM 4
1 9/26/2011 1:00 AM 1
1 9/26/2011 2:00 AM 3
1 9/26/2011 3:00 AM 5
2 9/25/2011 12:00 AM 7
2 9/25/2011 1:00 AM 3
2 9/25/2011 2:00 AM 2
2 9/25/2011 3:00 AM 1
2 9/26/2011 12:00 AM 2
2 9/26/2011 1:00 AM 4
2 9/26/2011 2:00 AM 10
2 9/26/2011 3:00 AM 6
I am using the unpivot transformation in an SSIS 2008 package to accomplish most of this but the issue I am having is adding the hour to the date based on the column of the value I am working with. Is there a way to accomplish this in SSIS? Keep in mind that this is a small subset of data of around 30 million records so performance is an issue.
Thanks for the help.
Create a SSIS package and add a new Data Flow Task and configure this DFT (Edit...)
Add a new data source
Add UNPIVOT component and configure it thus:
Add DATA CONVERSION component:
Temporary results:
Add DERIVED COLUMN component:
For NewData derived column you can use this expression: DATEADD("HOUR",(Type == "R01" ? 0 : (Type == "R02" ? 1 : (Type == "R03" ? 2 : 3))),Date). «boolean_expression» ? «when_true» : «when_false» operator is like IIF() function (from VBA/VB) and is used to calculate number of hours to add: for "R01" -> 0 hours, for "R02" -> 1 hour, for "R03" -> 2 hours or else 3 hours (for "R04").
Results:
Related
In Stata I need to create a new variable "changes in the board of directors" which indicates whether the same directors are observed in the same firm over time. Consider an example below:
clear
input dirid firmid year
1 10 2006
2 10 2006
3 10 2006
1 10 2007
2 10 2007
3 10 2007
1 10 2008
2 10 2008
3 10 2008
4 10 2008
3 10 2009
4 10 2009
end
Directors ID 1, 2, and 3 are in firm 10 in 2006 and in 2007. So there was no change in the board of directors from t-1 to t. The variable "changes in the board of directors" should be 0. However, in 2008 a new director came to the board dirid = 4, so there was a change in the board and the variable should be 1. The same in 2009 because dirid 1 and 2 left the company. So any change, whether the entrance or exit of directors, should be reported with 1 in the new binary variable.
Here's another way to do it. I think it should cope with directors leaving and later coming back.
clear
input dirid firmid year
1 10 2006
2 10 2006
3 10 2006
1 10 2007
2 10 2007
3 10 2007
1 10 2008
2 10 2008
3 10 2008
4 10 2008
3 10 2009
4 10 2009
end
bysort firmid year (dirid) : gen board = strofreal(dirid) if _n == 1
by firmid year : replace board = board[_n-1] + " " + strofreal(dirid) if _n > 1
by firmid year : replace board = board[_N]
by firmid : gen anychange = year != year[_n-1] & board != board[_n-1]
bysort firmid year (anychange) : replace anychange = anychange[_N]
sort firmid year dirid
list, sepby(firmid year)
+--------------------------------------------+
| dirid firmid year board anycha~e |
|--------------------------------------------|
1. | 1 10 2006 1 2 3 1 |
2. | 2 10 2006 1 2 3 1 |
3. | 3 10 2006 1 2 3 1 |
|--------------------------------------------|
4. | 1 10 2007 1 2 3 0 |
5. | 2 10 2007 1 2 3 0 |
6. | 3 10 2007 1 2 3 0 |
|--------------------------------------------|
7. | 1 10 2008 1 2 3 4 1 |
8. | 2 10 2008 1 2 3 4 1 |
9. | 3 10 2008 1 2 3 4 1 |
10. | 4 10 2008 1 2 3 4 1 |
|--------------------------------------------|
11. | 3 10 2009 3 4 1 |
12. | 4 10 2009 3 4 1 |
+--------------------------------------------+
See also [this paper][1] on concatenating rowwise.
[1]: https://journals.sagepub.com/doi/full/10.1177/1536867X20909698
clear
input dirid firmid year
1 10 2006
2 10 2006
3 10 2006
1 10 2007
2 10 2007
3 10 2007
1 10 2008
2 10 2008
3 10 2008
4 10 2008
3 10 2009
4 10 2009
end
bysort firmid year (dirid): gen n = _n
reshape wide n, i(firmid year) j(dirid)
egen all_directors = concat(n*)
bysort firmid (year): gen change = all_directors != all_directors[_n-1] & _n > 1
reshape long
drop if missing(n)
drop all_directors n
I have a table which contains info on customer purchases per year and month respectively. Here is a simplified version.
id
year
month
nb_purch
1
2001
1
1
1
2001
2
4
1
2001
3
7
...
...
...
...
1
2001
12
3
1
2003
1
3
1
2003
2
2
1
2003
3
5
1
2003
4
7
...
...
...
...
1
2003
12
3
2
2001
1
3
2
2001
2
2
2
2001
3
5
2
2001
4
7
Basically there are several constraints. The database contains only the years when the client has made a purchase. If the client has made a purchase within the year X then X will be divided into 12 rows according to months. The months with no purchases have the value 0.
What I am trying to do is to retrieve the number of purchases per certain "windows". Currently its value sits at 3 years. For example i want to retrieve the sum of nb_purch within the last 3 years starting from 2003 march. This means i need to add all values from
march 2001 to march 2003.
SELECT SUM(nb_purch) OVER (PARTITION BY id ORDER BY year, month ASC ROWS BETWEEN 36 PRECEDING AND CURRENT ROW) AS LAST_3_YEARS FROM T
The issue i am facing here is that the table does not contain all years and therefore in my example of purchases between (2001 and 2003) if the year 2002 is missing then i am getting false results. I would like to avoid having to add all missing years and fill them with NULL values for each customer.
How I can calculate difference between each week and 2 weeks ago, for a given measure in MDX?
WEEK MEASURE NEW_MEASURE
---- ------- -----------
1 10 NULL
2 5 NULL
3 20 10
4 10 5
5 40 20
Below Members work, but only without CASE statement so I have to calculate it separately:
MEMBER [Measures].[12 Week temp]
AS
([Date].[Week Year].CurrentMember, [Measures].[Total Orders]) -
([Date].[Week Year].lag(13), [Measures].[Total Orders])
MEMBER [Measures].[12 Week]
AS
CASE WHEN [Measures].[12 Week temp] = [Measures].[Total Orders] THEN 0 ELSE [Measures].[12 Week temp] END
I have this in my DataBase :
Table Days:
IdDay NameDay
1 Monday
2 Tuesday
3 Wednesday
4 Thursday
5 Friday
6 Saturday
7 Sunday
Table Time:
IdTime Time
1 9am
2 10am
3 11am
Table Work:
IdWork NameWork IdDay IdTime
1 cleaning 6 3
2 Studying 1 2
I am trying to edit a Matrix in My ReportViewer:
Day [Time]
[NameDay]
I want to use code and edit my Matrix like this Algo :
if database day = day in the matrix && database Time= time in matrix
Put NameWork
Is there a way to do that ?
I have the following scenario:
Table is _etblpricelistprices
Columns are as follows:
iPriceListNameID iPricelistNameID iStockID fExclPrice
1 1 1 10
2 2 1 20
3 3 1 30
4 4 1 40
5 5 1 100
6 6 1 200
7 7 1 300
8 8 1 400
9 1 2 1000
10 2 2 2000
11 3 2 3000
12 4 2 4000
13 5 2 50
14 6 2 40
15 7 2 30
16 8 2 20
There are only two stock items here, but a lot more in the DB. The first column is the PK which auto-increments. The second column is the Pricelist. The pricelist is split as follows. (1-4) is current pricing and (5-8) is future pricing. the third column is the stock item's ID, and the fourth column, the pricing of the item.
I need a script to update this table to swap the future and current pricing per item. Please help
Observe, if you will, that swapping the iPricelistNameID values will achieve the same overall effect as swapping the fExclPrice values, and can be perfomed using a formula:
UPDATE _etblpricelistprices
SET
iPricelistNameID = CASE
WHEN iPricelistNameID > 4 THEN iPricelistNameID - 4
ELSE iPricelistNameID + 4
END