Inserting into table with special conditions - sql-server

I hope you help me out a little with my sql problem. I am using SQL Server 2008. So what I basically have are two tables
Table (MAIN TABLE) that consists of the following information:
id|QUANTITY|AMOUNT|DATE
example:
iD|mainQuantity|mainAMOUNT|DATE | subQUANTITY | subAMOUNT |
-----------------------
1 | 200 | 1200 |02.02.2016| ? | ?
2 | 500 | 700 |20.03.2016| ? | ?
2. Table (SUB TABLE) that consists of the following information:
ID|subQUANTITY|subAMOUNT|DATE
-----------------------
1 | 280 | 1600 |07.02.2016
2 | 140 | 110 |22.02.2016
So my problem is.
I want to implement the following logic into my MS SQL Server.
Check if the DATE of the FIRST row in the SUB TABLE is identically with the date of the first row of the main Table or between 02.02.2016 and 20.03.2016 (Information of the Main Table).
If it is correct, then insert INTO MAINTABLE the VALUES of first row in the Column QUANTITY and AMOUNT in the Maintable into the subquantity and subamount UNTIL it reaches the maximum value of mainQuantity and mainAMOUNT.
The residual amount and the remaining quantity should be insertet into the next row into the column subquantity and subamount.
Do the same thing for every next row and insert the remaining quantity and amount in the next row of the main table.
example:
first ROW: ID1 -> DATE 07.02.2016 -> check if it is equal or between 02.02.2016 and 20.03.2016. -> YES IT is. -> INSERT INTO MAINTABLE:
result:
ID|mainQUANTITY|mainAMOUNT|DATE | subQUANTITY | subAMOUNT |
-----------------------
1 | 200 | 1200 |02.02.2016| 200 | 1200
2 | 500 | 700 |20.03.2016| 220 | 510
EDIT: I forgot to mention it: I dont have 1 column for the Date. I have two columns. A startingDate and an EndDate. The Date condition logic is still the same. So the logic should check if a date is exaclte equal to starting date or between startingDate and EndDate
I tried to do my best to explain it as easy as possible.I hope you understand the logic :)
Iam not that good in T-SQL that I program it by myself. So that is why I really badly hope that you can please help me.
Thank you very much for your time and your effort.

Related

Cognos Calculate Variance between dates

I have a "Master data" list like this :
Description
Snapchot - Date
Value
XXX
2023-01-05
150
XXX
2023-01-05
100
XXX
2023-01-06
350
XXX
2023-01-07
200
My goal is to create a Pivot table that calculates difference with day before :
| 2023-01-05 | 2023-01-06 | 2023-01-07 |
| Value | Diff. | Value | Diff. | Value | Diff. |
------------------------------------------------------
XXX | 250 | 0 | 350 | 100 | 200 | -150 |
My problem is that I don't know how to make my calculation between two time periods :
(Value of [Snapchot - Date]) - (Value of ([Snapchot - Date] - 1 day)
What I tried is to make a second query where I use this expression :
add_days_ ([Snapchot - Date] ; -1)
This works but there's a mismatch in my values. I don't have the correct values for every dates as I have in my master list.
How can I create a query that gives me the value of the day before ? So that I can do :
[ValueCurrentDay] - [ValueDayBefore]
Based on the requirement you maybe looking for a running-difference. Have a look at the summary folder for the syntax of that summary function to see if it applies to your requirement.

Data warehouse design - periodic snapshot with frequently changing dimension keys

Imagine a fact table with a summation of measures over a time period, say 1 hour.
Start Date | Measure 1 | Measure 2
-------------------------------------------
2018-09-08 00:00:00 | 5 | 10
2018-09-08 00:01:00 | 12 | 20
Ideally we want to maintain the grain such that each row is exactly 1 hour. However, each row references dimensions which might ‘break’ the grain. For instance:
Start Date | Measure 1 | Measure 2 | Dim 1
---------------------------------------------------
2018-09-08 00:00:00 | 5 | 10 | key 1
2018-09-08 00:01:00 | 12 | 20 | key 2
It is possible that the dimension value may change 30 minutes into the hour in which case, the above would be inaccurate and should be represented like this:
Start Date | Measure 1 | Measure 2 | Dim 1
---------------------------------------------------
2018-09-08 00:00:00 | 5 | 10 | val 1
2018-09-08 00:00:30 | 5 | 10 | val 2
2018-09-08 00:01:00 | 12 | 20 | val 2
In our scenario, the data needs to be sliced by at least 5 dimension keys with queries like:
sum(measure1) where dim1 = x and dim2 = y..
Is there a design pattern for this requirement? I have considered ‘periodic snapshots’ but I have not read anywhere about this kind of row splitting on dimension changes.
I can see only two options:
Store the dimension values that were most present on each row (e.g. if a dimension value was true for the majority of the time in the hour, use this value). This would lead to some loss of accuracy.
Split each row on every dimension change. This is complex in the ETL, creates more data and breaks the granularity rule in the fact table.
Option 2 is the current solution and serves the purpose but is harder to maintain. Is there a better way to do this, or other options?
By way of a real example, this system records production data in a manufacturing environment so the data is something like:
Line | Date | Crew | Product | Running Time (mins)
-----------------------------------------------------------------------
Line 1 | 2018-09-08 00:00:00 | Crew A | Product A | 60
As noted, the crew, product or any of the other dimension may change multiple times within the hour.
You shouldn't need to split the time portion of your fact table since you clearly want to report hourly data, but you should have two records, one for each dimension value. If this is an aggregate of a transactional fact table, your process that loads the hourly table should be grouping each record by each dimension key. So in your example above, you should have two records for hour like so:
Start Date | Measure 1 | Measure 2 | Dim 1
---------------------------------------------------
2018-09-08 00:00:00 | 5 | 10 | val 1
2018-09-08 00:01:00 | 5 | 10 | val 1
2018-09-08 00:01:00 | 12 | 10 | val 2
You will need to take into account the other measures as well and make sure they all go into the correct bucket (val 1 or val 2). I split them evenly in the example.
Now if you slice by hour 1 and by Dim 1 Value 2, you will only see 12 (measure 1), and if you slice on hour 1, dim 1 value 1, you will only see 5, and if you only slice on hour 1, you will see 17.
Remember, your grain is defined by the level of each dimension, not just the time dimension. HTH.

Use an array formula to calculate datedif on criteria

Lets say I have a record of logged flights in a range, example below
[A] | [B] | [C][D][...] | [G]
1 Date | Mode | More Data.... | Days Since
2 1 May | Day | .... | Formula here
3 4 May | Night | .... | Formula here
4 6 May | Day | .... | Formula here
5 8 May | Night | .... | Formula here
I can use a formula to get the datedif between each row in column G, similar to
=DATEDIF(A2,A3,"d")
and copy it all the way down the column, but I'm guessing I need an array formula to go back and find the first row above the current row that matches in column B and get the datedif or days between those two dates. I'm assuming an array formula, but what would the best way to go about that be? I need the result to be the days between row 5 and 3 (night) and 4 and 2 (day) and then copied down about 300 rows...
I was looking at another array formula for sorting rows and eliminated blanks, but not sure how to adapt it to this scenario.
To get the difference in days you only have to subtract one date from the other.
LOOKUP function can be used to find the previous match, so try this formula in G2 copied down
=IFERROR(A2-LOOKUP(2,1/(B$1:B1=B2),A$1:A1),"")
format result cell as number with no decimal places

EXCEL Array SUM IF unique combinations of multiple criteria

Is it possible to create an array formula based on a table where it searches for distinct combinations of data before summing? Here is some sample data:
| A | B | C | D |
---------------------------------------------------
1 | 1 Jun | Charlie | D | 1.3 |
2 | 1 Jun | Charlie | N | 1.4 |
3 | 1 Jun | Dave | D | 1.3 |
4 | 2 Jun | Charlie | N | 0.6 |
5 | 2 Jun | Dave | D | 1.5 |
What I'd like the array formula to be able to do is look at the table, ignore column B and tell me which distinct rows (A, C and D or more criteria) and sum column D if they are unique. In this example, it should not sum Row 3 because on 1 Jun, "D" already has a 1.3.
PivotTables and VBA are out as I am designing this for the lowest level user on super restrictive computers.
EDIT: Picture of expected results, changed sample data to reflect results.
Kind of the data I am looking for.... I replaced the ID with the LName to visualize. I need to exclude individuals and get only aircraft hours by their modes. You can see toward the bottom of the image, 2 people are on one aircraft. I used MAX to keep it from SUMming it up and skewing the hours.
The levels on the left are by Date, Aircraft, Flight Number, then Person. I need to aggregate the hours down to the Flight Number of each date.
The original formula is
=SUM((FLTD_HRS)*(FLTD_DATE>=$D$3)*(FLTD_DATE<=$D$4)*(IF(FLTD_ACFT_MDS=CONFIG!$F$8,1,0)))
however, as it is, does not get me unique flights. I'd prefer an array formula to a PivotTable.... XD
Create a table with the unique combos of A, C, and D. This works assuming finite or at least non-changing but you can also just add every combination you could possibly have and add an IF(row="","",x) to the formula to not show.
You can add a new column (E) to your first table and then lookup the value in the new table or calculate in the new table.
E = SUMIFS(D$3:D$30,A$3:A$30,A3,C$3:C$30,C3,D$3:D$30,D3)
E1 = {=INDEX(A$3:E$30,MATCH(G3&H3&I3,A$3:A$30&C$3:C$30&D$3:D$30,0),5)}
E2 = SUMIFS(D$3:D$30,A$3:A$30,G3,C$3:C$30,H3,D$3:D$30,I3)
I ended up creating a calculation sheet to create a unique key for each flight by concatenating the fields I wanted to match and also eliminated the duplicates by searching for "VALID_PC" which searched for the highest rank of the person in the crew flying. The formulas I used in columns K and L were:
=IF(IFERROR(MATCH(H3,VALID_PC,0),"0"),A3&RIGHT(C3,5)&D3&E3,0)
and
=IFERROR(IF(A2=0,0,IF(COUNTIF($K$2:K3,K3)>1,0,1)),0)
This allowed me to do a simple {=SUM} on whether column L had a 1 in it on the main page..
Calculations Sheet
I do appreciate the help and effort you all put in to help me.

How can I create a conditional SUMPRODUCT formula for a horizontal set of timeseries data

I'm attempting to use sumproduct on 2 sets of timeseries data that is formatted in the following way:
Array 1
| 01/01/2016 | 02/01/2016 | ...
Stock1 | Price1a | Price1b |
Stock2 | Price2a | Price2b |
Array 2
| 01/01/2016 | 02/01/2016 | ...
Stock1 | Volume1a | Volume1b |
Stock2 | Volume2a | Volume2b |
Such that for a given date, the sumproduct formula will perform price * volume for all stocks in that date.
Example:
For 01/01/2016, the formula will return Price1a * Volume1a + Price2a * Volume2a.
Appreciate any help or questions in case it is unclear.
I put the ranges on two different sheets and used this formula:
=SUMPRODUCT((INDEX($A:$C,2,MATCH(E1,$1:$1,0)):INDEX($A:$C,MATCH("ZZZZ",A:A),MATCH(E1,$1:$1,0)))*(INDEX(Sheet3!$A:$C,2,MATCH(E1,Sheet3!$1:$1,0)):INDEX(Sheet3!$A:$C,MATCH("ZZZZ",Sheet3!A:A),MATCH(E1,Sheet3!$1:$1,0))))
The Index/Match define the extents of the arrays. The tables must have the same number of rows for this to work and the Prices to Volumes must be in the same order.
In the first and third INDEX there is ,2,. The 2 is the starting row. If your starting row is different than change it to your starting row number.
The Match("ZZZ"... will find the last row with text in it, so make sure there is nothing below the tables.
Also all the $C references need to change to the extents of your data. It can be larger than your data without detriment, so if you just want to set it to something like $CC it will be fine.

Resources