Extracting data from MS SQL Server-2008 referring multiple tables

Extracting data from MS SQL Server-2008 referring multiple tables - sql-server

I have had asked a similar question here and have got help from jpw who helped me with the query. The situation here remains same but only a bit more detail added. I have four tables. Sample structure for three of them is given below:
I have been helped to form query which goes as below:
select
d.LOTQty,
ApprovedQty = count(d.SerialNo),
d.DispatchDate,
Installed = count(a.SerialNo) + count(r.SerialNo)
from
Despatch d
left join
Activation a
on d.SerialNo= a.SerialNo
and d.DispatchDate <= a.ActivationDate
and d.LOTQty = a.LOTQty
left join
Replaced r
on d.SerialNo= r.SerialNo
and d.DispatchDate <= r.ActivationDate
and (a.ActivationDate is null or a.ActivationDate < d.DispatchDate)
where
d.LOTQty = 15
group by
d.LOTQty, d.DispatchDate, d.STBModel
For understanding sake, above query match Despatch table's SerialNo with Activation table. If match found it checks for Date difference. If DespatchDate < ActivationDate only those numbers are considered while others(which didn't match or whose DispatchDate > ActivationDate) are matched with Replaced with similar date criteria. So at the end we find 9 matches i.e 7 from Activation and 2 from Replaced as below:
LotQty | ApprovedQty | DispatchDate | Installed
15 | 10 | 2013-8-7 | 9
I want to display two more columns in here i.e DOA and Bounce like this:
LotQty | ApprovedQty | DispatchDate | Installed | DOA | Bounce
15 | 10 | 2013-8-7 | 9 | 2 | 4
DOA and Bounce should be calculated with difference between 4th table i.e Failed table's FailedDate and the above 9 matched SerialNo's respective Activation/Record date(henceforth termed as act_rec_date). Failed table and Intermediate 9 matched SerialNo's structure is shown below:
Intermediate table doesn't physically exist. It is just for reference and to provide more clarity. Intermediate table contain those SerialNo, which were matched with Activation and Replaced table. The act_rec_Date field is correspondingly matched Activation/Record Date.
DOA & Bounce = We should match all the 9 resultant SerialNo's(i.e Intermediate table) with Failed table. If matched, calculate difference between FailedDate and act_rec_date. If difference is (0 to <=10 days) then count it under DOA and if difference is (>10 days to <=180 days) then count it under Bounce. From Failed we find 6 matches out of which Product1,2 falls in DOA as difference between act_rec_Date is 0 and Product7,8,9 & 10 falls under Bounce as their difference is 89 | 54 | 61 | 61. So as shown above DOA = 2 and Bounce = 4
I want to build a query which could give me DOA and Bounce as well. I tried creating a temp table and dumping the resultant SerialNo's and act_rec_Date into it. Next I tried to match temp table and Failed table. I couldn't get it working and further more it took around 7 minutes to even execute the query.
P.S- My Actual tables contain around 50k to 100k data entries.

Continuing on the previous query I think the new columns could be added with a conditional aggregation in the select statement and another left join for the failed table.
This should work, but I'm sure the query can be improved:
select
d.LOTQty,
ApprovedQty = count(d.SerialNo),
d.DispatchDate,
Installed = count(a.SerialNo) + count(r.NewSerialNo),
DOA = sum(case when datediff(day, coalesce(a.ActivationDate,r.RecordDate), f.FailedDate) <= 10 then 1 else 0 end),
Bounce = sum(case when datediff(day, coalesce(a.ActivationDate,r.RecordDate), f.FailedDate) between 11 and 180 then 1 else 0 end)
from
Despatch d
left join
Activation a
on d.SerialNo= a.SerialNo
and d.DispatchDate <= a.ActivationDate
and d.LOTQty = a.LOTQty
left join
Replaced r
on d.SerialNo= r.NewSerialNo
and d.DispatchDate <= r.RecordDate
and (a.ActivationDate is null or a.ActivationDate < d.DispatchDate)
left join
Failed f
on (f.FailedSINo = a.SerialNo)
or (f.FailedSINo = r.NewSerialNo)
where
d.LOTQty = 15
group by
d.LOTQty, d.DispatchDate
Sample SQL Fiddle with test data

Related

Find nearest row that matches condition in SQL Server

I have a SQL table with unique IDs, a date of service for a health care encounter, and whether this encounter was an emergency room visit (ed = 1) or a hospital admission (hosp = 1).
For each unique ID, I want to identify ED visits that occurred <= 1 calendar day from a hospital stay.
Thus I think I want to ask SQL first identify ED visits and then search up and down to find the nearest hospital admission and calculate the difference in dates (absolute value). I'm familiar with lag/lead and rownumber() functions, but can't quite seem to figure this out.
Any ideas would be much appreciated! Thank you!
Table looks like this for one illustrative ID:
id date ed hosp
1 2012-01-01 0 1
1 2012-01-05 1 0
1 2012-02-01 0 1
1 2012-02-03 1 0
1 2012-05-01 0 0
And I want to create a new column (ed_hosp_diff) that is the minimum absolute date difference (days) between each ED visit and the closest hospital stay, something like this:
id date ed hosp ed_hosp_diff
1 2012-01-01 0 1 null
1 2012-01-05 1 0 4
1 2012-02-01 0 1 null
1 2012-02-03 1 0 2
1 2012-05-01 0 0 null

So this doesn't get you the output table you show, but it meets the requirement you list:
For each unique ID, I want to identify ED visits that occurred <= 1
calendar day from a hospital stay.
Your output table doesn't really give you that - it includes rows for ED Visits that don't have a matching hospital admit, and has rows for hospital admits, etc. This SQL doesn't give you those, it just gives you the ED Visits that were followed by a hospital admit within one day.
It also doesn't give you matches with negative days - cases where the hospital visit is prior to the ED visit (in terms of healthcare analytics, that's usually a different thing than looking for ED Visits followed by an IP Admit). If you do want those, delete the last bit of logic in the WHERE clause for the main query.
SELECT
ID = e.id,
ED_DATE = e.date,
HOSP_DATE = h.date
ED_HOSP_DIFF = DATEDIFF(dd, e.date, h.date)
FROM
Table1 AS e
JOIN
(
SELECT
id,
date
FROM
Table1
WHERE
hosp = 1
) AS h
ON
e.id = h.id
WHERE
e.ed = 1
AND
DATEDIFF(dd, e.date, h.date) <= 1
AND
DATEDIFF(dd, e.date, h.date) >= 0

use OUTER APPLY to get the record with ed = 1 and find the min date diff
SELECT *
FROM table t
OUTER APPLY
(
SELECT ed_hosp_diff = MIN ( ABS ( DATEDIFF(DAY, t.date, x.date) ) )
FROM table x
WHERE x.date <> t.date
AND x.ed = 1
) eh

Holiday Availability Calender - sum available days still left to sell over consecutive days

I require is a min & max of the BaseDate where the available to sell = 1 and there are 3 or more consecutive days still available to sell. However, the sum needs to be excluded if the properties changeoverday starts on the same day as the BaseDate, as we are only interested in the gaps that we can't sell due to changeover restrictions. The data would have to be grouped by Code, as we have over 1,000 properties. BaseDates are for 2015 & 2016.
NB: Some properties have more than 1 changeoverDay & are currently held in one column comma separated i.e. Saturday, Sunday
Example Data:-
DECLARE #sampleData TABLE (
Code VARCHAR(5) NOT NULL
, BaseDate DATE NOT NULL
, DayName VARCHAR(9) NOT NULL
, ChangeoverDay VARCHAR(8) NOT NULL
, AvailabletoSell BIT NOT NULL
);
INSERT INTO #sampleData VALUES
('PERCH','2015-05-06','Wednesday','Saturday',0),
('PERCH','2015-05-07','Thursday','Saturday',0),
('PERCH','2015-05-08','Friday','Saturday',0),
('PERCH','2015-05-09','Saturday','Saturday',1), -- Not this one as changeover day is the same as the BaseDate
('PERCH','2015-05-10','Sunday','Saturday',1),
('PERCH','2015-05-11','Monday','Saturday',1),
('PERCH','2015-05-12','Tuesday','Saturday',0),
('PERCH','2015-05-13','Wednesday','Saturday',0),
('PERCH','2015-05-14','Thursday','Saturday',1), -- This one = 3
('PERCH','2015-05-15','Friday','Saturday',1),
('PERCH','2015-05-16','Saturday','Saturday',1),
('PERCH','2015-05-17','Sunday','Saturday',0),
('PERCH','2015-05-18','Monday','Saturday',1), -- This one = 4
('PERCH','2015-05-19','Tuesday','Saturday',1),
('PERCH','2015-05-20','Wednesday','Saturday',1),
('PERCH','2015-05-21','Thursday','Saturday',1),
('PERCH','2015-05-22','Friday','Saturday',0),
('PERCH','2015-05-23','Saturday','Saturday',0),
('PERCH','2015-05-24','Sunday','Saturday',0),
('PERCH','2015-05-25','Monday','Saturday',0),
('PERCH','2015-05-26','Tuesday','Saturday',0),
('PERCH','2015-05-27','Wednesday','Saturday',1), -- Not this one, as only 2 consecutive days
('PERCH','2015-05-28','Thursday','Saturday',1),
('PERCH','2015-05-29','Friday','Saturday',0),
('PERCH','2015-05-30','Saturday','Saturday',0);
I would require the output as below:-
+-------+---------------+-------------+----------------------+
| Code | StartBaseDate | EndBaseDate | TotalAvailabletoSell |
+-------+---------------+-------------+----------------------+
| PERCH | 14/05/2015 | 16/05/2015 | 3 |
| PERCH | 18/05/2015 | 21/05/2015 | 4 |
+-------+---------------+-------------+----------------------+

This gives you what you want. But I feel there's a way to reduce the number of times it touches the table
WITH Groupings AS (
SELECT
Code
,LastChange
,MIN(BaseDate) AS StartBaseDate
,MAX(BaseDate) AS EndBaseDate
,COUNT(*) AS DaysInPeriod
FROM
#sampleData AS s1
CROSS APPLY (
SELECT
MAX(BaseDate) AS LastChange
FROM
#sampleData AS cv
WHERE
s1.BaseDate > cv.BaseDate
AND s1.AvailabletoSell != cv.AvailabletoSell
AND s1.Code = cv.Code
) AS cv
WHERE
s1.AvailabletoSell = 1
GROUP BY
Code
,LastChange
)
SELECT
g.Code
,g.StartBaseDate
,g.EndBaseDate
,CASE WHEN a.DayName = a.ChangeoverDay THEN DaysInPeriod - 1 ELSE DaysInPeriod END AS TotalAvailableToSell
FROM
Groupings AS g
INNER JOIN #sampleData AS a
ON a.BaseDate = g.StartBaseDate AND a.Code = g.Code
WHERE
CASE WHEN a.DayName = a.ChangeoverDay THEN DaysInPeriod - 1 ELSE DaysInPeriod END > 2
The logic is pretty much:
Find the last date where the AvailableToSell flag flipped before "this row"
Group into sets by those dates and count the rows in it
Decrement by 1 if the start date has DayName as the ChangeoverDay
I havent accounted for your note about the ChangeoverDay being a comma separated field. There are plenty of resources on breaking that out which you could then join to. But I think you also need to expand what happens in this scenario with regards to DayName is in the list of ChangeoverDays

Netezza: Show dates even if 0 data for that day

I have this query through an odbc connection in excel for a refreshable report with data for every 4 weeks. I need to show the dates in each of the 4 weeks even if there is no data for that day because this data is then linked to a Graph. Is there a way to do this?
thanks.
Select b.INV_DT, sum( a.ORD_QTY) as Ordered, sum( a.SHIPPED_QTY) as Shipped
from fct_dly_invoice_detail a, fct_dly_invoice_header b, dim_invoice_customer c
where a.INV_HDR_SK = b.INV_HDR_SK
and b.DIM_INV_CUST_SK = c.DIM_INV_CUST_SK
and a.SRC_SYS_CD = 'ABC'
and a.NDC_NBR is not null
**and b.inv_dt between CURRENT_DATE - 16 and CURRENT_DATE**
and b.store_nbr in (2851, 2963, 3249, 3385, 3447, 3591, 3727, 4065, 4102, 4289, 4376, 4793, 5209, 5266, 5312, 5453, 5569, 5575, 5892, 6534, 6571, 7110, 9057, 9262, 9652, 9742, 10373, 12392, 12739, 13870
)
group by 1

The general purpose solution to this is to create a date dimension table, and then perform an outer join to that date dimension table on the INV_DT column.
There are tons of good resources you can search for on creating a good date dimension table, so I'll just create a quick and dirty (and trivial) example here. I highly recommend some research in that area if you'll be doing a lot of BI/reporting.
If our table we want to report from looks like this:
Table "TABLEZ"
Attribute | Type | Modifier | Default Value
-----------+--------+----------+---------------
AMOUNT | BIGINT | |
INV_DT | DATE | |
Distributed on random: (round-robin)
select * from tablez order by inv_dt
AMOUNT | INV_DT
--------+------------
1 | 2015-04-04
1 | 2015-04-04
1 | 2015-04-06
1 | 2015-04-06
(4 rows)
and our report looks like this:
SELECT inv_dt,
SUM(amount)
FROM tablez
WHERE inv_dt BETWEEN CURRENT_DATE - 5 AND CURRENT_DATE
GROUP BY inv_dt;
INV_DT | SUM
------------+-----
2015-04-04 | 2
2015-04-06 | 2
(2 rows)
We can create a date dimension table that contains a row for every date (or ate last 1024 days in the past and 1024 days in the future using the _v_vector_idx view in this example).
create table date_dim (date_dt date);
insert into date_dim select current_date - idx from _v_vector_idx;
insert into date_dim select current_date + idx +1 from _v_vector_idx;
Then our query would look like this:
SELECT d.date_dt,
SUM(amount)
FROM tablez a
RIGHT OUTER JOIN date_dim d
ON a.inv_dt = d.date_dt
WHERE d.date_dt BETWEEN CURRENT_DATE -5 AND CURRENT_DATE
GROUP BY d.date_dt;
DATE_DT | SUM
------------+-----
2015-04-01 |
2015-04-02 |
2015-04-03 |
2015-04-04 | 2
2015-04-05 |
2015-04-06 | 2
(6 rows)
If you actually needed a zero value instead of a NULL for the days where you had no data, you could use a COALESCE or NVL like this:
SELECT d.date_dt,
COALESCE(SUM(amount),0)
FROM tablez a
RIGHT OUTER JOIN date_dim d
ON a.inv_dt = d.date_dt
WHERE d.date_dt BETWEEN CURRENT_DATE -5 AND CURRENT_DATE
GROUP BY d.date_dt;
DATE_DT | COALESCE
------------+----------
2015-04-01 | 0
2015-04-02 | 0
2015-04-03 | 0
2015-04-04 | 2
2015-04-05 | 0
2015-04-06 | 2
(6 rows)

I agree with #ScottMcG that you need to get the list of dates. However if you are in a situation where you aren't allowed to create a table. You can simplify things. All you need is a table that has at least 28 rows. Using your example, this should work.
select date_list.dt_nm, nvl(results.Ordered,0) as Ordered, nvl(results.Shipped,0) as Shipped
from
(select row_number() over(order by sub.arb_nbr)+ (current_date -28) as dt_nm
from (select rowid as arb_nbr
from fct_dly_invoice_detail b
limit 28) sub ) date_list left outer join
( Select b.INV_DT, sum( a.ORD_QTY) as Ordered, sum( a.SHIPPED_QTY) as Shipped
from fct_dly_invoice_detail a inner join
fct_dly_invoice_header b
on a.INV_HDR_SK = b.INV_HDR_SK
and a.SRC_SYS_CD = 'ABC'
and a.NDC_NBR is not null
**and b.inv_dt between CURRENT_DATE - 16 and CURRENT_DATE**
and b.store_nbr in (2851, 2963, 3249, 3385, 3447, 3591, 3727, 4065, 4102, 4289, 4376, 4793, 5209, 5266, 5312, 5453, 5569, 5575, 5892, 6534, 6571, 7110, 9057, 9262, 9652, 9742, 10373, 12392, 12739, 13870)
inner join
dim_invoice_customer c
on b.DIM_INV_CUST_SK = c.DIM_INV_CUST_SK
group by 1 ) results
on date_list.dt_nm = results.inv_dt

group by with 'pre-defined row'

Say I have to following PaymentTransaction Table:
ID Amount PayMethodID
----------------------------
10254 100 1
15789 150 1
15790 200 0
16954 300 0
17864 400 1
19364 500 1
PayMethodID Desc
----------------------------
0 CASH
1 VISA
2 MASTER
3 AMEX
4 ETC
I can simply use a group by to group the PayMethodID under 1 and 0.
What i am trying to do is to show also the non-exist PayMethodID under GROUP BY
My current result with simple group by statement is
PayMethodID TotalAmount
-------------------------
0 500
1 1150
Expected result (to show 0 if its not exits in the transaction table):
PayMethodID TotalAmount
-------------------------
0 500
1 1150
2 0
3 0
4 0
This might be a simple and duplicated question, but i just cant find the keyword to search around. I would remove this post if you can find me any duplication. Thanks.

You can use LEFT JOIN, so all rows from leftmost table (TableA) will be shown whether it has a matching values on the other table or not.
SELECT a.PayMethodID,
TotalAmount = ISNULL(SUM(b.Amount), 0)
FROM TableA AS a -- <== contains list of card type
LEFT JOIN TableB AS b -- <== contains the payment list
ON a.PayMethodID = b.PayMethodID
GROUP BY a.PayMethodID

A regular OUTER (LEFT) JOIN will give you all rows from the PayMethod table no matter if they exist in the PaymentTransaction table, the rest of the sums being NULL. You can then use a COALESCE to make the null rows zero;
SELECT pm.PayMethodID, COALESCE(SUM(pt.Amount), 0) TotalAmount
FROM PayMethod pm
LEFT JOIN PaymentTransaction pt
ON pm.PayMethodID = pt.PayMethodID
GROUP BY pm.PayMethodID
An SQLfiddle to test with.

VLOOKUP-style range lookup in T-SQL

Here's a tricky problem I haven't quite been able to get my head around. I'm using SQL Server 2008, and I have a sparse range table that looks like this:
Range Profession
----- ----------
0 Office Worker
23 Construction
54 Medical
Then I have another table with values that are within these ranges. I'd like to construct a query which joins these two tables and gives me the Profession value that is less than or equal to the given value. So let's say my other table looks like this:
Value
29
1
60
Then I'd like my join to return:
Value Profession
----- ----------
29 Construction
1 Office Worker
60 Medical
(because 29>the 23 for Construction but <=the 54 for Medical)
Is there any way I can get SQL to bend to my will in this manner, short of actually blowing out the range table to include every possible value?
Thank you.

Easist Way to do this is to add a another column to you sparse range table.
LowRange HighRange Profession
0 22 Office Worker
23 53 Construction
54 999999 Medical
Then use a query like this to get the range(table 2 is the one with the 29,1,60 values):
SELECT Table_2.JoinKey as Value, Table_1.Description as Profession
FROM Table_1 INNER JOIN Table_2
ON Table_2.JoinKey => Table_1.LowRangeKey
AND Table_2.JoinKey <= Table_1.HighRangeKey;

You could use CROSS APPLY:
select v.Value, p.Profession
from tblValues v
cross apply
(select top(1) pr.Profession
from tblProfessionRanges pr
where pr.Range <= v.Value ORDER BY pr.[Range] DESC) p
It should be faster than using max and doesn't need a max-range do be maintained.

I think I understand your problem. I created a table called professions with your values and a map_vals table with the look up values. Then I came up with this:
select p.range as `range1`, p.profession, v.value from professions p
inner join map_vals v ON v.value >= p.range
where p.range =
(select max(p3.range) from professions p3 where p3.range <= v.value)
order by v.value
which when given these values...
value
29
0
60
1
23
54
returns
range1 profession value
0 Office Worker 0
0 Office Worker 1
23 Construction 23
23 Construction 29
54 Medical 54
54 Medical 60
EDIT:
You could also use CROSS APPLY as shown by manfred-sorg but it requires an ORDER BY DESC or you will get the following:
select v.Value, p.Profession
from tblValues v
cross apply
(select top(1) pr.Profession
from tblProfessionRanges pr
where pr.Range <= v.Value) p
produces
Value Profession
----------- --------------------------------------------------
29 Office Worker
1 Office Worker
60 Office Worker
to get your desired result you need to change it to:
select v.Value, p.Profession
from tblValues v
cross apply
(select top(1) pr.Profession
from tblProfessionRanges pr
where pr.Range <= v.Value ORDER BY pr.[Range] DESC) p
Value Profession
----------- --------------------------------------------------
29 Construction
1 Office Worker
60 Medical
However, the sorting required here makes it less efficient than using MAX.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Extracting data from MS SQL Server-2008 referring multiple tables - sql-server

Related

Find nearest row that matches condition in SQL Server

Holiday Availability Calender - sum available days still left to sell over consecutive days

Netezza: Show dates even if 0 data for that day

group by with 'pre-defined row'

VLOOKUP-style range lookup in T-SQL

Categories

Resources