Dear all, I have a select query that currently produces the following results:
DoctorName Team 1 2 3 4 5 6 7 ... 31 Visited
dr. As A x x ... 2 times
dr. Sc A x ... 1 times
dr. Gh B x ... 1 times
dr. Nd C ... x 1 times
Using the following query:
DECLARE #startDate = '1/1/2010', #enddate = '1/31/2010'
SELECT d.doctorname,
t.teamname,
MAX(CASE WHEN ca.visitdate = 1 THEN 'x' ELSE NULL END) AS 1,
MAX(CASE WHEN ca.visitdate = 2 THEN 'x' ELSE NULL END) AS 2,
MAX(CASE WHEN ca.visitdate = 3 THEN 'x' ELSE NULL END) AS 3,
...
MAX(CASE WHEN ca.visitdate = 31 THEN 'x' ELSE NULL END) AS 31,
COUNT(*) AS visited
FROM CACTIVITY ca
JOIN DOCTOR d ON d.id = ca.doctorid
JOIN TEAM t ON t.id = ca.teamid
WHERE ca.visitdate BETWEEN #startdate AND #enddate
GROUP BY d.doctorname, t.teamname
the problem is I want to make the column of date are dynamic for example if ca.visitdate BETWEEN '2/1/2012' AND '2/29/2012'
so the result will be :
DoctorName Team 1 2 3 4 5 6 7 ... 29 Visited
dr. As A x x ... 2 times
dr. Sc A x ... 1 times
dr. Gh B x ... 1 times
dr. Nd C ... x 1 times
Can somebody help me how to get numbers of days between two date and help me revised the query so it can looping MAX(CASE WHEN ca.visitdate = 1 THEN 'x' ELSE NULL END) AS 1 as many as numbers of days? Please please
The basic rule in SQL is that any given constructed query will always return the same columns - in terms of how many there are, their names, and their types.
So you'd be looking at a dynamic SQL approach to construct the query on the fly, if you still want to go down that route. Otherwise, it might be worth looking at whether you can suppress empty columns at a higher level (is this going to some form of report processor - such as SQL reporting services or crystal reports?)
edit 1
You might want to add additional columns to your query such as:
CASE WHEN DATEPART(month,#StartDate) = DATEPART(month,DATEADD(day,29,#StartDate)) THEN 1 ELSE 0 END as ShowColumn29
(And similarly for the other numbers). How you then use that in Reporting services, I#m a bit vague, but I think you can add a hidden textbox somewhere on your report that binds to the ShowColumn29 value, and then set the visibility of the "29" column of the report to the value of this textbox.
Sorry - I'm not that good with reporting services, but hopefully you can play around with this sort of concept and make it work?
Related
I am working with a dashboard software that has limited features, i.e. no way to set the yaxis. Therefore this has to be worked into the code. This means the datetime field has to be alphanumeric, due to the yaxis aspect. However, the table now sorts incorrectly. So rather than 1, 2, 3 it sorts as 1, 10, 11, 12 . I've look on here for the answer, but there's nothing that works with an Over function, which is necessary as its for a line graph showing cumulative sales figures over a month.
It is MSSQL that I am using:
SELECT DATENAME(day, DATEADD(day,day(oh_datetime),-1)),
SUM(SUM((CASE oh_sot_id WHEN 1 THEN 1 WHEN 4 THEN -1 WHEN 2 THEN 0 WHEN 3 THEN 0 WHEN 6 THEN 0 WHEN 11 THEN 0 END) * oht_net)) over (ORDER BY day(oh_datetime)) AS 'Orders In($)',
SUM((CASE oh_cd_id WHEN 11728 THEN 1 END) * oht_net) AS 'Target($)'
FROM order_header_total
JOIN order_header ON oht_oh_id = oh_id
WHERE year(oh_datetime) = year(GETDATE()) AND month(oh_datetime) = month(GETDATE())
GROUP BY day(oh_datetime)
UNION SELECT 'YAxis','0','0'
I have a SQL table with unique IDs, a date of service for a health care encounter, and whether this encounter was an emergency room visit (ed = 1) or a hospital admission (hosp = 1).
For each unique ID, I want to identify ED visits that occurred <= 1 calendar day from a hospital stay.
Thus I think I want to ask SQL first identify ED visits and then search up and down to find the nearest hospital admission and calculate the difference in dates (absolute value). I'm familiar with lag/lead and rownumber() functions, but can't quite seem to figure this out.
Any ideas would be much appreciated! Thank you!
Table looks like this for one illustrative ID:
id date ed hosp
1 2012-01-01 0 1
1 2012-01-05 1 0
1 2012-02-01 0 1
1 2012-02-03 1 0
1 2012-05-01 0 0
And I want to create a new column (ed_hosp_diff) that is the minimum absolute date difference (days) between each ED visit and the closest hospital stay, something like this:
id date ed hosp ed_hosp_diff
1 2012-01-01 0 1 null
1 2012-01-05 1 0 4
1 2012-02-01 0 1 null
1 2012-02-03 1 0 2
1 2012-05-01 0 0 null
So this doesn't get you the output table you show, but it meets the requirement you list:
For each unique ID, I want to identify ED visits that occurred <= 1
calendar day from a hospital stay.
Your output table doesn't really give you that - it includes rows for ED Visits that don't have a matching hospital admit, and has rows for hospital admits, etc. This SQL doesn't give you those, it just gives you the ED Visits that were followed by a hospital admit within one day.
It also doesn't give you matches with negative days - cases where the hospital visit is prior to the ED visit (in terms of healthcare analytics, that's usually a different thing than looking for ED Visits followed by an IP Admit). If you do want those, delete the last bit of logic in the WHERE clause for the main query.
SELECT
ID = e.id,
ED_DATE = e.date,
HOSP_DATE = h.date
ED_HOSP_DIFF = DATEDIFF(dd, e.date, h.date)
FROM
Table1 AS e
JOIN
(
SELECT
id,
date
FROM
Table1
WHERE
hosp = 1
) AS h
ON
e.id = h.id
WHERE
e.ed = 1
AND
DATEDIFF(dd, e.date, h.date) <= 1
AND
DATEDIFF(dd, e.date, h.date) >= 0
use OUTER APPLY to get the record with ed = 1 and find the min date diff
SELECT *
FROM table t
OUTER APPLY
(
SELECT ed_hosp_diff = MIN ( ABS ( DATEDIFF(DAY, t.date, x.date) ) )
FROM table x
WHERE x.date <> t.date
AND x.ed = 1
) eh
I have the following table in SQL Server Express edition:
Time Device Value
0:00 1 2
0:01 2 3
0:03 3 5
0:03 1 3
0:13 2 5
0:22 1 7
0:34 3 5
0:35 2 6
0:37 1 5
The table is used to log the events of different devices which are reporting their latest values. What I'd like to do is to prepare the data in a way that I'd present the average data through time scale and eventually create a chart using this data. I've manipulated this example data in Excel in the following way:
Time Average value
0:03 3,666666667
0:13 4,333333333
0:22 5,666666667
0:34 5,666666667
0:35 6
0:37 5,333333333
So, at time 0:03 I need to take latest data I have in the table and calculate the average. In this case it's (3+3+5)/3=3,67. At time 0:13 the steps would be repeated, and again at 0:22,...
As I'd like to leave the everything within the SQL table (I wouldn't like to create any service with C# or similar which would grab the data and store it into some other table)
I'd like to know the following:
is this the right approach or should I use some other concept of calculating the average for charting data preparation?
if yes, what's the best approach to implement it? Table view, function within the database, stored procedure (which would be called from the charting API)?
any suggestions on how to implement this?
Thank you in advance.
Mark
Update 1
In the mean time I got one idea how to approach to this problem. I'd kindly ask you for your comments on it and I'd still need some help in getting the problem resolved.
So, the idea is to crosstab the table like this:
Time Device1Value Device2Value Device3Value
0:00 2 NULL NULL
0:01 NULL 3 NULL
0:03 3 NULL 5
0:13 NULL 5 NULL
0:22 7 NULL NULL
0:34 NULL NULL 5
0:35 NULL 6 NULL
0:37 5 NULL NULL
The query for this to happen would be:
SELECT Time,
(SELECT Stock FROM dbo.Event WHERE Time = S.Time AND Device = 1) AS Device1Value,
(SELECT Stock FROM dbo.Event WHERE Time = S.Time AND Device = 2) AS Device2Value,
(SELECT Stock FROM dbo.Event WHERE Time = S.Time AND Device = 3) AS Device3Value
FROM dbo.Event S GROUP BY Time
What I'd still need to do is to write a user defined function and call it within this query which would write last available value in case of NULL and if the last available value doesn't exist it would leave NULL value. With this function I'd get the following results:
Time Device1Value Device2Value Device3Value
0:00 2 NULL NULL
0:01 2 3 NULL
0:03 3 3 5
0:13 3 5 5
0:22 7 5 5
0:34 7 5 5
0:35 7 6 5
0:37 5 6 5
And by having this results I'd be able to calculate the average for each time by only SUMing up the 3 relevant columns and dividing it by count (in this case 3). For NULL I'd use 0 value.
Can anybody suggest how to create a user defined function for replacing NULL values with latest value?
Update 2
Thanks Martin.
This query worked but it took almost 21 minutes to go through the 13.576 lines which is far too much.
The final query I used was:
SELECT Time,
(SELECT TOP 1 Stock FROM dbo.Event e WHERE e.Time <= S.Time AND Device = 1 ORDER BY e.Time DESC) AS Device1Value,
(SELECT TOP 1 Stock FROM dbo.Event e WHERE e.Time <= S.Time AND Device = 2 ORDER BY e.Time DESC) AS Device2Value,
(SELECT TOP 1 Stock FROM dbo.Event e WHERE e.Time <= S.Time AND Device = 3 ORDER BY e.Time DESC) AS Device3Value
FROM dbo.Event S GROUP BY Time
but I've extended it to 10 devices.
I agree that this is not the best way to do it. Is there any other way to prepare the data for the average calculation because this takes just too much of the processing.
Here's one way. It uses the "Quirky Update" approach to filling in the gaps. This relies on an undocumented behaviour so you may prefer to use a cursor for this.
DECLARE #SourceData TABLE([Time] TIME, Device INT, value FLOAT)
INSERT INTO #SourceData
SELECT '0:00',1,2 UNION ALL
SELECT '0:01',2,3 UNION ALL
SELECT '0:03',3,5 UNION ALL
SELECT '0:03',1,3 UNION ALL
SELECT '0:13',2,5 UNION ALL
SELECT '0:22',1,7 UNION ALL
SELECT '0:34',3,5 UNION ALL
SELECT '0:35',2,6 UNION ALL
SELECT '0:37',1,5
CREATE TABLE #tmpResults
(
[Time] Time primary key,
[1] FLOAT,
[2] FLOAT,
[3] FLOAT
)
INSERT INTO #tmpResults
SELECT [Time],[1],[2],[3]
FROM #SourceData
PIVOT ( MAX(value) FOR Device IN ([1],[2],[3])) AS pvt
ORDER BY [Time];
DECLARE #1 FLOAT, #2 FLOAT, #3 FLOAT
UPDATE #tmpResults
SET #1 = [1] = ISNULL([1],#1),
#2 = [2] = ISNULL([2],#2),
#3 = [3] = ISNULL([3],#3)
SELECT [Time],
(SELECT AVG(device)
FROM (SELECT [1] AS device
UNION ALL
SELECT [2]
UNION ALL
SELECT [3]) t) AS [Average value]
FROM #tmpResults
DROP TABLE #tmpResults
So one of the possible solutions which I found is far more efficient (less than a second for 14.574 lines). I haven't yet had time to review the results in details but on the first hand it looks promising. This is the code for the 3 device example:
SELECT Time,
SUM(CASE MAC WHEN '1' THEN Stock ELSE 0 END) Device1Value,
SUM(CASE MAC WHEN '2' THEN Stock ELSE 0 END) Device1Value,
SUM(CASE MAC WHEN '3' THEN Stock ELSE 0 END) Device1Value,
FROM dbo.Event
GROUP BY Time
ORDER BY Time
In any case I'll test the code provided by Martin to see if it makes any difference to the results.
I'm working on an ssis package to fix some data from a table. The table looks something like this:
CustID FieldID INT_VAL DEC_VAL VARCHAR_VAL DATE_VAL
1 1 23
1 2 500.0
1 3 David
1 4 4/1/05
1 5 52369871
2 1 25
2 2 896.23
2 3 Allan
2 4 9/20/03
2 5 52369872
I want to transform it into this:
CustID FirstName AccountNumber Age JoinDate Balance
1 David 52369871 23 4/1/05 500.0
2 Allan 52369872 25 9/20/03 896.23
Currently, I've got my SSIS package set up to pull in the data from the source table, does a conditional split on the field id, then generates a derived column on each split. The part I'm stuck on is joining the data back together. I want to join the data back together on the CustId.
However, the join merge only allows you to join 2 datasets, in the end I will need to join about 30 data sets. Is there a good way to do that without having to have a bunch of merge joins?
That seems a bit awkward, why not just do it in a query?
select
CustID,
max(case when FieldID = 3 then VARCHAR_VAL else null end) as 'FirstName',
max(case when FieldID = 5 then INT_VAL else null end) as 'AccountNumber',
max(case when FieldID = 1 then INT_VAL else null end) as 'Age',
max(case when FieldID = 4 then DATE_VAL else null end) as 'JoinDate',
max(case when FieldID = 2 then DEC_VAL else null end) as 'Balance'
from
dbo.StagingTable
group by
CustID
If your source system is MSSQL, then you can use that query from SSIS or even create a view in the source database (if you're allowed to). If not, then copy the data directly to a staging table in MSSQL and query it from there.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
Today, for the first time in 10 years of development with sql server I used a cross join in a production query. I needed to pad a result set to a report and found that a cross join between two tables with a creative where clause was a good solution. I was wondering what use has anyone found in production code for the cross join?
Update: the code posted by Tony Andrews is very close to what I used the cross join for. Believe me, I understand the implications of using a cross join and would not do so lightly. I was excited to have finally used it (I'm such a nerd) - sort of like the time I first used a full outer join.
Thanks to everyone for the answers! Here's how I used the cross join:
SELECT CLASS, [Trans-Date] as Trans_Date,
SUM(CASE TRANS
WHEN 'SCR' THEN [Std-Labor-Value]
WHEN 'S+' THEN [Std-Labor-Value]
WHEN 'S-' THEN [Std-Labor-Value]
WHEN 'SAL' THEN [Std-Labor-Value]
WHEN 'OUT' THEN [Std-Labor-Value]
ELSE 0
END) AS [LABOR SCRAP],
SUM(CASE TRANS
WHEN 'SCR' THEN [Std-Material-Value]
WHEN 'S+' THEN [Std-Material-Value]
WHEN 'S-' THEN [Std-Material-Value]
WHEN 'SAL' THEN [Std-Material-Value]
ELSE 0
END) AS [MATERIAL SCRAP],
SUM(CASE TRANS WHEN 'RWK' THEN [Act-Labor-Value] ELSE 0 END) AS [LABOR REWORK],
SUM(CASE TRANS
WHEN 'PRD' THEN [Act-Labor-Value]
WHEN 'TRN' THEN [Act-Labor-Value]
WHEN 'RWK' THEN [Act-Labor-Value]
ELSE 0
END) AS [ACTUAL LABOR],
SUM(CASE TRANS
WHEN 'PRD' THEN [Std-Labor-Value]
WHEN 'TRN' THEN [Std-Labor-Value]
ELSE 0
END) AS [STANDARD LABOR],
SUM(CASE TRANS
WHEN 'PRD' THEN [Act-Labor-Value] - [Std-Labor-Value]
WHEN 'TRN' THEN [Act-Labor-Value] - [Std-Labor-Value]
--WHEN 'RWK' THEN [Act-Labor-Value]
ELSE 0 END) -- - SUM([Std-Labor-Value]) -- - SUM(CASE TRANS WHEN 'RWK' THEN [Act-Labor-Value] ELSE 0 END)
AS [LABOR VARIANCE]
FROM v_Labor_Dist_Detail
where [Trans-Date] between #startdate and #enddate
--and CLASS = (CASE #class WHEN '~ALL' THEN CLASS ELSE #class END)
GROUP BY [Trans-Date], CLASS
UNION --REL 2/6/09 Pad result set with any missing dates for each class.
select distinct [Description] as class, cast([Date] as datetime) as [Trans-Date], 0,0,0,0,0,0
FROM Calendar_To_Fiscal cross join PRMS.Product_Class
where cast([Date] as datetime) between #startdate and #enddate and
not exists (select class FROM v_Labor_Dist_Detail vl where [Trans-Date] between #startdate and #enddate
and vl.[Trans-Date] = cast(Calendar_To_Fiscal.[Date] as datetime)
and vl.class= PRMS.Product_Class.[Description]
GROUP BY [Trans-Date], CLASS)
order by [Trans-Date], CLASS
A typical legitimate use of a cross join would be a report that shows e.g. total sales by product and region. If no sales were made of product P in region R then we want to see a row with a zero, rather than just not showing a row.
select r.region_name, p.product_name, sum(s.sales_amount)
from regions r
cross join products p
left outer join sales s on s.region_id = r.region_id
and s.product_id = p.product_id
group by r.region_name, p.product_name
order by r.region_name, p.product_name;
One use I've come across a lot is splitting records out into several records, mainly for reporting purposes.
Imagine a string where each character represents some event during the corresponding hour.
ID | Hourly Event Data
1 | -----X-------X-------X--
2 | ---X-----X------X-------
3 | -----X---X--X-----------
4 | ----------------X--X-X--
5 | ---X--------X-------X---
6 | -------X-------X-----X--
Now you want a report which shows how many events happened at what day. Cross join the table with a table of IDs 1 to 24, then work your magic...
SELECT
[hour].id,
SUM(CASE WHEN SUBSTRING([data].string, [hour].id, 1) = 'X' THEN 1 ELSE 0 END)
FROM
[data]
CROSS JOIN
[hours]
GROUP BY
[hours].id
=>
1, 0
2, 0
3, 0
4, 2
5, 0
6, 2
7, 0
8, 1
9, 0
10, 2
11, 0
12, 0
13, 2
14, 1
15, 0
16, 1
17, 2
18, 0
19, 0
20, 1
21, 1
22, 3
23, 0
24, 0
I have different reports that prefilter the recordset (by various lines of business within the firm), but there were calculations that required percentages of revenue firm-wide. The recordsource had to contain the firm total instead of relying on calculating the overall sum in the report itself.
Example: The recordset has balances for each client and the Line of Business the client's revenue comes from. The report may only show 'retail' clients. There is no way to get a sum of the balances for the entire firm, but the report shows the percentage of the firm's revenue.
Since there are different balance fields, I felt it was less complicated to have full join with the view that has several balances (I can also reuse this view of firm totals) instead of multiple fields made up sub queries.
Another one is an update statement where multiple records needed to be created (one record for each step in a preset workflow process).
Here's one, where the CROSS JOIN substitutes for an INNER JOIN. This is useful and legitimate when there are no identical values between two tables on which to join. For example, suppose you have a table that contains version 1, version 2 and version 3 of some statement or company document, all saved in a SQL Server table so that you can recreate a document that is associated with an order, on the fly, long after the order, and long after your document was rewritten into a new version. But only one of the two tables you need to join (the Documents table) has a VersionID column. Here is a way to do this:
SELECT DocumentText, VersionID =
(
SELECT d.VersionID
FROM Documents d
CROSS JOIN Orders o
WHERE o.DateOrdered BETWEEN d.EffectiveStart AND d.EffectiveEnd
)
FROM Documents
I've used a CROSS JOIN recently in a report that we use for sales forcasting, the report needs to break out the amount of sales that a sales person has done in each General Ledger account.
So in the report I do something to this effect:
SELECT gla.AccountN, s.SalespersonN
FROM
GLAccounts gla
CROSS JOIN Salesperson s
WHERE (gla.SalesAnalysis = 1 OR gla.AccountN = 47500)
This gives me every GL account for every sales person like:
SalesPsn AccountN
1000 40100
1000 40200
1000 40300
1000 48150
1000 49980
1000 49990
1005 40100
1005 40200
1005 40300
1054 48150
1054 49980
1054 49990
1078 40100
1078 40200
1078 40300
1078 48150
1078 49980
1078 49990
1081 40100
1081 40200
1081 40300
1081 48150
1081 49980
1081 49990
1188 40100
1188 40200
1188 40300
1188 48150
1188 49980
1188 49990
For charting (reports) where every grouping must have a record even if it is zero.
(e.g. RadCharts)
I had combinations of am insolvency field from my source data.
There are 5 distinct types but the data had combinations of 2 of these. So I created lookup table of the 5 distinct values then used a cross join for an insert statement to fill out the rest. like so
insert into LK_Insolvency (code,value)
select a.code+b.code, a.value+' '+b.value
from LK_Insolvency a
cross join LK_Insolvency b
where a.code <> b.code <--this makes sure the x product of the value with itself is not used as this does not appear in the source data.
I personally try to avoid cartesian product's in my queries. I suppose have a result set of every combination of your join could be useful, but usually if I end up with one, I know I have something wrong.