Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
Today, for the first time in 10 years of development with sql server I used a cross join in a production query. I needed to pad a result set to a report and found that a cross join between two tables with a creative where clause was a good solution. I was wondering what use has anyone found in production code for the cross join?
Update: the code posted by Tony Andrews is very close to what I used the cross join for. Believe me, I understand the implications of using a cross join and would not do so lightly. I was excited to have finally used it (I'm such a nerd) - sort of like the time I first used a full outer join.
Thanks to everyone for the answers! Here's how I used the cross join:
SELECT CLASS, [Trans-Date] as Trans_Date,
SUM(CASE TRANS
WHEN 'SCR' THEN [Std-Labor-Value]
WHEN 'S+' THEN [Std-Labor-Value]
WHEN 'S-' THEN [Std-Labor-Value]
WHEN 'SAL' THEN [Std-Labor-Value]
WHEN 'OUT' THEN [Std-Labor-Value]
ELSE 0
END) AS [LABOR SCRAP],
SUM(CASE TRANS
WHEN 'SCR' THEN [Std-Material-Value]
WHEN 'S+' THEN [Std-Material-Value]
WHEN 'S-' THEN [Std-Material-Value]
WHEN 'SAL' THEN [Std-Material-Value]
ELSE 0
END) AS [MATERIAL SCRAP],
SUM(CASE TRANS WHEN 'RWK' THEN [Act-Labor-Value] ELSE 0 END) AS [LABOR REWORK],
SUM(CASE TRANS
WHEN 'PRD' THEN [Act-Labor-Value]
WHEN 'TRN' THEN [Act-Labor-Value]
WHEN 'RWK' THEN [Act-Labor-Value]
ELSE 0
END) AS [ACTUAL LABOR],
SUM(CASE TRANS
WHEN 'PRD' THEN [Std-Labor-Value]
WHEN 'TRN' THEN [Std-Labor-Value]
ELSE 0
END) AS [STANDARD LABOR],
SUM(CASE TRANS
WHEN 'PRD' THEN [Act-Labor-Value] - [Std-Labor-Value]
WHEN 'TRN' THEN [Act-Labor-Value] - [Std-Labor-Value]
--WHEN 'RWK' THEN [Act-Labor-Value]
ELSE 0 END) -- - SUM([Std-Labor-Value]) -- - SUM(CASE TRANS WHEN 'RWK' THEN [Act-Labor-Value] ELSE 0 END)
AS [LABOR VARIANCE]
FROM v_Labor_Dist_Detail
where [Trans-Date] between #startdate and #enddate
--and CLASS = (CASE #class WHEN '~ALL' THEN CLASS ELSE #class END)
GROUP BY [Trans-Date], CLASS
UNION --REL 2/6/09 Pad result set with any missing dates for each class.
select distinct [Description] as class, cast([Date] as datetime) as [Trans-Date], 0,0,0,0,0,0
FROM Calendar_To_Fiscal cross join PRMS.Product_Class
where cast([Date] as datetime) between #startdate and #enddate and
not exists (select class FROM v_Labor_Dist_Detail vl where [Trans-Date] between #startdate and #enddate
and vl.[Trans-Date] = cast(Calendar_To_Fiscal.[Date] as datetime)
and vl.class= PRMS.Product_Class.[Description]
GROUP BY [Trans-Date], CLASS)
order by [Trans-Date], CLASS
A typical legitimate use of a cross join would be a report that shows e.g. total sales by product and region. If no sales were made of product P in region R then we want to see a row with a zero, rather than just not showing a row.
select r.region_name, p.product_name, sum(s.sales_amount)
from regions r
cross join products p
left outer join sales s on s.region_id = r.region_id
and s.product_id = p.product_id
group by r.region_name, p.product_name
order by r.region_name, p.product_name;
One use I've come across a lot is splitting records out into several records, mainly for reporting purposes.
Imagine a string where each character represents some event during the corresponding hour.
ID | Hourly Event Data
1 | -----X-------X-------X--
2 | ---X-----X------X-------
3 | -----X---X--X-----------
4 | ----------------X--X-X--
5 | ---X--------X-------X---
6 | -------X-------X-----X--
Now you want a report which shows how many events happened at what day. Cross join the table with a table of IDs 1 to 24, then work your magic...
SELECT
[hour].id,
SUM(CASE WHEN SUBSTRING([data].string, [hour].id, 1) = 'X' THEN 1 ELSE 0 END)
FROM
[data]
CROSS JOIN
[hours]
GROUP BY
[hours].id
=>
1, 0
2, 0
3, 0
4, 2
5, 0
6, 2
7, 0
8, 1
9, 0
10, 2
11, 0
12, 0
13, 2
14, 1
15, 0
16, 1
17, 2
18, 0
19, 0
20, 1
21, 1
22, 3
23, 0
24, 0
I have different reports that prefilter the recordset (by various lines of business within the firm), but there were calculations that required percentages of revenue firm-wide. The recordsource had to contain the firm total instead of relying on calculating the overall sum in the report itself.
Example: The recordset has balances for each client and the Line of Business the client's revenue comes from. The report may only show 'retail' clients. There is no way to get a sum of the balances for the entire firm, but the report shows the percentage of the firm's revenue.
Since there are different balance fields, I felt it was less complicated to have full join with the view that has several balances (I can also reuse this view of firm totals) instead of multiple fields made up sub queries.
Another one is an update statement where multiple records needed to be created (one record for each step in a preset workflow process).
Here's one, where the CROSS JOIN substitutes for an INNER JOIN. This is useful and legitimate when there are no identical values between two tables on which to join. For example, suppose you have a table that contains version 1, version 2 and version 3 of some statement or company document, all saved in a SQL Server table so that you can recreate a document that is associated with an order, on the fly, long after the order, and long after your document was rewritten into a new version. But only one of the two tables you need to join (the Documents table) has a VersionID column. Here is a way to do this:
SELECT DocumentText, VersionID =
(
SELECT d.VersionID
FROM Documents d
CROSS JOIN Orders o
WHERE o.DateOrdered BETWEEN d.EffectiveStart AND d.EffectiveEnd
)
FROM Documents
I've used a CROSS JOIN recently in a report that we use for sales forcasting, the report needs to break out the amount of sales that a sales person has done in each General Ledger account.
So in the report I do something to this effect:
SELECT gla.AccountN, s.SalespersonN
FROM
GLAccounts gla
CROSS JOIN Salesperson s
WHERE (gla.SalesAnalysis = 1 OR gla.AccountN = 47500)
This gives me every GL account for every sales person like:
SalesPsn AccountN
1000 40100
1000 40200
1000 40300
1000 48150
1000 49980
1000 49990
1005 40100
1005 40200
1005 40300
1054 48150
1054 49980
1054 49990
1078 40100
1078 40200
1078 40300
1078 48150
1078 49980
1078 49990
1081 40100
1081 40200
1081 40300
1081 48150
1081 49980
1081 49990
1188 40100
1188 40200
1188 40300
1188 48150
1188 49980
1188 49990
For charting (reports) where every grouping must have a record even if it is zero.
(e.g. RadCharts)
I had combinations of am insolvency field from my source data.
There are 5 distinct types but the data had combinations of 2 of these. So I created lookup table of the 5 distinct values then used a cross join for an insert statement to fill out the rest. like so
insert into LK_Insolvency (code,value)
select a.code+b.code, a.value+' '+b.value
from LK_Insolvency a
cross join LK_Insolvency b
where a.code <> b.code <--this makes sure the x product of the value with itself is not used as this does not appear in the source data.
I personally try to avoid cartesian product's in my queries. I suppose have a result set of every combination of your join could be useful, but usually if I end up with one, I know I have something wrong.
Related
I want to know who has the most friends from the app I own(transactions), which means it can be either he got paid, or paid himself to many other users.
I can't make the query to show me only those who have the max friends number (it can be 1 or many, and it can be changed so I can't use limit).
;with relationships as
(
select
paid as 'auser',
Member_No as 'afriend'
from Payments$
union all
select
member_no as 'auser',
paid as 'afriend'
from Payments$
),
DistinctRelationships AS (
SELECT DISTINCT *
FROM relationships
)
select
afriend,
count(*) cnt
from DistinctRelationShips
GROUP BY
afriend
order by
count(*) desc
I just can't figure it out, I've tried count, max(count), where = max, nothing worked.
It's a two columns table - "Member_No" and "Paid" - member pays the money, and the paid is the one who got the money.
Member_No
Paid
14
18
17
1
12
20
12
11
20
8
6
3
2
4
9
20
8
10
5
20
14
16
5
2
12
1
14
10
It's from Excel, but I loaded it into sql-server.
It's just a sample, there are 1000 more rows
It seems like you are massively over-complicating this. There is no need for self-joining.
Just unpivot each row so you have both sides of the relationship, then group it up by one side and count distinct of the other side
SELECT
-- for just the first then SELECT TOP (1)
-- for all that tie for the top place use SELECT TOP (1) WITH TIES
v.Id,
Relationships = COUNT(DISTINCT v.Other),
TotalTransactions = COUNT(*)
FROM Payments$ p
CROSS APPLY (VALUES
(p.Member_No, p.Paid),
(p.Paid, p.Member_No)
) v(Id, Other)
GROUP BY
v.Id
ORDER BY
COUNT(DISTINCT v.Other) DESC;
db<>fiddle
I am writing a course project (something like a program for the hotel manager) and I need a little help. I have tables Reservations and Rooms and I need to calculate the amount of payment after the client leaves the room ((End_date - Start_date) * price_per_day), but I'm having trouble getting the price_per_day from the table Rooms.
My query only works if there is one record in the Resertvation table, if there are 2 or more, I get an error "subquery returned more than 1 value" and I don’t know how to fix it (the problem is in this part of the query SELECT price_per_day FROM Rooms AS ro JOIN Reservations AS re ON ro.room_id = re.room_id)
I'm using visual studio 2019 + SQL Server Express LocalDB.
I will be grateful for any help or hint!
UPDATE Reservations
SET Amount_payable = (
DATEDIFF(day, CONVERT(datetime, Start_date, 104), CONVERT(datetime, End_date, 104) * (SELECT price_per_day FROM Rooms AS ro JOIN Reservations AS re ON ro.room_id = re.room_id))
)
WHERE Status = 'Archived'
Table Reservations
reservation_id customer_id room_id start_date end_date status Amount_payable
1 3 3 12.04.2020 05.06.2020 Archived 0
2 2 4 11.04.2020 30.05.2020 Active 0
Table Rooms
reservation_id room_id number_of_persons room_type price_per_day
0 1 3 Double 300
0 2 4 Triple 600
0 3 3 Studio 400
2 4 2 Single 444
you need slightly different approach to resolve the issue.
try the following:
UPDATE re
SET
Amount_payable = (DATEDIFF(day, CONVERT(DATETIME, Start_date, 104), CONVERT(DATETIME, End_date, 104)) * price_per_day)
FROM Reservations re
JOIN Rooms AS ro ON ro.room_id = re.room_id
WHERE STATUS = 'Archived';
Sample data in tblData:
RowID SID Staken DateTaken
---------------------------------------------
1 1 1 2014-09-15 14:18:11.997
2 1 1 2014-09-16 14:18:11.997
3 1 1 2014-09-17 14:18:11.997
I would like to get the daywise count of SIDs and also a cumulative sum like
Date ThisDayCount TotalCount
-----------------------------------
2014-09-15 1 1
2014-09-16 10 11
2014-09-17 30 41
This is what I have now in my stored procedure with the start & end date parameters. Is there a more elegant way to do this?
;WITH TBL AS
(
SELECT
CONVERT(date, asu.DateTaken) AS Date,
COUNT(*) AS 'ThisDayCount'
FROM
tblData asu
WHERE
asu.SID = 1
AND asu.STaken = 1
AND asu.DateTaken >= #StartDate
AND asu.DateTaken <= #EndDate
GROUP BY
CONVERT(date, asu.DateTaken)
)
SELECT
t1.Date, t1.ThisDayCount, SUM(t1.ThisDayCount) AS 'TotalCount'
FROM
TBL t1
INNER JOIN
TBL t2 ON t1.date >= t2.date
GROUP BY
t1.Date, t1.ThisDayCount
I am not aware of a more elegant way to do that, other than perhaps with a subquery for your running total. What you have is pretty elegant by T-SQL standards.
But, depending on how many records you have to process and what your indexes look like, this could be very slow. You don't say what the destination of this information is, but if it's any kind of report or web page, I'd consider doing the running total as part of the processing at the destination rather than in the database.
I have the following table in SQL Server Express edition:
Time Device Value
0:00 1 2
0:01 2 3
0:03 3 5
0:03 1 3
0:13 2 5
0:22 1 7
0:34 3 5
0:35 2 6
0:37 1 5
The table is used to log the events of different devices which are reporting their latest values. What I'd like to do is to prepare the data in a way that I'd present the average data through time scale and eventually create a chart using this data. I've manipulated this example data in Excel in the following way:
Time Average value
0:03 3,666666667
0:13 4,333333333
0:22 5,666666667
0:34 5,666666667
0:35 6
0:37 5,333333333
So, at time 0:03 I need to take latest data I have in the table and calculate the average. In this case it's (3+3+5)/3=3,67. At time 0:13 the steps would be repeated, and again at 0:22,...
As I'd like to leave the everything within the SQL table (I wouldn't like to create any service with C# or similar which would grab the data and store it into some other table)
I'd like to know the following:
is this the right approach or should I use some other concept of calculating the average for charting data preparation?
if yes, what's the best approach to implement it? Table view, function within the database, stored procedure (which would be called from the charting API)?
any suggestions on how to implement this?
Thank you in advance.
Mark
Update 1
In the mean time I got one idea how to approach to this problem. I'd kindly ask you for your comments on it and I'd still need some help in getting the problem resolved.
So, the idea is to crosstab the table like this:
Time Device1Value Device2Value Device3Value
0:00 2 NULL NULL
0:01 NULL 3 NULL
0:03 3 NULL 5
0:13 NULL 5 NULL
0:22 7 NULL NULL
0:34 NULL NULL 5
0:35 NULL 6 NULL
0:37 5 NULL NULL
The query for this to happen would be:
SELECT Time,
(SELECT Stock FROM dbo.Event WHERE Time = S.Time AND Device = 1) AS Device1Value,
(SELECT Stock FROM dbo.Event WHERE Time = S.Time AND Device = 2) AS Device2Value,
(SELECT Stock FROM dbo.Event WHERE Time = S.Time AND Device = 3) AS Device3Value
FROM dbo.Event S GROUP BY Time
What I'd still need to do is to write a user defined function and call it within this query which would write last available value in case of NULL and if the last available value doesn't exist it would leave NULL value. With this function I'd get the following results:
Time Device1Value Device2Value Device3Value
0:00 2 NULL NULL
0:01 2 3 NULL
0:03 3 3 5
0:13 3 5 5
0:22 7 5 5
0:34 7 5 5
0:35 7 6 5
0:37 5 6 5
And by having this results I'd be able to calculate the average for each time by only SUMing up the 3 relevant columns and dividing it by count (in this case 3). For NULL I'd use 0 value.
Can anybody suggest how to create a user defined function for replacing NULL values with latest value?
Update 2
Thanks Martin.
This query worked but it took almost 21 minutes to go through the 13.576 lines which is far too much.
The final query I used was:
SELECT Time,
(SELECT TOP 1 Stock FROM dbo.Event e WHERE e.Time <= S.Time AND Device = 1 ORDER BY e.Time DESC) AS Device1Value,
(SELECT TOP 1 Stock FROM dbo.Event e WHERE e.Time <= S.Time AND Device = 2 ORDER BY e.Time DESC) AS Device2Value,
(SELECT TOP 1 Stock FROM dbo.Event e WHERE e.Time <= S.Time AND Device = 3 ORDER BY e.Time DESC) AS Device3Value
FROM dbo.Event S GROUP BY Time
but I've extended it to 10 devices.
I agree that this is not the best way to do it. Is there any other way to prepare the data for the average calculation because this takes just too much of the processing.
Here's one way. It uses the "Quirky Update" approach to filling in the gaps. This relies on an undocumented behaviour so you may prefer to use a cursor for this.
DECLARE #SourceData TABLE([Time] TIME, Device INT, value FLOAT)
INSERT INTO #SourceData
SELECT '0:00',1,2 UNION ALL
SELECT '0:01',2,3 UNION ALL
SELECT '0:03',3,5 UNION ALL
SELECT '0:03',1,3 UNION ALL
SELECT '0:13',2,5 UNION ALL
SELECT '0:22',1,7 UNION ALL
SELECT '0:34',3,5 UNION ALL
SELECT '0:35',2,6 UNION ALL
SELECT '0:37',1,5
CREATE TABLE #tmpResults
(
[Time] Time primary key,
[1] FLOAT,
[2] FLOAT,
[3] FLOAT
)
INSERT INTO #tmpResults
SELECT [Time],[1],[2],[3]
FROM #SourceData
PIVOT ( MAX(value) FOR Device IN ([1],[2],[3])) AS pvt
ORDER BY [Time];
DECLARE #1 FLOAT, #2 FLOAT, #3 FLOAT
UPDATE #tmpResults
SET #1 = [1] = ISNULL([1],#1),
#2 = [2] = ISNULL([2],#2),
#3 = [3] = ISNULL([3],#3)
SELECT [Time],
(SELECT AVG(device)
FROM (SELECT [1] AS device
UNION ALL
SELECT [2]
UNION ALL
SELECT [3]) t) AS [Average value]
FROM #tmpResults
DROP TABLE #tmpResults
So one of the possible solutions which I found is far more efficient (less than a second for 14.574 lines). I haven't yet had time to review the results in details but on the first hand it looks promising. This is the code for the 3 device example:
SELECT Time,
SUM(CASE MAC WHEN '1' THEN Stock ELSE 0 END) Device1Value,
SUM(CASE MAC WHEN '2' THEN Stock ELSE 0 END) Device1Value,
SUM(CASE MAC WHEN '3' THEN Stock ELSE 0 END) Device1Value,
FROM dbo.Event
GROUP BY Time
ORDER BY Time
In any case I'll test the code provided by Martin to see if it makes any difference to the results.
Dear all, I have a select query that currently produces the following results:
DoctorName Team 1 2 3 4 5 6 7 ... 31 Visited
dr. As A x x ... 2 times
dr. Sc A x ... 1 times
dr. Gh B x ... 1 times
dr. Nd C ... x 1 times
Using the following query:
DECLARE #startDate = '1/1/2010', #enddate = '1/31/2010'
SELECT d.doctorname,
t.teamname,
MAX(CASE WHEN ca.visitdate = 1 THEN 'x' ELSE NULL END) AS 1,
MAX(CASE WHEN ca.visitdate = 2 THEN 'x' ELSE NULL END) AS 2,
MAX(CASE WHEN ca.visitdate = 3 THEN 'x' ELSE NULL END) AS 3,
...
MAX(CASE WHEN ca.visitdate = 31 THEN 'x' ELSE NULL END) AS 31,
COUNT(*) AS visited
FROM CACTIVITY ca
JOIN DOCTOR d ON d.id = ca.doctorid
JOIN TEAM t ON t.id = ca.teamid
WHERE ca.visitdate BETWEEN #startdate AND #enddate
GROUP BY d.doctorname, t.teamname
the problem is I want to make the column of date are dynamic for example if ca.visitdate BETWEEN '2/1/2012' AND '2/29/2012'
so the result will be :
DoctorName Team 1 2 3 4 5 6 7 ... 29 Visited
dr. As A x x ... 2 times
dr. Sc A x ... 1 times
dr. Gh B x ... 1 times
dr. Nd C ... x 1 times
Can somebody help me how to get numbers of days between two date and help me revised the query so it can looping MAX(CASE WHEN ca.visitdate = 1 THEN 'x' ELSE NULL END) AS 1 as many as numbers of days? Please please
The basic rule in SQL is that any given constructed query will always return the same columns - in terms of how many there are, their names, and their types.
So you'd be looking at a dynamic SQL approach to construct the query on the fly, if you still want to go down that route. Otherwise, it might be worth looking at whether you can suppress empty columns at a higher level (is this going to some form of report processor - such as SQL reporting services or crystal reports?)
edit 1
You might want to add additional columns to your query such as:
CASE WHEN DATEPART(month,#StartDate) = DATEPART(month,DATEADD(day,29,#StartDate)) THEN 1 ELSE 0 END as ShowColumn29
(And similarly for the other numbers). How you then use that in Reporting services, I#m a bit vague, but I think you can add a hidden textbox somewhere on your report that binds to the ShowColumn29 value, and then set the visibility of the "29" column of the report to the value of this textbox.
Sorry - I'm not that good with reporting services, but hopefully you can play around with this sort of concept and make it work?