ms sql table adding rows whenever level changes by more than 1 so that every row has difference of 1 in start_level and end_level - sql-server

(This is my first stack overflow question. So please let me know suggestions for posing a better question, if you cannot understand.)
I have a table of around 500 people(users) who are going up the stairs from floor x (0=x, max(y) = 50). A person can climb zero/one or many levels in a single go which corresponds to a single row of the table along with the time taken to do so in seconds.
I want to find average time taken to go from floor a to a+1 where a is any of the floor number. To do so I intend to divide every row of the mentioned table into rows which have start_level+1= end_level. Duration will be divided equally as shown in EXPECTED OUTPUT TABLE for user b.
GIVEN TABLE INPUT
start_level end_level duration user
1 1 10 a
1 2 5 a
2 5 27 b
5 6 3 c
EXPECTED OUTPUT
start_level end_level duration user
1 1 10 a
1 2 5 a
2 3 27/3 b
3 4 27/3 b
4 5 27/3 b
5 6 3 c
Note: level jumps are in integers only.
After getting expected output, I can simply create a column sum(duration)/count(distinct users) at a start_level level to get average time taken to get one floor above from each floor.
Any help is appreciated.

You can use a Numbers table to "create" the incremental steps. Here's my setup:
CREATE TABLE #floors
(
[start_level] INT,
[end_level] INT,
[duration] DECIMAL(10, 4),
[user] VARCHAR(50)
)
INSERT INTO #floors
([start_level],
[end_level],
[duration],
[user])
VALUES (1,1,10,'a'),
(1,2,5,'a'),
(2,5,27,'b'),
(5,6,3,'c')
Then, using a Numbers table and some LEFT JOIN/COALESCE logic:
-- Create a Numbers table
;WITH Numbers_CTE
AS (SELECT TOP 50 [Number] = ROW_NUMBER()
OVER(
ORDER BY (SELECT NULL))
FROM sys.columns)
SELECT [start_level] = COALESCE(n.[Number], f.[start_level]),
[end_level] = COALESCE(n.[Number] + 1, f.[end_level]),
[duration] = CASE
WHEN f.[end_level] = f.[start_level] THEN f.[duration]
ELSE f.[duration] / ( f.[end_level] - f.[start_level] )
END,
f.[user]
FROM #floors f
LEFT JOIN Numbers_CTE n
ON n.[Number] BETWEEN f.[start_level] AND f.[end_level]
AND f.[end_level] - f.[start_level] > 1
Here are the logical steps:
LEFT JOIN the Numbers table for cases where end_level >= start_level + 2 (this has the effect of giving us multiple rows - one for each incremental step)
new start_level = If the LEFT JOIN "completes": take Number from the Numbers table, else: take the original start_level
new end_level = If the LEFT JOIN "completes": take Number + 1, else: take the original end_level
new duration = If end_level = start_level: take the original duration (to avoid divide by 0), else: take the average duration over end_level - start_level

Related

How to combine group by, join, COUNT, SUM and subquery clauses in sql

I am not sure how to write the SQL query for the following problem:
There are two tables, Worker and Product (one worker can make many products) which I describe in this link:https://docs.google.com/spreadsheets/d/1Yk2vKKmUEyuN-QfgTEbmF4suHFtuDkkrsUf-wqvOoKQ/edit?fbclid=IwAR3ipjwNrfhGXg3fCyAri4tD1Q4WqWuKVAqagvbsZg9Sn1myDwkWbWcl_6E#gid=0
The calculation of the total salary of a worker at month x is as follows
totalSalary = salaryPerMonth + SUM(salaryPerProduct * COUNT(pid))
I want to use join statement (regardless of INNER JOIN, LEFT, OR RIGHT JOIN) combined with group by clause to solve this problem but my statements are wrong.
Expect a specific SQL statement in this case.
I hope to be able to express my ideas in this photo
UPDATE: my picture quality is not good so i will repost my picture on this linkenter image description here
#phi nguyễn quốc - Welcome to StackOverflow. What you posted has the makings of a good question. It contains:
Brief summary of the issue
Table structure, sample data
Explanation of expected results
Code you've tried
It just needs a few modifications to conform to the guidelines and avoid being closed. A few tips on posting:
Help others to help you by including a Minimal, Reproducible Example. (With SQL questions include table definitions and sample data). That way folks who want to help can spend their time answering your question, instead of on writing set-up code to replicate your tables, environment, etc..
Make it easy for others to be able to test your code. Always post code as text, not as an image.
Use collaborative tools like db<>fiddle for sharing
One example of how you might improve the question and avoid it being closed:
Issue:
I am trying to write a SQL query to calculate the total salary for workers for a given month X. There are two tables: [Worker] and [Product]. One worker can make many products.
wid
wname
salaryPerMonth
salaryPerProduct
phoneNumber
1
Mr A
500
5
2
Mr B
100
30
3
Mr C
200
20
pid
pname
manufacturedDate
wid
1
Product A
2013-12-01
1
2
Product B
2013-12-09
1
3
Product C
2013-09-08
1
4
Product D
2013-01-30
2
5
Product E
2013-09-20
2
6
Product F
2013-12-23
3
The "Total Salary" of a worker for month X is calculated as follows:
SalaryPerMonth +
( SalaryPerProduct *
Number of Products for Month
)
Expected Results: (December 2013)
wid
wname
salaryPerMonth
salaryPerProduct
totalSalary
** Formula
1
Mr A
500
5
510
= 500 + (5*2)
2
Mr B
100
30
100
= 100 + (30*0)
3
Mr C
200
20
220
= 200 + (20*1)
Actual Results
I've tried this query
SELECT W.wid, W.wname, W.phoneNumber, W.salaryPerMonth, W.salaryPerProduct, (W.salaryPerMonth - SUM(W.salaryPerMonth*COUNT(p.pid))) AS Total
FROM Worker W INNER JOIN Product P ON p.Wid = W.wid
WHERE MONTH(P.manufacturedDate) = 12
GROUP BY W.wid, W.wname, W.phoneNumber, W.salaryPerMonth, W.salaryPerProduct
.. but am getting the error below:
Msg 130 Level 15 State 1 Line 1
Cannot perform an aggregate function on an expression containing an aggregate or a subquery.
Here is my db<>fiddle
CREATE TABLE Product (
pid int
, pname varchar(40)
, manufacturedDate date
, wid int
);
CREATE TABLE Worker (
wid int
, wname varchar(40)
, salaryPerMonth int
, salaryPerProduct int
, phoneNumber varchar(20)
)
INSERT INTO Product(pid, pname, manufacturedDate, wid)
VALUES
(1,'Product A','2013-12-01',1)
,(2,'Product B','2013-12-09',1)
,(3,'Product C','2013-09-08',1)
,(4,'Product D','2013-01-30',2)
,(5,'Product E','2013-09-20',2)
,(6,'Product F','2013-12-23',3)
;
INSERT INTO Worker (wid, wname, salaryPerMonth,salaryPerProduct)
VALUES
(1,'Mr A', 500, 5)
,(2, 'Mr B', 100, 30)
,(3,'Mr C', 200, 20)
;

Need to generate from and to numbers based on the result set with a specified interval

I have below requirement.
Input is like as below.
Create table Numbers
(
Num int
)
Insert into Numbers
values (1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(11),(12),(13),(14),(15)
Create table FromTo
(
FromNum int
,ToNum int
)
Select * From FromTo
Output should be as below.
FromNum ToNum
1 5
6 10
11 15
Actual Requirement is as below.
I need to load the data for a column into a table which will have thousands of records with different no's.
Consider like below.
1,2,5,7,9,11,15,34,56,78,98,123,453,765 etc..
I need to load these into other table which is having FROM and TO columns with the intervals of 5000. For example in the first 5000 if i have the no's till 3000, my 1st row should have FromNo as 1 and ToNum as 3000. second row: if the data is not having till 10000 and the next no started as 12312(This is the 2nd Row FromNum) the ToNum value should be +5000 i.e 17312. Here also if we don't have the no's data till 17312 it need to consider the ToNum between the 12312 and 17312
Output should be as below.
FromNum ToNum
1 3205
1095806 1100805
1100808 1105806
1105822 1110820
Can you guys please help me with the solution for the above.
Thanks in advance.
What you may try in this situation is to group data and get the expected results:
DECLARE #interval int = 5
INSERT INTO FromTo (FromNum, ToNum)
SELECT MIN(Num) AS FromNum, MAX(Num) AS ToNum
FROM Numbers
GROUP BY (Num - 1) / #interval

TSQL Least number of appearances

My question is that I want to find the "Balie" with the least number of "Maatschappijen" booked on it. So far I got this query wich displays all "Balies" and all the "Maatschappijen" with them. The wanted result is one "balienummer" record with the least number of "maatschappijen" booked on it.
Query
SELECT [Balie].[balienummer], [IncheckenBijMaatschappij].[balienummer], [IncheckenBijMaatschappij].[maatschappijcode]
FROM [Balie]
JOIN [IncheckenBijMaatschappij]
ON [Balie].[balienummer] = [IncheckenBijMaatschappij].[balienummer]
Query result
balienummer balienummer maatschappijcode
1 1 BA
1 1 TR
2 2 AF
2 2 NZ
3 3 KL
4 4 KL
LRS: https://www.dropbox.com/s/f2l9a874d5witpt/LRS_CasusGelreAirport.pdf
SELECT [Balie].[balienummer], count([IncheckenBijMaatschappij].[maatschappijcode])
FROM [Balie]
JOIN [IncheckenBijMaatschappij]
ON [Balie].[balienummer] = [IncheckenBijMaatschappij].[balienummer]
GROUP BY [Balie].[balienummer]
ORDER BY count([IncheckenBijMaatschappij].[maatschappijcode])
First record should be your answer.

TSQL - How to prevent optimisation in this query

I have a query analogous to:
update x
set x.y = (
select sum(x2.y)
from mytable x2
where x2.y < x.y
)
from mytable x
the point being, I'm iterating over rows and updating a field based on a subquery over those fields which are changing.
What I'm seeing is the subquery is being executed for each row before any updates occur, so the changed values for each row are not being picked up.
How can I force the subquery to be re-evaluated for each row of the update?
Is there a suitable table hint or something?
As an aside, I was doing the below and it did work, however since modifying my query somewhat (for logic purposes, not to try and solve this issue) this trick no longer works :(
declare #temp int
update x
set #temp = (
select sum(x2.y)
from mytable x2
where x2.y < x.y
),
x.y = #temp
from mytable x
I'm not particularly concerned about performance, this is a background task run over a few rows
It looks like task is incorrect or other rules should apply.
Let's see on example. Let's say you have values 4, 1, 2, 3, 1, 2
Sql will update rows based on original values. I.e. during single update statement newly calculated values is NOT mixing with original values:
-- only original values used
4 -> 9 (1+2+3+1+2)
1 -> null
2 -> 2 (1+1)
3 -> 6 (1+2+1+2)
1 -> null
2 -> 2 (1+1)
Based on your request you wants that update of each rows will count newly calculated values. (Note, that SQL does not guarantees the sequence in which rows will be processed.)
Let's do this calculation by processing rows from top to bottom:
-- from top
4 -> 9 (1+2+3+1+2)
1 -> null
2 -> 1 (1)
3 -> 4 (1+1+2)
1 -> null
2 -> 1 (1)
Do the same in other sequence - from bottom to top:
-- from bottom
4 -> 3 (2+1)
1 -> null
2 -> 1 (1)
3 -> 5 (2+2+1)
1 -> null
2 -> 2 (1+1)
How you can see your expected result is inconsistent. To make it right you need to correct the calculation rule - for instance define strong sequence of the rows to process (date, id, ...)
Also, if you want to do some recursive processing look at the common_table_expression:
http://technet.microsoft.com/en-us/library/ms186243(v=sql.105).aspx

Populating a departure date field which is after the arrival date

Step 1
Arrival Date (Already generated) – 1.35 Million Times
Step 2
Randomise a number between 0 and 1
Step 3
Use the Randomised number produced above to create the script below
UPDATE BOOKINGS
SET DepartureDate
CASE WHEN RAND() Result = Between 0 and 0.3 = Departure Date will be 2 Nights Later
CASE WHEN RAND() Result = Between 0.3 and 0.4 = Departure Date will be 3 Nights Later
CASE WHEN RAND ()Result >0.4 = Departure Date will be either 1,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28 Nights Later
Do not use RAND() with a changing seed. It makes for terribly randomized data.
To get to your solution you need to create "buckets" of possible values. 3 days is supposed to happen in 10% of the cases; that makes the smallest bucket, so we need ten buckets. 2 days goes into 3 buckets. The other values go into 2 buckets each. then just use modulo to select one of the 10 buckets like this:
CREATE TABLE dbo.booking(Id INT IDENTITY(1,1) PRIMARY KEY CLUSTERED,days INT);
GO
INSERT INTO dbo.booking(days)
SELECT TOP(100000) 0 FROM sys.columns A,sys.columns B,sys.columns C,sys.columns D;
GO
UPDATE b
SET days = rndm.days
FROM dbo.booking b
CROSS APPLY (
SELECT days
FROM (VALUES(0,2),(1,2),(2,2),(3,3),(4,1),(5,1),(6,4),(7,4),(8,28),(9,28))dn(n,days)
WHERE n = ABS(CHECKSUM(NEWID(),b.Id))%10
)rndm;
GO
SELECT days,COUNT(1) cnt
FROM dbo.booking
GROUP BY days;
GO
EDIT: Updated code to not use case statement.
Just to let you know the final solution I used was:
UPDATE BOOKINGS
SET DepartureDate =
DATEADD(day,
CASE WHEN Rand(CHECKSUM(NEWID())) BETWEEN 0 and 0.3 THEN 2 ELSE
CASE WHEN Rand(CHECKSUM(NEWID())) BETWEEN 0.3 and 0.5 THEN 3 ELSE
Round(Rand(CHECKSUM(NEWID())) * 28,0) END END,ArrivalDate)
Thanks
Wayne

Resources