Ugly table query - sql-server

I have inherited this table and trying to optimize the queries. I am stuck with one query. Here is the table information
RaterName - varchar(24) - name of the rater
TimeTaken - varchar(12) - is stored as 00:10:14:8
Year - char(4) - is stored as 2014
I need
distinct list of raters, total count for the rater, sum(TimeTaken) for rater, avg(timetaken) for rater (for a given year)
I also need sum(timetaken) and avg(TimeTaken) for all the raters (for a given year)
Here is the query that I have come up with for #1... I would like the sum and avg to be like hh:mm:ss. How can I do this?
SELECT
[RaterName]
, count(*) as TotalRatings
, SUM((DATEPART(hh,convert(datetime, timetaken, 101))*60)+DATEPART(mi,convert(datetime, timetaken, 101))+(DATEPART(ss,convert(datetime, timetaken, 101))/(60.0)))/60.0 as TotalTimeTaken
, AVG((DATEPART(hh,convert(datetime, timetaken, 101))*60)+DATEPART(mi,convert(datetime, timetaken, 101))+(DATEPART(ss,convert(datetime, timetaken, 101))/(60.0)))/60.0 as AverageTimeTaken
FROM
[dbo].[rating]
WHERE
year = '2014'
GROUP BY
RaterName
ORDER BY
RaterName
Output:
RaterName TotalRatings TotalTimeTaken AverageTimeTaken
================================================================
Rater1 257 21.113609 0.082154
Rater2 747 41.546106 0.055617
Rater3 767 59.257218 0.077258
Rater4 581 37.154163 0.063948
Can I incorporate #2 in this query or write a second query and drop group by from it?
On the front end, I am using C#.

WITH data ( raterName, timeTaken )
AS (
SELECT raterName,
DATEDIFF(MILLISECOND, CAST('00:00' AS TIME),
CAST(timeTaken AS TIME))
FROM rating
WHERE CAST([year] AS INT) = 2014
)
SELECT raterName, COUNT(*) AS totalRatings,
SUM(timeTaken) AS totalTimeTaken, avg(timeTaken) AS averageTimeTaken
FROM data
GROUP BY raterName
ORDER BY raterName;
PS: If you don't want milliseconds, can make that Second or Minute.
EDIT: On your C# frontend you can make the Milliseconds or Seconds to a TimeSpan which would give you the format when you use ToString. ie:
var ttt = TimeSpan.FromSeconds(totalTimeTaken).ToString();

Related

SQL Server : update every 5 records with Past Months

I want to update 15 records in that first 5 records date should be June 2019,next 5 records with July 2019,last 5 records with Aug 2019 based on employee id,Can any one tell me how to write this type of query in SQL Server Management Studio V 17.7,I've tried with below query but unable to do for next 5 rows..
Like below query
Update TOP(5) emp.employee(nolock) set statusDate=GETDATE()-31 where EMPLOYEEID='XCXXXXXX';
To update only a certain number of rows of a table you will need to include a FROM clause and join a sub-query which limits the number of rows. I would suggest using OFFSET AND FETCH instead of top so that you can skip X number of rows
You will also want to use the DATEADD function instead of directly subtracting a number from the DateTime function GETDATE(). I'm not certain but I think your query will subtract milliseconds. If you intend to go back a month I would suggest subtracting a month rather than 31 days. Alternatively it might be easier to specify an exact date like '2019-06-01'
For example:
TableA
- TableAID INT PK
- EmployeeID INT FK
- statusDate DATETIME
UPDATE TableA
SET statusDate = '2019-06-01'
FROM TableA
INNER JOIN
(
SELECT TableAID
FROM TableA
WHERE EmployeeID = ''
ORDER BY TableAID
OFFSET 0 ROWS
FETCH NEXT 5 ROWS ONLY
) T1 ON TableA.TableAID = T1.TableAID
Right now it looks like your original query is updating the table employee rather than a purchases table. You will want to replace my TableA with whichever table it is you're updating and replace TableAID with the PK field of it.
You can use a ROW_NUMBER to get a ranking by employee, then just update the first 15 rows.
;WITH EmployeeRowsWithRowNumbers AS
(
SELECT
T.*,
RowNumberByEmployee = ROW_NUMBER() OVER (
PARTITION BY
T.EmployeeID -- Generate a ranking by each different EmployeeID
ORDER BY
(SELECT NULL)) -- ... in no particular order (you should supply one if you have an ordering column)
FROM
emp.employee AS T
)
UPDATE E SET
statusDate = CASE
WHEN E.RowNumberByEmployee <= 5 THEN '2019-06-01'
WHEN E.RowNumberByEmployee BETWEEN 6 AND 10 THEN '2019-07-01'
ELSE '2019-08-01' END
FROM
EmployeeRowsWithRowNumbers AS E
WHERE
E.RowNumberByEmployee <= 15

T-SQL Max Weight Query

I have a table with these columns:
BatchNumber, BagNumber, BagWeight, CumulativeWeight
Each batch can have up to 30 bags and the other columns are self-explanatory.
What I need is a query which finds the maximum cumulative weight for each batch, here is what I have so far.
DECLARE #HighestBagNumber INT;
DECLARE #BatchNumber CHAR(8);
SET #BatchNumber = 37708;
SELECT #HighestBagNumber = MAX(BagNumber)
FROM FSD3BagLog
WHERE BatchNumber = #BatchNumber
SELECT BatchNumber, BagNumber, CumulativeWeight
FROM FSD3BagLog
WHERE BagNumber = #HighestBagNumber
AND BatchNumber = #BatchNumber
This works for one batch at a time but I need it to look at all batches in the table. As you might be able to tell, I am a total beginner so please be as critical as you want, its all good.
Yes GROUP BY seems right, with the proper storage model it should be:
SELECT BatchNumber, COUNT(BagNumber), SUM(BagWeight)
FROM FSD3BagLog
GROUP BY BatchNumber
Result: 100, 30, 600
(Where batch number = 100, there are 30 per batch, and weight of each bag = 20)
But based on you current working query it looks like you are storing denomalizing the data and storing cumulative weight as you go, probably using triggers or some other code that fires when the table is updated.
So if cumulative weight represents the total weight of a given batch, you can get rid of it and use the query above.
If cumulative weight is something else, such as total of all bags up to a certain point in time, you can still get rid of it. In that case you would simple do something like:
SELECT BatchNumber, SUM(BagWeight) AS CumulativeWeight
FROM FSD3BagLog
WHERE ModifiedDate <= '2018-08-11 06:18:00'
Given you are storing ModifiedDate as a column on your table, this will give you the cumulative weight of all bags up to today at 6:18 AM.
Simple GROUP BY should do the job:
SELECT BatchNumber, MAX(CumulativeWeight)
FROM my_table
GROUP BY BatchNumber
with batches_ranked as
(
select BatchNumber, BagNumber,
CumulativeWeight = sum(Weight) over (partition by BatchNumber order by BagNumber),
[Rank] = row_number() over (partition by BatchNumber order by BagNumber desc)
from FSD3BagLog
)
select * from batches_ranked where [Rank] = 1
sounds like you have CumulativeWeight stored in the table, if that is always increasing with BagNumber then you can simplify the query to just:
select BatchNumber, max(BagNumber), max(CumulativeWeight)
from FSD3BagLog group by BatchNumber

Month Difference Without additional table

I am new to SQL query world and got stuck into one requirement.
In my Query i have toDate and fromdate input parameter, based on business logic it will return result like below.
Result:-
Month
Dec-16
Dec-16
Dec-16
Feb-17
Feb-17
Mar-17
Mar-17
now query should need to return the data for each month , if we dont have data for perticular month(in image which is Jan) then it should insert data and return data for that month too, in image we can see for Jan we dont have any data.
You can use a calendar or dates table for this sort of thing.
Without a calendar table, you can generate an adhoc set of months using a common table expression with just this:
declare #fromdate date = '20161201';
declare #todate date = '20170301';
;with n as (select n from (values(0),(1),(2),(3),(4),(5),(6),(7),(8),(9)) t(n))
, Months as (
select top (datediff(month, #fromdate, #todate)+1)
[Month]=convert(date,dateadd(month,row_number() over(order by (select 1))-1,#fromdate))
from n as deka cross join n as hecto cross join n as kilo
order by [Month]
)
/* your query here: */
select
d.[Month]
, sum_col = sum(t.col)
from Months
left join tbl t
on d.[Month] = t.[Month]
group by d.[Month]
Number and Calendar table reference:
Generate a set or sequence without loops - 2 - Aaron Bertrand
The "Numbers" or "Tally" Table: What it is and how it replaces a loop - Jeff Moden
Creating a Date Table/Dimension in sql Server 2008 - David Stein
Calendar Tables - Why You Need One - David Stein
Creating a date dimension or calendar table in sql Server - Aaron Bertrand
Solved Query:-
Declare #customDate DATETIME
declare #datafound integer
set #customDate = #fromDate
WHILE #customDate < #toDate
BEGIN
select #datafound = count(1) from #temp where datepart(month, MonthDate) = datepart(month, #customDate)
if #datafound = 0
select Format(#customDate,'MMM-yy') as Month
SET #customDate = DATEADD(month, 1,#customDate)
END;

SUM() column based on other columns

I have table with sales plan data for every week, which consists of few columns:
SAL_DTDGID -- which is date of every Sunday, for example 20160110, 20160117
SAL_MQuantity --sum of sales plan value
SAL_MQuantityYTD --sum of plans since first day of the year
SAL_CoreElement --sales plan data for few core elements
SAL_Site --unique identifier of place, where sale has happened
How do I sum values in SAL_MQuantityYTD as values of SAL_MQuantity since first records in 2016 to 'now' for every site and every core element?
Every site mentioned in SAL_Site has 52 rows corresponding week count in a year along with 5 different SAL_CoreElement's
Example:
SAL_DTDGID|SAL_MQuantity|SAL_MQuantityYTD|SAL_CoreElement|SAL_Site
20160110 |20000 |20000 |1 |1234
20160117 |10000 |30000 |1 |1234
20160124 |30000 |60000 |1 |1234
If something isn't clear I'll try to explain.
Not sure I completely understand your question, but this should allow you to recreate the running sum for SAL_MQuantityYTD. Replace #test with whatever your table/view is called.
SELECT *,
(SELECT SUM(SAL_MQuantity)
FROM #test T2
WHERE T2.SAL_DTDGID <= T1.SAL_DTDGID
AND T2.SAL_Site = T1.SAL_Site
AND T2.SAL_coreElement = T1.SAL_coreElement) AS RunningTotal
FROM #test T1
If you wanted to create the yearly figure then you could also use a correlated subquery like this
SELECT *,
(SELECT SUM(SAL_MQuantity)
FROM #test T2
WHERE cast(left(T2.SAL_DTDGID,4) as integer) = cast(left(T1.SAL_DTDGID,4) as integer)
AND T2.SAL_Site = T1.SAL_Site
AND T2.SAL_coreElement = T1.SAL_coreElement) AS RunningTotal
FROM #test T1
Edit: Just seen, basically the same answer, using a window function.
Let me explain you an idea. Please try below.
Select A, B,
(Select SUM(SAL_MQuantity)
FORM [Your Table]
WHERE [your date column] between '20160101' AND '[Present date]') AS SAL_MQuantityYTD
FROM [Your Table]
My understanding from your questions is that you want to have the YTD sum of SAL_MQuantity for each year (you can simply 'where' after if you only want 2016), SAL_Site, SAL_CoreElement.
The code below should achieve that and will run on SQL 2008 r2 (im running 2005).
'##t1' is the temp table name I used to test, replace it with your table name.
Select distinct
sum (SAL_MQuantity) over (partition by
left (cast (cast (SAL_DTDGID as int) as varchar (8)),4)
, SAL_Site
, SAL_CoreElement
) as Sum_SAL_DTDGID
,left (cast (cast (SAL_DTDGID as int) as varchar (8)),4) as Time_Period
, SAL_Site
, SAL_CoreElement
from ##t1

Flatten/merge overlapping time intervals

I have a 'Service' table with millions of rows. Each row corresponds to a service provided by a staff in a given date and time interval (Each row has a unique ID). There are cases where a staff might provide services in overlapping time frames. I need to write a query that merges overlapping time intervals and returns the data in the format shown below.
I tried grouping by StaffID and Date fields and getting the Min of BeginTime and Max of EndTime but that does not account for the non-overlapping time frames. How can I accomplish this? Again, the table contains several million records so a recursive CTE approach might have performance issues. Thanks in advance.
Service Table
ID StaffID Date BeginTime EndTime
1 101 2014-01-01 08:00 09:00
2 101 2014-01-01 08:30 09:30
3 101 2014-01-01 18:00 20:30
4 101 2014-01-01 19:00 21:00
Output
StaffID Date BeginTime EndTime
101 2014-01-01 08:00 09:30
101 2014-01-01 18:00 21:00
Here is another sample data set with a query proposed by a contributor.
http://sqlfiddle.com/#!6/bfbdc/3
The first two rows in the results set should be merged into one row (06:00-08:45) but it generates two rows (06:00-08:30 & 06:00-08:45)
I only came up with a CTE query as the problem is there may be a chain of overlapping times, e.g. record 1 overlaps with record 2, record 2 with record 3 and so on. This is hard to resolve without CTE or some other kind of loops, etc. Please give it a go anyway.
The first part of the CTE query gets the services that start a new group and are do not have the same starting time as some other service (I need to have just one record that starts a group). The second part gets those that start a group but there's more then one with the same start time - again, I need just one of them. The last part recursively builds up on the starting group, taking all overlapping services.
Here is SQLFiddle with more records added to demonstrate different kinds of overlapping and duplicate times.
I couldn't use ServiceID as it would have to be ordered in the same way as BeginTime.
;with flat as
(
select StaffID, ServiceDate, BeginTime, EndTime, BeginTime as groupid
from services S1
where not exists (select * from services S2
where S1.StaffID = S2.StaffID
and S1.ServiceDate = S2.ServiceDate
and S2.BeginTime <= S1.BeginTime and S2.EndTime <> S1.EndTime
and S2.EndTime > S1.BeginTime)
union all
select StaffID, ServiceDate, BeginTime, EndTime, BeginTime as groupid
from services S1
where exists (select * from services S2
where S1.StaffID = S2.StaffID
and S1.ServiceDate = S2.ServiceDate
and S2.BeginTime = S1.BeginTime and S2.EndTime > S1.EndTime)
and not exists (select * from services S2
where S1.StaffID = S2.StaffID
and S1.ServiceDate = S2.ServiceDate
and S2.BeginTime < S1.BeginTime
and S2.EndTime > S1.BeginTime)
union all
select S.StaffID, S.ServiceDate, S.BeginTime, S.EndTime, flat.groupid
from flat
inner join services S
on flat.StaffID = S.StaffID
and flat.ServiceDate = S.ServiceDate
and flat.EndTime > S.BeginTime
and flat.BeginTime < S.BeginTime and flat.EndTime < S.EndTime
)
select StaffID, ServiceDate, MIN(BeginTime) as begintime, MAX(EndTime) as endtime
from flat
group by StaffID, ServiceDate, groupid
order by StaffID, ServiceDate, begintime, endtime
Elsewhere I've answered a similar Date Packing question with
a geometric strategy. Namely, I interperet the date ranges
as a line, and utilize geometry::UnionAggregate to merge
the ranges.
Your question has two peculiarities though. First, it calls
for sql-server-2008. geometry::UnionAggregate is not then
avialable. However, download the microsoft library at
https://github.com/microsoft/SQLServerSpatialTools and load
it in as a clr assembly to your instance and you have it
available as dbo.GeometryUnionAggregate.
But the real peculiarity that has my interest is the concern
that you have several million rows to work with. So I thought
I'd repeat the strategy here but with an added technique to
improve it's performance. This technique will work well if
you have a lot of your StaffID/date subsets that are the same.
First, let's build a numbers table. Swap this out with your favorite
way to do it.
select i = row_number() over (order by (select null))
into #numbers
from #services; -- where i put your data
Then convert the dates to floats and use those floats to create
geometrical points.
These points can then be turned into lines via STUnion and STEnvelope.
With your ranges now represented as geometric lines, merge them via
UnionAggregate. The resulting geometry object 'lines' might contain
multiple lines. But any overlapping lines turn into one line.
select s.StaffID,
s.Date,
linesWKT = geometry::UnionAggregate(line).ToString()
-- If you have SQLSpatialTools installed then:
-- linesWKT = dbo.GeometryUnionAggregate(line).ToString()
into #aggregateRangesToGeo
from #services s
cross apply (select
beginTimeF = convert(float, convert(datetime,beginTime)),
endTimeF = convert(float, convert(datetime,endTime))
) prepare
cross apply (select
beginPt = geometry::Point(beginTimeF, 0, 0),
endPt = geometry::Point(endTimeF, 0, 0)
) pointify
cross apply (select
line = beginPt.STUnion(endPt).STEnvelope()
) lineify
group by s.StaffID,
s.Date;
You have one 'lines' object for each staffId/date combo. But depending
on your dataset, there may be many 'lines' objects that are the same
between these combos. This may very well be true if staff are expected
to follow a routine and data is recorded to the nearest whatever.
So get a distinct lising of 'lines' objects. This should improve
performance.
From this, extract the individual lines inside 'lines'. Envelope the lines,
which ensures that the lines are stored only as their endpoints. Read the
endpoint x values and convert them back to their time representations.
Keep the WKT representation to join it back to the combos later on.
select lns.linesWKT,
beginTime = convert(time, convert(datetime, ap.beginTime)),
endTime = convert(time, convert(datetime, ap.endTime))
into #parsedLines
from (select distinct linesWKT from #aggregateRangesToGeo) lns
cross apply (select
lines = geometry::STGeomFromText(linesWKT, 0)
) geo
join #numbers n on n.i between 1 and geo.lines.STNumGeometries()
cross apply (select
line = geo.lines.STGeometryN(n.i).STEnvelope()
) ln
cross apply (select
beginTime = ln.line.STPointN(1).STX,
endTime = ln.line.STPointN(3).STX
) ap;
Now just join your parsed data back to the StaffId/Date combos.
select ar.StaffID,
ar.Date,
pl.beginTime,
pl.endTime
from #aggregateRangesToGeo ar
join #parsedLines pl on ar.linesWKT = pl.linesWKT
order by ar.StaffID,
ar.Date,
pl.beginTime;

Resources