result comes with adding different different columns - sql-server

this is the given tables data ,i want the output like this as given by me .
slno name salary
-----------------------------
1 raj 5000.0000
2 laba 4000.0000
3 silu 3000.0000
4 jaya 6000.0000
5 papu 7000.0000
6 tikan 9000.0000
7 susanta 6000.0000
8 chiku 4500.0000
9 micky 5500.0000
10 susa 2500.0000
11 musa 6500.0000
12 pi 6500.0000
13 luna 7500.0000
14 tuna 9500.0000
15 tina 3500.0000
Desired output
slno name salary
----------------------
1 raj 5000.0000
2 laba 4000.0000
3 silu 3000.0000
4 jaya 6000.0000
5 papu 7000.0000
6-10 ---- 27500.0000(total salary from 6-10)
6-15 ---- 61000.0000(total salary from 6-15)

Try this:
create table #table_name (slno int, name varchar(20), salary float);
insert into #table_name (slno, name, salary) values
(1, 'raj', 5000.0000),
(2, 'laba', 4000.0000),
(3, 'silu', 3000.0000),
(4, 'jaya', 6000.0000),
(5, 'papu', 7000.0000),
(6, 'tikan', 9000.0000),
(7, 'susanta', 6000.0000),
(8, 'chiku', 4500.0000),
(9, 'micky', 5500.0000),
(10, 'susa', 2500.0000),
(11, 'musa', 6500.0000),
(12, 'pi', 6500.0000),
(13, 'luna', 7500.0000),
(14, 'tuna', 9500.0000),
(15, 'tina', 3500.0000);
select cast(slno as varchar(10)) [slno]
, name
, salary
from #table_name where slno <= 5
union all
select '6-10'
, '----'
, sum(salary)
from #table_name where slno between 6 and 10
union all
select '6-15'
, '----'
, sum(salary)
from #table_name where slno between 6 and 15
Result
slno name salary
----------------------
1 raj 5000
2 laba 4000
3 silu 3000
4 jaya 6000
5 papu 7000
6-10 ---- 27500
6-15 ---- 61000

Related

Create a single time series master table using different tables from many different dates (SQL Server/SSMS18)

I tried to find answers by searching articles on the web and SO suggestions (e.g., INSERT, ALTER TABLE, MERGE, COALESCE, INSERT INTO SELECT). This suggestion using FULL JOIN or UNION ALL is close to what is needed, but the new fields added to the table need to be appended to their corresponding "id" and not become new records as shown (Table C): Creating table from two different tables sql
SSMS2018 will be used to create a time series using data from different tables. Each date has multiple tables with different fields. The field "id" is present in all tables (FYI: "id" is the company's id number).
Steps:
needs to combine all the fields into one new table for a given date.
a master table needs to be created with the data for all dates and all fields (note: there may be new "id"'s added or existing "id"'s dropped across dates). The goal is to be able to analyze the values for each field across all dates grouped by "id" (see example below).
Questions:
What SQL statement(s) in SSMS '18 are used to perform the steps above?
Is it possible and more efficient to use JOINs or another SQL function to perform Step 2?
Example:
Step 1: Append the fields in Table 2 to Table 1
Table 1
date id field1 field2 Table 2 date id field5 field6
20191231 a1 4 4 20191231 a1 9 5
20191231 b5 4 10 20191231 b5 8 8
20191231 c9 2 9 20191231 c9 9 10
Table 1 (revised)
date id field1 field2 field5 field6
20191231 a1 4 4 9 5
20191231 b5 4 10 8 8
20191231 c9 2 9 9 10
Step 2: Combine / Merge Table 1 (revised) with Table 4 (Table 4 was previously created using Step 1) to create a time series in "New Table"
Table 4
date id field1 field2 field5 field6
20190930 a1 1 7 0 7
20190930 b5 3 2 6 1
20190930 c9 5 10 4 6
20190930 d11 0 5 3 7
New Table
date id field1 field2 field5 field6
20190930 a1 1 7 0 7
20191231 a1 4 4 9 5
20190930 b5 3 2 6 1
20191231 b5 4 10 8 8
20190930 c9 5 10 4 6
20191231 c9 2 9 9 10
20190930 d11 0 5 3 7
20191231 d11 NULL NULL NULL NULL
Instead of "appending fields" from Table2 to Table1, etc. and then creating a main table, the relational way would be to convert variable lists of columns into variable rows with a fixed number of columns. This means 'unpivoting' each table directly into a normalized Test_Main table. The 'New Table' output could be produced by a query using conditional aggregation.
Data
drop table if exists #tTEST1;
go
select * INTO #tTEST1 from (values
('20191231', 'a1', 4, 4),
('20191231', 'b5', 4, 10),
('20191231', 'c9', 2, 9)) V(mdate, id, field1, field2);
drop table if exists #tTEST2;
go
select * INTO #tTEST2 from (values
('20191231', 'a1', 9, 5),
('20191231', 'b5', 8, 8),
('20191231', 'c9', 9, 10)) V(mdate, id, field5, field6);
drop table if exists #tTEST4;
go
select * INTO #tTEST4 from (values
('20191230', 'a1', 1, 7, 0, 7),
('20191230', 'b5', 3, 2, 6, 1),
('20191230', 'c9', 5, 10, 4, 6),
('20191230', 'd11', 0, 5, 3, 7)) V(mdate, id, field1, field2, field5, field6);
DDL of main table
drop table if exists #tTEST_Main;
go
create table #tTEST_Main(
id varchar(10) not null,
mdate date not null,
field_name varchar(100) not null,
series_val int not null,
constraint
unq_tm_id_m_fn unique(id, mdate, field_name));
Unpivoting queries to populate Test_Main table
insert #tTEST_Main(id, mdate, field_name, series_val)
select v.*
from #tTEST1 t1
cross apply
(values (id, mdate, 'field1', field1),
(id, mdate, 'field2', field2)) v(id, mdate, field_name, series_val);
insert #tTEST_Main(id, mdate, field_name, series_val)
select v.*
from #tTEST2 t2
cross apply
(values (id, mdate, 'field5', field5),
(id, mdate, 'field6', field6)) v(id, mdate, field_name, series_val);
insert #tTEST_Main(id, mdate, field_name, series_val)
select v.*
from #tTEST4 t4
cross apply
(values (id, mdate, 'field1', field1),
(id, mdate, 'field2', field2),
(id, mdate, 'field5', field5),
(id, mdate, 'field6', field6)) v(id, mdate, field_name, series_val);
Query to output "New Table" results
select id, mdate,
max(case when field_name='field1' then series_val else 0 end) field1,
max(case when field_name='field2' then series_val else 0 end) field2,
max(case when field_name='field5' then series_val else 0 end) field5,
max(case when field_name='field6' then series_val else 0 end) field6
from #tTEST_Main
group by id, mdate;
Output
id mdate field1 field2 field5 field6
a1 2019-12-30 1 7 0 7
a1 2019-12-31 4 4 9 5
b5 2019-12-30 3 2 6 1
b5 2019-12-31 4 10 8 8
c9 2019-12-30 5 10 4 6
c9 2019-12-31 2 9 9 10
d11 2019-12-30 0 5 3 7

Function to return date of 4hrs minus from given date in SQL Server

I have a table Emp:
Create table Emp
(
empno int,
ename varchar(50),
doj varchar(30),
salary int
);
insert into Emp
values (1, 'raj', '2010-06-30 08:10:45', 5000),
(2, 'kiran', '2018-12-05 18:20:24', 40000),
(3, 'akbar', '2015-04-12 20:02:45', 9000),
(4, 'nitin', '2010-03-11 02:10:23', 3000),
(5, 'Rahul', '2013-12-03 13:23:30', 15000);
Emp table:
-------+------------------+--------------------------+-----------------
empno ename doj salary
-------+------------------+--------------------------+-----------------
1 raj 2010-06-30 08:10:45 5000
2 kiran 2018-12-05 18:20:24 40000
3 akbar 2015-04-12 20:02:45 9000
4 nitin 2010-03-11 02:10:23 3000
5 Rahul 2013-12-03 13:23:30 15000
-------+------------------+-------------------------+-----------------
Here I want to subtract 4hrs from doj and should return the doj values.
I wrote this SQL query and it's working:
select
format(cast(doj as datetime) - cast('04:00' as datetime), 'yyyy-mm-dd HH:mm:ss') "4Hrs_Minus"
from emp;
Now I want to use a function which should return the o/p as above...
Query output:
-------+-----------------------+---------------------
empno doj 4Hrs_Minus
-------+-----------------------+---------------------
1 2010-06-30 08:10:45 2010-06-30 04:10:45
2 2018-12-05 18:20:24 2018-12-05 14:20:24
3 2015-04-12 20:02:45 2015-04-12 16:02:45
4 2010-03-11 02:10:23 2010-03-11 22:10:23
5 2013-12-03 13:23:30 2013-12-03 09:23:30
-------+-----------------------+----------------------
SELECT empno, doj, DATEADD(hh, -4, doj) as [4hrs_Minus] FROM Emp
Documentation:
https://learn.microsoft.com/en-us/sql/t-sql/functions/dateadd-transact-sql?view=sql-server-2017

Running total/ID groups based on specific value in TSQL

I have data that looks like ID and Col1, where the value 01 in Col1 denotes the start of a related group of rows lasting until the next 01.
Sample Data:
ID Col1
1 01
2 02
3 02
---------
4 01
5 02
6 03
7 03
----------
8 01
9 03
----------
10 01
I need to calculate GroupTotal, which provides a running total of '01' from Col1, and also GroupID, which is an increment ID that resets at every instance of '01' in Col 1. Row order must be preserved with ID.
Desired Results:
ID Col1 GroupTotal GroupID
1 01 1 1
2 02 1 2
3 02 1 3
----------------------------
4 01 2 1
5 02 2 2
6 03 2 3
7 03 2 4
----------------------------
8 01 3 1
9 03 3 2
----------------------------
10 01 4 1
I've been messing with OVER, PARTITION BY etc. and cannot crack either.
Thanks
I believe what the OP is saying is that the only data available is a table with the id and col1 data, and that the desired results is what is currently posted in the question.
If that is the case, you just need the following.
Sample Data Setup:
declare #grp_tbl table (id int, col1 int)
insert into #grp_tbl (id, col1)
values (1, 1),(2, 2),(3, 2),(4, 1),(5, 2),(6, 3),(7, 3),(8, 1),(9, 3),(10, 1)
Answer:
declare #max_id int = (select max(id) from #grp_tbl)
; with grp_cnt as
(
--getting the range of ids that are in each group
--and ranking them
select gt.id
, lead(gt.id - 1, 1, #max_id) over (order by gt.id asc) as id_max --max id in the group
, row_number() over (order by gt.id asc) as grp_ttl
from #grp_tbl as gt
where 1=1
and gt.col1 = 1
)
--ranking the range of ids inside each group
select gt.id
, gt.col1
, gc.grp_ttl as group_total
, row_number() over (partition by gc.grp_ttl order by gt.id asc) as group_id
from #grp_tbl as gt
left join grp_cnt as gc on gt.id between gc.id and gc.id_max
Final Results:
id col1 group_total group_id
1 1 1 1
2 2 1 2
3 2 1 3
4 1 2 1
5 2 2 2
6 3 2 3
7 3 2 4
8 1 3 1
9 3 3 2
10 1 4 1
If I understood correctly, this is what you want:
CREATE TABLE #tmp
([ID] int, [Col1] int, [GroupTotal] int, [GroupID] int)
;
INSERT INTO #tmp
([ID], [Col1], [GroupTotal], [GroupID])
VALUES
(1, 01, 1, 1),
(2, 02, 1, 2),
(3, 02, 1, 3),
(4, 01, 2, 1),
(5, 02, 2, 2),
(6, 03, 2, 3),
(7, 03, 2, 4),
(8, 01, 3, 1),
(9, 03, 3, 2),
(10, 01, 4, 1)
;
select *, row_number() over (partition by Grp order by ID) as GrpID From (
select ID, Col1, [GroupTotal],
sum(case when Col1 = '01' then 1 else 0 end) over (Order by ID) as Grp,
[GroupID]
from #tmp
The sum handles the groups with case, 1 is added always when Col1=01, and that's then used in the row_number to partition the groups.
Example
I'm not really sure what you are after but you are on the right tracks with partitioning functions. The following calculates a running total of groupid by grouptotal. I'm sure that's not what you want but it shows you how you can achieve it.
select *, SUM(GroupId) over (partition by grouptotal order by id)
from #tmp
order by grouptotal, id

SQL - Count and group records by month and field value from the last year

I need to count totals number of records in a table, 'a', where a field in 'a', say 'type', has a certain value, 'v'. From all these records where a.type = 'v', I need to group these twice: first by field 'b_id', and again by month. The date range for these records must be restricted to the last year from the current date
I already have the totals for the 'b_id' field with ISNULL() as follows:
SELECT ISNULL(
SELECT COUNT(*)
FROM a
WHERE a.type = 'v'
AND b.b_id = a.b_id
), 0) AS b_totals
The data lies in table a, and is joined on table b. 'b_id' is the primary key for table b, and is found in table a (thought it is not part of a's key). The key for a is irrelevant to the data I need to pull, but can be stated as "a_id" for simplicity.
How do I:
Restrict these records to the past twelve months from the current date.
Take the total for any and all values of b.id, and categorize them by month. This is in addition to the totals of b.id by year. The date is stored in field "date_occurred" in table 'a' as a standard date/time type.
The schema at the end should look something like this, assuming that the current month is October and the year is 2016:
b.id | b_totals | Nov. 2015 | Dec. 2015 | Jan. 2016 .... Oct. 2016
__________________________________________________________________
ID_1 1 0 0 0 1
ID_2 3 2 0 1 0
ID_3 5 1 1 3 0
EDIT: I should probably clarify that I'm counting the records in table 'a' where field 'f' has a certain value 'v.' From these records, I need to group them by building then by month/date. I updated my ISNULL query to make this more clear, as well as the keys for a and b. "date_occured" should be in table a, not b, that was a mistake/typo on my end.
If it helps, the best way I can describe the data from a high level without giving away any sensitive data:
'b' is a table of locations, and 'b.b_id' is the ID for each location
'a' is a table of events. The location for these events is found in 'a.b_id' and joined on 'b.b_id' The date that each event occured is in 'a.date_occurred'
I need to restrict the type of events to a certain value. In this case, the type is field 'type.' This is the "where" clause in my ISNULL SQL query that gets the totals by location.
From all the events of this particular type, I need to count how many times this event occurred in the past year for each location. Once I have these totals from the past year, I need to count them by month.
Table structure:
The table structure of a is something like
a.a_id | a.b_id | a.type | a.date_occurred
Again, I do not need the ID's from a: just a series of counts based on type, b_id, and date_occurred.
EDIT 2: I restricted the totals of b_id to the past year with the following query:
SELECT ISNULL(
SELECT COUNT(*)
FROM a
WHERE a.type = 'v'
AND b.b_id = a.b_id
AND a.date_occurred BETWEEN (DATEADD(yyyy, -1, GETDATE()) AND (GETDATE())
), 0) AS b_totals
Now need to do this with a PIVOT and the months.
In an attempt to make this sufficiently detailed from the absolute minimum of detail provided in the question I have created these 2 example tables with some data:
CREATE TABLE Bexample
([ID] int)
;
INSERT INTO Bexample
([ID])
VALUES
(1),
(2),
(3),
(4),
(5),
(6),
(7),
(8),
(9)
;
CREATE TABLE Aexample
([ID] int, [B_PK] int, [SOME_DT] datetime)
;
INSERT INTO Aexample
([ID], [B_PK], [SOME_DT])
VALUES
(1, 1, '2015-01-01 00:00:00'),
(2, 2, '2015-02-01 00:00:00'),
(3, 3, '2015-03-01 00:00:00'),
(4, 4, '2015-04-01 00:00:00'),
(5, 5, '2015-05-01 00:00:00'),
(6, 6, '2015-06-01 00:00:00'),
(7, 7, '2015-07-01 00:00:00'),
(8, 8, '2015-08-01 00:00:00'),
(9, 9, '2015-09-01 00:00:00'),
(10, 1, '2015-10-01 00:00:00'),
(11, 2, '2015-11-01 00:00:00'),
(12, 3, '2015-12-01 00:00:00'),
(13, 1, '2016-01-01 00:00:00'),
(14, 2, '2016-02-01 00:00:00'),
(15, 3, '2016-03-01 00:00:00'),
(16, 4, '2016-04-01 00:00:00'),
(17, 5, '2016-05-01 00:00:00'),
(18, 6, '2016-06-01 00:00:00'),
(19, 7, '2016-07-01 00:00:00'),
(20, 8, '2016-08-01 00:00:00'),
(21, 9, '2016-09-01 00:00:00'),
(22, 1, '2016-10-01 00:00:00'),
(23, 2, '2016-11-01 00:00:00'),
(24, 3, '2016-12-01 00:00:00')
;
Now, using those tables and data I can generate a result table like this:
id Nov 2015 Dec 2015 Jan 2016 Feb 2016 Mar 2016 Apr 2016 May 2016 Jun 2016 Jul 2016 Aug 2016 Sep 2016 Oct 2016
1 0 0 1 0 0 0 0 0 0 0 0 1
2 1 0 0 1 0 0 0 0 0 0 0 0
3 0 1 0 0 1 0 0 0 0 0 0 0
4 0 0 0 0 0 1 0 0 0 0 0 0
5 0 0 0 0 0 0 1 0 0 0 0 0
6 0 0 0 0 0 0 0 1 0 0 0 0
7 0 0 0 0 0 0 0 0 1 0 0 0
8 0 0 0 0 0 0 0 0 0 1 0 0
9 0 0 0 0 0 0 0 0 0 0 1 0
Using a query that needs both a "common table expression (CTE) and "dynamic sql" to produce that result:
"Dynamic SQL" is a query that generates SQL which is then executed. This is needed because the column names change month to month. So, for the dynamic sql we declare 2 variables that will hold the generated SQL. One of these is to store the columns names, which gets used in 2 places, and the other is to hold the completed query. Note instead of executing this you may display the generated SQL as you evelop your solution (note the comments near execute at the end of the query).
In addition to the example tables and data, we also have a "time series" of 12 months to consider. This is "dynamic" as it is calculated from today's date and I have assumed that if today is any day within November 2016, that "the last 12 months" starts at 1 Nov 2015, and concludes at 31 Oct 2016 (i.e. 12 full months, no partial months).
The core of calculating this is here:
DATEADD(month,-12, DATEADD(month, DATEDIFF(month,0,GETDATE()), 0) )
which firstly locates the first day of the current month with DATEDIFF(month,0,GETDATE()) then deducts a further 12 months from that date. With that as a start date a "recursive CTE" is used to generate 12 rows, one for each month for the past 12 full months.
The purpose of these 12 rows is to ensure that when we consider the actual table data there will be no gaps in the 12 columns. This is achieved by using the generated 12 rows as the "from table" in our query, and the "A" table is LEFT JOINED based on the year/month of a date column [some_dt] to the 12 monthly rows.
So, we generate 12 rows join the sample data to these which is used to generate the SQL necessary for a "PIVOT" of the data. Here it is useful to actually see that an example of that generated sql, which looks like this:
SELECT id, [Nov 2015],[Dec 2015],[Jan 2016],[Feb 2016],[Mar 2016],[Apr 2016],[May 2016],[Jun 2016],[Jul 2016],[Aug 2016],[Sep 2016],[Oct 2016] FROM
(
select
format([mnth],'MMM yyyy') colname
, b.id
, a.b_pk
from #mylist
cross join bexample b
left join aexample a on #mylist.mnth = DATEADD(month, DATEDIFF(month,0,a.some_dt), 0)
and b.id = a.b_pk
) sourcedata
pivot
(
count([b_pk])
FOR [colname] IN ([Nov 2015],[Dec 2015],[Jan 2016],[Feb 2016],[Mar 2016],[Apr 2016],[May 2016],[Jun 2016],[Jul 2016],[Aug 2016],[Sep 2016],[Oct 2016])
) p
So, hopefully you can see in that generated SQL code that the dynamically created 12 rows become 12 columns. Note that because we are executing "dynamic sql" the 12 rows we generated as a CTE need to be stored as a "temporary table" (#mylist).
The query to generate AND execute that SQL is this.
DECLARE #cols AS VARCHAR(MAX)
DECLARE #query AS VARCHAR(MAX)
;with mylist as (
select DATEADD(month,-12, DATEADD(month, DATEDIFF(month,0,GETDATE()), 0) ) as [mnth]
union all
select DATEADD(month,1,[mnth])
from mylist
where [mnth] < DATEADD(month,-1, DATEADD(month, DATEDIFF(month,0,GETDATE()), 0) )
)
select [mnth]
into #mylist
from mylist
SELECT #cols = STUFF((SELECT ',' + QUOTENAME(format([mnth],'MMM yyyy'))
FROM #mylist
FOR XML PATH(''), TYPE
).value('.', 'NVARCHAR(MAX)')
,1,1,'')
SET #query = 'SELECT id, ' + #cols + ' FROM
(
select
format([mnth],''MMM yyyy'') colname
, b.id
, a.b_pk
from #mylist
cross join bexample b
left join aexample a on #mylist.mnth = DATEADD(month, DATEDIFF(month,0,a.some_dt), 0)
and b.id = a.b_pk
) sourcedata
pivot
(
count([b_pk])
FOR [colname] IN (' + #cols + ')
) p '
--select #query -- use select to inspect the generated sql
execute(#query) -- once satisfied that sql is OK, use execute
drop table #mylist
You can see this working at: http://rextester.com/VVGZ39193
I want to share another attempt at explaining the issues faced by your requirements.
To follow this you MUST understand this sample data. I have 2 tables #events (a) and #locations (b). The column names should be easy to follow I hope. The
declare #Events table
( [id] int IDENTITY(1007,2)
, [b_id] int
, [date_occurred] datetime
, [type] varchar(20)
)
;
INSERT INTO #Events
([b_id], [date_occurred],[type])
VALUES
(1, '2015-01-11 00:00:00','v'),
(2, '2015-02-21 00:00:00','v'),
(3, '2015-03-11 00:00:00','v'),
(4, '2015-04-21 00:00:00','v'),
(5, '2015-05-11 00:00:00','v'),
(6, '2015-06-21 00:00:00','v'),
(1, '2015-07-11 00:00:00','v'),
(2, '2015-08-11 00:00:00','v'),
(3, '2015-09-11 00:00:00','v'),
(5, '2015-10-11 00:00:00','v'),
(5, '2015-11-21 00:00:00','v'),
(6, '2015-12-21 00:00:00','v'),
(1, '2016-01-21 00:00:00','v'),
(2, '2016-02-21 00:00:00','v'),
(3, '2016-03-21 00:00:00','v'),
(4, '2016-04-21 00:00:00','v'),
(5, '2016-05-21 00:00:00','v'),
(6, '2016-06-21 00:00:00','v'),
(1, '2016-07-11 00:00:00','v'),
(2, '2016-08-21 00:00:00','v'),
(3, '2016-09-21 00:00:00','v'),
(4, '2016-10-11 00:00:00','v'),
(5, '2016-11-11 00:00:00','v'),
(6, '2016-12-11 00:00:00','v');
declare #Locations table
([id] int, [name] varchar(13))
;
INSERT INTO #Locations
([id], [name])
VALUES
(1, 'Atlantic City'),
(2, 'Boston'),
(3, 'Chicago'),
(4, 'Denver'),
(5, 'Edgbaston'),
(6, 'Melbourne')
;
OK. So with that data we can easily create a set of counts using this query:
select
b.id
, b.name
, format(a.date_occurred,'yyyy MMM') mnth
, count(*)
FROM #events a
inner join #locations b ON b.id = a.b_id
WHERE a.type = 'v'
and a.date_occurred >= DATEADD(month,-12, DATEADD(month, DATEDIFF(month,0,GETDATE()), 0) )
group by
b.id
, b.name
, format(a.date_occurred,'yyyy MMM')
And that output looks like this:
id name mnth
-- ------------- -------- -
1 Atlantic City 2016 Jan 1
1 Atlantic City 2016 Jul 1
2 Boston 2016 Aug 1
2 Boston 2016 Feb 1
3 Chicago 2016 Mar 1
3 Chicago 2016 Sep 1
4 Denver 2016 Apr 1
4 Denver 2016 Oct 1
5 Edgbaston 2015 Nov 1
5 Edgbaston 2016 May 1
5 Edgbaston 2016 Nov 1
6 Melbourne 2015 Dec 1
6 Melbourne 2016 Dec 1
6 Melbourne 2016 Jun 1
So, with a "simple" query that is easy to pass parameters into, the output is BY ROWS
and the column headings are FIXED
NOW do you understand why transposing those rows into columns, with VARIABLE COLUMN HEADING forces the use of dynamic sql?
Your requirements, no matter how many words you throw at it, leads to complexity in the sql.
You can run the above data/query here: https://data.stackexchange.com/stackoverflow/query/574718/count-and-group-records-by-month-and-field-value-from-the-last-year

SQL Server conditional subtotal query

given the following table:
create table #T
(
user_id int,
project_id int,
datum datetime,
status varchar(10),
KM int
)
insert into #T values
(1, 1, '20160301 10:25', 'START', 1000),
(1, 1, '20160301 10:28', 'PASS', 1008),
(2, 2, '20160301 10:29', 'START', 2000),
(1, 1, '20160301 11:08', 'STOP', 1045),
(3, 3, '20160301 10:25', 'START', 3000),
(2, 2, '20160301 10:56', 'STOP', 2020),
(1, 4, '20160301 15:00', 'START', 1045),
(4, 5, '20160301 15:10', 'START', 400),
(1, 4, '20160301 15:10', 'PASS', 1060),
(1, 4, '20160301 15:20', 'PASS', 1080),
(1, 4, '20160301 15:30', 'STOP', 1080),
(4, 5, '20160301 15:40', 'STOP', 450),
(3, 3, '20160301 16:25', 'STOP', 3200)
I have to sum the length of a track between START and STOP statuses for a given user and project
The expected result would be this:
user_id project_id datum TOTAL_KM
----------- ----------- ---------- -----------
1 1 2016-03-01 45
1 4 2016-03-01 35
2 2 2016-03-01 20
3 3 2016-03-01 200
4 5 2016-03-01 50
How can I achieve this without using a cluster?
The performance is an issue (I have over 1 million records per month and we have to keep data for several years)
Explanation:
We can ignore the records with the status "PASS". Basically we have to subtract the KM value of the START record from the STOP record for a given user and project.
There can be several hundred records between a START and STOP (like describes in the sample data)
The date should be the date of START (in case where we have an over midnight delivery)
I think I should have a SELECT with an OVER() clause but I don't know how to formulate my query to respect those conditions.
Any idea?
SELECT t.[user_id],
t.project_id,
cast(t.datum as date) as datum,
t1.KM- t.KM as KM
FROM #T t
INNER JOIN #T t1
ON t.[user_id]=t1.[user_id] and t.project_id = t1.project_id
WHERE t.[status] = 'START' and t1.[status] = 'STOP'
ORDER BY t.[user_id],
t.project_id,
cast(t.datum as date)
Output:
user_id project_id datum KM
----------- ----------- ---------- -----------
1 1 2016-03-01 45
1 4 2016-03-01 35
2 2 2016-03-01 20
3 3 2016-03-01 200
4 5 2016-03-01 50
(5 row(s) affected)
This could be achieved by simple self join.
One of the example: (this may not be exact query but just an idea)
Select
a.user_id,
a.project_id,
b.datum as StartDate,
a.KM-b.KM as TotalKM
From #T a
Where status = 'STOP'
Join
(
Select user_id, project_id, KM From #t Where
status = 'START'
) b ON b.user_id = a.user_id, b.project_id = a.project_id
#T b

Resources