select query to get first row from rows having multiple id's.(without partition by) - sql-server

id date amount documentNo paperID
1 2015/10/15 500 1234 34
1 2015/10/15 100 1332 33
2 2015/10/13 200 1302 21
2 2015/10/13 400 1332 33
3 2015/11/23 500 1332 43
I should get the output as:
id date amount documentNo paperID
1 2015/10/15 500 1234 34
2 2015/10/13 200 1302 21
3 2015/11/23 500 1332 43
Please suggest a simple select query to fetch only one row without partition by. Note: the date remain same for a particular id.

Try a null-self-join. Basically you are comparing each row to some other version of that row ,but, via an inequality (here I have used documentNo) you end-up with a single row that has no match.
See this SQL Fiddle
MySQL 5.6 Schema Setup:
CREATE TABLE Table1
(`id` int, `date` datetime, `amount` int, `documentNo` int, `paperID` int)
;
INSERT INTO Table1
(`id`, `date`, `amount`, `documentNo`, `paperID`)
VALUES
(1, '2015-10-15 00:00:00', 500, 1234, 34),
(1, '2015-10-15 00:00:00', 100, 1332, 33),
(2, '2015-10-13 00:00:00', 200, 1302, 21),
(2, '2015-10-13 00:00:00', 400, 1332, 33),
(3, '2015-11-23 00:00:00', 500, 1332, 43)
;
Query 1:
SELECT
t1.*
FROM table1 AS t1
LEFT OUTER JOIN table1 AS t2 ON t1.id = t2.id
AND t1.date = t2.date
AND t2.documentNo < t1.documentNo
WHERE t2.ID IS NULL
Results:
| id | date | amount | documentNo | paperID |
|----|----------------------------|--------|------------|---------|
| 1 | October, 15 2015 00:00:00 | 500 | 1234 | 34 |
| 2 | October, 13 2015 00:00:00 | 200 | 1302 | 21 |
| 3 | November, 23 2015 00:00:00 | 500 | 1332 | 43 |
EDIT: There are several approaches to this problem even without windowing functions such as row_number() , here is a previous answer covering some MySQL specific alternatives.

Related

Cursor within a cursor

I need your help.
I am trying to make a series of cursors within other cursors.
Below I show you the tables with which I want to make the cursors.
First make the cursor of the first table looking for record.
Second, in table 1 we have the column "ID_ASI", with that column I want to make another cursor that searches inside another table (IMAGE OF TABLE 2) all the "ID_ASI" that it finds with the same "ID_ASI".
Finally, the "ID_ASI" finding by the second step, make a new cursor that looks for all "ID_DOC" that have the same "ID_ASI".
For example,
In the second step, when making the cursor in the "ID_ASI" column, find 3 rows with the same "ID" (101), then the third step searches for all "ID_DOC" with the same "ID_ASI".
For example, "ID_ASI" 101 has 3 "ID_DOC" (value 10), 101 has 2 other values ​​(20) and finally two other values ​​(30).
The complicated thing is how to group them all in the same way, and how to put a cursor inside a cursor.
This would be the result.
Thank you for attention.
To me it looks like what you want is to join the tables together, something like
SELECT t2.*
FROM TABLE_1 t1
INNER JOIN TABLE_2 t2
ON t2.ID_ASI = t1.ID_ASI
ORDER BY t2.ID_ASI, t2.ID_DOC
SQLFiddle here
Best of luck.
Use a hierarchical query:
SQL Fiddle
Oracle 11g R2 Schema Setup:
CREATE TABLE TABLE_1 ( ID_ASI ) AS
SELECT 101 FROM DUAL UNION ALL
SELECT 201 FROM DUAL UNION ALL
SELECT 301 FROM DUAL;
CREATE TABLE TABLE_2 (ID_ASI, ID_DOC, IMPORT, IMPORT_TO ) AS
SELECT 101, 10, NULL, 1000 FROM DUAL UNION ALL
SELECT 101, 20, NULL, 2000 FROM DUAL UNION ALL
SELECT 101, 30, NULL, 3000 FROM DUAL UNION ALL
SELECT 201, 23, NULL, 430 FROM DUAL UNION ALL
SELECT 201, 23, 430, NULL FROM DUAL UNION ALL
SELECT 104, 10, 500, NULL FROM DUAL UNION ALL
SELECT 104, 20, 2000, NULL FROM DUAL UNION ALL
SELECT 104, 10, 500, NULL FROM DUAL UNION ALL
SELECT 104, 30, 3000, NULL FROM DUAL;
Query 1:
SELECT *
FROM TABLE_2
START WITH ID_ASI IN ( SELECT ID_ASI FROM TABLE_1 )
CONNECT BY PRIOR ID_DOC = ID_DOC
AND PRIOR ID_ASI < ID_ASI
ORDER SIBLINGS BY 1, 2, 3, 4
Results:
| ID_ASI | ID_DOC | IMPORT | IMPORT_TO |
|--------|--------|--------|-----------|
| 101 | 10 | (null) | 1000 |
| 104 | 10 | 500 | (null) |
| 104 | 10 | 500 | (null) |
| 101 | 20 | (null) | 2000 |
| 104 | 20 | 2000 | (null) |
| 101 | 30 | (null) | 3000 |
| 104 | 30 | 3000 | (null) |
| 201 | 23 | 430 | (null) |
| 201 | 23 | (null) | 430 |

MS Access Transform / Pivot to SQL Server with grouping

I know similar questions have been asked before, however the grouping is throwing me off and hopefully I can get some help. I currently have a working MS Access model that does custom calculations to an Oracle connection, however my data is now pushing the 2GB mark with the custom calculations and trying SQL Server Express as an alternative and need a bit of help.
The database structure is from a 3rd party application so have to live with what I have - its UGLY.
ID | ATRSTUDY | ENDTIME | NAME | COUNT
---+----------+--------------------+-------------+-------
1 | A | Jan 1, 18 00:15 | NorthBound | 10
2 | A | Jan 1, 18 00:15 | SouthBound | 20
3 | A | Jan 1, 18 00:15 | Both Dir | 30
4 | B | Jan 1, 18 00:15 | EastBound | 30
5 | B | Jan 1, 18 00:15 | WestBound | 40
5 | B | Jan 1, 18 00:15 | Both Dir | 70
My existing MS-Access SQL is:
TRANSFORM Sum(CountData_Local.Count) AS SumOfCount
SELECT CountData_Local.ATRSTUDY, DateValue([CountData_Local]![ENDTIME]) AS CNTDATE, CountData_Local.ENDTIME
FROM DataVariables, CountData_Local
GROUP BY CountData_Local.ATRSTUDY, DateValue([CountData_Local]![ENDTIME]), CountData_Local.ENDTIME
PIVOT IIf([NAME]="EastBound" Or [NAME]="NorthBound" Or [NAME]="First Direction","C1",IIf([NAME]="WestBound" Or [NAME]="SouthBound" Or [NAME]="Second Direction","C2",IIf([NAME]="Both Dir","TC")));
The end result I am try to achieve is a pivot table that combines the 3 rows into one row as follows:
ATRSTUDY | CNDDATE | ENDTIME | C1 | C2 | TC
---------+-----------+---------+----+----+---
A | Jan 1, 18 | 00:15 |10 | 20 | 30
B | Jan 1, 18 | 00:15 |30 | 40 | 70
Thanks in advance!
Here is my SQL to date. Still having a bit of trouble with showing date and time and adding a filter for a date range which I currently have filtered out.
Perfect situation, date would be a between A and B.
Select atrstudy, endtime, sum(c1), sum(c2), sum(tc) from
(
with t as
(
select
sdata.oid,
sdata.atrstudy,
case upper(datatyp.name)
when 'EASTBOUND' then 'C1'
when 'NORTHBOUND' then 'C1'
when 'FIRST DIRECTION' then 'C1'
when 'WESTBOUND' then 'C2'
when 'SOUTHBOUND' then 'C2'
when 'SECOND DIRECTION' then 'C2'
when 'BOTH DIRECTIONS' then 'TC'
else datatyp.name
end as namedir,
/* trunc(datadtlbs.starttime) as starttime, */
endtime as endtime,
datadtlbs.count as counttot,
sdata.gcrecord
from roy.atrstudydata sdata
left outer join roy.atrstudydatadtl datadtl
on sdata.oid = datadtl.parent
join roy.atrstudydattyp datatyp
on sdata.direction = datatyp.oid
left outer join roy.atrstudydatadtlbs datadtlbs
on datadtl.oid = datadtlbs.oid
/* where trunc(datadtlbs.endtime) > '31-DEC-15' can remove this where clause*/)
select atrstudy, endtime, C1, C2, TC from t
pivot
(sum(COUNTtot) for NAMEdir in ('C1' as C1, 'C2' as C2, 'TC' as TC)))
group by atrstudy, endtime
order by atrstudy, endtime
MS SQL Server has PIVOT to do such things.
And CASE can be used to group those names as codes.
Example snippet:
--
-- Using a table variable for easy testing in the example
--
declare #Count_data table (id int identity(1,1) primary key, ATRSTUDY varchar(1), ENDTIME datetime, NAME varchar(30), [COUNT] int);
insert into #Count_data (ATRSTUDY, ENDTIME, NAME, [COUNT]) values
('A','2018-01-01T18:00:15','NorthBound',10),
('A','2018-01-01T18:00:15','SouthBound',20),
('A','2018-01-01T18:00:15','Both Directions',30),
('B','2018-01-01T18:00:15','EastBound',30),
('B','2018-01-01T18:00:15','WestBound',40),
('B','2018-01-01T18:00:15','Both Directions',70);
select ATRSTUDY, ENDDATE, ENDTIME, [C1], [C2], [TC]
from (
select
d.ATRSTUDY,
cast(d.ENDTIME as date) as ENDDATE,
left(cast(d.ENDTIME as time),5) as ENDTIME,
(case -- if the Name column has a Case-Insensitive collation then lower or upper case won't matter.
when d.Name in ('eastbound', 'northbound', 'first direction') then 'C1'
when d.Name in ('westbound', 'southbound', 'second direction') then 'C2'
when d.Name like 'both dir%' then 'TC'
else d.Name
end) as ColName,
d.[count]
From #Count_data d
) as q
pivot (
sum([count])
for ColName in ([C1], [C2], [TC])
) as pvt
order by ATRSTUDY, ENDDATE, ENDTIME;
Output:
ATRSTUDY ENDDATE ENDTIME C1 C2 TC
-------- ---------- ------- -- -- --
A 2018-01-01 18:00 10 20 30
B 2018-01-01 18:00 30 40 70

SQL Server get two days difference and days count from date range

I am using SQL Server 2010.
I have a table in the database with records as shown below :
Id | EmpName | JoinDate | ResignedDate
---+----------+-------------------------+--------------
1 | Govind | 2014-04-02 00:00:00.000 | 2014-04-02
2 | Aravind | 2014-04-05 00:00:00.000 | 2014-04-05
3 | Aravind | 2014-04-07 00:00:00.000 | 2014-04-10
4 | Aravind | 2014-04-10 00:00:00.000 | 2014-04-11
5 | Aravind | 2014-04-14 00:00:00.000 | 2014-04-16
Now, I want display the difference of two dates (joinDate , ResignDate) and of that date different how many count available
Sample output:
DateDifferent Count
------------- -----
0 2
1 1
2 1
3 1
Here am showing my sample query,
entityManager.createNativeQuery(SELECT
DATEDIFF(day, e.joinedDate , e.resignedDate),
COUNT(DATEDIFF(day, e.joinedDate , e.resignedDate)))
FROM
Employee e
GROUP BY
DATEDIFF(e.joinedDate , e.resignedDate) ORDER BY (DATEDIFF(e.joinedDate , e.resignedDate)));
This queries is work well for mssql query browser but when I using the query in JPA Native Query (Java code) this query is not working
Any one help me ...
SQL Fiddle
MS SQL Server 2008 Schema Setup:
CREATE TABLE Employee
([Id] int, [EmpName] varchar(7), [JoinDate] datetime, [ResignedDate] datetime)
;
INSERT INTO Employee
([Id], [EmpName], [JoinDate], [ResignedDate])
VALUES
(1, 'Govind', '2014-04-02 00:00:00', '2014-04-02 00:00:00'),
(2, 'Aravind', '2014-04-05 00:00:00', '2014-04-05 00:00:00'),
(3, 'Aravind', '2014-04-07 00:00:00', '2014-04-10 00:00:00'),
(4, 'Aravind', '2014-04-10 00:00:00', '2014-04-11 00:00:00'),
(5, 'Aravind', '2014-04-14 00:00:00', '2014-04-16 00:00:00')
;
Query 1:
SELECT
DATEDIFF(DAY, JoinDate, ResignedDate) AS DateDifferent
, COUNT(DATEDIFF(DAY, JoinDate, ResignedDate)) as FrequencyOf
FROM Employee
GROUP BY DATEDIFF(DAY, JoinDate, ResignedDate)
ORDER BY DateDifferent
Note! You may use column aliases (e.g. DateDifferent) in the ORDER BY clause
Results:
| DateDifferent | FrequencyOf |
|---------------|-------------|
| 0 | 2 |
| 1 | 1 |
| 2 | 1 |
| 3 | 1 |

Migrating SQL Server point history table to period history table

I am trying to clean up a not so useful history table by changing it's format.
For the usage of the history table it is relevant between which time a row was valid.
The current situation:
Unit | Value | HistoryOn |
----------------------------------------
1 | 123 | 2013-01-05 14:16:00
1 | 234 | 2013-01-07 12:12:00
2 | 325 | 2013-01-04 14:12:00
1 | 657 | 2013-02-04 17:11:00
3 | 132 | 2013-04-02 13:00:00
The problem that arises here is that as this table grows it will become increasingly resource hungry when I want to know what status all of my containers had during a certain period. (say I want to know the value for all units on a specific date)
My solution is to create a table in this format:
Unit | value | HistoryStart | HistoryEnd |
---------------------------------------------------------------------
1 | 123 | 2013-01-05 14:16:00 | 2013-01-07 12:11:59
1 | 234 | 2013-01-07 12:12:00 | 2013-02-04 17:10:59
1 | 657 | 2013-02-04 17:11:00 | NULL
2 | 325 | 2013-01-04 14:12:00 | NULL
3 | 132 | 2013-04-02 13:00:00 | NULL
Note that the NULL value in HistoryEnd here indicates that the row is still representative of the current status.
I have tried to make use of a left join on the table itself using the HistoryOn field. This had the unfortunate side effect of cascading in an undesired manner.
SQL Query used:
SELECT *
FROM webhistory.Units u1 LEFT JOIN webhistory.Units u2 on u1.Unit = u2.Unit
AND u1.HistoryOn < u2.HistoryOn
WHERE u1.Units = 1
The result of the query is as follows:
Unit | Value | HistoryOn | Unit | Value | HistoryOn |
-------------------------------------------------------------------------------------
1 | 657 | 2013-02-04 17:11:00 | NULL | NULL | NULL
1 | 234 | 2013-01-07 12:12:00 | 1 | 657 | 2013-02-04 17:11:00
1 | 123 | 2013-01-05 14:16:00 | 1 | 657 | 2013-02-04 17:11:00
1 | 123 | 2013-01-05 14:16:00 | 1 | 234 | 2013-01-07 12:12:00
This effect is incremental because each entry will join on all the entries that are newer than itself instead of only the first entry that comes after it.
Sadly right as of yet I am unable to come up with a good query to solve this and would like insights or suggestions that could help me solve this migration problem.
Maybe I'm missing something, but this seems to work:
CREATE TABLE #webhist(
Unit int,
Value int,
HistoryOn datetime
)
INSERT INTO #webhist VALUES
(1, 123, '2013-01-05 14:16:00'),
(1, 234, '2013-01-07 12:12:00'),
(2, 325, '2013-01-04 14:12:00'),
(1, 657, '2013-02-04 17:11:00'),
(3, 132, '2013-04-02 13:00:00')
SELECT
u1.Unit
,u1.Value
,u1.HistoryOn AS HistoryStart
,u2.HistoryOn AS HistoryEnd
FROM #webhist u1
OUTER APPLY (
SELECT TOP 1 *
FROM #webhist u2
WHERE u1.Unit = u2.Unit AND u1.HistoryOn < u2.HistoryOn
ORDER BY HistoryOn
) u2
DROP TABLE #webhist
First data sample
create table Data(
Unit int,
Value int,
HistoryOn datetime)
insert into Data
select 1,123,'2013-01-05 14:16:00'
union select 1 , 234 , '2013-01-07 12:12:00'
union select 2 , 325 , '2013-01-04 14:12:00'
union select 1 , 657 , '2013-02-04 17:11:00'
union select 3 , 132 , '2013-04-02 13:00:00'
I created a function to calculate HistoryEnd
Noticed I named Data to table
CREATE FUNCTION dbo.fnHistoryEnd
(
#Unit as int,
#HistoryOn as datetime
)
RETURNS datetime
AS
BEGIN
-- Declare the return variable here
DECLARE #HistoryEnd as datetime
select top 1 #HistoryEnd=dateadd(s,-1,d.HistoryOn )
from Data d
where d.HistoryOn>#HistoryOn and d.Unit=#Unit
order by d.HistoryOn asc
RETURN #HistoryEnd
END
GO
Then, the query is trivial
select *,dbo.fnHistoryEnd(a.Unit,a.HistoryOn) from Data a
order by Unit, HistoryOn
EDIT
Don't forget order by clause in sub query. Look what could happen if not
CREATE TABLE #webhist(
Unit int,
Value int,
HistoryOn datetime
)
INSERT INTO #webhist VALUES
(1, 234, '2013-01-07 12:12:00'),
(2, 325, '2013-01-04 14:12:00'),
(1, 657, '2013-02-04 17:11:00'),
(3, 132, '2013-04-02 13:00:00'),
(1, 123, '2013-01-05 14:16:00')
select *, (select top 1 historyon from #webhist u2 where u2.historyon > u1.historyon and u1.unit = u2.unit) from #webhist u1;
select *, (select top 1 historyon from #webhist u2 where u2.historyon > u1.historyon and u1.unit = u2.unit order by u2.HistoryOn) from #webhist u1;
drop table #webhist

How to compute the moving average over the last n hours

I am trying to compute efficiently (using SQL Server 2008) the moving average of the ProductCount over a period of 24 hours. For every single row in the Product table, I'd like to know what was the average of ProductCount (for that given products) over the last 24 hours. One problem with our data is that not all the dates/hours are present (see example below). If a TimeStamp is missing, it means that the ProductCount was 0.
I have a table with millions or rows with a Date, Product and Count. Below is a simplified example of the data I have to deal with.
Any idea on how to acheive that?
EDIT: One other piece of data that I need is the MIN and MAX ProductCount for the period (i.e. 24h). Computing the MIN/MAX is a bit trickier because of the missing values...
+---------------------+-------------+--------------+
| Date | ProductName | ProductCount |
+---------------------+-------------+--------------+
| 2012-01-01 00:00:00 | Banana | 15000 |
| 2012-01-01 01:00:00 | Banana | 16000 |
| 2012-01-01 02:00:00 | Banana | 17000 |
| 2012-01-01 05:00:00 | Banana | 12000 |
| 2012-01-01 00:00:00 | Apple | 5000 |
| 2012-01-01 05:00:00 | Apple | 6000 |
+---------------------+-------------+--------------+
SQL
CREATE TABLE ProductInventory (
[Date] DATETIME,
[ProductName] NVARCHAR(50),
[ProductCount] INT
)
INSERT INTO ProductInventory VALUES ('2012-01-01 00:00:00', 'Banana', 15000)
INSERT INTO ProductInventory VALUES ('2012-01-01 01:00:00', 'Banana', 16000)
INSERT INTO ProductInventory VALUES ('2012-01-01 02:00:00', 'Banana', 17000)
INSERT INTO ProductInventory VALUES ('2012-01-01 05:00:00', 'Banana', 12000)
INSERT INTO ProductInventory VALUES ('2012-01-01 00:00:00', 'Apple', 5000)
INSERT INTO ProductInventory VALUES ('2012-01-01 05:00:00', 'Apple', 6000)
Well, the fact that you need to calculate the average for every hour, actually makes this simpler, since you just need to SUM the product count and divide it by a fixed number (24). So I think that this will get the results you want (though in this particular case, a cursor by be actually faster):
SELECT A.*, B.ProductCount/24 DailyMovingAverage
FROM ProductInventory A
OUTER APPLY ( SELECT SUM(ProductCount) ProductCount
FROM ProductInventory
WHERE ProductName = A.ProductName
AND [Date] BETWEEN DATEADD(HOUR,-23,A.[Date]) AND A.[Date]) B
I added to Lamak's answer to include min/max:
SELECT *
FROM ProductInventory A
OUTER APPLY (
SELECT
SUM(ProductCount) / 24 AS DailyMovingAverage,
MAX(ProductCount) AS MaxProductCount,
CASE COUNT(*) WHEN 24 THEN MIN(ProductCount) ELSE 0 END AS MinProductCount
FROM ProductInventory
WHERE ProductName = A.ProductName
AND [Date] BETWEEN DATEADD(HOUR, -23, A.[Date]) AND A.[Date]) B
To account for missing records, check that there were indeed 24 records in the last 24 hours before using MIN(ProductCount), and return 0 otherwise.
Working SQL Fiddle, with a bunch (bushel?) of Oranges added to show the MinProductCount working

Resources