MS SQL Server 2005 - Grouping Similar Data - sql-server

I am trying to find a solution to the following type of groupings:
My Data
Formula # Date
1 2016-01-02 12:05:00
1 2016-01-02 12:07:00
2 2016-01-02 12:10:00
2 2016-01-02 12:15:00
3 2016-01-02 12:25:00
3 2016-01-02 12:30:00
3 2016-01-02 12:50:00
3 2016-01-02 12:55:00
2 2016-01-02 13:05:00
2 2016-01-02 13:25:00
2 2016-01-02 13:40:00
And I am trying to get a result like this:
Formula Count Start Date End Date
1 2 2016-01-02 12:05:00 2016-01-02 12:07:00
2 2 2016-01-02 12:10:00 2016-01-02 12:15:00
3 4 2016-01-02 12:25:00 2016-01-02 12:55:00
2 3 2016-01-02 13:05:00 2016-01-02 13:40:00
I've tried various things and while I can roll up the similar formula numbers, I cannot seem to get it to sort out to get the results in the format I've listed. I'm also not sure at all how to get the starting and ending date of the groups of data..
Any thoughts or help would be greatly appreciated..

The below query will group by the formula and provide the count as well as min and max dates. Based on the data in the table there is no way to know that there are supposed to be 2 separate sets of data for Formula #2, so it is being grouped into one row.
SELECT
[Formula]
,COUNT([Formula]) AS [COUNT]
,MIN([Date]) AS [MIN_DATE]
,MAX([Date]) AS [MAX_DATE]
FROM
#test_table
GROUP BY
[Formula]

Related

Calculate the rolling time since an event for each record in SQL

I am working on a query to calculate the time since the latest preventative maintenance (PM) for each piece of equipment on a rolling calendar. The goal is build a dataframe to analyze when a repair will take place.
Below is a table of the query results that I have with the added column "Time Since Last PM" which is the column that I want.
Call this Table 1:
EquipmentNumber |Year |weeknumber |Current Week Repair|Time Since Last PM
2069186 |2018 |10 |1 |5
2069186 |2018 |21 |1 |1
1626930 |2018 |09 |1 |21
1626930 |2019 |03 |1 |15
The preventative maintenance data comes from a query/table that is set up like the following i.e. Table 2.
Equipment Number |Year |WeekNumber
2069186 |2018 |5
2069186 |2018 |20
1626930 |2017 |40
1626930 |2018 |40
So I need to make sure that for my final query, Table 1, the “Time Since Last PM” for equipment 2069186 the first record is difference from week 5 2018 in Table 2 to week 10 2018. For the second record of equipment 2069186 the “Time Since Last PM” should be the difference between week 21 2018 in Table 1 and week 20 2018 in Table 2. For equipment 1626930 the first record should be the difference from week 9 2018 in Table 1 to week 40 2017 in Table 2. For equipment 1626930 the second record should be the difference from week 3 2019 in Table 1 to week 40 2018 in Table 2. Keep in mind that I want the difference between the current record in Table 1 and the latest PM prior to the year and week in Table 1.
The queries for Table 1 and Table 2 are very basic.
SELECT
DISTINCT EquipmentNumber,
Year,
weeknumber,
[Current Week Repair]
FROM TableA
SELECT
EquipmentNumber,
DatePart(Year, Created) as Year,
DatePart(Week, Created) as WeekNumber
FROM TableB
Any suggestions on the proper way to join the two tables and make the calculation? Or is there a way to utilize temp tables and variables that anyone can suggest? Any input would be appreciated!
You can use APPLY:
SELECT t1.*, t1.WeekNumber - ca.WeekNumber
FROM t1
OUTER APPLY (
SELECT TOP 1 Year, WeekNumber
FROM t2
WHERE Equipment_Number = t1.Equipment_Number
AND (
(Year < t1.Year)
OR
(Year = t1.Year AND WeekNumber < t1.WeekNumber)
)
ORDER BY Year DESC, WeekNumber DESC
) AS ca
Calculating week number difference will be tricky and for that you'll need the number of weeks in the previous year.

Why does adding TOP 10 to DISTINCT SELECT timeout when DISTINCT SELECT does not?

When I run a basic SELECT DISTINCT query (NOTE: this occurs only when using a View not a Table)
SELECT DISTINCT
[DateField]
FROM MyData
I quickly (< 1 sec) get my 190 rows back. But when I add TOP 10 (either before or after the DISTINCT) it runs until it times out (5 minutes +).
SELECT TOP 10 DISTINCT
[DateField]
FROM MyData
This also happens if I put the SELECT DISTINCT query as a subquery (just testing alternatives).
SELECT TOP 10
FOO.[DateField]
FROM
(SELECT DISTINCT [DateField]
FROM MyData) FOO
Here is a sample of the data returned by the SELECT DISTINCT query.
DateField
2016-12-01 00:00:00.000
2016-09-01 00:00:00.000
2016-11-01 00:00:00.000
2017-11-29 00:00:00.000
2017-07-01 00:00:00.000
2016-08-01 00:00:00.000
2017-04-24 00:00:00.000
2016-03-01 00:00:00.000
2017-03-01 00:00:00.000
2016-07-01 00:00:00.000
2016-02-01 00:00:00.000
2016-04-01 00:00:00.000
2017-01-01 00:00:00.000
2016-06-01 00:00:00.000
2016-05-01 00:00:00.000
2018-02-28 00:00:00.000
Thanks in advance!

Query to show difference over time in the data

Lets say I have a table that shows attendence to a lecture. The table is very simple, it only contains the date of the lecture, and the attendees.
2016-10-10 | Adam
2016-10-10 | Mike
2016-10-10 | David
2016-10-11 | Adam
2016-10-14 | Adam
2016-10-14 | David
What I would like is a query to show what percentage of the attendees for each lecture that was present on the previous lecture. Can this be done in a effective way?
The expected result would be something like this:
2016-10-11 | 1.00
2016-10-14 | 0.50
2016-10-10 would be left out since it does not have a previous lecture.

Adding rows for recordsets that have incomplete date ranges

This one is tricky for me, can't figure it out.
We have a system where we calculate inventory numbers on the fly. If there's a month where a customer doesn't have any orders, there's no record for that month but the beginning inventory is still calculated and displayed as a rolling calculation based on the previous transaction data.
I'm now pulling this data but need to "fill in the blanks" so to speak.
For example, the table has the following fields:
MonthYear DATETIME
WarehouseID INT
Quantity DECIMAL(18,2)
If I put all this in a temporary table to do calculations, I'll end up with something like this:
2010-01-01 00:00:00.000 135 1000.00
2010-04-01 00:00:00.000 135 2000.00
2010-07-01 00:00:00.000 135 3000.00
2010-06-01 00:00:00.000 235 1000.00
2010-07-01 00:00:00.000 235 2000.00
2011-02-01 00:00:00.000 135 1000.00
2011-03-01 00:00:00.000 135 2450.00
etc., etc.
What I need to do is for each warehouse, if a record exists for that year, add a blank row for any months that aren't in the table.
In the example above for warehouse 135 I need to add a record for 02, 03, 05, 08, etc.
Is there any easier way of doing this rather than with cursors and loops?
Thanks.
Search Recursive CTE. Then LEFT JOIN. Gotta bolt out the door. Sorry for quick answer.

SQL Server Distinct on Earliest Date in a Timestamp Column

I have a table called 'Audit' in SQL Server 2005 like this:
Name | Last Logged On Date
--------| -----------------------
Joe | 2012-02-01 00:00:00.000
Joe | 2012-02-02 00:00:00.000
Bloggs | 2012-03-01 00:00:00.000
Bloggs | 2012-03-02 00:00:00.000
I want to only get the distinct on the first time the person logged on.
So in other words, I want to return:
Name | First Logged On Date
--------| -----------------------
Joe | 2012-02-01 00:00:00.000
Bloggs | 2012-03-01 00:00:00.000
How would I achieve this?
Help!!!
If I understand your question right, it should work for you
SELECT Name, MIN([Last Logged On Date]) AS [First Logged On Date]
FROM Audit
GROUP BY Name

Resources