How to select Second Last Row in mySql? - sql-server

I want to retrieve the 2nd last row result and I have seen this question:
How can I retrieve second last row?
but it uses order by which in my case does not work because the Emp_Number Column contains number of rows and date time stamp that mixes data if I use order by .
The rows 22 and 23 contain the total number of rows (excluding row 21 and 22) and the time and day it got entered respectively.
I used this query which returns the required result 21 but if this number increases it will cause an error.
SELECT TOP 1 *
FROM(
SELECT TOP 2 *
FROM DAT_History
ORDER BY Emp_Number ASC
) t
ORDER BY Emp_Number desc
Is there any way to get the 2nd last row value without using the Order By function?

There is no guarantee that the count will be returned in the one-but-last row, as there is no definite order defined. Even if those records were written in the correct order, the engine is free to return the records in any order, unless you specify an order by clause. But apparently you don't have a column to put in that clause to reproduce the intended order.
I propose these solutions:
1. Return the minimum of those values that represent positive integers
select min(Emp_Number * 1)
from DAT_history
where Emp_Number not regexp '[^0-9]'
See SQL Fiddle
This will obviously fail when the count is larger then the smallest employee number. But seeing the sample data, that would represent a number of records that is maybe not expected...
2. Count the records, ignoring the 2 aggregated records
select count(*)-2
from DAT_history
See SQL Fiddle
3. Relying on correct order without order by
As explained at the start, you cannot rely on the order, but if for some reason you still want to rely on this, you can use a variable to number the rows in a sub query, and then pick out the one that has been attributed the one-but-last number:
select Emp_Number * 1
from (select Emp_Number,
#rn := #rn + 1 rn
from DAT_history,
(select #rn := 0) init
) numbered
where rn = #rn - 1
See SQL Fiddle
The * 1 is added to convert the text to a number data type.

This is not a perfect solution. I am making some assumptions for this. Check if this could work for you.
;WITH cte
AS (SELECT emp_number,
Row_number()
OVER (
ORDER BY emp_number ASC) AS rn
FROM dat_history
WHERE Isdate(emp_number) = 0) --Omit date entries
SELECT emp_number
FROM cte
WHERE rn = 1 -- select the minimum entry, assuming it would be the count and assuming count might not exceed the emp number range of 9888000

Related

Is there a way you can produce an output like this in T-SQL

I have a column which I translate the values using a case statements and I get numbers like this below. There are multiple columns I need to produce the result like this and this is just one column.
How do you produce the output as a whole like this below.
The 12 is the total numbers counting from top to bottom
49 is the Average.
4.08 is the division 49/12.
1 is how many 1's are there in the output list above. As you can see there is only one 1 in the output above
8.33% is the division and percentage comes from 1/12 * 100
and so on. Is there a way to produce this output below?
drop table test111
create table test111
(
Q1 nvarchar(max)
);
INSERT INTO TEST111(Q1)
VALUES('Strongly Agree')
,('Agree')
,('Disagree')
,('Strongly Disagree')
,('Strongly Agree')
,('Agree')
,('Disagree')
,('Neutral');
SELECT
CASE WHEN [Q1] = 'Strongly Agree' THEN 5
WHEN [Q1] = 'Agree' THEN 4
WHEN [Q1] = 'Neutral' THEN 3
WHEN [Q1] = 'Disagree' THEN 2
WHEN [Q1] = 'Strongly Disagree' THEN 1
END AS 'Test Q1'
FROM test111
I have to make a few assumptions here, but it looks like you want to treat an output column like a column in a spreadsheet. You have 12 numbers. You then have a blank "separator" row. Then a row with the number 12 (which is the count of how many numbers you have). Then a row with the number 49, which is the sum of those 12 numbers. Then the 4.08 row, which is rougly the average, and so on.
Some of these outputs can be provided by cube or rollup, but neither is a complete solution.
If you wanted to produce this output directly from TSQL, you would need to have multiple select statements and combine the results of all of those statements using union all. First you would have a select just to get the numbers. Then you would have a second select which outputs a "blank". Then another select which is providing a count. Then another select which is providing a sum. And so on.
You would also no longer be able to output actual numbers, since a "blank" is not a number. Visually it's best represented as an empty string. But now your output column has to be of datatype char or varchar.
You also have to make sure rows come out in the correct order for presentation. So you need a column to order by. You would have to add some kind of ordering column "manually" to each of the select statements, so when you union them all together you can tell SQL in what order the output should be provided.
So the answer to "can it be done?" is technically "yes". But if you think seems like a whole lot of laborious and inefficient TSQL work, you'd be right.
The real solution here is to change your approach. SQL should not be concerned with "output formatting". What you should do is just return the actual data (your 12 numbers) from SQL, and then do all of the additional presentation (like adding a blank row, adding a count row, etc), in the code of the program that is calling SQL to get that data.
I must say, this is one of the strangest T-SQL requirements I've seen, and is really best left to the presentation layer.
It is possible using GROUPING SETS though. We can use it to get an extra rollup row that aggregates the whole table.
Once you have the rollup, you need to unpivot the totalled row (identified by GROUPING() = 1) to get your final result. We can do this using CROSS APPLY.
This is impossible without a row-identifier. I have added ROW_NUMBER, but any primary or unique key will do.
WITH YourTable AS (
SELECT
ROW_NUMBER() OVER (ORDER BY (SELECT 1)) AS rn,
CASE WHEN [Q1] = 'Strongly Agree' THEN 5
WHEN [Q1] = 'Agree' THEN 4
WHEN [Q1] = 'Neutral' THEN 3
WHEN [Q1] = 'Disagree' THEN 2
WHEN [Q1] = 'Strongly Disagree' THEN 1
END AS TestQ1
FROM test111
),
RolledUp AS (
SELECT
rn,
TestQ1,
grouping = GROUPING(TestQ1),
count = COUNT(*),
sum = SUM(TestQ1),
avg = AVG(TestQ1 * 1.0),
one = COUNT(CASE WHEN TestQ1 = 1 THEN 1 END),
onePct = COUNT(CASE WHEN TestQ1 = 1 THEN 1 END) * 1.0 / COUNT(*)
FROM YourTable
GROUP BY GROUPING SETS(
(rn, TestQ1),
()
)
)
SELECT v.TestQ1
FROM RolledUp r
CROSS APPLY (
SELECT r.TestQ1, 0 AS ordering
WHERE r.grouping = 0
UNION ALL
SELECT v.value, v.ordering
FROM (VALUES
(NULL , 1),
(r.count , 2),
(r.sum , 3),
(r.avg , 4),
(r.one , 5),
(r.onePct, 6)
) v(value, ordering)
WHERE r.grouping = 1
) v
ORDER BY
v.ordering,
r.rn;
db<>fiddle

Calculate a Recursive Rolling Average in SQL Server

We are attempting to calculate a rolling average and have tried to convert numerous SO answers to solve the problem. To this point we are still unsuccessful.
What we've tried:
Here are some of the SO answers we have considered.
SQL Server: How to get a rolling sum over 3 days for different customers within same table
SQL Query for 7 Day Rolling Average in SQL Server
T-SQL calculate moving average
Our latest attempt has been to modify one of the solutions (#4) found here.
https://www.red-gate.com/simple-talk/sql/t-sql-programming/calculating-values-within-a-rolling-window-in-transact-sql/
Example:
Here is an example in SQL Fiddle: http://sqlfiddle.com/#!6/4570a/17
In the fiddle, we are still trying to get the SUM to work right but ultimately we are trying to get the average.
The end goal
Using the Fiddle example, we need to find the difference between Value1 and ComparisonValue1 and present it as Diff1. When a row has no Value1 available, we need to estimate it by taking the average of the last two Diff1 values and then add it to the ComparisonValue1 for that row.
With the correct query, the result would look like this:
GroupID Number ComparisonValue1 Diff1 Value1
5 10 54.78 2.41 57.19
5 11 55.91 2.62 58.53
5 12 55.93 2.78 58.71
5 13 56.54 2.7 59.24
5 14 56.14 2.74 58.88
5 15 55.57 2.72 58.29
5 16 55.26 2.73 57.99
Question: is it possible to calculate this average when it could potentially factor into the average of the following rows?
Update:
Added a VIEW to the Fiddle schema to simplify the final query.
Updated the query to include the new rolling average for Diff1 (column Diff1Last2Avg). This rolling average works great until we run into nulls in the Value1 column. This is where we need to insert the estimate.
Updated the query to include the estimate that should be used when there is no Value1 (column Value1Estimate). This is working great and would be perfect if we could use the estimate in place of NULL in the Value1 column. Since the Diff1 column reflects the difference between Value1 (or its estimate) and ComparisonValue1, including the Estimate would fill in all the NULL values in Diff1. This in turn would continue to allow the Estimates of future rows to be calculated. It gets confusing at this point, but still hacking away at it. Any ideas?
Credit for the idea goes to this answer: https://stackoverflow.com/a/35152131/6305294 from #JesúsLópez
I have included comments in the code to explain it.
UPDATE
I have corrected the query based on comments.
I have swapped numbers in minuend and subtrahend to get difference as a positive number.
Removed Diff2Ago column.
Results of the query now exactly match your sample output.
;WITH cte AS
(
-- This is similar to your ItemWithComparison view
SELECT i.Number, i.Value1, i2.Value1 AS ComparisonValue1,
-- Calculated Differences; NULL will be returned when i.Value1 is NULL
CONVERT( DECIMAL( 10, 3 ), i.Value1 - i2.Value1 ) AS Diff
FROM Item AS i
LEFT JOIN [Group] AS G ON g.ID = i.GroupID
LEFT JOIN Item AS i2 ON i2.GroupID = g.ComparisonGroupID AND i2.Number = i.Number
WHERE NOT i2.Id IS NULL
),
cte2 AS(
/*
Start with the first number
Note if you do not have at least 2 consecutive numbers (in cte) with non-NULL Diff value and therefore Diff1Ago or Diff2Ago are NULL then everything else will not work;
You may need to add additional logic to handle these cases */
SELECT TOP 1 -- start with the 1st number (see ORDER BY)
a.Number, a.Value1, a.ComparisonValue1, a.Diff, b.Diff AS Diff1Ago
FROM cte AS a
-- "1 number ago"
LEFT JOIN cte AS b ON a.Number - 1 = b.Number
WHERE NOT a.Value1 IS NULL
ORDER BY a.Number
UNION ALL
SELECT b.Number, b.Value1, b.ComparisonValue1,
( CASE
WHEN NOT b.Value1 IS NULL THEN b.Diff
ELSE CONVERT( DECIMAL( 10, 3 ), ( a.Diff + a.Diff1Ago ) / 2.0 )
END ) AS Diff,
a.Diff AS Diff1Ago
FROM cte2 AS a
INNER JOIN cte AS b ON a.Number + 1 = b.Number
)
SELECT *, ( CASE WHEN Value1 IS NULL THEN ComparisonValue1 + Diff ELSE Value1 END ) AS NewValue1
FROM cte2 OPTION( MAXRECURSION 0 );
Limitations:
this solution works well only when you need to consider small number of preceding values.

Recursive Decaying Average in Sql Server 2012

I need to calculate a decaying average (cumulative moving?) of a set of values. The last value in the series is 50% weight, with the decayed average of all the prior series as the other 50% weight, recursively.
I came up with a CTE query that produces correct results, but it depends on a sequential row number. I'm wondering if there is a better way to do this in SQL 2012, maybe with the new windowing functions for Over(), or something like that?
In the live data, the rows are ordered by time. I can use an SQL view and ROW_NUMBER() to generate the necessary Row field for my CTE approach, but if there is a more efficient way to do this, I would like to keep this as efficient as possible.
I have a sample table with 2 columns: Row int, and Value Float. I have 6 sample data values of 1,2,3,4,4,4. The correct result should be 3.78125.
My solution is:
;WITH items AS (
SELECT TOP 1
Row, Value, Value AS Decayed
FROM Sample Order By Row
UNION ALL
SELECT v.Row, v.Value, Decayed * .5 + v.Value *.5 AS Decayed
FROM Sample v
INNER JOIN items itms ON itms.Row = v.Row-1
)
SELECT top 1 Decayed FROM items order by Row desc
This correctly produces 3.78125 with the test data. My question is: Is there a more efficient and/or simpler way to do this in SQL 2012, or is this about the only way to do it? Thanks.
One possible alternative would be
WITH T AS
(
SELECT
Value * POWER(5E-1, ROW_NUMBER()
OVER (ORDER BY Row DESC)
/* first row decays less so special cased */
-IIF(LEAD(Value) OVER (ORDER BY Row DESC) IS NULL,1,0))
as x
FROM Sample
)
SELECT SUM(x)
FROM T
SQL Fiddle
Or for the updated question using 60%/40%
WITH T AS
(
SELECT IIF(LEAD(Value) OVER (ORDER BY Row DESC) IS NULL, 1,0.6)
* Value
* POWER(4E-1, ROW_NUMBER() OVER (ORDER BY Row DESC) -1)
as x
FROM Sample
)
SELECT SUM(x)
FROM T
SQL Fiddle
both of the above perform a single pass through the data and can potentially use an index on Row INCLUDE(Value) to avoid a sort.

Gaps in recurring series of a group with datetime [duplicate]

We have a table with following data
Id,ItemId,SeqNumber;DateTimeTrx
1,100,254,2011-12-01 09:00:00
2,100,1,2011-12-01 09:10:00
3,200,7,2011-12-02 11:00:00
4,200,5,2011-12-02 10:00:00
5,100,255,2011-12-01 09:05:00
6,200,3,2011-12-02 09:00:00
7,300,0,2011-12-03 10:00:00
8,300,255,2011-12-03 11:00:00
9,300,1,2011-12-03 10:30:00
Id is an identity column.
The sequence for an ItemId starts from 0 and goes till 255 and then resets to 0. All this information is stored in a table called Item. The order of sequence number is determined by the DateTimeTrx but such data can enter any time into the system. The expected output is as shown below-
ItemId,PrevorNext,SeqNumber,DateTimeTrx,MissingNumber
100,Previous,255,2011-12-01 09:05:00,0
100,Next,1,2011-12-01 09:10:00,0
200,Previous,3,2011-12-02 09:00:00,4
200,Next,5,2011-12-02 10:00:00,4
200,Previous,5,2011-12-02 10:00:00,6
200,Next,7,2011-12-02 11:00:00,6
300,Previous,1,2011-12-03 10:30:00,2
300,Next,255,2011-12-03 16:30:00,2
We need to get those rows one before and one after the missing sequence. In the above example for ItemId 300 - the record with sequence 1 has entered first (2011-12-03 10:30:00) and then 255(2011-12-03 16:30:00), hence the missing number here is 2. So 1 is previous and 255 is next and 2 is the first missing number. Coming to ItemId 100, the record with sequence 255 has entered first (2011-12-02 09:05:00) and then 1 (2011-12-02 09:10:00), hence 255 is previous and then 1, hence 0 is the first missing number.
In the above expected result, MissingNumber column is the first occuring missing number just to illustrate the example.
We will not have a case where we would have a complete series reset at one time i.e. it can be either a series rundown from 255 to 0 as in for itemid 100 or 0 to 255 as in ItemId 300. Hence we need to identify sequence missing when in ascending order (0,1,...255) or either in descending order (254,254,0,2) etc.
How can we accomplish this in a t-sql?
Could work like this:
;WITH b AS (
SELECT *
,row_number() OVER (ORDER BY ItemId, DateTimeTrx, SeqNumber) AS rn
FROM tbl
), x AS (
SELECT
b.Id
,b.ItemId AS prev_Itm
,b.SeqNumber AS prev_Seq
,c.ItemId AS next_Itm
,c.SeqNumber AS next_Seq
FROM b
JOIN b c ON c.rn = b.rn + 1 -- next row
WHERE c.ItemId = b.ItemId -- only with same ItemId
AND c.SeqNumber <> (b.SeqNumber + 1)%256 -- Seq cycles modulo 256
)
SELECT Id, prev_Itm, 'Previous' AS PrevNext, prev_Seq
FROM x
UNION ALL
SELECT Id, next_Itm ,'Next', next_Seq
FROM x
ORDER BY Id, PrevNext DESC
Produces exactly the requested result.
See a complete working demo on data.SE.
This solution takes gaps in the Id column into consideration, as there is no mention of a gapless sequence of Ids in the question.
Edit2: Answer to updated question:
I updated the CTE in the query above to match your latest verstion - or so I think.
Use those columns that define the sequence of rows. Add as many columns to your ORDER BY clause as necessary to break ties.
The explanation to your latest update is not entirely clear to me, but I think you only need to squeeze in DateTimeTrx to achieve what you want. I have SeqNumber in the ORDER BY additionally to break ties left by identical DateTimeTrx. I edited the query above.

selecting previous and next rows in mysql - how?

I can't figure out how to select a previous/next row IF the current row does not have any numeric identifiers.
With numeric value I always use 2 queries:
SELECT min(customer_id)
FROM customers
WHERE `customer_id` < 10
GROUP BY customer_status
ORDER BY customer_name ASC
LIMIT 1;
SELECT max(customer_id)
FROM customers
WHERE `customer_id` > 10
GROUP BY customer_status
ORDER BY customer_name DESC
LIMIT 1;
However, I don't have "customer_id" anymore and only "customer_name". When I query the DB and sort by this column, I get:
Ab
Bb
Cc
Dd
Ee
Let's assume my current customer's name is "Cc". I want to be able to select "Bb" and "Dd" from the DB. How? :)
Rows do not have an order, mysql stores the rows in whatever order it wants. Its called clustering. You use LIMIT to grab subsets of a result set. LIMIT 10 says rows 1 to 10. LIMIT 11,20 says rows 11 to 20 and so on. Row 1 corresponding to the order of the row in the result set, since the rows in the tables are more like a "cloud", there is no order until you build a result set with an ORDER BY clause.
i'd select the previous one with...
SELECT MAX(customer_name)
FROM customers
WHERE `customer_name` < 'Cc'
LIMIT 1;
and the next one with...
SELECT MIN(customer_name)
FROM customers
WHERE `customer_name` > 'Cc'
LIMIT 1;
You where nearly there, I think.
Edit: Removed superfluous ORDER BY statements as suggested by Col. Shrapnel.

Resources