SQL Case Then: remember and evaluate based on last group result - sql-server

Example: I have the following case then statement (see here SQL: Order by column, then by substring mix asc and desc ) my warehouse and the locations are in 1 column as follows and the case alternates the rows going from ASC (odd) to DESC (even):
select *
from #temp
order
by substring(id,1,2),
case
when substring(id,1,2)%2=0 then row_number() over (partition by substring(id,1,2) order by SUBSTRING(id,4,3) desc)
else row_number() over (partition by substring(id,1,2) order by SUBSTRING(id,4,3) asc)
end
01-001-A-01
01-002-A-02
01-003-A-03
01-004-A-01
01-005-A-03
02-001-A-01
02-002-A-02
02-003-A-03
02-004-A-01
02-005-A-03
03-001-A-01
03-002-A-02
03-003-A-03
03-004-A-01
03-005-A-03
Now I would like to add the following: I pick an order from row 1 but nothing to pick from row 2 so I want to go to row 3, now I don't want to walk back the aisle to the beginning of row 03-01 because I'm close to 03-05, so I would like that my results should always be alternating between ASC and desc, so after 01-005-A-03 if I have 0 results with 02 then I want 03-005-A-03 meaning that in this case I would like row 3 DESC and row 4 ASC (so always do the opposite than in the previous group?
This is how it should be if no result begins in 02-XXX-X-XX
01-001-A-01
01-002-A-02
01-003-A-03
01-004-A-01
01-005-A-03
03-005-A-01
03-004-A-02
03-003-A-03
03-002-A-01
03-001-A-03

change the ORDER BY clause to
order by
substring(id,1,2),
case
when dense_rank() over (order by substring(id,1,2)) % 2 = 0
then row_number() over (partition by substring(id,1,2) order by SUBSTRING(id,4,3) desc)
else row_number() over (partition by substring(id,1,2) order by SUBSTRING(id,4,3) asc)
end
instead of finding modulo of substring(id,1,2) % 2 , use dense_rank() to get the continuous numbering and then find the modulo of it
Note : the original query would failed if your first segment is not pure numeric

Related

SQL syntax for complex GROUP BY with OVER statement: calculating Gini coefficient for multiple sets

I want to calculate the Gini coefficient for a number of sets, containing in a two-column table (here called #cits) containing a value and a set-ID. I have been experimenting with different Gini-coefficient calculations, described here (StackExchange query) and here (StackOverflow question with some good replies). Both of the examples only calculate one coefficient for one table, whereas I would like to do it with a GROUP BY clause.
The #cits table contains two columns, c and cid, being the value and set-ID respectively.
Here is my current try (incomplete):
select count(c) as numC,
sum(c) as totalC,
(select row_number() over(order by c asc, cid) id, c from #cits) as a
from #cits group by cid
selecting numC and totalC works well, of course, but the next line is giving me a headache. I can see that the syntax is wrong, but I can't figure out how to assign the row_number() per c per cid.
EDIT:
Based on the suggestions, I used partition, like so:
select cid,sumC = sum(a.id * a.c)
into #srep
from (
select cid,row_number() over (partition by cid order by c asc) id,
c
from #cits
) as a
group by a.cluster_id1
select count(c) as numC,
sum(c) as totalC, b.sumC
into #gtmp
from #cits a
join #srep b
on a.cid = b.cid
group by a.cid,b.sumC
select
gini = 2 * sumC / (totalC * numC) - (numC - 1) / numC
from #gtmp
This almost works. I get a result, but it is >1, which is unexpected, as the Gini-coefficient should be between 0 and 1. As stated in the comments, I would have preferred a one-query solution as well, but it is not a major issue at all.
You can "partition" the data so row numbering would start over for each ID...
but I'm not sure this is what you're after..
I'm assuming you want the CID displayed as you are grouping by it.
select count(c) as numC
, sum(c) as totalC
, row_number() over(partition by cID order by c asc) as a
, cid
from #cits group by cid
Note you don't need the subquery.
Yeah this isn't likely right.
output
NumC TotalC A CID
24 383 1 1
15 232 1 2
If I'm understanding correctly, you need numC and totalC for each C in a cid set, as well as the position of the c inside of that set. This should get you what you need:
select
rn.cid,
rn.c,
row_number() over (partition by rn.cid order by rn.c) as id,
agg.numC,
agg.totalC
from #cits rn
left outer join
(
select
cid,
count(c) as numC,
sum(c) as totalC
from #cits
group by cid
) agg
on rn.cid = agg.cid

Creating unique identifier column(1 or zero) Rank () SQL SERVER

I am trying to create a column in SQL SERVER that shows 1 OR 0(zero). I have a column of customer numbers that appear more than once. At the first hit on a unique non-repeated customer number it should show one and if it is repeated then 0(zero). How can I create this ?
CustNumber Unique
25122134 1
25122134 0
25122134 0
25122136 1
25122136 0
the solutions I am considering and trying out now are Rank() and Rank_DENSE().
declare #test table
(
CustNumber int
)
insert into #test values
(25122134),
(25122134),
(25122134),
(25122136),
(25122136)
select
* ,
// each CustNumber in partition has the same rank, but different row_number
case when (row_number() over (partition by CustNumber order by CustNumber)) = 1
then 1 else 0 end as [Unique]
// the 1st is unique, the rest (2..n) are not
from #test
order by CustNumber, [Unique] desc
// unique in each group should be displayed first
You don't want RANK because that, by definition, produces the same output for identical inputs.
ROW_NUMBER() and a simple CASE expression should do it:
;WITH Numbered as (
SELECT CustNumber,
ROW_NUMBER() OVER (PARTITION BY CustNumber
ORDER BY CustNumber) as rn --Unusual - pick a real column if you have a preference
FROM YourUnnamedTable
)
SELECT CustNumber,CASE WHEN rn = 1 THEN 1 ELSE 0 END as [Unique]
FROM Numbered

GROUP BY doesn't contain specific column

I have the following statement in MSSQL
SELECT a, b, MAX(t)
FROM table
GROUP BY a, b
What I want is just to show c and d columns for each specific row in the result. How can I do that?
It sounds like you're looking for ROW_NUMBER() or RANK() (the former will ignore ties, the latter will include them), something like:
;With Ranked as (
SELECT a,b,c,d,t,
ROW_NUMBER() OVER (PARTITION BY a,b
ORDER BY t desc) as rn
FROM table
)
SELECT * from Ranked where rn = 1
Which will return one row for each unique combination of the a,b columns, choosing the other values such that they come from the row with the highest t value (and, as I say, this variant ignores ties).

Do I need to use the dreaded sql server loop/ cursor for the result set I need?

I need a sql server result set that "breaks" on a column value, but if I order by this column in a ranking function, the order I really need is lost. This is best explained by example. The query I'm currently experimenting with is:
select RANK() over(partition by Symbol, Period order by TradeDate desc)
SymbSmaOverUnderGroup, Symbol, TradeDate, Period, Value, Low, LowMinusVal,
LMVSign
from #smasAndLow3
and it returns:
Rnk Symbol TradeDate Period Value Low LowMinusVal LMVSign
1 A 9/6/12 5 37.09 36.71 -.38 U
2 A 9/5/12 5 37.03 36.62 -.41 U
3 A 9/4/12 5 37.07 36.71 -.36 U
4 A 8/31/12 5 37.15 37.30 .15 O
5 A 8/30/12 5 37.22 37.40 .18 O
6 A 8/29/12 5 37.00 36.00 -1.00 U
7 A 8/28/12 5 37.10 37.00 -.10 U
The rank I need here is: 1,1,1,2,2,3,3. So I need to partition by Symbol, Period, and I need to start a new partition on LMVSign (which only contains the values U, O, and E), but it's essential that I order by TradeDate desc. Unless I'm mistaken, partitioning or ordering by LMVSign will make it impossible to sort on the date column. I hope this makes sense. I'm working like mad to do this without a cursor, but I can't get it to work.. thanks in advance.
UPDATE after clarification: I think that you are entering the world of islands and gaps. If your requirement is to group rows by Symbol, Period and LMVSign ordered descendingly by TradeDate, ranking them when any one of these columns change, you might use this (by Itzik Ben-Gan's solution to islands and gaps).
; with islandsAndGaps as
(
select *,
-- Create groups. Important part is order by
-- The difference remains the same as two sequences
-- run along, but the number itself is not ordered
row_number() over (partition by Symbol, Period
order by TradeDate)
- row_number() over (partition by Symbol, Period
order by LMVSign, TradeDate) grp
from Table1
),
grouped as
(
select *,
-- So to order it we use last date in group
-- (mind partition by uses changed order by from second row_number
-- and unordered group number
max(TradeDate) over(partition by LMVSign, grp) DateGroup
from islandsAndGaps
)
-- now we can get rank
select dense_rank() over (order by DateGroup desc) Rnk,
*
from grouped
order by TradeDate desc
Take a look at Sql Fiddle.
OLD answer:
Partition by restarts ranking. I think that you need order by:
dense_rank() over (order by Symbol, Period, LMVSign desc) Rnk
and then you should use TradeDate in order by:
order by Rnk, TradeDate desc
If you need it as a number, add another column:
row_number() over (order by Symbol, Period, LMVSign desc, TradeDate desc) rn

How do I select last 5 rows in a table without sorting?

I want to select the last 5 records from a table in SQL Server without arranging the table in ascending or descending order.
This is just about the most bizarre query I've ever written, but I'm pretty sure it gets the "last 5" rows from a table without ordering:
select *
from issues
where issueid not in (
select top (
(select count(*) from issues) - 5
) issueid
from issues
)
Note that this makes use of SQL Server 2005's ability to pass a value into the "top" clause - it doesn't work on SQL Server 2000.
Suppose you have an index on id, this will be lightning fast:
SELECT * FROM [MyTable] WHERE [id] > (SELECT MAX([id]) - 5 FROM [MyTable])
The way your question is phrased makes it sound like you think you have to physically resort the data in the table in order to get it back in the order you want. If so, this is not the case, the ORDER BY clause exists for this purpose. The physical order in which the records are stored remains unchanged when using ORDER BY. The records are sorted in memory (or in temporary disk space) before they are returned.
Note that the order that records get returned is not guaranteed without using an ORDER BY clause. So, while any of the the suggestions here may work, there is no reason to think they will continue to work, nor can you prove that they work in all cases with your current database. This is by design - I am assuming it is to give the database engine the freedom do as it will with the records in order to obtain best performance in the case where there is no explicit order specified.
Assuming you wanted the last 5 records sorted by the field Name in ascending order, you could do something like this, which should work in either SQL 2000 or 2005:
select Name
from (
select top 5 Name
from MyTable
order by Name desc
) a
order by Name asc
You need to count number of rows inside table ( say we have 12 rows )
then subtract 5 rows from them ( we are now in 7 )
select * where index_column > 7
select * from users
where user_id >
( (select COUNT(*) from users) - 5)
you can order them ASC or DESC
But when using this code
select TOP 5 from users order by user_id DESC
it will not be ordered easily.
select * from table limit 5 offset (select count(*) from table) - 5;
Without an order, this is impossible. What defines the "bottom"? The following will select 5 rows according to how they are stored in the database.
SELECT TOP 5 * FROM [TableName]
Well, the "last five rows" are actually the last five rows depending on your clustered index. Your clustered index, by definition, is the way that he rows are ordered. So you really can't get the "last five rows" without some order. You can, however, get the last five rows as it pertains to the clustered index.
SELECT TOP 5 * FROM MyTable
ORDER BY MyCLusteredIndexColumn1, MyCLusteredIndexColumnq, ..., MyCLusteredIndexColumnN DESC
Search 5 records from last records you can use this,
SELECT *
FROM Table Name
WHERE ID <= IDENT_CURRENT('Table Name')
AND ID >= IDENT_CURRENT('Table Name') - 5
If you know how many rows there will be in total you can use the ROW_NUMBER() function.
Here's an examble from MSDN (http://msdn.microsoft.com/en-us/library/ms186734.aspx)
USE AdventureWorks;
GO
WITH OrderedOrders AS
(
SELECT SalesOrderID, OrderDate,
ROW_NUMBER() OVER (ORDER BY OrderDate) AS 'RowNumber'
FROM Sales.SalesOrderHeader
)
SELECT *
FROM OrderedOrders
WHERE RowNumber BETWEEN 50 AND 60;
In SQL Server 2012 you can do this :
Declare #Count1 int ;
Select #Count1 = Count(*)
FROM [Log] AS L
SELECT
*
FROM [Log] AS L
ORDER BY L.id
OFFSET #Count - 5 ROWS
FETCH NEXT 5 ROWS ONLY;
Try this, if you don't have a primary key or identical column:
select [Stu_Id],[Student_Name] ,[City] ,[Registered],
RowNum = row_number() OVER (ORDER BY (SELECT 0))
from student
ORDER BY RowNum desc
You can retrieve them from memory.
So first you get the rows in a DataSet, and then get the last 5 out of the DataSet.
There is a handy trick that works in some databases for ordering in database order,
SELECT * FROM TableName ORDER BY true
Apparently, this can work in conjunction with any of the other suggestions posted here to leave the results in "order they came out of the database" order, which in some databases, is the order they were last modified in.
select *
from table
order by empno(primary key) desc
fetch first 5 rows only
Last 5 rows retrieve in mysql
This query working perfectly
SELECT * FROM (SELECT * FROM recharge ORDER BY sno DESC LIMIT 5)sub ORDER BY sno ASC
or
select sno from(select sno from recharge order by sno desc limit 5) as t where t.sno order by t.sno asc
When number of rows in table is less than 5 the answers of Matt Hamilton and msuvajac is Incorrect.
Because a TOP N rowcount value may not be negative.
A great example can be found Here.
i am using this code:
select * from tweets where placeID = '$placeID' and id > (
(select count(*) from tweets where placeID = '$placeID')-2)
In SQL Server, it does not seem possible without using ordering in the query.
This is what I have used.
SELECT *
FROM
(
SELECT TOP 5 *
FROM [MyTable]
ORDER BY Id DESC /*Primary Key*/
) AS T
ORDER BY T.Id ASC; /*Primary Key*/
DECLARE #MYVAR NVARCHAR(100)
DECLARE #step int
SET #step = 0;
DECLARE MYTESTCURSOR CURSOR
DYNAMIC
FOR
SELECT col FROM [dbo].[table]
OPEN MYTESTCURSOR
FETCH LAST FROM MYTESTCURSOR INTO #MYVAR
print #MYVAR;
WHILE #step < 10
BEGIN
FETCH PRIOR FROM MYTESTCURSOR INTO #MYVAR
print #MYVAR;
SET #step = #step + 1;
END
CLOSE MYTESTCURSOR
DEALLOCATE MYTESTCURSOR
Thanks to #Apps Tawale , Based on his answer, here's a bit of another (my) version,
To select last 5 records without an identity column,
select top 5 *,
RowNum = row_number() OVER (ORDER BY (SELECT 0))
from [dbo].[ViewEmployeeMaster]
ORDER BY RowNum desc
Nevertheless, it has an order by, but on RowNum :)
Note(1): The above query will reverse the order of what we get when we run the main select query.
So to maintain the order, we can slightly go like:
select *, RowNum2 = row_number() OVER (ORDER BY (SELECT 0))
from (
select top 5 *, RowNum = row_number() OVER (ORDER BY (SELECT 0))
from [dbo].[ViewEmployeeMaster]
ORDER BY RowNum desc
) as t1
order by RowNum2 desc
Note(2): Without an identity column, the query takes a bit of time in case of large data
Get the count of that table
select count(*) from TABLE
select top count * from TABLE where 'primary key row' NOT IN (select top (count-5) 'primary key row' from TABLE)
If you do not want to arrange the table in ascending or descending order. Use this.
select * from table limit 5 offset (select count(*) from table) - 5;

Resources