How to get MAX value of a version-number (varchar) column in T-SQL - sql-server

I have a table defined like this:
Column: Version Message
Type: varchar(20) varchar(100)
----------------------------------
Row 1: 2.2.6 Message 1
Row 2: 2.2.7 Message 2
Row 3: 2.2.12 Message 3
Row 4: 2.3.9 Message 4
Row 5: 2.3.15 Message 5
I want to write a T-Sql query that will get message for the MAX version number, where the "Version" column represents a software version number. I.e., 2.2.12 is greater than 2.2.7, and 2.3.15 is greater than 2.3.9, etc. Unfortunately, I can't think of an easy way to do that without using CHARINDEX or some complicated other split-like logic. Running this query:
SELECT MAX(Version) FROM my_table
will yield the erroneous result:
2.3.9
When it should really be 2.3.15. Any bright ideas that don't get too complex?

One solution would be to use a table-valued split function to split the versions into rows and then combine them back into columns so that you can do something like:
Select TOP 1 Major, Minor, Build
From ( ...derived crosstab query )
Order By Major Desc, Minor Desc, Build Desc
Actually, another way is to use the PARSENAME function which was meant to split object names:
Select TOP 1 Version
From Table
Order By Cast(Parsename( Z.Version , 3 ) As Int) Desc
, Cast(Parsename( Z.Version , 2 ) As Int) Desc
, Cast(Parsename( Z.Version , 1 ) As Int) Desc

Does it have to be efficient on a large table? I suggest you create an indexed persisted computed column that transform the version into a format that ranks correctly, and use the computed column in your queries. Otherwise you'll always scan end to end.
If the table is small, it doesn't matter. Then you can use a just-in-time ranking, using a split function, or (ab)using the parsename as Thomas suggested.

Related

SQL Server : Row Number without ordering

I want to create a Select statement that ranks the column as is without ordering.
Currently, the table is in the following order:
ITEM_Description1
ITEM_Description2
ITEM_StockingType
ITEM_RevisionNumber
I do not want the results to be numerical in any way, nor depend on the VariableID numbers, but with ROW_Number(), I have to choose something. Does anyone know how I can have the results look like this?
Row| VariableName
---------------------
1 | ITEM_Description1
2 | ITEM_Description2
3 | ITEM_StockingType
4 | ITEM_RevisionNumber
My code for an example is shown below.
SELECT
VariableName,
ROW_NUMBER() OVER (ORDER BY VariableID) AS RowNumber
FROM
SeanVault.dbo.TempVarIDs
Using ORDER BY (SELECT NULL) will give you the results your looking for.
SELECT
VariableName,
ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS rownum
FROM
SeanVault.dbo.TempVarIDs
Your problem seems to be with this sentence:
Currently, the table is in the following order:
No, your table is NOT implicitly ordered!!
Although it might look like this...
The only way to enforce the resultset's sort order is an ORDER BY-clause at the outer most SELECT.
If you want to maintain the sort order of your inserts you can use
a column like ID INT IDENTITY (which will automatically increase a sequence counter)
Using GETDATE() on insert will not solve this, as multiple row inserts might get the same DateTime value.
You do not have to show this in your output of course...
Your table has no inherent order. Even if you get that order a 100 times in a row is no guarantee it will be that order on the 101 time.
You can add an identity column to the table.

Highlight Duplicate Values in a NetSuite Saved Search

I am looking for a way to highlight duplicates in a NetSuite saved search. The duplicates are in a column called "ACCOUNT" populated with text values.
NetSuite permits adding fields (columns) to the search using a stripped down version of SQL Server. It also permits conditional highlighting of entire rows using the same code. However I don't see an obvious way to compare values between rows of data.
Although duplicates can be grouped together in a summary report and identified by a count of 2 or more, I want to show duplicate lines separately and highlight each.
The closest thing I found was a clever formula that calculates a running total here:
sum/* comment */({amount})
OVER(PARTITION BY {name}
ORDER BY {internalid}
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
I wonder if it's possible to sort results by the field being checked for duplicates and adapt this code to identify changes in the "ACCOUNT" field between a row and the previous row.
Any ideas? Thanks!
This post has been edited. I have left the progression as a learning experience about NetSuite.
Original - plain SQL way - not suitable for NetSuite
Does something like this meet your needs? The test data assumes looking for duplicates on id1 and id2. Note: This does not work in NetSuite as it supports limited SQL functions. See comments for links.
declare #table table (id1 int, id2 int, value int);
insert #table values
(1,1,11),
(1,2,12),
(1,3,13),
(2,1,21),
(2,2,22),
(2,3,23),
(1,3,1313);
--select * from #table order by id1, id2;
select t.*,
case when dups.id1 is not null then 1 else 0 end is_dup --identify dups when there is a matching dup record
from #table t
left join ( --subquery to find duplicates
select id1, id2
from #table
group by id1, id2
having count(1) > 1
) dups
on dups.id1 = t.id1
and dups.id2 = t.id2
order by t.id1, t.id2;
First Edit - NetSuite target but in SQL.
This was a SQL test based on the example available syntax provided in the question since I do not have NetSuite to test against. This will give you a value greater than 1 on each duplicate row using a similar syntax. Note: This will give the appropriate answer but not in NetSuite.
select t.*,
sum(1) over (partition by id1, id2)
from #table t
order by t.id1, t.id2;
Second Edit - Working NetSuite version
After some back and forth here is a version that works in NetSuite:
sum/* comment */(1) OVER(PARTITION BY {name})
This will also give a value greater than 1 on any row that is a duplicate.
Explanation
This works by summing the value 1 on each row included in the partition. The partition column(s) should be what you consider a duplicate. If only one column makes a duplicate (e.g. user ID) then use as above. If multiple columns make a duplicate (e.g. first name, last name, city) then use a comma-separated list in the partition. SQL will basically group the rows by the partition and add up the 1s in the sum/* comment */(1). The example provided in the question sums an actual column. By summing 1 instead we will get the value 1 when there is only 1 ID in the partition. Anything higher is a duplicate. I guess you could call this field duplicate count.

SQL Get Second Record

I am looking to retrieve only the second (duplicate) record from a data set. For example in the following picture:
Inside the UnitID column there is two separate records for 105. I only want the returned data set to return the second 105 record. Additionally, I want this query to return the second record for all duplicates, not just 105.
I have tried everything I can think of, albeit I am not that experience, and I cannot figure it out. Any help would be greatly appreciated.
You need to use GROUP BY for this.
Here's an example: (I can't read your first column name, so I'm calling it JobUnitK
SELECT MAX(JobUnitK), Unit
FROM JobUnits
WHERE DispatchDate = 'oct 4, 2015'
GROUP BY Unit
HAVING COUNT(*) > 1
I'm assuming JobUnitK is your ordering/id field. If it's not, just replace MAX(JobUnitK) with MAX(FieldIOrderWith).
Use RANK function. Rank the rows OVER PARTITION BY UnitId and pick the rows with rank 2 .
For reference -
https://msdn.microsoft.com/en-IN/library/ms176102.aspx
Assuming SQL Server 2005 and up, you can use the Row_Number windowing function:
WITH DupeCalc AS (
SELECT
DupID = Row_Number() OVER (PARTITION BY UnitID, ORDER BY JobUnitKeyID),
*
FROM JobUnits
WHERE DispatchDate = '20151004'
ORDER BY UnitID Desc
)
SELECT *
FROM DupeCalc
WHERE DupID >= 2
;
This is better than a solution that uses Max(JobUnitKeyID) for multiple reasons:
There could be more than one duplicate, in which case using Min(JobUnitKeyID) in conjunction with UnitID to join back on the UnitID where the JobUnitKeyID <> MinJobUnitKeyID` is required.
Except, using Min or Max requires you to join back to the same data (which will be inherently slower).
If the ordering key you use turns out to be non-unique, you won't be able to pull the right number of rows with either one.
If the ordering key consists of multiple columns, the query using Min or Max explodes in complexity.

SQL Server 2005 SELECT TOP 1 from VIEW returns LAST row

I have a view that may contain more than one row, looking like this:
[rate] | [vendorID]
8374 1234
6523 4321
5234 9374
In a SPROC, I need to set a param equal to the value of the first column from the first row of the view. something like this:
DECLARE #rate int;
SET #rate = (select top 1 rate from vendor_view where vendorID = 123)
SELECT #rate
But this ALWAYS returns the LAST row of the view.
In fact, if I simply run the subselect by itself, I only get the last row.
With 3 rows in the view, TOP 2 returns the FIRST and THIRD rows in order. With 4 rows, it's returning the top 3 in order. Yet still top 1 is returning the last.
DERP?!?
This works..
DECLARE #rate int;
CREATE TABLE #temp (vRate int)
INSERT INTO #temp (vRate) (select rate from vendor_view where vendorID = 123)
SET #rate = (select top 1 vRate from #temp)
SELECT #rate
DROP TABLE #temp
.. but can someone tell me why the first behaves so fudgely and how to do what I want? As explained in the comments, there is no meaningful column by which I can do an order by. Can I force the order in which rows are inserted to be the order in which they are returned?
[EDIT] I've also noticed that: select top 1 rate from ([view definition select]) also returns the correct values time and again.[/EDIT]
That is by design.
If you don't specify how the query should be sorted, the database is free to return the records in any order that is convenient. There is no natural order for a table that is used as default sort order.
What the order will actually be depends on how the query is planned, so you can't even rely on the same query giving a consistent result over time, as the database will gather statistics about the data and may change how the query is planned based on that.
To get the record that you expect, you simply have to specify how you want them sorted, for example:
select top 1 rate
from vendor_view
where vendorID = 123
order by rate
I ran into this problem on a query that had worked for years. We upgraded SQL Server and all of a sudden, an unordered select top 1 was not returning the final record in a table. We simply added an order by to the select.
My understanding is that SQL Server normally will generally provide you the results based on the clustered index if no order by is provided OR off of whatever index is picked by the engine. But, this is not a guarantee of a certain order.
If you don't have something to order off of, you need to add it. Either add a date inserted column and default it to GETDATE() or add an identity column. It won't help you historically, but it addresses the issue going forward.
While it doesn't necessarily make sense that the results of the query should be consistent, in this particular instance they are so we decided to leave it 'as is'. Ultimately it would be best to add a column, but this was not an option. The application this belongs to is slated to be discontinued sometime soon and the database server will not be upgraded from SQL 2005. I don't necessarily like this outcome, but it is what it is: until it breaks it shall not be fixed. :-x

SQL Server ORDER BY date and nulls last

I am trying to order by date. I want the most recent dates coming in first. That's easy enough, but there are many records that are null and those come before any records that have a date.
I have tried a few things with no success:
ORDER BY ISNULL(Next_Contact_Date, 0)
ORDER BY ISNULL(Next_Contact_Date, 999999999)
ORDER BY coalesce(Next_Contact_Date, 99/99/9999)
How can I order by date and have the nulls come in last? The data type is smalldatetime.
smalldatetime has range up to June 6, 2079 so you can use
ORDER BY ISNULL(Next_Contact_Date, '2079-06-05T23:59:00')
If no legitimate records will have that date.
If this is not an assumption you fancy relying on a more robust option is sorting on two columns.
ORDER BY CASE WHEN Next_Contact_Date IS NULL THEN 1 ELSE 0 END, Next_Contact_Date
Both of the above suggestions are not able to use an index to avoid a sort however and give similar looking plans.
One other possibility if such an index exists is
SELECT 1 AS Grp, Next_Contact_Date
FROM T
WHERE Next_Contact_Date IS NOT NULL
UNION ALL
SELECT 2 AS Grp, Next_Contact_Date
FROM T
WHERE Next_Contact_Date IS NULL
ORDER BY Grp, Next_Contact_Date
According to Itzik Ben-Gan, author of T-SQL Fundamentals for MS SQL Server 2012, "By default, SQL Server sorts NULL marks before non-NULL values. To get NULL marks to sort last, you can use a CASE expression that returns 1 when the" Next_Contact_Date column is NULL, "and 0 when it is not NULL. Non-NULL marks get 0 back from the expression; therefore, they sort before NULL marks (which get 1). This CASE expression is used as the first sort column." The Next_Contact_Date column "should be specified as the second sort column. This way, non-NULL marks sort correctly among themselves." Here is the solution query for your example for MS SQL Server 2012 (and SQL Server 2014):
ORDER BY
CASE
WHEN Next_Contact_Date IS NULL THEN 1
ELSE 0
END, Next_Contact_Date;
Equivalent code using IIF syntax:
ORDER BY
IIF(Next_Contact_Date IS NULL, 1, 0),
Next_Contact_Date;
order by -cast([Next_Contact_Date] as bigint) desc
If your SQL doesn't support NULLS FIRST or NULLS LAST, the simplest way to do this is to use the value IS NULL expression:
ORDER BY Next_Contact_Date IS NULL, Next_Contact_Date
to put the nulls at the end (NULLS LAST) or
ORDER BY Next_Contact_Date IS NOT NULL, Next_Contact_Date
to put the nulls at the front. This doesn't require knowing the type of the column and is easier to read than the CASE expression.
EDIT: Alas, while this works in other SQL implementations like PostgreSQL and MySQL, it doesn't work in MS SQL Server. I didn't have a SQL Server to test against and relied on Microsoft's documentation and testing with other SQL implementations. According to Microsoft, value IS NULL is an expression that should be usable just like any other expression. And ORDER BY is supposed to take expressions just like any other statement that takes an expression. But it doesn't actually work.
The best solution for SQL Server therefore appears to be the CASE expression.
A bit late, but maybe someone finds it useful.
For me, ISNULL was out of question due to the table scan. UNION ALL would need me to repeat a complex query, and due to me selecting only the TOP X it would not have been very efficient.
If you are able to change the table design, you can:
Add another field, just for sorting, such as Next_Contact_Date_Sort.
Create a trigger that fills that field with a large (or small) value, depending on what you need:
CREATE TRIGGER FILL_SORTABLE_DATE ON YOUR_TABLE AFTER INSERT,UPDATE AS
BEGIN
SET NOCOUNT ON;
IF (update(Next_Contact_Date)) BEGIN
UPDATE YOUR_TABLE SET Next_Contact_Date_Sort=IIF(YOUR_TABLE.Next_Contact_Date IS NULL, 99/99/9999, YOUR_TABLE.Next_Contact_Date_Sort) FROM inserted i WHERE YOUR_TABLE.key1=i.key1 AND YOUR_TABLE.key2=i.key2
END
END
Use desc and multiply by -1 if necessary. Example for ascending int ordering with nulls last:
select *
from
(select null v union all select 1 v union all select 2 v) t
order by -t.v desc
I know this is old but this is what worked for me
Order by Isnull(Date,'12/31/9999')
I think I found a way to show nulls in the end and still be able to use indexes for sorting.
The idea is super simple - create a calculatable column which will be based on existing column, and put an index on it.
ALTER TABLE dbo.Users
ADD [FirstNameNullLast]
AS (case when [FirstName] IS NOT NULL AND (ltrim(rtrim([FirstName]))<>N'' OR [FirstName] IS NULL) then [FirstName] else N'ZZZZZZZZZZ' end) PERSISTED
So, we are creating a persisted calculatable column in the SQL, in that column all blank and null values will be replaced by 'ZZZZZZZZ', this will mean, that if we will try to sort based on that column, we will see all the null or blank values in the end.
Now we can use it in our new index.
Like this:
CREATE NONCLUSTERED INDEX [IX_Users_FirstNameNullLast] ON [dbo].[Users]
(
[FirstNameNullLast] ASC
)
So, this is an ordinary nonclustered index. We can change it however we want, i.e. include extra columns, increase number of indexes columns, change sorting order etc.
I know this is a old thread, but in SQL Server nulls are always lower than non-null values. So it's only necessary to order by Desc
In your case Order by Next_Contact_Date Desc should be enough.
Source: order by with nulls- LearnSql

Resources