alphanumeric sorting in sql

alphanumeric sorting in sql - sql-server

Hi I am currently doing a project in which the database needs to sort the lot numbers
prefix is nvarchar
lotnum is int
suffix is nvarchar
I have managed to convert the lot number
code i used is
Select (case when prefix is null then '' else prefix end) +
CONVERT ( nvarchar , ( lotnumber ) ) +(case when suffix is null then '' else suffix end)
(values in the database are a1a,1a,1,2,100)
when I order by lotnumber I get
a1a
1a
1
2
100
then prefix to the order by
and get this result
1
a1a
1a
2
100
I have added the suffix as well and returns the same result
I need to order it as follows
1
1a
2
100
a1a
Please could someone help me on this

Have you tried ordering by all three columns?
ORDER BY prefix, lotnum, suffix
By the way, I can see you're using SQL Server. To make things more portable, I'd recommend to use COALESCE and CAST instead of a CASE/WHEN and CONVERT for prefix and lotnum. Full query may look like this.
SELECT
COALESCE(prefix, '')
+ CAST(lotnum AS NVARCHAR)
+ COALESCE(suffix, '') AS lot_number
FROM
YourTable
ORDER BY
COALESCE(prefix, '')
,lotnum
,COALESCE(suffix, '')

Related

!IsNullOrEmpty equivalent in SQL server

I'm creating an SSRS report and during writing the query I look up the code logic for the data that needs to be retrieved in query. There is lots of usage of !String.IsNullOrEmpty method so I want to know what is the shortest and best way to do the equivalent check in SQL server?
WHERE t.Name IS NOT NULL OR T.Name != ''
or....
WHERE LEN(t.Name) > 0
which one is correct? Or is there any other alternative?

There is no built-in equivalent of IsNullOrEmpty() in T-SQL, largely due to the fact that trailing blanks are ignored when comparing strings. One of your options:
where len(t.Name) > 0
would be enough as len() ignores trailing spaces too. Unfortunately it can make the expression non-SARGable, and any index on the Name column might not be used. However, this one should do the trick:
where t.Name > ''
P.S. For the sake of completeness, the datalength() function takes all characters into account; keep in mind however that it returns the number of bytes, not characters, so for any nvarchar value the result will be at least double of what you might expect (and with supplementary characters / surrogate pairs the number should be even higher, if my memory serves).

If the desired result is the simplest possible one-liner then:
WHERE NULLIF(Name, '') IS NOT NULL
However, from a performance point of view following alternative is SARGable, therefore indexes potentially can be used to spot and filter out records with such values
WHERE Name IS NOT NULL AND Name != ''
An example:
;WITH cte AS (
SELECT 1 AS ID, '' AS Name UNION ALL
SELECT 2, ' ' UNION ALL
SELECT 3, NULL UNION ALL
SELECT 4, 'abc'
)
SELECT * FROM cte
WHERE Name IS NOT NULL AND Name != ''
Results to:
ID Name
---------
4 abc

Yes, you can use WHERE LEN(t.Name)>0.
You can also verify as below:
-- Count the Total number of records
SELECT COUNT(1) FROM tblName as t
-- Count the Total number of 'NULL' or 'Blank' records
SELECT COUNT(1) FROM tblName as t WHERE ISNULL(t.Name,'')= ''
-- Count the Total number of 'NOT NULL' records
SELECT COUNT(1) FROM tblName as t WHERE LEN(t.Name)>0
Thanks.

How to merge multiple values from multiple columns into one row?

I have this table:
I need to convert it to (with parenthesis as well):
row_nbr - row_label - default_order
10 - TOTAL ACCOUNTABLE GROSS - (1, 3)
12 - DEDUCTIBLE TERMS - (3)
20 - TOTAL DEDUCTIBLE TERMS - (3)
34 - AMOUNT DUE (UNRECOUPED) - (4)
36 - ACCOUNTABLE GROSS - (2)
41 - TOTAL CONTINGENT COMPENSATION - (3)
I could have more than twice of the same row_nbr.
In this case the 10 is there twice, but I could have 3 10's, 4 12's, etc.
I kind of started the pivot table but honestly, even by looking at the Microsoft site, I cannot for the life of me figure this out.
select row_nbr, row_label, default_order
from #temp
pivot
(
max(row_nbr)
for default_order in (default_order)
) piv;
Anyone care to help?
Thanks.

As #Vinit says, you can use the string_agg function in 2017, but if you're at least on 2005 you can use a horrible, torturous XML generator:
SELECT row_nbr
,row_label
,default_order = '(' +
STUFF(
(SELECT ', ' + CAST(default_order AS VARCHAR(10))
FROM #temp
WHERE row_nbr = t.row_nbr
ORDER BY default_order
FOR XML PATH('') ,
ROOT('MyString'),
TYPE ).value('/MyString[1]', 'varchar(max)'), 1, 2, '')
+ ')'
FROM #temp t;
You can read more about it in this blog post

PIVOTS are definitely wonky to get a grip on. Fortunately, in this case, while you could use one as an intermediate step, it's not necessary. PIVOT will take each value and put it into a corresponding distinct column, and what you're wanting is a single column, with them all concatenated together. Like I said, you could do the pivot, then just concatenate all the generated columns together, but that's way more work than is needed.
On 2014, the easiest way to do this is using FOR XML. Russell Fox's answer pretty much covers how that technique works (although there are a few variants on how you can do that should you so choose).
If you know definitively the values are all integers, you can save a bit of typing and omit the type and value operators, as those are only necessary when you have to escape certain XML characters in string fields
select
row_nbr,
row_label,
default_order,
stuff
(
(
select concat(',', default_order)
from #temp i
where i.row_nbr = o.row_nbr
for xml path('')
), 1, 1, ''
)
from #temp o

Expression to find multiple spaces in string

We handle a lot of sensitive data and I would like to mask passenger names using only the first and last letter of each name part and join these by three asterisks (***),
For example: the name 'John Doe' will become 'J***n D***e'
For a name that consists of two parts this is doable by finding the space using the expression:
LEFT(CardHolderNameFromPurchase, 1) +
'***' +
CASE WHEN CHARINDEX(' ', PassengerName) = 0
THEN RIGHT(PassengerName, 1)
ELSE SUBSTRING(PassengerName, CHARINDEX(' ', PassengerName) -1, 1) +
' ' +
SUBSTRING(PassengerName, CHARINDEX(' ', PassengerName) +1, 1) +
'***' +
RIGHT(PassengerName, 1)
END
However, the passenger name can have more than two parts, there is no real limit to it. How should can I find the indices of all spaces within an expression? Or should I maybe tackle this problem in a different way?
Any help or pointer is much appreciated!

This solution does what you want it to, but is really the wrong approach to use when trying to hide personally identifiable data, as per Gordon's explanation in his answer.
SQL:
declare #t table(n nvarchar(20));
insert into #t values('John Doe')
,('JohnDoe')
,('John Doe Two')
,('John Doe Two Three')
,('John O''Neill');
select n
,stuff((select ' ' + left(s.item,1) + '***' + right(s.item,1)
from dbo.fn_StringSplit4k(t.n,' ',null) as s
for xml path('')
),1,1,''
) as mask
from #t as t;
Output:
+--------------------+-------------------------+
| n | mask |
+--------------------+-------------------------+
| John Doe | J***n D***e |
| JohnDoe | J***e |
| John Doe Two | J***n D***e T***o |
| John Doe Two Three | J***n D***e T***o T***e |
| John O'Neill | J***n O***l |
+--------------------+-------------------------+
String splitting function based on Jeff Moden's Tally Table approach:
create function [dbo].[fn_StringSplit4k]
(
#str nvarchar(4000) = ' ' -- String to split.
,#delimiter as nvarchar(1) = ',' -- Delimiting value to split on.
,#num as int = null -- Which value to return, null returns all.
)
returns table
as
return
-- Start tally table with 10 rows.
with n(n) as (select 1 union all select 1 union all select 1 union all select 1 union all select 1 union all select 1 union all select 1 union all select 1 union all select 1 union all select 1)
-- Select the same number of rows as characters in #str as incremental row numbers.
-- Cross joins increase exponentially to a max possible 10,000 rows to cover largest #str length.
,t(t) as (select top (select len(isnull(#str,'')) a) row_number() over (order by (select null)) from n n1,n n2,n n3,n n4)
-- Return the position of every value that follows the specified delimiter.
,s(s) as (select 1 union all select t+1 from t where substring(isnull(#str,''),t,1) = #delimiter)
-- Return the start and length of every value, to use in the SUBSTRING function.
-- ISNULL/NULLIF combo handles the last value where there is no delimiter at the end of the string.
,l(s,l) as (select s,isnull(nullif(charindex(#delimiter,isnull(#str,''),s),0)-s,4000) from s)
select rn
,item
from(select row_number() over(order by s) as rn
,substring(#str,s,l) as item
from l
) a
where rn = #num
or #num is null;
GO

If you consider PassengerName as sensitive information, then you should not be storing it in clear text in generally accessible tables. Period.
There are several different options.
One is to have reference tables for sensitive information. Any table that references this would have an id rather than the name. Viola. No sensitive information is available without access to the reference table, and that would be severely restricted.
A second method is a reversible compression algorithm. This would allow the the value to be gibberish, but with the right knowledge, it could be transformed back into a meaningful value. Typical methods for this are the public key encryption algorithms devised by Rivest, Shamir, and Adelman (RSA encoding).
If you want to do first and last letters of names, I would be really careful about Asian names. Many of them consist of two or three letters, when written in Latin script. That isn't much hiding. SQL Server does not have simple mechanisms to do this. You can write a user-defined function with a loop to manager the process. However, I view this as the least secure and least desirable approach.

This uses Jeff Moden's DelimitedSplit8K, as well as the new functionality in SQL Server 2017 STRING_AGG. As I don't know what version you're using, I've just gone "whole hog" and assumed you're using the latest version.
Jeff's function is invaluable here, as it returns the ordinal position, something which Microsoft have foolishly omitted from their own function, STRING_SPLIT (and didn't add in 2017 either). Ordinal position is key here, so we can't make use of the built in function.
WITH VTE AS(
SELECT *
FROM (VALUES ('John Doe'),('Jane Bloggs'),('Edgar Allan Poe'),('Mr George W. Bush'),('Homer J Simpson')) V(FullName)),
Masking AS (
SELECT *,
ISNULL(STUFF(Item, 2, LEN(item) -2,'***'), Item) AS MaskedPart
FROM VTE V
CROSS APPLY dbo.delimitedSplit8K(V.Fullname, ' '))
SELECT STRING_AGG(MaskedPart,' ') AS MaskedFullName
FROM Masking
GROUP BY Fullname;
Edit: Nevermind, OP has commented they are using 2008, so STRING_AGG is out of the question. #iamdave, however, has posted an answer which is very similar to my own, just do it the "old fashioned XML way".

Depending on your version of SQL Server, you may be able to use the built-in string split to rows on spaces in the name, do your string formatting, and then roll back up to name level using an XML path.
create table dataset (id int identity(1,1), name varchar(50));
insert into dataset (name) values
('John Smith'),
('Edgar Allen Poe'),
('One Two Three Four');
with split as (
select id, cs.Value as Name
from dataset
cross apply STRING_SPLIT (name, ' ') cs
),
formatted as (
select
id,
name,
left(name, 1) + '***' + right(name, 1) as out
from split
)
SELECT
id,
(SELECT ' ' + out
FROM formatted b
WHERE a.id = b.id
FOR XML PATH('')) [out_name]
FROM formatted a
GROUP BY id
Result:
id out_name
1 J***n S***h
2 E***r A***n P***e
3 O***e T***o T***e F***r

You can do that using this function.
create function [dbo].[fnMaskName] (#var_name varchar(100))
RETURNS varchar(100)
WITH EXECUTE AS CALLER
AS
BEGIN
declare #var_part varchar(100)
declare #var_return varchar(100)
declare #n_position smallint
set #var_return = ''
set #n_position = 1
WHILE #n_position<>0
BEGIN
SET #n_position = CHARINDEX(' ', #var_name)
IF #n_position = 0
SET #n_position = LEN(#var_name)
SET #var_part = SUBSTRING(#var_name, 1, #n_position)
SET #var_name = SUBSTRING(#var_name, #n_position+1, LEN(#var_name))
if #var_part<>''
SET #var_return = #var_return + stuff(#var_part, 2, len(#var_part)-2, replicate('*',len(#var_part)-2)) + ' '
END
RETURN(#var_return)
END

how to select data row from a comma separated value field

My question is not exactly but similar to this question
How to SELECT parts from a comma-separated field with a LIKE statement
but i have not seen any answer there. So I am posting my question again.
i have the following table
╔════════════╦═════════════╗
║ VacancyId ║ Media ║
╠════════════╬═════════════╣
║ 1 ║ 32,26,30 ║
║ 2 ║ 31, 25,20 ║
║ 3 ║ 21,32,23 ║
╚════════════╩═════════════╝
I want to select data who has media id=30 or media=21 or media= 40
So in this case the output will return the 1st and the third row.
How can I do that ?
I have tried media like '30' but that does not return any value. Plus i just dont need to search for one string in that field .
My database is SQL Server
Thank you

It's never good to use the comma separated values to store in database if it is feasible try to make separate tables to store them as most probably this is 1:n relationship.
If this is not feasible then there are following possible ways you can do this,
If your number of values to match are going to stay same, then you might want to do the series of Like statement along with OR/AND depending on your requirement.
Ex.-
WHERE
Media LIKE '%21%'
OR Media LIKE '%30%'
OR Media LIKE '%40%'
However above query will likely to catch all the values which contains 21 so even if columns with values like 1210,210 will also be returned. To overcome this you can do following trick which is hamper the performance as it uses functions in where clause and that goes against making Seargable queries.
But here it goes,
--Declare valueSearch variable first to value to match for you can do this for multiple values using multiple variables.
Declare #valueSearch = '21'
-- Then do the matching in where clause
WHERE
(',' + RTRIM(Media) + ',') LIKE '%,' + #valueSearch + ',%'
If the number of values to match are going to change then you might want to look into FullText Index and you should thinking about the same.
And if you decide to go with this after Fulltext Index you can do as below to get what you want,
Ex.-
WHERE
CONTAINS(Media, '"21" OR "30" OR "40"')

The best possible way i can suggest is first you have do comma separated value to table using This link and you will end up with table looks like below.
SELECT * FROM Table
WHERE Media in('30','28')
It will surely works.

You can use this, but the performance is inevitably poor. You should, as others have said, normalise this structure.
WHERE
',' + media + ',' LIKE '%,21,%'
OR ',' + media + ',' LIKE '%,30,%'
Etc, etc...

If you are certain that any Media value containing the string 30 will be one you wish to return, you just need to include wildcards in your LIKE statement:
SELECT *
FROM Table
WHERE Media LIKE '%30%'
Bear in mind though that this would also return a record with a Media value of 298,300,302 for example, so if this is problematic for you, you'll need to consider a more sophisticated method, like:
SELECT *
FROM Table
WHERE Media LIKE '%,30,%'
OR Media LIKE '30,%'
OR Media LIKE '%,30'
OR Media = '30'
If there might be spaces in the strings (as per in your question), you'll also want to strip these out:
SELECT *
FROM Table
WHERE REPLACE(Media,' ','') LIKE '%,30,%'
OR REPLACE(Media,' ','') LIKE '30,%'
OR REPLACE(Media,' ','') LIKE '%,30'
OR REPLACE(Media,' ','') = '30'
Edit: I actually prefer Coder of Code's solution to this:
SELECT *
FROM Table
WHERE ',' + LTRIM(RTRIM(REPLACE(Media,' ',''))) + ',' LIKE '%,30,%'
You mention that would wish to search for multiple strings in this field, which is also possible:
SELECT *
FROM Table
WHERE Media LIKE '%30%'
OR Media LIKE '%28%'
SELECT *
FROM Table
WHERE Media LIKE '%30%'
AND Media LIKE '%28%'

I agree not a good idea comma seperated values stored like that. Bu if you have to;
I think using inline function is will give better performance;
Select VacancyId, Media from (
Select 1 as VacancyId, '32,26,30' as Media
union all
Select 2, '31,25,20'
union all
Select 3, '21,32,23'
) asa
CROSS APPLY dbo.udf_StrToTable(Media, ',') tbl
where CAST(tbl.Result as int) in (30,21,40)
Group by VacancyId, Media
Output is;
VacancyId Media
----------- ---------
1 32,26,30
3 21,32,23
and our inline function script is;
if exists (select * from dbo.sysobjects where id = object_id(N'[dbo].[udf_StrToTable]') and xtype in (N'FN', N'IF', N'TF'))
drop function [dbo].udf_StrToTable
GO
CREATE FUNCTION udf_StrToTable (#List NVARCHAR(MAX), #Delimiter NVARCHAR(1))
RETURNS TABLE
With Encryption
AS
RETURN
( WITH Split(stpos,endpos)
AS(
SELECT 0 AS stpos, CHARINDEX(#Delimiter,#List) AS endpos
UNION ALL
SELECT CAST(endpos+1 as int), CHARINDEX(#Delimiter,#List,endpos+1)
FROM Split
WHERE endpos > 0
)
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT 1)) as inx,
SUBSTRING(#List,stpos,COALESCE(NULLIF(endpos,0),LEN(#List)+1)-stpos) Result
FROM Split
)
GO

This solution uses a RECURSIVE CTE to identify the position of each comma within the string then uses SUBSTRING to return all strings between the commas.
I've left some unnecessary code in place to help you get you head round what it's doing. You can strip it down to provide exactly what you need.
DROP TABLE #TMP
CREATE TABLE #TMP(ID INT, Vals CHAR(100))
INSERT INTO #TMP(ID,VALS)
VALUES
(1,'32,26,30')
,(2,'31, 25,20')
,(3,'21,32,23')
;WITH cte
AS
(
SELECT
ID
,VALS
,0 POS
,CHARINDEX(',',VALS,0) REM
FROM
#TMP
UNION ALL
SELECT ID,VALS,REM,CHARINDEX(',',VALS,REM+1)
FROM
cte c
WHERE CHARINDEX(',',VALS,REM+1) > 0
UNION ALL
SELECT ID,VALS,REM,LEN(VALS)
FROM
cte c
WHERE POS+1 < LEN(VALS) AND CHARINDEX(',',VALS,REM+1) = 0
)
,cte_Clean
AS
(
SELECT ID,CAST(REPLACE(LTRIM(RTRIM(SUBSTRING(VALS,POS+1,REM-POS))),',','') AS INT) AS VAL FROM cte
WHERE POS <> REM
)
SELECT
ID
FROM
cte_Clean
WHERE
VAL = 32
ORDER BY ID

Can i use a case statement to convert a varchar to decimal and use that in my where clause?

I have a column which I want to convert to decimal so I can then use it to compare in my where clause. I want to make sure all values from the column are greater or equal to 1.3. I converted the column successfully in the select statement but when attempting to do the same convert in the where clause I get the following error:
Arithmetic overflow error converting varchar to data type numeric.
I am using SQL Server 2008.
SELECT ID,
CASE
WHEN ISNUMERIC(USER_3) = 1
THEN Convert(varchar(50), CONVERT(decimal(14,2), USER_3))
END AS KG_M
FROM PART
WHERE USER_3 IS NOT NULL
AND CASE
WHEN ISNUMERIC(USER_3) = 1
THEN Convert(varchar(50), CONVERT(decimal(14,2), USER_3))
END >= 1.3

Sure, why not? Here's a self-contained example:
select a.ID
, b.KG_M
from (values
(1, N'12345678')
, (2, N'ABCDEFGH')
) as a (ID, USER_3)
cross apply (values(
case IsNumeric(a.USER_3)
when 1 then Convert(varchar(50), Convert(decimal(14, 2), a.USER_3))
else a.USER_3
end
)) as b (KG_M)
where b.KG_M >= '1.3';
We simply use the APPLY operator to contain our calculation for reuse later.

You need to choose one way to convert. I would use the native type for comparison, decimal.
SELECT * FROM
(
SELECT ID, KG_M=CAST(USER_3 AS decimal(14,2))
FROM PART
WHERE
ISNUMERIC(USER_3) = 1
)AS X
WHERE
X.KG_M >= 1.3
Allow strings that are not numbers in outoput
SELECT * FROM
(
SELECT
ID,
USER_3_AsDecimal=CASE WHEN ISNUMERIC(USER_3) THEN CAST(USER_3 AS decimal(14,2)) ELSE NULL END,
USER_3
FROM PART
WHERE
NOT USER_3 IS NULL
)AS X
WHERE
X.USER_3_AsDecimal IS NULL
OR
X.USER_3_AsDecimal >= 1.3

The problem was a syntax error, the case in the where clause was a success the entire time.
"you should use >= '1.3' since you are converting to varchar" credit to #Lamak in comments