Find all special characters in a column in SQL Server 2008 - sql-server

I need to find the occurrence of all special characters in a column in SQL Server 2008. So, I don't care about A, B, C ... 8, 9, 0, but I do care about !, #, &,, etc.
The easiest way to do so, in my mind, would exclude A, B, C, ... 8, 9, 0, but if I wrote a statement to exclude those, I would miss entries that had ! and A. So, it seems to me that I would have to get a list of every non-alphabet / non-number character, then run a SELECT with a LIKE and Wildcard qualifiers.
Here is what I would run:
SELECT Col1
FROM TABLE
WHERE Col1 LIKE ('!', '#', '#', '$', '%'....)
However, I don't think you can run multiple qualifiers, can you? Is there a way I could accomplish this?

Negatives are your friend here:
SELECT Col1
FROM TABLE
WHERE Col1 like '%[^a-Z0-9]%'
Which says that you want any rows where Col1 consists of any number of characters, then one character not in the set a-Z0-9, and then any number of characters.
If you have a case sensitive collation, it's important that you use a range that includes both upper and lower case A, a, Z and z, which is what I've given (originally I had it the wrong way around. a comes before A. Z comes after z)
Or, to put it another way, you could have written your original WHERE as:
Col1 LIKE '%[!##$%]%'
But, as you observed, you'd need to know all of the characters to include in the [].

The following transact SQL script works for all languages (international). The solution is not to check for alphanumeric but to check for not containing special characters.
DECLARE #teststring nvarchar(max)
SET #teststring = 'Test''Me'
SELECT 'IS ALPHANUMERIC: ' + #teststring
WHERE #teststring NOT LIKE '%[-!#%&+,./:;<=>#`{|}~"()*\\\_\^\?\[\]\'']%' {ESCAPE '\'}

Select * from TableName Where ColumnName LIKE '%[^A-Za-z0-9, ]%'
This will give you all the row which contains any special character.

select count(*) from dbo.tablename where address_line_1 LIKE '%[\'']%' {eSCAPE'\'}

Related

Is there a way to find values that contain only 0's and a symbol of any length?

I want to find strings of any length that contain only 0's and a symbol such as a / a . or a -
Examples include 0__0 and 000/00/00000 and .00000
Considering this sample data:
CREATE TABLE dbo.things(thing varchar(255));
INSERT dbo.things(thing) VALUES
('0__0'),('000/00/00000'),('00000'),('0123456');
Try the following, which locates the first position of any character that is NOT a 0, a decimal, a forward slash, or an underscore. PATINDEX returns 0 if the pattern is not found.
SELECT thing FROM dbo.things
WHERE PATINDEX('%[^0^.^/^_]%', thing) = 0;
Results:
thing
0__0
000/00/00000
00000
The opposite:
SELECT thing FROM dbo.things
WHERE PATINDEX('%[^0^.^/^_]%', thing) > 0;
Results:
thing
0123food456
Example db<>fiddle
I can see a way of doing this... But it's something that wouldn't perform well, if you think about using it as a search criteria.
We are going to use a translate function on SQL Server, to replace the allowed characters, or symbols as you've said, with a zero. And then, eliminates the zeroes. If the result is an empty string, then there are two cases, or it only had zeroes and allowed characters, or it already was an empty string.
So, checking for this and for non-empty strings, we can define if it matches your criteria.
-- Test scenario
create table #example (something varchar(200) )
insert into #example(something) values
--Example cases from Stack Overflow
('0__0'),('000/00/00000'),('.00000'),
-- With something not allowed (don't know, just put a number)
('1230__0'),('000/04560/00000'),('.00000789'),
-- Just not allowed characters, zero, blank, and NULL
('1234567489'),('0'), (''),(null)
-- Shows the data, with a column to check if it matches your criteria
select *
from #example e
cross apply (
select case when
-- If it *must* have at least a zero
e.something like '%0%' and
-- Eliminates zeroes
replace(
-- Replaces the allowed characters with zero
translate(
e.something
,'_./'
,'000'
)
,'0'
,''
) = ''
then cast(1 as bit)
else cast(0 as bit)
end as doesItMatch
) as criteria(doesItMatch)
I really discourage you from using this as a search criteria.
-- Queries the table over this criteria.
-- This is going to compute over your entire table, so it can get very CPU intensive
select *
from #example e
where
-- If it *must* have at least a zero
e.something like '%0%' and
-- Eliminates zeroes
replace(
-- Replaces the allowed characters with zero
translate(
e.something
,'_./'
,'000'
)
,'0'
,''
) = ''
If you must use this as a search criteria, and this will be a common filter on your application, I suggest you create a new bit column, to flag if it matches this, and index it. Thus, the increase in computational effort would be spread on the inserts/updates/deletes, and the search queries won't overloading the database.
The code can be seen executing here, on DB Fiddle.
What I got from the question is that the strings must contain both 0 and any combination of the special characters in the string.
If you have SQL Server 2017 and above, you can use translate() to replace multiple characters with a space and compare this with the empty string. Also you can use LIKE to enforce that both a 0 and any combination of the special character(s) appear at least once:
DECLARE #temp TABLE (val varchar(100))
INSERT INTO #temp VALUES
('0__0'), ('000/00/00000'), ('.00000'), ('w0hee/'), ('./')
SELECT *
FROM #temp
WHERE val LIKE '%0%' --must have at least one zero somewhere
AND val LIKE '%[_/.]%' --must have at least one special character(s) somewhere
AND TRANSLATE(val, '0./_', ' ') = '' --translated zeros and sp characters to spaces equivalent to an empty string
Creates output:
val
0__0
000/00/00000
.00000

Return words in between specific phrases in string in T-SQL

My column Details would return a big message such as and the only thing I want to extract is the number 874659.29. This number varies among rows but it will always comes after ,"CashAmount": and a coma (,).
There will be only one ,"CashAmount": but several comas after.
dhfgdh&%^&%,"CashAmount":874659.29,"Hasdjhf"&^%^%
Therefore, I was wondering if I could use anything to only show the number in my output column.
Thanks in advance!
Here is another option for this just using some string manipulation.
declare #Details varchar(100) = 'dhfgdh&%^&%,"CashAmount":874659.29,"Hasdjhf"&^%^%'
select left(substring(#Details, CHARINDEX('CashAmount":', #Details) + 12 /*12 is the length of CashAmount":*/, LEN(#Details))
, charindex(',', substring(#Details, CHARINDEX('CashAmount":', #Details) + 12, LEN(#Details))) - 1)
You could use one of the split string functions as described here..
declare #string varchar(max)
set #string='dhfgdh&%^&%,"CashAmount":874659.29,"Hasdjhf"&^%^%'
select b.val from
[dbo].[SplitStrings_Numbers](#string,',')a
cross apply
(
select isnumeric(replace(a.item,'"CashAmount":',1)),replace(a.item,'"CashAmount":',1)
) b(chk,val)
where b.chk=1
Output:
874659.29
The above will work only if number comes after cashamount and before , and if it doesn't have any special characters..
if your number has special characters,you can use TRY_PARSE and check for NULL..

sql like statement picking up unexpected results

I have a simple table like the following
id, target
-----------
1, test_1
2, test_2
3, test_3
4, testable
I have a simple query like so:
select * from my_table where target like 'test_%'
What I'm expecting are the first 3 records but I'm getting all 4 records
See SQLFiddle example here
Underscore is a pattern matching character. Try this:
select * from my_table where target like 'test[_]%'
_ is also a wildcard. You can escape it like:
... like 'test\_%' escape '\'
The underscore character _ as you've used it is a wildcard for a single character, hence it returns 4 rows. Try using [_] instead of _.
To illustrate..
CREATE TABLE #tmp (val varchar(10))
INSERT INTO #tmp (val)
VALUES ('test_1'), ('test_2'), ('test_3'), ('testing')
-- This returns all four
SELECT * FROM #tmp WHERE val LIKE 'test_%'
-- This returns the three test_ rows
SELECT * FROM #tmp WHERE val LIKE 'test[_]%'
The underscore is a wildcard character that says "match any character single character", just like the % is a wildcard that says "match any 0 or more characters". If you're familiar with Regular Expressions, the underscore character is equivalent to the dot there. You'll need to properly escape the underscore to match that character literally.

valid record in SQL query

I have a table with few columns and one of the column is DockNumber. I have to display the docknumbers if they confirm to a particular format
First five characters are numbers followed by a - and followed by 5 characters. The last but one character should be a alpha.
12345-678V9
How can I check in SQL if the first 5 characters are numbers and there is a hyphen and next 3 are numbers and last but one is an alpha.
Building on #gbn's answer, this checks to make sure the length is 11 (in case the #val is not a char(11) or varchar(11) and also checks to make sure the second to last char is alpha
DECLARE #val VARCHAR(20)
SET #val = '12345-678V9'
SELECT CASE WHEN LEN(#val) = 11 AND #val LIKE '[0-9][0-9][0-9][0-9][0-9]-[0-9][0-9][0-9][A-Z0-9][0-9]'
THEN 'isMatch'
ELSE 'isNotMatch'
END AS [Valid]
you can use this, you will have to figure it out on how to use this...
SELECT Case when
Cast(ISNUMERIC(LEFT(#Str,5)) as int) + case when substring(#str,6,1)= '-' then 1 else 0 end +case when substring(#str,10,1) like '[a-z]' then 1 else 0 end =3
THEN 'Matched'
Else 'NotMatched'
End
Regular Expressions can be your friend.
LIKE '[0-9][0-9][0-9][0-9][0-9]-[0-9][0-9][0-9][A-Z][0-9]'
Now, this allows lower case a-z too. You'd need to coerce collation if you wanted upper case only
Value COLLATE Latin_General_BIN
LIKE '[0-9][0-9][0-9][0-9][0-9]-[0-9][0-9][0-9][A-Z][0-9]' COLLATE Latin_General_BIN
PATINDEX is probably the ideal solution.
Select ...
From Table
Where PatIndex('[0-9][0-9][0-9][0-9][0-9]-[0-9][0-9][0-9][A-Z][0-9]', DockNumber) > 0
Where DockNumber Like '[0-9][0-9][0-9][0-9][0-9][-][0-9][0-9][0-9][a-z][0-9]
should work, but i would suggest using Regular expression in code. Much easier if it is possible.
The regex should be '^\d{5}-\d{3}[A-Z]\d$', because without ^ and $ it would find longer strings that contain that sequence (122 12345-678V9 34).
Use rule
CREATE RULE pattern_rule
AS
#value LIKE '[0-9][0-9][0-9][0-9]-[0-9][0-9][0-9][0-9][A-Z][0-9]'
Then bind rule to column

SQL Server: sort a column numerically if possible, otherwise alpha

I am working with a table that comes from an external source, and cannot be "cleaned". There is a column which an nvarchar(20) and contains an integer about 95% of the time, but occasionally contains an alpha. I want to use something like
select * from sch.tbl order by cast(shouldBeANumber as integer)
but this throws an error on the odd "3A" or "D" or "SUPERCEDED" value.
Is there a way to say "sort it like a number if you can, otherwise just sort by string"? I know there is some sloppiness in that statement, but that is basically what I want.
Lets say for example the values were
7,1,5A,SUPERCEDED,2,5,SECTION
I would be happy if these were sorted in any of the following ways (because I really only need to work with the numeric ones)
1,2,5,7,5A,SECTION,SUPERCEDED
1,2,5,5A,7,SECTION,SUPERCEDED
SECTION,SUPERCEDED,1,2,5,5A,7
5A,SECTION,SUPERCEDED,1,2,5,7
I really only need to work with the
numeric ones
this will give you only the numeric ones, sorted properly:
SELECT
*
FROM YourTable
WHERE ISNUMERIC(YourColumn)=1
ORDER BY YourColumn
select
*
from
sch.tbl
order by
case isnumeric(shouldBeANumber)
when 1 then cast(shouldBeANumber as integer)
else 0
end
Provided that your numbers are not more than 100 characters long:
WITH chars AS
(
SELECT 1 AS c
UNION ALL
SELECT c + 1
FROM chars
WHERE c <= 99
),
rows AS
(
SELECT '1,2,5,7,5A,SECTION,SUPERCEDED' AS mynum
UNION ALL
SELECT '1,2,5,5A,7,SECTION,SUPERCEDED'
UNION ALL
SELECT 'SECTION,SUPERCEDED,1,2,5,5A,7'
UNION ALL
SELECT '5A,SECTION,SUPERCEDED,1,2,5,7'
)
SELECT rows.*
FROM rows
ORDER BY
(
SELECT SUBSTRING(mynum, c, 1) AS [text()]
FROM chars
WHERE SUBSTRING(mynum, c, 1) BETWEEN '0' AND '9'
FOR XML PATH('')
) DESC
SELECT
(CASE ISNUMERIC(shouldBeANumber)
WHEN 1 THEN
RIGHT(CONCAT('00000000',shouldBeANumber), 8)
ELSE
shouoldBeANumber) AS stringSortSafeAlpha
ORDEER BY
stringSortSafeAlpha
This will add leading zeros to all shouldBeANumber values that truly are numbers and leave all remaining values alone. This way, when you sort, you can use an alpha sort but still get the correct values (with an alpha sort, "100" would be less than "50", but if you change "50" to "050", it works fine). Note, for this example, I added 8 leading zeros, but you only need enough leading zeros to cover the largest possible integer in your column.

Resources