Regex for a serial number in T-SQL - sql-server

I'm a novice at regexs and am currently trying to come up with a simple regex that searches for a serial number in the following format: 0217103200XX, where "XX" can each be a numeric digit. I'm using SQL Server Management Studio to pass the regex as a parameter in a stored procedure. I'm not sure if the syntax is any different from other programming languages. I have the following regex as a reference:
(?:2328\d\d(?:0[1-9]|[1-4]\d|5[0-3])\d{4})
Any suggestions are appreciated.
UPDATE:
I'm actually using this in a SQL Query and not in a .Net application. The format is as follows:
USE [MyDB]
EXEC MyStoredProcedure #regex = '(?:2328\d\d(?:0[1-9]|[1-4]\d|5[0-3])\d{4})'

Use LIKE: there is no native RegEx in SQL Server
LIKE '0217103200[0-9][0-9]'

As OMG Ponies stated - SQL Server does not natively support regex (need to use SQLCLR for 2005+, or xp_cre).
If I have understood your question, you could use a PATINDEX to find the serial numbers
Select *
From dbo.MyTable
Where PATINDEX('0217103200[0-9][0-9]', SerialNumberColumn) > 0

Related

What is regexp_replace equivalent in SQL Server

I have this piece of code in Oracle which I need to convert into SQL Server to get the same behavior. I have used the REPLACE function. It seems to be working but I just wanted to make sure.
REGEXP_REPLACE(
phonenumber,
'([[:digit:]]{3})([[:digit:]]{3})([[:digit:]]{4})',
'(\1)\2-\3'
) phonenumber
As Martin said in his answer, SQL Server does not have built-in RegEx functionality (and while it has not been suggested here, just to be clear: no, the [...] wildcard of LIKE and PATINDEX is not RegEx). If your data has little to no variation then yes, you can use some combination of T-SQL functions: REPLACE, SUBSTRING, LEFT, RIGHT, CHARINDEX, PATINDEX, FORMATMESSAGE, CONCAT, and maybe one or two others.
However, if the data / input has even a moderate level of complexity, then the built-in T-SQL functions will be at best be cumbersome, and at worst useless. In such cases it's possible to do actual RegEx via SQLCLR (as long as you aren't using Azure SQL Database Single DB or SQL Server 2017+ via AWS RDS), which is (restricted) .NET code running within SQL Server. You can either code your own / find examples here on S.O. or elsewhere, or try a pre-done library such as the one I created, SQL# (SQLsharp), the Free version of which contains several RegEx functions. Please note that SQLCLR, being .NET, is not a POSIX-based RegEx, and hence does not use POSIX character classes (meaning: you will need to use \d for "digits" instead of [:digit:]).
The level of complexity needed in this particular situation is unclear as the example code in the question implies that the data is simple and uniform (i.e. 1112223333) but the example data shown in a comment on the question appears to indicate that there might be dashes and/or spaces in the data (i.e. xxx- xxx xxxx).
If the data truly is uniform, then stick with the pure T-SQL solution provided by #MartinSmith. But, if the data is of sufficient complexity, then please consider the RegEx example below, using a SQLCLR function found in the Free version of my SQL# library (as mentioned earlier), that easily handles the 3 variations of input data and more:
SELECT SQL#.RegEx_Replace4k(tmp.phone,
N'\(?(\d{3})\)?[ .-]*(\d{3})[ .-]*(\d{4})', N'($1)$2-$3',
-1, -- count (-1 == unlimited)
1, -- start at
N'') -- RegEx options
FROM (VALUES (N'8885551212'),
(N'123- 456 7890'),
(N'(777) 555- 4653')
) tmp([phone]);
returns:
(888)555-1212
(123)456-7890
(777)555-4653
The RegEx pattern allows for:
0 or 1 (
3 decimal digits
0 or 1 )
0 or more of , ., or -
3 decimal digits
0 or more of , ., or -
4 decimal digits
NOTE
It was mentioned that the newer Language Extensions might be a better choice than SQLCLR. Language Extensions allow calling R / Python / Java code, hosted outside of SQL Server, via the sp_execute_external_script stored procedure. As the Tutorial: Search for a string using regular expressions (regex) in Java page shows, external scripts are actually not a good choice for many / most uses of RegEx in SQL Server. The main problems are:
Unlike with SQLCLR, the only interface for external scripts is a stored procedure. This means that you can't use any of that functionality inline in a query (SELECT, WHERE, etc).
With external scripts, you pass in the query, work on the results in the external language, and pass back a static result set. This means that compiled code now has to be more specialized (i.e. tightly-coupled) to the particular usage. Changing how the query uses RegEx and/or what columns are returned now requires editing, compiling, testing, and deploying the R / Python / Java code in addition to (and coordinated with!) the T-SQL changes.
I'm sure external scripts are absolutely wonderful, and a better choice than SQLCLR, in certain scenarios. But they certainly do not lend themselves well to the highly varied, and often ad hoc, nature of how RegEx is used (like many / most other functions).
SQL Server does not have native regex support. You would need to use CLR (or as #Lukasz Szozda points out in the comments one of the newer Language Extensions) .
If I have understood the regex correctly though it matches strings of 10 digits and assigns the first 3 to group 1, second 3 to group 2, and last 4 to group 3 and then uses the back references in the expression (\1)\2-\3
You can use built in string functions to do this as below
SELECT CASE
WHEN phonenumber LIKE REPLICATE('[0-9]', 10)
THEN FORMATMESSAGE('(%s)%s-%s',
LEFT(phonenumber, 3),
SUBSTRING(phonenumber, 4, 3),
RIGHT(phonenumber, 4))
ELSE phonenumber
END
You can write SQL function using CLR, that will wrap standard dotnet regex. I have wrote this and you can use it there. It will look this:
DECLARE #SourceText NVARCHAR(MAX) = N'My first line <br /> My second line';
DECLARE #RegexPattern NVARCHAR(MAX) = N'([<]br\s*/[>])';
DECLARE #Replacement NVARCHAR(MAX) = N''
DECLARE #IsCaseSensitive BIT = 0;
SELECT regex.Replace(#SourceText, #RegexPattern, #Replacement, #IsCaseSensitive);

I need the command , which will act exactly as Translate command in sql server

I need to scramble the data as per below , if one of the person name is
William , the output should be Jihhiar. It can be done using the Translate command in oracle sql. But the SQL server the translate command is not compatible.
Hence require help to identify the exact function for Translate in sql server.
Use REPLACE
select REPLACE(TableName.PersonName, 'William', 'Jihhiar')
Not sure why you say TRANSLATE doesn't work:
SELECT TRANSLATE('William','Wlm','Jhr');
The provides the value 'Jihhiar'`.

Fuzzy matching SQL Syntax

How in SQL can we write something which performs matching similar to the SSIS Fuzzy Matching component ?
What options do we have available using SQL Server features and SQL syntax ?
Thanks,
You can use the full text indexing feature of SQL server, together with the associated functions CONTAINS, RANK, etc.
The easiest way to do fuzzy matching in T-SQL would be using SOUNDEX() and DIFFERENCE().
For example
select
soundex('SQL') as 'four-character (SOUNDEX) code' -- Returns S240
, soundex('Sequel') as 'four-character (SOUNDEX) code' -- Returns S240
, difference('SQL', 'Sequel') as '0: weak or no similarity. 4: strong similarity or the same values.' -- Returns 4

MS SQL server - convert HEX string to integer

This answer to what looks like the same question:
Convert integer to hex and hex to integer
..does not work for me.
I am not able to go to a HEX string to an integer using MS SQL server 2005 CAST or CONVERT. Am I missing something trivial? I have searched extensively, and the best I can find are long-winded user functions to go from a hex string value to something that looks like a decimal int. Surely there is a simple way to do this directly in a query using built in functions rather than writing a user function?
Thanks
Edit to include examples:
select CONVERT(INT, 0x89)
works as expected, but
select CONVERT(INT, '0x' + substring(msg, 66, 2)) from sometable
gets me:
"Conversion failed when converting the varchar value '0x89' to data type int."
an extra explicit CAST:
select CONVERT(INT, CAST('0x89' AS VARBINARY))
executes, but returns 813185081.
Substituting 'Int', 'Decimal', etc for 'Varbinary' results in an error. In general, strings that appear to be numeric are interpreted as numeric if required, but not in this case, and there does not appear to be a CAST that recognizes HEX. I would like to think there is something simple and obvious and I've just missed it.
Microsoft SQL Server Management Studio Express 9.00.3042.00
Microsoft SQL Server 2005 - 9.00.3080.00 (Intel X86) Sep 6 2009 01:43:32 Copyright (c) 1988-2005 Microsoft Corporation Express Edition with Advanced Services on Windows NT 5.1 (Build 2600: Service Pack 3)
To sum up: I want to take a hex string which is a value in a table, and display it as part of a query result as a decimal integer, using only system defined functions, not a UDF.
Thanks for giving some more explicit examples. As far as I can tell from the documentation and Googling, this is not possible in MSSQL 2005 without a UDF or other procedural code. In MSSQL 2008 the CONVERT() function's style parameter now supoprts binary data, so you can do it directly like this:
select convert(int, convert(varbinary, '0x89', 1))
In previous versions, your choices are:
Use a UDF (TSQL or CLR; CLR might actually be easier for this)
Wrap the SELECT in a stored procedure (but you'll probably still have the equivalent of a UDF in it anyway)
Convert it in the application front end
Upgrade to MSSQL 2008
If converting the data is only for display purposes, the application might be the easiest solution: data formatting usually belongs there anyway. If you must do it in a query, then a UDF is easiest but the performance may not be great (I know you said you preferred not to use a UDF but it's not clear why). I'm guessing that upgrading to MSSQL 2008 just for this probably isn't realistic.
Finally, FYI the version number you included is the version of Management Studio, not the version number of your server. To get that, query the server itself with select ##version or select serverproperty('ProductVersion').

SQL Server Management Studio - using multiple filters in table list?

In Management Studio, you can right click on the tables group to create a filter for the table list. Has anyone figured out a way to include multiple tables in the filter? For example, I'd like all tables with "br_*" and "tbl_*" to show up.
Anyone know how to do this?
No, you can't do this. When we first got Management Studio I've tried every possible combination of everything you could think of: _, %, *, ", ', &&, &, and, or, |, ||, etc...
You might be able to roll your own addon to SMSS that would allow you to do what you are looking for:
The Black Art of Writing a SQL Server Management Studio 2005 Add-In
Extend Functionality in SQL Server 2005 Management Studio with Add-ins
The first one is specifically for searching and displaying all schema objects with a given name so you might be able to expand upon that for what you are looking for.
I'm using SQL Server Management Studio v17.1 and it has a SQL injection bug in it's filter construction, so you can actually escape default
tbl.name like '%xxx%'
and write your own query (with some limitations). For example to filter tables that are ending with "_arch", "_hist", "_purge" I used following filter value
_arch') and RIGHT(tbl.name, 5) != N'purge' and RIGHT(tbl.name, 4) != N'hist' and not(tbl.name like N'bbb
You can use SQL Server Profiler to see the constructed query and adjust it as needed.
Not sure if this same bug is available in previous SQL Management Studio versions or when it will be fixed, but for now I'm happy with the result.
I've used Toad for SQL Server (freeware version) which has very nice filtering options.
At first it looks like it could use a CONTAINS query (e.g. "br_*" OR "tbl_*"), but it doesn't seem to. It seems to only support a value that is then passed into a LIKE clause (e.g. 'app' becomes '%app%').
The "sql injection" method still works (v17.5), but with a twist:
zzzz' or charindex('pattern1', name) > 0 or charindex('pattern2', name) > 0 or name like 'zzzz
(I used the 'zzzz' to bypass the '%')
It doesn´t work if '_' or '%' is used in the patterns (or anywhere on your code), because it will automatically be replaced by '[_]' or '[%]' before evaluation.
As others have said, you cannot do this in SQL Server Management Studio (up and including 2014).
The following query will give you a filtered list of tables, if this is all you need:
SELECT
CONCAT(TABLE_SCHEMA, '.', TABLE_NAME) AS TABLE_SCHEMA_AND_NAME,
TABLE_SCHEMA,
TABLE_NAME
FROM
INFORMATION_SCHEMA.TABLES
WHERE
TABLE_SCHEMA IN ('X', 'Y', 'Z') -- schemas go here
ORDER BY
TABLE_SCHEMA,
TABLE_NAME;
The SQL injection method still works (somewhat) as of SSMS 2017 v17.8.1, although it puts brackets around the % symbol, so it will interpret those literally.
If you're using the Name->Contains filter, Profiler shows:
... AND dtb.name LIKE N'%MyDatabase1%')
So, in the Name->Contains field: MyDatabase1') OR (dtb.name LIKE 'MyDatabase2 should do it for simple cases.
This is old I know, but it's good to know that it can works if you input just entering the "filter" text. Skip * or % or any other standard search characters, just enter br_ or tbl_ or whatever you want to filter on.
Your in luck, I just conquered that feat, although my success is small because you can filter by schema which would allow you see more than 1 table but you have to type the filter text in each time you want to change it.

Resources