Is CHAR(14) not allowed in SQL Server T-SQL patindex range? - sql-server

What's the problem with CHAR(13) or perhaps CHAR(14) in TSQL patindex?
As soon as I include CHAR(14) in a pattern, I get no records found.
Searching for an answer, I just found my own question (unanswered) from 2009 (here: http://www.sqlservercentral.com/Forums/Topic795063-338-1.aspx).
Here is another simple test, to show what I mean:
/* PATINDEX TEST */
DECLARE #msg NVARCHAR(255)
SET #msg = 'ABC' + NCHAR(13) + NCHAR(9) + 'DEF'
DECLARE #unwanted NVARCHAR(50)
-- unwanted chars in a "chopped up" string
SET #unwanted = N'%[' + NCHAR(1) + '-' + NCHAR(13) + NCHAR(14) + '-' + NCHAR(31) + ']%'
SELECT patindex(#unwanted, #msg)
-- Result: 4
-- NOW LET THE unwanted string includ the whole range from 1 to 31
SET #unwanted = '%['+NCHAR(1)+'-'+NCHAR(31)+']%' -- -- As soon as Char(14) is included, we get no match with patindex!
SELECT patindex(#unwanted, #msg)
-- Result: 0

It is permitted.
You need to bear in mind that the ranges are based on collation sort order not character codes however so perhaps in your default collation it sorts in a position that you do not expect.
What is your database's default collation?
What does the following return?
;WITH CTE(N) AS
(
SELECT 1 UNION ALL
SELECT 9 UNION ALL
SELECT 13 UNION ALL
SELECT 14 UNION ALL
SELECT 31
)
SELECT N
FROM CTE
ORDER BY NCHAR(N)
For me it returns
N
-----------
1
14
31
9
13
So both characters 9 and 13 are outside the range 1-31. Hence
'ABC' + NCHAR(13) + NCHAR(9) + 'DEF' NOT LIKE N'%['+NCHAR(1)+N'-'+NCHAR(31)+N']%'
Which explains the results in your question. Character 14 doesn't enter into it.
You can use a binary collate clause to get it to sort more as you were expecting. e.g.
SELECT patindex(#unwanted COLLATE Latin1_General_100_BIN, #msg)
Returns 4 in the second query too.

Related

How to convert TIMESTAMP values to VARCHAR in T-SQL as SSMS does?

I am trying to convert a TIMESTAMP field in a table to a string so that it can be printed or executed as part of dynamic SQL. SSMS is able to do it, so there must be a built-in method to do it. However, I can't get it to work using T-SQL.
The following correctly displays a table result:
SELECT TOP 1 RowVersion FROM MyTable
It shows 0x00000000288D17AE. However, I need the result to be part of a larger string.
DECLARE #res VARCHAR(MAX) = (SELECT TOP 1 'test' + CONVERT(BINARY(8), RowVersion) FROM MyTable)
PRINT(#res)
This yields an error: The data types varchar and binary are incompatible in the add operator
DECLARE #res VARCHAR(MAX) = (SELECT TOP 1 'test' + CONVERT(VARCHAR(MAX), RowVersion) FROM MyTable)
PRINT(#res)
This results in garbage characters: test (®
In fact, the spaces are just null characters and terminate the string for the purpose of running dynamic SQL using EXEC().
DECLARE #sql VARCHAR(MAX) = 'SELECT TOP 1 ''test'' + CONVERT(VARCHAR(MAX), RowVersion) FROM MyTable'
EXEC (#sql)
This just displays a table result with the word "test". Everything after "test" in the dynamic SQL is cut off because the CONVERT function returns terminating null characters first.
Obviously, what I want the resultant string to be is "test0x00000000288D17AE" or even the decimal equivalent, which in this case would be "test680335278".
Any ideas would be greatly appreciated.
SELECT 'test' + CONVERT(NVARCHAR(MAX), CONVERT(BINARY(8), RowVersion), 1). The trick is the 1 to the CONVERT as the style, per the documentation. (Pass 2 to omit the 0x.)
As mentioned in the comments, the undocumented function master.sys.fn_varbintohexstr will convert binary to string such that you could then concatenate with some other string value:
DECLARE #binary BINARY(8)
SELECT #binary = CAST(1234567890 AS BINARY(8))
SELECT #binary AS BinaryValue,
LEFT(master.sys.fn_varbintohexstr(#binary),2) + UPPER(RIGHT(master.sys.fn_varbintohexstr(#binary),LEN(master.sys.fn_varbintohexstr(#binary))-2)) AS VarcharValue,
'test' + LEFT(master.sys.fn_varbintohexstr(#binary),2) + UPPER(RIGHT(master.sys.fn_varbintohexstr(#binary),LEN(master.sys.fn_varbintohexstr(#binary))-2)) AS ConcatenatedVarcharValue
I went ahead and split the first two characters and did not apply the UPPER function to them, to exactly reproduce the format as displayed when a binary value.
Results:
/--------------------------------------------------------------------\
| BinaryValue | VarcharValue | ConcatenatedVarcharValue |
|--------------------+--------------------+--------------------------|
| 0x00000000499602D2 | 0x00000000499602D2 | test0x00000000499602D2 |
\--------------------------------------------------------------------/
Have a look at this:
SELECT
substring(replace(replace(replace(replace(cast(CAST(GETDATE() AS datetime2) as
varchar(50)),'-',''),' ',''),':',''),'.',''),1,18)

SQL concat / + operator behaving strangely?

So basically to explain my situation I have a program where a user can select code numbers, that are alpha numeric. These codes are stored in my SQL database as datatype char.
When they select all the codes they want, the program then sends a few parameters(the codes being one of them). The codes are strung together and look something like this:
',01,1,A3' etc. etc. with commas separating the codes. I have the comma in front, but changing the comma to the back does not change anything.
the #reasonCode variable is the reason codes strung together.
In my where clause I have a statement that is this:
(#reasonCode = 'ALL') OR
((#reasonCode <> 'ALL' AND (charindex(',' + ro_reason_code, #reasonCode) > 0)))
Basically I want to restrict my results to just those that have those specific reason codes the user selected(among other parameters). I am trying to achieve that by stringing together the codes, and then searching through them using charindex, seperated by commas.
However I am running into an issue. Here are the results using a few different variations of reason codes:
',1' = 625 records (correct number)
',01' = 1015(correct number)
',01,1 = 1640(correct number)
',1,01' = 1015(for whatever reason it isn't picking up the 1 reason codes)
That is my issue right there.
When I put the 1 in front of the 01, it doesn't pick up the 1 reason codes. But if I do it flip-flopped it works fine...
Any ideas as to why this happens?
(I have tried also using the concat function and get the same results, and also tried forcing everything to be char datatype.)
In the end I would like the same result, regardless if it is ,01,1 or ,1,01.
I'm pretty sure this is because you said you're using the char type instead of varchar. Try replacing your charindex expression with this:
charindex(',' + rtrim(ro_reason_code), #reasonCode)
When I used a type of char(2) in the table and char(16) for the #reasonCode, I could reproduce your result, and I found that adding the rtrim fixed the problem. But unfortunately I can't explain exactly what's going here, why having ',1' at the end of the string should work without the trim whereas having it at the beginning does not. Hopefully someone can provide a more in-depth answer that gets into the "why," but I thought I'd still post this for the time being to get you running.
Reproduction:
-- Forgive the "hackish" way of populating this table. I'm assuming sysobjects has >=1015 records.
declare #Code table (ro_reason_code char(2));
insert #Code select top 625 '1' from sysobjects;
insert #Code select top 1015 '01' from sysobjects;
declare #reasonCode char(16);
set #reasonCode = ',1,01';
select count(1) from #Code where #reasonCode = 'ALL' or charindex(',' + ro_reason_code, #reasonCode) > 0; -- Result: 1015
select count(1) from #Code where #reasonCode = 'ALL' or charindex(',' + rtrim(ro_reason_code), #reasonCode) > 0; -- Result: 1640
set #reasonCode = ',01,1';
select count(1) from #Code where #reasonCode = 'ALL' or charindex(',' + ro_reason_code, #reasonCode) > 0; -- Result: 1640
select count(1) from #Code where #reasonCode = 'ALL' or charindex(',' + rtrim(ro_reason_code), #reasonCode) > 0; -- Result: 1640
Because you are using char, which is a fixed length field, your data is stored padded out to the length of the field. So '1' is stored as '1 '
DECLARE #Code CHAR(2)
SET #Code = '1'
SELECT '''' + #Code + ''''
-- Printes '1 '
For that reason, when you add ',' to the value, you now have ',1 ' (notice the trailing whitespace)
DECLARE #Code CHAR(2)
SET #Code = '1'
SELECT '''' + ',' + #Code + ''''
-- prints ',1 '
Now if you're comparing off another char field, there will also be padded whitespace if the character data is less than the length of the field. So what appers to be ',11,1' is actually something like ',11,1 ' which does match the pattern of ',1 '
BUT, when you reverse the order, ',1,11' becomes ',1,11 ' which does not match the pattern of ',1 '
Unrelated
I just want to point out there is a subtle issue with the implementation. By only appending the leading comma, you may get false positives depending on your data. For example, ,2 will match the pattern ,25.
,2 does match 1,11,25,A01
You've gotta append the comma on both sides of each side of the evaluation.
CHARINDEX( ',' + RTRIM(ro_reason_code) + ',',
',' + RTRIM(#reasonCode) + ',') > 0
So to illustrate the difference it becomes
,2, does not match ,1,11,25,A01,

Concatenate the result of an ordered String_Split in a variable

In a SqlServer database I use, the database name is something like StackExchange.Audio.Meta, or StackExchange.Audio or StackOverflow . By sheer luck this is also the url for a website. I only need split it on the dots and reverse it: meta.audio.stackexchange. Adding http:// and .com and I'm done. Obviously Stackoverflow doesn't need any reversing.
Using the SqlServer 2016 string_split function I can easy split and reorder its result:
select value
from string_split(db_name(),'.')
order by row_number() over( order by (select 1)) desc
This gives me
| Value |
-----------------
| Meta |
| Audio |
| StackExchange |
As I need to have the url in a variable I hoped to concatenate it using this answer so my attempt looks like this:
declare #revname nvarchar(150)
select #revname = coalesce(#revname +'.','') + value
from string_split(db_name(),'.')
order by row_number() over( order by (select 1)) desc
However this only returns me the last value, StackExchange. I already noticed the warnings on that answer that this trick only works for certain execution plans as explained here.
The problem seems to be caused by the order by clause. Without that I get all values, but then in the wrong order. I tried to a add ltrimand rtrim function as suggested in the Microsoft article as well as a subquery but so far without luck.
Is there a way I can nudge the Sql Server 2016 Query Engine to concatenate the ordered result from that string_split in a variable?
I do know I can use for XML or even a plain cursor to get the result I need but I don't want to give up this elegant solution yet.
As I'm running this on the Stack Exchange Data Explorer I can't use functions, as we lack the permission to create those. I can do Stored procedures but I hoped I could evade those.
I prepared a SEDE Query to experiment with. The database names to expect are either without dots, aka StackOverflow, with 1 dot: StackOverflow.Meta or 2 dots, `StackExchange.Audio.Meta, the full list of databases is here
I think you are over-complicating things. You could use PARSENAME:
SELECT 'http://' + PARSENAME(db_name(),1) +
ISNULL('.' + PARSENAME(db_name(),2),'') + ISNULL('.'+PARSENAME(db_name(),3),'')
+ '.com'
This is exactly why I have the Presentation Sequence (PS) in my split function. People often scoff at using a UDF for such items, but it is generally a one-time hit to parse something for later consumption.
Select * from [dbo].[udf-Str-Parse]('meta.audio.stackexchange','.')
Returns
Key_PS Key_Value
1 meta
2 audio
3 stackexchange
The UDF
CREATE FUNCTION [dbo].[udf-Str-Parse] (#String varchar(max),#delimeter varchar(10))
--Usage: Select * from [dbo].[udf-Str-Parse]('meta.audio.stackexchange','.')
-- Select * from [dbo].[udf-Str-Parse]('John Cappelletti was here',' ')
-- Select * from [dbo].[udf-Str-Parse]('id26,id46|id658,id967','|')
Returns #ReturnTable Table (Key_PS int IDENTITY(1,1) NOT NULL , Key_Value varchar(max))
As
Begin
Declare #intPos int,#SubStr varchar(max)
Set #IntPos = CharIndex(#delimeter, #String)
Set #String = Replace(#String,#delimeter+#delimeter,#delimeter)
While #IntPos > 0
Begin
Set #SubStr = Substring(#String, 0, #IntPos)
Insert into #ReturnTable (Key_Value) values (#SubStr)
Set #String = Replace(#String, #SubStr + #delimeter, '')
Set #IntPos = CharIndex(#delimeter, #String)
End
Insert into #ReturnTable (Key_Value) values (#String)
Return
End
Probably less elegant solution but it takes only a few lines and works with any number of dots.
;with cte as (--build xml
select 1 num, cast('<str><s>'+replace(db_name(),'.','</s><s>')+'</s></str>' as xml) str
)
,x as (--make table from xml
select row_number() over(order by num) rn, --add numbers to sort later
t.v.value('.[1]','varchar(50)') s
from cte cross apply cte.str.nodes('str/s') t(v)
)
--combine into string
select STUFF((SELECT '.' + s AS [text()]
FROM x
order by rn desc --in reverse order
FOR XML PATH('')
), 1, 1, '' ) name
Is there a way I can nudge the Sql Server 2016 Query Engine to concatenate the ordered result from that string_split in a variable?
You can just use CONCAT:
DECLARE #URL NVARCHAR(MAX)
SELECT #URL = CONCAT(value, '.', #URL) FROM STRING_SPLIT(DB_NAME(), '.')
SET #URL = CONCAT('http://', LOWER(#URL), 'com');
The reversal is accomplished by the order of parameters to CONCAT. Here's an example.
It changes StackExchange.Garage.Meta to http://meta.garage.stackexchange.com.
This can be used to split and reverse strings in general, but note that it does leave a trailing delimiter. I'm sure you could add some logic or a COALESCE in there to make that not happen.
Also note that vNext will be adding STRING_AGG.
To answer the 'X' of this XY problem, and to address the HTTPS switch (especially for Meta sites) and some other site name changes, I've written the following SEDE query which outputs all site names in the format used on the network site list.
SELECT name,
LOWER('https://' +
IIF(PATINDEX('%.Mathoverflow%', name) > 0,
IIF(PATINDEX('%.Meta', name) > 0, 'meta.mathoverflow.net', 'mathoverflow.net'),
IIF(PATINDEX('%.Ubuntu%', name) > 0,
IIF(PATINDEX('%.Meta', name) > 0, 'meta.askubuntu.com', 'askubuntu.com'),
IIF(PATINDEX('StackExchange.%', name) > 0,
CASE SUBSTRING(name, 15, 200)
WHEN 'Audio' THEN 'video'
WHEN 'Audio.Meta' THEN 'video.meta'
WHEN 'Beer' THEN 'alcohol'
WHEN 'Beer.Meta' THEN 'alcohol.meta'
WHEN 'CogSci' THEN 'psychology'
WHEN 'CogSci.Meta' THEN 'psychology.meta'
WHEN 'Garage' THEN 'mechanics'
WHEN 'Garage.Meta' THEN 'mechanics.meta'
WHEN 'Health' THEN 'medicalsciences'
WHEN 'Health.Meta' THEN 'medicalsciences.meta'
WHEN 'Moderators' THEN 'communitybuilding'
WHEN 'Moderators.Meta' THEN 'communitybuilding.meta'
WHEN 'Photography' THEN 'photo'
WHEN 'Photography.Meta' THEN 'photo.meta'
WHEN 'Programmers' THEN 'softwareengineering'
WHEN 'Programmers.Meta' THEN 'softwareengineering.meta'
WHEN 'Vegetarian' THEN 'vegetarianism'
WHEN 'Vegetarian.Meta' THEN 'vegetarianism.meta'
WHEN 'Writers' THEN 'writing'
WHEN 'Writers.Meta' THEN 'writing.meta'
ELSE SUBSTRING(name, 15, 200)
END + '.stackexchange.com',
IIF(PATINDEX('StackOverflow.%', name) > 0,
CASE SUBSTRING(name, 15, 200)
WHEN 'Br' THEN 'pt'
WHEN 'Br.Meta' THEN 'pt.meta'
ELSE SUBSTRING(name, 15, 200)
END + '.stackoverflow.com',
IIF(PATINDEX('%.Meta', name) > 0,
'meta.' + SUBSTRING(name, 0, PATINDEX('%.Meta', name)) + '.com',
name + '.com'
)
)
)
)
) + '/'
)
FROM sys.databases WHERE database_id > 5

Extract Min Date from a string with several dates using SQL Server

I am trying to extract the min date from a varchar string.
The data in the field looks like this
QTY DIFFERENCE - PO LINE 6. 147 ON PO / 192 ON INVOICE
5/18/2016 4:18:52 PM by ROOFING\ebuchanan
ANDREW SANTORI ISSUED THIS PO, PLEASE SEND TO HIS QUE
5/21/2016 9:48:42 AM by ROOFING\knaylor
RE-ROUTED TO ATS
Using this code
SELECT
UISeq,
LEFT(SUBSTRING(Notes, PATINDEX('%[0-9/]%', Notes), 8000),
PATINDEX('%[^0-9/]%', SUBSTRING(Notes, PATINDEX('%[0-9/]%', Notes), 8000) + 'X') -1) as 'MaxDate'
FROM
bAPUI
WHERE
Notes IS NOT NULL
ORDER BY
UISeq
I get this result from the record above
6
I also get
01/01/2000
On other fields
How do I correct the code to only return the Min date within each record field?
UISeq MinDate
2 3
3 5
13 4/1/2016
15 1
17
18 4/15/2016
19 3
20 4/15/2016
40 05/22/16
43 05/22/16
54 5/18/16
John's post is beyond my current ability
I have created the function, here is the code to extract the data
Declare #Str varchar(max);
Select #Str as Notes, Min(Key_Value)
from bAPUI, [dbo].[SA-udf-Str-Parse](replace(#Str,char(13),' '),' ')
Where Key_Value like '%/%'
and len(Key_Value)>=10
What I am not understanding is how to get the bAPUI.Notes table/field into the select statement.
The following uses a string parser udf. Perhaps in your data, or even just in the example, there were chr(13)'s, so I had to perform a replace(), there could be other extended characters that may need to be trapped.
Declare #Str varchar(max)
Set #Str='QTY DIFFERENCE - PO LINE 6. 147 ON PO / 192 ON INVOICE
5/18/2016 4:18:52 PM by ROOFING\ebuchanan
ANDREW SANTORI ISSUED THIS PO, PLEASE SEND TO HIS QUE
5/21/2016 9:48:42 AM by ROOFING\knaylor
RE-ROUTED TO ATS'
Select * from [dbo].[udf-Str-Parse](replace(#Str,char(13),' '),' ')
Where Key_Value like '%/%'
and len(Key_Value)>=10
Returns
Key_PS Key_Value
13 5/18/2016
28 5/21/2016
While with a quick change
Select Min(Key_Value) from [dbo].[udf-Str-Parse](replace(#Str,char(13),' '),' ')
Where Key_Value like '%/%'
and len(Key_Value)>=10
Returns
5/18/2016
There are millions of variations but here is mine.
CREATE FUNCTION [dbo].[udf-Str-Parse] (#String varchar(max),#delimeter varchar(10))
--Usage: Select * from [dbo].[udf-Str-Parse]('Dog,Cat,House,Car',',')
-- Select * from [dbo].[udf-Str-Parse]('John Cappelletti was here',' ')
-- Select * from [dbo].[udf-Str-Parse]('id26,id46|id658,id967','|')
Returns #ReturnTable Table (Key_PS int IDENTITY(1,1) NOT NULL , Key_Value varchar(500))
As
Begin
Declare #intPos int,#SubStr varchar(500)
Set #IntPos = CharIndex(#delimeter, #String)
Set #String = Replace(#String,#delimeter+#delimeter,#delimeter)
While #IntPos > 0
Begin
Set #SubStr = Substring(#String, 0, #IntPos)
Insert into #ReturnTable (Key_Value) values (#SubStr)
Set #String = Replace(#String, #SubStr + #delimeter, '')
Set #IntPos = CharIndex(#delimeter, #String)
End
Insert into #ReturnTable (Key_Value) values (#String)
Return
End
So to apply to your data
Select UISeq,
,MinDate=(Select Min(Key_Value) from [dbo].[udf-Str-Parse](replace(Notes,char(13),' '),' ') Where Key_Value like '%/%' and len(Key_Value)>=10)
FROM bAPUI
WHERE Notes IS NOT NULL
ORDER BYUISeq
I have no idea how this will perform on a large dataset
Super quick draft - Use CHARINDEX And LEFT to retrieve all characters up to the first space, then convert that text to a DATE, then use MIN to select the EARLIEST date.
select #str as string
,left(#str,CHARINDEX(' ',#str)) -- Get the position of the first space, then select all characters up to the space
,MIN(convert(date,left(#str,CHARINDEX(' ',#str)))) -- Convert the selected characters to a date and then use MIN to select earliest date

Search Entire Database To find extended ascii codes in sql

We have issues with extended ascii codes getting in our database (128-155)
Is there anyway to search the entire database and display the results of any of these characters that may be in there and where they are located within the tables and columns.
Hope that makes sense.
I have the script to search entire DB, but having trouble with opening line.
DECLARE #SearchStr nvarchar(100)
SET #SearchStr != between char(32) and char(127)
I have this originally that works, but I need to extend the range I'm looking for.
SET #SearchStr = '|' + char(9) + '|' + char(10) + '|' + char(13)
Thanks
It's very unclear what your data looks like, but this might help you to get started:
declare #TestData table (String nvarchar(100))
insert into #TestData select N'abc'
insert into #TestData select N'def'
insert into #TestData select char(128)
insert into #TestData select char(155)
declare #SearchPattern nvarchar(max) = N'%['
declare #i int = 128
while #i <= 155
begin
set #SearchPattern += char(#i)
set #i += 1
end
set #SearchPattern += N']%'
select #SearchPattern
select String
from #TestData
where String like #SearchPattern
Of course you'll need to add some code to loop over every table and column that you want to query (see this question), and it's possible that this code will behave differently on different collations.
... where dodgyColumn is your column with questionable data ....
WHERE(patindex('%[' + char(127) + '-' + char(255) + ']%', dodgyColumn COLLATE Latin1_General_BIN2) > 0)
This works for us, to identify extended ASCII characters in our otherwise normal ASCII data (characters, numbers, punctuation, dollar and percent signs, etc.)

Resources