Using SQL WHERE LIKE to filter when have consecutive numbers - sql-server

Trying to convert a PostgreSQL view to SQL Server (2016) view.
I have a table with a column named filename, with the following data:
R24bMP1.png
MP3.png
R28.jpg
I002.jpg
App_1472669569054.jpg
Test_1575753047890.png
So, I like to filter all rows the filename must contains 13 consecutives numbers, in the example, only App_1472669569054.jpg and Test_1575753047890.png.
In PostgreSQL, I can use regex to do this:
SELECT * FROM table WHERE filename ~ '\d{13}'.
Tried in SQL Server with:
SELECT * FROM table WHERE filename LIKE '%[0-9]{13}%', but got no results. The only way that worked is:
SELECT * FROM table WHERE filename LIKE '%[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]%'
After that, I need to get only the number part of filename, in the example, the returned value must be:
1472669569054
1575753047890
I know I can use CLR with SQL Server, but I like to known if is possible to filter without CLR in this case.

As Martin Smith pointed out:
'%[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]%'
can be reduced to:
'%' + replicate('[0-9]',13) + '%'
You were most of the way there:
declare #filename varchar(128) = 'Test_1575753047890.png'
select test=substring(#filename
,patindex('%' + replicate('[0-9]',13) + '%',#filename)
,13
)
returns: 1575753047890
So for your table it would look like:
select test=substring([filename]
,patindex('%' + replicate('[0-9]',13) + '%',[filename])
,13
)
from t
where patindex('%' + replicate('[0-9]',13) + '%',[filename]) > 0

Related

SQL Server geography: Reduce size (decimal precision) of WKT text

For my farming app, a stored proc retrieves paddock/field boundaries that are stored as SQL Server geography data type, for display on the user's mobile device.
SQL Server's .ToString() and .STAsText() functions render each vertex as a lat/long pair, to 15 decimal places. From this answer, 15 decimals defines a location to within the width of an atom! To the nearest metre would be good enough for me.
The resulting overly-precise payload is extremely large and too slow for use on larger farms.
From my SQL Server geography data, I would like to produce WKT that is formatted to 4 or 5 decimals. I could find no built-in methods, but my best leads are:
Postgis and Google Cloud "BigQuery" have the ST_SNAPTOGRID function, which would be perfect, and
Regex might be useful, e.g. this answer, but SQL Server doesn't seem to have a regex replace.
I would think this is a common problem: is there a simple solution?
By stepping through #iamdave's excellent answer, and using that same approach, it looks like we need split only on periods.... I think we can ignore all parantheses and commas, and ignore the POLYGON prefix (which means it'll work across other GEOGRAPHY types, such as MULTIPOLYGON.)
i.e. Each time we find a period, grab only the next 4 characters after it, and throw away any numbers after that (until we hit a non-number.)
This works for me (using #iamdave's test data):
DECLARE #wkt NVARCHAR(MAX), #wktShort NVARCHAR(MAX);
DECLARE #decimalPlaces int = 4;
SET #wkt = 'POLYGON((-121.973669 37.365336,-121.97367 37.365336,-121.973642 37.365309,-121.973415 37.365309,-121.973189 37.365309,-121.973002 37.365912,-121.972815 37.366515,-121.972796 37.366532,-121.972776 37.366549,-121.972627 37.366424,-121.972478 37.366299,-121.972422 37.366299,-121.972366 37.366299,-121.972298 37.366356,-121.97223 37.366412,-121.97215 37.366505,-121.97207 37.366598,-121.971908 37.366794,-121.971489 37.367353,-121.971396 37.367484,-121.971285 37.36769,-121.971173 37.367897,-121.971121 37.368072,-121.971068 37.368248,-121.971028 37.36847,-121.970987 37.368692,-121.970987 37.368779,-121.970987 37.368866,-121.970949 37.368923,-121.970912 37.36898,-121.970935 37.36898,-121.970958 37.36898,-121.970975 37.368933,-121.970993 37.368887,-121.971067 37.368807,-121.97114 37.368726,-121.971124 37.368705,-121.971108 37.368685,-121.971136 37.368698,-121.971163 37.368712,-121.97134 37.368531,-121.971516 37.368351,-121.971697 37.368186,-121.971878 37.368021,-121.972085 37.367846,-121.972293 37.36767,-121.972331 37.367629,-121.972369 37.367588,-121.972125 37.367763,-121.97188 37.367938,-121.971612 37.36815,-121.971345 37.368362,-121.971321 37.36835,-121.971297 37.368338,-121.971323 37.368298,-121.97135 37.368259,-121.971569 37.368062,-121.971788 37.367865,-121.971977 37.367716,-121.972166 37.367567,-121.972345 37.367442,-121.972524 37.367317,-121.972605 37.367272,-121.972687 37.367227,-121.972728 37.367227,-121.972769 37.367227,-121.972769 37.367259,-121.972769 37.367291,-121.972612 37.367416,-121.972454 37.367542,-121.972488 37.367558,-121.972521 37.367575,-121.972404 37.367674,-121.972286 37.367773,-121.972194 37.367851,-121.972101 37.367928,-121.972046 37.36799,-121.971991 37.368052,-121.972008 37.368052,-121.972025 37.368052,-121.972143 37.367959,-121.972261 37.367866,-121.972296 37.367866,-121.972276 37.36794,-121.972221 37.36798,-121.972094 37.368097,-121.971966 37.368214,-121.971956 37.368324,-121.971945 37.368433,-121.971907 37.368753,-121.971868 37.369073,-121.97184 37.369578,-121.971812 37.370083,-121.971798 37.370212,-121.971783 37.370342,-121.971542 37.370486,-121.971904 37.370324,-121.972085 37.37028,-121.972266 37.370236,-121.972559 37.370196,-121.972852 37.370155,-121.973019 37.370155,-121.973186 37.370155,-121.973232 37.370136,-121.973279 37.370116,-121.973307 37.370058,-121.973336 37.370001,-121.973363 37.369836,-121.973391 37.369671,-121.973419 37.369227,-121.973446 37.368784,-121.973429 37.368413,-121.973413 37.368041,-121.973361 37.367714,-121.973308 37.367387,-121.973285 37.367339,-121.973262 37.36729,-121.973126 37.3673,-121.972989 37.36731,-121.973066 37.36728,-121.973144 37.367251,-121.973269 37.367237,-121.973393 37.367223,-121.973443 37.367158,-121.973493 37.367093,-121.973518 37.36702,-121.973543 37.366947,-121.973582 37.366618,-121.973622 37.366288,-121.97366 37.365826,-121.973698 37.365363,-121.973669 37.365336))';
-- Split on '.', then get the next N decimals, and find the index of the first non-number.
-- Then recombine the fragments, skipping the unwanted numbers.
WITH points AS (
SELECT value, LEFT(value, #decimalPlaces) AS decimals, PATINDEX('%[^0-9]%', value) AS indx
FROM STRING_SPLIT(#wkt, '.')
)
SELECT #wktShort = STRING_AGG(IIF(indx < #decimalPlaces, '', decimals) + SUBSTRING(value, indx, LEN(value)), '.')
FROM points;
Comparing original vs shortened, we can see that each number is truncated to 4dp:
SELECT #wkt AS Text UNION ALL SELECT #wktShort;
Edit
I believe I may have misunderstood your question and you are looking to transmit the WKT rather than the binary representations of the polygons? If that is the case, my answer below still shows you how to chop off some decimal places (without rounding them). Just don't wrap the stuff(...) FOR XML in a STGeomFromText and you have the amended WKT.
When working with geography data types, it can be handy to maintain a very detailed 'master' version, from which you generate and persist less detailed versions based on your requirements.
An easy way to generate these reduced complexity polygons is to use the helpfully named Reduce function, which I think will actually help you in this situation.
If you would prefer to go down the route of reducing the number of decimal places you will either have to write a custom CLR function or enter the wonderful world of SQL Server string manipulation!
SQL Query
declare #DecimalPlaces int = 4; -- Specify the desired number of lat/long decimals
with g as(
select p.g -- Original polygon, for comparison purposes
,geography::STGeomFromText('POLYGON((' -- stripped apart and then recreated polygon from text, using a custom string split function. You won't be able to use the built in STRING_SPLIT here as it doesn't guarantee sort order.
+ stuff((select ', ' + left(s.item,charindex('.',s.item,0) + #DecimalPlaces) + substring(s.item,charindex(' ',s.item,0),charindex('.',s.item,charindex(' ',s.item,0)) - charindex(' ',s.item,0) + 1 + #DecimalPlaces)
from dbo.fn_StringSplitMax(replace(replace(p.g.STAsText(),'POLYGON ((',''),'))',''),', ',null) as s
for xml path(''), type).value('.', 'NVARCHAR(MAX)') -- STUFF and FOR XML mimics GROUP_CONCAT functionality seen in other SQL languages, to recombine shortened Points back into a Polygon string
,1,2,''
)
+ '))', 4326).MakeValid() as x -- Remember to make the polygon valid again, as you have been messing with the Point data
from(values(geography::STGeomFromText('POLYGON((-121.973669 37.365336,-121.97367 37.365336,-121.973642 37.365309,-121.973415 37.365309,-121.973189 37.365309,-121.973002 37.365912,-121.972815 37.366515,-121.972796 37.366532,-121.972776 37.366549,-121.972627 37.366424,-121.972478 37.366299,-121.972422 37.366299,-121.972366 37.366299,-121.972298 37.366356,-121.97223 37.366412,-121.97215 37.366505,-121.97207 37.366598,-121.971908 37.366794,-121.971489 37.367353,-121.971396 37.367484,-121.971285 37.36769,-121.971173 37.367897,-121.971121 37.368072,-121.971068 37.368248,-121.971028 37.36847,-121.970987 37.368692,-121.970987 37.368779,-121.970987 37.368866,-121.970949 37.368923,-121.970912 37.36898,-121.970935 37.36898,-121.970958 37.36898,-121.970975 37.368933,-121.970993 37.368887,-121.971067 37.368807,-121.97114 37.368726,-121.971124 37.368705,-121.971108 37.368685,-121.971136 37.368698,-121.971163 37.368712,-121.97134 37.368531,-121.971516 37.368351,-121.971697 37.368186,-121.971878 37.368021,-121.972085 37.367846,-121.972293 37.36767,-121.972331 37.367629,-121.972369 37.367588,-121.972125 37.367763,-121.97188 37.367938,-121.971612 37.36815,-121.971345 37.368362,-121.971321 37.36835,-121.971297 37.368338,-121.971323 37.368298,-121.97135 37.368259,-121.971569 37.368062,-121.971788 37.367865,-121.971977 37.367716,-121.972166 37.367567,-121.972345 37.367442,-121.972524 37.367317,-121.972605 37.367272,-121.972687 37.367227,-121.972728 37.367227,-121.972769 37.367227,-121.972769 37.367259,-121.972769 37.367291,-121.972612 37.367416,-121.972454 37.367542,-121.972488 37.367558,-121.972521 37.367575,-121.972404 37.367674,-121.972286 37.367773,-121.972194 37.367851,-121.972101 37.367928,-121.972046 37.36799,-121.971991 37.368052,-121.972008 37.368052,-121.972025 37.368052,-121.972143 37.367959,-121.972261 37.367866,-121.972296 37.367866,-121.972276 37.36794,-121.972221 37.36798,-121.972094 37.368097,-121.971966 37.368214,-121.971956 37.368324,-121.971945 37.368433,-121.971907 37.368753,-121.971868 37.369073,-121.97184 37.369578,-121.971812 37.370083,-121.971798 37.370212,-121.971783 37.370342,-121.971542 37.370486,-121.971904 37.370324,-121.972085 37.37028,-121.972266 37.370236,-121.972559 37.370196,-121.972852 37.370155,-121.973019 37.370155,-121.973186 37.370155,-121.973232 37.370136,-121.973279 37.370116,-121.973307 37.370058,-121.973336 37.370001,-121.973363 37.369836,-121.973391 37.369671,-121.973419 37.369227,-121.973446 37.368784,-121.973429 37.368413,-121.973413 37.368041,-121.973361 37.367714,-121.973308 37.367387,-121.973285 37.367339,-121.973262 37.36729,-121.973126 37.3673,-121.972989 37.36731,-121.973066 37.36728,-121.973144 37.367251,-121.973269 37.367237,-121.973393 37.367223,-121.973443 37.367158,-121.973493 37.367093,-121.973518 37.36702,-121.973543 37.366947,-121.973582 37.366618,-121.973622 37.366288,-121.97366 37.365826,-121.973698 37.365363,-121.973669 37.365336))', 4326))) as p(g)
)
-- select various versions of the polygons into the same column for overlay comparison in SSMS
select 'Original' as l
,g
from g
union all
select 'Short' as l
,x
from g
union all
select 'Original Reduced' as l
,g.Reduce(10)
from g
union all
select 'Short Reduced' as l
,x.Reduce(10)
from g;
Output
What is interesting to note here is the difference in length of the geog binary representation (simple count of characters as displayed). As I mentioned above, simply using the Reduce function may do what you need, so you will want to test various approaches to see how best you cut down on your data transfer.
+------------------+--------------------+------+
| l | g | Len |
+------------------+--------------------+------+
| Original | 0xE6100000010484...| 4290 |
| Short | 0xE6100000010471...| 3840 |
| Original Reduced | 0xE6100000010418...| 834 |
| Short Reduced | 0xE610000001041E...| 1184 |
+------------------+--------------------+------+
Visual Comparison
String Split Function
As polygon data can be bloody huge, you will need a string splitter that can handle more then 4k or 8k characters. In my case I tend to opt for an xml based approach:
create function [dbo].[fn_StringSplitMax]
(
#str nvarchar(max) = ' ' -- String to split.
,#delimiter as nvarchar(max) = ',' -- Delimiting value to split on.
,#num as int = null -- Which value to return.
)
returns table
as
return
with s as
( -- Convert the string to an XML value, replacing the delimiter with XML tags
select convert(xml,'<x>' + replace((select #str for xml path('')),#delimiter,'</x><x>') + '</x>').query('.') as s
)
select rn
,item -- Select the values from the generated XML value by CROSS APPLYing to the XML nodes
from(select row_number() over (order by (select null)) as rn
,n.x.value('.','nvarchar(max)') as item
from s
cross apply s.nodes('x') as n(x)
) a
where rn = #num
or #num is null;

MS SQL REPLACE based on 1 character to the left of the $

I am not a SQL expert so please forgive me if this is SQL 101 :).
In a select statement there are 2 replace functions. They look for a Servername and it's admin share d$ by it's UNC path. Example '\SERVERNAME\d$'
It then replaces '\SERVERNAME\d$' with 'D:'.
Here is the query currently:
select Replace(p.Path,'\\SERVERNAME\d$','D:') as searchpath
,p.path as fullpath
,s.ShareName
,s.SharePath
,p.Member
,p.Access
From Paths As p
Left Outer Join Shares as s on
Replace(p.Path,'\\SERVERNAME\d$','D:') Like s.SharePath + '\%'
Up until now it has always been d$.
Today my needs have changed and I need the query to find ANY servername UNC path admin share regardless of share letter (c$, d$, e$, f$...etc) and replace it with it's respective drive letter (D:, E:, F:... etc).
My thought is replace function could find the $ and look one character to the left of it to get the proper share letter, then use that for the replace. The issue I have, not being a SQL professional, is that I know SQL can likley do what I need it to do...I just don't know how to get there. I've googled and found some examples, but haven't had any luck in getting them to work.
Any help would be greatly appreciated.
You can use a combination of STUFF, PATINDEX, LEN to get what you want.
Sample Query
DECLARE #ReplaceChar VARCHAR(100) = '[prefixcharacters]\\SERVERNAME\d$[postcharacter]'
DECLARE #SearchString VARCHAR(100) = '\\SERVERNAME\_$'
SELECT
STUFF(#ReplaceChar,PATINDEX('%' + #SearchString + '%',#ReplaceChar),LEN(#SearchString),
UPPER(SUBSTRING(#ReplaceChar,PATINDEX('%' + #SearchString + '%',#ReplaceChar) + LEN(#SearchString) - 2,1)) + ':') as searchpath
WHERE PATINDEX('%' + #SearchString + '%',#ReplaceChar) > 0
Output
[prefixcharacters]D:[postcharacter]
Alternate Query
You can shorten the query if you want to get the previous character before $ as per your title. Something like this
DECLARE #ReplaceChar VARCHAR(100) = '[prefixcharacters]\\SERVERNAME\d$[postcharacter]'
DECLARE #SearchString VARCHAR(100) = '\\SERVERNAME\_$'
SELECT
STUFF(#ReplaceChar,
PATINDEX('%'+#SearchString+'%',#ReplaceChar),
LEN(#SearchString),
UPPER(SUBSTRING(#ReplaceChar,CHARINDEX('$',#ReplaceChar) -1,1)) + ':')
WHERE PATINDEX('%'+#SearchString+'%',#ReplaceChar) > 0
In this query
STUFF replaces your pattern with with the character before $ + ':'
Start of pattern is identified by PATINDEX('%'+#SearchString+'%',#ReplaceChar)
D is identified by getting the charindex of '$' and then getting the previous character using SUBSTRING
What about ¸
select Replace(SUBSTRING(p.path, 14, Len(#spath)-14),'$',':') as searchpath
,p.path as fullpath
,s.ShareName
,s.SharePath
,p.Member
,p.Access
From Paths As p
Left Outer Join Shares as s on
Replace(SUBSTRING(p.path, 14, Len(#spath)-14),'$',':') Like s.SharePath + '\%
select as searchpath
DECLARE #str nvarchar (100)
SET #str = '\\SERVERNAME\d$'
IF #str LIKE '\\SERVERNAME\_$'
SET #str = UPPER(SUBSTRING(#str, 14, 1)) + ':'
SELECT #str
Starting from previous, something like
select UPPER(SUBSTRING(p.path, 14, 1)) + ':' as searchpath
,p.path as fullpath
,s.ShareName
,s.SharePath
,p.Member
,p.Access
From Paths As p
Left Outer Join Shares as s on
SUBSTRING(p.path, 14, 1) + ':' Like s.SharePath + '\%'
I am no mysql expert either :)
Based on the logic you mentioned in the last part of the question, I have used concat and substring to get to the drive letter in the column.
Hope this helps
select replace(path, concat(substring(path, 1, locate('$', path) - 2), substring(path, locate('$', path) - 1, 1) , '$'), concat(substring(path, locate('$', path) - 1, 1) , ':')) as searchpath ...
The remaining part of the query would be the same.

Number 0 is not saving to database as a prefix in SQL Server of CHAR data type column

I am trying to insert an value as '019393' into a table with a CHAR(10) column.
It is inserting only '19393' into the database
I am implementing this feature in a stored procedure, doing some manipulation like incrementing that number by 15 and saving it back with '0' as the prefix
I am using SQL Server database
Note: I tried CASTING that value as VARCHAR before saving to the database, but even that did not get the solution
Code
SELECT
#fromBSB = fromBSB, #toBSB = toBSB, #type = Type
FROM
[dbo].[tbl_REF_SpecialBSBRanges]
WHERE
CAST(#inputFromBSB AS INT) BETWEEN fromBSB AND toBSB
SET #RETURNVALUE = #fromBSB
IF(#fromBSB = #inputFromBSB)
BEGIN
PRINT 'Starting Number is Equal';
DELETE FROM tbl_REF_SpecialBSBRanges
WHERE Type = #type AND fromBSB = #fromBSB AND toBSB = #toBSB
INSERT INTO [tbl_REF_SpecialBSBRanges] ([Type], [fromBSB], [toBSB])
VALUES(#type, CAST('0' + #fromBSB + 1 AS CHAR), #toBSB)
INSERT INTO [tbl_REF_SpecialBSBRanges] ([Type], [fromBSB], [toBSB])
VALUES(#inputBSBName, #inputFromBSB, #inputToBSB)
END
Okay, without knowing the column datatypes, I would suggest trying this:
Change from
CAST('0'+#fromBSB+1 AS CHAR)
To
'0'+CAST(#fromBSB+1 AS CHAR(10))
But if the columns are integers this won't make a difference.

Oracle split text into multiple rows

Inside a varchar2 column I have text values like :
aaaaaa. fgdfg.
bbbbbbbbbbbbbb ccccccccc
dddddd ddd dddddddddddd,
asdasdasdll
sssss
if i do select column from table where id=... i get the whole text in a single row, normally.
But i would like to get the result in multiple rows, 5 for the example above.
I have to use just one select statement, and the delimiters will be new line or carriage return (chr(10), chr(13) in oracle)
Thank you!
Like this, maybe (but it all depends on the version of oracle you are using):
WITH yourtable AS (SELECT REPLACE('aaaaaa. fgdfg.' ||chr(10)||
'bbbbbbbbbbbbbb ccccccccc ' ||chr(13)||
'dddddd ddd dddddddddddd,' ||chr(10)||
'asdasdasdll ' ||chr(13)||
'sssss '||chr(10),chr(13),chr(10)) AS astr FROM DUAL)
SELECT REGEXP_SUBSTR ( astr, '[^' ||chr(10)||']+', 1, LEVEL) data FROM yourtable
CONNECT BY LEVEL <= LENGTH(astr) - LENGTH(REPLACE(astr, chr(10))) + 1
see: Comma Separated values in Oracle
The answer by Kevin Burton contains a bug if your data contains empty lines.
The adaptation below, based on the solution invented here, works. Check that post for an explanation on the issue and the solution.
WITH yourtable AS (SELECT REPLACE('aaaaaa. fgdfg.' ||chr(10)||
'bbbbbbbbbbbbbb ccccccccc ' ||chr(13)||
chr(13)||
'dddddd ddd dddddddddddd,' ||chr(10)||
'asdasdasdll ' ||chr(13)||
'sssss '||chr(10),chr(13),chr(10)) AS astr FROM DUAL)
SELECT REGEXP_SUBSTR ( astr, '([^' ||chr(10)||']*)('||chr(10)||'|$)', 1, LEVEL, null, 1) data FROM yourtable
CONNECT BY LEVEL <= LENGTH(astr) - LENGTH(REPLACE(astr, chr(10))) + 1;

How to do hit-highlighting of results from a SQL Server full-text query

We have a web application that uses SQL Server 2008 as the database. Our users are able to do full-text searches on particular columns in the database. SQL Server's full-text functionality does not seem to provide support for hit highlighting. Do we need to build this ourselves or is there perhaps some library or knowledge around on how to do this?
BTW the application is written in C# so a .Net solution would be ideal but not necessary as we could translate.
Expanding on Ishmael's idea, it's not the final solution, but I think it's a good way to start.
Firstly we need to get the list of words that have been retrieved with the full-text engine:
declare #SearchPattern nvarchar(1000) = 'FORMSOF (INFLECTIONAL, " ' + #SearchString + ' ")'
declare #SearchWords table (Word varchar(100), Expansion_type int)
insert into #SearchWords
select distinct display_term, expansion_type
from sys.dm_fts_parser(#SearchPattern, 1033, 0, 0)
where special_term = 'Exact Match'
There is already quite a lot one can expand on, for example the search pattern is quite basic; also there are probably better ways to filter out the words you don't need, but it least it gives you a list of stem words etc. that would be matched by full-text search.
After you get the results you need, you can use RegEx to parse through the result set (or preferably only a subset to speed it up, although I haven't yet figured out a good way to do so). For this I simply use two while loops and a bunch of temporary table and variables:
declare #FinalResults table
while (select COUNT(*) from #PrelimResults) > 0
begin
select top 1 #CurrID = [UID], #Text = Text from #PrelimResults
declare #TextLength int = LEN(#Text )
declare #IndexOfDot int = CHARINDEX('.', REVERSE(#Text ), #TextLength - dbo.RegExIndexOf(#Text, '\b' + #FirstSearchWord + '\b') + 1)
set #Text = SUBSTRING(#Text, case #IndexOfDot when 0 then 0 else #TextLength - #IndexOfDot + 3 end, 300)
while (select COUNT(*) from #TempSearchWords) > 0
begin
select top 1 #CurrWord = Word from #TempSearchWords
set #Text = dbo.RegExReplace(#Text, '\b' + #CurrWord + '\b', '<b>' + SUBSTRING(#Text, dbo.RegExIndexOf(#Text, '\b' + #CurrWord + '\b'), LEN(#CurrWord) + 1) + '</b>')
delete from #TempSearchWords where Word = #CurrWord
end
insert into #FinalResults
select * from #PrelimResults where [UID] = #CurrID
delete from #PrelimResults where [UID] = #CurrID
end
Several notes:
1. Nested while loops probably aren't the most efficient way of doing it, however nothing else comes to mind. If I were to use cursors, it would essentially be the same thing?
2. #FirstSearchWord here to refers to the first instance in the text of one of the original search words, so essentially the text you are replacing is only going to be in the summary. Again, it's quite a basic method, some sort of text cluster finding algorithm would probably be handy.
3. To get RegEx in the first place, you need CLR user-defined functions.
It looks like you could parse the output of the new SQL Server 2008 stored procedure sys.dm_fts_parser and use regex, but I haven't looked at it too closely.
You might be missing the point of the database in this instance. Its job is to return the data to you that satisfies the conditions you gave it. I think you will want to implement the highlighting probably using regex in your web control.
Here is something a quick search would reveal.
http://www.dotnetjunkies.com/PrintContent.aspx?type=article&id=195E323C-78F3-4884-A5AA-3A1081AC3B35
Some details:
search_kiemeles=replace(lcase(search),"""","")
do while not rs.eof 'The search result loop
hirdetes=rs("hirdetes")
data=RegExpValueA("([A-Za-zöüóőúéáűíÖÜÓŐÚÉÁŰÍ0-9]+)",search_kiemeles) 'Give back all the search words in an array, I need non-english characters also
For i=0 to Ubound(data,1)
hirdetes = RegExpReplace(hirdetes,"("&NoAccentRE(data(i))&")","<em>$1</em>")
Next
response.write hirdetes
rs.movenext
Loop
...
Functions
'All Match to Array
Function RegExpValueA(patrn, strng)
Dim regEx
Set regEx = New RegExp ' Create a regular expression.
regEx.IgnoreCase = True ' Set case insensitivity.
regEx.Global = True
Dim Match, Matches, RetStr
Dim data()
Dim count
count = 0
Redim data(-1) 'VBSCript Ubound array bug workaround
if isnull(strng) or strng="" then
RegExpValueA = data
exit function
end if
regEx.Pattern = patrn ' Set pattern.
Set Matches = regEx.Execute(strng) ' Execute search.
For Each Match in Matches ' Iterate Matches collection.
count = count + 1
Redim Preserve data(count-1)
data(count-1) = Match.Value
Next
set regEx = nothing
RegExpValueA = data
End Function
'Replace non-english chars
Function NoAccentRE(accent_string)
NoAccentRE=accent_string
NoAccentRE=Replace(NoAccentRE,"a","§")
NoAccentRE=Replace(NoAccentRE,"á","§")
NoAccentRE=Replace(NoAccentRE,"§","[aá]")
NoAccentRE=Replace(NoAccentRE,"e","§")
NoAccentRE=Replace(NoAccentRE,"é","§")
NoAccentRE=Replace(NoAccentRE,"§","[eé]")
NoAccentRE=Replace(NoAccentRE,"i","§")
NoAccentRE=Replace(NoAccentRE,"í","§")
NoAccentRE=Replace(NoAccentRE,"§","[ií]")
NoAccentRE=Replace(NoAccentRE,"o","§")
NoAccentRE=Replace(NoAccentRE,"ó","§")
NoAccentRE=Replace(NoAccentRE,"ö","§")
NoAccentRE=Replace(NoAccentRE,"ő","§")
NoAccentRE=Replace(NoAccentRE,"§","[oóöő]")
NoAccentRE=Replace(NoAccentRE,"u","§")
NoAccentRE=Replace(NoAccentRE,"ú","§")
NoAccentRE=Replace(NoAccentRE,"ü","§")
NoAccentRE=Replace(NoAccentRE,"ű","§")
NoAccentRE=Replace(NoAccentRE,"§","[uúüű]")
end function

Resources