Need to remove data from a substring - sql-server

I haven't the foggiest how to remove a substring from my column, I have been looking here for a few days and everyone seems to want to remove data from the end not the beginning.
Column data: /data/data/data.com --data=nameiwant2keep
Column name: column1
Table name: table1
Thank you for any help.

Assuming you wanted to keep just the nameiwant2keep part, and there won't be any other parameters in the column data, you can search for the index of the = sign and take the substring from the next character to the end of the string:
UPDATE
table1
SET
column1 = SUBSTRING(column1, CHARINDEX('=', column1, 0) + 1, LEN(column1))
Try this for proof:
DECLARE #column1 varchar(max)
SET #column1 = '/data/data/data.com --data=nameiwant2keep'
SELECT
SUBSTRING(#column1, CHARINDEX('=', #column1, 0) + 1, LEN(#column1))

Related

How can I CUT a specific string part to another column in SQL?

I have about 500 records in a table with an nvarchar column.
I want to cut a part of that data into another column. by "cut" I mean deleting it in the original column and add it to the target column.
All the data that has to be cut is contained within brackets. The bracketed text may occur anywhere in the string.
For example, ColumnA has: SomeTest Data [I want to cut this], and I want to move [I want to cut this] (but without the brackets) to ColumnB.
How do I achieve this?
UPDATE
Eventually found it out. The problem was that I didn't escaped my brackets.
What I have now (and works):
UPDATE TableA
SET TargetColumn = substring(SourceColumn,charindex('[',SourceColumn)+1,charindex(']',SourceColumn)-charindex('[',SourceColumn)-1),
SourceColumn = substring(SourceColumn, 0, charindex('[',SourceColumn))
where TableA.SourceColumn like '%\[%\]%' ESCAPE '\'
An UPDATE statement along these lines would do it:
CREATE TABLE #Test
(
StringToCut VARCHAR(100)
,CutValue VARCHAR(100)
)
INSERT #Test
VALUES
('SomeTest Data 1 [I want to cut this 1] More Testing',NULL),
('SomeTest Data 2 [I want to cut this 2]',NULL),
('SomeTest Data 3 [I want to cut this 3] Additional Test',NULL),
('[I want to cut this 4] last test',NULL)
SELECT * FROM #Test
--Populate CutValue column based on starting position of '[' and ending position of ']'
UPDATE #Test
SET CutValue = SUBSTRING(StringToCut,CHARINDEX('[',StringToCut),(CHARINDEX(']',StringToCut)-CHARINDEX('[',StringToCut)))
--Remove the '[' ']'
UPDATE #Test
SET CutValue = REPLACE(CutValue,'[','')
UPDATE #Test
SET CutValue = REPLACE(CutValue,']','')
--Remove everything after and including '[' from StringToCut
UPDATE #Test
SET StringToCut = LEFT(StringToCut,CHARINDEX('[',StringToCut)-1) + LTRIM(RIGHT(StringToCut,LEN(StringToCut)-CHARINDEX(']',StringToCut)))
SELECT * FROM #Test
DROP TABLE #Test
You left some questions unanswered.
How do you want to handle NULL values? I am leaving them NULL.
Where should the 'cut' string go? I am assuming "at the end"
What do you do if you find nested brackets? [[cut me]]
Do you need to remove any surrounding spaces? For example, does "The cat [blah] sleeps" become "The cat**sleeps" with two spaces before "sleeps"?
To make the operation atomic, you'll want to use a single UPDATE.
Here is a sample script to get you started.
--build a temp table with sample data
declare #t table(ikey int, sourcecolumn nvarchar(100), targetcolumn nvarchar(100));
insert into #t
select 0,'SomeTest Data [I want to cut this]','Existing Data For Row 1'
union select 1,'SomeTest [cut this too] Data2','Existing Data For Row 2'
union select 2,'[also cut this please] SomeTest Data3',null
union select 3,null,null
union select 4,null,''
union select 5,'Nested bracket example [[[within nested brackets]]] Other data',null
union select 6,'Example with no brackets',null
union select 7,'No brackets, and empty string in target',''
--show "before"
select * from #t order by ikey
--cut and paste
update #t
set
targetcolumn =
isnull(targetcolumn,'') +
case when 0 < isnull(charindex('[',sourcecolumn),0) and 0 < isnull(charindex(']',sourcecolumn),0)
then substring(sourcecolumn,charindex('[',sourcecolumn)+1,charindex(']',sourcecolumn)-charindex('[',sourcecolumn)-1)
else ''
end
,sourcecolumn =
case when sourcecolumn is null
then null
else substring(sourcecolumn,0,charindex('[',sourcecolumn)) + substring(sourcecolumn,charindex(']',sourcecolumn)+1,len(sourcecolumn))
end
where sourcecolumn like '%[%'
and sourcecolumn like '%]%'
--show "after"
select * from #t order by ikey
And another one in single update statement -
CREATE TABLE #Test
(
StringToCut VARCHAR(50)
,CutValue VARCHAR(50)
)
INSERT #Test
VALUES
('SomeTest Data 1 [I want to cut this 1]',NULL),
('SomeTest Data 2 [I want to cut this 2]',NULL),
('SomeTest Data 3 [I want to cut this 3]',NULL),
('SomeTest Data 4 [I want to cut this 4]',NULL)
UPDATE #Test
SET CutValue =
SUBSTRING(StringToCut, CHARINDEX('[', StringToCut)+1, CHARINDEX(']', StringToCut) - CHARINDEX('[', StringToCut) - 1)
SELECT * FROM #Test

Most effective way to check sub-string exists in comma-separated string in SQL Server

I have a comma-separated list column available which has values like
Product1, Product2, Product3
I need to search whether the given product name exists in this column.
I used this SQL and it is working fine.
Select *
from ProductsList
where productname like '%Product1%'
This query is working very slowly. Is there a more efficient way I can search for a product name in the comma-separated list to improve the performance of the query?
Please note I have to search comma separated list before performing any other select statements.
user defined functions for comma separation of the string
Create FUNCTION [dbo].[BreakStringIntoRows] (#CommadelimitedString varchar(max))
RETURNS #Result TABLE (Column1 VARCHAR(max))
AS
BEGIN
DECLARE #IntLocation INT
WHILE (CHARINDEX(',', #CommadelimitedString, 0) > 0)
BEGIN
SET #IntLocation = CHARINDEX(',', #CommadelimitedString, 0)
INSERT INTO #Result (Column1)
--LTRIM and RTRIM to ensure blank spaces are removed
SELECT RTRIM(LTRIM(SUBSTRING(#CommadelimitedString, 0, #IntLocation)))
SET #CommadelimitedString = STUFF(#CommadelimitedString, 1, #IntLocation, '')
END
INSERT INTO #Result (Column1)
SELECT RTRIM(LTRIM(#CommadelimitedString))--LTRIM and RTRIM to ensure blank spaces are removed
RETURN
END
Declare #productname Nvarchar(max)
set #productname='Product1,Product2,Product3'
select * from product where [productname] in(select * from [dbo].[![enter image description here][1]][1][BreakStringIntoRows](#productname))
Felix is right and the 'right answer' is to normalize your table. Although, maybe you have 500k lines of code that expect this column to exist as it is. So your next best (non-destructive) answer is:
Create a table to hold normalize data:
CREATE TABLE ProductsList2 (ProductId INT, ProductName VARCHAR)
Create a TRIGGER that on UPDATE/INSERT/DELETE maintains ProductList2 by splitting the string 'Product1,Product2,Product3' into three records.
Index your new table.
Query against your new table:
SELECT *
FROM ProductsList
WHERE ProductId IN (SELECT x.ProductId
FROM ProductsList2 x
WHERE x.ProductName = 'Product1')

How to return list of duplicate words and the count of instances in a table

I basically have a table with a column. Lets call the column 'Summary'
So if 'Summary' looks like this. I went to the park to find a dog. The dog was not there. I left because there was no dog.
I want to be able to return a list that basically gives me the duplicate words and the hit count of how many times it appeared. I won't know which word exactly is a duplicate so I cannot hard code it into the SQL query.
I need the results to be "Dog" -3, "The"- 2, "I"- 2
I cant post images so I cannot post a table
This is not necessarily a very efficient way of achieving the result you are looking for, but this will output a list of words that have a count of 2 or more in the specified summary:
DECLARE #summary NVARCHAR(MAX)
SET #summary = N'I went to the park to find a dog. The dog was not there. I left because there was no dog.'
SET NOCOUNT ON
DECLARE #PosA INT
DECLARE #Word NVARCHAR(MAX)
-- A temporary table to hold matches
CREATE TABLE dbo.#WordList
(
Word NVARCHAR(MAX),
WordCount INT
)
SET #PosA = 0
WHILE (LEN(#summary) > 0)
BEGIN
-- Find the position of the word end
SET #PosA = CHARINDEX(' ', #summary)
IF (#PosA = 0)
SET #PosA = LEN(#summary) + 1
-- Extract the word and shorten the summary text
SET #Word = SUBSTRING(#summary, 0, #PosA)
IF (#PosA < LEN(#summary))
SET #summary = SUBSTRING(#summary, #PosA + 1, LEN(#summary) - #PosA)
ELSE
SET #summary = ''
-- Strip punctuation
SET #Word = REPLACE(REPLACE(#Word, '.', ''), ',', '')
-- Add or create the word
IF EXISTS ( SELECT TOP 1 1 FROM dbo.#WordList WHERE Word = #Word)
UPDATE dbo.#WordList
SET WordCount = WordCount + 1
WHERE (Word = #Word)
ELSE
INSERT INTO dbo.#WordList (Word, WordCount)
VALUES (#Word, 1)
END
-- Get results
SELECT *
FROM dbo.#WordList
WHERE (WordCount > 1)
ORDER BY Word
--- Tidy up
DROP TABLE dbo.#WordList
Effectively, split the summary text by each space and then remove punctuation from the resulting word. The resulting words are stored in the #WordList temporary table, with the count incremented as appropriate.
Finally the results are returned at the end.
Note that you may wish to improve the punctuation removal as I only added full-stops and commas for the purposes of this answer.
I think that for each row, you'll need to split the summary column into separate rows. Then you can do select on that result set, counting each value. Here's a link to a bunch of nice Split functions:
Split functions
They are pretty old, but still very effective. I think something like tvf should get you going:
CREATE FUNCTION dbo.Split (#sep char(1), #s varchar(512))
RETURNS table
AS
RETURN (
WITH Pieces(pn, start, stop) AS (
SELECT 1, 1, CHARINDEX(#sep, #s)
UNION ALL
SELECT pn + 1, stop + 1, CHARINDEX(#sep, #s, stop + 1)
FROM Pieces
WHERE stop > 0
)
SELECT pn,
SUBSTRING(#s, start, CASE WHEN stop > 0 THEN stop-start ELSE 512 END) AS s
FROM Pieces
)
DECLARE #summaries TABLE (id int, summary nvarchar(max))
INSERT #summaries values
(1,N'I went to the park to find a dog. The dog was not there. I left because there was no dog.')
SELECT id, word, COUNT(*) c
FROM #summaries
CROSS APPLY (SELECT CAST('<a>'+REPLACE(summary,' ','</a><a>')+'</a>' AS xml) xml1 ) t1
CROSS APPLY (SELECT n.value('.','varchar(max)') AS word FROM xml1.nodes('a') x(n) ) t2
GROUP BY id, word
HAVING COUNT(*) > 1

Select just first line of chars up to CR/LF from a text column

Is it possible to select or substring just the first line of chars in a SQL Server text column, to then prepend as the first line of chars in another text field in another table?
If you are running SQL Server 2005 or higher:
In the LEFT command, use CHARINDEX on CHAR(13) to find the position of the first line feed character, as in the following example:
declare #a table(id int identity(1,1) not null, lines text); --Source
declare #b table(id int identity(1,1) not null, lines text); --Target
insert into #a(lines) values ('1111111'+char(13)+char(10)+'222222')
insert into #b(lines) values ('aaaaa');
update b
set lines=LEFT(cast(a.lines as varchar(max)),CHARINDEX(char(13),cast(a.lines as varchar(max)),1)-1)+cast(b.lines as varchar(max))
from #a a
join #b b on a.id=b.id;
select * from #b;
I suggest also updating your TEXT data types to varchar(max), if possible. varchar(max) is much more robust.
Yes, do a substring or left till the first newline of the text field.
You could easily assign this via subquery for use in insert or update statements.
SELECT ( CASE WHEN CHARINDEX(CHAR(13), action_Item.Description) = 0
THEN action_Item.Description
ELSE SUBSTRING(action_Item.Description, 0,
CHARINDEX(CHAR(13), action_Item.Description))
END ) AS [Description] FROM action_Item
Where I am selecting the first line if "Description" field from a table called "action_Item"
DECLARE #crlf char(2);
SET #crlf = CHAR(13) + CHAR(10);
UPDATE table1
SET LEFT(table2.fieldWithCRLF, CHARINDEX(table2.fieldWithCRLF, #crlf, 0) - 1) + table1.fieldToPrepend
FROM table1
INNER JOIN table2
ON table1.sharedKey = table2.sharedKey
WHERE CHARINDEX(table2.fieldWithCRLF, #crlf, 0) > 0

Search a varchar field that contains all words from another string

trying to do a small stored procedure without needing to add freetext indexing just for this (SQL Server 2008)
Basically, I want to find all records where a certain field contains all the words from a parameter.
So if in the field I have "This is a test field", and the parameter to my SP would be "this test field" it would return it, as it would if the parameter was "field this test".
The table is very small (4000) record and load will be low, so efficiency is not a big deal. Right now the only solution i can think of is to split both strings with table valued function and go from there.
Any simpler idea?
Thanks!
If efficency is not a big problem, why not go with a bit of dynamic SQL. Something like:
create procedure myproc (#var varchar(100))
as
set #var = '%' + replace(#var, ' ', '%') + '%'
exec ('select * from mytable where myfield like '''+ #var + '''')
Here is a solution using recursive CTEs. This actually uses two separate recursions. The first one splits the strings into tokens and the second one recursively filters the records using each token.
declare
#searchString varchar(max),
#delimiter char;
select
#searchString = 'This is a test field'
,#delimiter = ' '
declare #tokens table(pos int, string varchar(max))
;WITH Tokens(pos, start, stop) AS (
SELECT 1, 1, CONVERT(int, CHARINDEX(#delimiter, #searchString))
UNION ALL
SELECT pos + 1, stop + 1, CONVERT(int, CHARINDEX(#delimiter, #searchString, stop + 1))
FROM Tokens
WHERE stop > 0
)
INSERT INTO #tokens
SELECT pos,
SUBSTRING(#searchString, start, CASE WHEN stop > 0 THEN stop-start ELSE 512 END) AS string
FROM Tokens
OPTION (MAXRECURSION 25000) ;
;with filter(ind, myfield) as (
select 1,myfield from mytable where myfield like '%'+(select string from #tokens where pos = 1)+'%'
union all
select ind + 1, myfield from filter where myfield like '%'+(select string from #tokens where pos = ind + 1)+'%'
)
select * from filter where ind = (select COUNT(1) from #tokens)
This took me about 15 seconds to search a table of 10k records for the search string 'this is a test field'.. (the more words in the string, the longer it takes.. )
Edit
If you want a fuzzy search i.e return closely matching results even if there wasnt an exact match, you could modify the last line in the query to be -
select * from (select max(ind) as ind, myfield from filter group by myfield) t order by ind desc
'ind' would give you the number of words from the search string found in myfield.

Resources