Finding a string in XML column using sql server - sql-server

I have a table with a xml column.
I require to search for sub string in that xml column for all its node and value. Search should be case insensitive
Structure of XML in each row is different
I used below query to do that,
select * from TableName Where Cast(xmlcolumn as varchar(max) ) like '%searchString%'
this works for short length xml rows, if row length goes huge it cant handle the situation. Only partial of the data was searched.
Suggest me some other ways to achieve.

If this is one time task then I would use exist XML method thus:
DECLARE #Table1 TABLE (
ID INT IDENTITY PRIMARY KEY,
CommentAsXML XML
)
INSERT #Table1 (CommentAsXML)
VALUES (N'<root><item /><item type="Reg">0001</item><item type="Inv">B007</item><item type="Cus">A0001</item><item type="Br">F0001</item></root>')
INSERT #Table1 (CommentAsXML)
VALUES (N'<root><item /><item type="Reg">0005</item><parent><child>B007</child></parent><item type="Br">F0005</item></root>')
INSERT #Table1 (CommentAsXML)
VALUES (N'<root><item /><item type="Reg">0005</item></root>')
-- Following query is searching for B007 within InnerText of all XML elements:
SELECT *
FROM #Table1 t
WHERE t.CommentAsXML.exist('//*[lower-case(text()[1]) eq "b007"]') = 1
Results:
ID CommentAsXML
-- ------------------------------------------------------------------------------------------------------------------------------
1 <root><item type="Reg">0001</item><item type="Inv">B007</item><item type="Cus">A0001</item><item type="Br">F0001</item></root>
2 <root><item type="Reg">0005</item><parent><child>B007</child></parent><item type="Br">F0005</item></root>
Also, if you want to search for some text in XML atrributes' values then following XQuery could be used:
SELECT *
FROM #Table1 t
WHERE t.CommentAsXML.exist('//#*[lower-case(.) eq "reg"]') = 1
Note: in both cases, string constants (ex. "reg") should be with lower cases.

Related

Is it possible to perform a regex pattern match in a SQL Server varchar(max) column and return the match?

I have a log table that has some records that have this type of pattern:
.... "RefundId":"re_1ABasdf234234343434", "..."....
I want to extract and return the value of the RefundId in a column in a select statement, is this possible?
If there is only one Refund_ID for each row then you can use something like this:
--Create table
create table T1 (
T1_id int identity(1,1) primary key clustered,
Log_Data varchar(max) null
)
--Insert test data
insert T1(Log_Data)
values('.... "RefundId":"re_1ABasdf234234343434", "..."....'),
(' "RefundId":"JHHJJHJHJHJJHJH", "..."....'),
(''),
(null)
--Get some results
select *, left(substring(Log_Data, patindex('%"RefundId":"%', Log_Data)+12, 20000000), patindex('%"%', substring(Log_Data, patindex('%"RefundId":"%', Log_Data)+12, 20000000)) + case when patindex('%"%', substring(Log_Data, patindex('%"RefundId":"%', Log_Data)+12, 20000000)) > 0 then -1 else 0 end ) Refund_ID
from T1
If there are multiple Refund_IDs for each value then you will have to find a different method.
You can use the keyword LIKE
SELECT RefundId
FROM MyTable
WHERE RefundId LIKE 'some pattern'

Sybase - how do I return the first value that exists from a condition in SQL?

Say I'm trying to return some results where a column in a table matches a condition I set. But I only want to return the first result from a list of possible values in the condition. Is there a quick and easy way to do that? I'm thinking that I can use coalesce somehow, but not sure how I can structure it.
Something like:
select identifier,purpose from table
where identifier = 'letters'
and purpose = coalesce('A','B','C')
group by purpose
So in the table, if A purpose isn't there, then I only want the B purpose to show up. if it isn't there, then I want the C to show up, if none of them are there, then I would ideally like a null or no results to be returned. I'd rather not make several case statements where if A is null then look to B, then if B is null to look to C. Is there a quick way syntactically to do so?
Edit: I also want this to work if I have multiple identifiers I list, such as:
select identifier,purpose from table
where identifier in ('letters1', 'letters2')
and purpose = coalesce('A','B','C')
group by purpose
where I return two results if they exist - one purpose for each identifier, with the purpose in the order of importance for A first, then B, then C, or null if none exist.
Unforunately my reasoning for caolesce doesn't work above, as none of the variables are null so my query will just try to return all purposes of 'A' without the fallback that I intend my query to do. I want to try and avoid using temp tables if possible.
Sybase ASE does not have support for the row_number() function (else this would be fairly simple), so one option would be to use a #temp table to simulate (to some extent) row_number() functionality.
Some sample data:
create table mytab
(identifier varchar(30)
,purpose varchar(30)
)
go
insert mytab values ('letters1','A')
insert mytab values ('letters1','B')
insert mytab values ('letters1','C')
insert mytab values ('letters2','A')
insert mytab values ('letters2','B')
insert mytab values ('letters2','C')
go
The #temp table is created with an identity column plus a 2nd column to hold the items you wish to prioritize; priority is determined by the order in which the rows are inserted into the #temp table.
create table #priority
(id smallint identity
,purpose varchar(30))
go
insert #priority (purpose)
select 'A' -- first priority
union all
select 'B' -- second priority
union all
select 'C' -- last priority
go
select * from #priority order by id
go
id purpose
------ -------
1 A
2 B
3 C
We'll use a derived table to find the highest priority purpose (ie, minimal id value). We then join this minimal id back to #priority to generate the final result set:
select dt.identifier,
p.purpose
from (-- join mytab with #priority, keeping only the minimal priority id of the rows that exist:
select m.identifier,
min(p.id) as min_id
from mytab m
join #priority p
on p.purpose = m.purpose
group by m.identifier) dt
-- join back to #priority to convert min(id) into the actual purpose:
join #priority p
on p.id = dt.min_id
order by 1
go
Some test runs with different set of mytab data:
/* contents of mytab:
insert mytab values ('letters1','A')
insert mytab values ('letters1','B')
insert mytab values ('letters1','C')
insert mytab values ('letters2','A')
insert mytab values ('letters2','B')
insert mytab values ('letters2','C')
*/
identifier purpose
---------- -------
letters1 A
letters2 A
/* contents of mytab:
--insert mytab values ('letters1','A')
--insert mytab values ('letters1','B')
insert mytab values ('letters1','C')
--insert mytab values ('letters2','A')
insert mytab values ('letters2','B')
insert mytab values ('letters2','C')
*/
identifier purpose
---------- -------
letters1 C
letters2 B
Returning NULL if a row does not exist is not going to be easy since generating a NULL requires existence of a row ... somewhere ... with which to associate the NULL.
One idea would be to expand on the #temp table idea by creating another #temp table (eg, #identifiers) with the list of desired identifier values you wish to search on. You could then make use of a left (outer) join from #identifiers to mytab to ensure you always generate a result record for each identifier.

T-SQL Check if list has values, select and Insert into Table

I'm quite new to T-SQL and currently struggling with an insert statement in my stored procedure: I use as a parameter in the stored procedure a list of ids of type INT.
If the list is NOT empty, I want to store the ids into the table Delivery.
To pass the list of ids, i use a table type:
CREATE TYPE tIdList AS TABLE
(
ID INT NULL
);
GO
Maybe you know a better way to pass a list of ids into a stored procedure?
However, my procedure looks as follows:
-- parameter
#DeliveryModelIds tIdList READONLY
...
DECLARE #StoreId INT = 1;
-- Delivery
IF EXISTS (SELECT * FROM #DeliveryModelIds)
INSERT [MyDB].[Delivery] ([DeliveryModelId], [StoreId])
OUTPUT inserted.DeliveryId
SELECT ID FROM #DeliveryModelIds;
If the list has values, I want to store the values into the DB as well as the StoreId which is always 1.
If I insert the DeliveryIds 3,7,5 The result in table Delivery should look like this:
DeliveryId | StoreId | DeliveryModelId
1...............| 1...........| 3
2...............| 1...........| 7
3...............| 1...........| 5
Do you have an idea on how to solve this issue?
THANKS !
You can add #StoreId to your select for your insert.
...
IF EXISTS (SELECT * FROM #DeliveryModelIds)
INSERT [MyDB].[Delivery] ([DeliveryModelId], [StoreId])
OUTPUT inserted.DeliveryId
SELECT ID, #StoreId FROM #DeliveryModelIds;
Additionally, if you only want to insert DeliveryModelId that do not currently exist in the target table, you can use not exists() in the where clause like so:
...
IF EXISTS (SELECT * FROM #DeliveryModelIds)
INSERT [MyDB].[Delivery] ([DeliveryModelId], [StoreId])
OUTPUT inserted.DeliveryId
SELECT dmi.ID, #StoreId
FROM #DeliveryModelIds dmi
where not exists (
select 1
from MyDb.Delivery i
where i.StoreId = #StoreId
and i.DeliveryModeId = dmi.ID
);
You need to modify the INSERT statement to:
INSERT [MyDB].[Delivery] ([DeliveryModelId], [StoreId])
OUTPUT inserted.DeliveryId
SELECT ID, 1 FROM #DeliveryModelIds;
So you are also selecting a literal, 1, along with ID field.

Most effective way to check sub-string exists in comma-separated string in SQL Server

I have a comma-separated list column available which has values like
Product1, Product2, Product3
I need to search whether the given product name exists in this column.
I used this SQL and it is working fine.
Select *
from ProductsList
where productname like '%Product1%'
This query is working very slowly. Is there a more efficient way I can search for a product name in the comma-separated list to improve the performance of the query?
Please note I have to search comma separated list before performing any other select statements.
user defined functions for comma separation of the string
Create FUNCTION [dbo].[BreakStringIntoRows] (#CommadelimitedString varchar(max))
RETURNS #Result TABLE (Column1 VARCHAR(max))
AS
BEGIN
DECLARE #IntLocation INT
WHILE (CHARINDEX(',', #CommadelimitedString, 0) > 0)
BEGIN
SET #IntLocation = CHARINDEX(',', #CommadelimitedString, 0)
INSERT INTO #Result (Column1)
--LTRIM and RTRIM to ensure blank spaces are removed
SELECT RTRIM(LTRIM(SUBSTRING(#CommadelimitedString, 0, #IntLocation)))
SET #CommadelimitedString = STUFF(#CommadelimitedString, 1, #IntLocation, '')
END
INSERT INTO #Result (Column1)
SELECT RTRIM(LTRIM(#CommadelimitedString))--LTRIM and RTRIM to ensure blank spaces are removed
RETURN
END
Declare #productname Nvarchar(max)
set #productname='Product1,Product2,Product3'
select * from product where [productname] in(select * from [dbo].[![enter image description here][1]][1][BreakStringIntoRows](#productname))
Felix is right and the 'right answer' is to normalize your table. Although, maybe you have 500k lines of code that expect this column to exist as it is. So your next best (non-destructive) answer is:
Create a table to hold normalize data:
CREATE TABLE ProductsList2 (ProductId INT, ProductName VARCHAR)
Create a TRIGGER that on UPDATE/INSERT/DELETE maintains ProductList2 by splitting the string 'Product1,Product2,Product3' into three records.
Index your new table.
Query against your new table:
SELECT *
FROM ProductsList
WHERE ProductId IN (SELECT x.ProductId
FROM ProductsList2 x
WHERE x.ProductName = 'Product1')

sql server merge with multiple insert when not matched

I'm using MERGE in my query and i'm making INSERT on clause WHEN NOT MATCHED THEN, but then i would like to get the inserted row identity and make another INSERT to some other table. Query for now is:
ALTER PROCEDURE [dbo].[BulkMergeOffers]
#data ImportDataType READONLY
AS
SET NOCOUNT ON;
DECLARE #cid int = 0
MERGE dbo.oferta AS target
USING #data AS source
ON (target.nr_oferty = source.nr_oferty)
WHEN NOT MATCHED THEN
INSERT (nr_oferty,rynek,typ_transakcji, typ_nieruchomosci,cena,powierzchnia, rok_budowy, wojewodztwo, miasto, dzielnica, ulica, opis, wspolrzedne, film, zrodlo, KontaktStore, data, forma_wlasnosci, stan_techniczny, liczba_pokoi, liczba_peter, pietro, material, kuchnia, pow_dzialki, typ_dzialki, woda,gaz, prad,sila, przeznaczenie,lokal_dane)
VALUES (source.nr_oferty,source.rynek,source.typ_transakcji, source.typ_nieruchomosci,source.cena,source.powierzchnia, source.rok_budowy, source.wojewodztwo, miasto, source.dzielnica, source.ulica, source.opis, source.wspolrzedne, source.film, source.zrodlo, source.KontaktStore, source.data, source.forma_wlasnosci, source.stan_techniczny, source.liczba_pokoi, source.liczba_peter, source.pietro, source.material, source.kuchnia, source.pow_dzialki, source.typ_dzialki, source.woda,source.gaz, source.prad,source.sila, source.przeznaczenie,source.lokal_dane);
So as you see i need to insert some values to the target table based on source data, then i need to take the insert identity and insert it into another table but also based on some source data, so something like that, just after the first insert:
SET #cid = SCOPE_IDENTITY();
if source.photo is not null
begin
insert into dbo.photos(offerID, file) values (#cid, source.photo);
end
But i can't assemble it, a have no access to the source no more, also if statement show error :
"the multi-part identifier
source.photo can not be bound"
but it is there. Just for clarity ImportDataType is a table-valued parameter.
Please HELP
If you don't need the WHEN MATCHED part of the MERGE statement in your query, there's no real reason to use MERGE. You could use INSERT with an outer join or NOT EXISTS statement.
In either case, you can use the OUTPUT clause to retrieve the inserted identity value an pass it on to a second query.
I've extended your example:
<stored procedure header - unchanged>
--declare a table variable to hold the inserted values data
DECLARE #newData TABLE
(nr_oferty int
,newid int
) -- I'm guessing the datatype for both columns
MERGE dbo.oferta AS target
USING #data AS source
ON (target.nr_oferty = source.nr_oferty)
WHEN NOT MATCHED THEN
INSERT (nr_oferty,rynek,typ_transakcji, typ_nieruchomosci,cena,powierzchnia, rok_budowy, wojewodztwo, miasto, dzielnica, ulica, opis, wspolrzedne, film, zrodlo, KontaktStore, data, forma_wlasnosci, stan_techniczny, liczba_pokoi, liczba_peter, pietro, material, kuchnia, pow_dzialki, typ_dzialki, woda,gaz, prad,sila, przeznaczenie,lokal_dane)
VALUES (source.nr_oferty,source.rynek,source.typ_transakcji, source.typ_nieruchomosci,source.cena,source.powierzchnia, source.rok_budowy, source.wojewodztwo, miasto, source.dzielnica, source.ulica, source.opis, source.wspolrzedne, source.film, source.zrodlo, source.KontaktStore, source.data, source.forma_wlasnosci, source.stan_techniczny, source.liczba_pokoi, source.liczba_peter, source.pietro, source.material, source.kuchnia, source.pow_dzialki, source.typ_dzialki, source.woda,source.gaz, source.prad,source.sila, source.przeznaczenie,source.lokal_dane)
OUTPUT inserted.nr_oferty, inserted.<tableId> INTO #newData;
-- replace <tableId> with the name of the identity column in dbo.oftera
insert into dbo.photos(offerID, file)
SELECT nd.newid, pt.photo
FROM #data AS pt
JOIN #newData AS nd
ON nd.nr_oferty = pt.nr_oferty
WHERE pt.photo IS NOT NULL

Resources