how to select data row from a comma separated value field - sql-server

My question is not exactly but similar to this question
How to SELECT parts from a comma-separated field with a LIKE statement
but i have not seen any answer there. So I am posting my question again.
i have the following table
╔════════════╦═════════════╗
║ VacancyId ║ Media ║
╠════════════╬═════════════╣
║ 1 ║ 32,26,30 ║
║ 2 ║ 31, 25,20 ║
║ 3 ║ 21,32,23 ║
╚════════════╩═════════════╝
I want to select data who has media id=30 or media=21 or media= 40
So in this case the output will return the 1st and the third row.
How can I do that ?
I have tried media like '30' but that does not return any value. Plus i just dont need to search for one string in that field .
My database is SQL Server
Thank you

It's never good to use the comma separated values to store in database if it is feasible try to make separate tables to store them as most probably this is 1:n relationship.
If this is not feasible then there are following possible ways you can do this,
If your number of values to match are going to stay same, then you might want to do the series of Like statement along with OR/AND depending on your requirement.
Ex.-
WHERE
Media LIKE '%21%'
OR Media LIKE '%30%'
OR Media LIKE '%40%'
However above query will likely to catch all the values which contains 21 so even if columns with values like 1210,210 will also be returned. To overcome this you can do following trick which is hamper the performance as it uses functions in where clause and that goes against making Seargable queries.
But here it goes,
--Declare valueSearch variable first to value to match for you can do this for multiple values using multiple variables.
Declare #valueSearch = '21'
-- Then do the matching in where clause
WHERE
(',' + RTRIM(Media) + ',') LIKE '%,' + #valueSearch + ',%'
If the number of values to match are going to change then you might want to look into FullText Index and you should thinking about the same.
And if you decide to go with this after Fulltext Index you can do as below to get what you want,
Ex.-
WHERE
CONTAINS(Media, '"21" OR "30" OR "40"')

The best possible way i can suggest is first you have do comma separated value to table using This link and you will end up with table looks like below.
SELECT * FROM Table
WHERE Media in('30','28')
It will surely works.

You can use this, but the performance is inevitably poor. You should, as others have said, normalise this structure.
WHERE
',' + media + ',' LIKE '%,21,%'
OR ',' + media + ',' LIKE '%,30,%'
Etc, etc...

If you are certain that any Media value containing the string 30 will be one you wish to return, you just need to include wildcards in your LIKE statement:
SELECT *
FROM Table
WHERE Media LIKE '%30%'
Bear in mind though that this would also return a record with a Media value of 298,300,302 for example, so if this is problematic for you, you'll need to consider a more sophisticated method, like:
SELECT *
FROM Table
WHERE Media LIKE '%,30,%'
OR Media LIKE '30,%'
OR Media LIKE '%,30'
OR Media = '30'
If there might be spaces in the strings (as per in your question), you'll also want to strip these out:
SELECT *
FROM Table
WHERE REPLACE(Media,' ','') LIKE '%,30,%'
OR REPLACE(Media,' ','') LIKE '30,%'
OR REPLACE(Media,' ','') LIKE '%,30'
OR REPLACE(Media,' ','') = '30'
Edit: I actually prefer Coder of Code's solution to this:
SELECT *
FROM Table
WHERE ',' + LTRIM(RTRIM(REPLACE(Media,' ',''))) + ',' LIKE '%,30,%'
You mention that would wish to search for multiple strings in this field, which is also possible:
SELECT *
FROM Table
WHERE Media LIKE '%30%'
OR Media LIKE '%28%'
SELECT *
FROM Table
WHERE Media LIKE '%30%'
AND Media LIKE '%28%'

I agree not a good idea comma seperated values stored like that. Bu if you have to;
I think using inline function is will give better performance;
Select VacancyId, Media from (
Select 1 as VacancyId, '32,26,30' as Media
union all
Select 2, '31,25,20'
union all
Select 3, '21,32,23'
) asa
CROSS APPLY dbo.udf_StrToTable(Media, ',') tbl
where CAST(tbl.Result as int) in (30,21,40)
Group by VacancyId, Media
Output is;
VacancyId Media
----------- ---------
1 32,26,30
3 21,32,23
and our inline function script is;
if exists (select * from dbo.sysobjects where id = object_id(N'[dbo].[udf_StrToTable]') and xtype in (N'FN', N'IF', N'TF'))
drop function [dbo].udf_StrToTable
GO
CREATE FUNCTION udf_StrToTable (#List NVARCHAR(MAX), #Delimiter NVARCHAR(1))
RETURNS TABLE
With Encryption
AS
RETURN
( WITH Split(stpos,endpos)
AS(
SELECT 0 AS stpos, CHARINDEX(#Delimiter,#List) AS endpos
UNION ALL
SELECT CAST(endpos+1 as int), CHARINDEX(#Delimiter,#List,endpos+1)
FROM Split
WHERE endpos > 0
)
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT 1)) as inx,
SUBSTRING(#List,stpos,COALESCE(NULLIF(endpos,0),LEN(#List)+1)-stpos) Result
FROM Split
)
GO

This solution uses a RECURSIVE CTE to identify the position of each comma within the string then uses SUBSTRING to return all strings between the commas.
I've left some unnecessary code in place to help you get you head round what it's doing. You can strip it down to provide exactly what you need.
DROP TABLE #TMP
CREATE TABLE #TMP(ID INT, Vals CHAR(100))
INSERT INTO #TMP(ID,VALS)
VALUES
(1,'32,26,30')
,(2,'31, 25,20')
,(3,'21,32,23')
;WITH cte
AS
(
SELECT
ID
,VALS
,0 POS
,CHARINDEX(',',VALS,0) REM
FROM
#TMP
UNION ALL
SELECT ID,VALS,REM,CHARINDEX(',',VALS,REM+1)
FROM
cte c
WHERE CHARINDEX(',',VALS,REM+1) > 0
UNION ALL
SELECT ID,VALS,REM,LEN(VALS)
FROM
cte c
WHERE POS+1 < LEN(VALS) AND CHARINDEX(',',VALS,REM+1) = 0
)
,cte_Clean
AS
(
SELECT ID,CAST(REPLACE(LTRIM(RTRIM(SUBSTRING(VALS,POS+1,REM-POS))),',','') AS INT) AS VAL FROM cte
WHERE POS <> REM
)
SELECT
ID
FROM
cte_Clean
WHERE
VAL = 32
ORDER BY ID

Related

SQL to split a column values into rows in Netezza

I have data in the below way in a column. The data within the column is separated by two spaces.
4EG C6CC C6DE 6MM C6LL L3BC C3
I need to split it into as beloW. I tried using REGEXP_SUBSTR to do it but looks like it's not in the SQL toolkit. Any suggestions?
1. 4EG
2. C6CC
3. C6DE
4. 6MM
5. C6LL
6. L3BC
7. C3
This has ben answered here: http://nz2nz.blogspot.com/2016/09/netezza-transpose-delimited-string-into.html?m=1
Please note the comment at the button about the best performing way of use if array functions. I have measured the use of regexp_extract_all_sp() versus repeated regex matches and the benefit can be quite large
The examples from nz2nz.blogpost.com are hard to follow. I was able to piece together this method:
with
n_rows as (--update on your end
select row_number() over(partition by 1 order by some_field) as seq_num
from any_table_with_more_rows_than_delimited_values
)
, find_values as ( -- fake data
select 'A' as id, '10,20,30' as orig_values
union select 'B', '5,4,3,2,1'
)
select
id,
seq_num,
orig_values,
array_split(orig_values, ',') as array_list,
get_value_varchar(array_list, seq_num) as value
from
find_values
cross join n_rows
where
seq_num <= regexp_match_count(orig_values, ',') + 1 -- one row for each value in list
order by
id,
seq_num

Expression to find multiple spaces in string

We handle a lot of sensitive data and I would like to mask passenger names using only the first and last letter of each name part and join these by three asterisks (***),
For example: the name 'John Doe' will become 'J***n D***e'
For a name that consists of two parts this is doable by finding the space using the expression:
LEFT(CardHolderNameFromPurchase, 1) +
'***' +
CASE WHEN CHARINDEX(' ', PassengerName) = 0
THEN RIGHT(PassengerName, 1)
ELSE SUBSTRING(PassengerName, CHARINDEX(' ', PassengerName) -1, 1) +
' ' +
SUBSTRING(PassengerName, CHARINDEX(' ', PassengerName) +1, 1) +
'***' +
RIGHT(PassengerName, 1)
END
However, the passenger name can have more than two parts, there is no real limit to it. How should can I find the indices of all spaces within an expression? Or should I maybe tackle this problem in a different way?
Any help or pointer is much appreciated!
This solution does what you want it to, but is really the wrong approach to use when trying to hide personally identifiable data, as per Gordon's explanation in his answer.
SQL:
declare #t table(n nvarchar(20));
insert into #t values('John Doe')
,('JohnDoe')
,('John Doe Two')
,('John Doe Two Three')
,('John O''Neill');
select n
,stuff((select ' ' + left(s.item,1) + '***' + right(s.item,1)
from dbo.fn_StringSplit4k(t.n,' ',null) as s
for xml path('')
),1,1,''
) as mask
from #t as t;
Output:
+--------------------+-------------------------+
| n | mask |
+--------------------+-------------------------+
| John Doe | J***n D***e |
| JohnDoe | J***e |
| John Doe Two | J***n D***e T***o |
| John Doe Two Three | J***n D***e T***o T***e |
| John O'Neill | J***n O***l |
+--------------------+-------------------------+
String splitting function based on Jeff Moden's Tally Table approach:
create function [dbo].[fn_StringSplit4k]
(
#str nvarchar(4000) = ' ' -- String to split.
,#delimiter as nvarchar(1) = ',' -- Delimiting value to split on.
,#num as int = null -- Which value to return, null returns all.
)
returns table
as
return
-- Start tally table with 10 rows.
with n(n) as (select 1 union all select 1 union all select 1 union all select 1 union all select 1 union all select 1 union all select 1 union all select 1 union all select 1 union all select 1)
-- Select the same number of rows as characters in #str as incremental row numbers.
-- Cross joins increase exponentially to a max possible 10,000 rows to cover largest #str length.
,t(t) as (select top (select len(isnull(#str,'')) a) row_number() over (order by (select null)) from n n1,n n2,n n3,n n4)
-- Return the position of every value that follows the specified delimiter.
,s(s) as (select 1 union all select t+1 from t where substring(isnull(#str,''),t,1) = #delimiter)
-- Return the start and length of every value, to use in the SUBSTRING function.
-- ISNULL/NULLIF combo handles the last value where there is no delimiter at the end of the string.
,l(s,l) as (select s,isnull(nullif(charindex(#delimiter,isnull(#str,''),s),0)-s,4000) from s)
select rn
,item
from(select row_number() over(order by s) as rn
,substring(#str,s,l) as item
from l
) a
where rn = #num
or #num is null;
GO
If you consider PassengerName as sensitive information, then you should not be storing it in clear text in generally accessible tables. Period.
There are several different options.
One is to have reference tables for sensitive information. Any table that references this would have an id rather than the name. Viola. No sensitive information is available without access to the reference table, and that would be severely restricted.
A second method is a reversible compression algorithm. This would allow the the value to be gibberish, but with the right knowledge, it could be transformed back into a meaningful value. Typical methods for this are the public key encryption algorithms devised by Rivest, Shamir, and Adelman (RSA encoding).
If you want to do first and last letters of names, I would be really careful about Asian names. Many of them consist of two or three letters, when written in Latin script. That isn't much hiding. SQL Server does not have simple mechanisms to do this. You can write a user-defined function with a loop to manager the process. However, I view this as the least secure and least desirable approach.
This uses Jeff Moden's DelimitedSplit8K, as well as the new functionality in SQL Server 2017 STRING_AGG. As I don't know what version you're using, I've just gone "whole hog" and assumed you're using the latest version.
Jeff's function is invaluable here, as it returns the ordinal position, something which Microsoft have foolishly omitted from their own function, STRING_SPLIT (and didn't add in 2017 either). Ordinal position is key here, so we can't make use of the built in function.
WITH VTE AS(
SELECT *
FROM (VALUES ('John Doe'),('Jane Bloggs'),('Edgar Allan Poe'),('Mr George W. Bush'),('Homer J Simpson')) V(FullName)),
Masking AS (
SELECT *,
ISNULL(STUFF(Item, 2, LEN(item) -2,'***'), Item) AS MaskedPart
FROM VTE V
CROSS APPLY dbo.delimitedSplit8K(V.Fullname, ' '))
SELECT STRING_AGG(MaskedPart,' ') AS MaskedFullName
FROM Masking
GROUP BY Fullname;
Edit: Nevermind, OP has commented they are using 2008, so STRING_AGG is out of the question. #iamdave, however, has posted an answer which is very similar to my own, just do it the "old fashioned XML way".
Depending on your version of SQL Server, you may be able to use the built-in string split to rows on spaces in the name, do your string formatting, and then roll back up to name level using an XML path.
create table dataset (id int identity(1,1), name varchar(50));
insert into dataset (name) values
('John Smith'),
('Edgar Allen Poe'),
('One Two Three Four');
with split as (
select id, cs.Value as Name
from dataset
cross apply STRING_SPLIT (name, ' ') cs
),
formatted as (
select
id,
name,
left(name, 1) + '***' + right(name, 1) as out
from split
)
SELECT
id,
(SELECT ' ' + out
FROM formatted b
WHERE a.id = b.id
FOR XML PATH('')) [out_name]
FROM formatted a
GROUP BY id
Result:
id out_name
1 J***n S***h
2 E***r A***n P***e
3 O***e T***o T***e F***r
You can do that using this function.
create function [dbo].[fnMaskName] (#var_name varchar(100))
RETURNS varchar(100)
WITH EXECUTE AS CALLER
AS
BEGIN
declare #var_part varchar(100)
declare #var_return varchar(100)
declare #n_position smallint
set #var_return = ''
set #n_position = 1
WHILE #n_position<>0
BEGIN
SET #n_position = CHARINDEX(' ', #var_name)
IF #n_position = 0
SET #n_position = LEN(#var_name)
SET #var_part = SUBSTRING(#var_name, 1, #n_position)
SET #var_name = SUBSTRING(#var_name, #n_position+1, LEN(#var_name))
if #var_part<>''
SET #var_return = #var_return + stuff(#var_part, 2, len(#var_part)-2, replicate('*',len(#var_part)-2)) + ' '
END
RETURN(#var_return)
END

SSRS multi-select parameter can't capture values with comma

I have a multi-select parameter that has comma on it's list and my dataset is using a function split for my parameter since it's in a SP so my where clause looks like this:
WHERE [CURRENTPRODATT_DIV_NAME] IN (SELECT VALUE FROM DBO.FnSplit(#ProductDivision,','))
and for example, "SMART-UPS 1,5KVA" and "BACK-UPS" were ticked in the multi-select parameter, the multi-select will treat the first value as two different values which are "SMART-UPS 1" and 5KVA". So in my Split function it will show this result:
Row Value
1 SMART-UPS 1
2 5KVA
3 BACK-UPS
And since "SMART-UPS 1" and "5KVA" were not a valid value, I will not get the records under "SMART-UPS 1,5KVA".
Can someone give ideas on how to solve this problem? Any response will be greatly appreciated.
The answer to this one can be quite simple: Just use a different seperator!
(In this case i use ";" but you could use almost any symbol, which does not appear in your LOV.
When passing the Parameters, set the Paramter Value in the Dataset Properties to
=Join(Parameters!ProductDivision.Value, ";")
In your SQL-Code then adjust the where clause to
WHERE [CURRENTPRODATT_DIV_NAME] IN (SELECT VALUE FROM DBO.FnSplit(#ProductDivision,';'))
This should work for you.
You should store the values in a separate table. Show the description and make the value of the list-items their respective IDs. Then, you'll never have a comma in your string parameter.
If that's not an option:
Before building your string (within your application), replace all comma values with something that you know won't appear naturally.
For example, if you've selected the following values:
1. SPK,5
2. Joe
3. Dave,Smith
You can iterate through those items and replace the commas with something like ten asterisks: "**********"
So your final string would be "SPK**********5,Joe,Dave**********Smith". You would also want to pass the replacement string as a second parameter (so you can change your app without having to modify your SP).
Then, you can use the following logic:
WHERE [CURRENTPRODATT_DIV_NAME] IN (SELECT REPLACE(VALUE, #SecondParameter, ',') FROM DBO.FnSplit(#ProductDivision,','))
This way, everything will be split correctly, since you've removed the commas. Then, when you select from that list, you just replace the funny character string (in this example, ten asterisks) with the original comma.
For the record, you should 100% do what I initially suggested. This is a terrible approach, but it will get you out of a jam.
So my guess that the string you pass is a string like param,value,param,value and you want only the values from the string.
If you know that values can't be 'PARAMETER'. You could do like this
declare #params varchar(max) = 'PARAMETER1,AAA,PARAMETER2,BBB'
SELECT VALUE FROM dbo.FnSplit(#params,',')
WHERE VALUE NOT LIKE 'PARAMETER%'
The output of this query is
----------
AAA
BBB
Edit:
In case you have parameter as a value and you know that there always will be param,value,param,value string. You could use a CTE
declare #params varchar(max) = 'PARAMETER1,AAA,PARAMETER2,BBB';
WITH VALUE AS
(
SELECT VALUE, ROW_NUMBER() OVER ( ORDER BY (SELECT 1)) as RN
FROM dbo.FnSplit(#params,',')
)
SELECT * FROM VALUE
WHERE RN % 2 = 0
So what you do here, you split the string and save the rownumber and you also know that values will always have an even Rownumber.
You could pass parameters as table parameter:
CREATE TABLE #t(ID INT IDENTITY(1,1), CURRENTPRODATT_DIV_NAME VARCHAR(100));
INSERT INTO #t(CURRENTPRODATT_DIV_NAME)
VALUES ('SMART-UPS 1,5KVA'), ('BACK-UPS');
DECLARE #t TABLE (val VARCHAR(100));
INSERT INTO #t(val) VALUES ('SMART-UPS 1,5KVA'),('BACK-UPS');
SELECT *
FROM #t
WHERE CURRENTPRODATT_DIV_NAME IN (SELECT val FROM #t);
LiveDemo
Output:
╔════╦═════════════════════════╗
║ ID ║ CURRENTPRODATT_DIV_NAME ║
╠════╬═════════════════════════╣
║ 1 ║ SMART-UPS 1,5KVA ║
║ 2 ║ BACK-UPS ║
╚════╩═════════════════════════╝

Adding number to text rows sql server

I have a columns named id and item and there are stored values like:
id item
1 value
2 value
3 value
etc. There are 192 rows. These values are in the system in different places and I need to find concrete value in database to change it to the name I need.
Is there some posibility to add number to rows, for example value_01, value_02 etc.
I know how to do it in C language, but have no idea how to do it in sql server.
Edited:
#lad2025
In system I have columns, that names are stored in database.
Names are same, for example:
In app Apple I have table name Apple
In app Storage I also have table name Apple
I need to change app Storage columns name Apple to different, but I dont know, which of databasa Apple values it is, so I want to add identifiers to string, to find the right one. So I need to update database values, to see them in system.
SQLFiddleDemo
DECLARE #pad INT = 3;
SELECT
[id],
[item] = [item] + '_' + RIGHT(REPLICATE('0', #pad) + CAST([id] AS NVARCHAR(10)), #pad)
FROM your_table;
This will produce result like:
value_001
value_010
value_192
EDIT:
After reading your comments it is not clear what you want to achieve, but check:
SqlFiddleDemo2
DECLARE #pad INT = 3;
;WITH cte AS
(
SELECT *,
[rn] = ROW_NUMBER() OVER (PARTITION BY item ORDER BY item)
FROM your_table
)
SELECT
[id],
[item] = [item] + '_' + RIGHT(REPLICATE('0', #pad) + CAST([rn] AS NVARCHAR(10)), #pad)
FROM cte
WHERE item = 'value'; /* You can comment it if needed */

T-SQL trying to determine the largest string from a set of concatenated strings in a database

I have two tables. One has an Order number, and details about the order:
CREATE TABLE #Order ( OrderID int )
and the second contains comments about the order:
CREATE TABLE #OrderComments ( OrderID int
Comment VarChar(500) )
Order ID Comments
~~~~~~~~ ~~~~~~~~
1 Loved this item!
1 Could use some work
1 I've had better
2 Try the veal
I'm tasked with determining the maximum length of the output, then returning output like the following:
Order ID Comments Length
~~~~~~~~ ~~~~~~~~ ~~~~~~
1 Loved this item! | Could use some work | I've had better 56
2 Try the veal 12
So, in this example, if this is all of the data, I'm looking for "56").
The main purpose is to determine the maximum length of all comments when appended together, including the | delimiter. This will be used when constructing the table this output will be put into, to determine if we can get the data within the 8,060 size limit for a row or if we need to use varchar(max) or text to hold the data.
I have tried a couple of subqueries that can generate this output to variables, but I haven't found one yet that could generate the above output. If I could get that, then I could just do a SELECT TOP 1 ... ORDER BY 3 DESC to get the number I'm looking for.
To find out what the length of the longest string will be if you trim and concatenate all the (not null) comments belonging to an OrderId with a delimiter of length three you can use
SELECT TOP(1) SUM(LEN(Comment)) + 3* (COUNT(Comment) - 1) AS Length
FROM OrderComments
GROUP BY OrderId
ORDER BY Length DESC
To actually do the concatenation you can use XML PATH as demonstrated in many other answers on this site.
WITH O AS
(
SELECT DISTINCT OrderID
FROM #Order
)
SELECT O.OrderID,
LEFT(y.Comments, LEN(y.Comments) - 1) AS Comments
FROM O
CROSS APPLY (SELECT ltrim(rtrim(Comment)) + ' | '
FROM #OrderComments oc
WHERE oc.OrderID = O.OrderID
AND Comment IS NOT NULL
FOR XML PATH(''), TYPE) x (Comments)
CROSS APPLY (SELECT x.Comments.value('.', 'VARCHAR(MAX)')) y(Comments)
All you need is STUFF function and XML PATH
Check out this sql fiddle
http://www.sqlfiddle.com/#!3/65cc6/5

Resources