Remove string portion from inconsistent string of comma-separated values

Remove string portion from inconsistent string of comma-separated values - sql-server

SQL Server 2017 on Azure.
Given a field called Categories in a table called dbo.sources:
ID Categories
1 ABC01, FFG02, ERERE, CC201
2 GDF01, ABC01, GREER, DS223
3 DSF12, GREER
4 ABC01
5 NULL
What is the syntax for a query that would remove ABC01 from any record where it exists, but keep the other codes in the string?
Results would be:
ID Categories
1 AFFG02, ERERE, CC201
2 GDF01, GREER, DS223
3 DSF12, GREER
4 NULL
5 NULL

Normalising and then denormalising your data, you can do this:
USE Sandbox;
GO
CREATE TABLE dbo.Sources (ID int,
Categories varchar(MAX));
INSERT INTO dbo.Sources
VALUES (1,'ABC01,FFG02,ERERE,CC201'), --I **assume you don't really have the space)
(2,'GDF01,ABC01,GREER,DS223'),
(3,'DSF12,GREER'),
(4,'ABC01'),
(5,NULL);
GO
DECLARE #Source varchar(5) = 'ABC01'; --Value to remove
WITH CTE AS(
SELECT S.ID,
STRING_AGG(NULLIF(SS.[value],#Source),',') WITHIN GROUP(ORDER BY S.ID) AS Categories
FROM dbo.Sources S
CROSS APPLY STRING_SPLIT(S.Categories,',') SS
GROUP BY S.ID)
UPDATE S
SET Categories = C.Categories
FROM dbo.Sources S
JOIN CTE C ON S.ID = C.ID;
GO
SELECT ID,
Categories
FROM dbo.Sources
GO
DROP TABLE dbo.Sources;
Although this seems like a bit overkill, compared to the REPLACE, it shows why normalising it is a far better idea in the first place, and how simple it is to actually do so.

You can use Replace as follows:
update dbo.sources set
category = replace(replace(category,'ABC01',''),', ','')
where category like '%ABC01%'

Related

Why TRY_PARSE its so slow?

I have this query that basically returns (right now) only 10 rows as results:
select *
FROM Table1 as o
inner join Table2 as t on t.Field1 = o.Field2
where Code = 123456 and t.FakeData is not null
Now, if I want to parse the field FakeData (which, unfortunately, can contain different types of data, from DateTime to Surname/etc; i.e. nvarchar(70)), for data show and/or filtering:
select *, TRY_PARSE(t.FakeData as date USING 'en-GB') as RealDate
FROM Table1 as o
inner join Table2 as t on t.Field1 = o.Field2
where Code = 123456 and t.FakeData is not null
It takes x10 the query to be executed.
Where am I wrong? How can I speed up?
I can't edit the database, I'm just a customer which read data.

The TSQL documentation for TRY_PARSE makes the following observation:
Keep in mind that there is a certain performance overhead in parsing the string value.
NB: I am assuming your typical date format would be dd/mm/yyyy.
The following is something of a shot-in-the-dark that might help. By progressively assessing the nvarchar column if it is a candidate as a date it is possible to reduce the number of uses of that function. Note that a data point established in one apply can then be referenced in a subsequent apply:
CREATE TABLE mytable(
FakeData NVARCHAR(60) NOT NULL
);
INSERT INTO mytable(FakeData) VALUES (N'oiwsuhd ouhw dcouhw oduch woidhc owihdc oiwhd cowihc');
INSERT INTO mytable(FakeData) VALUES (N'9603200-0297r2-0--824');
INSERT INTO mytable(FakeData) VALUES (N'12/03/1967');
INSERT INTO mytable(FakeData) VALUES (N'12/3/2012');
INSERT INTO mytable(FakeData) VALUES (N'3/3/1812');
INSERT INTO mytable(FakeData) VALUES (N'ohsw dciuh iuh pswiuh piwsuh cpiuwhs dcpiuhws ipdcu wsiu');
select
t.FakeData, oa3.RealDate
from mytable as t
outer apply (
select len(FakeData) as fd_len
) oa1
outer apply (
select case when oa1.fd_len > 10 then 0
when len(replace(FakeData,'/','')) + 2 = oa1.fd_len then 1
else 0
end as is_candidate
) oa2
outer apply (
select case when oa2.is_candidate = 1 then TRY_PARSE(t.FakeData as date USING 'en-GB') end as RealDate
) oa3
FakeData
RealDate
oiwsuhd ouhw dcouhw oduch woidhc owihdc oiwhd cowihc
null
9603200-0297r2-0--824
null
12/03/1967
1967-03-12
12/3/2012
2012-03-12
3/3/1812
1812-03-03
ohsw dciuh iuh pswiuh piwsuh cpiuwhs dcpiuhws ipdcu wsiu
null
db<>fiddle here

SQL- use an attribute to group activities and use the group as parameter

I have a table that looks like this:
ActivityID
Time Used
Activity Type
Activity Category ID
Activity Category
123456
30
A
1
X
765432
120
B
2
Y
876462
65
C
3
Z
h52635
76
D
3
Z
hsgs62
187
E
1
X
I would like to use the Activity Category as parameter (#ActivityCategory) to filter my report later, it means the filter should be X;Y;Z.
When I choose one Activity Category, the sum of "Time used" should appear.
My question is: how should I build the query, to be able to group the activities with the same Activity Category together and use the Category XYZ as a parameter?

Something like this perhaps:
-- Sample data
DECLARE #table TABLE (ActivityId INT, TimeUsed INT, ActivityCategory CHAR(1));
INSERT #table VALUES(123,20,'X'), (129,50,'Y'), (254,30,'Y'), (991,10,'Z');
-- Parameter
DECLARE #ActivityCategory VARCHAR(100) = 'X,Y';
SELECT t.ActivityCategory, TimeUsed = SUM(t.TimeUsed)
FROM #table AS t
CROSS APPLY STRING_SPLIT(#ActivityCategory,',') AS s -- You will need a string splitter funciton
WHERE t.ActivityCategory = s.value
GROUP BY t.ActivityCategory;
Returns:
ActivityCategory TimeUsed
---------------- -----------
X 20
Y 80

Alan's answer is good, but I'd personally use a temp table and a join for performance reasons. The table being queried might be very large, in which case a join to a temp table would be more performant than CROSS APPLY.
The easiest way to pass multi-value parameters in and out of your query are comma-separated lists. Indeed if you are using Report Server / SSRS then that is how the "Multiple Value" box in the user interface will deliver the users' selections into a varchar parameter.
--Declare and set parameter
DECLARE #ActivityCategories varchar(MAX)
SET #ActivityCategories = 'X,Y,Z'
--Convert individual parameter values to a temp table
DROP TABLE IF EXISTS #ParamaterValues
CREATE TABLE #ParameterValues (ActivityCategory varchar(10) NOT NULL PRIMARY KEY CLUSTERED)
INSERT INTO #ParameterValues WITH(TABLOCK)
SELECT value
FROM STRING_SPLIT(#ActivityCategories,',')
GROUP BY value
ORDER BY value
--Join on temp table to filter by paramater values
SELECT ActivityID,
TimeUsed,
ActivityType,
ActivityCategoryID,
ActivityCategory
FROM dbo.YourTable a
INNER JOIN #ParameterValues b ON a.ActivityCategory = b.ActivityCategory

Check records from same table with unmatched values in multiple rows based on date

I want to select the unmatched combination from the same table:
Table 1
So as per the above table, #Id combination (#3,#4,#5) is missing for date 15-Sep-2018 and #Id combination (#8,#9,#10) is completely different for 15-Sep-2018 as compared to 14-Sep-2018.
So I want to select such IDs [ #Id combination (#8,#9,#10) ] and print it
How do I find this through query?

When you say "Combination #8" you actually just mean Server 3 + License 1? Something like this?
declare #daycount int
select #daycount = count(distinct [date]) from table1
select ServerID, LicenseID
from table1
group by ServerID, LicenseID
having count(*) != #daycount

T-SQL Grouping Sets of Information

I have a problem which my limited SQL knowledge is keeping me from understanding.
First the problem:
I have a database which I need to run a report on, it contains configurations of a users entitlements. The report needs to show a distinct list of these configurations and a count against each one.
So a line in my DB looks like this:
USER_ID SALE_ITEM_ID SALE_ITEM_NAME PRODUCT_NAME CURRENT_LINK_NUM PRICE_SHEET_ID
37715 547 CultFREE CultPlus 0 561
the above line is one row of a users configuration, for every user ID there can be 1-5 of these lines. So the definition of a configuration is multiple rows of data sharing a common User ID with variable attributes..
I need to get a distinct list of these configurations across the whole table, leaving me just one configuration set for every instance where > 1 has that configuration and a count of instances of that configuration.
Hope this is clear?
Any ideas?!?!
I have tried various group by's and unions, also the grouping sets function to no avail.
Will be very greatful if anyone can give me some pointers!

Ouch that hurt ...
Ok so problem:
a row represents a configurable line
users may be linked to more than 1 row of configuration
configuration rows when grouped together form a configuration set
we want to figure out all of the distinct configuration sets
we want to know what users are using them.
Solution (its a bit messy but the idea is there, copy and paste in to SQL management studio) ...
-- ok so i imported the data to a table named SampleData ...
-- 1. import the data
-- 2. add a new column
-- 3. select all the values of the config in to the new column (Configuration_id)
--UPDATE [dbo].[SampleData]
--SET [Configuration_ID] = SALE_ITEM_ID + SALE_ITEM_NAME + [PRODUCT_NAME] + [CURRENT_LINK_NUM] + [PRICE_SHEET_ID] + [Configuration_ID]
-- 4. i then selected just the distinct values of those and found 6 distinct Configuration_id's
--SELECT DISTINCT [Configuration_ID] FROM [dbo].[SampleData]
-- 5. to make them a bit easier to read and work with i gave them int values instead
-- for me it was easy to do this manually but you might wanna do some trickery here to autonumber them or something
-- basic idea is to run the step 4 statement but select into a new table then add a new primary key column and set identity spec on it
-- that will generate u a bunch of incremental numbers for your config id's so u can then do something like ...
--UPDATE [dbo].[SampleData] sd
--SET Configuration_ID = (SELECT ID FROM TempConfigTable WHERE Config_ID = sd.Configuration_ID)
-- at this point you have all your existing rows with a unique ident for the values combined in each row.
-- so for example in my dataset i have several rows where only the user_id has changed but all look like this ...
--SALE_ITEM_ID SALE_ITEM_NAME PRODUCT_NAME CURRENT_LINK_NUM PRICE_SHEET_ID Configuration_ID
--54101 TravelFREE TravelPlus 0 56101 1
-- now you have a config id you can start to work on building sets up ...
-- each user is now matched with 1 or more config id
-- 6. we use a CTE (common table expression) to link the possibles (keeps the join small) ...
--WITH Temp (ConfigID)
--AS
--(
-- SELECT DISTINCT SD.Configuration_Id --SD2.Configuration_Id, SD3.Configuration_Id, SD4.Configuration_Id, SD5.Configuration_Id,
-- FROM [dbo].[SampleData] SD
--)
-- this extracts all the possible combinations using the CTE
-- on the basis of what you told me, max rows per user is 6, in the result set i have i only have 5 distinct configs
-- meaning i gain nothing by doing a 6th join.
-- cross joins basically give you every combination of unique values from the 2 tables but we joined back on the same table
-- so its every possible combination of Temp + Temp (ConfigID + ConfigID) ... per cross join so with 5 joins its every combination of
-- Temp + Temp + Temp + Temp + Temp .. good job temp only has 1 column with 5 values in it
-- 7. uncomment both this and the CTE above ... need to use them together
--SELECT DISTINCT T.ConfigID C1, T2.ConfigID C2, T3.ConfigID C3, T4.ConfigID C4, T5.ConfigID C5
--INTO [SETS]
--FROM Temp T
--CROSS JOIN Temp T2
--CROSS JOIN Temp T3
--CROSS JOIN Temp T4
--CROSS JOIN Temp T5
-- notice the INTO clause ... this dumps me out a new [SETS] table in my db
-- if i go add a primary key to this and set its ident spec i now have unique set id's
-- for each row in the table.
--SELECT *
--FROM [dbo].[SETS]
-- now here's where it gets interesting ... row 1 defines a set as being config id 1 and nothing else
-- row 2 defines set 2 as being config 1 and config 2 and nothing else ... and so on ...
-- the problem here of course is that 1,2,1,1,1 is technically the same set as 1,1,1,2,1 from our point of view
-- ok lets assign a set to each userid ...
-- 8. first we pull the distinct id's out ...
--SELECT DISTINCT USER_ID usr, null SetID
--INTO UserSets
--FROM SampleData
-- now we need to do bit a of operating on these that's a bit much for a single update or select so ...
-- 9. process findings in a loop
DECLARE #currentUser int
DECLARE #set int
-- while theres a userid not linked to a set
WHILE EXISTS(#currentUser = SELECT TOP 1 usr FROM UserSets WHERE SetId IS NULL)
BEGIN
-- figure out a set to link it to
SET #set = (
SELECT TOP 1 ID
FROM [SETS]
-- shouldn't really do this ... basically need to refactor in to a table variable then compare to that
-- that way the table lookup on ur main data is only 1 per User_id
WHERE C1 IN (SELECT DISTINCT Configuration_id FROM SampleData WHERE USER_ID = #currentUser)
AND C2 IN (SELECT DISTINCT Configuration_id FROM SampleData WHERE USER_ID = #currentUser)
AND C3 IN (SELECT DISTINCT Configuration_id FROM SampleData WHERE USER_ID = #currentUser)
AND C4 IN (SELECT DISTINCT Configuration_id FROM SampleData WHERE USER_ID = #currentUser)
AND C5 IN (SELECT DISTINCT Configuration_id FROM SampleData WHERE USER_ID = #currentUser)
)
-- hopefully that worked
IF(#set IS NOT NULL)
BEGIN
-- tell the usersets table
UPDATE UserSets SET SetId = #set WHERE usr = #currentUser
set #set = null
END
ELSE -- something went wrong ... set to 0 to prevent endless loop but any userid linked to set 0 is a problem u need to look at
UPDATE UserSets SET SetId = 0 WHERE usr = #currentUser
-- and round we go again ... until we are done
END

SELECT
USER_ID,
SALE_ITEM_ID, ETC...,
COUNT(*) WhateverYouWantToNameCount
FROM TableNAme
GROUP BY USER_ID

Select data from one table where a field is greater than that of another field in another table

I want to be able to select data from TableA where Field1 is greater than Field2 in TableB.
In my head i image it to be something like this
Select TableA.*
from TableA
Join TableB
On TableA.PK = TableB.FK
WHERE TableA.Field1 > TableB.Field2
I am using SQL server 2005 and the TableA.Field1 and tableB.Field2 look like:
2004102881010 - data type - Vrachar
My PK and FK look like:
0908232 - data type - nvarchar
The probelm is when this query is ran ALL the data is displaying and not just the rows where Field1 is greater.
Cheers:)

Seems to be working correctly for this demo code. Perhaps I'm not understanding the problem or data.
;
with TABLEA (PK, Field1) AS
(
-- Sample row that is filtered out
SELECT CAST('0908232' AS nvarchar(10)), CAST('2004102881010' AS varchar(50))
-- This is bigger than what's in B
UNION ALL SELECT CAST('0908232' AS nvarchar(10)), CAST('2005102881010' AS varchar(50))
)
, TABLEB(FK, Field2) AS
(
-- This matches row 1 above and will be excluded
SELECT CAST('0908232' AS nvarchar(10)), CAST('2004102881010' AS varchar(50))
)
SELECT TableA.*
FROM TableA
INNER JOIN TableB
ON TableA.PK = TableB.FK
WHERE TableA.Field1 > TableB.Field2
Results
PK Field1
0908232 2005102881010

This seems like a problem with missing zeroes:
20041028*0*81010
There is nothing wrong with your query, but your data.
Consider 2001-01-01 01:01:01, this would be seen as: 200111111
It should be seen as: 20010101010101

Comparrison operators (>, <) used on strings (varchars, nvarchars, etc.) work alphabetically. For example, '9' > '11' is true. You might try doing a data type conversion...
WHERE cast(A.field1 as int) > cast(B.field2 as int)

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Remove string portion from inconsistent string of comma-separated values - sql-server

You can use Replace as follows: update dbo.sources set category = replace(replace(category,'ABC01',''),', ','') where category like '%ABC01%'

Related

Why TRY_PARSE its so slow?

SQL- use an attribute to group activities and use the group as parameter

Check records from same table with unmatched values in multiple rows based on date

T-SQL Grouping Sets of Information

Select data from one table where a field is greater than that of another field in another table

Categories

Resources