SQL Server : split value based on delimiter and match at run-time - sql-server

I have 2 simplified tables (all columns are varchar). Some rows in T1_TAB for F2 contain multiple values separated by ;, some do not have separators at all as shown below (sometimes ; might also appear at the beginning and/or at the end). F2 in T2_TAB always has a single value.
I need to be able to pull rows from ether table based on single selection from one table and likeliness on F2 columns.
T1_TAB
F0 | F2
--------------
1 ;30
2 ;10;20;30
3 ;20;30;
4 10
T2_TAB
F1 | F2
-------------
100 10
200 20
300 30
I can do:
SELECT T1.F0
FROM T1_TAB T1
LEFT JOIN T2_TAB T2
ON T2.F2 LIKE '%' + T1.F2 + '%'
WHERE T2.F1 = '200'
This would bare result:
2
3
Now, I need to do the opposite. For instance:
Based on condition WHERE T1.F0 = 3, I need to pull from T2 rows with F1 equals 200 and 300 respectively. I guess I need to somehow split ;20;30; by a ";" and do the loop to match each value separately at run-time disregarding blank tokens.

You can create a function which converts the semicolon-separated string into a table of values:
create function CreateTableFromList(#Values varchar(1000))
returns #table table (id int)
as
begin
declare #p int = 1, #q int = 1, #n int = len(#Values)
while #p<#n begin
set #q = CHARINDEX(';',#Values,#p)
if #q=0 set #q=#n + 1
if #q > #p
insert into #table values (cast(substring(#Values,#p,#q-#p) as int))
set #p= #q + 1
end
return
end
and then
select T2.F1
from T2
where T2.F2 in (
select ID
from T1
cross apply dbo.CreateTableFromList(T1.F2)
where T1.F0=3
)

Related

Compare the two tables and update the value in a Flag column

I have two tables and the values like this
`create table InputLocationTable(SKUID int,InputLocations varchar(100),Flag varchar(100))
create table Location(SKUID int,Locations varchar(100))
insert into InputLocationTable(SKUID,InputLocations) values(11,'Loc1, Loc2, Loc3, Loc4, Loc5, Loc6')
insert into InputLocationTable(SKUID,InputLocations) values(12,'Loc1, Loc2')
insert into InputLocationTable(SKUID,InputLocations) values(13,'Loc4,Loc5')
insert into Location(SKUID,Locations) values(11,'Loc3')
insert into Location(SKUID,Locations) values(11,'Loc4')
insert into Location(SKUID,Locations) values(11,'Loc5')
insert into Location(SKUID,Locations) values(11,'Loc7')
insert into Location(SKUID,Locations) values(12,'Loc10')
insert into Location(SKUID,Locations) values(12,'Loc1')
insert into Location(SKUID,Locations) values(12,'Loc5')
insert into Location(SKUID,Locations) values(13,'Loc4')
insert into Location(SKUID,Locations) values(13,'Loc2')
insert into Location(SKUID,Locations) values(13,'Loc2')`
I need to get the output by matching SKUID's from Each tables and Update the value in Flag column as shown in the screenshot, I have tried something like this code
`SELECT STUFF((select ','+ Data.C1
FROM
(select
n.r.value('.', 'varchar(50)') AS C1
from InputLocation as T
cross apply (select cast('<r>'+replace(replace(Location,'&','&'), ',', '</r><r>')+'</r>' as xml)) as S(XMLCol)
cross apply S.XMLCol.nodes('r') as n(r)) DATA
WHERE data.C1 NOT IN (SELECT Location
FROM Location) for xml path('')),1,1,'') As Output`
But not convinced with output and also i am trying to avoid xml path code, because performance is not first place for this code, I need the output like the below screenshot. Any help would be greatly appreciated.
I think you need to first look at why you think the XML approach is not performing well enough for your needs, as it has actually been shown to perform very well for larger input strings.
If you only need to handle input strings of up to either 4000 or 8000 characters (non max nvarchar and varchar types respectively), you can utilise a tally table contained within an inline table valued function which will also perform very well. The version I use can be found at the end of this post.
Utilising this function we can split out the values in your InputLocations column, though we still need to use for xml to concatenate them back together for your desired format:
-- Define data
declare #InputLocationTable table (SKUID int,InputLocations varchar(100),Flag varchar(100));
declare #Location table (SKUID int,Locations varchar(100));
insert into #InputLocationTable(SKUID,InputLocations) values (11,'Loc1, Loc2, Loc3, Loc4, Loc5, Loc6'),(12,'Loc1, Loc2'),(13,'Loc4,Loc5'),(14,'Loc1');
insert into #Location(SKUID,Locations) values (11,'Loc3'),(11,'Loc4'),(11,'Loc5'),(11,'Loc7'),(12,'Loc10'),(12,'Loc1'),(12,'Loc5'),(13,'Loc4'),(13,'Loc2'),(13,'Loc2'),(14,'Loc1');
--Query
-- Derived table splits out the values held within the InputLocations column
with i as
(
select i.SKUID
,i.InputLocations
,s.item as Loc
from #InputLocationTable as i
cross apply dbo.fn_StringSplit4k(replace(i.InputLocations,' ',''),',',null) as s
)
select il.SKUID
,il.InputLocations
,isnull('Add ' -- The split Locations are then matched to those already in #Location and those not present are concatenated together.
+ stuff((select ', ' + i.Loc
from i
left join #Location as l
on i.SKUID = l.SKUID
and i.Loc = l.Locations
where il.SKUID = i.SKUID
and l.SKUID is null
for xml path('')
)
,1,2,''
)
,'No Flag') as Flag
from #InputLocationTable as il
order by il.SKUID;
Output:
+-------+------------------------------------+----------------------+
| SKUID | InputLocations | Flag |
+-------+------------------------------------+----------------------+
| 11 | Loc1, Loc2, Loc3, Loc4, Loc5, Loc6 | Add Loc1, Loc2, Loc6 |
| 12 | Loc1, Loc2 | Add Loc2 |
| 13 | Loc4,Loc5 | Add Loc5 |
| 14 | Loc1 | No Flag |
+-------+------------------------------------+----------------------+
For nvarchar input (I have different functions for varchar and max type input) this is my version of the string splitting function linked above:
create function [dbo].[fn_StringSplit4k]
(
#str nvarchar(4000) = ' ' -- String to split.
,#delimiter as nvarchar(1) = ',' -- Delimiting value to split on.
,#num as int = null -- Which value in the list to return. NULL returns all.
)
returns table
as
return
-- Start tally table with 10 rows.
with n(n) as (select 1 union all select 1 union all select 1 union all select 1 union all select 1 union all select 1 union all select 1 union all select 1 union all select 1 union all select 1)
-- Select the same number of rows as characters in #str as incremental row numbers.
-- Cross joins increase exponentially to a max possible 10,000 rows to cover largest #str length.
,t(t) as (select top (select len(isnull(#str,'')) a) row_number() over (order by (select null)) from n n1,n n2,n n3,n n4)
-- Return the position of every value that follows the specified delimiter.
,s(s) as (select 1 union all select t+1 from t where substring(isnull(#str,''),t,1) = #delimiter)
-- Return the start and length of every value, to use in the SUBSTRING function.
-- ISNULL/NULLIF combo handles the last value where there is no delimiter at the end of the string.
,l(s,l) as (select s,isnull(nullif(charindex(#delimiter,isnull(#str,''),s),0)-s,4000) from s)
select rn
,item
from(select row_number() over(order by s) as rn
,substring(#str,s,l) as item
from l
) a
where rn = #num
or #num is null;
go

How to Capture text from DB column in SQL Server 2012?

I have a column notes with a length of more than 80,0000 characters.
As per the transformation rule, I have to write a SQL script which will caption the notes column in the below steps :
First 300 characters in Column_A
Next 300 characters in Column_B
Next 300 characters in Column_C
and so on.
So I am looking for a output as below :
For every client ID with end of the length of the notes column.
Ouch! That's quite a complex requirement. You will need to combine a number of skills to solve this one.
Firstly you need to create additional rows. One way to achieve this is via recursion. In the example below I've calculated how many rows are required for each Client Id. I've then used recursion to create them.
You also need to break each row into 3 300 character blocks. In my example I've used 3 3 character blocks instead, so it's easier to read. But the principle will scale up. Using SUBSTRING and the record number you can calculate the starting point for each column.
I've created some sample records in a CTE called Raw. This allows anyone to follow the example, which is up on Stack Data Exchange (link below).
Example
DECLARE #ColumnWidth INT = 3; -- Use to adjust required length of columns A, B and C.
DECLARE #ColumnCount INT = 3; -- Use to adjust number of output columns.
WITH [Raw] AS
(
/* This CTE creates sample records for us to experiment with.
* The note column contains each letter of the alphabet, repeated
* 3 times. The repeatition will help us validate the result set.
*
* Using ceiling, to round up, the field length (#ColumnWidth) and
* the number of fields (#ColumnCount) and the number of charaters (LEN)
* we can calculate how many rows are required.
*/
SELECT
r.ClientId,
r.Note,
CEILING(CAST(LEN(r.Note) AS DECIMAL(18, 8)) / (#ColumnWidth * #ColumnCount)) AS RecordsRequired
FROM
(
VALUES
(1, 'aaabbbcccdddeeefffggghhhiiijjjkkklllmmmnnnooopppqqqrrrssstttuuuvvvwwwxxxyyyzz'),
(2, 'aaabbbcccdddeeefffggghhhiiijjjkkklll'),
(3, 'aaabbbcccdddeeefffggghhhiiijjjkkklllmmmnnno'),
(4, 'aaabbbcccdddeeefffggghhhiiijjjkkklllmmmnnnoooppp'),
(5, 'aaabbbcccdddeeefffggghhhiiijjj'),
(6, 'aaabbbcccdd')
) AS r(ClientId, Note)
),
MultiRow AS
(
/* This CTE uses recursion to return multiple rows for
* each orginal row.
* The number returned matches the RecordsRequired value
* from the Raw CTE.
*/
SELECT
1 AS RecordNumber,
RecordsRequired,
ClientId,
Note
FROM
[Raw]
UNION ALL
-- Keep repeating each record until the number of required rows has been returned.
SELECT
RecordNumber + 1 AS RecordNumber,
RecordsRequired,
ClientId,
Note
FROM
MultiRow
WHERE
RecordNumber < RecordsRequired
)
/* Each record returned by the MultiRow CTE is numbered: 1, 2, 3 etc.
* Using this we can extract blocks of text from the orginal Note column.
*/
SELECT
ClientId,
SUBSTRING(Note, ((#ColumnWidth * #ColumnCount) * RecordNumber) - ((#ColumnWidth * 3) -1), #ColumnWidth) AS Column_A,
SUBSTRING(Note, ((#ColumnWidth * #ColumnCount) * RecordNumber) - ((#ColumnWidth * 2) -1), #ColumnWidth) AS Column_B,
SUBSTRING(Note, ((#ColumnWidth * #ColumnCount) * RecordNumber) - ((#ColumnWidth * 1) -1), #ColumnWidth) AS Column_C
FROM
MultiRow
ORDER BY
ClientId, RecordNumber
;
Here is how you can do this:
DECLARE #c TABLE(ID INT, Notes VARCHAR(26))
INSERT INTO #c VALUES
(1, 'abcdefghijklmnopqrstuvwxyz'),
(2, 'ABCDEFGHIJKLMNOPQRSTUVWXYZ')
DECLARE #size INT = 26
DECLARE #chunk INT = 5
;WITH tally AS(SELECT 1 s1, #chunk + 1 s2, 2*#chunk + 1 s3
UNION ALL
SELECT s3 + #chunk, s3 + 2*#chunk, s3 + 3*#chunk FROM tally
WHERE s3 < #size)
SELECT c.ID,
SUBSTRING(Notes, t.s1, #chunk) A,
SUBSTRING(Notes, t.s2, #chunk) B,
SUBSTRING(Notes, t.s3, #chunk) C
FROM #c c
CROSS JOIN tally t
ORDER BY c.ID, t.s1
Output:
ID A B C
1 abcde fghij klmno
1 pqrst uvwxy z
2 ABCDE FGHIJ KLMNO
2 PQRST UVWXY Z
Description:
tally table returns you the starting positions, which you will use in substring function. For the above configuration it returns:
s1 s2 s3
1 6 11
16 21 26
For this you are using recursive cte which spreads starting positions across the rows with 3 starting position. The rest should be easy to understand.
The required result can be obtained using simple looping
/* Declare a temperory table for storing the results */
DECLARE #Result_TABLE AS TABLE
(
CustomerId BIGINT
,ColA VARCHAR(300)
,ColB VARCHAR(300)
,ColC VARCHAR(300)
)
DECLARE #CustomerCount INT --To store customer count
DECLARE #IteratorForCustomers INT = 1 --To iterate for each customers
/* Get Count of cutomers */
SELECT #CustomerCount = COUNT (1) FROM Customers
DECLARE #CustomerId BIGINT --To store customer id in looping
DECLARE #TempNote VARCHAR(MAX) -- To store customer note of each customer in looping
/* Loop for all customers */
WHILE (#IteratorForCustomers <=#CustomerCount)
BEGIN
;WITH CTE AS
(
SELECT
ROW_NUMBER() OVER (ORDER BY CustomerID ) AS RowId
,CustomerId
,Customer_Note
FROM Customers
)
SELECT
#CustomerId = a.CustomerId
,#TempNote = a.Customer_Note
FROM CTE a
WHERE
RowId = #IteratorForCustomers
/* Loop for generating each row with three columns */
WHILE (LEN(#TempNote)>0)
BEGIN
INSERT INTO #Result_TABLE
VALUES
(
#CustomerId,SUBSTRING(#TempNote,1,300),SUBSTRING(#TempNote,301,300),SUBSTRING(#TempNote,601,300)
)
SET #TempNote = CASE WHEN LEN(#TempNote)>900 THEN SUBSTRING(#TempNote,901,LEN(#TempNote)-900)
ELSE NULL END
END
SET #IteratorForCustomers = #IteratorForCustomers + 1
END
SELECT * FROM #Result_TABLE

How to Pivot one Row into One Column

Can someone please help me out.
I've looked around and can't find something similar to what I need to do. Basically,
I have a table that will need to be pivoted, it is coming from a flat file that loads all columns as one comma delimited column. I will need to break out the columns into their respective order before the pivot and I've got procedures that do this beautifully. However, the crux of this table is that I need to edit the headers before I can continue.
I need help to pivot the information in the first column and put it another table I've created. Therefore, I need this
ID Column01
1 Express,,,Express,,,HyperMakert,,WebStore,Web
To End up like this....
New_ID New_Col
1 Express
2
3
4 Express
5
6
7 HyperMarket
8
9 WebStore
10 Web
Please note that I need to include the '' Black columns from the original row and.
I looked and the links below but they were not helpful;
SQL Server : Transpose rows to columns
Efficiently convert rows to columns in sql server
Mysql query to dynamically convert rows to columns
There are many methods of splitting string in SQL Server you can find on the web, some are really complicated but some are just simple. I like the way of using dynamic query. It's just short and simple (not sure about the performance but I believe it would be not too bad):
declare #s varchar(max)
-- save the Column01 string/text into #s variable
select #s = Column01 from test where ID = 1
-- build the query string
set #s = 'select row_number() over (order by current_timestamp) as New_ID, c as New_Col from (values ('''
+ replace(#s, ',', '''),(''') + ''')) v(c)'
insert newTable exec(#s)
go
select * from newTable
Sqlfiddle Demo
The use of values() clause above is some kind of anonymous table, here is a simple example of such usage (so that you can understand it better). The anonymous table in the following example has just 1 column, the table name is v and the column name is c. Each row has just 1 cell and should be wrapped in a pair of parentheses (). The rows are separated by commas and follow after values. Here is the code:
-- note about the outside (...) wrapping values ....
select * from (values ('a'),('b'),('c'), ('d')) v(c)
The result will be:
c
------
1 a
2 b
3 c
4 d
Just try running that code and you'll understand how useful it is.
You may want to use a tally table here. See http://www.sqlservercentral.com/articles/T-SQL/62867/
declare #parameter varchar(4000)
set #parameter = 'Express,,,Express,,,HyperMakert,,WebStore,Web'
set #parameter = ',' + #parameter + ',' -- add commas
with
e1 as(select 1 as N union all select 1), -- 2 rows
e2 as(select 1 as N from e1 as a, e1 as b), -- 4 rows
e3 as(select 1 as N from e2 as a, e2 as b), -- 16 rows
e4 as(select 1 as N from e3 as a, e3 as b), -- 256 rows
e5 as(select 1 as N from e4 as a, e4 as b), -- 65536 rows
tally as (select row_number() over(order by N) as N from e5
)
select
substring(#parameter, N+1, charindex(',', #parameter, N+1) - N-1)
from tally
where
N < len(#parameter)
and substring(#parameter, N, 1) = ','

How to find difference of two row values and store it as third row in the same table?

I want to find difference between two rows of a table and insert the difference value as third row in the same table. If the first row value is less than second row then the difference should appear within parenthesis instead of negative symbol.
Ex:
Name S1 S2 S3 S4
xxx 98 70 50 85
xxx1 50 90 35 105
Diff 48 (20) 15 (20)
Kindly say whether is it possible either in front-end, that is after storing the values in gridview. But in Gridview the colums and rows are Transposed. Then in grid view difference of two column store as third new column.
Here's some code which will do it:
SELECT
Name
,S1
,S2
FROM
(
select
x.Name as xName
,CONVERT(varchar(99), x.S1) as xS1 -- Have to convert everything since you require negatives as ()
,CONVERT(varchar(99), x.S2) as xS2
,1 as xOrder -- To ensure the rows are presented as requested
,x1.Name as x1Name
,CONVERT(varchar(99), x1.S1) as x1S1
,CONVERT(varchar(99), x1.S2) as x1S2
,2 as x1Order
,case
when SIGN(x.S1 - x1.S1) = -1
then '(' + CONVERT(varchar(99), x.S1 - x1.S1) + ')'
else CONVERT(varchar(99), x.S1 - x1.S1)
end as d1
,case
when SIGN(x.S2 - x1.S2) = -1
then '(' + CONVERT(varchar(99), x.S2 - x1.S2) + ')'
else CONVERT(varchar(99), x.S2 - x1.S2)
end as d2
-- etc for S3 and S4
,3 as DiffOrder
from
(select * from dbo.Data where Name = 'xxx') as x
inner join
(select * from dbo.Data where Name = 'xxx1') as x1
on x1.Name = x.Name + '1' -- I suspect your row matching conditin will have
-- to me more sophisticated that this in realit
) as Data
-- The cross apply does an UNPIVOT on the cheap
CROSS APPLY ( VALUES
(xName, xS1, xS2, xOrder),
(x1Name, x1S1, x1S2, x1Order),
('Diff', d1, d2, DiffOrder)
) -- Close paren from the CROSS APPLY
as ca(Name, S1, S2, SortOrder)
order by ca.SortOrder;
It would be much, much better to have your presentation layer perform the formatting of negatives.
Because you require negatives presented in this way the columns have to be CONVERTed to character types. This adds complexity to the SQL. It may also mean they are no longer right-aligned. Again, good presentation software should be able to deal with that for you better than SQL can.
I suspect that in your actual system you will have many row-pairs. You haven't said what the rules to match them are. You will have to add this into the code. Make sure you change the sorting to account for this also.
The code could be reformatted using CTEs, if you find that layout simpler to understand and maintain.
I have no idea how you intend to insert varchar into a numeric column. I assume s1-s4 are numeric as they should be.
You have your data reversed, so best thing is to spin it 90 degrees first to do the calculation, then after the result turn it back to display it correct:
declare #t table(Name varchar(4), S1 int, S2 int, S3 int, S4 int)
insert #t values('xxx',98,70,50,85),
('xxx1',50,90,35,105)
--Diff 48 (20) 15 (20)
-- this next part can be put into a view
;with cte as
(
select
sum(case when Name = 'xxx' then Value else -value end) Total, Col
from #t t1
UNPIVOT
(Value FOR Col IN
([S1], [S2], [S3], [S4]) ) AS unpvt
group by col
), cte2 as
(
select case when Total < 0 then
'(' + cast(-Total as varchar(10))+')'
else cast(Total as varchar(10)) end Totalvarchar, Col
from cte
)
select Name,
cast(S1 as varchar(20)) S1, cast(S2 as varchar(20)) S2,
cast(S3 as varchar(20)) S3, cast(S4 as varchar(20)) S4
from #t
union all
select 'Diff' Name, [S1], [S2], [S3], [S4]
from
cte2
PIVOT (max(TotalVarchar) FOR [Col] IN ([S1], [S2], [S3], [S4])) AS pvt
Result:
Name S1 S2 S3 S4
xxx 98 70 50 85
xxx1 50 90 35 105
Diff 48 (20) 15 (20)

How do I get the "Next available number" from an SQL Server? (Not an Identity column)

Technologies: SQL Server 2008
So I've tried a few options that I've found on SO, but nothing really provided me with a definitive answer.
I have a table with two columns, (Transaction ID, GroupID) where neither has unique values. For example:
TransID | GroupID
-----------------
23 | 4001
99 | 4001
63 | 4001
123 | 4001
77 | 2113
2645 | 2113
123 | 2113
99 | 2113
Originally, the groupID was just chosen at random by the user, but now we're automating it. Thing is, we're keeping the existing DB without any changes to the existing data(too much work, for too little gain)
Is there a way to query "GroupID" on table "GroupTransactions" for the next available value of GroupID > 2000?
I think from the question you're after the next available, although that may not be the same as max+1 right? - In that case:
Start with a list of integers, and look for those that aren't there in the groupid column, for example:
;WITH CTE_Numbers AS (
SELECT n = 2001
UNION ALL
SELECT n + 1 FROM CTE_Numbers WHERE n < 4000
)
SELECT top 1 n
FROM CTE_Numbers num
WHERE NOT EXISTS (SELECT 1 FROM MyTable tab WHERE num.n = tab.groupid)
ORDER BY n
Note: you need to tweak the 2001/4000 values int the CTE to allow for the range you want. I assumed the name of your table to by MyTable
select max(groupid) + 1 from GroupTransactions
The following will find the next gap above 2000:
SELECT MIN(t.GroupID)+1 AS NextID
FROM GroupTransactions t (updlock)
WHERE NOT EXISTS
(SELECT NULL FROM GroupTransactions n WHERE n.GroupID=t.GroupID+1 AND n.GroupID>2000)
AND t.GroupID>2000
There are always many ways to do everything. I resolved this problem by doing like this:
declare #i int = null
declare #t table (i int)
insert into #t values (1)
insert into #t values (2)
--insert into #t values (3)
--insert into #t values (4)
insert into #t values (5)
--insert into #t values (6)
--get the first missing number
select #i = min(RowNumber)
from (
select ROW_NUMBER() OVER(ORDER BY i) AS RowNumber, i
from (
--select distinct in case a number is in there multiple times
select distinct i
from #t
--start after 0 in case there are negative or 0 number
where i > 0
) as a
) as b
where RowNumber <> i
--if there are no missing numbers or no records, get the max record
if #i is null
begin
select #i = isnull(max(i),0) + 1 from #t
end
select #i
In my situation I have a system to generate message numbers or a file/case/reservation number sequentially from 1 every year. But in some situations a number does not get use (user was testing/practicing or whatever reason) and the number was deleted.
You can use a where clause to filter by year if all entries are in the same table, and make it dynamic (my example is hardcoded). if you archive your yearly data then not needed. The sub-query part for mID and mID2 must be identical.
The "union 0 as seq " for mID is there in case your table is empty; this is the base seed number. It can be anything ex: 3000000 or {prefix}0000. The field is an integer. If you omit " Union 0 as seq " it will not work on an empty table or when you have a table missing ID 1 it will given you the next ID ( if the first number is 4 the value returned will be 5).
This query is very quick - hint: the field must be indexed; it was tested on a table of 100,000+ rows. I found that using a domain aggregate get slower as the table increases in size.
If you remove the "top 1" you will get a list of 'next numbers' but not all the missing numbers in a sequence; ie if you have 1 2 4 7 the result will be 3 5 8.
set #newID = select top 1 mID.seq + 1 as seq from
(select a.[msg_number] as seq from [tblMSG] a --where a.[msg_date] between '2023-01-01' and '2023-12-31'
union select 0 as seq ) as mID
left outer join
(Select b.[msg_number] as seq from [tblMSG] b --where b.[msg_date] between '2023-01-01' and '2023-12-31'
) as mID2 on mID.seq + 1 = mID2.seq where mID2.seq is null order by mID.seq
-- Next: a statement to insert a row with #newID immediately in tblMSG (in a transaction block).
-- Then the row can be updated by your app.

Resources