Need to insert breaks in strings of very column with ' ' or ',' - sql-server

I have a table that has one Column but over 100,000 rows
Col_Name
qwchijhuirhxnihdiuyfnx
dhjhfiurhncnmxmzjcoinrds
xnbxknsiiuncirnxknrxnxz
I need to insert a '.' or '$' or some marker after every 3rd character
Example of result needed:
Col_Name
qwc.hij.hui.rhx.nih.diu.yfn.x
dhj.hfi.urh.ncn.mxm.zjc.oin.rds.
xnb.xkn.sii.unc.irn.xkn.rxn.xz
I originally solved this with:
INSERT INTO New_Table
(
c1
,c2
,c3
)
SELECT
substring(CAST(Col_Name AS VARCHAR(MAX)),1,3) as C1
,substring(CAST(Col_Name AS VARCHAR(MAX)),4,3) as C2
,substring(CAST(Col_Name AS VARCHAR(MAX)),7,3) as C3
From Table_Name
This causes problems later in the script so the data must remain in one column but could be inserted into a new table as long as it was a new table with just one column

Here's a sqlfiddle starting point you can refactor http://sqlfiddle.com/#!6/ab6dd/1/0 using function and while loop.
You may be able to do something more efficient with regular expressions or SQLCLR if you need speed.
CREATE FUNCTION dotify (#input varchar(MAX))
RETURNS varchar(MAX)
AS
BEGIN
DECLARE #output varchar(MAX) = ''
declare #index int = 0
declare #length int
set #length = len(#input)
while #index <= #length
begin
SET #output = #output + substring(#input, #index, 1)
if (#index % 3) = 0 AND #index > 0
BEGIN
SET #output = #output +'.'
END
set #index = #index + 1
end
return(#output)
END
GO
select TOP 10000 col_name, dbo.dotify(col_name) FROM old_table
You can use TOP to limit the processing time to a few seconds so you can easily profile efficiency changes you make.

Related

Can I assign value to variable or parameter after the execution code is defined?

I have a quite large script which is shrunk and simplified in this question.The overall principal is that I have some code that need to be run several times with only small adjustments for every iteration. The script is built with a major loop that has several subloops in it. Today the whole select-statement is hard coded in the loops. My thought was that I could write the select-statement once and only let the parts that needs to be changed for every loop be the only thing that changes in the loop. The purpose is easier maintaining.
Example of the script:
declare
#i1 int,
#i2 int,
#t nvarchar(50),
#v nvarchar(50),
#s nvarchar(max)
set #i1 = 1
while #i1 < 3
begin
if #i1 = 1
begin
set #i2 = 1
set #t = 'Ansokningsomgang'
set #s = '
select ' + #v + '_Ar,count(*) as N
from (
select left(' + #v + ',4) as ' + #v + '_Ar
from Vinnova_' + #t + '
) a
group by ' + #v + '_Ar
order by ' + #v + '_Ar
'
while #i2 < 4
begin
if #i2 = 1
begin
set #v = 'diarienummer'
exec sp_executesql
#stmt = #s,
#params = N'#tab as nvarchar(50), #var as nvarchar(50)',
#tab = #t, #var = #v
end
else if #i2 = 2
begin
set #v = 'utlysning_diarienummer'
exec sp_executesql
#stmt = #s,
#params = N'#tab as nvarchar(50), #var as nvarchar(50)',
#tab = #t, #var = #v
end
else if #i2 = 3
begin
set #v = 'utlysning_program_diarienummer'
exec sp_executesql
#stmt = #s,
#params = N'#tab as nvarchar(50), #var as nvarchar(50)',
#tab = #t, #var = #v
end
set #i2 = #i2 + 1
end
end
else
print('Nr: ' + cast(#i1 as char))
set #i1 = #i1 + 1
end
This script doesn't work. It runs through but have no outputs. If I declare #v above the declaration of #s it works, but then I need to declare #s for every time I need to change the value for #v. Then there is no point in doing this.
#i1 iterates far more times than what is shown here.
The else statement to "if #i1" doesn't exist in the real script. It replaces a bunch of subloops that run for every value that is aloud for #i1 in this example.
I also tried to just execute #s like:
exec(#s)
in every loop. Same result.
So what am I missing?
Database engine is MS SQL Server.
Your parallel-structured tables are not 'normalized' to any degree,
and you are now suffering the consequence. Typically, the best
approach is to go ahead and make the data more normalized before you
take any other action.
Dynamic sql could work for making this task easier, and it is okay
as long as it's an ad-hoc task that hopefully you use to begin
building permanent tables in the name of making your various
parallel tables obsolete. It is not okay if it is part of a
regular process because someone could enter in some malicious
code into one of your table values and do some damage. This is
particularly true in your case because your use of left
functions imply that you're columns are character based.
Here's some code to put your data in more normal form. It can be
made more normal after this, so it would only be the first step.
But it gets you to the point where using it for your purpose is
far easier, and so hopefully will motivate you to redesign.
-- plug in the parallel tables you want to normalize
declare #tablesToNormalize table (id int identity(1,1), tbl sysname);
insert #tablesToNormalize values ('Ansokningsomgang', 'Ansokningsomgang2');
-- create a table that will hold the restructured data
create table ##normalized (
tbl sysname,
rowKey int, -- optional, but needed if restructure is permanent
col sysname,
category varchar(50),
value varchar(50)
);
-- create template code to restructure and insert a table's data
-- into the normalized table (notice the use of #tbl as a string,
-- not as a variable)
declare #templateSql nvarchar(max) = '
insert ##normalized
select tbl = ''Vinnova_#tbl'',
rowKey = t.somePrimaryKey, -- optional, but needed if restructure is permanent
ap.col,
category = left(ap.value, 4),
ap.value
from Vinnova_#tbl t
cross apply (values
(''diarienummer'', diarienummer),
(''utlysning_diarienummer'', utlysning_diarienummer),
(''utlysning_program_diarienummer'', utlysning_program_diarienummer)
// ... and so on (much better than writing a nested loop for ever row)
) ap (col, value)
';
-- loop the table names and run the template (notice the 'replace' function)
declare #id int = 1;
while #id <= (select max(id) from #tablesToNormalize)
begin
declare #tbl sysname = (select tbl from #tablesToNormalize where id = #id);
declare #sql nvarchar(max) = replace(#templateSql, '#t', #tbl);
exec (#tbl);
end
Now that your data is in a more normal form, code for your purpose
is much simpler, and the output far cleaner.
select tbl, col, category, n = count(value)
from ##normalized
group by tbl, col, category
order by tbl, col, category;

How to find hardcoded values defined in text using MS SQL query without using functions

Requirement: I have a table for storing queries (SQL programs). I need to search and find out in this table such queries which have hardcoded values for a particular column (name) as shown below:
SELECT
*
FROM TABLE1 AC
WHERE
AC.name = 'hardcoded_value1'
UNION
SELECT
*
FROM TABLE2 BC
WHERE BC.name = 'hardcoded_value2'
I have tried and done this using a function and it works fine. But the requirement has a constraint which doesn't allow to make use of any function or stored procedure.
Below is the function definition for reference:-
CREATE OR ALTER FUNCTION [dbo].[GetConstantValue](#QueryID INT)
RETURNS
#Constantvalue TABLE
(
name_ NVARCHAR(2000)
)
AS
BEGIN
Declare #Query NVARCHAR(max) = SELECT code FROM QUERY_TABLE WHERE ID = #QueryID
Declare #StartIndex int = 0,#EndIndex int = 0,#Count int = 0,#ConstStr nvarchar(max) = ''
WHILE #Count <= LEN(#Query)
BEGIN
IF SUBSTRING(#Query,#Count, 1) = CHAR(39)
BEGIN
IF #StartIndex <> 0
BEGIN
SET #ConstStr = #ConstStr + CASE WHEN LEN(#ConstStr)>0 THEN '|' ELSE '' END+ SUBSTRING(#Query,#StartIndex+1,#Count-(#StartIndex+1))
SET #StartIndex = 0
SET #EndIndex = 0
END
ELSE
IF SUBSTRING(#Query,#Count-20, 20) LIKE '%name%[=]%'
SET #StartIndex = #Count
END
SET #Count = #Count + 1
END
INSERT INTO #Constantvalue
SELECT Value FROM string_split(#ConstStr,'|')
RETURN
END
Please suggest me a way to achieve this in the main query itself without making any function calls

Convert a SQL function into a stored procedure

I am having trouble converting an UDF into a stored procedure.
Here is what I've got: this is the stored procedure that calls the function (I am using it to search for and remove all UNICODE characters that are not between 32 and 126):
ALTER PROCEDURE [dbo].[spRemoveUNICODE]
#FieldList varchar(250) = '',
#Multiple int = 0,
#TableName varchar(100) = ''
AS
BEGIN
SET NOCOUNT ON;
DECLARE #SQL VARCHAR(MAX), #counter INT = 0
IF #Multiple > 0
BEGIN
DECLARE #Field VARCHAR(100)
SELECT splitdata
INTO #TempValue
FROM dbo.fnSplitString(#FieldList,',')
WHILE (SELECT COUNT(*) FROM #TempValue) >= 1
BEGIN
DECLARE #Column VARCHAR(100) = (SELECT TOP 1 splitdata FROM #TempValue)
SET #SQL = 'UPDATE ' + #TableName + ' SET ' + #Column + ' = dbo.RemoveNonASCII(' + #Column + ')'
EXEC (#SQL)
--print #SQL
SET #counter = #counter + 1
PRINT #column + ' was checked for ' + #counter + ' rows.'
DELETE FROM #TempValue
WHERE splitdata = #Column
END
END
ELSE IF #Multiple = 0
BEGIN
SET #SQL = 'UPDATE ' + #TableName + ' SET ' + #FieldList + ' = dbo.RemoveNonASCII(' + #FieldList + ')'
EXEC (#SQL)
--print #SQL
SET #counter = #counter + 1
PRINT #column + ' was checked for ' + #counter + ' rows.'
END
END
And here is the UDF that I created to help with the update (RemoveNonASCII):
ALTER FUNCTION [dbo].[RemoveNonASCII]
(#nstring nvarchar(max))
RETURNS varchar(max)
AS
BEGIN
-- Variables
DECLARE #Result varchar(max) = '',#nchar nvarchar(1), #position int
-- T-SQL statements to compute the return value
set #position = 1
while #position <= LEN(#nstring)
BEGIN
set #nchar = SUBSTRING(#nstring, #position, 1)
if UNICODE(#nchar) between 32 and 127
set #Result = #Result + #nchar
set #position = #position + 1
set #Result = REPLACE(#Result,'))','')
set #Result = REPLACE(#Result,'?','')
END
if (#Result = '')
set #Result = null
-- Return the result
RETURN #Result
END
I've been trying to convert it into a stored procedure. I want to track how many rows actually get updated when this is run. Right now it just says that all rows, however many I run this on, are updated. I want to know if say only half of them had bad characters. The stored procedure is already set up so that it tells me which column it is looking at, I want to include how many rows were updated. Here is what I've tried so far:
DECLARE #Result varchar(max) = '',#nchar nvarchar(1), #position int, #nstring nvarchar(max), #counter int = 0, #CountRows int = 0, #Length int
--select Notes from #Temp where Notes is not null order by Notes OFFSET #counter ROWS FETCH NEXT 1 ROWS ONLY
set #nstring = (select Notes from #Temp where Notes is not null order by Notes OFFSET #counter ROWS FETCH NEXT 1 ROWS ONLY)
set #Length = LEN(#nstring)
if #Length = 0 set #Length = 1
-- Add the T-SQL statements to compute the return value here
set #position = 1
while #position <= #Length
BEGIN
print #counter
print #CountRows
select #nstring
set #nchar = SUBSTRING(#nstring, #position, 1)
if UNICODE(#nchar) between 32 and 127
begin
print unicode(#nchar)
set #Result = #Result + #nchar
set #counter = #counter + 1
end
if UNICODE(#nchar) not between 32 and 127
begin
set #CountRows = #CountRows + 1
end
set #position = #position + 1
END
print 'Rows found with invalid UNICODE: ' + convert(varchar,#CountRows)
Right now I'm purposely creating a temp table and adding a bunch of notes and then adding in a bunch of invalid characters.
I created a list of 700+ Notes and then updated 2 of them with some invalid characters (outside the 32 - 127). There are a few that are null and a few that are not null, but that doesn't have anything in them. What happens is that I get 0 updates.
Rows found with invalid UNICODE: 0
Though it does see that the UNICODE for the one that it pulls is 32.
Obviously I'm missing something I just don't see what it is.
Here is a set based solution to handle your bulk replacements. Instead of a slow scalar function this is utilizing an inline table valued function. These are far faster than their scalar ancestors. I am using a tally table here. I keep this as a view on my system like this.
create View [dbo].[cteTally] as
WITH
E1(N) AS (select 1 from (values (1),(1),(1),(1),(1),(1),(1),(1),(1),(1))dt(n)),
E2(N) AS (SELECT 1 FROM E1 a, E1 b), --10E+2 or 100 rows
E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10E+4 or 10,000 rows max
cteTally(N) AS
(
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E4
)
select N from cteTally
If you are interested about tally tables here is an excellent article on the topic. http://www.sqlservercentral.com/articles/T-SQL/62867/
create function RemoveNonASCII
(
#SearchVal nvarchar(max)
) returns table as
RETURN
with MyValues as
(
select substring(#SearchVal, N, 1) as MyChar
, t.N
from cteTally t
where N <= len(#SearchVal)
and UNICODE(substring(#SearchVal, N, 1)) between 32 and 127
)
select distinct MyResult = STUFF((select MyChar + ''
from MyValues mv2
order by mv2.N
--for xml path('')), 1, 0, '')
FOR XML PATH(''),TYPE).value('.','NVARCHAR(MAX)'), 1, 0, '')
from MyValues mv
;
Now instead of being forced to call this every single row you can utilize cross apply. The performance benefit of just this portion of your original question should be pretty huge.
I also eluded to your string splitter also being a potential performance issue. Here is an excellent article with a number of very fast set based string splitters. http://sqlperformance.com/2012/07/t-sql-queries/split-strings
The last step here would be eliminate the first loop in your procedure. This can be done also but I am not entirely certain what your code is doing there. I will look closer and see what I can find out. In the meantime parse through this and feel free to ask questions about any parts you don't understand.
Here is what I've got working based on the great help from Sean Lange:
How I call the Stored Procedure:
exec spRemoveUNICODE #FieldList='Notes,Notes2,Notes3,Notes4,Notes5',#Multiple=1,#TableName='#Temp'
The #Temp table is created:
create table #Temp (ID int,Notes nvarchar(Max),Notes2 nvarchar(max),Notes3 nvarchar(max),Notes4 nvarchar(max),Notes5 nvarchar(max))
Then I fill it with comments from 5 fields from a couple of different tables that range in length from NULL to blank (but not null) to 5000 characters.
I then insert some random characters like this:
update #Temp
set Notes2 = SUBSTRING(Notes2,1,LEN(Notes2)/2) + N'￿㹊潮Ņ᯸ࢹᖈư㹨ƶ槹鎤⻄ƺ綐ڌ⸀ƺ삸)䀤ƍ샄)Ņᛡ鎤ꗘᖃᒨ쬵Ğᘍ鎤ᐜᏰ>֔υ赸Ƹ쳰డ촜)鉀௿촜)쮜)Ἡ屰山舰霡ࣆ 耏Аం畠Ư놐ᓜતᏛ֔Ꮫ֨Ꮫ꯼ᓜƒ 邰఍厰ఆ邰఍드)抉鎤듄)繟Ĺ띨)᯸ࢹ䮸ࣉ᯸ࢹ䮸ࣉ샰)ԌƏ￿

How to insert a comma separated values to the column to a SQL table in a order same as passing order?

i written a code like below to insert a comma separated values to the tempTble.It is working but i need the values to be entered in a order same as am passing it to query.But here number are arranged in a numerical order and string values arranged according to alphabetical order .Example '7,6,5,1,2,Jack,Ana,Micky' but it is inserted to column in a order of '1,2,5,6,7,Ana,Jack,Micky'.
Can you please provide answer for this.
Thank you in advance
ALTER PROCEDURE [dbo].[usp_GetValuesFromBillingSystem]
(
#BillingSystemCode VARCHAR(max)
)
AS
BEGIN
DECLARE #planID varchar(max) = Null ;
SET #planID= #BillingSystemCode
DECLARE #tempTble Table (planID varchar(50) NULL);
while len(#planID ) > 0
begin
insert into #tempTble (planID ) values(left(#planID , charindex(',', #planID +',')-1))
set #planID = stuff(#planID , 1, charindex(',', #planID +','), '')
end
select * from #tempTble
END
www.aspdotnet-suresh.com/2013/07/sql-server-split-function-example-in.html
CREATE FUNCTION dbo.Split(#String nvarchar(4000), #Delimiter char(1))
RETURNS #Results TABLE (Items nvarchar(4000))
AS
BEGIN
DECLARE #INDEX INT
DECLARE #SLICE nvarchar(4000)
-- HAVE TO SET TO 1 SO IT DOESNT EQUAL Z
-- ERO FIRST TIME IN LOOP
SELECT #INDEX = 1
WHILE #INDEX !=0
BEGIN
-- GET THE INDEX OF THE FIRST OCCURENCE OF THE SPLIT CHARACTER
SELECT #INDEX = CHARINDEX(#Delimiter,#STRING)
-- NOW PUSH EVERYTHING TO THE LEFT OF IT INTO THE SLICE VARIABLE
IF #INDEX !=0
SELECT #SLICE = LEFT(#STRING,#INDEX - 1)
ELSE
SELECT #SLICE = #STRING
-- PUT THE ITEM INTO THE RESULTS SET
INSERT INTO #Results(Items) VALUES(#SLICE)
-- CHOP THE ITEM REMOVED OFF THE MAIN STRING
SELECT #STRING = RIGHT(#STRING,LEN(#STRING) - #INDEX)
-- BREAK OUT IF WE ARE DONE
IF LEN(#STRING) = 0 BREAK
END
RETURN
END

SQL single row to multiple rows separated by comma

I would like to get a list of data from SQL Server including the data separated by comma in new rows.
Note : I have only single column name DATA
I have a value like
DATA
new
old,yes,now
ok,for
no
My required output is:
DATA
new
old
yes
now
ok
for
no
What you need is a split functions , Just see if this works for you
CREATE FUNCTION FNC_SPLIT(#MYSTR VARCHAR(500), #DELIMITER CHAR(1))
RETURNS #MYTBL TABLE (idx smallint, value varchar(8000))
AS
BEGIN
DECLARE #RET VARCHAR(500)
DECLARE #INDEX INT
DECLARE #COUNTER smallint
--Get the first position of delimiter in the main string
SET #INDEX = CHARINDEX(#DELIMITER,#MYSTR)
SET #COUNTER = 0
--Loop if delimiter exists in the main string
WHILE #INDEX > 0
BEGIN
--extract the result substring before the delimiter found
SET #RET = SUBSTRING(#MYSTR,1, #INDEX-1 )
--set mainstring right part after the delimiter found
SET #MYSTR = SUBSTRING(#MYSTR,#INDEX+1 , LEN(#MYSTR) - #INDEX )
--increase the counter
SET #COUNTER = #COUNTER + 1
--add the result substring to the table
INSERT INTO #MYTBL (idx, value)
VALUES (#COUNTER, #RET)
--Get the next position of delimiter in the main string
SET #INDEX = CHARINDEX(#DELIMITER,#MYSTR)
END
--if no delimiter is found then simply add the mainstring to the table
IF #INDEX = 0
BEGIN
SET #COUNTER = #COUNTER + 1
INSERT INTO #MYTBL (idx, value)
VALUES (#COUNTER, #MYSTR)
END
RETURN
END
GO
declare #table table(dt varchar(100));
insert into #table values
('DATA'),
('new'),
('old,yes,now'),
('ok,for');
select * from #table
select value from #table t cross apply dbo.FNC_SPLIT(t.dt,',')

Resources