Search Entire Database To find extended ascii codes in sql

Search Entire Database To find extended ascii codes in sql - sql-server

We have issues with extended ascii codes getting in our database (128-155)
Is there anyway to search the entire database and display the results of any of these characters that may be in there and where they are located within the tables and columns.
Hope that makes sense.
I have the script to search entire DB, but having trouble with opening line.
DECLARE #SearchStr nvarchar(100)
SET #SearchStr != between char(32) and char(127)
I have this originally that works, but I need to extend the range I'm looking for.
SET #SearchStr = '|' + char(9) + '|' + char(10) + '|' + char(13)
Thanks

It's very unclear what your data looks like, but this might help you to get started:
declare #TestData table (String nvarchar(100))
insert into #TestData select N'abc'
insert into #TestData select N'def'
insert into #TestData select char(128)
insert into #TestData select char(155)
declare #SearchPattern nvarchar(max) = N'%['
declare #i int = 128
while #i <= 155
begin
set #SearchPattern += char(#i)
set #i += 1
end
set #SearchPattern += N']%'
select #SearchPattern
select String
from #TestData
where String like #SearchPattern
Of course you'll need to add some code to loop over every table and column that you want to query (see this question), and it's possible that this code will behave differently on different collations.

... where dodgyColumn is your column with questionable data ....
WHERE(patindex('%[' + char(127) + '-' + char(255) + ']%', dodgyColumn COLLATE Latin1_General_BIN2) > 0)
This works for us, to identify extended ASCII characters in our otherwise normal ASCII data (characters, numbers, punctuation, dollar and percent signs, etc.)

Related

What is the encode(<columnName>, 'escape') PostgreSQL equivalent in SQL Server?

In the same vein as this question, what is the equivalent in SQL Server to the following Postgres statement?
select encode(some_field, 'escape') from only some_table

As you were told already, SQL-Server is not the best with such issues.
The most important advise to avoid such issues is: Use the appropriate data type to store your values. Storing binary data as a HEX-string is running against this best practice. But there are some workarounds:
I use the HEX-string taken from the linked question:
DECLARE #str VARCHAR(100)='0x61736461640061736461736400';
--here I use dynamically created SQL to get the HEX-string as a real binary:
DECLARE #convBin VARBINARY(MAX);
DECLARE #cmd NVARCHAR(MAX)=N'SELECT #bin=' + #str;
EXEC sp_executeSql #cmd
,N'#bin VARBINARY(MAX) OUTPUT'
,#bin=#convBin OUTPUT;
--This real binary can be converted to a VARCHAR(MAX).
--Be aware, that in this case the input contains 00 as this is an array.
--It is possible to split the input at the 00s, but this is going to far...
SELECT #convBin AS HexStringAsRealBinary
,CAST(#convBin AS VARCHAR(MAX)) AS CastedToString; --You will see the first "asda" only
--If your HEX-string is not longer than 10 bytes there is an undocumented function:
--You'll see, that the final AA is cut away, while a shorter string would be filled with zeros.
SELECT sys.fn_cdc_hexstrtobin('0x00112233445566778899AA')
SELECT CAST(sys.fn_cdc_hexstrtobin(#str) AS VARCHAR(100));
UPDATE: An inlinable approach
The following recursive CTE will read the HEX-string character by character.
Furthermore it will group the result and return two rows in this case.
This solution is very specific to the given input.
DECLARE #str VARCHAR(100)='0x61736461640061736461736400';
WITH recCTE AS
(
SELECT 1 AS position
,1 AS GroupingKey
,SUBSTRING(#str,3,2) AS HEXCode
,CHAR(SUBSTRING(sys.fn_cdc_hexstrtobin('0x' + SUBSTRING(#str,3,2)),1,1)) AS TheLetter
UNION ALL
SELECT r.position+1
,r.GroupingKey + CASE WHEN SUBSTRING(#str,2+(r.position)*2+1,2)='00' THEN 1 ELSE 0 END
,SUBSTRING(#str,2+(r.position)*2+1,2)
,CHAR(SUBSTRING(sys.fn_cdc_hexstrtobin('0x' + SUBSTRING(#str,2+(r.position)*2+1,2)),1,1)) AS TheLetter
FROM recCTE r
WHERE position<LEN(#str)/2
)
SELECT r.GroupingKey
,(
SELECT x.TheLetter AS [*]
FROM recCTE x
WHERE x.GroupingKey=r.GroupingKey
AND x.HEXCode<>'00'
AND LEN(x.HEXCode)>0
ORDER BY x.position
FOR XML PATH(''),TYPE
).value('.','varchar(max)')
FROM recCTE r
GROUP BY r.GroupingKey;
The result
1 asdad
2 asdasd
Hint: Starting with SQL Server 2017 there is STRING_AGG(), which would reduce the final SELECT...

If you need this functionality, it's going to be up to you to implement it. Assuming you just need the escape variant, you can try to implement it as a T-SQL UDF. But pulling strings apart, working character by character and building up a new string just isn't a T-SQL strength. You'd be looking at a WHILE loop to count over the length of the input byte length, SUBSTRING to extract the individual bytes, and CHAR to directly convert the bytes that don't need to be octal encoded.1
If you're going to start down this route (and especially if you want to support the other formats), I'd be looking at using the CLR support in SQL Server, to create the function in a .NET language (C# usually preferred) and use the richer string manipulation functionality there.
Both of the above assume that what you're really wanting is to replicate the escape format of encode. If you just want "take this binary data and give me a safe string to represent it", just use CONVERT to get the binary hex encoded.
1Here's my attempt at it. I'd suggest a lot of testing and tweaking before you use it in anger:
create function Postgresql_encode_escape (#input varbinary(max))
returns varchar(max)
as
begin
declare #i int
declare #len int
declare #out varchar(max)
declare #chr int
select #i = 1, #out = '',#len = DATALENGTH(#input)
while #i <= #len
begin
set #chr = SUBSTRING(#input,#i,1)
if #chr > 31 and #chr < 128
begin
set #out = #out + CHAR(#chr)
end
else
begin
set #out = #out + '\' +
RIGHT('000' + CONVERT(varchar(3),
(#chr / 64)*100 +
((#chr / 8)%8)*10 +
(#chr % 8))
,3)
end
set #i = #i + 1
end
return #out
end

MS SQL - CONTAINS full-text search w/ variable number of values, not using dynamic sql

I've created a full-text indexed column on a table.
I have a stored procedure to which I may pass the value of a variable "search this text". I want to search for "search", "this" and "text" within the full-text column. The number of words to search would be variable.
I could use something like
WHERE column LIKE '%search%' OR column LIST '%this%' OR column LIKE '%text%'
But that would require me to use dynamic SQL, which I'm trying to avoid.
How can I use my full-text search to find each of the words, presumably using CONTAINS, and without converting the whole stored procedure to dynamic SQL?

If you say you definitely have SQL Table Full Text Search Enabled, Then you can use query like below.
select * from table where contains(columnname,'"text1" or "text2" or "text3"' )
See link below for details
Full-Text Indexing Workbench

So I think I came up with a solution. I created the following scalar function:
CREATE FUNCTION [dbo].[fn_Util_CONTAINS_SearchString]
(
#searchString NVARCHAR(MAX),
#delimiter NVARCHAR(1) = ' ',
#ANDOR NVARCHAR(3) = 'AND'
)
RETURNS NVARCHAR(MAX)
AS
BEGIN
IF #searchString IS NULL OR LTRIM(RTRIM(#searchString)) = '' RETURN NULL
-- trim leading/trailing spaces
SET #searchString = LTRIM(RTRIM(#searchString))
-- remove double spaces (prevents empty search terms)
WHILE CHARINDEX(' ', #searchString) > 0
BEGIN
SET #searchString = REPLACE(#searchString,' ',' ')
END
-- reformat
SET #searchString = REPLACE(#searchString,' ','" ' + #ANDOR + ' "') -- replace spaces with " AND " (quote) AND (quote)
SET #searchString = ' "' + #searchString + '" ' -- surround string with quotes
RETURN #searchString
END
I can get my results:
DECLARE #ftName NVARCHAR (1024) = dbo.fn_Util_CONTAINS_SearchString('value1 value2',default,default)
SELECT * FROM Table WHERE CONTAINS(name,#ftName)
I would appreciate any comments/suggestions.

For your consideration.
I understand your Senior wants to avoid dynamic SQL, but it is my firm belief that Dynamic SQL is NOT evil.
In the example below, you can see that with a few parameters (or even defaults), and a 3 lines of code, you can:
1) Dynamically search any source
2) Return desired or all elements
3) Rank the Hit rate
The SQL
Declare #SearchFor varchar(max) ='Daily,Production,default' -- any comma delim string
Declare #SearchFrom varchar(150) ='OD' -- table or even a join statment
Declare #SearchExpr varchar(150) ='[OD-Title]+[OD-Class]' -- Any field or even expression
Declare #ReturnCols varchar(150) ='[OD-Nr],[OD-Title]' -- Any field(s) even with alias
Set #SearchFor = 'Sign(CharIndex('''+Replace(Replace(Replace(#SearchFor,' , ',','),', ',''),',',''','+#SearchExpr+'))+Sign(CharIndex(''')+''','+#SearchExpr+'))'
Declare #SQL varchar(Max) = 'Select * from (Select Distinct'+#ReturnCols+',Hits='+#SearchFor+' From '+#SearchFrom + ') A Where Hits>0 Order by Hits Desc'
Exec(#SQL)
Returns
OD-Nr OD-Title Hits
3 Daily Production Summary 2
6 Default Settings 1
I should add that my search string is comma delimited, but you can change to space.
Another note CharIndex can be substanitally faster that LIKE. Take a peek at
http://cc.davelozinski.com/sql/like-vs-substring-vs-leftright-vs-charindex

Generate column name dynamically in sql server

Please look at the below query..
select name as [Employee Name] from table name.
I want to generate [Employee Name] dynamically based on other column value.
Here is the sample table
s_dt dt01 dt02 dt03
2015-10-26
I want dt01 value to display as column name 26 and dt02 column value will be 26+1=27

I'm not sure if I understood you correctly. If I'am going into the wrong direction, please add comments to your question to make it more precise.
If you really want to create columns per sql you could try a variation of this script:
DECLARE #name NVARCHAR(MAX) = 'somename'
DECLARE #sql NVARCHAR(MAX) = 'ALTER TABLE aps.tbl_Fabrikkalender ADD '+#name+' nvarchar(10) NULL'
EXEC sys.sp_executesql #sql;
To retrieve the column name from another query insert the following between the above declares and fill the placeholders as needed:
SELECT #name = <some colum> FROM <some table> WHERE <some condition>

You would need to dynamically build the SQL as a string then execute it. Something like this...
DECLARE #s_dt INT
DECLARE #query NVARCHAR(MAX)
SET #s_dt = (SELECT DATEPART(dd, s_dt) FROM TableName WHERE 1 = 1)
SET #query = 'SELECT s_dt'
+ ', NULL as dt' + RIGHT('0' + CAST(#s_dt as VARCHAR), 2)
+ ', NULL as dt' + RIGHT('0' + CAST((#s_dt + 1) as VARCHAR), 2)
+ ', NULL as dt' + RIGHT('0' + CAST((#s_dt + 2) as VARCHAR), 2)
+ ', NULL as dt' + RIGHT('0' + CAST((#s_dt + 3) as VARCHAR), 2)
+ ' FROM TableName WHERE 1 = 1)
EXECUTE(#query)
You will need to replace WHERE 1 = 1 in two places above to select your data, also change TableName to the name of your table and it currently puts NULL as the dynamic column data, you probably want something else there.
To explain what it is doing:
SET #s_dt is selecting the date value from your table and returning only the day part as an INT.
SET #query is dynamically building your SELECT statement based on the day part (#s_dt).
Each line is taking #s_dt, adding 0, 1, 2, 3 etc, casting as VARCHAR, adding '0' to the left (so that it is at least 2 chars in length) then taking the right two chars (the '0' and RIGHT operation just ensure anything under 10 have a leading '0').

It is possible to do this using dynamic SQL, however I would also consider looking at the pivot operators to see if they can achieve what you are after a lot more efficiently.
https://technet.microsoft.com/en-us/library/ms177410(v=sql.105).aspx

T-SQL - Merge all columns from source to target table w/o listing all the columns

I'm trying to merge a very wide table from a source (linked Oracle server) to a target table (SQL Server 2012) w/o listing all the columns. Both tables are identical except for the records in them.
This is what I have been using:
TRUNCATE TABLE TargetTable
INSERT INTO TargetTable
SELECT *
FROM SourceTable
When/if I get this working I would like to make it a procedure so that I can pass into it the source, target and match key(s) needed to make the update. For now I would just love to get it to work at all.
USE ThisDatabase
GO
DECLARE
#Columns VARCHAR(4000) = (
SELECT COLUMN_NAME + ','
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = 'TargetTable'
FOR XML PATH('')
)
MERGE TargetTable AS T
USING (SELECT * FROM SourceTable) AS S
ON (T.ID = S.ID AND T.ROWVERSION = S.ROWVERSION)
WHEN MATCHED THEN
UPDATE SET #Columns = S.#Columns
WHEN NOT MATCHED THEN
INSERT (#Columns)
VALUES (S.#Columns)
Please excuse my noob-ness. I feel like I'm only half way there, but I don't understand some parts of SQL well enough to put it all together. Many thanks.

As previously mentioned in the answers, if you don't want to specify the columns , then you have to write a dynamic query.
Something like this in your case should help:
DECLARE
#Columns VARCHAR(4000) = (
SELECT COLUMN_NAME + ','
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = 'TargetTable'
FOR XML PATH('')
)
DECLARE #MergeQuery NVARCHAR(MAX)
DECLARE #UpdateQuery VARCHAR(MAX)
DECLARE #InsertQuery VARCHAR(MAX)
DECLARE #InsertQueryValues VARCHAR(MAX)
DECLARE #Col VARCHAR(200)
SET #UpdateQuery='Update Set '
SET #InsertQuery='Insert ('
SET #InsertQueryValues=' Values('
WHILE LEN(#Columns) > 0
BEGIN
SET #Col=left(#Columns, charindex(',', #Columns+',')-1);
IF #Col<> 'ID' AND #Col <> 'ROWVERSION'
BEGIN
SET #UpdateQuery= #UpdateQuery+ 'TargetTable.'+ #Col + ' = SourceTable.'+ #Col+ ','
SET #InsertQuery= #InsertQuery+#Col + ','
SET #InsertQueryValues=#InsertQueryValues+'SourceTable.'+ #Col+ ','
END
SET #Columns = stuff(#Columns, 1, charindex(',', #Columns+','), '')
END
SET #UpdateQuery=LEFT(#UpdateQuery, LEN(#UpdateQuery) - 1)
SET #InsertQuery=LEFT(#InsertQuery, LEN(#InsertQuery) - 1)
SET #InsertQueryValues=LEFT(#InsertQueryValues, LEN(#InsertQueryValues) - 1)
SET #InsertQuery=#InsertQuery+ ')'+ #InsertQueryValues +')'
SET #MergeQuery=
N'MERGE TargetTable
USING SourceTable
ON TargetTable.ID = SourceTable.ID AND TargetTable.ROWVERSION = SourceTable.ROWVERSION ' +
'WHEN MATCHED THEN ' + #UpdateQuery +
' WHEN NOT MATCHED THEN '+#InsertQuery +';'
Execute sp_executesql #MergeQuery
If you want more information about Merge, you could read the this excellent article

Don't feel bad. It takes time. Merge has interesting syntax. I've actually never used it. I read Microsoft's documentation on it, which is very helpful and even has examples. I think I covered everything. I think there may be a slight amount of tweaking you might have to do, but I think it should work.
Here's the documentation for MERGE:
https://msdn.microsoft.com/en-us/library/bb510625.aspx
As for your code, I commented pretty much everything to explain it and show you how to do it.
This part is to help write your merge statement
USE ThisDatabase --This says what datbase context to use.
--Pretty much what database your querying.
--Like this: database.schema.objectName
GO
DECLARE
#SetColumns VARCHAR(4000) = (
SELECT CONCAT(QUOTENAME(COLUMN_NAME),' = S.',QUOTENAME(COLUMN_NAME),',',CHAR(10)) --Concat just says concatenate these values. It's adds the strings together.
--QUOTENAME adds brackets around the column names
--CHAR(10) is a line break for formatting purposes(totally optional)
FROM INFORMATION_SCHEMA.COLUMNS
--WHERE TABLE_NAME = 'TargetTable'
FOR XML PATH('')
) --This uses some fancy XML trick to get your Columns concatenated into one row.
--What really is in your table is a column of your column names in different rows.
--BTW If the columns names in both tables are identical, then this will work.
DECLARE #Columns VARCHAR(4000) = (
SELECT QUOTENAME(COLUMN_NAME) + ','
FROM INFORMATION_SCHEMA.COLUMNS
--WHERE TABLE_NAME = 'TargetTable'
FOR XML PATH('')
)
SET #Columns = SUBSTRING(#Columns,0,LEN(#Columns)) -- this gets rid off the comma at the end of your list
SET #SetColumns = SUBSTRING(#SetColumns,0,LEN(#SetColumns)) --same thing here
SELECT #SetColumns --Your going to want to copy and paste this into your WHEN MATCHED statement
SELECT #Columns --Your going to want to copy this into your WHEN NOT MATCHED statement
GO
Merge Statement
Especially look at my notes on ROWVERSION.
MERGE INTO TargetTable AS T
USING SourceTable AS S --Don't really need to write SELECT * FROM since you need the whole table anyway
ON (T.ID = S.ID AND T.[ROWVERSION] = S.[ROWVERSION]) --These are your matching parameters
--One note on this, if ROWVERSION is different versions of the same data you don't want to have RowVersion here
--Like lets say you have ID 1 ROWVERSION 2 in your source but only version 1 in your targetTable
--If you leave T.ID =S.ID AND T.ROWVERSION = S.ROWVERSION, then it will insert the new ROWVERSION
--So you'll have two versions of ID 1
WHEN MATCHED THEN --When TargetTable ID and ROWVERSION match in the matching parameters
--Update the values in the TargetTable
UPDATE SET /*Copy and Paste #SetColumnss here*/
--Should look like this(minus the "--"):
--Col1 = S.Col1,
--Col2 = S.Col2,
--Col3 = S.Col3,
--Etc...
WHEN NOT MATCHED THEN --This says okay there are no rows with the existing ID, now insert a new row
INSERT (col1,col2,col3) --Copy and paste #Columns in between the parentheses. Should look like I show it. Note: This is insert into target table so your listing the target table columns
VALUES (col1,col2,col3) --Same thing here. This is the list of source table columns

T-Sql function to convert a varchar - in this instance someone's name - from upper to title case?

Does anyone have in their back pocket a function that can achieve this?

Found this here :-
create function ProperCase(#Text as varchar(8000))
returns varchar(8000)
as
begin
declare #Reset bit;
declare #Ret varchar(8000);
declare #i int;
declare #c char(1);
select #Reset = 1, #i=1, #Ret = '';
while (#i <= len(#Text))
select #c= substring(#Text,#i,1),
#Ret = #Ret + case when #Reset=1 then UPPER(#c) else LOWER(#c) end,
#Reset = case when #c like '[a-zA-Z]' then 0 else 1 end,
#i = #i +1
return #Ret
end
Results from this:-
select dbo.propercase('ALL UPPERCASE'); -- All Uppercase
select dbo.propercase('MiXeD CaSe'); -- Mixed Case
select dbo.propercase('lower case'); -- Lower Case
select dbo.propercase('names with apostrophe - mr o''reilly '); -- Names With Apostrophe - Mr O'Reilly
select dbo.propercase('names with hyphen - mary two-barrels '); -- Names With Hyphen - Mary Two-Barrels

I'd do this outside of TSQL, in the calling code tbh.
e.g. if you're using .NET, it's just a case of using TextInfo.ToTitleCase.
That way, you leave your formatting code outside of TSQL (standard "let the caller decide how to use/format the data" approach).

This kind of function is better done on the application side, as it will perform relatively poorly in SQL.
With SQL-Server 2005 and above you could write a CLR function that does that and call it from your SQL. Here is an article on how to do this.

If you really want to do this in T-SQL and without a loop, see Tony Rogerson's article "Turning stuff into "Camel Case" without loops"
I haven't tried it... that's what client code it for :-)

No cursors, no while loops, no (inline) sub-queries
-- ===== IF YOU DON'T HAVE A NUMBERS TABLE =================
--CREATE TABLE Numbers (
-- Num INT NOT NULL PRIMARY KEY CLUSTERED WITH(FILLFACTOR = 100)
--)
--INSERT INTO Numbers
--SELECT TOP(11000)
-- ROW_NUMBER() OVER (ORDER BY (SELECT 1))
--FROM master.sys.all_columns a
-- CROSS JOIN master.sys.all_columns b
DECLARE #text VARCHAR(8000) = 'my text to make title-case';
DECLARE #result VARCHAR(8000);
SET #result = UPPER(LEFT(#text, 1));
SELECT
#result +=
CASE
WHEN SUBSTRING(#text, Num - 1, 1) IN (' ', '-') THEN UPPER(SUBSTRING(#text, Num, 1))
ELSE SUBSTRING(#text, Num, 1)
END
FROM Numbers
WHERE Num > 1 AND Num <= LEN(#text);
PRINT #result;

Will any given row only contain a firstname or a lastname that you wish to convert or will it contain full names separated by spaces? Also, are there any other rules you wish to what characters it should "upper" or lower"?
If you can guarantee that it's only first and last names and you aren't dealing with any specialized capitalization such as after an apostrophe, might this do what you're looking for?
SELECT -- Initial of First Name
UPPER(LEFT(FullName, 1))
-- Rest of First Name
+ SUBSTRING(LOWER(FullName), 2, CHARINDEX(' ', FullName, 0) - 2)
-- Space between names
+ ' '
-- Inital of last name
+ UPPER(SUBSTRING(FullName, CHARINDEX(' ', FullName, 0) + 1, 1))
-- Rest of last name
+ SUBSTRING(LOWER(FullName), CHARINDEX(' ', FullName, 0) + 2, LEN(FullName) - CHARINDEX(' ', FullName, 0) + 2)
FROM Employee