I tried the following query (test data) and found SPACES trimmed automatically.
DECLARE #Employees TABLE(EmpID INT IDENTITY(1,1), FirstName VARCHAR(10), LastName VARCHAR(10))
INSERT INTO #Employees VALUES
('Mani',' '),
('Muthu','Kumar'),
('Ram','Prasath'),
('Elango',''),
('Prabhu',' ')
DECLARE #Name VARCHAR(10) = ' ' -- 2 Spaces
-- SELECT LEN(#Name) -- 0
-- Returned rows with empty LastName
SELECT * FROM #Employees WHERE LastName = #Name
-- Update - Multiple spaces
UPDATE #Employees SET LastName = NULLIF(LastName, ' ') -- All empty last name updated
SELECT * FROM #Employees
In the above example, I searched rows LastName with 2 spaces. But it returned all rows with Empty LastName. I checked the length of the given parameter value. And it shows 0 (actually 6 spaces).
I tried to update LastName which contains more than one space (in example, tried 2 spaces), but it updated all the records with empty LastName
How the automatic trim happening?
I am using SQL Server 2012.
1) LEN is documented as:
Returns the number of characters of the specified string expression, excluding trailing blanks.
(my emphasis)
and 2) How SQL Server compares strings with trailing spaces is documented:
The ANSI standard requires padding for the character strings used in comparisons so that their lengths match before comparing them. The padding directly affects the semantics of WHERE and HAVING clause predicates and other Transact-SQL string comparisons. For example, Transact-SQL considers the strings 'abc' and 'abc ' to be equivalent for most comparison operations.
...
(And so, in fact, technically what happens is not trimming but padding. Bonus points if you can actually work out a way to demonstrate this difference)
Two ways to work with these features - one, you can use DATALENGTH. The other is that you can append a trailing non-space character, to use as a sentinel.
Related
I've tried to use CONCAT function of some fields in a table; in order to get a string that I need to compare onto another field from different table.
However when I use the function it's like it random adds spaces between the fields and then I cannot use this result to compare.
I've tried:
SELECT CONCAT([STC_GL-STC].[ZZGL_Desc_Group_5D],'-',
[STC_GL-STC].[ZZCostCentreGroup],'-',
AS RESULT
FROM [STC_GL-STC];
As an example of result:
'Compras - RM -MATERIA PRIMA -'
(Please note the blank spaces in the second and third (-).
I would need to obtain:
'Compras - RM-MATERIA PRIMA-'
I've checked the values in the fields and there is no blank spaces at the end on fields ZZGL_Desc_Group_5D , ZZCostCentreGroup.
I've also tried:
SELECT CONCAT_WS('-',[ZZGL_Desc_Group_5D],[ZZCostCentreGroup]) AS RESULT
FROM [STC_GL-STC]
With same result.
And finally I tried to remove blank spaces using RTRIM and LTRIM using the following:
SELECT CONCAT(LTRIM(RTRIM([STC_GL-STC].[ZZGL_Desc_Group_5D])),
LTRIM(RTRIM('-')),
LTRIM(RTRIM([STC_GL-STC].[ZZCostCentreGroup]))) AS RESULT
FROM [STC_GL-STC]
ORDER BY RESULT ASC;
And even with LTRIM and RTRIM functions on that field, I still getting the same result.
How to get rid of this behaviour and of the blank spaces? Is there another way to build that string?
Kind Regards and many thanks in advance,
Long time ago I created a udf function to remove white spaces.
It is based on the 'magic' of the XML xs:token data type.
udf
/*
1. All invisible TAB, Carriage Return, and Line Feed characters will be replaced with spaces.
2. Then leading and trailing spaces are removed from the value.
3. Further, contiguous occurrences of more than one space will be replaced with a single space.
*/
CREATE FUNCTION dbo.udf_tokenize(#input VARCHAR(MAX))
RETURNS VARCHAR(MAX)
AS
BEGIN
RETURN (SELECT CAST('<r><![CDATA[' + #input + ']]></r>' AS XML).value('(/r/text())[1] cast as xs:token?','VARCHAR(MAX)'));
END
Test harness
-- DDL and sample data population, start
DECLARE #mockTbl TABLE (ID INT IDENTITY(1,1), col_1 VARCHAR(100), col_2 VARCHAR(100));
INSERT INTO #mockTbl (col_1, col_2)
VALUES (' FL ', ' Miami')
, (' FL ', ' Fort Lauderdale ')
, (' NY ', ' New York ')
, (' NY ', '')
, (' NY ', NULL);
-- DDL and sample data population, end
SELECT *
, col_1n = dbo.udf_tokenize(col_1)
, col_2n = dbo.udf_tokenize(col_2)
, CONCAT_WS('-', dbo.udf_tokenize(col_1), dbo.udf_tokenize(col_2)) AS RESULT
FROM #mockTbl;
I want to replace all occurrences of a particular single character string (eg.:'^'or ',') when creating a view that is based on a single table. But id does not replace the desired single character in all the the data rows. I know it when I query the newly created view. All fields have varchar datatype.
This is a specific a example where the desire string does not get replaced MAINTENANCE¿ENHANCED
I tried the following and none worked.
SELECT REPLACE('MAINTENANCE¿ENHANCED',',','')
SELECT REPLACE('MAINTENANCE¿ENHANCED',char(33),'')
SELECT REPLACE(N'MAINTENANCE¿ENHANCED',',','')
SELECT REPLACE('MAINTENANCE¿ENHANCED',N',','')
SELECT REPLACE(CAST('MAINTENANCE¿ENHANCED' as NVARCHAR(50)),N',','')
SELECT REPLACE(CAST('MAINTENANCE¿ENHANCED' as VARCHAR(50)),N',','')
SELECT REPLACE(TRY_CAST('MAINTENANCE¿ENHANCED' as VARCHAR(50)),N',','')
SELECT REPLACE(CONVERT(VARCHAR(50),'MAINTENANCE¿ENHANCED'), N',','')
Also I performed simple test I copied the comma from the string from where I need it to be replaced (see my code below)
if ',' = '‚' print 1 -- DOES NOT return TRUE. 1st comma is the one I typed in the second argument of the REPLACE function, the 2nd comma is the one copied from the string above.
if ',' = ',' print 1 -- RETURNs TRUE. Both of the commas that I typed in the second argument of the REPLACE function.
Apparently the issue is with my comma in the data source which is being treated as equally. Though the functions below shows that both are varchar. ( https://blog.sqlauthority.com/2013/12/15/sql-server-how-to-identify-datatypes-and-properties-of-variable )
**-- comma from the string**
DECLARE #myVar VARCHAR(100)
SET #myVar = '‚'
SELECT SQL_VARIANT_PROPERTY(#myVar,'BaseType') BaseType,
SQL_VARIANT_PROPERTY(#myVar,'Precision') Precisions,
SQL_VARIANT_PROPERTY(#myVar,'Scale') Scale,
SQL_VARIANT_PROPERTY(#myVar,'TotalBytes') TotalBytes,
SQL_VARIANT_PROPERTY(#myVar,'Collation') Collation,
SQL_VARIANT_PROPERTY(#myVar,'MaxLength') MaxLengths
--**regular comma**
SET #myVar = ','
SELECT SQL_VARIANT_PROPERTY(#myVar,'BaseType') BaseType,
SQL_VARIANT_PROPERTY(#myVar,'Precision') Precisions,
SQL_VARIANT_PROPERTY(#myVar,'Scale') Scale,
SQL_VARIANT_PROPERTY(#myVar,'TotalBytes') TotalBytes,
SQL_VARIANT_PROPERTY(#myVar,'Collation') Collation,
SQL_VARIANT_PROPERTY(#myVar,'MaxLength') MaxLengths
Partially this can be resolved using this code below
select Stuff('MAINTENANCE¿ENHANCED', PatIndex('%[^a-z0-9]%', 'MAINTENANCE¿ENHANCED'), 1, '')
OUTPUT
-- the comma is replaced. That is what is expected.
MAINTENANCEÿENHANCED
But it does not work in I have more than 1 comma regardless if I copy it from the data source or type it in myself.
('‚MAINTENANCE¿ENHANCED')
select Stuff('‚MAINTENANCE¿ENHANCED', PatIndex('%[^a-z0-9]%', '‚MAINTENANCE¿ENHANCED'), 1, '')
select REPLACE(Stuff('‚MAINTENANCE¿ENHANCED', PatIndex('%[^a-z0-9]%', '‚MAINTENANCE¿ENHANCED'), 1, ''),',','')
OUTPUT
-- the comma is back again. The is the Issues. Only one (first) comma is replaced.
AINTENANCE¿ENHANCED
P.S.
Please refer to my answer below where I resolved all the above described issues except that I need to figure out how to preserver from removal special characters like question marks, parenthetic, etc.
PARTIALLY this can be resolved using this code below that I got from here
use MyDB;
go
drop function if exists [dbo].[RemoveNonAlphaCharacters]
go
Create Function [dbo].[RemoveNonAlphaCharacters](#Temp VarChar(1000))
Returns VarChar(1000)
AS
Begin
Declare #KeepValues as varchar(50)
Set #KeepValues = '%[^ a-z0-9]%'
While PatIndex(#KeepValues, #Temp) > 0
Set #Temp = Stuff(#Temp, PatIndex(#KeepValues, #Temp), 1, '')
Return #Temp
End
SELECT MyDB.dbo.RemoveNonAlphaCharacters(', (/!:\£&^?-:;|\)?%$"é觰àçò*MAIN,2TENANCE¿ENHANCED 123 asds %[ ..')
I got this from
How to strip all non-alphabetic characters from string in SQL Server?
OUTPUT
éèàçòMAIN2TENANCEÃÂENHANCED 123 asds
The issues here that it removes all non-alphabetic string characters such as &^?-:;|)? ]% ;:_|!"
I could not fugure out how to pass regular expression to preserver all (except for comma which needs to be replaced) characters in the printable section of the ASCII table (see example above and follow the links below)
https://www.rexegg.com/regex-quickstart.html
http://www.asciitable.com/
I have a column with the name of a person in the following format: "LAST NAME, FIRST NAME"
Only Upper Cases Allowed
Space after comma optional
I would like to use a regular expression like: [A-Z]+,[ ]?[A-Z]+ but I do not know how to do this in T-SQL. In Oracle, I would use REGEXP_LIKE, is there something similar for SQL Server 2016?
I need something like the following:
UPDATE table
SET is_correct_format = 'YES'
WHERE REGEXP_LIKE(table.name,'[A-Z]+,[ ]?[A-Z]+');
First, case sensitivity depends on the collation of the DB, though with LIKE you can specify case comparisons. With that... here is some Boolean logic to take care of the cases you stated. Though, you may need to add additional clauses if you discover some bogus input.
declare #table table (Person varchar(64), is_correct_format varchar(3) default 'NO')
insert into #table (Person)
values
('LowerCase, Here'),
('CORRECTLY, FORMATTED'),
('CORRECTLY,FORMATTEDTWO'),
('ONLY FIRST UPPER, LowerLast'),
('WEGOT, FormaNUMB3RStted'),
('NoComma Formatted'),
('CORRECTLY, TWOCOMMA, A'),
(',COMMA FIRST'),
('COMMA LAST,'),
('SPACE BEFORE COMMA , GOOD'),
(' SPACE AT BEGINNING, GOOD')
update #table
set is_correct_format = 'YES'
where
Person not like '%[^A-Z, ]%' --check for non characters, excluding comma and spaces
and len(replace(Person,' ','')) = len(replace(replace(Person,' ',''),',','')) + 1 --make sure there is only one comma
and charindex(',',Person) <> 1 --make sure the comma isn't at the beginning
and charindex(',',Person) <> len(Person) --make sure the comma isn't at the end
and substring(Person,charindex(',',Person) - 1,1) <> ' ' --make sure there isn't a space before comma
and left(Person,1) <> ' ' --check preceeding spaces
and UPPER(Person) = Person collate Latin1_General_CS_AS --check collation for CI default (only upper cases)
select * from #table
The tsql equivalent could look like this. I'm not vouching for the efficiency of this solution.
declare #table as table(name varchar(20), is_Correct_format varchar(5))
insert into #table(name) Values
('Smith, Jon')
,('se7en, six')
,('Billy bob')
UPDATE #table
SET is_correct_format = 'YES'
WHERE
replace(name, ', ', ',x')
like (replicate('[a-z]', charindex(',', name) - 1)
+ ','
+ replicate('[a-z]', len(name) - charindex(',', name)) )
select * from #table
The optional space is hard to solve, so since it's next to a legal character I'm just replacing with another legal character when it's there.
TSQL does not provide the kind of 'repeating pattern' of * or + in regex, so you have to count the characters and construct the pattern that many times in your search pattern.
I split the string at the comma, counted the alphas before and after, and built a search pattern to match.
Clunky, but doable.
Issue
I want to write a query that will select all from a table where my string value is equal to two columns concatenated together.
This is plain English version:
#MYSTRING varchar(50)
SELECT ALL FROM [FFLOCNP] WHERE COLUMN1 + COLUMN2 = #MYSTRING
I have tried to use the COALESCE but i have never used this before and it is returning me an error:
#CODE varchar(50)
SELECT * FROM [dbo].[FFLOCNP] WHERE COALESCE([LOCTRY], '') || COALESCE([LOCLCN], '') = #CODE
you have to use ISNULL for this.
Use below query may be it helps you.
SELECT * FROM [FFLOCNP] WHERE ISNULL(COLUMN1,'') + ISNULL(COLUMN2,'') = #MYSTRING
Be careful, when using ISNULL instead of COALESCE. ISNULL limits the returned value to the datatype of the first input parameter. In the given example column V1 will be implicitly defined with nvarchar(1), because the longest text in column V1 consists of only one character. ISNULL(V1, [param2]) will therefor return always a one character long string, regardless of the length of the second parameter. In your case ISNULL would work, if you wanted to replace a NULL with an empty string. If you wanted to replace a NULL with a longer string then you MUST use COALESCE instead of ISNULL. COALESCE returns the full string in parameter 2 regardless of the datatype of parameter 1. Apart from this COALESCE is standard SQL whereas ISNULL is a flavor of SQL-Server. Standard SQL should be preferred to T-SQL flavor to get more portable code.
WITH CTE_SRC AS
(
SELECT
[V1]
,[V2]
FROM
(VALUES
(N'A', N'BB')
,(NULL, N'BB')
,(N'A', NULL)
) T([V1],[V2])
)
SELECT
ISNULL([V1], '1234') AS [ISNULL]
,COALESCE([V1], '123') AS [COALESCE]
FROM
CTE_SRC
Result
ISNULL COALESCE
------ --------
A A
1 123
A A
This is in SQL Server 2005. I have a varchar column and some rows contain trailing space, e.g. abc, def.
I tried removing the trailing space with this command:
update thetable
set thecolumn = rtrim(thecolumn)
But the trailing space remains. I tried to find them using:
select *
from thetable
where thecolumn <> rtrim(thecolumn)
But it returned nothing.
Are there some settings that I am not aware that influences trailing space check?
EDIT:
I know that there is trailing space from SSMS, when I copy paste the value from the grid to the editor, it has trailing space.
Check if the spaces that are not removed have the ASCII code 32.
Try this to replace "hard space" with "soft space":
update thetable set thecolumn = rtrim(replace(thecolumn, char(160), char(32)))
the query was missing equal sign
Are you certain that it is a space (ascii 32) character? You can get odd behavior with other "non-visible" characters. Try running
select ascII(right(theColumn, 1))
from theTable
and see what you get.
Use this Function:
Create Function [dbo].[FullTrim] (#strText varchar(max))
Returns varchar(max) as
Begin
Declare #Ch1 char,#ch2 char
Declare #i int,#LenStr int
Declare #Result varchar(max)
Set #i=1
Set #LenStr=len(#StrText)
Set #Result=''
While #i<=#LenStr
Begin
Set #ch1=SUBSTRING(#StrText,#i,1)
Set #ch2=SUBSTRING(#StrText,#i+1,1)
if ((#ch1=' ' and #ch2=' ') or (len(#Result)=0 and #ch1=' '))
Set #i+=1
Else
Begin
Set #Result+=#Ch1
Set #i+=1
End
End
Return #Result
End
In SQL, CHAR(n) columns are right-padded with spaces to their length.
Also string comparison operators (and most functions too) do not take the trailing spaces into account.
DECLARE #t TABLE (c CHAR(10), vc VARCHAR(10))
INSERT
INTO #t
VALUES ('a ', 'a ')
SELECT LEN(c), LEN(vc), с + vc
FROM #t
--
1 1 "a a"
Please run this query:
SELECT *
FROM thetable
WHERE thecolumn + '|' <> RTRIM(thecolumn) + '|'
and see if it finds something.
It sounds like either:
1) Whatever you are using to view the values is inserting the trailing space (or the appearance thereof- try a fixed-width font like Consolas).
2) The column is CHAR, not VARCHAR. In that case, the column will be padded with spaces up to the length of the column, e.g. inserting 'abc' into char(4) will always result in 'abc '
3) You are somehow not committing the updates, not updating the right column, or other form of user error. The update statement itself looks correct on the face of it.
I had the same issues with RTRIM() AND LTRIM() functions.
In my situation the problem was in LF and CR chars.
Solution
DECLARE #test NVARCHAR(100)
SET #test = 'Declaration status '
SET #test = REPLACE(REPLACE(#test, CHAR(10), ''), CHAR(13), '')