FOR XML PATH always adds trailing space to value - sql-server

Using the FOR XML PATH structure to create a list of values,
I find that (annoyingly) it always adds a trailing space to selected values.
This ruins my attempts at providing my own delimiters - the trailing space is added after the column and delimiters have been concatenated.
For example:
SELECT country + '-' FROM countryTable...
results in the following string:
china- france- england-
Has anyone else seen this, and is there a way to stop it?
I don't think TRIM() will work, as that would be applied before the extra space is inserted...
I'm using SQL Server 2016.
Thanks

Ok, thanks to John C and his sample query I found the culprit.
I had a AS [data()] clause after the column name/delimiter.
Removing that removed the trailing space.
I don't know how/why but it did...

I suspect the data inside the country column, What if each value in Country column is having leading space. For XML PATH does not add any space to the data
Try this
SELECT RTRIM(LTRIM(country)) + '-' FROM countryTable...

You may have leading/trailing spaces and/or CRLFs. Perhaps this will help
Declare #countryTable table (country varchar(100))
Insert Into #countryTable values
(' china'), -- leading space
(char(13)+'france'), -- leading char(13)
(char(10)+'england') -- leading char(10)
Select Value=Stuff((Select Distinct '-' + ltrim(rtrim(replace(replace(country,char(13),''),char(10),'')))
From #countryTable
Where 1=1
For XML Path ('')),1,1,'')
Returns
Value
china-england-france

FOR XML PATH ... AS [data()] add to this from MS Help
If the path specified as column name is data(), the value is treated as an atomic value in the generated XML. A space character is added to the XML if the next item in the serialization is also an atomic value. This is useful when you are creating list typed element and attribute values.
When you write here ... AS something. Then something is used as open/closing markup tag for each selected value.
Add 2. Is possible concate in select clausule more fileds from each row. For other types than string type, value must be converted into string type CAST AS

Related

Remove special character from customerId column in SQL Server

I want remove special character from CustomerId column.
Currently CustomerId column contains ¬78254782 and I want to remove the ¬ character.
Could you please help me with that ?
Applying the REPLACE T-SQL function :
SELECT REPLACE(CustomerId, '¬', '') FROM Customers;
SQL Server does not really have any regex support, which is what you would probably want to be using here. That being said, you could try using the enhanced LIKE operator as follows:
UPDATE yourTable
SET CustomerId = RIGHT(CustomerId, LEN(CustomerId) - 1)
WHERE CustomerId LIKE '[^A-Za-z0-9]%';
Here we are phrasing the condition of the first character being special using [^A-Za-z0-9], followed by anything else. In that case, we substring off the first character in the update.

Remove lines recursively from a SQL TEXT column

I have a SQL Text column which has a block of text and has multiple lines which are not relevant and I need to remove only those lines.
Example - This is all one column value:
Header_ID askdjfhklasjdhfklajhfwoi fhweiohrognfk
ABC
SECTION_ID asdfhkwjehfi efjhewiu1382204 3904834
123
SECTION_ID deihefgjkahf dfjsdhfkl edfashldfkljh
So basically I need to remove all lines which are starting with Header_ID and Section_ID and the output Text i need is just
ABC
123
The only thing constant about these lines is the first word it starts with and depending on that I need to remove the whole line.
Here is a solution. Details about how it works are below. Note this solution needs MSSQL 2017+ to work.
-- Place the raw string value as varchar data in a variable so it is convenient to work with:
declare #rawValue varchar(max) = 'Header_ID askdjfhklasjdhfklajhfwoi fhweiohrognfk
ABC
SECTION_ID asdfhkwjehfi efjhewiu1382204 3904834
123
SECTION_ID deihefgjkahf dfjsdhfkl edfashldfkljh';
-- Perform multiple operations on the raw value and save the result to another variable:
declare #convertedValue varchar(max) =
(
select string_agg(value, char(13) + char(10))
from string_split(#rawValue, char(10))
where value not like 'header_id%' and value not like 'section_id%'
);
-- Display converted value.
select #convertedValue;
The magic begins with the string_split() function which produces a table value. It detects the line feed character, char(10), and splits the multi-line string into a table with each line from the string in a separate row.
Next, we filter out the rows from the table that we don't want. These rows begin with the known substrings header_id and section_id. This is accomplished in the where clause.
Lastly, for the output, we use string_agg() and aggregate the remaining rows (the lines we do want) back into a string with the individual values delimited by a combination of the carriage return char(13) and line feed char(10) characters.
Since I am using SQL Server 2016 and not 2017 what i did to resolve the issue was First break all the data into multiple rows(Cross apply with the Split function) using the delimiter as CHAR(13) and then taking only rows that did not start with Header_ID, Section_ID etc and using stuff again built the text block.
Thanks again #otto for the resolution.

How can I make LIKE match a number or empty string inside square brackets in T-SQL?

Is it possible to have a LIKE clause with one character number or an empty string?
I have a field in which I will write a LIKE clause (as a string). I will apply it later with an expression in the WHERE clause: ... LIKE tableX.FormatField .... It must contain a number (a single character or an empty string).
Something like [0-9 ]. Where the space bar inside square brackets means an empty string.
I have a table in which I have a configuration for parameters - TblParam with field DataFormat. I have to validate a value from another table, TblValue, with field ValueToCheck. The validation is made by a query. The part for the validation looks like:
... WHERE TblValue.ValueToCheck LIKE TblParam.DataFormat ...
For the configuration value, I need an expression for one numeric character or an empty string. Something like [0-9'']. Because of the automatic nature of the check, I need a single expression (without AND OR OR operators) which can fit the query (see the example above). The same check is valid for other types of the checks, so I have to fit my check engine.
I am almost sure that I can not use [0-9''], but is there another suitable solution?
Actually, I have difficulty to validate a version string: 1.0.1.2 or 1.0.2. It can contain 2-3 dots (.) and numbers.
I am pretty sure it is not possible, as '' is not even a character.
select ascii(''); returns null.
'' = ' '; is true
'' is null; is false
If you want exactly 0-9 '' (and not ' '), then you do to something like this (in a more efficient way than like):
where col in ('1','2','3','4','5','6','7','9','0') or (col = '' and DATALENGTH(col) = 0)
That's a tricky one... As far as I can tell, there isn't a way to do it with only one like clause. You need to do like '[0-9]' OR like ''.
You could accomplish this by having a second column in your TableX. That indicates either a second pattern, or whether or not to include blanks.
If I correctly understand your question, you need something that catches an empty string. Try to use the nullif() function:
create table t1 (a nvarchar(1))
insert t1(a) values('')
insert t1(a) values('1')
insert t1(a) values('2')
insert t1(a) values('a')
-- must select first three
select a from t1 where a like '[0-9]' or nullif(a,'') is null
It returns exactly three records: '', '1' and '2'.
A more convenient method with only one range clause is:
select a from t1 where isnull(nullif(a,''),0) like '[0-9]'

How can I use LTRIM/RTRIM to search and replace leading/trailing spaces?

I'm in the process of trying to clear out leading and trailing spaces from an NVARCHAR(MAX) column that is filled with prices (using NVARCHAR due to data importing from multiple operating systems with odd characters).
At this point I have a t-sql command that can remove the leading/trailing spaces from static prices. However, when it comes to leveraging this same command to remove all prices, I'm stumped.
Here's the static script I used to remove a specific price:
UPDATE *tablename* set *columnname* = LTRIM(RTRIM(2.50)) WHERE cost = '2.50 ';
Here's what I've tried to remove all the trailing spaces:
UPDATE *tablename* set *columnname* LIKE LTRIM(RTRIM('[.]')) WHERE cost LIKE '[.] ';
I've also tried different varations of the % for random characters but at this point I'm spinning my wheels.
What I'm hoping to achieve is to run one simple command that takes off all the leading and trailing spaces in each cell of this column without modifying any of the actual column data.
To remove spaces from left/right, use LTRIM/RTRIM. What you had
UPDATE *tablename*
SET *columnname* = LTRIM(RTRIM(*columnname*));
would have worked on ALL the rows. To minimize updates if you don't need to update, the update code is unchanged, but the LIKE expression in the WHERE clause would have been
UPDATE [tablename]
SET [columnname] = LTRIM(RTRIM([columnname]))
WHERE 32 in (ASCII([columname]), ASCII(REVERSE([columname])));
Note: 32 is the ascii code for the space character.
To remove spaces... please use LTRIM/RTRIM
LTRIM(String)
RTRIM(String)
The String parameter that is passed to the functions can be a column name, a variable, a literal string or the output of a user defined function or scalar query.
SELECT LTRIM(' spaces at start')
SELECT RTRIM(FirstName) FROM Customers
Read more: http://rockingshani.blogspot.com/p/sq.html#ixzz33SrLQ4Wi
LTrim function and RTrim function :
The LTrim function to remove leading spaces and the RTrim
function to remove trailing spaces from a string variable.
It uses the Trim function to remove both types of spaces.
select LTRIM(RTRIM(' SQL Server '))
output:
SQL Server
I understand this question is for sql server 2012, but if the same scenario for SQL Server 2017 or SQL Azure you can use Trim directly as below:
UPDATE *tablename*
SET *columnname* = trim(*columnname*);
SELECT RTRIM(' Author ') AS Name;
Output will be without any trailing spaces.
Name
——————
‘ Author’
The LTrim function to remove leading spaces and the RTrim function to remove trailing spaces from a string variable.
It uses the Trim function to remove both types of spaces and means before and after spaces of string.
SELECT LTRIM(RTRIM(REVERSE(' NEXT LEVEL EMPLOYEE ')))

How can I make SQL Server return FALSE for comparing varchars with and without trailing spaces?

If I deliberately store trailing spaces in a VARCHAR column, how can I force SQL Server to see the data as mismatch?
SELECT 'foo' WHERE 'bar' = 'bar '
I have tried:
SELECT 'foo' WHERE LEN('bar') = LEN('bar ')
One method I've seen floated is to append a specific character to the end of every string then strip it back out for my presentation... but this seems pretty silly.
Is there a method I've overlooked?
I've noticed that it does not apply to leading spaces so perhaps I run a function which inverts the character order before the compare.... problem is that this makes the query unSARGable....
From the docs on LEN (Transact-SQL):
Returns the number of characters of the specified string expression, excluding trailing blanks. To return the number of bytes used to represent an expression, use the DATALENGTH function
Also, from the support page on How SQL Server Compares Strings with Trailing Spaces:
SQL Server follows the ANSI/ISO SQL-92 specification on how to compare strings with spaces. The ANSI standard requires padding for the character strings used in comparisons so that their lengths match before comparing them.
Update: I deleted my code using LIKE (which does not pad spaces during comparison) and DATALENGTH() since they are not foolproof for comparing strings
This has also been asked in a lot of other places as well for other solutions:
SQL Server 2008 Empty String vs. Space
Is it good practice to trim whitespace (leading and trailing)
Why would SqlServer select statement select rows which match and rows which match and have trailing spaces
you could try somethign like this:
declare #a varchar(10), #b varchar(10)
set #a='foo'
set #b='foo '
select #a, #b, DATALENGTH(#a), DATALENGTH(#b)
Sometimes the dumbest solution is the best:
SELECT 'foo' WHERE 'bar' + 'x' = 'bar ' + 'x'
So basically append any character to both strings before making the comparison.
After some search the simplest solution i found was in Anthony Bloesch
WebLog.
Just add some text (a char is enough) to the end of the data (append)
SELECT 'foo' WHERE 'bar' + 'BOGUS_TXT' = 'bar ' + 'BOGUS_TXT'
Also works for 'WHERE IN'
SELECT <columnA>
FROM <tableA>
WHERE <columnA> + 'BOGUS_TXT' in ( SELECT <columnB> + 'BOGUS_TXT' FROM <tableB> )
The approach I’m planning to use is to use a normal comparison which should be index-keyable (“sargable”) supplemented by a DATALENGTH (because LEN ignores the whitespace). It would look like this:
DECLARE #testValue VARCHAR(MAX) = 'x';
SELECT t.Id, t.Value
FROM dbo.MyTable t
WHERE t.Value = #testValue AND DATALENGTH(t.Value) = DATALENGTH(#testValue)
It is up to the query optimizer to decide the order of filters, but it should choose to use an index for the data lookup if that makes sense for the table being tested and then further filter down the remaining result by length with the more expensive scalar operations. However, as another answer stated, it would be better to avoid these scalar operations altogether by using an indexed calculated column. The method presented here might make sense if you have no control over the schema , or if you want to avoid creating the calculated columns, or if creating and maintaining the calculated columns is considered more costly than the worse query performance.
I've only really got two suggestions. One would be to revisit the design that requires you to store trailing spaces - they're always a pain to deal with in SQL.
The second (given your SARG-able comments) would be to add acomputed column to the table that stores the length, and add this column to appropriate indexes. That way, at least, the length comparison should be SARG-able.

Resources