Good evening,
I need to replace a "part of a string" in a SQL column (Column3), I'm using the REPLACE build-in function to accomplish this, but I need to ADD a leading number (1) to the original string (Column2), and I keep getting "String or binary data would be truncated."
UPDATE [database].[dbo].[Table1]
SET [Column3] = REPLACE(Column3, Column2, ('1' + Column2))
One example:
Column2: "0200"
Here is an example of what COLUMN3 string looks like:
Column3: "TEST DATA 0200"
Then after it gets replaced we need to show it like this: "TEST DATA 10200"
Notice the number now includes a leading "1"
HELP PLEASE!!!
Maybe this REPLACE function replaces in a recursive loop until it blows the size of the string. Let's say Column3 is a varchar(10), and contains "abcd". Column2 contains "b". Then, it will replace all occurrences of Column2 in Column3 for '1' + Column2.
First replacement ('b' for '1b'):
"a1bcd"
Second replacement:
"a11bcd"
It keeps going...
"a111bcd"
"a1111bcd"
"a11111bcd"
"a111111bcd"
Next time it will try to put a 11 character string on a varchar(10).
You should define your own function (using T-SQL) to run the string only once and, after replace, continue from after the replacement and not from the right next character.
The error message you are getting indicates that the resulting string is too big to fit in the "Data" column. You need to increase the size of "Data" to ensure the result will fit.
Try running this SELECT (instead of the UPDATE) to see what is happening
SELECT
REPLACE(Column3, Column2, ('1' + Column2)) AS Result,
DATALENGTH(REPLACE(Column3, Column2, ('1' + Column2))) AS ResultSize
FROM
[database].[dbo].[Table1]
Related
I have column values that are in between starting of the underscore() and ending with an underscore().
I am trying to see how to extract a value between 2 underscores (_). for example, "xxxx_Whats your number 23345_xxxxx".
I want to discard everything before and after underscore(_).
Any help is greatly appreciated.
REGEXP_SUBSTRING, using a grouping match ( ) and turning on sub-matches 'e', and selecting the first match.. then stating you want to see an underscore, and then many not underscores, and then a underscore.
select
column1,
regexp_substr(column1, '_([^_]*)_',1,1,'e')
from values
('xxxx_Whats your number 23345_xxxxx')
gives:
COLUMN1
REGEXP_SUBSTR(COLUMN1, '([^]*)_',1,1,'E')
xxxx_Whats your number 23345_xxxxx
Whats your number 23345
hmm, you mention discard before and after, thus if you want to include the underscore you will need to move them into the grouping brackets:
select
column1,
regexp_substr(column1, '_([^_]*)_',1,1,'e') as exclude_underscore,
regexp_substr(column1, '(_[^_]*_)',1,1,'e') as include_underscore
from values
('xxxx_Whats your number 23345_xxxxx'),
('has no first underscore_xxxxx'),
('xxx_has no last underscore'),
('nothing between__the underscores');
COLUMN1
EXCLUDE_UNDERSCORE
INCLUDE_UNDERSCORE
xxxx_Whats your number 23345_xxxxx
Whats your number 23345
_Whats your number 23345_
has no first underscore_xxxxx
null
null
xxx_has no last underscore
null
null
nothing between__the underscores
__
then you might also want, atleast 1 character between the underscrores, and thus should change the * to a + or a {n,}
I want to find strings of any length that contain only 0's and a symbol such as a / a . or a -
Examples include 0__0 and 000/00/00000 and .00000
Considering this sample data:
CREATE TABLE dbo.things(thing varchar(255));
INSERT dbo.things(thing) VALUES
('0__0'),('000/00/00000'),('00000'),('0123456');
Try the following, which locates the first position of any character that is NOT a 0, a decimal, a forward slash, or an underscore. PATINDEX returns 0 if the pattern is not found.
SELECT thing FROM dbo.things
WHERE PATINDEX('%[^0^.^/^_]%', thing) = 0;
Results:
thing
0__0
000/00/00000
00000
The opposite:
SELECT thing FROM dbo.things
WHERE PATINDEX('%[^0^.^/^_]%', thing) > 0;
Results:
thing
0123food456
Example db<>fiddle
I can see a way of doing this... But it's something that wouldn't perform well, if you think about using it as a search criteria.
We are going to use a translate function on SQL Server, to replace the allowed characters, or symbols as you've said, with a zero. And then, eliminates the zeroes. If the result is an empty string, then there are two cases, or it only had zeroes and allowed characters, or it already was an empty string.
So, checking for this and for non-empty strings, we can define if it matches your criteria.
-- Test scenario
create table #example (something varchar(200) )
insert into #example(something) values
--Example cases from Stack Overflow
('0__0'),('000/00/00000'),('.00000'),
-- With something not allowed (don't know, just put a number)
('1230__0'),('000/04560/00000'),('.00000789'),
-- Just not allowed characters, zero, blank, and NULL
('1234567489'),('0'), (''),(null)
-- Shows the data, with a column to check if it matches your criteria
select *
from #example e
cross apply (
select case when
-- If it *must* have at least a zero
e.something like '%0%' and
-- Eliminates zeroes
replace(
-- Replaces the allowed characters with zero
translate(
e.something
,'_./'
,'000'
)
,'0'
,''
) = ''
then cast(1 as bit)
else cast(0 as bit)
end as doesItMatch
) as criteria(doesItMatch)
I really discourage you from using this as a search criteria.
-- Queries the table over this criteria.
-- This is going to compute over your entire table, so it can get very CPU intensive
select *
from #example e
where
-- If it *must* have at least a zero
e.something like '%0%' and
-- Eliminates zeroes
replace(
-- Replaces the allowed characters with zero
translate(
e.something
,'_./'
,'000'
)
,'0'
,''
) = ''
If you must use this as a search criteria, and this will be a common filter on your application, I suggest you create a new bit column, to flag if it matches this, and index it. Thus, the increase in computational effort would be spread on the inserts/updates/deletes, and the search queries won't overloading the database.
The code can be seen executing here, on DB Fiddle.
What I got from the question is that the strings must contain both 0 and any combination of the special characters in the string.
If you have SQL Server 2017 and above, you can use translate() to replace multiple characters with a space and compare this with the empty string. Also you can use LIKE to enforce that both a 0 and any combination of the special character(s) appear at least once:
DECLARE #temp TABLE (val varchar(100))
INSERT INTO #temp VALUES
('0__0'), ('000/00/00000'), ('.00000'), ('w0hee/'), ('./')
SELECT *
FROM #temp
WHERE val LIKE '%0%' --must have at least one zero somewhere
AND val LIKE '%[_/.]%' --must have at least one special character(s) somewhere
AND TRANSLATE(val, '0./_', ' ') = '' --translated zeros and sp characters to spaces equivalent to an empty string
Creates output:
val
0__0
000/00/00000
.00000
I am using SQL to compare two columns and return TRUE/FALSE if they are equal.
In some cases, the two columns contain exactly the same string (no spaces or anything) but I am still getting false.
What may the reason for this be?
I am using this code:
CASE WHEN column1 = column2 THEN 0 ELSE 1 END AS [check]
The values are different despite the displayed value.
Using T-SQL, run a query like this to see the exact difference in the underlying raw values:
SELECT
column1
, CAST(column1 AS varbinary(MAX)) AS column1Binary
, column2
, CAST(column2 AS varbinary(MAX)) AS column2Binary
FROM dbo.YourTable;
This will reveal underlying differences like tabs or subtle character differences.
In fact, a likely explanation for what you are seeing is that one/both of the strings has leading and/or trailing whitespace. On SQL Server you may try:
CASE WHEN LTRIM(column1) = LTRIM(column2) THEN 0 ELSE 1 END AS [check]
If the above does not detect the problematical records, then try checking the length:
CASE WHEN LEN(column1) = LEN(column2) THEN 0 ELSE 1 END AS [check2]
In my local table, I am try to check if an Oracle Number column called JOBNUMBER has a value that exists in a string parameter. Technically I am passing in the string as a stored procedure nvarchar2 parameter, but for simplicity, I hardcoded the string in my Query below:
SELECT FIRST_NAME, JOB_NUMBER
FROM JOBTABLE
WHERE TO_CHAR(JOB_NUMBER) IN ('00052, 00048');
When Oracle runs the query above, it returns no values even though 00052 is a number value in the table column for JOB_NUMBER. I'm thinking that it checks for the whole string ('00052, 00048') in JOB_NUMBER and can't find it, so it returns no values. The string will contain different values each time, and there will several numbers (of type string) in that string.
Does anyone know how to do this?
The trick is to keep the leading zeroes of the number when comparing to the string, then looping through the string to compare. Here a CTE is used is to simulate creating a numeric job number and a string to search. The TO_CHAR function makes sure to preserve the leading zeroes and the FM format removes the leading space that TO_CHAR leaves for the sign. CONNECT BY loops through the elements for the count of the delimiter + 1 times, keeping the count in the value in 'LEVEL'. This value is used in REGEXP_SUBSTR to iterate through the elements to compare the converted numeric value to each element to see if a match is found. Note this regular expression allows for NULL elements should you need to know which item in the list is your match.
SQL> with tbl(job_nbr_in, job_str_in) as (
select 00052, '00052, 00048' from dual
)
select --level element_nbr,
to_char(job_nbr_in, 'FM00000') search_for, job_str_in in_string,
regexp_substr(job_str_in, '(.*?)(, |$)', 1, level, NULL, 1) found
from tbl
where to_char(job_nbr_in, 'FM00000') = regexp_substr(job_str_in, '(.*?)(, |$)', 1, level, NULL, 1)
connect by level <= regexp_count(job_str_in, ',')+1;
SEARCH_FOR IN_STRING FOUND
---------- ------------ ------------
00052 00052, 00048 00052
If you are not sure if you will always have a space after the comma, remove spaces with REPLACE and adjust the delimiter in REGEXP_SUBSTR:
with tbl(job_nbr_in, job_str_in) as (
select 00052, '00052, 00048' from dual
)
select to_char(job_nbr_in, 'FM00000') search_for, job_str_in in_string,
regexp_substr(replace(job_str_in, ' '), '(.*?)(,|$)', 1, level, NULL, 1) found
from tbl
where to_char(job_nbr_in, 'FM00000') = regexp_substr(replace(job_str_in, ' '), '(.*?)(,|$)', 1, level, NULL, 1)
connect by level <= regexp_count(job_str_in, ',')+1;
Someone asked here how to get only values which are a number :
So , if the table is :
DECLARE #Table TABLE(
Col nVARCHAR(50)
)
INSERT INTO #Table SELECT 'ABC'
INSERT INTO #Table SELECT '234.62'
INSERT INTO #Table SELECT '10:10:10:10'
INSERT INTO #Table SELECT 'France'
INSERT INTO #Table SELECT '2'
then - the desired results are :
234.62
2
But when I tested this query :
SELECT * FROM #Table WHERE Col LIKE '%[0-9.]%' --expected to see only 234.62
it showed :
234.62
10:10:10:10
2
Question #1
How come 10:10:10:10 , 2 satisfies the condition ?
Question #2
I saw this answer here which does work
SELECT * FROM #Table WHERE Col NOT LIKE '%[^0-9.]%'
But I don't understand why this works. AFAIU - it selects all values which are not like (not(has number) and not( has dot)) which is ===>(de morgan)===> not like ( has number or has dot)
Can someone please shed light ?
nb I already know that isnumeric can be used also , but it's unsafe (+). also valid wildcards are %,_,[],[^]
Any particular use of [set] within a LIKE expression is a check against one character in the target string.
So, LIKE '%[0-9.]%' says - % - match 0-to-many arbitrary characters, then [0-9.] match one character in the set 0-9., and then % match 0-to-many arbitrary characters. Paraphrased, it says "match any string that contains at least one character in the set 0-9.". So, 10:10:10:10 can be matched as 0 arbitrary characters, then 1 matches [0-9.], and then 0:10:10:10 matches the final %.
LIKE '%[^0-9.]%' says - % - match 0-to-many arbitrary characters, then [^0-9.] match one character not in the set 0-9., and then % match 0-to-many arbitrary characters. Paraphrased, it says "match any string that contains at least one character outside of the set 0-9.. So when we apply the NOT to the front of that, we are saying "match any string that doesn't contain at least one character outside of the set 0-9." or "match strings that only contain characters in the set 0-9..
Essentially, the double-negative is a way to make an assertion about all characters in the string.