SQL-manipulating strings - sql-server

I'll try and make this clear...
Let's say I have a table with 2 columns. issue_number and issue_text. I need to grab 2 strings out of the issue_text column. The first string is something that can be hard coded with case statements since there are only so many types of issues that can be logged (note, i know this isn't the best way)
case
when issue_text like '%error%' then 'error'
else 'not found'
end as error_type
the issue_text is a string that will be formatted mostly the same, it'll have an error, more info, then an incident number, and that is the end of the string.
i.e. "Can't add address. Ref Number: 9999999"
the problem I'm having is the number will not always be the same amount of characters away from the error message.
I was wondering if there is a way to access the substring that causes a match from the like clause. like another case statement using a regex(which i know aren't supported well in sql)
case
when issue_text like '%[0-9 .]%' then (the substring match from like '%[0-9 .]%')
else 00000
end as issue_number
I am restricted to solving this issue and parsing these strings from SQL Server Management Studio or yes, I'd use .net or something to leverage.

Declare #YourTable table (ID int,issue_text varchar(150))
Insert Into #YourTable values
(1,'Can''t add address. Ref Number: 9999999'),
(2,'error')
Select ID
,Issue = Left(issue_text,PatIndex('%:%',issue_text+':')-1)
,IssueNo = substring(issue_text,PatIndex('%:%',issue_text+':')+2,25)
From #YourTable
Returns
ID Issue IssueNo
1 Can't add address. Ref Number 9999999
2 error

If there's always a space just before the number and the number is the last part of the string you can do
RIGHT(issue_text, CHARINDEX(' ', REVERSE(issue_text)) - 1)

Related

Oracle ROWTOCOL Function oddities

I have a requirement to pull data in a specific format and I'm struggling slightly with the ROWTOCOL function and was hoping a fresh pair of eyes might be able to help.
I'm using 10g Oracle DB (10.2) so LISTAGG which appears to do what I need to achieve is not an option.
I need to aggregate a number of usernames into a string delimited with a '$' but I also need to concatenate another column to to build up email addresses.
select
rowtocol('select username_id from username where user_id = '||s.user_id|| 'order by USERNAME_ID asc','#'||d.domain_name||'$')
from username s, domain d
where s.user_id = d.user_id
(I've simplified the query specific to just this function as the actual query is quite large and all works except for this particular function.)
in the DOMAIN Table I have a number of domains such as 'hotmail.com','gmail.com' etc
I need to concatenate the username, an '#' symbol followed by the domain and all delimited with a '$'
such as ......
joe.bloggs#gmail.com$joeblogs#gmail.com$joe_bloggs#gmail.com
I've battled with this and I've got close but in reverse?!.....
gmail.com$joe.bloggs#gmail.com$joeblogs#gmail.com$joe_bloggs
I've also noticed that if I play around with the delimiter (,'#'||d.domain_name||'$') it has a tendency to drop off the first character as can be seen above the preceding '#' has been dropped from the first email address.
Can anyone offer any suggestions as to how to get this working?
Many Thanks in advance!
Assuming you're using the rowtocol function from OTN, and have tables something like:
create table username (user_id number, username_id varchar2(20));
create table domain (user_id number, domain_name varchar2(20));
insert into username values (1, 'joe.bloggs');
insert into username values (1, 'joebloggs');
insert into username values (1, 'joe_bloggs');
insert into domain values (1, 'gmail.com');
Then your original query gets three rows back:
gmail.com$joe.bloggs
gmail.com$joe_bloggs#gmail.com$joebloggs
gmail.com$joe_bloggs#gmail.com$joebloggs
You're passing the data from each of your user IDs to a separate call to rowtocol, which isn't really what you want. You can get the result I think you're after by reversing it; pass the main query that joins the two tables as the select argument to the function, and have that passed query do the username/domain concatenation - that is a separate step to the string aggregation:
select
rowtocol('select s.username_id || ''#'' || d.domain_name from username s join domain d on d.user_id = s.user_id', '$')
from dual;
which gets a single result:
joe.bloggs#gmail.com$joe_bloggs#gmail.com$joebloggs#gmail.com
Whether that fits into your larger query, which you haven't shown, is a separate question. You might need to correlate it with the rest of your query.
There are other ways to string aggregation in Oracle, but this function is one way, and you already have it installed. I'd look at alternatives though, such as ThomasG's answer, which make it a bit clearer what's going on I think.
As Alex told you in comments, this ROWTOCOL isn't a standard function so if you don't show its code, there's nothing we can do to fix it.
However you can accomplish what you want in Oracle 10 using the XMLAGG built-in function.
try this :
SELECT
rtrim (xmlagg (xmlelement (e, s.user_id || '#' || d.domain_name || '$')).extract ('//text()'), '$') whatever
FROM username s
INNER JOIN domain d ON s.user_id = d.user_id

How to delete the rows that contain string?

I want to replace a string in sql script. Im using this query
update [master].[dbo].[Test]
set Student_ID = REPLACE (Student_ID , '1|2_', '1|2_345_')
I think my query is correct but im getting this error "String or binary data would be truncated.The statement has been terminated"
I think this is happen because in column Student_ID have redundant data like "hdhvjf124rgrthrt". How to delete the rows that contain "hdhvjf124rgrthrt" stuffs?
"String or binary data would be truncated.The statement has been terminated"
This exception is usually means that your db field doesn't have enough
length.
I think after you replace string, result string length is larger than column data length. try change column size and run script again.
Try this:
update [master].[dbo].[Test]
set Student_ID = REPLACE (Student_ID , '1|2_', '1|2_345_')
WHERE Student_ID like '%' + '1|2_' + '%'
You can delete those records with:
DELETE FROM [master].[dbo].[Test]
WHERE Student_ID LIKE '%hdhvjf124rgrthrt%'
I don't think that's the problem though. I think the length of Student_ID is too short when you do the replace (since you are adding characters). What is the length of Student_ID and how long are the actual values of Student_ID?
You must have some data that already has a length of 252 or more.
Thus when you replace '1|2_' with '1|2_345_' you are pushing it over the 255 length as you are adding 4 extra characters, or more if more that one occurance of the old pattern exists.
Try running a query to see....
SELECT LEN(StudentId) FROM TEST ORDER BY 1 DESC
That's your problem truncation wise.
However, you have asked how to do the DELETE and #HoneyBadger has already shown you that.

How to design an errors table for data validations in a star schema

I am working in SQL Server 2008. I have been tasked with writing a stored procedure to do some data validations on external data before we move it into our star schema data warehouse environment. One type of test requested is domain integrity / reference lookup from our external fact data tables to our dimension tables. To do this, I use the following technique:
SELECT
some_column
FROM some_fact_table
LEFT JOIN some_dimension_table
ON
some_fact_table.some_column = some_dimension_table.lookup_column
WHERE
some_fact_table.some_column IS NOT NULL
AND
some_dimension_table.lookup_column IS NULL
The SELECT clause will match the column definition for an errors table that I will eventually move the output into via SSIS. So, the SELECT clause actually looks like:
SELECT
primary_key,
'some_column' AS Offending_Column,
'not found in lookup' AS Error_Message,
some_column AS Offending_Value
But, because the fact tables are very large, we want to minimize the number of times that we have to select from it. Hence, I have just 1 query for each fact table to check each column in question, which looks something like:
SELECT
primary_key,
'col1|col2|col3' AS Potentially_Offending_Columns,
'not found in lookup|not found in lookup|not found in lookup' AS Error_Messages,
col1 + '|' + col2 + '|' + col3 AS Potentially_Offending_Values
FROM fact_table
LEFT JOIN dim_table1
ON
fact_table.col1 = dim_table1.lookup_column
LEFT JOIN dim_table2
ON
fact_table.col2 = dim_table2.lookup_column
LEFT JOIN dim_table3
ON
fact_table.col2 = dim_table3.lookup_column
WHERE
dim_table1.lookup_column IS NULL
OR
dim_table2.lookup_column IS NULL
OR
dim_table3.lookup_column IS NULL
This has some problems with it. (1) If any of the source column rows is null, then the string concatenation in Offending_Values will result in NULL. If I wrap each column with ISNULL (and swap the nulls for something like an empty string), then I won't be able to tell if the test failed because of a true empty string in the source or if it was swapped for an empty string. (2) If just one of the columns fail in the lookup, then the error message will still read 'not found in lookup|not found in lookup|not found in lookup', i.e., I can't tell which of the columns actually failed. (3) The Potentially_offending_Columns column in the output will always be static, which means I can't tell if any of the columns failed just by looking at it.
So, in effect, I am having some design problems with my errors table. Is there a standard way of outputting to an errors table in this situation? Or, if not, what do I need to fix to make the output readable and useful?
I don't know what your data looks like, but instead of using an empty string with ISNULL, couldn't you return the word FAIL or something that's meaningful to you. You could do a CASE WHEN for your 'not found in lookup' column.
CASE WHEN Col1 IS NULL THEN 'not found in lookup' ELSE '' END + '|' +
CASE WHEN Col2 IS NULL THEN 'not found in lookup' ELSE '' END + '|' +
CASE WHEN Col3 IS NULL THEN 'not found in lookup' ELSE '' END AS Error_Messages,
ISNULL(col1,'FAIL') + '|' + ISNULL(col2,'FAIL') + '|' + ISNULL(col3,'FAIL') AS Potentially_Offending_Values

Pl/Sql array inside a statement

I'm trying to prepare a function, so I've started this sql sketch to figure out how to manage my situation:
DECLARE
x XMLType;
begin
x := XMLType('<?xml version="1.0"?>
<ROWSET>
<ROW>
<START_DATETIME>29/05/2015 14:23:00</START_DATETIME>
</ROW>
<ROW>
<START_DATETIME>29/05/2015 17:09:00</START_DATETIME>
</ROW>
</ROWSET>');
FOR r IN (
SELECT ExtractValue(Value(p),'/ROW/START_DATETIME/text()') as deleted
FROM TABLE(XMLSequence(Extract(x,'/ROWSET/ROW'))) p
) LOOP
-- do whatever you want with r.name, r.state, r.city
-- dbms_output.put_line( 'TO_DATE('''|| r.deleted ||''', '''|| 'DD/MM/YYYY HH24:MI:SS'')');
dbms_output.put_line( ''''|| r.deleted ||'''');
DELETE FROM MYTABLE a WHERE a.START_DATETIME not in (''''|| r.deleted || '''');
END LOOP;
END;
I've tried different ways to perform the query after the loop has filled the variable but is gaves me a conversion error:
00000 - "a non-numeric character was found where a numeric was expected"
*Cause: The input data to be converted using a date format model was
incorrect. The input data did not contain a number where a number was
required by the format model.
*Action: Fix the input data or the date format model to make sure the
elements match in number and type. Then retry the operation.
Can anybody help me?
thanks!
You're wrapping a string in explicit single quotes; that is making the quotes part of the string itself, which you don't want.
You need to convert the string to a data type, which you are sort of doing in a commented-out section - in that case you do need the extra quotes for your dbms_output() to make it a text literal, and to end up as a valid to_date() call; so you end up with output from that:
TO_DATE('29/05/2015 14:23:00', 'DD/MM/YYYY HH24:MI:SS')
But for your delete though you just need to do:
DELETE FROM MYTABLE a
WHERE a.START_DATETIME not in (to_date(r.deleted, 'DD/MM/YYYY HH24:MI:SS'));
The reference to r.deleted is already a string, so you refer to it directly, with no additional quotes.
You only have a single value though, so at that point in the loop using not in is not necessary and you can use != instead:
DELETE FROM MYTABLE
WHERE START_DATETIME != to_date(r.deleted, 'DD/MM/YYYY HH24:MI:SS');
Your title mentions an array, so perhaps you really intend to put all the values from the XML into a (schema-level type) table collection and then use that in the not in clause, so it removes everything except the dates in your XML. Doing it individually like this will effectively delete everything in the table if there is more than one date in the XML, which also suggests you either want to use an array, and/or actually meant in or = to only remove those.
Incidentally, extractValue() is deprecated, so it would be better to use XMLQuery or XMLTable, e.g.:
FOR r IN (
SELECT *
FROM XMLTable('/ROWSET/ROW/START_DATETIME'
PASSING x COLUMNS deleted VARCHAR2(19) PATH '.')
) LOOP

SQL mobile number validation

I have my sql database where i would like to filter out all the valid mobile numbers.
I currently use as follows;
WHERE pn.PhoneNumber LIKE '+[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]'
OR pn.PhoneNumber LIKE '[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]'
OR pn.PhoneNumber LIKE '[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]'
OR pn.PhoneNumber LIKE '[0-9][0-9][0-9]-[0-9][0-9][0-9][0-9][0-9][0-9]'
However I still receive numbers such as 0000000, 0 ,0000 etc.
Some of the numbers aren't Irish mobiles either as they don't begin with 08.
To fix there if I wanted the beginning of the number to begin with an 087 would I just input [0][8][7] instead of the [0-9]?
try testing this !
this'll give numbers starting with 087 and mobile(length)=10
select * from table where mobile_number like '087%' and LEN(mobile_number)=10
DEMO
I would create a table containing all the prefixes that I was interested in and then use that to do the validation.
Something like ....
Create table Allowed ( Prefix VARCHAR(10) )
insert into allowed values ( '071' );
insert into allowed values ( '072' );
insert into allowed values ( '+44' );
select count(prefix) as OK
from allowed
where REPLACE( pn.phonenumber, ' ', '') like prefix || '%'
You can still do the numeric validation separately, or combine the regexp part into the suffix added above.
I know this is out of date but just developed code for a UK Mobile Number that someone might find useful. It checks with or without a space, hyphen etc after the first 5 numbers and returns a blank if the number isn't valid - I need to upload records to a third party who reject records with invalid mobile numbers but accept blanks.
Mobile = CASE WHEN MobileTel LIKE '07[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]%' THEN left(MobileTel,11) WHEN MobileTel LIKE '07[0-9][0-9][0-9][^0-9][0-9][0-9][0-9][0-9][0-9][0-9]%' THEN (LEFT(MobileTel,5)+substring(MobileTel,7,6)) ELSE '' END

Resources