How to split a non delimited string in T-SQL - sql-server

I need to split a string value that has no delimiter. I work in banking and I am selecting a GL account number and need to separate the account number from the account branch number. The issue is both values are passed as one long string, 10 digits for the account number and 4 for the account branch. For example 01234567891234 needs to be changed to 0123456789.1234.
Every thing I find says to use CHARINDEX or SUBSTRING. From my understand both require a character to search for. If anyone can provide another function and some example code that would be great. Thanks.

You can do something simple like
left(str, 10) + '.' + right(str, 4)
if you know it'll always be a 14 character string

You could also use STUFF function as below:
declare #accNo varchar(14) = '01234567891234'
select stuff(#accNo,11,0,'.')
SQL Fiddle

Related

How to take apart information between hyphens in SQL Server

How would I take apart a column that contains string:
92873-987dsfkj80-2002-04-11
20392-208kj48384-2008-01-04
Data would look like this:
Filename Yes/No Key
Abidabo Yes 92873-987dsfkj80-2002-04-11
Bibiboo No 20392-208kj48384-2008-01-04
Want it to look like this:
Filename Yes/No Key
Abidabo Yes 92873-987dsfkj80-20020411
Bibiboo No 20392-208kj48384-20080104
whereby I would like to concat the dates in the end as 20020411 and 20080104. From the right side, the information is the same always. From the left it is not, otherwise I could have concatenated it. It is not an import issue.
As mentioned in the comments already, storing data like this is a bad idea. However, you can obtain the dates from those strings by using a RIGHT function like so:
SELECT RIGHT('20392-208kj48384-2008-01-04', 10)
Output:
2008-01-04
Depending on the SQLSERVER version you are using, you can use STRING_SPLIT which requieres COMPATIBILITY_LEVEL 130. You can also build your own User Defined Function to split the contents of a field and manipulate it as you need, you can find some useful examples of SPLIT functions in this thread:
Split function equivalent in T-SQL?
Assuming I'm correct and the date part is always on the right side of the string, you can simply use RIGHT and CAST to get the date (assuming, again, that the date is represented as yyyy-mm-dd):
SELECT CAST(RIGHT(YourColumn, 10) As Date)
FROM YourTable
However, Panagiotis is correct in his comment - You shouldn't store data like that. Each column in the database should hold only a single point of data, be it string, number or date.
Update following your comment and the updated question:
SELECT LEFT(YourColumn, LEN(YourColumn) - 10) + REPLACE(RIGHT(YourColumn, 10), '-', '')
FROM YourTable
will return the desired results.

T-SQL Check string pattern

Just curious that is there any easy way to filter certain string out instead of using the following method:
example: for AccountNumber attribute, that should allow exactly 10 digits as the value, like, 0123456789,
So for the query I made like :
#input like '[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]'
I am just wondering is there any alternate way to write this query? for those value which require exact 100 digits, nobody want to count while keep pasting [0-9], right? I notice there is something in C# like ^(\d{10})$, but I cannot find such matching method in TSQL, does this similar method exist?
Your logic is fine. You can also write this as:
where len(AccountNumber) = 10 and AccountNumber not like '%[^0-9]%'
That is, the length is 10 and it contains no characters that are not digits.
You could use
#input like REPLICATE('[0-9]',10) COLLATE Latin1_General_100_BIN2
The explicit collate clause is because in some collations the range will match things that aren't strictly digits.

Matching Regular Expressions In SQL Server

I am trying to extract id of Android app from its url but getting extra characters.
Using replace function in sql server, below are two sample urls:
https://play.google.com/store/apps/details?id=com.flipkart.android&hl=en com.flipkart.android
https://play.google.com/store/apps/details?hl=en_US&id=com.surveysampling.mobile.quickthoughts&referrer=mat_click_id%3Df1901cef59f79b1542d05a1fdfa67202-20150429-5128 en_US&id=com.surveysampling.mobile.quickthoughts&r
I am doing this right now:
SELECT
SUBSTRING(REPLACE(PREVIEW, '&hl=en',''), CHARINDEX('?', PREVIEW) + 4 , 50)
FROM OFFERS_TABLE;
But for 1st I am getting com.flipkart.android which is correct, but for 2nd I am getting en_US&id=com.surveysampling.mobile.quickthoughts&r.
I want to remove en_US&id from starting of it and &r from its end.
Can someone help me with any post or url from where I can refer?
What you are actually trying to do is extract the string preceded by id= until the & is found which is separator for variables in URL. Taking this condition I came up with following regex.
Regex: (?<=id=)[^&]*
Explanation: It uses the lookbehind assertion that is the string is preceded by id= until the first & is found.
Regex101 Demo
It seems like you've made some assumptions of lengths. The the &r is appearing because that is 50 characters. You are also getting the en_US because you assumed 4 characters at the beginning but your second string has more. Perhaps you can split on & and then look for the variable that begins with id=.
it seems like a function like this would help.
http://www.sqlservercentral.com/blogs/querying-microsoft-sql-server/2013/09/19/how-to-split-a-string-by-delimited-char-in-sql-server/

T-SQL Regex for social security number (SQL Server 2008 R2)

I need to find invalid social security numbers in a varchar field in a SQL Server 2008 database table. (Valid SSNs are being defined by being in the format ###-##-#### - doesn't matter what the numbers are, as long as they are in that "3-digit dash 2-digit dash 4-digit" pattern.
I do have a working regex:
SELECT *
FROM mytable
WHERE ssn NOT LIKE '[0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9][0-9][0-9]'
That does find the invalid SSNs in the column, but I know (okay - I'm pretty sure) that there is a way to shorten that to indicate that the previous pattern can have x iterations.
I thought this would work:
'[0-9]{3}-[0-9]{2}-[0-9]{4}'
But it doesn't.
Is there a shorter regex than the one above in the select, or not? Or perhaps there is, but T-SQL/SQL Server 2008 doesn't support it!?
If you plan to get a shorter variant of your LIKE expression, then the answer is no.
In T-SQL, you can only use the following wildcards in the pattern:
%
- Any string of zero or more characters.
WHERE title LIKE '%computer%' finds all book titles with the word computer anywhere in the book title.
_ (underscore)
Any single character.
WHERE au_fname LIKE '_ean' finds all four-letter first names that end with ean (Dean, Sean, and so on).
[ ]
Any single character within the specified range ([a-f]) or set ([abcdef]).
WHERE au_lname LIKE '[C-P]arsen' finds author last names ending with arsen and starting with any single character between C and P, for example Carsen, Larsen, Karsen, and so on. In range searches, the characters included in the range may vary depending on the sorting rules of the collation.
[^]
Any single character not within the specified range ([^a-f]) or set ([^abcdef]).
So, your LIKE statement is already the shortest possible expression. No limiting quantifiers can be used (those like {min,max}), not shorthand classes like \d.
If you were using MySQL, you could use a richer set of regex utilities, but it is not the case.
I suggest you to use another solution like this:
-- Use `REPLICATE` if you really want to use a number to repeat
Declare #rgx nvarchar(max) = REPLICATE('#', 3) + '-' +
REPLICATE('#', 2) + '-' +
REPLICATE('#', 4);
-- or use your simple format string
Declare #rgx nvarchar(max) = '###-##-####';
-- then use this to get your final `LIKE` string.
Set #rgx = REPLACE(#rgx, '#', '[0-9]');
And you can also use something like '_' for characters then replace it with [A-Z] and so on.

String manipulation in SQL Server-- adding placeholder characters

I'm a little green when it comes to SQL Server string manipulation functions. If I have a string with six characters in it, say:
DECLARE #p_MyStringVariable VARCHAR (100)
SET #p_MyStringVariable = 'FANFFF'
And I want to insert, say, the letter 'M' in the first and seventh positions of the final string and assign that to another VARCHAR variable to read 'MFANFFMF', how can I best do that? And am I correct in reading that SQL Server strings are indexed starting from one, instead of zero? I'm thinking of the SUBSTRING() function, for instance.
(Note that some strings will be up to 100 characters in length, thus the VARCHAR(100) declaration above, even for a six-character string)
Thanks much for your help.
You could also take a look at the STUFF function.
SELECT STUFF(STUFF(#p_MyStringVariable,1,0,'M'),7,0,'M')
Yes, SQL Server indexes varchar etc columns starting at one.
To insert at specfic points, use STUFF (and see Joes's answer for examples)

Resources