Comma delimited string to array using regular expressions - arrays

I have a string: strString = first,last,,4443334444
I want to use regular expression to split this string into an array.
I'm using this regular expression [\""].+?[\""]|[^,]+ but it is ignoring the space after the word last.
So, my array is looking something like this:
[0] => first
[1] => last
[2] => 4443334444
instead of:
[0] => first
[1] => last
[2] =>
[3] => 4443334444
I would like to keep the space.
Any help would be appreciated.

You may use
"[^"\\]*(?:\\.[^"\\]*)*"|[^,]+|(?<=^|,)(?=$|,)‌​
See the regex demo
The expression consists of
"[^"\\]*(?:\\.[^"\\]*)*" - a double quoted string literal with escape sequence support
| - or
[^,]+ - 1 or more characters other than ,
| - or
(?<=^|,)(?=$|,)‌​ - any empty string that is either between commas, or between the start/end of string and a comma.

A couple of issues with your expression.
First [\""] is redundant, use ["] or better " (without the
character class) instead.
Second, your actual problem is due to
the + operator which requires at least one character (but
there's none between the commas, thus disallowing empty fields).
Third, this is probably some CSV output, so why not use
explode() or similar functions?
If you insist on using a regular expression, you might get along with:
".*?"|[^,]*
See a demo on regex101.com.

Not sure if there's a way to get the element between the two commas, since there's no regex expression for it. The best I could come up with is:
str.match(/(?:[^,]+)|,,/g)
=> ["first", "last", ",,", "4443334444"]
But you'll need to translate the ",," into an empty string.
Is there a reason why you're using regex? Does your language have a .split() function? https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/split

Related

Regex empty string or number [duplicate]

I have the following Regular Expression which matches an email address format:
^[\w\.\-]+#([\w\-]+\.)+[a-zA-Z]+$
This is used for validation with a form using JavaScript. However, this is an optional field. Therefore how can I change this regex to match an email address format, or an empty string?
From my limited regex knowledge, I think \b matches an empty string, and | means "Or", so I tried to do the following, but it didn't work:
^[\w\.\-]+#([\w\-]+\.)+[a-zA-Z]+$|\b
To match pattern or an empty string, use
^$|pattern
Explanation
^ and $ are the beginning and end of the string anchors respectively.
| is used to denote alternates, e.g. this|that.
References
regular-expressions.info/Anchors and Alternation
On \b
\b in most flavor is a "word boundary" anchor. It is a zero-width match, i.e. an empty string, but it only matches those strings at very specific places, namely at the boundaries of a word.
That is, \b is located:
Between consecutive \w and \W (either order):
i.e. between a word character and a non-word character
Between ^ and \w
i.e. at the beginning of the string if it starts with \w
Between \w and $
i.e. at the end of the string if it ends with \w
References
regular-expressions.info/Word Boundaries
On using regex to match e-mail addresses
This is not trivial depending on specification.
Related questions
What is the best regular expression for validating email addresses?
Regexp recognition of email address hard?
How far should one take e-mail address validation?
An alternative would be to place your regexp in non-capturing parentheses. Then make that expression optional using the ? qualifier, which will look for 0 (i.e. empty string) or 1 instances of the non-captured group.
For example:
/(?: some regexp )?/
In your case the regular expression would look something like this:
/^(?:[\w\.\-]+#([\w\-]+\.)+[a-zA-Z]+)?$/
No | "or" operator necessary!
Here is the Mozilla documentation for JavaScript Regular Expression syntax.
I'm not sure why you'd want to validate an optional email address, but I'd suggest you use
^$|^[^#\s]+#[^#\s]+$
meaning
^$ empty string
| or
^ beginning of string
[^#\s]+ any character but # or whitespace
#
[^#\s]+
$ end of string
You won't stop fake emails anyway, and this way you won't stop valid addresses.
\b matches a word boundary. I think you can use ^$ for empty string.
^$ did not work for me if there were multiple patterns in regex.
Another solution:
/(pattern1)(pattern2)?/g
"pattern2" is optional. If empty, not matched.
? matches (pattern2) between zero and one times.
Tested here ("m" is there for multi-line example purposes): https://regex101.com/r/mezfvx/1

Escaping square brackets when using LIKE operator in sql [duplicate]

I am trying to filter items with a stored procedure using like. The column is a varchar(15). The items I am trying to filter have square brackets in the name.
For example: WC[R]S123456.
If I do a LIKE 'WC[R]S123456' it will not return anything.
I found some information on using the ESCAPE keyword with LIKE, but how can I use it to treat the square brackets as a regular string?
LIKE 'WC[[]R]S123456'
or
LIKE 'WC\[R]S123456' ESCAPE '\'
Should work.
Let's say you want to match the literal its[brac]et.
You don't need to escape the ] as it has special meaning only when it is paired with [.
Therefore escaping [ suffices to solve the problem. You can escape [ by replacing it with [[].
I needed to exclude names that started with an underscore from a query, so I ended up with this:
WHERE b.[name] not like '\_%' escape '\' -- use \ as the escape character
Here is what I actually used:
like 'WC![R]S123456' ESCAPE '!'
The ESCAPE keyword is used if you need to search for special characters like % and _, which are normally wild cards. If you specify ESCAPE, SQL will search literally for the characters % and _.
Here's a good article with some more examples
SELECT columns FROM table WHERE
column LIKE '%[[]SQL Server Driver]%'
-- or
SELECT columns FROM table WHERE
column LIKE '%\[SQL Server Driver]%' ESCAPE '\'
According to documentation:
You can use the wildcard pattern matching characters as literal
characters. To use a wildcard character as a literal character,
enclose the wildcard character in brackets.
You need to escape these three characters %_[:
'5%' LIKE '5[%]' -- true
'5$' LIKE '5[%]' -- false
'foo_bar' LIKE 'foo[_]bar' -- true
'foo$bar' LIKE 'foo[_]bar' -- false
'foo[bar' LIKE 'foo[[]bar' -- true
'foo]bar' LIKE 'foo]bar' -- true
If you would need to escape special characters like '_' (underscore), as it was in my case, and you are not willing/not able to define an ESCAPE clause, you may wish to enclose the special character with square brackets '[' and ']'.
This explains the meaning of the "weird" string '[[]' - it just embraces the '[' character with square brackets, effectively escaping it.
My use case was to specify the name of a stored procedure with underscores in it as a filter criteria for the Profiler. So I've put string '%name[_]of[_]a[_]stored[_]procedure%' in a TextData LIKE field and it gave me trace results I wanted to achieve.
Here is a good example from the documentation:
LIKE (Transact-SQL) - Using Wildcard Characters As Literals
There is a problem in that while
LIKE 'WC[[]R]S123456'
and
LIKE 'WC\[R]S123456' ESCAPE '\'
both work for SQL Server, neither work for Oracle.
It seems that there isn't any ISO/IEC 9075 way to recognize a pattern involving a left brace.
Instead of '\' or another character on the keyboard, you can also use special characters that aren't on the keyboard. Depending o your use case this might be necessary, if you don't want user input to accidentally be used as an escape character.
Use the following.
For user input to search as it is, use escape, in that it will require the following replacement for all special characters (the below covers all of SQL Server).
Here a single quote, "'" ,is not taken as it does not affect the like clause as it is a matter of string concatenation.
The "-" & "^" & "]" replace is not required as we are escaping "[".
String FormattedString = "UserString".Replace("ð","ðð").Replace("_", "ð_").Replace("%", "ð%").Replace("[", "ð[");
Then, in SQL Query it should be as following. (In parameterised query, the string can be added with patterns after the above replacement).
To search an exact string.
like 'FormattedString' ESCAPE 'ð'
To search start with a string:
like '%FormattedString' ESCAPE 'ð'
To search end with a string:
like 'FormattedString%' ESCAPE 'ð'
To search containing with a string:
like '%FormattedString%' ESCAPE 'ð'
And so on for other pattern matching. But direct user input needs to be formatted as mentioned above.

Matching Regular Expressions In SQL Server

I am trying to extract id of Android app from its url but getting extra characters.
Using replace function in sql server, below are two sample urls:
https&colon;//play.google.com/store/apps/details?id=com.flipkart.android&hl=en com.flipkart.android
https&colon;//play.google.com/store/apps/details?hl=en_US&id=com.surveysampling.mobile.quickthoughts&referrer=mat_click_id%3Df1901cef59f79b1542d05a1fdfa67202-20150429-5128 en_US&id=com.surveysampling.mobile.quickthoughts&r
I am doing this right now:
SELECT
SUBSTRING(REPLACE(PREVIEW, '&hl=en',''), CHARINDEX('?', PREVIEW) + 4 , 50)
FROM OFFERS_TABLE;
But for 1st I am getting com.flipkart.android which is correct, but for 2nd I am getting en_US&id=com.surveysampling.mobile.quickthoughts&r.
I want to remove en_US&id from starting of it and &r from its end.
Can someone help me with any post or url from where I can refer?
What you are actually trying to do is extract the string preceded by id= until the & is found which is separator for variables in URL. Taking this condition I came up with following regex.
Regex: (?<=id=)[^&]*
Explanation: It uses the lookbehind assertion that is the string is preceded by id= until the first & is found.
Regex101 Demo
It seems like you've made some assumptions of lengths. The the &r is appearing because that is 50 characters. You are also getting the en_US because you assumed 4 characters at the beginning but your second string has more. Perhaps you can split on & and then look for the variable that begins with id=.
it seems like a function like this would help.
http://www.sqlservercentral.com/blogs/querying-microsoft-sql-server/2013/09/19/how-to-split-a-string-by-delimited-char-in-sql-server/

select first two characters of values in a concatenated string

I am trying to create a formula field that checks a string that is a series of concatenated values separated by a comma. I want to check the first two characters of each comma separated value in the string. For example, the string pattern could be: abcd,efgh,ijkl,mnop,qrst,uvwx
In my formula I'd like to check if the first two characters are 'ab','ef'
If so, I would return true, else false.
Thanks.
To do this properly, you need to use a regular expression. Unfortunately the REGEX function is not available in formula fields. It is, however, available in formulas in Validation Rules and in Workflow Rules. You can, therefore, specify the below formula in either of a Validation or Workflow Rule:
OR(
AND(
NOT(
BEGINS( KXENDev__Languages__c, "ab" )
),
NOT(
BEGINS( KXENDev__Languages__c, "ef" )
)
),
REGEX( KXENDev__Languages__c , ".*,(?!ab|ef).*")
)
If it's a Validation Rule, you're done -- this formula will create an error if any of the entries do not start with "ab" or "ef". If it's a Workflow Rule, then you can add a Field Update to it to update some field with False when this formula is true (if this formula is true then there is at least one item that doesn't start with ab or ef, so that would make your field False).
Some may ask "What's with the BEGINS statements? Couldn't you have done this all with one REGEX?" Yes, I probably could, but that makes for an increasingly complex REGEX statement, and these are quite difficult to debug in Salesforce.com, so I prefer to keep my REGEXes in Salesforce.com as simple as possible.
I suggest you to search for ',ab' and ',ef' using CONTAINS method. But first of all you need to re implement method which composes this string so it puts ',' before first substring. At the end returned string should look like ',abcd,efgh,ijkl,mnop,qrst,uvwx'.
If you are not able to re implement method which compose this string use LEFT([our string goes here],2) method to check first two chars.

How can I escape square brackets in a LIKE clause?

I am trying to filter items with a stored procedure using like. The column is a varchar(15). The items I am trying to filter have square brackets in the name.
For example: WC[R]S123456.
If I do a LIKE 'WC[R]S123456' it will not return anything.
I found some information on using the ESCAPE keyword with LIKE, but how can I use it to treat the square brackets as a regular string?
LIKE 'WC[[]R]S123456'
or
LIKE 'WC\[R]S123456' ESCAPE '\'
Should work.
Let's say you want to match the literal its[brac]et.
You don't need to escape the ] as it has special meaning only when it is paired with [.
Therefore escaping [ suffices to solve the problem. You can escape [ by replacing it with [[].
I needed to exclude names that started with an underscore from a query, so I ended up with this:
WHERE b.[name] not like '\_%' escape '\' -- use \ as the escape character
Here is what I actually used:
like 'WC![R]S123456' ESCAPE '!'
The ESCAPE keyword is used if you need to search for special characters like % and _, which are normally wild cards. If you specify ESCAPE, SQL will search literally for the characters % and _.
Here's a good article with some more examples
SELECT columns FROM table WHERE
column LIKE '%[[]SQL Server Driver]%'
-- or
SELECT columns FROM table WHERE
column LIKE '%\[SQL Server Driver]%' ESCAPE '\'
According to documentation:
You can use the wildcard pattern matching characters as literal
characters. To use a wildcard character as a literal character,
enclose the wildcard character in brackets.
You need to escape these three characters %_[:
'5%' LIKE '5[%]' -- true
'5$' LIKE '5[%]' -- false
'foo_bar' LIKE 'foo[_]bar' -- true
'foo$bar' LIKE 'foo[_]bar' -- false
'foo[bar' LIKE 'foo[[]bar' -- true
'foo]bar' LIKE 'foo]bar' -- true
If you would need to escape special characters like '_' (underscore), as it was in my case, and you are not willing/not able to define an ESCAPE clause, you may wish to enclose the special character with square brackets '[' and ']'.
This explains the meaning of the "weird" string '[[]' - it just embraces the '[' character with square brackets, effectively escaping it.
My use case was to specify the name of a stored procedure with underscores in it as a filter criteria for the Profiler. So I've put string '%name[_]of[_]a[_]stored[_]procedure%' in a TextData LIKE field and it gave me trace results I wanted to achieve.
Here is a good example from the documentation:
LIKE (Transact-SQL) - Using Wildcard Characters As Literals
There is a problem in that while
LIKE 'WC[[]R]S123456'
and
LIKE 'WC\[R]S123456' ESCAPE '\'
both work for SQL Server, neither work for Oracle.
It seems that there isn't any ISO/IEC 9075 way to recognize a pattern involving a left brace.
Instead of '\' or another character on the keyboard, you can also use special characters that aren't on the keyboard. Depending o your use case this might be necessary, if you don't want user input to accidentally be used as an escape character.
Use the following.
For user input to search as it is, use escape, in that it will require the following replacement for all special characters (the below covers all of SQL Server).
Here a single quote, "'" ,is not taken as it does not affect the like clause as it is a matter of string concatenation.
The "-" & "^" & "]" replace is not required as we are escaping "[".
String FormattedString = "UserString".Replace("ð","ðð").Replace("_", "ð_").Replace("%", "ð%").Replace("[", "ð[");
Then, in SQL Query it should be as following. (In parameterised query, the string can be added with patterns after the above replacement).
To search an exact string.
like 'FormattedString' ESCAPE 'ð'
To search start with a string:
like '%FormattedString' ESCAPE 'ð'
To search end with a string:
like 'FormattedString%' ESCAPE 'ð'
To search containing with a string:
like '%FormattedString%' ESCAPE 'ð'
And so on for other pattern matching. But direct user input needs to be formatted as mentioned above.

Resources