Informatica Code to SQL Code - sql-server

I am transforming the following informatica code to SQL. I am encountering some issues and would appreciate help with the following code:
SUBSTR(COV_REINS_CONCAT_BK,INSTR(COV_REINS_CONCAT_BK,'|',1,3) +1,2)
That is, I am looking for the equivalent code to produce the same results in SQL Server.
I appreciate anyone's help!

SUBSTR's equivalent is SUBSTRING.
INSTR's equivalent is CHARINDEX, but it has the first 2 parameters reversed, and does not support the 4th parameter (occurrence).
The expression returns 2 characters after the third occurrence of | (pipe).
Example: It will return 'FG' for 'A|BC|DE|FGH'.
So the translation will be:
SUBSTRING(COV_REINS_CONCAT_BK,1+CHARINDEX('|',COV_REINS_CONCAT_BK,1+CHARINDEX('|'
,COV_REINS_CONCAT_BK,1+CHARINDEX('|',COV_REINS_CONCAT_BK))),2)

Related

Snowflake and Regular Expressions - issue when implementing known good expression in SF

I'm looking for some assistance in debugging a REGEXP_REPLACE() statement.
I have been using an online regular expressions editor to build expressions, and then the SF regexp_* functions to implement them. I've attempted to remain consistent with the SF regex implementation, but I'm seeing an inconsistency in the returned results that I'm hoping someone can explain :)
My intent is to replace commas within the text (excluding commas with double-quoted text) with a new delimiter (#^#).
Sample text string:
"Foreign Corporate Name Registration","99999","Valuation Research",,"Active Name",02/09/2020,"02/09/2020","NEVADA","UNITED STATES",,,"123 SOME STREET",,"MILWAUKEE","WI","53202","UNITED STATES","123 SOME STREET",,"MILWAUKEE","WI","53202","UNITED STATES",,,,,,,,,,,,
RegEx command and Substitution (working in regex101.com):
([("].*?["])*?(,)
\1#^#
regex101.com Result:
"Foreign Corporate Name Registration"#^#"99999"#^#"Valuation Research"#^##^#"Active Name"#^#02/09/2020#^#"02/09/2020"#^#"NEVADA"#^#"UNITED STATES"#^##^##^#"123 SOME STREET"#^##^#"MILWAUKEE"#^#"WI"#^#"53202"#^#"UNITED STATES"#^#"123 SOME STREET"#^##^#"MILWAUKEE"#^#"WI"#^#"53202"#^#"UNITED STATES"#^##^##^##^##^##^##^##^##^##^##^##^#
When I try and implement this same logic in SF using REGEXP_REPLACE(), I am using the following statement:
SELECT TOP 500
A.C1
,REGEXP_REPLACE((A."C1"),'([("].*?["])*?(,)','\\1#^#') AS BASE
FROM
"<Warehouse>"."<database>"."<table>" AS A
This statement returns the result for BASE:
"Foreign Corporate Name Registration","99999","Valuation Research",,"Active Name",02/09/2020,"02/09/2020","NEVADA","UNITED STATES",,,"123 SOME STREET",,"MILWAUKEE","WI","53202","UNITED STATES","123 SOME STREET",,"MILWAUKEE","WI","53202","UNITED STATES"#^##^##^##^##^##^##^##^##^##^##^##^#
As you can see when comparing the results, the SF result set is only replacing commas at the tail-end of the text.
Can anyone tell me why the results between regex101.com and SF are returning different results with the same statement? Is my expression non-compliant with the SF implementation of RegEx - and if yes, can you tell me why?
Many many thanks for your time and effort reading this far!
Happy Wednesday,
Casey.
The use of .*? to achieve lazy matching for regexing is limited to PCRE, which Snowflake does not support. To see this, in regex101.com, change your 'flavor" to be anything other than PCRE (PHP); you will see that your ([("].*?["])*?(,) regex no longer achieves what you are expecting.
I believe that this will work for your purposes:
REGEXP_REPLACE(A.C1,'("[^"]*")*,','\\1#^#')

Matching Regular Expressions In SQL Server

I am trying to extract id of Android app from its url but getting extra characters.
Using replace function in sql server, below are two sample urls:
https&colon;//play.google.com/store/apps/details?id=com.flipkart.android&hl=en com.flipkart.android
https&colon;//play.google.com/store/apps/details?hl=en_US&id=com.surveysampling.mobile.quickthoughts&referrer=mat_click_id%3Df1901cef59f79b1542d05a1fdfa67202-20150429-5128 en_US&id=com.surveysampling.mobile.quickthoughts&r
I am doing this right now:
SELECT
SUBSTRING(REPLACE(PREVIEW, '&hl=en',''), CHARINDEX('?', PREVIEW) + 4 , 50)
FROM OFFERS_TABLE;
But for 1st I am getting com.flipkart.android which is correct, but for 2nd I am getting en_US&id=com.surveysampling.mobile.quickthoughts&r.
I want to remove en_US&id from starting of it and &r from its end.
Can someone help me with any post or url from where I can refer?
What you are actually trying to do is extract the string preceded by id= until the & is found which is separator for variables in URL. Taking this condition I came up with following regex.
Regex: (?<=id=)[^&]*
Explanation: It uses the lookbehind assertion that is the string is preceded by id= until the first & is found.
Regex101 Demo
It seems like you've made some assumptions of lengths. The the &r is appearing because that is 50 characters. You are also getting the en_US because you assumed 4 characters at the beginning but your second string has more. Perhaps you can split on & and then look for the variable that begins with id=.
it seems like a function like this would help.
http://www.sqlservercentral.com/blogs/querying-microsoft-sql-server/2013/09/19/how-to-split-a-string-by-delimited-char-in-sql-server/

Regular expression in SQL Server 2005+

I am stuck with a regular expression in SQL server 2005+, i.e. I need a regular expression to validate a first name (which allows only alphabets,whitespaces and a .(dot)).
I tried with below query
SELECT PATINDEX('%[A-Z]%[a-z]%[.]%','John H. Wilson') as VALIDFIRSTNAME
But, this also fails in some cases. I'm unable to find a clear regular expression. Any assistance would be very much appreciated.
I have used patindex to recognise the given pattern. If the string doesn't match the pattern, then It should give 0 else it should give 1 or >1.
Thanks in advance.
You can't match mentioned condition with available patterns.
Try to follow this way.

SQL Server 2008 XPath Equivalent to preceding-sibling axis locator

I have some MathML that contains tags that identify various function calls (though this scenario can apply to any XML).
A sample would be:
<math>
<apply>
<ci>IIF</ci>
<apply>
<eq />
<apply>
<ci>DDOutputB60</ci>
<ci>Index</ci>
</apply>
<cn>0</cn>
</apply>
<cn>0</cn>
<apply>
<ci>DDOutputB60</ci>
<ci>Index</ci>
</apply>
</apply>
</math>
As you can see, this particular sample identifies two function calls - but both are to the same function (DDOutputB60)
I am attempting to write some SQL to list the DISTINCT functions but need to do this in the XPath and not in a wrapper SELECT statement that selects DISTINCT from the result set.
(As an aside, the reason for this is that this is the member SQL for a recursive CTE and DISTINCT or GROUP BY are not allowed)
I am led to believe that the following is valid XPath that will select distinct values but is not supported in SQL Server 2008:
COLUMN.nodes('(//ci[not(text() = preceding-sibling::ci/text())])')
Can anyone suggest an XPath equivalent that will work in SQL Server 2008?
Perhaps the >> or << node comparison operators might help, but I'm not an expert. Yet.
Thanks in advance.
#Ravi: the desired output would be the result of the .nodes(...) sql function which I assume would look in this case like:
<ci>DDOutputB60</ci>
It would be a single node result as the duplicate has been removed.
Perhaps the >> or << node comparison operators might help, but I'm not an expert. Yet.
Yes this is what you need
Here example in those two answer:
Getting the following sibling in an XPath when “following” axis is not supported
Get followin sibling in SQL Server XPath

Why is this data type conversion failing but not failing?

Ok, so I'm wracking my brain on this one...
These two queries... though they appear the same... are apparently different in some fashion. When run against a database in SQL Server Management Studio the top one results in an error (Conversion failed when converting from a character string to uniqueidentifier.) where as the bottom one runs just fine. Any ideas as to why that would be?
SELECT CONVERT(UNIQUEIDENTIFIER,'459B621C-A49A-49Cl-900F-AB14D61841E2');
SELECT CONVERT(UNIQUEIDENTIFIER,'459B621C-A49A-49C1-900F-AB14D61841E2');
Could it be a character encoding issue?
Thanks
There is a difference. The first one uses an l, the second is 1.

Resources