I am the local admin for an obsucure CRM-ATS prior to migrating to SFDC in 18 mos. It has a (basically BETA) report builder that is not well documented, but, appears very powerful. I have the capabilities to build custom expressions within the report, but, I can't figure the syntax for all the operators.
Does anyone recognize this code or operators list that might be able to point me to the proper language for syntax and creation of these individual expressions.
IF(GREATER_OR_EQUAL(DATE_DIFF(NOW(); JobCurrentStep.StepTime); 14); SUBSTRING("14Days+"; 1); SUBSTRING("<14 Days"; 1))
DATE_DIFF(StepsLinkedPeop.StepStartTime; StepsLinkedPeop.StepEndTime)
COUNT_DISTINCT(People.Person)
IF(LIKE(LinkedJobs.JobClientNameSBD; "MSP"); COUNT_DISTINCT(People.Person); 9)
COUNT_DISTINCT(People.Person)
IF(GREATER_OR_EQUAL(DATE_DIFF(StepChangesJour.StepStartTime; NOW()); 30); COUNT_DISTINCT(People.Person); 0
COUNT_DISTINCT(LinkedPeople.Applicant)
COUNT(LinkedPeople.Applicant)
DATE_DIFF(StepChangesJour.StepEndTime; StepChangesJour.StepStartTime)
GREATER_OR_EQUAL(DATE_DIFF(StepChangesJour.StepEndTime; StepChangesJour.StepStartTime); 7)
DATE_DIFF(NOW(); JobCurrentStep.StepTime)
IF(GREATER_OR_EQUAL(DATE_DIFF(NOW(); JobCurrentStep.StepTime); 500); SUBSTRING("Greater than 2 Weeks"; 1); SUBSTRING("Recent"; 1))
Here are the available operators:
AVG
CONCAT
COUNT
COUNT_DISTINCT
DATE_ADD_DAYS
DATE_ADD_SECONDS
DATE_DIFF
DATE_DIFF_IN_SECONDS
DATE_DIFF_IN_YEARS
DATE_FORMAT
DIVISION
EQUALS
GREATER
GREATER_OR_EQUAL
GROUP_CONCAT
GROUP_CONCAT_DISTINCT
GROUP_CONCAT_DISTINCT_WITH_HYPHEN
GROUP_CONCAT_DISTINCT_WITH_PIPES
HOUR_DIFF
IF
IF_NULL
IN
INET_NTOA
LIKE
LITERAL_NULL
LOCATE
LOGGED_USER_ID
LOGGED_USER_PERSON_ID
LOGGED_USER_TIMEZONE
MAX
MIN
MINUS
MULTIPLY
NOW
PCT
PLUS
REPLACE
ROUND
SUBSTRING
SUBSTRING_INDEX
SUM
SUM_DISTINCT
TO_DATETIME
TO_INT
TRIM
TRUNCATE
WORKING_DAYS
Though I don't assert this because I don't have experience with the language, this looks like ABAP to me, the high-level language created by the German software company SAP for its business applications.
Many of the operators you've mentioned look like MySQL function names or keywords. Notable examples include:
COUNT_DISTINCT, which looks a lot like COUNT(DISTINCT)
GROUP_CONCAT
INET_NTOA
LIKE
SUBSTRING_INDEX
However, many of the functions you've identified do not appear in MySQL; in particular, basic operators like PLUS, MULTIPLY and EQUALS are not functions in MySQL, and LOGGED_USER_ID and WORKING_DAYS do not appear in MySQL either. Additionally, the function call syntax you've described doesn't match what MySQL uses.
If I had to guess, I'd say you're looking at something custom that "compiles" expressions into MySQL queries.
Related
I'm looking for some assistance in debugging a REGEXP_REPLACE() statement.
I have been using an online regular expressions editor to build expressions, and then the SF regexp_* functions to implement them. I've attempted to remain consistent with the SF regex implementation, but I'm seeing an inconsistency in the returned results that I'm hoping someone can explain :)
My intent is to replace commas within the text (excluding commas with double-quoted text) with a new delimiter (#^#).
Sample text string:
"Foreign Corporate Name Registration","99999","Valuation Research",,"Active Name",02/09/2020,"02/09/2020","NEVADA","UNITED STATES",,,"123 SOME STREET",,"MILWAUKEE","WI","53202","UNITED STATES","123 SOME STREET",,"MILWAUKEE","WI","53202","UNITED STATES",,,,,,,,,,,,
RegEx command and Substitution (working in regex101.com):
([("].*?["])*?(,)
\1#^#
regex101.com Result:
"Foreign Corporate Name Registration"#^#"99999"#^#"Valuation Research"#^##^#"Active Name"#^#02/09/2020#^#"02/09/2020"#^#"NEVADA"#^#"UNITED STATES"#^##^##^#"123 SOME STREET"#^##^#"MILWAUKEE"#^#"WI"#^#"53202"#^#"UNITED STATES"#^#"123 SOME STREET"#^##^#"MILWAUKEE"#^#"WI"#^#"53202"#^#"UNITED STATES"#^##^##^##^##^##^##^##^##^##^##^##^#
When I try and implement this same logic in SF using REGEXP_REPLACE(), I am using the following statement:
SELECT TOP 500
A.C1
,REGEXP_REPLACE((A."C1"),'([("].*?["])*?(,)','\\1#^#') AS BASE
FROM
"<Warehouse>"."<database>"."<table>" AS A
This statement returns the result for BASE:
"Foreign Corporate Name Registration","99999","Valuation Research",,"Active Name",02/09/2020,"02/09/2020","NEVADA","UNITED STATES",,,"123 SOME STREET",,"MILWAUKEE","WI","53202","UNITED STATES","123 SOME STREET",,"MILWAUKEE","WI","53202","UNITED STATES"#^##^##^##^##^##^##^##^##^##^##^##^#
As you can see when comparing the results, the SF result set is only replacing commas at the tail-end of the text.
Can anyone tell me why the results between regex101.com and SF are returning different results with the same statement? Is my expression non-compliant with the SF implementation of RegEx - and if yes, can you tell me why?
Many many thanks for your time and effort reading this far!
Happy Wednesday,
Casey.
The use of .*? to achieve lazy matching for regexing is limited to PCRE, which Snowflake does not support. To see this, in regex101.com, change your 'flavor" to be anything other than PCRE (PHP); you will see that your ([("].*?["])*?(,) regex no longer achieves what you are expecting.
I believe that this will work for your purposes:
REGEXP_REPLACE(A.C1,'("[^"]*")*,','\\1#^#')
I have this piece of code in Oracle which I need to convert into SQL Server to get the same behavior. I have used the REPLACE function. It seems to be working but I just wanted to make sure.
REGEXP_REPLACE(
phonenumber,
'([[:digit:]]{3})([[:digit:]]{3})([[:digit:]]{4})',
'(\1)\2-\3'
) phonenumber
As Martin said in his answer, SQL Server does not have built-in RegEx functionality (and while it has not been suggested here, just to be clear: no, the [...] wildcard of LIKE and PATINDEX is not RegEx). If your data has little to no variation then yes, you can use some combination of T-SQL functions: REPLACE, SUBSTRING, LEFT, RIGHT, CHARINDEX, PATINDEX, FORMATMESSAGE, CONCAT, and maybe one or two others.
However, if the data / input has even a moderate level of complexity, then the built-in T-SQL functions will be at best be cumbersome, and at worst useless. In such cases it's possible to do actual RegEx via SQLCLR (as long as you aren't using Azure SQL Database Single DB or SQL Server 2017+ via AWS RDS), which is (restricted) .NET code running within SQL Server. You can either code your own / find examples here on S.O. or elsewhere, or try a pre-done library such as the one I created, SQL# (SQLsharp), the Free version of which contains several RegEx functions. Please note that SQLCLR, being .NET, is not a POSIX-based RegEx, and hence does not use POSIX character classes (meaning: you will need to use \d for "digits" instead of [:digit:]).
The level of complexity needed in this particular situation is unclear as the example code in the question implies that the data is simple and uniform (i.e. 1112223333) but the example data shown in a comment on the question appears to indicate that there might be dashes and/or spaces in the data (i.e. xxx- xxx xxxx).
If the data truly is uniform, then stick with the pure T-SQL solution provided by #MartinSmith. But, if the data is of sufficient complexity, then please consider the RegEx example below, using a SQLCLR function found in the Free version of my SQL# library (as mentioned earlier), that easily handles the 3 variations of input data and more:
SELECT SQL#.RegEx_Replace4k(tmp.phone,
N'\(?(\d{3})\)?[ .-]*(\d{3})[ .-]*(\d{4})', N'($1)$2-$3',
-1, -- count (-1 == unlimited)
1, -- start at
N'') -- RegEx options
FROM (VALUES (N'8885551212'),
(N'123- 456 7890'),
(N'(777) 555- 4653')
) tmp([phone]);
returns:
(888)555-1212
(123)456-7890
(777)555-4653
The RegEx pattern allows for:
0 or 1 (
3 decimal digits
0 or 1 )
0 or more of , ., or -
3 decimal digits
0 or more of , ., or -
4 decimal digits
NOTE
It was mentioned that the newer Language Extensions might be a better choice than SQLCLR. Language Extensions allow calling R / Python / Java code, hosted outside of SQL Server, via the sp_execute_external_script stored procedure. As the Tutorial: Search for a string using regular expressions (regex) in Java page shows, external scripts are actually not a good choice for many / most uses of RegEx in SQL Server. The main problems are:
Unlike with SQLCLR, the only interface for external scripts is a stored procedure. This means that you can't use any of that functionality inline in a query (SELECT, WHERE, etc).
With external scripts, you pass in the query, work on the results in the external language, and pass back a static result set. This means that compiled code now has to be more specialized (i.e. tightly-coupled) to the particular usage. Changing how the query uses RegEx and/or what columns are returned now requires editing, compiling, testing, and deploying the R / Python / Java code in addition to (and coordinated with!) the T-SQL changes.
I'm sure external scripts are absolutely wonderful, and a better choice than SQLCLR, in certain scenarios. But they certainly do not lend themselves well to the highly varied, and often ad hoc, nature of how RegEx is used (like many / most other functions).
SQL Server does not have native regex support. You would need to use CLR (or as #Lukasz Szozda points out in the comments one of the newer Language Extensions) .
If I have understood the regex correctly though it matches strings of 10 digits and assigns the first 3 to group 1, second 3 to group 2, and last 4 to group 3 and then uses the back references in the expression (\1)\2-\3
You can use built in string functions to do this as below
SELECT CASE
WHEN phonenumber LIKE REPLICATE('[0-9]', 10)
THEN FORMATMESSAGE('(%s)%s-%s',
LEFT(phonenumber, 3),
SUBSTRING(phonenumber, 4, 3),
RIGHT(phonenumber, 4))
ELSE phonenumber
END
You can write SQL function using CLR, that will wrap standard dotnet regex. I have wrote this and you can use it there. It will look this:
DECLARE #SourceText NVARCHAR(MAX) = N'My first line <br /> My second line';
DECLARE #RegexPattern NVARCHAR(MAX) = N'([<]br\s*/[>])';
DECLARE #Replacement NVARCHAR(MAX) = N''
DECLARE #IsCaseSensitive BIT = 0;
SELECT regex.Replace(#SourceText, #RegexPattern, #Replacement, #IsCaseSensitive);
Entirely by accident today I was running a SQL statement to filter some items by date, for simplicity sake we'll say I used
SELECT *
FROM [TableName]
WHERE [RecordCreated] >+ '2016-04-10'
Only after the statement ran I realised I had used >+ instead of >=, now I was confused as I would have expected an error.
I tried a couple of other variations such as
>- -- Throws an error
<+ -- Ran successfully
<- -- Throws an error
The count of rows returned was exactly the same whether I used >= or >+
After searching online I couldn't find any documentation that covered this syntax directly, only when the two operators are used apart.
The RecordCreated column is a datetime.
Is this just a nicety in syntax for a possible common mistake or is it potentially trying to cast the date as a numeric value?
This seems to be a bug with '+' operator.
As per the updates from Microsoft team,
After some investigation, this behavior is by design since + is an
unary operator. So the parser accepts "+ , and the '+' is
simply ignored in this case. Changing this behavior has lot of
backward compatibility implications so we don't intend to change it &
the fix will introduce unnecessary changes for application code.
You can find a really good answer by RGO to his own question here.
The result shouldn't match with ">=" and "<=" but with ">" and "<". Just checked and the rowcound varies by 2 - the first and last item is removed.
I was writing a query against a table today on a SQL Server 2000 box, and while writing the query in Query Analyzer, to my surprise I noticed the word LineNo was converted to blue text.
It appears to be a reserved word according to MSDN documentation, but I can find no information on it, just speculation that it might be a legacy reserved word that doesn't do anything.
I have no problem escaping the field name, but I'm curious -- does anyone know what "LineNo" in T-SQL is actually used for?
OK, this is completely undocumented, and I had to figure it out via trial and error, but it sets the line number for error reporting. For example:
LINENO 25
SELECT * FROM NON_EXISTENT_TABLE
The above will give you an error message, indicating an error at line 27 (instead of 3, if you convert the LINENO line to a single line comment (e.g., by prefixing it with two hyphens) ):
Msg 208, Level 16, State 1, Line 27
Invalid object name 'NON_EXISTENT_TABLE'.
This is related to similar mechanisms in programming languages, such as the #line preprocessor directives in Visual C++ and Visual C# (which are documented, by the way).
How is this useful, you may ask? Well, one use of this it to help SQL code generators that generate code from some higher level (than SQL) language and/or perform macro expansion, tie generated code lines to user code lines.
P.S., It is not a good idea to rely on undocumented features, especially when dealing with a database.
Update: This explanation is still correct up to and including the current version of SQL Server, which at the time of this writing is SQL Server 2008 R2 Cumulative Update 5 (10.50.1753.0) .
Depending on where you use it, you can always use [LineNo]. For example:
select LnNo [LineNo] from OrderLines.
What you want is for users to just type in their search criteria just like they would in Google. Some words, maybe some quoted phrases, maybe a few operators, and have it just work.
A .Net solution is available here:
http://ewbi.blogs.com/develops/2007/05/normalizing_sql.html
I am looking for a pure T-SQL version with where support also. (Or VbScript/javascript)
Example: "dog" food price:20..45
should look like this (for mssql):
select * from table t join containstable(desc, '"dog" and food*') k on k.key=t.id
where t.price between 20 and 45
Operators: and, or, near, "", not, * , etc.
I don't see how you could have this functionality short of writing a complete parser that is programmed with the table relationships and column datatypes that exist on your database.