I have a sql statement (inherited) that has the following WHERE clause:
WHERE
(Users_1.SecurityLevel IN ('Accounts', 'General manager'))
AND (PurchaseOrders.Approval = 1)
AND (PurchaseOrders.QuotedAmount = 0)
AND (Users_1.StaffNumber = ISNULL(ServiceRequests.POC_UserID, PurchaseOrders.Approval_UserID))
OR
(Users_1.SecurityLevel IN ('Accounts', 'General manager'))
AND (PurchaseOrders.QuotedAmount = 0)
AND (ServiceRequests.POC = 1)
AND (Users_1.StaffNumber = ISNULL(ServiceRequests.POC_UserID, PurchaseOrders.Approval_UserID))
OR
(ISNULL(ISNULL(PurchaseOrders.InvoiceNumber, ServiceRequests.InvoiceNumber), '!#') <> '!#')
AND (Users_1.StaffNumber = ISNULL(ServiceRequests.POC_UserID, PurchaseOrders.Approval_UserID))
I'm trying to figure out the order of operations when things are not nicely bracketed.
How are AND and OR statements ordered in the above example?
Is there an easy rule so that I can put brackets around things to make it more readable?
I'm looking for something like "BODMAS" when it comes to the mathematical order of operations, for SQL WHERE clause operators.
Thanks
The page on Operator Precedence tells you:
When a complex expression has multiple operators, operator precedence determines the sequence in which the operations are performed. The order of execution can significantly affect the resulting value.
And that AND has a higher precedence than OR.
However, it's not correct. In SQL, you tell the system what you want, not how to do it, and the optimizer is free to re-order operations, provided that the same logical result is produced.
So, whilst operator precedence tells you how the operators are logically combined, it does not, in fact, control the order in which each piece of logic is actually performed. This means that idioms which may be safe in other languages because of guarantees of execution order are not in fact safe in SQL. E.g. a check such as:
<String can be parsed as an int> && <convert the string to an int and compare to 20>
Can be perfectly safe in languages such as C#. The same logic in SQL is not safe since the optimizer may choose to perform the string to int conversion before it evaluates whether the string can be parsed as an int and so can throw an error about a failed conversion. (Of course, it can also work as you may have expected and not produce an error)
Related
I am surprised! This statement below is valid in SQL SERVER:
SELECT +'ABCDEF'
Has SQL Server defined + as a Unary operator for string types?
Here is my own answer to this question (Please also see the update at the end):
No, there isn't such unary operator defined on the String expressions. It is possible that this is a bug.
Explanation:
The given statement is valid and it generates the below result:
(No column name)
----------------
ABCDEF
(1 row(s) affected)
which is equivalent to doing the SELECT statement without using the + sign:
SELECT 'ABCDEF'
Being compiled without giving any errors, in fact being executed successfully, gives the impression that + is operating as a Unary operation on the given string. However, in the official T-SQL documentation, there is no mentioning of such an operator. In fact, in the section entitled "String Operators", + appears in two String operations which are + (String Concatenation) and += (String Concatenation); but neither is a Unary operation. Also, in the section entitled "Unary Operators", three operators have been introduced, only one of them being the + (Positive) operator. However, for this only one that seems to be relevant, it soon becomes clear that this operator, too, has nothing to do with non-numeric string values as the explanation for + (Positive) operator explicitly states that this operator is applicable only for numeric values: "Returns the value of a numeric expression (a unary operator)".
Perhaps, this operator is there to successfully accept those string values that are successfully evaluated as numbers such as the one that has been used here:
SELECT +'12345'+1
When the above statement is executed, it generates a number in the output which is the sum of both the given string evaluated as a number and the numberic value added to it, which is 1 here but it could obviously be any other amount:
(No column name)
----------------
12346
(1 row(s) affected)
However, I doubt this explanation is the correct as it raises to below questions:
Firstly, if we accept that this explanation is true, then we can conclude that expressions such +'12345' are evaluated to numbers. If so, then why is it that these numbers can appear in the string related functions such as DATALENGTH, LEN, etc. You could see a statement such as this:
SELECT DATALENGTH(+'12345')
is quite valid and it results the below:
(No column name)
----------------
5
(1 row(s) affected)
which means +'12345' is being evaluated as a string not a number. How this can be explained?
Secondly, while similar statements with - operator, such as this:
`SELECT -'ABCDE'`
or even this:
`SELECT -'12345'`
generate the below error:
Invalid operator for data type. Operator equals minus, type equals varchar.
Why, shouldn't it generate an error for similar cases when + operator has been wrongly used with a non-numeric string value?
So, these two questions prevent me from accepting the explanation that this is the same + (unary) operator that has been introduced in the documentation for numeric values. As there is no other mentioning of it anywhere else, it could be that it is deliberately added to the language. May be a bug.
The problem looks to be more severe when we see no error is generated for statements such as this one either:
SELECT ++++++++'ABCDE'
I do not know if there are any other programming languages out there which accept these sort of statements. But if there are, it would be nice to know for what purpose(s) they use a + (unary) operator applied to a string. I cannot imagine any usage!
UPDATE
Here it says this has been a bug in earlier versions but it won't be fixed because of backward compatibility:
After some investigation, this behavior is by design since + is an unary operator. So the parser accepts "+ , and the '+' is simply ignored in this case.
Changing this behavior has lot of backward compatibility implications so we don't intend to change it & the fix will introduce unnecessary changes for application code.
I am converting some views from Netezza into another DBMS.
I keep running into this operator /=/, which I imagine is some sort of equality operator.
However, I have searched this site and the official docs, but I cannot find a definition of how this operator works.
What does /=/ mean in Netezza?
EDIT:
I am seeing it in case statements.
Here is an example:
CASE WHEN (A_TABLE.A_COL /=/ 'ONE'::VARCHAR) THEN 'ONE'::VARCHAR
WHEN (A_TABLE.A_COL /=/ 'TWO'::VARCHAR) THEN 'TWO'::VARCHAR
WHEN (A_TABLE.A_COL /=/ 'THREE'::VARCHAR) THEN 'THREE'::VARCHAR
WHEN (A_TABLE.A_COL /=/ 'FOUR'::VARCHAR) THEN 'FOUR'::VARCHAR
ELSE 'OTHER'::VARCHAR END
It is a quite powerful feature, often used in JOIN statements and as here in CASE.
Its an operator that tells the database to match NULL in one value to NULL in another. Normally all functions and operators return NULL if one of the arguments is NULL, and since NULL is not TRUE you will not find a match.
This whole tri-state logic surrounding NULL can be quite confusing at times and has clearly been invented in the wrinkled minds of mathematicians, but this special /=/ operator has a behavior that is quite easy to wrap you brain around.
Using AppEngine appstats I profiled my queries, and noticed that although the docs say a query costs one read, queries using ndb.OR (or .IN which expands to OR), cost n reads (n equals the number of OR clauses).
eg:
votes = (Vote.query(ndb.OR(Vote.object == keys[0], Vote.object == keys[1]))
.filter(Vote.user_id == user_id)
.fetch(keys_only=True))
This query costs 2 reads (it matches 0 entities). If I replace the ndb.OR with Vote.object.IN, the number of reads equals the length of array I pass to read.
This behavior is kind of contradicts the docs.
I was wondering if anyone else experienced the same, and if this is a bug in AE, docs, or my understanding.
Thanks.
The query docs for ndb are not particularly explicit but this paragraph is your best answer
In addition to the native operators, the API supports the != operator,
combining groups of filters using the Boolean OR operation, and the IN
operation, which test for equality to one of a list of possible values
(like Python's 'in' operator). These operations don't map 1:1 to the
Datastore's native operations; thus they are a little quirky and slow,
relatively. They are implemented using in-memory merging of result
streams. Note that p != v is implemented as "p < v OR p > v". (This
matters for repeated properties.)
In this doc https://developers.google.com/appengine/docs/python/ndb/queries
I am stuck on trying to find the meaning of a plus sign in a where clause. Anyone have any ideas on this one? Been stuck for a bit on it. The query itself is pretty simple and work similarly with, or with without the plus sign. I'd like to remove it unless it's there for a reason.
SELECT userID from tblUser WHERE + userName = SYSTEM_USER
Added note: This is in SQL Server 2008 not Oracle, nor did it come from and Oracle migration... As mentioned below there is an older join notation for Oracle that uses the + generally postfixed to some of the criteria.
The unary + operator is simply a no op. This is explained in the documentation for this operator, which is here:
Although a unary plus can appear before any numeric expression, it
performs no operation on the value returned from the expression.
Specifically, it will not return the positive value of a negative
expression. To return positive value of a negative expression, use the
ABS function.
I actually believe this remark is a wee little bit misleading. I think the unary plus operator will convert a string argument to a number. When applied to a constant string filled with digits, this could actually be beneficial as a way of encouraging the compiler to use an index on a numeric field.
It looks like the plus operator in the where clause is used for left or right outer joins.
You don't need it in your case, but you can read up on them here.
The reason your query works the same either way is because data is only coming from a single table. The join is superfluous.
A quick search also lead me to this answer, which states using the + method for joins is not recommended.
Update
Since you're using Microsoft SQL Server 2008, this is my best guess:
The '+' operator is used for string concatenation.
I understand that lucene's AND (&&), OR (||) and NOT (!) operators are shorthands for REQUIRED, OPTIONAL and EXCLUDE respectively, which is why one can't treat them as boolean operators (adhering to boolean algebra).
I have been trying to construct a simple OR expression, as follows
q = +(field1:value1 OR field2:value2)
with a match on either field1 or field2. But since the OR is merely an optional, documents where both field1:value1 and field2:value2 are matched, the query returns a score resulting in a match on both the clauses.
How do I enforce short-circuiting in this context? In other words, how to implement short-circuiting as in boolean algebra where an expression A || B || C returns true if A is true without even looking into whether B or C could be true.
Strictly speaking, no, there is no short circuiting boolean logic. If a document is found for one term, you can't simply tell it not to check for the other. Lucene is an inverted index, so it doesn't really check documents for matches directly. If you search for A OR B, it finds A and gets all the documents which have indexed that value. Then it gets B in the index, and then list of all documents containing it (this is simplifying somewhat, but I hope it gets the point across). It doesn't really make sense for it to not check the documents in which A is found. Further, for the query provided, all the matches on a document still need to be enumerated in order to acquire a correct score.
However, you did mention scores! I suspect what you are really trying to get at is that if one query term in a set is found, to not compound the score with other elements. That is, for (A OR B), the score is either the score-A or the score-B, rather than score-A * score-B or some such (Sorry if I am making a wrong assumption here, of course).
That is what DisjunctionMaxQuery is for. Adding each subquery to it will render a score from it equal to the maximum of the scores of all subqueries, rather than a product.
In Solr, you should learn about the DisMaxQParserPlugin and it's more recent incarnation, the ExtendedDisMax, which, if I'm close to the mark here, should serve you very well.