Database query with 'and' 'or' and negate condition - database

I am trying to understand why this is working as expected and the later dosn't.
a1 or a2 or a3 ---> works
I want to get all the other parts that are not in the first expression.
match'a*' and (not'a1' or not'a2' or not'a3') ---> doesn't work
Same one as the previous works as expected if all the conditions used are 'and', Why is that? I was sure my logic is good.
Is it possible when negating expression conditions also should be negated? Even if so, if I do
match'a*' and not'a1' ---> works
It works fine for one object but not more than one.

'a1' -match 'a*' -and (
'a1' -notlike 'a1' -or # false - move on to next evaluation
'a1' -notlike 'a2' -or # true - found true, so whole evaluation is true
'a1' -notlike 'a3'
)
'a1' -match 'a*' -and (
'a1' -notlike 'a1' -and # false - stop evaluating and return false since all must be true to return true
'a1' -notlike 'a2' -and
'a1' -notlike 'a3'
)

The technical reason why not'a1' or not'a2' or not'a3' fails comes in two parts.
One, the overall condition is
match'a*' and (Result of the comparisons.)
Two, the comparisons evaluated to True.
Use a1 as input.
a1 not'a1' = False
a1 not'a2' = True
a1 not'a3' = True
Remember that operators only work on two values at a time. In PowerShell, operators evaluate from left to right. And the result from one comparison is used as the left-hand side of later comparisons.
Logical OR is True when either statement is True.
a1 not'a1' [False] or a1 not'a2' [True] = True
(a1 not'a1' or a1 not'a2' [True]) or a1 not'a3' [True] = True
If a1 is your input then match'a*' evaluates to True and your comparisons evaluate to True.
True and (True)
So a1 is accepted for
match'a*' and (not'a1' or not'a2' or not'a3')
The reason why not'a1' and not'a2' and not'a3' succeeds is:
Logical AND is True when both statements are True.
a1 not'a1' [False] and a1 not'a2' [True] = False
(a1 not'a1' and a1 not'a2' [False]) and a1 not'a3' [True] = False
If a1 is your input then match'a*' evaluates to True and your comparisons evaluate to False.
True and (False)
So a1 is rejected for
match'a*' and (not'a1' and not'a2' and not'a3')
You need to practice manually evaluating conditions.
References
about_Logical_Operators - PowerShell | Microsoft Docs
Update 01
With respect to
match'a*' and (not'a1' or not'a2' or not'a3')
it is correct to say that
which will basically include everything because it's likely that any value will evaluate to True in at least 2 of those evaluations.
because any value will be not'a1' or not'a2'.
match'a*' and (not'a1' or not'a2' or not'a3')
For example, if I input a1 then not'a1' is False. not'a2' is True.
match'a*' and (False or True or not'a3')
So the program will evaluate False OR True.
Logical OR is True when either statement is True.
So False OR True is evaluated to True.
match'a*' and (True or not'a3')
not'a3' is True.
match'a*' and (True or True)
So the program will evaluate True OR True.
True OR True is evaluated to True.
match'a*' and (True)
match'a*' is True.
True and (True)
So the program will evaluate True AND True.
Logical AND is True when both statements are True.
So True AND True is evaluated to True.
True
The final result when using a1 as input is True.
match'a*' and (not'a1' or not'a2' or not'a3')
If I input a2 then not'a1' is True. not'a2' is False.
match'a*' and (True or False or not'a3')
So the program will evaluate True OR False.
True OR False is evaluated to True.
match'a*' and (True or not'a3')
not'a3' is True.
match'a*' and (True or True)
So the program will evaluate True OR True.
True OR True is evaluated to True.
match'a*' and (True)
match'a*' is True.
True and (True)
So the program will evaluate True AND True.
True AND True is evaluated to True.
True
The final result when using a2 as input is True.
a3
match'a*' and (not'a1' or not'a2' or not'a3')
match'a*' and (True or True or not'a3')
match'a*' and (True or not'a3')
match'a*' and (True or False)
match'a*' and (True)
True and (True)
True
a4
match'a*' and (not'a1' or not'a2' or not'a3')
match'a*' and (True or True or not'a3')
match'a*' and (True or not'a3')
match'a*' and (True or True)
match'a*' and (True)
True and (True)
True
Because that statement uses logical OR comparisons, there is no input that will cause them all to be False. When at least one of the statements is True, that causes the whole clause (not'a1' or not'a2' or not'a3') to evaluate to True at the end.
Only the first condition can fail the test.
b1
match'a*' and (not'a1' or not'a2' or not'a3')
match'a*' and (True or True or not'a3')
match'a*' and (True or not'a3')
match'a*' and (True or True)
match'a*' and (True)
False and (True)
False
When you say
anything that equals 'a1', 'a2', or 'a3'.
or
anything that is not 'al', 'a2', or 'a3'.
or
anything that is not equal to 'a1' AND not equal to 'a2' AND not equal to 'a3'.
you are making confusing generalizations of a concrete process.
My advice is to evaluate the statements individually while following precedence as I've done above. That's what the computer will do. It will always give the right answer without guessing.

How to think about NOT (a1 OR a2)
First small theory: NOT, AND and OR are logical operators. A logical operator is a symbol or word used to connect two or more expressions. As a mathematical constructs, there are rules associated with them.
By saying NOT (a1 or a2) what we mean is that we are searching for the expression that is neither a1 nor it's a2. And here enters the confusion. Pay attention how I constructed the senate with the nor and neither. As people we are used to this line of thinking which leads us to write NOT a1 OR NOT a2. Although our natural logical intention is correct, from the perspective of the mathematics (where this problem belongs), the formal expression is wrong.
Since a1 and a2 are sets, we are "looking for an element that do not belong to set a1 and also at the same time do not belong to the set a2". What we are searching that are outside of the union of both sets. And this line of thinking helps to construct the proper expression NOT a1 AND NOT a2, or the we are: "looking for an element that is not at the same time member of a1 or member of a2" which means NOT (a1 or a2).
Going back to NOT a1 OR NOT a2. For operator OR to yield true it's enough only one of the operands to yield true. So if an element is NOT a1 the statement will return true regardless if the second operant is true or not, so if second operand is member of the set a2 that do not intersects with a1, the statement will return true. But this is not what we want. This is why we need to use the operator AND which requires both operands to be true before he can return true
If you want to know more about the formal mathematics related to this issue, this is called DeMorgan's law, and this wikipedia article can help you understand how it arrives. It also has good graphical illustrations that help to understand the logic behind it.

Related

Julia, integer vs boolean results from selection of instances in two arrays

I've got my 3d array called Pop. I want find out how many times two different conditions are met, and they both work for me independently but I can't put the two together.
Pop[end, :, 1] .== 3
works ok, produces an integer vector of 1's and 0's which is correct. Also
Pop[end-1, :, 1] .== 4
works, again returns integer vector, however when I put the two together as:
count(Pop[end, :, 1] .== 3 && Pop[end-1, :, 1] .== 4)
I get this error:
ERROR: TypeError: non-boolean (BitArray{1}) used in boolean context
Which sort of helps, can see that the two numeric arrays can not be compared in a boolean way. What is wrong with my syntax to get the count of the number of times both of the conditions are met. Simple I know but I can't get it! Thx. J
&& is a short-circuiting boolean, which means that if the first term is true, the rest isn't evaluated (see documentation). It also means it's only for a singular booleans and it cannot be broadcasted over an array.
& is the bitwise AND operator (documentation), that you want to use here, because it can be broadcasted over arrays with the syntax .&, the same way you use .==
julia> [true, true, false, false] .& [true, false, true, false]
4-element BitVector:
1
0
0
0
Update
in Julia 1.7+, the short-circuiting operators && and || can now be dotted to participate in broadcast fusion as .&& and .|| (#39594):
julia> [true, true, false, false] .&& [true, false, true, false]
4-element BitVector:
1
0
0
0

Check if array contains more nils than other values

Given an array with only odd counts:
[1,nil,nil]
[1,nil,Module,nil,2]
[1,Class.new,nil]
I would like to determine if there are nils or more non-nils. The approach I used was to make everything either true or false first. And then to determine if there are more true or false values:
[ 1,nil,nil,nil,2,3].collect {|val| !!!val }.max
#=> ArgumentError: comparison of TrueClass with false failed
The max method does not want to play nice with booleans. How can I accomplish this?
Now this might not be the best approach to determine whether there are more nils or non-nils, but this is the approach that I used.
Given an array with only odd counts
If by that you mean that there will always be the nonequal amount of truthy/falsey values in an array, then, first of all, [] is not a valid input.
And here's the solution:
def truthy?(array)
falsey, truthy = array.partition(&:!)
truthy.size > falsey.size
end
You can go with oneliner if you prefer:
def truthy?(array)
array.partition(&:!).max_by(&:size).any?
end
Spec:
truthy?([1,nil,nil]) #=> false
truthy?([1,nil,nil,nil,2]) #=> false
truthy?([1,4,nil]) #=> true
truthy?([1,nil,nil]) #=> false
truthy?([1,nil,Module,nil,2]) #=> true
truthy?([1,Class.new,nil]) #=> true
It uses
Enumerable#partition method;
BasicObject#! method.
If you indeed intended to only calculate nils, not falsey values (as it was stated in the OP):
def more_nils?(array)
array.partition(&:nil?).max_by(&:size).none?
end
Spec:
more_nils?([1,nil,nil]) #=> true
more_nils?([1,nil,nil,nil,2]) #=> true
more_nils?([1,4,nil]) #=> false
more_nils?([1,nil,nil]) #=> true
more_nils?([1,nil,Module,nil,2]) #=> false
more_nils?([1,Class.new,nil]) #=> false
It uses Object#nil? method.
Inspired by #pjs's answer:
array.sum { |el| el.nil? ? -1 : 1 }.negative?
Even simpler ( from #SagarPandya's comment)
array.count(nil) > array.compact.count
A fairly straightforward solution would be:
def truthy?(ary)
ary.map { |bool| bool ? 1 : -1 }.sum > 0
end
Map entries to +/-1 based on their truthiness, sum, and see whether the sum is positive or negative.
This can deal with empty arrays, it returns false in that case.
Here another one:
if array.size > 2*array.compact.size
# We have more nil than non-nil
end
Assuming that falsy values are nil and false, and everything else is truthy (as conditional statements do), you can leverage Object#itself with Array#select.
irb(main):013:0> ary = [1,nil,nil,false,2]
=> [1, nil, nil, false, 2]
irb(main):014:0> ary.select(&:itself).length
=> 2
irb(main):015:0> ary.reject(&:itself).length
=> 3

Pairwise comparison inside array in Julia

Suppose we have a 6-element array in Julia, for example, Int64[1,1,2,3,3,4]. If we want to compare two arrays elementwise, we know we can use ".=="; but my goal is to do all the pairwise comparisons inside the above array: if the elements (i,j) of each pair are equal, I set it to 1 (or true), but if they are different, I set it to 0. All the pairwise comparisons are stored in a 6x6 matrix. Is it possible to do that in Julia without the loop for? Thank you.
You can use the fact that broadcasting will compare rows to columns to simply do a comparison between the array and its transpose:
julia> A = [1,1,2,3,3,4]
6-element Array{Int64,1}:
1
1
2
3
3
4
julia> A .== A'
6×6 BitArray{2}:
true true false false false false
true true false false false false
false false true false false false
false false false true true false
false false false true true false
false false false false false true

Final logical value of boolean array in ruby

Lets say I have an array that looks like:
[true, true, false]
And I am passing an operator along with the array which may be AND, OR or XOR.
So I want to calculate the logical value of array based on the operator specified.
ex:
for the given array [true, true, false] and the operator AND
I should be able to perform in continuation for n number of elements in array
Steps: true AND true -> true, true AND false -> false
therefore the output should be false
the array can be an n number of boolean values.
The best and easiest way to do this is using reduce:
def logical_calculation(arr, op)
op=='AND' ? arr.reduce(:&) : op=='OR' ? arr.reduce(:|) : arr.reduce(:^)
end
and also the other way is might be using inject
OPS = { "AND" => :&, "OR" => :|, "XOR" => :^ }
def logical_calculation(array, op)
array.inject(&OPS[op])
end

How can I find periodically appearing NA values in an 3D array (along dimension time) with R

I have a time series (monthly values over several years) of spatial data (originally ncdf) in an array. If there are more than 2 consecutive e.g. januaries with NA, I want to ban this pixel (now cell in the matrix of one time step) completely from further studies by putting it to NA in all time steps.
As far as I am concerned, "time.series" is only valid for vectors or matrices (maximum of two-dimensions).
One workaround I can see (but also not manage to implement) is:
Resorting the array in a way that the order isn't purely chronological anymore but sorted by month (jan 2001, jan2002, jan 2003, feb 2001, feb 2002, feb 2003,...) would already help a lot. But it would leave the case that pixels get NA if eg. jan 2002, jan 2003 and feb 2001 are NA.
Any help would be really appreciated. Please ask if my question is unclear - it's my first one - I tried my best.
edit:
My actual dataset is a global satellite based radiation dataset. Due to eg periodically appearing clouds (during rainseason in the same month every year) those pixel should not be considered any further. I also have some other criteria which eliminates pixel. Only that one criteria is missing.
# create any array with scattered NAs
set.seed (10)
array <- replicate(48, replicate(10, rnorm(20)))
na_pixels <- array((sample(c(1, NA), size = 7200, replace = TRUE, prob = c(0.95, 0.05))), dim = c(20,10,48))
na_array <- array * na_pixels
dimnames(na_array) <- list(NULL, NULL, as.character(seq(as.Date("2001-01-01"), as.Date("2004-12-01"), "month")))
#I want to test several conditions that would make a pixel not usable for me
#in the end I want to retrieve a mask of usable "pixels".
#what I am doing already is:
mask <- apply(na_array, MARGIN = c(1,2), FUN=function(x){
#check if more than 10% of a pixel are NA over time
if (sum(is.na(x)) > (length(x)*0.05)){
mask_val <- 0
}
#check if more than 5 pixel are missing consecutively
else if (max(with(rle(is.na(a)), lengths[values])) > 5){
mask_val <- 0
}
#this is the missing part
else if (...more than 2 januaries or 2 feburaries or... are NA){#check for periodically appearing NAs
mask_val <- 0
}
else {
mask_val <- 1
}
return(mask_val)
})
It's, probably, more convenient (if the necessary memory exists) to change your 3D array in a 'long' "data.frame":
as.data.frame(as.table(na_array))
# Var1 Var2 Var3 Freq
#1 A A 2001-01-01 0.01874617
#2 B A 2001-01-01 -0.18425254
#3 C A 2001-01-01 -1.37133055
# ...........................
#9598 R J 2004-12-01 NA
#9599 S J 2004-12-01 -1.11411416
#9600 T J 2004-12-01 0.01435433
Instead of relying on as.table and as.data.frame coercions, it could be done manually and more efficiently:
dat = data.frame(i = rep_len(seq_len(dim(na_array)[1]), prod(dim(na_array))),
j = rep_len(rep(seq_len(dim(na_array)[2]), each = dim(na_array)[1]), prod(dim(na_array))),
date = rep(as.Date(dimnames(na_array)[[3]]), each = prod(dim(na_array)[1:2])) ,
month = rep(format(as.Date(dimnames(na_array)[[3]]), "%b"), each = prod(dim(na_array)[1:2])),
isNA = c(is.na(na_array)))
dat
# i j date month isNA
#1 1 1 2001-01-01 Jan FALSE
#2 2 1 2001-01-01 Jan FALSE
#3 3 1 2001-01-01 Jan FALSE
#4 4 1 2001-01-01 Jan TRUE
# ..............
#9597 17 10 2004-12-01 Dec FALSE
#9598 18 10 2004-12-01 Dec TRUE
#9599 19 10 2004-12-01 Dec FALSE
#9600 20 10 2004-12-01 Dec FALSE
Where i: row in na_array, j: column in na_array, date: 3rd dim of na_array, month: month of the date column (as it will be needed later), isNA: whether the value of na_array is NA.
And building the three conditions:
cond1 = aggregate(isNA ~ i + j, dat, function(x) sum(x) > (dim(na_array)[3] * 0.05))
(A more efficient way to create cond1 is rowSums(is.na(na_array), dims = 2) > (dim(na_array)[3] * 0.05)).
cond2 = aggregate(isNA ~ i + j, dat, function(x) any(with(rle(x), lengths[values]) > 5))
And to compute cond3, first find the number of missing values per "month" per each 'cell' (i.e. [i, j]) ("month" is a variable created/extracted from the dimnames(na_array)[[3]] when creating the 'long' "data.frame" dat in the beginning):
NA_per_month = aggregate(isNA ~ i + j + month, dat, function(x) sum(x))
Having the number of NAs per "month" for each [i, j], we build cond3 by checking if each [i, j] contains any "month" with more than 2 NAs:
cond3 = aggregate(isNA ~ i + j, NA_per_month, function(x) any(x > 2))
(It's trivial to replace aggregate in the above 'group-by' operations by any other available).
Perhaps we could avoid creating a 'long' "data.frame" and operate on na_array directly. For example, calculating cond1 with the rowSums version is much more efficient and straightforward. cond2, too, could be saved by an apply on na_array. But cond3 becomes much more straightforward with a 'long' "data.frame" rather than with a 3D array. So, accounting for efficiency, it's always better to try working with the structure present in the data and if it gets cumbersome enough, then we should probably change the structure of our data once and calculate anything in another scaffold than previously.
To get the final result, allocate a "matrix" of appropriate size:
ans = matrix(NA, dim(na_array)[1], dim(na_array)[2])
and fill in after ORing the conditions:
ans[cbind(cond1$i, cond1$j)] = cond1$isNA | cond2$isNA | cond3$isNA
ans
# [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
# [1,] TRUE TRUE FALSE TRUE FALSE FALSE FALSE TRUE FALSE FALSE
# [2,] TRUE FALSE FALSE FALSE TRUE TRUE FALSE TRUE FALSE FALSE
# [3,] FALSE FALSE FALSE FALSE FALSE TRUE FALSE FALSE TRUE FALSE
# [4,] FALSE FALSE FALSE FALSE TRUE FALSE TRUE FALSE FALSE FALSE
# [5,] FALSE FALSE TRUE FALSE TRUE FALSE FALSE TRUE FALSE FALSE
# [6,] FALSE FALSE TRUE FALSE FALSE FALSE TRUE FALSE FALSE FALSE
# [7,] FALSE FALSE TRUE TRUE TRUE FALSE FALSE TRUE TRUE FALSE
# [8,] TRUE TRUE TRUE TRUE FALSE FALSE TRUE FALSE TRUE FALSE
# [9,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE
#[10,] TRUE FALSE TRUE TRUE FALSE FALSE FALSE TRUE FALSE FALSE
#[11,] FALSE TRUE TRUE FALSE FALSE TRUE FALSE TRUE FALSE FALSE
#[12,] TRUE TRUE TRUE FALSE FALSE TRUE FALSE FALSE FALSE FALSE
#[13,] FALSE TRUE TRUE FALSE TRUE FALSE FALSE TRUE FALSE TRUE
#[14,] FALSE FALSE TRUE FALSE TRUE FALSE FALSE TRUE FALSE TRUE
#[15,] TRUE TRUE TRUE TRUE FALSE TRUE FALSE FALSE TRUE FALSE
#[16,] FALSE FALSE FALSE TRUE TRUE FALSE TRUE TRUE TRUE TRUE
#[17,] TRUE FALSE TRUE TRUE FALSE FALSE TRUE FALSE TRUE FALSE
#[18,] FALSE FALSE FALSE TRUE FALSE FALSE TRUE FALSE TRUE TRUE
#[19,] FALSE FALSE FALSE TRUE FALSE FALSE FALSE FALSE TRUE FALSE
#[20,] TRUE FALSE TRUE TRUE FALSE TRUE TRUE FALSE FALSE TRUE
# alexis_laz: Yes, this works now. Unfortunately I realised that the ans[cbind(cond1$i, cond1$j)] = cond1$isNA | cond2$isNA | cond3$isNA is not working. I get the error: number of items to replace is not a multiple of replacement length. I think it only takes the cond1 for replacement. (I am sorry for my example dataset which gives 'FALSE' in all cases for cond2 and cond3 but still, it should check the 'OR' in the code.Even though the result will look the same like cond1) I came up with the following code, which works but is definately not niceor efficient because I am not too familiar with boolean stuff. Perhaps you could optimize my code or edit your line (as my real dataset is huge, i would be greatful fpr any optimization). In the far end I would need all True conditions (meaning NA) to be 0 and all FALSE conditions to be 1. That's why I already did this in my code here.
ans = matrix(NA, dim(na_array)[1], dim(na_array)[2])
cond1_bool <- ans
cond1_bool[cbind(cond1$i, cond1$j)] = cond1$isNA
cond2_bool <- ans
cond2_bool[cbind(cond2$i, cond2$j)] = cond2$isNA
cond3_bool <- ans
cond3_bool[cbind(cond3$i, cond3$j)] = cond3$isNA
ans_bool <- ans
ans_bool[which(cond1_bool == T|cond2_bool == T|cond3_bool == T)] <- 0
ans_bool[which(is.na(ans_bool))] <- 1

Resources