Multiple boost of each matching value in one field - solr

I have the one multiple field, with the following values:
"itm_field_skills":[1, 2]
Now I have the following query:
q=itm_field_skills:(1+OR+2)^5
So I've got the result, but the score is 5.
I want to make a search request with boosting of each matching value to get score 10.

Absolute score values isn't something you can rely on. Your query does not mean that your score will be 5 or 10 - just that those terms are five/ten times more important than other parts of your query.
If you look at the output of debugQuery, you'll see that the boost (5) is being applied separately to each term and then the scores for the terms are summed together afterwards.
4.8168015 = sum of:
1.2343608 = weight(..) [SchemaSimilarity], result of:
1.2343608 = score(doc=0,freq=1.0 = termFreq=1.0
), product of:
5.0 = boost <----
0.3254224 = idf, computed as log(1 + (docCount - docFreq + 0.5) / (docFreq + 0.5)) from:
6.0 = docFreq
8.0 = docCount
0.7586207 = tfNorm, computed as (freq * (k1 + 1)) / (freq + k1 * (1 - b + b * fieldLength / avgFieldLength)) from:
1.0 = termFreq=1.0
1.2 = parameter k1
0.75 = parameter b
1.125 = avgFieldLength
2.0 = fieldLength
3.5824406 = weight(..) [SchemaSimilarity], result of:
3.5824406 = score(doc=0,freq=1.0 = termFreq=1.0
), product of:
5.0 = boost <---
0.9444616 = idf, computed as log(1 + (docCount - docFreq + 0.5) / (docFreq + 0.5)) from:
3.0 = docFreq
8.0 = docCount
0.7586207 = tfNorm, computed as (freq * (k1 + 1)) / (freq + k1 * (1 - b + b * fieldLength / avgFieldLength)) from:
1.0 = termFreq=1.0
1.2 = parameter k1
0.75 = parameter b
1.125 = avgFieldLength
2.0 = fieldLength

Related

Defining variables explicitly vs accessing arrays

I am implementing the Runge-Kutta-Fehlberg method with adaptive step-size (RK45). I define and call my Butcher tableau in a notebook with
module FehlbergTableau
using StaticArrays
export A, B, CH, CT
A = #SVector [ 0 , 2/9 , 1/3 , 3/4 , 1 , 5/6 ]
B = #SMatrix [ 0 0 0 0 0
2/9 0 0 0 0
1/12 1/4 0 0 0
69/128 -243/128 135/64 0 0
-17/12 27/4 -27/5 16/15 0
65/432 -5/16 13/16 4/27 5/144 ]
CH = #SVector [ 47/450 , 0 , 12/25 , 32/225 , 1/30 , 6/25 ]
CT = #SVector [ -1/150 , 0 , 3/100 , -16/75 , -1/20 , 6/25 ]
end
using .FehlbergTableau
If I code the algorithm for RK45 straightforwardly as
function infinitesimal_flow(A::SVector{6,Float64}, B::SMatrix{6,5,Float64}, CH::SVector{6,Float64}, CT::SVector{6,Float64}, t0::Float64, Δt::Float64, J∇H::Function, x0::SVector{N,Float64}) where N
k1 = Δt * J∇H( t0 + Δt*A[1], x0 )
k2 = Δt * J∇H( t0 + Δt*A[2], x0 + B[2,1]*k1 )
k3 = Δt * J∇H( t0 + Δt*A[3], x0 + B[3,1]*k1 + B[3,2]*k2 )
k4 = Δt * J∇H( t0 + Δt*A[4], x0 + B[4,1]*k1 + B[4,2]*k2 + B[4,3]*k3 )
k5 = Δt * J∇H( t0 + Δt*A[5], x0 + B[5,1]*k1 + B[5,2]*k2 + B[5,3]*k3 + B[5,4]*k4 )
k6 = Δt * J∇H( t0 + Δt*A[6], x0 + B[6,1]*k1 + B[6,2]*k2 + B[6,3]*k3 + B[6,4]*k4 + B[6,5]*k5 )
TE = CT[1]*k1 + CT[2]*k2 + CT[3]*k3 + CT[4]*k4 + CT[5]*k5 + CT[6]*k6
xt = x0 + CH[1]*k1 + CH[2]*k2 + CH[3]*k3 + CH[4]*k4 + CH[5]*k5 + CH[6]*k6
norm(TE), xt
end
and compare it with the more compact implementation
function infinitesimal_flow_2(A::SVector{6,Float64}, B::SMatrix{6,5,Float64}, CH::SVector{6,Float64}, CT::SVector{6,Float64}, t0::Float64,Δt::Float64,J∇H::Function, x0::SVector{N,Float64}) where N
k = MMatrix{N,6}(0.0I)
TE = zero(x0); xt = x0
for i=1:6
# EDIT: this is wrong! there should be a new variable here, as pointed
# out by Lutz Lehmann: xs = x0
for j=1:i-1
# xs += B[i,j] * k[:,j]
x0 += B[i,j] * k[:,j] #wrong
end
k[:,i] = Δt * J∇H(t0 + Δt*A[i], x0)
TE += CT[i]*k[:,i]
xt += CH[i]*k[:,i]B[i,j] * k[:,j]
end
norm(TE), xt
end
Then the first function, which defines variables explicitly, is much faster:
J∇H(t::Float64, X::SVector{N,Float64}) where N = #SVector [ -X[2]^2, X[1] ]
x0 = SVector{2}([0.0, 1.0])
infinitesimal_flow(A, B, CH, CT, 0.0, 1e-2, J∇H, x0)
infinitesimal_flow_2(A, B, CH, CT, 0.0, 1e-2, J∇H, x0)
#btime infinitesimal_flow($A, $B, $CH, $CT, 0.0, 1e-2, $J∇H, $x0)
>> 19.387 ns (0 allocations: 0 bytes)
#btime infinitesimal_flow_2($A, $B, $CH, $CT, 0.0, 1e-2, $J∇H, $x0)
>> 50.985 ns (0 allocations: 0 bytes)
I cannot find a type instability or anything to justify the lag, and for more complex tableaus it is mandatory that I use the algorithm in loop form. What am I doing wrong?
P.S.: The bottleneck in infinitesimal_flow_2 is the line k[:,i] = Δt * J∇H(t0 + Δt*A[i], x0).
Each stage of the RK method computes its evaluation point directly from the base point of the RK step. This is explicit in the first method. In the second method you would have to reset the point computation in each stage, such as in
for i=1:6
xs = x0
for j=1:i-1
xs += B[i,j] * k[:,j]
end
k[:,i] = Δt * J∇H(t0 + Δt*A[i], xs)
...
The slightest error in the step computation can catastrophically throw off the step-size controller, forcing the step size to fall towards zero and thus the effort to increase drastically. An example is the 4101 error in RKF45

How do I return a value that's dependent on another value?

I have a table of products, and a table of rates. Each product has a set of different rates, and each set has a headline rate. How do I return the headline rate for each product?
Here's an example of the tables
Products pp
Id Product
- - - - - - - - - - - - - - - - - - - - - - - - - - - -
P1 Product1
P2 Product2
P3 Product3
Rates rr
Id Productid Headlinetier Tier1 Tier2 Tier3
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
1 P1 3 0.1 0.2 0.3
2 P2 1 0.4 0.5 0.6
3 P3 2 0.7 0.8 0.9
How do I get the following results?
pp.Product rr.Headlinerate
- - - - - - - - - - - - - - - - - - - - - - - - - -
P1 0.3
P2 0.4
P3 0.8
You need to join the tables and a CASE statement to choose between the 3 tiers:
select
p.product,
case r.headlinetier
when 1 then r.tier1
when 2 then r.tier2
when 3 then r.tier3
end headlinerate
from products p inner join rates r
on r.productid = p.id
If your version is SQL Server 2012+ you can use choose():
select
p.product,
choose(r.headlinetier, r.tier1, r.tier2, r.tier3) headlinerate
from products p inner join rates r
on r.productid = p.id

Very large values in Field length value "fieldLength" Solr BM25

I have encountered an issue in the calculation of fieldLength value in Solr 6. I am using BM25 as the similarity measure. When i index a set of documents, the fieldLength values for these documents are very erroneous. For a field of title containing only 9 words, the fieldLength field stores a value of "5.6493154E19" which is entirely incorrect. When I re-index an individual document, the score is corrected and it shows the fieldLength value to be "10.24".
Now when I re-index the whole corpus, the values are again corrupted and the fieldLength value is again "5.6493154E19"
Original fieldLength value stored:
4.641637E-19 = tfNorm, computed as (freq * (k1 + 1)) / (freq + k1 * (1 - b + b * fieldLength / avgFieldLength)) from:
1.0 = termFreq=1.0
1.2 = parameter k1
0.75 = parameter b
10.727212 = avgFieldLength
5.6493154E19 = fieldLength
After Re-indexing an individual Document:
1.0189644 = tfNorm, computed as (freq * (k1 + 1)) / (freq + k1 * (1 - b + b * fieldLength / avgFieldLength)) from:
1.0 = termFreq=1.0
1.2 = parameter k1
0.75 = parameter b
10.72807 = avgFieldLength
10.24 = fieldLength
After re-indexing the whole corpus:
4.641637E-19 = tfNorm, computed as (freq * (k1 + 1)) / (freq + k1 * (1 - b + b * fieldLength / avgFieldLength)) from:
1.0 = termFreq=1.0
1.2 = parameter k1
0.75 = parameter b
10.727212 = avgFieldLength
5.6493154E19 = fieldLength
Any ideas on where the problem is?

C arithmetic precedence

I stumbled upon a question about arithmetic precedence in a test and I cannot wrap my head at all around its answer.
float x = 5 % 3 * + 2 - 4.5 / 5 * 2 + 2;
My "understanding" right now is that multiplication must take place first before division and modulus, yet when I try using that approach, the answer is 6.55 instead of 4.20. I tried playing around with the expression (adding brackets here and there), and it turns out that 5 % 3 takes place first before everything else. I just don't understand why since, according to the precedence table I was provided, that shouldn't be the case. Could someone clear this up for me?
Please refer to the documentation here.
The precedence of multiplication, division and remainder operators are higher than those of sum and subtraction.
When multiplication, division or remainder operators go one after another, then they are left-associated, meaning they will be executed one by one in the given order.
In your example 5 % 3 will be performed first, then the multiplication (by whatever number there is), then the division 4.5 / 5, then multiplication of the result by 2, and only after all that will the sum and subtraction be performed.
Your C code:
x = 5 % 3 * + 2 - 4.5 / 5 * 2 + 2;
First, unary plus and unary minus has the highest precedence:
x = 5 % 3 * (+ 2) - 4.5 / 5 * 2 + 2;
Second, multiplication, division, and remainder have the same precedence, associated from left to right:
x = ((5 % 3) * (+ 2)) - ((4.5 / 5) * 2) + 2;
Last, addition and subtraction have the same precedence, associated from left to right:
x = ((((5 % 3) * (+ 2)) - ((4.5 / 5) * 2)) + 2);
Now we evaluate the expression:
x = (((2 * (+ 2)) - ((4.5 / 5) * 2)) + 2);
x = (((2 * 2) - ((4.5 / 5) * 2)) + 2);
x = ((4 - ((4.5 / 5) * 2)) + 2);
x = ((4 - (0.9 * 2)) + 2);
x = ((4 - 1.8) + 2);
x = (2.2 + 2);
x = 4.2;
you can refers this link for more detail
http://www.difranco.net/compsci/C_Operator_Precedence_Table.htm

Database calculations are wrong

Here is what I'm setting:
result = price / (case when tax = 0 then #tax1h / 100 else #tax2 / 100 end + 1)
These are the values:
price = 17.5
tax = 1
tax2 = 6
17.5 / (6 / 100 + 1) = 16.5
And this returns 17.5 Why is this happening and how to solve it?
Integer division:
select (6 / 100 + 1)
The result of the above is 1.
However, the result of:
select (6 / 100.0 + 1)
Is 1.06.

Resources