I want to store a document which would have a field containing a date. What would be the best way to represent it and which type should I use in its "search definition" ?
I have looked for a "date" type in the documentation (https://docs.vespa.ai/documentation/reference/search-definitions-reference.html#field_types) but haven't found one.
The best way to represent a date in Vespa is to use a unix timestamp.
The field definition in the search definition can be defined as:
# Creation date
field createdate type long {
indexing: summary | attribute
attribute: fast-search
}
Note that 'attribute' will make the value searchable, and 'attribute: fast-search' will improve the search performance
Related
What is the fl parameter I have to use to get all fields in a document except for "field1" in Solr?
Right now is not possible to define, in the fl parameter, the fields to exclude from the results. You have to define all the fields you want and not put field1 in the list. Another possibility is using the regex syntax, as you can see from the official documentation: https://solr.apache.org/guide/solr/latest/query-guide/common-query-parameters.html#fl-field-list-parameter
A different solution can be to not store the field in Solr, this possibility clearly depends on your field usage.
I have Solr 7.2.1 and in my managed-schema.xml file I have a field which represents date object of type "pDate".
Now I need to index also the time of the day, but I saw I can't search for the time with "pDate" field type. If I query solr searching for my_date_field:[2018-03-12T00:00:00.000Z TO *] it works; instead if i search [2018-03-12T12:00:00.000Z TO *] I can't find any results.
so, basically, what type is better to use to achieve that ? Is the field type the origin of the problem ?
Solr Ref Guide says: "Solr’s date fields (DatePointField, DateRangeField and the deprecated TrieDateField) represent "dates" as a point in time with millisecond precision." So the field can not be the origin of your problem.
Check the format "12/03/2018T12:00:00.000Z". Is is really correct? I only know dates formatted like "2018-03-12T12:00:00.000Z". See here Date Formatting
Also find the document in Solr Admin UI and inspect the JSON response. What is the value of my_date_field?
I found the solution,
I had to set $SOLR_TIMEZONE variable in the "solr.in" config file, but the correct value was not "CEST" but "Europe/Rome" for me
I would order my search according some field.
For example:
- title
- description
- some field
I would order by title and description. I try QF but this not work in all cases because search only in the specified field. I would to specify a list of field but i don't want exclude other field
Sort by score is the default sort if you don't specify anything. Perhaps you are looking to boost matches against specific field. This can be done using eDisMax and specifying boosts in the list of fields.
For example fl=title^10 description^3 otherfield1 otherfield2
I am reviewing the similarity calculations performed by the DefaultSimilarity class in Lucene invoked by Solr. Specifically, I am not clear about field normalization as to how its calculated when the Solr query doesn't reference a specific field.
norm(t,d) = doc.getBoost() · lengthNorm · ∏ f.getBoost() .... field f in d named as t
where
doc.getBoost() = document's boost specified at index time
f.getBoost() = field's boost specified at index time
lengthNorm = number of terms/tokens in the field
My question is, if a solr query is specified as -
/select?q=indian cricket&rows=5&wt=json
without reference to a specific field in schema.xml, how is norm(t,d) calculated? for every field, the term is found in? If so, how
are these combined?
Thanks in advance for your insights!
Fields without a field name will use the defaultSearchField setting from the schema, the df (default field) query parameter or the qf query fields parameter (if using (e)dismax, and the terms will be prefixed with the field name. Each field, term combination for each queried field will then be used to evaluate the norm.
Use the debugQuery feature of Solr to see each scored part and how it affects the score.
I have a structure of names and dates, that are stored in a log table. They represent user actions and when they did that.
Name Date
John A. 2013-04-01
Leev B. 2013-04-02
Anse E. 2013-04-03
I need to index that data, keeping the relation between name and date.
I've already tried to concatenate the fiedls, using a separator ($):
"John A.$2013-04-01"
"Leev B.$2013-04-02"
"Anse E.$2013-04-03"
It worked fine, but from now on, users can search by a portion of the name, without typing it completely, and use a range for the date. So a ordinary search would be:
fq = log_user_date:["John*2013-12-01" TO "John*2013-12-31"]
It happens that Apache Solr cannot handle a query range with a wildcard in the middle of it.
There is a better solution for indexing "key value" data?
Why are you concatenating the name and the date with a delimiter? You can instead keep two separate fields: a string field name and a date field date in Solr and do a query like
q=*:*&fq=name:John*&fq=date:[2013-12-01T00:00:00Z TO 2013-12-31T11:59:59Z]