Elasticsearch dynamic mapping compared to Solr dynamic field - solr

In Solr I can define a dynamic field and tie it to a particular data type. In the following example all fields in an indexed document ending with "dt" will be indexed as a long.
<dynamicField name="*_dt" stored="true" indexed="true" type="long" multiValued="true"/>
In ElasticSearch, knowing the name of the field, I can use the "properties" sub-node in "mappings" to index a field to a particular type.
"properties": {
"msh_datetimeofmessage_hl7_dt": {
"type": "date",
"format": "YYYYMMddHHmmss"
},
I tried the following and attempted using a template, unsuccessfully.
"properties": {
"*_dt": {
"type": "date",
"format": "YYYYMMddHHmmss"
},
Does ElasticSearch provide the same functionality as Solr as described above?
Thanks in advance.

I think you may be looking for functionality provided by dynamic templates. Unless I am mistaken, your mapping would look something like this (mostly borrowed from the linked page).
PUT /my_index
{
"mappings": {
"my_type": {
"dynamic_templates": [
{ "my_date_template": {
"match": "*_dt",
"mapping": {
"type": "date",
"format": "YYYYMMDDHHmmss"
}
}}
]
}}}

Related

pivot facet query does not show nested result

I want to use pivoted facet query with solr to get counts of documents by specific 'type' in each 'region'. I run the following query:
http://localhost:8983/solr/alfresco/select?facet.pivot=ns:region,ns:type&facet=true&indent=on&q=TYPE:ns\:caseFile&rows=0&start=0&wt=json
I expect to see number documents of specific 'type' in each 'region'. But I get 'region' counts only:
....
"_pivot_mappings_": {
"ns:region,ns:type": "text#s__lt#{http://xxx.eu/model/1.0}region,text#s__lt#{http://xxx.eu/model/1.0}type"
},
"facet.pivot": "ns:region,ns:type",
...
"facet_counts": {
"facet_intervals": {},
"facet_pivot": {
"ns:region,ns:type": [
{
"field": "ns:region",
"count": 479,
"value": "{en}hk"
},
{
"field": "ns:region",
"count": 120,
"value": "{en}gk"
},
{
"field": "ns:region",
"count": 5,
"value": "{en}oc"
},
{
"field": "ns:region",
"count": 2,
"value": "{en}dep"
},
]
},
"facet_queries": {},
"facet_fields": {},
"facet_heatmaps": {},
"facet_ranges": {}
},
Pivot facets are documented to produce the results I expect, but I was unable to get nested counts, like it was shown here.
Are there any limitations in document model or index itself that prevent getting results I expect? Or is the query wrong? Is there anything I can check?
try "group",but must be single value
Problem is that the fields that I wanted to use for faceting are simply not "facetable". It is not that they are not visible solr document, thats ok because fields are not stored in the index.
What I learned so far, in order to be facetable, filed should not be tokenised and it needs to have docValues="true" in solr document schema. With these changes, faceting started to work as expected. Since the solr schema is built automatically by alfresco, there is a "facetable" marker for alfresco properties(fields). Once turned on, fileds that were previously indexed and tokenized, now become Indexed, DocValues, Omit Norms, Omit Term Frequencies & Positions. Problem solved.

Solr grouping results

So I have a solr result with the following JSON:
{
"grouped": {
"manu_exact": {
"matches": 2,
"groups": [
{
"groupValue": "SOLR1000",
"doclist": {
"numFound": 2,
"start": 0,
"docs": [
{
"id": "SOLR1000",
"name": "Solr, the Enterprise Search Server",
"date":"March 1, 2018 03:00:00",
"status":"Cancel"
},
{
"id": "SOLR1000",
"name": "Solr, the Enterprise Search Server",
"date":"March 1, 2018 01:00:00",
"status":"New"
}
]
}
},
{
"groupValue": "VS1GB400C3",
"doclist": {
"numFound": 2,
"start": 0,
"docs": [
{
"id": "VS1GB400C3",
"name": "Retail",
"date":"March 4, 2018 04:00:00",
"status":"Shipped"
},
{
"id": "VS1GB400C3",
"name": "Retail",
"date":"March 4, 2018 02:00:00",
"status":"New"
}
]
}
}
]
}
}
}
The field definitions of the relevant fields are:
<field name="id" type="string" indexed="true" stored="true" required="true" />
<field name="date" type="tdate" indexed="true" stored="true"/>
<field name="status" type="string" indexed="true" stored="true"/>
The field types of "string" and "tdate" types are as follows:
<fieldType name="string" class="solr.StrField"
positionIncrementGap="100"/></fieldType>
<fieldType name="tdate" class="solr.TrieDateField" precisionStep="6"
positionIncrementGap="0"/></fieldType>
The above query is generated by the following grouping parameter:
group=true& group.field=id& group.sort=date desc & group.limit=10
I wish to do the following:
Run a query that includes only includes those groups that contain documents that are not "cancelled".
Basically, the sorted group gives the time line of the status of the product. I only want to retrive the documents that do not have the latest status as "Cancel".
This is done by sorting the group by the date field and checking the status of the first document.
In the above example, I do not want to include the first group as the status field of its latest document has value : "cancel".
However I want the second group to be included as the status field of its latest document has value : "shipped" and not "cancel".
Any ideas how to do this?
Any help would be much appreciated.

Wrong facet results with special characters in facet field

I have implemented Solr Search and Faceting for e-commerce stores, and facing weired issue with facet filter faceting results. This happens only when we have special character (i.e. bracket) in the facet field otherwise everything works fine.
I have implemented this using SolrNet. I checked doing raw queries into Solr directly and found that this issue might be in the Solr itself and not related to SolrNet.
Example:
I have numbers of products and filters like following:
RAM (GB)
2 GB
4 GB
8 GB
Memory (GB)
4 GB
8 GB
16 GB
Each of facet options has some products into them, so the issue is not about facet.min count. And I have applied the tagging properly as well.
Now, one of this facet works fine while the other one doesn't seems to work with bracket in facet field.
Here is my schema where I define facet fields.
<dynamicField name="f_*" type="string" indexed="true" stored="true" multiValued="true" required="false" />
<dynamicField name="pa_*" type="string" indexed="true" stored="true" multiValued="true" required="false" />
Facet works fine when I do query for field starting as pa_, but not with f_.
Query I am doing, into Solr:
../select?indent=on&wt=json&facet.field={!ex%3Dpa_RAM(GB)}pa_RAM(GB)&fq={!tag%3Dpa_RAM\(GB\)}pa_RAM\(GB\):2%2BGB&q=CategoryID:(1+OR+2+OR+3+OR+4)&start=0&rows=10&defType=edismax&facet.mincount=1&facet=true&spellcheck.collate=true
Image1
This works fine as expected.
Another query:
../select?indent=on&wt=json&facet.field={!ex%3Df_Memory(GB)}f_Memory(GB)&fq={!tag%3Df_Memory\(GB\)}f_Memory\(GB\):4%2BGB&q=CategoryID:(1+OR+2+OR+3+OR+4)&start=0&rows=10&defType=edismax&facet.mincount=1&facet=true&spellcheck.collate=true
Gives following result:
Image 2
This doesn't work. However if I remove special character from query and indexed data this works fine.
Moreover, the returned facet option is the selected one on which I added filter tag. All other facet options are not returned by Solr.
I am unable to figure out why this happens and how to fix it.
Any clue \ idea will be great!
Please refer this query and Images.(It's not a right way or perfect solution)
../select?indent=on&wt=json&facet.field={!ex%3Df_Memory(GB)}f_Memory(GB)&fq={!tag%3Df_Memory(GB)}f_Memory\(GB\):4%2BGB&q=CategoryID:(1+OR+2+OR+3+OR+4)&start=0&rows=10&defType=edismax&facet.mincount=1&facet=true&spellcheck.collate=true&fl=Id,Name,f_Memory(GB)
Reference link :Local Parameters for Faceting
Please help me!
Special characters in SOLR queries (q and fq parameters) must be escaped if you need to search them literally, otherwise queryParser assumes their special meaning. (See "Escaping special characters" in SOLR Documentation
In the example + character not escaped in fq:
{!tag=f_Memory\(GB\)}f_Memory\(GB\):4+GB
Those escaping rules do not apply to Local parameters, i.e. all is between {!and }.
In the example you escaped (and )in tag label. In this way the label defined as {!tag=f_Memory\(GB\)} in filter is different from the one referenced in {!ex=f_Memory+(GB)} in facet field so filter is not excluded during faceting and only matching documents are used to build facets.
You should write filter as:
{!tag=f_Memory(GB)}f_Memory\(GB\):4\+GB
and facet as
{!ex=f_Memory+(GB)}f_Memory+(GB)
to obtain what you're looking for.
Example of full correct request:
../select?indent=on&wt=json&facet.field={!ex%3Df_Memory(GB)}f_Memory(GB)&fq={!tag%3Df_Memory(GB)}f_Memory\(GB\):4\%2BGB&q=CategoryID:(1+OR+2+OR+3+OR+4)&start=0&rows=10&defType=edismax&facet.mincount=1&facet=true&spellcheck.collate=true
Simple real example I tested locally:
This is data in core:
Request:
http://localhost:8983/solr/test/select?q=*%3A*&fl=id%2Cf_*%2Cpa_*&wt=json&indent=true
Response:
{
"responseHeader": {
"status": 0,
"QTime": 1,
"params": {
"q": "*:*",
"indent": "true",
"fl": "id,f_*,pa_*",
"wt": "json",
"_": "1474529614808"
}
},
"response": {
"numFound": 2,
"start": 0,
"docs": [
{
"id": "1",
"f_Memory(GB)": [
"4+GB"
],
"pa_RAM(GB)": [
"2+GB",
"4GB",
"8GB"
]
},
{
"id": "2",
"f_Memory(GB)": [
"8+GB"
],
"pa_RAM(GB)": [
"4GB"
]
}
]
}
}
Working faceting:
Request:
http://localhost:8983/solr/test/select?q=*%3A*&fq=%7B!tag%3Df_Memory(GB)%7Df_Memory%5C(GB%5C)%3A4%5C%2BGB&fl=id%2Cf_*%2Cpa_*&wt=json&indent=true&facet=true&facet.field=%7B!ex%3Df_Memory(GB)%7Df_Memory(GB)
Response:
{
"responseHeader": {
"status": 0,
"QTime": 2,
"params": {
"q": "*:*",
"facet.field": "{!ex=f_Memory(GB)}f_Memory(GB)",
"indent": "true",
"fl": "id,f_*,pa_*",
"fq": "{!tag=f_Memory(GB)}f_Memory\\(GB\\):4\\+GB",
"wt": "json",
"facet": "true",
"_": "1474530054207"
}
},
"response": {
"numFound": 1,
"start": 0,
"docs": [
{
"id": "1",
"f_Memory(GB)": [
"4+GB"
],
"pa_RAM(GB)": [
"2+GB",
"4GB",
"8GB"
]
}
]
},
"facet_counts": {
"facet_queries": {},
"facet_fields": {
"f_Memory(GB)": [
"4+GB",
1,
"8+GB",
1
]
},
"facet_dates": {},
"facet_ranges": {},
"facet_intervals": {},
"facet_heatmaps": {}
}
}

How to sum multivalued field with facet In Solr 5.2

I use JSON Facet API.
When I request faceting such as below.
facet: {
depth1: {
"method": "enum",
"limit" : 30,
"field" : "_srg9jrens_texts",
"type" : "terms",
"sort" : "index asc",
"facet" : {
"stats" : "sum(_45qotu8ef_doubles)"
},
"mincount" : 1
}
It responses error message "can not use FieldCache on multivalued field: _45qotu8ef_doubles"
As you can see the field '_45qotu8ef_doubles' is multivalued==true.
schema.xml
...
<dynamicField name="*_doubles" type="double" indexed="true" stored="true" multiValued="true"/>
...
I need help to solve this problem especially I have to keep it as multiValued.
Please help!
I use Nested document instead of using multivalued field.

solr data import handler not able to index data

Data config.xml is as follows
<dataConfig>
<dataSource type="JdbcDataSource" driver="com.mysql.jdbc.Driver" url="jdbc:mysql://localhost:3306/solrdata" user="root" password="root" />
<document name="cars">
<entity name="car" query="SELECT color FROM solrdata.car ">
<field column="color" name="color" />
</entity>
</document>
</dataConfig>
schema.xml is as follsws
field name="color" type="string" indexed="true" stored="true" />
i checked in debug mode its getting data but not able to process it
output of debug mode is as follows:
{
"responseHeader": {
"status": 0,
"QTime": 312
},
"initArgs": [
"defaults",
[
"config",
"data-config.xml"
]
],
"command": "full-import",
"mode": "debug",
"documents": [
{
"COLOR": [
"red"
]
},
{
"COLOR": [
"silver"
]
},
{
"COLOR": [
"oii"
]
}
],
"verbose-output": [],
"status": "idle",
"importResponse": "",
"statusMessages": {
"Total Requests made to DataSource": "1",
"Total Rows Fetched": "3",
"Total Documents Skipped": "0",
"Full Dump Started": "2013-03-07 15:49:14",
"Total Documents Processed": "0",
"Total Documents Failed": "3",
"Time taken": "0:0:0.281"
},
"WARNING": "This response format is experimental. It is likely to change in the future."
}
You will have a uniqueKey for each document to identify it uniquely (Can be considered similiar to the PrimaryKey in Databases).
Modify your entity in data-config.xml as follows:
<entity name="car" query="SELECT color,id FROM solrdata.car ">
<field column="id" name="id" />
<field column="color" name="color" />
</entity>
Note: The field id is your primaryKey for the table car.
In your schema.xml file, add the following line,
<field name="id" type="string" indexed="true" stored="true" required="true" />
Also, make sure that the following text,
<uniqueKey>id</uniqueKey>
in your schema.xml is not commented out.
Now, restart your Solr Web-application and do a Full-import.
I can read from the import handler response:
"Total Documents Failed": "3"
Looks to me as if your query has some problems or if the loaded schema doesn't "match" the DIH ouput. A <uniqueKey> field is not required although highly recommended. But the missing unique key declaration should lead to that error.
Have a look at the "logging" page on the admin console. If the data import handler has problems with the query then you'll find a log entry there.
And don't forget to refresh the schema and the DIH config file if you've applied any changes while the solr instance is running.
Some specific fields, including version, should not be in the list of your fields. (fl="*" this is contains all fields)
Try each field individually with the corresponding id
fl="id,color"

Resources