Exact match with special characters in solr - solr

I using solr 4.6.1, I have one problem in searching string with special charcters, let me tell you one example
if I search string "choose:" then results having string <choose> comes first and then results with exact match <choose:> comes at the end of result set.
Please tell me what I have to do to solve this problem.
"params": {
"lowercaseOperators": "true",
"indent": "true",
"q": "type:service AND tags:\"choose:\"",
"qf": "tags^8",
"_": "1406201797319",
"stopwords": "true",
"wt": "json",
"defType": "edismax"
}

If you search against a StrField, only exact matches will count. You can then score these matches higher, using qf=exact^8 text (if using dismax or edismax as your query parser). In standard Lucene syntax you can search for exact:"choose:"^8 OR text:"choose:" to score the exact matches higher.

Related

Can we use Phonetic Analyzer and Synonym maps together in the index of Azure Cognitive Search?

I am trying to enable both the Phonetic analyzer and Synonym maps together in my search index. But when I worked with both, Synonym mapping is not working.
If I remove the phonetic analyzer as part of the index creation, then the synonyms are working fine.
Also, the synonyms work fine with the inbuilt analyzers like en.microsoft.
My Index field:
{
"name": "content",
"type": "Edm.String",
"facetable": false,
"filterable": false,
"key": false,
"retrievable": true,
"searchable": true,
"sortable": false,
"analyzer": "my_standard",
"indexAnalyzer": null,
"searchAnalyzer": null,
"synonymMaps": [
"euc-synonymmap"
]
}
Analyzer definition:
"analyzers":[
{
"name":"my_standard",
"#odata.type":"#Microsoft.Azure.Search.CustomAnalyzer",
"tokenizer":"standard_v2",
"tokenFilters":[ "lowercase", "asciifolding", "phonetic" ]
}
]
Synonym map:
{
"format":"solr",
"synonyms" : "features,characteristic,property,detail,facet,factor\n
configure,setup,install,launch,arange\n
issue,problem,controversy,affair\n
troubleshoot,fix,correct,fine-tune,over haul\n
extension,postponement,postpone,delay,addition,add-on\n
computer,desktop,system,laptop,mainframe,machine,PC,Workstation\n
temp,temporary,momentary,short lived"
}
Search query:
{
"search"="Lane acount postponement",
"top":"5",
"highlight":"content"
}
Consider I have a document which is having content with terms 'LAN','account' and 'Extension' that is been indexed in azure index(with phonetic analyzer and synonyms). When I pass search query as "Lane acount postponement", phonetic analyzer analyze 'Lane' term as LAN and 'acount' term as 'account'. It is also highlighting the terms(LAN , account) from document's content since I am using hit highlighting in a search query.
You can see in the synonym map definition,i added extension as a synonym for postponement . But it is not searching and highlighting the term 'extension'.
I just need to know whether we can use both the Phonetic analyzer and Synonym maps together in a search index.
Please clarify me. Thank you in advance.

Solr edismax query syntax error "Query Field '_text_' is not a valid field name"

I had originally created in my solr schema 3 copy fields:
curl -X POST -H 'Content-type:application/json' --data-binary '{"add-copy-field": {"source":"company_name","dest":"_text_"}}' http://my-instance/solr/listing/schema
curl -X POST -H 'Content-type:application/json' --data-binary '{"add-copy-field": {"source":"address","dest":"_text_"}}' http://my-instance/solr/listing/schema
curl -X POST -H 'Content-type:application/json' --data-binary '{"add-copy-field": {"source":"city","dest":"_text_"}}' http://my-instance/solr/listing/schema
However, I have recently removed these from the schema and are now composing queries in a slightly different format. More advanced queries we have the need for edismax.
However, even by turning on edismax I'm receiving an error from the solr query parser as per below. Did I break something by deleting the copy fields?
/solr/listing/select?debugQuery=on&defType=edismax&q=%3A&stopwords=true
{
"responseHeader": {
"zkConnected": true,
"status": 400,
"QTime": 1,
"params": {
"q": "*:*",
"defType": "edismax",
"debugQuery": "on",
"stopwords": "true"
}
},
"error": {
"metadata": [
"error-class",
"org.apache.solr.common.SolrException",
"root-error-class",
"org.apache.solr.common.SolrException"
],
"msg": "org.apache.solr.search.SyntaxError: Query Field '_text_' is not a valid field name",
"code": 400
}
}
As per the comments the 'text' field remains in 3 places in the config:
"/update/extract":{
"startup":"lazy",
"name":"/update/extract",
"class":"solr.extraction.ExtractingRequestHandler",
"defaults":{
"lowernames":"true",
"fmap.content":"_text_"}}
"spellchecker":{
"name":"default",
"field":"_text_",
"initParams":[{
"path":"/update/**,/query,/select,/tvrh,/elevate,/spell,/browse",
"defaults":{"df":"_text_"}}]
As per the comment on my question (I'm still on the learning path of solr):
Although they have been deprecated for quite some time, Solr still has
support for Schema based configuration of a <defaultSearchField/>
(which is superseded by the df parameter) and <solrQueryParser defaultOperator="OR"/> (which is superseded by the q.op parameter.
If you have these options specified in your Schema, you are strongly
encouraged to replace them with request parameters (or request
parameter defaults) as support for them may be removed from future
Solr release.
For our purposes and as we are using the edismax query parser we needed to specify the query fields that we wanted to use.
2+ year old post, not sure this will help.
Since you are using "defType": "edismax"
try "q.alt": "*:*" instead of "q": "*:*". This should fix the issue.

Solr returns different result for each letter change

When I try searching for products that are having "camel" in their display names. All the indexing procedure have been done. The problem here is:
When I search "camel" I get: 1 product
"name": "CHANEL HYDRA BEAUTY CAMELLIA WATER CREAM ILLUMINATING HYDRATING FLUID 30ML"
But When I search "CAMELL": I get 3 products from solr:
{
"name": "CLE DE PEAU Lipstick #5 Camellia"
},
{
"name": "CHANEL HYDRA BEAUTY CAMELLIA WATER CREAM ILLUMINATING HYDRATING FLUID 30ML"
},
{
"name": "HERA Rouge Holic Shine No.315 Camellia Orange"
}
When I search CAMEL. I must have got these 3 as well. Why isn't it working?
The issue was fixed after setting the wildcard flag as true to the indexed properties, in Hybris. Thanks to everyone for your help and ideas.

How to handle multi-word/phrase synonyms in Azure Search

According to article https://azure.microsoft.com/pl-pl/blog/azure-search-synonyms-public-preview/ I should be to use multi-word/phrase synonym in synonymMaps
Multi-word synonyms
In many full text search engines, support for synonyms is limited to single words. Our team has engineered a solution that allows Azure Search to support multi-word synonyms. This allows for phrase queries (“”) to function properly while using synonyms. If someone has mapped ‘hot tub’ to ‘whirlpool bath’ and they then search for “large hot tub,” Azure Search will return matches which contain both “large hot tub” and “large whirlpool bath.”
However, in my case I got match on sub words.
My synonymMap looks like:
{"name":"map",
"format":"solr",
"synonyms":"Gastroenterology (acute and chronic),vomiting, diarrhoea, weight loss\n"}
And I have documents in search index which contains medicine disciplines like Gastroenterology (acute and chronic).
What I receives after ?search="vomiting" is:
{
"#search.score": 1.0405536,
"#search.highlights": {
"disciplines/name": [
"<em>Acute</em> <em>and</em> <em>chronic</em> ear disease",
"<em>Acute</em> <em>and</em> <em>chronic</em> skin disease",
"<em>Gastroenterology</em> (<em>acute</em> <em>and</em> <em>chronic</em>)",
"Haematology (<em>acute</em> <em>and</em> <em>chronic</em>)",
"Respiratory medicine (<em>acute</em> <em>and</em> <em>chronic</em>)"
],
And I am expecting:
{
"#search.score": 1.0405536,
"#search.highlights": {
"disciplines/services/translatedName": [
"<em>Gastroenterology (acute and chronic)</em>",
],
Am I doing something wrong?
I tried to cut main word to one-word like Gastroenterology but some of them simply cannot be cut.
Providing quotes like synonyms => "Gastroenterology (acute and chronic)" also does not work.
UPDATED
I was wondering why I thought there is problem.
Well, I provided:
{"name":"map",
"format":"solr",
"synonyms":"Gastroenterology (acute and chronic),vomiting, diarrhoea, weight loss\n"}
And actually using:
{"name":"map",
"format":"solr",
"synonyms":"Gastroenterology (acute and chronic),vomiting, diarrhoea, weight loss
=> Gastroenterology (acute and chronic)\n"}
In that case I vae 4 results:
"#odata.count": 4,
"value": [
{
"#search.score": 1.0137179,
"#search.highlights": {
"disciplines/services/translatedName": [
"<em>Acute</em> <em>and</em> <em>chronic</em> ear disease",
"<em>Acute</em> <em>and</em> <em>chronic</em> skin disease",
"<em>Gastroenterology</em> (<em>acute</em> <em>and</em> <em>chronic</em>)",
"Haematology (<em>acute</em> <em>and</em> <em>chronic</em>)",
"Respiratory medicine (<em>acute</em> <em>and</em> <em>chronic</em>)"
],
"equipment/translatedName": [
"Emergency <em>and</em> crictial care",
"In house skin <em>and</em> ear cyology"
],
"disciplines/translatedName": [
"Anaesthesia <em>and</em> analgesia",
"Emergency <em>and</em> critical care"
]
},
...
{
"#search.score": 0.33542877,
"#search.highlights": {
"disciplines/services/translatedName": [
"<em>Chronic</em> pain management"
],
"disciplines/translatedName": [
"Anaesthesia <em>and</em> analgesia"
]
},
...
{
"#search.score": 0.13757591,
"#search.highlights": {
"equipment/translatedName": [
"Emergency <em>and</em> crictial care"
],
"disciplines/translatedName": [
"Emergency <em>and</em> critical care"
]
},
...
{
"#search.score": 0.07112321,
"#search.highlights": {
"disciplines/services/translatedName": [
"<em>Chronic</em> pain management"
]
},
Could you explain to me how it works in that case?
Azure Search does support multi-word synonyms and the result in your case is as expected. There are a couple of things to be called out here.
First ?search="vomiting" will return docs that match 'vomiting' or specified synonyms anywhere within the document. The multi-word synonym Gastroenterology (acute and chronic) in the collection disciplines/name matches your query, resulting the document to be returned.
The second thing that is probably the source of confusion, is the highlighting. Azure search doesn't support phrase highlighting currently. If used with a phrase query, it highlights the individual terms in the phrase. Since the matching document also had individual terms elsewhere, all of those were highlighted. Check Azure search highlights for phrases with double quotes for more details.
So, the multi-word synonym expansion and search is functioning as expected. You can test this by indexing a test document that just contains Gastroenterology (acute and chronic) and then another that just contains acute and chronic. The query should result only return the 1st document.
If you have a strict requirement on highlighting phrases, you'll have to do some client side processing after retrieving the search results

Solr Faceting - simple example complaining about asterisk

I'm doing the most basic of solr queries with faceting.
q=*:*&facet=true&facet.field=year
And I'm getting an error as follows:
{
"responseHeader": {
"status": 400,
"QTime": 1,
"params": {
"indent": "true",
"q": "*:*&facet=true&facet.field=year",
"_": "1443134591151",
"wt": "json"
}
},
"error": {
"msg": "undefined field *",
"code": 400
}
}
This query is straight out of the online tutorials. Why is solr complaining?
It appears that what you have done is gone to the Solr Admin panel and in the query section you have put
*:*&facet=true&facet.field=year
after the q. What you need to do is put *:* after the q, and facet=true&facet.field=year under Raw Query Parameters.
The error says, that you have "undefined field". Is "year" field defined in your schema? Also, can you give details about how you are querying the data. Like which client?And I assume that q=: is working and issue is only with faceting
You've put it into the wrong line in the solr admin.
Just take the same line, and paste it into the Raw query line instead of the query line.

Resources