I am integrating solr with liferay and I want implement smart next word suggester. for eg if title of my documents are following:
Solr is best search engine in world
Solr is implemented on lucene search engine
Solr are lucene used by 80% of developers as search engine in world
lucene doesn't require separate installation on app server for make search implementation
So if I type Solr, I should get following result :
solr
solr lucene
Solr search
solr engine
solr world
etc.
if I type lucene, I should get following result :
lucene
lucene search
lucene engine
lucene world
etc.
I tried lots of example and they works but I am facing following problems:
Suggestions work if I start from prefix Solr, If I start typing any middle word, it doesn't work
I am getting complete sentence not next best matching word.
Please help. Thanks in advance :)
You can build "most popular other words" for the field by using facets. Ignore the facet with the same name as the word you're searching for and add a fq with the word (to limit it to the subset of documents that matches your word). Do this for each word you're searching for (i.e. ignore solr in the facet list generated and add fq=title:solr to the query).
Next step would be fq=title:solr AND title:lucene and skip solr and lucene in the list of facets.
This assumes that you're tokenizing your field approriately (with the StandardTokenizer or WhitespaceTokenizer for example) so you get one token for each word.
AnalyzingInfixLookupFactory will give you suggestions if you type the middle word as well, hope that answers your first question
Related
I'm adding search functionality for my website with Solr. I created a solr core to save articles(id, title, content). Because the text on content field very long, i want to limit words of results around keyword each time i get a query results (it like Google results).
my Solr version: 6.1.0
Thanks!
use Highlighting, this feature is exactly what you are asking for
Does anybody know how to have Lucene and Solr together in the same Sitecore Instalation?
Sitecore states that is possible here:
https://doc.sitecore.net/sitecore_experience_platform/setting_up__maintaining/search_and_indexing/indexing/using_solr_or_lucene
You can mix Lucene and Solr, and, for example, use Solr for xDB and
Lucene for content search at the same time. If an index is small, it
is much easier to manage as a Lucene index because there is little to
no overhead to set it up.
But there is no reference on how to configure it.
Any advise is welcome.
Cheers!
In words, your analytic indexes will be using SOLR and your content search indexes will be using Lucene.
To configure your analytic indexes to use SOLR, you can check the following documentation from Sitecore: https://doc.sitecore.net/sitecore_experience_platform/setting_up__maintaining/xdb/configuring_servers/configure_a_processing_server#_Solr_configuration
By default, Sitecore already configured Lucene to be used for Content Search. So, for this, there is no change required.
However, I am not sure that SOLR and Lucene can be used for Content Search or xDB at the same time because of its configuration. For example, the Content Search makes use of the index configuration master, web and core. If you decide to use SOLR for Content Search, you will need to disable the Lucene configuration file from the Include folder.
Thanks
I am learning solr and want to use solr for stemming words.I'll be passing the word to the solr and it should send the stemmed word back.I know how to configure solr core for different stemming patterns and also i am able to view their stemmed words in the analyzer (solr admin ui) but i am not sure how to achieve this using java code.I am able to index and query using java api.
I am using solr-5.3.0.
If you need to just stem the words I would recommend you not to use the whole Solr. Just use the code they use for stemming or something similar. E.g. you can use
org.apache.lucene.analysis.en.PorterStemmer.stem(String)
Unfortunately PorterStemmer has package level access so I would just copy it from the sources or you can search the Internet for some other stemmer implementations. I hope that helps.
Good luck!
I'm moving a search from coldfusion 9 verity to coldfusion 10 solr, but i'm getting some weird results.
For example; if i search for "Fishing and Camping England" including the quotation marks on verity i get 7 results, and as you'd expect the results contain the correct phrase "Fishing and Camping England"
But when i search on solr, i get 1 result, and its a result i didn't get back previously. The context shows;
about fish! Camping England and
If i search the solr collection using different search terms, the results/documents i want are actually there. Is there something strange with solr and search terms in quotation marks? I looked on the Adobe site for solr terms, and it seems it should be fine. Buts it not! I get the same strange results on our local development server and our remote server.
For this example i changed the actual search words, but I hope you get the idea.
There is difference between working of verity and solr search engine. verity is classic search engine where as Solr is modern.Solr is more robust and fast. Raymond Camden have explained it well in his blog.
For difference in result in solr you have to chose a proper serach syntax that will return you desired result. Solr support multiple search syntax to find matching result. Here is some example of solr search syntax.
I'm looking into a search solution that will identify strings (company names) and use these strings for search and facets in Solr.
I'm new to Nutch and Solr so I wonder if this is best done in Nutch or in Solr. One solution would be to generate a Parser in Nutch that identifies the strings in question and then index the name of the company, later mapped to a Solr value. I'm not sure on how, but I guess this could also be done inside Solr directly from the text?
Does it make sense to do this string identification in Nutch or in Solr and is there some functionality in Solr or Nutch that could help me here?
Thanks.
You could embed a NER library (see opennlp, lingpipe, gate) in to a custom parser, generate new fields and create an indexingfilter accordingly. This is not particularly difficult and the advantage compared to doing this on the SOLR side is that you'd gain from the scalability of mapreduce (NLP tasks are often CPU-hungry).
See Behemoth for an example of how to embed GATE in mapreduce
Nutch works with Solr by indexing the crawled data to Solr via the Solr HTTP API. You trigger the indexation by calling the solrindex command. See this page for details on how to setup this.
To be able to extract the company names, I would add the necessary code in Solr. I would use a UpdateRequestProcessor. It allows to add an extra step in the indexing process to add extra fields in the document being indexed. Your UpdateRequestProcessor would be used to examine to document sent to Solr by Nutch, extract the company names from the text and add them as new fields in the document. Solr would them index the document + the fields that you add.