I'm adding search functionality for my website with Solr. I created a solr core to save articles(id, title, content). Because the text on content field very long, i want to limit words of results around keyword each time i get a query results (it like Google results).
my Solr version: 6.1.0
Thanks!
use Highlighting, this feature is exactly what you are asking for
Related
I'm moving a search from coldfusion 9 verity to coldfusion 10 solr, but i'm getting some weird results.
For example; if i search for "Fishing and Camping England" including the quotation marks on verity i get 7 results, and as you'd expect the results contain the correct phrase "Fishing and Camping England"
But when i search on solr, i get 1 result, and its a result i didn't get back previously. The context shows;
about fish! Camping England and
If i search the solr collection using different search terms, the results/documents i want are actually there. Is there something strange with solr and search terms in quotation marks? I looked on the Adobe site for solr terms, and it seems it should be fine. Buts it not! I get the same strange results on our local development server and our remote server.
For this example i changed the actual search words, but I hope you get the idea.
There is difference between working of verity and solr search engine. verity is classic search engine where as Solr is modern.Solr is more robust and fast. Raymond Camden have explained it well in his blog.
For difference in result in solr you have to chose a proper serach syntax that will return you desired result. Solr support multiple search syntax to find matching result. Here is some example of solr search syntax.
When I am searching for a particular content, it is showing the file which has the content, how can I show the line in which the particular content is there?
I know alfresco uses lucene, can I use lucene highlighter. If yes how to use lucene highlighter in alfresco?
What about solr can I use that?
4.2.e without modifications means that you're using SOLR.
Afaik there is no addon that adds hit-highlighting to Alfresco's Solr search subsystem.
It's on the roadmap.
There are quite some posts regarding hit-lighting in Alfresco based on lucene.
Alfresco 5.2 seems to have this feature. Searched for string is highlighted with context in the search results.
Hey so I started researching about Solr and have a couple of questions on how Solr works. I know the schema defines what is stored and indexed in the Solr application. But I'm confuse as to how Solr knows that the "content" is the content of the site or that the url is the url?
My main goal is I'm trying to extract phone numbers from websites and I want Solr to nicely spit out 1234567890.
You need to define it in Solr schema.xml by declaring all the fields and its field type. You can then query Solr for any field to search.
Refer this: http://wiki.apache.org/solr/SchemaXml
Solr will not automatically index content from a website. You need to tell it how to index your content. Solr only knows the content you tell it to know. Extracting phone numbers sounds pretty simple so writing an update script or finding one online should not be an issue. Good luck!
I have a single Solr instance for indexing multiple sites content.
While indexing I am populating Website field to be able to perform faceted search on that field for every particular website...and that works ok.
Though, if I use Solr MLT feature I get results from all websites, and I want to narrow MLT results down just to the single website.
Is it possible to define facet for the Solr MLT or is there any other better way to achieve the same?
If Solr supports that, is it also available in solrnet?
Solr 3.1 doesn't support filters on the MoreLikeThis component (issue here). You have to use the MoreLikeThis handler, but this handler is not currently implemented in SolrNet (issue here). available as of 0.4.0 beta 1
I'm looking into a search solution that will identify strings (company names) and use these strings for search and facets in Solr.
I'm new to Nutch and Solr so I wonder if this is best done in Nutch or in Solr. One solution would be to generate a Parser in Nutch that identifies the strings in question and then index the name of the company, later mapped to a Solr value. I'm not sure on how, but I guess this could also be done inside Solr directly from the text?
Does it make sense to do this string identification in Nutch or in Solr and is there some functionality in Solr or Nutch that could help me here?
Thanks.
You could embed a NER library (see opennlp, lingpipe, gate) in to a custom parser, generate new fields and create an indexingfilter accordingly. This is not particularly difficult and the advantage compared to doing this on the SOLR side is that you'd gain from the scalability of mapreduce (NLP tasks are often CPU-hungry).
See Behemoth for an example of how to embed GATE in mapreduce
Nutch works with Solr by indexing the crawled data to Solr via the Solr HTTP API. You trigger the indexation by calling the solrindex command. See this page for details on how to setup this.
To be able to extract the company names, I would add the necessary code in Solr. I would use a UpdateRequestProcessor. It allows to add an extra step in the indexing process to add extra fields in the document being indexed. Your UpdateRequestProcessor would be used to examine to document sent to Solr by Nutch, extract the company names from the text and add them as new fields in the document. Solr would them index the document + the fields that you add.