I have found the interesting article on Solr Facet Functions available in heliosearch.
I am just wondering if the same is available in native solr?
That kind of functionality will come with SOLR 5.1. Yonik, the developer of Heliosearch, joined a big SOLR company. Development on Heliosearch will not continue, but Yonik is porting the changes to SOLR. See the following thread for details:
https://groups.google.com/forum/#!topic/heliosearch/ji466TddEDY
I don't think here are facet/aggregate functions like that. The closest thing I've found is the stats component:
https://cwiki.apache.org/confluence/display/solr/The+Stats+Component
https://cwiki.apache.org/confluence/display/solr/Faceting#Faceting-CombiningStatsComponentWithPivots
Related
I saw there was a article in the Apache wiki on OpenNLP for Solr.
Is it valid for current solr version 5.3.1?
No, if you have a look at LUCENE-2899, you'll see that the code discussed was never added to trunk. You'll have to download/patch/update the code yourself if you're going to have it native to Solr.
It's probably a better idea to do all the NLP stuff outside of Solr, then index the result in a form suited for the task you're trying to solve.
Yes. It's better to keep it outside.
Here is a small project I tried.
https://github.com/john77eipe/DeepQA
ElasticSearch has percolator for prospective search. Does SOLR have a similar feature where you define your query upfront? If not, is there an effective way of implementing this myself on top of the existing SOLR features?
besides what BunkerMentality said, it is not hard to build your own percolator, what you need:
Are the queries you want to run easy to model on Lucene only syntax? if so you are good, if not, you need to convert them to Lucene only. Built them, and keep them in memory as Lucene queries
When a doc arrives:
build a MemoryIndex containing only that single doc
run all your queries on the index
I have done this for a system ingesting millions docs a day and it worked fine.
It's listed as an open new feature, SOLR-4587, on Solr JIRA but it doesn't seem like any work has started on it yet.
There is a link in the comments there to a separate project called Luwak that seems to implement some features similar to percolator.
If it is still relevant, you can use this
It's SOLR Update Processor that based on Luwak
With ElasticSearch, an app can point to the alias of an index, instead of the index directly, which makes it easy to switch the index the app uses.
Tire, the equivalent of Sunspot for ES, allows me to interact with aliases.
I can't find anything regarding aliases with Sunspot. How do you handle them in your apps which use Sunspot?
I do not know anything about sunspot, but for Solr counts that there has been a core alias feature, until version 3.1 of Solr. This has been removed with SOLR-1637 and has been "really, really" removed with SOLR-6169 in version 4.9.
But with the advent of SolrCloud this feature has been re-introduced with a better/different implementation SOLR-4497 in Solr 4.2.
Unfortunately when skimming through the Reference of Sunspost I do not find a word about SolrCloud or aliasing. Probably that features have not been adopted by the Sunspot developers? As stated I do not know sunspot, probably they name it differently?
Most likely you will have to get your hands dirty and manage SolrCloud and in consequence aliases not through the API sunspot offers, but with admin interface of Solr.
Sources of information
There is this old Wiki page that covers SolrCloud. It has a small, separate section about creating aliases
In the official reference is also a section about collection aliases.
The guys of Cloudera who have donated the feature to Solr have also written a blog post about it.
Hello Friends,
I want to know by any way we can use solr data import handler with cassandra.
If I can get any reference site or example would be good help.
Thanks
I'd have a look at datastax's page on cassandra integration with solr. Also look at this Github repository, its a library for cassandra and solr.
That gitHub library is old , only Datastax Enterprise uses the integration of Cassandra with Solr, but its not free.
You can see Stargate-core soluton for cassandra but it uses lucene
Another one is Stratio Cassandra again it used Lucene
Hope this helps
Regards
Asit
I am currently using Apache Solr to build a search engine. The queries in Solr are of the field:value format. Now I want to use a part-of-speech tagger to separate the subject, verb and predicate and search the values in each fields. For example, if I input "Who likes Starbucks" then I need some code to give me "q=subject:*&verb=likes&object=starbucks". Is there any library that can handle this job? Thank you!
I think several people have used UIMA for this, see solr wiki
There are a number of POS taggers. Here is another StackOverflow posting about this: What is a good Java library for Parts-Of-Speech tagging?