Can I use Solr just for search an existing Lucene index? - solr

I use Lucene locally to index documents. I know how to use Lucene pretty well. I never used Solr but I want to run a web search using a Lucene index so I'm now looking into it.
Can I install Solr on EC2 let's say, and then instead of indexing documents using Solr, doing it locally using Lucene directly and then just coping the Lucene index from my machine to EC2 which Solr will be using for search?
I'm assuming it's possible as long as I keep the index on disk but would like to be sure.
Thanks!

It's certainly possible, you would only make sure to maintain the exactly the same index structure (defined by Solr schema). However, it would also mean that your configuration would be stored in two completely separate places -- e.g. each time you would change an analyzer in Lucene, you would need to synchronize this change in Solr XML configuration. I'm not sure what benefit would Solr bring in such use case.

Related

Solr or Lucene like single application

Hello i have already working application for searching in database. In database I have like 50M indexed documents. There is any idea to run all together i mean i don't want solr on http? what should i do? it's better to use Lucene or EmbeddedSolrServer? Or maybe you have other solution?
I have already something like on 1st diagram and i want make this in single process
And if i will go in lucene can i use my indexes from solr?
solr-5.2.1
Tomcat v8.0
It is not recommended to have one tomcat and deploy the application and solr.
If solr crashes then they are chances of getting downtime for the application. So its always better to run solr independently. Embedding solr is also not recommended.
The simplest, safest, way to use Solr is via Solr's standard HTTP interfaces. Embedding Solr is less flexible, harder to support, not as well tested, and should be reserved for special circumstances.
for reference http://wiki.apache.org/solr/EmbeddedSolr
It depends. If you want to use parts of the Solr feature set (Solr adds quite a few features on top of Lucene), you'll reimplement features that you otherwise would get for free.
You can use EmbeddedSolr to have Solr internal to your application, and then use the EmbeddedSolrServer client in SolrJ to talk to it - the rest of your application would still use Solr as it were a remote instance.
The problem with EmbeddedSolr is that you'll run into scalability issues as the index size grows, since you'll have a harder time scaling onto multiple servers and to separate concerns.

Can we use Solr and Lucene both together in sitecore

Sitecore 7.5 introduces support for using multiple providers in the search index.
Can we solr and lucene both in a sitecore application?
Yes you can. As of Sitecore 7.5 you can have different indexes configured to use different search providers.You could, for example, have one (larger) index use Solr and some smaller indexes just run on Lucene.
Previously, you had to chose one and all your indexes ran under that provider. That limitation is no longer present.

How lucene works with Neo4j

I am new to Neo4j and Solr/Lucene. i have read that we can use lucene query in Neo4j how does this works? What is the use of using lucene query in Neo4j.?
And also i need a suggestion. I need to write an application to search and analyse the data. which might help me Neo4j Or Solr?
Neo4J uses lucene as part of its legacy indexing. Right now, Neo4J supports several kinds of indexes, like creating labels on nodes, and indexes on node properties.
But before neo4j supported those new features, it primarily (and still) used Lucene for indexing. Most developers would create lucene indexes on particular node properties, to enable them to use lucene's query syntax to find nodes within a cypher query.
For example, if you created an index according to the documentation, you could then search the index for particular values like this:
IndexHits<Node> hits = actors.get( "name", "Keanu Reeves" );
Node reeves = hits.getSingle();
It's lucene behind the scenes that's actually doing that finding.
In cypher, it might look like this:
start n=node:node_auto_index('name:M* OR name:N*')
return n;
In this case, you're searching a particular index for all nodes that have a name property that starts either with an "M" or an "N". What's inside of that single quote expression there is just a query according to the lucene query syntax.
OK, so that's how Neo4J uses lucene. In recent versions, I only use these "legacy indexes" for fulltext indexing, which is where lucene's strength is. If I just want fast equality checks (where name="Neo") then I use regular neo4j schema indexes.
As for Solr, I haven't seen it used in conjunction with neo4j - maybe someone will jump in and provide a counter-example, but usually I think of Solr as running on top of a big lucene index, and in the case of neo4j, it's kind of in the middle there, and I'm not sure running Solr would be a good fit.
As for you needing to write an application to search and analyze data, I can't give you a recommendation - either Neo4J or Solr might help, depending on your application and what you want to do. In generalities, use neo4j when you need to express and search graphs. Use Solr more when you need to organize and search large volumes of text documents.

Is there something equivalent to solr's UpdateRequestProcessor in Elasticsearch?

I want to create a plugin that adds a new field to the document before it get indexed. In Solr there is a specific component for this purpose UpdateRequestProcessor.
Is there something similar for elasticsearch?
Although some rivers support scripting to modify documents that are going to be indexed, that would definitely slow indexing down and is not supported within elasticsearch itself.
Doing this work in the client side is the way to go.
I just built a tool that allows you to use Solr's UpdateRequestProcessor in Elasticsearch.

What to be aware of when querying an index with Elasticsearch when indexing with SOLR?

As part of a refactoring project I'm moving our quering end to ElasticSearch. Goal is to refactor the indexing-end to ES as well in the end, but this is pretty involved and the indexing part is running stable so this has less priority.
This leads to a situation where a Lucene index is created / indexed using Solr and queried using Elasticsearch. To my understanding this should be possible since ES and SOlR both create Lucene-compatable indexes.
Just to be sure, besides some housekeeping in ES to point to the correct index, is there any unforseen trouble I should be aware of when doing this?
You are correct, Lucene index is part of elasticsearch index. However, you need to consider that elasticsearch index also contains elasticsearch-specific index metadata, which will have to be recreated. The most tricky part of the metadata is mapping that will have to be precisely matched to Solr schema for all fields that you care about, and it might not be easy for some data types. Moreover, elasticsearch expects to find certain internal fields in the index. For example, it wouldn't be able to function without _uid field indexed and stored for every record.
At the end, even if you will overcome all these hurdles you might end up with fairly brittle solution and you will not be able to take advantage of many advanced elasticsearch features. I would suggest looking into migrating indexing portion first.
Have you seen ElasticSearch Mock Solr Plugin? I think it might help you in the migration process.

Resources