Data Import Handler in sunspot rails - solr

I have some tables in mysql. I want to import some data using data import handler and index those using Solr. Is it possible?. I saw we need to make some modifications in data-config.xml file from this link. Is it possible to use DataImportHandler in Sunspot Rails?
If so, When I execute a rails application when this data import takes place.Because with rails application I believe it is not possible to give DIH commands such as fullimport,deltaimport etc.
Please help me to resolve since I'm little confused with interaction between DIH and solr and the flow of those programs.

Related

Can I query a new handler not in solr config?

I am using solr 5.4.0.
I want to create a new handler in solr say "X". This handler is not defined in solr config, but can I define this on run time and include it in query using the qt field?
The same way how we can replace the bq, qf etc fields for an already existing handler in solr config, is there a support for creating a new handler while issuing the solr query as well
I do not remember being able to create additional request handlers via API in Solr 5.4. You may be able to modify or XInclude a file on a filesystem and reload core. But that's a bit hacky.
In the latest versions of Solr, you do have
configuration API to override solrconfig.xml
request parameters API, that allow you to define parameter sets, which you can apply with useParams configuration or as a query parameter in the URL.

How to set Data Import Handler and Scheduler using solrJ Client

I am new to solr search, i have completed a simple search.
Now I want to index documents directly from Database and want set scheduler or trigger for updating index when there is any change in DB.
I know that I can do it with DataImportHandler but can't understand its flow.
can you help me that from which steps I should have to start this process?
or can anyone just give me pointers to do this ??
I want to do this all things using SolrJ client.
This task requires many parts to work together. Work through https://wiki.apache.org/solr/DataImportHandler
DataImportHandler is a Solr component, which means that it runs inside the Solr instance. All you have to do is configure Solr and than run the DHI through the Dataimport Screen.
On the other hand SolrJ is an API that makes it easy for Java applications to talk to Solr. So you can write your own applications that create, modify, search and delete documents to Solr.
try to do simple edit and delete function on button click event and
send the id with that url in servlet and do your jdbc opertaion
after that successfully commited, call your data import command from solrj and redirect it to your index page
thats it.

Solr Data Import Scheduling from MySQL

I am a newbie to Solr, I m trying to schedule an import from MySQL to Solr.
The full-import functionality is working fine when I my trying to import data from Solr admin, but when I am trying to configure the data-import through the documentation provided on apache site, I am not able to move forward.
Just after adding the listener to solr web.xml file, my solr is going down and not coming up. Also as mentioned in the documentation there should be dataimport.properties file on solr.home/conf/ path. But I don't have that path also for my solr4.1
I'm the author of the scheduling component.
Take a look here for more info: https://github.com/mbonaci/solr-data-import-scheduler

bootstrap solr on tomcat with compositeID routing

We are upgrading solr 4.0 to solr 4.3.1 on tomcat 7.
We would like to use the "compositeId" router. It seems that there are two ways to do that:
1. using collections API to create a new collection by passing "numShards";
2. Passing "numShards" in bootstrap process.
For 1, we have a large amount of existing index data that we don't want to reindex. Hence, we can't create new collections.
SolrCloud wiki use examples of jetty where it is possible to pass "numShards" parameter. Is it possible to do it in tomcat?
This is currently what happens in solr 4.3.1 on tomcat 7. When doing the default bootstrap: solr read "solr.xml" to find all solr cores and bootstrap all of them. however, the hash range of a solr core shows "null" in : "clusterstate.json" in zookeeper and will result in using "implicit" router.
Thanks!
When you want to set up collection with Solr running in Tomcat (ZooKeeper runs separately) you should use Collections API: i.e. you can specify number of shards (numShards) and other parameters when calling CREATE action.
With Solr 4.3.1 there's a nice option now that allows splitting existing shards. Please, check out SPLITSHARDS in Collections API.
https://cwiki.apache.org/confluence/display/solr/Collections+API
http://wiki.apache.org/solr/SolrCloud (some points about collections API are also there)

How to add data to the solr's schema

I try to add new data to the solandra according to the solr's schema but I can't find any example about this. My ultimate goal is to integrate solandra with django-solr.
What I understand about the insert and updating in the solr based on the original solr and django-solr is to send the new data on the http protocol to the decent path, for example:
http://localhost:8983/solandra/wikipedia/update/json
However, when I access the url, the browser keep telling me HTTP ERROR: 404.
Can you help me understand the step to add new data and delete the data in the solandra environment?
I also have a look at the reuters-demo, but the procedure to insert data is process in the file of reutersimporter.jar, but I can't see the source as well. So Please help me to understand how the system work in terms of data inserting and deleting.
Thank you.
Since you are using the JSON update handler, this UpdateJSON page on the Solr Wiki has some good examples of inserting data using the JSON handler via curl. Also, the Indexing Data section of the Solr Tutorial shows how you can insert data using the post.jar file that is included with the Solr source.
Are you creating the solr schema.xml and solrconfig.xml and posting it to solandra? If you add the JSON handler then this should work. The reutersdemo uses solrj. django-solr should work as well.

Resources