How to automate solr indexing?

How to automate solr indexing? - solr

Normally we do indexing in solr from a browser. Can we do it automatically by writing a batch job or java code?
Please provide me some idea, if it is possible.

You can use the DataImportHandler, which can import from lot of different sources such as databases or xml files: https://wiki.apache.org/solr/DataImportHandler
If you have specific requirements which are not satisfied by the DataImportHandler you may implement your own indexer by using a solr client api:
https://cwiki.apache.org/confluence/display/solr/Client+APIs

If you want to do stuff with Solr programmaticaly take a look at: Solrj which is an API that'll do what your asking for.

You can use a web debugging proxy such as Fiddler to view the HTTP request that is generated when you trigger the data import via a web browser. Then send the same request from your Java code.

Related

Apache Nutch REST API to retrieve data from server running Nutch?

I am using the nutch REST API to run nutch searches on a seperate server. I would like to retrieve the crawled data back to my local machine. Is there a way I can use the nutch dump functionality to dump the data and retrieve it via the API, or am I better off indexing the data into Solr and retrieving it from Solr.
Thanks for your help.

Currently, the REST API doesn't provide such functionality. The main purpose of the REST API is to configure and lunch your crawl jobs. At its core, it will allow you to set the configuration of a new crawl job and manage it (to some extent).
The transfer of the crawled data is up to you. That being said I do have a couple of recommendations:
If you're sending the data into Solr/ES (or any other indexer) I would recommend getting the data directly from there. Both Solr&ES already provide a REST API, with the additional benefit that you might filter which data to "copy over".
If you're running Nutch in a distributed mode (i.e in a Hadoop cluster) try to use the Hadoop libraries to copy the data to the destination.
If none of this applies then perhaps relying on something else like rsync or similar might be worth considering.

Validating Solr queries against a schema

I would like to validate queries against a schema before actually executing them.
Is there an official API which will give me access to a schema, or will I have to parse the Solr configuration XML myself?

The usual trick for finding these resources it to open the Admin interface with the developer network tool running in your browser, then navigating to the resource you're looking for while watching which requests your browser perform. Since the frontend is purely Javascript based and runs in your browser, it accesses everything through the API exposed by Solr.
You'll have to parse something, either in JSON or XML (probably) format. For my older, 4.10.2-installation, it is available as:
/solr/corename/admin/file?file=schema.xml&contentType=text/xml;charset=utf-8

Client-accessible noSql database?

Im building a simple angular application and there is a small administrator panel for updating the content (a .json document). I'm looking for a way to edit the json document from the administrator panel.
I can manipulate the memory-loaded json but I can't save it. Is there a way to put the json file in some kind of cloud database and connect to it without setting up a server or backend for my application?
I want my application to be easily deployable on any ftp so I can't setup a nodeserver or install something like couchdb.
Any ideas are appreciated.

You could use a provider like Parse. It's free (up to a limit of requests/month), has a nice JavaScript SDK that would get you up and running quickly. https://parse.com/
Also, check out this query builder to aid in retrieving your data from Parse. It's built as an Angular service for easy integration. https://github.com/dpollot/parse-query
EDIT
Parse also offers hosting, for free.

programmatically get server load information from Apache ServiceMix using JMX

Is it possible to get the server load information of a webserver deployed on a ServiceMix / Fuse ESB.
I dont want to use Jconsole but get the information by running a java file and writing the values into text file.
Could someone point me to some code that I can run on my machine?
Cheers,
Kunal

You can also intall jolokia i ServiceMix which exposes a REST interface over JMX. This makes it much easier for non Java developers and programming languages to access the metrics. It's just a HTTP call to get the data.
http://www.jolokia.org/
We use this library for the http://hawt.io management console so we can get the data easily from a moderne HTML5 web console.

I won't write the code you ask for, but..
Everything in JConsole is accessed through JMX. And everything in JMX is accessible via code as well (basic tutorial here).
So just locate the value/values you are intressted in using JConsole, then just extract them using the JMX api in code.

solr faceted search UI development

I am working on a POC, where I have to display faceted search result on web page. can anybody please help me to suggest what all set up I need to configure to display. I would prefer java technologies. Just to mention, I have solr cloud running on remote server.
I would like to know:
1. Should I use MVC framework?
2. How will my local interact with remote solr server?
3. How will I send query through java code and what technology I should use to display faceted search result?
or any example how someone is doing will be very helpful.
Please help me on this.
Thanks,

One of the quickest ways to create your POC is by using the VelocityResponseWriter, this response writter is bundled in Solr distribution, it's basically a series of Apache Velocity templates that are very easy to customize.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

How to automate solr indexing? - solr

Normally we do indexing in solr from a browser. Can we do it automatically by writing a batch job or java code? Please provide me some idea, if it is possible.

If you want to do stuff with Solr programmaticaly take a look at: Solrj which is an API that'll do what your asking for.

You can use a web debugging proxy such as Fiddler to view the HTTP request that is generated when you trigger the data import via a web browser. Then send the same request from your Java code.

Related

Apache Nutch REST API to retrieve data from server running Nutch?

Validating Solr queries against a schema

Client-accessible noSql database?

programmatically get server load information from Apache ServiceMix using JMX

solr faceted search UI development

Categories

Resources