How to integrate Solr with Web Application - solr

After reading many Solr books and article all over on the net, now I have an idea of the power of this server.
But... how to integrate it in a real application? For example: a web site written in PHP, etc.
Right now, I understand that Solr produces XML, JSON etc results... so to integrate this in a web application, the "simple" work is to convert this information for render in a page or there are other technique to avoid this?
I'm my case, I have to develop a search engine to scan many documents and find result.
My idea was:
Use Solr to build an index and search documents
Use a web application to show the result
Looking on the net I haven't find anything that explains how to integrate Solr in a real application, all the reading are about "How to use Solr... with Solr..." Anything about a real integration.
Does someone have some useful resource how to integrate Solr in a real application, with some clean examples?

Edit: It looks like Apache maintains their own list of recommended
client APIs, and their recommended tool for PHP is Google's
library (though they refer to it as SolPHP). Given this, I imagine that this is the best place
to start.
A Solr library for the programming language you're using could save you some of the trouble in implementing the integration. For instance, if your site is written in PHP, you could try Google's Solr library for PHP.
I have done most of my Solr work in Java, so I have used SolrJ quite a bit. This is a well supported tool because it comes from Apache in parallel with the Solr product itself.
If you are doing work in any other languages, you are likely to find libraries available for them. The amount of time they save you may vary according to the quality of the library itself.

When I was using Solr in my project, only my application server (that is Tomcat) was communicating with Solr server. I wrote a class, which executes GET requests to Solr server based on input provided by end user. When Solr returns XML/JSON back to an application server you may parse it and process as every other bussiness data (render an *.html). So, summing up, Web Browser never communicates directly with Solr, all goes through an application server:
WebBrowser -> GET to application server -> GET to Solr server
show *.html <- parse XML/JSON, render *.html <- return XML/JSON

Related

Apache Solr: Can apache solr be used as a third part system for indexing and searching for documents from different websites?

I am working on implementing a research web application or portal that integrates different research portal or website using an open source platform called search kit. The web application will act as a central point of access to research publications on different research portals. To do this, I also need to implement a third party system that does the following:
Searches for documents based on user query on the other different research portals and presents or displays the results to the users on my web application.
Index the documents
Should be used by system administrators to configure the web application. Whereby system administrators can add,remove or modify the URL of the website Solr is pulling documents from
Displays the results to the user in one standard format.
My question is, can apache solr be used to implement the third party system? if not, what open source platform or way would you recommend I used to implement the third party system?
In general, Solr seems like a good fit here, but you might need some custom code (apart from configuration) here and there. To go through the points:
Querying is one of the main features of Solr, so this is definitely possible.
Indexing is handled by Solr.
There was a component for Solr called "Data Import Handler" that supported indexing from URLs (see the docs). However, this was removed from the main Solr distribution, and was moved to a separate package. This package doesn't seem to be actively maintained though, so you will probably run into some problems if you decide to use it. The alternative is to develop your document-pulling code yourself.
Solr can display the results in multiple formats, but it still might not support the exact format you would like it to be. In this case, you need to build your transformation based on the result from Solr.

Choose Lucene or Solr

We need to integrate a search engine in our plataform Catalog management software in Share point. The information is stored in multiple databases and a storage of files ( doc , ppt , pdf .....). Our dev platform is Asp.Net and we have done some pre-liminary work on Lucene, found it to be good. However, we just came to know of Solr.
We need to continue using lucene, but we need to defend her the solr.
Please any help is accepted.
And sorry for my english.
Lucene is a full-text search library used to provide search functionalities to an application. It can't be used as an application by itself. Solr is a complete search engine built around Lucene providing its search functionalities and others. Solr is a web application that can be used by itself without any development around it.
If you need a search engine to be called by your application I recommend you to use Solr.

Can we use Kibana for Apache Solr not using elasticsearch

How to integrate Kibana with Apache solr instead of using elastic search.
If it cannot be done.
What are the alternatives to Kibana for Solr
At LucidWorks, we have ported Kibana to work with Solr and released it as open source.
If you want a bundled package, you can download that at http://www.lucidworks.com/lucidworks-silk/.
Our port for Kibana for Solr is bundled with Solr 4.7.0 and can be used as a query engine to build dashboards from indexes within the bundled Solr instance and/or located on other Solr instances.
The source code is available at https://github.com/LucidWorks/banana.
We have also included Solr Output Writer for LogStash with that bundle; however, you can use any ETL and indexing mechanism to get time series data into Solr. Links to this github repository are available on the LucidWorks link above.
HUE is an alternative search UI for Solr, while it is not good as Kibana for search at the moment.
You can use SiLK for sure but you are better off using the fully integrated dashboards module that comes with Lucidworks Fusion. Fusion will save you a ton of time and make it easier to focus on the search stuff that matters - like building a recommender engine, creating data-driven user experience, driving data enrichment with entity recognition and integrating with Big Data software like Hadoop.

How does a webapp save files to its server?

I'm building a webapp where one can develop documents within the web browser (e.g., something like Zoho's document tool, or Google Docs). In my case, I have a set of arrays that store different paragraphs and other pieces of information, along with parallel arrays that store metadata on the paragraphs themselves.
The entire webapp is written in jQuery and associated libraries / plugins.
Is there an elegant way for me to save this as a file on the server itself? So far, I've been recommended using a hidden form to POST the arrays to the server and store them in a NoSQL database of some sort... This feels a bit painful to me and I'm wondering if (1) there is a more elegant approach, or (2) there is a library / framework that automates some of the sending / POSTing / saving.
Thank you!
You would need to create services that live on the server itself. These services would be methods such as (just as a simple example)
SaveDocument(User, Document)
GetDocument(User, Document)
you would need to configure your web app to call these services and pass in the required parameters. Now as for how to do this, you could write the services in any number of languages (Java using JavaEE, C# using WCF to name a few, but you can also do this in python/ruby/etc) and then generate WSDL interfaces to the services that any number of other languages could call.
There are lots of resources available on the web that cover this, so pick a language you want to learn, or are already proficient in and google around on how to develop web services in that language.
Good luck!

Language/Framework support for Interacting With CouchDB

I am interested in knowing if there are any server-side web application frameworks which integrate nicely with CouchDB? Does anyone have any experience in doing this? It seems like a dynamic language would be well-suited for playing with the JSON, but I am more interested in hearing about how it would fit in with the framework and the application's design.
Two frameworks that I would suggest for CouchDB are Ruby on Rails and Django. Both have a small file you can include that allows for easy interaction with CouchDB. For Ruby/Rails, this gives you the ability to write code that looks like this (code snippets yanked from here):
# Create the database
server = Couch::Server.new("localhost", "5984")
server.put("/foo/", "")
# Insert a new document into the database
doc = <<-JSON
{"type":"comment","body":"First Post!"}
JSON
server.put("/foo/document_id", doc)
# Get the document back later
res = server.get("/foo/document_id")
json = res.body
puts json
Python/Django lets you do the same with a relatively minimal amount of work (see here). Both of these aren't at the web framework level but they require a minimal amount of work to set up and are pretty easy to get going in Rails and Django. The Django approach still requires some packages to be installed so if you just can't do that for some reason the Rails approach is the way to go.
Another good how-to on Python on Django can be found here (also lifted from the CouchDB FAQ).
The only web framework that dedicates itself to CouchDB is currently CouchDBKit for Python.
Check out the official wiki page that lists how to get started in your language:
http://wiki.apache.org/couchdb/Basics
Pick the language and framework that suits you best and then use one of the light CouchDB libraries with it.
It seems that things are move quite quickly at the moment for CouchDB. I'm sure there will be more frameworks out there soon with CouchDB support. I'm currently looking into building one for PHP.
I have had good success with jcouchdb for Java and CouchApp for JavaScript and CouchDBKit with Python. All of these are actively developed, open source and well designed and easy to enhance if they are missing something you really need. I have submitted patches and feature enhancements for jcouchdb and couchapp both.
Actually, you don't really need such a framework. Instead, you can just write the whole web application in CouchDB. It allows you to generate HTML files, or any other XML derived format, and you can even use HTML-templates. I consider this a good choice, because JavaScript is a rich and flexible language. On the other hand you don't have the overkill of a connection between the database and your web application.
For more details, check out: http://books.couchdb.org/relax/design-documents/shows
There's also a related question: Using CouchDB to serve HTML
Depending on what you want to build CouchApp may be something to look at: It's specially designed for writing apps with CouchDB:
https://github.com/jchris/couchapp/wiki/manual

Resources