Search Request to Apache Solr other than browser - solr

I am making a search query on local Apache Solr Server by browser and see the results.
I want to make Same Query on the production server.
Since tomcat port is blocked on production, I cannot test the query results on the browser.
Is there any method to make query and see the results?

Solr is a java web application: if you can't access the port it's listening to, you can't access Solr itself. There's no other way to retrieve data from a remote location. Usually on production Solr is put behind an apache proxy, so that it protects the whole Solr and makes accessible only the needed contexts, in your case solr/select for example to make queries.

Related

relationships among zookeeper, solrcloud, and http requests

I am relatively new to this. So I am trying to understand the relationships among zookeeper, solrcloud, and http requests.
My understanding is:
Zookeeper (accessible through 2181) keeps config files for solrcloud.
and all http requests goes to solrcloud instance directly rather than going through zookeeper.
Therefore, zookeeper, in this particular case, is not used for its ability in routing (API) requests? I do not really think that should be the case. But based on the tutorials from solr official sites. It seems all the requests needs to go through solr's 8983 port.
Solr uses Zookeeper to keep its clusterstate (which servers has which cores / shards / parts of the complete collection) as well as configuration files and anything else that should be available all throughout the cluster.
The request itself is made to Solr, and Solr uses information from Zookeeper in the background to route the request internally to the correct location. A client can be Cloud Aware (such as SolrJ) and can query Zookeeper directly by itself and then contact the correct Solr server instantly, instead of having Solr route the request internally. In SolrJ, this is implemented as CloudSolrClient (or CloudSolrServer as it might be named in older versions of SolrJ) (and not the regular SolrServer, which would contact the Solr instance you're referencing and then route the request from there).
If you look at the documentation of CloudSolrClient, you can see that it takes the Zookeeper information as its argument, and not the Solr Server address. SolrJ makes a ZK request to Zookeeper, retrieves the clusterstate, then makes the HTTP request directly to the servers hosting the shard or collection.

Solr Cloud Managed Resources

I am implementing Solr Cloud for the first time. I've worked with normal Solr and have that down pretty well, but I'm not finding a lot on what you can and can't do with Solr Cloud. So my question is about Managed Resources. I know you can CRUD stop words and synonyms using the new RESTful api in solr. However with the cloud do I need to CRUD my changes to each individual solr server in the cloud, or do I send them to a different url that sends them through to each server? I'm new to cloud and zookeeper. I have not found anything in the solr wiki about working with the managed resources in the cloud setup. Any advice would be helpful.
In SolrCloud configuration and other files like stopwords, are stored and maintained by Zookeeper. Which means you do not need to individually send updates to each server.
Once you have SolrCloud, before putting in any data, you will create a collection. Each collection has its own set of resources/config folder.
So for example if u have a collection called techproducts with 2 servers localhost1 and localhost2 the below command from any of the servers will work on the same resource.
curl "http://localhost1:8983/solr/techproducts/schema/analysis/synonyms/english"
curl "http://localhost2:8983/solr/techproducts/schema/analysis/synonyms/english"

Integrating Solr with web application

Context:
I have a web application that serves content via RESTful web services
I need to provide a search functionality
This is what I have in mind. Am I on the right track or way off ?
Index seed client:
This component will poll the Application at repeated intervals for data
(I have a WS which returns an XML response)
And then Post the XML to a EMS
Queue Listener:
The Queue Listener will convert the domain XML into Solr doc
And the post the document to Solr to be indexed
Search client:
The client will make a search request to my web application with query parameters
The web application will forward the request to Solr
Solr returns search results to my web application
My web application returns the result back to the client
Alternate flow ?
The search client talks to Solr directly and does the search.
Suggestions?
Searching will depend on your implementation choice of solr server. If you use embbededSolrServer you will need to query via your web client then calling sol. If you are using an httpsolrserver then you can query solr directly.
It also depends on how you want to return the results.
As solr documents?
Or your own interpretation of a solr document?
The later would have to be serviced by your web application

Can I update an application configuration file without restarting the Jetty server?

I have Solr running on a Jetty server, and I'd like to be able to update a configuration file and have my application pick up the changes without restarting the entire server. Specifically, I'm looking for something similar to touch web.xml in Tomcat. Is that possible, and if so, how do I do it?
EDIT:
Specifically, I want to update application-specific information in an external file, and I want to get the application to load this new data all without stopping and starting the server.
There are several ways to achieve this (assuming you're thinking of general config reloading). You can have a daemon thread polling the file for last changed timestamp, and trigger a reload. Or you can check the timestamp on each configuration value lookup, if it doesn't happen to often. But my preferred way would be to expose a "reload configuration" operation either through JMX or a URL that is accessible only from the "inside".
If you are running Solr 4+ and are talking about schema.xml and solrconfig.xml, then you want 'Reload Core', which is in the Web Admin UI under core/collection management. You can also trigger it from a URL.
From Apache solr admin, go to core admin and reload respective core. If you are using solr cloud, then it is too easy. Just reload configuration via zookeeper. These changes will be visible only after complete copy.

Solr server security

I have a Solr server that returns search results to users via Ajax requests. The request all looks like this: http://abc.com/core1/select?q=...
Now I just realized this would expose my search server to potential "bad" guys. I can have basic authentication on the jetty Solr server but it would block people from calling the search server.
My question is what is the common strategy to fix this ? Should I use the Solrj java client library from my Tomcat webserver to search first and then return the results to users, also firewall the access to the search server completely ? Any other way to get around it ?

Resources