SolrJ and Custom Solr Handler

SolrJ and Custom Solr Handler - solr

I am trying to implement a simple custom request handler in Solr 7.3. I needed some clarifications on the methods available via the Solr Java API.
As per my understanding, I have extended my Java Class with "SearchHandler" and then overridden the "handleRequestBody" method. I am trying to understand the flow from the beginning. Here is a sample query in the browser.
http://localhost:8983/solr/customcollection/customhandler?
q=John&fl=id,last_name&maxRows=10
1) Once you enter the above query in the browser and press
"return" the Solr customhandler will be triggered. It will look
for the necessary jars from where the handler is created.
2) Once it finds the main class it will execute the following
method, which is overridden from the "SearchHandler" parent
class.
public void handleRequestBody(SolrRequest req, SolrResponse
resp) throws Exception
3) The SolrRequest req object will hold all the Solr Parameters
on the query, in this case, q,fl and maxRows.
4) Using the following code I unpack these parameters.
SolrParams params = req.getParams();
String q = params.get(CommonParams.Q);
String fl = params.getParams(CommonParams.FL);
String rows = params.get(CommonParams.ROWS);
5)I create a Solr object that let's me connect to my Solr Cloud
String zkHostString = "localhost:5181";
SolrClient solr = new
CloudSolrClient.Builder().withZkHost(zkHostString).build();
6) Here is where I need help
a) How do I use the unpacked Solr Parameters from the
original query and make a call to the "solr" object to
return results.
b) How do I make use of the "resp" object?
c) Most of the examples that I found on the internet show
how to print the results to STDOUT. However, since I am
using a custom handler I would like to display the results
back to the user (in this case, SOLR Admin or the browser).```
Any help is truly appreciated.
Thanks
public class SolrQueryTest extends
org.apache.solr.handler.component.SearchHandler {
String zkHostString = "localhost:5181";
SolrClient solr = new
CloudSolrClient.Builder().withZkHost(zkHostString).build();
private static final Logger log =
Logger.getLogger(SolrQueryTest.class.getName());
public void handleRequestBody(SolrRequest req, SolrResponse
resp) throws Exception {
SolrParams params = req.getParams();
String q = params.get(CommonParams.Q);
String rows = params.get(CommonParams.ROWS);
SolrQuery query = new SolrQuery(q);
query.setShowDebugInfo(true);
query.set("indent", "true");
// need to know how to call SOLR using the above query
parameters
//Once the response is received how to send it back to the
browser and NOT STDOUT
}
}

Related

SOLRJ giving me strange error when trying to add a pdf to a new core. "You must type correct path"

So starting to update ancient solr app to 9.1 and also the SolrJ indexer. When I try to add a document, I am getting
Exception in thread "main" org.apache.solr.client.solrj.impl.BaseHttpSolrClient$RemoteSolrException: Error from server at http://my.host:8983/solr/qmap:
Searching for Solr
You must type the correct path
Solr will respond
I can see the qmap core in the solr admin and solr is running.
Code is:
public class DocumentIndexer {
private final String fileToIndex;
private final ConcurrentUpdateHttp2SolrClient solrClient;
private final Http2SolrClient http2Client;
public DocumentIndexer(String solrUrl, String fileToIndex) {
this.fileToIndex =fileToIndex;
http2Client = new Http2SolrClient.Builder().build();
solrClient = new ConcurrentUpdateHttp2SolrClient.Builder(solrUrl, http2Client).build();
}
public void indexDocuments() throws IOException, SolrServerException{
ContentStreamUpdateRequest req = new ContentStreamUpdateRequest("/update/extract");
req.setAction(AbstractUpdateRequest.ACTION.COMMIT, true, true);
req.addFile(new File(fileToIndex),"application/xml");
req.setParam("id", fileToIndex);
req.process(solrClient);
solrClient.commit(true, true);
}
}

Simple enough - update/extract was not defined in the solrconfig. Recreating the core using the sample_techproducts_examples as template supplies this or alternatively setting up the solrconfig with the update/extract path defined.
Also, req.setParam("id", fileToIndex) needs to be changed to req.setParam("literal.id", fileToIndex)

Solr: Custom RequestHandler within the Solr api (solrj) that makes a query call back to the server

I'm looking for some advice on a specific issue that is holding us back.
I'm trying to create a custom RequestHandler within the Solr api (solrj) that makes a query call back to the server.
I'm not finding any good, run-able examples of this on-line. Possibly I'm approaching this wrong. Any advice would be appreciated.
Here is one of my attempts to coding this.
public void handleRequestBody(SolrQueryRequest req, SolrQueryResponse rsp) throws Exception
{
SolrCore solrServerCore = req.getCore();
SolrRequestHandler handler = solrServerCore.getRequestHandler("/select");
ModifiableSolrParams params = new ModifiableSolrParams();
params1.add("q", "*:*");
SolrQueryRequest req = new LocalSolrQueryRequest(solrServerCore, params);
SolrQueryResponse rsp = new SolrQueryResponse();;
solrServerCore.execute(handler1, req, rsp);
//!!! Not returning a structured response
System.out.println(rsp.toString());
System.out.println(rsp.getReturnFields());
System.out.println(rsp.getValues().toString());
}

Solr Updates which is faster solrj or curl

I have a case where I have a set of fields to be updated in solr. The input i recieve is in form of a map, key being field name and value is the updated value.
I had a doubt that in such a scenario should i be using curl to update the doc or solrj where I have to convert the map to solrInputDocument and then call add command. Will the first approach be faster than second?

You can convert the map to SolrInputdocument.
Curl uses HTTP URL. All requests go to HttpSolrServer(Im not quite sure about it).
However, in my experience, I would strongly recommend
ConcurrentUpdateSolrServer which is meant for updates
SolrJ has ConcurrentUpdateSolrServer class which describes like below.
ConcurrentUpdateSolrServer buffers all added documents and writes them
into open HTTP connections. This class is thread safe. Params from
UpdateRequest are converted to http request parameters. When params
change between UpdateRequests a new HTTP request is started. Although
any SolrServer request can be made with this implementation, it is
only recommended to use ConcurrentUpdateSolrServer with /update
requests. The class HttpSolrServer is better suited for the query
interface.
The below is example way to get Solr Server instance
/**
* Method lookUpConcurrentUpdateSolrServerInstance.
*
* #param baseURL String
* #param collection String
* #return ConcurrentUpdateSolrServer
* #throws SolrServerException
* #throws IOException
* #throws Exception
*/
public static ConcurrentUpdateSolrClient lookUpConcurrentUpdateSolrServerInstance(String baseURL, String collection)
throws SolrServerException, IOException, Exception {
ConcurrentUpdateSolrClient solrServer = null;
if (StringUtils.isNotBlank(baseURL) && StringUtils.isNotBlank(collection)) {
solrServer = new ConcurrentUpdateSolrClient(baseURL + collection, "queueSizeasInteger", 10),
"threadCount as integer", 10));
checkServerStatus(solrServer);
return solrServer;
}
return solrServer;
}

Storing the Cursor for App Engine Pagination

I'm trying to implement pagination using App Engine's RPC and GWT (it's an app engine connected project).
How can I pass both the query results and the web-safe cursor object to the GWT client from the RPC?
I've seen examples using a servlet but I want to know how to do it without a servelt.
I've considered caching the cursor on the server using memcache but I'm not sure if that's appropriate or what should be used as the key (session identifier I would assume, but I'm not sure how those are handled on App Engine).
Links to example projects would be fantastic, I've been unable to find any.

OK, so the best way to do this is to store the cursor as a string on the client.
To do this you have to create a wrapper class that is transportable so you can pass back it to the client via RequestFactory that can hold the results list and the cursor string. To do that you create a normal POJO and then a proxy for it.
here's what the code looks like for the POJO:
public class OrganizationResultsWrapper {
public List<Organization> list;
public String webSafeCursorString;
public List<Organization> getList() {
return list;
}
public void setList(List<Organization> list) {
this.list = list;
}
public String getWebSafeCursorString() {
return this.webSafeCursorString;
}
public void setWebSafeCursorString(String webSafeCursorString) {
this.webSafeCursorString = webSafeCursorString;
}
}
for the proxy:
#ProxyFor(OrganizationResultsWrapper.class)
public interface OrganizationResultsWrapperProxy extends ValueProxy{
List<OrganizationProxy> getList();
void setList(List<OrganizationProxy> list);
String getWebSafeCursorString();
void setWebSafeCursorString(String webSafeCursorString);
}
set up your service and requestFactory to use the POJO and proxy respectively
// service class method
#ServiceMethod
public OrganizationResultsWrapper getOrganizations(String webSafeCursorString) {
return dao.getOrganizations(webSafeCursorString);
}
// request factory method
Request<OrganizationResultsWrapperProxy> getOrganizations(String webSafeCursorString);
Then make sure and run the RPC wizard so that your validation process runs otherwise you'll get a request context error on the server.
Here's the implementation in my data access class:
public OrganizationResultsWrapper getOrganizations(String webSafeCursorString) {
List<Organization> list = new ArrayList<Organization>();
OrganizationResultsWrapper resultsWrapper = new OrganizationResultsWrapper();
Query<Organization> query = ofy().load().type(Organization.class).limit(50);
if (webSafeCursorString != null) {
query = query.startAt(Cursor.fromWebSafeString(webSafeCursorString));
}
QueryResultIterator<Organization> iterator = query.iterator();
while (iterator.hasNext()) {
list.add(iterator.next());
}
resultsWrapper.setList(list);
resultsWrapper.setWebSafeCursorString(iterator.getCursor().toWebSafeString());
return resultsWrapper;
}

a second option would be to save the webSafeCursorString in the memcache, as you already mentioned.
my idea looks like this:
the client sends always request like this "getMyObjects(Object... myParams, int maxResults, String clientPaginationString)". the clientPaginationString is uniquely created like shown below
server receives request and looks into the memcache if there is a webSafeCursorString for the key clientPaginationString
if the server finds nothing, he creates the query and save the webSafeCursorString into memcache with the clientPaginationString as the key. -> returns the results
if the server finds the webSafeCursorString he restarts the query with it and returns the results
the problems are how to clean the memcache and how to find a unique clientPaginationString:
a unique clientPaginationString should be the current UserId + the params of the current query + timestemp. this should work just fine!
i really can't think of a easy way how to clean the memcache, however i think we do not have to clean it at all.
we could store all the webSafeCursorStrings and timestemps+params+userid in a WebSafeCursor-Class that contains a map and store all this in the memcache... and clean this Class ones in a while (timestamp older then...).
one improvement i can think of is to save the webSafeCursorString in the memcache with a key that is created on the server (userSessionId + servicename + servicemethodname + params). however, important is that the client sends an information if he is interested in a new query (memcache is overriden) or wants the next pagination results (gets webSafeCursorString from memcache). a reload of the page should work. a second tap in the browser would be a problem i think...
what would you say?

Solrj Select All

I am having issues selecting everything in my 25 document Solr (3.6) index via Solrj (running Tomcat).
public static void main(String[] args) throws MalformedURLException, SolrServerException {
SolrServer solr = new HttpSolrServer("http://localhost:8080/solr");
ModifiableSolrParams parameters = new ModifiableSolrParams();
parameters.set("?q", "*:*");
parameters.set("wt", "json");
QueryResponse response = solr.query(parameters);
System.out.println(response);
}
The result I get is:
{responseHeader={status=0,QTime=0,params={?q=*:*,wt=javabin,version=2}},response={numFound=0,start=0,docs=[]}}
Also, If I take the "?" out of parameters.set("?q", "*:*");I have to terminate the compilation or else it times out. The same happens if I replace the
"*:*"
with just
"*"
Also, I have tried parameters.set("qt", "/select");to no avail.
How do you select all and actually get results through Solrj?

I am not sure why this works but after failing on a hundred ideas, this one took:
public static void main(String[] args) throws MalformedURLException, SolrServerException {
SolrServer solr = new HttpSolrServer("http://localhost:8080/solr");
ModifiableSolrParams parameters = new ModifiableSolrParams();
parameters.set("q", "*:*"); //query everything thanks to user1452132!
parameters.set("facet", true);//without this I cant select all
parameters.set("fl", "id");//send back just the id values
parameters.set("wt", "json");//Id like this in json format please
QueryResponse response = solr.query(parameters);
System.out.println(response);
}
Hope this helps someone out there.

You should be using "q" as the parameter and the following is the right syntax.
parameters.set("?q", "*:*");
The reason why it returns with "?q" is that there is no query to run, so it returns fast.
First, please test through the browser. You can also set the number of rows to return, so that you are not returning a large result set.
parameters.set("rows", 5);
Once solr query returns, you have to paginate through the results. If you had a large collection you wont be able to retrieve all of them in one go.

I think you should try to also specify your core whenever you are referring to SolrServer object, i.e., write
SolrServer solr = new HttpSolrServer("http://localhost:8080/solr/collection1");
where collection1 is the name of the core that you want to use.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight