Query construction in custom Solr component plugin - solr

I have developed a solr component which expands users query and adds additional clauses to the query. For this expansion we are making request to external REST api's. This query expansion logic is mainly in prepare() method. Everything works as expected in standalone mode. When we deploy this plugin in SolrCloud environment each shard is calling external REST api for query expansion.
My question is that can we make only one call to external REST api since its the same request sent from each shard to external service. How can we modify our component to make only one call per search request ?

In the prepare() method, right before your external API call, you can check RequestBuilder.isDistrib(). This boolean will be true for a request that is about to be distributed. You can then use this information to determine whether you can just execute the external request or you need to set one of the SolrCloud hosts that does this job.
How to determine the SolrCloud host to use for the external API? You could...
Hardwire one of the hosts into the component and check whether localhost is the hard wired host. That would unbalance the host load, though.
Have an arbitrary measurement that any component can check for itself, like hosts 1-10 fire the external request when the current minute is equal to the number of the host.
Even throw a dice in the search frontend, give the hostname to Solr via query parameter and have the component check this parameter (get it from ResponseBuilder.req.getParams()) against its local hostname.
You can get really creative there.
After you got an answer from the external API, you can use modifyRequest() to update all other hosts on the results.
Please read more in the Solr Wiki.

The approach you can use is like:
Rewrite your component into requestHandler (or just wrapper over standard requestHandler)
Make your requestHandler be aware of special flag which will trigger / not trigger your own custom logic. What I mean looks like this (I know it is not fancy):
public void handleRequestBody(SolrQueryRequest req, SolrQueryResponse rsp) throws Exception {
...
SolrParams params = req.getParams();
if (req.getParams().get("apiCallWasSent") == null) {
makeApiCall(req, rsp);
params = new ModifiableSolrParams(params);
params.add("apiCallWasSent", "true");
req.setParams(params);
}
...
super.handleRequestBody(req, rsp);
}
In my opinion those additional query clauses and everything what is related to the query itself should be handled by your own QParserPlugin. But component also can handle those clauses.

Related

ESB + Camel Calling multiple web services based on the response from the previous call

I'm using ESB and Camel to provide an endpoint to my mobile apps. From there, I need to call multiple web services in such way that the response from the previous call determines whether the next should be called or not and need to pass the same request parameters to the multiple calls.
Additionally, I need to save those response in the database.
I would like to know the best pattern by which we can implement this particular use case using Camel.
there are many ways to do it - just think how you'd like to do such logic as example in pure Java then move it to Camel. From actions flow prospective there is no difference. You have condition - you have to have IF or SWITCH operations that's it.
Straight-forward way.
After calling previous service you have a response in body with attribute that is a decision factor for next call. So, use Camel "choice-when-otherwise" structure (analog of Java "switch" statement) and in "when" use any available ways to check condition from body (i.e. "simple", "xpath", "xquery" etc.)
If logic to identify next call is more complex - create your custom processor which will identify next call, set special exchange property and then go to the same "choice-when-otherwise" block
For that case as example you can have some map with <"previous-result","next-call"> or do it as you'd like to.
and your route may look like (I use Spring):
<cml:to uri="previous_uri"/>
<cml:processor ref="my_selector"/> <!-- it sets Exchange property "next_call" based on result from previous -->
<cml:choice>
<cml:when>
<cml:simple>${exchangeProperty.next_call} =="SERVICE1"/>
<cml:to uri="next_service1_uri"/>
... process Service1 result logic ...
</cml:when>
<cml:when>
<cml:simple>${exchangeProperty.next_call} =="SERVICE2"/>
<cml:to uri="next_service2_uri"/>
... process Service2 result logic ...
</cml:when>
and so on...

Handle Scenarios when exposing route as a restlet service

I have used rest servlet binding to expose route as a service.
I have used employeeClientBean as a POJO , wrapping the actual call to employee REST service within it, basically doing the role of a service client.
So, based on the method name passed, I call the respective method in employee REST service, through the employeeClientBean.
I want to know how how I can handle the scenarios as added in commments in the block of code.
I am just new to Camel, but felt POJO binding is better as it does not couple us to camel specific APIs like exchange and processor or even use
any specific components.
But, I am not sure how I can handle the above scenarios and return appropriate JSON responses to the user of the route service.
Can someone help me on this.
public void configure() throws Exception {
restConfiguration().component("servlet").bindingMode(RestBindingMode.json)
.dataFormatProperty("prettyPrint", "true")
.contextPath("camelroute/rest").port(8080);
rest("/employee").description("Employee Rest Service")
.consumes("application/json").produces("application/json")
.get("/{id}").description("Find employee by id").outType(Employee.class)
.to("bean:employeeClientBean? method=getEmployeeDetails(${header.id})")
//How to handle and return response to the user of the route service for the following scenarios for get/{id}"
//1.Passed id is not a valid one as per the system
//2.Failure to return details due to some issues
.post().description("Create a new Employee ").type(Employee.class)
.to("bean:employeeClientBean?method=createEmployee");
//How to handle and return correct response to the user of the route service for the following scenarios "
//1. Employee being created already exists in the system
//2. Some of the fields of employee passed are as not as per constraints on them
//3. Failure to create a employee due to some issues in server side (For Eg, DB Failure)
}
I fear you are putting Camel to bad use - as per the Apache documentation the REST module is supporting Consumer implementations, e.g. reading from a REST-endpoint, but NOT writing back to a caller.
For your use case you might want to switch framework. Syntactically, Ratpack goes in that direction.

How to capture original endpoint URI within an expression (Recipient List EIP)

I'm attempting to use the Recipient List EIP to dynamically generate the consumer endpoint URI during runtime based on configuration entries in a database (http://camel.apache.org/how-to-use-a-dynamic-uri-in-to.html). I've got a number of routes that I want to handle this way so I'd like to build something that can handle multiple routes generically.
Therefore, my idea is to keep an in memory map of these URI values keyed on some type of identifying information (original endpoint URI seems like a logical choice) which would be updated if/when the database is updated to keep the routes in sync, and prevent having to go to the database for every exchange. Using the RouteBuilder, I am setting up the route with the recipient list and Bean expression.
from(endpointUri).recipientList(bean(MyBean.class, "getUri"));
I know that I can capture various objects such as the exchange, body, headers (as long as I know the name), etc using the Bean binding for the getUri method. Is it possible to somehow get the original endpoint URI value so that I can use it as a key to fetch the correct consumer endpoint?
The Exchange interface has getFromEndpoint() method which returns an Endpoint. The Endpoint interface has getEndpointUri() method which returns a String. Perhaps that's what you need? If that's not sufficient, you could set header value(s) at some point and then subsequently retrieve them later in your route.

ZK MVC: passing attributes in a request of two zul pages

I am trying to pass an object from one zul page to another. where :
Both zul pages have own composers.
I want to set value of object in 1st zul's composer.
And Want to get it in 2nd zul's composer.
I have tried executions.sendredirect(), but it clears the value of object, forward() thrwos an exception saying that "use sendRedirect instead to process user request".
Your problem is scope.
In ZK, like other web application frameworks, you have access to a number of different scopes: webapp, desktop, page, session, request, etc. If you have two different pages served from two different URLs, those will have distinct request scopes.
When moving from one request to another, you can pass information on the request URL:
Executions.sendRedirect("page2.zul?myId=1234")
Then, in the composer on page2, you can retrieve this value from the Execution:
Execution execution = Executions.getCurrent();
execution.getParameter("myId");
This is standard HTTP query string business so you're limited to text and numbers here. For passing things like database ids though, this can be quite convenient.
A more robust solution might be to leverage some of ZK's other scopes. For example, you could put your object in the user's Session scope. Refer to my answer to ZK session variable with a menu for implementation details. Note that when using the Session, you are no longer limited to text but can store an actual Object.

Getting IDs of added documents after import operation is complete

I'm trying to setup a Solr dataimport.EventListener to call a SOAP service with the IDs of the documents which have been added in the update event. I have a class which implements org.apache.solr.handler.dataimport.EventListener and I thought that the result of getAllEntityFields() would yield a collection of document IDs. Unfortunately, the result of the method yields an empty list. Even more confusing is that context.getSolrCore().getName() yields an empty string rather than the actual core name. So it seems I am not quite on the right path here.
The current setup is the following:
Whenever a certain sproc is called in SQL, it puts a message in a queue. This queue has a listener on it which initiates a program which reads the queue and calls other sprocs. After the sprocs are complete, a delta or full import operation is performed on Solr. Immediately after, a method is called to update a cache. However, because the import operation on Solr may not have yet been completed before this update method is called the cache may be updated with "stale" data.
I was hoping to use a dataimport EventListener to call the method which updates the cache since my other options seem far too complex (e.g. polling the dataimport URL to determine when to call the update method or using a queue to list document IDs which need to be updated and have the EventListener call a method on a service to receive this queue and update the cache). I'm having a bit of a hard time finding documentation or examples. Does anyone have any ideas on how I should approach the problem?
From what i understand, you are trying to update your cache as and when the documents are added. Depending on what version of solr you are running, you can do one of the following.
Solr 4.0 provides script transformer that lets you do this.
http://wiki.apache.org/solr/DataImportHandler#ScriptTransformer
With prior versions of solr, you can chain one handler on top of other as answered in the following post.
Solr and custom update handler

Resources