Solr Updates which is faster solrj or curl

Solr Updates which is faster solrj or curl - solr

I have a case where I have a set of fields to be updated in solr. The input i recieve is in form of a map, key being field name and value is the updated value.
I had a doubt that in such a scenario should i be using curl to update the doc or solrj where I have to convert the map to solrInputDocument and then call add command. Will the first approach be faster than second?

You can convert the map to SolrInputdocument.
Curl uses HTTP URL. All requests go to HttpSolrServer(Im not quite sure about it).
However, in my experience, I would strongly recommend
ConcurrentUpdateSolrServer which is meant for updates
SolrJ has ConcurrentUpdateSolrServer class which describes like below.
ConcurrentUpdateSolrServer buffers all added documents and writes them
into open HTTP connections. This class is thread safe. Params from
UpdateRequest are converted to http request parameters. When params
change between UpdateRequests a new HTTP request is started. Although
any SolrServer request can be made with this implementation, it is
only recommended to use ConcurrentUpdateSolrServer with /update
requests. The class HttpSolrServer is better suited for the query
interface.
The below is example way to get Solr Server instance
/**
* Method lookUpConcurrentUpdateSolrServerInstance.
*
* #param baseURL String
* #param collection String
* #return ConcurrentUpdateSolrServer
* #throws SolrServerException
* #throws IOException
* #throws Exception
*/
public static ConcurrentUpdateSolrClient lookUpConcurrentUpdateSolrServerInstance(String baseURL, String collection)
throws SolrServerException, IOException, Exception {
ConcurrentUpdateSolrClient solrServer = null;
if (StringUtils.isNotBlank(baseURL) && StringUtils.isNotBlank(collection)) {
solrServer = new ConcurrentUpdateSolrClient(baseURL + collection, "queueSizeasInteger", 10),
"threadCount as integer", 10));
checkServerStatus(solrServer);
return solrServer;
}
return solrServer;
}

Related

SolrJ and Custom Solr Handler

I am trying to implement a simple custom request handler in Solr 7.3. I needed some clarifications on the methods available via the Solr Java API.
As per my understanding, I have extended my Java Class with "SearchHandler" and then overridden the "handleRequestBody" method. I am trying to understand the flow from the beginning. Here is a sample query in the browser.
http://localhost:8983/solr/customcollection/customhandler?
q=John&fl=id,last_name&maxRows=10
1) Once you enter the above query in the browser and press
"return" the Solr customhandler will be triggered. It will look
for the necessary jars from where the handler is created.
2) Once it finds the main class it will execute the following
method, which is overridden from the "SearchHandler" parent
class.
public void handleRequestBody(SolrRequest req, SolrResponse
resp) throws Exception
3) The SolrRequest req object will hold all the Solr Parameters
on the query, in this case, q,fl and maxRows.
4) Using the following code I unpack these parameters.
SolrParams params = req.getParams();
String q = params.get(CommonParams.Q);
String fl = params.getParams(CommonParams.FL);
String rows = params.get(CommonParams.ROWS);
5)I create a Solr object that let's me connect to my Solr Cloud
String zkHostString = "localhost:5181";
SolrClient solr = new
CloudSolrClient.Builder().withZkHost(zkHostString).build();
6) Here is where I need help
a) How do I use the unpacked Solr Parameters from the
original query and make a call to the "solr" object to
return results.
b) How do I make use of the "resp" object?
c) Most of the examples that I found on the internet show
how to print the results to STDOUT. However, since I am
using a custom handler I would like to display the results
back to the user (in this case, SOLR Admin or the browser).```
Any help is truly appreciated.
Thanks
public class SolrQueryTest extends
org.apache.solr.handler.component.SearchHandler {
String zkHostString = "localhost:5181";
SolrClient solr = new
CloudSolrClient.Builder().withZkHost(zkHostString).build();
private static final Logger log =
Logger.getLogger(SolrQueryTest.class.getName());
public void handleRequestBody(SolrRequest req, SolrResponse
resp) throws Exception {
SolrParams params = req.getParams();
String q = params.get(CommonParams.Q);
String rows = params.get(CommonParams.ROWS);
SolrQuery query = new SolrQuery(q);
query.setShowDebugInfo(true);
query.set("indent", "true");
// need to know how to call SOLR using the above query
parameters
//Once the response is received how to send it back to the
browser and NOT STDOUT
}
}

Solr: Custom RequestHandler within the Solr api (solrj) that makes a query call back to the server

I'm looking for some advice on a specific issue that is holding us back.
I'm trying to create a custom RequestHandler within the Solr api (solrj) that makes a query call back to the server.
I'm not finding any good, run-able examples of this on-line. Possibly I'm approaching this wrong. Any advice would be appreciated.
Here is one of my attempts to coding this.
public void handleRequestBody(SolrQueryRequest req, SolrQueryResponse rsp) throws Exception
{
SolrCore solrServerCore = req.getCore();
SolrRequestHandler handler = solrServerCore.getRequestHandler("/select");
ModifiableSolrParams params = new ModifiableSolrParams();
params1.add("q", "*:*");
SolrQueryRequest req = new LocalSolrQueryRequest(solrServerCore, params);
SolrQueryResponse rsp = new SolrQueryResponse();;
solrServerCore.execute(handler1, req, rsp);
//!!! Not returning a structured response
System.out.println(rsp.toString());
System.out.println(rsp.getReturnFields());
System.out.println(rsp.getValues().toString());
}

How to use apache camel enrich with JDBC

Apache Camel 2.12.1
I have a route setup like this:
public void configure() throws Exception {
from("direct:start")
.process(new AuthorizationHeaderProcessor(configureCreds()))
.to(httpSourceEndpoint)
.process(new GenerateSQLFromMessageProcessor))
.enrich("jdbc:dataSource", new DBAggregator())
//...do things with result...
1) HttpSourceEndPoint is a get request from some url.
2) I then want to use the result of this to generate the SQL as per GenerateSQLFromMessageProcessor, which is provided as input to the JDBC route.
My problem is, within the DBAggregator, the parameters that come through are:
oldExchange = the raw SQL string that was sent to the JDBC call
newExchange = the result set from the DB query
ie there is no sign of the original message that came from the http source endpoint, which is how I would have expected aggregation to work. How am I supposed to combine the 2 streams? Was it the GenerateSQLFromMessageProcessor call that consumed the original message? If so, should you specify the SQL in a bean for an enrich?
EDIT
So setting in the header like this:
public void configure() throws Exception {
from("direct:start")
.process(new AuthorizationHeaderProcessor(configureCreds()))
.to(httpSourceEndpoint)
.setHeader(new BeanExpression(MySQLBean.class, "methodToGenerateSQL")
.enrich("jdbc:dataSource", new DBAggregator())
//...do things with result...
results in my aggregator looking like this:
public class DBAggregator implements AggregationStrategy {
#Override
public Exchange aggregate(Exchange oldExchange, Exchange newExchange) {
Here I have:
oldExchange = the resulting SQL string that methodToGenerateSQL generated
newExchange = the result set from the SQL query
The problem is I do not have access to the original message that came from httpSourceEndpoint.
As this is an aggregator I would have expected oldExchange to be the incoming message, not just an SQL string.
After all, it is an aggregator, and yet I have effectively lost the incoming message - this is not "enriching"!
Thanks,
Mr Tea

JAX-RS with CXF / rest-assured: Handling multiparam file upload

I want to upload a JPG file and a JSON-serialized Java object. On the server I am using Apache CXF, on the client I am integration testing with rest-assured.
My server code looks like:
#POST
#Path("/document")
#Consumes(MediaType.MULTIPART_FORM_DATA)
public Response storeTravelDocument(
#Context UriInfo uriInfo,
#Multipart(value = "document") JsonBean bean,
#Multipart(value = "image") InputStream pictureStream)
throws IOException
{}
My client code looks like:
given().
multiPart("document", new File("./data/json.txt"), "application/json").
multiPart("image", new File("./data/image.txt"), "image/jpeg").
expect().
statusCode(Response.Status.CREATED.getStatusCode()).
when().
post("/document");
Everything works fine when I read the json part from the file as in the first multiPart line. However, when I want to serialize the json instance I come into problems. I tried many variants, but none worked.
I thought this variant should work: on the client
JsonBean json = new JsonBean();
json.setVal1("Value 1");
json.setVal2("Value 2");
given().
contentType("application/json").
formParam("document", json).
multiPart("image", new File("./data/image.txt"), "image/jpeg").
...
and on the server
public Response storeTravelDocument(
#Context UriInfo uriInfo,
#FormParam(value = "document") JsonBean bean,
#Multipart(value = "image") InputStream pictureStream)
but no. Can anyone tell me how it should be?

Try different approach (worked for me), I am not sure if this is suitable in your case.
Make JsonBean a JAXB entity, that it add #XmlRootEntity above class definition.
Then, instead of formParam
given().
contentType("application/json").
body(bean). //bean is your JsonBean
multiPart("image", new File("./data/image.txt"), "image/jpeg").
then
public Response storeTravelDocument(
#Context UriInfo uriInfo,
JsonBean bean, //should be deserialized properly
#Multipart(value = "image") InputStream pictureStream)
I've never tried that with #Multipart part, but, hopefully it would work.

Multipart/form-data follows the rules of multipart MIME data streams, see w3.org. This means that each part of the request forms a part in the stream. Rest-assured supports already simple fields (strings), files and streams, but not object serialization into a part. After asking on the mailing list, Johan Haleby (the author of rest-assured) suggested to add an issue. The issue is already accepted, see issue 166.
The server will stay as it is:
#POST
#Path("/document")
#Consumes(MediaType.MULTIPART_FORM_DATA)
public Response storeTravelDocument(
#Context UriInfo uriInfo,
#Multipart(value = "document") JsonBean bean,
#Multipart(value = "image") InputStream pictureStream)
throws IOException
{}
The client code will look like:
given().
multiPartObject("document", objectToSerialize, "application/json").
multiPart("image", new File("./data/image.txt"), "image/jpeg").
expect().
statusCode(Response.Status.CREATED.getStatusCode()).
when().
post("/document");
Maybe the name "multiPartObject" will change. We will see once it is implemented.

SolrJ: Disable Autocommit

We have an instance of Solr, where we've found that turning on autoCommit in the solrconfig.xml actually may serve our needs well. However there are some instances and some batch operations where we want to temporarily disable autocommit. I have not been able to find anything, but I'm wondering if anyone knew if via SolrJ you could disable autocommit for a certain process, and then re-enable it?

You can't disable and enable autocommit as it's configured in solrconfig.xml. However, you can leave it disabled in solrconfig.xml and use commitWithin for those add commands that need autocommit.

answering because this is the first result for "solr disable autocommit".
This is now possible with the new config API that allows to override some properties set in solrconfig.xml without reloading the core.
Solrj does not implement that new API yet.
You should not disable autocommits, see this article.
If you want to do a bulk indexing of many documents at once, set updateHandler.autoCommit.openSearcher=false and disable autoSoftCommits:
/**
* Disables the autoSoftCommit feature.
* Use {#link #reEnableAutoCommit()} to reenable.
* #throws IOException network error.
* #throws SolrServerException solr error.
*/
public void disableAutoSoftCommit() throws SolrServerException, IOException
{
// Solrj does not support the config API yet.
String command = "{\"set-property\": {" +
"\"updateHandler.autoSoftCommit.maxDocs\": -1," +
"\"updateHandler.autoSoftCommit.maxTime\": -1" +
"}}";
GenericSolrRequest rq = new GenericSolrRequest(SolrRequest.METHOD.POST, "/config", null);
ContentStream content = new ContentStreamBase.StringStream(command);
rq.setContentStreams(Collections.singleton(content));
rq.process(solrClient);
}
/**
* Undo {#link #disableAutoSoftCommit()}.
* #throws IOException network error.
* #throws SolrServerException solr error.
*/
public void reenableAutoSoftCommit() throws SolrServerException, IOException
{
// Solrj does not support the config API yet.
String command = "{\"unset-property\": [" +
"\"updateHandler.autoSoftCommit.maxDocs\"," +
"\"updateHandler.autoSoftCommit.maxTime\"" +
"]}";
GenericSolrRequest rq = new GenericSolrRequest(SolrRequest.METHOD.POST, "/config", null);
ContentStream content = new ContentStreamBase.StringStream(command);
rq.setContentStreams(Collections.singleton(content));
rq.process(solrClient);
}
You can see the overriden properties at http://localhost:8983/solr/<core>/config/overlay

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Solr Updates which is faster solrj or curl - solr

Related

SolrJ and Custom Solr Handler

Solr: Custom RequestHandler within the Solr api (solrj) that makes a query call back to the server

How to use apache camel enrich with JDBC

JAX-RS with CXF / rest-assured: Handling multiparam file upload

SolrJ: Disable Autocommit

Categories

Resources