I'm reading sunspot documentation and find that sunspot based on RSolr library.
Is there any way to get connection to perform low level request like this
pseudo-code:
solr = Sunspot.connection
response = solr.get 'select', :params => {:q => '*:*'}
Not as of the current version (1.3.2). Well, you can, but you'd have to instance_eval a few objects to get to the underlying RSolr object. Patches are welcome at http://github.com/sunspot/sunspot — a good accessor method for the RSolr connection would be welcome.
Related
How can I query Solr, using the HTTP API, for information about a collection? I'm not talking about the collection's indexes, which I could query using the COLSTATUS command. I'm just talking about the basic details of a collection, which you can see when you click on a collection in the Solr web admin page, such as config name.
When wondering where information provided in the web interface comes from, the easiest way is to bring up your browser's development tools and go to the Network section. Since the interface is a small Javascript application, it uses the available REST API in the background - the same that you'd query yourself.
Extensive collection information can be retrieved by querying:
/solr/admin/collections?action=CLUSTERSTATUS&wt=json
(Any _ parameter is just present for cache busting).
This will return a list of all the collections present and their metadata, such as which config set they use and what shards the collection consists of. This is the same API endpoint that the web interface uses.
collections":{
"aaaaa":{
"pullReplicas":"0",
"replicationFactor":"1",
"shards":{"shard1":{
"range":"80000000-7fffffff",
"state":"active",
"replicas":{"core_node2":{
"core":"aaaaa_shard1_replica_n1",
"base_url":"http://...:8983/solr",
"node_name":"...:8983_solr",
"state":"down",
"type":"NRT",
"force_set_state":"false",
"leader":"true"}}}},
"router":{"name":"compositeId"},
"maxShardsPerNode":"1",
"autoAddReplicas":"false",
"nrtReplicas":"1",
"tlogReplicas":"0",
"znodeVersion":7,
"configName":"_default"},
...
}
Please try the below code.
getConfigName(String collectionName){
//provide the list of zookeeper instances
List<String> zkHosts = ""
// get the solr cloud client
CloudSolrClient cloudSolrClient = new CloudSolrClient.Builder (zkHosts, Optional.empty
()).build ();
// get the config for the collection
String configName = solrConnectionProvider.getCloudSolrClient().getZkStateReader().readConfigName(collectionName);
return configName;
}
Please handle the exception(s) from your end.
Using Camel 2.19.3 ...
I want to read from a TOPIC (IBM-MQ). I set both a
"durableSubscriptionName" and a client ID.
from ("jms:topic:TEST/TOPIC1?durableSubscriptionName=TestSubscription1&clientId=101021&exchangePattern=InOnly")
However, the DefaultJmsMessageContainerFactory gives me an error:
JMWCC0101: The clientID cannot be null
I've tried the same configuration using Spring JmsTemplate directly, and by
setting the clientId on the connection, and that works.
Do I need to specify a custom "connectionFactory"? Looking at the code for
DefaultJmsMessageContainerFactory , it looks like it should handle setting
the clientID to the underlying connection.
Any thoughts on what I should look for?
What worked for us was to assign a client-id to the Connection Factory, not to the Camel JMS Component, nor to the specific consumer. That level of granularity is all we need in our use-case.
Since we use IBM Liberty, we added a property in server.xml, but there are probably other ways to accomplish the same thing.
<jmsConnectionFactory ..... >
<properties.wmqJms ... clientId="99999" ... />
</jmsConnectionFactory>
We are using azure search and need to implement a retry stratgey as well as storing the Ids of failed documents as described.
Is there any documentation/samples on how to implement a RetryPolicy strategy in Azure Search.
Thanks
This is what I used:
private async Task<DocumentIndexResult> IndexWithExponentialBackoffAsync(IndexBatch<IndexModel> indexBatch)
{
return await Policy
.Handle<IndexBatchException>()
.WaitAndRetryAsync(5, retryAttempt => TimeSpan.FromSeconds(Math.Pow(2, retryAttempt)), (ex, span) =>
{
indexBatch = ((IndexBatchException)ex).FindFailedActionsToRetry(indexBatch, x => x.Id);
})
.ExecuteAsync(async () => await _searchClient.IndexAsync(indexBatch));
}
It uses the Polly library to handle exponential backoff. In this case I use a model IndexModel that has a id field named Id.
If you like to log or store the ids of the failed attempts you can do that in the WaitAndRetryAsync function like
((IndexBatchException)ex)ex.IndexingResults.Where(r => !r.Succeeded).Select(r => r.Key).<Do something here>
There is currently no sample showing how to properly retry on IndexBatchException. However, there is a method you can use to make it easier to implement: IndexBatchException.FindFailedActionsToRetry. This method extracts the IDs of failed documents from the IndexBatchException, correlates them with the actions in a given batch, and returns a new batch containing only the failed actions that need to be retried.
Regarding the rest of the retry logic, you might find this code in the ClientRuntime library useful. You will need to tweak the parameters based on the characteristics of your load. The important thing to remember is that you should use exponential backoff before retrying to help your service recover, since otherwise your requests may be throttled.
Solr has a Admin UI where we can check each and every Collections that were deployed to Solr Cloud. For example, I can see a Slice/Shard in a collection up or not as mentioned in the below URL.
Our production environment doesn't provide access to this Admin UI due to security reasons. I need to provide an API to get the status of each and every collection, and its shards and each shard's replica. I am using Solr APIs to do that
http://lucene.apache.org/solr/4_7_2/solr-solrj/index.html
CloudSolrServer server = new CloudSolrServer(<zk quorum>);
ZkStateReader reader = server.getZkStateReader();
Collection<Slice> slices = reader.getClusterState().getSlices(collection);
Iterator<Slice> iter = slices.iterator();
while (iter.hasNext()) {
Slice slice = iter.next();
System.out.println(slice.getName());
System.out.println(slice.getState());
}
The above piece of code is always returning Active only as the state of shard, even its replica is showing down in the UI. I assume this returns only the state of a shard, not the state of shard's leader or replica.
How can I get the replicas status through Solr APIs? is there any API for this?
And what is the API being using by Solr Admin UI for getting shard's replicas/leader status?
Thanks
The code is not looking at replica status. Here is one that prints out replica status:
CloudSolrServer server = new CloudSolrServer(zknodesurlstring);
server.setDefaultCollection("mycollection");
server.connect();
ZkStateReader reader = server.getZkStateReader();
Collection<Slice> slices = reader.getClusterState().getSlices("mycollection");
Iterator<Slice> iter = slices.iterator();
while (iter.hasNext()) {
Slice slice = iter.next();
for(Replica replica:slice.getReplicas()) {
System.out.println("replica state for " + replica.getStr("core") + " : "+ replica.getStr( "state" ));
System.out.println(slice.getName());
System.out.println(slice.getState());
}
}
check http://{ipaddress}:{port}/solr/admin/info/system
Look at the Solr log when browsing the web interface. Since the web interface is purely a client side application, you can see which endpoints on the Solr server it queries to retrieve information about the current state of the cluster.
The response format used to create the graph is probably pretty straight forward (since it's parsed in the web interface).
This also works for the other information displayed in the Admin interface.
You can use Solr's Ping API to check health status of all replicas for a given collection.
Request format: http://localhost:8983/solr/Collection-Name/admin/ping?distrib=true&wt=xml
This command will ping all replicas of the given collection name
In Java:
public boolean isActive(final String collectionName) {
SolrPing ping = new SolrPing();
ping.getParams().add("distrib", "true"); //To make it a distributed request against a collection
SolrPingResponse response = ping.process(solrClient, collectionName);
return response.getStatus() == 0;
}
How can I execute a RealTime get request from the SolrJ client?
I specifically need to retrieve un-commited documents in order to check the _version_ field for optimistic concurrency.
Since the RealTime Get is implemented with an alternate requestHandler, you would just need to use the setRequestHandler() method on SolrQuery passing "/get" as the handler name.
Please see the testRealTimeGet() method in this SolrExampleTests.java file from the Solr source for a full example.
Here is the snippet from that file:
SolrQuery q = new SolrQuery();
q.setRequestHandler("/get");
q.set("id", "DOCID");
q.set("fl", "id,name,aaa:[value v=aaa]");