WRRCSR42 Create solr cluster - solr

I followed the instructions listed in Getting started with the Retrieve and Rank service to create solr cluster, however I received the following message : WRRCSR42:The requesting service instance may not create any more free solar clusters(current limit:1)
My Questions: what this message mean? and what should I do to get the cluster id?
Thank you,

The error tells you that you've already created a Solr Cluster. IBM Watson R&R only provides one free cluster.
To retrieve the list of existing clusters, you can use the same endpoint as when you attempt to create the cluster, but issue a regular GET request instead of a POST request.
https://gateway.watsonplatform.net/retrieve-and-rank/api/v1/solr_clusters
The response lists your existing Solr clusters, and includes your solr_cluster_id

Related

SOLR CLOUD - Uploading LTR features

We have a SOLR Cloud cluster with 4 nodes. Collections are created with 4 shards and 2 replicas.
I was using REST endpoints (pointing to a single instance for all operations), to create feature(s) and model(s).
http://{{SOLRCLOUD-HOST}}:8983/solr/{{ACTIVE_INDEX_NAME}}/schema/feature-store
http://{{SOLRCLOUD-HOST}}:8983/solr/{{ACTIVE_INDEX_NAME}}/schema/model-store
When I execute REST endpoint to fetch the existing feature(s) and models(s)
http://{{SOLRCLOUD-HOST}}:8983/solr/{{ACTIVE_INDEX_NAME}}/schema/feature-store
http://{{SOLRCLOUD-HOST}}:8983/solr/{{ACTIVE_INDEX_NAME}}/schema/model-store
I see my feature/model created sometimes and the other times it says they don't exist.
At this point, when restart my cluster, thre GET calls always return the created features and models.
Couple of questions -
Like config sets, is there a way to upload features and models without using REST endpoint?
Is restart required after uploading features and models.
Should the feature/mode be executed against all collections in the cluster (assume I have more than one collection with the same data created for different purpose, plz don't ask why, I have them)
Are the features/models created available for collections created in the future with the same config set, I ask this question because the feature/model uploaded is seen inside the config set as - _schema_model-store.json and _schema_feature-store.json
Please advice. Thanks!
Did you find any answers?
I was stuck with feature-store not being available on all shards. Your suggestion of restarting solr helped. Is that the permanent solution?
To answer your Question #3:
You need to upload the features/models for each collection, since collection is part of the upload url, notice the "techproducts" in feature upload example from solr doc:
curl -XPUT 'http://localhost:8983/solr/techproducts/schema/feature-store' --data-binary "#/path/myFeatures.json" -H 'Content-type:application/json'
Just reload the collection to make the feature and model json file to be available on all shards of the collection. The restart of solr is not required.

Can I use StackDriver Trace PHP application in GKE?

I want to check latencies of RPC every day about CakePHP Application each endpoints running in GKE cluster. I found it is possible using php google client or zipkin server by reading documents , but I don't know how easy to introduce to our app though both seem tough for me.
In addition, I'm concerned about GKE cluster configuration has StackDriver Trace option though our cluster it sets disabled.Can we trace span if it sets enable?
Could you give some advices?
I succeeded to send gcp's trace api in php client via REST. It can see trace set by php client parameters , but my endpoint for trace api has stopped though I don't know why.Maybe ,it is not still supported well because the document have many ambiguous expression so, I realized watching server response by BigQuery with fluentd and DataStudio and it seem best solution because auto span can be set by table name with yyyymmdd and we can watch arbitrary metrics with custom query or calculation field.

Getting 403 not authorized when indexing documents on Retrieve and Rank

I am suddenly getting a 403 error when I try to POST an update to the Retrieve and Rank service. This code is under development but it has been working up until yesterday. The failure occurs only when doing a POST to /v1/solr_clusters/{solr_cluster_id}/solr/{collection_name}/update, and it fails the same way whether I do it via my program, the Swagger API documentation, or cURL. All other operations to this service that I've tried work fine when using the same credentials that I'm using with this POST. The error message I'm getting back is
Error: WRRCSH004: Service [1d111267-76b7-417a-98bd-4e9a58072ef9] is not authorized for cluster [sc262b05e8_dcf5_40b4_b662_ae85058ff07f]!. I don't know where the identifier (1d111267-76b7-417a-98bd-4e9a58072ef9) is coming from; that's not the userid I'm sending in.
Looking into your issue it appears your Bluemix organization has multiple service instances. The 403 issue you were seeing is because you're trying to access a Solr cluster using credentials from one of your instances against a cluster in the other instance. The 1d111267-76b7-417a-98bd-4e9a58072ef9 represents one of these service instances—but the issue is that the cluster you're trying to access is not part of that instance. A good way to test this is to ensure you're using the same credentials that generate the 403 but simply try to list the Solr clusters you have created by doing a GET against https://gateway.watsonplatform.net/retrieve-and-rank/api/v1/solr_clusters/.
As for the 500 issue, I wasn't able to see anything on our end. If you're still experiencing that I would suggest posting another question and we can look into things again.
Thanks,
-Scott

Replicating Schemaless SOLR Index

I have an index on a Schemaless solr instance. To allow the application to query some of the fields that are in this index, I have to register these fields using the schema REST API http://localhost:8983/solr/schema/fields.
All works fine in isolation. I can also replicate the index to slaves without problem. However, I am unable to query the replicated index using the fields that were registered via the schema REST API.
That means, if I register the field "button" using the API, I can query using this field on master, but I cannot query on slave. I get error message 400 undefined field button.
Now, I also tried to register this field on the slave in the same way I registered it on the master using the schema REST API. This fails with the message: 400 This IndexSchema is not mutable.
Any idea how this should be addressed?
I presume that when the schema is well defined, the schema.xml can be replicated. But what happens with fields created via the REST API?
I am using SOLR 4.10.3
I have not fully validated that this is the solution to this problem, but my gut feeling tells me that it is. The SOLR master was running SOLR 4.8.0 and the SOLR Slave was running SOLR 4.10.3. It looks like the slave did not completely like the index replicated from 4.8.0. So I downgraded the slave to 4.8.0 and everything works fine.

Query solr cluster for state of nodes

I'm trying to tweak our system status check to see the state of the Solr nodes in our SolrCloud. I'm facing the following problems:
We send a query to each of the Solr nodes separately. If we get a response and the status of the response is 0, we assume the node is running. Unfortunately, we've seen cases in which the node is recovering or even down and select queries are still handled.
In hope to prevent this, we've added a check which sends a ping request to solr. If the status returned by this is request reads 'OK' we assume the node is up. Unfortunately even with this request, if the node is recovering or down, this check won't fail.
My question is: What is the correct way to check the status of a node in SolrCloud?
If you are using a SolrCloud, it's recommended to maintain an explicit zookeeper ensemble as well. Because zookeeper ensemble maintains the SolrCloud's current status of each node and each shard wise. This status is actually get reflected from the SolrCloud admin window.
Go to the Admin window. Click on "Cloud".
Then click on "Tree" to get a tree view of your SolrCloud architecture.
Click /clusterstate.json to view the SolrCloud status.
This (clusterstate.json) json file holds the SolrCloud status information. Now if you are running an explicit zookeeper ensemble, following are the steps to get SolrCloud status.
Go to the path "zookeeper/installation/directory/bin"
Execute ./zkCli.sh -server ZK_IP:ZK_PORT (E.g ./zkCli.sh -server localhost:2181)
Execute get /clusterstate.json
You'll find the SolrCloud status.
Note : ZK_IP - The HOST IP where zoopeeper is running.
ZK_PORT - Zookeeper's client port.
You actually don't want /clusterstate.json - as this only covers the case where collections are already present. From ZooKeeper you need /live_nodes
Because Zookeeper is the authority for what Solr Nodes are members of the Solr cloud cluster, it follows that you should go to it first, to discover what members are accessible. This is how all Solr cloud clients work, and probably is the best way to approach the problem.
/live_nodes contains a file for each live Solr node, regardless of what collections exist or where the replicas are located.
Once you have resolved /live_nodes... you can call clusterstatus on any Solr instance with the address and port from one of the live-nodes.
http://localhost:8983/solr/admin/collections?action=clusterstatus&wt=json
clusterstatus provides a detailed overview of Solr nodes, collections, replicas, etc. Everything you would want to know.
As a final note, it's very wise to set SOLR_HOST inside of solr.in.sh configuration (/etc/default/solr.in.sh) - by default 'localhost' is used to reference the solr node. Setting this value to the public address you want the Solr node identified by will prevent ZooKeeper from returning the address "localhost" to clients when attempting to reach a Solr Node.

Resources