Replicating Schemaless SOLR Index - solr

I have an index on a Schemaless solr instance. To allow the application to query some of the fields that are in this index, I have to register these fields using the schema REST API http://localhost:8983/solr/schema/fields.
All works fine in isolation. I can also replicate the index to slaves without problem. However, I am unable to query the replicated index using the fields that were registered via the schema REST API.
That means, if I register the field "button" using the API, I can query using this field on master, but I cannot query on slave. I get error message 400 undefined field button.
Now, I also tried to register this field on the slave in the same way I registered it on the master using the schema REST API. This fails with the message: 400 This IndexSchema is not mutable.
Any idea how this should be addressed?
I presume that when the schema is well defined, the schema.xml can be replicated. But what happens with fields created via the REST API?
I am using SOLR 4.10.3

I have not fully validated that this is the solution to this problem, but my gut feeling tells me that it is. The SOLR master was running SOLR 4.8.0 and the SOLR Slave was running SOLR 4.10.3. It looks like the slave did not completely like the index replicated from 4.8.0. So I downgraded the slave to 4.8.0 and everything works fine.

Related

How to make config changes take effect in Solr 7.3

We are using solr.SynonymFilterFactory with synonyms.txt in Solr during querying. I realized that there is an error in synonyms.txt, corrected it and uploaded the new file. I can see the modified synonyms.txt from Admin. But it looks like the queries are still using the old synonyms.txt. I am executing test queries from Admin with debugQuery=true and can see the synonyms getting used. How can this be fixed? It is a production environment with 3 nodes using zookeeper for management.
You'll need to reload your core for the changes to take effect.
In a single-node Solr you can do that from the Admin page: go to Core Admin, select your core, and hit Reload. This will slow down some queries but it shouldn't drop queries or connections.
You can also reload the core via the API:
curl 'http://localhost:8983/solr/admin/cores?action=RELOAD&core=your-core'
I am not sure how this works on an environment with 3 nodes, though.

How to set Data Import Handler and Scheduler using solrJ Client

I am new to solr search, i have completed a simple search.
Now I want to index documents directly from Database and want set scheduler or trigger for updating index when there is any change in DB.
I know that I can do it with DataImportHandler but can't understand its flow.
can you help me that from which steps I should have to start this process?
or can anyone just give me pointers to do this ??
I want to do this all things using SolrJ client.
This task requires many parts to work together. Work through https://wiki.apache.org/solr/DataImportHandler
DataImportHandler is a Solr component, which means that it runs inside the Solr instance. All you have to do is configure Solr and than run the DHI through the Dataimport Screen.
On the other hand SolrJ is an API that makes it easy for Java applications to talk to Solr. So you can write your own applications that create, modify, search and delete documents to Solr.
try to do simple edit and delete function on button click event and
send the id with that url in servlet and do your jdbc opertaion
after that successfully commited, call your data import command from solrj and redirect it to your index page
thats it.

Solr Cloud : no servers hosting shard

We have a cluster of standalone Solr cores (Solr 4.3) for which we had built some custom plugins. I'm now trying to prototype converting the cluster to a Solr Cloud cluster. This is how I am trying to deploy the cores (in 4.7.2).
Start solr with zookeeper embedded.
java -DzkRun -Djetty.port=8985 -jar start.jar
upload a config into Zookeeper (same config as the standalone cores)
zkcli.bat -zkhost localhost:9985 -cmd upconfig -confdir myconfig -confname myconfig
Create a new collection (mycollection) of 2 shards using the Collections API
http://localhost:8985/solr/admin/collections?action=CREATE&name=mycollection&numShards=2&replicationFactor=1&maxShardsPerNode=2&collection.configName=myconfig
So at this point I have two shards under my solr directory with the appropriate core.properties
But when I go to http://localhost:8985/solr/#/~cloud, I see that the two shards' status is "Down" when they are supposed to be active by default.
And when I try to index documents in them using SolrJ (via CloudSolrServer API) , I get the error "No live SolrServers available to handle this request". I restarted Solr but same issue.
private CloudSolrServer cloudSolr;
cloudSolr = new CloudSolrServer(zkHOST);
cloudSolr.setZkClientTimeout(zkClientTimeout);
cloudSolr.setDefaultCollection(collectionName);
cloudSolr.connect();
cloudSolr.add(doc)
What am I doing wrong? I did a lot of digging around and saw an old Jira bug saying that Solr Cloud shards won't be active until there are some documents in the index. If that is the reason, that's kind of like a catch-22 isn't it?
So anyways, I also tried adding some test documents manually and committed to see if things improved. Now on the shard statistics page, it correctly gives me the Numdocs count but when I try to query it says "no servers hosting shard". I next tried passing in shards.tolerant=true as a query parameter and search, but no cigar. It says 0 documents found.
Any help would be appreciated. My main objective is to rebuilt the old standalone cores using SolrCloud and test to see if our custom requesthandlers still work as expected. And at this point, I can't index documents inside of the 4.7 Solr Cloud collection I have created.
Thanks and Regards

Search using SOLR is not up to date

I am writing an application in which I present search capabilities based on SOLR 4.
I am facing a strange behaviour: in case of massive indexing, search request doesnt always "sees" new indexed data. It seems like the index reader is not getting refreshed frequently, and only after I manually refresh the core from the Solr Core Admin window - the expected results will return...
I am indexing my data using JsonUpdateRequestHandler.
Is it a matter of configuration? do I need to configure Solr to reopen its index reader more frequently somehow?
Changes to the index are not available until they are commited.
For SolrJ, do
HttpSolrServer server = new HttpSolrServer(host);
server.commit();
For XML either send in <commit/> or add ?commit=true to the URL, e.g. http://localhost:8983/solr/update?commit=true

Solr not indexing

I have setup Solr to index data from Oracle DB through DIH handler. However through Solr admin I could see the DB connection is successfull, data retrieved from DB to Solr but not added into index. The message is that "0 documents added" even when I am able to see that 9 records are returned back.
The schema and fields in db-data-config.xml are one and the same.
Please suggest if anything I should look for.
Did you do a full import by hitting http://HOST:PORT/solr/CORE/dataimport?command=full-import? Then the commit should happen by default. You can also try committing on full import explicitly by hitting http://HOST:PORT/solr/CORE/dataimport?command=full-import&commit=true.
Hit http://HOST:PORT/solr/CORE/select?q=*:* and check if you get 9 docs back.
However, if you are running a delta import, then there is a possibility that no documents were changed and you may see 0 docs added/deleted.
If you want to delete the existing Solr index before starting, hit http://HOST:PORT/solr/CORE/update?stream.body=<delete><query>*:*</query></delete>&commit=true and then do a full import & verify.

Resources