Solr - Unique Key Exception - solr

Is there any way Solr can throw exception back, either in the status or exception message somehow, for an update request that having an existing unique key. Right now, Solr just sends back a good update message with status 0 while its not adding the document. I need an ability to tell from the client side that if a document was not added because of the duplicate unique key issue.
Thanks!

If a document with the unique id exists, solr just updates the doc. It is by design and as far as i know, there is no way to change it.
You can solr query before you update/add a doc, so that you are not adding it again... but that is not really transactional (solr is not a database).It'd work if you are the only one updating solar and the changes are serialized etc.
If you have this stringent requirement on not adding existing ids, you could use an intermediary database, load it and reindex solr from that..?

Related

Does the AzureSearch query key change if the indexer is reset or query parameters changed?

I am planning to use AzureSearch, and have the exposed API get invoked from a client application. I expect that the indexer and fields returned from Azure Search, to change over time.
I wanted to check if the Azuresearch API access key might change- and what steps we need to take to ensure that this is static?
This is critical, as distributing any new key to client devices could be challenging
Azure Search indexers won't change query keys, even if you reset it. The only API that can remove query keys is Query Keys - Delete

How to know when a put on Cloud Datastore in App Engine reaches Milestone B?

I have an application that uses Cloud Datastore via App Engine to save data.
I need to refresh the clients when an object is put on the database. To do it, after the object is put on the database, the server sends a sync message to the clients. The clients read the sync message and does a query to the server. The server does a Query to return the new result.
The problem is that when the Query is done, the put object doesn't appears on the query results. Reading the documentation, I suppose that the reason is that the put isn't on the Milestone B, see https://cloud.google.com/appengine/articles/transaction_isolation, because another later call object appears.
How can I know when a put reaches a "Milestone B"? If it isn't possible to know it, how can I do this logic (refresh clients after put)?
You can ensure up-to-date query results by using an ancestor query, or, if you know the key of the specific entity you need to retrieve, you can fetch it by key rather than using a query.
This page discusses the trade-offs of using ancestor queries.
The data do not appear in the result of your query because the indexes have not been updated yet.
There is some latency before the indexes will be updated and unfortunately there is no way to know when this will happen.
The only way to handle this case is to use the entity's key, that is the only index that guarantees to be updated as soon the entity it's stored.
https://cloud.google.com/appengine/docs/java/datastore/entities

Changes to Solr schema.xml do not update after stoping and restarting Solr

I am a new learner of Solr. Now I want to make my own schema.xml. So I add some fields. I stop the solr and restart it. In the admin of solr, I can see the changes in the schema choice. But the content of schema browser doesn't changes. And when I want to index some document. There is an error that says there is no field which I just added in the schema. The content of schema browser is not same as the schema file.
Changing the schema of a core doesn't change the documents you already have there, which is why they look the same even after you restart the Solr service. You need to re-upload the documents with the new fields specified (if they are required fields) after you make a schema change to get these new fields for existing documents.
from here I went to the path of my core instance to make the changes.
/usr/local/Cellar/solr#7.7/7.7.3_1/server/solr/drupal
then I was able to confirm the changes by clicking on Files and scrolling to where I made the change.

Add metadata from database to Solr Index created by Nutch

I have a bespoke CMS that needs to be searchable in Solr. Currently, I am using Nutch to crawl the pages based on a seed list generated from the CMS itself.
I need to be able to add metadata stored in the CMS database to the document indexed in Solr. So, the thought here is that the page text (html generated by the CMS) is crawled via Nutch and the metadata is added to the Solr document where the unique ID (in this instance, the URL) is the same.
As such, the metadata from the DB can be used for facets / filtering etc while full-text search and ranking is handled via the document added by Nutch.
Is this pattern possible? Is there any way to update the fields expected from the CMS DB after Nutch has added it to Solr?
Solr has the ability to partially update a document, provided that all your document fields are stored. See this. This way, you can define several fields for your document, that are not originally filled by nutch, but after the document is added to solr by nutch, you can update those fields with your database values.
In spite of this, I think there is one major problem to be solved. Whenever nutch recrawls a page, it updates the entire document in solr, so your updated fields are missed. Even in the first time, you must be sure that nutch first added the document, and then the fields are updated. To solve this, I think you need to write a plugin for nutch or a special request handler for solr to know when updates are happening.

How can I put my custom URLs to SOLR results

I am trying to index my database data with SOLR and I am successfully indexed it.
What I need is:
I need to put URLs with every results.
The URLs for each result item will be different.
Each result item need to append its item_id (which is available as a field) with its URL.
I am very new to SOLR configurations and SOLR query, so please help to implement a better search result XML.
Thanks in advance.
You can store URL in an additional field (stored=true indexed=false) and then simply retrieve it when you're searching.
Even better if you can compose URLs yourself (if they differ only in ID/primary key) by appending document ID to some fixed URL, that's certainly a better way to go.
That would include altering your page which displays search results.
What kind of application is your Solr integrated with?
Where are those documents of yours stored? In a db? How do you get to them through your application?

Resources