Can we map the custom index tags of blob document inside azure storage container to azure cognitive search? - azure-cognitive-search

We are tagging the documents uploaded to the storage container to be searched by the particular tag in the cognitive search, we could find that custom metadata properties can be linked to search index field with the same name, but this case is not true with custom tags, is there any way to map the custom tags to search index field through indexer.

Related

Getting a more friendly result highlight on Azure Search for PDFs

I'm indexing PDFs into an Azure Cognitive Search index and created a search page.
I'm using the highlights feature, but the returned content is pretty ugly. Sometimes it even returns the content from the footer of the files. It removes spaces and it looks like a mess.
To index the files I'm using the default configuration. The 'content' field uses a Standard Analyzer. I have configured an Indexer connected to a Data Source that points to an Azure Storage folder.
What is the best approach to display a more user-friendly excerpt of the file?

how to filter any title from my Firebase Database

how to search any item from the Firebase Realtime database in RecyclerView
This is my Firebase Database and when users type a word into edit text my app search the match title item from my database,
if I use query selector then I have to mention the child, which means I can't search my entire database.
Asking for HELP::
how to filter or search any title from my firebase database?

How to specify a particular document to be searched in IBM Watson discovery service

IBM Watson Discovery service
I want to get the set of keywords in a particular document in a collection using discovery service. I tried the below url:
https://watson-api-explorer.mybluemix.net/discovery/api/v1/environments/{environment secret key}/collections/{collection secret key}/query?passages=true&count=10&highlight=true&version=2017-11-07
But, it is fetching from all documents in that collection. How can I specify a particular document to be searched?
It's important to know: The collections can have a lot of documents, so the query will search across all your documents inside your collection that you specified if you won't put a field id inside the query.
According to the IBM Watson Discovery Expert #Anish Mathur you can query for a particular document using a field query.
So something like
enviroment/{id envir}/collections/{id coll}/query?query=id:{document_id}
See the Official API Reference for query with WDS.

Can I use .NET SDK to create Azure Table datasource for Azure Search index?

I would like to configure datasource including ia. field mapping for json string containing collection.
Yes you can - get the latest Azure Search SDK from nuget.org and use DataSource.AzureTableStorage method.
You no longer need to use field mappings to populate string collection fields - Azure Search will do this for you automatically as long as the strings are formatted as a JSON array of strings. However, should you need field mappings for other reasons, they are available as FieldMappings property of Indexer class.

Add metadata from database to Solr Index created by Nutch

I have a bespoke CMS that needs to be searchable in Solr. Currently, I am using Nutch to crawl the pages based on a seed list generated from the CMS itself.
I need to be able to add metadata stored in the CMS database to the document indexed in Solr. So, the thought here is that the page text (html generated by the CMS) is crawled via Nutch and the metadata is added to the Solr document where the unique ID (in this instance, the URL) is the same.
As such, the metadata from the DB can be used for facets / filtering etc while full-text search and ranking is handled via the document added by Nutch.
Is this pattern possible? Is there any way to update the fields expected from the CMS DB after Nutch has added it to Solr?
Solr has the ability to partially update a document, provided that all your document fields are stored. See this. This way, you can define several fields for your document, that are not originally filled by nutch, but after the document is added to solr by nutch, you can update those fields with your database values.
In spite of this, I think there is one major problem to be solved. Whenever nutch recrawls a page, it updates the entire document in solr, so your updated fields are missed. Even in the first time, you must be sure that nutch first added the document, and then the fields are updated. To solve this, I think you need to write a plugin for nutch or a special request handler for solr to know when updates are happening.

Resources