Data security with Azure Cognitive Search - azure-cognitive-search

I have a client that does not want the documents to permanently live in Azure. They are ok with moving files up to Azure to be indexed but after indexing they want the files to be removed from Azure and results to point to their on-prem storage.
Is this use-case possible with Azure Cognitive Search?

you can push any data into a search index that you want via the service's REST API as long as you have network connectivity to the search service.
I'm not sure why your client doesn't want to store documents in Azure, but you should make sure they're aware that the ingested document data exists in the search index independently of any source data. That is, if he's concerned about his data being stored in Azure, the indexed data will always be stored in azure, since that's how the search service works.
If you're asking whether it's possible to point an azure search indexer to a datasource that is not hosted in Azure, then no, that's not generally supported. There are some third party organizations (eg: Accenture, BA Insight) that will host a connector to a non-azure datasource on your behalf though.

Related

Auto sync Azure SQL DB data to Azure search index

I want to sync any DML operations on Azure SQL DB to azure search index immediately.
I have gone through this question.
How does Auto-indexing/sync of Azure SQL DB with Azure Search Works?
No answer in this question. This was posted almost 5 yrs back.
With integrated change policy in place do we have auto sync feature by any means now.
Function app does not have a SQL trigger event attached.
I don't want to do a while true loop or any timer or call indexer when data gets update.
Please suggest if there are any other best approach or any build feature.
Azure Functions don't have a SQL trigger but Logic Apps do: https://learn.microsoft.com/en-us/azure/connectors/connectors-create-api-sqlazure Logic Apps can also trigger functions and custom APIs, so you should be able to trigger an indexing operation from a SQL operation this way. Do keep in mind however that the indexing process itself may be delayed or take time once triggered so your index may not immediately reflect the changes, depending on payload.
There are two ways you can integrate Azure SQL data with Azure Search.
Built-in indexer. You have the built-in index support for Azure SQL. It supports incremental updating of the index, but with a limited refresh ratio. Currently, you can run incremental indexing every 5 minutes at the most. See Connect to and index Azure SQL content using an Azure Cognitive Search indexer
Push API. To support immediate updates, you have to push data via the Push API. In this case you only create the index, not the indexer. Your code that pushes content to Azure SQL is responsible for pushing content to Azure Search. Check out this example Tutorial: Optimize indexing with the push API

How to update data azure SQL database using stream analytics?

How to update or delete data in azure sql DB using azure stream analytics
Currently, Azure Stream Analytics (ASA) only supports inserting (appending) rows to SQL outputs (Azure SQL Databases, and Azure Synapse Analytics).
You should consider to use workarounds to enable UPDATE, UPSERT, or MERGE on SQL databases, with Azure Functions as the intermediary layer.
You can find more information about such workarounds in this MS article.
Firstly, we need to know what is Azure Stream Analytics.
An Azure Stream Analytics job consists of an input, query, and an output. Stream Analytics ingests data from Azure Event Hubs, Azure IoT Hub, or Azure Blob Storage. The query, which is based on SQL query language, can be used to easily filter, sort, aggregate, and join streaming data over a period of time. You can also extend this SQL language with JavaScript and C# user defined functions (UDFs). You can easily adjust the event ordering options and duration of time windows when preforming aggregation operations through simple language constructs and/or configurations.
Azure Stream Analytics now natively supports Azure SQL Database as a source of reference data input. Developers can author a query to extract the dataset from Azure SQL Database, and configure a refresh interval for scenarios that require slowly changing reference datasets.
That means that you can not insert or update data in azure sql DB using Azure Stream Analytics.
Azure Stream Analytics is not a database manage tool.
Hope this helps.

Azure Search support to index on Image /Binary data types in SQL Server, is it possible? or there any alternative possibilities?

We came with a requirement to Search SQL table which contain documents data in Image/binary column type. we are trying to do this with Elastic-search and Azure Search. we can able to proceed with Elastic-search but hit roadblock on Azure Search as indexing is not possible for these data types thru indexer.
can any body help us, is there any possibilities to achieve this with Azure Search?
Please see my response to your question on MSDN.
In short, currently, in order to use Azure Search built-in document extraction capabilities, the files need to be stored in Azure blob storage. Then, you can use the blob indexer.

How to get notification from Azure Sql database on insert

I need to get a notification or call a webservice whenever a row is inserted into a specific table in my Azure Sql Database. I have been searching the web for a good solution, but i haven't found any.
I tried to call a web app service in Azure - but this is not allowed from Azure Sql Databases.
I looked at the Azure logic apps, but the SQL Server Connector has been removed.
How do I get notificated when a row is put in?
Although this is not natively supported in SQL Azure, there are a few different options you can consider.
1) Modify the calling code to insert a row into the table and write a message to Azure storage queue. You can have a separate process which drains the message from the queue and invokes the web service so that these actions are loosely coupled.
2) Enable change tracking on the specific table so that your app can discover the latest changes (i.e. inserts) to the table. This feature is well documented if you search the Azure SQL docs.

Azure Search from existing database

I have an existing SQL Server database that uses Full Text Search and Semantic search for the UI's primary searching capability. The tables used in the search contain around 1 million rows of data.
I'm looking at using Azure Search to replace this, however my database relies upon the Full Text Enabled tables for it's core functionality. I'd like to use Azure Search for the "searching" but still have my current table structure in place to be able to edit records and display the detail record when something has been found.
My thoughts to implement this is to:
Create the Azure indexes
Push all of the searchable data from the Full Text enabled table in SQL Server to Azure Search
Azure Search to return ID's of documents that match the search criteria
Query the existing database to fetch the rows that contain those ID's to display on the front end
When some data in the existing database changes, schedule an update in Azure Search to ensure the data stays in sync
Is this a good approach? How do hybrid implementations work where your existing data is in an on-prem database but you want to take advantage of Azure Search?
Overall, your approach seems reasonable. A couple of pointers that might be useful:
Azure SQL now has support for Full Text Search, so if moving to Azure SQL is an option for you and you still want to use Azure Search, you can use Azure SQL indexer. Or you can run SQL Server on IaaS VMs and configure the indexer using the instructions here.
With on-prem SQL Server, you might be able to use Azure Data Factory sink for Azure Search to sync data.
I actually just went through this process, almost exactly. Instead of SQL Server, we are using a different backend data store.
Foremost, we wrote an application to sync all existing data. Pretty simple.
For new documents being added, we made the choice to sync to Azure Search synchronously rather than async. We made this choice because we measured excellent performance when adding to and updating the index. 50-200 ms response time and no failures over hundreds of thousands of records. We couldn't justify the additional cost of building and maintaining workers, durable queues, etc. Caveat: Our web service is located in the same Azure region as the Azure Search instance. If your SQL Server is on-prem, you could experience longer latencies.
We ended up storing about 80% of each record in Azure Search. Obviously, the more you store in Azure Search, the less likely you'll have to perform a worst-case serial "double query."

Resources