Reuse Skillsets in multiple Azure Cognitive Search Resources

Reuse Skillsets in multiple Azure Cognitive Search Resources - azure-cognitive-search

I developed a custom skillset that is called by an indexer within an Azure Cognitive Search Resource. It is possible to reuse this custom skillset with multiple indexers that are defined in a single Azure Cognitive Search Resource.
Is it possible to call that same skillset from an indexer that is in a different Azure Search Service Resource?
The MSFT Docs say that 'As a high-level resource, you can design a skillset once, and then reference it in multiple indexers.' - but it is unclear to me if or how you would reuse the skillset with an indexer that is in a different Azure Search Service Resource.

You can re-use the same skillset definition, but you'll need to create a new skillset instance for the other search service.

Yes, you can take the json definition of your skillset, and reuse it with multiple indexers in the same service, or in an indexer in a different service.
Here's an example of how I've done this at https://github.com/liamca/covid19search/tree/master/AzureCognitiveSearchService. This folder contains a Jupyter notebook to set up a Cognitive Search service, and the various pieces (skillset, indexer, index, etc.) are stored as .json and reused each time that you create a service.
If you typically use the Azure portal "Import data" experience, there isn't an easy way to use your skillset json during that workflow. So you could just select a single skill during "Import data", and after the "Import data" wizard is completed, then click on the skillset that was created with a single skill, and paste your custom skillset into that Skillset Definition (JSON) and click "Save".

Related

how to check my azure search quota usage?

For storage and VM, I can check the current quota usage by following powershell command:
Get-AzureRmStorageUsage
Get-AzureRmVMUsage
Is there similar thing for Azure Search? Either via powershell or Portal is ok.

In addition to monitor usage on portal, you could also get it via Service Statistics.
The Service Statistics request is constructed using HTTP GET and returns from Azure Search the current usage and limits of the following properties.
GET https://[service name].search.windows.net/servicestats?api-version=[api-version]
Content-Type: application/json
api-key: [admin key]
how many S2 or S3 services I can created on a give region?
You can create multiple services within a subscription. Each one can be provisioned at a specific tier. You're limited only by the number of services allowed at each tier. For example, you could create up to 12 services at the Basic tier and another 12 services at the S1 tier within the same subscription. Please refer to this article.

There is an easy way to check your quota usage for an Azure search service, via the portal:
If you open up the overview tab for you search service, you'll be presented with something like this: (image from a search service that I've created)
This shows the quota of resources for your search service and how much of the quota has been used up.

https://learn.microsoft.com/en-us/azure/search/search-limits-quotas-capacity
By using above url if you know what is your service plan then you can see the limits

exporting data for analytics use in SaaS

We are a SaaS product and we would like to be able have per-user data exports that will be used with various analytical (BI) tools like Tableau or PowerBI. Instead of just managing all those exports manually, we thought of using some cloud database such as AWS Redshift (which will be part of our service). But then, it is not clear how is user will access those databases naturally, unless we do some kind of SSO integration with AWS.
So - what is the best practice for exporting data for analytics use in SaaS products?

In this case you can build your security in to your backend API layer.
First you can set up processes to load your data to Redshift, then make sure that only your backend API server/cluster has access to redshift (e.g. through a vpc with no external ip access to redshift)
Now you have your data, you can validate your user as usual through your backend service, then when a user requests a download through the backend API, the backend can create a query to extract from redshift only the correct data based upon the users security role. In order to make this possible you may need to build some kind of security column into your redshift data model.

I am assuming getting data to redshift is not a problem.
What you are looking for, if I understand correctly is a OEM solutions.
The problem is how does one mimic the security model you have in place for your SaaS offering.
That depends on how complex is your security model.
If it is as simple as just authenticate the user and he has access to all tenant data or the data can be easily filtered for user. Things are simple for you. Trusted authentication will allow you to authenticate that user and user filtering will allow you to show him all that he has access to.
But here is the kicker, if your security is really complex , then it can become really difficult to mimic it within these products.
Here for integrating tableau this link will help:-
https://tableau.github.io/embedding-playbook/#
Power BI, this product am not a fan off. I tried to embed a view in one my applications and data refresh was a big issue.
Its almost like they want you to be a azure shop for real time reporting.( I like GCP more )
If you create the api's and populate datasets then they have crazy restrictions like 1MB/sec etc.
On the other instances datasets can be refreshed only 8 times.
I gave up on them.
Very recently I got a call from Sisense and they seemed promising as well from a OEM perspective. You might was to try them.

Is it possible to populate IBM Watson Assistant entites by reading from another system (e.g. Salesforce)?

I'd like to bring through entities from another system (e.g. Salesforce) into the Watson chatbot in order to allow the user to interact with them. e.g. Rather than explicitly defining "Customer" and then building a list of customers in Watson I'd like it to integrate and bring through all active Customer records from salesforce, each as their own entity. Is it possible to dynamically update Watson's entity list based on a table in another system?

There is no built-in functionality for that, but it can be done. Watson Assistant / Watson Conversation has an API and SDK support for adding entities. I have used that technique as part of the EgoBot project.
However, I would recommend to integrate Salesforce as a regular backend. Here is a tutorial for how to have a database as backend. Another option, depending on what you want to accomplish, is to look at the Salesforce to Watson integration. There is also a Salesforce / Watson SDK for that.

Create multiple databases dynamically in Microsoft Azure

I am a newbie in Microsoft Azure platform. I want to create multiple databases dynamically (We are developing multi-tenant model. So, Each organization should have their own database. Whenever an organization is registered with our system, we need to create a new database dynamically). Please provide some insights on this.

By using Azure Resource Manager Templates you can reliably deploy the whole infrastructure required by each organisation. So if they need a webserver, database and middleware servers, you can define the whole thing in a template and reliably deploy that for every client.
(from the above link)
You can deploy, manage, and monitor all of the resources for your solution as a group, rather than handling these resources individually.
You can repeatedly deploy your solution throughout the development lifecycle and have confidence your resources are deployed in a consistent state.
You can use declarative templates to define your deployment.
You can define the dependencies between resources so they are deployed in the correct order.
You can apply access control to all services in your resource group because Role-Based Access Control (RBAC) is natively integrated into the management platform.
You can apply tags to resources to logically organize all of the resources in your subscription.
You can clarify billing for your organization by viewing the rolled-up costs for the entire group or for a group of resources sharing the same tag.
The link above has a lot of resources for learning how to use templates as well as the syntax and usage.
There are a large number of templates at the Azure ARM Template Github page and even some pre-existing templates to get you started deploying SQL Server to Azure (there's also mysql and postgress if you prefer)
Plus many others that you can work through to get accustomed to how they work.

you can use the AZURE SQL Database REST API to do so, its as simple as sending a PUT Request to a URL https://management.azure.com/subscriptions/{subscription-id}/resourceGroups/{resource-group-name}/providers/microsoft.sql/servers/{server-name}/databases/{database-name}?api-version={api-version}
Check out these links for more details
https://msdn.microsoft.com/en-us/library/azure/mt163571.aspx
https://msdn.microsoft.com/en-us/library/azure/mt163685.aspx

How to avoid script injection in search query of elastic search?

I have a application that uses Angularjs and have database as Elastic Search. For Elastic Search the version is 1.3.1, so dynamic scripting is enabled by default. User can add data to elastic search from the application. So while searching how to avoid the injections that is the script injection in query of elastic search?

Depends how json is built, if it's something like "{query: {match:"%s"}}" then it's possible to pass a string to add more text to the script.
Check, if elasticsearch post open for everybody - you should close it
You should use groovy sandboxed scripting and limit libraries to be used.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight