Azure search: use a single index on multiple data sources - azure-cognitive-search

I have multiple Azure tables across multiple Azure storage that have the exact same format. Is it possible to configure several data sources in Azure-search to use a unique Index so that a search on this Index would return the results aggregated from all data sources (Azure tables)?
So far, each time I configure a new 'Data Sources' and the corresponding index, I must create a new index (with a new index name). Attempting to reuse an existing index name results in an error stating "Another index with this name already exists"
Thank you for any help or pointer you might provide.

Yes, it's possible, but we don't currently support it in the Azure Portal.
When you go through the "import data" flow in the portal, it'll create a data source, indexer and index for you.
If you want more sources for that index, you need to create new data sources and indexers, with the new indexers pointing at the existing index. Unfortunately this is not currently supported from the portal. You can do it using the .NET SDK (if you're using .NET), directly using the REST API from your app, or using any tool that can make HTTP requests such as PowerShell, curl or Fiddler.
The documentation that describes the indexer-related REST APIs is here:
https://msdn.microsoft.com/en-us/library/azure/dn946891.aspx

Related

how to parse Excel worksheet/table and map excel columns to a data entity in Dynamics 365 Operations

what actions/connectors one could use to parse a excel file and map columuns from a excel table to and external store, in this case i wish to make a record in Dynamics 365 operations using OData entity.
Thanks
Why exactly are the obvious RapidStart Services no option for you?
If you definitely need to use OData, I suggest building a publishable OData page via WebServices. You then can implement a parser in any environment/language you prefer and submit the new record via a RESTful Webservice-Call (which nearly every framework should allow) to this very same page, in order to submit the record to your productive environment.

Creating an Indexer in Azure Search: Schema identifying Edm.Complex types in one subscription but not another

I’m seeing some inconsistent behavior between two different subscriptions when trying to create an Indexer for an Azure Search service through the Azure Portal.
I was able to successfully build the indexers in our test/ppe subscription (Microsoft) using a Cosmos DB collection as a source and the data was matched 1:1 using Edm.ComplexType and Collection(Edm.ComplexType):
Complex Types showing up in dropdown
After I verified things were working as expected, I moved to our prod subscription (AME.GBL) to do the same; however, this ability to add/edit complex types seems to be missing:
Complex Types not showing up in dropdown
Is there a reason why this ability to add complex types is available in one of our subscriptions but not the other?
Is there a feature gate in place for this ability to add/edit Complex types for an index and if there is, would it be possibly to manually override it for a given subscription ID?
Thank you!
Bradley - You are correct that in this case the Complex Type detection feature via the portal 'Import Data' workflow is currently only available via the internal Microsoft portal experience. However, this feature will be deployed publicly world-wide very soon and we are targeting 5/1/2019 for global availability via the portal. If you need to create this index with Complex Types beforehand you can use the API to set the index schema with the complex types directly.

How to find Uniqueness in Azure Usage Records from Billing APIs

I am building an Azure Chargeback solution and for that I am pulling the Azure Usage data from Azure Billing REST APIs for multiple subscriptions and different dates. I need to store this into custom MS SQL database as per customer’s requirements. I get various usage records from Azure.
Problem: From these Usage records, I am not able to find any combination of the columns in the data I receive which will give me a
Unique Key to identify a Usage record for a particular subscription
and for a specific date. Only column I see as different is Quantity
but even that can be duplicated. E.g. If there are 2 VMs of type A1
with no data or applications on them, in the same cloud service, then
they will have exact quantity of usage. I do not get the exact name
of the VM or any other resource via the Usage APIs.
One Custom Solution (Ineffective): I can append a counter or unique ID to the usage records but if I fetch the data next time the
order may shuffle or new data may be introduced thereby affecting the
logic for uniqueness. Any logic I build to checking if any data is
missing in DB will result in bugs if there is any alteration in the
order the usage records are returned (for a specific subscription for
a specific date).
I am sure that Microsoft stores this data in some database. I can’t find the unique id to identify a usage record from many records returned by the Billing API. Maybe I am missing something here.
I will appreciate any help or any pointers on this.
When you call the Usage API set the ShowDetails parameter to true: &showDetails=true
MSDN Doc
This will populate the instance data in the returned JSON with the unique URI for the resource which includes the name for example:
Website:
"instanceData": "{\"Microsoft.Resources\":{\"resourceUri\":\"/subscriptions/xxx-xxxx/resourceGroups/mygoup/providers/Microsoft.Web/sites/WebsiteName\",\"...
Virtual Machine:
"instanceData": "{\"Microsoft.Resources\":{\"resourceUri\":\"/subscriptions/xxx-xxxx/resourceGroups/TESTINGBillGroup/providers/Microsoft.Compute/virtualMachines/TestingBillVM\",\...
If ShowDetails is false all your resources will be aggregated on the server side based on the resource type, all your websites will show as one entry.
The resource URI, date range and meterid will form a unique key as far as I know.
NOTE: If you are using the legacy API your VMs will be aggregated under the cloud service hosting them.

Checking if entry uploaded/exists in table using C# securely using AWS IAM

I need to check if a value has successfully uploaded to a table using C#. The table uses a hash and range. Currently I use a queryrequest->queryresponse->queryresult and then check if the result is null. However the problem with this is that the entire table entry (i.e. all fields) are passed back to the program. This is not sufficiently secure.
I have looked at AWS IAM access policies however I cannot seem to restrict 'getitem' to field level, only to table level.
Any suggestions as to how to have an IAM access policy that only allows users get the hash/range from a table?
I don't think that this is possible via IAM. However, one way to approximate it is to encrypt all fields except for the hash/range.

Data security in result sets from Elastic Search, Solr or

I need to add full-text search capabilities to my existing database. Of course first turn is to something like Solr or Elastic Search. And the blocking point I’ve got to is – how to securely display results returned from underlying search engine (let’s think about Solr or Elastic Search for now, however any other solution or engine that hit the point are also appreciated).
The tricky context is that I have, for example, in my system Personal Profile records that are to be indexed. One of the fields in personal profile is – manager’s feedback. Normally in the system that field is visible only to employee’s direct manager and higher hierarchy, i.e. ‘manager’ from another branch will not be able to see that field. However, I want that field to be searchable via full text search but only for people who actually can see it.
Now I query Solr for ‘stupid’ (that is query string) and it returns me N documents. When returning that to end-user I’ll remove the ‘Manager’s feedback’ field because end-user is not the manager of given people – but just presence of the document in resultset is already the evidence of ‘stupid’ guys …
The question is – what is workable approach to handle that use-case? Is it possible to plug into Solr/ES with home-grown security filter for outputs?
Caveats:
filtering out only fields do not work because of above mentioned scenario
filtering out complete documents will not work because of
search engine does not tell which fields matched – therefore no way to manually filter resultset by field http://elasticsearch-users.115913.n3.nabble.com/Best-way-to-return-which-field-matched-td2713071.html
even this does work, removing documents from result set will spoil down facets (e.g. number of matches by department) returned by the engine – I’ll have to either recalculate facets manually or they will not match to manually filtered records and will reveal what I actually do not want to show to end users
In Solr you can create multiValued fields. In your case you can use it to store de-normalized values of organization structure.
In described scenario you will create multi valued field ouId (Organization Unit Id) and store employee's ouId and all parent ouIds. In other words you will save allowed ouIds into this field.
In search scenario you will use FilterQuery - fq parameter filtering by ouId of manager.
Example:
..&fq=ouId:12
where 12 is organization unit id of selected manager.
Maybe this is helpful for you https://github.com/salyh/elasticsearch-security-plugin It adds Document level security to elasticsearch.
"Currently for user based authentication and authorization Kerberos/SPNEGO and NTLM are supported through 3rd party library waffle (only on windows servers). For UNIX servers Kerberos/SPNEGO is supported through tomcat build in SPNEGO Valve (Works with any Kerberos implementation. For authorization either Active Directory and generic LDAP is supported). PKI/SSL client certificate authentication is also supported (CLIENT-CERT method). SSL/TLS is also supported without client authentication.
You can use this plugin also without Kerberos/NTLM/PKI but then only host based authentication is available.
As of now two security modules are implemented:
Actionpathfilter: Restrict actions against Elasticsearch on a coarse-grained level like who is allowed to to READ, WRITE or even ADMIN rest api calls
Document level security (dls): Restrict actions on document level like who is allowed to query for which fields within a document"

Resources