Creating an Indexer in Azure Search: Schema identifying Edm.Complex types in one subscription but not another - azure-cognitive-search

I’m seeing some inconsistent behavior between two different subscriptions when trying to create an Indexer for an Azure Search service through the Azure Portal.
I was able to successfully build the indexers in our test/ppe subscription (Microsoft) using a Cosmos DB collection as a source and the data was matched 1:1 using Edm.ComplexType and Collection(Edm.ComplexType):
Complex Types showing up in dropdown
After I verified things were working as expected, I moved to our prod subscription (AME.GBL) to do the same; however, this ability to add/edit complex types seems to be missing:
Complex Types not showing up in dropdown
Is there a reason why this ability to add complex types is available in one of our subscriptions but not the other?
Is there a feature gate in place for this ability to add/edit Complex types for an index and if there is, would it be possibly to manually override it for a given subscription ID?
Thank you!

Bradley - You are correct that in this case the Complex Type detection feature via the portal 'Import Data' workflow is currently only available via the internal Microsoft portal experience. However, this feature will be deployed publicly world-wide very soon and we are targeting 5/1/2019 for global availability via the portal. If you need to create this index with Complex Types beforehand you can use the API to set the index schema with the complex types directly.

Related

Azure AD provisioning requires two runs to succeed with custom app

We've created an application using SCIM 2 SDK from PingIdentity for provisioning with Azure AD. Custom mapping is set up and working.
However, when the user is CREATED, all of the fields are included in the import, but only a few fields are included in the provisioning step and sent to our application. Provisioning needs to run a second time on that user to UPDATE in order for all the fields to be included. Amongst other things, this means that first and last name are not split and it only sends the displayname (which ends up as firstname on our end).
For some users in normal provisioning, it can take days between the create and update runs so we're missing data for a long time.
Anyone know how can we can test for what's causing this and solve it so all the fields are included in the initial CREATE run for a user?
Here are the attribute mapping settings: https://imgur.com/ypfAAmD
And an example log of when the user is created with only basic fields: https://imgur.com/iOXACJh
vs. when the user is updated with all the other fields: https://imgur.com/UqDNyCv
I'm a product manager at Microsoft that works on the provisioning service and our SCIM client.
The behavior you're seeing occurs when you have attributes that are not part of the SCIM core schema included as "short" names. Attributes not defined in the SCIM core schema (RFC 7643) should have full URN syntax. Something to the effect of urn:ietf:params:scim:schemas:extension:appName:2.0:User:attributeName is commonly used by other implementations. The shaky behavior you're seeing where the AAD provisioning service fails to send these attribute values via a POST but later includes them in a PATCH comes down to different code paths in the AAD provisioning service, and the PATCH code happens to handle this differently than the POST code. This is purely by chance, however, and isn't an intentional design choice. At some future point I'm hoping we'll make this more consistent and disallow incorrectly structured attribute names entirely.
If you adjust your attribute names to align with the guidance in the SCIM spec's schema RFC and provide the attributes with fully defined URNs, you should see consistent behavior that works on both POST and PATCH.

Microsoft Graph API - Azure AD Connect - extensionAttribute

When I try querying extensionAttribute with Graph API (Hybrid Exchange), I cannot get any value.
E.g., if I try: https://graph.microsoft.com/v1.0/users/<userid or upn>?$select=extensionAttribute2, I cannot see the value even I know it’s there.
Do you know how to get it properly (or a workaround)?
Thank you
Are these values synced to Azure Active Directory? All properties for the AAD User can be found in the Microsoft Graph API docs here : https://learn.microsoft.com/en-us/graph/api/resources/user?view=graph-rest-1.0
It sounds like these are being synced from an AAD Connect environment, so it's most likely you are trying to get the onPremisesExtensionAttributes.
Per the description:
Contains extensionAttributes 1-15 for the user. Note that the individual extension attributes are neither selectable nor filterable. For an onPremisesSyncEnabled user, this set of properties is mastered on-premises and is read-only. For a cloud-only user (where onPremisesSyncEnabled is false), these properties may be set during creation or update.
I suggest taking a look more thoroughly through the documentation in regards to this. In addition to that, as you mentioned Exchange, note that the custom attributes from exchange are the same as the extension attributes. For more info on this see : https://github.com/microsoftgraph/microsoft-graph-docs/issues/5950
This is a separate sort of "Extension Attribute" but I figured I would include this in the answer as well. There is a different extensibility section for the Microsoft Graph, and the docs on this can be found here : https://learn.microsoft.com/en-us/graph/extensibility-overview
If you see information on these extensions, know that this is separate from the on-prem extensions.

Profile Management with Identity Server

So from what I have read on IdentityServer I should be storing details about the user such as first name and last name inside claims. How would a web application then be able to access the claim information? Since the User Info endpoint requires a valid access token representing the user, I suppose I would need to make an API that could access that returned the profile information of other users? Is this the right way to do it? (use case, web page needs to display contact details that are stored in claims of another user)
Also what would be the way for multiple language profile information be stored and retrieved in the claims? For example a user can have a name/title in multiple languages. I'm thinking of making [LanguageCode]_[ClaimType] (fr_first_name) naming convention and either adding all languages to just the profile IdentityResource or creating separate resources per language.
Your best bet is to set up a project using the IdentityServer4 QuickstartUI example and review that code to better understand how it all works. As of version 4, Identity Server is only focused on the sign-in / sign-out process and the various flows around authentication. They also provide a basic EF-driven persistence model, and they also support the ASP.NET Core Identity persistence model (also EF-driven), but both of those are not meant to be production-ready code.
Basically, persistence of user details is considered your responsibility. That being said, the cookies used for ASP.NET Core authentication greatly restricts how much data you can/should store as claims. The best model is to keep "real" identity provider (IDP) claims as claims, don't add new claims to that list, copy what you need into some other separate user-data table you manage completely, and use the unique claims identifier (almost always "subject id") as the key to your user data. This also makes it easier to migrate a user to another IDP (for example, you'll know user details for "Bob" but he can re-associate his user data away from his Facebook OIDC auth to his Google auth).
Basic persistence isn't too difficult (it's only 12 or 13 SQL statements) but it's a lot more than will fit in a Stackoverflow answer. I blogged about a non-EF approach here -- also not production-ready code (for example, it has ad-hoc SQL to keep things simple), but it should get you started.

Azure search: use a single index on multiple data sources

I have multiple Azure tables across multiple Azure storage that have the exact same format. Is it possible to configure several data sources in Azure-search to use a unique Index so that a search on this Index would return the results aggregated from all data sources (Azure tables)?
So far, each time I configure a new 'Data Sources' and the corresponding index, I must create a new index (with a new index name). Attempting to reuse an existing index name results in an error stating "Another index with this name already exists"
Thank you for any help or pointer you might provide.
Yes, it's possible, but we don't currently support it in the Azure Portal.
When you go through the "import data" flow in the portal, it'll create a data source, indexer and index for you.
If you want more sources for that index, you need to create new data sources and indexers, with the new indexers pointing at the existing index. Unfortunately this is not currently supported from the portal. You can do it using the .NET SDK (if you're using .NET), directly using the REST API from your app, or using any tool that can make HTTP requests such as PowerShell, curl or Fiddler.
The documentation that describes the indexer-related REST APIs is here:
https://msdn.microsoft.com/en-us/library/azure/dn946891.aspx

Data security in result sets from Elastic Search, Solr or

I need to add full-text search capabilities to my existing database. Of course first turn is to something like Solr or Elastic Search. And the blocking point I’ve got to is – how to securely display results returned from underlying search engine (let’s think about Solr or Elastic Search for now, however any other solution or engine that hit the point are also appreciated).
The tricky context is that I have, for example, in my system Personal Profile records that are to be indexed. One of the fields in personal profile is – manager’s feedback. Normally in the system that field is visible only to employee’s direct manager and higher hierarchy, i.e. ‘manager’ from another branch will not be able to see that field. However, I want that field to be searchable via full text search but only for people who actually can see it.
Now I query Solr for ‘stupid’ (that is query string) and it returns me N documents. When returning that to end-user I’ll remove the ‘Manager’s feedback’ field because end-user is not the manager of given people – but just presence of the document in resultset is already the evidence of ‘stupid’ guys …
The question is – what is workable approach to handle that use-case? Is it possible to plug into Solr/ES with home-grown security filter for outputs?
Caveats:
filtering out only fields do not work because of above mentioned scenario
filtering out complete documents will not work because of
search engine does not tell which fields matched – therefore no way to manually filter resultset by field http://elasticsearch-users.115913.n3.nabble.com/Best-way-to-return-which-field-matched-td2713071.html
even this does work, removing documents from result set will spoil down facets (e.g. number of matches by department) returned by the engine – I’ll have to either recalculate facets manually or they will not match to manually filtered records and will reveal what I actually do not want to show to end users
In Solr you can create multiValued fields. In your case you can use it to store de-normalized values of organization structure.
In described scenario you will create multi valued field ouId (Organization Unit Id) and store employee's ouId and all parent ouIds. In other words you will save allowed ouIds into this field.
In search scenario you will use FilterQuery - fq parameter filtering by ouId of manager.
Example:
..&fq=ouId:12
where 12 is organization unit id of selected manager.
Maybe this is helpful for you https://github.com/salyh/elasticsearch-security-plugin It adds Document level security to elasticsearch.
"Currently for user based authentication and authorization Kerberos/SPNEGO and NTLM are supported through 3rd party library waffle (only on windows servers). For UNIX servers Kerberos/SPNEGO is supported through tomcat build in SPNEGO Valve (Works with any Kerberos implementation. For authorization either Active Directory and generic LDAP is supported). PKI/SSL client certificate authentication is also supported (CLIENT-CERT method). SSL/TLS is also supported without client authentication.
You can use this plugin also without Kerberos/NTLM/PKI but then only host based authentication is available.
As of now two security modules are implemented:
Actionpathfilter: Restrict actions against Elasticsearch on a coarse-grained level like who is allowed to to READ, WRITE or even ADMIN rest api calls
Document level security (dls): Restrict actions on document level like who is allowed to query for which fields within a document"

Resources