Using search.in with all - azure-cognitive-search

Using search.in with all - azure-cognitive-search

Follwing statement find all profiles that has Facebook or twitter and this works:
$filter=SocialAccounts/any(x: search.in(x, 'Facebook,Twitter'))
But I cant find any samples for finding all that has both Facebook and twitter. I tried:
$filter=SocialAccounts/all(x: search.in(x, 'Facebook,Twitter'))
But this is not valid query.

Azure Search does not support the type of ‘all’ filter that you’re looking for. Using search.in with ‘all’ would be equivalent to using OR, but Azure Search can only handle AND in the body of an ‘all’ lambda (which is equivalent to OR in the body of an ‘any’ lambda).
You might try a workaround like this:
$filter=tags/any(t: t eq 'Facebook') and tags/any(t: t eq 'Twitter')
However, this isn't actually equivalent to using all with search.in. The query as expressed using all is matching documents where every social account is strictly either Facebook or Twitter. If any other social account is present, the document won’t match. The workaround doesn’t have this property. A document must have at least Facebook and Twitter in order to match, but not exclusively those. This is certainly a valid scenario; it just isn't the same as using all with search.in, which was the original question.
No matter how you try to rewrite the query, you won’t be able to express an equivalent to the all query. This is a limitation due to the way Azure Search stores collections of strings and other primitive types in the inverted index.
Please vote on user voice to help prioritize:
https://feedback.azure.com/forums/263029-azure-search/suggestions/37166749-efficient-way-to-express-a-true-all
A possible workaround is to use the new Complex Types feature, which does allow more expressive filters inside lambda expressions. For example, if you model tags as objects with a single value property instead of as a collection of strings, you should be able to execute a filter like this:
$filter=tags/all(t: search.in(t/value, 'Facebook,Twitter'))
In the REST API, you'd define tags like this:
{
"name": "myindex",
"fields": [
...
{
"name": "tags",
"type": "Collection(Edm.ComplexType)",
"fields": [
{ "name": "value", "type": "Edm.String", "filterable": true }
]
}
]
}
Note that this feature is in preview at the time of this writing, but will be generally available (and publicly documented) soon.

Related

Azure AD show group name in id token instead of group id

My id token has group (as role) ids only
"roles": [
"729b24b5-c527-440e-9ef6-81a04415e7ba",
"8d4f9343-10c3-43a2-9efe-34cfd740d020",
"81715416-9be4-43d7-807a-d5ccc9420cf7",
"1b5e6d7b-0ee0-4212-a5b9-cd5c3ca07a4a"
],
Even set to sAMAccountName
Any idea to return the group names instead?

If you are expecting group names in the claims of ID/Access/SAML token, unfortunately currently that is not supported due to some limitations. You would only have the object ids (guid) of the groups in the claim for AAD managed groups.
If you absolutely need group names for your purpose, consider a separate Graph API call to list group memberships of a user.
Also feel free to upvote on the feature request of group names in claims here.
Please refer to this similar question

I know this question is rather old already, but I experienced the same issue in summer 2021 still when trying to create a new .NET API project. Probably people stumbling across it in 2021 could make use of my solution:
I decided to create a NuGet package that resolves the group names using Microsoft Graph for applications using .NETs Microsoft.Identity.Web package (mostly ASP.NET Core applications).
Feel free to take a look into https://github.com/peterwurzinger/AuthOida if that's applicable to your use case.

This is relatively old, but there's answer to that.
You can translate your groups to names by adjusting your application manifest.
Go to your application manifest via Applications and SSO options, you should see there "Manifest option" it should return a JSON which you can modify.
The important bit is in the optionalClaims, you need to add to your groups.additionalProperties section cloud_displayname option like this:
...
"optionalClaims": {
"idToken": [
{
"name": "preferred_username",
"source": null,
"essential": false,
"additionalProperties": []
},
{
"name": "groups",
"source": null,
"essential": false,
"additionalProperties": [
"sam_account_name",
"emit_as_roles",
"cloud_displayname"
]
}
],
...

How to ensure all items in a collection match a filter in Azure Cognitive Search

I have Azure Cognitive Search running, and my index is working as expected.
We are trying to add a security filter into the search, based on the current users permissions.
The users permissions are coming to me in as IEnumerable, but I am currently selecting just a string[] and passing that into my filter, then do a string.join, which looks like this.
permission1, permission2, permission3, permission4
In our SQL database, we have a view that is where the index is getting it's data from. There is a column on the view called RequiredPermissions, it is a Collection(Edm.string) in the index, and the data looks like this.
[ 'permission1', 'permission2', 'permission3' ]
The requirement is that for a record to return in the results, a user's permissions must contain all of the RequiredPermissions for that record.
So if we have a user with the following permissions
permission1, permission3, permission5
And we have the following records
Id, SearchText, Type, Permissions
1, abc, User, [ 'permission1', 'permission2' ]
2, abc.pdf, Document, [ 'permission1' ]
3, abc, Thing, [ 'permission1', 'permission3' ]
4, abc, Stuff, [ 'permission3', 'permission4' ]
If the user searched for 'abc' and these four results would come back, I need to $filter results that do not have the proper permissions. So I would expect the following results
Id, Returned, Reason
1, no, the user does not have permission2
2, yes, the user has permission1 and nothing else is needed
3, yes, the user has both permission1 and permission3
4, no, the user does not have permission4
If I run the following filter, then I get back anything that has permission1 or permission3, which is not acceptable, since the user should not see items Id 1 or 4
RequiredPermissions/any(role: search.in(role, 'permission1, permission3', ','))
If I run this filter, then I get nothing back, everything is rejected, because no records have permission5, and the user has it
RequiredPermissions/all(role: not search.in(role, 'permission1, permission3', ','))
If I try to run the search using 'all' and without the 'not' I get the following error
RequiredPermissions/all(role: search.in(role, 'permission1, permission3', ','))
Invalid expression: Invalid lambda expression. Found a test for equality or inequality where the opposite was expected in a lambda expression that iterates over a field of type Collection(Edm.String). For 'any', please use expressions of the form 'x eq y' or 'search.in(...)'. For 'all', please use expressions of the form 'x ne y', 'not (x eq y)', or 'not search.in(...)'.\r\nParameter name: $filter
So it seems that I cannot use the 'not' with 'any', and I must use the 'not' with 'all'
What I wish for is a way to say that a user has all the permissions in their list that is in the RequiredPermissions column.
I am currently just working in Postman using the RestApi to solve this, but I will eventually move this into .Net.

Your scenario can't be implemented with Collection(Edm.String) due to the limitations on how all and any work on such collections (documented here).
Fortunately, there is an alternative. You can model permissions as a collection of complex types, which allows you to use all the way that you need to implement your permissions model. Here is a JSON example of how the field would be defined:
{
"name": "test",
"fields": [
{ "name": "Id", "type": "Edm.String", "key": true },
{ "name": "RequiredPermissions", "type": "Collection(Edm.ComplexType)", "fields": [{ "name": "Name", "type": "Edm.String" }] }
]
}
Here is a JSON example of what a document would look like with its permissions defined:
{ "#search.action": "upload", "Id": "1", "RequiredPermissions": [{"Name": "permission1"}, {"Name": "permission2"}] }
Here is how you could construct a filter that has the desired effect:
RequiredPermissions/all(perm: search.in(perm/Name, 'permission1,permission3,permission5'))
While this works, you are strongly advised to test the performance of this solution with a realistic set of data. Under the hood, all is executed as a negated any, and negated queries can sometimes perform poorly with the type of inverted indexes used by a search engine.
Also, please be aware that there is currently a limit on the number of elements in all complex collections across a document. This limit is currently 3000. So if RequiredPermissions were the only complex collection in your index, this means you could have at most 3000 permissions defined per document.

HATEOAS and forms driven by the API

I'm trying to apply HATEOAS to the existing application and I'm having trouble with modeling a form inputs that would be driven by the API response.
The app is allowing to search & book connections between two places. First endpoint allows for searching the connections GET /connections?from={lat,lon}&to={lat,lon}&departure={dateTime} and returns following payload (response body).
[
{
"id": "aaa",
"carrier": "Fast Bus",
"price": 3.20,
"departure": "2019-04-05T12:30"
},
{
"id": "bbb",
"carrier": "Airport Bus",
"price": 4.60,
"departure": "2019-04-05T13:30"
},
{
"id": "ccc",
"carrier": "Slow bus",
"price": 1.60,
"departure": "2019-04-05T11:30"
}
]
In order to make an order for one of connections, the client needs to make a POST /orders request with one of following payloads (request body):
email required
{
"connectionId": "aaa",
"email": "passenger#example.org"
}
email & flight number required (carrier handles only aiprort connections)
{
"connectionId": "bbb",
"email": "passenger#example.org",
"flightNumber": "EA1234"
}
phone number required
{
"connectionId": "ccc",
"phoneNumber": "+44 111 222 333"
}
The payload is different, because different connections may be handled by different carriers and each of them may require some different set of information to provide. I would like to inform the API client, what fields are required when creating an order. The question I have is how do I do this with HATEOAS?
I checked different specs and this is what I could tell from reading the specs:
HAL & HAL-FORMS There are "_templates" but, there is no URI in the template itself. It’s presumed to operate on the self link, which in my case would be /connections... not /orders.
JSON-LD I couldn't find anything about forms or templates support.
JSON-API I couldn't find anything about forms or templates support.
Collection+JSON There is at most one "template" per document, therefore it's presumed that all elements of the collection have the same fields which is not the case in my app.
Siren Looks like the "actions" would fit my use case, but the project seems dead and there are no supporting libraries for many major languages.
CPHL The project seems dead, very little documentation and no libraries.
Ion There is nice support for forms, but I couldn't find any supporting libraries. Looks like it's just a spec for now.
Is such a common problem as having forms driven by the API still unsolved with spec and tooling?

In your example, it appears that Connections are resources. It's not completely clear if Orders are truly resources. I'm guessing probably yes, but to have an Order you need a Client and Connection. So, to create an Order you will need to expose a collection, likely from the Client or Connection, possibly both.
I think the disconnect is from thinking along the lines of "now that we've got a list of available connections, the client can select one and create an Order." That's perfectly valid, but it's remote procedure call (RPC) thinking, not REST. Neither is objectively better than the other, except in the context of a particular set of project requirements, and generally they shouldn't be mixed together.
With an RPC mindset, a create order method is defined (e.g. using OpenAPI) and any clients are expected to use some out-of-band information to determine the correct form required (i.e. by reading the OpenAPI spec).
With a REST/HATEOAS mindset, the correct approach would be to expose a Orders collection from Connection. Each Connection in the collection has a self link and a Orders collection (link or object, as defined by app requirements). Each item of Order has a self link, and that is where the affordances are specified. An Order is a known type (even with REST/HATEOAS the client and service have to at least agree on a shared vocabulary) that the client presumably knows how to define. That vocabulary can be defined using any mechanism that works -- json-ld, XSD, etc.
HATEOAS requires that the result contains everything the client needs to update the state. There can be no out-of-band information (other than the shared vocabulary). So, to solve your issue, you either need to expose a collection of Orders from Connection or you need to allow an Order to be created by posting to Connection. If the latter seems like a bit of a hack, it probably is.
For example, in HAL-Forms, I would do something like:
{
"connections": [{
"id": "aaa",
"carrier": "Fast Bus",
"price": 3.20,
"departure": "2019-04-05T12:30"
"_links": {
"self": { ... }, // link to this connection
"orders": {} // link to collection of orders for this connection
}
},
, ...],
"_links": {
"self": { ... } // link to the collection
},
"_templates": { ... } // post/put/patch/delete connection
}
Clients would follow the links to orders and from there would get the _templates collection that contains the instructions for managing the Order resources. The Order POST would likely require a connection identifier and client information. The HAL-Forms Spec defines a regex property that can be used to specify the type of data to supply for any particular form element. Since you have reached the order by navigating through a specific connection, you would be able to specify in your _templates for that order exactly which fields are required. e.g. /orders?connectionType=aaa would return a different set of required properties than /orders?connectionType=bbb but both use the same self link of /orders?connectionType={type} and you'd validate it on POST/PUT/PATCH.
I should note that the Spring-HATEOAS goes beyond the HAL-Forms spec and allows for multiple _links and _templates. See this GitHub issue.
It may look like HATEOAS/REST requires quite a bit more work than a simple OpenAPI/RPC API and it does. But what you are giving up in simplicity, you are gaining in flexibility and resilience, assuming well-designed clients. Which approach is correct depends on a lot of factors, most of them not technical (team skills, expected consumers, how much control you have over clients, maintenance, etc.).

Search for multiple id:s in cloudant

I am using node-red to communicate with cloudant and for each time my flow runs I might have different amount of id:s coming in msg.payload. Later I want to use these id:s to display all the relevant objects. Is it possible to search for multiple id:s in some way? Or do you have any other solution? Can't find anything about this online atm

It looks like Node-RED supports querying by _id, a search index, or all documents. When you use _id there does not seem to be a way to specify more than one ID. You can use a search index, however, to query for multiple IDs.
Create a search index in Cloudant similar to the following:
{
"_id": "_design/allDocSearch",
"views": {},
"language": "javascript",
"indexes": {
"byId": {
"analyzer": "standard",
"index": "function (doc) {\n index(\"id\", doc._id);\n}"
}
}
}
This corresponds to the following when using the Cloudant dashboard:
design doc = allDocSearch
index name = byId
index function =
function (doc) {
index("name", doc.name);
}
To search for multiple IDs your query would look something like this:
id:"1" OR id:"2"
In Node-Red set up your Cloudant node to point to the appropriate database, specify a "Search by" of search index, and configure your design document and index name (in this case it would be allDocSearch/byId).
You can test with a simple inject node with a payload similar to the search query above: id:"1" OR id:"2"

My custom slot type is taking on unexpected values

I noticed something strange when testing my interaction model with the Alexa skills kit.
I defined a custom slot type, like so:
CAR_MAKERS Mercedes | BMW | Volkswagen
And my intent scheme was something like:
{
"intents": [
{
"intent": "CountCarsIntent",
"slots": [
{
"name": "CarMaker",
"type": "CAR_MAKERS"
},
...
with sample utterances such as:
CountCarsIntent Add {Amount} cars to {CarMaker}
Now, when testing in the developer console, I noticed that I can write stuff like:
"Add three cars to Ford"
And it will actually parse this correctly! Even though "Ford" was never mentioned in the interaction model! The lambda request is:
"request": {
"type": "IntentRequest",
...
"intent": {
"name": "CountCarsIntent",
"slots": {
"CarMaker": {
"name": "ExpenseCategory",
"value": "whatever"
},
...
This really surprises me, because the documentation on custom slot types is pretty clear about the fact that the slot can only take the values which are listed in the interaction model.
Now, it seems that values are also parsed dynamically! Is this a new feature, or am I missing something?

Actually that is normal (and good, IMO). Alexa uses the word list that you provide as a guide, not a definitive list.
If it didn't have this flexibility then there would be no way to know if users were using words that you weren't expecting. This way you can learn and improve your list and handling.

Alexa treat the provided slot values as 'Samples'. Hence slot values which are not mentioned in interaction model will also get mapped.
When you create a custom slot type, a key concept to understand is
that this is training data for Alexa’s NLP (natural language
processing). The values you provide are NOT a strict enum or array
that limit what the user can say. This has two implications
1) words and phrases not in your slot values will be passed to you,
2) your code needs to perform any validation you require if what’s
said is unknown.
Since you know the acceptable values for that slot, always perform a slot-value validation on your code. In this way when you get something other than a valid car manufacturer or something which you don't support, you can always politely respond back like
"Sorry I didn't understand, can you repeat"
or
"Sorry we dont have in our list. can you please
select something from [give some samples from your list]"
More info here

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight