I am working in RASA NLU to extract intents and entities in Arabic language, and I have my own entities such as (places, org, and people) and i want to add these entities without any intent.
I just want to add them as an entities and their type.
How can I do this?
Are you using .md or .json as training data file? I cannot think of a solution to pass an entity in markdown format without defining an intent. But in json format you may pass a text without defining an intent by simply not writing a value for the dictionary key "intent". See example below.
The documentation https://rasa.com/docs/nlu/dataformat/ says that intent is an optional field, so it should work.
{
"text": "show me chinese restaurants",
"intent": ,
"entities": [
{
"start": 8,
"end": 15,
"value": "chinese",
"entity": "cuisine"
}
]
}
I would try leaving the value of "intent" completely empty, insert None or Null.
Related
What I've gathered is that new posts are published by POSTing a JSON-LD Activity Streams object of type Note to an actor's outbox.
{"#context": "https://www.w3.org/ns/activitystreams",
"type": "Note",
"to": ["https://chatty.example/ben/"],
"attributedTo": "https://social.example/alyssa/",
"content": "Say, did you finish reading that book I lent you?"}
The server will then have wrap it into an activity of type Create.
{"#context": "https://www.w3.org/ns/activitystreams",
"type": "Create",
"id": "https://social.example/alyssa/posts/a29a6843-9feb-4c74-a7f7-081b9c9201d3",
"to": ["https://chatty.example/ben/"],
"actor": "https://social.example/alyssa/",
"object": {"type": "Note",
"id": "https://social.example/alyssa/posts/49e2d03d-b53a-4c4c-a95c-94a6abf45a19",
"attributedTo": "https://social.example/alyssa/",
"to": ["https://chatty.example/ben/"],
"content": "Say, did you finish reading that book I lent you?"}}
I fail to see the usefulness of this, as the wrapping activity doesn't seem to add any useful data to the wrapped note. Worse even, it seems like it might introduce a fair bit of redundancy to the responses (in this basic example from the official page, actor and attributedTo, as well as the 2 to fields, have exactly the same purpose). Is this perhaps done just for consistency, as there are a few other other activity types that are applied to notes, and for newly created posts having just a plain object (or a collection of plain objects) as a response would not fit this way of doing things?
Also, why are other activity types (e.g., Like) able to simply reference notes by id, while Create activities enclose that data directly? Is that required or is there a specific reason for it?
{"#context": "https://www.w3.org/ns/activitystreams",
"type": "Like",
"id": "https://social.example/alyssa/posts/5312e10e-5110-42e5-a09b-934882b3ecec",
"to": ["https://chatty.example/ben/"],
"actor": "https://social.example/alyssa/",
"object": "https://chatty.example/ben/p/51086"}
ActivityStreams is a protocole to synchronise datas between different databases and softwares.
Some actions contain the detail (like for Notes) because we can create/update them and an activity stream reader must know the changes to display the good datas to its users.
Some operations like Like have a named cancel operation like Unlike. So we don't need the Create/Update container. and because they operate on existing datas, we also only need the unique ID to the concerned resource.
I noticed something strange when testing my interaction model with the Alexa skills kit.
I defined a custom slot type, like so:
CAR_MAKERS Mercedes | BMW | Volkswagen
And my intent scheme was something like:
{
"intents": [
{
"intent": "CountCarsIntent",
"slots": [
{
"name": "CarMaker",
"type": "CAR_MAKERS"
},
...
with sample utterances such as:
CountCarsIntent Add {Amount} cars to {CarMaker}
Now, when testing in the developer console, I noticed that I can write stuff like:
"Add three cars to Ford"
And it will actually parse this correctly! Even though "Ford" was never mentioned in the interaction model! The lambda request is:
"request": {
"type": "IntentRequest",
...
"intent": {
"name": "CountCarsIntent",
"slots": {
"CarMaker": {
"name": "ExpenseCategory",
"value": "whatever"
},
...
This really surprises me, because the documentation on custom slot types is pretty clear about the fact that the slot can only take the values which are listed in the interaction model.
Now, it seems that values are also parsed dynamically! Is this a new feature, or am I missing something?
Actually that is normal (and good, IMO). Alexa uses the word list that you provide as a guide, not a definitive list.
If it didn't have this flexibility then there would be no way to know if users were using words that you weren't expecting. This way you can learn and improve your list and handling.
Alexa treat the provided slot values as 'Samples'. Hence slot values which are not mentioned in interaction model will also get mapped.
When you create a custom slot type, a key concept to understand is
that this is training data for Alexa’s NLP (natural language
processing). The values you provide are NOT a strict enum or array
that limit what the user can say. This has two implications
1) words and phrases not in your slot values will be passed to you,
2) your code needs to perform any validation you require if what’s
said is unknown.
Since you know the acceptable values for that slot, always perform a slot-value validation on your code. In this way when you get something other than a valid car manufacturer or something which you don't support, you can always politely respond back like
"Sorry I didn't understand, can you repeat"
or
"Sorry we dont have in our list. can you please
select something from [give some samples from your list]"
More info here
As a simple exercise I wanted to take some test-data from a little app I had which produced a user record in JSON and turn it into JSON-LD, testing on JSON-LD.org's playground gives some help, but I don't know if I'm doing it right.
The original is:
[
{
"Id": 1
"Username": "Dave",
"Colour":"green“
}
]
So I have a person, who has a username, an ID and an associated colour.
What I've got so far is:
{
"#context": {
"name": "http://schema.org/name",
"Colour": {
"#id": "http://dbpedia.org/ontology/Colour",
"#type": "http://schema.org/Text",
"#language": "en"
}
},
"#type": "http://schema.org/Person",
"#Id": "http://example.com/player/1",
"sameAs" : "https://www.facebook.com/DaveAlger",
"Id": 1,
"name": "David Alger",
"Username": "Dave",
"Colour": "green"
}
So I'm declaring it's a #type of person, and given a URI #id.
I'm also using the "sameAs" idea, which I saw on a blog-post once, but am unclear if it is just supported right off.
Then I've tried to create a #context. Here that I've added a name and given that a reference. I've tried to create something for "colour" too. I'm not sure if pointing to a DBpedia reference about "colour" and specifying a #type and #language is good, or not.
I suppose the final thing is "username", but that feels so deeply internal to a site that it doesn't make sense to "Link" it at all.
I'm aware this data is perhaps not even worth linking, this is very much a learning exercise for me.
I don’t think that http://dbpedia.org/ontology/Colour should be used like that. It’s a class, not a property. The property that has http://dbpedia.org/ontology/Colour as range is http://dbpedia.org/ontology/colour. (That said, I’m not sure if your really intend that the person should have a colour, instead of something related to this person.)
If you want to provide the language of the colour strings, you should not specify the datatype, #language is sufficient (if a value is typed, it can’t have a language anymore; by using #language, it’s implied that the value is a string).
You are using #Id for specifying the node’s URI, but it must be #id.
The properties sameAs, Id and Username are not defined in your #context.
If you intend to use Schema.org’s sameAs property, you could define it similar to what you did with name, but you should specify that the value is a URI:
"sameAs": {
"#id": "http://schema.org/sameAs",
"#type": "#id"
},
For Username, you could use FOAF’s nick property, or maybe Schema.org’s alternateName property.
No idea which property you could use for Id (depends on your case if this is useful for others at all, or if this is only relevant for your internal system).
I am using #RepositoryRestResource annotation to expose Spring JPA Data as restful service. It works great. However I am struggling with referencing specific entity within angular app.
As known, Spring Data Rest doesn't serialise #Id of the entity, but HAL response contains links to entities (_links.self, _embedded.projects[]._links.self) like in the following example:
{
"_links": {
"self": {
"href": "http://localhost:8080/api/projects{?page,size,sort}",
"templated": true
}
},
"_embedded": {
"projects": [
{
"name": "Sample Project",
"description": "lorem ipsum",
"_links": {
"self": {
"href": "http://localhost:8080/api/projects/1f888ada-2c90-48bc-abbe-762d27842124"
}
}
},
...
My Angular application requires to put kind of reference to specific project entity in the URL, like http://localhost/angular-app/#/projects/{id}. I don't think using href is good idea. UUID (#Id) seems to be better but is not explicitly listed as a field. This is point I got stuck. After reading tons of articles I came up with 2 ideas, but I don't consider neither of those as a perfect one:
Idea 1:
Enable explicitly serialisation of #Id field and just use it to reference to the object.
Caveat: exposing database specific innards to front-end.
Idea 2:
Keep #Id field internal and create an extra "business identifier" field which can be used to identify specific object.
Caveat: Extra field in table (wasting space).
I would appreciate your comment on this. Maybe I am just unnecessarily too reserved to implement either of presented ideas, maybe there is a better one.
To give you another option, there is a special wrapper for Angular+Spring Data Rest that could probably help you out:
https://github.com/guylabs/angular-spring-data-rest
I'm pretty new to Solr, I'm trying to add a multi-value field with boost values defined for each value, all defined via JSON. In other words, I'd like this to work:
[{ "id": "ID1000",
"tag": [
{ "boost": 1, "value": "A test value" },
{ "boost": 2, "value": "A boosted value" } ]
}]
I know how to do that in XML (multiple <field name = 'tag' boost = '...'>), but the JSON code above doesn't work, the server says "Error parsing JSON field value. Unexpected OBJECT_START". Has Solr a limit/bug?
PS: I fixed the originally-missing ']' and that's not the problem.
EDIT: It seems the way to go should be payloads (http://wiki.apache.org/solr/Payloads), but I couldn't make them to work on Solr (followed this: http://sujitpal.blogspot.co.uk/2011/01/payloads-with-solr.html). Leaving the question open to see if someone can further help.
Found the following sentence in the from the Solr Relevancy FAQ - Query Elevation Component section
An Index-time boost on a value of a multiValued field applies to all values for that field.
I do not think adding an individual boost to each value in the multivalued field is going to work. I know that the Xml will allow it, but I would guess that it may only apply the boost value from the last value applied to the field.
So based on that I would change the Json to the following and see if that works.
[
{
"id": "ID1000",
"tag": {
"boost": 2,
"value": [ "A test value", "A boosted value"]
}
}
]
The JSON seems to be invalid missing a closing ]
[
{
"id": "ID1000",
"tag": [
{
"boost": 1,
"value": "A test value"
},
{
"boost": 2,
"value": "A boosted value"
}
]
}
]
You hit an edge case. You can have the boosts on single values and you can have an array of values. But not one inside another (from my reading of Solr 4.1 source code)
That might be something to create as an enhancement request.
If you are generating that JSON by hand, you can try:
"tag": { "boost": 1, "value": "A test value" },
"tag": { "boost": 2, "value": "A boosted value" }
I believe Sols will merge the values then. But if you are generating it via a framework, it will most likely disallow or override multiple object property names (tag here).
The error has nothing to do with boosting.
I get the same error with a very simple json doc.
No luck solving it.
see Solr errors when trying to parse a collection: Error parsing JSON field value. Unexp ected OBJECT_START
I hit the same error message. Actually the error message was misplaced. The underlying real error was the two of the required fields as per schema.xml in solr configuration were missing in the json payload.
An error message of the kind "required parameters are missing in the document" would have been more helpful here. You might want to check if some required fields are missing in the json payload.