How can I get Watson to recognize two different dates upon user input? - ibm-watson

If a user asks the following sentence:
For some reason Watson uses the first date for the both $checkin and $checkout variables even though it detects the second date.
You can refer to the "dialog node" screenshot to see how the nodes are setup.
How can I get Watson to recognize the first date is the checkin date and the second one is the checkout date. Is there a way I could tell Watson after the first date is used if a second one is detected use it to fill the next slot?
I've found something about the #sys-date range_link entity. But the documentation is not detailed.

This is easy to do, but comes with issues you need to be aware of.
Slots allows you to define variables as they are read. For example.
Will generate this:
The issue is that you are assuming that people will ask in the same order that you need the information. If this doesn't happen then this will fail.
You can mitigate this by shaping how the user may ask the question. For example:
"Please let me know where you are leaving and going to"
The person is more likely to respond with the exit date first.
BETA
This is likely to change and doesn't fully work as you expect. You can enable the beta #sys-date in the options. So I wouldn't recommend relying on this until it is final.
You first need to check for range_link. This will tell you if it detected that two dates are connected to each other.
Then you can do the following:
From Date: <? entities['sys-date'].filter("d", "d.role.type == 'date_from'")[0]?.value ?>
To Date: <? entities['sys-date'].filter("d", "d.role.type == 'date_to'")[0]?.value ?>
What this does is find the exact record that has the role of date_from and returns the value. Likewise for date_to.
You end up with something like this.

Related

How to send an Email notification by day end using Rules with all the nodes published that day?

I am trying to achieve email notification . The condition is , it should go by end of the day with the current day published content list.
For the same I have tried couple of things using Rules, but stuck in between.
Any help?
I tried using rules, and I created a rule like so:
Events:
After updating existing content of type(content type name)
Cron maintenance tasks are performed
Condition: Data to compare: [node:field-img-status], Data value: Approve
When I am trying to add second condition to check if the node is published within 24hrs, I am unable to achieve it. When I add strtotime("-1 day"), I get an error like:
Wrong date format. Specify the date in the format 2017-05-10 08:17:18.
I tried date('Y-m-d h:i:s',strtotime("-1 day")) but I did not succeed.
Now I am trying one more method to achieve it using Views Rules which is suggested in this answer to the question about 'How to create a Drupal rule to check (on cron) a date field and if passed set field "status" to "ended"?'.
Below is a blueprint of how I'd get this to work ...
Step 1: Create a single eMail for each node that was published
Create a view (using Views) of all the nodes that were published the last 24 hours. Make sure to include a column in that view for the various data you want to be included about each node in your eMail later on.
Use Rules to create a rule with a Rules Action that consists of a "Rules Loop", in which its "list items" are actually the list of nodes that you want to be included in your eMail later on. To create this Rules Loop, use the Views Rules combined with a Views display type of "Views Rules", for the view that you created. Refer to my answer to "How to pass arguments to a view from Rules?" for way more details on how to use the Views Rules module.
For each list item in the Rules Loop of the previous step, you have access to all data for each column in the View you created. By using these data you could add an additional Rules Action (within the same Rules Loop) to send an appropriate eMail about the node being processed.
Step 2: Group all eMails in a single eMail
Obviously, the previous step creates a single eMail for each node that was published in the last 24 hours. If you only have a few nodes that may not be a real issue to worry about. But if you have dozens (or more?) of such nodes then you might want to consider consolidating all such eMails in a single eMail, which contains (in its eMail body) the complete list of nodes.
A possible solution to implement such consolidation, is similar to what is shown in the Rules example included in my answer to "How to concatenate all token values of a list in a single field within a Rules loop?". In your case, you could make it work like so:
Add some new Rules variable that will be used later on as part of the eMail body, before the start of your loop. Say you name the variable nodes_list_var_for_email_body.
Within your loop, for each iteration, prepend or append the value for each "list item" to that variable nodes_list_var_for_email_body.
Move the Rules Action to send an eMail outside your loop, and after the loop completed. And finetune the details (configuration) of your (new) "send an eMail" Rules Action. When doing so, you'll be able to select the token for nodes_list_var_for_email_body to include anywhere in your eMail body.
Step 3: Schedule the daily execution of your rule
Use the Rules Once per Day to schedule the daily execution of your rule. Refer to my answer to "How to limit the execution of a rule for sending an email to only run once in a day?" for way more details about this module.
Voilà, that's it ...
This is how I would achieve this:
Make some view which would list all nodes created today.
Make some end-point (from my module, check out: https://api.drupal.org/api/drupal/modules%21system%21system.api.php/function/hook_menu/7.x)
It would call this view, and grab that node list (i.e. with views_get_view_result : https://api.drupal.org/api/views/views.module/function/views_get_view_result/7.x-3.x ), loop through the list, compose the email and send it.
Then I would set cron job to call that end-point at end of every day.

Extract Array of Values for Watson Dialog Variables

In DevPost Watson Developer Challenge for Conversational Applications post, I saw Watson (maybe) able to analyze following phrase "I want to visit Tokyo, Sydney, Manchester, and Reykjavik during a trip that takes 30 days".
Is there a better way to extract those array of locations without having to predefine max no of location variables (i.e. set location1 - 5) and manually specify various grammar items like $ (Locations)={location1} * (Locations)={location2} * (Locations)={location3} * (Locations)={location4} as per Pizza example dialog? I would like to follow up with comment such as "That's a lot" if location > 4, or "Sure" if less.
You could try something like alchemy or relationship extraction to identify all of the languages, and then simply add them to the user profile in Dialog. But today, the best way to do this within a broader conversation will be to do it the same way the pizza sample does as you outlined above.

Magento 1.9.2 order ID duplicate entry if previously canceled

I have a very annoying problem with Magento 1.9.2 upgrade (previously was 1.7.0.2)
I get time to time this error message in log file:
SQLSTATE[23000]: Integrity constraint violation: 1062 Duplicate entry 'OR1010702' for key 'UNQ_SALES_FLAT_ORDER_INCREMENT_ID', query was: INSERT INTO `sales_flat_order`...
It happens with every payment method, only if customer has already order before and his last order is Canceled or On Hold. It works fine when it's his first order or if previous one is Processing, Complete or Pending.
For exemple, a new customer goes through the checkout process, is redirected on Paypal or other payment solution page. If he uses his credit card correctly, no problem, but if he cancels his payment (the order is created and has status Canceled) he won't be able to process it again. He will stay on Magento checkout page and get javascript prompt error (something like: Payment issue, try latter)
Everything was fine with Magento 1.7 but with 1.9.2 upgrade I don't know what to do get order process working in any case.
If someone has and idea... Thanks very much in advance!
I finally found the answer here: http://blog.adin.pro/2014-12-02/magento-sqlstate23000-integrity-constraint-violation-1062-duplicate-entry-000000254-for-key-unq_sales_flat_order_increment_id/
I had to override /app/code/local/Adin/Sales/Model/Resource/Sales/Quote.php in order to remove (int) in this line: $bind = array(':increment_id' => (int)$orderIncrementId);
Although it's an old answer, I have found a solution which may help someone else in problem.
This problem happens when below conditions met
Customer has canceled previous order
Customer is trying again for a new order (with or without cookies cleared)
Your setup has a custom invoice number which is not int
Why the 3rd point above is bold? Because Magento already has solution to cover the first 2 conditions. Please check function reserveOrderId() in app\code\core\Mage\Sales\Model\Quote.php and function isOrderIncrementIdUsed() in app\code\core\Mage\Sales\Model\Resource\Quote.php.
To cover your custom invoice number which has chars within, you need to change below line
$bind = array(':increment_id' => (int)$orderIncrementId);
with
$bind = array(':increment_id' => $orderIncrementId);
in function isOrderIncrementIdUsed() in app\code\core\Mage\Sales\Model\Resource\Quote.php
note: removal of (int)
I hope you understood the problem now, but to be more precise, Magento tries to parse your invoice number 'OR1010702' into an int, which it can't. So this method returns false and hence no new invoice number is generated. Once you change the code to not parse into int, it is able to assign a new order id and it works fine.
note: You should not touch any file inside core directory, rather copy the file with it's directory structure in app\code\local and do the changes there

How to make datastore keys mapreduce-friendly(-er)?

Edit: See my answer. Problem was in our code. MR works fine, it may have a status reporting problem, but at least the input readers work fine.
I ran an experiment several times now and I am now sure that mapreduce (or DatastoreInputReader) has odd behavior. I suspect this might have something to do with key ranges and splitting them, but that is just my guess.
Anyway, here's the setup we have:
we have an NDB model called "AdGroup", when creating new entities
of this model - we use the same id returned from AdWords (it's an
integer), but we use it as string: AdGroup(id=str(adgroupId))
we have 1,163,871 of these entities in our datastore (that's what
the "Datastore Admin" page tells us - I know it's not entirely
accurate number, but we don't create/delete adgroups very often, so
we can say for sure, that the number is 1.1 million or more).
mapreduce is started (from another pipeline) like this:
yield mapreduce_pipeline.MapreducePipeline(
job_name='AdGroup-process',
mapper_spec='process.adgroup_mapper',
reducer_spec='process.adgroup_reducer',
input_reader_spec='mapreduce.input_readers.DatastoreInputReader',
mapper_params={
'entity_kind': 'model.AdGroup',
'shard_count': 120,
'processing_rate': 500,
'batch_size': 20,
},
)
So, I've tried to run this mapreduce several times today without changing anything in the code and without making changes to the datastore. Every time I ran it, mapper-calls counter had a different value ranging from 450,000 to 550,000.
Correct me if I'm wrong, but considering that I use the very basic DatastoreInputReader - mapper-calls should be equal to the number of entities. So it should be 1.1 million or more.
Note: the reason why I noticed this issue in the first place is because our marketing guys started complaining that "it's been 4 days after we added new adgroups and they still don't show up in your app!".
Right now, I can think of only one workaround - write all keys of all adgroups into a blobstore file (one per line) and then use BlobstoreLineInputReader. The writing to blob part would have to be written in a way that does not utilize DatastoreInputReader, of course. Should I go with this for now, or can you suggest something better?
Note: I have also tried using DatastoreKeyInputReader with the same code - the results were similar - mapper-calls were between 450,000 and 550,000.
So, finally questions. Is it important how you generate ids for your entities? Is it better to use int ids instead of str ids? In general, what can I do to make it easier for mapreduce to find all of my entities mapping them?
PS: I'm still in the process of experimenting with this, I might add more details later.
After further investigation we have found that the error was actually in our code. So, mapreduce actually works as expected (mapper is called for every single datastore entity).
Our code was calling some google services functions that were sometimes failing (the wonderful cryptic ApplicationError messages). Due to these failures, MR tasks were being retried. However, we have set a limit on taskqueue retries. MR did not detect nor report this in any way - MR was still showing "success" in the status page for all shards. That is why we thought that everything is fine with our code and that there is something wrong with the input reader.

Searching through descriptions

There's a movie which name I can't remember. It's about a carnival or amusement park with a horror house and a bunch of teens who are murdered one by one by something with a clowns mask. I've seen this movie about 20 years ago, and it's sequel, but can't remember it exactly. (And also forgot it's title.) As a result, I started wondering about how to solve something technical.
Assume that I have a database with the story plot and other data of each and every movie published. (Something like the IMDb.) And I would have an edit field where a user can just enter a description in plain text. The system would then start analysing this text to find the movie(s) that would qualify to this description.
For example (different movie), I enter this in the edit field: "Some movie about an Egyptian king who attacks a bunch of indians on horseback, but he's badly wounded and his horse dies while he lost this battle."
The system should then report the movie "Alexander" from 2004 as answer, but possibly a few more. (Even allowing a few errors in the description.)
To create such a system where a description gets analysed to find a matching record by searching through descriptions, what techniques should I need for something as complex as that? Not that I want to build something like that right now, but more out of curiosity if I ever want to pick up some interesting new project.
(I wanted to award extra points for those who recognise the movie I've mentioned in the beginning. But one Google-attempt later and I found it myself!)
Btw, it's not the search engine itself that interests me, but analysing the description to get to something a search engine will understand! With the example movie, it's human logic that helped me to find the title. (And it's annoying that this movie isn't for sale in the Netherlands.) Human logic will always be a requirement but it's about analysing the user input, which is in the form of a story or description, with possible errors.
You should check out document classification.
A few document classification techniques
Naive Bayes classifier
tf–idf
For what I can tell by your own comments, Google is the technique to be used. ;-) But, honestly, I think more or less any search engine would do.
Edit: heh, you removed your comment, but I do remember you mentioned Google as the one deserving extra points.
Edit+: well, you mentioned Google again, but I don't want to remove my first edit. ;-)
Pure speculation: Would something trivial such as taking every word of more than 4 letters in the description "Egyptian, Indian, horse battle etc." and fuzzy matching against a database of such summaries work? Perhaps with some normalisation eg. king == leader == emperor?
Hmmm ... Young Man, Girlfriend, swimming pool, mother, wedding does that get us to The Graduate? Well I guess with a small amount of specifics "Robinson" it might.
You can do lots of interesting stuff with the imdb keyword search:
http://akas.imdb.com/keyword/carnival/clown/murder/
You can specify multiple keywords, it suggests movies and more keywords which are in similar context with your given keywords.
The data contained in imdb is publicy available for non-commercial use and can be downloaded as text files. You could build a database from it.

Resources