Calling functions from plain text descriptions - artificial-intelligence

I have an app which has common maths functions behind the scenes:
add(x, y)
multiply(x, y)
square(x)
The interface is a simple google- style text field. I want the user to be able to enter a plain text description -
'2*3'
'2 times 3'
'multiply 2 and 3'
'take the product of 2 and 3'
and get a answer mathematical answer
Question is, how should I map the text descriptions to the functions ? I'm guessing I need to
tokenise the text
identify key tokens (function names, arguments)
try and map token combinations to function signatures
However I'm guessing this is already a 'solved problem' in the machine learning space. Should I be using Natural Language Processing ? Plain text search ? Something else ?
All ideas gratefully received, plus implementation suggestions [I'm using Python/AppEngine; I know about NLTK and Whoosh]
[PS I understand Google does this already, at least for the first two queries on the list. I'm guessing they also go it statistically, having a very large amount of search data. I don't have a large amount of data available, so will need an alternative approach].

After you tokenise the text, you need parsing to get a syntax tree of your natural language phrase. Once you have this, you can map the parse tree to a mathematical expression, and then evaluate the expression. I do not think this is a solved problem. I would start with several templates, say the first two, and experiment. The larger the domain of possible descriptions, the harder the task is.

I would recommend some tool for provide grammar/patterns on text like SimpleParse for python http://www.ibm.com/developerworks/linux/library/l-simple.html. As java programmer I would prefer GATE or graph-expression.

Related

Ruby: Hash, Arrays and Objects for storage information

I am learning Ruby, reading few books, tutorials, foruns and so one... so, I am brand new to this.
I am trying to develop a stock system so I can learn doing.
My questions are the following:
I created the following to store transactions: (just few parts of the code)
transactions.push type: "BUY", date: Date.strptime(date.to_s, '%d/%m/%Y'), quantity: quantity, price: price.to_money(:BRL), fees: fees.to_money(:BRL)
And one colleague here suggested to create a Transaction class to store this.
So, for the next storage information that I had, I did:
#dividends_from_stock << DividendsFromStock.new(row["Approved"], row["Value"], row["Type"], row["Last Day With"], row["Payment Day"])
Now, FIRST question: which way is better? Hash in Array or Object in Array? And why?
This #dividends_from_stock is returned by the method 'dividends'.
I want to find all the dividends that were paid above a specific date:
puts ciel3.dividends.find_all {|dividend| Date.parse(dividend.last_day_with) > Date.parse('12/05/2014')}
I get the following:
#<DividendsFromStock:0x2785e60>
#<DividendsFromStock:0x2785410>
#<DividendsFromStock:0x2784a68>
#<DividendsFromStock:0x27840c0>
#<DividendsFromStock:0x1ec91f8>
#<DividendsFromStock:0x2797ce0>
#<DividendsFromStock:0x2797338>
#<DividendsFromStock:0x2796990>
Ok with this I am able to spot (I think) all the objects that has date higher than the 12/05/2014. But (SECOND question) how can I get the information regarding the 'value' (or other information) stored inside the objects?
Generally it is always better to define classes. Classes have names. They will help you understand what is going on when your program gets big. You can always see the class of each variable like this: var.class. If you use hashes everywhere, you will be confused because these calls will always return Hash. But if you define classes for things, you will see your class names.
Define methods in your classes that return the information you need. If you define a method called to_s, Ruby will call it behind the scenes on the object when you print it or use it in an interpolation (puts "Some #{var} here").
You probably want a first-class model of some kind to represent the concept of a trade/transaction and a list of transactions that serves as a ledger.
I'd advise steering closer to a database for this instead of manipulating toy objects in memory. Sequel can be a pretty simple ORM if used minimally, but ActiveRecord is often a lot more beginner friendly and has fewer sharp edges.
Using naked hashes or arrays is good for prototyping and seeing if something works in principle. Beyond that it's important to give things proper classes so you can relate them properly and start to refine how these things fit together.
I'd even start with TransactionHistory being a class derived from Array where you get all that functionality for free, then can go and add on custom things as necessary.
For example, you have a pretty gnarly interface to DividendsFromStock which could be cleaned up by having that format of row be accepted to the initialize function as-is.
Don't forget to write a to_s or inspect method for any custom classes you want to be able to print or have a look at. These are usually super simple to write and come in very handy when debugging.
thank you!
I will answer my question, based on the information provided by tadman and Ilya Vassilevsky (and also B. Seven).
1- It is better to create a class, and the objects. It will help me organize my code, and debug. Localize who is who and doing what. Also seems better to use with DB.
2- I am a little bit shamed with my question after figure out the solution. It is far simpler than I was thinking. Just needed two steps:
willpay = ciel3.dividends.find_all {|dividend| Date.parse(dividend.last_day_with) > Date.parse('10/09/2015')}
willpay.each do |dividend|
puts "#{ciel3.code} has approved #{dividend.type} on #{dividend.approved} and will pay by #{dividend.payment_day} the value of #{dividend.value.format} per share, for those that had the asset on #{dividend.last_day_with}"
puts
end

What is the comprehension expression in AngularJS?

I have a few questions buzzing in my head about the comprehension expression:
What is the data structure which it defines?
Was it adapted from some other language?
Where is it used in AngularJS? Does this API exist for select elements only?
From the docs:
ngOptions - comprehension_expression - in one of the following forms:
for array data sources:
label for value in array
select as label for value in array
label group by group for value in array
select as label group by group for value in array track by trackexpr
for object data sources:
label for (key , value) in object
select as label for (key , value) in object
label group by group for (key, value) in object
select as label group by group for (key, value) in object
Comprehension expression is just a string formatted in a special way to be recognized by select directive.
There's no magic behind it, just several formats of it because there are quite a few ways to process and represent your collection (data structure of your model, item/item property selection as scope's model, some other options regarding labels, grouping etc.). When you consider all these options it is not that strange for allowing complex expressions.
Let's say you have such code:
<select
ng-model="color"
ng-options="c.name group by c.shade for c in colors"></select>
In order to ditch the comprehension expression and use attributes, you would write something like this:
<select
ng-model="color"
ng-data-type="object"
ng-data="colors"
ng-select="c"
ng-label="c.name"
ng-group-by="c.shade"></select>
The attribute approach might get ugly once you expand your API. Besides, with comprehension expression it's much easier to use filters.
While in one way it's true to say that a "comprehension_expression" is "just a string" as package says, on the other hand, source code is just a string. Programming languages are just strings.
A SQL SELECT statement
– which could very well be part of the inspiration for the syntax and features of the "comprehension_expression" (but it's not obvious that it is, because it's not mentioned in the docs– perhaps if I dug into some developer conversations I might be able to find out) –
is just a string.
Sure they're just strings, but they have structure, which relates to the problem they are trying to solve. And the question is, is the structure adequately described? Is its pattern, how it relates to the problem at hand, made clear? Is its relationship to other structures that other people have designed apparent?
While the "comprehension_expression" is just a string, on the other hand, its complexity almost comes to it being a sort of sub-language in its own right.
But the way it is portrayed in the docs (https://docs.angularjs.org/api/ng/directive/ngOptions) does reflect the attitude that it is "just a string with some formatting". It is tucked away in the documentation for ng-options as the type of the ng-options directive. To some extent, it is not an entity in its own right, it is a second-class citizen.
The way the different formats are listed can give one a strange feeling, like it's sort of ad-hoc, without any pattern relating the different possible formats (although there is a pattern if you look closely). Without a formal grammar with a regular structure, it makes you wonder if they really covered all the possible options. Compared to, say, the MySQL documentation for the SQL SELECT statement: https://dev.mysql.com/doc/refman/5.7/en/select.html
Obviously such formal syntax can be quite intimidating and is maybe not necessary for the case of the "comprehension_expression", on the other hand it can be reassuring to know it is precisely defined.
I suspect the asker of the question was somewhat unsettled by how casually the "comprehension_expression" was mentioned in the docs; it can seem like a sort of floating, ghost-like entity, just mentioned briefly but not given its own page etc.
It might be worth it having its own page, being treated as an entity in its own right, because then that invites discussion as to the design of this "sub-language". How did it come about? What are the reasons for the different features of the "sub-language"? Which features, and thus syntaxes, conflict with each other? Why can this feature be used together with that feature but not another feature? Are there inspirations from e.g. SQL, in the design of this "sub-language"?
Otherwise it seems to be an invention out of the blue, unrelated to other DSLs of its kind.
In a blog post on ng-options,
https://www.undefinednull.com/2014/08/11/a-brief-walk-through-of-the-ng-options-in-angularjs/
Mr. Shidhin links to a little discussion
https://groups.google.com/forum/#!topic/angular/4EDe8xIbjLU
Where just this issue is discussed. "Matt Hughes" also expresses the opinion that "Seems like a lot of additional complexity for one directive."
Perhaps this is not that big a deal. I just wanted to put it out there though.

WPF Binding.StringFormat: C vs. '{}{0:C}'

I am trying to globalize a simple WPF app. In several questions and/or answers on SO I see one of the following two settings on the binding:
StringFormat=C
StringFormat='{}{0:C}'
What is the difference between these? Are there certain conditions where you would use one over the other?
My understanding is that there is no difference, one is just shorthand and the other explicit. The only condition I can think of where being explicit is beneficial is when you want more control over the format. For example:
StringFormat=Total: {0:C}
Other than that, I'd say keep it simple; XAML is already verbose and shorthand syntax is welcome.
Maybe read up string formatting?
http://msdn.microsoft.com/en-us/library/dd465121.aspx
You can use {0:C} in a format string where you are filling in a value:
decimal value = 123.456;
Console.WriteLine("Your account balance is {0:C2}.", value);
while, you use the C as a plain format:
Console.WriteLine(("Your account balance is " + decimal.Parse("24.3200").ToString("C"));
they are functionally equivalent as far as the output. It's just a different way to format the data based on the context of how your using it.
http://msdn.microsoft.com/en-us/library/dwhawy9k.aspx#CFormatString

'8 Ball' Program

I've tried looking on google but i guess I can just not get the right search phrases to find what I want. If you are familiar with the afterNET IRC server, there is a command '.8' which is an 8 ball. It answers more than just yes/no questions tho. It gives you a variety of answers based on certain words you use in your question, like when, where, color, etc
I'd like to make something like this but have no idea where to start. I've recently studied DFA (Deterministic Finite Automata), is that where I should start? I understand I don't want to be scripting out every possible combination of words people use, but it would be nice to have a system that feels sorta realistic (like the 8ball program on the IRC server), and is expandable for more 'words' whenever I want.
Thanks for any help/links!
You may be giving most 8ball implementations more credit than they deserve. I think the point is that the questions are for yes/no answers, so the provided answers only have to cover a fairly predictable set of possibilities.
Most 8ball scripts that I am aware of (example) will just use an array and a random number to grab an answer.
Magic 8ball bots are very popular on irc as they are very easy to implement - simply respond to text with a given marker (in this case ".8") and respond with a random answer.
I have never heard of a magic 8ball using a deterministic approach, Cleverbot style. Actually, trying that, I'm not even sure how deterministic that is as most of the responses are also totally random and unrelated to what I was saying.
// our answers array
String[] answers = [ "yes", "no", "for sure", "unlikely", "most certainly", "definitely not" ];
public String ask8Ball() {
// rand returns a float between 0>=res>1, the (int) cast rounds down
int index = (int)(java.lang.Math.random() * 7);
return answers[index];
}

MD5 code kata and BDD

I was thinking to implement MD5 as a code kata and wanted to use BDD to drive the design (I am a BDD newb).
However, the only test I can think of starting with is to pass in an empty string, and the simplest thing that will work is embedding the hash in my program and returning that.
The logical extension of this is that I end up embedding the hash in my solution for every test and switching on the input to decide what to return. Which of course will not result in a working MD5 program.
One of my difficulties is that there should only be one public function:
public static string MD5(input byte[])
And I don't see how to test the internals.
Is my approach completely flawed or is MD5 unsuitable for BDD?
I believe you chose a pretty hard exercise for a BDD code-kata. The thing about code-kata, or what I've understood about it so far, is that you somehow have to see the problem in small incremental steps, so that you can perform these steps in red, green, refactor iterations.
For example, an exercise of finding an element position inside an array, might be like this:
If array is empty, then position is 0, no matter the needle element
Write test. Implementation. Refactor
If array is not empty, and element does not exist, position is -1
Write test. Implementation. Refactor
If array is not empty, and element is the first in list, position is 1
Write test. Implementation. Refactor
I don't really see how to break the MD5 algorithm in that kind of steps. But that may be because I'm not really an algorithm guy. If you better understand the steps involved in the MD5 algorithm, then you may have better chances.
It depends on what you mean with unsuitable... :-) It is suitable if you want to document a few examples that describes your implementation. It should also be possible to have the algorithm emerge from your specifciation if you add one more character for each test.
By just adding a switch statement you're just trying to "cheat the system". Using BDD/TDD does not mean you have to implement stupid things. Also the fact that you have hardcoded hash values as well as a switch statement in your code are clear code smells and should be refactored and removed. That is how your algorithm should emerge because when you see the hard coded values you first remove them (by calculating the value) and then you see that they are all the same so you remove the switch statement.
Also if your question is about finding good katas I would recommend lokking in the Kata catalogue.

Resources