In order to develop a translation system for a constructed language, I am looking for a method that is easy to implement. I turn quite naturally to algorithms based on rules encoded by an "expert".
Thus, I am looking for references (codes, explanatory documents...) about naive translation algorithms; for example those from the 60s.
Note: I know that the results will probably be rough.
Looking and working with translations, use books and dictionary, and a leguaje as reference, by example from book Moby Dick:
"Call me Ishmael. Some years ago—never mind how
long precisely-having little or no money in my purse, and nothing particular to
interest me on shore, I thought I would sail about a little and see
the watery part of the world."
I need translate spanish text:
"¿Que paises forman parte del mundo?"
When you use a section "with statistically more words and synonyms and antonyms" in text, called "window or scope" this section star with a direct translation word with dictionary and end section, in this example, and search same section in spanish book
start : call -> Llamame. => end: world -> mundo
after processing you result is like to:
how many places about of the wold
You could find more information here
Related
I am learning Ruby, reading few books, tutorials, foruns and so one... so, I am brand new to this.
I am trying to develop a stock system so I can learn doing.
My questions are the following:
I created the following to store transactions: (just few parts of the code)
transactions.push type: "BUY", date: Date.strptime(date.to_s, '%d/%m/%Y'), quantity: quantity, price: price.to_money(:BRL), fees: fees.to_money(:BRL)
And one colleague here suggested to create a Transaction class to store this.
So, for the next storage information that I had, I did:
#dividends_from_stock << DividendsFromStock.new(row["Approved"], row["Value"], row["Type"], row["Last Day With"], row["Payment Day"])
Now, FIRST question: which way is better? Hash in Array or Object in Array? And why?
This #dividends_from_stock is returned by the method 'dividends'.
I want to find all the dividends that were paid above a specific date:
puts ciel3.dividends.find_all {|dividend| Date.parse(dividend.last_day_with) > Date.parse('12/05/2014')}
I get the following:
#<DividendsFromStock:0x2785e60>
#<DividendsFromStock:0x2785410>
#<DividendsFromStock:0x2784a68>
#<DividendsFromStock:0x27840c0>
#<DividendsFromStock:0x1ec91f8>
#<DividendsFromStock:0x2797ce0>
#<DividendsFromStock:0x2797338>
#<DividendsFromStock:0x2796990>
Ok with this I am able to spot (I think) all the objects that has date higher than the 12/05/2014. But (SECOND question) how can I get the information regarding the 'value' (or other information) stored inside the objects?
Generally it is always better to define classes. Classes have names. They will help you understand what is going on when your program gets big. You can always see the class of each variable like this: var.class. If you use hashes everywhere, you will be confused because these calls will always return Hash. But if you define classes for things, you will see your class names.
Define methods in your classes that return the information you need. If you define a method called to_s, Ruby will call it behind the scenes on the object when you print it or use it in an interpolation (puts "Some #{var} here").
You probably want a first-class model of some kind to represent the concept of a trade/transaction and a list of transactions that serves as a ledger.
I'd advise steering closer to a database for this instead of manipulating toy objects in memory. Sequel can be a pretty simple ORM if used minimally, but ActiveRecord is often a lot more beginner friendly and has fewer sharp edges.
Using naked hashes or arrays is good for prototyping and seeing if something works in principle. Beyond that it's important to give things proper classes so you can relate them properly and start to refine how these things fit together.
I'd even start with TransactionHistory being a class derived from Array where you get all that functionality for free, then can go and add on custom things as necessary.
For example, you have a pretty gnarly interface to DividendsFromStock which could be cleaned up by having that format of row be accepted to the initialize function as-is.
Don't forget to write a to_s or inspect method for any custom classes you want to be able to print or have a look at. These are usually super simple to write and come in very handy when debugging.
thank you!
I will answer my question, based on the information provided by tadman and Ilya Vassilevsky (and also B. Seven).
1- It is better to create a class, and the objects. It will help me organize my code, and debug. Localize who is who and doing what. Also seems better to use with DB.
2- I am a little bit shamed with my question after figure out the solution. It is far simpler than I was thinking. Just needed two steps:
willpay = ciel3.dividends.find_all {|dividend| Date.parse(dividend.last_day_with) > Date.parse('10/09/2015')}
willpay.each do |dividend|
puts "#{ciel3.code} has approved #{dividend.type} on #{dividend.approved} and will pay by #{dividend.payment_day} the value of #{dividend.value.format} per share, for those that had the asset on #{dividend.last_day_with}"
puts
end
I am using Hunspell to stem words for a SOLR instance. For the most part, it seems to be working well.
I'm using the OpenOffice dic/aff files.
However, there are some notable word exceptions, and I'd like to be able to remove these as candidates for stemming.
A great example is "skier", which stems to "sky" because of the following:
in the .dic file
sky/MDRSGZ
relevant rule in the .aff file
SFX R y ier [^aeiou]y
Is there any way to indicate that skier and only skier should be left alone?
Yeah this is a very common thing, just remove the "R"
sky/MDSGZ
But you may then want to add back in on another line "skier" and any other versions of it.
skier/MS
I have had to make numerous changes to this file, and now really wish there was a better option.
For example
Butter -> Butt
Corner -> Corn
Easter -> East
And then another one that is really confusing,
Wind == Wound
On my site before we fixed it if you searched for wind like in "wind power" you ended up with a bunch of bruises and bloody wounds.
Because "wound" like in "I wound the clock" stemmed to wind.
We also decided to remove all RE prefixes. because things like
remarkable -> mark
remove -> move
reset -> set
restore -> store
So if you know of a better dictionary that is better for this please let me know. (I think the main problem is this dictionary is more intended for spell check then for stemming)
I would be willing to start and/or contribute to a git project for a real stemming dictionary to replace this spelling dictionary for everyone out there using this.
have you tried freeling? It is open sourced.
A demo page is here:
http://nlp.lsi.upc.edu/freeling/demo/demo.php
When I pick english, pos tagging I get the following result:
you wound the clock?
you wind the clock?
PRP VBD DT NN ?
also skier, wind power all get the noun stems. It is a great stemmer and analyzer.
not sure about licensing. the download page:
http://devel.cpl.upc.edu/freeling/downloads?order=time&desc=1
I want to have 1 wide streams of stuff be able to path-find so as to be considerate of the other stream going on the same 2 wide path.
Let's say I have a map like this: ("0"'s it cannot go, "-"'s you it can, "1" and "A" are the starting point while "2" and "b" are the destinations)
000000000000
0000000001A0
000000000--0
0B200------0
0--00------0
0------00000
0------00000
000000000000
If I have "A" path-find to "B" with the A* algorithm it would block the path from "1" to "2".("=" is the path)
000000000000
0000000001A0
000000000-=0
0B200======0
0=-00=-----0
0=====-00000
0------00000
000000000000
Yes I could path-find "1" to "2" then make the AB path but that won't always work. Case in point is this:
00000000000000000000
000000000000000001A0
00000000000000000--0
0B200------00------0
0--00------00------0
0------00------00000
0------00------00000
00000000000000000000
The A* path-finding from "1" to "2" blocks the path for "A" to "B"
000000000000000001A0
00000000000000000=-0
0B200------00=====-0
0-=00=====-00=-----0
0-====-00=====-00000
0------00------00000
00000000000000000000
"A" to "B" blocks "1" to "2"
000000000000000001A0
00000000000000000-=0
0B200------00======0
0=-00=====-00=-----0
0=====-00=====-00000
0------00------00000
00000000000000000000
Additional Clarification: "A", "B", "1", and "2 can be anywhere in a user created map. There will be any number from 1 to 10 paths going at the same time and starting and stopping separately though the AI only needs to take account other current paths. It also needs to happen live so it cannot take even seconds to compute.
So how can I make an AI smart enough to not block another path? Right now I'm using the A* so is there an improvement to it or should I use an entirely new AI system? (both work for me)
If I have understood correctly, you are searching for Cooperative Pathfinding. In the last decade, many solutions for this problem have been proposed. You can find a nice summary of them in this paper.
I'll give you a small recap:
Local Repair A Each agent
searches for a route to the destination using the A* algorithm, ignoring all other agents except for its current neighbors. The agents then begin to
follow their routes, until a collision is imminent. Whenever an agent is about to move into an occupied position it instead recalculates the remainder of its route. A bit of "brute-forcing", it is not really state of the art but it is "easy" to implement and the current industry standard in video-games. Unfortunately I'm not able to find the pseudo-code for the algorithm. :(
Cooperative A s a new algorithm for solving the Co-
operative Pathfinding problem. The task is decoupled into
a series of single agent searches. The individual searches
are performed in three dimensional space-time, and take account of the planned routes of other agents. A wait move
is included in the agent’s action set, to enable it to remain
stationary.
Hierarchical Cooperative A* As before but in a hierarchical way.
Windowed Hierarchical Cooperative A* The state of the art at the time (2005 I think). There is an interesting demo on the internet with Java source code and everything. To understand why WHCA* is better go to page 3 of the paper.
I hope this can be enough to start exploring this field by yourself if you need. :)
As with most problems - finding the actual constraints can help to identify the search space you are looking at.
It is not clear from the example problems if both paths can take wildly different routes or if the paths have to travel alongside each other? Do you know if there will there always be a solution where all the paths on the map can be routed simultaneously?
If you simply require a road to support n-wide gaps ( for n parallel paths), this seems like a simple tweak to the search space/problem representation which could probably be done with A*.
Also - you have mentioned streams - is behaviour over time a dimension of the problem? - could there be an option for time-sharing (alternate use) of a narrow gap between multiple streams? Or perhaps shorter convoys of stream elements that can path-find on their own?
I need to implement Minesweeper solver. I have started to implement rule based agent.
I have implemented certain rules. I have a heuristic function for choosing best matching rule for current cell (with info about surrounding cells) being treated. So for each chosen cell it can decide for 8 surroundings cells to open them, to mark them or to do nothing. I mean. at the moment, the agent gets as an input some revealed cell and decides what to do with surrounding cells (at the moment, the agent do not know, how to decide which cell to treat).
My question is, what algorithm to implement for deciding which cell to treat?
Suppose, for, the first move, the agent will reveal a corner cell (or some other, according to some rule for the first move). What to do after that?
I understand that I need to implement some kind of search. I know many search algorithms (BFS, DFS, A-STAR and others), that is not the problem, I just do not understand how can I use here these searches.
I need to implement it in a principles of Artificial Intelligence: A modern approach.
BFS, DFS, and A* are probably not appropriate here. Those algorithms are good if you are trying to plan out a course of action when you have complete knowledge of the world. In Minesweeper, you don't have such knowledge.
Instead, I would suggest trying to use some of the logical inference techniques from Section III of the book, particularly using SAT or the techniques from Chapter 10. This will let you draw conclusions about where the mines are using facts like "one of the following eight squares is a mine, and exactly two of the following eight squares is a mine." Doing this at each step will help you identify where the mines are, or realize that you must guess before continuing.
Hope this helps!
I ported this (with a bit of help). Here is the link to it working: http://robertleeplummerjr.github.io/smartSweepers.js/ . Here is the project: https://github.com/robertleeplummerjr/smartSweepers.js
Have fun!
I have an app which has common maths functions behind the scenes:
add(x, y)
multiply(x, y)
square(x)
The interface is a simple google- style text field. I want the user to be able to enter a plain text description -
'2*3'
'2 times 3'
'multiply 2 and 3'
'take the product of 2 and 3'
and get a answer mathematical answer
Question is, how should I map the text descriptions to the functions ? I'm guessing I need to
tokenise the text
identify key tokens (function names, arguments)
try and map token combinations to function signatures
However I'm guessing this is already a 'solved problem' in the machine learning space. Should I be using Natural Language Processing ? Plain text search ? Something else ?
All ideas gratefully received, plus implementation suggestions [I'm using Python/AppEngine; I know about NLTK and Whoosh]
[PS I understand Google does this already, at least for the first two queries on the list. I'm guessing they also go it statistically, having a very large amount of search data. I don't have a large amount of data available, so will need an alternative approach].
After you tokenise the text, you need parsing to get a syntax tree of your natural language phrase. Once you have this, you can map the parse tree to a mathematical expression, and then evaluate the expression. I do not think this is a solved problem. I would start with several templates, say the first two, and experiment. The larger the domain of possible descriptions, the harder the task is.
I would recommend some tool for provide grammar/patterns on text like SimpleParse for python http://www.ibm.com/developerworks/linux/library/l-simple.html. As java programmer I would prefer GATE or graph-expression.