I was wondering if there's any open-source tool that I can use to design an AI which would read sentences from a file, understand their structure( by breaking them into their basic components) and then report back its components in detail.
I will provide it some sets of words belonging to different components of sentences(like set of prepositions, set of verbs, set of adjectives, etc) so as to help it determine different components.
I have an elaborate plan for it but my question is whether there's a tool available or do I have to program it from the scratch?
You are looking for a Parts of Speech Tagger, of which there are many. They are actually not all that hard to write (I made a simple one during grad school) but robust taggers do take a fair amount of work.
Here is one that is a part of the popular NLTK Package for Python.
Just as an aside, there is a lot more to Natural Language understanding that POS, but POS tags could be part of the feature vector that you feed into a larger ML/AI algorithm.
Related
I'd like to write an application, which would imitate a player in an online game.
About the game: it is a strategy, where you can:
train your army (you have to have enough resources, then click on a unit, click train)
build buildings (mines, armories, houses,...)
attack enemies (select a unit, select an enemy, click attack)
transport resources between buildings
make researches (economics, military, technologic,...)
This is a simplified list and is just an example. Main thing is, that you have to do a lot of clicking, if you want to advance...
I allready have the 'navigational' part of the application (I used Watin library - http://watin.sourceforge.net/). That means, that I can use high level objects and manipulate them, for example:
Soldiers soldiers = Navigator.GetAllSoldiers();
soldiers.Move(someLocation);
Now I'd like to take the next step - write a kind of AI, which would simulate my gaming style. For this I have two ideas (and I don't like either of them):
login to the game and then follow a bunch of if statements in a loop (check if someone is attacking me, check if I can build something, check if I can attack somebody, loop)
design a kind of scripting language and write a compiler for it. This way I could write simple scripts and run them (Login(); CheckForAnAttack(); BuildSomething(); ...)
Any other ideas?
PS: some might take this as cheating and it probably is, but I look on this as a learning project and it will never be published or reselled.
A bunch of if statements is the best option if the strategy is not too complicated. However, this solution does not scale very well.
Making a scripting language (or, domain specific language as one would call that nowadays) does not buy you much. You are not going to have other people create AI agents are you? You can better use your programming language for that.
If the strategy gets more involved, you may want to look at Bayesian Belief Networks or Decision Graphs. These are good at looking for the best action in an uncertain environment in a structured and explicit way. If you google on these terms you'll find plenty of information and libraries to use.
Sounds like you want a finite state machine. I've used them to various degrees of success in coding bots. Depending on the game you're botting you could be better off coding an AI that learns, but it sounds like yours is simple enough not to need that complexity.
Don't make a new language, just make a library of functions you can call from your state machine.
Most strategy game AIs use a "hierarchical" approach, much in the same way you've already described: define relatively separate domains of action (i.e. deciding what to research is mostly independent from pathfinding), and then create an AI layer to handle just that domain. Then have a "top-level" AI layer that directs the intermediate layers to perform tasks.
How each of those intermediate layers work (and how your "general" layer works) can each determined separately. You might come up with something rather rigid and straightforward for the "What To Research" layer (based on your preferences), but you may need a more complicated approach for the "General" layer (which is likely directing and responding to inputs of the other layers).
Do you have the sourcecode behind the game? If not, it's going to be kind of hard tracing the positions of each CPU you're (your computer in your case) is battling against. You'll have to develop some sort of plugin that can do it because from the sound of it, you're dealing with some sort of RTS of some sort; That requires the evaluation of a lot of different position scenarios between a lot of different CPUs.
If you want to simulate your movements, you could trace your mouse using some WinAPI quite easily. You can also record your screen as you play (which probably won't help much, but might be of assistance if you're determined enough.).
To be brutally honest, what you're trying to do is damn near impossible for the type of game you're playing with. You didn't seem to think this through yet. Programming is a useful skill, but it's not magic.
Check out some stuff (if you can find any) on MIT Battlecode. It might be up your alley in terms of programming for this sort of thing.
First of all I must point out that this project(which only serves educational purposes), is too large for a single person to complete within a reasonable amount of time. But if you want the AI to imitate your personal playing style, another alternative that comes to mind are neural networks: You play the game a lot(really a lot) and record all moves you make and feed that data to such a network, and if all goes well, the AI should play roughly the same as you do. But I'm afraid this is just a third idea you won't like, because it would take a tremendeous amount of time to get it perfect.
I've been studying hierachial reinforcement learning problems, and while a lot of papers propose interesting ways for learning a policy, they all seem to assume they know in advance a graph structure describing the actions in the domain. For example, The MAXQ Method for Hierarchial Reinforcement Learning by Dietterich describes a complex graph of actions and sub-tasks for a simple Taxi domain, but not how this graph was discovered. How would you learn the hierarchy of this graph, and not just the policy?
In Dietterich's MAXQ, the graph is constructed manually. It's considered to be a task for the system designer, in the same way that coming up with a representation space and reward functions are.
Depending on what you're trying to achieve, you might want to automatically decompose the state space, learn relevant features, or transfer experience from simple tasks to more complex ones.
I'd suggest you just start reading papers that refer to the MAXQ one you linked to. Without knowing what exactly what you want to achieve, I can't be very prescriptive (and I'm not really on top of all the current RL research), but you might find relevant ideas in the work of Luo, Bell & McCollum or the papers by Madden & Howley.
This paper describes one approach that is a good starting point:
N. Mehta, S. Ray, P. Tadepalli, and T. Dietterich. Automatic Discovery and Transfer of MAXQ Hierarchies. In International Conference on Machine Learning, 2008.
http://web.engr.oregonstate.edu/~mehtane/papers/hi-mat.pdf
Say there is this agent out there moving about doing things. You don't know its internal goals (task graph). How do you infer its goals?
In way way, this is impossible. Just as it is impossible for me to know what goal you had mind when you put that box down: maybe you were tired, maybe you saw a killer bee, maybe you had to pee....
You are trying to model an agent's internal goal structure. In order to do that you need some sort of guidance as to what are the set of possible goals and how these are represented by actions. In the research literature this problem has been studied under the terms "plan recognition" and also with the use of POMDP (partially observable markov decision process), but both of these techniques assume you do know something about the other agent's goals.
If you don't know anything about its goals, all you can do is either infer one of the above models (This is what we humans do. I assume others have the same goals I do. I never think, "Oh, he dropped his laptop, he must be ready to lay an egg" cse, he's a human.) or model it as a black box: a simple state-to-actions function then add internal states as needed (hmmmm, someone must have written a paper on this, but I don't know who).
I'm looking for a program or library that I could use for experimenting with board games (chess mostly, but not necessarily -- other similarly complex board games
are OK too). I'll test different game-playing algorithms.
This is what I need:
I'd like, if possible, to make my program play against
players like gnuchess and crafty, but also against itself
and against a human player;
It's OK if my player-program can communicate with the
"server" via TCP, but it would be even nicer if it had
a C interface (not C++, because then I'd have to write
a wrapper);
I may want to change the game rules (initial position of
pieces, number of pieces, and even movement rules);
Flexible (it's OK if the library/server validates
chess moves, for example, but I'd like such feature to be
optional because I will want to turn it off for some
experiments);
Free (I may want to get into the source code and maybe
change a few bits).
I'd be grateful if anyone could point me to such a library/server...
Thanks a lot!
P.S.: I wanted to include a "board-games" tag, but it seems that I'd need more reputation for that...
P.S. 2: I'd like to accept two answers (they're complementary). It's a pity StackOverflow doesn't allow that.
VASSAL is a cross-platform engine for playing board and card games over the internet. It is designed for allowing humans to play each other, but it is extensible enough you could add an AI player.
It is open-source and extremely customizable, people have created original games using it.
The XBoard protocol is the standard used between chess engines and graphic board front ends. It is plain text: as far as I can say there is no library.
Although seems complicated, the implementation is quite straightforward: a really small subset is needed in order to develop an usable application. The doc usually refers to the chess engine, but the same applies to the client side (reversing the side).
Hypothetically you can have the same connectivity of XBoard/Winboard, depending on how much protocol has been implemented. If you need some code to inspect, other than the classic Eboard and Xboard, there are a lot of examples around the web, and I mean really a lot of (it is a list of chess engines, but someone of them, such as babychess, is also a GUI frontend).
I'm not sure something like that exists yet.. by the way most of these topics are quite easy to develop by yourself:
vs player: just implement imput (you can use something simple like ncurses)
vs CPU: these games are called perfect information games and you can easily structure an AI with simple algorithms like minmax trees or negmax
to allow changing of rules it's more simple to hard code them (since every game could have its really different rules
for TCP support you should need to codify moves and separate GUI part from server part
If you are not going to test it with a large amount of different games projecting something really extendible (like an engine) will be a waste of time. Just concentrate on modifiable parts and program them wisely..
Actually some parts don't need to be generic: plan a good game protocol and then just care about events like unallowed move and similars..
I run a website that allows users to write blog-post, I would really like to summarize the written content and use it to fill the <meta name="description".../>-tag for example.
What methods can I employ to automatically summarize/describe the contents of user generated content?
Are there any (preferably free) methods out there that have solved this problem?
(I've seen other websites just copy the first 100 or so words but this strikes me as a sub-optimal solution.)
Think of the task of summarization as a challenge to 'select the most important sentences' from the document.
The method described in The Automatic Creation of Literature Abstracts by H.P. Luhn (1958) describes a naive method that actually performs quite well. Try giving it a shot.
If your website is in Python coding this algorithm using the NLTK (Natural Language Toolkit) is a fun task.
Make it predictable.
From a users perspective simply using the first paragraph is not bad at all.
Using any automation is bound to fall flat in some cases. So I suggest to display
the first paragraph (maybe truncating at some point) as a summary and offer the ability to override that by an optional field.
I might try using mechanical Turk or any number of other crowdsourcing options.
Another item to check out, a SourceForge project, AutoSummary Semantic Analysis Engine
Not a trivial task... You should look for articles or books on "extractive summarization"
A few starters could be:
Books:
Natural Language Processing with Python
Foundations of Statistical Natural Language Processing
Articles:
Language independent extractive summarization
Extractive summarization: how to identify the gist of a text
Extractive Summarization using Inter- and Intra- Event Relevance
Yahoo has a free API for this:
http://developer.yahoo.com/search/content/V1/termExtraction.html
Apple's patent 6424362 - Auto-summary of document content contains sample code which might be useful...
This borders on artificial intelligence so there's not going to be an "easy" solution out there, but there are products that target this problem.
Check out Copernic Summarizer, for one.
Noun phrases typically tend to be important elements of a sentence. Picking sentence(s) with a high density of noun phrases could yield a good summary. You could get noun phrases using a POS tagger.
For a good summary, it is desirable that it is a meaningful sentence. Reading a broken sentence is slightly jarring.
Alternatively, when the author posts the article, the author can highlight what are the keywords that can be used in the description which can then be automatically put in the meta description tag.
I would like to know if there are any tools that can help me model C applications i.e. Functional programming.
E.g. I'm currently building a shared library.
But to communicate my design visually, I need something like UML. I would like to do this so that the person reviewing my design need not read through 100s of pages of functions, variables and so on.
I have read about UML for C, which I'm considering.
If there is anything better out there, please let me know.
The bottom line is to visualize the design of C applications and modules without reading through 100s of pages of text, because it takes time and is difficult for the reviewers.
Any help in this area from the experts here would be much appreciated.
Thanks.
A well written text documentation brings you a far. Much further than any UML diagram could ever achieve.
You should split this in two parts:
What do you want to say?
What's the best way to saying it?
Whatever formalism you use to answer the second part, you should be sure it's not ambigous.
The good of UML is that a lot of semantic is already defined by the language so you don't have to include a definition of what those boxes, lines and arrows mean in a collaboration diagram.
But most importantly, documenting something means create a path for others to understand the subject you are documenting. A very precise description that offers no clue on how to read it is as good as none. So, use UML, Finite state machines, ER diagrams, plain English, whatever you want but be sure to include a logical path that your "readers" can follow to understand what's going on.
I had a friend that was a fan of "preciseness at all cost" and it would ask us to go through all the details before some sort of meaning would emerge.
I once ask him to do this experiment: on his next trip to an unknown city, he would have to carry the most precise map he could get. Much better, he would have to carry a 1:1 map of the city with every single detail exactly reported in scale. That way he couldn't get lost!
He declined but I would love to see him trying to use that map. Just even folding it! :)
Whatever you like. It's not a standard but many devs use it and understand it. If it does help you to communicate with other people and document your work -> its for you. If it just takes too much time and you think it's not effective, drop it. Also, don't bother with all details, as long as it resembles UML and your team can work with it, it's fine.
It's meant to help you, not waste you time.