Any business examples of using Markov chains? - artificial-intelligence

What business cases are there for using Markov chains? I've seen the sort of play area of a markov chain applied to someone's blog to write a fake post. I'd like some practical examples though? E.g. useful in business or prediction of stock market, or the like...
Edit: Thanks to all who gave examples, I upvoted each one as they were all useful.
Edit2: I selected the answer with the most detail as the accepted answer. All answers I upvoted.

The obvious one: Google's PageRank.

Hidden Markov models are based on a Markov chain and extensively used in speech recognition and especially bioinformatics.

I've seen spam email that was clearly generated using a Markov chain -- certainly that qualifies as a "business use". :)

There is a class of optimization methods based on Markov Chain Monte Carlo (MCMC) methods. These have been applied to a wide variety of practical problems, for example signal & image processing applications to data segmentation and classification. Speech & image recognition, time series analysis, lots of similar examples come out of computer vision and pattern recognition.

We use log-file chain-analysis to derive and promote secondary and tertiary links to otherwise-unrelated documents in our help-system (a collection of 10m docs).
This is especially helpful in bridging otherwise separate taxonomies. e.g. SQL docs vs. IIS docs.

I know AccessData uses them in their forensic password-cracking tools. It lets you explore the more likely password phrases first, resulting in faster password recovery (on average).

Markov chains are used by search companies like bing to infer the relevance of documents from the sequence of clicks made by users on the results page. The underlying user behaviour in a typical query session is modeled as a markov chain , with particular behaviours as state transitions...
for example if the document is relevant, a user may still examine more documents (but with a smaller probability) or else he may examine more documents (with a much larger probability).

There are some commercial Ray Tracing systems that implement Metropolis Light Transport (invented by Eric Veach, basically he applied metropolis hastings to ray tracing), and also Bi-Directional- and Importance-Sampling- Path Tracers use Markov-Chains.
The bold texts are googlable, I omitted further explanation for the sake of this thread.

We plan to use it for predictive text entry on a handheld device for data entry in an industrial environment. In a situation with a reasonable vocabulary size, transitions to the next word can be suggested based on frequency. Our initial testing suggests that this will work well for our needs.

IBM has CELM. Check out this link:
http://www.research.ibm.com/journal/rd/513/labbi.pdf

I recently stumbled on a blog example of using markov chains for creating test data...
http://github.com/emelski/code.melski.net/blob/master/markov/main.cpp

Markov model is a way of describing a process that goes through a series of states.
HMMs can be applied in many fields where the goal is to recover a data sequence that is not immediately observable (but depends on some other data on that sequence).
Common applications include:
Crypt-analysis, Speech recognition, Part-of-speech tagging, Machine translation, Stock Prediction, Gene prediction, Alignment of bio-sequences, Gesture Recognition, Activity recognition, Detecting browsing pattern of a user on a website.

Markov Chains can be used to simulate user interaction, f.g. when browsing service.
My friend was writing as diplom work plagiat recognision using Markov Chains (he said the input data must be whole books to succeed).
It may not be very 'business' but Markov Chains can be used to generate fictitious geographical and person names, especially in RPG games.

Markov Chains are used in life insurance, particularly in the permanent disability model. There are 3 states
0 - The life is healthy
1 - The life becomes disabled
2 - The life dies
In a permanent disability model the insurer may pay some sort of benefit if the insured becomes disabled and/or the life insurance benefit when the insured dies. The insurance company would then likely run a monte carlo simulation based on this Markov Chain to determine the likely cost of providing such an insurance.

Related

What do we mean by "controllable actions" in a POMDP?

I have some questions related to POMDPs.
What do we mean by controllable actions in a partially observable Markov decision process? Or no controllable actions in hidden Markov states?
When computing policies through value or policy iteration, could we say that the POMDP is an expert system (because we model the environment)? While, when using Q-learning, it is a more flexible system in terms of intelligence or adaptability to a changing environment?
Actions
Controllable actions are the results of choices that the decision maker makes. In the classic POMDP tiger problem, there is a tiger hidden behind one of two doors. At each time step, the decision maker can choose to listen or to open one of the doors. The actions in this scenario are {listen, open left door, open right door}. The transition function from one state to another depends on both the previous state and the action chosen.
In a hidden Markov model (HMM), there are no actions for the decision maker. In the tiger problem context, this means the participant can only listen without opening doors. In this case, the transition function only depends on the previous state, since there are no actions.
For more details on the tiger problem, see Kaelbling Littman and Cassandra's 1998 POMDP paper, Section 5.1. There's also a more introductory walk-through available in this tutorial.
Adaptability
The basic intuition in your question is correct, but can be refined. POMDPs are a class of models, whereas Q-learning is a solution technique. The basic difference in your question is between model-based and model-free approaches. POMDPs are model-based, although the partial observability allows for additional uncertainty. Reinforcement learning can be applied in a model-free context, with Q-learning. The model-free approach will be more flexible for non-stationary problems. That being said, depending on the complexity of the problem, you could incorporate the non-stationarity into the model itself and treat it as an MDP.
There's a very thorough discussion on these non-stationary modelling trade-offs in the answer to this question.
Lastly, it is correct that POMDP's can be considered expert systems. Mazumdar et al (2017) have suggested treating Markov decision processes (MDPs) as expert systems.

Learning the Structure of a Hierarchical Reinforcement Task

I've been studying hierachial reinforcement learning problems, and while a lot of papers propose interesting ways for learning a policy, they all seem to assume they know in advance a graph structure describing the actions in the domain. For example, The MAXQ Method for Hierarchial Reinforcement Learning by Dietterich describes a complex graph of actions and sub-tasks for a simple Taxi domain, but not how this graph was discovered. How would you learn the hierarchy of this graph, and not just the policy?
In Dietterich's MAXQ, the graph is constructed manually. It's considered to be a task for the system designer, in the same way that coming up with a representation space and reward functions are.
Depending on what you're trying to achieve, you might want to automatically decompose the state space, learn relevant features, or transfer experience from simple tasks to more complex ones.
I'd suggest you just start reading papers that refer to the MAXQ one you linked to. Without knowing what exactly what you want to achieve, I can't be very prescriptive (and I'm not really on top of all the current RL research), but you might find relevant ideas in the work of Luo, Bell & McCollum or the papers by Madden & Howley.
This paper describes one approach that is a good starting point:
N. Mehta, S. Ray, P. Tadepalli, and T. Dietterich. Automatic Discovery and Transfer of MAXQ Hierarchies. In International Conference on Machine Learning, 2008.
http://web.engr.oregonstate.edu/~mehtane/papers/hi-mat.pdf
Say there is this agent out there moving about doing things. You don't know its internal goals (task graph). How do you infer its goals?
In way way, this is impossible. Just as it is impossible for me to know what goal you had mind when you put that box down: maybe you were tired, maybe you saw a killer bee, maybe you had to pee....
You are trying to model an agent's internal goal structure. In order to do that you need some sort of guidance as to what are the set of possible goals and how these are represented by actions. In the research literature this problem has been studied under the terms "plan recognition" and also with the use of POMDP (partially observable markov decision process), but both of these techniques assume you do know something about the other agent's goals.
If you don't know anything about its goals, all you can do is either infer one of the above models (This is what we humans do. I assume others have the same goals I do. I never think, "Oh, he dropped his laptop, he must be ready to lay an egg" cse, he's a human.) or model it as a black box: a simple state-to-actions function then add internal states as needed (hmmmm, someone must have written a paper on this, but I don't know who).

Do you use Styrofoam balls to model your systems? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 years ago.
Improve this question
[Objective-C]
Do you still use Styrofoam balls to model your systems, where each ball
represents a class?
Tom Love: We do, actually. We've also done a 3D animation version of
it, which we found to be nowhere near
as useful as the Styrofoam balls.
There's something about a physical,
conspicuous structure hanging from the
ceiling right in the middle of a
development project that's regularly
updated to provide not only the
structure of the system that you're
building, but also the current status
of each one of the classes.
We've done it on 19 projects the last time I've counted. One of them was 1,856 classes, which is big - actually, probably bigger than it should be. It was a big commercial project, so it needed to be somewhat big.
Masterminds of Programming
It is the first time I've read or heard about using styrofoam balls to model classes.
Is that a commonly used technique? And, how does that sort of modeling help us to design better the system?
If you have any photos to share which can show us how the classes are represented it'd be great!
Update: So, it seems that the material most people use is the paper. Styrofoam balls are actually oddballs, not a commonly used technique.
Noticeable techniques:
"paper plates and string" modeling, NealB
Post-it Notes on a whiteboard, Jason
Class-Responsibility-Collaboration cards, duffymo
Sheets of ruled paper taped to the wall, AMissico
Thank you all for the very good answers.
I found a couple of styrofoam models for:
Windows 95
and
Lotus Notes
(if that helps)
Actually, here's a Tom Love case study that shows a couple of his models.
This model may represent the least
expensive CASE tool on the market --
materials cost $20.35. It was more
useful than any CASE tools I have ever
used.
We used it in three important ways.
It fixed the number of classes that we would deliver in the finished
application and we did not allow new
ones to be added, unless existing ones
could be removed.
It was a very useful way to publicly document which classes had
been code reviewed (blue ribbons) and
tested (green ribbons).
It helped everyone understand what was being built and how much time and
effort it takes to do testing,
documentation and code reviews.
Edit: photo of object model
alt text http://img686.imageshack.us/img686/82/stryrofoamobjectmodel.jpg
The styrofoam ball model appears to date back to the mid 1990's - a time when CASE (Computer Aided Systems Analysis)
systems were all the rage.
At that time CASE systems promised significant benefits but were dismally slow,
buggy, unstable, overextended and downright awkward to use. Basically, long on potential but short on delivery.
I remember having a conversation with an analyst working on a different project from mine. Her team had
become so frustrated with their CASE system that they trashed it and resorted to "paper plates and string"
modeling. They reserved a meeting room, removed all the furniture, and laid out their process model using labeled
paper plates with strings (representing data flows) connecting them. She claimed it was much more
useful than the CASE system it replaced.
I suspect that the styrofoam ball model had similar roots.
Using styrofoam balls or paper plates fostered design "buy-in". If a team
finds something to rally around it naturally creates a common design focus. It is simple, concrete and
minimal - using it requires a lot
of face to face interaction and discussion. And that is where the value comes from. I suspect
if you brought a new person into the project and told them to bring themselves up-to-speed by
reviewing the "model" they would be "dead in the water". However, walk them through the
"model" and a real conversation would occur where all the required information need to
perform on the project would be imparted very quickly and efficiently.
Do I think styrofoam balls could become a sustainable modeling tool? No, I don't. They would be a real
pain to keep up to date in a changing environment. They convey little information. There are better tools available
today. And most importantly, if the team you are working with don't "buy" it, and they
probably won't, it will look really stupid - kind of like a sports team mascot, a rallying point
only if the team "buys it".
No, we don't do this. And in my 30-odd year history in the IT industry, I've never heard of anyone doing this.
The only way this could help you design better systems is by:
keeping the class count down since it's hard to build the styrofoam model; and
minimising changes, since updating it would be a serious pain in the rear end.
Other than those two dubious features, I can't see this as being very useful. I'd almost conclude it was some sort of prank. Far better to do some real work, I think.
Seriously, if we tried to model our application with styro coffee cups and straws, our bosses would be calling the men in white coats.
Post-it Notes on a whiteboard seem to be popular in the circles I travel in. Objects go on the Post-Its, and you rearrange them until you get your relationships the way you want em.
And then there are the Color Modeling people who use a 4-pack of colored Post-Its and assign an archetype to each color. It doesn't sound like this is much of an improvement, but standing across a room looking at it, you can tell where there are missing features or unidentified objects in the system.
There is one application to this that I think we tend to forget-- using tools to articulate an architecture comes naturally to us after years in the industry, but there are valuable, albeit less technically-minded, stakeholders who may not grasp vital concepts as readily. It would sometimes be a lifesaver to point to a cluster of balls and say, "This is the Language Processing Model, and if I implement the feature you want, it will have consequences here, here, and here. You can see that there are a lot of balls connected there".
Architects, be they designing buildings or systems, might rely on those tangible models to indoctrinate the check writers into the process.
And I thought that UML was useless. The styrofoam ball model makes UML look positively elegant by comparison.
Ward Cunningham's CRC card idea is more useful, even cheaper, and still retains that tactile quality that Dr. Love was after.
I had never heard of the idea until I read this question. It deserves an up vote for originality. And the "Windows" and "Lotus Notes" pictures are priceless.
Sheets of ruled paper taped to the wall, where each sheet is a component, class, entity, or whatever is needed. Everyone has a pencil.
Everyone can write on them "flushing" out the model during the design meetings. Such as, meeting notes, implemetation notes, new classes, removed classes, reasons why you do not have a particular class, and so on. After the design meeting, the principal designer takes them down and rewrite them, again "flushing" them out with pen in "rough-draft" versions. The designer can then make decisions based on the notes of each sheet, create new sheets for any additional components. Generate topics for next meeting, note any descrepancies, note any design / implementation details needed for coding, or whatever else they need to do.
Repeat the meetings until everyone is satisfied. Pencil is new stuff, pen is previous items. Once everyone is happy, the designer creates the working-draft, and posts where everyone can see and initial, in pen, their acceptance of the "working-draft".
Nothing is final. Pen versions are "latest" versions. Pencil versions are "work-in-progress" or "draft" versions.
Simple, fast, flexible, no wasting time on the computer, with high visiblity. Working man's Wiki.
No. My team does not do this.
And I am badly tempted to mock with image macros. But I'm contemplating that the idea is silly enough that it is self-mocking.

Training Hidden Markov Models without Tagged Corpus Data

For a linguistics course we implemented Part of Speech (POS) tagging using a hidden markov model, where the hidden variables were the parts of speech. We trained the system on some tagged data, and then tested it and compared our results with the gold data.
Would it have been possible to train the HMM without the tagged training set?
In theory you can do that. In that case you would use the Baum-Welch-Algorithm. It is described very well in Rabiner's HMM Tutorial.
However, having applied HMMs to part of speech, the error you get with the standard form will not be so satisfying. It is a form of expectation maximization which only converges to local maxima. Rule based approaches beat HMMs hands down, iirc.
I believe the natural language toolkit NLTK for python has an HMM implementation for that exact purpose.
NLP was a couple years ago, but I believe without tagging the HMM could help determine the symbol emission/state transition probabilities of n-grams (i.e. what are the odds of "world" occurring after "hello"), but not parts-of-speech. It needs the tagged corpus to learn how the POS interrelate.
If I'm way off on this let me know in the comments!

How to automatically excerpt user generated content?

I run a website that allows users to write blog-post, I would really like to summarize the written content and use it to fill the <meta name="description".../>-tag for example.
What methods can I employ to automatically summarize/describe the contents of user generated content?
Are there any (preferably free) methods out there that have solved this problem?
(I've seen other websites just copy the first 100 or so words but this strikes me as a sub-optimal solution.)
Think of the task of summarization as a challenge to 'select the most important sentences' from the document.
The method described in The Automatic Creation of Literature Abstracts by H.P. Luhn (1958) describes a naive method that actually performs quite well. Try giving it a shot.
If your website is in Python coding this algorithm using the NLTK (Natural Language Toolkit) is a fun task.
Make it predictable.
From a users perspective simply using the first paragraph is not bad at all.
Using any automation is bound to fall flat in some cases. So I suggest to display
the first paragraph (maybe truncating at some point) as a summary and offer the ability to override that by an optional field.
I might try using mechanical Turk or any number of other crowdsourcing options.
Another item to check out, a SourceForge project, AutoSummary Semantic Analysis Engine
Not a trivial task... You should look for articles or books on "extractive summarization"
A few starters could be:
Books:
Natural Language Processing with Python
Foundations of Statistical Natural Language Processing
Articles:
Language independent extractive summarization
Extractive summarization: how to identify the gist of a text
Extractive Summarization using Inter- and Intra- Event Relevance
Yahoo has a free API for this:
http://developer.yahoo.com/search/content/V1/termExtraction.html
Apple's patent 6424362 - Auto-summary of document content contains sample code which might be useful...
This borders on artificial intelligence so there's not going to be an "easy" solution out there, but there are products that target this problem.
Check out Copernic Summarizer, for one.
Noun phrases typically tend to be important elements of a sentence. Picking sentence(s) with a high density of noun phrases could yield a good summary. You could get noun phrases using a POS tagger.
For a good summary, it is desirable that it is a meaningful sentence. Reading a broken sentence is slightly jarring.
Alternatively, when the author posts the article, the author can highlight what are the keywords that can be used in the description which can then be automatically put in the meta description tag.

Resources