View Full Version : Explanation of how Markov Chaining Works and How it Relates to Search
randfish
02-24-2005, 03:28 PM
I've been reading quite a few white papers and research projects that mention Markov Chaining as a method used in constructing an algorithm or using it for link analysis, etc.
Unfortunately, I haven't been able to dig up a resource that I can understand on the subject of Markov Chaining itself. Could one of the experts here possibly help me to better understand this technique?
Thanks much!
orion
02-24-2005, 06:07 PM
Hi, Rand
Here is a very simple example of Markov Theory applied to music. Not really high-level math at all. It also comes with a nice source code.
http://jmusic.ci.qut.edu.au/jmtutorial/Markov1.html
Hope this help in some way.
Orion
Michael Martinez
02-24-2005, 06:15 PM
I have not studied Markov Chains, so I am only basing this on the previous example. That model could be applied to an analysis of Web document relationships, where a normalization could be determined for relationships between documents of type A and documents of type B.
That is, the probability that a document which is topically about subject A may point to a document which is topically about subject B would be determined through random sampling.
It could be useful for detecting documents which deviate from the norm by an unacceptable degree.
The model would also be useful for developing document clusters and determining which documents (in a group) are primary.
A Markov chain describes a system whose state changes over time. Its not completely random but rather a distribution of probabilities.
If you like symbols - if not ignore it'll tell you nothing more:
"A homogeneous Markov chain is defined by an initial distribution x and a Markov kernel P.
Path = sequence (x0, x1, …, xn).
The probability of a path can be computed as a product of probabilities for each step i. "
Markov chains are used all the time in networking, bioinformatics, entropy, financial markets... There's lots of different kinds of Markov chains like hidden markov, continuous-time markov or even the exotically named Markov Chain Monte Carlo. And then there are loads of different theorems as well.
They are said to be a discrete-time stochastic process, and belong to the maths arena which we always borrow heavily from.
"discrete time" means that the time index t has a finite or countably infinite number of values;
It comes from the markov property which says that if the present state predicts future states as well as the history of past and present states , the process is said to be memoryless. So we can say simply that the assumption is that the future depends only on the present, not the past.
In the computational linguistic application, words get clustered together to reduce the number of parameters. Each word can be predicted according to a conditional distribution. So this is how we can estimate which words are likely to appear next, as the word does not depend on the history of all the words found, but rather on the present ones.
This a large and very complex area, to get a full comprehension you start looking into Maximum Likelihood Estimators, discounting, good-turing smoothing, random walks,... and all of the other probalistic theories. It really depends what you're trying to do with it. It is used enourmously in cross-language application, NLP applications, summarization, IR evaluation. spam recognition applications, statistical machine translation... that kind of thing.
What we know about linguistics and the web:
Words are distributed very unevenly (Zipf, Benford, self-triggerability laws)
Links form hubs which are based around topics.
These are used like water is in tea or coffee. They are everywhere. Don't be suprised if you see them popping up in any IR or NLP paper. They are efficient, but again as with everything, not ideal.
randfish
02-24-2005, 07:24 PM
Thanks for your help guys.
It appears to me that the early uses for Markov Chaining were to predict and emulate the 'random walk' of a surfer on the web for algos like PageRank. Lately though, I see them used for topics like clustering based on semantics rather than links (i.e. Clusty) and being broken down to small chains in order to process larger groups of information.
I admit, I'm still struggling with the concepts of chaining itself and the construction of matrices. I wish that I had the opportunity to have a teacher present this in class, rather than trying to learn it 5+ years after my last math class in college.
Perhaps I can get some further insight during SES... in my yellow shoes :)
Also, yes, Google does use a Markov Chain, it's called PageRank.
I wrote more about it on the blog. Good topic in fact Randfish.
claus
02-25-2005, 10:22 AM
I haven't been able to dig up a resource that I can understand on the subject of Markov Chaining itself. Could one of the experts here possibly help me to better understand this technique?
In another forum i gave this straightforward explanation, it might be helpful to get a basic grasp of things
imagine a map from city A to city B,C,D... with lots of roads and cities inbetween. Assign probabilities to passing each of the cities inbetween and calculate where you're most likely to end up. That's basically it
Its not quite like that Claus, its much more linear.
claus
02-27-2005, 12:39 PM
*lol* thats the danger of providing generalized examples to a mixed audience - while it will help the understanding at some levels it will be outright misleading at others :)
Okay, you do have constraints - you can't just follow any road, as there's a transition matrix among the cities that determine which step you can take next given where you are. Also, some cities might be absorbing, so you'll never leave those once you get there, while others are transitional, meaning that you will pass. Another word for cities is "states" (fun?). Anyway, it might as well be web pages, that's just technicalities - as soon as you scratch the surface it sounds a bit confusing, which it really isn't when you work with it.
But.. perhaps i'm just not good at explaining this kind of stuff, as i tend to avoid the jargon and ignore some of the specifics in order to simplify. And, then there are different types of these models as well...
Anyway, i just found another interesting use of Markov chains (http://instruct1.cit.cornell.edu/courses/eceprojectsland/STUDENTPROJ/2002to2003/lil2/) ... engineers... duh... what could one expect from these people, really ;)
Marcia
02-27-2005, 01:04 PM
>>interesting use of Markov chains
Claus, what is that supposed to be? The one on the third shelf up looks more like a mutated chicken than a hamster.
claus
02-27-2005, 01:52 PM
Another word for cities is "states" (fun?).
That isn't really so, as the "states" is something a little bit different, ie. not the cities themselves but "your placement" at each "run" of the model, like, say each round in a game of cards - or an iteration or whathavewe... Anyway, i guess i've proven that there are limits to what can be stated in a simple fashion. I did try, though ;)
>> chicken
At second sight that creature doesn't really look healthy. There you go - there's apparently a 17% probability that engineer-made music is bad for your hair *lol*
>>interesting use of Markov chains
Claus, what is that supposed to be? The one on the third shelf up looks more like a mutated chicken than a hamster.
That confused me until I had a look, then I laughed really hard!