Markov Chain Monte Carlo
One of the goal of the current my study about statistics is understanding Markov Chain Monte Carlo(MCMC).
I would like to write about one of the popular algorithms of MCMC, Metropolis–Hastings algorithm. It’s a kind of rejection sampling algorithm.
Let be a probability distribution of a target distribution() that you would like to draw samples from.
Let be a probability distribution of a proposal distribution( that you already know how to draw samples from.
Draw a sample from (The previous sample is )
Accept the sample with a probability . Otherwise, reject it.
It means that you always accept the sample when . And you accept the sample with the probability when it’s not.
Repeat 1 and 2 util the sample’s probability distribution is close enough to the target.
Let’s think about why this algorithm works.
Reversible Markov chain that I explained in the previous post plays an important role of this algorithm.
Let’s assume that samples from the target probability distribution are generated based on a reversible Markov chain which has a stationary distribution .
Since the Markov chain is reversible, each transition( -> ) of the Markov chain fulfils the following condition:
When is true, we always accept a sample.
If , then , but it’s not. It means the proposal distribution causes the chain to move from to too rarely and from to too often. Even though this condition, we get from G. Then we accept the sample .
On the other hand, when < 1, we accept the sample with the probability .
<
It means the proposal distribution causes the chain to move from to too often and from to too rarely.
We need to adjust the transition by using acceptance distribution such that:
This is my intuitive understanding of Metropolis–Hastings algorithm.
More formal way of explanation from wikipedia is as follows:
can be re-written as
We draw a sample from the proposal distribution G and accept it with a certain probability to simulate the target distribution P.
Let’s insert this to the previous equation.
Then we need to choose a acceptance distribution which fulfils this condition. The choice is used in the Metropolis–Hastings algorithm is as follows:
This fulfils the previous condition because
When , then .
When < 1 > 1, then .