# Python辅导Jupyter | CS 369 2019 Assignment HMM & MCMC

## Assignment 4 : HMM & MCMC (13 + 2 points)

In this assignment you will write code to implement

• The forward and backward algorithms for hidden Markov models,
• The MCMC algorithm to sample the posterior distribution.

Plus 2 extra points for well commented code.

Due: Wednesday 12th June at 23:59

### Instruction

Write your submission in a Jupyter notebook and write any code in Python 3. You should submit the .ipynb file. The primary document the markers will look at is the .ipynb file.

In your report, include explanations of what you are doing and comments in the code. Where you are making mathematical derivations, show your working.

### Question 1 HMM

In Assignment 2, we introduced a simple HMM to model secondary structure in proteins with states ?,?,? and symbols ?,?,?. Refer to that assignment for more detail about the model and its meaning. It has :

transition probabilities a =

⎛⎝⎜⎜⎜⎜????0.950.03330.05?0.010.91670.05?0.040.050.90⎞⎠⎟⎟⎟⎟

and emission probabilites e =

⎛⎝⎜⎜⎜⎜????0.350.550.10?0.550.150.10?0.100.300.80⎞⎠⎟⎟⎟⎟

### i. Simulate a HMM (2 points)

Write a function simulate_HMM to simulate a state and symbol sequence pair (?,?) of length 100. Print ? and ? to output. Make sure that ? has at least two runs of ?s (you can set the random seed using numpy.random.seed() until you are happy with your sequence). You will use this simulated pair for the remainder of this question.

### ii. The forward algorithm (2 points)

Write a function to implement the forward algorithm to return a forward table matrix ? and the probability ?(?) of the observing sequence ?. You need to print ?(?) and make explicit if it is in a log space. ? is simulated by Question i. For forward algorithm examples, see Lecture 16 & Tutorial 6.

[ ]

### iii. The backward algorithm (2 points)

Write a function to implement the backward algorithm to return a matrix ? and the probability ?(?) of the observing sequence ?. You need to print ?(?) and make explicit if it is in a log space. ? is simulated by Question i. For a backwards algorithm example, see Tutorial 6.

### iv. Test (1 point)

Calculate the matrix P where the (?,?)th entry of P is ?(?,?)=??(??=?|?), the posterior probability of being in state ? at step ?. Notice that ?(?,?) is not a log probability in the below equation. ?(?,?) and ?(?,?) are from the anwsers of above questions.

?(?,?)=??(??=?|?)=??(?)??(?)?(?)

Print the matrix P, and show that your method is correct by checking the column sums of P are all 1.

### v. Plot (1 point)

Make a plot of ??(??=?|?) against the index ? (that is, plot ? on the horizontal axis and ??(??=?|?) on the vertical axis). To embed the plot in your report, include in the first cell of your notebook the command %matplotlib inline

### vi. Report (1 point)

By comparing your plot to the true state sequence ?, discuss whether or not you think you could accurately find the ?-helices in a given protein sequence using this model. E-mail: vipdue@outlook.com  微信号:vipnxx 