Reading questions

If you are unsure about any of these concepts, post a question on the discussion forum!

  • What is the loss function used to train a CRF?
  • What form takes the gradient of the (regularized) loss with respect to the parameters?
  • How to perform stochastic gradient descent?
  • What kind of inference is required for learning?
  • What is pseudolikelihood?
  • What is a maximum-entropy Markov model?