Reading questions
If you are unsure about any of these concepts, post a question on the
discussion forum!
- What is the loss function used to train a CRF?
- What form takes the gradient of the (regularized) loss with respect to the parameters?
- How to perform stochastic gradient descent?
- What kind of inference is required for learning?
- What is pseudolikelihood?
- What is a maximum-entropy Markov model?