Hugo Larochelle

Reading questions

If you are unsure about any of these concepts, post a question on the discussion forum!

What is the loss function used to train a neural network for classification?
What is the chain rule?
What is backpropagation?
What is a flow graph?
What are the convergence conditions for (stochastic) gradient descent?
What is Newton's method?
What is gradient checking based on the finite difference approximation?
How best to initialize the parameters of a feedforward neural network?
What is early stopping?
What is weight decay?
What is a training epoch?
What is momentum?
What is the time-constant of decay for the learning rate (or decrease constant in assignment #1)?
How to perform a grid search of the hyper-parameters?