Neural Networks II

00:00:00Neural Networks II
00:01:09Mini-batch stochastic gradient descent
00:03:55Finding an effective learning rate
00:06:15Using a learning schedule
00:07:35Complex loss surfaces and local minima
00:09:12:Adding momentum to gradient descent
00:12:50Adaptive optimizers (RMSProp and Adam)
00:15:08Local minima are rarely a problem
00:15:21Activation functions (sigmoid, tanh, and relu)
00:19:35Weight initialization techniques (Xavier/Glorot and He)
00:21:15Feature scaling (normalization and standardization)
00:23:28Batch normalization for training stability
00:28:26Regularization (early stopping, L1, L2, and dropout)
00:33:11DEMO: building a basic deep learning model for NLP
00:56:19Deep learning is about learning representations
00:58:18Sensible defaults when building deep learning models

References and Links