# Forward Prop
Forward prop is pretty straight forward. Important thing to keep in mind is the matrix sizes
#### Matrix Dimensions
Matrix dimension for the weights matrix will be as follows (size of current layer x size of previous layer)
$w \in \boldsymbol{M}_{n^{l}\times n^{l-1}}(\mathbb{R})$
Matrix dimensions for bias matrix will be
$b \in \boldsymbol{M}_{n^l\times 1}(\mathbb{R})$
Output of each layer will be a matrix of the size $n^l\times m$ where m is the number of samples
### Gradient Descent in Neural Network
Gradient descent (backprop) is similar to how it was implemented in logistic regression using the chain rule in calculus.
### Initialization of Neural network
This topic will be covered in more detail later. But just don't initialize weights with zeros. That will not work. Biases can be zeros.