# Bayesian Inference:
Also called "Probabilistic Programming" because everything in Bayes is probability distribution
$
Pr(A|B) = \frac{Pr(B|A)Pr(A)}{Pr(B)}
$
https://en.wikipedia.org/wiki/Bayes%27_theorem
$
Pr(\theta|y) \propto \prod_{i=1}^{N}Pr(y_i|\theta)Pr(\theta)
$
where $\theta$ is the unknown and $y$ is the data
In English: Probability of $\theta$ given $y$ is proportional to what we know about $\theta$ before we observethe data (**prior**) ($Pr(\theta)$) and the information from the data (**likelihood**) ($Pr(y_i|\theta)$) .
$Pr(\theta|y)$ is called **posterior**
$
Pr(\theta|y) = \frac{\prod_{i=1}^{N}Pr(y_i|\theta)Pr(\theta)}{\int_\theta \prod_{i=1}^N Pr(y_i|\theta)Pr(\theta)d\theta}
$
The denominator is the numerator integrated over all values of the unknown $\theta$. The denominator is the tough part that makes Bayesian inference hard. It is called marginal likelihood. $\theta$ can be pretty big and cannot be done.
### Bayes in Python
PYMC3 (started in 2003)
based on Theano.
pymc.io
Doing any model with Bayesian approach involves three steps
1. specify the model
2. calculate posterior distribution
3. check the model
**Likelihood principle** - all information relevant to the unknown parameters in your model are contained in the likelihood function
**Law of likelihood** - the extent to which the evidence supports one parameter value or hypothesis against another is indicated by the ratio of their likelihoods or likelihood ratio or Bayes factor