## Maximum Likelihood estimation
follows from an assumption that our data results from independent and identically distributed observations from a population. Our goal is to find a ๐ that maximizes the likelihood of us observing our data.
It is joint probability of observing the data given a specific value for parameter.
sometimes it is necessary to convert likelihood to log likelihood for computational simplicity
MLE can be shown to be a consistent estimator, but may be biased.(eg. variances of normal distribution) Operationally, it can be computationally expensive to calculate, but offers a useful fact that any function of the parameters is also a function of the MLE, ie invariant to transformations.
For a normal distribution MLE and MoM give same resutls
https://www.statlect.com/fundamentals-of-statistics/normal-distribution-maximum-likelihood
![[Pasted image 20220130194405.png]]
# Maximum Likelihood in python
MLE is computed using optimization by maximizing the likelihood function for the distribution parameters. This is done by maximizing the log-likelihood for numerical stability. Scipy.stats.norm has a fit method that does MLE
```python
from scipy.stats import norm
from sklearn.datasets import make_blobs
# Make 1D Gaussian data, with given number of clusters
x, x_label = make_blobs(n_samples=1000 , n_features=1, centers=3, cluster_std=0.15, random_state=0)
x[x_label==0].mean() # --> 0.9726
x[x_label==0].std() # --> 0.14658
loc, scale = norm.fit(x[x_label == 0]) # loc ---> 0.9726 scale --->0.14658
```
[[Gaussian Mixture Model]]
# Non Parametric methods:
[[Kernel Density Estimation]]