## Maximum Likelihood estimation follows from an assumption that our data results from independent and identically distributed observations from a population. Our goal is to find a ๐œƒ that maximizes the likelihood of us observing our data. It is joint probability of observing the data given a specific value for parameter. sometimes it is necessary to convert likelihood to log likelihood for computational simplicity MLE can be shown to be a consistent estimator, but may be biased.(eg. variances of normal distribution) Operationally, it can be computationally expensive to calculate, but offers a useful fact that any function of the parameters is also a function of the MLE, ie invariant to transformations. For a normal distribution MLE and MoM give same resutls https://www.statlect.com/fundamentals-of-statistics/normal-distribution-maximum-likelihood ![[Pasted image 20220130194405.png]] # Maximum Likelihood in python MLE is computed using optimization by maximizing the likelihood function for the distribution parameters. This is done by maximizing the log-likelihood for numerical stability. Scipy.stats.norm has a fit method that does MLE ```python from scipy.stats import norm from sklearn.datasets import make_blobs # Make 1D Gaussian data, with given number of clusters x, x_label = make_blobs(n_samples=1000 , n_features=1, centers=3, cluster_std=0.15, random_state=0) x[x_label==0].mean() # --> 0.9726 x[x_label==0].std() # --> 0.14658 loc, scale = norm.fit(x[x_label == 0]) # loc ---> 0.9726 scale --->0.14658 ``` [[Gaussian Mixture Model]] # Non Parametric methods: [[Kernel Density Estimation]]