class: title-slide <br> <br> .right-panel[ <br> # Estimation ### Dr. Babak Shahbaba ] --- ### Parameter estimation - We are interested in population mean and population variance, denoted as `\(\mu\)` and `\(\sigma^{2}\)` respectively, of a random variable. - These quantities are unknown in general. - We refer to these unknown quantities **parameters**. - In the previous lecture, you learned about statistical methods for parameter **estimation**. - Estimation refers to the process of guessing the unknown value of a parameter (e.g., population mean) using the observed data. --- ### Point estimation vs. interval estimation - Sometimes we only provide a single value as our estimate: sample mean for population mean and sample variance for population variance. - This is called **point estimation**. - We use `\(\hat{\mu}\)` and `\(\hat{\sigma}^{2}\)` to denote the point estimates for `\(\mu\)` and `\(\sigma^{2}\)`. - Point estimates do not reflect our uncertainty. - To address this issue, we can present our estimates in terms of a range of possible values (as opposed to a single value). - This is called **interval estimation**. --- ### Confidence intervals for the population mean - When the population variance `\(\sigma^{2}\)` is known, the 95% confidence interval for `\(\mu\)` is obtained as follows: `$$\begin{equation*} \bigl[\bar{x} - 2 \times\sigma/ \sqrt{n},\ \bar{x} + 2 \times\sigma/\sqrt{n}\,\bigr] \end{equation*}$$` - Or in general, we use the `\(z\)`-critical value for the confidence level of `\(c\)`: `$$\begin{eqnarray*} [\bar{x} - z_{crit}\times \sigma / \sqrt{n}, \ \bar{x} + z_{crit} \times \sigma / \sqrt{n}] \end{eqnarray*}$$` - To find the `\(z\)`-critical value given confidence level `\(c\)`, we use the standard normal distribution to find the value whose upper tail probability is `\((1-c)/2\)`. --- --- ### Interpretation of confidence interval <img src="img/CI_SBP.png" width="40%" style="display: block; margin: auto;" /> --- ### Standard error - When the population variance, `\(\sigma^{2}\)`, is unknown (almost always), we need to estimate `\(\sigma^{2}\)` along with the population mean `\(\mu\)`. - For this, we use the sample variance `\(s^{2}\)`. - As a result, the standard deviation for the sample mean is estimated to be `\(s/\sqrt{n}\)`. - We refer to `\(s/\sqrt{n}\)` as the **standard error** of the sample mean. --- ### Confidence Interval When the Population Variance Is Unknown - To find confidence intervals for the population mean when the population variance is unknown, we follow similar steps as described above, but - instead of `\(\sigma/\sqrt{n}\)` we use `\(\mathit{SE} = s/\sqrt{n}\)`, - instead of `\(z_{\mathrm{crit}}\)` based on the standard normal distribution, we use `\(t_{\mathrm{crit}}\)` obtained from a `\(t\)`-distribution with `\(n-1\)` degrees of freedom. - The confidence interval for the population mean at `\(c\)` confidence level is `$$\begin{equation*} \bigl[\bar{x} - t_{\mathrm{crit}}\times s / \sqrt{n}, \ \bar{x} + t_{\mathrm{crit}} \times s / \sqrt{n}\,\bigr], \end{equation*}$$` --- ### Margin of error - We can write the confidence interval as `$$\begin{equation*} \bar{x} \pm t_{\mathrm{crit}}\times SE \end{equation*}$$` - The term `\(t_{\mathrm{crit}}\times SE\)` is called the **margin of error** for the given confidence level. - It is common to present interval estimates for a given confidence level as `$$\begin{equation*} \textrm{Point estimate} \pm\textrm{Margin of error.} \end{equation*}$$`