class: center, middle, inverse, title-slide # Ec140 - Variance and Sampling ### Fernando Hoces la Guardia ### 06/27/2022 --- <style type="text/css"> .remark-slide-content { font-size: 30px; padding: 1em 1em 1em 1em; } </style> # Housekeeping - Updated Syllabus - Fixed dates on PS1. Due this Friday 5pm on gradescope. - Unofficial Course Capture! (second attempt!) - Finish Ch 1 of MM by the end of the week. --- count:true # Todays Lecture - Variance and Standard Deviation - Expectation and Standard Deviation of the Sample Mean - Law of Large Numbers, Central Limit Theorem, and Sampling --- count:true # Variance and Standard Deviation 1/N (Sample) .font80[ .pull-left[ - Random variables -> probabilities -> distributions -> data -> mean/expectation - Let's look at another data set: ] ] .font80[ .pull-right[
] ] --- count:true # Variance and Standard Deviation 1/N (Sample) .font80[ .pull-left[ - Random variables -> probabilities -> distributions -> data -> mean/expectation - Let's look at another data set: $$ `\begin{equation} \overline{X} = 84.5 \\ \overline{Y} = 89.2 \end{equation}` $$ - Based on this data set, which one should we watch? - In addition to the mean, what other summary statistic (from data to one number) would you like to communicate. - Lets draw the data ] ] .font80[ .pull-right[
] ] --- count:true # Variance and Standard Deviation 2/N (Sample) .font80[ .pull-left[ $$ `\begin{equation} \overline{X} = 84.5 \\ \overline{Y} = 89.2 \end{equation}` $$ ] ] .font80[ .pull-right[
] ] --- count:true # Variance and Standard Deviation 2/N (Sample) .font80[ .pull-left[ $$ `\begin{equation} \overline{X} = 84.5 \\ \overline{Y} = 89.2 \end{equation}` $$ ] ] .font80[ .pull-right[
] ] --- count:true # Variance and Standard Deviation 2/N (Sample) .font80[ .pull-right[
] ] .font80[ .pull-left[ $$ `\begin{equation} \overline{X} = 84.5 \\ \overline{Y} = 89.2 \end{equation}` $$ $$ `\begin{equation} \frac{ \sum_{1:8}\left( x - \overline{X} \right) }{8} = ? \\ \frac{ \sum_{1:8}\left( y - \overline{Y} \right) }{8} = ? \end{equation}` $$ ] ] --- count:true # Variance and Standard Deviation 2/N (Sample) .font80[ .pull-right[
] ] .font80[ .pull-left[ $$ `\begin{equation} \overline{X} = 84.5 \\ \overline{Y} = 89.2 \end{equation}` $$ $$ `\begin{equation} \frac{ \sum_{1:8}\left( x - \overline{X} \right) }{8} = 0 \\ \frac{ \sum_{1:8}\left( y - \overline{Y} \right) }{8} = 0 \end{equation}` $$ ] ] --- # Variance and Standard Deviation 2/N (Sample) .font80[ .pull-left[ $$ `\begin{equation} \overline{X} = 84.5 \\ \overline{Y} = 89.2 \end{equation}` $$ $$ `\begin{equation} \frac{ \sum_{1:8}\left( x - \overline{X} \right) }{8} = 0 \\ \frac{ \sum_{1:8}\left( y - \overline{Y} \right) }{8} = 0 \end{equation}` $$ ] ] .font50[ .pull-right[
] ] --- # Variance and Standard Deviation 2/N (Sample) .font80[ .pull-left[ $$ `\begin{equation} \overline{X} = 84.5 \\ \overline{Y} = 89.2 \end{equation}` $$ $$ `\begin{equation} \frac{ \sum_{1:8}\left( x - \overline{X} \right) }{8} = 0 \\ \frac{ \sum_{1:8}\left( y - \overline{Y} \right) }{8} = 0 \end{equation}` $$ $$ `\begin{equation} \frac{ \sum_{1:8}\left( x - \overline{X} \right)^2 }{8} = 36.2 \\ \frac{ \sum_{1:8}\left( y - \overline{Y} \right)^2 }{8} = 171.9 \end{equation}` $$ ] ] .font50[ .pull-right[
.font150[ - These represent the sample variances of HP and GoT ratings - But what about the units? ] ] ] --- # Variance and Standard Deviation 2/N (Sample) .font80[ .pull-left[ $$ `\begin{equation} \overline{X} = 84.5 \\ \overline{Y} = 89.2 \end{equation}` $$ $$ `\begin{equation} s^{2}_{X} = \frac{ \sum_{1:8}\left( x - \overline{X} \right)^2 }{8} = 36.2 \\ s^{2}_{Y} = \frac{ \sum_{1:8}\left( y - \overline{Y} \right)^2 }{8} = 171.9 \end{equation}` $$ $$ `\begin{equation} s_{X} = \sqrt{ \frac{ \sum_{1:8}\left( x - \overline{X} \right)^2 }{8} } = 6 \\ s_{Y} = \sqrt{ \frac{ \sum_{1:8}\left( y - \overline{Y} \right)^2 }{8} } = 13.1 \end{equation}` $$ ] ] .font50[ .pull-right[
.font150[ - Due to a minor technicality we divide by `\(N-1\)` instead of `\(N\)` (not relevant for the course). - `\(s^{2}_{X}\)` and `\(s_{X}\)` correspond to the sample variance and sample standard deviation of a random variable `\(X\)`. ] ] ] --- count: true # Variance and Standard Deviation 2/N (Sample) .font80[ .pull-left[ $$ `\begin{equation} \overline{X} = 84.5 \\ \overline{Y} = 89.2 \end{equation}` $$ $$ `\begin{equation} s^{2}_{X} = \frac{ \sum_{1:8}\left( x - \overline{X} \right)^2 }{8- 1} = 41.4 \\ s^{2}_{Y} = \frac{ \sum_{1:8}\left( y - \overline{Y} \right)^2 }{8 - 1} = 196.5 \end{equation}` $$ $$ `\begin{equation} s_{X} = \sqrt{ \frac{ \sum_{1:8}\left( x - \overline{X} \right)^2 }{8 - 1} } = 6.4 \\ s_{Y} = \sqrt{ \frac{ \sum_{1:8}\left( y - \overline{Y} \right)^2 }{8 - 1} } = 14 \end{equation}` $$ ] ] .font50[ .pull-right[
.font150[ - Due to a minor technicality we divide by `\(N-1\)` instead of `\(N\)` (not relevant for the course). - `\(s^{2}_{X}\)` and `\(s_{X}\)` correspond to the sample variance and standard deviation. ] ] ] --- # Variance and Standard Deviation 3/N (Population) Let's focus on the formula for mean and sample variance of Harry Potter only. And for now, I will continue use `\(N\)` (8) in the denominator for the variane to illustrate the following concept. .font90[ .pull-left[ $$ `\begin{equation} \overline{X} = \frac{ \sum_{1:8}{x} }{8} = 84.5 \\ \end{equation}` $$ $$ `\begin{equation} s^{2}_{X} = \frac{ \sum_{1:8}\left( x - \overline{X} \right)^2 }{8} = 36.2 \\ \end{equation}` $$ ] ] .font80[ .pull-right[ ] ] --- # Variance and Standard Deviation 4/N (Population) .font90[ .pull-left[ .center[ Sample ] $$ `\begin{equation} \color{#FD5F00}{ \overline{X} = \frac{ \sum_{1:8}{x} }{8} } = 84.5 \\ \end{equation}` $$ $$ `\begin{equation} s^{2}_{X} = \frac{ \sum_{1:8}\left( x - \overline{X} \right)^2 }{8} = 36.2 \\ \end{equation}` $$ ] ] .font90[ .pull-right[ .center[ Population ] ] ] --- # Variance and Standard Deviation 4/N (Population) .font90[ .pull-left[ .center[ Sample ] $$ `\begin{equation} \color{#FD5F00}{ \overline{X} = \frac{ \sum_{1:8}{x} }{8} = \sum_{1:8} x \frac{1}{8} } = 84.5 \\ \end{equation}` $$ $$ `\begin{equation} s^{2}_{X} = \frac{ \sum_{1:8}\left( x - \overline{X} \right)^2 }{8} = 36.2 \\ \end{equation}` $$ ] ] .font90[ .pull-right[ .center[ Population ] ] ] --- count:true # Variance and Standard Deviation 4/N (Population) .font90[ .pull-left[ .center[ Sample ] $$ `\begin{equation} \color{#FD5F00}{ \overline{X} = \frac{ \sum_{1:8}{x} }{8} = \sum_{1:8} x \frac{1}{8} = \\ \sum_{1:8} x \times prop(x) } = 84.5 \\ \end{equation}` $$ $$ `\begin{equation} s^{2}_{X} = \frac{ \sum_{1:8}\left( x - \overline{X} \right)^2 }{8} = 36.2 \\ \end{equation}` $$ ] ] .font90[ .pull-right[ .center[ Population ] ] ] --- count:true # Variance and Standard Deviation 4/N (Population) .font90[ .pull-left[ .center[ Sample ] $$ `\begin{equation} \color{#FD5F00}{ \overline{X} = \frac{ \sum_{1:8}{x} }{8} = \sum_{1:8} x \frac{1}{8} = \\ \sum_{1:8} x \times prop(x) } = 84.5 \\ \end{equation}` $$ $$ `\begin{equation} s^{2}_{X} = \frac{ \sum_{1:8}\left( x - \overline{X} \right)^2 }{8} = 36.2 \\ \end{equation}` $$ ] ] .font90[ .pull-right[ .center[ Population ] $$ `\begin{equation} \color{#FD5F00}{ \mathop{\mathbb{E}}(X)\equiv \sum_{x}x f(x) }\\ \end{equation}` $$ ] ] --- count:true # Variance and Standard Deviation 4/N (Population) .font90[ .pull-left[ .center[ Sample ] $$ `\begin{equation} \color{#FD5F00}{ \overline{X} = \frac{ \sum_{1:8}{x} }{8} = \sum_{1:8} x \frac{1}{8} = \\ \sum_{1:8} x \times prop(x) } = 84.5 \\ \end{equation}` $$ $$ `\begin{equation} \color{#007935}{ s^{2}_{X} = \frac{ \sum_{1:8}\left( x - \overline{X} \right)^2 }{8} } = 36.2 \\ \end{equation}` $$ ] ] .font90[ .pull-right[ .center[ Population ] $$ `\begin{equation} \color{#FD5F00}{ \mathop{\mathbb{E}}(X)\equiv \sum_{x}x f(x) }\\ \end{equation}` $$ ] ] --- count:true # Variance and Standard Deviation 4/N (Population) .font90[ .pull-left[ .center[ Sample ] $$ `\begin{equation} \color{#FD5F00}{ \overline{X} = \frac{ \sum_{1:8}{x} }{8} = \sum_{1:8} x \frac{1}{8} = \\ \sum_{1:8} x \times prop(x) } = 84.5 \\ \end{equation}` $$ $$ `\begin{equation} \color{#007935}{ s^{2}_{X} = \frac{ \sum_{1:8} g(x) }{8} } = 36.2 \\ \end{equation}` $$ ] ] .font90[ .pull-right[ .center[ Population ] $$ `\begin{equation} \color{#FD5F00}{ \mathop{\mathbb{E}}(X)\equiv \sum_{x}x f(x) }\\ \end{equation}` $$ ] ] --- count:true # Variance and Standard Deviation 4/N (Population) .font90[ .pull-left[ .center[ Sample ] $$ `\begin{equation} \color{#FD5F00}{ \overline{X} = \frac{ \sum_{1:8}{x} }{8} = \sum_{1:8} x \frac{1}{8} = \\ \sum_{1:8} x \times prop(x) } \\ \end{equation}` $$ $$ `\begin{equation} \color{#007935}{ s^{2}_{X} = \frac{ \sum_{1:8}{g(x)} }{8} = \sum_{1:8} g(x) \frac{1}{8} = \\ \sum_{1:8} g(x) \times prop(x) } \\ \end{equation}` $$ ] ] -- .font90[ .pull-right[ .center[ Population ] $$ `\begin{equation} \color{#FD5F00}{ \mathop{\mathbb{E}}(X)\equiv \sum_{x}x f(x) }\\ \end{equation}` $$ $$ `\begin{equation} \color{#007935}{ \mathop{\mathbb{E}}\left( g(x) \right) = \\ \mathop{\mathbb{E}}\left( (X - \overline{X})^2 \right) = \sum_{x} (x - E(X))^2 f(x) }\\ \end{equation}` $$ ] ] --- count:true # Variance and Standard Deviation 4/N (Population) .font90[ .pull-left[ .center[ Sample ] $$ `\begin{equation} \color{#FD5F00}{ \overline{X} = \frac{ \sum_{1:8}{x} }{8} = \sum_{1:8} x \frac{1}{8} = \\ \sum_{1:8} x \times prop(x) } \\ \end{equation}` $$ $$ `\begin{equation} \color{#007935}{ s^{2}_{X} = \frac{ \sum_{1:8}{g(x)} }{8} = \sum_{1:8} g(x) \frac{1}{8} = \\ \sum_{1:8} g(x) \times prop(x) } \\ \end{equation}` $$ ] ] .font90[ .pull-right[ .center[ Population ] $$ `\begin{equation} \color{#FD5F00}{ \mathop{\mathbb{E}}(X)\equiv \sum_{x}x f(x) }\\ \end{equation}` $$ $$ `\begin{equation} \color{#007935}{ \mathop{\mathbb{E}}\left( g(x) \right) = \\ \mathop{\mathbb{E}}\left( (X - E(X))^2 \right) = \sum_{x} (x - E(X))^2 f(x) }\\ \end{equation}` $$ Usually `\(E(X)\)` is defined as `\(\mu\)`, so you might see: $$ `\begin{equation} \color{#007935}{ \mathop{\mathbb{E}}\left( ( X - \mu )^2 \right) = \sum_{x} (x - \mu)^2 f(x) }\\ \end{equation}` $$ ] ] --- count:true # Variance and Standard Deviation 5/N (Done!) You now know what are the variance and standard deviation and where do they come from! .font200[ $$ `\begin{equation} Var(X) = \sigma^2 = \mathop{\mathbb{E}}\left( ( X - \mu )^2 \right) \\ SD(X) = \sigma = \sqrt{ \mathop{\mathbb{E}}\left( ( X - \mu )^2 \right) } \end{equation}` $$ ] --- # Variance Random variables `\(\color{#e64173}{X}\)` and `\(\color{#9370DB}{Y}\)` share the same population mean, but are distributed differently. <img src="04_sampling_files/figure-html/unnamed-chunk-14-1.svg" style="display: block; margin: auto;" /> --- # Variance ## Rule 1 `\(\mathop{\text{Var}}(X) = 0 \iff X\)` is a constant. - If a random variable never deviates from its mean, then it has zero variance. - If a random variable is always equal to its mean, then it's a (not-so-random) constant. --- # Variance ## Rule 2 For any constants `\(a\)` and `\(b\)`, `\(\mathop{\text{Var}}(aX + b) = a^2\mathop{\text{Var}}(X)\)`. -- ## Example Suppose `\(X\)` is the high temperature in degrees Celsius in Eugene during August. If `\(Y\)` is the temperature in degrees Fahrenheit, then `\(Y = 32 + \frac{9}{5} X\)`. .hi-purple[What is] `\(\color{#9370DB}{\mathop{\text{Var}}(Y)}\)`.hi-purple[?] -- - `\(\mathop{\text{Var}}(Y) = (\frac{9}{5})^2 \mathop{\text{Var}}(X) = \color{#9370DB}{\frac{81}{25} \mathop{\text{Var}}(X)}\)`. --- # Variance ## Variance Rule 3 For constants `\(a\)` and `\(b\)`, $$ \mathop{\text{Var}} (aX + bY) = a^2 \mathop{\text{Var}}(X) + b^2 \mathop{\text{Var}}(Y) + 2ab\mathop{\text{Cov}}(X, Y). $$ -- - If `\(X\)` and `\(Y\)` are uncorrelated, then `\(\mathop{\text{Var}} (X + Y) = \mathop{\text{Var}}(X) + \mathop{\text{Var}}(Y)\)` - If `\(X\)` and `\(Y\)` are uncorrelated, then `\(\mathop{\text{Var}} (X - Y) = \mathop{\text{Var}}(X) + \mathop{\text{Var}}(Y)\)` --- name:sample-mean # Expectation and Variance of the Sample Mean - Time for a subtle, but very important change of focus. - Until now we have been talking about the expectation and variance of a random variable. Now we are going to focus on the expectation and variance of the **mean of a collection of random variables**. - Wait? We talk last class that the expectation is like the mean. So basically you want to focus on the mean of the mean? What do that we even mean (!)? - A combination of random variables is also a random variable (e.g., remember how a Binomial random variable was a summation of Bernoullis?). In particular, a summation of random variables `\(Y_1, Y_2, Y_3 ..., Y_n\)` is also a random variable, and the sample size is a constant. Hence, `\(\overline{Y}=\frac{ \sum_{n} Y}{n}\)` is also a random variable. --- # Expectation and Variance of the Sample Mean - This potentially cofusing, as before we would have one random variable X, from which we would sample a collection of values `\(\{x_1, x_2, ... , x_n \}\)`, and with this we could compute the mean `\(\overline{X}\)`. - But now we will have to imagine that we do this sampling multiple times. To help with the transition (and because it will also help with future notation), I will use the letter `\(Y_{\text{number } i}\)` to denote random variable number `\(i\)` (where `\(i\)` is used to represent any given number) or `\(Y_{i}\)` for short. - Hard to imagine if one sample corresponds to one survey that cost millions of dollars and took months or years to carry out, but think about it as a thought exercise. Believing in the multiverse in this case helps with the thought exercise :) --- # Expectation and Variance of the Sample Mean - Before we start combining random variables, we need to make two important assumptions: **independence** and **identically distributed**. - **Independence:** Two (or more) random variables are independent when knowing one random variable provides no information about the value of the other. A bit more formally, if two random variables `\(X\)` and `\(Y\)` are independent, then `\(P(X=x \& Y=y) = P(X=x)P(Y=y)\)`. A nice shorthand is to think of "independence as multiplication". - **Identically Distributed:** Two (or more) random variables are identically distributed if they have the same probability distribution (or density) function. As a consequence these random variables have the same expected value, let's call it `\(\mu_{Y}\)`, and standard deviation `\(\sigma_{Y}\)` - A common abbreviation for this two assumption is to say that a collection of random variables is **i.i.d** --- # Expectation of the Sample Mean - The expected value of the sample mean `\((\overline{Y})\)` is, at first glance, nothing too surprising: $$ `\begin{equation} \mathop{\mathbb{E}}(\overline{Y}) = \frac{1}{n}\sum \mathop{\mathbb{E}}(Y_i)\\ \mathop{\mathbb{E}}(\overline{Y}) = \frac{1}{n}\sum \mu_{Y} = \frac{n \mu_{Y}}{n}\\ \mathop{\mathbb{E}}(\overline{Y}) = \mu_Y \end{equation}` $$ (The first equality comes from Rule 2 and 3 of expectation. The second equality comes from identical means, and the third from summing `\(n\)` times the same constant) --- # The Standard Deviation of the Sample Mean - The formula for variance and standard deviation of the sample mean `\((\overline{Y})\)` is less straight forward: $$ Var(\overline{Y}) = \frac{\sigma^{2}_{Y}}{n} $$ $$ SD(\overline{Y}) = \frac{ \sigma_{Y}}{\sqrt{n}} $$ - Unlike the expectation of the mean its the standard deviation is not the same as the standard deviation of a single random variable. Moreover, it shrinks (to zero) as the sample size increases. --- # Exact v. Approximate Approches - We just examine the expectation and variance for the sampling mean `\((\overline{Y})\)` using theoretical properties of `\(E()\)` and `\(Var()\)` this results hold true *regardless* of the sample size `\(n\)`. But at the same time answer to a highly hypothetical question (what is the population mean of the sample mean?). - In addition to this "exact" derivation. We can also ask what happens with `\(\overline{Y}\)` when its sample size `\((n)\)` increases. This "approximate" approach is refer to as the asymptotic properties `\(\overline{Y}\)` (but either term is fine). - In econometrics we make extensive use of the two following approximations: --- # Law of Large Numbers (LLN) - Under general conditions, of independence (and finite variance), `\(\overline{Y}\)` will be near its expected value `\((\mu_Y)\)` with arbitrary high probability as `\(n\)` is large `\((\overline{Y} \overset{p}{\to} \mu_{Y})\)` <!-- implications for the real world: - counting - surveying hard questions --> <img src="04_sampling_files/figure-html/unnamed-chunk-15-1.png" style="display: block; margin: auto;" /> - Let's roll some dice in [Seeing Theory](https://seeing-theory.brown.edu/basic-probability/index.html#section2) to get a better idea. --- # Law of Large Numbers (LLN): Observations - In practical terms `\(n\)` doesn't have to be too large. `\(n=25-35\)` tends to be enought. In social sciences we tend to work with much more that. - As `\(n\)` grows the standard deviation of the sample mean drops to zero. In the example above: `\(SD(\overline{Y_{10}}) = 0.16\)`, `\(SD(\overline{Y_{100}}) = 0.05\)`, `\(SD(\overline{Y_{1000}}) = 0.02\)`, `\(SD(\overline{Y_{10000}}) = 0\)`. --- # Central Limit Theorem (CLT) - Under general conditions, of independence (and finite variance), the **distribution** of `\(\overline{Y}\)` is approximately `\(N(\mu_{Y}, \frac{\sigma_{Y}^{2}}{n})\)` as `\(n\)` is large. - This is true **for any** type of distribution (not only normal) of the underlying `\(Y_{i}\)`. - This is very hard to believe, so we are going to spend some significant time in [Seeing Theory](https://seeing-theory.brown.edu/probability-distributions/index.html#section3) simulating different scenarios (and probably over session too). - In real life the key assumption is that of independence. If observations are obtained at random, a procedure called *random sampling*, then independence is achieved. - Random sampling is necessary so the LLN and CLT can be used. --- # Acknowledgments [TO DO] - LLN simulation blog - Seeing theory