You are playing dice with a friend. Each player chooses a number between 1 and 6. If the die lands on your number - you win!!. What would be a good number to choose? This die may not be fair. The die could be landing on 5s or 6s most of the time. The spread between the outcomes could be very small. In this case, it would be worthwhile picking the high numbers. Parameters are useful as they determine which values are most likely and also inform you on the spread of the data. There is more risk when the variance is high. We will explore how to find important parameters for one variable and also for two or more variables.
Let us look at a simple example of rolling a die. You can draw a sample space for rolling a die
| Die | X1 = 1 | X2 = 2 | X3 = 3 | X4 = 4 | X5 = 5 | X6 = 6 |
| Frequency | f(x1) | f(x2) | f(x3) | f(x4) | f(x5) | f(x3) |
| Probability | P(H,1) | P(H,2) | P(H,3) | P(H,4) | P(H,5) | P(H,6) |
The die is a uniform distribution. If there is 60 rolls then the theoretical frequency fi = 10. Furthermore the probability of each outcome is:
\(P(X = x_{i}) = {1 \over 6}\)
Let us firstly remind ourselves of how to calculate the population mean for one variable.
\( \mu= {\Sigma_{i=1}^{6} x_{i}f(x_{i}) \over N}\)
The probability of each outcome is the following:
\(P(X = x_{i}) = {f(x_{i}) \over N}\)
N is the the total frequency. It is calculated by \(\Sigma_{i=1}^{6} f(x_{i})\)
Therefore the population mean can be expressed as :
\( \mu= \Sigma_{i=1}^{6} x_{i} P(X = x_{i})\)
This is also called the expected value.
\( \mathbb{E}[X] = \Sigma_{i=1}^{6} x_{i} P(X = x_{i})\)
Let us remind ourselves of how to calculate the population variance for one variable.
\( \sigma^{2}= {\Sigma_{i=1}^{6} [x_{i}-\mu]^{2}f(x_{i}) \over N}\)
The probability of each outcome is the following:
\(P(X = x_{i}) = {f(x_{i}) \over N}\)
Therefore the population mean can be expressed as :
\( \sigma^{2}= \Sigma_{i=1}^{6} [x_{i}-\mu]^{2} P(X = x_{i})\)
This is also called the expected value.
\( \mathbb{E}[[X-\mu]^{2}] = \Sigma_{i=1}^{6} [x_{i}-\mu]^{2} P(X = x_{i})\)
You have now an alternative expression for the variance.
\( \mathbb{Var}[X] = \mathbb{E}[[X-\mu]^{2}] \)
You can also provide another formula for the variance. Firstly, expand the bracket.
\( \mathbb{Var}[X] = \mathbb{E}[[X^{2}-2\mu x_{i} + \mu^{2}] \)
Let us express this in terms of \(\Sigma\)
\(\mathbb{Var}[X] = \Sigma_{i=1}^{6} [x_{i}^{2}-2\mu x_{i} + \mu^{2}]P(X = x_{i})\)
You can split the terms.
\(\mathbb{Var}[X] = \Sigma_{i=1}^{6} [x_{i}^{2}P(X = x_{i})-\Sigma_{i=1}^{6}2\mu x_{i}P(X = x_{i}) + \Sigma_{i=1}^{6}\mu^{2}P(X = x_{i})\)
You can take out the constants.
\(\mathbb{Var}[X] = \Sigma_{i=1}^{6} [x_{i}^{2}P(X = x_{i})-2\mu\Sigma_{i=1}^{6} x_{i}P(X = x_{i}) + \mu^{2}\Sigma_{i=1}^{6}P(X = x_{i})\)
You can now express the variance in term of expected values. Also remember that the total probability = 1
\(\mathbb{Var}[X] = \mathbb{E}[X^{2}]-2\mu \mathbb{E}[X] + \mu^{2}\)
Recall that \(\mathbb{E}[X] = \mu \)
\(\mathbb{Var}[X] = \mathbb{E}[X^{2}]-\mu^{2}\)
Alternatively, if you prefer
\(\mathbb{Var}[X] = \mathbb{E}[X^{2}]-\mathbb{E}[X]^{2}\)
We can find the covariance for two variables
Let us first remind ourselves of how to calculate the covariance.
Therefore the covariance can be expressed as :
\( \mathbb{Cov}[X]= \Sigma_{i=1}^{6} \Sigma_{j=1}^{2}[x_{i}-\mu_{x}][y_{j}-\mu_{y}]P(X = x_{i}, Y = y_{i})\)This can also be expressed in terms of expected values.
\( \mathbb{Cov}[X]= \mathbb{E}[(X-\mu_{x})(Y-\mu_{y})]\)Let us find an alternative expression for the covariance. Firstly we will write it in terms of Σ
\( \mathbb{Cov}[X]= \Sigma_{i=1}^{6}\Sigma_{j=1}^{2} [x_{i}-\mu_{x}][y_{j}-\mu_{y}]P(X = x_{i}, Y = y_{i})\)Now you can expand the expression
\begin{array}{ccc} \mathbb{Cov}[X] = & \Sigma_{i=1}^{6}\Sigma_{j=1}^{2} [x_{i}y_{j}P(X = x_{i}, Y = y_{i}) & & \\ & -x_{i}\mu_{y}P(X = x_{i}, Y = y_{i}) & & \\ & - y_{j}\mu_{y}P(X = x_{i}, Y = y_{i}) & & \\ & \mu_{x}\mu_{y}P(X = x_{i}, Y = y_{i})] & & \\ \end{array}
Next we can split up the terms.
\begin{array}{ccc} \mathbb{Cov}[X] = & \Sigma_{i=1}^{6}\Sigma_{j=1}^{2} x_{i}y_{j}P(X = x_{i}, Y = y_{i}) & & \\ & \Sigma_{i=1}^{6}\Sigma_{j=1}^{2}-x_{i}\mu_{y}P(X = x_{i}, Y = y_{i}) & & \\ & \Sigma_{i=1}^{6}\Sigma_{j=1}^{2}-y_{j}\mu_{y}P(X = x_{i}, Y = y_{i}) & & \\ & \Sigma_{i=1}^{6}\Sigma_{j=1}^{2}\mu_{x}\mu_{y}P(X = x_{i}, Y = y_{i}) & & \\ \end{array}You can take out the constants and separate the "i"s and the "j"s.
\begin{array}{ccc} \mathbb{Cov}[X] = & \Sigma_{i=1}^{6}x_{i}\Sigma_{j=1}^{2}y_{j}P(X = x_{i}, Y = y_{i}) & & \\ & -\mu_{y}\Sigma_{j=1}^{2}\Sigma_{i=1}^{6}x_{i}P(X = x_{i}, Y = y_{i}) & & \\ & -\mu_{x}\Sigma_{i=1}^{6}\Sigma_{j=1}^{2}y_{j}P(X = x_{i}, Y = y_{i}) & & \\ & \mu_{x}\mu_{y}\Sigma_{i=1}^{6}\Sigma_{j=1}^{2}P(X = x_{i}, Y = y_{i}) & & \\ \end{array}
You need to recall some important points.
These points can be applied to the covariance.
\( \mathbb{Cov}[X]= \Sigma_{i=1}^{6}\Sigma_{j=1}^{2} x_{i}y_{i}P(X = x_{i}, Y = y_{i}) - \mu_{y}\Sigma_{i=1}^{6}x_{i}P(X = x_{i}) - \mu_{x}\Sigma_{j=1}^{2}y_{j}P(Y = y_{i}) + \mu_{x}\mu_{y}\)
Now, you can express everything in terms of expected values.
\( \mathbb{Cov}[X]= \mathbb{E}[XY] - \mu_{y}\mathbb{E}[X] - \mu_{x}\mathbb{E}[Y] + \mu_{x}\mu_{y}\)Finally, we have an alternative expression for the covariance.
\( \mathbb{Cov}[X]= \mathbb{E}[XY] - \mu_{x}\mu_{y}\)Equally, it can be expressed as:
\( \mathbb{Cov}[X]= \mathbb{E}[XY] - \mathbb{E}[X]\mathbb{E}[Y] \)Let us use a tangible example such as rolling a die and throwing a coin. This is the sample space.
| Coin/Die | X1 = 1 | X2 = 2 | X3 = 3 | X4 = 4 | X5 = 5 | X6 = 6 |
| Y1 = Head (H) | P(H,1) | P(H,2) | P(H,3) | P(H,4) | P(H,5) | P(H,6) |
| Y2 = Tails (T) | P(T,1) | P(T,2) | P(T,3) | P(T,4) | P(T,5) | P(T,6) |
\(\mathbb{E}[XY] = \Sigma_{i=1}^{6}\Sigma_{j=1}^{2} X_{i}Y_{j}P(X_{i},Y_{j})\)
Since rolling a die and throwing a coin are both independent variables \(P(X_{i},Y_{j}) = P(X_{i})P(Y_{j})\)
\(\mathbb{E}[XY] = \Sigma_{i=1}^{6}\Sigma_{j=1}^{2} X_{i}Y_{j}P(X_{i})P(Y_{j})\)
Now you split the "i"s and the "j"s
\(\mathbb{E}[XY] = \Sigma_{i=1}^{6}X_{i}P(X_{i})\Sigma_{j=1}^{2} Y_{j}P(Y_{j})\)
You should be able to spot two expected values
\(\mathbb{E}[XY] = \mathbb{E}[X]\mathbb{E}[Y]\)
Rearranging this equation we can see that the Covariance is 0.
\(\mathbb{Cov}[X,Y] = \mathbb{E}[XY] - \mathbb{E}[X]\mathbb{E}[Y] = 0\)
This is a property of independence. We have shown that the covariance of two independent variables is 0.
You may need to calculate the expected value of the sample mean or the variance of the sample mean.
\( X̄={Σ_{i} X_{i}\over n }\)
Let us firstly remind ourselves of some key properties of expected values:
property 1
\(\mathbb{E}[X_{1}+X_{2} + ... + X_{n}] = \mathbb{E}[X_{1}]+\mathbb{E}[X_{2}] + ... + \mathbb{E}[X_{n}]\)
You can simply show that:
\(\mathbb{E}[X+Y]= \mathbb{E}[X] + \mathbb{E}[Y] \)
To make it as tangible as possible, let X be throwing a die and Y is the outcome of tossing a coin.
\(\mathbb{E}[X+Y] = \Sigma_{i=1}^{6}\Sigma_{j=2}^{2} [X_{i}+Y_{j}]P(X_{i},Y_{j})\)
You can expand the expression.
\(\mathbb{E}[X+Y] = \Sigma_{i=1}^{6}\Sigma_{j=1}^{2} X_{i}P(X_{i},Y_{j}) + \Sigma_{i=1}^{6}\Sigma_{j=1}^{2} Y_{i}P(X_{i},Y_{j}) \)
We will split the "i"s and the "j"s
\(\mathbb{E}[X+Y] = \Sigma_{i=1}^{6}X_{i}\Sigma_{j=1}^{2} P(X_{i},Y_{j}) + \Sigma_{j=1}^{2} Y_{i} \Sigma_{i=1}^{6}P(X_{i},Y_{j}) \)
We will recall the marginal probabilities
\(\mathbb{E}[X+Y] = \Sigma_{i=1}^{6}X_{i}(P(X = x_{i}) + \Sigma_{j=1}^{2} Y_{j}(P(Y = y_{j}) \)
You should be able to spot two expected values
\(\mathbb{E}[X+Y] = \mathbb{E}[X] + \mathbb{E}[Y]\)
By the same procedure we can show that:
\(\mathbb{E}[\Sigma_{i=1}^{n}X_{i}] = \Sigma_{i=1}^{n}\mathbb{E}[X_{i}]\)
\(\mathbb{E}[Σ_{i} X_{i}] = Σ_{i} \mathbb{E}[X_{i}]\)
You also need to be convinced that :
Property 2
\( \mathbb{E}[aX] = a\mathbb{E}[X]\)
a is a constant that can be taken out and X is a random variable
\( \mathbb{E}[aX] = \Sigma_{i=1}^{6}ax_{i}P(X = x_{i})\)
Using properties of series you can take out the constant.
\( \mathbb{E}[aX] = a\Sigma_{i=1}^{6}x_{i}P(X = x_{i})\) This leads to: \( \mathbb{E}[aX] = a\mathbb{E}[X]\) -
Now, you can show that the expected value of the sample mean is the population mean.
\(\mathbb{E}[X̄] = \mathbb{E}[{Σ_{i} X_{i}\over n }]\)
You can take the constant out - property 2
\(\mathbb{E}[X̄] ={1\over n } \mathbb{E}[Σ_{i} X_{i}]\)
Now you can use property 1
\(\mathbb{E}[X̄] ={1\over n }Σ\mathbb{E}[X_{i}]\)
\(\mathbb{E}[X̄] ={1\over n }Σμ\)
\(\mathbb{E}[X̄] =μ\)
Let us firstly remind ourselves of some key properties of the variance:
Property 3
\(\mathbb{Var}[X_{1}+X_{2} + ... + X_{n}] = \mathbb{Var}[X_{1}]+\mathbb{Var}[X_{2}] + ... + \mathbb{Var}[X_{n}]\)This is only true of independent events.
You can simply show that:
\(\mathbb{Var}[X+Y]= \mathbb{Var}[X] + \mathbb{Var}[Y] \)To make it as tangible as possible, let X be throwing a die and Y is the outcome from tossing a coin.
\(\mathbb{Var}[X+Y] = \Sigma_{i=1}^{6}\Sigma_{j=2}^{2} [X_{i}+Y_{j}- \mathbb{E}[X + Y]]^{2}P(X_{i},Y_{j})\)You can expand the expression.
\begin{array}{ccc} \mathbb{E}[X+Y] = & & & \\ & \Sigma_{i=1}^{6}\Sigma_{j=1}^{2} [x_{i}+y_{j}]^{2}P(X = x_{i},Y = y_{j}) & & \\ & - 2\Sigma_{i=1}^{6}\Sigma_{j=1}^{2} [x_{i}+y_{j}]\mathbb{E}[X + Y]]P(X = x_{i},Y = y_{j}) & & \\ & \Sigma_{i=1}^{6}\Sigma_{j=1}^{2} [\mathbb{E}[X + Y]]]^{2}P(X = x_{i},Y = y_{j}) & & \\ \end{array}We have split \(\mathbb{E}[X+Y]\) into three terms.
You can expand the first term to show that:
\(\Sigma_{i=1}^{6}\Sigma_{j=1}^{2} [x_{i}+y_{j}]^{2}P(X = x_{i},Y = y_{j}) = \Sigma_{i=1}^{6}\Sigma_{j=1}^{2} [x_{i}^{2}+2x_{i}y_{j}+y_{j}^{2}]P(X=x_{i},Y=y_{j})\)You can split up the series.
\begin{array}{ccc} \Sigma_{i=1}^{6}\Sigma_{j=1}^{2} [x_{i}+y_{j}]^{2}P(X = x_{i},Y =y_{j})= & & & \\ & \Sigma_{j=1}^{2}\Sigma_{x=1}^{6} x_{i}^{2} P(X = x_{i},Y =y_{j}) & & \\ & +2\Sigma_{i=1}^{6} \Sigma_{j=1}^{2} x_{i}y_{i} P(X = x_{i},Y =y_{j}) & & \\ & +\Sigma_{i=1}^{6} P(X = x_{i}) \Sigma_{j=1}^{2} y_{i}^{2} P(X = x_{i},Y =y_{j})& & \\ \end{array}You can express the first part in terms of expected values.
\(\Sigma_{i=1}^{6}\Sigma_{j=1}^{2} [x_{i}+y_{j}]^{2}P(X_{i},Y_{j}) = \mathbb{E}[X^{2}] + 2\mathbb{E}[XY]+\mathbb{E}[Y^{2}]\)Similarly we can show for the second part:
Since X and Y are independent
\(\mathbb{E}[X+Y] = (\mathbb{E}[X] + \mathbb{E}[Y])\) - property 1
We can also take out the constants.
\begin{array}{ccc} -2\Sigma_{i=1}^{6}\Sigma_{j=1}^{2} [x_{i}+y_{j}]\mathbb{E}[X + Y]]P(X = x_{i},Y =y_{j}) = & & & \\ -2(\mathbb{E}[X] + \mathbb{E}[Y])\Sigma_{i=1}^{6} \Sigma_{j=1}^{2} [x_{i}+y_{j}]P(X = x_{i},Y = y_{j}) & & & \\ \end{array}This can be expanded.
\begin{array}{ccc} -2\Sigma_{i=1}^{6}\Sigma_{j=1}^{2} [x_{i}+y_{j}]\mathbb{E}[X + Y]]P(X = x_{i},Y =y_{j}) = & & & \\ & -2(\mathbb{E}[X] + \mathbb{E}[Y])[\Sigma_{i=1}^{6} \Sigma_{j=1}^{2}X_{i}P(X = x_{i},Y = y_{j})+\Sigma_{i=1}^{6} \Sigma_{j=1}^{2}[Y_{j}P(X = x_{i},Y = y_{j})] & & \\ \end{array}You can take out the constant
\begin{array}{ccc} -2\Sigma_{i=1}^{6}\Sigma_{j=1}^{2} [x_{i}+y_{j}]\mathbb{E}[X + Y]]P(X = x_{i},Y =y_{j}) = & & & \\ -2\mathbb{E}[X + Y][\Sigma_{i=1}^{6} \Sigma_{j=1}^{2} x_{i}P(X = x_{i},Y = y_{j})+\Sigma_{i=1}^{6} \Sigma_{j=1}^{2} y_{j}P(X=x_{i},Y=y_{j})]] & & & \\ \end{array}Again we need to split the "i"s and the "j"s
\begin{array}{ccc} -2\Sigma_{i=1}^{6}\Sigma_{j=1}^{2} [x_{i}+y_{j}]\mathbb{E}[X + Y]]P(X = x_{i},Y =y_{j}) = & & & \\ -2\mathbb{E}[X + Y][\Sigma_{i=1}^{6} x_{i} \Sigma_{j=1}^{2} P(X = x_{i},Y = y_{j})+\Sigma_{j=1}^{2}y_{j} \Sigma_{i=1}^{6}P(X=x_{i},Y=y_{j})]] & & & \\ \end{array}Also,spot the marginal probabilities.
\begin{array}{ccc} -2\Sigma_{i=1}^{6}\Sigma_{j=1}^{2} [x_{i}+y_{j}]\mathbb{E}[X + Y]]P(X = x_{i},Y =y_{j}) = & & & \\ -2\mathbb{E}[X + Y][\Sigma_{i=1}^{6} x_{i} P(X = x_{i})+\Sigma_{j=1}^{2}y_{j} P(Y=y_{j})]] & & & \\ \end{array}You should also be seeing the expected values.
\(-2\Sigma_{i=1}^{6} \Sigma_{j=1}^{2} [X_{i}+Y_{j}]\mathbb{E}[X + Y]]P(X_{i},Y_{j}) = -2(\mathbb{E}[X] + \mathbb{E}[Y])[\mathbb{E}[X] +\mathbb{E}[Y]) \)This can be expanded to become: \(-2\Sigma_{i=1}^{6} \Sigma_{j=1}^{2}[X_{i}+Y_{j}]\mathbb{E}[X + Y]]P(X_{i},Y_{j}) =-2(\mathbb{E}[X]^{2} + 2\mathbb{E}[Y]\mathbb{E}[X] +\mathbb{E}[Y]^{2}) \)
Finally, we can show for the third part:
\(\Sigma_{i=1}^{6}\Sigma_{j=1}^{2} [\mathbb{E}[X + Y]]^{2}P(X_{i},Y_{j}) = \mathbb{E}[X + Y]]]^{2} \Sigma_{i=1}^{6}\Sigma_{j=1}^{2} P(X_{i},Y_{j})\)The constant has been taken out and now you should be able to recall that total probability = 1.
\(\Sigma_{i=1}^{6}\Sigma_{j=1}^{2} [\mathbb{E}[X + Y]]]^{2}P(X_{i},Y_{j}) = \mathbb{E}[X + Y]]]^{2}\)Since X and Y are independent \(\mathbb{E}[X+Y] = (\mathbb{E}[X] + \mathbb{E}[Y])\)
\(\Sigma_{i=1}^{6}\Sigma_{j=1}^{2} [\mathbb{E}[X + Y]]]^{2}P(X_{i},Y_{j}) = (\mathbb{E}[X] + \mathbb{E}[Y])^{2}\)After expansion you can show that :
\(\Sigma_{i=1}^{6}\Sigma_{j=1}^{2} [\mathbb{E}[X + Y]]]^{2}P(X_{i},Y_{j}) = \mathbb{E}[X]^{2}+ 2\mathbb{E}[X]\mathbb{E}[Y]+\mathbb{E}[Y]^2\)Putting all three parts together we have:
\begin{array}{ccc} \mathbb{Var}[X+Y]= & & & \\ & \mathbb{E}[X^{2}] & +2\mathbb{E}[XY] & + \mathbb{E}[Y^{2}]\\ & -2\mathbb{E}[X]^2 & -4\mathbb{E}[X] \mathbb{E}[Y] & -2\mathbb{E}[Y]^{2} \\ & +\mathbb{E}[X]^{2} & +2\mathbb{E}[X]\mathbb{E}[Y] & +\mathbb{E}[Y]^2 \\ \end{array}After simplification
\(\mathbb{Var}[X+Y]= [\mathbb{E}[X^{2}] - \mathbb{E}[X]^{2}] +[\mathbb{E}[Y^{2}]- \mathbb{E}[Y]^{2}] + 2[\mathbb{E}[XY]-\mathbb{E}[X]\mathbb{E}[Y]] \)This can be expressed in term of the variance of X , the variance of Y and the covariance.
\(\mathbb{Var}[X+Y]= \mathbb{Var}[X] + \mathbb{Var}[Y] + \mathbb{Cov}[X,Y] \)For independent events the covariance = 0
\(\mathbb{Var}[X+Y]= \mathbb{Var}[X] + \mathbb{Var}[Y] \)It can be further shown that :
\(\mathbb{Var}[\Sigma_{i=1}^{n} X_{i}]= \Sigma_{i=1}^{n} \mathbb{Var}[X_{i}] \) - property 3This is only true for independent variables
Property 4
\(\mathbb{Var}[aX] = a^{2}\mathbb{Var}[X] \)
let Y = aX
\(\mathbb{Var}[Y] = \mathbb{E}[[Y-\mu_{y}]^{2}] = \Sigma_{i=1}^{n} [y_{i}-\mu_{y}]^{2}P(Y = y_{i}) \) \(\mu_{y} = \mathbb{E}[Y] = \Sigma_{i=1}^{n} y_{i}P(Y = y_{i}) \)Let us substitute Y = aX
\(\mu_{y} = \mathbb{E}[Y] = \Sigma_{i=1}^{n} ax_{i}P(aX = aX_{i}) \)We next simplify and take out the constant.
\(\mu_{y} = \mathbb{E}[Y] = a\Sigma_{i=1}^{n} x_{i}P(X = X_{i}) \)This becomes:
\(\mu_{y} = \mathbb{E}[Y] = a\mathbb{E}[X] = a\mu_{x} \)
Let us subsitute Y = aX
\(\mathbb{Var}[Y] = \Sigma_{i=1}^{n} [ax_{i}-a\mu_{x}]^{2}P(aX = ax_{i}) \)Let us further simplify and take out the constant.
\(\mathbb{Var}[Y] = a^2\Sigma_{i=1}^{n} [x_{i}-\mu_{x}]^{2}P(X = x_{i}) \)We have shown that:
\(\mathbb{Var}[aX] = a^2\mathbb{Var}[X]\)
Now you can find the variance of the sample mean.
You can take the constant out - property 4
\(\mathbb{Var}[X̄] ={1\over n^{2}} \mathbb{Var}[Σ_{i} X_{i}]\)Since the variables are independent - property 3.
\(\mathbb{Var}[X̄] ={1\over n^{2}} Σ_{i}\mathbb{Var}[X_{i}]\)All \(X_{i}\) follow the sample distribution, you can call it X
\(\mathbb{Var}[X̄] ={1\over n^{2}} Σ_{i}\mathbb{Var}[X]\)This simplifies to :
\(\mathbb{Var}[X̄] ={\mathbb{Var}[X]\over n}\)