Variance
Variance is the expected value of the squared variation of a random variable from its mean value, in probability and statistics. Informally, variance estimates how far a set of numbers (random) are spread out from their mean value.
The value of variance is equal to the square of standard deviation, which is another central tool.
Variance is symbolically represented by σ2, s2, or Var(X).
The formula for variance is given by:
Var (X) = E[( X – \(\mu\))2] |
Table of Contents: |
Definition
Variance is a measure of how data points differ from the mean. According to Layman, a variance is a measure of how far a set of data (numbers) are spread out from their mean (average) value.
Variance means to find the expected difference of deviation from actual value. Therefore, variance depends on the standard deviation of the given data set.
The more the value of variance, the data is more scattered from its mean and if the value of variance is low or minimum, then it is less scattered from mean. Therefore, it is called a measure of spread of data from mean.
For the purpose of solving questions, the formula for variance is given by:
Var (X) = E[( X – \(\mu\))2]
Put into words; this means that variance is the expectation of the squared deviation of a random set of data from its mean value. Here,
X = Random variable
“µ” is equal to E(X) so the above equation may also be expressed as,
Var(X) = E[(X – E(X))2]
Var(X) = E[ X2 -2X E(X) +(E(X))2]
Var(X) = E(X2) -2 E(X) E(X) + (E(X))2
Var(X) = E(X2) – (E(X))2
Sometimes the covariance of the random variable itself is treated as the variance of that variable. Symbolically,
Var(X) = Cov(X, X)
Formula
As we know already, the variance is the square of standard deviation, i.e.,
Variance = (Standard deviation)2= σ2
The corresponding formulas are hence,
Population standard deviation σ = \(\sqrt{\frac{\sum (X-\mu )^{2}}{N}}\) and
Sample standard deviation s = \(\sqrt{\frac{\sum (x-\overline{x})^{2}}{n-1}}\)
Where X (or x) = Value of Observations
μ = Population mean of all Values
n = Number of observations in the sample set
\(\bar{x}\) = Sample mean
N = Total number of values in the population
Properties
The variance, var(X) of a random variable X has the following properties.
- Var(X + C) = Var(X), where C is a constant.
- Var(CX) = C2.Var(X), where C is a constant.
- Var(aX + b) = a2.Var(X), where a and b are constants.
- If X1, X2,……., Xn are n independent random variables, then
Var(X1 + X2 +……+ Xn) = Var(X1) + Var(X2) +……..+Var(Xn).
Now let’s have a look at the relationship between Variance and Standard Deviation.
Variance and Standard Deviation
Standard deviation is the positive square root of the variance. The symbols σ and s are used correspondingly to represent population and sample standard deviations.
Standard Deviation is a measure of how spread out the data is. Its formula is simple; it is the square root of the variance for that data set. It’s represented by the Greek symbol sigma (σ).
How to Calculate Variance
Variance can be calculated easily by following the steps given below:
- Find the mean of the given data set. Calculate the average of a given set of values
- Now subtract the mean from each value and square them
- Find the average of these squared values, that will result in variance
Say if x1, x2, x3, x4, …,xn are the given values.
Therefore, the mean of all these values is:
x̄ = (x1+x2+x3+…+xn)/n
Now subtract the mean value from each value of the given data set and square them.
(x1-x̄)2, (x2-x̄)2, (x3-x̄)2,…….,(xn-x̄)2
Find the average of the above values to get the variance.
Var (X) = [(x1-x̄)2+ (x2-x̄)2+ (x3-x̄)2+…….+(xn-x̄)2]/n
Hence, the variance is calculated.
Example of Variance
Let’s say the heights (in mm) are 610, 450, 160, 420, 310.
Mean and Variance is interrelated. The first step is finding the mean which is done as follows,
Mean = ( 610+450+160+420+310)/ 5 = 390
So the mean average is 390 mm.
To calculate the Variance, compute the difference of each from the mean, square it and find then find the average once again.
So for this particular case the variance is :
= (2202 + 602 + (-230)2 +302 + (-80)2)/5
= (48400 + 3600 + 52900 + 900 + 6400)/5
Final answer : Variance = 22440
Problem & Solution
Example: Find the variance of the numbers 3, 8, 6, 10, 12, 9, 11, 10, 12, 7.
Solution:
Given,
3, 8, 6, 10, 12, 9, 11, 10, 12, 7
Step 1: Compute the mean of the 10 values given.
Mean = (3+8+6+10+12+9+11+10+12+7) / 10 = 88 / 10 = 8.8
Step 2: Make a table with three columns, one for the X values, the second for the deviations and the third for squared deviations. As the data is not given as sample data so we use the formula for population variance. Thus, the mean is denoted by μ.
Value
X |
X – μ | (X – μ)2 |
3 | -5.8 | 33.64 |
8 | -0.8 | 0.64 |
6 | -2.8 | 7.84 |
10 | 1.2 | 1.44 |
12 | 3.2 | 10.24 |
9 | 0.2 | 0.04 |
11 | 2.2 | 4.84 |
10 | 1.2 | 1.44 |
12 | 3.2 | 10.24 |
7 | -1.8 | 3.24 |
Total | 0 | 73.6 |
Step 3:
σ2 = \(\frac{\sum (X-\mu )^{2}}{N}\)
= 73.6 / 10
= 7.36
Points to Remember
- In statistics, the variance is used to understand how different numbers correlate to each other within a data set, instead of using more comprehensive mathematical methods such as organising numbers of the data set into quartiles.
- Variance considers all the deviations from the mean are the same despite their direction. However, the squared deviations cannot sum to zero and provide the presence of no variability at all in the given data set.
- One of the disadvantages of finding variance is that it gives combined weight to extreme values, i.e. the numbers that are far from the mean. When squaring these numbers, there is a chance that they may skew the given data set.
- Another disadvantage of variance is that sometimes it may conclude complex calculations.
Note: If the data values are identical in a set, then their variance will be zero (0).
Stay tuned with BYJU’S to learn more about Covariance Formula and other maths concepts with the help of interactive videos.
Frequently Asked Questions – FAQs
What is variance in statistics?
What is the symbol of variance?
What is the formula to find variance?
Var (X) = E[( X – μ)2]
Where Var (X) is the variance
E denotes the expected value
X is the random variable and μ is the mean