Covariance and Its Major Flaws

Sandhya Krishnan
2 min readOct 29, 2022

Covariance measures the strength of the linear relationship between two variables. That is how much one variable goes up(down) when the other goes up(down).

We have sample covariance formula as,

Technically, it is the average of the product of the differences between each variable from their expected values. If we have a set of points, then:

  • If both X and Y are greater than or less than their respective mean, then the point contributes to positive covariance.
  • If X is greater X-Mean and Y is less than Y-Mean or vice versa, the point contributes to negative covariance.
  • If Y or X is the same as their mean, covariance is zero.
Image by author

If two variables are independent, their covariance is 0. But, having a covariance of 0 does not imply the variables are independent.

It can be explained by these points (1,3) (1,8) (3,5) (3,6), the covariance of the red points cancels each other and the covariance of blue points cancels each other and the total covariance is zero.

Image by author

As Covariance takes value from [ -∞, +∞], it does not explain:

  • If the slope of the line representing the relationship is steep or not.
  • Whether the points are closer to the slope or not.
  • It cannot measure the relative strength of the relationship.

For example, if we have covariance as 250, we cannot say if X and Y have a weak or strong relationship.

But Covariance is an important mathematical term as it is

  • The first step in calculating correlation.
  • It is used in Principal Component Analysis (PCA).

Correlation is scaled a scaled version of covariance that takes on values in [-1,1]

--

--