Basic Stats Cheatsheet
Variance
- Youtube Video
- Determines how far is each data point from the mean.
- If most points were closer to the mean it means the variances is lower
Standard Deviation
- Is the positive square root of the variance
- $σ =\sqrt\frac{\sum(x −x_{mean})^2}{n}$
Covariance
- Youtube video
- Used to analyze linear relationships between 2 variables. How do these behave as pairs?
- A positive value indicates a direct increasing linear relationship. If one goes up the other goes up
- A negative value indicates an inverse relationship. If one goes up the other goes down.
- Difference between Covariance and Correlation
- Covariance determines the type of association not the strength. It only speaks the direction of the direction. The correlation talks about the strength of the relationship
Sample Covariance
\[\frac{\sum(x −x`)(y −y`)}{(n-1)}\]Slope of a line m is essentially covariance because b in y = mx + b
\[m = \frac{\sum(x −x_{mean})(y −y_{mean})}{n −1}\]
Correlation
- Correlation is always between -1 and 1
- Correlation is standardized thus comparable.
- Covariance is not standardized just direction
- Suprious correlation: Two completely unrelated factors that seem to have mathematical correlation but have no sensible correlation in real life. Dog bars vs moon’s phase
Pearson correlation coefficient $r$
And
\[r = \frac{Covariance(x, y)}{standard deviation(x) ∗ standard deviation(y)}\]Where $Sx$ and $Sy$ are
\[Sx=\frac{\sum(x −xmean)}{2√n −1}\]$r$ is the coefficient of correlation When $n$=number of elements in consideration