Main Page

Correlation and Regression

What is Correlation?

Correlation in statistics is a way to measure the relation between two variables in terms of numbers. e.g. Degree of atmospheric pressure is negatively correlated to current height spot.

Correlation can be positive, negative or zero between two variables and correlation should be between -1 and 1. As we can say that correlation is a metric that describes the strength of the relation, if the value gets closer to 1; that means the strong positive relation and if the value gets closer to -1; that means the strong negative relation. Before we can make correlational assumptions we first need to explore most common technique developed by Karl Pearson. Below we will explore how to calculate correlation coefficient and conditions to provide while using.

Pearson's Correlation Coefficient

All three formulas mean the Pearson's Correlation Coefficient(r).

For a given dataset(or sample) we can use the formula to calculate the Pearson's Coefficient and we can see the relationship between two variables.Let's explore below dataset that includes the variables as x and y.

With applying above dataset to formulas we find the Pearson's Correlation Coefficient(r) as;

When and how to use Correlation Coefficient?

What is Regression?

Regression Analysis is basically using independent variables(X,X1,X2..) to explain the changes in the dependent variable(Y). Regression Analysis can be named as the Foundation of Machine Learning Techniques because we can make predictions of the target variable(Y) for the given data points(X,X1,X2..) using Regression.

Simplest form of Regression is the Linear Regression. As it can be understood from the name, it is actually fitting a "line" to the 2 dimensions(x,y) Cartesian coordinate system. The technique is called Least Squares Residual.

Here is a sample data with the mean y value is 13.875

Since the definition of Linear Regression is basically explaining the variance with independent variables, firstly let's explore the variance around mean.

My Twitter Account My Github Account My LinkedIn Account