Dimensionality Reduction In Machine Learning: Some mathematical prerequisites: Mean Vector, Covariance Matrix and Column Standardization

This is part 2 of Introduction to Dimensionality Reduction. In this blog post, we would several different mathematical prerequisites that one must know before trying to understand machine learning.

Mean Vector

The sample mean is a vector each of whose elements is the sample mean of one of the random variables – that is, each of whose elements is the arithmetic average of the observed values of one of the variables.

Lets say we have two vectors

x1 = [2.2, 4.2,...]

x2 = [1.2, 3.2,...]

x_mean = 1/2(x1+x2)

= 0.5* [3.4, 7.4, ..]

= [1.7,3.7, .. ]

So essentially we summed up elements at ith index of the first array and the corresponding index of the second array.

So we can say every array can be considered as a vector with each of its indices as one of the dimensions.

If we plot all these arrays and their indices in a multidimensional space,we would see that they look like a 3d scattered plot.

We can define mean vector for that scattered group geometrically something like this picture below.

Data preprocessing: Column Standardization

Image result for mean vector after column standardization

Column standardization is a type of feature normalization where we move the data in such a way that the mean of the data becomes 0 and the standard deviation becomes 1.

feature	f1	f2	f3	f4
X=1	10	a1	2	3
X=2	20	a2	1	4
X=3	30	a3	4	4
X=4	30	a3	4	4


X=n	x	an	z

Let a1, a2... an, represent n values of a feature f_j

Let's say we apply column standardization on these and we get a new feature

f_{j standardized}

Let a1', a2'... an' represent n values of a feature f_{j standardized}

then the mean of all such vectors would be defined as

mean(a1',a2'...an') =0

and the standard deviation

std(a1',a2'...an') =1

The way we do it is by subtracting each element from the mean and dividing by standard deviation. On doing so for the new vector, the mean becomes 0 and the standard deviation becomes 1.

So how do we obtain a1', a2'... an'

let's say mean(a1', a2'... an') be a_mean

and standard deviation(a1', a2'... an') be a_std

a_i' = (a_i- a_mean)/a_std

Geometrically speaking we move the distribution to the origin and constrict it in a hypercube of unit 1. Hence it is also called mean centering.

We may need to squish or expand the data depending upon if the standard dev of the data is greater or less than 1.

Covariance Matrix

Let's say we have a matrix X

features	f1	f2	f3	f4
X=1	x11	x12	x13	x14
X=2	x21	x22	x23	x24
X=3	x31	x32	x33	x34

we can define its covariance matrix S

features	f1	f2	f3	f4
1	s11	s12	s13	s14
2	s21	s22	s23	s24
3	s31	s32	s33	s34
4	s41	s42	s43	s44

where x_i,j are i^th row and j^th column in X

and s_i,j are i^th row and j^th column in S

The dimensions of X here are n*d where n is number of points and d is number of dimensions(or features)

Whereas the covariance matrix is of size d*d. Hence covariance matrix is always a square matrix

Image result for covariance matrix formula

Covariance has two properties

1) Cov(X,Y)=Cov(Y,X)

2) Cov(X,X)= Var(X)

_{This means s(i,j) = s(j,i)}

Another interesting property is for a column normalized vector X,

covariance matrix S=(1/n)(X_transpose *X)

We will leave the proof as an exercise but proving it should be trivial.

Stay tuned for the next post on dimensionality reduction. We would start describing a classical way of doing Dimensionality Reduction called Principal component Analysis.

Dimensionality Reduction In Machine Learning: Some mathematical prerequisites: Mean Vector, Covariance Matrix and Column Standardization

Published by Deepanshu Lulla on June 1, 2019June 1, 2019

Mean Vector

Data preprocessing: Column Standardization

Covariance Matrix

Like this:

0 Comments

What do you think?Cancel reply

Notes on paper: Large Language Models as Zero-Shot Conversational Recommenders

Machine Learning Engineering by Andriy Burkov by Chapter 1 notes

Notes on Paper: RecMind: Large Language Model Powered Agent For Recommendation

Dimensionality Reduction In Machine Learning: Some mathematical prerequisites: Mean Vector, Covariance Matrix and Column Standardization

Published by Deepanshu Lulla on June 1, 2019June 1, 2019

Mean Vector

Data preprocessing: Column Standardization

Covariance Matrix

Like this:

0 Comments

What do you think?Cancel reply

Related Posts

Notes on paper: Large Language Models as Zero-Shot Conversational Recommenders

Machine Learning Engineering by Andriy Burkov by Chapter 1 notes

Notes on Paper: RecMind: Large Language Model Powered Agent For Recommendation