Calculating Principal component analysis (PCA), step by step using a simple dataset.

1.43K subscribers

1,909 views

About
Share

Published On Jul 30, 2022

In this video, I explain what a PCA is and how to calculate it step by step, both manually and using Matlab (no prior knowledge of the language is necessary). I will use a simple and intuitive 3D dataset, which I will use PCA to transform into a 2D dataset. This example is suitable for biologists and non-biologists alike.
---------------------------
The Matlab code that I used:
% Dataset
A = double([1 0 0; 0 1 0; 0 0 1; 0 0 0]);

% Calculating PCA manually, step-by-step
[n,m] = size(A)
AMean = mean(A)

B = (A-repmat(AMean, [n 1]))

C = (B'*B)./(size(B,1)-1)

[V_eigenvectors D_eigenvalues] = eig(C)

round(V_eigenvectors,2)
round(D_eigenvalues,2)

% Using matlab's function to calculate PCA
[coeff1, SCORE1, V1] = pca(A, 'Algorithm','eig')

% Two vecotrs are siad to be "mutually orthogonal" if the dot product of
% any pair of distinct vectors in the set is 0
V_eigenvectors(:,1)'*V_eigenvectors(:2)
V_eigenvectors(:,1)'*V_eigenvectors(:3)
V_eigenvectors(:,2)'*V_eigenvectors(:3)

% plotting the eigenvectors
e1 = [zeros(size(V_eigenvectors(:,1))), V_eigenvectors(:,1)]'
e1 = e1/norm(e1)

e2 = [zeros(size(V_eigenvectors(:,2))), V_eigenvectors(:,2)]'
e2 = e2/norm(e2)

e3 = [zeros(size(V_eigenvectors(:,3))), V_eigenvectors(:,3)]'
e3 = e3/norm(e3)

plot3(e1(:,1), e1(:,2), e1(:,3), 'k', 'LineWidth',2)
hold on;
plot3(e2(:,1), e2(:,2), e2(:,3), 'b', 'LineWidth',2)
plot3(e3(:,1), e3(:,2), e3(:,3), 'g', 'LineWidth',2)
---------------------------
This video supports the paper:
https://www.nature.com/articles/s4159...

Published On Jul 30, 2022

Share/Embed

Video Link