Image Processing
This project uses face image data from the Olivetti Research Laboratory (ORL). The ORL data set is useful because all images are of a standard size and there is little variance in the lighting and positioning of the face. Eigenfaces requires standard dimension images and works best when the lighting, positioning and expression of the faces are standard.
After Image files are loaded they must be transformed. Each image is transfromed from a l × w matrix of greyscale values to a single column vector of length lw. Next:
[h,w,n] = size(imgs);
d = h*w;
% vectorize images
x = reshape(imgs,[d n]);
x = double(x);
%subtract mean
mean_matrix = mean(x,2);
x = bsxfun(@minus, x, mean_matrix);
% calculate covariance
s = cov(x');
% obtain eigenvalue & eigenvector
[V,D] = eig(s);
eigval = diag(D);
Using Eigenfaces
Note that the eigenface representations in the right image weren't created using all the obtained eigenvectors (AKA principal components). The image used a weighted sum of only the 100 most dominant eigenvectors. The rest of the eigenvectors contributed sufficiently little information as to be ignored. Igoring these vectors allows this algorithm to greatly compress the information needing to be stored and In this way this algorithm is working much like the PCA algorithm, a threshold is determined for which eigenvectors are kept for analysis/identifcation and the rest can safely be ignored allowing for less space and computational requirements.
Images are stored as weights of the dominant eigenvectors and images can be added by storing new weights without recalculating any matrix decompositions. Comparisons of a new image to identify it with an existing one is just a matter of comparing the weights. The Euclidean distance between a new image's weights and the existing weight vector is used to compare. A threshold for determining whether a given comparison must be decided and should be based on experimentation with the individual dataset.
Conclusion
The Eigenfaces algorithm gives a neat demonstration of some linear algebra principles and can be used to explain Principle Component Analysis or Singular Value Decomposition. However, with the modern ubiquity of processing power, storage and machine learning techniques it is certainly not the most powerful or robust algorithm for identification tasks.
Please see my repo for complete matlab code.