Image Processing

This project uses face image data from the Olivetti Research Laboratory (ORL). The ORL data set is useful because all images are of a standard size and there is little variance in the lighting and positioning of the face. Eigenfaces requires standard dimension images and works best when the lighting, positioning and expression of the faces are standard.

After Image files are loaded they must be transformed. Each image is transfromed from a l × w matrix of greyscale values to a single column vector of length lw. Next:

  • Each image is concatenated into a matrix X whose columns are the image vectors;
  • The mean image vector is subtracted from each row of X;
  • the covariance matrix S of X is calculated; and
  • The eigendecomposition of S is found.
                [h,w,n] = size(imgs);
                d = h*w;
                % vectorize images
                x = reshape(imgs,[d n]);
                x = double(x);
                %subtract mean
                mean_matrix = mean(x,2);
                x = bsxfun(@minus, x, mean_matrix);
                % calculate covariance
                s = cov(x');
                % obtain eigenvalue & eigenvector
                [V,D] = eig(s);
                eigval = diag(D);
    The mean face and the first 15 dominant eiganvectors
    Stages of Eigenfaces

    Although the resulting image is not recognizable it still contains enough information for face identification.

    Using Eigenfaces

    Note that the eigenface representations in the right image weren't created using all the obtained eigenvectors (AKA principal components). The image used a weighted sum of only the 100 most dominant eigenvectors. The rest of the eigenvectors contributed sufficiently little information as to be ignored. Igoring these vectors allows this algorithm to greatly compress the information needing to be stored and In this way this algorithm is working much like the PCA algorithm, a threshold is determined for which eigenvectors are kept for analysis/identifcation and the rest can safely be ignored allowing for less space and computational requirements.

    Images are stored as weights of the dominant eigenvectors and images can be added by storing new weights without recalculating any matrix decompositions. Comparisons of a new image to identify it with an existing one is just a matter of comparing the weights. The Euclidean distance between a new image's weights and the existing weight vector is used to compare. A threshold for determining whether a given comparison must be decided and should be based on experimentation with the individual dataset.

    Matrix-Vector product
    fig:Calculating the weights vector of a new image

    The Eigenfaces algorithm gives a neat demonstration of some linear algebra principles and can be used to explain Principle Component Analysis or Singular Value Decomposition. However, with the modern ubiquity of processing power, storage and machine learning techniques it is certainly not the most powerful or robust algorithm for identification tasks.

    Please see my repo for complete matlab code.