Coursera: Machine Learning (Week 8) [Assignment Solution] - Andrew NG

 

 Recommended Courses:

1.K-means clustering and PCA

Don't just copy paste the code for the sake of completion. 
Make sure you understand the code first.

In this exercise, you will implement the K-means clustering algorithm and apply it to compress an image. In the second part, you will use principal component analysis to find a low-dimensional representation of face images. Before starting on the programming exercise, we strongly recommend watching the video lectures and completing the review questions for the associated topics.
It consist of the following files:
  • ex7.m - Octave/MATLAB script for the first exercise on K-means
  • ex7 pca.m - Octave/MATLAB script for the second exercise on PCA
  • ex7data1.mat - Example Dataset for PCA
  • ex7data2.mat - Example Dataset for K-means
  • ex7faces.mat - Faces Dataset
  • bird small.png - Example Image
  • displayData.m - Displays 2D data stored in a matrix
  • drawLine.m - Draws a line over an exsiting figure
  • plotDataPoints.m - Initialization for K-means centroids
  • plotProgresskMeans.m - Plots each step of K-means as it proceeds
  • runkMeans.m - Runs the K-means algorithm
  • submit.m - Submission script that sends your solutions to our servers
  • [*] pca.m - Perform principal component analysis
  • [*] projectData.m - Projects a data set into a lower dimensional space
  • [*] recoverData.m - Recovers the original data from the projection
  • [*] findClosestCentroids.m - Find closest centroids (used in K-means)
  • [*] computeCentroids.m - Compute centroid means (used in K-means)
  • [*] kMeansInitCentroids.m - Initialization for K-means centroids

* indicates files you will need to complete

pca.m :

function [U, S] = pca(X)
  %PCA Run principal component analysis on the dataset X
  %   [U, S, X] = pca(X) computes eigenvectors of the covariance matrix of X
  %   Returns the eigenvectors U, the eigenvalues (on diagonal) in S
  %
  
  % Useful values
  [m, n] = size(X);
  
  % You need to return the following variables correctly.
  U = zeros(n);
  S = zeros(n);
  
  % ====================== YOUR CODE HERE ======================
  % Instructions: You should first compute the covariance matrix. Then, you
  %               should use the "svd" function to compute the eigenvectors
  %               and eigenvalues of the covariance matrix. 
  %
  % Note: When computing the covariance matrix, remember to divide by m (the
  %       number of examples).
  %
  % DIMENSIONS:
  %    X = m x n
  
  Sigma = (1/m)*(X'*X); % n x n
  [U, S, V] = svd(Sigma);
  
  % =========================================================================
end

projectData.m :

function Z = projectData(X, U, K)
  %PROJECTDATA Computes the reduced data representation when projecting only 
  %on to the top k eigenvectors
  %   Z = projectData(X, U, K) computes the projection of 
  %   the normalized inputs X into the reduced dimensional space spanned by
  %   the first K columns of U. It returns the projected examples in Z.
  %
  
  % You need to return the following variables correctly.
  Z = zeros(size(X, 1), K);
  
  % ====================== YOUR CODE HERE ======================
  % Instructions: Compute the projection of the data using only the top K 
  %               eigenvectors in U (first K columns). 
  %               For the i-th example X(i,:), the projection on to the k-th 
  %               eigenvector is given as follows:
  %                    x = X(i, :)';
  %                    projection_k = x' * U(:, k);
  %
  % DIMENSIONS:
  %    X = m x n
  %    U = n x n
  %    U_reduce = n x K
  %    K = scalar
  
  U_reduce = U(:,[1:K]);   % n x K
  Z = X * U_reduce;        % m x k
  
  % =============================================================
end


recoverData.m :

function X_rec = recoverData(Z, U, K)
  %RECOVERDATA Recovers an approximation of the original data when using the 
  %projected data
  %   X_rec = RECOVERDATA(Z, U, K) recovers an approximation the 
  %   original data that has been reduced to K dimensions. It returns the
  %   approximate reconstruction in X_rec.
  %
  
  % You need to return the following variables correctly.
  X_rec = zeros(size(Z, 1), size(U, 1));
  
  % ====================== YOUR CODE HERE ======================
  % Instructions: Compute the approximation of the data by projecting back
  %               onto the original space using the top K eigenvectors in U.
  %
  %               For the i-th example Z(i,:), the (approximate)
  %               recovered data for dimension j is given as follows:
  %                    v = Z(i, :)';
  %                    recovered_j = v' * U(j, 1:K)';
  %
  %               Notice that U(j, 1:K) is a row vector.
  %               
  % DIMENSIONS: 
  %    Z = m x K
  %    U = n x n
  %    U_reduce = n x k
  %    K = scalar
  %    X_rec = m x n
  
  U_reduce = U(:,1:K);   % n x k
  X_rec = Z * U_reduce'; % m x n
  
  % =============================================================
end

findClosestCentroids.m :

function idx = findClosestCentroids(X, centroids)
  %FINDCLOSESTCENTROIDS computes the centroid memberships for every example
  %   idx = FINDCLOSESTCENTROIDS (X, centroids) returns the closest centroids
  %   in idx for a dataset X where each row is a single example. idx = m x 1
  %   vector of centroid assignments (i.e. each entry in range [1..K])
  %
  
  % Set K
  K = size(centroids, 1); % K x 1 == 3 x 1
  
  % You need to return the following variables correctly.
  idx = zeros(size(X,1), 1);  % m x 1 == 300 x 1
  
  % ====================== YOUR CODE HERE ======================
  % Instructions: Go over every example, find its closest centroid, and store
  %               the index inside idx at the appropriate location.
  %               Concretely, idx(i) should contain the index of the centroid
  %               closest to example i. Hence, it should be a value in the
  %               range 1..K
  %
  % Note: You can use a for-loop over the examples to compute this.
  %
  % DIMENSIONS:
  %    centroids = K x no. of features = 3 x 2
  
  for i = 1:size(X,1)
      temp = zeros(K,1);
      for j = 1:K
          temp(j)=sqrt(sum((X(i,:)-centroids(j,:)).^2));
      end
      [~,idx(i)] = min(temp);
  end
  
  % =============================================================
end

computeCentroids.m :

function centroids = computeCentroids(X, idx, K)
  %COMPUTECENTROIDS returns the new centroids by computing the means of the
  %data points assigned to each centroid.
  %   centroids = COMPUTECENTROIDS(X, idx, K) returns the new centroids by
  %   computing the means of the data points assigned to each centroid. It is
  %   given a dataset X where each row is a single data point, a vector
  %   idx of centroid assignments (i.e. each entry in range [1..K]) for each
  %   example, and K, the number of centroids. You should return a matrix
  %   centroids, where each row of centroids is the mean of the data points
  %   assigned to it.
  %
  
  % Useful variables
  [m n] = size(X);
  
  % You need to return the following variables correctly.
  centroids = zeros(K, n);
  
  
  % ====================== YOUR CODE HERE ======================
  % Instructions: Go over every centroid and compute mean of all points that
  %               belong to it. Concretely, the row vector centroids(i, :)
  %               should contain the mean of the data points assigned to
  %               centroid i.
  %
  % Note: You can use a for-loop over the centroids to compute this.
  %
  % DIMENSIONS:
  %    X =  m x n
  %    centroids = K x n
  
  %% %%%%%% WORKING: SOLUTION1 %%%%%%%%%
  % for i = 1:K
  %     idx_i = find(idx==i);       %indexes of all the input which belongs to cluster j
  %     centroids(i,:)=(1/length(idx_i))*sum(X(idx_i,:)); %calculating mean manually
  % end
  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
  
  %% %%%%%% WORKING: SOLUTION 2 %%%%%%%%
  for i = 1:K
      idx_i = find(idx==i);       %indexes of all the input which belongs to cluster j
      centroids(i,:) = mean(X(idx_i,:)); % calculating mean using built-in function
  end
  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
  % =============================================================
end
  • 一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一
                         Machine Learning Coursera-All weeks solutions [Assignment + Quiz]   click here
                                                                                &
                         Coursera Google Data Analytics Professional Quiz Answers   click here


Have no concerns to ask doubts in the comment section. I will give my best to answer it.
If you find this helpful kindly comment and share the post.
This is the simplest way to encourage me to keep doing such work.


Thanks & Regards,
- Wolf

Comments

Popular posts from this blog

Coursera Google Data Analytics Professional Quiz Answers of all 8 courses.Data Analytics Professional Certificate

Coursera Google Data Analytics Professional Data Analysis with R Programming (Week 5) Quiz Answer-Documentation and reports.