Ticker

6/recent/ticker-posts

Coursera: Machine Learning (Week 8) Quiz - Unsupervised Learning | Andrew NG

 


 Recommended Courses:

1.Unsupervised Learning.

Don't just copy & paste for the sake of completion. The solutions uploaded here are only for reference.They are meant to unblock you if you get stuck somewhere.Make sure you understand first.

  1. For which of the following tasks might K-means clustering be a suitable algorithm
    Select all that apply.

    •  Given a set of news articles from many different news websites, find out what are the main topics covered.
    •  Given historical weather records, predict if tomorrow’s weather will be sunny or rainy.

    •  From the user usage patterns on a website, figure out what different groups of users exist.

    •  Given many emails, you want to determine if they are Spam or Non-Spam emails.

    •  Given a database of information about your users, automatically group them into different market segments.
    •  Given sales data from a large number of products in a supermarket, figure out which products tend to form coherent groups (say are frequently purchased together) and thus should be put on the same shelf.

    •  Given sales data from a large number of products in a supermarket, estimate future sales for each of these products.
  1. Suppose we have three cluster centroids  and .
    Furthermore, we have a training example . After a cluster assignment
    step, what will  be?

    •   = 1   ANSWER
    •   is not assigned

    •   = 2

    •   = 3
  1. K-means is an iterative algorithm, and two of the following steps are repeatedly carried out in its inner-loop. Which two?

    •  Move the cluster centroids, where the centroids  are updated.

    •  The cluster assignment step, where the parameters  are updated.

    •  Using the elbow method to choose K.

    •  Feature scaling, to ensure each feature is on a comparable scale to the others.

    •  The cluster centroid assignment step, where each cluster centroid  is assigned (by setting ) to the closest training example .

    •  Move each cluster centroid , by setting it to be equal to the closest training example .

    •  Test on the cross-validation set.

    •  Randomly initialize the cluster centroids.

  1. Suppose you have an unlabeled dataset . You run K-means with 50 different random initializations, and obtain 50 different clusterings of the data.
    What is the recommended way for choosing which one of these 50 clusterings to use?
    •  Use the elbow method.
    •  Plot the data and the cluster centroids, and pick the clustering that gives the most “coherent” cluster centroids.
    •  Manually examine the clusterings, and pick the best one.
    •  Compute the distortion function , and pick the one that minimizes this.
    •  The only way to do so is if we also have labels  for our data.
    •  Always pick the final (50th) clustering found, since by that time it is more likely to have converged to a good solution.

    •  The answer is ambiguous, and there is no good way of choosing.

    • 5.Which of the following statements are true? Select all that apply.

      •  On every iteration of K-means, the cost function  (the distortion function) should either stay the same or decrease; in particular, it should not increase.
      •  A good way to initialize K-means is to select K (distinct) examples from the training set and set the cluster centroids equal to these selected examples.
      •  K-Means will always give the same results regardless of the initialization of the centroids.

      •  Once an example has been assigned to a particular centroid, it will never be reassigned to another different centroid

      •  For some datasets, the “right” or “correct” value of K (the number of clusters) can be ambiguous, and hard even for a human expert looking carefully at the data to decide.
      •  The standard way of initializing K-means is setting  to be equal to a vector of zeros.
      •  If we are worried about K-means getting stuck in bad local optima, one way to ameliorate (reduce) this problem is if we try using multiple random initializations.
      •  Since K-Means is an unsupervised learning algorithm, it cannot overfit the data, and thus it is always better to have as large a number of clusters as is computationally feasible.
    •  For each of the clusterings, compute , and pick the one that minimizes this.
      • 一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一
      • Machine Learning Coursera-All weeks solutions [Assignment + Quiz]   click here
                                                                                      &
                               Coursera Google Data Analytics Professional Quiz Answers   click here

      Have no concerns to ask doubts in the comment section. I will give my best to answer it.
      If you find this helpful kindly comment and share the post.
      This is the simplest way to encourage me to keep doing such work.


      Thanks & Regards,
      - Wolf

Post a Comment

3 Comments

  1. Thanks for your comment keep supporting and share the blog.

    ReplyDelete
  2. Very Informative and creative contents. This concept is a good way to enhance knowledge. Thanks for sharing. Continue to share your knowledge through articles like these.

    Data Engineering Services 

    Data Analytics Solutions

    Artificial Intelligence Solutions

    Data Modernization Solutions

    ReplyDelete