To use the k-means clustering algorithm to identify clusters in a dataset, both in R and Tableau.

K-Means Clustering in R and Tableau

Objective: To use the k-means clustering algorithm to identify clusters in a dataset, both in R and Tableau.

Dataset #1:

The dataset for this assignment is the Old Faithful Geysar dataset. The dataset contains the duration of the eruption (in minutes) and the waiting time (in minutes) between eruptions for the Old Faithful geyser in Yellowstone National Park, Wyoming, USA. The dataset contains 272 observations on those 2 variables.

Create a report in Word Processor that will include:

  1. A brief introduction (in one to two paragraphs) to k-means clustering (Write in your own words without copying and pasting from the internet.)
  2. Create a scatter plot with the 272 observations.
  3. Can you identify clusters forming in your Scatter Plot?
  4. Circle the clusters you have identified.
  5. Provide a screenshot of your scatter plot in the report. Please make sure the plot is appropriately labeled.
  6. Perform a K-Means analysis on the dataset.
  7. A description of how you identified the optimal number of clusters.
  8. The results of your K-Means clustering analysis in R and Tableau.
  9. An analysis of the clustered data.
  10. List each question before answering it.

Dataset #2:

The dataset for this assignment is the Iris dataset. The dataset contains measurements of sepal length, sepal width, petal length, and petal width for 150 iris flowers, which are labeled as belonging to one of three species: Iris setosa, Iris versicolor, or Iris virginica.

Create a report in Word Processor that will include:

  1. Perform a K-Means analysis on the dataset.
  2. A description of how you identified the optimal number of clusters.
  3. The results of your K-Means clustering analysis in R and Tableau.
  4. An analysis of the clustered data.
  5. What are the characteristics of each cluster?
  6. Are the clusters well-separated?
  7. List each question before answering it.

 

Submission:

1.     Create two folders, one for each problem.

2.     Include your report, R file and Tableau file in the same folder for each problem.

3.     When you are done and ready to submit, place both folders inside one folder and zip that folder.

4.     Upload the zipped folder to Canvas.