Clustering Methods. Nathaniel E. Helwig. Assistant Professor of Psychology and Statistics University of Minnesota (Twin Cities) Updated 27-Mar-2017

Similar documents

TABLE 3c: Congressional Districts with Number and Percent of Hispanics* Living in Hard-to-Count (HTC) Census Tracts**

TABLE 3b: Congressional Districts Ranked by Percent of Hispanics* Living in Hard-to- Count (HTC) Census Tracts**

The American Legion NATIONAL MEMBERSHIP RECORD

5 x 7 Notecards $1.50 with Envelopes - MOQ - 12

Unemployment Rate (%) Rank State. Unemployment

Unemployment Rate (%) Rank State. Unemployment

Unemployment Rate (%) Rank State. Unemployment

Unemployment Rate (%) Rank State. Unemployment

Unemployment Rate (%) Rank State. Unemployment

Unemployment Rate (%) Rank State. Unemployment

Unemployment Rate (%) Rank State. Unemployment

Unemployment Rate (%) Rank State. Unemployment

Unemployment Rate (%) Rank State. Unemployment

Unemployment Rate (%) Rank State. Unemployment

Index of religiosity, by state

Current Medicare Advantage Enrollment Penetration: State and County-Level Tabulations

2015 State Hospice Report 2013 Medicare Information 1/1/15

MAP 1: Seriously Delinquent Rate by State for Q3, 2008

STATE INDUSTRY ASSOCIATIONS $ - LISTED NEXT PAGE. TOTAL $ 88,000 * for each contribution of $500 for Board Meeting sponsorship

Interstate Pay Differential

Voter Registration and Absentee Ballot Deadlines by State 2018 General Election: Tuesday, November 6. Saturday, Oct 27 (postal ballot)

Estimated Economic Impacts of the Small Business Jobs and Tax Relief Act National Report

PRESS RELEASE Media Contact: Joseph Stefko, Director of Public Finance, ;

Is this consistent with other jurisdictions or do you allow some mechanism to reinstate?

Sentinel Event Data. General Information Copyright, The Joint Commission

Rutgers Revenue Sources

Sentinel Event Data. General Information Q Copyright, The Joint Commission

2016 INCOME EARNED BY STATE INFORMATION

HOME HEALTH AIDE TRAINING REQUIREMENTS, DECEMBER 2016

Child & Adult Care Food Program: Participation Trends 2017

Child & Adult Care Food Program: Participation Trends 2016

FORTIETH TRIENNIAL ASSEMBLY

Table 6 Medicaid Eligibility Systems for Children, Pregnant Women, Parents, and Expansion Adults, January Share of Determinations

Date: 5/25/2012. To: Chuck Wyatt, DCR, Virginia. From: Christos Siderelis

Weights and Measures Training Registration

Child & Adult Care Food Program: Participation Trends 2014

Percentage of Enrolled Students by Program Type, 2016

CRMRI White Paper #3 August 2017 State Refugee Services Indicators of Integration: How are the states doing?

Statutory change to name availability standard. Jurisdiction. Date: April 8, [Statutory change to name availability standard] [April 8, 2015]

Interstate Turbine Advisory Council (CESA-ITAC)

Table 8 Online and Telephone Medicaid Applications for Children, Pregnant Women, Parents, and Expansion Adults, January 2017

States Ranked by Annual Nonagricultural Employment Change October 2017, Seasonally Adjusted

YOUTH MENTAL HEALTH IS WORSENING AND ACCESS TO CARE IS LIMITED THERE IS A SHORTAGE OF PROVIDERS HEALTHCARE REFORM IS HELPING

HIGH SCHOOL ATHLETICS PARTICIPATION SURVEY

*ALWAYS KEEP A COPY OF THE CERTIFICATE OF ATTENDANCE FOR YOUR RECORDS IN CASE OF AUDIT

In the District of Columbia we have also adopted the latest Model business Corporation Act.

FY 2014 Per Capita Federal Spending on Major Grant Programs Curtis Smith, Nick Jacobs, and Trinity Tomsic

Critical Access Hospitals and HCAHPS

Rankings of the States 2017 and Estimates of School Statistics 2018

STATE AGRICULTURAL ORGANIZATIONS SUPPORTING S. 744 AS APPROVED BY THE SENATE AGRICULTURE COMMITTEE

U.S. Army Civilian Personnel Evaluation Agency

All Approved Insurance Providers All Risk Management Agency Field Offices All Other Interested Parties

Senior American Access to Care Grant

Colorado River Basin. Source: U.S. Department of the Interior, Bureau of Reclamation

Weekly Market Demand Index (MDI)

F O R E S T R I V E R M A R I N E

State Authority for Hazardous Materials Transportation

Economic Freedom of North America

Department of Defense INSTRUCTION

EXHIBIT A. List of Public Entities Participating in FEDES Project

National Collegiate Soils Contest Rules

Fiscal Year 1999 Comparisons. State by State Rankings of Revenues and Spending. Includes Fiscal Year 2000 Rankings for State Taxes Only

THE METHODIST CHURCH (U.S.)

CONNECTICUT: ECONOMIC FUTURE WITH EDUCATIONAL REFORM

UNCLASSIFIED UNCLASSIFIED

Selection & Retention Of State Judges. Methods from Across the Country

HOPE NOW State Loss Mitigation Data December 2016

Cooperative Program Allocation Budget Receipts Southern Baptist Convention Executive Committee August 2015

HOPE NOW State Loss Mitigation Data September 2014

Pipeline Safety Regulations and the Effects on Operator Qualification Programs. March 28, 2017

Cooperative Program Allocation Budget Receipts Southern Baptist Convention Executive Committee March 2018

Cooperative Program Allocation Budget Receipts Southern Baptist Convention Executive Committee January 2014

Cooperative Program Allocation Budget Receipts Southern Baptist Convention Executive Committee April 2015

Cooperative Program Allocation Budget Receipts Southern Baptist Convention Executive Committee March 2015

Cooperative Program Allocation Budget Receipts Southern Baptist Convention Executive Committee May 2016

Cooperative Program Allocation Budget Receipts Southern Baptist Convention Executive Committee December 2015

Name: Date: Albany: Jefferson City: Annapolis: Juneau: Atlanta: Lansing: Augusta: Lincoln: Austin: Little Rock: Baton Rouge: Madison: Bismarck:

Larry DeBoer Purdue University September Real GDP Growth. Real Consumption Spending Growth

Percent of Population Under Age 65 Uninsured, 2013, 2014, and 2015

NMLS Mortgage Industry Report 2016 Q1 Update

NAFCC Accreditation Annual Update

NMLS Mortgage Industry Report 2017Q2 Update

NMLS Mortgage Industry Report 2017Q4 Update

NMLS Mortgage Industry Report 2018Q1 Update

STATUTORY/REGULATORY NURSE ANESTHETIST RECOGNITION

How North Carolina Compares

Supplemental Nutrition Assistance Program. STATE ACTIVITY REPORT Fiscal Year 2016

November 24, First Street NE, Suite 510 Washington, DC 20002

Introduction. Current Law Distribution of Funds. MEMORANDUM May 8, Subject:

Acm769 AG U.S. WATER BAPTISMS, 2017¹ Page 1

national assembly of state arts agencies

NURSING HOME STATISTICAL YEARBOOK, 2015

Alabama Okay No Any recruiting or advertising without authorization is considered out of compliance. Not authorized


2014 ACEP URGENT CARE POLL RESULTS

The Regional Economic Outlook

How North Carolina Compares

AMERICAN ASSOCIATION FOR AGRICULTURAL EDUCATION FACULTY SALARIES

Software for statistical analysis and data visualization.

Transcription:

Clustering Methods Nathaniel E. Helwig Assistant Professor of Psychology and Statistics University of Minnesota (Twin Cities) Updated 27-Mar-2017 Nathaniel E. Helwig (U of Minnesota) Clustering Methods Updated 27-Mar-2017 : Slide 1

Copyright Copyright c 2017 by Nathaniel E. Helwig Nathaniel E. Helwig (U of Minnesota) Clustering Methods Updated 27-Mar-2017 : Slide 2

Outline of Notes 1) Similarity and Dissimilarity Defining Similarity Distance Measures 3) Non-Hierarchical Clustering Overview K Means Clustering States Example 2) Hierarchical Clustering Overview Linkage Methods States Example Nathaniel E. Helwig (U of Minnesota) Clustering Methods Updated 27-Mar-2017 : Slide 3

Purpose of Clustering Methods Clustering methods attempt to group (or cluster) objects based on some rule defining the similarity (or dissimilarity) between the objects. Distinction between clustering and classification/discrimination: Clustering: the group labels are not known a priori Classification: the group labels are known (for a training sample) The typical goal in clustering is to discover the natural groupings present in the data. Nathaniel E. Helwig (U of Minnesota) Clustering Methods Updated 27-Mar-2017 : Slide 4

Similarity and Dissimilarity Similarity and Dissimilarity Nathaniel E. Helwig (U of Minnesota) Clustering Methods Updated 27-Mar-2017 : Slide 5

Similarity and Dissimilarity Defining Similarity (Between Objects) What does it Mean for Objects to be Similar? Let x = (x 1,..., x p ) and y = (y 1,..., y p ) denote two arbitrary vectors. Problem: We want some rule that measures the closeness or similarity between x and y. How we define closeness (or similarity) will determine how we group the objects into clusters. Rule 1: Pearson correlation between x and y Rule 2: Euclidean distance between x and y Rule 3: Number of matches, i.e., p j=1 1 {x j =y j } Nathaniel E. Helwig (U of Minnesota) Clustering Methods Updated 27-Mar-2017 : Slide 6

Similarity and Dissimilarity Defining Similarity (Between Objects) Card Clustering with Different Similarity Rules Figure: Figure 12.1 from Applied Multivariate Statistical Analysis, 6th Ed (Johnson & Wichern). Nathaniel E. Helwig (U of Minnesota) Clustering Methods Updated 27-Mar-2017 : Slide 7

Similarity and Dissimilarity Distance Measures Defining a Proper Distance A metric (or distance) on a set X is a function d : X X [0, ) Let d(, ) denote some distance measure between objects P and Q, and let R denote some intermediate object. A proper distance measure satisfies the following properties: 1 d(p, Q) = d(q, P) [symmetry] 2 d(p, Q) 0 for all P, Q [non-negativity] 3 d(p, Q) = 0 if and only if P = Q [identity of indiscernibles] 4 d(p, Q) d(p, R) + d(r, Q) [triangle inequality] Distances define the similarity (or dissimilarity) between objects. Nathaniel E. Helwig (U of Minnesota) Clustering Methods Updated 27-Mar-2017 : Slide 8

Similarity and Dissimilarity Distance Measures Visualization of the Triangle Inequality Figure: From https://en.wikipedia.org/wiki/triangle_inequality Nathaniel E. Helwig (U of Minnesota) Clustering Methods Updated 27-Mar-2017 : Slide 9

Similarity and Dissimilarity Distance Measures Minkowski Metric (and its Special Cases) The Minkowski Metric is defined as p d m (x, y) = x j y j m j=1 1/m where setting m 1 defines a true distance metric. Setting m = 1 gives the Manhattan distance (city block) d 1 (x, y) = p j=1 x j y j Setting m = 2 gives the Euclidean distance ( p ) 1/2 d 2 (x, y) = j=1 [x j y j ] 2 Setting m = gives the Chebyshev distance d (x, y) = max j x j y j Nathaniel E. Helwig (U of Minnesota) Clustering Methods Updated 27-Mar-2017 : Slide 10

Hierarchical Clustering Hierarchical Clustering Nathaniel E. Helwig (U of Minnesota) Clustering Methods Updated 27-Mar-2017 : Slide 11

Hierarchical Clustering Overview Two Approaches to Hierarchical Clustering Hierarchical clustering uses a series of successive mergers or divisions to group N objects based on some distance. Agglomerative Hierarchical Clustering (bottom up) 1 Begin with N clusters (each object is own cluster) 2 Merge the most similar objects 3 Repeat 2 until all objects are in the same cluster Divisive Hierarchical Clustering (top down) 1 Begin with 1 cluster (all objects together) 2 Split the most dissimilar objects 3 Repeat 2 until all objects are in their own cluster Nathaniel E. Helwig (U of Minnesota) Clustering Methods Updated 27-Mar-2017 : Slide 12

Hierarchical Clustering Overview Dissimilarity between Objects (and Clusters?) Our input for hierarchical clustering is an N N dissimilarity matrix d 11 d 12 d 1N d 21 d 22 d 2N D =...... d N1 d N2 d NN where d uv = d(x u, X v ) is the distance between objects X u and X v. We know how to define dissimilarity between objects (i.e., d uv ), but how do we define dissimilarity between clusters of objects? Nathaniel E. Helwig (U of Minnesota) Clustering Methods Updated 27-Mar-2017 : Slide 13

Hierarchical Clustering Linkage Methods Measuring Inter-Cluster Distance (Dissimilarity) Let C X = {X 1,..., X m } and C Y = {Y 1,..., Y n } denote two clusters. X j is the j-th object in cluster C X for j = 1,..., m Y k is the k-th object in cluster C Y for k = 1,..., n To quantify the distance between two clusters, we could use: Single Linkage: minimum (or nearest neighbor) distance d(c X, C Y ) = min j,k d(x j, Y k ) Complete Linkage: maximum (or furthest neighbor) distance d(c X, C Y ) = max j,k d(x j, Y k ) Average Linkage: average (across all pairs) distance d(c X, C Y ) = 1 mn m j=1 n k=1 d(x j, Y k ) Nathaniel E. Helwig (U of Minnesota) Clustering Methods Updated 27-Mar-2017 : Slide 14

Hierarchical Clustering Linkage Methods Visualizing the Different Linkage Methods Figure: Figure 12.2 from Applied Multivariate Statistical Analysis, 6th Ed (Johnson & Wichern). Nathaniel E. Helwig (U of Minnesota) Clustering Methods Updated 27-Mar-2017 : Slide 15

Hierarchical Clustering States Example States Example: Dissimilarity Matrix # look at states data >?state.x77 > vars <- c("income","illiteracy","life Exp","HS Grad") > head(state.x77[,vars]) Income Illiteracy Life Exp HS Grad Alabama 3624 2.1 69.05 41.3 Alaska 6315 1.5 69.31 66.7 Arizona 4530 1.8 70.55 58.1 Arkansas 3378 1.9 70.66 39.9 California 5114 1.1 71.71 62.6 Colorado 4884 0.7 72.06 63.9 > apply(state.x77[,vars], 2, mean) Income Illiteracy Life Exp HS Grad 4435.8000 1.1700 70.8786 53.1080 > apply(state.x77[,vars], 2, sd) Income Illiteracy Life Exp HS Grad 614.4699392 0.6095331 1.3423936 8.0769978 # create distance (raw and standarized) > distraw <- dist(state.x77[,vars]) > diststd <- dist(scale(state.x77[,vars])) Nathaniel E. Helwig (U of Minnesota) Clustering Methods Updated 27-Mar-2017 : Slide 16

Hierarchical Clustering States Example States Example: HCA via Three Linkage Methods # hierarchical clustering (raw data) > hcrawsl <- hclust(distraw, method="single") > hcrawcl <- hclust(distraw, method="complete") > hcrawal <- hclust(distraw, method="average") # hierarchical clustering (standardized data) > hcstdsl <- hclust(diststd, method="single") > hcstdcl <- hclust(diststd, method="complete") > hcstdal <- hclust(diststd, method="average") Nathaniel E. Helwig (U of Minnesota) Clustering Methods Updated 27-Mar-2017 : Slide 17

Hierarchical Clustering States Example States Example: Results for Raw Data Cluster Dendrogram Cluster Dendrogram Cluster Dendrogram Height 0 200 400 600 800 1000 Mississippi Arkansas Nevada North Dakota California Illinois New Jersey Connecticut Maryland Kentucky Maine Louisiana New Mexico South Carolina Alabama West Virginia Wisconsin Indiana Pennsylvania Wyoming Ohio Rhode Island Arizona Nebraska Hawaii Delaware Florida New York Colorado Washington Massachusetts Michigan Iowa Virginia Oregon Kansas Minnesota Tennessee North Carolina Vermont Oklahoma Utah South Dakota Texas Georgia Idaho Montana Missouri New Hampshire Alaska Height 0 500 1500 2500 Arkansas Mississippi Kentucky Maine Louisiana New Mexico South Carolina Alabama West Virginia Oklahoma Utah Tennessee North Carolina Vermont Wisconsin Indiana Pennsylvania Wyoming Ohio Rhode Island Arizona Nebraska Montana Missouri New Hampshire South Dakota Texas Georgia Idaho Nevada North Dakota California Illinois New Jersey Connecticut Maryland Hawaii New York Colorado Washington Delaware Florida Massachusetts Michigan Iowa Virginia Oregon Kansas Minnesota Alaska Height 0 500 1000 1500 2000 Nevada North Dakota California Illinois New Jersey Connecticut Maryland Wisconsin Indiana Pennsylvania Wyoming Ohio Rhode Island Arizona Nebraska Hawaii New York Colorado Washington Iowa Virginia Oregon Kansas Minnesota Delaware Florida Massachusetts Michigan Mississippi Arkansas Kentucky Maine Louisiana New Mexico South Carolina Alabama West Virginia Oklahoma Utah Tennessee North Carolina Vermont South Dakota Texas Georgia Idaho Montana Missouri New Hampshire Alaska distraw hclust (*, "single") distraw hclust (*, "complete") distraw hclust (*, "average") plot(hcrawsl) plot(hcrawcl) plot(hcrawal) Nathaniel E. Helwig (U of Minnesota) Clustering Methods Updated 27-Mar-2017 : Slide 18

Hierarchical Clustering States Example States Example: Results for Standardized Data Cluster Dendrogram Cluster Dendrogram Cluster Dendrogram Height 0.0 0.5 1.0 1.5 2.0 2.5 Hawaii Nevada Utah Rhode Island Arizona Connecticut North Dakota Maine Oklahoma Virginia South Dakota Wisconsin Colorado Washington Minnesota Oregon Nebraska Iowa Kansas Idaho Vermont Wyoming Montana New Hampshire Florida New York New Jersey Missouri Pennsylvania Illinois Maryland Delaware Michigan Indiana Ohio California Massachusetts New Mexico Texas Louisiana Mississippi South Carolina Arkansas West Virginia Kentucky Tennessee Georgia Alabama North Carolina Alaska Height 0 1 2 3 4 5 6 Arizona New Mexico Texas Arkansas West Virginia Kentucky Tennessee Georgia Alabama North Carolina Louisiana Mississippi South Carolina Alaska Nevada Wyoming Montana New Hampshire Maine Oklahoma Indiana Ohio Missouri Pennsylvania Minnesota Nebraska Iowa Kansas Oregon Colorado Washington California Massachusetts Utah Idaho Vermont South Dakota Wisconsin Virginia Florida New York Delaware Michigan New Jersey Illinois Maryland Rhode Island Connecticut North Dakota Hawaii Height 0.0 1.0 2.0 3.0 Arizona New Mexico Texas Louisiana Mississippi South Carolina Georgia Alabama North Carolina Arkansas West Virginia Kentucky Tennessee Alaska Nevada Hawaii Maine Oklahoma Rhode Island Missouri Pennsylvania Indiana Ohio Delaware Michigan New Jersey Illinois Maryland Virginia Florida New York Connecticut North Dakota Utah California Massachusetts Colorado Washington Minnesota Oregon Nebraska Iowa Kansas Wyoming Montana New Hampshire Idaho Vermont South Dakota Wisconsin diststd hclust (*, "single") diststd hclust (*, "complete") diststd hclust (*, "average") plot(hcstdsl) plot(hcstdcl) plot(hcstdal) Nathaniel E. Helwig (U of Minnesota) Clustering Methods Updated 27-Mar-2017 : Slide 19

Hierarchical Clustering States Example States Example: Standardized Data w/ Complete Link Cluster Dendrogram Height 0 1 2 3 4 5 6 Arizona New Mexico Texas Arkansas West Virginia Kentucky Tennessee Georgia Alabama North Carolina Louisiana Mississippi South Carolina Alaska Nevada Wyoming Montana New Hampshire Maine Oklahoma Indiana Ohio Missouri Pennsylvania Minnesota Nebraska Iowa Kansas Oregon Colorado Washington California Massachusetts Utah Idaho Vermont South Dakota Wisconsin Hawaii Virginia Florida New York Delaware Michigan New Jersey Illinois Maryland Rhode Island Connecticut North Dakota diststd hclust (*, "complete") Nathaniel E. Helwig (U of Minnesota) Clustering Methods Updated 27-Mar-2017 : Slide 20

Non-Hierarchical Clustering Non-Hierarchical Clustering Nathaniel E. Helwig (U of Minnesota) Clustering Methods Updated 27-Mar-2017 : Slide 21

Non-Hierarchical Clustering Overview Non-Hierarchical Clustering: Definition Non-hierarchical clustering partitions a set of N objects into K distinct groups based on some distance (or dissimilarity). The number of clusters K can be known a priori or can be estimated as a part of the procedure. Regardless, we need to start with some initial partition or seed points which define cluster centers. Try many different randomly generated seed points Nathaniel E. Helwig (U of Minnesota) Clustering Methods Updated 27-Mar-2017 : Slide 22

Non-Hierarchical Clustering K Means Clustering K Means: Clustering via Distance to Centroids K means clustering refers to the algorithm: 1 Partition the N objects into K distinct clusters C 1,..., C K 2 For each i = 1,..., N: 2a Assign object X i to cluster C k that has closest centroid (mean) 2b Update cluster centroids if X i is reassigned to new cluster 3 Repeat 2 until all objects remain in the same cluster Note: we could replace step 1 with Define K seed points giving the centroids of clusters C 1,..., C K. It is good to use MANY random starts of the above algorithm. Nathaniel E. Helwig (U of Minnesota) Clustering Methods Updated 27-Mar-2017 : Slide 23

Non-Hierarchical Clustering States Example States Example: K Means on Raw Data # look at states data >?state.x77 > vars <- c("income","illiteracy","life Exp","HS Grad") > apply(state.x77[,vars], 2, mean) Income Illiteracy Life Exp HS Grad 4435.8000 1.1700 70.8786 53.1080 # fit k means for k = 2,..., 10 (raw data) > kmlist <- vector("list", 9) > for(k in 2:10){ + set.seed(1) + kmlist[[k-1]] <- kmeans(state.x77[,vars], k, nstart=5000) + } Nathaniel E. Helwig (U of Minnesota) Clustering Methods Updated 27-Mar-2017 : Slide 24

Non-Hierarchical Clustering States Example States Example: Scree Plot for Raw Data Scree Plot: Raw Data SSW / SST 0.00 0.10 0.20 0.30 2 4 6 8 10 # Clusters tot.withinss <- sapply(kmlist, function(x) x$tot.withinss) plot(2:10, tot.withinss / kmlist[[1]]$totss, type="b", xlab="# Clusters", ylab="ssw / SST", main="scree Plot: Raw Data") Nathaniel E. Helwig (U of Minnesota) Clustering Methods Updated 27-Mar-2017 : Slide 25

Non-Hierarchical Clustering States Example States Example: Cluster Plot for Raw Data K=3 Clusters: Raw Data K=4 Clusters: Raw Data K=5 Clusters: Raw Data K=6 Clusters: Raw Data Nathaniel E. Helwig (U of Minnesota) Clustering Methods Updated 27-Mar-2017 : Slide 26

Non-Hierarchical Clustering States Example States Example: K Means on Standardized Data # look at states data >?state.x77 > vars <- c("income","illiteracy","life Exp","HS Grad") > apply(state.x77[,vars], 2, mean) Income Illiteracy Life Exp HS Grad 4435.8000 1.1700 70.8786 53.1080 # fit k means for k = 2,..., 10 (standardized data) > Xs <- scale(state.x77[,vars]) > kmlist.std <- vector("list", 9) > for(k in 2:10){ + set.seed(1) + kmlist.std[[k-1]] <- kmeans(xs, k, nstart=5000) + } Nathaniel E. Helwig (U of Minnesota) Clustering Methods Updated 27-Mar-2017 : Slide 27

Non-Hierarchical Clustering States Example States Example: Scree Plot for Standardized Data Scree Plot: Std. Data SSW / SST 0.15 0.25 0.35 0.45 2 4 6 8 10 # Clusters tot.withinss.std <- sapply(kmlist.std, function(x) x$tot.withinss) plot(2:10, tot.withinss.std / kmlist.std[[1]]$totss, type="b", xlab="# Clusters", ylab="ssw / SST", main="scree Plot: Std. Data") Nathaniel E. Helwig (U of Minnesota) Clustering Methods Updated 27-Mar-2017 : Slide 28

Non-Hierarchical Clustering States Example States Example: Cluster Plot for Standardized Data K=3 Clusters: Std. Data K=4 Clusters: Std. Data K=5 Clusters: Std. Data K=6 Clusters: Std. Data Nathaniel E. Helwig (U of Minnesota) Clustering Methods Updated 27-Mar-2017 : Slide 29