| Using this allocation rule, additional (new) samples can be classified into the predefined groups, and the corresponding probability of misclassification can be estimated (Figure 1). | | Using this allocation rule, additional (new) samples can be classified into the predefined groups, and the corresponding probability of misclassification can be estimated (Figure 1). |
− | Discriminant analysis requires the definition of a "distance" between any two groups. A widely used measure is the Mahalanobis distance (see Davis, 1986, for further details). | + | Discriminant analysis requires the definition of a "distance" between any two groups. A widely used measure is the Mahalanobis distance (see Davis, 1986, for further details)<ref name=Davis_1986 />. |
| In conclusion, although cluster analysis aims at an unsupervised classification, it is best when applied with some supervision and a prior idea of what natural or physical clusters could be. Cluster analysis can then prove to be a remarkable corroboratory tool, allowing prior speculations to be checked and quantified. | | In conclusion, although cluster analysis aims at an unsupervised classification, it is best when applied with some supervision and a prior idea of what natural or physical clusters could be. Cluster analysis can then prove to be a remarkable corroboratory tool, allowing prior speculations to be checked and quantified. |
− | [[File:Esbensen_etal__multivariate-data-analysis__Fig_1.png|thumb|{{figure_number|1}}Plot of two-bivariate distributions, showing overlap between groups a and b along both variables ''x''<sub>1</sub> and ''x''<sub>2</sub>. Groups can be distinguished by projecting members of the two groups onto the discriminant function line. (From Davis, 1986)<ref name=davis_1986 />. ]] | + | [[File:Esbensen_etal__multivariate-data-analysis__Fig_1.png|thumb|{{figure_number|1}}Plot of two-bivariate distributions, showing overlap between groups a and b along both variables ''x''<sub>1</sub> and ''x''<sub>2</sub>. Groups can be distinguished by projecting members of the two groups onto the discriminant function line. (From Davis, 1986)<ref name=Davis_1986 />. ]] |
| [[File:Esbensen_etal__multivariate-data-analysis__Fig_2.png|thumb|{{figure_number|2}}Dendrogram (by aggregation). Starting from n samples, combine the two most similar samples (here 2 and 3). Then, combine the two nearest groups by either joining two samples or aggregating a third sample to the previous group of two (1 is aggregated to 2 and 3). At the next step, 4 and 5 constitutes a new group, which is then aggregated to the former group (1, 2, 3). The aggregation process stops when there is only one group left. In the last step, group (1, 2, 3, 4, 5) is aggregated to group (6, 7, 8, 9).]] | | [[File:Esbensen_etal__multivariate-data-analysis__Fig_2.png|thumb|{{figure_number|2}}Dendrogram (by aggregation). Starting from n samples, combine the two most similar samples (here 2 and 3). Then, combine the two nearest groups by either joining two samples or aggregating a third sample to the previous group of two (1 is aggregated to 2 and 3). At the next step, 4 and 5 constitutes a new group, which is then aggregated to the former group (1, 2, 3). The aggregation process stops when there is only one group left. In the last step, group (1, 2, 3, 4, 5) is aggregated to group (6, 7, 8, 9).]] |