Hierarchical clustering

Author

Cox Lab

Published

June 27, 2024

1 General

Type: - Matrix Analysis
Heading: - Clustering/PCA
Source code: not public.

2 Brief description

This activity performs hierarchical clustering of rows and/or columns and produces a visual heat map representation of the clustered matrix. Clustering can be performed with a choice of distances and linkages. This activity can also be used just to display your data in a heat map without performing clustering by deselecting row and column clustering.

3 Parameters

3.1 Row tree

If checked rows will be clustered and a tree (dendrogram) is generated (default: checked).

3.1.1 Distance

Selected distance that will be used for the clustering process (default: Euclidean). The distance can be selected from a predefined list:

Euclidean
L1
Maximum
Lp
Pearson correlation
Spearman correlation
Cosine
Canberra

3.1.2 Linkage

Selected clustering method that will be applied (default: Average). It can be selected from a predefined list:

Average
Complete
Single

3.1.3 Constraint

Selected constraint that should be preserved from the input data (default: None). The used constraint can be selected from a predefined list of constraints:

None
Preserve order
Preserve order (periodic)

3.1.4 Preprocess with k-means

Specifies, whether the data should be pre-processed using k-means before applying clustering and generating a heatmap (default: checked).

3.1.5 Number of clusters

This parameter is just relevant, if the parameter “Preprocess with k-means” is checked. Defines the number of clusters that will be created by the k-means algorithm (default: 300).

3.2 Column tree

If checked, columns will be clustered and a tree (dendrogram) is generated (default: checked).

3.2.1 Distance

Selected distance that will be used for the clustering process (default: Euclidean). The distance can be selected from a predefined list:

Euclidean
L1
Maximum
Lp
Pearson correlation
Spearman correlation

3.2.2 Linkage

Selected clustering method that will be applied (default: Average). It can be selected from a predefined list:

Average
Complete
Single

3.2.3 Constraint