Menu

MultivariateLive Code Editor
30 researchers ran this analysis this month

Hierarchical Clustering Heatmap in Python

Technique overview

Cluster rows and columns to reveal structure in gene expression, similarity matrices, and other multivariate datasets.

Hierarchical clustering heatmaps reorder rows and columns so structure in a matrix becomes visible. They are common in gene expression, proteomics, metabolomics, similarity matrices, and high-dimensional assay screens. A useful clustered heatmap depends on preprocessing: row scaling, distance metric, linkage method, and annotation choices can change the apparent clusters. The figure should make those choices explicit.

Key points

  • Cluster rows and columns to reveal structure in gene expression, similarity matrices, and other multivariate datasets.
  • Hierarchical clustering heatmaps reorder rows and columns so structure in a matrix becomes visible.
  • They are common in gene expression, proteomics, metabolomics, similarity matrices, and high-dimensional assay screens.
  • A useful clustered heatmap depends on preprocessing: row scaling, distance metric, linkage method, and annotation choices can change the apparent clusters.
scipynumpymatplotlibseaborn

Example Visualization

Review the example first, then use the live editor below to run and customize the full workflow.

Mathematical Foundation

Hierarchical clustering heatmaps reorder rows and columns so structure in a matrix becomes visible.

distance(row_i, row_j) -> linkage tree -> reordered heatmap

Equation

distance(row_i, row_j) -> linkage tree -> reordered heatmap

Parameter breakdown

distance metricEuclidean, correlation, cosine, or another dissimilarity measure
linkageRule for merging clusters, such as average, complete, or ward
z-score scalingRow-wise normalization to emphasize patterns over absolute magnitude
dendrogramTree showing hierarchical relationships

When to use this technique

Use clustered heatmaps for exploratory pattern discovery in matrices where both samples and features may have meaningful groupings.

Apply This Technique Now

Run this analysis workflow with AI in seconds. Use the prepared technique prompt or bring your own dataset.

View example prompt
Example AI Prompt

"Create a hierarchical clustering heatmap from my matrix data, apply row z-score normalization, show both dendrograms, and label the most important clusters"

How to apply this technique in 30 seconds

1

Upload Data

Upload your CSV or Excel file in Analyze and keep your column names as-is.

2

Generate

Run the example prompt and let AI generate this technique automatically.

3

Refine and Export

Adjust code or prompt, then export publication-ready figures.

Implementation Code

The core data processing logic. Copy this block and replace the sample data with your measurements.

import numpy as np
import pandas as pd
from scipy.cluster.hierarchy import linkage, leaves_list
from scipy.spatial.distance import pdist

np.random.seed(4)
matrix = np.vstack([
    np.random.normal(1.2, 0.35, (8, 6)),
    np.random.normal(-0.8, 0.35, (8, 6)),
    np.random.normal(0.2, 0.35, (8, 6)),
])
matrix[:, 3:] += np.repeat([0.7, -0.4, 0.2], 8)[:, None]
df = pd.DataFrame(matrix, index=[f"gene_{i+1}" for i in range(24)],
                  columns=[f"sample_{i+1}" for i in range(6)])

row_z = df.sub(df.mean(axis=1), axis=0).div(df.std(axis=1), axis=0)
row_linkage = linkage(pdist(row_z.values, metric="correlation"), method="average")
row_order = leaves_list(row_linkage)
print(row_z.index[row_order].tolist()[:5])

Visualization Code

Complete matplotlib code for a publication-ready figure. Copy, paste into your notebook, and adjust labels to match your data.

import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

np.random.seed(4)
matrix = np.vstack([
    np.random.normal(1.2, 0.35, (8, 6)),
    np.random.normal(-0.8, 0.35, (8, 6)),
    np.random.normal(0.2, 0.35, (8, 6)),
])
matrix[:, 3:] += np.repeat([0.7, -0.4, 0.2], 8)[:, None]
df = pd.DataFrame(matrix, index=[f"gene_{i+1}" for i in range(24)],
                  columns=[f"sample_{i+1}" for i in range(6)])
row_z = df.sub(df.mean(axis=1), axis=0).div(df.std(axis=1), axis=0)

cluster = sns.clustermap(
    row_z,
    method="average",
    metric="correlation",
    cmap="vlag",
    center=0,
    figsize=(7, 8),
    dendrogram_ratio=(0.18, 0.12),
    cbar_kws={"label": "Row z-score"},
)
cluster.fig.suptitle("Hierarchical Clustering Heatmap", y=1.02)
cluster.savefig("hierarchical_clustering_heatmap.png", dpi=300, bbox_inches="tight")
plt.show()

Add Sample Group Color Annotations

Column color bars make experimental groups visible without crowding the heatmap labels.

sample_group = pd.Series(["control", "control", "control", "treated", "treated", "treated"], index=df.columns)
palette = {"control": "#888888", "treated": "#9240ff"}
col_colors = sample_group.map(palette)
sns.clustermap(row_z, method="average", metric="correlation", col_colors=col_colors, cmap="vlag", center=0)

Common Errors and How to Fix Them

Clusters are driven by absolute abundance only

Why: Unscaled rows with large magnitudes dominate the distance calculation.

Fix: Use row z-score normalization when the goal is pattern clustering across samples.

Too many labels overlap

Why: A heatmap with hundreds of rows cannot show every label legibly.

Fix: Hide row labels, label selected features, or split the matrix into cluster-specific panels.

Changing linkage changes the conclusion

Why: Cluster structure is exploratory and sensitive to distance choices.

Fix: Report the metric and linkage method, and test whether major clusters are robust.

Frequently Asked Questions

Apply Hierarchical Clustering Heatmap in Python to Your Data

Upload your dataset and Plotivy generates the Python code, runs the analysis, and produces a publication-ready figure.

Generate Code for This Technique

Python Libraries

scipynumpymatplotlibseaborn

Quick Info

Domain
Multivariate
Typical Audience
Bioinformaticians, systems biologists, and data scientists exploring similarity patterns in high-dimensional measurements

Related Chart Guides

Apply to your data

Upload a dataset and get Python code instantly

Get Started Free