Python Scatter Plot Tutorial: matplotlib, seaborn, and plotly (2026)

Scatter plots are the workhorse of scientific figures: two continuous variables, one relationship. But making a scatter plot that is honest, readable, and journal-ready takes more than a single ax.scatter() call. This guide walks through the most useful variations in matplotlib, with notes on when each one applies.
Marker size
Encode a third continuous variable by varying s (size) across points — bubble chart style.
Regression line
np.polyfit gives you slope and intercept in one line; add R² to the legend or caption.
Overplotting
With >500 points, reduce alpha or switch to a 2-D histogram or hexbin plot.
1. Basic scatter plot
Start with ax.scatter(x, y). Set s (point size) and use white edgecolors to separate overlapping points slightly.
import matplotlib.pyplot as plt
import numpy as np
rng = np.random.default_rng(42)
x = rng.normal(5, 1.5, 80)
y = 1.2 * x + rng.normal(0, 1.0, 80)
fig, ax = plt.subplots(figsize=(6, 5))
ax.scatter(x, y, color="#6b21a8", s=40, edgecolors="white", linewidths=0.4)
ax.set_xlabel("Variable X")
ax.set_ylabel("Variable Y")
ax.set_title("Basic Scatter Plot")
plt.tight_layout()2. Encode a third variable with a colour map
Pass a numeric array to c and a cmap name. Always add a colourbar with a label so readers can decode the mapping.
Try it
Try it now: build this scatter plot with your dataset
Use your own measurements and generate a clean scatter plot with optional regression line and confidence interval.
Generate my scatter plot →Newsletter
Get a weekly Python plotting tip
One concise tip each week for cleaner, faster scientific figures. Built for researchers who publish.
import matplotlib.pyplot as plt
import numpy as np
rng = np.random.default_rng(42)
x = rng.uniform(0, 10, 120)
y = rng.uniform(0, 10, 120)
z = np.sin(x) * np.cos(y) # third dimension encoded as color
fig, ax = plt.subplots(figsize=(6, 5))
sc = ax.scatter(x, y, c=z, cmap="plasma", s=50, edgecolors="none")
plt.colorbar(sc, ax=ax, label="sin(x)·cos(y)")
ax.set_xlabel("X")
ax.set_ylabel("Y")
plt.tight_layout()3. Add a regression line
Overlay a least-squares fit with np.polyfit. Report slope, intercept, and R² either in the legend or in the figure caption.
import matplotlib.pyplot as plt
import numpy as np
rng = np.random.default_rng(42)
x = rng.normal(5, 1.5, 80)
y = 1.2 * x + rng.normal(0, 1.0, 80)
m, b = np.polyfit(x, y, 1)
xline = np.linspace(x.min(), x.max(), 200)
fig, ax = plt.subplots(figsize=(6, 5))
ax.scatter(x, y, color="#6b21a8", s=40, edgecolors="white", linewidths=0.4, label="Data")
ax.plot(xline, m * xline + b, color="#f59e0b", lw=2, label=f"y = {m:.2f}x + {b:.2f}")
ax.set_xlabel("Variable X")
ax.set_ylabel("Variable Y")
ax.legend(frameon=False)
plt.tight_layout()4. Error bars on scatter points
For experimental data with measurement uncertainty, use ax.errorbarinstead of ax.scatter. Set fmt to control the marker style and capsize for the bar cap length.
import matplotlib.pyplot as plt
import numpy as np
x = np.array([1, 2, 3, 4, 5])
y = np.array([2.1, 3.8, 6.2, 8.1, 10.3])
xerr = np.array([0.2, 0.3, 0.2, 0.4, 0.3])
yerr = np.array([0.5, 0.4, 0.6, 0.5, 0.7])
fig, ax = plt.subplots(figsize=(6, 5))
ax.errorbar(x, y, xerr=xerr, yerr=yerr,
fmt="o", color="#6b21a8", ecolor="gray",
elinewidth=1.5, capsize=4, ms=7)
ax.set_xlabel("Variable X")
ax.set_ylabel("Variable Y")
plt.tight_layout()Common mistakes
- Overplotting — when many points land on top of each other, use
alpha=0.3or switch to a hexbin plot. - Unlabelled axes — always include the variable name and its units on both axes.
- Trendline without uncertainty — show a 95% confidence band around regression lines in small samples.
- Misleading aspect ratio — stretching one axis exaggerates or hides correlation.
Upload your CSV and describe the scatter plot you need — Plotivy generates the complete code including regression lines, colour maps, and error bars.
Generate your scatter plotRelated chart guides
Apply this tutorial directly in the chart gallery with ready-to-run prompts and examples.
Technique guides scientists read next
scipy.signal.find_peaks guide
Tune prominence and width parameters for robust peak extraction.
Savitzky-Golay smoothing
Reduce noise while preserving peak shape and position.
PCA visualization workflow
Move from high-dimensional measurements to interpretable components.
ANOVA with post-hoc brackets
Add statistically correct pairwise significance annotations.
Found this helpful? Share it with your network.
Experimental Physicist & Photonics Researcher
Hands-on experience in silicon photonics, semiconductor fabrication (DRIE/ICP-RIE), optical simulation, and data-driven analysis. Built Plotivy to help researchers focus on discoveries instead of data struggles.
More about the authorVisualize your own data
Apply the techniques from this article to your own datasets. Upload CSV, Excel, or paste data directly.