The Complete ggplot2 Scatter Plot Guide: Custom Themes & Fit Lines

Scatter plots represent the most common layout to check correlations, distributions, and trends in scientific analysis. In R's ggplot2 system, scatter plots are built using the geom_point() geometry layer.
In This Guide
0.Live Code: Scatter Plot & Trendline
1.Basic Scatter Plot (geom_point)
2.Mapping Colors and Shapes
3.Adding Fit Lines (geom_smooth)
4.Customizing Point Aesthetics
0. Live Code: Scatter Plot & Trendline
Visualize numeric correlations. Modify the Python code below for standard plotting, or upload your data to run R directly.
1. Basic Scatter Plot with geom_point
Use `geom_point()` to draw data coordinates as points:
R / ggplot2
ggplot(df, aes(x = temperature, y = activity)) +
geom_point()2. Mapping Colors and Shapes to Groups
Try it
Try it now: build this scatter plot with your dataset
Use your own measurements and generate a clean scatter plot with optional regression line and confidence interval.
Generate my scatter plot →Newsletter
Get a weekly Python plotting tip
One concise tip each week for cleaner, faster scientific figures. Built for researchers who publish.
Map groups to aesthetic markers (`color`, `shape`) inside `aes()`:
R / ggplot2
ggplot(df, aes(x = temperature, y = activity, color = genotype, shape = genotype)) +
geom_point(size = 2.5) +
scale_color_brewer(palette = "Set1")3. Adding Fit Lines with geom_smooth
Add regression lines (`method = "lm"`) or local LOESS curves (`method = "loess"`):
R / ggplot2
ggplot(df, aes(x = temperature, y = activity)) +
geom_point() +
geom_smooth(method = "lm", se = TRUE, color = "red", fill = "gray80")4. Customizing Point Aesthetics
Specify shape codes (pch) such as filled circles with custom borders (pch = 21) to enhance scientific legibility:
R / ggplot2
ggplot(df, aes(x = temperature, y = activity)) +
geom_point(shape = 21, fill = "royalblue", color = "white", size = 3, stroke = 0.5)Chart gallery
Explore related formats
Review scatter and line formats.

Scatterplot
Displays values for two variables as points on a Cartesian coordinate system.
Sample code / prompt
import matplotlib.pyplot as plt
import numpy as np
from scipy import stats
import pandas as pd
# Generate sample data
np.random.seed(42)
n_samples = 200
height = np.random.normal(170, 8, n_samples)
weight = height * 0.6 + np.random.normal(0, 8, n_samples) - 50
Line Graph
Displays data points connected by straight line segments to show trends over time.
Sample code / prompt
import matplotlib.pyplot as plt
import numpy as np
# Generate temperature data for 3 major US cities over 12 months
months = ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec']
nyc = [30, 32, 40, 52, 65, 75, 82, 81, 74, 63, 50, 38]
miami = [65, 66, 70, 76, 82, 87, 90, 90, 87, 80, 72, 66]
chicago = [25, 27, 35, 48, 62, 72, 80, 79, 71, 60, 45, 32]
# Create figure with enhanced stylingBuild This Scatter Plot Online
Upload your data and describe the design. Plotivy writes the ggplot2 code and executes it instantly.
Related chart guides
Apply this tutorial directly in the chart gallery with ready-to-run prompts and examples.
Technique guides scientists read next
scipy.signal.find_peaks guide
Tune prominence and width parameters for robust peak extraction.
Savitzky-Golay smoothing
Reduce noise while preserving peak shape and position.
PCA visualization workflow
Move from high-dimensional measurements to interpretable components.
ANOVA with post-hoc brackets
Add statistically correct pairwise significance annotations.
Found this helpful? Share it with your network.
Experimental Physicist & Photonics Researcher
Hands-on experience in silicon photonics, semiconductor fabrication (DRIE/ICP-RIE), optical simulation, and data-driven analysis. Built Plotivy to help researchers focus on discoveries instead of data struggles.
More about the authorVisualize your own data
Apply the techniques from this article to your own datasets. Upload CSV, Excel, or paste data directly.