How to Create Box Plots with Jitter Points in R ggplot2

Box plots display the median, quartiles, and range of a distribution. However, box plots can hide sample size and local distribution details. Overlaying jittered raw data points solves this, showing every individual observation.
In This Tutorial
0.Live Code: Box Plot with Overlaid Observations
1.Basic Box Plot (geom_boxplot)
2.Adding Raw Points (geom_jitter)
3.Avoiding Outlier Duplication
4.Violin Overlay Combinations
0. Live Code: Box Plot with Overlaid Observations
Distribution comparisons. Customize parameters using Python below, or upload your data to run R directly.
1. Basic Box Plot with geom_boxplot
Create standard box plots to visualize distribution quartiles:
R / ggplot2
ggplot(df, aes(x = group, y = measurement)) +
geom_boxplot(fill = "gray95", width = 0.5)2. Adding Raw Points with geom_jitter
Try it
Try it now: compare your groups with the right chart
Generate box, violin, or bar charts directly from your dataset and choose the clearest visual for your paper.
Generate comparison charts →Newsletter
Get a weekly Python plotting tip
One concise tip each week for cleaner, faster scientific figures. Built for researchers who publish.
Overlay individual observations. Set width boundaries inside `geom_jitter` to prevent points spreading too far:
R / ggplot2
ggplot(df, aes(x = group, y = measurement)) +
geom_boxplot(fill = "gray95", width = 0.5) +
geom_jitter(color = "#4f46e5", alpha = 0.6, width = 0.15, size = 1.5)3. Avoiding Outlier Duplication
Crucial Tip: Disable Boxplot Outliers
When overlaying `geom_jitter()`, ggplot2 will plot outliers twice: once from the boxplot calculation and once as a jittered point. Set outlier.shape = NA in the boxplot layer to avoid double-plotting.
R / ggplot2
ggplot(df, aes(x = group, y = measurement)) +
geom_boxplot(outlier.shape = NA, fill = "gray95", width = 0.5) +
geom_jitter(color = "#4f46e5", alpha = 0.6, width = 0.15)4. Violin Overlay Combinations
Pairing a violin plot with a box plot inside is another premium scientific layout:
R / ggplot2
ggplot(df, aes(x = group, y = measurement)) +
geom_violin(fill = "gray90", color = "gray60", alpha = 0.5) +
geom_boxplot(width = 0.15, fill = "white", outlier.shape = NA)Chart gallery
Explore related formats
Review distribution formats.
.png&w=1280&q=70)
Box and Whisker Plot
Displays data distribution using quartiles, median, and outliers in a standardized format.
Sample code / prompt
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from scipy import stats
# Generate gene expression data for 4 genotypes
np.random.seed(42)
genotypes = ['WT', 'KO1', 'KO2', 'Mutant']
n_per_group = 20
Violin Plot
Combines box plots with kernel density to show distribution shape across groups.
Sample code / prompt
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np
from scipy.stats import f_oneway
# Generate exam score data for 3 groups
np.random.seed(42)
control = np.random.normal(72, 12, 50)
treatment_a = np.random.normal(78, 10, 50)Build This Box Plot Online
Upload your data and describe the design. Plotivy writes the ggplot2 code and executes it instantly.
Technique guides scientists read next
scipy.signal.find_peaks guide
Tune prominence and width parameters for robust peak extraction.
Savitzky-Golay smoothing
Reduce noise while preserving peak shape and position.
PCA visualization workflow
Move from high-dimensional measurements to interpretable components.
ANOVA with post-hoc brackets
Add statistically correct pairwise significance annotations.
Found this helpful? Share it with your network.
Experimental Physicist & Photonics Researcher
Hands-on experience in silicon photonics, semiconductor fabrication (DRIE/ICP-RIE), optical simulation, and data-driven analysis. Built Plotivy to help researchers focus on discoveries instead of data struggles.
More about the authorVisualize your own data
Apply the techniques from this article to your own datasets. Upload CSV, Excel, or paste data directly.