Density Plot
Chart overview
Density plots provide a smooth, continuous estimate of the probability density function of a random variable.
Key points
- Unlike histograms, they don't depend on bin size and provide a cleaner view of the underlying distribution shape.
- Density plots are excellent for comparing distributions between groups and identifying multimodal patterns in your data.
Example Visualization
.png&w=1920&q=75)
Create This Chart Now
Generate publication-ready density plots with AI in seconds. No coding required – just describe your data and let AI do the work.
View example prompt
"Create a density plot (KDE) showing the distribution of 'SAT Scores' for 1000 students across 3 school districts. Generate realistic data: District A (urban, mean=1100, sd=150), District B (suburban, mean=1200, sd=120), District C (private, mean=1300, sd=100). Shade the area under each curve with semi-transparent fills using distinct colors. Add vertical dashed lines for each district's mean with annotations. Include a vertical line at the national average (1060). Add a rug plot showing individual observations (alpha=0.1). X-axis: 'SAT Score (400-1600)', Y-axis: 'Density'. Add legend with district names and sample sizes. Title: 'SAT Score Distribution by School District'."
How to create this chart in 30 seconds
Upload Data
Drag & drop your Excel or CSV file. Plotivy securely processes it in your browser.
AI Generation
Our AI analyzes your data and generates the Density Plot code automatically.
Customize & Export
Tweak the design with natural language, then export as high-res PNG, SVG or PDF.
Python Code Example
# === IMPORTS ===
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from scipy import stats
# === USER-EDITABLE PARAMETERS ===
# Change: Customize data generation parameters
districts = ['A', 'B', 'C'] # Short labels for DataFrame
district_labels = ['District A', 'District B', 'District C'] # Full labels for legend/annotations
means = [1100, 1200, 1300] # Mean SAT scores for each district
stds = [150, 120, 100] # Standard deviations for each district
n_samples = [333, 333, 334] # Number of students per district (total ~1000)
national_avg = 1060 # National average SAT score
score_min, score_max = 400, 1600 # SAT score range for clipping and x-limits
# Change: Customize plot appearance
title = 'SAT Score Distribution by School District' # Insight-driven title (customize with computed stats if desired)
x_label = 'SAT Score (400-1600)'
y_label = 'Density'
figsize = (12, 7)
title_fontsize = 18
label_fontsize = 16
colors = ['#1f77b4', '#ff7f0e', '#2ca02c'] # Hex colors for districts (matplotlib default qualitative)
alpha_fill = 0.3 # Transparency for shaded areas under KDE curves
alpha_rug = 0.1 # Transparency for rug plot ticks
rug_size = 3 # Marker size for rug plot
linewidth_kde = 2.5 # Line width for KDE curves
linewidth_vline = 2 # Line width for mean lines
# Change: Analysis toggles
np.random.seed(42) # Change: Set to None for random data each run
# === Data Generation ===
# Generate realistic SAT score data for each district
data_list = []
for i in range(len(districts)):
scores = np.random.normal(means[i], stds[i], n_samples[i])
scores = np.clip(scores, score_min, score_max) # Clip to realistic SAT range
temp_df = pd.DataFrame({'District': districts[i], 'SAT Score': scores})
data_list.append(temp_df)
df = pd.concat(data_list, ignore_index=True)
# === Print Relevant Statistics ===
print("Generated Data Summary:")
for i, dist in enumerate(districts):
scores_dist = df[df['District'] == dist]['SAT Score']
actual_mean = scores_dist.mean()
actual_std = scores_dist.std()
actual_n = len(scores_dist)
print(f"{district_labels[i]}: n={actual_n}, mean={actual_mean:.0f}, std={actual_std:.0f}")
print(f"National Average: {national_avg}")
print(f"Total students: {len(df)}")
print(f"SAT Score range: {df['SAT Score'].min():.0f} - {df['SAT Score'].max():.0f}")
# === Create Plot ===
fig, ax = plt.subplots(figsize=figsize)
# Compute x-range for smooth KDE evaluation
x_range = np.linspace(score_min, score_max, 500)
# Plot KDE curves, fills, and mean lines for each district
max_density = 0
mean_values = []
for i, dist in enumerate(districts):
scores = df[df['District'] == dist]['SAT Score'].values
kde = stats.gaussian_kde(scores)
y_density = kde(x_range)
max_density = max(max_density, np.max(y_density))
# KDE line and shaded fill
ax.plot(x_range, y_density, color=colors[i], linewidth=linewidth_kde,
label=f"{district_labels[i]} (n={n_samples[i]})")
ax.fill_between(x_range, 0, y_density, color=colors[i], alpha=alpha_fill)
# Vertical dashed line for district mean
mean_val = scores.mean()
mean_values.append(mean_val)
ax.axvline(mean_val, color=colors[i], linestyle='--', linewidth=linewidth_vline, alpha=0.8)
# National average line
ax.axvline(national_avg, color='#313233', linestyle='-', linewidth=linewidth_vline + 0.5,
alpha=0.9, label=f'National Average ({national_avg})')
# Rug plot: individual data points as ticks at bottom
rug_offset = -0.02 * max_density # Small offset below y=0
for i, dist in enumerate(districts):
scores = df[df['District'] == dist]['SAT Score'].values
ax.scatter(scores, np.full_like(scores, rug_offset),
marker='|', color=colors[i], alpha=alpha_rug, s=rug_size)
# Mean labels vertically aligned on vertical dashed lines
y_ann = 0.90 * max_density
for i in range(len(districts)):
mean_val = mean_values[i]
ax.text(mean_val, y_ann, f'{mean_val:.0f}',
ha='center', va='center',
bbox=dict(boxstyle='round,pad=0.4', facecolor='white', alpha=0.9, ec=colors[i]),
fontsize=11, fontweight='bold')
# Styling and labels
ax.set_xlabel(x_label, fontsize=label_fontsize)
ax.set_ylabel(y_label, fontsize=label_fontsize)
ax.set_title(title, fontsize=title_fontsize, pad=20)
ax.tick_params(labelsize=label_fontsize)
ax.set_xlim(score_min, score_max)
ax.set_ylim(-0.05 * max_density, 1.15 * max_density)
ax.grid(True, alpha=0.3, linestyle=':')
ax.legend(loc='upper left', fontsize=12, framealpha=0.9)
# Layout adjustments to prevent clipping
plt.subplots_adjust(top=0.92, bottom=0.12, left=0.12, right=0.95)
plt.tight_layout()
plt.show()
fig # Assign final plot to fig for display
# END-OF-CODEOpens the Analyze page with this code pre-loaded and ready to execute
Console Output
District Statistics: District A (Urban): mean=1099, sd=151, median=1101 District B (Suburban): mean=1199, sd=120, median=1200 District C (Private): mean=1301, sd=100, median=1303 Performance Gap: District C vs A: 202 points higher District B vs A: 100 points higher
Common Use Cases
- 1Comparing distributions between groups
- 2Identifying multimodal distributions
- 3Visualizing probability density
- 4Smoothed frequency analysis
Pro Tips
Adjust bandwidth for optimal smoothing
Use fill with transparency for overlapping distributions
Add rug plots to show individual data points
Scientific Chart Selection Cheat Sheet
Not sure whether to use a Violin Plot, Box Plot, or Ridge Plot? Download our single-page reference mapping the most-used scientific chart types, exactly when to use them, and the core Matplotlib/Seaborn functions.