Menu

Distribution
Static
45 Python scripts generated for density plot this week

Density Plot

Chart overview

Density plots provide a smooth, continuous estimate of the probability density function of a random variable.

Key points

  • Unlike histograms, they don't depend on bin size and provide a cleaner view of the underlying distribution shape.
  • Density plots are excellent for comparing distributions between groups and identifying multimodal patterns in your data.

Example Visualization

Density plot showing distribution of test scores with shaded area

Create This Chart Now

Generate publication-ready density plots with AI in seconds. No coding required – just describe your data and let AI do the work.

View example prompt
Example AI Prompt

"Create a density plot (KDE) showing the distribution of 'SAT Scores' for 1000 students across 3 school districts. Generate realistic data: District A (urban, mean=1100, sd=150), District B (suburban, mean=1200, sd=120), District C (private, mean=1300, sd=100). Shade the area under each curve with semi-transparent fills using distinct colors. Add vertical dashed lines for each district's mean with annotations. Include a vertical line at the national average (1060). Add a rug plot showing individual observations (alpha=0.1). X-axis: 'SAT Score (400-1600)', Y-axis: 'Density'. Add legend with district names and sample sizes. Title: 'SAT Score Distribution by School District'."

How to create this chart in 30 seconds

1

Upload Data

Drag & drop your Excel or CSV file. Plotivy securely processes it in your browser.

2

AI Generation

Our AI analyzes your data and generates the Density Plot code automatically.

3

Customize & Export

Tweak the design with natural language, then export as high-res PNG, SVG or PDF.

Python Code Example

example.py
# === IMPORTS ===
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from scipy import stats

# === USER-EDITABLE PARAMETERS ===
# Change: Customize data generation parameters
districts = ['A', 'B', 'C']  # Short labels for DataFrame
district_labels = ['District A', 'District B', 'District C']  # Full labels for legend/annotations
means = [1100, 1200, 1300]  # Mean SAT scores for each district
stds = [150, 120, 100]  # Standard deviations for each district
n_samples = [333, 333, 334]  # Number of students per district (total ~1000)
national_avg = 1060  # National average SAT score
score_min, score_max = 400, 1600  # SAT score range for clipping and x-limits

# Change: Customize plot appearance
title = 'SAT Score Distribution by School District'  # Insight-driven title (customize with computed stats if desired)
x_label = 'SAT Score (400-1600)'
y_label = 'Density'
figsize = (12, 7)
title_fontsize = 18
label_fontsize = 16
colors = ['#1f77b4', '#ff7f0e', '#2ca02c']  # Hex colors for districts (matplotlib default qualitative)
alpha_fill = 0.3  # Transparency for shaded areas under KDE curves
alpha_rug = 0.1  # Transparency for rug plot ticks
rug_size = 3  # Marker size for rug plot
linewidth_kde = 2.5  # Line width for KDE curves
linewidth_vline = 2  # Line width for mean lines

# Change: Analysis toggles
np.random.seed(42)  # Change: Set to None for random data each run

# === Data Generation ===
# Generate realistic SAT score data for each district
data_list = []
for i in range(len(districts)):
    scores = np.random.normal(means[i], stds[i], n_samples[i])
    scores = np.clip(scores, score_min, score_max)  # Clip to realistic SAT range
    temp_df = pd.DataFrame({'District': districts[i], 'SAT Score': scores})
    data_list.append(temp_df)
df = pd.concat(data_list, ignore_index=True)

# === Print Relevant Statistics ===
print("Generated Data Summary:")
for i, dist in enumerate(districts):
    scores_dist = df[df['District'] == dist]['SAT Score']
    actual_mean = scores_dist.mean()
    actual_std = scores_dist.std()
    actual_n = len(scores_dist)
    print(f"{district_labels[i]}: n={actual_n}, mean={actual_mean:.0f}, std={actual_std:.0f}")
print(f"National Average: {national_avg}")
print(f"Total students: {len(df)}")
print(f"SAT Score range: {df['SAT Score'].min():.0f} - {df['SAT Score'].max():.0f}")

# === Create Plot ===
fig, ax = plt.subplots(figsize=figsize)

# Compute x-range for smooth KDE evaluation
x_range = np.linspace(score_min, score_max, 500)

# Plot KDE curves, fills, and mean lines for each district
max_density = 0
mean_values = []
for i, dist in enumerate(districts):
    scores = df[df['District'] == dist]['SAT Score'].values
    kde = stats.gaussian_kde(scores)
    y_density = kde(x_range)
    max_density = max(max_density, np.max(y_density))
    
    # KDE line and shaded fill
    ax.plot(x_range, y_density, color=colors[i], linewidth=linewidth_kde,
            label=f"{district_labels[i]} (n={n_samples[i]})")
    ax.fill_between(x_range, 0, y_density, color=colors[i], alpha=alpha_fill)
    
    # Vertical dashed line for district mean
    mean_val = scores.mean()
    mean_values.append(mean_val)
    ax.axvline(mean_val, color=colors[i], linestyle='--', linewidth=linewidth_vline, alpha=0.8)

# National average line
ax.axvline(national_avg, color='#313233', linestyle='-', linewidth=linewidth_vline + 0.5,
           alpha=0.9, label=f'National Average ({national_avg})')

# Rug plot: individual data points as ticks at bottom
rug_offset = -0.02 * max_density  # Small offset below y=0
for i, dist in enumerate(districts):
    scores = df[df['District'] == dist]['SAT Score'].values
    ax.scatter(scores, np.full_like(scores, rug_offset),
               marker='|', color=colors[i], alpha=alpha_rug, s=rug_size)

# Mean labels vertically aligned on vertical dashed lines
y_ann = 0.90 * max_density
for i in range(len(districts)):
    mean_val = mean_values[i]
    ax.text(mean_val, y_ann, f'{mean_val:.0f}',
            ha='center', va='center',
            bbox=dict(boxstyle='round,pad=0.4', facecolor='white', alpha=0.9, ec=colors[i]),
            fontsize=11, fontweight='bold')

# Styling and labels
ax.set_xlabel(x_label, fontsize=label_fontsize)
ax.set_ylabel(y_label, fontsize=label_fontsize)
ax.set_title(title, fontsize=title_fontsize, pad=20)
ax.tick_params(labelsize=label_fontsize)
ax.set_xlim(score_min, score_max)
ax.set_ylim(-0.05 * max_density, 1.15 * max_density)
ax.grid(True, alpha=0.3, linestyle=':')
ax.legend(loc='upper left', fontsize=12, framealpha=0.9)

# Layout adjustments to prevent clipping
plt.subplots_adjust(top=0.92, bottom=0.12, left=0.12, right=0.95)
plt.tight_layout()

plt.show()
fig  # Assign final plot to fig for display
# END-OF-CODE

Opens the Analyze page with this code pre-loaded and ready to execute

Console Output

Output
District Statistics:
District A (Urban): mean=1099, sd=151, median=1101
District B (Suburban): mean=1199, sd=120, median=1200
District C (Private): mean=1301, sd=100, median=1303

Performance Gap:
District C vs A: 202 points higher
District B vs A: 100 points higher

Common Use Cases

  • 1Comparing distributions between groups
  • 2Identifying multimodal distributions
  • 3Visualizing probability density
  • 4Smoothed frequency analysis

Pro Tips

Adjust bandwidth for optimal smoothing

Use fill with transparency for overlapping distributions

Add rug plots to show individual data points

Free Cheat Sheet

Scientific Chart Selection Cheat Sheet

Not sure whether to use a Violin Plot, Box Plot, or Ridge Plot? Download our single-page reference mapping the most-used scientific chart types, exactly when to use them, and the core Matplotlib/Seaborn functions.

Comparison Charts
Distribution Charts
Time Series Data
Common Mistakes
No spam. Unsubscribe anytime.