Name: Plotivy
Author: Francesco Villasmunta

Every published figure starts as a messy CSV. This guide walks through the complete workflow - from organizing your raw data to submitting publication-ready figures - with reproducible Python code at each step.

What You'll Learn

0.Live Code: Publication Multi-Panel

1.Phase 1 - Data Collection & Organization

2.Phase 2 - Cleaning & Validation

3.Phase 3 - Analysis & Statistics

4.Phase 4 - Visualization & Figures

5.Phase 5 - Export & Submission

0. Live Code: Publication Multi-Panel

The end result of the workflow: a 4-panel publication figure with time series, dose-response, correlation, and distribution panels. Edit the code and re-run.

Live Code Editor

Code EditorPython

Loading editor...

Live Preview

Preparing preview

Running once automatically on first load

Open this example in the data visualization tool

Learn by Experimenting

This is a safe playground for learning! Try changing:

• Colors: Modify color values to see different palettes
• Numbers: Adjust sizes, positions, or data ranges
• Labels: Update titles, axis names, or legends

Edit the code, run it, then open the full data visualization tool to continue with your own dataset.

1. Phase 1 - Data Collection & Organization

Folder structure template

project/
├── data/
│   ├── raw/          # Never modify raw data files
│   ├── processed/    # Cleaned, transformed data
│   └── metadata/     # Experiment logs, protocols
├── figures/
│   ├── exploratory/  # Quick plots for yourself
│   └── publication/  # Final figures for the paper
├── scripts/
│   ├── 01_clean.py
│   ├── 02_analyze.py
│   └── 03_visualize.py
└── README.md

Never modify raw data

Keep originals in data/raw/. All transformations go to data/processed/.

Use descriptive filenames

experiment_2025-01-15_dosage-response.csv, not data2.csv.

Include metadata

Record units, instrument settings, operator, date in a companion file.

Version everything

Use Git. Even for data analysis scripts. Especially for data analysis scripts.

2. Phase 2 - Cleaning & Validation

Essential cleaning operations

import pandas as pd

df = pd.read_csv("data/raw/experiment.csv")

# 1. Check for missing values
print(df.isnull().sum())

# 2. Remove duplicates
df = df.drop_duplicates()

# 3. Fix data types
df["date"] = pd.to_datetime(df["date"])
df["concentration"] = pd.to_numeric(df["concentration"], errors="coerce")

# 4. Outlier detection (IQR method)
Q1, Q3 = df["value"].quantile([0.25, 0.75])
IQR = Q3 - Q1
mask = (df["value"] >= Q1 - 1.5*IQR) & (df["value"] <= Q3 + 1.5*IQR)
df_clean = df[mask]

# 5. Save cleaned data (never overwrite raw!)
df_clean.to_csv("data/processed/experiment_clean.csv", index=False)

Validation checklist

No missing values in key columns
Data types are correct (numeric, datetime, categorical)
Value ranges are physically plausible
Sample sizes match experiment protocol
Units are consistent across all measurements

3. Phase 3 - Analysis & Statistics

t-test

Compare 2 groups

Normal distribution, equal variance

ANOVA

Compare 3+ groups

Normal distribution, homoscedasticity

Pearson r

Linear correlation

Bivariate normality

Mann-Whitney U

Non-normal 2-group

Ordinal data, independent

Chi-square

Categorical association

Expected count >= 5

Linear regression

Predict Y from X

Linearity, independence

4. Phase 4 - Visualization & Figures

One message per figure

Each figure should communicate exactly one key finding.

Label everything

Axis labels with units, legend, title, and panel letters (A, B, C).

Use consistent styling

Same fonts, colors, and line widths across all figures in your paper.

Match journal specs

Width (single/double column), DPI (300+), font size (8-12 pt), file format.

Include statistical annotations

p-values, confidence intervals, R-squared on the figure itself.

5. Phase 5 - Export & Submission

Export code template

# Save in multiple formats for different uses
fig.savefig("figures/publication/figure_1.pdf",
            dpi=600, bbox_inches="tight", facecolor="white")
fig.savefig("figures/publication/figure_1.tiff",
            dpi=300, bbox_inches="tight", facecolor="white")
fig.savefig("figures/publication/figure_1.png",
            dpi=600, bbox_inches="tight", facecolor="white")

# PDF  -> Vector, best for journal submission
# TIFF -> Required by some journals (Nature, Science)
# PNG  -> For presentations and web

Reproducibility checklist

All scripts run from raw data to final figures without manual steps
Package versions recorded (pip freeze, conda env export)
Random seeds set for any stochastic operations
README explains how to reproduce every figure
Data and code deposited in a DOI-backed repository

Chart gallery

Chart Types for Your Publication

Every chart type used in scientific publications, ready to generate.

Browse all chart types →

Scatter plot of height vs weight colored by gender with regression line

Statistical•matplotlib, seaborn

From the chart gallery•Correlation analysis between metrics

Scatterplot

Displays values for two variables as points on a Cartesian coordinate system.

Sample code / prompt

import matplotlib.pyplot as plt
import numpy as np
from scipy import stats
import pandas as pd

# Generate sample data
np.random.seed(42)
n_samples = 200
height = np.random.normal(170, 8, n_samples)
weight = height * 0.6 + np.random.normal(0, 8, n_samples) - 50

View chart details Generate this chart

Bar chart comparing average scores across 5 groups with error bars

Comparison•matplotlib, seaborn

From the chart gallery•Comparing performance across categories

Bar Chart

Compares categorical data using rectangular bars with heights proportional to values.

Sample code / prompt

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from scipy import stats

# Generate performance scores for 5 treatment groups
np.random.seed(42)
groups = ['Control', 'Treatment A', 'Treatment B', 'Treatment C', 'Treatment D']
n_samples = 30

View chart details Generate this chart

Line graph with error bars showing 95% confidence intervals

Statistical•matplotlib

From the chart gallery•Scientific data presentation

Error Bars

Graphical representations of the variability of data indicating error or uncertainty in measurements.

Sample code / prompt

import numpy as np
import matplotlib.pyplot as plt
from scipy import stats

# Generate bacterial growth data with replicates
np.random.seed(42)
time_points = np.array([0, 4, 8, 12, 18, 24])
mean_values = np.array([10, 25, 80, 250, 600, 800])

# Generate 5 replicates per time point with noise

View chart details Generate this chart

Box and whisker plot comparing gene expression across 4 genotypes with significance brackets

Distribution•seaborn, matplotlib

From the chart gallery•Comparing experimental groups in scientific research

Box and Whisker Plot

Displays data distribution using quartiles, median, and outliers in a standardized format.

Sample code / prompt

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from scipy import stats

# Generate gene expression data for 4 genotypes
np.random.seed(42)
genotypes = ['WT', 'KO1', 'KO2', 'Mutant']
n_per_group = 20

View chart details Generate this chart

Correlation heatmap with diverging color scale and coefficient annotations

Statistical•seaborn, matplotlib

From the chart gallery•Correlation analysis between variables

Heatmap

Represents data values as colors in a two-dimensional matrix format.

Sample code / prompt

import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np

# Create correlation matrix for financial metrics
metrics = ['Revenue', 'Profit', 'Expenses', 'ROI', 'Customers', 'AOV', 'Marketing', 'Employees']
correlation_data = np.array([
    [1.00, 0.85, -0.45, 0.72, 0.88, 0.65, 0.72, 0.55],
    [0.85, 1.00, -0.78, 0.92, 0.75, 0.58, 0.63, 0.48],

View chart details Generate this chart

Multi-line graph showing temperature trends for 3 cities over a year

Time Series•matplotlib, seaborn

From the chart gallery•Stock price tracking over time

Line Graph

Displays data points connected by straight line segments to show trends over time.

Sample code / prompt

import matplotlib.pyplot as plt
import numpy as np

# Generate temperature data for 3 major US cities over 12 months
months = ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec']
nyc = [30, 32, 40, 52, 65, 75, 82, 81, 74, 63, 50, 38]
miami = [65, 66, 70, 76, 82, 87, 90, 90, 87, 80, 72, 66]
chicago = [25, 27, 35, 48, 62, 72, 80, 79, 71, 60, 45, 32]

# Create figure with enhanced styling

View chart details Generate this chart

Skip Straight to Phase 4

Upload your cleaned data and let AI generate the Python code for your publication figures.

Try Plotivy Free

Tags:#research workflow#data analysis#reproducibility#publication

Found this helpful? Share it with your network.

Francesco Villasmunta

Experimental Physicist & Photonics Researcher

Hands-on experience in silicon photonics, semiconductor fabrication (DRIE/ICP-RIE), optical simulation, and data-driven analysis. Built Plotivy to help researchers focus on discoveries instead of data struggles.

More about the author

Visualize your own data

Apply the techniques from this article to your own datasets. Upload CSV, Excel, or paste data directly.

Start Analyzing - Free

Menu

What You'll Learn

0. Live Code: Publication Multi-Panel

Learn by Experimenting

1. Phase 1 - Data Collection & Organization

Folder structure template

Never modify raw data

Use descriptive filenames

Include metadata

Version everything

2. Phase 2 - Cleaning & Validation

Essential cleaning operations

Validation checklist

3. Phase 3 - Analysis & Statistics

t-test

ANOVA

Pearson r

Mann-Whitney U

Chi-square

Linear regression

4. Phase 4 - Visualization & Figures

One message per figure

Label everything

Use consistent styling

Match journal specs

Include statistical annotations

5. Phase 5 - Export & Submission

Export code template

Reproducibility checklist

Chart Types for Your Publication

Scatterplot

Bar Chart

Error Bars

Box and Whisker Plot

Heatmap

Line Graph

Skip Straight to Phase 4

Visualize your own data

Related Articles

Publication-Ready Figures in 10 Minutes

Best AI Tools for Data Analysis