What you will find in this guide

Why Python?

What Python gives you that spreadsheets cannot

Bridging bench and code

Start from the analysis you need, not the language

What each technique covers

Stats, curve fitting, signal processing, PCA

Technique reference

10 techniques with use-case descriptions

Common mistakes

What trips up most biologists starting with Python

Getting started

Practical first steps for your first figure

Why this guide uses Python - and why you do not have to write it

Python is the standard for reproducible research. PLOTIVY puts it within reach.

The tools in this guide - pandas, scipy, matplotlib - are widely used across research labs because they produce results that are:

Reproducible: every figure is a script you can re-run after a revision request, with identical fonts, colors, and annotations
Flexible: nonlinear models - Michaelis-Menten, dose-response, Gaussian - that spreadsheets and most GUI tools cannot fit
Auditable: every calculation is explicit in the code: nothing hidden in a cell formula or a GUI option you cannot find again
Publication-ready: export at any DPI, control font sizes, axis ranges, and significance brackets precisely

The friction is the implementation. PLOTIVY generates and executes the Python code for you in the browser - upload your CSV, describe the analysis, and the code is generated and run instantly. Your results remain a real Python script you can inspect, copy, and run independently.

This guide covers the concepts behind the code so you can understand and verify every step of the analysis.

Try it on your data

Why Python?

Spreadsheet tools work for basic summaries, but they hit a wall when you need nonlinear curve fitting, automated significance annotations, or a reproducible script you can re-run after a revision request.

Python fills that gap with three libraries used across every computational biology lab: pandas for data handling, scipy for statistics and curve fitting, and matplotlib or plotly for visualization.

Key insight: Every figure generated from Python code can be regenerated exactly - same fonts, colors, and statistical annotations. When a reviewer asks you to change the font size or add a missing p-value, you edit one line and re-run.

Bridging the Gap Between Bench and Code

The gap between knowing your experiment inside-out and producing a publication-quality figure should not require weeks of tutorials. Biology labs generate more quantitative data than ever - flow cytometry with millions of events, plate reader matrices across dozens of conditions, RNA-seq pipelines with thousands of differentially expressed genes.

The most effective approach is to start from the analysis you need rather than from the language itself. If you need to compare a treatment group against a control, you need a t-test. If you are fitting a calibration curve for an ELISA, you need linear regression. Each task maps directly to a specific technique on this page.

Loading a CSV, running a t-test, and generating a boxplot with significance brackets can be accomplished in fewer than 20 lines. That first successful figure is the foundation for everything else.

What Each Technique Covers

Statistical Comparison

The t-test and ANOVA pages walk through parametric group comparisons, including how to add significance brackets and p-value annotations directly on your figure. This is the formatting reviewers expect - and that most GUI tools make surprisingly difficult to control.

Curve Fitting

Curve fitting appears in nearly every quantitative biology workflow. Standard curves for ELISA and qPCR rely on linear regression. Drug screening depends on dose-response modeling with Hill or four-parameter logistic fits. Enzyme kinetics requires Michaelis-Menten fitting.

Each page provides the scipy.optimize.curve_fit pattern adapted to the specific model, with guidance on initial parameter guesses, error estimation, and how to overlay the fitted curve on your raw data.

Dimensionality Reduction

PCA is increasingly common as biology embraces omics data. Whether you are working with metabolomics profiles, transcriptomic count matrices, or multi-panel flow cytometry, PCA helps identify sample groupings, batch effects, and outliers before any deeper analysis. The PCA page shows how to compute and plot principal components with explained variance, colored by experimental condition.

Signal Processing

Savitzky-Golay smoothing, peak detection, and Gaussian fitting serve biologists working with biosensor output, chromatography traces, or spectroscopy data. These pages address how to clean noisy signals without distorting peak shapes and identify peaks programmatically when manual inspection is impractical.

Diagnostic Evaluation

The ROC curve page covers diagnostic evaluation for biomarker research, clinical study endpoints, and any classification task where you need to report sensitivity, specificity, and AUC with confidence intervals.

Techniques for Biologists

T-Test Visualization

Compare means between two groups with significance brackets

When: Comparing treatment vs control

ANOVA Visualization

Compare three or more groups with post-hoc pairwise tests

When: Multiple treatment groups

Linear Regression

Build calibration and standard curves with R-squared

When: ELISA, qPCR, protein assays

Dose-Response Curve

Fit Hill and 4PL models, extract EC50 and IC50

When: Drug screening, toxicology

Michaelis-Menten Fitting

Enzyme kinetics: determine Km and Vmax from rate data

When: Enzymology assays

PCA Visualization

Dimensionality reduction for high-dimensional omics data

When: Metabolomics, transcriptomics

ROC Curve

Evaluate diagnostic biomarker performance with AUC

When: Biomarker studies

Savitzky-Golay Smoothing

Smooth noisy biosensor traces while preserving peak shape

When: Spectrophotometry

Peak Detection

Identify peaks in electrophysiology and chromatography data

When: LC-MS, electrophysiology

Gaussian Fitting

Fit bell-shaped peaks to experimental distributions

When: Flow cytometry, spectroscopy

Common Mistakes to Avoid

Running a t-test without checking normality first - use a Shapiro-Wilk test or Q-Q plot when n is small.
Using R-squared alone to evaluate nonlinear fits. Always inspect residual plots for systematic patterns.
Providing poor initial parameter guesses to curve_fit - the algorithm will fail to converge.
Omitting error bar definitions in figure captions. Reviewers require you to state SD, SEM, or CI explicitly.
Exporting figures from a spreadsheet at screen resolution. Python exports at any DPI you specify.
Forgetting multiple comparison correction after ANOVA - Tukey or Bonferroni is expected in most journals.

Practical Advice for Getting Started

Start with the technique closest to your current experiment. Copy the working example, replace the sample data with your own CSV, and adjust labels. A working figure is more motivating than an abstract tutorial.

Once you have one result, the pattern generalizes: load data with pandas, run the analysis with scipy or statsmodels, and plot with matplotlib or plotly.

Starting with PLOTIVY: upload your CSV, describe the technique you need, and the analysis runs immediately - no environment to configure. The generated code is yours to keep, inspect, and extend.

Ready to Analyze Your Data?

Upload your dataset and generate publication-quality figures with AI-powered analysis. No installation required.

Start Analyzing