What you will find in this guide
Why Python?
What Python gives you that spreadsheets cannot
Bridging bench and code
Start from the analysis you need, not the language
What each technique covers
Stats, curve fitting, signal processing, PCA
Technique reference
10 techniques with use-case descriptions
Common mistakes
What trips up most biologists starting with Python
Getting started
Practical first steps for your first figure
Why this guide uses Python - and why you do not have to write it
Python is the standard for reproducible research. PLOTIVY puts it within reach.
The tools in this guide - pandas, scipy, matplotlib - are widely used across research labs because they produce results that are:
- Reproducible: every figure is a script you can re-run after a revision request, with identical fonts, colors, and annotations
- Flexible: nonlinear models - Michaelis-Menten, dose-response, Gaussian - that spreadsheets and most GUI tools cannot fit
- Auditable: every calculation is explicit in the code: nothing hidden in a cell formula or a GUI option you cannot find again
- Publication-ready: export at any DPI, control font sizes, axis ranges, and significance brackets precisely
The friction is the implementation. PLOTIVY generates and executes the Python code for you in the browser - upload your CSV, describe the analysis, and the code is generated and run instantly. Your results remain a real Python script you can inspect, copy, and run independently.
This guide covers the concepts behind the code so you can understand and verify every step of the analysis.
Try it on your dataWhy Python?
Spreadsheet tools work for basic summaries, but they hit a wall when you need nonlinear curve fitting, automated significance annotations, or a reproducible script you can re-run after a revision request.
Python fills that gap with three libraries used across every computational biology lab: pandas for data handling, scipy for statistics and curve fitting, and matplotlib or plotly for visualization.
Key insight: Every figure generated from Python code can be regenerated exactly - same fonts, colors, and statistical annotations. When a reviewer asks you to change the font size or add a missing p-value, you edit one line and re-run.
Bridging the Gap Between Bench and Code
The gap between knowing your experiment inside-out and producing a publication-quality figure should not require weeks of tutorials. Biology labs generate more quantitative data than ever - flow cytometry with millions of events, plate reader matrices across dozens of conditions, RNA-seq pipelines with thousands of differentially expressed genes.
The most effective approach is to start from the analysis you need rather than from the language itself. If you need to compare a treatment group against a control, you need a t-test. If you are fitting a calibration curve for an ELISA, you need linear regression. Each task maps directly to a specific technique on this page.
Loading a CSV, running a t-test, and generating a boxplot with significance brackets can be accomplished in fewer than 20 lines. That first successful figure is the foundation for everything else.
What Each Technique Covers
Statistical Comparison
The t-test and ANOVA pages walk through parametric group comparisons, including how to add significance brackets and p-value annotations directly on your figure. This is the formatting reviewers expect - and that most GUI tools make surprisingly difficult to control.
Curve Fitting
Curve fitting appears in nearly every quantitative biology workflow. Standard curves for ELISA and qPCR rely on linear regression. Drug screening depends on dose-response modeling with Hill or four-parameter logistic fits. Enzyme kinetics requires Michaelis-Menten fitting.
Each page provides the scipy.optimize.curve_fit pattern adapted to the specific model, with guidance on initial parameter guesses, error estimation, and how to overlay the fitted curve on your raw data.
Dimensionality Reduction
PCA is increasingly common as biology embraces omics data. Whether you are working with metabolomics profiles, transcriptomic count matrices, or multi-panel flow cytometry, PCA helps identify sample groupings, batch effects, and outliers before any deeper analysis. The PCA page shows how to compute and plot principal components with explained variance, colored by experimental condition.
Signal Processing
Savitzky-Golay smoothing, peak detection, and Gaussian fitting serve biologists working with biosensor output, chromatography traces, or spectroscopy data. These pages address how to clean noisy signals without distorting peak shapes and identify peaks programmatically when manual inspection is impractical.
Diagnostic Evaluation
The ROC curve page covers diagnostic evaluation for biomarker research, clinical study endpoints, and any classification task where you need to report sensitivity, specificity, and AUC with confidence intervals.
Techniques for Biologists
T-Test Visualization
Compare means between two groups with significance brackets
When: Comparing treatment vs control
ANOVA Visualization
Compare three or more groups with post-hoc pairwise tests
When: Multiple treatment groups
Linear Regression
Build calibration and standard curves with R-squared
When: ELISA, qPCR, protein assays
Dose-Response Curve
Fit Hill and 4PL models, extract EC50 and IC50
When: Drug screening, toxicology
Michaelis-Menten Fitting
Enzyme kinetics: determine Km and Vmax from rate data
When: Enzymology assays
PCA Visualization
Dimensionality reduction for high-dimensional omics data
When: Metabolomics, transcriptomics
ROC Curve
Evaluate diagnostic biomarker performance with AUC
When: Biomarker studies
Savitzky-Golay Smoothing
Smooth noisy biosensor traces while preserving peak shape
When: Spectrophotometry
Peak Detection
Identify peaks in electrophysiology and chromatography data
When: LC-MS, electrophysiology
Gaussian Fitting
Fit bell-shaped peaks to experimental distributions
When: Flow cytometry, spectroscopy
Common Mistakes to Avoid
- Running a t-test without checking normality first - use a Shapiro-Wilk test or Q-Q plot when n is small.
- Using R-squared alone to evaluate nonlinear fits. Always inspect residual plots for systematic patterns.
- Providing poor initial parameter guesses to curve_fit - the algorithm will fail to converge.
- Omitting error bar definitions in figure captions. Reviewers require you to state SD, SEM, or CI explicitly.
- Exporting figures from a spreadsheet at screen resolution. Python exports at any DPI you specify.
- Forgetting multiple comparison correction after ANOVA - Tukey or Bonferroni is expected in most journals.
Practical Advice for Getting Started
Start with the technique closest to your current experiment. Copy the working example, replace the sample data with your own CSV, and adjust labels. A working figure is more motivating than an abstract tutorial.
Once you have one result, the pattern generalizes: load data with pandas, run the analysis with scipy or statsmodels, and plot with matplotlib or plotly.
Starting with PLOTIVY: upload your CSV, describe the technique you need, and the analysis runs immediately - no environment to configure. The generated code is yours to keep, inspect, and extend.
Ready to Analyze Your Data?
Upload your dataset and generate publication-quality figures with AI-powered analysis. No installation required.
Start Analyzing