Menu

Signal ProcessingLive Code Editor
136 researchers ran this analysis this month

Baseline Correction in Python for Spectroscopy

Technique overview

Remove baseline drift from spectroscopy and chromatography traces before peak analysis or quantitative fitting.

Baseline drift can distort peak height, peak area, and fitted parameters in Raman, FTIR, UV-Vis, fluorescence, and chromatography data. A strong baseline can make a peak appear larger than it is or hide small features entirely. Baseline correction should be treated as a preprocessing model with assumptions, not as a cosmetic adjustment. This workflow compares polynomial fitting and asymmetric least squares so the corrected trace can be interpreted and reproduced.

Key points

  • Remove baseline drift from spectroscopy and chromatography traces before peak analysis or quantitative fitting.
  • Baseline drift can distort peak height, peak area, and fitted parameters in Raman, FTIR, UV-Vis, fluorescence, and chromatography data.
  • A strong baseline can make a peak appear larger than it is or hide small features entirely.
  • Baseline correction should be treated as a preprocessing model with assumptions, not as a cosmetic adjustment.
numpyscipymatplotlib

Example Visualization

Review the example first, then use the live editor below to run and customize the full workflow.

Mathematical Foundation

Baseline drift can distort peak height, peak area, and fitted parameters in Raman, FTIR, UV-Vis, fluorescence, and chromatography data.

corrected_signal = raw_signal - estimated_baseline

Equation

corrected_signal = raw_signal - estimated_baseline

Parameter breakdown

raw_signalMeasured intensity or detector response
estimated_baselineSlowly varying background estimated from the trace
lambdaSmoothness penalty for asymmetric least squares
pAsymmetry weight that keeps the baseline below peaks

When to use this technique

Use baseline correction before peak fitting, peak integration, or comparing spectra when background drift changes across samples.

Apply This Technique Now

Run this analysis workflow with AI in seconds. Use the prepared technique prompt or bring your own dataset.

View example prompt
Example AI Prompt

"Correct the baseline of my spectroscopy data, show the raw trace versus corrected trace, and highlight the peaks after baseline removal"

How to apply this technique in 30 seconds

1

Upload Data

Upload your CSV or Excel file in Analyze and keep your column names as-is.

2

Generate

Run the example prompt and let AI generate this technique automatically.

3

Refine and Export

Adjust code or prompt, then export publication-ready figures.

Implementation Code

The core data processing logic. Copy this block and replace the sample data with your measurements.

import numpy as np
from scipy import sparse
from scipy.sparse.linalg import spsolve

def asymmetric_least_squares(y, lam=1e5, p=0.01, n_iter=10):
    y = np.asarray(y, dtype=float)
    length = len(y)
    diff = sparse.diags([1, -2, 1], [0, 1, 2], shape=(length - 2, length))
    weights = np.ones(length)
    for _ in range(n_iter):
        W = sparse.spdiags(weights, 0, length, length)
        Z = W + lam * diff.T @ diff
        baseline = spsolve(Z, weights * y)
        weights = p * (y > baseline) + (1 - p) * (y < baseline)
    return baseline

x = np.linspace(400, 1800, 800)
baseline_true = 0.0000004 * (x - 1100) ** 2 + 0.18
peaks = 0.8 * np.exp(-((x - 820) / 22) ** 2) + 0.55 * np.exp(-((x - 1320) / 35) ** 2)
raw = baseline_true + peaks + np.random.default_rng(5).normal(0, 0.025, size=x.size)

baseline = asymmetric_least_squares(raw, lam=1e6, p=0.01)
corrected = raw - baseline
print(f"Baseline range: {baseline.min():.3f} to {baseline.max():.3f}")

Visualization Code

Complete matplotlib code for a publication-ready figure. Copy, paste into your notebook, and adjust labels to match your data.

import numpy as np
import matplotlib.pyplot as plt
from scipy import sparse
from scipy.sparse.linalg import spsolve

def asymmetric_least_squares(y, lam=1e6, p=0.01, n_iter=10):
    y = np.asarray(y, dtype=float)
    length = len(y)
    D = sparse.diags([1, -2, 1], [0, 1, 2], shape=(length - 2, length))
    weights = np.ones(length)
    for _ in range(n_iter):
        W = sparse.spdiags(weights, 0, length, length)
        baseline = spsolve(W + lam * D.T @ D, weights * y)
        weights = p * (y > baseline) + (1 - p) * (y < baseline)
    return baseline

rng = np.random.default_rng(5)
x = np.linspace(400, 1800, 800)
baseline_true = 0.0000004 * (x - 1100) ** 2 + 0.18
peaks = 0.8 * np.exp(-((x - 820) / 22) ** 2) + 0.55 * np.exp(-((x - 1320) / 35) ** 2)
raw = baseline_true + peaks + rng.normal(0, 0.025, size=x.size)
baseline = asymmetric_least_squares(raw)
corrected = raw - baseline

fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(8, 6), sharex=True)
ax1.plot(x, raw, color="#888888", lw=1, label="Raw spectrum")
ax1.plot(x, baseline, color="#9240ff", lw=2, label="Estimated baseline")
ax1.set_ylabel("Intensity")
ax1.set_title("Spectroscopy Baseline Correction")
ax1.legend(frameon=False)

ax2.plot(x, corrected, color="#111111", lw=1.1)
ax2.axhline(0, color="#9240ff", lw=1, ls=":")
ax2.set_xlabel("Wavenumber (cm^-1)")
ax2.set_ylabel("Corrected intensity")
ax1.spines[["top", "right"]].set_visible(False)
ax2.spines[["top", "right"]].set_visible(False)
plt.tight_layout()
plt.savefig("baseline_correction_spectroscopy.png", dpi=300, bbox_inches="tight")
plt.show()

Polynomial Baseline for Peak-Free Regions

When you can identify baseline-only regions, a low-order polynomial fit is transparent and easy to report.

mask = ((x < 650) | ((x > 1000) & (x < 1150)) | (x > 1550))
coeff = np.polyfit(x[mask], raw[mask], deg=2)
poly_baseline = np.polyval(coeff, x)
poly_corrected = raw - poly_baseline

Common Errors and How to Fix Them

Baseline cuts through peaks

Why: The baseline model is too flexible or the asymmetry parameter is too high.

Fix: Increase lambda for a smoother baseline and reduce p so peaks are penalized less.

Negative corrected intensities are interpreted as real dips

Why: Baseline subtraction can shift noise below zero.

Fix: Inspect raw and corrected traces together and avoid over-interpreting small negative residuals.

Baseline parameters are not reported

Why: Preprocessing choices affect peak areas and fitted results.

Fix: Report the baseline method, lambda, p, polynomial degree, and excluded regions.

Frequently Asked Questions

Apply Baseline Correction in Python for Spectroscopy to Your Data

Upload your dataset and Plotivy generates the Python code, runs the analysis, and produces a publication-ready figure.

Generate Code for This Technique

Python Libraries

numpyscipymatplotlib

Quick Info

Domain
Signal Processing
Typical Audience
Spectroscopists and analytical chemists cleaning Raman, FTIR, UV-Vis, or chromatography data before peak quantification

Related Chart Guides

Apply to your data

Upload a dataset and get Python code instantly

Get Started Free