What you will find in this guide

The analysis pipeline

Import, baseline, smooth, detect, fit, visualize

Noise reduction and preprocessing

Savitzky-Golay filtering and baseline correction

Peak identification and fitting

Detection, Gaussian/Lorentzian models, deconvolution

Gaussian vs Lorentzian comparison

Which line shape to choose for your technique

Multivariate spectral analysis

PCA for spectral classification and mapping

Technique reference

6 core techniques with application descriptions

Why this guide uses Python - and why you do not have to write it

Python is the standard for spectroscopy analysis. PLOTIVY removes the implementation friction.

scipy, numpy, and matplotlib provide every step of the spectroscopy pipeline - and produce results that are:

Reproducible: re-run with different smoothing parameters or an additional peak component by changing one line after a review
Precise: smoothing window length, number of Gaussian components, baseline polynomial degree are explicit parameters in the script
Auditable: every fit result, every FWHM, every baseline subtraction is a documented step - nothing hidden in a GUI
Publication-ready: export at 600 DPI, control axis labels and residual panels precisely, match journal formatting requirements

The friction is writing the code correctly. PLOTIVY generates and executes the Python code for you in the browser - upload your spectrum, describe the analysis, and the code is generated and run instantly. Your results remain a real Python script you can inspect, copy, and run independently.

This guide covers the concepts behind the code - what each function computes, why parameter choices matter - so you can verify and defend every step of your analysis.

Try it on your data

The Spectroscopy Data Analysis Pipeline

Spectroscopy generates some of the most information-dense data in experimental science. A single Raman spectrum may contain hundreds of peaks encoding molecular fingerprints. An FTIR time series can track chemical reactions across thousands of wavenumber channels simultaneously. UV-Vis absorbance measurements underpin quantitative assays across chemistry, biology, and materials science.

In every case, the raw instrument output requires processing before it becomes a publishable result. The standard workflow follows a consistent sequence:

1.Import the raw spectrum
2.Correct the baseline
3.Reduce noise with smoothing
4.Identify peaks
5.Fit quantitative models to those peaks
6.Produce a figure that communicates the result clearly

Key insight: Python has become the dominant language for this pipeline because scipy, numpy, and matplotlib provide every component - from signal filtering to nonlinear curve fitting - in a single, reproducible script.

When to Use This Spectroscopy Workflow

Use this workflow when a raw spectrum needs to become a defensible result: a peak table, a calibration curve, a PCA score plot, or a manuscript figure. The same structure applies whether the input is a single Raman spectrum, an FTIR series, UV-Vis absorbance data, or an NMR line shape that needs quantitative fitting.

The key SEO topic for this page is spectroscopy data analysis in Python. Supporting topics include peak fitting, baseline correction, Savitzky-Golay smoothing, Raman analysis, FTIR analysis, UV-Vis visualization, and PCA for spectra. These terms match the page intent because they describe the exact analysis steps already covered in the guide.

Noise Reduction and Signal Preprocessing

Raw spectra almost always contain noise. Detector electronics, thermal fluctuations, and photon statistics introduce random variation that obscures weak features and complicates peak fitting.

The challenge is removing noise without distorting the peaks you care about - a smoothing filter that broadens a narrow Raman line or shifts its center frequency defeats the purpose of the measurement.

Savitzky-Golay Filtering

Savitzky-Golay filtering fits a local polynomial to successive windows of the spectrum, preserving peak shape, position, and relative height while averaging out high-frequency noise. Two parameters control the behavior:

Window length: wider for broad FTIR absorption bands; shorter for narrow Raman lines
Polynomial order: higher order preserves more peak character but reduces noise suppression

Baseline Correction

Fluorescence background in Raman spectra, scattering contributions in UV-Vis, and detector drift in NMR all produce slowly varying baselines that shift peak intensities.

Important: Always perform baseline correction before fitting. Fitting on uncorrected spectra will produce inflated peak areas and inaccurate center positions.

Peak Identification and Quantitative Fitting

Once the spectrum is clean, the next step is identifying which features are real peaks and which are residual noise. Manual inspection works for simple spectra with a handful of well-separated lines, but Raman maps with thousands of spectra demand automated detection.

The scipy.signal.find_peaks function is the standard tool. The Peak Detection technique page covers how to tune its parameters - minimum height, prominence, and minimum distance between peaks - for different spectral types.

Multi-peak deconvolution - separating overlapping lines into individual components - is one of the most common tasks in Raman and fluorescence analysis. Set up composite models with multiple Gaussian or Lorentzian terms and use detected peak positions as initial guesses.

For quantitative spectroscopy, the Beer-Lambert law establishes a linear relationship between absorbance and concentration. The Linear Regression page provides the full workflow: measure standards, fit a calibration line with error bars, compute R-squared, and use the model to predict unknown concentrations with uncertainty estimates.

Gaussian vs Lorentzian vs Voigt

Choosing the wrong line shape is one of the most common errors in spectral fitting. Use this table to select the correct model for your technique.

Line Shape	Best For	Avoid For	Key Notes
Gaussian	Fluorescence emission, UV-Vis absorption bands, inhomogeneous broadening	Raman lines (Lorentzian is more accurate)	FWHM = 2.355σ; faster decay in the tails
Lorentzian	Raman peaks, NMR lines, homogeneous broadening	Broad fluorescence backgrounds	Heavier tails than Gaussian; FWHM = γ
Voigt	FTIR absorption bands, X-ray diffraction peaks	Simple spectra where extra parameters are not justified	Most accurate but adds two parameters; use pseudo-Voigt approximation in scipy

Multivariate Spectral Analysis

When you have many spectra - from different samples, different positions on a surface map, or different time points - univariate peak analysis may not capture the full picture. Principal Component Analysis (PCA) treats each spectrum as a point in a high-dimensional space and finds the directions of maximum variance.

PCA is widely used in Raman mapping to distinguish chemical phases across a surface, in NIR spectroscopy for material classification, and in process analytical technology (PAT) for monitoring manufacturing.

What PCA tells you about your spectra

Score plots: which samples (or spatial positions) cluster together chemically
Loading vectors: which spectral features (wavenumbers) drive the separation
Explained variance: how many components are needed to describe most of the variation
Outliers: samples that are chemically distinct from the rest of the dataset

Spectroscopy Techniques

Gaussian Fitting

Fit emission and absorption lines to quantify peak parameters

Type: Raman, fluorescence, UV-Vis

Lorentzian Fitting

Fit homogeneous peaks in NMR and Raman with heavier tails

Type: NMR, Raman, lifetime-broadened peaks

Peak Detection

Automated peak identification across full spectra

Type: All spectroscopy

Savitzky-Golay Smoothing

Noise reduction that preserves peak shape and position

Type: FTIR, Raman, NMR

Linear Regression

Beer-Lambert calibration curves for quantitative analysis

Type: UV-Vis quantitative analysis

PCA Visualization

Spectral classification, mapping, and outlier detection

Type: Raman mapping, NIR

Common Mistakes to Avoid

Smoothing too aggressively: a window that is too wide broadens peaks and shifts their centers.
Skipping baseline subtraction before fitting - fluorescence background in Raman inflates peak integrated areas.
Using a Gaussian model for Raman lines; Raman peaks are Lorentzian. The choice affects reported FWHM.
Providing initial guesses far from the true peak center - curve_fit converges to a local minimum or fails.
Comparing FWHM values across instruments with different spectral resolutions without correcting for the instrument function.
Running PCA on raw spectra without baseline correction - the first principal component will reflect baseline variation, not chemistry.

Related Spectroscopy and Visualization Guides

Continue from spectroscopy preprocessing into chart selection, peak detection, and multivariate analysis.

Analyze Your Spectra Now

Upload spectroscopy data and generate publication-ready peak fits, smoothed spectra, and calibration curves with AI-powered analysis.

Start Analyzing

Spectroscopy Data Analysis in Python

Clean the raw signal

Detect and fit peaks

Create defensible figures