How Machine Learning Models Estimate Earthquake Probability

A deep dive into the per-band algorithm competition system Talivio uses to generate daily seismic risk forecasts across 10 global regions.

Earthquake probability forecasting is one of the hardest problems in applied geophysics. Unlike weather forecasting, seismic systems operate with sparse, noisy catalogs and complex fault interactions. Talivio's approach: treat each magnitude band as an independent binary classification problem, then run an algorithm competition to find the best model for each band.

The Band Architecture

Rather than forecasting a single "will an earthquake happen?" probability, Talivio separates the problem into four distinct magnitude windows:

M 4.0–5.0: Frequent, moderate events — 102 core features, ~3× negative sampling
M 5.0–6.0: Felt events — 118 features, ~5× negative sampling
M 6.0–7.0: Damaging events — 120 features, ~10× negative sampling
M 7.0+: Major/great earthquakes — 125 features, ~20× negative sampling

Each band has a different class imbalance. Large earthquakes are extremely rare, requiring aggressive negative sampling to train a useful classifier.

Algorithm Competition

For each band, five algorithms compete: LightGBM, Random Forest, Gradient Boosting, ExtraTrees, and Calibrated Logistic Regression. After training, each model is evaluated on a held-out test set. The model with the highest ROC-AUC becomes the "champion" for that band.

When a new training run completes, the challenger replaces the champion only if its AUC exceeds the current best. This champion/challenger pattern prevents regression during continuous retraining.

Feature Engineering

The 102-core feature set combines temporal patterns (b-value, Omori law decay, seismicity rate changes over 7/30/90 days), mechanical signals (Coulomb stress, GNSS strain accumulation), and spatial features (fault proximity, depth distribution). Higher bands add wavelet-decomposed frequency features that capture long-period energy buildup invisible to simpler statistical models.

Probability Calibration

Raw classifier outputs are calibrated using Platt scaling (sigmoid calibration) to convert scores to reliable probabilities. A model that outputs 0.15 should, over many events, be correct ~15% of the time. This calibration is essential for honest uncertainty reporting.

"A forecast that admits uncertainty is more useful than one that claims false precision." — The core principle behind Talivio's calibrated outputs.

Daily validation against USGS event catalogs marks each forecast as hit or miss, feeding back into the retraining pipeline. Over time, the champion/challenger system adapts to shifting seismic patterns — crucial in regions like Kahramanmaraş where the post-2023 stress field changed dramatically.