Earthquake probability forecasting is one of the hardest problems in applied geophysics. Unlike weather forecasting, seismic systems operate with sparse, noisy catalogs and complex fault interactions. Talivio's approach: treat each magnitude band as an independent binary classification problem, then run an algorithm competition to find the best model for each band.
The Band Architecture
Rather than forecasting a single "will an earthquake happen?" probability, Talivio separates the problem into four distinct magnitude windows:
- M 4.0–5.0: Frequent, moderate events — 102 core features, ~3× negative sampling
- M 5.0–6.0: Felt events — 118 features, ~5× negative sampling
- M 6.0–7.0: Damaging events — 120 features, ~10× negative sampling
- M 7.0+: Major/great earthquakes — 125 features, ~20× negative sampling
Each band has a different class imbalance. Large earthquakes are extremely rare, requiring aggressive negative sampling to train a useful classifier.
Algorithm Competition
For each band, five algorithms compete: LightGBM, Random Forest, Gradient Boosting, ExtraTrees, and Calibrated Logistic Regression. After training, each model is evaluated on a held-out test set. The model with the highest ROC-AUC becomes the "champion" for that band.
When a new training run completes, the challenger replaces the champion only if its AUC exceeds the current best. This champion/challenger pattern prevents regression during continuous retraining.
Feature Engineering
The 102-core feature set combines temporal patterns (b-value, Omori law decay, seismicity rate changes over 7/30/90 days), mechanical signals (Coulomb stress, GNSS strain accumulation), and spatial features (fault proximity, depth distribution). Higher bands add wavelet-decomposed frequency features that capture long-period energy buildup invisible to simpler statistical models.
Probability Calibration
Raw classifier outputs are calibrated using Platt scaling (sigmoid calibration) to convert scores to reliable probabilities. A model that outputs 0.15 should, over many events, be correct ~15% of the time. This calibration is essential for honest uncertainty reporting.
"A forecast that admits uncertainty is more useful than one that claims false precision." — The core principle behind Talivio's calibrated outputs.
Daily validation against USGS event catalogs marks each forecast as hit or miss, feeding back into the retraining pipeline. Over time, the champion/challenger system adapts to shifting seismic patterns — crucial in regions like Kahramanmaraş where the post-2023 stress field changed dramatically.