Talivio uses per-region band ML models trained via an expanding-window backtest (2000–present). For each seismic region and magnitude band, a 4-algorithm competition (LightGBM, RandomForest, ExtraTrees, CalibratedLR) selects the best model using walk-forward cross-validation with hard same-region temporal negatives — preventing geographic classification leakage. Regional ROC-AUC values range from 0.62 to 0.99 across 10 global regions. CSEP-compatible forecast export for independent validation. Weekly automatic retraining.
The regional band model architecture trains a separate ML model for each (region, magnitude band) pair using an expanding-window backtest. Hard same-region negatives (temporal: same coordinates, different time; spatial: offset within the same seismic zone) are used to prevent geographic leakage. A 4-algorithm competition (LightGBM, RandomForest, ExtraTrees, CalibratedLR) selects the winning algorithm per band via cross-validated AUC. The global prediction endpoint retains the 102-feature ensemble (Coulomb stress, ETAS, GR b-value, multi-scale SRA, H3 spatial connectivity) for backwards-compatible API access.
Each band has a different feature dimensionality, data density, and dominant physical mechanism. In sparse bands (M6+) the seismic cycle and physical parameters take precedence, while in dense bands (M4–5) statistical precursors carry more weight.
Highest event rate; serves as a precursor indicator for larger events. Micro-cluster density and foreshock patterns are key signals.
Most frequent damaging class; high training data availability. Foreshock density and quiescence anomaly are the dominant signals in this band.
Magnitude range capable of causing damage. The 24-month linear regression of the b-value trend is the leading precursor signal in this band.
Potential for serious structural damage and casualties. Seismic cycle analysis with inter-event normalisation plays the decisive role.
Least training data; hardest prediction class. Physical parameters (slip rate, last rupture date, fault length) gain greater weight.
Features are constructed as a total of 29 core features across nine categories. Coulomb stress transfer, the ETAS statistical model and SRA were added in v4.0; IERS 2010 solid earth tides and FFT spectral features were added in v5.0.
| Feature | Physical Meaning | Source |
|---|---|---|
| Fault distance | Distance to nearest active fault segment (km) | GEM Global Active Faults |
| Coupling ratio | Fault coupling ratio (0=creeping, 1=fully locked); elastic energy accumulation indicator | Fault DB + GNSS |
| Slip rate | Long-term slip rate of the fault (mm/yr); rate of energy accumulation | GEM / AFAD Fault DB |
| Cross-fault velocity | GNSS cross-fault differential velocity (mm/yr); plate motion difference across fault | GNSS microservice |
| Seismic gap | Time elapsed since last major earthquake (years); identifies mature segments | Historical catalogue |
| Last rupture | Time since last surface rupture; basis of seismic cycle calculation | Historical catalogue |
| Feature | Physical Meaning | Source |
|---|---|---|
| Cumulative ΔCFS | Cumulative Coulomb stress effect of historical M≥5.5 earthquakes (bar) — Okada (1992) dislocation model | Computation + USGS/AFAD |
| Recent ΔCFS | Largest individual Coulomb stress change in the recent period (bar) — triggering potential | Computation |
| Feature | Physical Meaning | Source |
|---|---|---|
| ETAS density | ETAS model estimated instantaneous seismicity rate (events/day/km²) — Ogata (1988) MLE | Computation |
| Background rate | Background seismicity rate μ (excluding clustering); tectonic loading indicator | Ogata (1988) |
| Triggering ratio | Triggered/total ratio; high value = active aftershock sequence | Computation |
| Feature | Physical Meaning | Source |
|---|---|---|
| Short-term rate | Regional earthquake count over the short term; short-term activity | USGS + AFAD |
| Medium-term rate | Earthquake count over the medium term; medium-term activity | USGS + AFAD |
| Long-term rate | Earthquake count over the long term; long-term background | USGS + AFAD |
| Activity ratio | Short/long-term activity ratio (SRA). Detection of uplift or quiescence | Computation |
| Quiescence Z-score | Deviation of activity from historical mean. Anomaly detection (Habermann, 1988) | Computation |
| Feature | Physical Meaning | Source |
|---|---|---|
| b-value | Gutenberg-Richter b-value (Aki 1965 MLE); low b = high stress | Seismic catalogue |
| b-value trend | Time-series slope of b-value; decreasing trend signals stress increase | Computation |
| Precursor density | Density of small nearby earthquakes (foreshock density) | USGS + AFAD |
| Moment deficit | Accumulated/expected seismic moment ratio; ≥1 indicates energy surplus | Fault DB + Catalogue |
| Feature | Physical Meaning | Source |
|---|---|---|
| GNSS deformation | Crustal deformation rate computed via spatial interpolation from GPS stations (nstrain/yr) | NGL Nevada Geodetic Lab |
| Tidal index | Combined Moon+Sun tidal stress (normalised 0–1) | Ephemeris computation |
| Feature | Physical Meaning | Source |
|---|---|---|
| Volumetric strain | Volumetric strain computation based on IERS 2010 conventions (nanostrain). Typical amplitude: 50–120 ns | Computation (IERS 2010) |
| Tidal Coulomb | Coulomb projection of tidal stress onto the fault plane (kPa). Depends on fault geometry | Computation |
| Feature | Physical Meaning | Source |
|---|---|---|
| Dominant period | Spectral dominant period of the seismicity time series; decrease = quiescence signal | Computation |
| Spectral centroid | Weighted mean of the frequency distribution | Computation |
| Spectral entropy | Shannon spectral entropy (0–1 normalised); rising disorder = breakdown of harmonic order | Computation |
| Spectral slope | Log-log spectral slope; steepening = red shift, quiescence | Computation |
| Energy ratio | Low/high frequency energy ratio; increase = precursor signal | Computation |
Fault locking, slip deficit, and stress transfer from neighbouring earthquakes. Weighted average of three sub-components.
ETAS model seismicity density estimate and short/long-term activity ratio (SRA). Operational forecasting standard.
b-value decrease, seismic quiescence, and foreshock density. Weighted combination of four sub-signals.
Moon–Sun tidal stress; triggering potential when fault is near threshold. Scientifically small (~1–3%) but statistically significant.
Fault segments with ΔCFS > +0.1 bar (10 kPa) receive an automatic uplift in the mechanical component. King et al. (1994) triggering threshold.
If Mechanical ≥ 0.6, Statistical ≥ 0.5, Precursor ≥ 0.4, and Trigger ≥ 0.6 are simultaneously met, the system raises the Critical Window flag.
The earthquake catalogue is divided into spatial cells via a hierarchical hexagonal grid. Spatial features are computed using a multi-scale neighbourhood structure. Temporal leakage is prevented by leak-free binary search (only past events are used).
spatial_neighbor_rate_k1/k2: 90-day earthquake activity rate in near and extended neighbour cells.
spatial_max_mag_k1/k2: Maximum earthquake magnitude within 90 days in neighbouring cells.
spatial_activity_trend: Last 30-day vs. 90-day activity ratio — detection of acceleration or deceleration.
spatial_cluster_density: Ratio of active cells in the centre + K1 neighbours — measure of spatial clustering.
spatial_fault_smoothed_risk: Distance-weighted fault cell seismicity using exp(-dist/decay_km).
spatial_strain_gradient: Standard deviation of the activity difference between the centre cell and its neighbours.
In v8.0 the MTL architecture was replaced by a 4-model soft-voting ensemble. Each model contributes weighted probability votes. The ensemble mitigates individual model weaknesses and produces more robust, well-calibrated predictions across all 5 bands.
Primary model: n_estimators=300, max_depth=6, num_leaves=31. Gradient boosting with L1+L2 regularisation. Highest individual discriminative power.
n_estimators=250, max_depth=8. Bagging-based diversity reduces variance. Robust to outliers and noisy features.
Extremely Randomised Trees. Random split selection adds decorrelation. Fastest training; lowest weight but adds unique diversity.
sklearn GBM with n_estimators=200, max_depth=5. Classical sequential boosting provides a different bias-variance tradeoff vs LightGBM.
CalibratedClassifierCV (method='isotonic') converts ensemble output to a reliable probability. See current ECE in backtest metrics above.
Daily CSEP-compatible grid forecasts (ASCII/CSV/XML) are generated for independent validation via cseptesting.org. GR-Poisson + ML risk overlay.
The multi-layer stacking ensemble used prior to v7.0. Retained here for reference.
| Algorithm | Version | Strengths | Band Advantage |
|---|---|---|---|
| XGB XGBoost | 2.x | Histogram-based splitting, L1/L2 regularisation, GPU support, missing value handling | M4–5, M5–6 (dense data) |
| LightGBM LightGBM | 4.x | DART regularisation, leaf-wise growth, scale_pos_weight, fastest training | M4–5 (speed + accuracy) |
| CB CatBoost | 1.x | Ordered boosting (target leakage prevention), native categorical features, low overfitting | M5–6, M6–7 |
| ET ExtraTrees | sklearn 1.x | Extra random splits; low variance, fast training | M6–7 (sparse data) |
| RF Random Forest | sklearn 1.x | Bagging, interpretability, Gini importance ranking, SHAP values | All bands (baseline) |
The platform draws on 8 different data sources. The historical catalogue is merged from multiple sources; after magnitude homogenisation (ML→Mw, Ulusay et al. 2004), aftershocks are separated using Gardner-Knopoff (1974) declustering. Record priority: AFAD > USGS > ISC > ISC-GEM > Ambraseys.
According to elastic rebound theory, a fault segment accumulates elastic energy proportional to the regional slip rate since the last major earthquake. McCann et al. (1979) classify mature and recently ruptured segments using the seismic gap criterion.
McCann et al. (1979): fault segments from which a long time has passed since the last major earthquake are flagged as "probable gaps". Quiet segment = accumulated stress.
Low b (b < 0.8) signals high crustal stress and an approaching large event (Wiemer & Wyss, 2002; Scholz, 1968). b = 1.0 is the long-term regional average.
Characteristic earthquake period ~250 years; westward-migrating rupture sequence (1939–1999). The Marmara segment has not had a major rupture for >300 years.
Mean recurrence interval ~300 years; the 2023 Kahramanmaraş earthquakes (Mw 7.8 + Mw 7.6) were simultaneous ruptures of multiple segments of this fault.
The PyEphem library uses VSOP87 analytical planetary theory to compute the positions of the Moon and Sun with nano-radian precision. Tidal stress is projected onto fault planes using Boussinesq load theory.
During New Moon and Full Moon periods (±5-day window) the Moon-Sun-Earth alignment maximises tidal stress. These periods are flagged as trigger pressure windows.
Near perigee the Moon's gravitational effect increases by 14% (1/r³ dependence). Perigee + Full Moon coincidence (Supermoon) produces the strongest tidal pressure.
The 18.6-year lunar orbital inclination oscillation (Saros cycle) creates a small (~1%) modulation in tidal stress; its effect is observed in long-term statistics.
Normalised: (F_Moon/r³_Moon + F_Sun/r³_Sun) × syzygy_factor. Syzygy factor: 1 + 0.2 × cos(2π × lunar_phase). Range: ~27 MPa (minimum) – ~65 MPa (maximum).
When a large earthquake occurs, stress redistributes in the surrounding crust. The Coulomb Failure Stress (CFS) change quantifies how close neighbouring faults are brought to failure. This mechanism has successfully explained many earthquake sequences including the 1994 Northridge and 1999 İzmit events.
Deformation and stress field produced by slip on a rectangular fault plane in a semi-infinite, homogeneous, isotropic elastic medium are computed analytically. Inputs: source fault geometry, elastic parameters (μ=3.2×10¹⁰ Pa, ν=0.25).
Slip amount from magnitude: log₁₀(AD) = -4.80 + 0.69·M (metres). Rupture length: log₁₀(RLD) = -2.44 + 0.59·M (km). Rupture width: log₁₀(RW) = -1.01 + 0.32·M (km).
ΔCFS > +0.1 bar (10 kPa) triggering potential (Stein, 1999). Cumulative ΔCFS effect of all M≥5.5 earthquakes in the past 50 years is computed and projected onto target fault segments.
1999 İzmit (Mw 7.6) → Düzce segment ΔCFS: +3.2 bar → Düzce earthquake 87 days later. 1992 Landers → Big Bear triggering. 2023 Maraş multi-segment rupture.
Epidemic-Type Aftershock Sequence (ETAS) is a point process that models seismic activity as the sum of the background seismicity rate and the triggering potential of each event. Used by USGS, Italy INGV, and Japan JMA for operational earthquake forecasting.
The official U.S. Operational Aftershock Forecasting system is ETAS-based. Aftershock probability forecasts are published within 1 hour of each major event.
5 free parameters {K, α, c, p, μ} are estimated via Maximum Likelihood Estimation. K: triggering capacity, α: magnitude scaling, c/p: Omori decay rate.
SRA = R_7day / (R_1yr/52). SRA>2: anomalous increase. SRA<0.3: seismic quiescence — Habermann (1988) potential precursor anomaly.
With M≥1.5 completeness, AFAD-driven ETAS parameterization uses 10× more events. Background rate estimation (μ) and b-value computation improve significantly.
Solid Earth tidal stress from the Moon and Sun is computed physically using Love numbers (h₂=0.6078, l₂=0.0847, k₂=0.2980) based on IERS 2010 conventions. Provides more accurate volumetric strain and Coulomb projection than the legacy 1/r³ approximation.
h₂=0.6078, l₂=0.0847, k₂=0.2980. Defines the elastic response of the crust to tidal forces. Degree-2 is the dominant component (Petit & Luzum, 2010).
Typical amplitude: 50–120 nanostrain. Maximum at New/Full Moon. Moon contribution 68%, Sun contribution 32%.
Tidal stress is projected onto Coulomb stress based on fault geometry (strike, dip). Standard crustal shear modulus used (Agnew, 2015).
Ide et al. (2016) Nature Geoscience: large earthquakes occur statistically more frequently during high tidal stress periods. Particularly on shallow thrust faults.
Extreme Value Theory (EVT)-motivated asymmetric loss functions heavily penalize missing rare but catastrophic events like M7+ (False Negatives). Used as custom objectives in XGBoost and LightGBM.
Baseline weight. Frequent earthquakes; false alarms and misses balanced.
Miss penalty increases. Earthquakes capable of causing damage.
FN far more costly; FP more acceptable. Serious damage potential.
False alarms are acceptable; missing an M7.5 is not (Coles, 2001).
The Molchan diagram is the internationally accepted method for scientific evaluation of earthquake forecast models. It measures model skill via the tradeoff between alarm rate and miss rate.
Excellent. Outstanding forecast skill. Far superior to random model performance.
Very Good. Statistically strong signal.
Good. Sufficient skill for operational use.
Each magnitude band evaluated with separate Molchan analysis. p < 0.05 means model is significantly different from random (Molchan, 1991).
Model performance is measured with a walk-forward expanding window backtest from 1990 to present. A 30-day purge gap prevents data leakage. 15 folds (2-year test windows) are used. The system reports per-region, per-band metrics (M3–4/M4–5/M5–6/M6–7/M7+) validated via expanding-window backtest.
We compare two backtest methodologies: the Global Model (v1, trivial negatives from tectonically quiet zones) and the Regional Model (v2, hard negatives from the same seismic region). The regional approach forces the model to learn temporal prediction rather than geographic classification.
Expanding window 1996-2026. Negatives from stable zones (Central Asia, Sahara, etc.).
| Band | AUC | F1 | Assessment |
|---|---|---|---|
| M4-5 | 0.750 | 0.519 | Realistic |
| M5-6 | 0.948 | 0.716 | Inflated |
| M6-7 | 0.992 | 0.889 | Inflated |
| M7+ | 0.494 | 0.333 | Insufficient data |
10 regions, 100km radius, expanding window 2000–2026. Hard negatives from same seismic region. All active regions use per-region grid-search tuned params. Auto-updated from model registry.
| Region | M4-5 AUC | Hit% | M4-5 Yrs | M5-6 AUC | M5-6 Yrs | M6-7 |
|---|---|---|---|---|---|---|
| Loading... | ||||||
Loading regional summary...
The v1 model's near-perfect M5-6 and M6-7 scores are caused by trivial geographic separation between positives (active fault zones) and negatives (stable continental interiors). The model learns "is this location on a fault?" rather than "will this fault rupture soon?".
| Feature | Importance | Type |
|---|---|---|
| cross_fault_velocity_mm_yr | 0.186 | Geographic |
| strike_sin | 0.155 | Geographic |
| b_value | 0.106 | Seismological |
| strike_cos | 0.093 | Geographic |
| distance_to_fault_norm | 0.086 | Geographic |
4 of the top 5 features are geographic/fault geometry features — confirming the model classifies location, not timing. The v2 regional model eliminates this bias by using negatives from the same fault system, forcing temporal discrimination.
| Year | Train | Test (pos/neg) | AUC | Acc | Hit Rate |
|---|---|---|---|---|---|
| 2008 | <2008 | 2 / 8 | 0.188 | 50.0% | 0.0% |
| 2010 | <2010 | 3 / 12 | 0.417 | 66.7% | 0.0% |
| 2011 | <2011 | 2 / 8 | 0.563 | 80.0% | 0.0% |
| 2013 | <2013 | 3 / 12 | 0.750 | 73.3% | 0.0% |
| 2016 | <2016 | 3 / 12 | 0.639 | 80.0% | 66.7% |
| 2018 | <2018 | 2 / 6 | 0.417 | 75.0% | 50.0% |
| 2019 | <2019 | 5 / 20 | 0.930 | 88.0% | 80.0% |
| 2020 | <2020 | 3 / 12 | 0.361 | 66.7% | 33.3% |
| 2025 | <2025 | 8 / 32 | 0.820 | 77.5% | 75.0% |
Overall: AUC=0.619 (11 testable years, expanding-window, hard same-region negatives). 2019 shows AUC=0.930 (pre-Marmara seismicity buildup); 2025 AUC=0.820 (recent activity spike). Istanbul has sparse M4-5 events (median 3/yr), making per-year estimates high-variance.
| Year | Pos/Neg | AUC | Hit% | Year | Pos/Neg | AUC | Hit% |
|---|---|---|---|---|---|---|---|
| 2000 | 3/12 | 0.444 | 0% | 2015 | 3/12 | 0.667 | 33% |
| 2002 | 2/8 | 0.063 | 0% | 2017 | 3/12 | 0.417 | 0% |
| 2003 | 2/8 | 0.438 | 0% | 2018 | 4/16 | 0.531 | 50% |
| 2007 | 4/16 | 0.719 | 75% | 2019 | 2/8 | 0.375 | 0% |
| 2011 | 2/8 | 0.125 | 0% | 2020 | 2/8 | 0.188 | 0% |
| 2012 | 9/36 | 0.710 | 44% | 2021 | 2/8 | 0.625 | 50% |
| 2013 | 6/24 | 0.868 | 33% | 2022 | 5/20 | 0.520 | 20% |
| 2014 | 2/8 | 0.500 | 50% | 2023 | 60/50 | 0.716 | 78% |
| 2024 | 17/50 | 0.734 | 71% | ||||
Maraş 2023–2024 (post-Kahramanmaraş M7.8 sequence): AUC=0.716/0.734 — the model correctly identified elevated risk in the aftershock-rich period. 2013 shows AUC=0.868 (Doğanyol aftershock sequence). Strategy: neg_ratio=2, temporal_ratio=0.6 (grid-search tuned). Overall AUC=0.796 over 19 test years.
How does Talivio's regional model compare to published prospective earthquake forecast benchmarks tested through the Collaboratory for the Study of Earthquake Predictability (CSEP)? Below we compare against the best RELM (Regional Earthquake Likelihood Models) results from the California testing center and ETH Zurich's Switzerland testing center.
| Model / Study | Region | Best AUC | Area Skill Score | Negative Strategy | Validation |
|---|---|---|---|---|---|
| FCN Deep Learning (GJI 2024) | California | — | 0.882 | Same-region | Pseudo-prospective |
| ETAS Italy / OEF-Italy (Taroni 2023) | Italy | — | 0.70 | Same-region | Prospective 10yr |
| STEP (Gerstenberger et al. 2005) | California | ~0.65 | — | Random cells | Prospective (RELM) |
| ETAS (Helmstetter et al. 2007) | California | ~0.68 | — | Seismicity rate | Prospective (CSEP) |
| EEPAS (Rhoades & Evison 2004) | New Zealand | ~0.70 | — | Precursory swarms | Prospective (CSEP) |
| DeVries et al. 2018 (Google Brain) | Japan | 0.849 | — (inflated) | Geographically distinct | Retrospective |
| Talivio (this work) | 10 regions | — | — | Hard same-region temporal + spatial |
Expanding-window no future leakage |
ASS = Area Skill Score (Molchan diagram diagonal area). FCN 0.882 and ETAS Italy 0.70 from published papers. Talivio ASS computed live from prospective forecasts — grows as more forecasts are verified.