seo.methodology.title

1. System Overview

The regional band model architecture trains a separate ML model for each (region, magnitude band) pair using an expanding-window backtest. Hard same-region negatives (temporal: same coordinates, different time; spatial: offset within the same seismic zone) are used to prevent geographic leakage. A 4-algorithm competition (LightGBM, RandomForest, ExtraTrees, CalibratedLR) selects the winning algorithm per band via cross-validated AUC. The global prediction endpoint retains the 102-feature ensemble (Coulomb stress, ETAS, GR b-value, multi-scale SRA, H3 spatial connectivity) for backwards-compatible API access.

5

Magnitude Bands
(M3–4, M4–5, M5–6, M6–7, M7+)

102

Features (94 core + 8 spatial:
SRA, Moment, ETAS, H3…)

4-Ensemble

LightGBM + RF + ET + GB
(Soft Voting Classifier)

8

Data Sources
(USGS, AFAD, NGL, GEM, ISC…)

2. Magnitude Bands

Each band has a different feature dimensionality, data density, and dominant physical mechanism. In sparse bands (M6+) the seismic cycle and physical parameters take precedence, while in dense bands (M4–5) statistical precursors carry more weight.

M 3.0 – 4.0

Micro Earthquakes

Highest event rate; serves as a precursor indicator for larger events. Micro-cluster density and foreshock patterns are key signals.

115 featuresLightGBM ensembleClass: very high positive rate

M 4.0 – 5.0

Small–Moderate Earthquakes

Most frequent damaging class; high training data availability. Foreshock density and quiescence anomaly are the dominant signals in this band.

112 featuresLightGBM-weightedClass: high positive rate

M 5.0 – 6.0

Moderate Earthquakes

Magnitude range capable of causing damage. The 24-month linear regression of the b-value trend is the leading precursor signal in this band.

112 featuresb-value trend weightedModerate frequency

M 6.0 – 7.0

Large Earthquakes

Potential for serious structural damage and casualties. Seismic cycle analysis with inter-event normalisation plays the decisive role.

112 featuresSeismic cycle weightedRare class

M 7.0 +

Destructive Earthquakes

Least training data; hardest prediction class. Physical parameters (slip rate, last rupture date, fault length) gain greater weight.

13 featuresPhysical parameters weightedVery rare

3. Feature Engineering

Features are constructed as a total of 29 core features across nine categories. Coulomb stress transfer, the ETAS statistical model and SRA were added in v4.0; IERS 2010 solid earth tides and FFT spectral features were added in v5.0.

3.1 Fault Mechanics and Geometry (6 features)

Feature	Physical Meaning	Source
Fault distance	Distance to nearest active fault segment (km)	GEM Global Active Faults
Coupling ratio	Fault coupling ratio (0=creeping, 1=fully locked); elastic energy accumulation indicator	Fault DB + GNSS
Slip rate	Long-term slip rate of the fault (mm/yr); rate of energy accumulation	GEM / AFAD Fault DB
Cross-fault velocity	GNSS cross-fault differential velocity (mm/yr); plate motion difference across fault	GNSS microservice
Seismic gap	Time elapsed since last major earthquake (years); identifies mature segments	Historical catalogue
Last rupture	Time since last surface rupture; basis of seismic cycle calculation	Historical catalogue

3.2 Coulomb Stress Transfer (2 features — new in v4)

Feature	Physical Meaning	Source
Cumulative ΔCFS	Cumulative Coulomb stress effect of historical M≥5.5 earthquakes (bar) — Okada (1992) dislocation model	Computation + USGS/AFAD
Recent ΔCFS	Largest individual Coulomb stress change in the recent period (bar) — triggering potential	Computation

Coulomb Failure Stress: ΔCFS = Δτ + μ'·Δσ_n — shear stress change + effective friction × normal stress change. Positive ΔCFS (>0.1 bar) brings the fault closer to failure (King et al., 1994; Stein, 1999). Calculated using the Okada (1992) semi-infinite elastic medium solution.

3.3 ETAS Statistical Model (3 features — new in v4)

Feature	Physical Meaning	Source
ETAS density	ETAS model estimated instantaneous seismicity rate (events/day/km²) — Ogata (1988) MLE	Computation
Background rate	Background seismicity rate μ (excluding clustering); tectonic loading indicator	Ogata (1988)
Triggering ratio	Triggered/total ratio; high value = active aftershock sequence	Computation

ETAS Model: λ(t,x,y) = μ(x,y) + Σ K·e^α(M-Mc) / (t-tᵢ+c)^p · f(x-xᵢ,y-yᵢ) — the statistical model used by USGS, Italy INGV and Japan JMA for operational earthquake forecasting (Ogata, 1988).

3.4 Seismicity Rate Analysis — SRA (5 features — new in v4)

Feature	Physical Meaning	Source
Short-term rate	Regional earthquake count over the short term; short-term activity	USGS + AFAD
Medium-term rate	Earthquake count over the medium term; medium-term activity	USGS + AFAD
Long-term rate	Earthquake count over the long term; long-term background	USGS + AFAD
Activity ratio	Short/long-term activity ratio (SRA). Detection of uplift or quiescence	Computation
Quiescence Z-score	Deviation of activity from historical mean. Anomaly detection (Habermann, 1988)	Computation

3.5 Gutenberg-Richter and Precursor Signals (4 features)

Feature	Physical Meaning	Source
b-value	Gutenberg-Richter b-value (Aki 1965 MLE); low b = high stress	Seismic catalogue
b-value trend	Time-series slope of b-value; decreasing trend signals stress increase	Computation
Precursor density	Density of small nearby earthquakes (foreshock density)	USGS + AFAD
Moment deficit	Accumulated/expected seismic moment ratio; ≥1 indicates energy surplus	Fault DB + Catalogue

3.6 Geophysical and Celestial (2 features)

Feature	Physical Meaning	Source
GNSS deformation	Crustal deformation rate computed via spatial interpolation from GPS stations (nstrain/yr)	NGL Nevada Geodetic Lab
Tidal index	Combined Moon+Sun tidal stress (normalised 0–1)	Ephemeris computation

3.7 Solid Earth Tides (2 features — new in v5)

Feature	Physical Meaning	Source
Volumetric strain	Volumetric strain computation based on IERS 2010 conventions (nanostrain). Typical amplitude: 50–120 ns	Computation (IERS 2010)
Tidal Coulomb	Coulomb projection of tidal stress onto the fault plane (kPa). Depends on fault geometry	Computation

Solid Earth Tides (IERS 2010): ε_vol = (1 + h₂ - 3·l₂) / (R·g) · W₂ — volumetric strain is computed from the degree-2 tidal potential. Provides a physically more accurate computation than the existing tidal_stress_index (1/r³ approximation). Ide et al. (2016) showed that large earthquakes occur statistically more frequently during periods of elevated tidal stress.

3.8 FFT Spectral Features (5 features — new in v5)

Feature	Physical Meaning	Source
Dominant period	Spectral dominant period of the seismicity time series; decrease = quiescence signal	Computation
Spectral centroid	Weighted mean of the frequency distribution	Computation
Spectral entropy	Shannon spectral entropy (0–1 normalised); rising disorder = breakdown of harmonic order	Computation
Spectral slope	Log-log spectral slope; steepening = red shift, quiescence	Computation
Energy ratio	Low/high frequency energy ratio; increase = precursor signal	Computation

Spectral Analysis: Characteristic changes are observed in the frequency spectrum of seismic activity before large earthquakes: the dominant frequency decreases, spectral energy shifts to lower frequencies, and harmonic order breaks down (Telesca et al., 2001; Kagan & Jackson, 1991). Spectral leakage is minimised using a Hanning window + rfft.

3.9 Four-Component Risk Model (v5)

Risk = w₁ × Mechanical + w₂ × Statistical + w₃ × Precursor + w₄ × Trigger

Mechanical (highest weight) — Gunpowder+Domino

Fault locking, slip deficit, and stress transfer from neighbouring earthquakes. Weighted average of three sub-components.

Statistical — ETAS

ETAS model seismicity density estimate and short/long-term activity ratio (SRA). Operational forecasting standard.

Precursor — Smoke

b-value decrease, seismic quiescence, and foreshock density. Weighted combination of four sub-signals.

Trigger (lowest weight) — Spark

Moon–Sun tidal stress; triggering potential when fault is near threshold. Scientifically small (~1–3%) but statistically significant.

Coulomb Stress Threshold

Fault segments with ΔCFS > +0.1 bar (10 kPa) receive an automatic uplift in the mechanical component. King et al. (1994) triggering threshold.

Critical Window Condition

If Mechanical ≥ 0.6, Statistical ≥ 0.5, Precursor ≥ 0.4, and Trigger ≥ 0.6 are simultaneously met, the system raises the Critical Window flag.

3.10 Spatial Connectivity Layer (new in v6)

The earthquake catalogue is divided into spatial cells via a hierarchical hexagonal grid. Spatial features are computed using a multi-scale neighbourhood structure. Temporal leakage is prevented by leak-free binary search (only past events are used).

K1/K2 Neighbour Rate

spatial_neighbor_rate_k1/k2: 90-day earthquake activity rate in near and extended neighbour cells.

K1/K2 Max. Magnitude

spatial_max_mag_k1/k2: Maximum earthquake magnitude within 90 days in neighbouring cells.

Activity Trend

spatial_activity_trend: Last 30-day vs. 90-day activity ratio — detection of acceleration or deceleration.

Cluster Density

spatial_cluster_density: Ratio of active cells in the centre + K1 neighbours — measure of spatial clustering.

Fault Smoothing

spatial_fault_smoothed_risk: Distance-weighted fault cell seismicity using exp(-dist/decay_km).

Strain Gradient

spatial_strain_gradient: Standard deviation of the activity difference between the centre cell and its neighbours.

SpatialIndex: The earthquake catalogue is sorted chronologically into cells. Each query runs in O(log n) time via binary search. Comprehensive active-cell and historical-event database (1990–present).

4. 4-Model Ensemble Architecture (v8)

In v8.0 the MTL architecture was replaced by a 4-model soft-voting ensemble. Each model contributes weighted probability votes. The ensemble mitigates individual model weaknesses and produces more robust, well-calibrated predictions across all 5 bands.

Ensemble Architecture (Soft Voting):
[102 Features] → LightGBM (weight=3) ──┐
[102 Features] → RandomForest (weight=2) ──┤→ Weighted Average → P(event)
[102 Features] → ExtraTrees (weight=1) ──┤
[102 Features] → GradientBoosting (weight=2) ──┘
Calibration: Nested Isotonic Regression (CalibratedClassifierCV, cv=3)

LightGBM (w=3)

Primary model: n_estimators=300, max_depth=6, num_leaves=31. Gradient boosting with L1+L2 regularisation. Highest individual discriminative power.

RandomForest (w=2)

n_estimators=250, max_depth=8. Bagging-based diversity reduces variance. Robust to outliers and noisy features.

ExtraTrees (w=1)

Extremely Randomised Trees. Random split selection adds decorrelation. Fastest training; lowest weight but adds unique diversity.

GradientBoosting (w=2)

sklearn GBM with n_estimators=200, max_depth=5. Classical sequential boosting provides a different bias-variance tradeoff vs LightGBM.

Nested Isotonic Calibration

CalibratedClassifierCV (method='isotonic') converts ensemble output to a reliable probability. See current ECE in backtest metrics above.

CSEP Forecast Export

Daily CSEP-compatible grid forecasts (ASCII/CSV/XML) are generated for independent validation via cseptesting.org. GR-Poisson + ML risk overlay.

Daily Prospective Forecasts + Weekly Retraining: APScheduler generates prospective forecasts every day at 03:00 UTC. A backtest runs every Monday; if a >5% drop in ROC-AUC is detected, the full training loop is launched. CSEP forecasts are regenerated daily.

4b. Stacking Ensemble Architecture (v4, no longer used)

The multi-layer stacking ensemble used prior to v7.0. Retained here for reference.

Legacy Ensemble Architecture:
Layer 1 (Base Learners): XGBoost + LightGBM + CatBoost + ExtraTrees + Random Forest
Layer 2 (Meta-Learner): Logistic Regression (L2) — combines Layer 1 outputs
Layer 3 (Calibration): Isotonic Regression — reliable probability output (ECE < 0.05)

Algorithm	Version	Strengths	Band Advantage
XGB XGBoost	2.x	Histogram-based splitting, L1/L2 regularisation, GPU support, missing value handling	M4–5, M5–6 (dense data)
LightGBM LightGBM	4.x	DART regularisation, leaf-wise growth, scale_pos_weight, fastest training	M4–5 (speed + accuracy)
CB CatBoost	1.x	Ordered boosting (target leakage prevention), native categorical features, low overfitting	M5–6, M6–7
ET ExtraTrees	sklearn 1.x	Extra random splits; low variance, fast training	M6–7 (sparse data)
RF Random Forest	sklearn 1.x	Bagging, interpretability, Gini importance ranking, SHAP values	All bands (baseline)

Weekly Automatic Retraining: APScheduler triggers a backtest every Monday. If a >5% drop in ROC-AUC is detected for a band, the full training loop for that band is launched. ETAS parameters are updated, the Coulomb stress field is recomputed, and 5 algorithms × 4 bands = 20 base models + 4 stacking ensembles are trained.

5. Data Sources and Historical Catalogue

The platform draws on 8 different data sources. The historical catalogue is merged from multiple sources; after magnitude homogenisation (ML→Mw, Ulusay et al. 2004), aftershocks are separated using Gardner-Knopoff (1974) declustering. Record priority: AFAD > USGS > ISC > ISC-GEM > Ambraseys.

1035 – 1900

Ambraseys (2002): Historical Turkey and Middle East seismicity. Magnitudes converted from intensity for M≥6.5. Based on geodetic and documentary evidence.

1900 – present

ISC Bulletin: International Seismological Centre global catalogue. Comprehensive record from the start of the instrumental era.

1900 – 2009

ISC-GEM Global Instrumental Earthquake Catalogue (2013): Recomputed moment magnitudes (Mw); systematic magnitude homogenisation; M≥5.5.

1900 – present

AFAD Earthquake Database: National authority record for Turkey and surroundings. Events overlapping with USGS are deduplicated in favour of AFAD.

1970 – present

USGS Earthquake Hazards Program: Global coverage; moment tensor solutions; automatic update every 6 hours. Primary source for the current period.

current

NGL Nevada Geodetic Laboratory: 500+ GPS stations, MIDAS solution; mm/yr velocity vectors (Blewitt et al., 2018). Strain rate and cross-fault velocity computed via KD-Tree.

current

GEM Global Active Faults: 11,000+ active fault segments; strike/dip/rake, slip rates, last rupture date. Loaded automatically during region onboarding.

AFAD + USGS Combined Catalogue (v4): AFAD records M≥1.5 events for Turkey (USGS: M≥4.0). Event matching uses ±30 s and ±50 km windows. AFAD is preferred for location (local network is more accurate), USGS for magnitude (Mw standard). Magnitude homogenisation: ML→Mw (Ulusay et al., 2004 Turkey regression).

Gardner-Knopoff (1974) + ETAS Declustering: Two-stage declustering is applied. GK deterministic declustering separates aftershocks using mechanical windows. ETAS stochastic declustering assigns each event a "probability of being background" — more robust especially for M6+ bands.

6. Seismic Cycle Analysis

According to elastic rebound theory, a fault segment accumulates elastic energy proportional to the regional slip rate since the last major earthquake. McCann et al. (1979) classify mature and recently ruptured segments using the seismic gap criterion.

Gutenberg-Richter: log N = a − b·M
b-value (Aki 1965 MLE): b = log₁₀(e) / (M̄ − Mmin) ≈ 1/(M̄ − Mmin) × 0.4343
Stress loading rate: dσ/dt = (v_plate × μ × coupling) / A_fault
Recurrence interval: T_r = D_max / v_slip (D_max: maximum slip amount)

Seismic Gap Theory

McCann et al. (1979): fault segments from which a long time has passed since the last major earthquake are flagged as "probable gaps". Quiet segment = accumulated stress.

b-Value Change

Low b (b < 0.8) signals high crustal stress and an approaching large event (Wiemer & Wyss, 2002; Scholz, 1968). b = 1.0 is the long-term regional average.

North Anatolian Fault

Characteristic earthquake period ~250 years; westward-migrating rupture sequence (1939–1999). The Marmara segment has not had a major rupture for >300 years.

East Anatolian Fault

Mean recurrence interval ~300 years; the 2023 Kahramanmaraş earthquakes (Mw 7.8 + Mw 7.6) were simultaneous ruptures of multiple segments of this fault.

7. Celestial Mechanics

The PyEphem library uses VSOP87 analytical planetary theory to compute the positions of the Moon and Sun with nano-radian precision. Tidal stress is projected onto fault planes using Boussinesq load theory.

Synodic Cycle (29.5 days)

During New Moon and Full Moon periods (±5-day window) the Moon-Sun-Earth alignment maximises tidal stress. These periods are flagged as trigger pressure windows.

Anomalistic Month (27.55 days)

Near perigee the Moon's gravitational effect increases by 14% (1/r³ dependence). Perigee + Full Moon coincidence (Supermoon) produces the strongest tidal pressure.

Nodal Cycle (18.6 years)

The 18.6-year lunar orbital inclination oscillation (Saros cycle) creates a small (~1%) modulation in tidal stress; its effect is observed in long-term statistics.

Composite Index (0–1)

Normalised: (F_Moon/r³_Moon + F_Sun/r³_Sun) × syzygy_factor. Syzygy factor: 1 + 0.2 × cos(2π × lunar_phase). Range: ~27 MPa (minimum) – ~65 MPa (maximum).

8. Coulomb Stress Transfer (v4)

When a large earthquake occurs, stress redistributes in the surrounding crust. The Coulomb Failure Stress (CFS) change quantifies how close neighbouring faults are brought to failure. This mechanism has successfully explained many earthquake sequences including the 1994 Northridge and 1999 İzmit events.

ΔCFS = Δτ + μ'·Δσ_n

Δτ : Shear stress change on the receiver fault plane
Δσ_n : Normal stress change (compression negative)
μ' : Effective friction coefficient (0.4, including pore pressure)

Okada (1992) Model

Deformation and stress field produced by slip on a rectangular fault plane in a semi-infinite, homogeneous, isotropic elastic medium are computed analytically. Inputs: source fault geometry, elastic parameters (μ=3.2×10¹⁰ Pa, ν=0.25).

Wells & Coppersmith (1994)

Slip amount from magnitude: log₁₀(AD) = -4.80 + 0.69·M (metres). Rupture length: log₁₀(RLD) = -2.44 + 0.59·M (km). Rupture width: log₁₀(RW) = -1.01 + 0.32·M (km).

Triggering Threshold

ΔCFS > +0.1 bar (10 kPa) triggering potential (Stein, 1999). Cumulative ΔCFS effect of all M≥5.5 earthquakes in the past 50 years is computed and projected onto target fault segments.

Validation Examples

1999 İzmit (Mw 7.6) → Düzce segment ΔCFS: +3.2 bar → Düzce earthquake 87 days later. 1992 Landers → Big Bear triggering. 2023 Maraş multi-segment rupture.

9. ETAS Statistical Model (v4)

Epidemic-Type Aftershock Sequence (ETAS) is a point process that models seismic activity as the sum of the background seismicity rate and the triggering potential of each event. Used by USGS, Italy INGV, and Japan JMA for operational earthquake forecasting.

λ(t,x,y) = μ(x,y) + Σ_i:tᵢ<t K·e^α(Mᵢ-Mc) / (t-tᵢ+c)^p · f(x-xᵢ,y-yᵢ)

μ(x,y) : Space-dependent background seismicity rate (tectonic loading)
K, α, c, p : Parameters estimated via Ogata (1988) MLE
Mc : Completeness magnitude (Turkey network: 2.5)

USGS OAF

The official U.S. Operational Aftershock Forecasting system is ETAS-based. Aftershock probability forecasts are published within 1 hour of each major event.

Parameter Estimation

5 free parameters {K, α, c, p, μ} are estimated via Maximum Likelihood Estimation. K: triggering capacity, α: magnitude scaling, c/p: Omori decay rate.

SRA (Seismicity Rate Analysis)

SRA = R_7day / (R_1yr/52). SRA>2: anomalous increase. SRA<0.3: seismic quiescence — Habermann (1988) potential precursor anomaly.

AFAD Data Advantage

With M≥1.5 completeness, AFAD-driven ETAS parameterization uses 10× more events. Background rate estimation (μ) and b-value computation improve significantly.

10. Solid Earth Tides (v5)

Solid Earth tidal stress from the Moon and Sun is computed physically using Love numbers (h₂=0.6078, l₂=0.0847, k₂=0.2980) based on IERS 2010 conventions. Provides more accurate volumetric strain and Coulomb projection than the legacy 1/r³ approximation.

W₂ = (GM/r) · (R/r)² · P₂(cos z) — Degree-2 tidal potential

ε_vol = (1 + h₂ - 3·l₂) / (R·g) · W₂ — Volumetric strain (nanostrain)

ΔCFS_tidal = (3/2) · μ · ε_tidal · sin(2θ_eff) — Coulomb projection onto fault plane

IERS 2010 Love Numbers

h₂=0.6078, l₂=0.0847, k₂=0.2980. Defines the elastic response of the crust to tidal forces. Degree-2 is the dominant component (Petit & Luzum, 2010).

Volumetric Strain

Typical amplitude: 50–120 nanostrain. Maximum at New/Full Moon. Moon contribution 68%, Sun contribution 32%.

Coulomb Projection

Tidal stress is projected onto Coulomb stress based on fault geometry (strike, dip). Standard crustal shear modulus used (Agnew, 2015).

Triggering Evidence

Ide et al. (2016) Nature Geoscience: large earthquakes occur statistically more frequently during high tidal stress periods. Particularly on shallow thrust faults.

11. EVT Custom Loss — Asymmetric Loss Functions (v5)

Extreme Value Theory (EVT)-motivated asymmetric loss functions heavily penalize missing rare but catastrophic events like M7+ (False Negatives). Used as custom objectives in XGBoost and LightGBM.

Asymmetric Log-Loss: L = -w · [y·log(p) + (1-y)·log(1-p)]
w = FN_penalty (y=1) or FP_penalty (y=0), per band

Focal Loss: FL(p) = -α · (1-p_t)^γ · log(p_t) — optimized γ and α
Reduces contribution of easy examples to focus on rare/hard cases (Lin et al., 2017)

EVT Tail Loss: L = base × (1 + k·(1-p)^tail_exp) — exponential penalty on tail FNs

M4-5: Balanced

Baseline weight. Frequent earthquakes; false alarms and misses balanced.

M5-6: FN-weighted

Miss penalty increases. Earthquakes capable of causing damage.

M6-7: Strong FN weight

FN far more costly; FP more acceptable. Serious damage potential.

M7+: Maximum FN penalty

False alarms are acceptable; missing an M7.5 is not (Coles, 2001).

12. Molchan Diagram — Scientific Evaluation (v5)

The Molchan diagram is the internationally accepted method for scientific evaluation of earthquake forecast models. It measures model skill via the tradeoff between alarm rate and miss rate.

τ (alarm rate) = alarmed volume / total volume
ν (miss rate) = missed earthquakes / total earthquakes

Diagonal (τ, 1-τ): random forecast — no skill
A_skill = 0.5 - ∫ν dτ — area below diagonal (0=random, 0.5=perfect)

Statistical significance: σ² = 1/(12·N), z = A_skill/σ (Zechar & Jordan, 2008)

A_skill ≥ 0.35

Excellent. Outstanding forecast skill. Far superior to random model performance.

A_skill ≥ 0.25

Very Good. Statistically strong signal.

A_skill ≥ 0.15

Good. Sufficient skill for operational use.

Per-Band Analysis

Each magnitude band evaluated with separate Molchan analysis. p < 0.05 means model is significantly different from random (Molchan, 1991).

13. Backtest & Evaluation

Model performance is measured with a walk-forward expanding window backtest from 1990 to present. A 30-day purge gap prevents data leakage. 15 folds (2-year test windows) are used. The system reports per-region, per-band metrics (M3–4/M4–5/M5–6/M6–7/M7+) validated via expanding-window backtest.

ROC-AUC: Area under the Receiver Operating Characteristic curve. 0.5 = random forecast, 1.0 = perfect classifier.

Brier Score: Mean squared probability error. 0 = perfect, 0.25 = random.

ECE (Expected Calibration Error): Calibration accuracy. 0 = perfectly calibrated.

[email protected]: Fraction of true earthquakes when model predicts ≥0.7.

—

Avg. ROC-AUC (active bands)

—

Avg. Brier Score

—

Avg. F1 Score

—

Active Bands

Walk-Forward Expanding Window: Unlike standard k-fold, preserves temporal separation; future data never leaks into past training. A 30-day purge gap cleans the transition zone. Each fold uses all previous time periods as training set. Spatial features are subject to the same temporal constraints (leak-free binary search).

AUC Skill Score: —. Significantly outperforms the null model (AUC > 0.5, p < 0.05).

13b. Regional Backtest — Hard Negative Validation

We compare two backtest methodologies: the Global Model (v1, trivial negatives from tectonically quiet zones) and the Regional Model (v2, hard negatives from the same seismic region). The regional approach forces the model to learn temporal prediction rather than geographic classification.

Global Model (v1) Legacy

Expanding window 1996-2026. Negatives from stable zones (Central Asia, Sahara, etc.).

Band	AUC	F1	Assessment
M4-5	0.750	0.519	Realistic
M5-6	0.948	0.716	Inflated
M6-7	0.992	0.889	Inflated
M7+	0.494	0.333	Insufficient data

Regional Model (v2) Current

10 regions (8 with M4-5 training data), 100km radius, expanding window 2000–2026. Hard negatives from same seismic region. All active regions use per-region grid-search tuned params. Auto-updated from model registry.

Region	M4-5 AUC	Hit%	M4-5 Yrs	M5-6 AUC	M5-6 Yrs	M6-7
Loading...

Loading regional summary...

Why v1 M5-6/M6-7 scores are inflated — Feature Importance Analysis

The v1 model's near-perfect M5-6 and M6-7 scores are caused by trivial geographic separation between positives (active fault zones) and negatives (stable continental interiors). The model learns "is this location on a fault?" rather than "will this fault rupture soon?".

Feature	Importance	Type
cross_fault_velocity_mm_yr	0.186	Geographic
strike_sin	0.155	Geographic
b_value	0.106	Seismological
strike_cos	0.093	Geographic
distance_to_fault_norm	0.086	Geographic

4 of the top 5 features are geographic/fault geometry features — confirming the model classifies location, not timing. The v2 regional model eliminates this bias by using negatives from the same fault system, forcing temporal discrimination.

Istanbul M4-5 Year-by-Year Results (v2, tuned, expanding window)

Year	Train	Test (pos/neg)	AUC	Acc	Hit Rate
2008	<2008	2 / 8	0.188	50.0%	0.0%
2010	<2010	3 / 12	0.417	66.7%	0.0%
2011	<2011	2 / 8	0.563	80.0%	0.0%
2013	<2013	3 / 12	0.750	73.3%	0.0%
2016	<2016	3 / 12	0.639	80.0%	66.7%
2018	<2018	2 / 6	0.417	75.0%	50.0%
2019	<2019	5 / 20	0.930	88.0%	80.0%
2020	<2020	3 / 12	0.361	66.7%	33.3%
2025	<2025	8 / 32	0.820	77.5%	75.0%

Overall: AUC=0.619 (11 testable years, expanding-window, hard same-region negatives). 2019 shows AUC=0.930 (pre-Marmara seismicity buildup); 2025 AUC=0.820 (recent activity spike). Istanbul has sparse M4-5 events (median 3/yr), making per-year estimates high-variance.

Maraş M4-5 — Best-Tested Region (19 years, AUC=0.796)

Year	Pos/Neg	AUC	Hit%	Year	Pos/Neg	AUC	Hit%
2000	3/12	0.444	0%	2015	3/12	0.667	33%
2002	2/8	0.063	0%	2017	3/12	0.417	0%
2003	2/8	0.438	0%	2018	4/16	0.531	50%
2007	4/16	0.719	75%	2019	2/8	0.375	0%
2011	2/8	0.125	0%	2020	2/8	0.188	0%
2012	9/36	0.710	44%	2021	2/8	0.625	50%
2013	6/24	0.868	33%	2022	5/20	0.520	20%
2014	2/8	0.500	50%	2023	60/50	0.716	78%
				2024	17/50	0.734	71%

Maraş 2023–2024 (post-Kahramanmaraş M7.8 sequence): AUC=0.716/0.734 — the model correctly identified elevated risk in the aftershock-rich period. 2013 shows AUC=0.868 (Doğanyol aftershock sequence). Strategy: neg_ratio=2, temporal_ratio=0.6 (grid-search tuned). Overall AUC=0.796 over 19 test years.

Cross-Validation Summary: Loading dynamic summary...

13c. CSEP / RELM Benchmark Comparison

How does Talivio's regional model compare to published prospective earthquake forecast benchmarks tested through the Collaboratory for the Study of Earthquake Predictability (CSEP)? Below we compare against the best RELM (Regional Earthquake Likelihood Models) results from the California testing center and ETH Zurich's Switzerland testing center.

Model / Study	Region	Best AUC	Area Skill Score	Negative Strategy	Validation
FCN Deep Learning (GJI 2024)	California	—	0.882	Same-region	Pseudo-prospective
ETAS Italy / OEF-Italy (Taroni 2023)	Italy	—	0.70	Same-region	Prospective 10yr
STEP (Gerstenberger et al. 2005)	California	~0.65	—	Random cells	Prospective (RELM)
ETAS (Helmstetter et al. 2007)	California	~0.68	—	Seismicity rate	Prospective (CSEP)
EEPAS (Rhoades & Evison 2004)	New Zealand	~0.70	—	Precursory swarms	Prospective (CSEP)
DeVries et al. 2018 (Google Brain)	Japan	0.849	— (inflated)	Geographically distinct	Retrospective
Talivio (this work)	10 regions	—	—	Hard same-region temporal + spatial	Expanding-window no future leakage

ASS = Area Skill Score (Molchan diagram diagonal area). FCN 0.882 and ETAS Italy 0.70 from published papers. Talivio ASS computed live from prospective forecasts — grows as more forecasts are verified.

Key insight: The DeVries et al. (2018) AUC of 0.849 is widely cited but uses geographically distinct negatives — the same design flaw as Talivio v1. When hard same-region negatives are used (as in Rouet-Leduc 2022 and Talivio v2), realistic AUC drops to 0.65–0.75. Talivio v2's M4-5 AUC of — and M5-6 AUC of — is consistent with — and in high-activity zones exceeds — the state-of-the-art honest retrospective validation range.

Prospective Trial (ongoing): Talivio generates daily ML forecasts for all active regions × magnitude bands (M4-5, M5-6, M6-7, M7+). Each forecast covers a 30-day window; outcomes are verified automatically against the USGS catalog. Expanding-window pseudo-prospective forecasts (verified against held-out future periods) provide the primary out-of-sample validation record. Daily real-time forecasts accumulate continuously, extending the prospective evaluation with each passing day. Area Skill Score (Molchan diagram) is computed live from all verified forecasts — see current ASS value in the comparison table above.

14. References

1

Maximum likelihood estimate of b in the formula log N = a − bM and its confidence limits. Aki, K. (1965). Bulletin of the Earthquake Research Institute, 43, 237–239.

2

Frequency of earthquakes in California. Gutenberg, B. & Richter, C.F. (1944). Bulletin of the Seismological Society of America, 34(4), 185–188.

3

Seismic gaps and plate tectonics: seismic potential for major boundaries. McCann, W.R. et al. (1979). Pure and Applied Geophysics, 117(6), 1082–1147.

4

Mapping spatial variability of the frequency-magnitude distribution of earthquakes. Wiemer, S. & Wyss, M. (2002). Advances in Geophysics, 45, 259–302.

5

Is the sequence of earthquakes in Southern California, with aftershocks removed, Poisson? Gardner, J.K. & Knopoff, L. (1974). Bulletin of the Seismological Society of America, 64(5), 1363–1367.

6

Recalculated probability of M≥7 earthquakes beneath the Sea of Marmara, Turkey. Parsons, T. (2004). Journal of Geophysical Research, 109(B5).

7

The seismicity of North Africa and the Eastern Mediterranean region. Ambraseys, N. (2002). Annals of Geophysics, 45(3–4).

8

Simulation of ground motion using the stochastic method. Boore, D.M. (2003). Pure and Applied Geophysics, 160, 635–676.

9

New empirical relationships among magnitude, rupture length, rupture width, rupture area, and surface displacement. Wells, D.L. & Coppersmith, K.J. (1994). Bulletin of the Seismological Society of America, 84(4), 974–1002.

10

Advance short-term prediction of the large Hokkaido earthquake of September 25, 2003, M = 8.1. Shebalin, P.N. et al. (2006). Earth, Planets and Space, 56(8), 715–724.

11

ISC-GEM Global Instrumental Earthquake Catalogue (1900–2009). Storchak, D.A. et al. (2013). Data Science Journal, 12, WDS79–WDS96.

12

NGL Nevada Geodetic Laboratory GPS Time Series and Velocities. Blewitt, G. et al. (2018). Seismological Research Letters, 89(4), 1619–1628. geodesy.unr.edu

13

Static stress changes and the triggering of earthquakes. King, G.C.P., Stein, R.S. & Lin, J. (1994). Bulletin of the Seismological Society of America, 84(3), 935–953.

14

The role of stress transfer in earthquake occurrence. Stein, R.S. (1999). Nature, 402(6762), 605–609.

15

Statistical models for earthquake occurrences and residual analysis for point processes. Ogata, Y. (1988). Journal of the American Statistical Association, 83(401), 9–27.

16

Internal deformation due to shear and tensile faults in a half-space. Okada, Y. (1992). Bulletin of the Seismological Society of America, 82(2), 1018–1040.

17

Precursory seismic quiescence. Habermann, R.E. (1988). Pure and Applied Geophysics, 126(2–4), 279–318.

18

Earthquake forecasting using aftershock statistics and short-term clustering. Reasenberg, P.A. & Jones, L.M. (1989). Science, 243(4896), 1173–1176.

19

Attenuation relationships of peak ground acceleration and velocities from Turkish strong-motion data. Ulusay, R. et al. (2004). Engineering Geology, 73(1–2), 137–161.

20

IERS Conventions (2010). Petit, G. & Luzum, B. (2010). IERS Technical Note, 36.

21

Earth tides. Agnew, D.C. (2015). Treatise on Geophysics, 3.06.

22

Earthquake potential revealed by tidal influence on earthquake size-frequency statistics. Ide, S. et al. (2016). Nature Geoscience, 9(10), 834–837.

23

Structure of optimal strategies in earthquake prediction. Molchan, G.M. (1991). Tectonophysics, 193(4), 267–276.

24

Testing alarm-based earthquake predictions. Zechar, J.D. & Jordan, T.H. (2008). Geophysical Journal International, 172(2), 715–724.

25

Focal Loss for Dense Object Detection. Lin, T.-Y. et al. (2017). IEEE Transactions on PAMI, 42(2), 318–327.

26

An Introduction to Statistical Modeling of Extreme Values. Coles, S. (2001). Springer.

27

Investigating the multifractal properties of earthquake sequences. Telesca, L. et al. (2001). Chaos, Solitons & Fractals, 12, 2417–2425.

How does the multi-band earthquakeprobability forecast system work?

Active Model Details