Traffic Accident Detection & Tracking on Google Maps–Style Data

Built a Google Maps traffic incident pipeline that (1) detects accidents from noisy crowdsourced reports, (2) tracks persistence via travel-time behavior, and (3) determines clearance when traffic returns to normal. Implemented and compared classical statistical testing, sequential decision-making, time-series forecasting, and changepoint methods in Python.

  • GitHub: TODO (paste repo link)
  • Demo: TODO (optional)

Notes

What I built

  • Log parsing + feature engineering: Parsed car telemetry at Point A / Point B and joined it with accident report logs; computed per-car travel durations and aggregated signals into 10-minute intervals.
  • Accident detection (reports): Implemented binomial hypothesis testing per interval (H₀: baseline report rate ≈ 1%) and visualized detected windows.
  • Accident detection (telemetry): Implemented a likelihood ratio test on travel-time distributions to detect incident regimes even if reports are noisy or missing.
  • Persistence + clearance (online):
    • Sequential estimation: Running mean vs EMA with different forgetting factors to balance responsiveness and stability.
    • SPRT: Sequential Probability Ratio Test with explicit error controls (α=β=0.05) to emit “Accident detected” and “Cleared” events.
  • Time-series forecasting: Built rolling AR / ARMA / ARIMA forecasts over resampled 10-minute mean durations to anticipate expected traffic and highlight sustained deviations.
  • Changepoint detection: Prepared 1037 travel-duration observations and applied CUSUM, Page–Hinkley (online drift), and PELT (offline segmentation) to estimate incident start/clear boundaries.

Key takeaways

  • Combining noisy human reports with behavioral telemetry yields more reliable incident detection than either source alone.
  • Sequential methods (EMA/SPRT/Page–Hinkley) are well-suited for real-time monitoring; offline methods (PELT) provide cleaner retrospective boundaries.
  • Thresholds (confidence levels, penalties, CUSUM h) directly trade off false alarms vs missed incidents, and should be tuned to the product’s risk tolerance.