feat(alerts): add traffic spike detection with configurable thresholds

Introduce traffic_spike alert type for monitoring system and per-master
traffic levels with configurable thresholds stored in database.

- Add AlertThresholdConfig model for persistent threshold configuration
- Implement GET/PUT /admin/alerts/thresholds endpoints for threshold management
- Add traffic spike detection in alert detector cron job:
  - Global QPS monitoring across all masters
  - Per-master RPM/TPM checks with minimum sample thresholds
  - Per-master RPD/TPD checks for daily limits
- Use warning severity at threshold, critical at 2x threshold
- Include metric metadata (value, threshold, window) in alert details
- Update API documentation with new endpoints and alert type
This commit is contained in:
zenfun
2025-12-31 15:56:17 +08:00
parent 85d91cdd2e
commit ba54abd424
6 changed files with 563 additions and 3 deletions

View File

@@ -16,6 +16,7 @@ const (
AlertTypeKeyDisabled AlertType = "key_disabled"
AlertTypeKeyExpired AlertType = "key_expired"
AlertTypeProviderDown AlertType = "provider_down"
AlertTypeTrafficSpike AlertType = "traffic_spike"
)
// AlertSeverity defines the severity level of an alert