Operational observability

Monitor health, performance, and risk in real time

Abhackus exposes Prometheus metrics and enables actionable alerts for availability, latency, errors, and authentication risk patterns.

HTTP Metrics

`abhackus_http_requests_total` by method/route/status.

Latency

`abhackus_http_request_duration_seconds` histogram for p95/p99.

Baseline Alerts

API down, high 5xx rate, high p95 latency, and login failure bursts.

Formal SLOs

Availability
Monthly target >= 99.5%

Latency p95
Target <= 700ms

5xx Error Ratio
Target <= 1%

Key endpoints

GET /api/health
GET /api/metrics (requires technical token)
x-correlation-id response header

Reference files

deploy/observability/prometheus-alerts.yml
deploy/observability/prometheus-slo-rules.yml
deploy/observability/grafana-dashboard-abhackus.json
deploy/observability/README-observability.org
.gitlab-ci.yml (coverage gate + artifacts)

Connect Prometheus + Grafana + Alertmanager

# 1) Prometheus scrape config
scrape_configs:
  - job_name: abhackus-rest-api
    metrics_path: /api/metrics
    authorization:
      type: Bearer
      credentials: '<ABHACKUS_METRICS_TOKEN>'
    static_configs:
      - targets: ['127.0.0.1:8080']

# 2) Load alert rules
rule_files:
  - /etc/prometheus/prometheus-alerts.yml

# 3) Grafana
# - datasource: Prometheus
# - panels for request rate, 5xx rate, and p95/p99 latency
# - dashboards by critical routes (/api/auth/login, /api/accounting/*)

# 4) Alertmanager
# - route critical/high to on-call Telegram/Email
# - route medium to ops channel

CI includes a continuous stress/perf suite with k6 (perf-k6) against PERF_TARGET_URL, exporting JSON summary artifacts.