feat: incident correlation overhaul, signal-based auto-resolve, token fixes

Correlator - Raise fast-path idle gate 30 → 90 min (tg_fast_path_idle_minutes) - Fix disambiguate always-commits bug: run _call_fits_incident on winner before committing; fall through to new-incident creation if it fails - Add unit-continuity path (path 1.5): matches all_active by shared unit IDs with a reassignment guard, bridges calls past the idle gate - Add tag-based incident_type inference (_TAG_TYPE_HINTS) as GPT fallback, rescuing tagged calls that would have been dropped (616 observed orphans) - Add master/child incident model: _create_master_incident, _demote_to_child, _add_child_to_master; new incidents stamped incident_type="master" - Add cross-system parent detection (_find_cross_system_parent): two-signal scoring (road overlap=0.4, embedding≥0.78=0.3, proximity=0.3, threshold=0.5) wired into create-if-new path; creates master shell on first cross-system match - Add maybe_resolve_parent: auto-resolves master when all children close; called from upload pipeline (LLM closure) and summarizer stale sweep - Add signal-based auto-resolve via units_active/units_cleared tracking: GPT now extracts cleared_units per scene; _update_incident moves units between active/cleared lists and resolves the incident when active empties; stored on call doc for re-correlation sweep reuse - Add _create_incident initialization of units_active/units_cleared fields Re-correlation sweep - Add corr_sweep_count + MAX_SWEEP_ATTEMPTS=3: orphans get 3 attempts then are tombstoned as corr_path="unlinked", ending the re-sweep loop (previously hammering each orphan 29-31 times per shift) Intelligence extraction - Add cleared_units to GPT prompt schema and rules - Extract and propagate cleared_units per scene; merge across scenes; store on call doc for re-correlation sweep Token management - Fix token release bug: remove release_token call on discord_connected=False in MQTT checkin (transient Discord drops were orphaning bots mid-shift) - Add PUT /tokens/{id}/prefer/{system_id} endpoint: lock a bot token to a system; pass _none as system_id to clear; stored bidirectionally on both token and system documents - discord_join handler resolves preferred_token_id from system doc and passes system_name in MQTT payload
2026-05-10 19:49:05 -04:00
parent 7e1b01a275
commit 8b660d8e10
9 changed files with 509 additions and 23 deletions
@@ -29,6 +29,7 @@ Response format — a JSON object with a "scenes" array. Each scene:
  location: most specific location string found, or empty string
  vehicles: list of vehicle descriptions mentioned
  units: list of unit IDs or officer numbers explicitly mentioned
+  cleared_units: list of unit IDs that explicitly signal back-in-service or available in this recording
  severity: one of "minor" | "moderate" | "major" | "unknown"
  resolved: true if this scene explicitly signals incident closure, false otherwise
  reassignment: true if dispatch is actively pulling a unit away from their current assignment to respond to a new, different call — e.g. "Baker, can you clear and respond to...", "Adam, break from that and go to...". False if the unit is simply reporting in, updating status, or continuing their current assignment.
@@ -42,6 +43,7 @@ Rules:
 - incident_type: let the talkgroup channel be your primary signal. Use "fire" ONLY if the talkgroup is clearly a fire/rescue channel OR the transcript explicitly describes active fire, smoke, flames, or structure fire activation. Police or EMS referencing a fire scene → use "police" or "ems". When uncertain, prefer "other" over "fire".
 - ten_codes: interpret radio codes using the department reference provided below. Do not guess codes not listed.
 - resolved: true only when the scene explicitly signals "Code 4", "all clear", "10-42", "in custody", "patient transported", "fire out", "GOA", "negative contact", "scene clear".
+- cleared_units: only include units that explicitly stated their own back-in-service status in this recording (e.g. "Unit 7, 10-8", "Baker-1 available", "E-14 back in service", or the department ten-code for available/back-in-service listed above). Silence or absence of a unit is NOT clearance. A scene-wide Code 4 belongs in resolved=true, not here — cleared_units is for individual unit availability signals only.
 - reassignment: only true when a unit is explicitly being pulled to a completely new call or location. A unit going en route to their first dispatch is NOT a reassignment. Routine status updates, acknowledgements, and scene updates are NOT reassignments.
 - transcript_corrected: fix only clear STT/vocoder errors (e.g. "Several" → "10-4", misheard street names, garbled unit IDs). Keep all radio language as-is — do NOT decode codes into plain English. Return null if accurate.

@@ -129,6 +131,7 @@ async def extract_scenes(
        location:           Optional[str]  = scene.get("location") or None
        vehicles:           list[str]      = scene.get("vehicles") or []
        units:              list[str]      = scene.get("units") or []
+        cleared_units:      list[str]      = scene.get("cleared_units") or []
        severity:           str            = scene.get("severity") or "unknown"
        resolved:           bool           = bool(scene.get("resolved", False))
        reassignment:       bool           = bool(scene.get("reassignment", False))
@@ -160,6 +163,7 @@ async def extract_scenes(
            "location_coords":      location_coords,
            "vehicles":             vehicles,
            "units":                units,
+            "cleared_units":        cleared_units,
            "severity":             severity,
            "resolved":             resolved,
            "reassignment":         reassignment,
@@ -175,6 +179,7 @@ async def extract_scenes(
    all_tags     = list(dict.fromkeys(t for s in processed for t in s["tags"]))
    all_units    = list(dict.fromkeys(u for s in processed for u in s["units"]))
    all_vehicles = list(dict.fromkeys(v for s in processed for v in s["vehicles"]))
+    all_cleared  = list(dict.fromkeys(u for s in processed for u in s["cleared_units"]))

    updates: dict = {"tags": all_tags, "severity": primary["severity"]}
    if primary["location"]:
@@ -183,6 +188,8 @@ async def extract_scenes(
        updates["location_coords"] = primary["location_coords"]
    if all_units:
        updates["units"] = all_units
+    if all_cleared:
+        updates["cleared_units"] = all_cleared
    if all_vehicles:
        updates["vehicles"] = all_vehicles
    if primary["embedding"]: