feat: incident correlation overhaul, signal-based auto-resolve, token fixes

Correlator
- Raise fast-path idle gate 30 → 90 min (tg_fast_path_idle_minutes)
- Fix disambiguate always-commits bug: run _call_fits_incident on winner
  before committing; fall through to new-incident creation if it fails
- Add unit-continuity path (path 1.5): matches all_active by shared unit
  IDs with a reassignment guard, bridges calls past the idle gate
- Add tag-based incident_type inference (_TAG_TYPE_HINTS) as GPT fallback,
  rescuing tagged calls that would have been dropped (616 observed orphans)
- Add master/child incident model: _create_master_incident, _demote_to_child,
  _add_child_to_master; new incidents stamped incident_type="master"
- Add cross-system parent detection (_find_cross_system_parent): two-signal
  scoring (road overlap=0.4, embedding≥0.78=0.3, proximity=0.3, threshold=0.5)
  wired into create-if-new path; creates master shell on first cross-system match
- Add maybe_resolve_parent: auto-resolves master when all children close;
  called from upload pipeline (LLM closure) and summarizer stale sweep
- Add signal-based auto-resolve via units_active/units_cleared tracking:
  GPT now extracts cleared_units per scene; _update_incident moves units
  between active/cleared lists and resolves the incident when active empties;
  stored on call doc for re-correlation sweep reuse
- Add _create_incident initialization of units_active/units_cleared fields

Re-correlation sweep
- Add corr_sweep_count + MAX_SWEEP_ATTEMPTS=3: orphans get 3 attempts
  then are tombstoned as corr_path="unlinked", ending the re-sweep loop
  (previously hammering each orphan 29-31 times per shift)

Intelligence extraction
- Add cleared_units to GPT prompt schema and rules
- Extract and propagate cleared_units per scene; merge across scenes;
  store on call doc for re-correlation sweep

Token management
- Fix token release bug: remove release_token call on discord_connected=False
  in MQTT checkin (transient Discord drops were orphaning bots mid-shift)
- Add PUT /tokens/{id}/prefer/{system_id} endpoint: lock a bot token to a
  system; pass _none as system_id to clear; stored bidirectionally on both
  token and system documents
- discord_join handler resolves preferred_token_id from system doc and passes
  system_name in MQTT payload
This commit is contained in:
Logan
2026-05-10 19:49:05 -04:00
parent 7e1b01a275
commit 8b660d8e10
9 changed files with 509 additions and 23 deletions
@@ -46,9 +46,15 @@ async def _run_sweep_pass() -> None:
("status", "==", "ended"),
("ended_at", ">=", cutoff),
])
# corr_path="unlinked" is written after MAX_SWEEP_ATTEMPTS failures.
# Allows a few retries so a welfare-check call can link to an escalation
# incident that is created a few minutes later, without sweeping 30× forever.
MAX_SWEEP_ATTEMPTS = 3
orphans = [
c for c in recent_ended
if not c.get("incident_ids") and not c.get("incident_id")
and not c.get("corr_path") # skip calls already exhausted
and c.get("corr_sweep_count", 0) < MAX_SWEEP_ATTEMPTS
]
if not orphans:
@@ -87,6 +93,7 @@ async def _recorrelate_orphan(call: dict) -> bool:
incident_type = call.get("incident_type"),
location = call.get("location"),
location_coords= call.get("location_coords"),
cleared_units = call.get("cleared_units") or [],
reference_time = started_at, # anchor window to when the call happened
create_if_new = False, # never create — link-only
)
@@ -97,6 +104,15 @@ async def _recorrelate_orphan(call: dict) -> bool:
f"Re-correlation: linked orphaned call {call_id} → incident {incident_id}"
)
return True
# Increment the attempt counter. Once MAX_SWEEP_ATTEMPTS is reached the
# orphan filter above will stop picking this call up, and we write
# corr_path="unlinked" as a permanent tombstone.
attempts = call.get("corr_sweep_count", 0) + 1
update: dict = {"corr_sweep_count": attempts}
if attempts >= 3:
update["corr_path"] = "unlinked"
await fstore.doc_set("calls", call_id, update)
return False