feat: incident correlation overhaul, signal-based auto-resolve, token fixes
Correlator
- Raise fast-path idle gate 30 → 90 min (tg_fast_path_idle_minutes)
- Fix disambiguate always-commits bug: run _call_fits_incident on winner
before committing; fall through to new-incident creation if it fails
- Add unit-continuity path (path 1.5): matches all_active by shared unit
IDs with a reassignment guard, bridges calls past the idle gate
- Add tag-based incident_type inference (_TAG_TYPE_HINTS) as GPT fallback,
rescuing tagged calls that would have been dropped (616 observed orphans)
- Add master/child incident model: _create_master_incident, _demote_to_child,
_add_child_to_master; new incidents stamped incident_type="master"
- Add cross-system parent detection (_find_cross_system_parent): two-signal
scoring (road overlap=0.4, embedding≥0.78=0.3, proximity=0.3, threshold=0.5)
wired into create-if-new path; creates master shell on first cross-system match
- Add maybe_resolve_parent: auto-resolves master when all children close;
called from upload pipeline (LLM closure) and summarizer stale sweep
- Add signal-based auto-resolve via units_active/units_cleared tracking:
GPT now extracts cleared_units per scene; _update_incident moves units
between active/cleared lists and resolves the incident when active empties;
stored on call doc for re-correlation sweep reuse
- Add _create_incident initialization of units_active/units_cleared fields
Re-correlation sweep
- Add corr_sweep_count + MAX_SWEEP_ATTEMPTS=3: orphans get 3 attempts
then are tombstoned as corr_path="unlinked", ending the re-sweep loop
(previously hammering each orphan 29-31 times per shift)
Intelligence extraction
- Add cleared_units to GPT prompt schema and rules
- Extract and propagate cleared_units per scene; merge across scenes;
store on call doc for re-correlation sweep
Token management
- Fix token release bug: remove release_token call on discord_connected=False
in MQTT checkin (transient Discord drops were orphaning bots mid-shift)
- Add PUT /tokens/{id}/prefer/{system_id} endpoint: lock a bot token to a
system; pass _none as system_id to clear; stored bidirectionally on both
token and system documents
- discord_join handler resolves preferred_token_id from system doc and passes
system_name in MQTT payload
This commit is contained in:
@@ -109,10 +109,11 @@ class MQTTHandler:
|
||||
updates["status"] = "online"
|
||||
await fstore.doc_update("nodes", node_id, updates)
|
||||
|
||||
# Release any orphaned Discord token when the node explicitly reports disconnected
|
||||
if payload.get("discord_connected") is False:
|
||||
from app.routers.tokens import release_token
|
||||
await release_token(node_id)
|
||||
# NOTE: discord_connected in checkins is informational only — do NOT release the
|
||||
# token here. The bot watchdog reconnects on transient Discord drops, so a single
|
||||
# checkin with discord_connected=False during a brief reconnect window would
|
||||
# incorrectly free the token while the bot is still active. Token release is
|
||||
# handled exclusively by the discord_leave command and the node offline sweeper.
|
||||
|
||||
# ------------------------------------------------------------------
|
||||
# Status update
|
||||
|
||||
Reference in New Issue
Block a user