server-26

Author	SHA1	Message	Date
Logan	3d51db80d0	Improve extraction accuracy with speaker role inference Add a SPEAKER ROLES section to the GPT-4o-mini prompt teaching it to distinguish dispatch voice (names a unit then gives assignment + address) from unit voice (opens with own callsign + brief status). Applied to location attribution (dispatch-provided address beats unit position report) and unit extraction (dispatched units vs. acknowledging units). No extra API calls — purely prompt-level reasoning on the existing transcript.	2026-06-01 01:17:49 -04:00
Logan	b77d2cce36	Fix over-correlation: geocoding precision, thin path ambiguity, skip_reason propagation - Geocoding: reject GEOMETRIC_CENTER/APPROXIMATE results — vague location strings (regions, city centroids) were resolving to node-area coords and creating false proximity matches that merged unrelated incidents - Thin path: on dispatch channels with multiple active incidents, skip attachment rather than guessing — "10-4" with 3 active incidents is genuinely ambiguous - Short transcripts (≤5 words) now write skip_reason="transcript_too_short" to the call doc, matching garbage transcript behavior - upload.py no-scenes fallback now checks skip_reason before running correlation — flagged calls (garbage, too short) no longer attach via thin path - Update Server README to reflect current project purpose, goals, and pipeline	2026-05-31 23:51:46 -04:00
Logan	f774be12b8	Fix correlation over-merge, thin-call hallucination, and geocoding accuracy - Cap unit-continuity path at 20 min idle (unit_continuity_max_idle_minutes) - Block time_fallback and unit-continuity matching on reassignment calls - Expand reassignment detection to cover unit-initiated self-reassignment - Skip GPT extraction entirely for transcripts ≤5 words (prevents hallucinated tags/units) - Reduce geocode_max_km from 75 to 40 to reject far-out-of-area results - Include county in geocoding query for tighter jurisdiction anchoring	2026-05-26 02:20:15 -04:00
Logan	7d6e97fd4a	fix: improve geocoding specificity and increase distance threshold for repeater systems geocode_max_km: 25 → 75 km. The node is a physical receiver, not the system boundary; digital repeaters extend coverage well beyond 25km (North White Plains at 35.5km from the Yorktown node is a legitimate Westchester County location). Query now fully qualified: "High Street" → "High Street, Yorktown, New York". Added _get_node_state() which reverse-geocodes the node position once (cached) using Google Maps to get the state name, appended alongside the municipality. Generic street names (High Street, Main Street) no longer resolve to wrong-country results.	2026-05-25 14:49:02 -04:00
Logan	0279a82b10	feat: replace Nominatim geocoding with Google Maps API; add TOC map improvements Switch geocoding from Nominatim to Google Maps Geocoding API for accurate local place name resolution (bounds-biased, with 25km distance rejection guard). Remove the now-unused _get_node_place reverse-geocoder and _node_place_cache. Map page (TOC improvements): - Weather radar tiles auto-refresh every 5 minutes via radarEpoch key cycling - Google Maps traffic overlay added to LayersControl - Live 24h clock overlay at bottom-left for situational awareness - Incident sidebar cards now show age (time since dispatch) and unit count	2026-05-25 13:27:19 -04:00
Logan	0db09d6bf7	fix: reject geocode results outside node jurisdiction Nominatim's viewbox is advisory (bounded=0), so ambiguous place names like "Pinebrook" can resolve to locations 30-40km away in the wrong town. Added a post-geocode distance gate: results farther than geocode_max_km (default 25km) from the node are discarded with a warning log rather than written to the incident. Also logs distance on successful geocodes for easier audit. New config setting: geocode_max_km (float, default 25.0)	2026-05-25 13:09:10 -04:00
Logan	7dd090e8b2	fix: raise garbage-transcript threshold to avoid false positives on plate reads Phonetic run threshold 5 → 12: a plate spellout ("Foxtrot Alpha Uniform Lima Kilo...") produces 6–8 consecutive phonetic words, triggering false positives and blocking intelligence extraction on legitimate calls. 12 is safely above any real spellout (~8 max) while still catching the full-alphabet hallucination (26 words). Also writes skip_reason="garbage_transcript" to the call doc and surfaces it in the admin correlation debug endpoint.	2026-05-25 03:31:43 -04:00
Logan	92c9d8effc	fix: garbage transcript detection, county geocoding, dispatch channel detection - intelligence.py: detect Whisper phonetic-alphabet hallucinations before sending to GPT; skip extraction entirely to prevent fake units/tags corrupting correlation - intelligence.py: upgrade node reverse-geocode from zoom=5 (state) to zoom=10 (county) and include county in address queries so common street names (e.g. "East Main Street") resolve to the correct county - incident_correlator.py: add "patched" and "primary" to dispatch channel regex so patched trunking channels are treated as shared backbones - incident_correlator.py: add 20-min idle gate for tactical channel default so a reused frequency can't absorb a new unrelated incident	2026-05-24 01:30:40 -04:00
Logan	1071bcd3e8	fix: map overlay clicks, layer overlap, fan spacing, geocoding radius - Move incident panel to left side (was topright, conflicting with LayersControl) - Move legend to bottom-right, raise auto-fit button to clear it - Tighten fan card step 7→5px for closer grouping - Geocoding: remove bounded=1 hard clip, widen bias radius 0.1°→0.5° (~55km) so addresses like "34 Carlton Drive" resolve outside the node's immediate area	2026-05-24 00:20:11 -04:00
Logan	8b660d8e10	feat: incident correlation overhaul, signal-based auto-resolve, token fixes Correlator - Raise fast-path idle gate 30 → 90 min (tg_fast_path_idle_minutes) - Fix disambiguate always-commits bug: run _call_fits_incident on winner before committing; fall through to new-incident creation if it fails - Add unit-continuity path (path 1.5): matches all_active by shared unit IDs with a reassignment guard, bridges calls past the idle gate - Add tag-based incident_type inference (_TAG_TYPE_HINTS) as GPT fallback, rescuing tagged calls that would have been dropped (616 observed orphans) - Add master/child incident model: _create_master_incident, _demote_to_child, _add_child_to_master; new incidents stamped incident_type="master" - Add cross-system parent detection (_find_cross_system_parent): two-signal scoring (road overlap=0.4, embedding≥0.78=0.3, proximity=0.3, threshold=0.5) wired into create-if-new path; creates master shell on first cross-system match - Add maybe_resolve_parent: auto-resolves master when all children close; called from upload pipeline (LLM closure) and summarizer stale sweep - Add signal-based auto-resolve via units_active/units_cleared tracking: GPT now extracts cleared_units per scene; _update_incident moves units between active/cleared lists and resolves the incident when active empties; stored on call doc for re-correlation sweep reuse - Add _create_incident initialization of units_active/units_cleared fields Re-correlation sweep - Add corr_sweep_count + MAX_SWEEP_ATTEMPTS=3: orphans get 3 attempts then are tombstoned as corr_path="unlinked", ending the re-sweep loop (previously hammering each orphan 29-31 times per shift) Intelligence extraction - Add cleared_units to GPT prompt schema and rules - Extract and propagate cleared_units per scene; merge across scenes; store on call doc for re-correlation sweep Token management - Fix token release bug: remove release_token call on discord_connected=False in MQTT checkin (transient Discord drops were orphaning bots mid-shift) - Add PUT /tokens/{id}/prefer/{system_id} endpoint: lock a bot token to a system; pass _none as system_id to clear; stored bidirectionally on both token and system documents - discord_join handler resolves preferred_token_id from system doc and passes system_name in MQTT payload	2026-05-10 19:49:05 -04:00
Logan	7e1b01a275	Updates to reduce firestore calls to try and stay in free tier ### Firestore read reductions 1. `doc_get_cached()` in `firestore.py` — new 5-min TTL cache One place, benefits everything. System and node config documents almost never change during a monitoring session. 2. System doc: 4 reads → 1 per call \| Before \| After \| \|---\|---\| \| `upload.py` — `doc_get("systems")` for ai_flags \| `doc_get_cached` \| \| `transcription.py` — `get_vocabulary()` → `doc_get("systems")` \| cache hit \| \| `intelligence.py` — `get_vocabulary()` → `doc_get("systems")` \| cache hit \| \| `intelligence.py` — `doc_get("systems")` again for ten_codes \| eliminated (reads same cached doc) \| 3. Node doc: cached in `_on_call_start` and `intelligence.py` The node is read every call event to get `assigned_system_id` and lat/lon for geocoding. Both now use the cache — node assignments and positions essentially never change at runtime. 4. Node sweeper: 30s → 90s interval The sweeper was doing a full node collection scan 3× more often than necessary — the offline threshold is already 90s. Cuts sweeper reads by 66%. 5. Vocabulary induction: scans all-time calls → last 7 days Previously fetched every ended call for a system (could be thousands). Now scoped to the last 7 days. > Note: The vocabulary induction query `(system_id == X, ended_at >= cutoff)` needs a Firestore > composite index on `(system_id ASC, ended_at ASC)`. When the induction loop first fires it will log > an error with a Firebase Console link to create it in one click.	2026-05-04 02:05:00 -04:00
Logan	e704df1a62	# `app/internal/incident_correlator.py` - `correlate_call` — added units and vehicles optional params; when provided (per-scene from intelligence extraction), they take priority over the merged call-document values, preventing multi-scene unit contamination - Cross-TGID correlation path (2.5) — new path between location and slow paths: when a call shares 2+ unit IDs with a recent same-system, same-type incident AND embedding similarity ≥ 0.85, it links them — catches multi-talkgroup pursuits like the bicycle search that split across dispatch/tactical/geographic channels # `app/internal/intelligence.py` - `reassignment` field — added to the GPT-4o-mini prompt schema and rules; `true` when dispatch is actively pulling a unit to a new, different call (not a status update or en route acknowledgement); returned in every processed scene dict - Tag location rule — added explicit instruction to the prompt: tags must describe what happened, not where; place names, road names, and talkgroup names are explicitly forbidden as tags # `app/routers/upload.py` - Both scene correlation call sites (`_run_extraction_pipeline` and `_run_intelligence_pipeline`) now pass `units=corr_units` where `corr_units = [] if scene.get("reassignment") else scene.get("units") `— suppresses unit overlap matching when a unit is being reassigned to a new call, preventing chaining into their previous incident - Both sites also pass `vehicles=scene.get("vehicles")` (per-scene vehicles, from the multi-scene units fix) # `app/config.py` - `embedding_cross_tg_threshold: float = 0.85` — threshold for the new cross-TGID path	2026-05-04 01:33:03 -04:00
Logan	317f9d2a9d	Updates to intel and correlation	2026-04-23 01:26:41 -04:00
Logan	338b946ba3	Start to learn vocab from talkgroups to improve accuracy of STT	2026-04-21 22:17:30 -04:00
Logan	6612e4b683	Big updates	2026-04-21 01:51:23 -04:00
Logan	788afca339	Update geocoding intel	2026-04-19 23:27:51 -04:00
Logan	ba43796c51	Updates, big updates incident_correlator.py — full rewrite: always runs on every call, fetches all active incidents cross-type, fast path collects all talkgroup matches and disambiguates by unit/vehicle overlap → location proximity → embedding, new location proximity path, slow path requires location corroboration, "Auto:" stripped from titles, "auto-generated" tag added, units/vehicles now accumulated on update intelligence.py — resolved field in GPT schema, returned as 5th value upload.py — both pipelines unpack 5-tuple, always call correlate, auto-resolve on resolved=True summarizer.py — stale sweep runs each tick, resolves incidents idle for 90+ minutes config.py — correlation_window_hours=2, embedding_similarity_threshold=0.93, location_proximity_km=0.5, incident_auto_resolve_minutes=90	2026-04-19 22:53:53 -04:00
Logan	303c5b13cf	big ui and intel updates	2026-04-19 16:48:55 -04:00
Logan	03212fca51	Move to GPT for API consistency	2026-04-19 08:18:55 -04:00
Logan	1e3d691dbd	Intel update	2026-04-19 08:00:09 -04:00
Logan	10aabf4fb2	Change models	2026-04-13 01:43:10 -04:00
Logan	616c06f09c	stt updates and intelligence updates	2026-04-13 00:01:19 -04:00
Logan	7b6fd640d9	Update intelligence	2026-04-12 23:33:44 -04:00
Logan	3b3a136d04	Massive update	2026-04-11 13:44:08 -04:00

24 Commits