diff --git a/docs/ADR/ADR-0008-adaptive-query-routing-intent-history.md b/docs/ADR/ADR-0008-adaptive-query-routing-intent-history.md index 844c3d3..bf70fd4 100644 --- a/docs/ADR/ADR-0008-adaptive-query-routing-intent-history.md +++ b/docs/ADR/ADR-0008-adaptive-query-routing-intent-history.md @@ -77,6 +77,81 @@ Queries containing `"you are a direct and concise assistant"` (a system-injected --- +## Routing Contract + +This section is normative. Any reimplementation of the classifier or the graph must satisfy all rules below. Rules are ordered by priority — a higher-priority rule always wins. + +### RC-01 — Fast-path override (priority: highest) + +If the query contains a known platform-injected prefix, the system **MUST** classify it as `PLATFORM` without invoking any LLM. + +``` +∀ q : query + contains(q, known_platform_prefix) → route(q) = PLATFORM +``` + +Current registered prefixes (see `_PLATFORM_PATTERNS` in `graph.py`): +- `"you are a direct and concise assistant"` + +Adding a new prefix requires a code change to `_PLATFORM_PATTERNS` and a corresponding update to this list. + +### RC-02 — Platform data signal (priority: high) + +If the query contains any of the following signals, the classifier **MUST** output `PLATFORM` regardless of conversation history: + +- Usage percentages (e.g. `"20%"` in the context of project/account usage) +- Account metrics or consumption figures +- Quota, limit, or billing data + +This rule is enforced via `` in the classifier prompt. It cannot be overridden by history. + +### RC-03 — Intent history scoping (priority: medium) + +The classifier **MUST** use `classify_history` only to resolve ambiguous pronoun or deictic references (`"this"`, `"esto"`, `"lo anterior"`, `"that function"`). It **MUST NOT** use history to predict or bias the type of the current message. + +``` +classify(q, history) ≠ f(dominant_type(history)) +classify(q, history) = f(intent(q), resolve_references(q, history)) +``` + +### RC-04 — RAG bypass (priority: medium) + +Query types that bypass Elasticsearch retrieval: + +| Type | RAG | Justification | +|---|---|---| +| `RETRIEVAL` | Yes | Requires documentation context | +| `CODE_GENERATION` | Yes | Requires syntax examples | +| `CONVERSATIONAL` | No | Reformulates prior answer already in context | +| `PLATFORM` | No | Data is injected via `extra_context`, not retrieved | + +A `PLATFORM` or `CONVERSATIONAL` query that triggers a retrieval step is a contract violation. + +### RC-05 — Model assignment (priority: medium) + +``` +route(q) ∈ {RETRIEVAL, CODE_GENERATION} → model = OLLAMA_MODEL_NAME +route(q) ∈ {CONVERSATIONAL, PLATFORM} → model = OLLAMA_MODEL_NAME_CONVERSATIONAL + ?? OLLAMA_MODEL_NAME # fallback if unset +``` + +Changing which types map to which model slot requires updating this contract. + +### RC-06 — History growth bound (priority: low) + +`classify_history` per session **MUST** be bounded. The classifier reads at most the last 6 entries. The store may grow unbounded in memory but the classifier input is always capped. + +### Contract violations to monitor + +| Symptom | Violated rule | +|---|---| +| Platform query hits Elasticsearch | RC-04 | +| `qwen3:1.7b` used for a `PLATFORM` response | RC-05 | +| Platform prefix query triggers LLM classifier | RC-01 | +| Classifier output mirrors dominant history type | RC-03 | + +--- + ## Consequences ### Positive