[DOC] ADR-0008: add formal routing contract (RC-01 to RC-06)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
parent
d7baccd8f0
commit
4b9e1ff4ca
|
|
@ -77,6 +77,81 @@ Queries containing `"you are a direct and concise assistant"` (a system-injected
|
|||
|
||||
---
|
||||
|
||||
## Routing Contract
|
||||
|
||||
This section is normative. Any reimplementation of the classifier or the graph must satisfy all rules below. Rules are ordered by priority — a higher-priority rule always wins.
|
||||
|
||||
### RC-01 — Fast-path override (priority: highest)
|
||||
|
||||
If the query contains a known platform-injected prefix, the system **MUST** classify it as `PLATFORM` without invoking any LLM.
|
||||
|
||||
```
|
||||
∀ q : query
|
||||
contains(q, known_platform_prefix) → route(q) = PLATFORM
|
||||
```
|
||||
|
||||
Current registered prefixes (see `_PLATFORM_PATTERNS` in `graph.py`):
|
||||
- `"you are a direct and concise assistant"`
|
||||
|
||||
Adding a new prefix requires a code change to `_PLATFORM_PATTERNS` and a corresponding update to this list.
|
||||
|
||||
### RC-02 — Platform data signal (priority: high)
|
||||
|
||||
If the query contains any of the following signals, the classifier **MUST** output `PLATFORM` regardless of conversation history:
|
||||
|
||||
- Usage percentages (e.g. `"20%"` in the context of project/account usage)
|
||||
- Account metrics or consumption figures
|
||||
- Quota, limit, or billing data
|
||||
|
||||
This rule is enforced via `<platform_priority_rule>` in the classifier prompt. It cannot be overridden by history.
|
||||
|
||||
### RC-03 — Intent history scoping (priority: medium)
|
||||
|
||||
The classifier **MUST** use `classify_history` only to resolve ambiguous pronoun or deictic references (`"this"`, `"esto"`, `"lo anterior"`, `"that function"`). It **MUST NOT** use history to predict or bias the type of the current message.
|
||||
|
||||
```
|
||||
classify(q, history) ≠ f(dominant_type(history))
|
||||
classify(q, history) = f(intent(q), resolve_references(q, history))
|
||||
```
|
||||
|
||||
### RC-04 — RAG bypass (priority: medium)
|
||||
|
||||
Query types that bypass Elasticsearch retrieval:
|
||||
|
||||
| Type | RAG | Justification |
|
||||
|---|---|---|
|
||||
| `RETRIEVAL` | Yes | Requires documentation context |
|
||||
| `CODE_GENERATION` | Yes | Requires syntax examples |
|
||||
| `CONVERSATIONAL` | No | Reformulates prior answer already in context |
|
||||
| `PLATFORM` | No | Data is injected via `extra_context`, not retrieved |
|
||||
|
||||
A `PLATFORM` or `CONVERSATIONAL` query that triggers a retrieval step is a contract violation.
|
||||
|
||||
### RC-05 — Model assignment (priority: medium)
|
||||
|
||||
```
|
||||
route(q) ∈ {RETRIEVAL, CODE_GENERATION} → model = OLLAMA_MODEL_NAME
|
||||
route(q) ∈ {CONVERSATIONAL, PLATFORM} → model = OLLAMA_MODEL_NAME_CONVERSATIONAL
|
||||
?? OLLAMA_MODEL_NAME # fallback if unset
|
||||
```
|
||||
|
||||
Changing which types map to which model slot requires updating this contract.
|
||||
|
||||
### RC-06 — History growth bound (priority: low)
|
||||
|
||||
`classify_history` per session **MUST** be bounded. The classifier reads at most the last 6 entries. The store may grow unbounded in memory but the classifier input is always capped.
|
||||
|
||||
### Contract violations to monitor
|
||||
|
||||
| Symptom | Violated rule |
|
||||
|---|---|
|
||||
| Platform query hits Elasticsearch | RC-04 |
|
||||
| `qwen3:1.7b` used for a `PLATFORM` response | RC-05 |
|
||||
| Platform prefix query triggers LLM classifier | RC-01 |
|
||||
| Classifier output mirrors dominant history type | RC-03 |
|
||||
|
||||
---
|
||||
|
||||
## Consequences
|
||||
|
||||
### Positive
|
||||
|
|
|
|||
Loading…
Reference in New Issue