assistance-engine/docs/product/PRD-0002-editor-context-inj...

200 lines
11 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# PRD-0002: Editor Context Injection for VS Code Extension
**Date:** 2026-03-19
**Status:** Implemented
**Requested by:** Rafael Ruiz (CTO)
**Purpose:** Validate the VS Code extension with real users
**Related ADR:** ADR-0001 (gRPC interface), ADR-0002 (two-phase streaming)
---
## Problem
The Brunix Assistance Engine previously received only two inputs from the client: a `query` (the user's question) and a `session_id` (for conversation continuity). It had no awareness of what the user was looking at in their editor when they asked the question.
This created a fundamental limitation for a coding assistant: the user asking "how do I handle the error here?" or "what does this function return?" could not be answered correctly without knowing what "here" and "this function" referred to. The assistant was forced to treat every question as a general AVAP documentation query, even when the user's intent was clearly anchored to specific code in their editor.
For the VS Code extension validation, the CEO needed to demonstrate that the assistant behaves as a genuine coding assistant — one that understands the user's current context — not just a documentation search tool.
---
## Solution
The gRPC contract has been extended to allow the VS Code extension to send four optional context fields alongside every query. These fields are transported in the standard OpenAI `user` field as a JSON string when using the HTTP proxy, and as dedicated proto fields when calling gRPC directly.
**Transport format via HTTP proxy (`/v1/chat/completions`):**
```json
{
"model": "brunix",
"messages": [{"role": "user", "content": "que hace este código?"}],
"stream": true,
"session_id": "uuid",
"user": "{\"editor_content\":\"<base64>\",\"selected_text\":\"<base64>\",\"extra_context\":\"<base64>\",\"user_info\":{\"dev_id\":1,\"project_id\":2,\"org_id\":3}}"
}
```
**Fields:**
- **`editor_content`** (base64) — full content of the active file open in the editor. Gives the assistant awareness of the complete code the user is working on.
- **`selected_text`** (base64) — text currently selected in the editor, if any. The most precise signal of user intent — if the user has selected a block of code before asking a question, that block is almost certainly what the question is about.
- **`extra_context`** (base64) — free-form additional context (e.g., file path, language identifier, cursor position, open diagnostic errors). Extensible without requiring proto changes.
- **`user_info`** (JSON object) — client identity metadata: `dev_id`, `project_id`, `org_id`. Not base64 — sent as a JSON object nested within the `user` JSON string.
All four fields are optional. If none are provided, the assistant behaves exactly as it does today — full backward compatibility.
---
## User experience
**Scenario 1 — Question about selected code:**
The user selects a `try() / exception() / end()` block in their editor and asks "why is this not catching my error?". The assistant detects via the classifier that the question refers explicitly to the selected code, injects `selected_text` into the generation prompt, and answers specifically about that block — not about error handling in general.
**Scenario 2 — Question about the open file:**
The user has a full AVAP function open and asks "what HTTP status codes can this return?". The classifier detects the question refers to editor content, injects `editor_content` into the generation prompt, and reasons about the `_status` assignments in the function.
**Scenario 3 — General question (unchanged behaviour):**
The user asks "how does addVar work?" without selecting anything or referring to the editor. The classifier sets `use_editor_context: False`. The assistant behaves exactly as before — retrieval-augmented response from the AVAP knowledge base, no editor content injected.
---
## Scope
**In scope:**
- Add `editor_content`, `selected_text`, `extra_context`, `user_info` fields to `AgentRequest` in `brunix.proto`
- Decode base64 fields (`editor_content`, `selected_text`, `extra_context`) in `server.py` before propagating to graph state
- Parse `user_info` as opaque JSON string — available in state for future use, not yet consumed by the graph
- Parse the `user` field in `openai_proxy.py` as a JSON object containing all four context fields
- Propagate all fields through the server into the graph state (`AgentState`)
- Extend the classifier (`CLASSIFY_PROMPT_TEMPLATE`) to output two tokens: query type and editor context signal (`EDITOR` / `NO_EDITOR`)
- Set `use_editor_context: bool` in `AgentState` based on classifier output
- Use `selected_text` as the primary anchor for query reformulation only when `use_editor_context` is `True`
- Inject `selected_text` and `editor_content` into the generation prompt only when `use_editor_context` is `True`
- Fix reformulator language — queries must be rewritten in the original language, never translated
**Out of scope:**
- Changes to `EvaluateRAG` — the golden dataset does not include editor-context queries; this feature does not affect embedding or retrieval evaluation
- Consuming `user_info` fields (`dev_id`, `project_id`, `org_id`) in the graph — available in state for future routing or personalisation
- Evaluation of the feature impact via EvaluateRAG — a dedicated golden dataset with editor-context queries is required for that measurement; it is future work
---
## Technical design
### Proto changes (`brunix.proto`)
```protobuf
message AgentRequest {
string query = 1; // unchanged
string session_id = 2; // unchanged
string editor_content = 3; // base64-encoded full editor file content
string selected_text = 4; // base64-encoded currently selected text
string extra_context = 5; // base64-encoded free-form additional context
string user_info = 6; // JSON string: {"dev_id":…,"project_id":…,"org_id":…}
}
```
Fields 1 and 2 are unchanged. Fields 36 are optional — absent fields default to empty string in proto3. All existing clients remain compatible without modification.
### AgentState changes (`state.py`)
```python
class AgentState(TypedDict):
# Core fields
messages: Annotated[list, add_messages]
session_id: str
query_type: str
reformulated_query: str
context: str
# Editor context fields (PRD-0002)
editor_content: str # decoded from base64
selected_text: str # decoded from base64
extra_context: str # decoded from base64
user_info: str # JSON string — {"dev_id":…,"project_id":…,"org_id":…}
# Set by classifier — True only when user explicitly refers to editor code
use_editor_context: bool
```
### Server changes (`server.py`)
Base64 decoding applied to `editor_content`, `selected_text` and `extra_context` before propagation. `user_info` passed as-is (plain JSON string). Helper function:
```python
def _decode_b64(value: str) -> str:
try:
return base64.b64decode(value).decode("utf-8") if value else ""
except Exception:
logger.warning(f"[base64] decode failed")
return ""
```
### Proxy changes (`openai_proxy.py`)
The `user` field is parsed as a JSON object. `_parse_editor_context` extracts all four fields:
```python
def _parse_editor_context(user: Optional[str]) -> tuple[str, str, str, str]:
if not user:
return "", "", "", ""
try:
ctx = json.loads(user)
if isinstance(ctx, dict):
return (
ctx.get("editor_content", "") or "",
ctx.get("selected_text", "") or "",
ctx.get("extra_context", "") or "",
json.dumps(ctx.get("user_info", {})),
)
except (json.JSONDecodeError, TypeError):
pass
return "", "", "", ""
```
`session_id` is now read exclusively from the dedicated `session_id` field — no longer falls back to `user`.
### Classifier changes (`prompts.py` + `graph.py`)
`CLASSIFY_PROMPT_TEMPLATE` now outputs two tokens separated by a space:
- First token: `RETRIEVAL`, `CODE_GENERATION`, or `CONVERSATIONAL`
- Second token: `EDITOR` or `NO_EDITOR`
`EDITOR` is set only when the user message explicitly refers to the editor code or selected text using expressions like "this code", "este codigo", "fix this", "que hace esto", "explain this", etc.
`_parse_query_type` returns `tuple[str, bool]`. Both `classify` nodes (in `build_graph` and `build_prepare_graph`) set `use_editor_context` in the state.
### Reformulator changes (`prompts.py` + `graph.py`)
Two fixes applied:
**Mode-aware reformulation:** The reformulator receives `[MODE: X]` prepended to the query. In `RETRIEVAL` mode it compresses the query without expanding AVAP commands. In `CODE_GENERATION` mode it applies the command mapping. In `CONVERSATIONAL` mode it returns the query as-is.
**Language preservation:** The reformulator never translates. Queries in Spanish are rewritten in Spanish. Queries in English are rewritten in English. This fix was required because the BM25 retrieval is lexical — a Spanish chunk ("AVAP es un DSL...") cannot be found by an English query ("AVAP stand for").
### Generator changes (`graph.py`)
`_build_generation_prompt` injects `editor_content` and `selected_text` into the prompt only when `use_editor_context` is `True`. Priority hierarchy when injected:
1. `selected_text` — highest priority, most specific signal
2. `editor_content` — file-level context
3. RAG-retrieved chunks — knowledge base context
4. `extra_context` — free-form additional context
---
## Validation
**Acceptance criteria:**
- A query explicitly referring to selected code (`selected_text` non-empty, classifier returns `EDITOR`) produces a response grounded in that specific code.
- A general query (`use_editor_context: False`) produces a response identical in quality to the pre-PRD-0002 system — no editor content injected, no regression.
- A query in Spanish retrieves Spanish chunks correctly — the reformulator preserves the language.
- Existing gRPC clients that do not send the new fields work without modification.
- The `user` field in the HTTP proxy can be a plain string or absent — no error raised.
**Future measurement:**
Once the extension is validated and the embedding model is selected (ADR-0005), a dedicated golden dataset of editor-context queries should be built and added to `EvaluateRAG` to measure the quantitative impact of this feature.
---
## Impact on parallel workstreams
**Embedding evaluation (ADR-0005 / MrHouston):** No impact. The BEIR benchmarks and EvaluateRAG runs for embedding model selection use the existing golden dataset, which contains no editor-context queries. The two workstreams are independent.
**RAG architecture evolution:** This feature is additive. It does not change the retrieval infrastructure, the Elasticsearch index, or the embedding pipeline. It extends the graph with additional input signals that improve response quality for editor-anchored queries.