assistance-engine/changelog

119 lines
8.3 KiB
Plaintext

# Changelog
All notable changes to the **Brunix Assistance Engine** will be documented in this file. This project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
---
## [1.5.1] - 2026-03-18
### Added
- DOCS: Created `docs/ARCHITECTURE.md` — full technical architecture reference covering component inventory, request lifecycle, LangGraph workflow, hybrid RAG pipeline, streaming design, evaluation pipeline, infrastructure layout, session memory, observability, and security boundaries.
- DOCS: Created `docs/API_REFERENCE.md` — complete gRPC API contract documentation with method descriptions, message type tables, error handling, and `grpcurl` client examples for all three RPCs (`AskAgent`, `AskAgentStream`, `EvaluateRAG`).
- DOCS: Created `docs/RUNBOOK.md` — operational playbook with health checks, startup/shutdown procedures, tunnel management, and incident playbooks for all known failure modes.
- DOCS: Created `SECURITY.md` — security policy covering transport security, authentication, secrets management, container security, data privacy, known limitations table, and vulnerability reporting process.
- DOCS: Created `docs/AVAP_CHUNKER_CONFIG.md` — full reference for `avap_config.json`: lexer fields, all 4 block definitions with regex breakdown, all 10 statement categories with ordering rationale, all 14 semantic tags with detection patterns, a worked example showing chunks produced from real AVAP code, and a step-by-step guide for adding new constructs.
### Changed
- DOCS: Fully rewrote `README.md` project structure tree — now reflects all files accurately including `openai_proxy.py`, `entrypoint.sh`, `golden_dataset.json`, `SECURITY.md`, `docs/ARCHITECTURE.md`, `docs/API_REFERENCE.md`, `docs/RUNBOOK.md`, `docs/adr/`, `avap_chunker.py`, `avap_config.json`, `ingestion/chunks.jsonl`, and `src/config.py`.
- DOCS: Added `Knowledge Base Ingestion` section to `README.md` documenting both ingestion pipelines in full: Pipeline A (Chonkie — `elasticsearch_ingestion.py`) with flow diagram, CLI usage, and chunk field table; Pipeline B (AVAP Native — `avap_chunker.py` + `avap_ingestor.py`) with flow diagram, chunk type table, semantic tags reference, and ingestor env vars.
- DOCS: Replaced minimal `Testing & Debugging` section with complete documentation of all three gRPC methods (`AskAgent`, `AskAgentStream`, `EvaluateRAG`) including expected responses, multi-turn example, and verdict thresholds.
- DOCS: Added `HTTP Proxy` section documenting all 7 HTTP endpoints (4 OpenAI + 3 Ollama), streaming vs non-streaming routing, `session_id` extension, and proxy env vars table.
- DOCS: Fixed `API Contract (Protobuf)` section — corrected `grpc_tools.protoc` paths and added reference to `docs/API_REFERENCE.md`.
- DOCS: Fixed remaining stale reference to `scripts/generate_mbpp_avap.py` in Dataset Generation section.
- DOCS: Added Documentation Index table to `README.md` linking all documentation files.
- DOCS: Updated `CONTRIBUTING.md` — added Section 9 (Architecture Decision Records) and updated PR checklist and doc policy table.
- ENV: Added missing variable documentation to `README.md`: `ELASTICSEARCH_USER`, `ELASTICSEARCH_PASSWORD`, `ELASTICSEARCH_API_KEY`, `ANTHROPIC_API_KEY`, `ANTHROPIC_MODEL`.
---
## [1.5.0] - 2026-03-12
### Added
- IMPLEMENTED:
- `scripts/pipelines/flows/translate_mbpp.py`: pipeline to generate synthetic dataset from mbpp dataset.
- `scripts/tasks/prompts.py`: module containing prompts for pipelines.
- `scripts/tasks/chunk.py`: module containing functions related to chunk management.
- `synthetic_datasets`: folder containing generated synthetic datasets.
- `src/config.py`: environment variables configuration file.
### Changed
- REFACTORED: `scripts/pipelines/flows/elasticsearch_ingestion.py` now uses `docs/LRM` or `docs/samples` documents instead of pre chunked files.
- RENAMED `docs/AVAP Language: Core Commands & Functional Specification` to `docs/avap_language_github_docs`.
- REMOVED: `Makefile` file.
- REMOVED: `scripts/start-tunnels.sh` script.
- DEPENDENCIES: `requirements.txt` updated with new libraries required by the new modules.
- MOVED `scripts/generate_mbap.py` into `scripts/flows/generate_mbap.py`.
## [1.4.0] - 2026-03-10
### Added
- **Dataset Generation Suite**: Added `scripts/generate_mbpp_avap.py` to automate the creation of synthetic AVAP training data.
- **MBPP-style Benchmarking**: Support for generating structured JSON datasets with code solutions and Python-based validation tests (`test_list`).
- **LRM Integration**: The generator now performs grounded synthesis using the `avap.md` Language Reference Manual.
- **Anthropic Claude 3.5 Sonnet Integration**: Orchestration logic for high-fidelity code generation via API.
### Changed
- **README.md**: Added comprehensive documentation for the Evaluation & Dataset Generation pipeline.
- **Project Structure**: Integrated `evaluation/` directory for synthetic dataset storage.
### Security
- Added explicit policy to avoid committing real Anthropic API keys, enforcing the use of environment variables.
## [1.3.0] - 2026-03-05
### Added
- IMPLEMENTED:
- `Docker/src/utils/emb_factory`: factory modules created for embedding model generation.
- `Docker/src/utils/llm_factory`: factory modules created for LLM generation.
- `Docker/src/graph.py`: workflow graph orchestration module added.
- `Docker/src/prompts.py`: centralized prompt definitions added.
- `Docker/src/state.py`: shared state management module added.
- `scripts/pipelines/flows/elasticsearch_ingestion.py`: pipeline to populate the elasticsearch vector database.
- `ingestion/docs`: folder containing all chunked AVAP documents.
### Changed
- REFACTORED: `server.py` updated to integrate the new graph/state/prompt and utils-based architecture.
- REFACTORED: `docker-compose.yaml` now uses fully parameterized environment variables instead of hardcoded service URLs and credentials.
- DEPENDENCIES: `requirements.txt` updated with new libraries required by the new modules.
## [1.2.0] - 2026-03-03
### Added
- GOVERNANCE: Introduced `CONTRIBUTING.md` as the single source of truth for all contribution standards, covering GitFlow, infrastructure policy, repository standards, environment variables, changelog, documentation, and incident reporting.
- GOVERNANCE: Added `.github/pull_request_template.md` enforcing a mandatory structured checklist on every PR — including explicit sign-off on environment variables, changelog, and documentation.
- DOCS: Added Environment Variables reference table to `README.md`. All variables must be registered here. PRs introducing undocumented variables will be rejected.
- DOCS: Updated project structure map in `README.md` to reflect new governance files.
### Changed
- PROCESS: Pull Requests that introduce new environment variables without documentation, omit required changelog entries, or skip required documentation updates are now formally non-mergeable per `CONTRIBUTING.md`.
---
## [1.1.0] - 2026-02-16
### Added
- IMPLEMENTED: Strict repository structure enforcement to separate development environment from production runtime.
- SECURITY: Added `.dockerignore` to prevent leaking sensitive source files and local configurations into the container.
### Changed
- REFACTORED: Dockerfile build logic to optimize build context and reduce image footprint.
- ARCHITECTURE: Moved application entry point to `/app` and eliminated the redundant root `/workspace` directory for enhanced security.
### Fixed
- RESOLVED: Issue where non-production files were being bundled into the Docker image, improving deployment speed and container isolation.
---
## [1.0.0] - 2026-02-09
### Added
- **System Architecture:** Implementation of the triple-layer stack (Engine, Vector DB, Observability).
- **Core Engine:** Deployment of the `brunix-assistance-engine` using **Python 3.11**, **LangChain**, and **LangGraph** for agentic workflows.
- **Communication Layer:** Established **gRPC** as the primary high-performance interface (Port 50051/50052).
- **Knowledge Base:** Integration of **Elasticsearch 8.12** (`brunix-vector-db`) for AVAP technology RAG support.
- **Observability Framework:** Deployment of **Langfuse** and **PostgreSQL** for full trace audit and cost management.
- **Security:** Initial network isolation within Docker (`avap-network`) and production-ready secret management design.