282 lines
11 KiB
Markdown
282 lines
11 KiB
Markdown
# Brunix Assistance Engine
|
|
|
|
The **Brunix Assistance Engine** is a high-performance, gRPC-powered AI orchestration service. It serves as the core intelligence layer for the Brunix ecosystem, integrating advanced RAG (Retrieval-Augmented Generation) capabilities with real-time observability.
|
|
|
|
This project is a strategic joint development:
|
|
* **[101OBEX Corp](https://101obex.com):** Infrastructure, System Architecture, and the proprietary **AVAP Technology** stack.
|
|
* **[MrHouston](https://mrhouston.net):** Advanced LLM Fine-tuning, Model Training, and Prompt Engineering.
|
|
|
|
---
|
|
|
|
## System Architecture (Hybrid Dev Mode)
|
|
|
|
The engine runs locally for development but connects to the production-grade infrastructure in the **Vultr Cloud (Devaron Cluster)** via secure `kubectl` tunnels.
|
|
|
|
```mermaid
|
|
graph TD
|
|
subgraph Local_Workstation [Developer]
|
|
BE[Brunix Assistance Engine - Docker]
|
|
KT[Kubectl Port-Forward Tunnels]
|
|
end
|
|
|
|
subgraph Vultr_K8s_Cluster [Production - Devaron Cluster]
|
|
OL[Ollama Light Service - LLM]
|
|
EDB[(Elasticsearch Vector DB)]
|
|
PG[(Postgres - Langfuse Data)]
|
|
LF[Langfuse UI - Web]
|
|
end
|
|
|
|
BE -- localhost:11434 --> KT
|
|
BE -- localhost:9200 --> KT
|
|
BE -- localhost:5432 --> KT
|
|
|
|
KT -- Secure Link --> OL
|
|
KT -- Secure Link --> EDB
|
|
KT -- Secure Link --> PG
|
|
|
|
Developer -- Browser --> LF
|
|
```
|
|
|
|
---
|
|
|
|
## Project Structure
|
|
|
|
```text
|
|
|
|
├── README.md # System documentation & Dev guide
|
|
├── changelog # Version tracking and release history
|
|
├── pyproject.toml # Python project configuration
|
|
├── docs/
|
|
│ ├── AVAP Language: ... # AVAP DSL Documentation
|
|
│ │ └── AVAP.md
|
|
│ ├── developer.avapfr... # Documents on developer web page
|
|
│ ├── LRM/ # AVAP LRM documentation
|
|
│ │ └── avap.md
|
|
│ └── samples/ # AVAP code samples
|
|
├── Docker/
|
|
│ ├── protos/
|
|
│ │ └── brunix.proto # Protocol Buffers: The source of truth for the API
|
|
│ ├── src/
|
|
│ │ ├── graph.py # Workflow graph orchestration
|
|
│ │ ├── prompts.py # Centralized prompt definitions
|
|
│ │ ├── server.py # gRPC Server & RAG Orchestration
|
|
│ │ ├── state.py # Shared state management
|
|
│ │ └── utils/ # Utility modules
|
|
│ ├── Dockerfile # Container definition for the Engine
|
|
│ ├── docker-compose.yaml # Local orchestration for dev environment
|
|
│ ├── requirements.txt # Python dependencies for Docker
|
|
│ └── .dockerignore # Docker ignore files
|
|
├── scripts/
|
|
│ └── pipelines/
|
|
│ ├── flows/ # Processing pipelines
|
|
│ └── tasks/ # Modules used by the flows
|
|
└── src/
|
|
├── config.py # Environment variables configuration file
|
|
└── utils/
|
|
├── emb_factory.py # Embedding model factory
|
|
└── llm_factory.py # LLM model factory
|
|
```
|
|
|
|
---
|
|
|
|
## Data Flow & RAG Orchestration
|
|
|
|
The following diagram illustrates the sequence of a single `AskAgent` request, detailing the retrieval and generation phases through the secure tunnel.
|
|
|
|
```mermaid
|
|
sequenceDiagram
|
|
participant U as External Client (gRPCurl/App)
|
|
participant E as Brunix Engine (Local Docker)
|
|
participant T as Kubectl Tunnel
|
|
participant V as Vector DB (Vultr)
|
|
participant O as Ollama Light (Vultr)
|
|
|
|
U->>E: AskAgent(query, session_id)
|
|
Note over E: Start Langfuse Trace
|
|
|
|
E->>T: Search Context (Embeddings)
|
|
T->>V: Query Index [avap_manuals]
|
|
V-->>T: Return Relevant Chunks
|
|
T-->>E: Contextual Data
|
|
|
|
E->>T: Generate Completion (Prompt + Context)
|
|
T->>O: Stream Tokens (qwen2.5:1.5b)
|
|
|
|
loop Token Streaming
|
|
O-->>T: Token
|
|
T-->>E: Token
|
|
E-->>U: gRPC Stream Response {text, avap_code}
|
|
end
|
|
|
|
Note over E: Close Langfuse Trace
|
|
```
|
|
|
|
---
|
|
|
|
## Development Setup
|
|
|
|
### 1. Prerequisites
|
|
* **Docker & Docker Compose**
|
|
* **gRPCurl** (`brew install grpcurl`)
|
|
* **Access Credentials:** Ensure the file `./ivar.yaml` (Kubeconfig) is present in the root directory.
|
|
|
|
### 2. Observability Setup (Langfuse)
|
|
The engine utilizes Langfuse for end-to-end tracing and performance monitoring.
|
|
1. Access the Dashboard: **http://45.77.119.180**
|
|
2. Create a project and generate API Keys in **Settings**.
|
|
3. Configure your local `.env` file using the reference table below.
|
|
|
|
### 3. Environment Variables Reference
|
|
|
|
> **Policy:** Every environment variable used by the engine must be documented in this table. Any PR that introduces a new variable without a corresponding entry here will be rejected. See [CONTRIBUTING.md](./CONTRIBUTING.md#5-environment-variables-policy) for full details.
|
|
|
|
Create a `.env` file in the project root with the following variables:
|
|
|
|
```env
|
|
PYTHONPATH=${PYTHONPATH}:/home/...
|
|
ELASTICSEARCH_URL=http://host.docker.internal:9200
|
|
ELASTICSEARCH_LOCAL_URL=http://localhost:9200
|
|
ELASTICSEARCH_INDEX=avap-docs-test
|
|
POSTGRES_URL=postgresql://postgres:postgres@localhost:5432/langfuse
|
|
LANGFUSE_HOST=http://45.77.119.180
|
|
LANGFUSE_PUBLIC_KEY=pk-lf-...
|
|
LANGFUSE_SECRET_KEY=sk-lf-...
|
|
OLLAMA_URL=http://host.docker.internal:11434
|
|
OLLAMA_LOCAL_URL=http://localhost:11434
|
|
OLLAMA_MODEL_NAME=qwen2.5:1.5b
|
|
OLLAMA_EMB_MODEL_NAME=qwen3-0.6B-emb:latest
|
|
HF_TOKEN=hf_...
|
|
HF_EMB_MODEL_NAME=Qwen/Qwen3-Embedding-0.6B
|
|
```
|
|
|
|
| Variable | Required | Description | Example |
|
|
|---|---|---|---|
|
|
| `PYTHONPATH` | No | Path that aims to the root of the project | `${PYTHONPATH}:/home/...` |
|
|
| `ELASTICSEARCH_URL` | Yes | Elasticsearch endpoint used for vector/context retrieval in Docker | `http://host.docker.internal:9200` |
|
|
| `ELASTICSEARCH_LOCAL_URL` | Yes | Elasticsearch endpoint used for vector/context retrieval in local | `http://localhost:9200` |
|
|
| `ELASTICSEARCH_INDEX` | Yes | Elasticsearch index name used by the engine | `avap-docs-test` |
|
|
| `POSTGRES_URL` | Yes | PostgreSQL connection string used by the service | `postgresql://postgres:postgres@localhost:5432/langfuse` |
|
|
| `LANGFUSE_HOST` | Yes | Langfuse server endpoint (Devaron Cluster) | `http://45.77.119.180` |
|
|
| `LANGFUSE_PUBLIC_KEY` | Yes | Langfuse project public key for tracing and observability | `pk-lf-...` |
|
|
| `LANGFUSE_SECRET_KEY` | Yes | Langfuse project secret key | `sk-lf-...` |
|
|
| `OLLAMA_URL` | Yes | Ollama endpoint used for text generation/embeddings in Docker | `http://host.docker.internal:11434` |
|
|
| `OLLAMA_LOCAL_URL` | Yes | Ollama endpoint used for text generation/embeddings in local | `http://localhost:11434` |
|
|
| `OLLAMA_MODEL_NAME` | Yes | Ollama model name for generation | `qwen2.5:1.5b` |
|
|
| `OLLAMA_EMB_MODEL_NAME` | Yes | Ollama embeddings model name | `qwen3-0.6B-emb:latest` |
|
|
| `HF_TOKEN` | Yes | Hugginface secret token | `hf_...` |
|
|
| `HF_EMB_MODEL_NAME` | Yes | Hugginface embeddings model name | `Qwen/Qwen3-Embedding-0.6B` |
|
|
|
|
> Never commit real secret values. Use placeholder values when sharing configuration examples.
|
|
|
|
### 4. Infrastructure Tunnels
|
|
Open a terminal and establish the connection to the Devaron Cluster:
|
|
|
|
```bash
|
|
# 1. AI Model Tunnel (Ollama)
|
|
kubectl port-forward --address 0.0.0.0 svc/ollama-light-service 11434:11434 -n brunix --kubeconfig ./kubernetes/kubeconfig.yaml &
|
|
|
|
# 2. Knowledge Base Tunnel (Elasticsearch)
|
|
kubectl port-forward --address 0.0.0.0 svc/brunix-vector-db 9200:9200 -n brunix --kubeconfig ./kubernetes/kubeconfig.yaml &
|
|
|
|
# 3. Observability DB Tunnel (PostgreSQL)
|
|
kubectl port-forward --address 0.0.0.0 svc/brunix-postgres 5432:5432 -n brunix --kubeconfig ./kubernetes/kubeconfig.yaml &
|
|
```
|
|
|
|
### 5. Launch the Engine
|
|
```bash
|
|
docker-compose up -d --build
|
|
```
|
|
|
|
---
|
|
|
|
## Testing & Debugging
|
|
|
|
The service is exposed on port `50052` with **gRPC Reflection** enabled.
|
|
|
|
### Streaming Query Example
|
|
```bash
|
|
grpcurl -plaintext \
|
|
-d '{"query": "Hola Brunix, ¿qué es AVAP?", "session_id": "dev-test-123"}' \
|
|
localhost:50052 \
|
|
brunix.AssistanceEngine/AskAgent
|
|
```
|
|
|
|
---
|
|
|
|
## API Contract (Protobuf)
|
|
To update the communication interface, modify `protos/brunix.proto` and re-generate the stubs:
|
|
|
|
```bash
|
|
python -m grpc_tools.protoc -I./protos --python_out=./src --grpc_python_out=./src ./protos/brunix.proto
|
|
```
|
|
|
|
---
|
|
|
|
## Dataset Generation & Evaluation
|
|
|
|
The engine includes a specialized benchmarking suite to evaluate the model's proficiency in **AVAP syntax**. This is achieved through a synthetic data generator that creates problems in the MBPP (Mostly Basic Python Problems) style, but tailored for the AVAP Language Reference Manual (LRM).
|
|
|
|
### 1. Synthetic Data Generator
|
|
The script `scripts/generate_mbpp_avap.py` leverages Claude 3.5 Sonnet to produce high-quality, executable code examples and validation tests.
|
|
|
|
**Key Features:**
|
|
* **LRM Grounding:** Uses the provided `avap.md` as the source of truth for syntax and logic.
|
|
* **Validation Logic:** Generates `test_list` with Python regex assertions to verify the state of the AVAP stack after execution.
|
|
* **Balanced Categories:** Covers 14 domains including ORM, Concurrency (`go/gather`), HTTP handling, and Cryptography.
|
|
|
|
### 2. Usage
|
|
Ensure you have the `anthropic` library installed and your API key configured:
|
|
|
|
```bash
|
|
pip install anthropic
|
|
export ANTHROPIC_API_KEY="your-sk-ant-key"
|
|
```
|
|
|
|
Run the generator specifying the path to your LRM and the desired output:
|
|
|
|
```bash
|
|
python scripts/generate_mbpp_avap.py \
|
|
--lrm ingestion/docs/avap.md \
|
|
--output evaluation/mbpp_avap.json \
|
|
--problems 300
|
|
```
|
|
|
|
### 3. Dataset Schema
|
|
The generated JSON follows this structure:
|
|
|
|
| Field | Type | Description |
|
|
| :--- | :--- | :--- |
|
|
| `task_id` | Integer | Unique identifier for the benchmark. |
|
|
| `text` | String | Natural language description of the problem (Spanish). |
|
|
| `code` | String | The reference AVAP implementation. |
|
|
| `test_list` | Array | Python `re.match` expressions to validate execution results. |
|
|
|
|
### 4. Integration in RAG
|
|
These generated examples are used to:
|
|
1. **Fine-tune** the local models (`qwen2.5:1.5b`) or others via the MrHouston pipeline.
|
|
2. **Evaluate** the "Zero-Shot" performance of the engine before deployment.
|
|
3. **Provide Few-Shot examples** in the RAG prompt orchestration (`src/prompts.py`).
|
|
|
|
---
|
|
|
|
## Repository Standards & Architecture
|
|
|
|
### Docker & Build Context
|
|
To maintain production-grade security and image efficiency, this project enforces a strict separation between development files and the production runtime:
|
|
|
|
* **Production Root:** All executable code must reside in the `/app` directory within the container.
|
|
* **Exclusions:** The root `/workspace` directory is deprecated. No development artifacts, local logs, or non-essential source files (e.g., `.git`, `tests/`, `docs/`) should be bundled into the final image.
|
|
* **Compliance:** All Pull Requests must verify that the `Dockerfile` context is optimized using the provided `.dockerignore`.
|
|
|
|
*Failure to comply with these architectural standards will result in PR rejection.*
|
|
|
|
For the full set of contribution standards, see [CONTRIBUTING.md](./CONTRIBUTING.md).
|
|
|
|
---
|
|
|
|
## Security & Intellectual Property
|
|
* **Data Privacy:** All LLM processing and vector searches are conducted within a private Kubernetes environment.
|
|
* **Proprietary Technology:** This repository contains the **AVAP Technology** stack (101OBEX) and specialized training logic (MrHouston). Unauthorized distribution is prohibited.
|
|
|
|
---
|