assistance-engine/CONTRIBUTING.md

16 KiB

Contributing to Brunix Assistance Engine

This document is the single source of truth for all contribution standards in the Brunix Assistance Engine repository. All contributors — regardless of seniority or role — are expected to read, understand, and comply with these guidelines before opening any Pull Request.


Table of Contents

  1. Development Workflow (GitFlow)
  2. Infrastructure Standards
  3. Repository Standards
  4. Pull Request Requirements
  5. Ingestion Files Policy
  6. Environment Variables Policy
  7. Changelog Policy
  8. Documentation Policy
  9. Architecture Decision Records (ADRs)
  10. Product Requirements Documents (PRDs)
  11. Research & Experiments Policy
  12. Incident & Blockage Reporting

1. Development Workflow (GitFlow)

Branch Strategy

Branch type Naming convention Purpose
Feature *-dev Active development — volatile, no CI validation
Main online Production-ready, fully validated
  • Feature branches (*-dev) are volatile environments. No validation tests or infrastructure deployments are performed on these branches.
  • Official validation only occurs after a documented Pull Request is merged into online.
  • Developer responsibility: Code must be stable and functional against the authorized environment before a PR is opened. Do not use the PR review process as a debugging step.

2. Infrastructure Standards

The project provides a validated, shared environment (Devaron Cluster, Vultr) including Ollama, Elasticsearch, and PostgreSQL.

  • Authorized environment only. The use of parallel, unauthorized infrastructures — external EC2 instances, ad-hoc local setups, non-replicable environments — is strictly prohibited for official development.
  • No siloed environments. Isolated development creates technical debt and incompatibility risks that directly impact delivery timelines.
  • All infrastructure access must be established via the documented kubectl port-forward tunnels defined in the README.

3. Repository Standards

IDE Agnosticism

The online branch must remain neutral to any individual's development environment. The following must not be committed under any circumstance:

  • .devcontainer/
  • .vscode/
  • Any local IDE or editor configuration files

The .gitignore automates exclusion of these artifacts. Ensure your local environment is fully decoupled from the production-ready source code.

Security & Least Privilege

  • Never use root as remoteUser in any shared dev environment configuration.
  • All configurations must comply with the Principle of Least Privilege.
  • Using root in shared environments introduces unacceptable supply chain risk.

Docker & Build Context

  • All executable code must reside in /app within the container.
  • The /workspace root directory is deprecated — do not reference it.
  • Every PR must verify the Dockerfile context is optimized via .dockerignore.

PRs that violate these architectural standards will be rejected without review.


4. Pull Request Requirements

A PR is not ready for review unless all applicable items in the following checklist are satisfied. Reviewers are authorized to close PRs that do not meet these standards and request resubmission.

PR Checklist

Code & Environment

  • Tested locally against the authorized Devaron Cluster (no unauthorized infrastructure used)
  • No IDE or environment configuration files committed (.vscode, .devcontainer, etc.)
  • No root user configurations introduced
  • Dockerfile and .dockerignore comply with build context standards

Ingestion Files (see Section 5)

  • No ingestion files were added or modified
  • New or modified ingestion files are committed to the repository under ingestion/ or data/

Environment Variables (see Section 6)

  • No new environment variables were introduced
  • New environment variables are documented in the .env reference table in README.md

Changelog (see Section 7)

  • No changelog entry required (internal refactor, comment/typo fix, zero behavioral change)
  • Changelog updated with correct version bump and date

Documentation (see Section 8)

  • No documentation update required (internal change, no impact on setup or API)
  • README.md or relevant docs updated to reflect this change
  • If a significant architectural decision was made, an ADR was created in docs/ADR/
  • If a new user-facing feature was introduced, a PRD was created in docs/product/
  • If an experiment was conducted, results were documented in research/

5. Ingestion Files Policy

All files used to populate the vector knowledge base — source documents, AVAP manuals, structured data, or ingestion scripts — must be committed to the repository.

Rules

  • Ingestion files must reside in a dedicated directory (e.g., ingestion/ or data/) within the repository.
  • Any PR that introduces new knowledge base content or modifies existing ingestion pipelines must include the corresponding source files.
  • Files containing sensitive content that cannot be committed in plain form must be flagged for discussion before proceeding. Encryption, redaction, or a separate private submodule are all valid solutions — committing to an external or local-only location is not.

Why this matters

The Elasticsearch vector index is only as reliable as the source material that feeds it. Ingestion files that exist only on a local machine or external location cannot be audited, rebuilt, or validated by the team. A knowledge base populated from untracked files is a non-reproducible dependency — and a risk to the entire RAG pipeline.


6. Environment Variables Policy

This is a critical requirement. Every environment variable introduced in a PR must be documented before the PR can be merged.

Rules

  • Any new variable added to the codebase (.env, docker-compose.yaml, server.py, or any config file) must be declared in the .env reference table in README.md.
  • The documentation must include: variable name, purpose, whether it is required or optional, and an example value.
  • Variables that contain secrets must use placeholder values (e.g., your-secret-key-here) — never commit real values.

Required format in README.md

| Variable | Required | Description | Example |
|---|---|---|---|
| `LANGFUSE_PUBLIC_KEY` | Yes | Langfuse project public key for tracing | `pk-lf-...` |
| `LANGFUSE_SECRET_KEY` | Yes | Langfuse project secret key | `sk-lf-...` |
| `LANGFUSE_HOST`       | Yes | Langfuse server endpoint | `http://45.77.119.180` |
| `NEW_VARIABLE`        | Yes | Description of what it does | `example-value` |

Why this matters

An undocumented environment variable silently breaks the setup for every other developer on the team. It also makes the service non-reproducible, which is a direct violation of the infrastructure standards in Section 2. There are no exceptions to this policy.


7. Changelog Policy

The changelog file tracks all notable changes and follows Semantic Versioning.

When a changelog entry IS required

Change type Label to use
New feature or capability Added
Change to existing behavior, API, or interface Changed
Bug fix Fixed
Security patch or security-related change Security
Breaking change or deprecation Deprecated / Removed

When a changelog entry is NOT required

  • Typo or comment fixes only
  • Internal refactors with zero behavioral or interface change
  • Tooling/CI updates with no user-visible impact

If in doubt, add an entry.

Format

New entries go under [Unreleased] at the top of the file. When a PR merges, [Unreleased] is renamed to the new version with its date:

## [Unreleased]

### Added
- LABEL: Description of the new feature or capability.

### Changed
- LABEL: Description of what changed and the rationale.

### Fixed
- LABEL: Description of the bug resolved.

Use uppercase short labels for scanability: ENGINE:, API:, PROTO:, DOCKER:, INFRA:, SECURITY:, ENV:, CONFIG:, DOCS:, FEATURE:.


8. Documentation Policy

When documentation MUST be updated

Update README.md (or the relevant doc file) if the PR includes any of the following:

  • Changes to project structure (new files, directories, removed components)
  • Changes to setup, installation, or environment configuration
  • New or modified API endpoints or Protobuf definitions (brunix.proto)
  • New, modified, or removed environment variables
  • Changes to infrastructure tunnels or Kubernetes service names
  • New dependencies or updated dependency versions
  • Changes to security, access, or repository standards

When documentation is NOT required

  • Internal implementation changes with no impact on setup, usage, or API
  • Fixes that do not alter any documented behavior

Documentation files in this repository

File Purpose
README.md Setup guide, env vars reference, quick start
CONTRIBUTING.md Contribution standards (this file)
SECURITY.md Security policy and vulnerability reporting
docs/ARCHITECTURE.md Deep technical architecture reference
docs/API_REFERENCE.md Complete gRPC API contract and examples
docs/RUNBOOK.md Operational playbooks and incident response
docs/AVAP_CHUNKER_CONFIG.md avap_config.json reference — blocks, statements, semantic tags
docs/ADR/ Architecture Decision Records
docs/product/ Product Requirements Documents
research/ Experiment results, benchmarks, datasets

PRs that change user-facing behavior or setup without updating documentation will be rejected.


9. Architecture Decision Records (ADRs)

Architecture Decision Records document significant technical decisions — choices that have lasting consequences on the codebase, infrastructure, or development process.

When to write an ADR

Write an ADR when a PR introduces or changes:

  • A fundamental technology choice (communication protocol, storage backend, framework, model)
  • A design pattern that other components will follow
  • A deliberate trade-off with known consequences
  • A decision that future engineers might otherwise reverse without understanding the rationale

When NOT to write an ADR

  • Implementation details within a single module
  • Bug fixes
  • Dependency version bumps
  • Configuration changes
  • New user-facing features (use a PRD instead)

ADR format

ADRs live in docs/ADR/ and follow this naming convention:

ADR-XXXX-short-title.md

Where XXXX is a zero-padded sequential number (e.g., ADR-0005-new-decision.md).

Each ADR must contain:

# ADR-XXXX: Title

**Date:** YYYY-MM-DD
**Status:** Proposed | Under Evaluation | Accepted | Deprecated | Superseded by ADR-YYYY
**Deciders:** Names or roles

## Context
What problem are we solving? What forces are at play?

## Decision
What did we decide?

## Rationale
Why this option over alternatives? Include a trade-off analysis.

## Consequences
What are the positive and negative results of this decision?

Existing ADRs

ADR Title Status
ADR-0001 gRPC as the Primary Communication Interface Accepted
ADR-0002 Two-Phase Streaming Design for AskAgentStream Accepted
ADR-0003 Hybrid Retrieval (BM25 + kNN) with RRF Fusion Accepted
ADR-0004 Claude as the RAGAS Evaluation Judge Accepted
ADR-0005 Embedding Model Selection — BGE-M3 vs Qwen3-Embedding-0.6B Under Evaluation

10. Product Requirements Documents (PRDs)

Product Requirements Documents capture user-facing features — what is being built, why it is needed, and how it will be validated. Every feature that modifies the public API, the gRPC contract, or the user experience of any client (VS Code extension, OpenAI-compatible proxy, etc.) requires a PRD before implementation begins.

When to write a PRD

Write a PRD when a PR introduces or changes:

  • A new capability visible to any external consumer (extension, API client, proxy)
  • A change to the gRPC contract (brunix.proto)
  • A change to the HTTP proxy endpoints or behavior
  • A feature requested by product or business stakeholders

When NOT to write a PRD

  • Internal architectural changes (use an ADR instead)
  • Bug fixes with no change in user-visible behavior
  • Infrastructure or tooling changes

PRD format

PRDs live in docs/product/ and follow this naming convention:

PRD-XXXX-short-title.md

Each PRD must contain:

# PRD-XXXX: Title

**Date:** YYYY-MM-DD
**Status:** Proposed | Implemented
**Requested by:** Name / role
**Related ADR:** ADR-XXXX (if applicable)

## Problem
What user or business problem does this solve?

## Solution
What are we building?

## Scope
What is in scope and explicitly out of scope?

## Technical design
Key implementation decisions.

## Validation
How do we know this works? Acceptance criteria.

## Impact on parallel workstreams
Does this affect any ongoing experiment or evaluation?

Existing PRDs

PRD Title Status
PRD-0001 OpenAI-Compatible HTTP Proxy Implemented
PRD-0002 Editor Context Injection for VS Code Extension Proposed

11. Research & Experiments Policy

All scientific experiments, benchmark results, and dataset evaluations conducted by the research team must be documented and committed to the repository under research/.

Rules

  • Every experiment must have a corresponding result file in research/ before any engineering decision based on that experiment is considered valid.
  • Benchmark scripts, evaluation notebooks, and raw results must be committed alongside a summary README that explains the methodology, datasets used, metrics, and conclusions.
  • Experiments that inform an ADR must be referenced from that ADR with a direct path to the result files.
  • The golden dataset used by EvaluateRAG (Docker/src/golden_dataset.json) is a production artifact. Any modification requires explicit approval from the CTO and a new baseline EvaluateRAG run before the change is merged.

Directory structure

research/
  embeddings/       ← embedding model benchmarks (BEIR, MTEB)
  experiments/      ← RAG architecture experiments
  datasets/         ← synthetic datasets and golden datasets

Why this matters

An engineering decision based on an experiment that is not reproducible, not committed, or not peer-reviewable has no scientific validity. All decisions with impact on the production system must be traceable to documented, committed evidence.


12. Incident & Blockage Reporting

If you encounter a technical blockage (connection timeouts, service downtime, tunnel failures):

  1. Immediate notification — Report via the designated Slack channel at the moment of detection. Do not wait until end of day.
  2. GitHub Issue must include:
    • The exact command executed
    • Full terminal output (complete error logs)
    • Current status of all kubectl tunnels
  3. Resolution — If the error is not reproducible by the CTO/DevOps team, a 5-minute live debugging session will be scheduled to identify local network or configuration issues.

See docs/RUNBOOK.md for full incident playbooks and escalation paths.


These standards exist to protect the integrity of the Brunix Assistance Engine and to ensure every member of the team can work confidently and efficiently. They are not bureaucratic overhead — they are the foundation of a reliable, scalable engineering practice.

— Rafael Ruiz, CTO, AVAP Technology