# Contributing to Brunix Assistance Engine

> This document is the single source of truth for all contribution standards in the Brunix Assistance Engine repository. All contributors — regardless of seniority or role — are expected to read, understand, and comply with these guidelines before opening any Pull Request.

---

## Table of Contents

1. [Development Workflow (GitFlow)](#1-development-workflow-gitflow)
2. [Infrastructure Standards](#2-infrastructure-standards)
3. [Repository Standards](#3-repository-standards)
4. [Pull Request Requirements](#4-pull-request-requirements)
5. [Ingestion Files Policy](#5-ingestion-files-policy)
6. [Environment Variables Policy](#6-environment-variables-policy)
7. [Changelog Policy](#7-changelog-policy)
8. [Documentation Policy](#8-documentation-policy)
9. [Architecture Decision Records (ADRs)](#9-architecture-decision-records-adrs)
10. [Incident & Blockage Reporting](#10-incident--blockage-reporting)

---

## 1. Development Workflow (GitFlow)

### Branch Strategy

| Branch type | Naming convention | Purpose |
|---|---|---|
| Feature | `*-dev` | Active development — volatile, no CI validation |
| Main | `online` | Production-ready, fully validated |

- **Feature branches** (`*-dev`) are volatile environments. No validation tests or infrastructure deployments are performed on these branches.
- **Official validation** only occurs after a documented Pull Request is merged into `online`.
- **Developer responsibility:** Code must be stable and functional against the authorized environment before a PR is opened. Do not use the PR review process as a debugging step.

---

## 2. Infrastructure Standards

The project provides a validated, shared environment (Devaron Cluster, Vultr) including Ollama, Elasticsearch, and PostgreSQL.

- **Authorized environment only.** The use of parallel, unauthorized infrastructures — external EC2 instances, ad-hoc local setups, non-replicable environments — is strictly prohibited for official development.
- **No siloed environments.** Isolated development creates technical debt and incompatibility risks that directly impact delivery timelines.
- All infrastructure access must be established via the documented `kubectl` port-forward tunnels defined in the [README](./README.md#3-infrastructure-tunnels).

---

## 3. Repository Standards

### IDE Agnosticism

The `online` branch must remain neutral to any individual's development environment. The following **must not** be committed under any circumstance:

- `.devcontainer/`
- `.vscode/`
- Any local IDE or editor configuration files

The `.gitignore` automates exclusion of these artifacts. Ensure your local environment is fully decoupled from the production-ready source code.

### Security & Least Privilege

- Never use `root` as `remoteUser` in any shared dev environment configuration.
- All configurations must comply with the **Principle of Least Privilege**.
- Using root in shared environments introduces unacceptable supply chain risk.

### Docker & Build Context

- All executable code must reside in `/app` within the container.
- The `/workspace` root directory is **deprecated** — do not reference it.
- Every PR must verify the `Dockerfile` context is optimized via `.dockerignore`.

> **PRs that violate these architectural standards will be rejected without review.**

---

## 4. Pull Request Requirements

A PR is not ready for review unless **all applicable items** in the following checklist are satisfied. Reviewers are authorized to close PRs that do not meet these standards and request resubmission.

### PR Checklist

**Code & Environment**
- [ ] Tested locally against the authorized Devaron Cluster (no unauthorized infrastructure used)
- [ ] No IDE or environment configuration files committed (`.vscode`, `.devcontainer`, etc.)
- [ ] No `root` user configurations introduced
- [ ] `Dockerfile` and `.dockerignore` comply with build context standards

**Ingestion Files** *(see [Section 5](#5-ingestion-files-policy))*
- [ ] No ingestion files were added or modified
- [ ] New or modified ingestion files are committed to the repository under `ingestion/` or `data/`

**Environment Variables** *(see [Section 6](#6-environment-variables-policy))*
- [ ] No new environment variables were introduced
- [ ] New environment variables are documented in the `.env` reference table in `README.md`

**Changelog** *(see [Section 6](#6-changelog-policy))*
- [ ] No changelog entry required (internal refactor, comment/typo fix, zero behavioral change)
- [ ] Changelog updated with correct version bump and date

**Documentation** *(see [Section 8](#8-documentation-policy))*
- [ ] No documentation update required (internal change, no impact on setup or API)
- [ ] `README.md` or relevant docs updated to reflect this change
- [ ] If a significant architectural decision was made, an ADR was created in `docs/adr/`

---

## 5. Ingestion Files Policy

All files used to populate the vector knowledge base — source documents, AVAP manuals, structured data, or ingestion scripts — **must be committed to the repository.**

### Rules

- Ingestion files must reside in a dedicated directory (e.g., `ingestion/` or `data/`) within the repository.
- Any PR that introduces new knowledge base content or modifies existing ingestion pipelines must include the corresponding source files.
- Files containing sensitive content that cannot be committed in plain form must be flagged for discussion before proceeding. Encryption, redaction, or a separate private submodule are all valid solutions — committing to an external or local-only location is not.

### Why this matters

The Elasticsearch vector index is only as reliable as the source material that feeds it. Ingestion files that exist only on a local machine or external location cannot be audited, rebuilt, or validated by the team. A knowledge base populated from untracked files is a non-reproducible dependency — and a risk to the entire RAG pipeline.

---

## 6. Environment Variables Policy

This is a critical requirement. **Every environment variable introduced in a PR must be documented before the PR can be merged.**

### Rules

- Any new variable added to the codebase (`.env`, `docker-compose.yaml`, `server.py`, or any config file) must be declared in the `.env` reference table in `README.md`.
- The documentation must include: variable name, purpose, whether it is required or optional, and an example value.
- Variables that contain secrets must use placeholder values (e.g., `your-secret-key-here`) — never commit real values.

### Required format in README.md

```markdown
| Variable | Required | Description | Example |
|---|---|---|---|
| `LANGFUSE_PUBLIC_KEY` | Yes | Langfuse project public key for tracing | `pk-lf-...` |
| `LANGFUSE_SECRET_KEY` | Yes | Langfuse project secret key | `sk-lf-...` |
| `LANGFUSE_HOST`       | Yes | Langfuse server endpoint | `http://45.77.119.180` |
| `NEW_VARIABLE`        | Yes | Description of what it does | `example-value` |
```

### Why this matters

An undocumented environment variable silently breaks the setup for every other developer on the team. It also makes the service non-reproducible, which is a direct violation of the infrastructure standards in Section 2. There are no exceptions to this policy.

---

## 7. Changelog Policy

The `changelog` file tracks all notable changes and follows [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

### When a changelog entry IS required

| Change type | Label to use |
|---|---|
| New feature or capability | `Added` |
| Change to existing behavior, API, or interface | `Changed` |
| Bug fix | `Fixed` |
| Security patch or security-related change | `Security` |
| Breaking change or deprecation | `Deprecated` / `Removed` |

### When a changelog entry is NOT required

- Typo or comment fixes only
- Internal refactors with zero behavioral or interface change
- Tooling/CI updates with no user-visible impact

**If in doubt, add an entry.**

### Format

New entries go at the top of the file, above the previous version:

```
## [X.Y.Z] - YYYY-MM-DD

### Added
- LABEL: Description of the new feature or capability.

### Changed
- LABEL: Description of what changed and the rationale.

### Fixed
- LABEL: Description of the bug resolved.
```

Use uppercase short labels for scanability: `API:`, `DOCKER:`, `INFRA:`, `SECURITY:`, `ENV:`, `CONFIG:`.

---

## 8. Documentation Policy

### When documentation MUST be updated

Update `README.md` (or the relevant doc file) if the PR includes any of the following:

- Changes to project structure (new files, directories, removed components)
- Changes to setup, installation, or environment configuration
- New or modified API endpoints or Protobuf definitions (`brunix.proto`)
- New, modified, or removed environment variables
- Changes to infrastructure tunnels or Kubernetes service names
- New dependencies or updated dependency versions
- Changes to security, access, or repository standards

### When documentation is NOT required

- Internal implementation changes with no impact on setup, usage, or API
- Fixes that do not alter any documented behavior

### Documentation files in this repository

| File | Purpose |
|---|---|
| `README.md` | Setup guide, env vars reference, quick start |
| `CONTRIBUTING.md` | Contribution standards (this file) |
| `SECURITY.md` | Security policy and vulnerability reporting |
| `docs/ARCHITECTURE.md` | Deep technical architecture reference |
| `docs/API_REFERENCE.md` | Complete gRPC API contract and examples |
| `docs/RUNBOOK.md` | Operational playbooks and incident response |
| `docs/AVAP_CHUNKER_CONFIG.md` | `avap_config.json` reference — blocks, statements, semantic tags |
| `docs/adr/` | Architecture Decision Records |

> **PRs that change user-facing behavior or setup without updating documentation will be rejected.**

---

## 9. Architecture Decision Records (ADRs)

Architecture Decision Records document **significant technical decisions** — choices that have lasting consequences on the codebase, infrastructure, or development process.

### When to write an ADR

Write an ADR when a PR introduces or changes:

- A fundamental technology choice (communication protocol, storage backend, framework)
- A design pattern that other components will follow
- A deliberate trade-off with known consequences
- A decision that future engineers might otherwise reverse without understanding the rationale

### When NOT to write an ADR

- Implementation details within a single module
- Bug fixes
- Dependency version bumps
- Configuration changes

### ADR format

ADRs live in `docs/adr/` and follow this naming convention:

```
ADR-XXXX-short-title.md
```

Where `XXXX` is a zero-padded sequential number (e.g., `ADR-0005-new-decision.md`).

Each ADR must contain:

```markdown
# ADR-XXXX: Title

**Date:** YYYY-MM-DD
**Status:** Proposed | Accepted | Deprecated | Superseded by ADR-YYYY
**Deciders:** Names or roles

## Context
What problem are we solving? What forces are at play?

## Decision
What did we decide?

## Rationale
Why this option over alternatives? Include a trade-off analysis.

## Consequences
What are the positive and negative results of this decision?
```

### Existing ADRs

| ADR | Title | Status |
|---|---|---|
| [ADR-0001](docs/adr/ADR-0001-grpc-primary-interface.md) | gRPC as the Primary Communication Interface | Accepted |
| [ADR-0002](docs/adr/ADR-0002-two-phase-streaming.md) | Two-Phase Streaming Design for AskAgentStream | Accepted |
| [ADR-0003](docs/adr/ADR-0003-hybrid-retrieval-rrf.md) | Hybrid Retrieval (BM25 + kNN) with RRF Fusion | Accepted |
| [ADR-0004](docs/adr/ADR-0004-claude-eval-judge.md) | Claude as the RAGAS Evaluation Judge | Accepted |

---

## 10. Incident & Blockage Reporting

If you encounter a technical blockage (connection timeouts, service downtime, tunnel failures):

1. **Immediate notification** — Report via the designated Slack channel at the moment of detection. Do not wait until end of day.
2. **GitHub Issue must include:**
   - The exact command executed
   - Full terminal output (complete error logs)
   - Current status of all `kubectl` tunnels
3. **Resolution** — If the error is not reproducible by the CTO/DevOps team, a 5-minute live debugging session will be scheduled to identify local network or configuration issues.

See [`docs/RUNBOOK.md`](docs/RUNBOOK.md) for full incident playbooks and escalation paths.

---

*These standards exist to protect the integrity of the Brunix Assistance Engine and to ensure every member of the team can work confidently and efficiently. They are not bureaucratic overhead — they are the foundation of a reliable, scalable engineering practice.*

*— Rafael Ruiz, CTO, AVAP Technology*