Commit Graph

271 Commits

Author SHA1 Message Date
acano e035472b14 created chunks for new emb model (harrier) 2026-04-06 12:45:49 +02:00
pseco 56349184fb Add evaluation results for AVAP knowledge models and update evaluation notebook
- Created a new JSON file containing evaluation results for the AVAP knowledge models, including scores for faithfulness, answer relevancy, context recall, and context precision.
- Updated the evaluation notebook to use a new embedding model and fixed execution counts for code cells.
2026-04-06 11:55:33 +02:00
pseco 00a7cc727d Refactor code structure and remove redundant code blocks for improved readability and maintainability 2026-04-06 11:20:21 +02:00
pseco 26ffcc54d9 Refactor code structure for improved readability and maintainability 2026-03-31 13:57:15 +02:00
pseco 4f7367d2d4 Merge branch 'online' of github.com:BRUNIX-AI/assistance-engine into mrh-online-dev 2026-03-31 11:21:21 +02:00
pseco 8f501d3e52 Refactor code structure for improved readability and maintainability 2026-03-31 11:16:03 +02:00
rafa-ruiz 1e9a6508f9 Golden dataset 2026-03-31 01:48:00 -07:00
rafa-ruiz aa138783f3 Golden dataset 2026-03-31 01:40:53 -07:00
rafa-ruiz 6ee8583894 update 2026-03-31 01:40:23 -07:00
pseco cd656b08a8 Update default dataset path in validate_synthetic_dataset.py to point to new output location 2026-03-30 10:04:28 +02:00
pseco 04fa15ff1e Merge branch 'mrh-online-dev' of github.com:BRUNIX-AI/assistance-engine into mrh-online-dev 2026-03-27 14:14:05 +01:00
pseco 0cf2fc3aa7 Remove detailed print statements from fill rate analysis and retain only essential output 2026-03-27 14:14:00 +01:00
acano 8df0b59f65 Merge branch 'mrh-online-dev' of github.com:BRUNIX-AI/assistance-engine into mrh-online-dev 2026-03-27 14:13:17 +01:00
acano e4f76f3fab Add newline at the end of generate_mbap_v2.py for better file formatting 2026-03-27 14:10:58 +01:00
pseco 344230c2cf Refactor code structure for improved readability and maintainability 2026-03-27 14:09:18 +01:00
acano d074ce32cc Merge branch 'online' of github.com:BRUNIX-AI/assistance-engine into mrh-online-dev 2026-03-27 14:08:18 +01:00
acano bae58a7fed Merge branch 'mrh-online-dev' of github.com:BRUNIX-AI/assistance-engine into mrh-online-dev 2026-03-27 14:04:31 +01:00
acano f747c140c8 Enhance generate_mbap_v2.py with new reward mechanism and GoldPool integration
- Added GoldPool class to manage a top-K pool of high-reward examples.
- Implemented compute_reward function to calculate composite rewards based on execution coverage, novelty, and test quality.
- Introduced call_api_reward function for API calls in the new reward mode.
- Updated main function to support new reward mode with adjustable weights for ECS, novelty, and test quality.
- Enhanced dataset saving functionality to include reward statistics.
- Refactored existing code for improved readability and consistency.
2026-03-27 14:04:21 +01:00
pseco d2d223baea Add new JSON dataset for email validation API task and initialize validated dataset 2026-03-27 11:21:41 +01:00
Rafael Ruiz 3e47c15966
Merge pull request #63 from BRUNIX-AI/mrh-online-dev-partial
Add BEIR analysis notebooks and evaluation pipeline for embedding models
2026-03-26 09:33:54 -07:00
pseco 668f6d006b Merge branch 'mrh-online-dev' of github.com:BRUNIX-AI/assistance-engine into mrh-online-dev 2026-03-26 17:18:49 +01:00
pseco febf955a62 Add new JSON output files for candidate F reward statistics and MBPP tasks
- Created `candidate_F_reward_10_coverage_stats.json` with coverage statistics including total cells, filled cells, fill rate, and node type frequency.
- Added `mbpp_avap.json` containing 14 tasks with descriptions, code implementations, test inputs, and expected test results for various endpoints and functionalities.
2026-03-26 17:18:45 +01:00
acano c6b57849cd Merge branch 'online' of github.com:BRUNIX-AI/assistance-engine into mrh-online-dev 2026-03-26 17:02:27 +01:00
acano b94f3382b3 Merge branch 'mrh-online-dev' of github.com:BRUNIX-AI/assistance-engine into mrh-online-dev 2026-03-26 17:00:34 +01:00
izapata 4deda83a8e Add BEIR analysis notebooks and evaluation pipeline for embedding models
- Created `n00 Beir Analysis_cosqa.ipynb` for analyzing CoSQA dataset with BEIR.
- Created `n00 first Analysis.ipynb` for initial analysis using Ragas and Ollama embeddings.
- Implemented `evaluate_embeddings_pipeline.py` to evaluate embedding models across CodexGlue, CoSQA, and SciFact benchmarks.
- Added adapters for Ollama and HuggingFace embeddings to ensure compatibility with BEIR.
- Included functions to load datasets and evaluate models with detailed metrics.
2026-03-26 16:53:20 +01:00
Rafael Ruiz a55d4bbf5e
Merge pull request #62 from BRUNIX-AI/mrh-online-dev-partial
Update Embedding model PDF and enhance documentation
2026-03-26 08:36:53 -07:00
rafa-ruiz fe43cd6fa9 scripts documentation 2026-03-26 07:51:01 -07:00
acano ba03aa3b92 Add ADR-0006: Code Indexing Improvements with evaluation strategies and alternatives 2026-03-26 15:41:12 +01:00
izapata 08c5aded35 fix(docs): improve formatting and readability in ADR-0005 for embedding model selection 2026-03-26 15:32:23 +01:00
acano 1f0d31b7b3 Delete obsolete Jupyter notebooks for BEIR analysis and first analysis, removing unused code and dependencies. 2026-03-26 15:20:44 +01:00
acano 591a839c2a Refactor code structure for improved readability and maintainability 2026-03-26 15:13:54 +01:00
acano 3d3237aef6 chore(changelog): update changelog for version 1.6.2 with embedding model selection PDF changes 2026-03-26 15:11:07 +01:00
izapata 76250a347b feat(docs): typo fix 2026-03-26 10:30:12 +01:00
izapata 64d487e20d chore: update changelog for version 1.6.2 and enhance README.md documentation 2026-03-26 10:25:34 +01:00
izapata e4a8e5b85d chore: update Embedding model selection PDF with new content 2026-03-26 10:19:24 +01:00
izapata 669d0b47a0 chore(embeddings): update Embedding model selection PDF 2026-03-26 10:15:49 +01:00
acano d50f33c707 Merge branch 'online' of github.com:BRUNIX-AI/assistance-engine into mrh-online-dev 2026-03-26 09:37:57 +01:00
pseco 1ee5f21c7c Add BEIR analysis notebooks and evaluation pipeline for embedding models
- Created `n00 Beir Analysis_cosqa.ipynb` for analyzing CoSQA dataset with BEIR.
- Created `n00 first Analysis.ipynb` for initial analysis with embeddings.
- Implemented `evaluate_embeddings_pipeline.py` to evaluate embedding models across CodexGlue, CoSQA, and SciFact benchmarks.
- Added adapters for Ollama and HuggingFace embeddings to ensure compatibility with BEIR.
- Enhanced error handling and data normalization in embedding processes.
- Included functionality to load datasets from local cache or download if not present.
2026-03-26 09:37:37 +01:00
rafa-ruiz ccd9073a52 feat(dataset): add ADR-0006 and scaffold reward algorithm pipeline 2026-03-25 22:19:19 -07:00
pseco 0d2cdd2190 Refactor AVAP dataset generation prompts and add synthetic data generation notebook
- Introduced a new notebook for generating synthetic datasets for AVAP, including loading AVAP and MBPP data, and creating prompts for LLM interactions.
2026-03-25 17:07:00 +01:00
pseco d7f895804c Refactor code structure for improved readability and maintainability 2026-03-25 10:53:38 +01:00
pseco 71eb85cc89 Merge branch 'mrh-online-dev' of github.com:BRUNIX-AI/assistance-engine into mrh-online-dev 2026-03-25 10:50:13 +01:00
pseco 0b309bfa69 feat: add evaluation results for bge-m3 and qwen3-0.6B-emb models 2026-03-25 10:46:02 +01:00
acano b2e5d06d96 Merge branch 'mrh-online-dev' of github.com:BRUNIX-AI/assistance-engine into mrh-online-dev 2026-03-25 10:41:26 +01:00
acano 21bc6fc3f0 feat: add embedding evaluation results and task processing notebook 2026-03-25 10:40:49 +01:00
acano da483c51bb created code_indexing_improvements research 2026-03-25 10:37:53 +01:00
acano fe90548b8b added ast tree metadata 2026-03-25 10:36:18 +01:00
acano dc8230c872 feat: add ANTHROPIC_API_KEY and ANTHROPIC_MODEL to docker-compose environment 2026-03-25 10:30:00 +01:00
acano bd542bb14d Continued ADR-0005 and created ADR-0006 2026-03-25 10:27:41 +01:00
acano 1442a632c9 fixed avap examples (not coherent with official avap bnf rules) 2026-03-25 10:26:47 +01:00