Commit Graph

12 Commits

Author SHA1 Message Date
acano 5f21544e0b Refactor Elasticsearch ingestion pipeline and add MBPP generation script
- Updated `elasticsearch_ingestion.py` to streamline document processing and ingestion into Elasticsearch.
- Introduced `generate_mbap.py` for generating benchmark problems in AVAP language from a provided LRM.
- Created `prompts.py` to define prompts for converting Python problems to AVAP.
- Enhanced chunk processing in `chunk.py` to support markdown and AVAP documents.
- Added `OllamaEmbeddings` class in `embeddings.py` for handling embeddings with Ollama model.
- Updated dependencies in `uv.lock` to include new packages and versions.
2026-03-11 17:17:44 +01:00
pseco d04c149e66 workin on scratches bnf and parsing 2026-03-11 12:28:35 +01:00
acano 0ed7dfc653 Merge branch 'online' of github.com:BRUNIX-AI/assistance-engine into mrh-online-dev 2026-03-11 09:57:14 +01:00
rafa-ruiz 35ca56118d feat: add MBPP-style dataset generator and evaluation docs 2026-03-10 13:37:19 -07:00
acano 745ce07805 Merge branch 'mrh-online-dev' of github.com:BRUNIX-AI/assistance-engine into mrh-online-dev 2026-03-10 14:36:17 +01:00
acano bf3c7f36d8 feat(chunk): enhance file reading and processing logic
- Updated `read_files` function to return a list of dictionaries containing 'content' and 'title' keys.
- Added logic to handle concatenation of file contents and improved handling of file prefixes.
- Introduced `get_chunk_docs` function to chunk document contents using `SemanticChunker`.
- Added `convert_chunks_to_document` function to convert chunked content into `Document` objects.
- Integrated logging for chunking process.
- Updated dependencies in `uv.lock` to include `chonkie` and other related packages.
2026-03-10 14:36:09 +01:00
pseco a9bf84fa79 feat: Add synthetic dataset generation for AVAP using MBPP dataset
- Implemented a new script `translate_mbpp.py` to generate synthetic datasets using various LLM providers.
- Integrated the `get_prompt_mbpp` function in `prompts.py` to create prompts tailored for AVAP language conversion.
2026-03-09 17:43:07 +01:00
rafa-ruiz 7839793eff docs: align function syntax and cleanup docker config 2026-03-05 11:57:29 -08:00
rafa-ruiz 8379033900 Sample avap code 2026-03-04 20:21:27 -08:00
rafa-ruiz 1c9ee8d5dd docs(core): add official AVAP documentation in Markdown (iii) 2026-03-04 18:44:22 -08:00
rafa-ruiz 0113b32f8a docs(core): add official AVAP documentation in Markdown (ii) 2026-03-04 18:31:50 -08:00
rafa-ruiz 2d66266fd8 docs(core): add official AVAP documentation in Markdown 2026-03-04 18:25:15 -08:00