Commit Graph

4 Commits

Author SHA1 Message Date
acano 752bf9c7d9 Update Elasticsearch index version and modify imports in ingestion and translation scripts
- Changed Elasticsearch index from "avap-docs-test-v3" to "avap-docs-test-v4" in elasticsearch_ingestion.py.
- Removed unused import SystemMessage from langchain_core.messages in translate_mbpp.py.
- Added import for Lark in chunk.py to support new functionality.
2026-03-19 11:30:00 +01:00
pseco c7adab24a6 working on synthetic dataset 2026-03-17 11:02:06 +01:00
acano 9425db9b6c Refactor project structure: move prompts module to tasks directory and update references 2026-03-12 10:20:34 +01:00
pseco a9bf84fa79 feat: Add synthetic dataset generation for AVAP using MBPP dataset
- Implemented a new script `translate_mbpp.py` to generate synthetic datasets using various LLM providers.
- Integrated the `get_prompt_mbpp` function in `prompts.py` to create prompts tailored for AVAP language conversion.
2026-03-09 17:43:07 +01:00