assistance-engine

Commit Graph

Author	SHA1	Message	Date
pseco	3ac432567b	BNF extraction pipeline from avap.md	2026-03-11 11:29:19 +01:00
acano	0ed7dfc653	Merge branch 'online' of github.com:BRUNIX-AI/assistance-engine into mrh-online-dev	2026-03-11 09:57:14 +01:00
acano	2ad09cc77f	feat: Update dependencies and enhance Elasticsearch ingestion pipeline - Added new dependencies including chonkie and markdown-it-py to requirements.txt. - Refactored the Elasticsearch ingestion script to read and concatenate documents from specified folders. - Implemented semantic chunking for documents using the chonkie library. - Removed the old elasticsearch_ingestion_from_docs.py script as its functionality has been integrated into the main ingestion pipeline. - Updated README.md to reflect new project structure and environment variables. - Added a new changelog entry for version 1.4.0 detailing recent changes and enhancements.	2026-03-11 09:50:51 +01:00
rafa-ruiz	35ca56118d	feat: add MBPP-style dataset generator and evaluation docs	2026-03-10 13:37:19 -07:00
acano	745ce07805	Merge branch 'mrh-online-dev' of github.com:BRUNIX-AI/assistance-engine into mrh-online-dev	2026-03-10 14:36:17 +01:00
acano	bf3c7f36d8	feat(chunk): enhance file reading and processing logic - Updated `read_files` function to return a list of dictionaries containing 'content' and 'title' keys. - Added logic to handle concatenation of file contents and improved handling of file prefixes. - Introduced `get_chunk_docs` function to chunk document contents using `SemanticChunker`. - Added `convert_chunks_to_document` function to convert chunked content into `Document` objects. - Integrated logging for chunking process. - Updated dependencies in `uv.lock` to include `chonkie` and other related packages.	2026-03-10 14:36:09 +01:00
pseco	a9bf84fa79	feat: Add synthetic dataset generation for AVAP using MBPP dataset - Implemented a new script `translate_mbpp.py` to generate synthetic datasets using various LLM providers. - Integrated the `get_prompt_mbpp` function in `prompts.py` to create prompts tailored for AVAP language conversion.	2026-03-09 17:43:07 +01:00
pseco	f6bfba5561	Merge branch 'mrh-online-dev' of github.com:BRUNIX-AI/assistance-engine into mrh-online-dev	2026-03-09 15:04:23 +01:00
pseco	4afba7d89d	working on scrappy	2026-03-09 15:00:07 +01:00
acano	6d856ba691	Add chunk.py for processing and replacing JavaScript references with Avap - Implemented `replace_javascript_with_avap` function to handle text replacement. - Created `read_concat_files` function to read and concatenate files with a specified prefix, replacing JavaScript markers. - Added functionality to read files from a specified directory and process their contents.	2026-03-09 13:21:18 +01:00
acano	a4267e1b60	feat: implement Elasticsearch ingestion pipeline and embedding factories	2026-03-05 16:26:22 +01:00
acano	d951868200	refactor: Simplify Elasticsearch ingestion by removing chunk management module and integrating document building directly	2026-03-05 16:23:27 +01:00
acano	51f42c52b3	refactor: Remove unused uuid import from chunks.py and update changelog for refactoring changes	2026-03-05 11:27:27 +01:00
acano	1549069f5a	feat: Add Elasticsearch ingestion pipeline and document chunking functionality - Implemented `elasticsearch_ingestion` function to handle document ingestion into Elasticsearch. - Created `build_chunks_from_folder` function to read and clean text files, generating document chunks. - Added logging for better traceability during the ingestion process. - Updated `uv.lock` to include `boto3` as a new dependency.	2026-03-04 18:21:01 +01:00
pseco	9575af3ff0	working on dual index	2026-03-03 12:01:03 +01:00
pseco	a5952c1a4d	working on agent in docker	2026-03-02 12:41:27 +01:00
pseco	36bd3b32a6	generate working schema	2026-02-16 17:58:18 +01:00
izapata	7cdaf5a0c5	feat: update README and add start-tunnels.sh script for infrastructure setup	2026-02-16 14:50:55 +01:00
acano	03116be719	chore: add .gitkeep files to notebooks and scripts directories	2026-02-11 18:06:16 +01:00

19 Commits