- Implemented parser for executing AVAP files within a Docker container (parser v1.py).
- Created a script to send AVAP code to a local server and handle responses (parser v2.py).
- Introduced a mock MBAP test harness to validate AVAP code against expected outputs (mbap_tester.py).
- Added transformation logic to convert AVAP code into Python-like syntax for testing purposes.
- Enhanced error handling and output formatting in the testing harness.
- Updated `elasticsearch_ingestion.py` to streamline document processing and ingestion into Elasticsearch.
- Introduced `generate_mbap.py` for generating benchmark problems in AVAP language from a provided LRM.
- Created `prompts.py` to define prompts for converting Python problems to AVAP.
- Enhanced chunk processing in `chunk.py` to support markdown and AVAP documents.
- Added `OllamaEmbeddings` class in `embeddings.py` for handling embeddings with Ollama model.
- Updated dependencies in `uv.lock` to include new packages and versions.
- Added new dependencies including chonkie and markdown-it-py to requirements.txt.
- Refactored the Elasticsearch ingestion script to read and concatenate documents from specified folders.
- Implemented semantic chunking for documents using the chonkie library.
- Removed the old elasticsearch_ingestion_from_docs.py script as its functionality has been integrated into the main ingestion pipeline.
- Updated README.md to reflect new project structure and environment variables.
- Added a new changelog entry for version 1.4.0 detailing recent changes and enhancements.
- Implemented code to utilize OllamaEmbeddings for embedding documents.
- Included example usage with sample text inputs.
- Demonstrated response handling from the Ollama LLM.
- Noted deprecation warning for the Ollama class in LangChain.
- Updated `read_files` function to return a list of dictionaries containing 'content' and 'title' keys.
- Added logic to handle concatenation of file contents and improved handling of file prefixes.
- Introduced `get_chunk_docs` function to chunk document contents using `SemanticChunker`.
- Added `convert_chunks_to_document` function to convert chunked content into `Document` objects.
- Integrated logging for chunking process.
- Updated dependencies in `uv.lock` to include `chonkie` and other related packages.
- Implemented `replace_javascript_with_avap` function to handle text replacement.
- Created `read_concat_files` function to read and concatenate files with a specified prefix, replacing JavaScript markers.
- Added functionality to read files from a specified directory and process their contents.
- Implemented `elasticsearch_ingestion` function to handle document ingestion into Elasticsearch.
- Created `build_chunks_from_folder` function to read and clean text files, generating document chunks.
- Added logging for better traceability during the ingestion process.
- Updated `uv.lock` to include `boto3` as a new dependency.
- Set execution counts to null for initial cells in langgraph_agent_simple.ipynb
- Update execution counts for subsequent cells to maintain order
- Change output stream name from stdout to stderr for error handling
- Capture and log detailed error messages for failed Langfuse client authentication
Update uv.lock to manage accelerate dependency
- Remove accelerate from main dependencies
- Add accelerate to dev dependencies with version specification
- Adjust requires-dist section to reflect changes in dependency management
- Created a new Jupyter notebook for analyzing BEIR dataset with CosQA using Ollama embeddings.
- Implemented a custom embedding class to integrate LangChain's OllamaEmbeddings with BEIR.
- Added data loading and evaluation logic for the CosQA dataset.
- Updated `uv.lock` to remove unnecessary dependencies (`mteb` and `polars`) and incremented revision number.