Enhance generate_mbap_v2.py with new reward mechanism and GoldPool integration

- Added GoldPool class to manage a top-K pool of high-reward examples.
- Implemented compute_reward function to calculate composite rewards based on execution coverage, novelty, and test quality.
- Introduced call_api_reward function for API calls in the new reward mode.
- Updated main function to support new reward mode with adjustable weights for ECS, novelty, and test quality.
- Enhanced dataset saving functionality to include reward statistics.
- Refactored existing code for improved readability and consistency.
This commit is contained in:
acano 2026-03-27 14:04:21 +01:00
parent c6b57849cd
commit f747c140c8
1 changed files with 654 additions and 133 deletions

File diff suppressed because it is too large Load Diff