assistance-engine/output/evaluation_bge-m3-latest_av...

58 lines
3.1 KiB
JSON

{
"generated_at": "2026-04-06T10:56:13.462829",
"embedding_model": "bge-m3:latest",
"index_name": "avap-knowledge-v2-bge",
"result": {
"status": "ok",
"questions_evaluated": 5,
"elapsed_seconds": 48.6,
"judge_model": "claude-sonnet-4-20250514",
"index": "avap-knowledge-v2-bge",
"category_filter": "all",
"scores": {
"faithfulness": 0.6536,
"answer_relevancy": 0.8496,
"context_recall": 0.5758,
"context_precision": 0.6333
},
"global_score": 0.6781,
"verdict": "ACCEPTABLE",
"details": [
{
"id": "GD-001",
"category": "RETRIEVAL",
"question": "What is AVAP and what is it designed for?",
"answer_preview": "AVAP (Advanced Virtual API Programming) is an Advanced Virtual API Programming language that is Turing Complete. It is designed to be used in the orquestation of microservices and I/O operations, with a focus on security, concurrency, and determinism.\n\nThe language is described as strictly oriented ",
"n_chunks": 5
},
{
"id": "GD-002",
"category": "RETRIEVAL",
"question": "How does AVAP handle conditional logic? What commands are used and how are blocks closed?",
"answer_preview": "AVAP handles conditional logic using the `if()` command, which can be invoked in two modes:\n\n1. **Mode 1 (Structured Comparison):** The syntax is `if(variable, value, comparison_operator)`. This mode evaluates a comparison between the variable and the value provided as strings representing operators",
"n_chunks": 5
},
{
"id": "GD-003",
"category": "CODE_GENERATION",
"question": "Write an AVAP script that reads a 'password' parameter, generates a SHA-256 hash of it, and returns the hash.",
"answer_preview": "```avap\nencodeSHA256(password, hashed_password)\n```\n\nThis script takes a 'password' parameter, generates its SHA-256 hash using `encodeSHA256`, and stores the result in `hashed_password`.",
"n_chunks": 5
},
{
"id": "GD-004",
"category": "CODE_GENERATION",
"question": "Show an AVAP script that loops from 1 to 5, builds a JSON object with each iteration index as a key, and returns it.",
"answer_preview": "```avap\nx = 0\n\nwhile x < 5:\n y = x + 1\n \n # Build the JSON object for this iteration\n json_obj = {\n \"index\": y,\n \"value\": x * 2\n }\n \n # Return the current iteration's JSON object\n return json_obj\n\nx += 1\n```",
"n_chunks": 5
},
{
"id": "GD-005",
"category": "RETRIEVAL",
"question": "How does AVAP support external HTTP calls? What commands are available and how is timeout handled?",
"answer_preview": "AVAP supports external HTTP calls through the `RequestPost` and `RequestGet` functions. These functions allow for POST and GET requests, respectively, to be executed on a specified URL with optional query parameters, headers, body data, and destination file.\n\nTimeout handling in AVAP is managed by t",
"n_chunks": 5
}
]
}
}