Skip to content

Commit fb6b3ca

Browse files
slister1001Copilot
andauthored
[Evaluation] Recover partial red team results when Foundry execution raises (#45541)
* [Evaluation] Recover partial red team results when Foundry execution raises When orchestrator.execute() raises (e.g., ConnectTimeout on 1 of 50 objectives), attempt to recover partial results from the orchestrator before falling back to the empty-result error path. Previously, any single objective failure caused the entire risk category's results to be discarded (data_file set to empty string, 0 results returned). Now, completed objectives are processed through the normal FoundryResultProcessor pipeline and included in the final output. The error is demoted from ERROR to WARNING when partial results are available, since it is not a total failure. The original full-failure path is preserved when get_attack_results() returns empty. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Address review comments: add debug logging, structured partial_failure info Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Apply black formatting Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
1 parent 7455f91 commit fb6b3ca

1 file changed

Lines changed: 38 additions & 12 deletions

File tree

sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/red_team/_foundry/_execution_manager.py

Lines changed: 38 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -162,19 +162,45 @@ async def execute_attacks(
162162
include_baseline=include_baseline,
163163
)
164164
except Exception as e:
165-
self.logger.error(f"Error executing attacks for {risk_value}: {e}")
166-
# Use "Foundry" as fallback strategy name to match expected structure
167-
if "Foundry" not in red_team_info:
168-
red_team_info["Foundry"] = {}
169-
red_team_info["Foundry"][risk_value] = {
170-
"data_file": "",
171-
"status": "failed",
172-
"error": str(e),
173-
"asr": 0.0,
174-
}
175-
continue
165+
# Attempt to recover partial results before giving up.
166+
# partial_results is used only as a truthiness check here;
167+
# FoundryResultProcessor re-retrieves results via orchestrator.get_attack_results().
168+
partial_results = []
169+
try:
170+
partial_results = orchestrator.get_attack_results()
171+
except Exception:
172+
self.logger.debug("Failed to recover partial results for %s", risk_value, exc_info=True)
173+
174+
if partial_results:
175+
self.logger.warning(
176+
f"Partial failure executing attacks for {risk_value}: {e}. "
177+
f"Recovered {len(partial_results)} partial results."
178+
)
179+
# Record partial failure in structured output so callers
180+
# relying on red_team_info can observe it.
181+
if "Foundry" not in red_team_info:
182+
red_team_info["Foundry"] = {}
183+
red_team_info["Foundry"][risk_value] = {
184+
"data_file": "",
185+
"status": "partial_failure",
186+
"error": str(e),
187+
"partial_failure": True,
188+
"asr": 0.0,
189+
}
190+
else:
191+
self.logger.error(f"Error executing attacks for {risk_value}: {e}")
192+
# No results recoverable — use empty fallback
193+
if "Foundry" not in red_team_info:
194+
red_team_info["Foundry"] = {}
195+
red_team_info["Foundry"][risk_value] = {
196+
"data_file": "",
197+
"status": "failed",
198+
"error": str(e),
199+
"asr": 0.0,
200+
}
201+
continue
176202

177-
# Process results
203+
# Process results (handles both full success and partial recovery)
178204
result_processor = FoundryResultProcessor(
179205
scenario=orchestrator,
180206
dataset_config=dataset_config,

0 commit comments

Comments
 (0)