benchmark evidence
tau2-bench Banking Knowledge
Banking knowledge tasks evaluated through the tau2-bench agent scaffold and declared retrieval configuration.
no matched results
No public rows from this source currently map to an active callable model.
what this result means
Banking knowledge tasks evaluated through the tau2-bench agent scaffold and declared retrieval configuration.
This benchmark contributes direct public evidence. Read its scope before generalizing the result.
A win here is a win on tau2-bench Banking Knowledge. Broad task pages require independent corroboration before naming a general winner.
source record
category: agentic
metric: accuracy
matched models: 0
latest source date: date unavailable
direction: higher is better