-
Notifications
You must be signed in to change notification settings - Fork 173
Pull requests: sierra-research/tau2-bench
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
fix(telecom): use deterministic bill ID for state replay consistency
#148
opened Jan 23, 2026 by
CormickKneey
Loading…
3 of 4 tasks
update:tongagents tau2bench result summit
leaderboard submission
#145
opened Jan 16, 2026 by
tongagents-tau2
Loading…
feat: Add hospitality domain - Berkeley Hot Pot restaurant simulation
enhancement
New feature or request
new domain
Proposal for a new domain
#144
opened Jan 16, 2026 by
binleiwang
Loading…
Enhancement: Add vacation rental domain with host preference adaptation
enhancement
New feature or request
new domain
Proposal for a new domain
#142
opened Jan 15, 2026 by
wuTims
Loading…
submission: Add amity-sigma-v3r results from Amity
leaderboard submission
#137
opened Jan 12, 2026 by
touchaponk
Loading…
7 tasks done
feat: add new healthcare domain
enhancement
New feature or request
new domain
Proposal for a new domain
#136
opened Jan 11, 2026 by
eliot-gtn
Loading…
Add gpt-oss-120b leaderboard submission
leaderboard submission
#134
opened Jan 9, 2026 by
radiangle
Loading…
Create issue template and internal tracking
#124
opened Dec 26, 2025 by
christian-templeton
Loading…
feat: add new hotel domain implementation with 42 tasks
new domain
Proposal for a new domain
#114
opened Dec 15, 2025 by
imaronsu
Loading…
Submission: Claude Sonnet 4.5 with extended(interleaved) thinking and trajectories
#80
opened Nov 9, 2025 by
shivanibokadia-vl
Loading…
Submission: Claude Sonnet 4.5 with extended(interleaved) thinking and trajectories
#73
opened Oct 31, 2025 by
Hrithik2212
Loading…
Catch exceptions in validating agent_msg to handle failures gracefully
#43
opened Sep 10, 2025 by
GaganM
Loading…
User Action Omission: Simulating Human Errors in Support Scenarios
enhancement
New feature or request
#41
opened Sep 9, 2025 by
alexyoung23j
Loading…
Fix Problem 23: Clarify description about multiple bookings requirement
task enhancement
Task is correct but can be improved
Previous Next
ProTip!
Adding no:label will show everything without a label.