Skip to content

Commit cc97b34

Browse files
chore: Update branding from τ²-bench to τ-bench across the application (#70)
* chore: Update branding from τ²-bench to τ-bench across the application - Changed title and references in index.html, App.jsx, DocsContent.jsx, Leaderboard.jsx, Results.jsx, and TrajectoryVisualizer.jsx to reflect the new branding. - Updated submitting organization in submission.json from "Qwen" to "Alibaba". * chore: Add mention of tau-bench in README --------- Co-authored-by: Victor Barres <victor@sierra.ai>
1 parent 9df0df8 commit cc97b34

File tree

8 files changed

+24
-22
lines changed

8 files changed

+24
-22
lines changed

README.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,8 @@ The τ²-bench leaderboard is now live at **[taubench.com](https://taubench.com)
3535

3636
$\tau^2$-bench implements a simulation framework for evaluating customer service agents across various domains.
3737

38+
**$\tau^2$-bench is the new iteration of the original $\tau$-bench**, featuring code fixes and an additional telecom domain.
39+
3840
Each domain specifies:
3941
- a policy that the agent must follow
4042
- a set of tools that the agent can use

web/leaderboard/index.html

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44
<meta charset="UTF-8" />
55
<link rel="icon" type="image/png" href="./sierra-logo.png" />
66
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
7-
<title>τ²-bench</title>
7+
<title>τ-bench</title>
88
<script src="https://cdn.jsdelivr.net/npm/chart.js"></script>
99
</head>
1010
<body>

web/leaderboard/public/submissions/qwen3-max_qwen_2024_09_23/submission.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
{
22
"model_name": "Qwen3-Max",
33
"model_organization": "Qwen",
4-
"submitting_organization": "Qwen",
4+
"submitting_organization": "Alibaba",
55
"submission_date": "2024-09-23",
66
"contact_info": {
77
"email": "wzhao@cs.cornell.edu",

web/leaderboard/src/App.jsx

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -93,7 +93,7 @@ function App() {
9393
<div className="nav-container">
9494
<div className="nav-logo">
9595
<div className="logo-main" onClick={() => navigateTo('home')}>
96-
<span className="tau-symbol">τ²</span>
96+
<span className="tau-symbol">τ</span>
9797
<span className="bench-text">-bench</span>
9898
</div>
9999
<a href="https://sierra.ai" target="_blank" rel="noopener noreferrer" className="logo-attribution">
@@ -120,8 +120,8 @@ function App() {
120120
<div className="notification-container">
121121
<span className="notification-badge">NEW</span>
122122
<span className="notification-text">
123-
We have updated to τ²-bench. If you are seeking the original τ-bench please
124-
<a href="https://github.com/sierra-research/tau-bench" target="_blank" rel="noopener noreferrer" className="notification-link"> click here</a>.
123+
τ-bench now supports telecom domain, introduced by the{' '}
124+
<a href="https://arxiv.org/abs/2506.07982" target="_blank" rel="noopener noreferrer" className="notification-link">τ²-bench paper</a>.
125125
</span>
126126
</div>
127127
</div>
@@ -135,19 +135,19 @@ function App() {
135135
<div className="hero-content-vertical">
136136
<div className="hero-title-section">
137137
<h1 className="hero-main-title">
138-
<span className="tau-symbol">τ²</span>
138+
<span className="tau-symbol">τ</span>
139139
<span className="bench-text">-bench</span>
140140
</h1>
141141
</div>
142142

143143
<div className="hero-image-section">
144-
<img src={`${import.meta.env.BASE_URL}traj.png`} alt="Sample τ²-bench Trajectories" className="trajectory-image" />
144+
<img src={`${import.meta.env.BASE_URL}traj.png`} alt="Sample τ-bench Trajectories" className="trajectory-image" />
145145
</div>
146146

147147
<div className="hero-description-section">
148148
<p className="hero-description">
149149
Benchmarking AI agents in collaborative real-world scenarios.
150-
τ²-bench challenges agents to coordinate, guide, and assist users
150+
τ-bench challenges agents to coordinate, guide, and assist users
151151
in achieving shared objectives across complex enterprise domains.
152152
</p>
153153
<div className="hero-actions">
@@ -189,23 +189,23 @@ function App() {
189189
<a href="https://sierra.ai/resources/research/tau-squared-bench" target="_blank" rel="noopener noreferrer" className="news-item">
190190
<div className="news-icon">🎉</div>
191191
<div className="news-text">
192-
<strong>τ²-bench leaderboard released: Track model performance and submit your results across retail, airline, and telecom domains</strong>
192+
<strong>τ-bench leaderboard released: Track model performance and submit your results across retail, airline, and telecom domains</strong>
193193
<span>October 3, 2025</span>
194194
</div>
195195
<div className="news-arrow"></div>
196196
</a>
197197
<a href="https://openai.com/gpt-5" target="_blank" rel="noopener noreferrer" className="news-item">
198198
<div className="news-icon">🏆</div>
199199
<div className="news-text">
200-
<strong>GPT-5 achieves state-of-the-art performance on τ²-bench, setting new records with 96% on telecom, 82% on retail, and 63% on airline</strong>
200+
<strong>GPT-5 achieves state-of-the-art performance on τ-bench, setting new records with 96% on telecom, 82% on retail, and 63% on airline</strong>
201201
<span>January 15, 2025</span>
202202
</div>
203203
<div className="news-arrow"></div>
204204
</a>
205205
<a href="https://sierra.ai/resources/research/tau-squared-bench" target="_blank" rel="noopener noreferrer" className="news-item">
206206
<div className="news-icon">🚀</div>
207207
<div className="news-text">
208-
<strong>τ²-bench launched: Evaluating agents in dual-control environments, testing coordination and collaboration with tool-accessing user simulators</strong>
208+
<strong>τ-bench launched: Evaluating agents in dual-control environments, testing coordination and collaboration with tool-accessing user simulators</strong>
209209
<span>June 11, 2025</span>
210210
</div>
211211
<div className="news-arrow"></div>

web/leaderboard/src/components/DocsContent.jsx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -197,7 +197,7 @@ const DocsContent = ({ domain }) => {
197197
<div className="docs-note">
198198
<h3>📖 About This Documentation</h3>
199199
<p>
200-
This documentation represents the agent policy and domain specifications for the {domain} domain in τ²-bench.
200+
This documentation represents the agent policy and domain specifications for the {domain} domain in τ-bench.
201201
Agents are evaluated based on their adherence to these policies and their ability to successfully complete tasks within this domain.
202202
</p>
203203
<div className="docs-links">

web/leaderboard/src/components/Leaderboard.jsx

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -400,7 +400,7 @@ const Leaderboard = () => {
400400
if (isLoading) {
401401
return (
402402
<div className="leaderboard-container">
403-
<h2 className="leaderboard-title">τ²-bench Leaderboard</h2>
403+
<h2 className="leaderboard-title">τ-bench Leaderboard</h2>
404404
<div className="loading-state">
405405
<div className="loading-spinner"></div>
406406
<p>Loading leaderboard data...</p>
@@ -412,7 +412,7 @@ const Leaderboard = () => {
412412
if (loadError) {
413413
return (
414414
<div className="leaderboard-container">
415-
<h2 className="leaderboard-title">τ²-bench Leaderboard</h2>
415+
<h2 className="leaderboard-title">τ-bench Leaderboard</h2>
416416
<div className="error-state">
417417
<p>Error loading leaderboard data: {loadError}</p>
418418
<button onClick={loadSubmissionData} className="retry-button">
@@ -426,7 +426,7 @@ const Leaderboard = () => {
426426
if (Object.keys(passKData).length === 0) {
427427
return (
428428
<div className="leaderboard-container">
429-
<h2 className="leaderboard-title">τ²-bench Leaderboard</h2>
429+
<h2 className="leaderboard-title">τ-bench Leaderboard</h2>
430430
<div className="empty-state">
431431
<p>No leaderboard data available.</p>
432432
</div>
@@ -436,7 +436,7 @@ const Leaderboard = () => {
436436

437437
return (
438438
<div className="leaderboard-container">
439-
<h2 className="leaderboard-title">τ²-bench Leaderboard</h2>
439+
<h2 className="leaderboard-title">τ-bench Leaderboard</h2>
440440

441441
{/* Combined Controls Row */}
442442
<div className="leaderboard-controls">
@@ -669,8 +669,8 @@ const Leaderboard = () => {
669669
{model.organization === 'DeepSeek' && (
670670
<img src={`${import.meta.env.BASE_URL}DeepSeek_logo_icon.png`} alt="DeepSeek" className="logo-img" />
671671
)}
672-
{model.organization === 'Alibaba' && (
673-
<img src={`${import.meta.env.BASE_URL}qwen-color.png`} alt="Alibaba" className="logo-img" />
672+
{(model.organization === 'Alibaba' || model.organization === 'Qwen') && (
673+
<img src={`${import.meta.env.BASE_URL}qwen-color.png`} alt="Qwen" className="logo-img" />
674674
)}
675675
{model.organization === 'Google' && (
676676
<img src={`${import.meta.env.BASE_URL}Google__G__logo.svg.png`} alt="Google" className="logo-img" />

web/leaderboard/src/components/Results.jsx

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -102,7 +102,7 @@ const Results = () => {
102102
<div className="contributions-section">
103103
<h2>Key Contributions</h2>
104104
<p className="results-subtitle" style={{textAlign: 'center', marginBottom: '32px'}}>
105-
τ²-bench introduces four fundamental advances in agent evaluation methodology
105+
τ-bench introduces four fundamental advances in agent evaluation methodology
106106
</p>
107107
<div className="contributions-grid">
108108
<div className="contribution-card">
@@ -628,7 +628,7 @@ const Results = () => {
628628
<div className="container">
629629
{/* Header */}
630630
<div className="results-header">
631-
<h1>τ²-Bench Research Analysis</h1>
631+
<h1>τ-Bench Research Analysis</h1>
632632
<p className="results-subtitle">
633633
Comprehensive evaluation of conversational agents in dual-control collaborative environments.
634634
Detailed analysis of coordination challenges, reasoning bottlenecks, and simulation quality.

web/leaderboard/src/components/TrajectoryVisualizer.jsx

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -303,9 +303,9 @@ const TrajectoryVisualizer = () => {
303303
return (
304304
<div className="trajectory-visualizer">
305305
<div className="visualizer-header">
306-
<h2>τ²-bench Visualizer</h2>
306+
<h2>τ-bench Visualizer</h2>
307307
<p className="visualizer-description">
308-
Explore τ²-bench dataset: view conversation trajectories showing AI agent interactions with users,
308+
Explore τ-bench dataset: view conversation trajectories showing AI agent interactions with users,
309309
or examine the underlying task definitions that drive these conversations across airline, retail, and telecom domains.
310310
</p>
311311

0 commit comments

Comments
 (0)