Post by Braintrust

Name: Run a full chess eval without writing a single line of code using the Braintrust CLI. - Take a CSV
Uploaded: 2026-06-29T16:23:39.768Z
Channel: Braintrust
Description: Run a full chess eval without writing a single line of code using the Braintrust CLI. - Take a CSV of chess puzzles and make a dataset. - Write a prompt to solve mate in 2 puzzles, and upload it to

13,734 followers

Run a full chess eval without writing a single line of code using the Braintrust CLI. - Take a CSV of chess puzzles and make a dataset. - Write a prompt to solve mate in 2 puzzles, and upload it to the project. - Then write a scorer that compares the output to the expected answer. The eval found that GPT‑5 with no reasoning scored about 25% on the chess puzzles, and with low reasoning it scored about 15%. Learn more in the Braintrust docs → https://lnkd.in/eDXrETtR

Video Content