Jonathan Rahn | AI Lab Lead, Drees & Sommer | GitHub | HuggingFace
This work explores transformer-based strategic reasoning through chess as a testbed, demonstrating that language models can develop sophisticated game-playing capabilities without traditional search algorithms. In collaboration with LAION, we’ve developed a progression of models that challenge fundamental assumptions about how AI systems learn strategic thinking.
The core hypothesis: complex strategic reasoning can emerge from next-token prediction when models are trained on appropriately structured strategic data.
Active development integrating Reinforcement Learning with Verifiable Rewards (GRPO) for enhanced reasoning capabilities.
Repo Transformers & TRL: jorahn/RookWorld-TRL
Repo PyTorch: jorahn/RookWorld-RLVR
Key breakthrough: Unified chess policy and world model in a single transformer architecture.
Post: ROOK: REASONING OVER ORGANIZED KNOWLEDGE
- Collaboration: Jenia Jitsev (LAION/JSC), Qi Sun (Tokyo Tech/Sakana AI)
- Multi-task Performance:
- 32.1% Checkmate-in-One accuracy (vs ChessGPT-Base 26.5%)
- 99.9% environment simulation accuracy
- 26.2% overall action accuracy
- Model: RookWorld-LM 124M
- Dataset: rookworld_7m
- Significance: Enables closed-loop self-play without external engines
- Interactive Demo: RookWorld Space
Implementation of Chain-of-Thought reasoning for chess, incorporating position analysis → candidate evaluation → move selection.
- Dataset: rook_40m (6B tokens, generated on Tsubame 4.0)
- Architecture: GPT-2 with custom chess tokenization
- Performance: 22.2% action accuracy with comprehensive reasoning traces
- Technical Details: LAION Research Note
Reproduction of Google DeepMind’s “Grandmaster-Level Chess Without Search” methodology using LLaMA-based decoder.
- Performance: 49% action accuracy, 57% on Checkmate-in-One
- Achievement: Demonstrated searchless chess AI feasibility with minimal parameters
- Model: Available on HuggingFace
Initial exploration using BERT-based position evaluation with custom FEN encoders. Established baseline performance and identified key challenges in chess representation for transformer architectures.
- Dataset: yolochess_lichess-elite_2211
- Architecture: DeBERTa v2 with FEN tokenization
- Unified world modeling: Simultaneous policy and environment simulation in transformers
- Strategic tokenization: Custom representations for structured game states
- Multi-task scaling: Consistent performance improvements with unified training objectives
- Large-scale annotation: 40M+ positions annotated with Stockfish 16.1 on supercomputing infrastructure
- Multi-format datasets: Support for classification, autoregressive, and multi-task learning
- Reproducible pipelines: Full data generation code and methodology documentation
All models, datasets, and code publicly available. Contributing to democratization of strategic AI research.
Background spans neuro-informatics (University of Lübeck), games industry applications, business economics & management (Witten/Herdecke University, IPADE Mexico DF), and AI/ML consulting. Active contributor to HuggingFace ecosystem (transformers, datasets, evaluate) and open source frameworks including keras-rl and custom implementations like keras-wide-n-deep. Current work at Drees & Sommer, building the AI Lab & exploring applications in construction and real estate optimization.
The RookWorld results suggest that:
- Search-free strategic AI is viable with appropriate training data
- Unified architectures can efficiently handle multiple strategic reasoning tasks
- Chain-of-thought training improves both performance and interpretability
- Language model paradigms apply effectively to structured strategic domains
These findings have implications beyond chess for any domain requiring sequential decision-making under