Research Overview
This work explores transformer-based strategic reasoning through chess as a testbed, demonstrating that language models can develop sophisticated game-playing capabilities without traditional search algorithms. In collaboration with LAION, we've developed a progression of models that challenge fundamental assumptions about how AI systems learn strategic thinking.
Core hypothesis: Complex strategic reasoning can emerge from next-token prediction when models are trained on appropriately structured strategic data.
The ROOK Project Evolution
RookWorld-RLVR (2025) - RL Fine-Tuning with Verification Current
Active development integrating GRPO (Reinforcement Learning with Verifiable Rewards) for enhanced reasoning capabilities.
RookWorld-LM (2024) - Unified Agent+Environment
124M params: Unified chess policy and world model in a single transformer architecture.
Post: ROOK: REASONING OVER ORGANIZED KNOWLEDGE
Multi-task Performance:
- 🏆 32.1% Checkmate-in-One accuracy - outperforms ChessGPT-Base (26.5%) with 24x fewer parameters (124M vs 3B, Feng et al. NeurIPS'23)
- 99.9% environment simulation accuracy
- 26.2% overall action accuracy
Browser-based inference using ONNX Runtime (WASM). Downloads 285MB model for client-side generation.
Significance: Enables closed-loop self-play without external engines
ROOK-LM (2024) - Chain-of-Thought Reasoning
124M params: Implementation of reasoning traces for chess, incorporating position analysis → candidate evaluation → move selection.
- Dataset: rook_40m (6B tokens, generated on Tsubame 4.0)
- Architecture: GPT-2 with custom chess tokenization
- Performance: 22.2% action accuracy with comprehensive reasoning traces
- Technical Details: LAION Research Note
Browser-based inference using ONNX Runtime (WASM). Downloads 285MB model for client-side generation.
ROOK-CLF (2024) - Decoder-based Behavioral Cloning
9M params: Reproduction of Google DeepMind's "Grandmaster-Level Chess Without Search" methodology using LLaMA-based decoder.
- Performance: 49% action accuracy, 57% on Checkmate-in-One
- Achievement: Demonstrated searchless chess AI feasibility with minimal parameters
- Model: Available on HuggingFace
Browser-based inference using ONNX Runtime (WASM/WebGPU). Downloads 9MB quantized model for client-side generation.
LAION Strategic Game Dataset (2023) - Dataset Engineering
Contributed to the LAION Strategic Game Dataset project, responding to their call for participation to enhance AI models' strategic planning capabilities through game-based synthetic datasets. Developed chess-to-text transformation tools for dataset generation as part of this community effort exploring strategic reasoning in language models.
- Contribution: Chess dataset generation and transformation pipeline
- Code: chess-to-text repository
- Project Scale: 3.2 billion chess games, 608 billion moves via Stockfish self-play
- Impact: Foundation work that evolved into the ROOK project research
YoloChess (2022) - Encoder-based Behavioral Cloning
87M params (Custom DeBERTaV2-base, Vocab Size 500): Initial exploration using BERT-based position evaluation with custom FEN encoders. Established baseline performance and identified key challenges in chess representation for transformer architectures.
- Dataset: yolochess_lichess-elite_2211
- Architecture: DeBERTa v2 with FEN tokenization
- W&B Logs: View W&B Training Logs
Technical Contributions
Novel Architectures
- Unified world modeling: Simultaneous policy and environment simulation in transformers
- Strategic tokenization: Custom representations for structured game states
- Multi-task scaling: Consistent performance improvements with unified training objectives
Dataset Engineering
- Large-scale annotation: 40M+ positions annotated with Stockfish 16.1 on supercomputing infrastructure
- Multi-format datasets: Support for classification, autoregressive, and multi-task learning
- Reproducible pipelines: Full data generation code and methodology documentation
Open Science Impact
All models, datasets, and code publicly available. Contributing to democratization of strategic AI research.
Research Context
Background spans neuro-informatics (University of LĂĽbeck), games industry applications, business economics & management (Witten/Herdecke University, IPADE Mexico DF), and AI/ML consulting. Active contributor to HuggingFace ecosystem (transformers, datasets, evaluate) and open source frameworks including keras-rl and custom implementations like keras-wide-n-deep. Current work at Drees & Sommer, building the AI Lab & exploring applications in construction and real estate optimization.
Research Implications
The RookWorld results suggest that:
- Search-free strategic AI is viable with appropriate training data
- Unified architectures can efficiently handle multiple strategic reasoning tasks
- Chain-of-thought training improves both performance and interpretability
- Language model paradigms apply effectively to structured strategic domains
These findings have implications beyond chess for any domain requiring sequential decision-making under complex constraints.