En PEFTssant

Started: May 01, 2024

En PEFTssant is a project that fine-tunes a large language model (Mistral 7B) to play chess using parameter-efficient fine-tuning (PEFT) techniques, specifically LoRA.

The project uses the Kingbase chess dataset, converting it into a JSON file of individual chess moves. The data is processed and split into training and testing sets, then used to fine-tune Mistral 7B via the HuggingFace Alignment Handbook as a template. A custom Jinja chat template was applied to each data point, and several LoRA parameters (alpha, r, epochs, packing) were tuned.

For evaluation, a loop simulates 100 games from the test set in parallel. The model suggests the next move given a prefix of moves, with a retry mechanism (up to 8 retries) for invalid moves. Evaluation was capped at 100 moves per game due to quadratic execution time. Results were compared against StockFish to measure performance degradation per move.

Read the paper

GitHub

Technical Details

Model: Mistral 7B fine-tuned with LoRA (PEFT)
Dataset: Kingbase chess dataset
Framework: HuggingFace Alignment Handbook
Dependencies: torch==2.2.2, flash-attn==2.5.2

Usage

Step 1: Follow setup instructions from the HuggingFace Alignment Handbook.

Step 2: Run dataparser/chess/extract.py and dataparser/chess/split_data.py to create chessdata.json and the test/train prompts.

Step 3: Edit and run scripts/launch.sh depending on your GPU setup and desired SFT recipe.

Step 4: Run scripts/generate_new.py or scripts/generate_retry.py to generate valid moves.

Step 5: To play a game against the bot, run scripts/play_chess.py. Mode 0/1 has the bot play white/black; Mode 2 lets you alternate with the bot each move.

Share on

Bluesky Facebook LinkedIn Mastodon X (formerly Twitter)

Mihir Dhamankar

Share on