En PEFTssant

Started:

En PEFTssant is a project that fine-tunes a large language model (Mistral 7B) to play chess using parameter-efficient fine-tuning (PEFT) techniques, specifically LoRA.

The project uses the Kingbase chess dataset, converting it into a JSON file of individual chess moves. The data is processed and split into training and testing sets, then used to fine-tune Mistral 7B via the HuggingFace Alignment Handbook as a template. A custom Jinja chat template was applied to each data point, and several LoRA parameters (alpha, r, epochs, packing) were tuned.

For evaluation, a loop simulates 100 games from the test set in parallel. The model suggests the next move given a prefix of moves, with a retry mechanism (up to 8 retries) for invalid moves. Evaluation was capped at 100 moves per game due to quadratic execution time. Results were compared against StockFish to measure performance degradation per move.

Read the paper

GitHub

Technical Details

  • Model: Mistral 7B fine-tuned with LoRA (PEFT)
  • Dataset: Kingbase chess dataset
  • Framework: HuggingFace Alignment Handbook
  • Dependencies: torch==2.2.2, flash-attn==2.5.2

Usage

Step 1: Follow setup instructions from the HuggingFace Alignment Handbook.

Step 2: Run dataparser/chess/extract.py and dataparser/chess/split_data.py to create chessdata.json and the test/train prompts.

Step 3: Edit and run scripts/launch.sh depending on your GPU setup and desired SFT recipe.

Step 4: Run scripts/generate_new.py or scripts/generate_retry.py to generate valid moves.

Step 5: To play a game against the bot, run scripts/play_chess.py. Mode 0/1 has the bot play white/black; Mode 2 lets you alternate with the bot each move.