En PEFTssant
Started:
En PEFTssant is a project that fine-tunes a large language model (Mistral 7B) to play chess using parameter-efficient fine-tuning (PEFT) techniques, specifically LoRA.
The project uses the Kingbase chess dataset, converting it into a JSON file of individual chess moves. The data is processed and split into training and testing sets, then used to fine-tune Mistral 7B via the HuggingFace Alignment Handbook as a template. A custom Jinja chat template was applied to each data point, and several LoRA parameters (alpha, r, epochs, packing) were tuned.
For evaluation, a loop simulates 100 games from the test set in parallel. The model suggests the next move given a prefix of moves, with a retry mechanism (up to 8 retries) for invalid moves. Evaluation was capped at 100 moves per game due to quadratic execution time. Results were compared against StockFish to measure performance degradation per move.
Technical Details
- Model: Mistral 7B fine-tuned with LoRA (PEFT)
- Dataset: Kingbase chess dataset
- Framework: HuggingFace Alignment Handbook
- Dependencies: torch==2.2.2, flash-attn==2.5.2
Usage
Step 1: Follow setup instructions from the HuggingFace Alignment Handbook.
Step 2: Run dataparser/chess/extract.py and dataparser/chess/split_data.py to create chessdata.json and the test/train prompts.
Step 3: Edit and run scripts/launch.sh depending on your GPU setup and desired SFT recipe.
Step 4: Run scripts/generate_new.py or scripts/generate_retry.py to generate valid moves.
Step 5: To play a game against the bot, run scripts/play_chess.py. Mode 0/1 has the bot play white/black; Mode 2 lets you alternate with the bot each move.
