← Back to Home

Chess Dataset Generator

Procedural generation of chessboard image datasets from legal positions.
COMPLETED PROJECT
Project Information
Type Dataset Generation Tool
Status active
Language Python
Libraries python-chess / Pillow
Output PNG Dataset + Metadata
Repository GitHub

Overview

Chess Dataset Generator is a Python-based tool for procedurally generating chessboard image datasets from random legal chess positions.

The project renders chessboards using custom sprite sets and exports structured metadata alongside generated images.

It was built as a lightweight experiment for creating synthetic chess datasets for visualization and machine learning tasks.

Features

  • Generate random legal chess positions
  • Render chessboards from FEN strings
  • Export image datasets with metadata
  • Configurable board resolution and sprite sets
  • Deterministic dataset generation using seeds
  • Automatic metadata generation for each position

Dataset Generation

Legal chess positions are generated using python-chess and converted into rendered chessboard images using Pillow.

Each generated sample includes both the rendered board image and structured metadata describing the game state.

The generator supports configurable board sizes, sprite themes, and dataset output structure.

Metadata

Generated metadata includes information about the full board state and game conditions.

  • FEN representation
  • Current player turn
  • Move counter
  • Castling rights
  • En passant availability
  • Check status
  • Game-over state

Customization

The generator supports multiple configuration options for controlling visual output and dataset structure.

  • Custom chess piece sprite sets
  • Adjustable image resolution
  • Different board themes and layouts
  • Custom position distribution logic

Screenshots

Dataset Metadata Example Chessboard

Applications

  • Machine learning dataset generation
  • Computer vision experiments
  • Synthetic chessboard image generation
  • Chess visualization systems
  • Data augmentation workflows

Development Notes

The project explored procedural dataset generation and reproducible synthetic image pipelines.

Development focused on deterministic generation, configurable rendering, and metadata consistency.

```