PyMinMaximus Part 4: Creating an Opening Book

Welcome back to the PyMinMaximus series! In Part 1, we built move generation, in Part 2 we added minimax search, and in Part 3 we created a sophisticated evaluation function. PyMinMaximus can now play decent chess, understanding both tactics and positional concepts.

But we have a glaring weakness: PyMinMaximus wastes time “thinking” about well-known opening positions that have been analyzed for centuries. While PyMinMaximus calculates for 5 seconds to decide between 1.e4 and 1.d4, it could instantly play proven opening theory and save that time for complex middlegame positions.

In this post, we’ll solve this problem by implementing opening books from scratch. We will learn:

  • The purpose of an opening book
  • Polyglot, the common opening book format
  • Zobrist hashing to uniquely identify the board and positions
  • Reading a PGN file
  • How to create a custom opening book
  • Integrating into our PyMinMaximus engine

By the end, PyMinMaximus will play the opening like a grandmaster.

What is an Opening Book?

An opening book is a database of known opening positions with recommended moves. Instead of searching millions of positions in move 1, we simply look up “After 1.e4, what do strong players play?” and get instant answers.

Benefits:

  • Saves computation – no need to search book positions
  • Plays proven moves – benefit from centuries of opening theory
  • Adds variety – can randomize between good moves
  • Stronger play – knows opening traps and best lines
  • Time management – saves time for critical positions

Example: The Value of Opening Books

Without a book, PyMinMaximus might:

  • Search 500,000 nodes to choose 1.e2e4
  • Take 3-5 seconds per move in the opening
  • Play unusual or inferior opening lines
  • Walk into known opening traps

With a book, PyMinMaximus:

  • Plays 1.e2e4 instantly (0.001 seconds)
  • Saves 20-30 seconds in the opening phase
  • Plays mainline theory with proper move orders
  • Avoids known tactical pitfalls

That’s a massive practical advantage!

The Polyglot Opening Book Format

The standard file format for chess opening books is Polyglot, created by Fabien Letouzey. It’s a binary format that’s simple, efficient, and widely supported.

File Structure

A Polyglot book (.bin file) is a series of 16-byte entries, each containing:

Bytes 0-7:   Position hash (64-bit Zobrist hash, big-endian)

Bytes 8-9:   Move (16-bit encoded move, big-endian)

Bytes 10-11: Weight (16-bit unsigned integer, big-endian)

Bytes 12-15: Learn (32-bit value, usually 0, big-endian)

We will go into these in much more detail throughout this post, but generally, the position hash is a way to uniquely store any board state using only 64 bits. The move is a UCI move encoded into 16 bits. Weight helps us to determine the probability of a move (helpful if we have multiple possibilities from the current position). We will not use learn, but it is intended to allow an engine to self-modify the book.

The file is sorted by position hash for efficient binary search.

Why This Format?

  • Simple: Just 16 bytes per entry
  • Fast: Binary search finds positions in O(log n) time
  • Compact: No overhead, just raw data
  • Portable: Big-endian format works everywhere

Move Encoding

Let’s start with move encoding, which is fairly simple. Moves are encoded in 16 bits:

Bits 0-5:   to_square (0-63)

Bits 6-11:  from_square (0-63)

Bits 12-14: promotion piece (0=none, 1=knight, 2=bishop, 3=rook, 4=queen)

Bit 15:    unused

Example: e2e4 is encoded as:

  • from_square: e2 = 12 (row 1, col 4)
  • to_square: e4 = 28 (row 3, col 4)
  • Encoded: 28 | (12 << 6) = 28 | 768 = 796

Zobrist Hashing: Position Fingerprints

Next, let’s encode the board. To look up positions in the book, we need a unique fingerprint for each position. This is where Zobrist hashing comes in.

The Concept

Zobrist hashing creates a 64-bit hash using XOR operations:

  1. Generate random 64-bit numbers for each piece on each square
  2. XOR together the numbers for all pieces on the board
  3. XOR in additional bits for castling, en passant, and side to move

The beauty: incremental updates are cheap – just XOR in the changes when making a move. If you need to unmake a move, XOR the same move again.

As a simple example of how this works, if the current board is 10 and we add move 3, we simply XOR 3 again to remove it.

10 ^ 3 ^ 3 = 10

Why Zobrist Hashing?

  • Unique: Different positions (almost always) get different hashes
  • Fast: Just XOR operations
  • Incremental: Update hash when making moves without recalculating
  • Standard: Used in transposition tables and opening books

Implementation

Let’s implement Zobrist hashing from scratch:

import random
from constants import *

class ZobristHash:
    """
    Zobrist hashing for chess positions.
    Creates unique 64-bit fingerprints for positions.
    """
    
    def __init__(self, seed=42):
        """
        Initialize random number tables for Zobrist hashing.
        
        We need random numbers for:
        - Each piece type on each square
        - Castling rights
        - En passant files
        - Side to move
        """
        random.seed(seed)
        
        # Random numbers for each piece on each square
        # [piece_type][color][square]
        self.piece_keys = [
            [[self._rand64() for _ in range(64)] for _ in range(2)] 
            for _ in range(7)  # EMPTY, PAWN, KNIGHT, BISHOP, ROOK, QUEEN, KING
        ]
        
        # Random numbers for castling rights
        self.castle_keys = [self._rand64() for _ in range(4)]  # K, Q, k, q
        
        # Random numbers for en passant files (8 files)
        self.ep_keys = [self._rand64() for _ in range(8)]
        
        # Random number for side to move (black)
        self.side_key = self._rand64()
    
    def _rand64(self):
        """Generate a random 64-bit number."""
        return random.randint(0, (1 << 64) - 1)
    
    def hash_position(self, board):
        """
        Compute Zobrist hash for a position.
        
        Returns:
            64-bit integer representing the position
        """
        hash_value = 0
        
        # Hash all pieces on the board
        for row in range(8):
            for col in range(8):
                piece = board.board[row][col]
                
                if piece != EMPTY:
                    piece_type = piece & 7
                    color = 1 if (piece & 24) == BLACK else 0
                    square = row * 8 + col
                    
                    hash_value ^= self.piece_keys[piece_type][color][square]
        
        # Hash castling rights
        if board.castling_rights['K']:
            hash_value ^= self.castle_keys[0]
        if board.castling_rights['Q']:
            hash_value ^= self.castle_keys[1]
        if board.castling_rights['k']:
            hash_value ^= self.castle_keys[2]
        if board.castling_rights['q']:
            hash_value ^= self.castle_keys[3]
        
        # Hash en passant square (only the file matters)
        if board.en_passant_square:
            file = board.en_passant_square[1]
            hash_value ^= self.ep_keys[file]
        
        # Hash side to move (only if Black to move)
        if board.to_move == BLACK:
            hash_value ^= self.side_key
        
        return hash_value

Testing Zobrist Hashing

Let’s verify our implementation with a few simple tests:

import unittest
from move import Move
from zobrist import ZobristHash

class TestZobrist(unittest.TestCase):
    def test_zobrist(self):
        """Test Zobrist hashing properties."""
        from board import Board
        
        zobrist = ZobristHash()
        
        # Test 1: Same position = same hash
        print("Test 1: Consistency")
        board1 = Board()
        hash1 = zobrist.hash_position(board1)
        hash2 = zobrist.hash_position(board1)
        print(f"  Hash 1: {hash1}")
        print(f"  Hash 2: {hash2}")
        print(f"  Same: {hash1 == hash2} ✓")
        
        # Test 2: Different positions = different hashes
        print("\nTest 2: Uniqueness")
        board2 = Board()
        board2.make_move(Move(1, 4, 3, 4))  # e4
        hash3 = zobrist.hash_position(board2)
        print(f"  Starting: {hash1}")
        print(f"  After e4: {hash3}")
        print(f"  Different: {hash1 != hash3} ✓")
        
        # Test 3: Transpositions = same hash
        print("\nTest 3: Transpositions")
        # Reach same position via different move orders
        board3 = Board()
        board3.make_move(Move(1, 4, 3, 4))  # e4
        board3.make_move(Move(6, 4, 4, 4))  # e5
        
        board4 = Board()
        board4.make_move(Move(1, 4, 3, 4))  # e4
        board4.make_move(Move(6, 4, 4, 4))  # e5
        
        hash4 = zobrist.hash_position(board3)
        hash5 = zobrist.hash_position(board4)
        print(f"  Hash via path 1: {hash4}")
        print(f"  Hash via path 2: {hash5}")
        print(f"  Same: {hash4 == hash5} ✓")


if __name__ == "__main__":
    unittest.main()

Read PGN Files

Next, we need the ability to read chess PGN (portable game notation) files. This will allow us to read games and create an opening book.

PGN Games

We start by defining a simple game structure. We mainly need a list of UCI moves. We will also include a header and chess notation lists (in case we need them in the future).

import re
from board import Board
from move import Move
from constants import *

class PGNGame:
    """Represents a single chess game from a PGN file."""
    
    def __init__(self):
        self.headers = {}
        self.moves = []  # List of move strings in SAN format
        self.uci_moves = []  # List of moves in UCI format
        self.result = "*"
    
    def __str__(self):
        """String representation of the game."""
        header_str = "\n".join([f"[{key} \"{value}\"]" for key, value in self.headers.items()])
        moves_str = " ".join(self.moves)
        return f"{header_str}\n\n{moves_str} {self.result}"

PGN Parser

Next, we need a class to handle the parsing of a file. We split a file into games, and then we process each game. Let’s quickly outline the methods:

  • parse_file: opens a file and sends to parse_string
  • parse_string: splits the file into individual games(_split_games) and parses the individual games (_parse_game)
  • _split_games: processes each line to find the header of a new game
  • _parse_game: gets the moves from _parse_move_text and converts them to UCI format through _convert_to_uci
  • _parse_move_text: extracts the individual moves
  • _convert_to_uci: converts SAN moves to UCI
  • _san_to_uci: converts an individual SAN move to UCI
  • _move_to_uci and _uci_to_move: converts to and from our Move class
class PGNParser:
    """Parser for PGN files."""
    
    def __init__(self):
        self.games = []
    
    def parse_file(self, filename):
        """Parse a PGN file and extract all games."""
        with open(filename, 'r', encoding='utf-8') as f:
            content = f.read()
        
        return self.parse_string(content)
    
    def parse_string(self, pgn_string):
        """Parse PGN content from a string."""
        self.games = []
        
        # Split into individual games
        # Games are separated by blank lines after the movetext
        game_texts = self._split_games(pgn_string)
        
        for game_text in game_texts:
            if game_text.strip():
                game = self._parse_game(game_text)
                if game:
                    self.games.append(game)
        
        return self.games
    
    def _split_games(self, content):
        """Split PGN content into individual games."""
        games = []
        current_game = []
        in_headers = False
        
        lines = content.split('\n')
        
        for line in lines:
            stripped = line.strip()
            
            # Check if this is a header line
            if stripped.startswith('['):
                if current_game and not in_headers:
                    # We've hit a new game
                    games.append('\n'.join(current_game))
                    current_game = []
                in_headers = True
                current_game.append(line)
            elif stripped:
                in_headers = False
                current_game.append(line)
            elif current_game:
                # Blank line - could be end of game or just spacing
                current_game.append(line)
        
        # Don't forget the last game
        if current_game:
            games.append('\n'.join(current_game))
        
        return games
    
    def _parse_game(self, game_text):
        """Parse a single game from text."""
        game = PGNGame()
        
        lines = game_text.split('\n')
        movetext_lines = []
        
        # Parse headers
        for line in lines:
            line = line.strip()
            if line.startswith('['):
                # Parse header
                match = re.match(r'\[(\w+)\s+"(.*)"\]', line)
                if match:
                    key, value = match.groups()
                    game.headers[key] = value
            elif line:
                # This is part of the movetext
                movetext_lines.append(line)
        
        # Parse movetext
        movetext = ' '.join(movetext_lines)
        game.moves, game.result = self._parse_movetext(movetext)
        
        # Convert to UCI
        game.uci_moves = self._convert_to_uci(game)
        
        return game
    
    def _parse_movetext(self, movetext):
        """Parse the movetext section and extract moves."""
        moves = []
        
        # Remove comments in braces
        movetext = re.sub(r'\{[^}]*\}', '', movetext)
        
        # Remove comments in semicolons
        movetext = re.sub(r';[^\n]*', '', movetext)
        
        # Remove variations (recursive parentheses is complex, so we'll do simple)
        while '(' in movetext:
            movetext = re.sub(r'\([^()]*\)', '', movetext)
        
        # Extract result
        result_match = re.search(r'(1-0|0-1|1/2-1/2|\*)\s*$', movetext)
        result = result_match.group(1) if result_match else "*"
        
        # Remove result from movetext
        movetext = re.sub(r'(1-0|0-1|1/2-1/2|\*)\s*$', '', movetext)
        
        # Remove move numbers (e.g., "1.", "23...")
        movetext = re.sub(r'\d+\.+', '', movetext)
        
        # Remove NAGs (Numeric Annotation Glyphs like $1, $2, etc.)
        movetext = re.sub(r'\$\d+', '', movetext)
        
        # Remove annotation symbols (!, ?, !!, ??, !?, ?!)
        movetext = re.sub(r'[!?]+', '', movetext)
        
        # Split into individual moves
        tokens = movetext.split()
        
        for token in tokens:
            token = token.strip()
            if token and not token.startswith('(') and not token.endswith(')'):
                moves.append(token)
        
        return moves, result
    
    def _convert_to_uci(self, game):
        """Convert SAN moves to UCI format."""
        board = Board()
        
        # Set up initial position if FEN is provided
        if 'FEN' in game.headers:
            board.from_fen(game.headers['FEN'])
        
        uci_moves = []
        
        for san_move in game.moves:
            try:
                uci_move = self._san_to_uci(board, san_move)
                if uci_move:
                    uci_moves.append(uci_move)
                    
                    # Make the move on the board
                    move = self._uci_to_move(board, uci_move)
                    if move:
                        board.make_move(move)
                else:
                    print(f"Warning: Could not convert move '{san_move}'")
                    break
            except Exception as e:
                print(f"Error converting move '{san_move}': {e}")
                break
        
        return uci_moves
    
    def _san_to_uci(self, board, san):
        """Convert a single SAN move to UCI format."""
        # Generate all legal moves
        legal_moves = board.generate_legal_moves()
        
        # Store original for debugging
        original_san = san
        
        # Clean up the SAN notation
        san = san.replace('+', '').replace('#', '')
        
        # Handle castling
        if san in ['O-O', 'O-O-O']:
            for move in legal_moves:
                if move.is_castling:
                    if san == 'O-O' and move.to_col == 6:
                        return self._move_to_uci(move)
                    elif san == 'O-O-O' and move.to_col == 2:
                        return self._move_to_uci(move)
        
        # Parse the SAN move
        piece_type = None
        from_file = None
        from_rank = None
        to_square = None
        promotion = None
        is_capture = 'x' in san
        
        # Remove capture indicator for parsing
        san = san.replace('x', '')
        
        # Check for promotion
        if '=' in san:
            san, promo_char = san.split('=')
            promotion_map = {'Q': QUEEN, 'R': ROOK, 'B': BISHOP, 'N': KNIGHT}
            promotion = promotion_map.get(promo_char.upper())
        
        # Determine piece type
        if san[0].isupper():
            piece_map = {'K': KING, 'Q': QUEEN, 'R': ROOK, 'B': BISHOP, 'N': KNIGHT}
            piece_type = piece_map.get(san[0])
            san = san[1:]  # Remove piece character
        else:
            piece_type = PAWN
        
        # Extract destination square (last 2 characters)
        if len(san) >= 2:
            to_square = san[-2:]
            san = san[:-2]
        else:
            return None
        
        # Extract disambiguation info (file and/or rank)
        if san:
            for char in san:
                if char.isalpha():
                    from_file = ord(char) - ord('a')
                elif char.isdigit():
                    from_rank = int(char) - 1
        
        # Convert destination square to row/col
        if len(to_square) != 2 or not to_square[0].isalpha() or not to_square[1].isdigit():
            return None
            
        to_col = ord(to_square[0]) - ord('a')
        to_row = int(to_square[1]) - 1
        
        # Validate coordinates
        if not (0 <= to_col < 8 and 0 <= to_row < 8):
            return None
        
        # Find matching legal move
        candidates = []
        for move in legal_moves:
            # Check if destination matches
            if move.to_row != to_row or move.to_col != to_col:
                continue
            
            # Check piece type
            piece = board.board[move.from_row][move.from_col]
            if (piece & 7) != piece_type:
                continue
            
            # Check disambiguation
            if from_file is not None and move.from_col != from_file:
                continue
            if from_rank is not None and move.from_row != from_rank:
                continue
            
            # Check promotion
            if promotion is not None:
                if move.promotion != promotion:
                    continue
            elif move.promotion is not None:
                # SAN doesn't have promotion but move does
                continue
            
            candidates.append(move)
        
        # If we have exactly one candidate, use it
        if len(candidates) == 1:
            return self._move_to_uci(candidates[0])
        elif len(candidates) > 1:
            # Multiple candidates - shouldn't happen with proper disambiguation
            # Try to pick the most likely one (capture if it's a capture move)
            if is_capture:
                for move in candidates:
                    target = board.board[move.to_row][move.to_col]
                    if target != EMPTY:
                        return self._move_to_uci(move)
            return self._move_to_uci(candidates[0])
        
        return None
    
    def _move_to_uci(self, move):
        """Convert a Move object to UCI notation."""
        files = 'abcdefgh'
        
        from_square = files[move.from_col] + str(move.from_row + 1)
        to_square = files[move.to_col] + str(move.to_row + 1)
        
        uci = from_square + to_square
        
        if move.promotion:
            promo_chars = {QUEEN: 'q', ROOK: 'r', BISHOP: 'b', KNIGHT: 'n'}
            uci += promo_chars.get(move.promotion, 'q')
        
        return uci
    
    def _uci_to_move(self, board, uci):
        """Convert UCI notation to a Move object."""
        files = 'abcdefgh'
        
        from_col = files.index(uci[0])
        from_row = int(uci[1]) - 1
        to_col = files.index(uci[2])
        to_row = int(uci[3]) - 1
        
        promotion = None
        if len(uci) > 4:
            promo_map = {'q': QUEEN, 'r': ROOK, 'b': BISHOP, 'n': KNIGHT}
            promotion = promo_map.get(uci[4].lower())
        
        # Check if it's castling
        piece = board.board[from_row][from_col]
        is_castling = ((piece & 7) == KING) and abs(to_col - from_col) == 2
        
        # Check if it's en passant
        is_en_passant = False
        if (piece & 7) == PAWN:
            if board.en_passant_square and (to_row, to_col) == board.en_passant_square:
                is_en_passant = True
        
        return Move(from_row, from_col, to_row, to_col, 
                   promotion=promotion, 
                   is_castling=is_castling,
                   is_en_passant=is_en_passant)

Creating Custom Opening Books

Now let’s create our own opening books. We create a BookCreator class. The BookCreator class has a dictionary of dictionaries to store positions. We use the collection defaultdict to allow us to gracefully handle positions that are not in the dictionary yet. The class has a few key methods:

  • add_position: takes a fen and UCI move and stores them in the position dictionary
  • _encode_move: converts a UCI move into a 16 bit number (see above for the calculation)
  • save_book: saves the positions to a file. We use struct to pack position, move, weight, and learn into 16 bytes and save to the file.
  • encode_games: parses a PGN game file and then stores positions and moves into the position dictionary
# book_creator.py

import struct
from collections import defaultdict
from zobrist import ZobristHash
from board import Board
from constants import *
import pgn

class BookCreator:
    """
    Create Polyglot opening books from manual position entry.
    Build your own opening repertoire!
    """
    
    def __init__(self):
        self.positions = defaultdict(lambda: defaultdict(int))
        self.zobrist = ZobristHash()
    
    def add_position(self, fen, move_uci, weight=1):
        """
        Manually add a position and move to the book.
        
        Args:
            fen: FEN string of position
            move_uci: Move in UCI format (e.g., "e2e4")
            weight: Weight/frequency of this move (higher = more likely)
        """
        # Load position from FEN
        board = Board()
        board.from_fen(fen)
        
        # Get position hash
        position_hash = self.zobrist.hash_position(board)
        
        # Add move with weight
        self.positions[position_hash][move_uci] += weight
    
    def _encode_move(self, move_uci):
        """
        Encode UCI move to Polyglot format.
        
        Args:
            move_uci: Move string like "e2e4" or "e7e8q"
        
        Returns:
            16-bit encoded move
        """
        # Parse UCI move
        from_col = ord(move_uci[0]) - ord('a')
        from_row = int(move_uci[1]) - 1
        to_col = ord(move_uci[2]) - ord('a')
        to_row = int(move_uci[3]) - 1
        
        # Calculate square numbers
        from_square = from_row * 8 + from_col
        to_square = to_row * 8 + to_col
        
        # Check for promotion
        promo = 0
        if len(move_uci) == 5:
            promo_map = {'n': 1, 'b': 2, 'r': 3, 'q': 4}
            promo = promo_map.get(move_uci[4].lower(), 0)
        
        # Encode: to | (from << 6) | (promo << 12)
        return to_square | (from_square << 6) | (promo << 12)
    
    def save_book(self, output_path: str):
        """
        Save the book in Polyglot binary format.
        
        Args:
            output_path: Path for output .bin file
        """
        entries = []
        
        # Convert all positions and moves to binary entries
        for position_hash, moves in self.positions.items():
            for move_uci, weight in moves.items():
                move_int = self._encode_move(move_uci)
                entries.append((position_hash, move_int, weight, 0))
        
        # Sort entries by position hash (required for binary search)
        entries.sort(key=lambda e: e[0])
        
        # Write to binary file
        with open(output_path, 'wb') as f:
            for position_hash, move_int, weight, learn in entries:
                # Pack as big-endian: Q (8 bytes), H (2 bytes), H (2 bytes), I (4 bytes)
                packed = struct.pack('>QHHI', position_hash, move_int, weight, learn)
                f.write(packed)
        
        print(f"\n{'='*60}")
        print(f"Book saved to: {output_path}")
        print(f"Unique positions: {len(self.positions)}")
        print(f"Total entries: {len(entries)}")
        print(f"{'='*60}")

    def encode_games(self,pgn_file):
        parser = pgn.PGNParser()
        games = parser.parse_file(pgn_file)

        for current_game in games:
            board = Board()

            for index, move in enumerate(current_game.uci_moves):
                if index > 20:
                    break

                self.add_position(board.to_fen(),move)
                board.push_uci(move)

Reading Polyglot Books

Now let’s implement the book reader:

# opening_book.py

import struct
import random
from typing import Optional, List, Tuple
from zobrist import ZobristHash
from constants import *

class OpeningBook:
    """
    Reader for Polyglot opening book format (.bin files).
    """
    
    def __init__(self, book_path: Optional[str] = None):
        """
        Initialize opening book.
        
        Args:
            book_path: Path to Polyglot .bin file (None = no book)
        """
        self.book_path = book_path
        self.book_enabled = book_path is not None
        self.max_book_ply = 20  # Stay in book for first 20 plies
        self.zobrist = ZobristHash()
        
        if self.book_enabled:
            try:
                # Test that we can open the book
                with open(book_path, 'rb') as f:
                    # Try reading one entry
                    entry = f.read(16)
                    if len(entry) == 16:
                        print(f"✓ Opening book loaded: {book_path}")
                    else:
                        raise ValueError("Invalid book format")
            except Exception as e:
                print(f"✗ Could not load opening book: {e}")
                self.book_enabled = False
    
    def _decode_move(self, move_int):
        """
        Decode Polyglot move format.
        
        Move format (16 bits):
        - Bits 0-5: to_square (0-63)
        - Bits 6-11: from_square (0-63)
        - Bits 12-14: promotion piece (0=none, 1=knight, 2=bishop, 3=rook, 4=queen)
        
        Returns:
            Tuple of (from_row, from_col, to_row, to_col, promotion)
        """
        to_square = move_int & 0x3F
        from_square = (move_int >> 6) & 0x3F
        promo = (move_int >> 12) & 0x7
        
        # Convert square numbers to row/col
        to_row = to_square // 8
        to_col = to_square % 8
        from_row = from_square // 8
        from_col = from_square % 8
        
        # Convert promotion code to piece type
        promotion = None
        if promo == 1:
            promotion = KNIGHT
        elif promo == 2:
            promotion = BISHOP
        elif promo == 3:
            promotion = ROOK
        elif promo == 4:
            promotion = QUEEN
        
        return (from_row, from_col, to_row, to_col, promotion)
    
    def _find_entries(self, position_hash):
        """
        Find all book entries for a position using binary search.
        
        The book file is sorted by hash, so we use binary search
        to find the first matching entry, then read all consecutive matches.
        
        Returns:
            List of (move_tuple, weight) entries
        """
        if not self.book_enabled:
            return []
        
        entries = []
        
        try:
            with open(self.book_path, 'rb') as f:
                # Get file size and entry count
                f.seek(0, 2)  # Seek to end
                file_size = f.tell()
                entry_count = file_size // 16
                
                if entry_count == 0:
                    return []
                
                # Binary search for first occurrence of this hash
                left, right = 0, entry_count - 1
                first_match = -1
                
                while left <= right:
                    mid = (left + right) // 2
                    f.seek(mid * 16)
                    
                    # Read hash from entry
                    entry_hash = struct.unpack('>Q', f.read(8))[0]
                    
                    if entry_hash < position_hash:
                        left = mid + 1
                    elif entry_hash > position_hash:
                        right = mid - 1
                    else:
                        # Found a match, but keep searching left for first occurrence
                        first_match = mid
                        right = mid - 1
                
                # If no match found, return empty
                if first_match == -1:
                    return []
                
                # Read all consecutive entries with matching hash
                f.seek(first_match * 16)
                
                while f.tell() < file_size:
                    entry_data = f.read(16)
                    if len(entry_data) < 16:
                        break
                    
                    # Unpack entry: Q=8 bytes hash, H=2 bytes move, H=2 bytes weight, I=4 bytes learn
                    entry_hash, move_int, weight, learn = struct.unpack('>QHHI', entry_data)
                    
                    # Stop when we hit a different position
                    if entry_hash != position_hash:
                        break
                    
                    # Decode move and add to list
                    move_tuple = self._decode_move(move_int)
                    entries.append((move_tuple, weight))
        
        except Exception as e:
            print(f"Book lookup error: {e}")
            return []
        
        return entries
    
    def get_book_move(self, board, move_number: int, 
                      selection_mode: str = "weighted") -> Optional[str]:
        """
        Get a move from the opening book.
        
        Args:
            board: Current board position
            move_number: Current move number (for max_book_ply check)
            selection_mode: How to select from multiple book moves
                - "best": Choose highest weighted move
                - "weighted": Probabilistic selection based on weights
                - "random": Random selection (ignores weights)
        
        Returns:
            Move in UCI format (e.g., "e2e4") or None if not in book
        """
        if not self.book_enabled:
            return None
        
        # Check if we're still in book range
        ply = (move_number - 1) * 2 + (0 if board.to_move == WHITE else 1)
        if ply >= self.max_book_ply:
            return None
        
        # Get position hash
        position_hash = self.zobrist.hash_position(board)
        
        # Find all book entries for this position
        entries = self._find_entries(position_hash)
        
        if not entries:
            return None
        
        # Select move based on mode
        if selection_mode == "best":
            # Choose highest weighted move
            move_tuple, weight = max(entries, key=lambda e: e[1])
        
        elif selection_mode == "weighted":
            # Probabilistic selection based on weights
            total_weight = sum(w for _, w in entries)
            if total_weight == 0:
                # All weights are zero, choose randomly
                move_tuple, weight = random.choice(entries)
            else:
                # Weighted random selection
                r = random.uniform(0, total_weight)
                cumulative = 0
                move_tuple = entries[0][0]
                for m, w in entries:
                    cumulative += w
                    if r <= cumulative:
                        move_tuple = m
                        break
        
        else:  # random
            move_tuple, weight = random.choice(entries)
        
        # Convert move tuple to UCI format
        from_row, from_col, to_row, to_col, promotion = move_tuple
        
        files = 'abcdefgh'
        uci = f"{files[from_col]}{from_row + 1}{files[to_col]}{to_row + 1}"
        
        # Add promotion piece if present
        if promotion:
            promo_chars = {KNIGHT: 'n', BISHOP: 'b', ROOK: 'r', QUEEN: 'q'}
            uci += promo_chars.get(promotion, '')
        
        return uci
    
    def is_in_book(self, board) -> bool:
        """Check if current position is in the book."""
        if not self.book_enabled:
            return False
        
        position_hash = self.zobrist.hash_position(board)
        entries = self._find_entries(position_hash)
        return len(entries) > 0
    
    def get_book_moves_info(self, board) -> List[Tuple[str, int]]:
        """
        Get all book moves with their weights for the current position.
        
        Returns:
            List of (move_uci, weight) tuples, sorted by weight descending
        """
        if not self.book_enabled:
            return []
        
        position_hash = self.zobrist.hash_position(board)
        entries = self._find_entries(position_hash)
        
        result = []
        files = 'abcdefgh'
        
        for move_tuple, weight in entries:
            from_row, from_col, to_row, to_col, promotion = move_tuple
            uci = f"{files[from_col]}{from_row + 1}{files[to_col]}{to_row + 1}"
            
            if promotion:
                promo_chars = {KNIGHT: 'n', BISHOP: 'b', ROOK: 'r', QUEEN: 'q'}
                uci += promo_chars.get(promotion, '')
            
            result.append((uci, weight))
        
        # Sort by weight, highest first
        result.sort(key=lambda x: -x[1])
        
        return result

Understanding Book Selection Strategies

We implemented three selection modes:

1. Best Move Selection

book_move = book.get_book_move(board, 1, “best”)

  • Always chooses the highest-weighted move
  • Most consistent/predictable
  • Good for maximizing strength
  • Can be too repetitive

2. Weighted Random Selection

book_move = book.get_book_move(board, 1, “weighted”)

  • Probabilistic selection based on weights
  • More variety in play
  • Still favors stronger moves
  • Recommended for most cases

3. Random Selection

book_move = book.get_book_move(board, 1, “random”)

  • Completely random (ignores weights)
  • Maximum variety
  • Can play weaker book moves
  • Good for fun/casual games

Testing the Opening Book

Let’s verify everything works. We created two test books in the folder books (kasparov.bin and carlsen.bin).

# tests/test_opening_book.py

import unittest
from opening_book import OpeningBook
from board import Board

class TestOpeningBook(unittest.TestCase):
    def setUp(self):
        self.book = OpeningBook('books/kasparov.bin')
        self.board = Board()

    def test_opening_book(self):
        self.assertIsNotNone(self.book)

        self.assertTrue(self.book.is_in_book(self.board))

        self.assertTrue(len(self.book.get_book_moves_info(self.board)) == 3)

    def test_get_move(self):
        move = self.book.get_book_move(self.board,1,"best")
        self.assertEqual(move,'e2e4')

if __name__ == "__main__":
    unittest.main()

Integrate into Search

Integration into our search engine is fairly easy. We initialize the book when we initialize search. We then add a short block of code to the find_best_move_alphabeta function. This code checks whether the current board position is in the book.

  # search.py
  
  class SearchEngine(board):
  
  # other methods from parts 1-3
  
    def find_best_move_alphabeta(self, depth):
        """
        Find the best move using alpha-beta pruning.
        """
        # Check opening book
        if self.book.is_in_book(self.board):
            moves = self.board.generate_legal_moves()
            book_move = self.book.get_book_move(self.board,len(self.board.move_stack))
            for move in moves:
                if str(move) == book_move:
                    return move, 1000
                    
        # remainder of the method that we previously implemented

What We’ve Accomplished

In this part, we’ve transformed PyMinMaximus into a complete chess player:

Opening books – Instant play from proven theory
Zobrist hashing – Efficient position identification
Polyglot format – Standard book reading/writing
Book creation – Custom opening repertoires

Complete Code Structure

Your final project structure:

pyminmaximus/
├── constants.py        # Piece and color constants
├── move.py             # Move class
├── board.py            # Board representation
├── evaluation.py       # Position evaluation
├── search.py           # Search with book integration
├── zobrist.py          # Zobrist hashing
├── opening_book.py     # Polyglot book reader
├── book_creator.py     # Custom book creation
├── pgn.py # PGN (chess game) format reader
├── tests/
├────── test_board.py
├────── test_evaluation.py
├────── test_opening_book.py
├────── test_search.py
├────── test_zobrist.py

Next Steps

In Part 5 we will cover:

  • End Game Tablebases
  • How Retrograde Analysis Works
  • Integrating the tablebase into our search

With our opening book, we can start strong. In our next post, let’s also end strong!


Questions about Zobrist hashing or Polyglot? Built your own opening repertoire? Share in the comments!

Leave a Reply

Your email address will not be published. Required fields are marked *