Puzzle Implementations (puxle.puzzles)

All puzzle classes inherit from Puzzle and provide parallelized environments for AI research and reinforcement learning.

Puzzle implementations for PuXle.

This module contains implementations of various classic puzzles optimized for JAX-based computation. All puzzle classes inherit from the base Puzzle class and provide parallelized environments for AI research and reinforcement learning.

RubiksCube

class puxle.puzzles.rubikscube.RubiksCube[source]

Bases: Puzzle

N×N×N Rubik’s Cube environment.

Each face is stored as a 1-D array of size * size sticker values. Two representation modes are supported:

  • Color embedding (default): values in [0, 5] (3 bits/sticker).

  • Tile-ID mode: unique IDs in [0, 6·size²) (8 bits/sticker), useful for puzzles where individual tile identity matters.

Actions encode (axis, slice_index, direction) triplets and follow either QTM (quarter-turn metric, excludes whole-cube rotations) or UQTM (includes center-slice moves on odd-sized cubes).

The class also exposes the 24 global rotational symmetries of the cube via state_symmetries() for symmetry-aware hashing or data augmentation.

Parameters:
  • size (int) – Edge length of the cube (default 3).

  • initial_shuffle (int) – Number of random moves for scrambling (default 10).

  • color_embedding (bool) – If True (default), store 6-colour values; otherwise store unique tile IDs.

  • metric (str) – "QTM" (default) or "UQTM".

define_state_class()[source]

Return the @state_dataclass class used for puzzle states.

Subclasses must implement this method. The returned class should use FieldDescriptor to declare its fields.

Return type:

PuzzleState

Returns:

A @state_dataclass class describing the puzzle state.

__init__(size=3, initial_shuffle=26, color_embedding=True, metric='QTM', **kwargs)[source]

Initialise the puzzle.

Subclass constructors must call super().__init__(**kwargs) after setting action_size and any instance attributes needed by define_state_class() / data_init().

This method:

  1. Calls data_init() for optional dataset loading.

  2. Builds State and SolveConfig classes.

  3. JIT-compiles core methods (get_neighbours, is_solved, etc.).

  4. Validates action_size and pre-computes the inverse-action permutation.

Raises:

ValueError – If action_size is still None after subclass init.

Parameters:
  • size (int)

  • initial_shuffle (int)

  • color_embedding (bool)

  • metric (str)

size: int
index_grid: Array | ndarray | bool | number
convert_tile_to_color_embedding(tile_faces)[source]

Convert faces expressed with tile identifiers (0..6*tile_count-1) into color embedding (0..5). Accepts shapes (6, tile_count), (6, size, size) or flat.

Return type:

Array

Parameters:

tile_faces (ndarray | Array | bool | number)

get_string_parser()[source]

Return a callable that renders a State as a human-readable string.

Return type:

Callable

Returns:

A function (state: State, **kwargs) -> str.

get_initial_state(solve_config, key=None, data=None)[source]

Build and return the initial (scrambled) state for a given goal.

Parameters:
  • solve_config (SolveConfig) – The goal configuration for this episode.

  • key – Optional JAX PRNG key for random scrambling.

  • data – Optional puzzle-specific data from get_data().

Return type:

State

Returns:

A State instance representing the starting position.

get_target_state(key=None)[source]
Return type:

State

get_solve_config(key=None, data=None)[source]

Build and return a goal / solve configuration.

Parameters:
  • key – Optional JAX PRNG key for stochastic goal generation.

  • data – Optional puzzle-specific data from get_data().

Return type:

SolveConfig

Returns:

A SolveConfig instance describing the puzzle objective.

get_actions(solve_config, state, action, filled=True)[source]

Returns the next state and cost for a given action. Action decoding: - clockwise: action % 2 - axis: (action // 2) % 3 - index: index_grid[action // 6]

Return type:

tuple[State, Union[Array, ndarray, bool, number]]

Parameters:
state_symmetries(state)[source]

Return all 24 global rotational symmetries of a cube state.

The result is a batched State whose leading dimension is 24. This is useful for symmetry-aware hashing / canonicalization or data augmentation.

Return type:

State

Parameters:

state (State)

is_solved(solve_config, state)[source]

This function should return True if the state is the target state. if the puzzle has multiple target states, this function should return True if the state is one of the target conditions. e.g sokoban puzzle has multiple target states. box’s position should be the same as the target position but the player’s position can be different.

Return type:

bool

Parameters:
property inverse_action_map: Array | None

Defines the inverse action mapping for Rubik’s Cube. A rotation in one direction (e.g., clockwise) is inverted by a rotation in the opposite direction (counter-clockwise) on the same axis and slice.

Actions are generated from a meshgrid of (axis, index, clockwise), with clockwise being the fastest-changing dimension. This means actions are interleaved as [cw, ccw, cw, ccw, …]. The inverse of action 2k (cw) is 2k+1 (ccw), and vice versa.

action_to_string(action)[source]

This function should return a string representation of the action. Actions are encoded as (axis, index, clockwise) where: - axis: 0=x-axis, 1=y-axis, 2=z-axis - index: slice index (0 or 2 for 3x3 cube) - clockwise: 0=counterclockwise, 1=clockwise

For cubes larger than 3x3x3, internal slice rotations are named with layer numbers (e.g., L2, R2 for 4x4x4 cube).

Return type:

str

Parameters:

action (int)

get_img_parser()[source]

This function is a decorator that adds an img_parser to the class.

Return type:

Callable

class puxle.puzzles.rubikscube.RubiksCubeRandom[source]

Bases: RubiksCube

This class is a extension of RubiksCube, it will generate the state with random moves.

property fixed_target: bool

This function should return a boolean that indicates whether the target state is fixed and doesn’t change. default is only_target, but if the target state is not fixed, you should redefine this function.

__init__(size=3, initial_shuffle=26, **kwargs)[source]

Initialise the puzzle.

Subclass constructors must call super().__init__(**kwargs) after setting action_size and any instance attributes needed by define_state_class() / data_init().

This method:

  1. Calls data_init() for optional dataset loading.

  2. Builds State and SolveConfig classes.

  3. JIT-compiles core methods (get_neighbours, is_solved, etc.).

  4. Validates action_size and pre-computes the inverse-action permutation.

Raises:

ValueError – If action_size is still None after subclass init.

Parameters:
  • size (int)

  • initial_shuffle (int)

get_solve_config(key=None, data=None)[source]

Build and return a goal / solve configuration.

Parameters:
  • key – Optional JAX PRNG key for stochastic goal generation.

  • data – Optional puzzle-specific data from get_data().

Return type:

SolveConfig

Returns:

A SolveConfig instance describing the puzzle objective.

SlidePuzzle

class puxle.puzzles.slidepuzzle.SlidePuzzle[source]

Bases: Puzzle

N×N sliding tile puzzle (15-puzzle generalisation).

The board is a flat array of size² values where 0 represents the blank tile. Actions move the blank in four directions (←, →, ↑, ↓). Only solvable permutations are generated.

State packing uses ceil(log₂(size²)) bits per tile via xtructure.

Parameters:

size (int) – Edge length of the grid (default 4 → 15-puzzle).

define_state_class()[source]

Return the @state_dataclass class used for puzzle states.

Subclasses must implement this method. The returned class should use FieldDescriptor to declare its fields.

Return type:

PuzzleState

Returns:

A @state_dataclass class describing the puzzle state.

__init__(size=4, **kwargs)[source]

Initialise the puzzle.

Subclass constructors must call super().__init__(**kwargs) after setting action_size and any instance attributes needed by define_state_class() / data_init().

This method:

  1. Calls data_init() for optional dataset loading.

  2. Builds State and SolveConfig classes.

  3. JIT-compiles core methods (get_neighbours, is_solved, etc.).

  4. Validates action_size and pre-computes the inverse-action permutation.

Raises:

ValueError – If action_size is still None after subclass init.

Parameters:

size (int)

size: int
get_string_parser()[source]

Return a callable that renders a State as a human-readable string.

Return type:

Callable

Returns:

A function (state: State, **kwargs) -> str.

get_initial_state(solve_config, key=None, data=None)[source]

Build and return the initial (scrambled) state for a given goal.

Parameters:
  • solve_config (SolveConfig) – The goal configuration for this episode.

  • key – Optional JAX PRNG key for random scrambling.

  • data – Optional puzzle-specific data from get_data().

Return type:

State

Returns:

A State instance representing the starting position.

get_solve_config(key=None, data=None)[source]

Build and return a goal / solve configuration.

Parameters:
  • key – Optional JAX PRNG key for stochastic goal generation.

  • data – Optional puzzle-specific data from get_data().

Return type:

SolveConfig

Returns:

A SolveConfig instance describing the puzzle objective.

get_actions(solve_config, state, action, filled=True)[source]

This function should return a state and the cost of the move.

Return type:

tuple[State, Union[Array, ndarray, bool, number]]

Parameters:
is_solved(solve_config, state)[source]

This function should return True if the state is the target state. if the puzzle has multiple target states, this function should return True if the state is one of the target conditions. e.g sokoban puzzle has multiple target states. box’s position should be the same as the target position but the player’s position can be different.

Return type:

bool

Parameters:
action_to_string(action)[source]

Return a human-readable name for the given action index.

Override in subclasses to provide meaningful names (e.g., "R" for right, "U'" for counter-clockwise).

Parameters:

action (int) – Integer action index in [0, action_size).

Return type:

str

Returns:

String representation of the action.

property inverse_action_map: Array | None

Defines the inverse action mapping for the Slide Puzzle. The actions are moving the blank tile [R, L, D, U]. The inverse is therefore [L, R, U, D].

get_img_parser()[source]

This function is a decorator that adds an img_parser to the class.

Return type:

Callable

class puxle.puzzles.slidepuzzle.SlidePuzzleHard[source]

Bases: SlidePuzzle

This class is a extension of SlidePuzzle, it will generate the hardest state for the puzzle.

__init__(size=4, **kwargs)[source]

Initialise the puzzle.

Subclass constructors must call super().__init__(**kwargs) after setting action_size and any instance attributes needed by define_state_class() / data_init().

This method:

  1. Calls data_init() for optional dataset loading.

  2. Builds State and SolveConfig classes.

  3. JIT-compiles core methods (get_neighbours, is_solved, etc.).

  4. Validates action_size and pre-computes the inverse-action permutation.

Raises:

ValueError – If action_size is still None after subclass init.

Parameters:

size (int)

get_initial_state(solve_config, key=None, data=None)[source]

Build and return the initial (scrambled) state for a given goal.

Parameters:
  • solve_config (SolveConfig) – The goal configuration for this episode.

  • key – Optional JAX PRNG key for random scrambling.

  • data – Optional puzzle-specific data from get_data().

Return type:

State

Returns:

A State instance representing the starting position.

class puxle.puzzles.slidepuzzle.SlidePuzzleRandom[source]

Bases: SlidePuzzle

This class is a extension of SlidePuzzle, it will generate the random state for the puzzle.

property fixed_target: bool

This function should return a boolean that indicates whether the target state is fixed and doesn’t change. default is only_target, but if the target state is not fixed, you should redefine this function.

get_solve_config(key=None, data=None)[source]

Build and return a goal / solve configuration.

Parameters:
  • key – Optional JAX PRNG key for stochastic goal generation.

  • data – Optional puzzle-specific data from get_data().

Return type:

SolveConfig

Returns:

A SolveConfig instance describing the puzzle objective.

get_initial_state(solve_config, key=None, data=None)[source]

Build and return the initial (scrambled) state for a given goal.

Parameters:
  • solve_config (SolveConfig) – The goal configuration for this episode.

  • key – Optional JAX PRNG key for random scrambling.

  • data – Optional puzzle-specific data from get_data().

Return type:

State

Returns:

A State instance representing the starting position.

Sokoban

class puxle.puzzles.sokoban.Sokoban[source]

Bases: Puzzle

Sokoban (box-pushing) puzzle using the Boxoban dataset.

Each cell is one of four base types (empty / wall / player / box) packed in 2 bits. The board is fixed at 10×10 and levels are loaded from pre-packed .npy files shipped with the puxle.data subpackage.

This puzzle is not reversible in the standard sense — inverse neighbours are computed via a dedicated pull-move implementation in get_inverse_neighbours().

Two solve conditions are supported:

  • ALL_BOXES_ON_TARGET (default): only box positions must match.

  • ALL_BOXES_ON_TARGET_AND_PLAYER_ON_TARGET: both box and player positions must match the goal.

Parameters:
  • size (int) – Board edge length (must be 10).

  • solve_condition (SolveCondition) – Which condition defines a solved state.

class Object[source]

Bases: Enum

EMPTY = 0
WALL = 1
PLAYER = 2
BOX = 3
TARGET = 4
PLAYER_ON_TARGET = 5
BOX_ON_TARGET = 6
TARGET_PLAYER = 7
class SolveCondition[source]

Bases: Enum

ALL_BOXES_ON_TARGET = 0
ALL_BOXES_ON_TARGET_AND_PLAYER_ON_TARGET = 1
define_state_class()[source]

Defines the state class for Sokoban using xtructure.

Return type:

PuzzleState

__init__(size=10, solve_condition=SolveCondition.ALL_BOXES_ON_TARGET, **kwargs)[source]

Initialise the puzzle.

Subclass constructors must call super().__init__(**kwargs) after setting action_size and any instance attributes needed by define_state_class() / data_init().

This method:

  1. Calls data_init() for optional dataset loading.

  2. Builds State and SolveConfig classes.

  3. JIT-compiles core methods (get_neighbours, is_solved, etc.).

  4. Validates action_size and pre-computes the inverse-action permutation.

Raises:

ValueError – If action_size is still None after subclass init.

Parameters:
size: int = 10
solve_condition: SolveCondition = None
property is_reversible: bool

Indicates whether the puzzle is fully reversible through the inverse_action_map. This is true if an inverse_action_map is provided. Puzzles with custom, non-symmetric inverse logic (like Sokoban) should override this to return False.

property fixed_target: bool

This function should return a boolean that indicates whether the target state is fixed and doesn’t change. default is only_target, but if the target state is not fixed, you should redefine this function.

data_init()[source]

Hook for loading datasets or heavy resources during init.

Called before define_state_class(). Override in puzzles that require external data (e.g., Sokoban level files).

get_data(key)[source]

Optionally sample or return puzzle-specific data used by get_inits.

Parameters:

key (PRNGKey) – Optional JAX PRNG key for stochastic data selection.

Return type:

tuple[Union[Array, ndarray, bool, number], Union[Array, ndarray, bool, number]]

Returns:

Puzzle-specific data (e.g., a Sokoban level index) or None.

get_initial_state(solve_config, key=None, data=None)[source]

Build and return the initial (scrambled) state for a given goal.

Parameters:
  • solve_config (SolveConfig) – The goal configuration for this episode.

  • key – Optional JAX PRNG key for random scrambling.

  • data – Optional puzzle-specific data from get_data().

Return type:

State

Returns:

A State instance representing the starting position.

get_solve_config(key=None, data=None)[source]

Build and return a goal / solve configuration.

Parameters:
  • key – Optional JAX PRNG key for stochastic goal generation.

  • data – Optional puzzle-specific data from get_data().

Return type:

SolveConfig

Returns:

A SolveConfig instance describing the puzzle objective.

is_solved(solve_config, state)[source]

This function should return True if the state is the target state. if the puzzle has multiple target states, this function should return True if the state is one of the target conditions. e.g sokoban puzzle has multiple target states. box’s position should be the same as the target position but the player’s position can be different.

Return type:

bool

Parameters:
action_to_string(action)[source]

Return a human-readable name for the given action index.

Override in subclasses to provide meaningful names (e.g., "R" for right, "U'" for counter-clockwise).

Parameters:

action (int) – Integer action index in [0, action_size).

Return type:

str

Returns:

String representation of the action.

get_solve_config_string_parser()[source]

Return a callable that renders a SolveConfig as a string.

The default implementation delegates to get_string_parser() on solve_config.TargetState. Override when the solve config contains fields beyond TargetState.

Return type:

Callable

Returns:

A function (solve_config: SolveConfig) -> str.

get_string_parser()[source]

Return a callable that renders a State as a human-readable string.

Return type:

Callable

Returns:

A function (state: State, **kwargs) -> str.

get_actions(solve_config, state, action, filled=True)[source]

Returns the next state and cost for a given action.

Return type:

tuple[State, Union[Array, ndarray, bool, number]]

Parameters:
get_img_parser()[source]

This function is a decorator that adds an img_parser to the class.

Return type:

Callable

solve_config_to_state_transform(solve_config, key=None)[source]

This function shoulde transformt the solve config to the state.

Return type:

State

Parameters:
hindsight_transform(solve_config, state)[source]

This function shoulde transformt the state to the solve config.

Return type:

SolveConfig

Parameters:
get_inverse_neighbours(solve_config, state, filled=True)[source]

Returns possible previous states and their associated costs. In Sokoban, inverse moves correspond to ‘pulling’ a box or simply moving back to the previous position if no box is involved. If an inverse move is not possible, it returns the original state with an infinite cost.

Return type:

tuple[State, Union[Array, ndarray, bool, number]]

Parameters:
class puxle.puzzles.sokoban.SokobanHard[source]

Bases: Sokoban

data_init()[source]

Hook for loading datasets or heavy resources during init.

Called before define_state_class(). Override in puzzles that require external data (e.g., Sokoban level files).

LightsOut

class puxle.puzzles.lightsout.LightsOut[source]

Bases: Puzzle

Lights Out puzzle on an N×N grid.

Pressing a button toggles it and its four orthogonal neighbours. The goal is to turn all lights off. Each action is its own inverse, so inverse_action_map is the identity.

The board is stored as 1-bit-per-cell via xtructure bitpacking. A GF(2) solvability check is available via board_is_solvable().

Parameters:
  • size (int) – Edge length of the grid (default 7).

  • initial_shuffle (int) – Number of random presses for scrambling (default 8).

define_state_class()[source]

Defines the state class for LightsOut using xtructure.

Return type:

PuzzleState

__init__(size=7, initial_shuffle=8, **kwargs)[source]

Initialise the puzzle.

Subclass constructors must call super().__init__(**kwargs) after setting action_size and any instance attributes needed by define_state_class() / data_init().

This method:

  1. Calls data_init() for optional dataset loading.

  2. Builds State and SolveConfig classes.

  3. JIT-compiles core methods (get_neighbours, is_solved, etc.).

  4. Validates action_size and pre-computes the inverse-action permutation.

Raises:

ValueError – If action_size is still None after subclass init.

Parameters:
  • size (int)

  • initial_shuffle (int)

size: int
get_string_parser()[source]

Return a callable that renders a State as a human-readable string.

Return type:

Callable

Returns:

A function (state: State, **kwargs) -> str.

get_initial_state(solve_config, key=None, data=None)[source]

Build and return the initial (scrambled) state for a given goal.

Parameters:
  • solve_config (SolveConfig) – The goal configuration for this episode.

  • key – Optional JAX PRNG key for random scrambling.

  • data – Optional puzzle-specific data from get_data().

Return type:

State

Returns:

A State instance representing the starting position.

get_target_state(key=None)[source]
Return type:

State

get_solve_config(key=None, data=None)[source]

Build and return a goal / solve configuration.

Parameters:
  • key – Optional JAX PRNG key for stochastic goal generation.

  • data – Optional puzzle-specific data from get_data().

Return type:

SolveConfig

Returns:

A SolveConfig instance describing the puzzle objective.

get_actions(solve_config, state, action, filled=True)[source]

This function returns the next state and cost for a given action.

Return type:

tuple[State, Array]

Parameters:
is_solved(solve_config, state)[source]

This function should return True if the state is the target state. if the puzzle has multiple target states, this function should return True if the state is one of the target conditions. e.g sokoban puzzle has multiple target states. box’s position should be the same as the target position but the player’s position can be different.

Return type:

bool

Parameters:
action_to_string(action)[source]

This function should return a string representation of the action.

Return type:

str

Parameters:

action (int)

property inverse_action_map: Array | None

Defines the inverse action mapping for LightsOut. Each action (flipping a tile) is its own inverse.

classmethod board_is_solvable(board, size)[source]
Return type:

bool

Parameters:
is_state_solvable(state)[source]
Return type:

bool

Parameters:

state (State)

get_img_parser()[source]

This function is a decorator that adds an img_parser to the class.

Return type:

Callable

class puxle.puzzles.lightsout.LightsOutRandom[source]

Bases: LightsOut

This class is a extension of LightsOut, it will generate the random state for the puzzle.

property fixed_target: bool

This function should return a boolean that indicates whether the target state is fixed and doesn’t change. default is only_target, but if the target state is not fixed, you should redefine this function.

get_solve_config(key=None, data=None)[source]

Build and return a goal / solve configuration.

Parameters:
  • key – Optional JAX PRNG key for stochastic goal generation.

  • data – Optional puzzle-specific data from get_data().

Return type:

SolveConfig

Returns:

A SolveConfig instance describing the puzzle objective.

get_initial_state(solve_config, key=None, data=None)[source]

Build and return the initial (scrambled) state for a given goal.

Parameters:
  • solve_config (SolveConfig) – The goal configuration for this episode.

  • key – Optional JAX PRNG key for random scrambling.

  • data – Optional puzzle-specific data from get_data().

Return type:

State

Returns:

A State instance representing the starting position.

TowerOfHanoi

class puxle.puzzles.hanoi.TowerOfHanoi[source]

Bases: Puzzle

Tower of Hanoi puzzle with variable pegs.

Move all disks from the first peg to the last peg, obeying three rules:

  1. Only one disk may be moved at a time.

  2. A move takes the topmost disk from one peg and places it on another.

  3. No disk may be placed on top of a smaller disk.

Each peg is stored as a fixed-length array of shape (num_disks + 1,) whose first element is the current disk count and subsequent elements hold disk sizes (smallest at index 1 = top).

Actions encode ordered (from_peg, to_peg) pairs, giving num_pegs × (num_pegs 1) possible moves (invalid moves yield infinite cost).

Parameters:
  • size (int) – Number of disks (default 5).

  • num_pegs – Number of pegs (default 3).

num_pegs: int = 3
define_state_class()[source]

Defines the state class for Tower of Hanoi using xtructure.

Return type:

PuzzleState

__init__(size=5, **kwargs)[source]

Initialize the Tower of Hanoi puzzle

Parameters:
  • num_disks – The number of disks in the puzzle

  • size (int)

num_disks: int
max_disk_value: int
get_string_parser()[source]

Returns a function to convert a state to a string representation

get_img_parser()[source]

Returns a function to convert a state to an image representation

Return type:

Callable

get_initial_state(solve_config, key=None, data=None)[source]

Generate the initial state for the puzzle with all disks on the first peg

Return type:

State

Parameters:

solve_config (SolveConfig)

get_solve_config(key=None, data=None)[source]

Create the solving configuration (target state) - all disks on third peg

Return type:

SolveConfig

get_actions(solve_config, state, action, filled=True)[source]

Get the next state by performing the action (moving a disk).

Return type:

tuple[State, Array]

Parameters:
is_solved(solve_config, state)[source]

Check if the current state matches the target state

Return type:

bool

Parameters:
action_to_string(action)[source]

Return a string representation of the action

Return type:

str

Parameters:

action (int)

Maze

class puxle.puzzles.maze.Maze[source]

Bases: Puzzle

Randomly-generated 2-D maze puzzle.

A size × size boolean grid is generated via randomised depth-first search (True = wall, False = path). The player position is a 2-element uint16 coordinate. Four actions (←, →, ↑, ↓) are available; illegal moves (into walls or out of bounds) incur infinite cost.

The maze layout is stored inside the SolveConfig so that both the target position and wall configuration travel together.

This puzzle is reversible: each direction has a clear inverse (left ↔ right, up ↔ down).

Parameters:

size (int) – Edge length of the square grid (default 23; should be odd for well-formed DFS mazes).

define_solve_config_class()[source]

Return the @state_dataclass class used for goal/solve configuration.

The default implementation creates a SolveConfig with a single TargetState field. Override this when the goal representation requires additional fields (e.g., a goal mask for PDDL domains).

Return type:

PuzzleState

Returns:

A @state_dataclass class describing the solve configuration.

define_state_class()[source]

Return the @state_dataclass class used for puzzle states.

Subclasses must implement this method. The returned class should use FieldDescriptor to declare its fields.

Return type:

PuzzleState

Returns:

A @state_dataclass class describing the puzzle state.

__init__(size=23, **kwargs)[source]

Initialise the puzzle.

Subclass constructors must call super().__init__(**kwargs) after setting action_size and any instance attributes needed by define_state_class() / data_init().

This method:

  1. Calls data_init() for optional dataset loading.

  2. Builds State and SolveConfig classes.

  3. JIT-compiles core methods (get_neighbours, is_solved, etc.).

  4. Validates action_size and pre-computes the inverse-action permutation.

Raises:

ValueError – If action_size is still None after subclass init.

Parameters:

size (int)

size: int
get_solve_config_string_parser()[source]

Return a callable that renders a SolveConfig as a string.

The default implementation delegates to get_string_parser() on solve_config.TargetState. Override when the solve config contains fields beyond TargetState.

Return type:

Callable

Returns:

A function (solve_config: SolveConfig) -> str.

get_string_parser()[source]

Return a callable that renders a State as a human-readable string.

Return type:

Callable

Returns:

A function (state: State, **kwargs) -> str.

get_initial_state(solve_config, key=Array([0, 0], dtype=uint32), data=None)[source]

Build and return the initial (scrambled) state for a given goal.

Parameters:
  • solve_config (SolveConfig) – The goal configuration for this episode.

  • key – Optional JAX PRNG key for random scrambling.

  • data – Optional puzzle-specific data from get_data().

Return type:

State

Returns:

A State instance representing the starting position.

get_solve_config(key=Array([0, 128], dtype=uint32), data=None)[source]

Build and return a goal / solve configuration.

Parameters:
  • key – Optional JAX PRNG key for stochastic goal generation.

  • data – Optional puzzle-specific data from get_data().

Return type:

SolveConfig

Returns:

A SolveConfig instance describing the puzzle objective.

get_actions(solve_config, state, action, filled=True)[source]

Returns the next state and cost for a given action.

Return type:

tuple[State, Array]

Parameters:
is_solved(solve_config, state)[source]

This function should return True if the state is the target state. if the puzzle has multiple target states, this function should return True if the state is one of the target conditions. e.g sokoban puzzle has multiple target states. box’s position should be the same as the target position but the player’s position can be different.

Return type:

bool

Parameters:
action_to_string(action)[source]

Return a human-readable name for the given action index.

Override in subclasses to provide meaningful names (e.g., "R" for right, "U'" for counter-clockwise).

Parameters:

action (int) – Integer action index in [0, action_size).

Return type:

str

Returns:

String representation of the action.

property inverse_action_map: Array | None

Defines the inverse action mapping for the Maze. Actions are [L, R, U, D], so the inverse is [R, L, D, U].

get_img_parser()[source]

This function is a decorator that adds an img_parser to the class.

Return type:

Callable

get_solve_config_img_parser()[source]

Return a callable that renders a SolveConfig as an image array.

The default implementation delegates to get_img_parser() on solve_config.TargetState. Override when the solve config contains fields beyond TargetState.

Return type:

Callable

Returns:

A function (solve_config: SolveConfig) -> jnp.ndarray.

TSP (Traveling Salesman Problem)

class puxle.puzzles.tsp.TSP[source]

Bases: Puzzle

Travelling Salesman Problem (TSP) as a sequential-visit puzzle.

size cities are uniformly sampled in the unit square. The agent starts at a random city and must visit every remaining city exactly once, minimising total Euclidean distance (including the return to the start city).

State consists of a packed visited-mask (1 bit / city) and the index of the current city. The action space equals the number of cities; visiting an already-visited city yields infinite cost.

This puzzle is not reversible.

Parameters:

size (int) – Number of cities (default 16).

define_state_class()[source]

Defines the state class for TSP using xtructure.

Return type:

PuzzleState

define_solve_config_class()[source]

Defines the solve config class for TSP using xtructure.

Return type:

PuzzleState

__init__(size=16, **kwargs)[source]

Initialise the puzzle.

Subclass constructors must call super().__init__(**kwargs) after setting action_size and any instance attributes needed by define_state_class() / data_init().

This method:

  1. Calls data_init() for optional dataset loading.

  2. Builds State and SolveConfig classes.

  3. JIT-compiles core methods (get_neighbours, is_solved, etc.).

  4. Validates action_size and pre-computes the inverse-action permutation.

Raises:

ValueError – If action_size is still None after subclass init.

Parameters:

size (int)

size: int
get_solve_config_string_parser()[source]

Return a callable that renders a SolveConfig as a string.

The default implementation delegates to get_string_parser() on solve_config.TargetState. Override when the solve config contains fields beyond TargetState.

Return type:

Callable

Returns:

A function (solve_config: SolveConfig) -> str.

get_string_parser()[source]

Return a callable that renders a State as a human-readable string.

Returns:

A function (state: State, **kwargs) -> str.

get_initial_state(solve_config, key=Array([0, 0], dtype=uint32), data=None)[source]

Build and return the initial (scrambled) state for a given goal.

Parameters:
  • solve_config (SolveConfig) – The goal configuration for this episode.

  • key – Optional JAX PRNG key for random scrambling.

  • data – Optional puzzle-specific data from get_data().

Return type:

State

Returns:

A State instance representing the starting position.

get_solve_config(key=None, data=None)[source]

Build and return a goal / solve configuration.

Parameters:
  • key – Optional JAX PRNG key for stochastic goal generation.

  • data – Optional puzzle-specific data from get_data().

Return type:

SolveConfig

Returns:

A SolveConfig instance describing the puzzle objective.

get_actions(solve_config, state, action, filled=True)[source]

This function returns the next state and cost for a given action (next point index). If moving to a point already visited, the cost is infinity.

Return type:

tuple[State, Union[Array, ndarray, bool, number]]

Parameters:
is_solved(solve_config, state)[source]

TSP is solved when all points have been visited.

Return type:

bool

Parameters:
action_to_string(action)[source]

This function should return a string representation of the action.

Return type:

str

Parameters:

action (int)

get_solve_config_img_parser()[source]

Return a callable that renders a SolveConfig as an image array.

The default implementation delegates to get_img_parser() on solve_config.TargetState. Override when the solve config contains fields beyond TargetState.

Return type:

Callable

Returns:

A function (solve_config: SolveConfig) -> jnp.ndarray.

get_img_parser()[source]

This function returns an img_parser that visualizes the TSP problem. It draws all the points scaled to fit into the image, highlights the start point in green, marks visited points in blue and unvisited in red, and outlines the current point with a black border. If all points are visited, it draws a line from the current point back to the start point.

PancakeSorting

class puxle.puzzles.pancake.PancakeSorting[source]

Bases: Puzzle

Pancake Sorting (prefix-reversal) puzzle.

A stack of size distinctly-sized pancakes must be sorted so that the largest is at the bottom (ascending order from top). The only allowed operation is a prefix flip: choose a position k and reverse the top k + 1 pancakes.

Every flip is its own inverse, so inverse_action_map is the identity permutation.

The state is a 1-D uint8 permutation of [1 .. size].

Parameters:

size (int) – Number of pancakes in the stack (default 35).

define_state_class()[source]

Defines the state class for PancakeSorting using xtructure.

Return type:

PuzzleState

__init__(size=35, **kwargs)[source]

Initialize the Pancake Sorting puzzle

Parameters:

size (int) – The number of pancakes in the stack

size: int
get_string_parser()[source]

Returns a function to convert a state to a string representation

get_img_parser()[source]

Returns a function to convert a state to an image representation

Return type:

Callable

get_initial_state(solve_config, key=None, data=None)[source]

Generate a random initial state for the puzzle

Return type:

State

Parameters:

solve_config (SolveConfig)

get_solve_config(key=None, data=None)[source]

Create the solving configuration (target state)

Return type:

SolveConfig

get_actions(solve_config, state, action, filled=True)[source]

Get the next state by flipping pancakes at the position determined by action. flip_pos = action + 1

Return type:

tuple[State, Array]

Parameters:
is_solved(solve_config, state)[source]

Check if the current state matches the target state (sorted)

Return type:

bool

Parameters:
action_to_string(action)[source]

Return a string representation of the action

Return type:

str

Parameters:

action (int)

property inverse_action_map: Array | None

Defines the inverse action mapping for PancakeSorting. Each action (flipping a prefix of the stack) is its own inverse.

TopSpin

class puxle.puzzles.topspin.TopSpin[source]

Bases: Puzzle

Top Spin puzzle on a circular track.

n_discs numbered tokens sit on a ring. Three actions are available:

  • Shift left (action 0): rotate the entire ring one position counter-clockwise.

  • Shift right (action 1): rotate the ring one position clockwise.

  • Reverse turnstile (action 2): reverse the first turnstile_size tokens in the array.

The goal is the sorted permutation [1, 2, …, n_discs].

Inverse action map: left ↔ right; reverse is self-inverse.

Parameters:
  • size (int) – Number of tokens on the ring (default 20).

  • turnstile_size (int) – Number of tokens covered by the turnstile (default 4).

define_state_class()[source]

Defines the state class for TopSpin using xtructure.

Return type:

PuzzleState

__init__(size=20, turnstile_size=4, **kwargs)[source]

Initialise the puzzle.

Subclass constructors must call super().__init__(**kwargs) after setting action_size and any instance attributes needed by define_state_class() / data_init().

This method:

  1. Calls data_init() for optional dataset loading.

  2. Builds State and SolveConfig classes.

  3. JIT-compiles core methods (get_neighbours, is_solved, etc.).

  4. Validates action_size and pre-computes the inverse-action permutation.

Raises:

ValueError – If action_size is still None after subclass init.

Parameters:
  • size (int)

  • turnstile_size (int)

n_discs: int
turnstile_size: int
get_string_parser()[source]

Return a callable that renders a State as a human-readable string.

Return type:

Callable

Returns:

A function (state: State, **kwargs) -> str.

get_solve_config(key=None, data=None)[source]

Build and return a goal / solve configuration.

Parameters:
  • key – Optional JAX PRNG key for stochastic goal generation.

  • data – Optional puzzle-specific data from get_data().

Return type:

SolveConfig

Returns:

A SolveConfig instance describing the puzzle objective.

get_initial_state(solve_config, key=None, data=None)[source]

Build and return the initial (scrambled) state for a given goal.

Parameters:
  • solve_config (SolveConfig) – The goal configuration for this episode.

  • key – Optional JAX PRNG key for random scrambling.

  • data – Optional puzzle-specific data from get_data().

Return type:

State

Returns:

A State instance representing the starting position.

get_actions(solve_config, state, action, filled=True)[source]

Returns the next state and cost for a given action.

Return type:

tuple[State, Union[Array, ndarray, bool, number]]

Parameters:
is_solved(solve_config, state)[source]

This function should return True if the state is the target state. if the puzzle has multiple target states, this function should return True if the state is one of the target conditions. e.g sokoban puzzle has multiple target states. box’s position should be the same as the target position but the player’s position can be different.

Return type:

bool

Parameters:
action_to_string(action)[source]

Return a human-readable name for the given action index.

Override in subclasses to provide meaningful names (e.g., "R" for right, "U'" for counter-clockwise).

Parameters:

action (int) – Integer action index in [0, action_size).

Return type:

str

Returns:

String representation of the action.

property inverse_action_map: Array | None

Defines the inverse action mapping for TopSpin. - Shift Left (0) <-> Shift Right (1) - Reverse Turnstile (2) is its own inverse.

get_img_parser()[source]

Return a callable that renders a State as an image (NumPy/JAX array).

Returns:

A function (state: State, **kwargs) -> jnp.ndarray producing an (H, W, 3) RGB image.

DotKnot

class puxle.puzzles.dotknot.DotKnot[source]

Bases: Puzzle

Dot-and-Knot path-connection puzzle.

On an size × size grid, pairs of same-coloured dots must be connected by moving them toward each other. When two dots of the same colour meet they merge into a path segment. The puzzle is solved when no unmerged dots remain (and the board is non-empty).

Cell encoding (4 bits per cell via xtructure bitpacking):

  • 0: empty.

  • 1 .. 2·color_num: dot endpoints (two per colour).

  • > 2·color_num: path segments.

Four directional actions move the lowest-indexed available dot.

This puzzle is not reversible.

Parameters:
  • size (int) – Edge length of the grid (default 10; must be ≥ 4).

  • color_num (int) – Number of dot colours (default 4).

define_solve_config_class()[source]

Return the @state_dataclass class used for goal/solve configuration.

The default implementation creates a SolveConfig with a single TargetState field. Override this when the goal representation requires additional fields (e.g., a goal mask for PDDL domains).

Return type:

PuzzleState

Returns:

A @state_dataclass class describing the solve configuration.

define_state_class()[source]

Return the @state_dataclass class used for puzzle states.

Subclasses must implement this method. The returned class should use FieldDescriptor to declare its fields.

Return type:

PuzzleState

Returns:

A @state_dataclass class describing the puzzle state.

__init__(size=10, color_num=4, **kwargs)[source]

Initialise the puzzle.

Subclass constructors must call super().__init__(**kwargs) after setting action_size and any instance attributes needed by define_state_class() / data_init().

This method:

  1. Calls data_init() for optional dataset loading.

  2. Builds State and SolveConfig classes.

  3. JIT-compiles core methods (get_neighbours, is_solved, etc.).

  4. Validates action_size and pre-computes the inverse-action permutation.

Raises:

ValueError – If action_size is still None after subclass init.

Parameters:
size: int
get_solve_config_string_parser()[source]

Return a callable that renders a SolveConfig as a string.

The default implementation delegates to get_string_parser() on solve_config.TargetState. Override when the solve config contains fields beyond TargetState.

Return type:

Callable

Returns:

A function (solve_config: SolveConfig) -> str.

get_string_parser()[source]

Return a callable that renders a State as a human-readable string.

Return type:

Callable

Returns:

A function (state: State, **kwargs) -> str.

get_initial_state(solve_config, key=Array([0, 128], dtype=uint32), data=None)[source]

Build and return the initial (scrambled) state for a given goal.

Parameters:
  • solve_config (SolveConfig) – The goal configuration for this episode.

  • key – Optional JAX PRNG key for random scrambling.

  • data – Optional puzzle-specific data from get_data().

Return type:

State

Returns:

A State instance representing the starting position.

get_solve_config(key=Array([0, 128], dtype=uint32), data=None)[source]

Build and return a goal / solve configuration.

Parameters:
  • key – Optional JAX PRNG key for stochastic goal generation.

  • data – Optional puzzle-specific data from get_data().

Return type:

SolveConfig

Returns:

A SolveConfig instance describing the puzzle objective.

get_actions(solve_config, state, action, filled=True)[source]

This function returns the next state and cost for a given action.

Return type:

tuple[State, Array]

Parameters:
is_solved(solve_config, state)[source]

This function should return True if the state is the target state. if the puzzle has multiple target states, this function should return True if the state is one of the target conditions. e.g sokoban puzzle has multiple target states. box’s position should be the same as the target position but the player’s position can be different.

Return type:

bool

Parameters:
action_to_string(action)[source]

Return a human-readable name for the given action index.

Override in subclasses to provide meaningful names (e.g., "R" for right, "U'" for counter-clockwise).

Parameters:

action (int) – Integer action index in [0, action_size).

Return type:

str

Returns:

String representation of the action.

get_solve_config_img_parser()[source]

Return a callable that renders a SolveConfig as an image array.

The default implementation delegates to get_img_parser() on solve_config.TargetState. Override when the solve config contains fields beyond TargetState.

Return type:

Callable

Returns:

A function (solve_config: SolveConfig) -> jnp.ndarray.

get_img_parser()[source]

This function is a decorator that adds an img_parser to the class.

Return type:

Callable

Room

class puxle.puzzles.room.Room[source]

Bases: Maze

Maze variant with a fixed 3×3 grid of rectangular rooms.

Each room has an internal dimension of room_dim × room_dim. The total grid size must satisfy 3·N + 2 where N 1; if an invalid size is given, the nearest valid size is used instead.

Doors between adjacent rooms are opened using a randomised Kruskal-based spanning-tree algorithm to guarantee full connectivity. Additional doors may be opened with probability prob_open_extra_door.

Inherits movement logic and inverse-action map from Maze.

Parameters:
  • size (int) – Total grid edge length (default 11room_dim = 3).

  • prob_open_extra_door (float) – Probability of opening non-spanning-tree doors (default 1.0 = open all).

__init__(size=11, prob_open_extra_door=1.0, **kwargs)[source]

Initialize with a specified size, calculating room dimension and adjusting to the nearest valid size if necessary.

Parameters:
  • size (int)

  • prob_open_extra_door (float)

room_dim: int