Benchmarks (puxle.benchmark)

Benchmark infrastructure for evaluating puzzle-solving algorithms against known-optimal solutions.

BenchmarkSample

Benchmark (Base)

class puxle.benchmark.benchmark.Benchmark[source]

Bases: ABC, Generic[StateT, SolveConfigT]

Abstract base class for a benchmark dataset.

Subclasses must implement:

The base class provides lazy caching for puzzle and dataset and a generic verify_solution() that checks both validity (is the final state solved?) and optimality (is the cost ≤ optimal?).

__init__()[source]
Return type:

None

property puzzle: Puzzle

Return the puzzle used for this benchmark, constructing it lazily.

abstractmethod build_puzzle()[source]

Instantiate the puzzle that defines this benchmark.

Return type:

Puzzle

property dataset: Any

Load the dataset on demand and cache the result.

abstractmethod load_dataset()[source]

Return the raw dataset object backing the benchmark.

Return type:

Any

abstractmethod sample_ids()[source]

Return iterable sample identifiers available in the dataset.

Return type:

Iterable[Hashable]

abstractmethod get_sample(sample_id)[source]

Fetch the state, solve configuration and optimal action sequence for a sample.

Return type:

BenchmarkSample[TypeVar(StateT, bound= PuzzleState), TypeVar(SolveConfigT, bound= PuzzleState)]

Parameters:

sample_id (Hashable)

verify_solution(sample, states=None, action_sequence=None)[source]

Verify that a solution is valid and optimal for the given sample.

If action_sequence or states are provided, they are treated as the candidate solution. Otherwise, verifies sample.optimal_action_sequence.

Returns:

if valid (solved) and length matches optimal (<= optimal cost). - False: if invalid (not solved) or suboptimal (> optimal cost). - None: if valid (solved) but sample has no optimal info to compare against.

Return type:

  • True

Parameters:
  • sample (BenchmarkSample[StateT, SolveConfigT])

  • states (Sequence[StateT] | None)

  • action_sequence (Sequence[str] | None)

DeepCubeA Utilities

class puxle.benchmark._deepcubea.DeepCubeAUnpickler[source]

Bases: Unpickler

Unpickler that recreates missing DeepCubeA environment classes on the fly.

find_class(module, name)[source]

Return an object from a specified module.

If necessary, the module will be imported. Subclasses may override this method (e.g. to restrict unpickling of arbitrary classes and functions).

This method is called whenever a class or a function object is needed. Both arguments passed are str objects.

Return type:

Any

Parameters:
puxle.benchmark._deepcubea.load_deepcubea(handle)[source]

Helper that loads a DeepCubeA pickle with the compatible unpickler.

Return type:

Any

Parameters:

handle (IO[bytes])

puxle.benchmark._deepcubea.load_deepcubea_dataset(dataset_path, dataset_name, package_resource, fallback_dir)[source]

Helper to load a DeepCubeA dataset from various possible locations.

Return type:

dict[str, Any]

Parameters:
  • dataset_path (Path | None)

  • dataset_name (str)

  • package_resource (str)

  • fallback_dir (Path)

RubiksCube Benchmarks

SlidePuzzle Benchmarks

LightsOut Benchmarks

class puxle.benchmark.lightsout_deepcubea.LightsOutDeepCubeABenchmark[source]

Bases: Benchmark

Benchmark that exposes the DeepCubeA LightsOut dataset.

__init__(dataset_path=None, dataset_name='size7-deepcubeA.pkl', size=None)[source]
Parameters:
  • dataset_path (str | Path | None)

  • dataset_name (str)

  • size (int | None)

Return type:

None

build_puzzle()[source]

Instantiate the puzzle that defines this benchmark.

Return type:

LightsOut

load_dataset()[source]

Return the raw dataset object backing the benchmark.

Return type:

dict[str, Any]

sample_ids()[source]

Return iterable sample identifiers available in the dataset.

Return type:

Iterable[Hashable]

get_sample(sample_id)[source]

Fetch the state, solve configuration and optimal action sequence for a sample.

Return type:

BenchmarkSample

Parameters:

sample_id (Hashable)

verify_solution(sample, states=None, action_sequence=None)[source]

Verify that a solution is valid for the given sample. For 7x7 Lights Out, any solution without duplicate moves is considered optimal.

Return type:

bool | None

Parameters: