xtructure.hashtable package

Submodules

xtructure.hashtable.hashtable module

Hash table implementation using Cuckoo hashing technique for efficient state storage and lookup. This module provides functionality for hashing Xtructurables and managing collisions.

class xtructure.hashtable.hashtable.CuckooIdx(index: FieldDescriptor(dtype=<class 'jax.numpy.uint32'>, fill_value=4294967295, intrinsic_shape=(), fill_value_factory=None, validator=None), table_index: FieldDescriptor(dtype=<class 'jax.numpy.uint8'>, fill_value=255, intrinsic_shape=(), fill_value_factory=None, validator=None))[source]

Bases: object

property at
property bytes

Convert entire state tree to flattened byte array.

check_invariants()
classmethod default(shape: Tuple[int, ...] = ()) T
default_dtype = (<class 'jax.numpy.uint32'>, <class 'jax.numpy.uint8'>)
default_shape = ((), ())
property dtype: dtype

Get dtypes of all fields in the dataclass

flatten()
from_tuple()
hash(seed=0)

Main hash function that converts state to uint32 lanes and hashes them.

hash_with_uint32ed(seed=0)

Main hash function that converts state to uint32 lanes and hashes them. Returns both hash value and its uint32 representation.

index: FieldDescriptor(dtype=<class 'jax.numpy.uint32'>, fill_value=4294967295, intrinsic_shape=(), fill_value_factory=None, validator=None)
is_xtructed = True
classmethod load(path: str) T

Loads an instance from a .npz file.

padding_as_batch(batch_shape: tuple[int, ...])
classmethod random(shape=(), key=None)
replace(**kwargs)
reshape(new_shape: tuple[int, ...]) T
save(path: str)

Saves the instance to a .npz file.

property shape: shape

Returns a namedtuple containing the batch shape (if present) and the shapes of all fields. If a field is itself a xtructure_dataclass, its shape is included as a nested namedtuple.

str(**kwargs)
property structured_type: StructuredType
table_index: FieldDescriptor(dtype=<class 'jax.numpy.uint8'>, fill_value=255, intrinsic_shape=(), fill_value_factory=None, validator=None)
to_tuple()
property uint32ed

Convert pytree to uint32 array.

class xtructure.hashtable.hashtable.HashIdx(index: FieldDescriptor(dtype=<class 'jax.numpy.uint32'>, fill_value=4294967295, intrinsic_shape=(), fill_value_factory=None, validator=None))[source]

Bases: object

property at
property bytes

Convert entire state tree to flattened byte array.

check_invariants()
classmethod default(shape: Tuple[int, ...] = ()) T
default_dtype = (<class 'jax.numpy.uint32'>,)
default_shape = ((),)
property dtype: dtype

Get dtypes of all fields in the dataclass

flatten()
from_tuple()
hash(seed=0)

Main hash function that converts state to uint32 lanes and hashes them.

hash_with_uint32ed(seed=0)

Main hash function that converts state to uint32 lanes and hashes them. Returns both hash value and its uint32 representation.

index: FieldDescriptor(dtype=<class 'jax.numpy.uint32'>, fill_value=4294967295, intrinsic_shape=(), fill_value_factory=None, validator=None)
is_xtructed = True
classmethod load(path: str) T

Loads an instance from a .npz file.

padding_as_batch(batch_shape: tuple[int, ...])
classmethod random(shape=(), key=None)
replace(**kwargs)
reshape(new_shape: tuple[int, ...]) T
save(path: str)

Saves the instance to a .npz file.

property shape: shape

Returns a namedtuple containing the batch shape (if present) and the shapes of all fields. If a field is itself a xtructure_dataclass, its shape is included as a nested namedtuple.

str(**kwargs)
property structured_type: StructuredType
to_tuple()
property uint32ed

Convert pytree to uint32 array.

class xtructure.hashtable.hashtable.HashTable(seed: int, capacity: int, _capacity: int, cuckoo_table_n: int, size: int, table: Xtructurable, table_idx: Array | ndarray | bool | number, fingerprints: Array | ndarray | bool | number)[source]

Bases: object

Cuckoo Hash Table Implementation

This implementation uses multiple hash functions (specified by n_table) to resolve collisions. Each item can be stored in one of n_table possible positions.

seed

Initial seed for hash functions

Type:

int

capacity

User-specified capacity

Type:

int

_capacity

Actual internal capacity (larger than specified to handle collisions)

Type:

int

size

Current number of items in table

Type:

int

table

The actual storage for states

Type:

xtructure.core.protocol.Xtructurable

table_idx

Indices tracking which hash function was used for each entry

Type:

jax.Array | numpy.ndarray | numpy.bool | numpy.number

static build(dataclass: Xtructurable, seed: int, capacity: int, cuckoo_table_n: int = 2, hash_size_multiplier: int = 2) HashTable[source]

Initialize a new hash table with specified parameters.

Parameters:
  • dataclass – Example Xtructurable to determine the structure

  • seed – Initial seed for hash functions

  • capacity – Desired capacity of the table

Returns:

Initialized HashTable instance

capacity: int
cuckoo_table_n: int
fingerprints: Array | ndarray | bool | number
from_tuple()
insert(input: Xtructurable) tuple[HashTable, bool, HashIdx][source]

insert the state in the table

Returns (table, inserted?, flat_idx).

lookup(input: Xtructurable) tuple[HashIdx, bool][source]

Find a state in the hash table.

Returns a tuple of (HashIdx, found) where HashIdx.index is the flat index into table.table, and found indicates existence.

lookup_cuckoo(input: Xtructurable) tuple[CuckooIdx, bool, Array | ndarray | bool | number][source]

Finds the state in the hash table using Cuckoo hashing.

Parameters:
  • table – The HashTable instance.

  • input – The Xtructurable state to look up.

Returns:

  • idx (CuckooIdx): Index information for the slot examined.

  • found (bool): True if the state was found, False otherwise.

  • fingerprint (uint32): Hash fingerprint of the probed state (internal use).

If not found, idx indicates the first empty slot encountered during the Cuckoo search path where an insertion could occur.

Return type:

A tuple (idx, found, fingerprint)

lookup_parallel(inputs: Xtructurable) tuple[HashIdx, Array | ndarray | bool | number][source]

Finds the state in the hash table using Cuckoo hashing.

Returns (HashIdx, found_mask) per input.

parallel_insert(inputs: Xtructurable, filled: Array | ndarray | bool | number = None, unique_key: Array | ndarray | bool | number = None)[source]

Parallel insertion of multiple states into the hash table.

Parameters:
  • table – Hash table instance

  • inputs – States to insert

  • filled – Boolean array indicating which inputs are valid

  • unique_key – Optional key array for determining priority among duplicate states. When provided, among duplicate states, only the one with the smallest key value will be marked as unique in unique_filled mask.

Returns:

Tuple of (updated_table, updatable, unique_filled, idx)

replace(**kwargs)
seed: int
size: int
table: Xtructurable
table_idx: Array | ndarray | bool | number
to_tuple()
xtructure.hashtable.hashtable.get_new_idx_byterized(input: Xtructurable, modulus: int, seed: int) tuple[Array | ndarray | bool | number, Array | ndarray | bool | number, Array | ndarray | bool | number, Array | ndarray | bool | number][source]

Calculate new index and return uint32ed representation of input state. Similar to get_new_idx but also returns the uint32ed representation for equality comparison.

xtructure.hashtable.hashtable.get_new_idx_from_uint32ed(input_uint32ed: Array | ndarray | bool | number, modulus: int, seed: int) tuple[Array | ndarray | bool | number, Array | ndarray | bool | number, Array | ndarray | bool | number][source]

Calculate new index for input state using the hash function from its uint32ed representation and reduce it modulo the provided table capacity.

Module contents

class xtructure.hashtable.HashIdx(index: FieldDescriptor(dtype=<class 'jax.numpy.uint32'>, fill_value=4294967295, intrinsic_shape=(), fill_value_factory=None, validator=None))[source]

Bases: object

property at
property bytes

Convert entire state tree to flattened byte array.

check_invariants()
classmethod default(shape: Tuple[int, ...] = ()) T
default_dtype = (<class 'jax.numpy.uint32'>,)
default_shape = ((),)
property dtype: dtype

Get dtypes of all fields in the dataclass

flatten()
from_tuple()
hash(seed=0)

Main hash function that converts state to uint32 lanes and hashes them.

hash_with_uint32ed(seed=0)

Main hash function that converts state to uint32 lanes and hashes them. Returns both hash value and its uint32 representation.

index: FieldDescriptor(dtype=<class 'jax.numpy.uint32'>, fill_value=4294967295, intrinsic_shape=(), fill_value_factory=None, validator=None)
is_xtructed = True
classmethod load(path: str) T

Loads an instance from a .npz file.

padding_as_batch(batch_shape: tuple[int, ...])
classmethod random(shape=(), key=None)
replace(**kwargs)
reshape(new_shape: tuple[int, ...]) T
save(path: str)

Saves the instance to a .npz file.

property shape: shape

Returns a namedtuple containing the batch shape (if present) and the shapes of all fields. If a field is itself a xtructure_dataclass, its shape is included as a nested namedtuple.

str(**kwargs)
property structured_type: StructuredType
to_tuple()
property uint32ed

Convert pytree to uint32 array.

class xtructure.hashtable.HashTable(seed: int, capacity: int, _capacity: int, cuckoo_table_n: int, size: int, table: Xtructurable, table_idx: Array | ndarray | bool | number, fingerprints: Array | ndarray | bool | number)[source]

Bases: object

Cuckoo Hash Table Implementation

This implementation uses multiple hash functions (specified by n_table) to resolve collisions. Each item can be stored in one of n_table possible positions.

seed

Initial seed for hash functions

Type:

int

capacity

User-specified capacity

Type:

int

_capacity

Actual internal capacity (larger than specified to handle collisions)

Type:

int

size

Current number of items in table

Type:

int

table

The actual storage for states

Type:

xtructure.core.protocol.Xtructurable

table_idx

Indices tracking which hash function was used for each entry

Type:

jax.Array | numpy.ndarray | numpy.bool | numpy.number

static build(dataclass: Xtructurable, seed: int, capacity: int, cuckoo_table_n: int = 2, hash_size_multiplier: int = 2) HashTable[source]

Initialize a new hash table with specified parameters.

Parameters:
  • dataclass – Example Xtructurable to determine the structure

  • seed – Initial seed for hash functions

  • capacity – Desired capacity of the table

Returns:

Initialized HashTable instance

capacity: int
cuckoo_table_n: int
fingerprints: Array | ndarray | bool | number
from_tuple()
insert(input: Xtructurable) tuple[HashTable, bool, HashIdx][source]

insert the state in the table

Returns (table, inserted?, flat_idx).

lookup(input: Xtructurable) tuple[HashIdx, bool][source]

Find a state in the hash table.

Returns a tuple of (HashIdx, found) where HashIdx.index is the flat index into table.table, and found indicates existence.

lookup_cuckoo(input: Xtructurable) tuple[CuckooIdx, bool, Array | ndarray | bool | number][source]

Finds the state in the hash table using Cuckoo hashing.

Parameters:
  • table – The HashTable instance.

  • input – The Xtructurable state to look up.

Returns:

  • idx (CuckooIdx): Index information for the slot examined.

  • found (bool): True if the state was found, False otherwise.

  • fingerprint (uint32): Hash fingerprint of the probed state (internal use).

If not found, idx indicates the first empty slot encountered during the Cuckoo search path where an insertion could occur.

Return type:

A tuple (idx, found, fingerprint)

lookup_parallel(inputs: Xtructurable) tuple[HashIdx, Array | ndarray | bool | number][source]

Finds the state in the hash table using Cuckoo hashing.

Returns (HashIdx, found_mask) per input.

parallel_insert(inputs: Xtructurable, filled: Array | ndarray | bool | number = None, unique_key: Array | ndarray | bool | number = None)[source]

Parallel insertion of multiple states into the hash table.

Parameters:
  • table – Hash table instance

  • inputs – States to insert

  • filled – Boolean array indicating which inputs are valid

  • unique_key – Optional key array for determining priority among duplicate states. When provided, among duplicate states, only the one with the smallest key value will be marked as unique in unique_filled mask.

Returns:

Tuple of (updated_table, updatable, unique_filled, idx)

replace(**kwargs)
seed: int
size: int
table: Xtructurable
table_idx: Array | ndarray | bool | number
to_tuple()