xtructure.hashtable package
Submodules
xtructure.hashtable.hashtable module
Hash table implementation using Cuckoo hashing technique for efficient state storage and lookup. This module provides functionality for hashing Xtructurables and managing collisions.
- class xtructure.hashtable.hashtable.CuckooIdx(index: FieldDescriptor(dtype=<class 'jax.numpy.uint32'>, fill_value=4294967295, intrinsic_shape=(), fill_value_factory=None, validator=None), table_index: FieldDescriptor(dtype=<class 'jax.numpy.uint8'>, fill_value=255, intrinsic_shape=(), fill_value_factory=None, validator=None))[source]
Bases:
object- property at
- property bytes
Convert entire state tree to flattened byte array.
- check_invariants()
- classmethod default(shape: Tuple[int, ...] = ()) T
- default_dtype = (<class 'jax.numpy.uint32'>, <class 'jax.numpy.uint8'>)
- default_shape = ((), ())
- property dtype: dtype
Get dtypes of all fields in the dataclass
- flatten()
- from_tuple()
- hash(seed=0)
Main hash function that converts state to uint32 lanes and hashes them.
- hash_with_uint32ed(seed=0)
Main hash function that converts state to uint32 lanes and hashes them. Returns both hash value and its uint32 representation.
- index: FieldDescriptor(dtype=<class 'jax.numpy.uint32'>, fill_value=4294967295, intrinsic_shape=(), fill_value_factory=None, validator=None)
- is_xtructed = True
- classmethod load(path: str) T
Loads an instance from a .npz file.
- padding_as_batch(batch_shape: tuple[int, ...])
- classmethod random(shape=(), key=None)
- replace(**kwargs)
- reshape(new_shape: tuple[int, ...]) T
- save(path: str)
Saves the instance to a .npz file.
- property shape: shape
Returns a namedtuple containing the batch shape (if present) and the shapes of all fields. If a field is itself a xtructure_dataclass, its shape is included as a nested namedtuple.
- str(**kwargs)
- property structured_type: StructuredType
- table_index: FieldDescriptor(dtype=<class 'jax.numpy.uint8'>, fill_value=255, intrinsic_shape=(), fill_value_factory=None, validator=None)
- to_tuple()
- property uint32ed
Convert pytree to uint32 array.
- class xtructure.hashtable.hashtable.HashIdx(index: FieldDescriptor(dtype=<class 'jax.numpy.uint32'>, fill_value=4294967295, intrinsic_shape=(), fill_value_factory=None, validator=None))[source]
Bases:
object- property at
- property bytes
Convert entire state tree to flattened byte array.
- check_invariants()
- classmethod default(shape: Tuple[int, ...] = ()) T
- default_dtype = (<class 'jax.numpy.uint32'>,)
- default_shape = ((),)
- property dtype: dtype
Get dtypes of all fields in the dataclass
- flatten()
- from_tuple()
- hash(seed=0)
Main hash function that converts state to uint32 lanes and hashes them.
- hash_with_uint32ed(seed=0)
Main hash function that converts state to uint32 lanes and hashes them. Returns both hash value and its uint32 representation.
- index: FieldDescriptor(dtype=<class 'jax.numpy.uint32'>, fill_value=4294967295, intrinsic_shape=(), fill_value_factory=None, validator=None)
- is_xtructed = True
- classmethod load(path: str) T
Loads an instance from a .npz file.
- padding_as_batch(batch_shape: tuple[int, ...])
- classmethod random(shape=(), key=None)
- replace(**kwargs)
- reshape(new_shape: tuple[int, ...]) T
- save(path: str)
Saves the instance to a .npz file.
- property shape: shape
Returns a namedtuple containing the batch shape (if present) and the shapes of all fields. If a field is itself a xtructure_dataclass, its shape is included as a nested namedtuple.
- str(**kwargs)
- property structured_type: StructuredType
- to_tuple()
- property uint32ed
Convert pytree to uint32 array.
- class xtructure.hashtable.hashtable.HashTable(seed: int, capacity: int, _capacity: int, cuckoo_table_n: int, size: int, table: Xtructurable, table_idx: Array | ndarray | bool | number, fingerprints: Array | ndarray | bool | number)[source]
Bases:
objectCuckoo Hash Table Implementation
This implementation uses multiple hash functions (specified by n_table) to resolve collisions. Each item can be stored in one of n_table possible positions.
- seed
Initial seed for hash functions
- Type:
int
- capacity
User-specified capacity
- Type:
int
- _capacity
Actual internal capacity (larger than specified to handle collisions)
- Type:
int
- size
Current number of items in table
- Type:
int
- table
The actual storage for states
- table_idx
Indices tracking which hash function was used for each entry
- Type:
jax.Array | numpy.ndarray | numpy.bool | numpy.number
- static build(dataclass: Xtructurable, seed: int, capacity: int, cuckoo_table_n: int = 2, hash_size_multiplier: int = 2) HashTable[source]
Initialize a new hash table with specified parameters.
- Parameters:
dataclass – Example Xtructurable to determine the structure
seed – Initial seed for hash functions
capacity – Desired capacity of the table
- Returns:
Initialized HashTable instance
- capacity: int
- cuckoo_table_n: int
- fingerprints: Array | ndarray | bool | number
- from_tuple()
- insert(input: Xtructurable) tuple[HashTable, bool, HashIdx][source]
insert the state in the table
Returns (table, inserted?, flat_idx).
- lookup(input: Xtructurable) tuple[HashIdx, bool][source]
Find a state in the hash table.
Returns a tuple of (HashIdx, found) where HashIdx.index is the flat index into table.table, and found indicates existence.
- lookup_cuckoo(input: Xtructurable) tuple[CuckooIdx, bool, Array | ndarray | bool | number][source]
Finds the state in the hash table using Cuckoo hashing.
- Parameters:
table – The HashTable instance.
input – The Xtructurable state to look up.
- Returns:
idx (CuckooIdx): Index information for the slot examined.
found (bool): True if the state was found, False otherwise.
fingerprint (uint32): Hash fingerprint of the probed state (internal use).
If not found, idx indicates the first empty slot encountered during the Cuckoo search path where an insertion could occur.
- Return type:
A tuple (idx, found, fingerprint)
- lookup_parallel(inputs: Xtructurable) tuple[HashIdx, Array | ndarray | bool | number][source]
Finds the state in the hash table using Cuckoo hashing.
Returns (HashIdx, found_mask) per input.
- parallel_insert(inputs: Xtructurable, filled: Array | ndarray | bool | number = None, unique_key: Array | ndarray | bool | number = None)[source]
Parallel insertion of multiple states into the hash table.
- Parameters:
table – Hash table instance
inputs – States to insert
filled – Boolean array indicating which inputs are valid
unique_key – Optional key array for determining priority among duplicate states. When provided, among duplicate states, only the one with the smallest key value will be marked as unique in unique_filled mask.
- Returns:
Tuple of (updated_table, updatable, unique_filled, idx)
- replace(**kwargs)
- seed: int
- size: int
- table: Xtructurable
- table_idx: Array | ndarray | bool | number
- to_tuple()
- xtructure.hashtable.hashtable.get_new_idx_byterized(input: Xtructurable, modulus: int, seed: int) tuple[Array | ndarray | bool | number, Array | ndarray | bool | number, Array | ndarray | bool | number, Array | ndarray | bool | number][source]
Calculate new index and return uint32ed representation of input state. Similar to get_new_idx but also returns the uint32ed representation for equality comparison.
- xtructure.hashtable.hashtable.get_new_idx_from_uint32ed(input_uint32ed: Array | ndarray | bool | number, modulus: int, seed: int) tuple[Array | ndarray | bool | number, Array | ndarray | bool | number, Array | ndarray | bool | number][source]
Calculate new index for input state using the hash function from its uint32ed representation and reduce it modulo the provided table capacity.
Module contents
- class xtructure.hashtable.HashIdx(index: FieldDescriptor(dtype=<class 'jax.numpy.uint32'>, fill_value=4294967295, intrinsic_shape=(), fill_value_factory=None, validator=None))[source]
Bases:
object- property at
- property bytes
Convert entire state tree to flattened byte array.
- check_invariants()
- classmethod default(shape: Tuple[int, ...] = ()) T
- default_dtype = (<class 'jax.numpy.uint32'>,)
- default_shape = ((),)
- property dtype: dtype
Get dtypes of all fields in the dataclass
- flatten()
- from_tuple()
- hash(seed=0)
Main hash function that converts state to uint32 lanes and hashes them.
- hash_with_uint32ed(seed=0)
Main hash function that converts state to uint32 lanes and hashes them. Returns both hash value and its uint32 representation.
- index: FieldDescriptor(dtype=<class 'jax.numpy.uint32'>, fill_value=4294967295, intrinsic_shape=(), fill_value_factory=None, validator=None)
- is_xtructed = True
- classmethod load(path: str) T
Loads an instance from a .npz file.
- padding_as_batch(batch_shape: tuple[int, ...])
- classmethod random(shape=(), key=None)
- replace(**kwargs)
- reshape(new_shape: tuple[int, ...]) T
- save(path: str)
Saves the instance to a .npz file.
- property shape: shape
Returns a namedtuple containing the batch shape (if present) and the shapes of all fields. If a field is itself a xtructure_dataclass, its shape is included as a nested namedtuple.
- str(**kwargs)
- property structured_type: StructuredType
- to_tuple()
- property uint32ed
Convert pytree to uint32 array.
- class xtructure.hashtable.HashTable(seed: int, capacity: int, _capacity: int, cuckoo_table_n: int, size: int, table: Xtructurable, table_idx: Array | ndarray | bool | number, fingerprints: Array | ndarray | bool | number)[source]
Bases:
objectCuckoo Hash Table Implementation
This implementation uses multiple hash functions (specified by n_table) to resolve collisions. Each item can be stored in one of n_table possible positions.
- seed
Initial seed for hash functions
- Type:
int
- capacity
User-specified capacity
- Type:
int
- _capacity
Actual internal capacity (larger than specified to handle collisions)
- Type:
int
- size
Current number of items in table
- Type:
int
- table
The actual storage for states
- table_idx
Indices tracking which hash function was used for each entry
- Type:
jax.Array | numpy.ndarray | numpy.bool | numpy.number
- static build(dataclass: Xtructurable, seed: int, capacity: int, cuckoo_table_n: int = 2, hash_size_multiplier: int = 2) HashTable[source]
Initialize a new hash table with specified parameters.
- Parameters:
dataclass – Example Xtructurable to determine the structure
seed – Initial seed for hash functions
capacity – Desired capacity of the table
- Returns:
Initialized HashTable instance
- capacity: int
- cuckoo_table_n: int
- fingerprints: Array | ndarray | bool | number
- from_tuple()
- insert(input: Xtructurable) tuple[HashTable, bool, HashIdx][source]
insert the state in the table
Returns (table, inserted?, flat_idx).
- lookup(input: Xtructurable) tuple[HashIdx, bool][source]
Find a state in the hash table.
Returns a tuple of (HashIdx, found) where HashIdx.index is the flat index into table.table, and found indicates existence.
- lookup_cuckoo(input: Xtructurable) tuple[CuckooIdx, bool, Array | ndarray | bool | number][source]
Finds the state in the hash table using Cuckoo hashing.
- Parameters:
table – The HashTable instance.
input – The Xtructurable state to look up.
- Returns:
idx (CuckooIdx): Index information for the slot examined.
found (bool): True if the state was found, False otherwise.
fingerprint (uint32): Hash fingerprint of the probed state (internal use).
If not found, idx indicates the first empty slot encountered during the Cuckoo search path where an insertion could occur.
- Return type:
A tuple (idx, found, fingerprint)
- lookup_parallel(inputs: Xtructurable) tuple[HashIdx, Array | ndarray | bool | number][source]
Finds the state in the hash table using Cuckoo hashing.
Returns (HashIdx, found_mask) per input.
- parallel_insert(inputs: Xtructurable, filled: Array | ndarray | bool | number = None, unique_key: Array | ndarray | bool | number = None)[source]
Parallel insertion of multiple states into the hash table.
- Parameters:
table – Hash table instance
inputs – States to insert
filled – Boolean array indicating which inputs are valid
unique_key – Optional key array for determining priority among duplicate states. When provided, among duplicate states, only the one with the smallest key value will be marked as unique in unique_filled mask.
- Returns:
Tuple of (updated_table, updatable, unique_filled, idx)
- replace(**kwargs)
- seed: int
- size: int
- table: Xtructurable
- table_idx: Array | ndarray | bool | number
- to_tuple()