xtructure.core.xtructure_numpy.dataclass_ops.unique_ops package

Submodules

xtructure.core.xtructure_numpy.dataclass_ops.unique_ops.benchmark_unqiue_ops module

Microbenchmark for unique_mask implementations.

class xtructure.core.xtructure_numpy.dataclass_ops.unique_ops.benchmark_unqiue_ops.DummyData(id: FieldDescriptor(dtype = <class 'jax.numpy.uint32'>, fill_value=4294967295, intrinsic_shape=(), bits=None, packed_bits=None, unpacked_dtype=None, unpacked_intrinsic_shape=None, fill_value_factory=None, validator=None), category: FieldDescriptor(dtype = <class 'jax.numpy.uint8'>, fill_value=255, intrinsic_shape=(), bits=None, packed_bits=None, unpacked_dtype=None, unpacked_intrinsic_shape=None, fill_value_factory=None, validator=None), sub_id: FieldDescriptor(dtype = <class 'jax.numpy.uint16'>, fill_value=65535, intrinsic_shape=(), bits=None, packed_bits=None, unpacked_dtype=None, unpacked_intrinsic_shape=None, fill_value_factory=None, validator=None))[source]

Bases: object

allclose(b: Any, rtol: float = 1e-05, atol: float = 1e-08, equal_nan: bool = False) bool | Array

Returns True if two arrays are element-wise equal within a tolerance.

astype(dtype: Any, copy: bool = False, device: Any = None) T

Copy of the array, cast to a specified type.

property at
property batch_shape
block() Any

Assemble an nd-array from nested lists of blocks.

broadcast_to(shape: Sequence[int]) T

Broadcast an array to a new shape.

property bytes

Convert entire state tree to flattened byte array.

category: FieldDescriptor(dtype=<class 'jax.numpy.uint8'>, fill_value=255, intrinsic_shape=(), bits=None, packed_bits=None, unpacked_dtype=None, unpacked_intrinsic_shape=None, fill_value_factory=None, validator=None)
check_invariants()
column_stack() Any

Stack 1-D arrays as columns into a 2-D array.

classmethod default(shape: Tuple[int, ...] = ()) T
default_dtype = (<class 'jax.numpy.uint32'>, <class 'jax.numpy.uint8'>, <class 'jax.numpy.uint16'>)
default_shape = ((), (), ())
dstack(dtype: Any = None) Any

Stack arrays in sequence depth wise (along third axis).

property dtype: dtype

Get dtypes of all fields in the dataclass

equal(y: Any) T

Return (x == y) element-wise.

expand_dims(axis: int) T

Insert a new axis into every field.

flatten() T

Flatten the batch dimensions of a dataclass instance.

flip(axis: int | Sequence[int] | None = None) T

Reverse the order of elements in an array along the given axis.

from_tuple()
hash(seed=0)

Main hash function that converts state to uint32 lanes and hashes them.

hash_pair(seed=0)

Hash function that returns two 32-bit hashes.

hash_pair_with_uint32ed(seed=0)

Hash function that returns two 32-bit hashes and the uint32 lanes.

hash_with_uint32ed(seed=0)

Main hash function that converts state to uint32 lanes and hashes them. Returns both hash value and its uint32 representation.

hstack(dtype: Any = None) Any

Stack arrays in sequence horizontally (column wise).

id: FieldDescriptor(dtype=<class 'jax.numpy.uint32'>, fill_value=4294967295, intrinsic_shape=(), bits=None, packed_bits=None, unpacked_dtype=None, unpacked_intrinsic_shape=None, fill_value_factory=None, validator=None)
is_xtructed = True
isclose(b: Any, rtol: float = 1e-05, atol: float = 1e-08, equal_nan: bool = False) T

Returns a boolean array where two arrays are element-wise equal within a tolerance.

classmethod load(path: str) T

Loads an instance from a .npz file.

moveaxis(source: int | Sequence[int], destination: int | Sequence[int]) T

Move axes of an array to new positions.

property ndim: int

Return number of batch dimensions for structured instances.

not_equal(y: Any) T

Return (x != y) element-wise.

pad(pad_width: int | tuple[int, ...] | tuple[tuple[int, int], ...], mode: str = 'constant', **kwargs) T

Pad xtructure dataclasses using a jnp.pad compatible interface.

classmethod random(shape=(), key=None)
replace(**kwargs)
reshape(new_shape: tuple[int, ...] | int, *args: int) T

Reshape the batch dimensions of a dataclass instance.

Supports both reshape(instance, (2, 3)) and reshape(instance, 2, 3) syntax. Also supports -1 for dimension inference.

roll(shift: int | Sequence[int], axis: int | Sequence[int] | None = None) T

Roll array elements along a given axis.

rot90(k: int = 1, axes: tuple[int, int] = (0, 1)) T

Rotate an array by 90 degrees in the plane specified by axes.

save(path: str, *, packed: bool = True)

Saves the instance to a .npz file.

property shape: shape

Returns a namedtuple containing the batch shape (if present) and the shapes of all fields. If a field is itself a xtructure_dataclass, its shape is included as a nested namedtuple.

squeeze(axis: int | tuple[int, ...] | None = None) T

Remove axes of length one from every field.

str(**kwargs)
property structured_type: StructuredType
sub_id: FieldDescriptor(dtype=<class 'jax.numpy.uint16'>, fill_value=65535, intrinsic_shape=(), bits=None, packed_bits=None, unpacked_dtype=None, unpacked_intrinsic_shape=None, fill_value_factory=None, validator=None)
swapaxes(axis1: int, axis2: int) T

Swap two batch axes.

to_tuple()
transpose(axes: tuple[int, ...] | None = None) T

Transpose batch dimensions of every field.

property uint32ed

Convert pytree to uint32 array.

vstack(dtype: Any = None) Any

Stack arrays in sequence vertically (row wise).

xtructure.core.xtructure_numpy.dataclass_ops.unique_ops.benchmark_unqiue_ops.main() None[source]
xtructure.core.xtructure_numpy.dataclass_ops.unique_ops.benchmark_unqiue_ops.run_bench(sizes: List[int], duplication_rates: List[float], with_cost: bool, with_filled: bool, skewed: bool, trials: int, warmup: int, seed: int) None[source]

xtructure.core.xtructure_numpy.dataclass_ops.unique_ops.legacy_unique_ops module

Legacy unique_mask implementation for comparison.

xtructure.core.xtructure_numpy.dataclass_ops.unique_ops.legacy_unique_ops.unique_mask_legacy(val: Xtructurable, key: Array | None = None, filled: Array | None = None, key_fn: Callable[[Any], Array] | None = None, batch_len: int | None = None, return_index: bool = False, return_inverse: bool = False) Array | tuple[source]

Legacy implementation using jnp.unique + scatter reduction.

xtructure.core.xtructure_numpy.dataclass_ops.unique_ops.optimized_unique_ops module

Optimized unique_mask implementation using wide hashing and Lexsort.

xtructure.core.xtructure_numpy.dataclass_ops.unique_ops.optimized_unique_ops.unique_mask(val: Xtructurable, key: Array | None = None, filled: Array | None = None, key_fn: Callable[[Any], Array] | None = None, batch_len: int | None = None, return_index: bool = False, return_inverse: bool = False, size: int | None = None, fill_value: int | None = None) Array | tuple[source]

Mask or index information for selecting unique states.

Optimized implementation using wide hashing + Lexsort. This approach reduces any multi-column key into a fixed-width representation (128-bit), minimizing sorting passes and comparison overhead while maintaining near-zero collision probability.

Parameters:
  • val – Xtructurable dataclass to deduplicate.

  • key – Optional cost array (e.g. priority). If provided, the item with the lowest key among duplicates is selected.

  • filled – Optional boolean mask indicating valid items. Invalid items are treated as non-existent (never selected).

  • key_fn – Function to generate hash/comparison keys from val.

  • batch_len – Explicit batch length (optional).

  • return_index – Whether to return indices of unique items.

  • return_inverse – Whether to return inverse indices.

  • size – Optional static size for returned unique indices (required for JIT).

  • fill_value – Value to fill padding with when size is specified.

Returns:

Mask (bool array) or tuple (mask, index, inverse).

Module contents

Deduplication utilities for dataclass batches.

xtructure.core.xtructure_numpy.dataclass_ops.unique_ops.unique_mask(val: Xtructurable, key: Array | None = None, filled: Array | None = None, key_fn: Callable[[Any], Array] | None = None, batch_len: int | None = None, return_index: bool = False, return_inverse: bool = False, size: int | None = None, fill_value: int | None = None) Array | tuple[source]

Mask or index information for selecting unique states.

Optimized implementation using wide hashing + Lexsort. This approach reduces any multi-column key into a fixed-width representation (128-bit), minimizing sorting passes and comparison overhead while maintaining near-zero collision probability.

Parameters:
  • val – Xtructurable dataclass to deduplicate.

  • key – Optional cost array (e.g. priority). If provided, the item with the lowest key among duplicates is selected.

  • filled – Optional boolean mask indicating valid items. Invalid items are treated as non-existent (never selected).

  • key_fn – Function to generate hash/comparison keys from val.

  • batch_len – Explicit batch length (optional).

  • return_index – Whether to return indices of unique items.

  • return_inverse – Whether to return inverse indices.

  • size – Optional static size for returned unique indices (required for JIT).

  • fill_value – Value to fill padding with when size is specified.

Returns:

Mask (bool array) or tuple (mask, index, inverse).