API Reference¶

Complete programmatic interface for the Quantum Encoding Atlas. All public classes and functions are documented below with their signatures, parameters, and return types.

Encodings¶

All encodings inherit from BaseEncoding and share a unified interface.

Base Class¶

BaseEncoding ¶

BaseEncoding(n_features: int, **kwargs: Any)

Bases: ABC

Abstract base class for all quantum data encodings.

All encoding implementations must inherit from this class and implement the required abstract methods.

Parameters¶

n_features : int Number of classical features to encode. **kwargs : Any Additional encoding-specific parameters.

Examples¶

from encoding_atlas import AngleEncoding encoding = AngleEncoding(n_features=4, rotation='Y') encoding.n_qubits 4

Notes¶

This class uses slots for memory efficiency. Subclasses should also define slots listing their own instance attributes to maintain this optimization. If a subclass does not define slots, it will have a dict and the memory benefit is partially lost.

Source code in src/encoding_atlas/core/base.py

def __init__(self, n_features: int, **kwargs: Any) -> None:
    if not isinstance(n_features, int) or n_features < 1:
        raise ValueError(f"n_features must be a positive integer, got {n_features}")

    self._n_features = n_features
    self._config = kwargs
    self._properties: EncodingProperties | None = None
    # Thread lock for safe lazy initialization of properties
    # Uses double-checked locking pattern for efficiency
    self._properties_lock: threading.Lock = threading.Lock()

    # Optional LRU cache for circuit construction (disabled by default).
    # Enabled via :meth:`enable_cache`. ``None`` means caching is off.
    self._circuit_cache: OrderedDict[tuple[bytes, str], CircuitType] | None = None
    self._circuit_cache_maxsize: int = 0
    self._circuit_cache_lock: threading.Lock = threading.Lock()

n_qubits `abstractmethod` `property` ¶

n_qubits: int

Number of qubits required for this encoding.

depth `abstractmethod` `property` ¶

depth: int

Circuit depth of the encoding.

properties `property` ¶

properties: EncodingProperties

Compute and return encoding properties.

This property uses thread-safe lazy initialization with double-checked locking to ensure safe access in multi-threaded environments while minimizing lock contention for subsequent accesses.

Returns¶

EncodingProperties Computed properties of this encoding.

Notes¶

Thread Safety: The first access in a multi-threaded context will acquire a lock to ensure only one thread computes the properties. Subsequent accesses bypass the lock entirely for optimal performance.

The double-checked locking pattern

First check without lock (fast path for initialized case)
Acquire lock
Second check inside lock (handles race condition)
Compute if still None

config `property` ¶

config: dict[str, Any]

Encoding configuration parameters.

get_circuit ¶

get_circuit(x: ArrayLike, backend: BackendType = 'pennylane') -> CircuitType

Generate quantum circuit for the given data.

This is a template method that validates input and delegates to the encoding-specific :meth:_get_circuit_from_validated. Subclasses should normally override :meth:_get_circuit_from_validated rather than this method.

If circuit caching has been enabled via :meth:enable_cache, the result for repeated calls with byte-identical inputs is returned from cache.

Parameters¶

x : ArrayLike Input data of shape (n_features,) or (1, n_features) for a single sample. If 2D with a single row, the row is unwrapped before dispatch. backend : {'pennylane', 'qiskit', 'cirq'}, default='pennylane' Target quantum computing framework.

Returns¶

CircuitType Circuit in the specified backend's format.

Raises¶

ValueError If input shape doesn't match n_features, contains NaN/Inf values, or backend is not recognized.

Source code in src/encoding_atlas/core/base.py

def get_circuit(
    self,
    x: ArrayLike,
    backend: BackendType = "pennylane",
) -> CircuitType:
    """Generate quantum circuit for the given data.

    This is a template method that validates input and delegates to the
    encoding-specific :meth:`_get_circuit_from_validated`. Subclasses
    should normally override :meth:`_get_circuit_from_validated` rather
    than this method.

    If circuit caching has been enabled via :meth:`enable_cache`, the
    result for repeated calls with byte-identical inputs is returned
    from cache.

    Parameters
    ----------
    x : ArrayLike
        Input data of shape (n_features,) or (1, n_features) for a single
        sample. If 2D with a single row, the row is unwrapped before
        dispatch.
    backend : {'pennylane', 'qiskit', 'cirq'}, default='pennylane'
        Target quantum computing framework.

    Returns
    -------
    CircuitType
        Circuit in the specified backend's format.

    Raises
    ------
    ValueError
        If input shape doesn't match n_features, contains NaN/Inf values,
        or backend is not recognized.
    """
    x_validated = self._validate_input(x)
    if x_validated.ndim == 2:
        x_validated = x_validated[0]
    return self._cached_dispatch(x_validated, backend)

Properties¶

EncodingProperties `dataclass` ¶

EncodingProperties(n_qubits: int, depth: int, gate_count: int, single_qubit_gates: int, two_qubit_gates: int, parameter_count: int, is_entangling: bool, simulability: Literal['simulable', 'conditionally_simulable', 'not_simulable'], expressibility: Optional[float] = None, entanglement_capability: Optional[float] = None, trainability_estimate: Optional[float] = None, noise_resilience_estimate: Optional[float] = None, notes: str = '')

Properties computed for a quantum encoding.

Attributes¶

n_qubits : int Number of qubits required. depth : int Circuit depth (number of time steps). gate_count : int Total number of gates. single_qubit_gates : int Number of single-qubit gates. two_qubit_gates : int Number of two-qubit gates. parameter_count : int Number of data-dependent parameters. is_entangling : bool Whether the encoding creates entanglement. simulability : str Classical simulability status.

__post_init__ ¶

__post_init__() -> None

Validate properties after initialization.

Source code in src/encoding_atlas/core/properties.py

def __post_init__(self) -> None:
    """Validate properties after initialization."""
    if self.n_qubits < 1:
        raise ValueError("n_qubits must be at least 1")
    if self.depth < 0:
        raise ValueError("depth cannot be negative")
    if self.gate_count < 0:
        raise ValueError("gate_count cannot be negative")
    if self.single_qubit_gates + self.two_qubit_gates != self.gate_count:
        raise ValueError(
            "single_qubit_gates + two_qubit_gates must equal gate_count"
        )

to_dict ¶

to_dict() -> dict

Convert properties to dictionary.

Source code in src/encoding_atlas/core/properties.py

def to_dict(self) -> dict:
    """Convert properties to dictionary."""
    return {
        "n_qubits": self.n_qubits,
        "depth": self.depth,
        "gate_count": self.gate_count,
        "single_qubit_gates": self.single_qubit_gates,
        "two_qubit_gates": self.two_qubit_gates,
        "parameter_count": self.parameter_count,
        "is_entangling": self.is_entangling,
        "simulability": self.simulability,
        "expressibility": self.expressibility,
        "entanglement_capability": self.entanglement_capability,
        "trainability_estimate": self.trainability_estimate,
        "noise_resilience_estimate": self.noise_resilience_estimate,
        "notes": self.notes,
    }

Encoding Classes¶

Angle Encoding¶

AngleEncoding ¶

AngleEncoding(n_features: int, rotation: Literal['X', 'Y', 'Z'] = 'Y', reps: int = 1)

Bases: BaseEncoding

Angle encoding using single-qubit rotation gates.

AngleEncoding maps classical features directly to rotation angles of single-qubit gates. Each feature is encoded on a dedicated qubit using a rotation gate around a specified axis (X, Y, or Z).

This encoding creates product states (no entanglement), making it: - Classically simulable (efficient classical simulation possible) - Free from barren plateaus (good trainability) - Hardware-efficient (only single-qubit gates)

The circuit structure for each repetition is:

|0⟩ ─ Rₐ(x₀) ─
|0⟩ ─ Rₐ(x₁) ─
|0⟩ ─ Rₐ(x₂) ─
...

where Rₐ is the rotation gate for axis a ∈ {X, Y, Z}.

Parameters¶

n_features : int Number of classical features to encode. Must be a positive integer. Each feature requires one qubit, so this also determines the number of qubits in the circuit. rotation : {"X", "Y", "Z"}, default="Y" The axis of rotation to use for encoding:

- "X": Uses RX gates, encoding features in the YZ plane
- "Y": Uses RY gates (default), encoding features in the XZ plane
- "Z": Uses RZ gates, encoding features as phases

The Y rotation is commonly used as it creates real-valued amplitudes
when starting from |0⟩, and can represent any point on the Bloch
sphere's XZ great circle.

reps : int, default=1 Number of times to repeat the encoding layer. Higher values increase the effective rotation angle by a factor of reps. Must be at least 1.

With reps > 1, the effective rotation for feature xᵢ becomes:
Rₐ(xᵢ)^reps = Rₐ(reps · xᵢ)

Attributes¶

rotation : str The rotation axis used ("X", "Y", or "Z"). reps : int Number of encoding layer repetitions. n_features : int Number of classical features (inherited from BaseEncoding). n_qubits : int Number of qubits, equal to n_features.

Examples¶

Create a basic angle encoding with default Y rotation:

from encoding_atlas import AngleEncoding import numpy as np enc = AngleEncoding(n_features=4) enc.n_qubits 4 enc.rotation 'Y'

Generate a PennyLane circuit:

x = np.array([0.1, 0.2, 0.3, 0.4]) circuit = enc.get_circuit(x, backend='pennylane') callable(circuit) True

Use X rotation for different encoding behavior:

enc_x = AngleEncoding(n_features=4, rotation='X') qiskit_circuit = enc_x.get_circuit(x, backend='qiskit') qiskit_circuit.num_qubits 4

Use multiple repetitions to increase rotation range:

enc_reps = AngleEncoding(n_features=2, reps=3) enc_reps.depth 3 enc_reps.properties.gate_count 6

Access encoding properties:

enc = AngleEncoding(n_features=4) props = enc.properties props.is_entangling False props.simulability 'simulable'

References¶

.. [1] Schuld, M., & Petruccione, F. (2018). "Supervised Learning with Quantum Computers." Springer. .. [2] LaRose, R., & Coyle, B. (2020). "Robust data encodings for quantum classifiers." Physical Review A, 102(3), 032420.

Notes¶

Expressivity: Angle encoding creates product states, which limits its expressivity compared to entangling encodings. However, this simplicity is advantageous for trainability and classical simulation.

Rotation Axis Choice:

RY is most common as it creates real amplitudes from |0⟩
RX creates complex amplitudes with imaginary components
RZ only adds phases to |0⟩ (no population change from |0⟩)

Scaling Considerations: The rotation gates have period 2π (for RX, RY) or 4π (for full Bloch sphere coverage). Input features should be scaled appropriately to utilize the full encoding range.

Comparison with Other Encodings:

+-------------------+----------+--------+------------+---------------+ | Encoding | Qubits | Depth | Entangling | Simulability | +===================+==========+========+============+===============+ | AngleEncoding | n | reps | No | Simulable | +-------------------+----------+--------+------------+---------------+ | BasisEncoding | n | 1 | No | Simulable | +-------------------+----------+--------+------------+---------------+ | AmplitudeEncoding | log₂(n) | O(2^n) | Yes | Not simulable | +-------------------+----------+--------+------------+---------------+ | IQPEncoding | n | O(n²) | Yes | Not simulable | +-------------------+----------+--------+------------+---------------+ | ZZFeatureMap | n | O(n²) | Yes | Not simulable | +-------------------+----------+--------+------------+---------------+

Initialize the angle encoding.

Parameters¶

n_features : int Number of classical features to encode. rotation : {"X", "Y", "Z"}, default="Y" Rotation axis for the encoding gates. reps : int, default=1 Number of encoding layer repetitions.

Raises¶

ValueError If rotation is not one of "X", "Y", "Z". ValueError If reps is less than 1. ValueError If n_features is less than 1 (raised by parent class).

Amplitude Encoding¶

AmplitudeEncoding ¶

AmplitudeEncoding(n_features: int, normalize: bool = True)

Bases: BaseEncoding

Amplitude encoding maps classical features to quantum state amplitudes.

AmplitudeEncoding provides exponential data compression by encoding n classical features into the amplitudes of a log₂(n)-qubit quantum state. This is one of the most powerful quantum data encodings in terms of compression, but comes with significant circuit depth requirements.

The encoding prepares a quantum state where each computational basis state |i⟩ has amplitude proportional to the i-th feature value:

|0⟩ → (1/‖x‖) Σᵢ xᵢ|i⟩

The circuit structure requires state preparation gates:

|0⟩⊗ⁿ ─── [State Preparation Unitary U] ─── |ψ(x)⟩

where U|0⟩⊗ⁿ = |ψ(x)⟩ and the state preparation uses O(2^n) gates for n qubits in the general case.

Parameters¶

n_features : int Number of classical features to encode. Must be a positive integer. The number of qubits is ⌈log₂(n_features)⌉, with zero-padding if n_features is not a power of 2. normalize : bool, default=True Whether to normalize input data to create a valid quantum state. If True (default), input vectors are divided by their L2 norm. If False, the user must ensure inputs are already normalized.

.. warning::
    Setting normalize=False with unnormalized data may cause
    undefined behavior or errors in quantum backends.

Attributes¶

normalize : bool Whether automatic normalization is enabled. n_features : int Number of classical features (inherited from BaseEncoding). n_qubits : int Number of qubits: max(1, ⌈log₂(n_features)⌉). depth : int Circuit depth: O(2^n_qubits) for general state preparation.

Examples¶

Create a basic amplitude encoding for 4 features (requires 2 qubits):

from encoding_atlas import AmplitudeEncoding import numpy as np enc = AmplitudeEncoding(n_features=4) enc.n_qubits 2 enc.n_features 4

Encode a feature vector (automatically normalized):

x = np.array([1.0, 2.0, 3.0, 4.0]) circuit = enc.get_circuit(x, backend='pennylane') callable(circuit) True

Single feature requires 1 qubit (padded to 2 amplitudes):

enc_single = AmplitudeEncoding(n_features=1) enc_single.n_qubits 1

Generate circuits for different backends:

qiskit_circuit = enc.get_circuit(x, backend='qiskit') qiskit_circuit.num_qubits 2

cirq_circuit = enc.get_circuit(x, backend='cirq') len(cirq_circuit.all_qubits()) 2

Non-power-of-two features are zero-padded:

enc_5 = AmplitudeEncoding(n_features=5) enc_5.n_qubits # ceil(log2(5)) = 3, so 8 amplitudes 3

Batch processing multiple samples:

X = np.random.randn(10, 4) circuits = enc.get_circuits(X, backend='pennylane') len(circuits) 10

Access encoding properties:

props = enc.properties props.is_entangling True props.simulability 'not_simulable'

References¶

.. [1] Schuld, M., & Petruccione, F. (2018). "Supervised Learning with Quantum Computers." Springer. .. [2] Mottonen, M., et al. (2004). "Transformation of quantum states using uniformly controlled rotations."

Notes¶

Compression vs. Circuit Depth Tradeoff: While amplitude encoding provides exponential compression (n features in log₂(n) qubits), it requires O(2^n) gates for state preparation. This means the encoding circuit depth scales linearly with the number of features, not logarithmically.

Qubit Calculation Edge Case: For n_features=1, the formula log₂(1) = 0 would yield 0 qubits. The implementation enforces a minimum of 1 qubit, padding the single feature to a 2-element state vector [x, 0] → |0⟩ after normalization.

Normalization Behavior: - If normalize=True: Input vector x is divided by ‖x‖ before encoding - If normalize=False: Input is used as-is (must be pre-normalized) - Zero-norm vectors will cause errors (division by zero)

Backend-Specific Implementation: - PennyLane: Uses qml.AmplitudeEmbedding for optimal performance - Qiskit: Uses QuantumCircuit.initialize() with automatic decomposition - Cirq: Uses a custom unitary gate constructed via Gram-Schmidt

Qubit Ordering Conventions: Quantum computing frameworks differ in how they map statevector indices to qubit labels. This library adopts the MSB (most significant bit) convention used by PennyLane and Cirq, where qubit 0 is the leftmost (most significant) bit.

PennyLane — MSB natively. qml.state() returns a statevector where index i encodes the computational basis state with qubit 0 as the most significant bit. No conversion needed.
Cirq — MSB natively. Simulator.simulate() returns a statevector with LineQubit(0) as the most significant bit. No conversion needed.
Qiskit — LSB natively. Statevector(circuit).data returns amplitudes where qubit 0 is the least significant bit. _to_qiskit() compensates for this by bit-reversing the amplitude indices before calling initialize(), and the analysis module's _reverse_qubit_order() reverses the statevector when reading results back. The net effect is transparent MSB semantics for the caller.

Example: For 4 features (2 qubits), amplitude vector [a₀, a₁, a₂, a₃] maps to quantum state: a₀|00⟩ + a₁|01⟩ + a₂|10⟩ + a₃|11⟩

In the MSB convention used by all backends after conversion:

Index 0 (binary 00) → qubit 0 = 0, qubit 1 = 0
Index 1 (binary 01) → qubit 0 = 0, qubit 1 = 1
Index 2 (binary 10) → qubit 0 = 1, qubit 1 = 0
Index 3 (binary 11) → qubit 0 = 1, qubit 1 = 1

This consistent ordering ensures that circuits generated by different backends produce equivalent quantum states, enabling reliable cross- backend testing and backend-agnostic algorithm development.

Hardware Considerations: Due to the deep circuit depth, amplitude encoding may accumulate significant errors on NISQ hardware. Consider using shallower encodings (AngleEncoding, IQPEncoding) for near-term applications.

Initialize the amplitude encoding.

Parameters¶

n_features : int Number of classical features to encode. Must be a positive integer. normalize : bool, default=True Whether to automatically normalize input vectors.

Raises¶

ValueError If n_features is not a positive integer (raised by parent class). TypeError If normalize is not a boolean.

Examples¶

enc = AmplitudeEncoding(n_features=8) enc.n_qubits 3

enc = AmplitudeEncoding(n_features=4, normalize=False) enc.normalize False

Basis Encoding¶

BasisEncoding ¶

BasisEncoding(n_features: int, threshold: float = 0.5)

Bases: BaseEncoding

Basis encoding for binary/discrete data into computational basis states.

BasisEncoding provides the simplest quantum data encoding by mapping binary classical data directly to computational basis states. Each qubit represents one classical bit, with X gates applied to encode 1s.

This encoding creates pure computational basis states (no superposition), making it:

Trivially simulable: Classical computers can efficiently track basis states
Hardware efficient: Only single-qubit X gates, no entanglement
Deterministic: Measurement always yields the encoded bit string
Minimal depth: Circuit depth of 1 regardless of feature count

The circuit structure is:

|0⟩ ─── X^{x₀} ─── |x₀⟩
|0⟩ ─── X^{x₁} ─── |x₁⟩
|0⟩ ─── X^{x₂} ─── |x₂⟩
...

where X^{xᵢ} means "apply X gate if xᵢ = 1, otherwise identity".

Parameters¶

n_features : int Number of binary features to encode. Must be a positive integer. Each feature requires one qubit, so this also determines the number of qubits in the circuit. threshold : float, default=0.5 Binarization threshold for continuous inputs. Values strictly greater than this threshold are mapped to 1 (X gate applied), while values less than or equal to the threshold are mapped to 0 (no gate).

Common threshold choices:
- 0.5: Standard for data normalized to [0, 1]
- 0.0: Treat negative values as 0, positive as 1
- Custom: Match your data's natural decision boundary

Attributes¶

n_features : int Number of binary features (inherited from BaseEncoding). n_qubits : int Number of qubits, equal to n_features. depth : int Circuit depth, always 1 for basis encoding. threshold : float Binarization threshold. Values > threshold become 1.

Examples¶

Create a basic basis encoding for 4 binary features:

from encoding_atlas import BasisEncoding import numpy as np enc = BasisEncoding(n_features=4) enc.n_qubits 4 enc.depth 1 enc.threshold 0.5

Encode binary data directly:

x_binary = np.array([1, 0, 1, 1]) circuit = enc.get_circuit(x_binary, backend='pennylane') callable(circuit) True

Continuous data is automatically binarized at threshold 0.5:

x_continuous = np.array([0.8, 0.2, 0.6, 0.9]) # -> [1, 0, 1, 1] circuit = enc.get_circuit(x_continuous, backend='pennylane')

Custom threshold for data with different scales:

enc_custom = BasisEncoding(n_features=4, threshold=0.0) enc_custom.threshold 0.0

Now: positive values -> 1, non-positive values -> 0¶

x_signed = np.array([-0.5, 0.5, -0.1, 0.1]) # -> [0, 1, 0, 1] circuit = enc_custom.get_circuit(x_signed, backend='pennylane')

Generate circuits for different backends:

x = np.array([1, 0, 1, 0]) qiskit_circuit = enc.get_circuit(x, backend='qiskit') qiskit_circuit.num_qubits 4

cirq_circuit = enc.get_circuit(x, backend='cirq') len(cirq_circuit.all_qubits()) 4

Batch processing multiple samples:

X = np.array([[1, 0, 0, 1], [0, 1, 1, 0], [1, 1, 1, 1]]) circuits = enc.get_circuits(X, backend='pennylane') len(circuits) 3

Access encoding properties:

props = enc.properties props.is_entangling False props.simulability 'simulable' props.depth 1

Analyze actual resource usage for specific data:

enc = BasisEncoding(n_features=4) enc.properties.gate_count # Theoretical worst-case 4 enc.actual_gate_count([1, 0, 0, 0]) # Actual for sparse data 1 enc.binarize([0.8, 0.2, 0.6, 0.4]) # Inspect binarization array([1, 0, 1, 0])

References¶

.. [1] Nielsen, M. A., & Chuang, I. L. (2010). "Quantum Computation and Quantum Information." Cambridge University Press. .. [2] Schuld, M., & Petruccione, F. (2018). "Supervised Learning with Quantum Computers." Springer.

Notes¶

Binarization Behavior: This implementation automatically binarizes continuous inputs using the configured threshold (default 0.5):

x > threshold → 1 (X gate applied)
x ≤ threshold → 0 (no gate applied)

For pre-binarized data (0s and 1s), this threshold has no effect. The threshold can be customized at construction time to match your data's natural decision boundary.

Comparison with Other Encodings:

+------------------+----------+-------+------------+---------------+ | Encoding | Qubits | Depth | Entangling | Data Type | +==================+==========+=======+============+===============+ | BasisEncoding | n | 1 | No | Binary | +------------------+----------+-------+------------+---------------+ | AngleEncoding | n | 1 | No | Continuous | +------------------+----------+-------+------------+---------------+ | AmplitudeEncoding| log₂(n) | O(2ⁿ) | Yes | Continuous | +------------------+----------+-------+------------+---------------+ | IQPEncoding | n | O(n) | Yes | Continuous | +------------------+----------+-------+------------+---------------+

When to Use Basis Encoding:

Your data is naturally binary or categorical
You need a simple baseline encoding for comparison
Circuit depth must be minimal (NISQ hardware)
You're implementing algorithms that work with marked basis states (Grover's search, QAOA for combinatorial problems)

When NOT to Use Basis Encoding:

Your data is continuous and pattern-rich (use AngleEncoding or IQP)
You need amplitude-based information processing
Your algorithm requires superposition or entanglement from the encoding

Cirq Backend Note:

Cirq's circuit.all_qubits() only returns qubits that have operations. For sparse inputs (many zeros), this may return fewer qubits than expected. Always use encoding.n_qubits to determine the actual qubit count required for your circuit, not circuit.all_qubits().

Understanding Gate Counts (Important):

BasisEncoding provides two ways to analyze gate counts:

Theoretical (worst-case): properties.gate_count returns n_features, assuming all features binarize to 1. This is computed once at construction time and cached. Use this for:
Capacity planning and resource allocation
Worst-case guarantees
Comparing encoding algorithms theoretically
Documentation and specifications
Actual (data-dependent): actual_gate_count(x) returns the exact count of X gates for specific input data. This equals the number of 1s after binarization. Use this for:
Accurate hardware resource estimation
Benchmarking with real datasets
Cost analysis for sparse data
Debugging unexpected behavior

Example showing the difference:

enc = BasisEncoding(n_features=100) enc.properties.gate_count # Theoretical: always 100 100 sparse_data = np.zeros(100) sparse_data[0] = 1 # Only one feature is "on" enc.actual_gate_count(sparse_data) # Actual: only 1 gate 1

For sparse binary data, the actual gate count can be significantly lower than the theoretical maximum, which matters for:

Hardware execution time (fewer gates = faster circuits)
Error rates (fewer gates = less decoherence)
Classical simulation cost (sparser circuits = easier to simulate)

Initialize the basis encoding.

Parameters¶

n_features : int Number of binary features to encode. Each feature requires one qubit, so this determines both the input dimension and the number of qubits in the circuit. threshold : float, default=0.5 Binarization threshold for continuous inputs. Values strictly greater than this threshold are mapped to 1 (X gate applied), while values less than or equal are mapped to 0 (no gate).

Must be a finite real number. Common choices:
- 0.5 for data normalized to [0, 1]
- 0.0 for signed data (negative -> 0, positive -> 1)

Raises¶

ValueError If n_features is not a positive integer (raised by parent class). TypeError If threshold is not a numeric type (int or float). ValueError If threshold is NaN or infinite.

Examples¶

enc = BasisEncoding(n_features=4) enc.n_qubits 4 enc.depth 1 enc.threshold 0.5

enc = BasisEncoding(n_features=8) enc.properties.gate_count # Worst case: all X gates 8

enc = BasisEncoding(n_features=4, threshold=0.0) enc.threshold 0.0

IQP Encoding¶

IQPEncoding ¶

IQPEncoding(n_features: int, reps: int = _DEFAULT_REPS, entanglement: Literal['full', 'linear', 'circular'] = _DEFAULT_ENTANGLEMENT)

Bases: BaseEncoding

IQP encoding using Hadamard gates and ZZ interactions.

IQPEncoding implements Instantaneous Quantum Polynomial circuits for quantum machine learning. It creates highly entangled quantum states by combining Hadamard gates, single-qubit Z rotations, and two-qubit ZZ interactions parameterized by classical input features.

This encoding is notable for its provable classical hardness: simulating IQP circuits is believed to be intractable for classical computers under standard complexity-theoretic assumptions, making it a candidate for demonstrating quantum advantage in machine learning.

The circuit structure for each repetition is:

|0⟩ ─ H ─ RZ(2x₀) ─╭─────╮─╭─────╮─
|0⟩ ─ H ─ RZ(2x₁) ─│ ZZ  │─│     │─
|0⟩ ─ H ─ RZ(2x₂) ─╰─────╯─│ ZZ  │─
...                        ╰─────╯

where ZZ(xᵢxⱼ) interactions are applied according to the entanglement topology (full, linear, or circular connectivity).

Parameters¶

n_features : int Number of classical features to encode. Must be a positive integer. Each feature requires one qubit, so this also determines the number of qubits in the circuit. reps : int, default=2 Number of times to repeat the encoding layers. Higher values create deeper circuits with stronger entanglement but may face trainability issues. Must be at least 1. entanglement : {"full", "linear", "circular"}, default="full" Topology of ZZ interactions between qubits:

- "full": All-to-all connectivity. Every pair (i, j) with i < j has a
  ZZ interaction. Creates n(n-1)/2 entangling gates per layer.
  Maximum expressivity but highest gate count.
- "linear": Nearest-neighbor connectivity. Only pairs (i, i+1) have
  ZZ interactions. Creates n-1 entangling gates per layer.
  Hardware-friendly for linear qubit topologies.
- "circular": Nearest-neighbor with periodic boundary. Like linear,
  but adds a ZZ interaction between qubits n-1 and 0.
  Creates n entangling gates per layer.

Attributes¶

reps : int Number of encoding layer repetitions. entanglement : str The entanglement topology ("full", "linear", or "circular"). n_features : int Number of classical features (inherited from BaseEncoding). n_qubits : int Number of qubits, equal to n_features.

Examples¶

Create a basic IQP encoding with default settings:

from encoding_atlas import IQPEncoding import numpy as np enc = IQPEncoding(n_features=4) enc.n_qubits 4 enc.entanglement 'full'

Generate a PennyLane circuit:

x = np.array([0.1, 0.2, 0.3, 0.4]) circuit = enc.get_circuit(x, backend='pennylane') callable(circuit) True

Use linear entanglement for hardware-friendly circuits:

enc_linear = IQPEncoding(n_features=4, entanglement='linear') props = enc_linear.properties props.two_qubit_gates < enc.properties.two_qubit_gates True

Generate Qiskit and Cirq circuits:

qiskit_circuit = enc.get_circuit(x, backend='qiskit') qiskit_circuit.num_qubits 4 cirq_circuit = enc.get_circuit(x, backend='cirq') len(cirq_circuit.all_qubits()) 4

Access encoding properties:

props = enc.properties props.is_entangling True props.simulability 'not_simulable'

References¶

.. [1] Havlíček, V., et al. (2019). "Supervised learning with quantum-enhanced feature spaces." Nature, 567(7747), 209-212. .. [2] Bremner, M. J., et al. (2016). "Average-case complexity versus approximate simulation of commuting quantum computations." Physical Review Letters, 117(8), 080501. .. [3] McClean, J. R., et al. (2018). "Barren plateaus in quantum neural network training landscapes." Nature Communications, 9, 4812.

Warnings¶

Full Entanglement with Many Features: When using entanglement='full' with more than 12 features, a warning is issued because the circuit complexity may exceed practical limits for NISQ devices. The number of two-qubit gates scales as O(n²), which can lead to:

Excessive circuit depth and gate errors
Hardware connectivity constraints (all-to-all required)
Potential trainability issues (barren plateaus)

Consider using entanglement='linear' or entanglement='circular' for large feature counts.

Notes¶

Classical Hardness: The output distribution of IQP circuits is provably hard to sample from classically under the assumption that the polynomial hierarchy does not collapse [2]_. This provides theoretical motivation for potential quantum advantage in machine learning.

Feature Interactions: The ZZ(xᵢxⱼ) gates encode products of features, allowing the quantum kernel to capture pairwise correlations. This is analogous to polynomial feature expansion in classical machine learning, but executed in superposition.

Entanglement Topology Trade-offs:

Trainability: Deeper circuits (higher reps) with full entanglement may exhibit barren plateaus, where gradients become exponentially small [3]_. Mitigation strategies include:

Using fewer repetitions (reps=1 or 2)
Using sparser entanglement (linear or circular)
Layer-wise training approaches
Careful parameter initialization

Gate Decomposition: The ZZ interaction is implemented using the standard decomposition: ZZ(θ) = CNOT · RZ(2θ) · CNOT. This requires 2 CNOT gates per interaction, which dominates the circuit's two-qubit gate count.

Initialize the IQP encoding.

Parameters¶

n_features : int Number of classical features to encode. reps : int, default=2 Number of encoding layer repetitions. entanglement : {"full", "linear", "circular"}, default="full" Topology of ZZ interactions between qubits.

Raises¶

ValueError If reps is less than 1. ValueError If entanglement is not one of "full", "linear", "circular". ValueError If n_features is less than 1 (raised by parent class).

Warns¶

UserWarning If using full entanglement with more than 12 features, as the circuit complexity may exceed practical NISQ device limits.

ZZ Feature Map¶

ZZFeatureMap ¶

ZZFeatureMap(n_features: int, reps: int = _DEFAULT_REPS, entanglement: Literal['full', 'linear', 'circular'] = _DEFAULT_ENTANGLEMENT)

Bases: BaseEncoding

ZZ Feature Map encoding following Qiskit conventions.

ZZFeatureMap implements the second-order Pauli-Z expansion feature map commonly used in quantum kernel methods and QSVM. It creates entangled quantum states by applying Hadamard gates, phase rotations on single qubits, and ZZ entangling interactions between pairs of qubits.

This encoding follows the Qiskit convention for ZZFeatureMap, using the phase formula 2(π - xᵢ)(π - xⱼ) for two-qubit interactions, which creates a different kernel geometry compared to the standard IQP encoding.

The circuit structure for each repetition is:

|0⟩ ─ H ─ P(2x₀) ─╭──────────────╮─╭──────────────╮─
|0⟩ ─ H ─ P(2x₁) ─│ ZZ(φ₀₁)     │─│              │─
|0⟩ ─ H ─ P(2x₂) ─╰──────────────╯─│ ZZ(φ₁₂)     │─
...                                ╰──────────────╯

where φᵢⱼ = 2(π - xᵢ)(π - xⱼ) and ZZ interactions are applied according to the specified entanglement topology.

Parameters¶

n_features : int Number of classical features to encode. Must be a positive integer. Each feature requires one qubit, so this also determines the number of qubits in the circuit. reps : int, default=2 Number of times to repeat the encoding layers. Higher values create deeper circuits that may capture more complex feature relationships. Must be at least 1. entanglement : {"full", "linear", "circular"}, default="full" Topology of ZZ interactions between qubits:

- "full": All-to-all connectivity. Every pair (i, j) with i < j has a
  ZZ interaction. Creates n(n-1)/2 entangling gates per layer.
- "linear": Nearest-neighbor connectivity. Only pairs (i, i+1) have
  ZZ interactions. Creates n-1 entangling gates per layer.
- "circular": Nearest-neighbor with periodic boundary. Like linear,
  but adds a ZZ interaction between qubits n-1 and 0 when n > 2.

Attributes¶

reps : int Number of encoding layer repetitions. entanglement : str The entanglement topology ("full", "linear", or "circular"). n_features : int Number of classical features (inherited from BaseEncoding). n_qubits : int Number of qubits, equal to n_features.

Use Cases¶

Quantum Kernel Methods (QSVM): ZZFeatureMap is the canonical choice for quantum kernel estimation in supervised learning. The (π-x) phase convention creates kernels that can separate data classes that are not linearly separable in the original feature space.

Quantum Neural Networks: Use as a feature embedding layer before variational ansatz circuits. The entanglement creates correlations that parameterized layers can learn to exploit.

Benchmarking: Standard circuit for comparing quantum machine learning approaches across different platforms and simulators.

Research: Studying the effect of different phase conventions on kernel geometry and quantum advantage in machine learning tasks.

Limitations¶

Scalability: Full entanglement creates O(n²) ZZ pairs, leading to deep circuits for large feature counts. For n > 10 features, consider using entanglement='linear' or entanglement='circular'.

Hardware Connectivity: Full entanglement requires all-to-all qubit connectivity, which is not available on most near-term quantum hardware. SWAP gates will be inserted during transpilation, increasing circuit depth.

Noise Sensitivity: Deep circuits with many two-qubit gates are more susceptible to decoherence and gate errors on NISQ devices.

Classical Simulation: Unlike product-state encodings (e.g., AngleEncoding), ZZFeatureMap creates entangled states that cannot be efficiently simulated classically for large qubit counts.

Resource Analysis¶

The circuit resources scale as follows per repetition:

Full entanglement (entanglement='full'): - ZZ pairs: n(n-1)/2 - CNOT gates: n(n-1) per rep - Example: n=10 → 45 pairs, 90 CNOTs per rep

Linear entanglement (entanglement='linear'): - ZZ pairs: n-1 - CNOT gates: 2(n-1) per rep - Example: n=10 → 9 pairs, 18 CNOTs per rep

Circular entanglement (entanglement='circular'): - ZZ pairs: n (for n > 2) - CNOT gates: 2n per rep - Example: n=10 → 10 pairs, 20 CNOTs per rep

Use gate_count_breakdown() for detailed gate counts, or get_entanglement_pairs() to inspect the connectivity.

Entanglement Topology Trade-offs¶

Recommendation: Use entanglement='linear' for NISQ hardware experiments to minimize SWAP overhead. Use entanglement='full' for simulation studies or when maximum expressivity is needed.

Examples¶

Create a basic ZZ Feature Map with default settings:

from encoding_atlas import ZZFeatureMap import numpy as np enc = ZZFeatureMap(n_features=4) enc.n_qubits 4 enc.entanglement 'full'

Generate a PennyLane circuit:

x = np.array([0.5, 1.0, 1.5, 2.0]) circuit = enc.get_circuit(x, backend='pennylane') callable(circuit) True

Use linear entanglement for hardware-friendly circuits:

enc_linear = ZZFeatureMap(n_features=4, entanglement='linear') enc_linear.properties.two_qubit_gates 6

Generate Qiskit circuit (compatible with Qiskit's ZZFeatureMap):

qiskit_circuit = enc.get_circuit(x, backend='qiskit') qiskit_circuit.num_qubits 4

Batch processing with parallel execution:

X = np.random.randn(100, 4) circuits = enc.get_circuits(X, backend='pennylane', parallel=True) len(circuits) 100

Inspect entanglement connectivity:

enc.get_entanglement_pairs() [(0, 1), (0, 2), (0, 3), (1, 2), (1, 3), (2, 3)]

Get detailed gate count breakdown:

breakdown = enc.gate_count_breakdown() breakdown['cnot'] 24 breakdown['total'] 52

Access encoding properties:

props = enc.properties props.is_entangling True props.simulability 'not_simulable'

References¶

.. [1] Havlíček, V., et al. (2019). "Supervised learning with quantum-enhanced feature spaces." Nature, 567(7747), 209-212. .. [2] Qiskit Development Team. "ZZFeatureMap documentation."

Notes¶

Phase Convention: The (π - x) convention in ZZ interactions means:

Features at x = π have zero interaction strength
Features at x = 0 or x = 2π have maximum interaction
This creates a kernel with different geometry than standard IQP

Comparison with IQPEncoding:

ZZFeatureMap uses P(2x) phase gates; IQP uses RZ(2x)
ZZFeatureMap uses (π-xᵢ)(π-xⱼ) for ZZ; IQP uses xᵢxⱼ
Both create classically hard-to-simulate circuits
Choice depends on data distribution and kernel requirements

Qiskit Compatibility: This implementation follows Qiskit conventions and produces equivalent circuits to qiskit.circuit.library.ZZFeatureMap.

Thread Safety: This class is thread-safe for circuit generation. Multiple threads can safely call get_circuit() or get_circuits() concurrently on the same encoding instance. The encoding object is not modified during circuit generation, and input validation creates defensive copies to prevent data races.

Circular Entanglement Edge Case: For n_features=2, circular entanglement produces the same connectivity as linear (only one pair). The wrap-around edge (n-1, 0) is only added when n > 2 to avoid duplicate pairs.

Initialize the ZZ Feature Map encoding.

Parameters¶

n_features : int Number of classical features to encode. Must be a positive integer. Each feature requires one qubit, so this also determines the number of qubits in the circuit. reps : int, default=2 Number of encoding layer repetitions. Must be at least 1. entanglement : {"full", "linear", "circular"}, default="full" Topology of ZZ interactions between qubits.

Raises¶

ValueError If reps is not a positive integer. ValueError If entanglement is not one of "full", "linear", "circular". ValueError If n_features is less than 1 (raised by parent class).

Examples¶

enc = ZZFeatureMap(n_features=4) enc.n_qubits 4 enc.reps 2 enc.entanglement 'full'

enc = ZZFeatureMap(n_features=8, reps=3, entanglement='linear') len(enc.get_entanglement_pairs()) 7

Pauli Feature Map¶

PauliFeatureMap ¶

PauliFeatureMap(n_features: int, reps: int = _DEFAULT_REPS, paulis: list[str] | None = None, entanglement: Literal['full', 'linear', 'circular'] = _DEFAULT_ENTANGLEMENT)

Bases: BaseEncoding

Configurable Pauli feature map encoding for quantum machine learning.

PauliFeatureMap encodes classical data into quantum states using a configurable set of Pauli rotation gates. This provides flexibility to tailor the encoding to specific problem structures.

The circuit structure for each repetition consists of: 1. A Hadamard layer creating uniform superposition 2. Pauli rotation layers as specified by the paulis parameter

For single-qubit Paulis (e.g., "Z", "Y", "X"), rotations R_P(2φ(x_i)) are applied where φ is a feature mapping function.

For two-qubit Paulis (e.g., "ZZ", "YY", "XX"), entangling rotations exp(-i φ(x_i, x_j) P⊗P) are applied using the specified entanglement pattern.

Parameters¶

n_features : int Number of classical features to encode. Must be a positive integer. reps : int, default=2 Number of repetitions (layers) of the feature map circuit. Higher values increase expressibility but also circuit depth. Must be at least 1. paulis : list[str] or None, default=None List of Pauli strings specifying which rotations to apply. Valid single-qubit terms: "X", "Y", "Z" Valid two-qubit terms: "XX", "YY", "ZZ", "XY", "XZ", "YZ" If None, defaults to ["Z", "ZZ"] (equivalent to ZZFeatureMap). entanglement : {"full", "linear", "circular"}, default="full" Pattern for two-qubit interactions: - "full": All pairs (i, j) where i < j - "linear": Consecutive pairs (i, i+1) - "circular": Linear plus (n-1, 0)

Attributes¶

reps : int Number of circuit repetitions. paulis : list[str] List of Pauli terms used in the encoding. entanglement : str Entanglement pattern for two-qubit gates.

Examples¶

Create a basic Pauli feature map with default settings:

from encoding_atlas import PauliFeatureMap import numpy as np enc = PauliFeatureMap(n_features=4) enc.n_qubits 4 enc.paulis ['Z', 'ZZ']

Create a custom feature map with Y and YY rotations:

enc = PauliFeatureMap(n_features=4, paulis=["Y", "YY"], reps=3) x = np.array([0.1, 0.2, 0.3, 0.4]) circuit = enc.get_circuit(x, backend='pennylane')

Use linear entanglement for hardware-friendly circuits:

enc_full = PauliFeatureMap(n_features=8, entanglement="full") enc_linear = PauliFeatureMap(n_features=8, entanglement="linear") enc_linear.properties.two_qubit_gates < enc_full.properties.two_qubit_gates True

Batch processing with parallel execution:

X = np.random.randn(100, 4) circuits = enc.get_circuits(X, backend='pennylane', parallel=True) len(circuits) 100

Inspect entanglement connectivity:

enc = PauliFeatureMap(n_features=4) enc.get_entanglement_pairs() [(0, 1), (0, 2), (0, 3), (1, 2), (1, 3), (2, 3)]

Get detailed gate count breakdown:

breakdown = enc.gate_count_breakdown() breakdown['cnot'] 24 breakdown['total'] 52

References¶

.. [1] Havlíček, V., et al. (2019). "Supervised learning with quantum- enhanced feature spaces." Nature, 567(7747), 209-212.

Warnings¶

Full Entanglement with Many Features: When using entanglement='full' with more than 12 features, a warning is issued because the circuit complexity may exceed practical limits for NISQ devices. The number of two-qubit gates scales as O(n²), which can lead to:

Excessive circuit depth and gate errors
Hardware connectivity constraints (all-to-all required)
Potential trainability issues (barren plateaus)

Consider using entanglement='linear' or entanglement='circular' for large feature counts.

Notes¶

Feature Interactions: The two-qubit Pauli gates encode products of features using the (π - xᵢ)(π - xⱼ) convention, allowing the quantum kernel to capture pairwise correlations.

Entanglement Topology Trade-offs:

Pauli Term Trade-offs:

+-------+-------------------+------------------------------------------+ | Pauli | Gate Overhead | Notes | +=======+===================+==========================================+ | Z | Minimal | Native on most hardware | +-------+-------------------+------------------------------------------+ | X | Low (1 H gate) | Requires basis change | +-------+-------------------+------------------------------------------+ | Y | Medium (H + S) | Requires two basis change gates | +-------+-------------------+------------------------------------------+ | ZZ | 2 CNOTs | Standard for IQP/ZZFeatureMap | +-------+-------------------+------------------------------------------+ | XX | 2 CNOTs + 4 H | Basis change adds 4 single-qubit gates | +-------+-------------------+------------------------------------------+ | YY | 2 CNOTs + 8 gates | Most overhead due to Y-basis changes | +-------+-------------------+------------------------------------------+

Trainability: Deeper circuits (higher reps) with full entanglement and many Pauli terms may exhibit barren plateaus, where gradients become exponentially small. Mitigation strategies include:

Using fewer repetitions (reps=1 or 2)
Using sparser entanglement (linear or circular)
Selecting fewer Pauli terms
Layer-wise training approaches

Thread Safety: This class is thread-safe for circuit generation. Multiple threads can safely call get_circuit() or get_circuits() concurrently on the same encoding instance.

Initialize the Pauli feature map encoding.

Parameters¶

n_features : int Number of classical features to encode. reps : int, default=2 Number of repetitions of the feature map circuit. paulis : list[str] or None, default=None List of Pauli strings. Defaults to ["Z", "ZZ"]. entanglement : {"full", "linear", "circular"}, default="full" Entanglement pattern for two-qubit gates.

Raises¶

ValueError If any parameter is invalid. TypeError If paulis is not a list or None.

Warns¶

UserWarning If using full entanglement with more than 12 features, as the circuit complexity may exceed practical NISQ device limits.

Data Re-uploading¶

DataReuploading ¶

DataReuploading(n_features: int, n_layers: int = _DEFAULT_N_LAYERS, n_qubits: int | None = None)

Bases: BaseEncoding

Data re-uploading quantum feature map with high expressivity.

DataReuploading implements a quantum encoding strategy where classical data is encoded multiple times throughout the circuit, interleaved with entangling layers. This multi-layer approach creates quantum states with rich Fourier spectra, enabling expressive quantum kernels and serving as a foundation for variational quantum classifiers.

The key insight is that re-uploading data at multiple circuit depths increases the number of accessible Fourier frequencies, similar to how depth increases expressivity in classical neural networks.

The circuit structure repeats L times:

|0⟩ ─ RY(x₀) ─╭────╮─ RY(x₀) ─╭────╮─ ... ─
|0⟩ ─ RY(x₁) ─│CNOT│─ RY(x₁) ─│CNOT│─ ... ─
|0⟩ ─ RY(x₂) ─│    │─ RY(x₂) ─│    │─ ... ─
...           ╰────╯          ╰────╯

Each layer consists of data encoding (RY rotations) followed by entanglement (CNOT ladder), repeated n_layers times.

Parameters¶

n_features : int Number of classical features to encode. Must be a positive integer. If n_features > n_qubits, features are cyclically mapped to qubits. n_layers : int, default=3 Number of re-uploading layers. Higher values increase the number of accessible Fourier frequencies but also circuit depth. Must be at least 1. More layers provide richer feature representations. n_qubits : int, optional Number of qubits in the circuit. If not specified, defaults to n_features. Can be set lower than n_features to use fewer qubits with cyclic feature mapping.

Attributes¶

n_layers : int Number of re-uploading layers. n_features : int Number of classical features (inherited from BaseEncoding). n_qubits : int Number of qubits in the circuit.

Examples¶

Create a basic data re-uploading encoding:

from encoding_atlas import DataReuploading import numpy as np enc = DataReuploading(n_features=4) enc.n_qubits 4 enc.n_layers 3

Generate a PennyLane circuit:

x = np.array([0.1, 0.2, 0.3, 0.4]) circuit = enc.get_circuit(x, backend='pennylane') callable(circuit) True

Use fewer qubits than features (cyclic mapping):

enc_compact = DataReuploading(n_features=8, n_qubits=4) enc_compact.n_qubits 4

Increase layers for higher expressivity:

enc_deep = DataReuploading(n_features=4, n_layers=10) enc_deep.depth 30

Generate circuits for different backends:

qiskit_circuit = enc.get_circuit(x, backend='qiskit') qiskit_circuit.num_qubits 4

Access encoding properties:

props = enc.properties props.is_entangling True

References¶

.. [1] Pérez-Salinas, A., et al. (2020). "Data re-uploading for a universal quantum classifier." Quantum, 4, 226. .. [2] Schuld, M., et al. (2021). "Effect of data encoding on the expressive power of variational quantum machine learning models."

Notes¶

Feature Map vs. Universal Approximation: This class implements a fixed feature map that provides high expressivity through repeated data encoding. For full universal function approximation as described in [1]_, trainable parameters must be added between data uploads. Use this encoding as the feature map layer in a variational circuit, then add trainable rotations to achieve universal approximation capability.

Fourier Expressivity: With L encoding layers, the quantum state can represent functions with Fourier frequencies up to ±L. More layers enable the representation of higher-frequency components. The specific Fourier coefficients are determined by the fixed circuit structure.

Trainability Considerations: While data re-uploading helps with expressivity, very deep circuits may still face trainability challenges. The interleaved structure generally provides better gradient flow than purely entangling circuits.

Feature Mapping: When n_features > n_qubits, feature xᵢ is encoded on qubit (i mod n_qubits). This allows compact representations but may create interference between features mapped to the same qubit.

Gate Set Choice: This implementation uses RY rotations for data encoding and a CNOT ladder for entanglement. The original single-qubit formulation in [1]_ uses general SU(2) (U3) rotations with three parameters per gate, and the multi-qubit extension uses CZ gates with varying entanglement topologies. The RY + CNOT variant used here is a common simplification adopted by many QML frameworks; it is less expressive per gate but sufficient for most feature map applications. The Fourier frequency analysis from [2]_ applies regardless of the specific rotation axis.

Comparison with Classical Deep Learning: Data re-uploading is analogous to residual connections in deep neural networks, where information about the input is preserved and refined through layers.

Initialize the data re-uploading encoding.

Parameters¶

n_features : int Number of classical features to encode. n_layers : int, default=3 Number of re-uploading layers for expressivity. n_qubits : int, optional Number of qubits. Defaults to n_features if not specified.

Raises¶

ValueError If n_layers is less than 1. ValueError If n_qubits is less than 1 (when specified). ValueError If n_features is less than 1 (raised by parent class).

Warns¶

UserWarning If n_layers exceeds the deep circuit threshold (default 10), as very deep circuits may face trainability challenges due to barren plateaus.

Examples¶

enc = DataReuploading(n_features=4) enc.n_qubits 4 enc.n_layers 3

enc = DataReuploading(n_features=4, n_layers=5, n_qubits=2) enc.n_qubits 2 enc.n_layers 5

Hardware Efficient Encoding¶

HardwareEfficientEncoding ¶

HardwareEfficientEncoding(n_features: int, reps: int = 2, rotation: Literal['X', 'Y', 'Z'] = 'Y', entanglement: Literal['linear', 'circular', 'full'] = 'linear')

Bases: BaseEncoding

Hardware-efficient encoding optimized for NISQ devices.

HardwareEfficientEncoding implements a quantum data encoding that uses only native gates and respects physical qubit connectivity constraints. This makes it particularly suitable for near-term quantum devices where gate errors and limited connectivity are significant concerns.

The encoding uses simple alternating layers of single-qubit rotations (parameterized by input features) and entangling CNOT gates following a linear or circular topology.

The circuit structure for each repetition is:

|0⟩ ─ Rₐ(x₀) ─╭────╮─ Rₐ(x₀) ─╭────╮─
|0⟩ ─ Rₐ(x₁) ─│CNOT│─ Rₐ(x₁) ─│CNOT│─
|0⟩ ─ Rₐ(x₂) ─│    │─ Rₐ(x₂) ─│    │─
...           ╰────╯          ╰────╯

where Rₐ is the chosen rotation gate (RX, RY, or RZ) and CNOT gates connect neighboring qubits according to the entanglement topology.

Parameters¶

n_features : int Number of classical features to encode. Must be a positive integer. Each feature requires one qubit, so this also determines the number of qubits in the circuit. reps : int, default=2 Number of times to repeat the encoding layers. Higher values increase circuit expressivity but also depth and accumulated errors. Must be at least 1. rotation : {"X", "Y", "Z"}, default="Y" Axis of rotation for single-qubit gates:

- "X": Uses RX gates, commonly native on superconducting qubits
- "Y": Uses RY gates (default), creates real-valued amplitudes
- "Z": Uses RZ gates, often "virtual" gates with zero error

The optimal choice depends on the target hardware's native gate set.

entanglement : {"linear", "circular", "full"}, default="linear" Topology of CNOT entangling gates:

- "linear": Nearest-neighbor connectivity. CNOT gates connect pairs
  (0,1), (1,2), ..., (n-2, n-1). Creates n-1 entangling gates per layer.
  Matches linear qubit arrays common in superconducting devices.
- "circular": Linear connectivity plus periodic boundary. Adds a CNOT
  between qubits n-1 and 0. Creates n entangling gates per layer.
  Matches circular/ring qubit topologies.
- "full": All-to-all connectivity. CNOT gates connect every pair of
  qubits. Creates n(n-1)/2 entangling gates per layer (O(n²) scaling).
  Matches ion trap devices (IonQ, Quantinuum) with global connectivity.

Attributes¶

reps : int Number of encoding layer repetitions. rotation : str The rotation axis ("X", "Y", or "Z"). entanglement : str The entanglement topology ("linear", "circular", or "full"). n_features : int Number of classical features (inherited from BaseEncoding). n_qubits : int Number of qubits, equal to n_features.

Examples¶

Create a basic hardware-efficient encoding:

from encoding_atlas import HardwareEfficientEncoding import numpy as np enc = HardwareEfficientEncoding(n_features=4) enc.n_qubits 4 enc.rotation 'Y'

Generate a PennyLane circuit:

x = np.array([0.1, 0.2, 0.3, 0.4]) circuit = enc.get_circuit(x, backend='pennylane') callable(circuit) True

Use RZ rotations (often "virtual" gates with zero error):

enc_rz = HardwareEfficientEncoding(n_features=4, rotation='Z') enc_rz.rotation 'Z'

Use circular entanglement for ring topologies:

enc_circ = HardwareEfficientEncoding(n_features=4, entanglement='circular') enc_circ.properties.two_qubit_gates 8

Generate circuits for different backends:

qiskit_circuit = enc.get_circuit(x, backend='qiskit') qiskit_circuit.num_qubits 4 cirq_circuit = enc.get_circuit(x, backend='cirq') len(cirq_circuit.all_qubits()) 4

Access encoding properties:

props = enc.properties props.is_entangling True props.trainability_estimate 0.8

References¶

.. [1] Kandala, A., et al. (2017). "Hardware-efficient variational quantum eigensolver for small molecules and quantum magnets." Nature. .. [2] Cerezo, M., et al. (2021). "Variational quantum algorithms." Nature Reviews Physics.

Notes¶

Hardware Native Gates: Most superconducting quantum computers support RX, RY, RZ rotations and CNOT (or CZ) as native gates. Ion trap devices typically support different gate sets. Choose the rotation type that matches your target hardware.

Virtual RZ Gates: On many superconducting devices, RZ gates are implemented virtually (by updating the phase tracking) with essentially zero error. Using rotation='Z' can significantly reduce circuit errors.

Depth Considerations: Each repetition adds one rotation layer and one entanglement layer. For n qubits with linear entanglement: - Total single-qubit gates: reps × n - Total two-qubit gates: reps × (n-1) - Circuit depth: approximately 2 × reps

Expressivity vs. Trainability: Hardware-efficient encodings typically have good trainability (gradients don't vanish as quickly as random circuits) but may have lower expressivity than IQP-style encodings. This trade-off often favors hardware-efficient designs on NISQ devices.

Initialize the hardware-efficient encoding.

Parameters¶

n_features : int Number of classical features to encode. reps : int, default=2 Number of encoding layer repetitions. rotation : {"X", "Y", "Z"}, default="Y" Rotation axis for single-qubit gates. entanglement : {"linear", "circular", "full"}, default="linear" Topology for entangling CNOT gates.

Raises¶

ValueError If reps is less than 1. ValueError If rotation is not one of "X", "Y", "Z". ValueError If entanglement is not one of "linear", "circular", "full". ValueError If n_features is less than 1 (raised by parent class).

Analysis Module¶

analysis ¶

Analysis tools for quantum encoding properties.

This module provides quantitative analysis capabilities for quantum encodings, including resource counting, simulability analysis, expressibility computation, entanglement capability measurement, and trainability estimation.

Analysis Functions¶

The main analysis functions are:

:func:count_resources: Count computational resources (gates, depth, qubits)
:func:get_resource_summary: Quick resource summary from encoding properties
:func:get_gate_breakdown: Detailed gate-by-gate breakdown
:func:compare_resources: Compare resources across multiple encodings
:func:estimate_execution_time: Estimate circuit execution time
:func:check_simulability: Determine if classical simulation is efficient
:func:get_simulability_reason: Get a concise simulability explanation
:func:is_clifford_circuit: Check if encoding uses only Clifford gates
:func:is_matchgate_circuit: Check if encoding uses only matchgate operations
:func:estimate_entanglement_bound: Estimate upper bound on entanglement entropy
:func:compute_expressibility: Measure Hilbert space coverage
:func:compute_entanglement_capability: Measure entanglement generation
:func:estimate_trainability: Detect barren plateau risk

Utility Functions¶

Lower-level utilities for custom analysis:

:func:simulate_encoding_statevector: Simulate circuit to get statevector
:func:compute_fidelity: Compute fidelity between pure states
:func:compute_purity: Compute purity of a density matrix
:func:partial_trace_single_qubit: Compute reduced density matrix

Examples¶

Basic usage:

from encoding_atlas import AngleEncoding, IQPEncoding from encoding_atlas.analysis import count_resources, check_simulability

Compare resources between encodings¶

enc_simple = AngleEncoding(n_features=4) enc_complex = IQPEncoding(n_features=4, reps=2)

res_simple = count_resources(enc_simple) res_complex = count_resources(enc_complex)

print(f"Simple encoding: {res_simple['gate_count']} gates") print(f"Complex encoding: {res_complex['gate_count']} gates")

Advanced analysis:

from encoding_atlas.analysis import simulate_encoding_statevector import numpy as np

enc = AngleEncoding(n_features=2) x = np.array([0.5, 1.0]) state = simulate_encoding_statevector(enc, x) print(f"State dimension: {len(state)}")

EntanglementResult ¶

Bases: TypedDict

Result dictionary from entanglement capability computation.

Attributes¶

entanglement_capability : float Average entanglement measure in [0, 1]. Higher values indicate the encoding creates more entangled states on average. entanglement_samples : NDArray[np.floating] Array of entanglement measure values for each sampled input. Shape: (n_samples,). The measure type depends on the measure parameter used (Meyer-Wallach or Scott). std_error : float Standard error of the mean, computed as std / sqrt(n_samples). Useful for assessing the reliability of the estimate. n_samples : int Number of random samples used in the computation. per_qubit_entanglement : NDArray[np.floating] Average linear entropy contribution from each qubit. Shape: (n_qubits,). Useful for identifying which qubits are most entangled. Note: Only meaningful for Meyer-Wallach measure; returns zeros for Scott measure. measure : str The entanglement measure used: "meyer_wallach" or "scott". scott_k : int | None The k value used for Scott measure, or None if Meyer-Wallach was used. entanglement_ci_lower : float Lower bound of the percentile bootstrap confidence interval on entanglement_capability at confidence_level. Computed by resampling entanglement_samples with replacement. entanglement_ci_upper : float Upper bound of the bootstrap CI on entanglement_capability. confidence_level : float Two-sided confidence level used for the CI bounds above (default 0.95 → 95% CIs). sampling : {'uniform', 'sobol'} Sampling strategy that produced the per-sample inputs. See :func:compute_entanglement_capability for details.

ExpressibilityResult ¶

Bases: TypedDict

Result dictionary from expressibility computation.

This TypedDict defines the structure of the result returned by :func:compute_expressibility when return_distributions=True.

Attributes¶

expressibility : float Normalized expressibility score in [0, 1]. Higher values indicate the encoding's fidelity distribution is closer to Haar-random, meaning it can explore more of the Hilbert space. kl_divergence : float Raw Kullback-Leibler divergence from the encoding's fidelity distribution to the Haar distribution. Lower values indicate higher expressibility. fidelity_distribution : NDArray[np.floating] Histogram of sampled fidelities (probability for each bin). Shape: (n_bins,). Values sum to approximately 1. haar_distribution : NDArray[np.floating] Theoretical Haar-random fidelity distribution evaluated at the same bin centers. Shape: (n_bins,). bin_edges : NDArray[np.floating] Edges of the histogram bins. Shape: (n_bins + 1,). Range is [0, 1] for fidelity values. n_samples : int Number of fidelity samples used in the computation. n_bins : int Number of histogram bins used. convergence_estimate : float Bootstrap estimate of the standard error of the KL divergence. Smaller values indicate more reliable results. mean_fidelity : float Mean of the sampled fidelities. Useful for understanding the typical overlap between encoded states. std_fidelity : float Standard deviation of the sampled fidelities. expressibility_ci_lower : float Lower bound of the percentile bootstrap confidence interval on expressibility at confidence_level. Computed by resampling the per-sample fidelity array and re-running the full histogram → KL → score pipeline on each resample. expressibility_ci_upper : float Upper bound of the percentile bootstrap CI on expressibility. mean_fidelity_ci_lower : float Lower bound of the bootstrap CI on mean_fidelity. mean_fidelity_ci_upper : float Upper bound of the bootstrap CI on mean_fidelity. confidence_level : float Two-sided confidence level used for the CI bounds above (default 0.95 → 95% CIs). sampling : {'uniform', 'sobol'} Sampling strategy that produced the fidelity pairs. See :func:compute_expressibility for details.

Examples¶

result = compute_expressibility(enc, return_distributions=True, seed=42) print(f"Expressibility: {result['expressibility']:.4f}") print(f"KL Divergence: {result['kl_divergence']:.4f}") print(f"Mean Fidelity: {result['mean_fidelity']:.4f}") lo, hi = result['expressibility_ci_lower'], result['expressibility_ci_upper'] print(f"95% CI: [{lo:.4f}, {hi:.4f}]")

NoiseResilienceResult `dataclass` ¶

NoiseResilienceResult(retained_fidelity: float, fidelity_decay: float, noise_level: str, single_qubit_error: float, two_qubit_error: float, n_samples: int, std_fidelity: float, min_fidelity: float, max_fidelity: float)

Result of a noise-resilience analysis.

Attributes¶

retained_fidelity : float Mean fidelity between the ideal and noisy states over the sampled inputs, in [0, 1] (1 = perfectly resilient). fidelity_decay : float 1 - retained_fidelity: the fraction of state fidelity lost to noise. noise_level : str The preset level used, or "custom" for explicit noise_params. single_qubit_error, two_qubit_error : float The depolarizing probabilities applied after one- and two-qubit gates. n_samples : int Number of random inputs averaged over. std_fidelity : float Standard deviation of the per-input fidelities. min_fidelity, max_fidelity : float Extremes of the per-input fidelities.

SimulabilityResult ¶

Bases: TypedDict

Result of simulability analysis.

Attributes¶

is_simulable : bool True if the encoding circuit can be efficiently simulated classically. simulability_class : str One of "simulable", "conditionally_simulable", "not_simulable". reason : str Human-readable explanation of the classification. details : dict[str, Any] Additional analysis details including:

- is_entangling: Whether circuit creates entanglement
- is_clifford: Whether circuit uses only Clifford gates
- is_matchgate: Whether circuit uses only matchgate operations
- entanglement_pattern: Description of entanglement structure
- two_qubit_gate_count: Number of two-qubit gates
- n_qubits: Number of qubits
- declared_simulability: Simulability from encoding properties

list[str]

Suggestions for simulation approaches.

TrainabilityResult ¶

Bases: TypedDict

Result of trainability estimation.

This TypedDict defines the structure of the detailed result returned by :func:estimate_trainability when return_details=True.

Attributes¶

trainability_estimate : float Overall trainability score in the range [0, 1], where higher values indicate better trainability. This score is derived from the gradient variance using a logistic transformation. gradient_variance : float Average variance of parameter gradients computed over successful samples only. This is the primary metric for barren plateau detection. Values near zero indicate a barren plateau. barren_plateau_risk : {"low", "medium", "high"} Categorical assessment of barren plateau risk based on gradient variance and circuit size:

- "low": Variance is healthy, training should proceed normally
- "medium": Variance is borderline, training may be slow
- "high": Variance is very low, likely barren plateau

float

Estimated effective parameter dimension (HEURISTIC METRIC), computed as the count of parameters with variance exceeding 1% of the mean variance. This indicates how many parameters contribute meaningfully to the optimization landscape. Note: This is not a standard metric in barren plateau literature; it is provided for diagnostic purposes. For rigorous analysis, examine per_parameter_variance directly.

n_samples : int Total number of random parameter samples requested for the estimate. n_successful_samples : int Number of samples where gradient computation succeeded. This is the actual sample size used for variance computation. Equal to n_samples - n_failed_samples. per_parameter_variance : NDArray[np.floating] Array of variance values for each individual parameter, computed over successful samples only. This can reveal if specific parameters are more trainable than others. n_failed_samples : int Number of samples where gradient computation failed. High values may indicate issues with the encoding or simulation backend. Failed samples are excluded from variance computation to ensure unbiased statistical estimates. trainability_ci_lower : float Lower bound of the percentile bootstrap CI on trainability_estimate at confidence_level. Computed by resampling the per-sample gradient matrix and re-running the variance → trainability mapping on each resample. trainability_ci_upper : float Upper bound of the bootstrap CI on trainability_estimate. gradient_variance_ci_lower : float Lower bound of the bootstrap CI on gradient_variance. gradient_variance_ci_upper : float Upper bound of the bootstrap CI on gradient_variance. confidence_level : float Two-sided confidence level used for the CI bounds above (default 0.95 → 95% CIs). sampling : {'uniform', 'sobol'} Sampling strategy that produced the per-sample inputs. See :func:estimate_trainability for details.

centered_kernel_target_alignment ¶

centered_kernel_target_alignment(K: NDArray[floating[Any]], y: NDArray[integer[Any] | floating[Any]]) -> float

Centred kernel-target alignment in [-1, 1] (Cortes et al., 2012).

Centring removes the bias the uncentred score suffers for fidelity kernels (whose diagonal is always 1), giving a more faithful measure of task alignment. Returns 0.0 for degenerate inputs or n < 2.

check_simulability ¶

check_simulability(encoding: BaseEncoding, detailed: bool = True) -> SimulabilityResult

Check whether an encoding circuit is classically simulable.

Analyzes the encoding's circuit structure to determine if it can be efficiently simulated on a classical computer. This analysis considers entanglement structure, gate set, and circuit topology.

Parameters¶

encoding : BaseEncoding The encoding instance to analyze. detailed : bool, default=True If True, include detailed analysis in the result. If False, return minimal result with just classification.

Returns¶

SimulabilityResult Dictionary containing:

- ``is_simulable`` : bool
    True if efficiently classically simulable.
- ``simulability_class`` : str
    One of "simulable", "conditionally_simulable", "not_simulable".
- ``reason`` : str
    Human-readable explanation of the classification.
- ``details`` : dict
    Additional analysis details (if detailed=True):

    - is_entangling: Whether circuit creates entanglement
    - is_clifford: Whether circuit uses only Clifford gates
    - entanglement_pattern: Description of entanglement structure
    - two_qubit_gate_count: Number of two-qubit gates
    - n_qubits: Number of qubits
    - declared_simulability: Simulability from encoding properties

- ``recommendations`` : list[str]
    Suggestions for simulation approaches.

Raises¶

AnalysisError If encoding is not a valid BaseEncoding instance or analysis fails.

Examples¶

Check a non-entangling encoding:

from encoding_atlas import AngleEncoding enc = AngleEncoding(n_features=4) result = check_simulability(enc) print(result['simulability_class']) simulable print(result['reason']) Encoding produces only product states (no entanglement)

Check an entangling encoding:

from encoding_atlas import IQPEncoding enc = IQPEncoding(n_features=4, reps=2) result = check_simulability(enc) print(result['simulability_class']) not_simulable print(result['recommendations']) ['Use statevector simulation for small instances (< 20 qubits)', ...]

Quick check without details:

result = check_simulability(enc, detailed=False) if result['is_simulable']: ... print("Can use classical simulation")

Notes¶

Simulability Classes:

simulable: Circuit can always be efficiently simulated. Examples: Clifford circuits, product state circuits.
conditionally_simulable: Circuit may be simulable depending on input data, circuit parameters, or specific structure. Examples: Circuits with linear/circular entanglement topology.
not_simulable: Circuit is believed to be hard to simulate classically. Examples: IQP circuits, random circuits with high entanglement.

Limitations:

This analysis is based on known theoretical results and heuristics. It may be conservative (classifying some simulable circuits as not_simulable) but aims not to falsely claim simulability.

compare_resources ¶

compare_resources(encodings: list[BaseEncoding], metrics: list[str] | None = None, *, include_names: bool = True) -> dict[str, list[Any]]

Compare resources across multiple encodings.

Provides side-by-side comparison of resource metrics for a list of encodings. Useful for encoding selection and benchmarking.

Parameters¶

encodings : list[BaseEncoding] List of encodings to compare. Must all be valid BaseEncoding instances. metrics : list[str], optional Specific metrics to compare. If None, compares all standard metrics. Available metrics:

- ``"n_qubits"``: Number of qubits
- ``"depth"``: Circuit depth
- ``"gate_count"``: Total gates
- ``"single_qubit_gates"``: Single-qubit gate count
- ``"two_qubit_gates"``: Two-qubit gate count
- ``"parameter_count"``: Parameterized gate count
- ``"two_qubit_ratio"``: Ratio of two-qubit gates
- ``"gates_per_qubit"``: Average gates per qubit

bool, default=True

If True, include encoding names in the result.

Returns¶

dict[str, list[Any]] Dictionary mapping metric names to lists of values (one per encoding). If include_names=True, includes an "encoding_name" key with encoding class names.

Raises¶

AnalysisError If any encoding is invalid. ValueError If encodings list is empty.

Examples¶

from encoding_atlas import AngleEncoding, IQPEncoding from encoding_atlas.analysis import compare_resources

encodings = [ ... AngleEncoding(n_features=4), ... IQPEncoding(n_features=4, reps=1), ... IQPEncoding(n_features=4, reps=2), ... ] comparison = compare_resources(encodings) print(comparison['encoding_name']) ['AngleEncoding', 'IQPEncoding', 'IQPEncoding'] print(comparison['gate_count']) [4, 26, 52]

Compare specific metrics:

comparison = compare_resources( ... encodings, ... metrics=['gate_count', 'two_qubit_gates'], ... ) print(comparison.keys()) dict_keys(['encoding_name', 'gate_count', 'two_qubit_gates'])

Notes¶

For data-dependent encodings, this function returns worst-case (maximum) values from encoding properties. Use :func:count_resources with specific input data for actual counts.

compute_all_parameter_gradients ¶

compute_all_parameter_gradients(encoding: BaseEncoding, x: NDArray[floating[Any]], observable: Literal['computational', 'global_z', 'local_z', 'pauli_z'] = 'computational', backend: Literal['pennylane', 'qiskit', 'cirq'] = 'pennylane') -> FloatArray

Compute gradients for all parameters using the parameter-shift rule.

Builds the full (2 * n_params, n_features) matrix of shifted parameter vectors, dispatches a single :func:simulate_encoding_statevectors_batch call, and evaluates all expectation values with one vectorised observable computation.

Compared to the previous implementation — a Python for loop over parameters, each calling :func:compute_parameter_gradient and therefore each invoking the backend simulator twice independently — this version still performs the same 2 * n_params statevector simulations (the parameter-shift rule fundamentally requires them) but amortises the Python and observable-evaluation overhead into a single pass. The numerical result is identical to the per-call version (verified by test_batched_matches_per_call_gradient).

Parameters¶

encoding : BaseEncoding The encoding instance. x : NDArray[np.floating] Input data vector of shape (n_features,). observable : {"computational", "global_z", "local_z", "pauli_z"}, default="computational" Observable to measure. See :func:compute_parameter_gradient for the mathematical definitions. "pauli_z" is a deprecated alias for "global_z" and emits a DeprecationWarning. backend : {"pennylane", "qiskit", "cirq"}, default="pennylane" Simulation backend.

Returns¶

FloatArray Array of gradients of shape (n_features,). The element at index i is ∂⟨O⟩/∂xᵢ evaluated at the input point x.

Raises¶

ValidationError If x is not a 1-D array. ValueError If observable is not one of the recognised options. SimulationError If the underlying batched simulation fails.

Examples¶

from encoding_atlas import AngleEncoding import numpy as np

enc = AngleEncoding(n_features=3) x = np.array([0.5, 1.0, 1.5]) gradients = compute_all_parameter_gradients(enc, x) print(f"Gradients shape: {gradients.shape}") Gradients shape: (3,)

compute_effective_dimension ¶

compute_effective_dimension(encoding: BaseEncoding, X: NDArray[floating[Any]], *, regularization: float = 1.0, backend: Literal['pennylane', 'qiskit', 'cirq'] = 'pennylane') -> float

Feature-space effective dimension of an encoding's fidelity kernel.

Larger values indicate the encoding spreads data across more effective feature-space directions (higher capacity). See :func:kernel_effective_dimension for the definition.

compute_entanglement_capability ¶

compute_entanglement_capability(encoding: BaseEncoding, n_samples: int = ..., input_range: tuple[float, float] = ..., seed: int | None = ..., backend: Literal['pennylane', 'qiskit', 'cirq'] = ..., measure: Literal['meyer_wallach', 'scott'] = ..., scott_k: int | None = ..., return_details: Literal[False] = ..., verbose: bool = ..., parallel: ParallelArg = ..., max_workers: int | None = ..., sampling: SamplingMethod = ..., confidence_level: float = ..., n_bootstrap_ci: int = ...) -> float

compute_entanglement_capability(encoding: BaseEncoding, n_samples: int = ..., input_range: tuple[float, float] = ..., seed: int | None = ..., backend: Literal['pennylane', 'qiskit', 'cirq'] = ..., measure: Literal['meyer_wallach', 'scott'] = ..., scott_k: int | None = ..., return_details: Literal[True] = ..., verbose: bool = ..., parallel: ParallelArg = ..., max_workers: int | None = ..., sampling: SamplingMethod = ..., confidence_level: float = ..., n_bootstrap_ci: int = ...) -> EntanglementResult

compute_entanglement_capability(encoding: BaseEncoding, n_samples: int = _DEFAULT_N_SAMPLES, input_range: tuple[float, float] = _DEFAULT_INPUT_RANGE, seed: int | None = None, backend: Literal['pennylane', 'qiskit', 'cirq'] = 'pennylane', measure: Literal['meyer_wallach', 'scott'] = 'meyer_wallach', scott_k: int | None = None, return_details: bool = False, verbose: bool = False, parallel: ParallelArg = False, max_workers: int | None = None, sampling: SamplingMethod = 'uniform', confidence_level: float = _DEFAULT_CONFIDENCE_LEVEL, n_bootstrap_ci: int = _DEFAULT_N_BOOTSTRAP_CI) -> Union[float, EntanglementResult]

Compute the entanglement capability of a quantum encoding.

Entanglement capability quantifies the average amount of entanglement produced by the encoding across random inputs. Higher values indicate the encoding creates more entangled states on average.

Parameters¶

encoding : BaseEncoding The encoding instance to analyze. Must implement get_circuit(). n_samples : int, default=1000 Number of random inputs to sample for averaging. Higher values give more accurate estimates but increase computation time. Must be at least 10. input_range : tuple[float, float], default=(0, 2*pi) Range for sampling random input values. Features are sampled uniformly from this range. seed : int, optional Random seed for reproducibility. If None, uses system entropy. backend : {"pennylane", "qiskit", "cirq"}, default="pennylane" Backend for circuit simulation. measure : {"meyer_wallach", "scott"}, default="meyer_wallach" Entanglement measure to use:

- ``"meyer_wallach"``: Average single-qubit purity deficit.
  Faster and most commonly used. Equivalent to Scott measure with k=1.
- ``"scott"``: Scott measure with configurable k-body subsystems.
  More detailed but slower. Use ``scott_k`` to specify subsystem size.

int, optional

Subsystem size for Scott measure. Only used when measure="scott". Must satisfy 1 <= scott_k <= n_qubits - 1.

If None (default): Automatically selects min(2, n_qubits - 1). For 2-qubit systems this falls back to k=1 (Meyer-Wallach).
If specified: Uses the given k value, raising an error if invalid.

Ignored when measure="meyer_wallach".

bool, default=False

If True, return full result dictionary with statistics. If False, return only the entanglement capability score.

verbose : bool, default=False If True, log progress during computation. parallel : bool or {'thread', 'process'}, default=False Parallel-dispatch mode for the per-sample simulation + entanglement-measure computation.

- ``False`` (default) — sequential, no executor overhead.
- ``True`` or ``'thread'`` — :class:`ThreadPoolExecutor`.
- ``'process'`` — :class:`ProcessPoolExecutor` with the encoding
  pickled once per worker. Workers exchange only float / NumPy
  arrays, so process-pool parallelism works with **all** three
  backends here (unlike ``BaseEncoding.get_circuits`` where
  PennyLane's local-closure qfuncs prevent process-pool use).

Output is numerically identical across all modes for a fixed
``seed`` — the RNG is fully consumed in the main process before
any work is dispatched.

max_workers : int or None, default=None Maximum number of workers when parallel is enabled. sampling : {'uniform', 'sobol'}, default='uniform' Strategy for drawing the random per-sample inputs:

- ``'uniform'`` (default, unchanged): pseudo-random i.i.d.
  uniform draws.
- ``'sobol'``: Sobol' low-discrepancy quasi-random sequence,
  seeded from the same ``seed`` argument. Typically gives the
  same mean accuracy with **30-50% fewer samples**. For best
  statistical properties choose ``n_samples`` as a power of
  two.

confidence_level : float, default=0.95 Two-sided confidence level for the bootstrap CI reported in the result dict (only used when return_details=True). Must lie strictly in (0, 1). n_bootstrap_ci : int, default=200 Number of bootstrap resamples used to estimate the entanglement_capability percentile CI (only when return_details=True). 200 stabilizes the percentile endpoints to ~1% jitter at negligible cost.

Returns¶

float or EntanglementResult If return_details=False: Entanglement capability in [0, 1].

If ``return_details=True``: Dictionary containing:
    - ``entanglement_capability``: float in [0, 1]
    - ``entanglement_samples``: per-sample entanglement values
    - ``std_error``: standard error of mean
    - ``n_samples``: number of samples used
    - ``per_qubit_entanglement``: average entanglement per qubit
    - ``measure``: the measure used ("meyer_wallach" or "scott")
    - ``scott_k``: the k value used (None for Meyer-Wallach)

Raises¶

InsufficientSamplesError If n_samples < 10. SimulationError If circuit simulation fails. ValueError If encoding has < 2 qubits (entanglement requires multiple qubits). ValidationError If encoding is invalid or input parameters are malformed.

Warns¶

UserWarning If n_samples < 100, results may be unreliable.

Examples¶

Basic usage:

from encoding_atlas import IQPEncoding enc = IQPEncoding(n_features=4, reps=2, entanglement="full") ent = compute_entanglement_capability(enc, seed=42) print(f"Entanglement: {ent:.4f}")

Compare different entanglement patterns:

enc_linear = IQPEncoding(n_features=4, reps=2, entanglement="linear") enc_full = IQPEncoding(n_features=4, reps=2, entanglement="full") print(f"Linear: {compute_entanglement_capability(enc_linear, seed=42):.4f}") print(f"Full: {compute_entanglement_capability(enc_full, seed=42):.4f}")

Get detailed statistics:

result = compute_entanglement_capability( ... enc, n_samples=500, seed=42, return_details=True ... ) print(f"Mean: {result['entanglement_capability']:.4f}") print(f"Std Error: {result['std_error']:.4f}") print(f"Per-qubit: {result['per_qubit_entanglement']}")

Notes¶

For non-entangling encodings (e.g., AngleEncoding), the entanglement capability will be exactly 0.0, as all generated states are product states.

The computation time scales linearly with n_samples and exponentially with the number of qubits (due to statevector simulation).

compute_expressibility ¶

compute_expressibility(encoding: BaseEncoding, n_samples: int = ..., n_bins: int | None = ..., input_range: tuple[float, float] = ..., seed: int | None = ..., backend: Literal['pennylane', 'qiskit', 'cirq'] = ..., return_distributions: Literal[False] = ..., verbose: bool = ..., parallel: ParallelArg = ..., max_workers: int | None = ..., sampling: SamplingMethod = ..., confidence_level: float = ..., n_bootstrap_ci: int = ...) -> float

compute_expressibility(encoding: BaseEncoding, n_samples: int = ..., n_bins: int | None = ..., input_range: tuple[float, float] = ..., seed: int | None = ..., backend: Literal['pennylane', 'qiskit', 'cirq'] = ..., return_distributions: Literal[True] = ..., verbose: bool = ..., parallel: ParallelArg = ..., max_workers: int | None = ..., sampling: SamplingMethod = ..., confidence_level: float = ..., n_bootstrap_ci: int = ...) -> ExpressibilityResult

compute_expressibility(encoding: BaseEncoding, n_samples: int = _DEFAULT_N_SAMPLES, n_bins: int | None = None, input_range: tuple[float, float] = _DEFAULT_INPUT_RANGE, seed: int | None = None, backend: Literal['pennylane', 'qiskit', 'cirq'] = 'pennylane', return_distributions: bool = False, verbose: bool = False, parallel: ParallelArg = False, max_workers: int | None = None, sampling: SamplingMethod = 'uniform', confidence_level: float = _DEFAULT_CONFIDENCE_LEVEL, n_bootstrap_ci: int = _DEFAULT_N_BOOTSTRAP_CI) -> float | ExpressibilityResult

Compute the expressibility of a quantum encoding.

Expressibility quantifies how well the encoding can explore the Hilbert space compared to Haar-random states. Higher expressibility (closer to 1.0) indicates the encoding can generate a more diverse set of quantum states, which is generally desirable for variational quantum algorithms.

The computation works by:

Sampling random input pairs and computing their quantum states
Computing fidelities between state pairs
Building a histogram of fidelities
Comparing to the theoretical Haar-random distribution via KL divergence
Normalizing to a [0, 1] score

Parameters¶

encoding : BaseEncoding The encoding instance to analyze. Must have n_features and n_qubits attributes and implement get_circuit(). n_samples : int, default=5000 Number of random input pairs to sample. Higher values give more accurate results but increase computation time. A minimum of 1000 samples is recommended for reliable results. n_bins : int or None, default=None Number of bins for the fidelity histogram. If None (default), automatically set to min(75, n_samples) so that small sample counts don't require manually tuning the bin count. If provided explicitly, must satisfy 10 ≤ n_bins ≤ n_samples. Rule of thumb: n_bins ≈ √n_samples. input_range : tuple[float, float], default=(0, 2π) Range for sampling random input values. The default covers the full rotation range for typical quantum gates. seed : int, optional Random seed for reproducibility. If None, results will vary between runs. Always specify a seed for reproducible analysis. backend : {"pennylane", "qiskit", "cirq"}, default="pennylane" Backend to use for circuit simulation.

- ``"pennylane"``: Uses PennyLane's default.qubit simulator (recommended)
- ``"qiskit"``: Uses Qiskit's Statevector class
- ``"cirq"``: Uses Cirq's Simulator for statevector simulation

return_distributions : bool, default=False If True, return full result dictionary including fidelity distributions, bin edges, and statistics. If False, return only the expressibility score (float). verbose : bool, default=False If True, log progress information during computation. parallel : bool or {'thread', 'process'}, default=False Parallel-dispatch mode for the per-sample fidelity computation. Each sample requires two independent statevector simulations, making the loop embarrassingly parallel.

- ``False`` (default) — sequential, no executor overhead.
- ``True`` or ``'thread'`` — :class:`ThreadPoolExecutor`. Useful
  when the backend simulator releases the GIL during numerical
  work (NumPy operations typically do).
- ``'process'`` — :class:`ProcessPoolExecutor` for true CPU
  parallelism. The encoding is pickled once per worker. Best
  for large ``n_samples`` and CPU-bound backends. Has
  process-startup overhead, so only worthwhile for batches of
  roughly 100+ samples. Workers exchange only NumPy arrays /
  floats, so process-pool parallelism works with **all** three
  backends here (unlike ``BaseEncoding.get_circuits`` where
  PennyLane's local-closure qfuncs prevent process-pool use).

Output is numerically identical across all modes for a fixed
``seed``: the RNG is fully consumed in the main process before
any work is dispatched to workers.

max_workers : int or None, default=None Maximum number of workers when parallel is enabled. None defers to the executor's default (typically based on CPU count). sampling : {'uniform', 'sobol'}, default='uniform' Strategy for drawing the random input pairs (X1, X2):

- ``'uniform'`` (default, unchanged behavior): pseudo-random
  i.i.d. samples via :meth:`numpy.random.Generator.uniform`.
- ``'sobol'``: Sobol' low-discrepancy quasi-random sequence
  via :class:`scipy.stats.qmc.Sobol`. Covers the hypercube
  much more evenly than i.i.d. uniform draws and typically
  reaches the same KL accuracy with **30-50% fewer samples**.
  For best statistical properties round ``n_samples`` up to
  the nearest power of two.

Sobol is seeded from the same ``seed`` argument so a given
``(seed, sampling)`` pair is fully reproducible.

confidence_level : float, default=0.95 Two-sided confidence level for the bootstrap CIs reported in the result dict (only used when return_distributions=True). Must lie strictly in (0, 1). Common choices: 0.90, 0.95, 0.99. n_bootstrap_ci : int, default=200 Number of bootstrap resamples used to estimate the expressibility and mean_fidelity percentile CIs (only when return_distributions=True). 200 stabilizes the percentile endpoints to ~1% jitter; raise to ~1000 for publication-grade precision.

Returns¶

float or ExpressibilityResult If return_distributions=False: Expressibility score in [0, 1]. Higher values indicate more expressive encodings.

If ``return_distributions=True``: Dictionary containing:

- **expressibility**: float in [0, 1], higher = more expressive
- **kl_divergence**: raw KL divergence value
- **fidelity_distribution**: sampled fidelity histogram
- **haar_distribution**: theoretical Haar distribution
- **bin_edges**: histogram bin edges
- **n_samples**: number of samples used
- **n_bins**: number of histogram bins
- **convergence_estimate**: bootstrap standard error estimate
- **mean_fidelity**: mean of sampled fidelities
- **std_fidelity**: standard deviation of sampled fidelities

Raises¶

InsufficientSamplesError If n_samples < 10 (too few for meaningful statistics). SimulationError If circuit simulation fails for the given backend. AnalysisError If encoding is invalid or computation fails. ValueError If explicit n_bins < 10 or n_bins > n_samples, or if input_range[0] >= input_range[1].

Warns¶

UserWarning If n_samples < 100 (results may be unreliable). UserWarning If encoding has > 10 qubits (computation may be slow).

Examples¶

Basic usage with default parameters:

from encoding_atlas import AngleEncoding enc = AngleEncoding(n_features=4) expr = compute_expressibility(enc, seed=42) print(f"Expressibility: {expr:.4f}") Expressibility: ...

Get full result with distributions:

result = compute_expressibility(enc, return_distributions=True, seed=42) print(f"KL Divergence: {result['kl_divergence']:.4f}") print(f"Samples used: {result['n_samples']}")

Compare two encodings:

from encoding_atlas import IQPEncoding enc1 = AngleEncoding(n_features=4) # Non-entangling enc2 = IQPEncoding(n_features=4, reps=2) # Entangling expr1 = compute_expressibility(enc1, n_samples=1000, seed=42) expr2 = compute_expressibility(enc2, n_samples=1000, seed=42) print(f"AngleEncoding: {expr1:.4f}") print(f"IQPEncoding: {expr2:.4f}")

IQP typically shows higher expressibility due to entanglement¶

Notes¶

Computational Complexity

The complexity is O(n_samples × simulation_cost), where simulation_cost scales as O(2^n_qubits). For encodings with more than 10 qubits, consider reducing n_samples or using approximate methods.

Interpretation

Expressibility ≈ 0: Encoding produces states in a narrow region of Hilbert space (e.g., product states only)
Expressibility ≈ 1: Encoding produces states distributed like Haar-random states (explores full Hilbert space)

Normalization Formula

The expressibility score is normalized from KL divergence using::

expr = 1.0 - min(1.0, kl_divergence / MAX_KL)

This maps KL divergence to [0, 1] where higher values indicate distributions closer to Haar-random (more expressive).

Statistical Reliability

The convergence_estimate in the result dictionary provides a bootstrap estimate of the standard error. Values < 0.05 generally indicate reliable results.

References¶

.. [1] Sim, S., Johnson, P. D., & Aspuru-Guzik, A. (2019). "Expressibility and entangling capability of parameterized quantum circuits for hybrid quantum-classical algorithms." Advanced Quantum Technologies, 2(12), 1900070.

compute_fidelity ¶

compute_fidelity(state1: StatevectorType, state2: StatevectorType) -> float

Compute the fidelity between two pure quantum states.

The fidelity between two pure states is defined as:

F(|ψ⟩, |φ⟩) = |⟨ψ|φ⟩|²

This measures how "close" two quantum states are, with F = 1 for identical states (up to global phase) and F = 0 for orthogonal states.

Parameters¶

state1 : StatevectorType First statevector. state2 : StatevectorType Second statevector. Must have the same dimension as state1.

Returns¶

float Fidelity value in the range [0, 1].

Raises¶

ValueError If states have different dimensions. ValidationError If states are invalid.

Examples¶

Identical states have fidelity 1:

import numpy as np state = np.array([1, 0, 0, 0], dtype=complex) fidelity = compute_fidelity(state, state) print(f"Fidelity: {fidelity}") Fidelity: 1.0

Orthogonal states have fidelity 0:

state1 = np.array([1, 0], dtype=complex) # |0⟩ state2 = np.array([0, 1], dtype=complex) # |1⟩ fidelity = compute_fidelity(state1, state2) print(f"Fidelity: {fidelity}") Fidelity: 0.0

States differing by a global phase have fidelity 1:

state1 = np.array([1, 0], dtype=complex) state2 = np.array([1j, 0], dtype=complex) # |0⟩ with phase fidelity = compute_fidelity(state1, state2) print(f"Fidelity: {fidelity:.6f}") Fidelity: 1.000000

Notes¶

The implementation uses :func:numpy.vdot which conjugates the first argument, computing ⟨state1|state2⟩ = Σᵢ state1ᵢ* · state2ᵢ.

For mixed states (density matrices), use the more general formula: F(ρ, σ) = (Tr√(√ρ σ √ρ))²

compute_fidelity_distribution ¶

compute_fidelity_distribution(encoding: BaseEncoding, n_samples: int = _DEFAULT_N_SAMPLES, input_range: tuple[float, float] = _DEFAULT_INPUT_RANGE, seed: int | None = None, backend: Literal['pennylane', 'qiskit', 'cirq'] = 'pennylane', parallel: ParallelArg = False, max_workers: int | None = None, sampling: SamplingMethod = 'uniform') -> NDArray[floating[Any]]

Compute the fidelity distribution for an encoding.

Samples random input pairs and computes the fidelity between their corresponding quantum states. This is useful for analyzing the encoding's behavior without computing the full expressibility.

Parameters¶

encoding : BaseEncoding The encoding to analyze. n_samples : int, default=5000 Number of random input pairs to sample. input_range : tuple[float, float], default=(0, 2π) Range for random input sampling. seed : int, optional Random seed for reproducibility. backend : {"pennylane", "qiskit", "cirq"}, default="pennylane" Backend for circuit simulation.

- ``"pennylane"``: Uses PennyLane's default.qubit simulator (recommended)
- ``"qiskit"``: Uses Qiskit's Statevector class
- ``"cirq"``: Uses Cirq's Simulator for statevector simulation

parallel : bool or {'thread', 'process'}, default=False Parallel-dispatch mode for the per-sample fidelity computation. See :func:compute_expressibility for the full discussion. max_workers : int or None, default=None Maximum number of workers when parallel is enabled. sampling : {'uniform', 'sobol'}, default='uniform' Strategy for drawing the random input pairs. See :func:compute_expressibility for the full discussion.

Returns¶

NDArray[np.floating] Array of fidelity values, shape (n_samples,), each in [0, 1].

Raises¶

InsufficientSamplesError If n_samples < 10. SimulationError If circuit simulation fails. AnalysisError If encoding is invalid.

Examples¶

from encoding_atlas import IQPEncoding enc = IQPEncoding(n_features=4, reps=1) fidelities = compute_fidelity_distribution(enc, n_samples=1000, seed=42) print(f"Mean fidelity: {fidelities.mean():.4f}") print(f"Std fidelity: {fidelities.std():.4f}")

Notes¶

The fidelity F = |⟨ψ₁|ψ₂⟩|² measures the overlap between two quantum states. F = 1 means identical states (up to global phase), F = 0 means orthogonal states.

For Haar-random states in high dimensions, most pairs have F ≈ 0 because random high-dimensional vectors are typically nearly orthogonal.

compute_fidelity_kernel ¶

compute_fidelity_kernel(encoding: BaseEncoding, X: NDArray[floating[Any]], *, backend: Literal['pennylane', 'qiskit', 'cirq'] = 'pennylane') -> NDArray[floating[Any]]

Compute the fidelity (overlap) kernel matrix for X.

Parameters¶

encoding : BaseEncoding Encoding used for state preparation. encoding.n_features must equal X.shape[1]. X : ndarray, shape (n_samples, n_features) Input data. backend : {"pennylane", "qiskit", "cirq"}, default="pennylane" Statevector simulation backend.

Returns¶

ndarray, shape (n_samples, n_samples) The fidelity kernel matrix.

compute_geometric_difference ¶

compute_geometric_difference(encoding: BaseEncoding, X: NDArray[floating[Any]], *, classical_kernel: Literal['rbf', 'linear'] = 'rbf', gamma: float | None = None, regularization: float = 1e-06, backend: Literal['pennylane', 'qiskit', 'cirq'] = 'pennylane') -> float

Geometric difference g(K_classical || K_quantum) for an encoding.

A large value means the encoding's fidelity kernel is geometrically distinct from the classical reference kernel — a necessary (not sufficient) condition for a quantum advantage over that classical kernel.

Parameters¶

encoding : BaseEncoding Encoding under test. X : ndarray, shape (n_samples, n_features) Input data. classical_kernel : {"rbf", "linear"}, default="rbf" Classical reference kernel. gamma : float or None, default=None RBF bandwidth (None -> 1 / n_features, scikit-learn default). regularization : float, default=1e-6 Ridge for the classical-kernel inversion. backend : {"pennylane", "qiskit", "cirq"}, default="pennylane" Statevector backend.

compute_gradient_variance ¶

compute_gradient_variance(encoding: BaseEncoding, n_samples: int = _DEFAULT_N_SAMPLES, input_range: tuple[float, float] = _DEFAULT_INPUT_RANGE, observable: Literal['computational', 'pauli_z', 'global_z'] = 'computational', seed: int | None = None, backend: Literal['pennylane', 'qiskit', 'cirq'] = 'pennylane') -> float

Compute the average variance of parameter gradients.

This is the core metric used for trainability estimation. It measures how much the gradients vary across random parameter initializations. Low variance indicates a flat optimization landscape (barren plateau).

This function is a convenience wrapper that calls :func:estimate_trainability and extracts only the gradient variance.

Parameters¶

encoding : BaseEncoding The encoding to analyze. n_samples : int, default=500 Number of random parameter samples. input_range : tuple[float, float], default=(0, 2π) Range for sampling random parameters. observable : {"computational", "pauli_z", "global_z"}, default="computational" Observable for gradient computation. seed : int, optional Random seed for reproducibility. backend : {"pennylane", "qiskit", "cirq"}, default="pennylane" Backend for gradient computation.

Returns¶

float Average gradient variance across parameters. Values close to zero indicate a barren plateau.

Examples¶

from encoding_atlas import AngleEncoding enc = AngleEncoding(n_features=4) variance = compute_gradient_variance(enc, n_samples=100, seed=42) print(f"Gradient variance: {variance:.2e}")

compute_haar_distribution ¶

compute_haar_distribution(n_qubits: int, fidelity_values: NDArray[floating[Any]]) -> NDArray[floating[Any]]

Compute the Haar-random fidelity distribution.

For a d-dimensional Hilbert space (d = 2^n_qubits), the probability density of fidelities between Haar-random states is:

P_Haar(F) = (d - 1)(1 - F)^(d - 2)

This is derived from the fact that Haar-random states are uniformly distributed on the complex unit sphere.

Parameters¶

n_qubits : int Number of qubits (determines Hilbert space dimension d = 2^n_qubits). fidelity_values : NDArray[np.floating] Array of fidelity values at which to evaluate the distribution. Should be in [0, 1].

Returns¶

NDArray[np.floating] Probability density values at each fidelity point, normalized to sum to 1 (assuming fidelity_values are evenly spaced bin centers).

Raises¶

ValueError If n_qubits < 1.

Examples¶

import numpy as np fidelities = np.linspace(0, 1, 100) P_haar = compute_haar_distribution(n_qubits=4, fidelity_values=fidelities) print(f"P_Haar shape: {P_haar.shape}") print(f"P_Haar sum: {P_haar.sum():.4f}") # Should be close to 1

Notes¶

Physical Interpretation

The Haar distribution represents what we would observe if we picked two states uniformly at random from the Hilbert space. For large d:

Most pairs have very low fidelity (nearly orthogonal)
High fidelity pairs are exponentially rare
The distribution is concentrated near F = 0

Numerical Considerations

For large d (many qubits), the term (1-F)^(d-2) can cause underflow. We use log-space computation for d > 100 to maintain numerical stability.

Single Qubit Special Case

For d = 2 (single qubit), P_Haar(F) = 1 (uniform distribution), because the Bloch sphere has special geometry where all fidelities are equally likely.

References¶

.. [1] Życzkowski, K., & Sommers, H.-J. (2001). "Induced measures in the space of mixed quantum states." Journal of Physics A, 34(35), 7111.

compute_kernel_target_alignment ¶

compute_kernel_target_alignment(encoding: BaseEncoding, X: NDArray[floating[Any]], y: NDArray[integer[Any] | floating[Any]], *, centered: bool = True, backend: Literal['pennylane', 'qiskit', 'cirq'] = 'pennylane') -> float

Kernel-target alignment of an encoding's fidelity kernel on (X, y).

Higher alignment predicts stronger quantum-kernel classification accuracy. Uses the centred variant by default.

compute_linear_entropy ¶

compute_linear_entropy(density_matrix: DensityMatrixType) -> float

Compute the linear entropy of a density matrix.

The linear entropy is a measure of mixedness defined as:

S_L(ρ) = 1 - Tr(ρ²) = 1 - γ(ρ)

where γ(ρ) is the purity.

Properties

For a pure state: S_L = 0
For a maximally mixed state of dimension d: S_L = 1 - 1/d
In general: 0 ≤ S_L ≤ 1 - 1/d

Parameters¶

density_matrix : DensityMatrixType Density matrix (must be square).

Returns¶

float Linear entropy value in the range [0, 1 - 1/d].

Examples¶

Pure state has zero linear entropy:

import numpy as np rho_pure = np.array([[1, 0], [0, 0]], dtype=complex) entropy = compute_linear_entropy(rho_pure) print(f"Linear entropy: {entropy}") Linear entropy: 0.0

Maximally mixed state has maximum linear entropy:

rho_mixed = np.array([[0.5, 0], [0, 0.5]], dtype=complex) entropy = compute_linear_entropy(rho_mixed) print(f"Linear entropy: {entropy}") Linear entropy: 0.5

compute_meyer_wallach ¶

compute_meyer_wallach(statevector: StatevectorType, n_qubits: int) -> float

Compute the Meyer-Wallach entanglement measure for a pure state.

The Meyer-Wallach measure Q(|psi>) quantifies global entanglement as the average linear entropy across all single-qubit reduced states:

Q(|psi>) = (2/n) * sum_i (1 - Tr(rho_i^2))

where rho_i is the reduced density matrix of qubit i.

Parameters¶

statevector : NDArray[np.complexfloating] Pure state vector, shape (2^n_qubits,). n_qubits : int Number of qubits in the system.

Returns¶

float Meyer-Wallach measure in [0, 1].

- 0 = completely separable (product state)
- 1 = maximally entangled

Raises¶

ValueError If statevector length doesn't match 2^n_qubits. ValidationError If statevector is invalid (NaN, zero norm, etc.).

Examples¶

GHZ state: (|000> + |111>) / sqrt(2) has MW = 1:

import numpy as np ghz = np.zeros(8, dtype=complex) ghz[0] = ghz[7] = 1.0 / np.sqrt(2) mw = compute_meyer_wallach(ghz, n_qubits=3) print(f"GHZ Meyer-Wallach: {mw:.4f}") GHZ Meyer-Wallach: 1.0000

Product state |000> has MW = 0:

product = np.zeros(8, dtype=complex) product[0] = 1.0 mw = compute_meyer_wallach(product, n_qubits=3) print(f"Product Meyer-Wallach: {mw:.4f}") Product Meyer-Wallach: 0.0000

Bell state (|00> + |11>) / sqrt(2) has MW = 1:

bell = np.zeros(4, dtype=complex) bell[0] = bell[3] = 1.0 / np.sqrt(2) mw = compute_meyer_wallach(bell, n_qubits=2) print(f"Bell Meyer-Wallach: {mw:.4f}") Bell Meyer-Wallach: 1.0000

Notes¶

The Meyer-Wallach measure is invariant under local unitary operations. It reaches its maximum value of 1 for maximally entangled states like the GHZ state (|00...0> + |11...1>) / sqrt(2).

compute_meyer_wallach_with_breakdown ¶

compute_meyer_wallach_with_breakdown(statevector: StatevectorType, n_qubits: int) -> tuple[float, FloatArray]

Compute Meyer-Wallach measure with per-qubit breakdown.

This function returns both the total Meyer-Wallach measure and the contribution from each qubit, which is useful for understanding which qubits are most entangled.

Parameters¶

statevector : NDArray[np.complexfloating] Pure state vector, shape (2^n_qubits,). n_qubits : int Number of qubits in the system.

Returns¶

mw_value : float Total Meyer-Wallach measure in [0, 1]. per_qubit_entropy : NDArray[np.floating] Linear entropy (1 - purity) for each qubit's reduced state. Shape: (n_qubits,).

Raises¶

ValueError If statevector length doesn't match 2^n_qubits, or if n_qubits < 1. ValidationError If statevector is invalid.

Examples¶

import numpy as np

W state: (|001> + |010> + |100>) / sqrt(3)¶

w_state = np.zeros(8, dtype=complex) w_state[1] = w_state[2] = w_state[4] = 1.0 / np.sqrt(3) mw, per_qubit = compute_meyer_wallach_with_breakdown(w_state, n_qubits=3) print(f"Total MW: {mw:.4f}") print(f"Per-qubit entropies: {per_qubit}")

Notes¶

The per-qubit breakdown is useful for:

Identifying which qubits contribute most to entanglement
Detecting asymmetric entanglement structures
Debugging encoding circuits

For the GHZ state, all qubits have equal entropy (= 0.5). For the W state, all qubits also have equal entropy (= ⅔).

compute_noise_resilience ¶

compute_noise_resilience(encoding: BaseEncoding, *, noise_level: str = 'medium', noise_params: dict[str, float] | None = None, n_samples: int = 25, seed: int | None = None, feature_range: tuple[float, float] = _DEFAULT_RANGE) -> NoiseResilienceResult

Measure an encoding's resilience to depolarizing gate noise.

For n_samples random inputs, simulates the ideal state |phi(x)> and the noisy density matrix rho(x) and averages the fidelity F(x) = <phi(x)| rho(x) |phi(x)>. Higher retained fidelity means the encoding degrades less under noise.

Parameters¶

encoding : BaseEncoding Encoding to analyze (at most 12 qubits; density-matrix memory is O(4^n)). noise_level : {"low", "medium", "high"}, default="medium" Preset depolarizing error rates (ignored if noise_params is given). noise_params : dict or None, default=None Explicit {"single_qubit": p1, "two_qubit": p2} error probabilities, overriding noise_level. n_samples : int, default=25 Number of random inputs to average over. Depolarizing fidelity is nearly input-independent, so modest values are accurate. seed : int or None, default=None Seed for input sampling (reproducibility). feature_range : tuple, default=(0, 2*pi) Range for uniformly sampled input features.

Returns¶

NoiseResilienceResult Retained fidelity, fidelity decay, and per-sample statistics.

Raises¶

ValueError If n_samples < 1, the encoding exceeds the qubit cap, or the noise specification is invalid.

compute_parameter_gradient ¶

compute_parameter_gradient(encoding: BaseEncoding, x: NDArray[floating[Any]], param_index: int, observable: Literal['computational', 'global_z', 'local_z', 'pauli_z'] = 'computational', backend: Literal['pennylane', 'qiskit', 'cirq'] = 'pennylane') -> float

Compute gradient of expectation value with respect to one parameter.

Uses the parameter-shift rule for quantum gradients:

∂⟨O⟩/∂θ = (⟨O⟩_{θ+π/2} - ⟨O⟩_{θ-π/2}) / 2

This is exact for gates of the form U(θ) = exp(-iθG/2) where G² = I, which includes all standard rotation gates (RX, RY, RZ).

Parameters¶

encoding : BaseEncoding The encoding instance. x : NDArray[np.floating] Input data vector. param_index : int Index of the parameter to differentiate with respect to. observable : {"computational", "global_z", "local_z", "pauli_z"}, default="computational" Observable to measure:

- ``"computational"``: Probability of |0...0⟩ state, i.e., ⟨0...0|ρ|0...0⟩
- ``"global_z"``: Expectation value of Z⊗Z⊗...⊗Z (tensor product of Z on
  all qubits). Eigenvalue is (-1)^(number of 1s in bitstring).
- ``"local_z"``: Expectation value of Z on first qubit only, ⟨Z₀⟩.
  Computed as P(0 on qubit 0) - P(1 on qubit 0) = 2·P(0) - 1.
- ``"pauli_z"``: **Deprecated alias for "global_z"**. For clarity,
  use ``"global_z"`` for the global Z-string observable or ``"local_z"``
  for the single-qubit Z observable.

{"pennylane", "qiskit", "cirq"}, default="pennylane"

Simulation backend.

Returns¶

float Gradient of the expectation value.

Raises¶

ValidationError If param_index is out of range. ValueError If observable is not a recognized option.

Examples¶

from encoding_atlas import AngleEncoding import numpy as np

enc = AngleEncoding(n_features=2) x = np.array([0.5, 1.0]) grad = compute_parameter_gradient(enc, x, param_index=0) print(f"Gradient: {grad:.6f}")

Notes¶

The parameter-shift rule requires two circuit evaluations per gradient component. For full gradient vectors, use :func:compute_all_parameter_gradients.

Observable Semantics

The distinction between "global_z" and "local_z" is important:

"global_z" measures the parity of all qubits simultaneously. The expectation value is ⟨Z⊗Z⊗...⊗Z⟩ where each Z acts on a different qubit. This captures global correlations in the encoded state.
"local_z" measures only the first qubit (MSB in computational basis ordering), ignoring the state of other qubits. This is useful for local sensitivity analysis.

compute_purity ¶

compute_purity(density_matrix: DensityMatrixType) -> float

Compute the purity of a density matrix.

The purity of a quantum state ρ is defined as:

γ(ρ) = Tr(ρ²)

Properties

For a pure state: γ = 1
For a maximally mixed state of dimension d: γ = 1/d
In general: 1/d ≤ γ ≤ 1

Parameters¶

density_matrix : DensityMatrixType Density matrix (must be square). Can be any valid quantum state, including mixed states.

Returns¶

float Purity value in the range [1/d, 1] where d is the dimension.

Raises¶

ValidationError If the density matrix is not square or contains invalid values.

Examples¶

Pure state |0⟩ has purity 1:

import numpy as np rho_pure = np.array([[1, 0], [0, 0]], dtype=complex) purity = compute_purity(rho_pure) print(f"Purity: {purity}") Purity: 1.0

Maximally mixed state has purity 1/d:

rho_mixed = np.array([[0.5, 0], [0, 0.5]], dtype=complex) purity = compute_purity(rho_mixed) print(f"Purity: {purity}") Purity: 0.5

Notes¶

The purity is related to the linear entropy by: S_L(ρ) = 1 - γ(ρ)

And to the participation ratio by

PR = 1/γ(ρ)

which measures the "effective dimension" of the state.

compute_scott_measure ¶

compute_scott_measure(statevector: StatevectorType, n_qubits: int, k: int = 2) -> float

Compute the Scott entanglement measure.

The Scott measure generalizes Meyer-Wallach to k-body reduced states, averaging the linear entropy over all k-qubit subsystems.

Parameters¶

statevector : NDArray[np.complexfloating] Pure state vector. n_qubits : int Number of qubits. k : int, default=2 Size of subsystems to consider. Must satisfy 1 <= k <= n_qubits - 1.

- k=1: Equivalent to Meyer-Wallach measure.
- k=2: Pairwise entanglement (default).
- Higher k: Captures higher-order entanglement structure.

The measure is well-defined for any proper subsystem size.
Note that ``Q_k`` and ``Q_{n-k}`` probe complementary subsystems
and therefore carry related (but not identical) information.

Returns¶

float Scott measure in [0, 1].

Raises¶

ValueError If k < 1, k >= n_qubits, or other inputs are invalid. ValidationError If statevector is invalid.

Examples¶

import numpy as np

GHZ state on 3 qubits¶

ghz = np.zeros(8, dtype=complex) ghz[0] = ghz[7] = 1.0 / np.sqrt(2) scott_1 = compute_scott_measure(ghz, n_qubits=3, k=1) scott_2 = compute_scott_measure(ghz, n_qubits=3, k=2) print(f"Scott k=1: {scott_1:.4f}") # Same as Meyer-Wallach Scott k=1: 1.0000 print(f"Scott k=2: {scott_2:.4f}") Scott k=2: 0.6667

Notes¶

The Scott measure averages the normalized linear entropy of all :math:inom{n}{k} reduced density matrices of size k. The mathematical definition is valid for any 1 <= k <= n - 1, where n is the number of qubits [1]_.

Computational cost scales with :math:inom{n}{k}, the number of k-qubit subsystems. For large n, choosing k close to 1 or close to n - 1 is cheaper than choosing k near n / 2.

References¶

.. [1] Scott, A. J. (2004). "Multipartite entanglement, quantum-error-correcting codes, and entangling power of quantum evolutions." Physical Review A, 69(5), 052330.

compute_von_neumann_entropy ¶

compute_von_neumann_entropy(density_matrix: DensityMatrixType) -> float

Compute the von Neumann entropy of a density matrix.

The von Neumann entropy is the quantum analog of Shannon entropy:

S(ρ) = -Tr(ρ log₂ ρ) = -Σᵢ λᵢ log₂(λᵢ)

where λᵢ are the eigenvalues of ρ.

Properties

For a pure state: S = 0
For a maximally mixed state of dimension d: S = log₂(d)
In general: 0 ≤ S ≤ log₂(d)

Parameters¶

density_matrix : DensityMatrixType Density matrix (must be square).

Returns¶

float Von Neumann entropy in bits (log base 2).

Examples¶

Pure state has zero entropy:

import numpy as np rho_pure = np.array([[1, 0], [0, 0]], dtype=complex) entropy = compute_von_neumann_entropy(rho_pure) print(f"von Neumann entropy: {entropy:.6f}") von Neumann entropy: 0.000000

Maximally mixed state has maximum entropy:

rho_mixed = np.array([[0.5, 0], [0, 0.5]], dtype=complex) entropy = compute_von_neumann_entropy(rho_mixed) print(f"von Neumann entropy: {entropy:.6f}") von Neumann entropy: 1.000000

Notes¶

The computation uses the eigenvalue decomposition of the density matrix. Eigenvalues that are zero or negative (due to numerical errors) are filtered out to avoid log(0) issues.

count_resources ¶

count_resources(encoding: BaseEncoding, x: NDArray[floating[Any]] | None = None, *, detailed: Literal[False] = False) -> ResourceCountSummary

count_resources(encoding: BaseEncoding, x: NDArray[floating[Any]] | None = None, *, detailed: Literal[True]) -> DetailedGateBreakdown

count_resources(encoding: BaseEncoding, x: NDArray[floating[Any]] | None = None, *, detailed: bool = False) -> ResourceCountSummary | DetailedGateBreakdown

Count the computational resources required by an encoding.

Analyzes the encoding circuit structure to determine gate counts, circuit depth, and other resource metrics. This analysis is performed analytically (without simulation) for most encodings.

Parameters¶

encoding : BaseEncoding The encoding instance to analyze. Must be a valid encoding that inherits from :class:BaseEncoding. x : NDArray[np.floating], optional Input data vector. Required only for data-dependent encodings (e.g., BasisEncoding) where gate count depends on input values. For most encodings, this can be omitted. detailed : bool, default=False If True, return detailed gate-by-gate breakdown. If False, return summary with totals and derived metrics.

Returns¶

ResourceCountSummary or DetailedGateBreakdown If detailed=False (default): Dictionary with summary metrics including:

- ``n_qubits``: Number of qubits
- ``depth``: Circuit depth
- ``gate_count``: Total gates
- ``single_qubit_gates``: Single-qubit gate count
- ``two_qubit_gates``: Two-qubit gate count
- ``parameter_count``: Number of parameterized gates
- ``two_qubit_ratio``: Fraction of gates that are two-qubit
- ``gates_per_qubit``: Average gates per qubit
- ``encoding_name``: Name of the encoding class
- ``is_data_dependent``: Whether counts depend on input data

If ``detailed=True``: Dictionary with per-gate-type counts including:

- ``rx``, ``ry``, ``rz``: Rotation gate counts
- ``h``: Hadamard count
- ``x``, ``y``, ``z``: Pauli gate counts
- ``cnot``, ``cz``, ``swap``: Two-qubit gate counts
- ``total_single_qubit``, ``total_two_qubit``, ``total``
- ``encoding_name``: Name of the encoding class

Raises¶

AnalysisError If encoding is not a valid BaseEncoding instance or if resource counting fails. ValidationError If x is required but not provided (for data-dependent encodings), or if x has invalid shape or values.

Examples¶

Basic resource counting:

from encoding_atlas import AngleEncoding, IQPEncoding from encoding_atlas.analysis import count_resources

enc = AngleEncoding(n_features=4) res = count_resources(enc) print(f"Qubits: {res['n_qubits']}, Depth: {res['depth']}") Qubits: 4, Depth: 1

Comparing two encodings:

enc_simple = AngleEncoding(n_features=4) enc_complex = IQPEncoding(n_features=4, reps=2)

res_simple = count_resources(enc_simple) res_complex = count_resources(enc_complex)

print(f"Simple: {res_simple['gate_count']} gates") print(f"Complex: {res_complex['gate_count']} gates")

Detailed breakdown:

breakdown = count_resources(enc_complex, detailed=True) print(f"CNOT gates: {breakdown['cnot']}") print(f"RZ gates: {breakdown['rz']}")

Data-dependent encoding:

from encoding_atlas import BasisEncoding import numpy as np

enc = BasisEncoding(n_features=4) x = np.array([0.1, 0.9, 0.3, 0.8]) # Will be binarized res = count_resources(enc, x=x) print(f"X gates (depends on input): {res['gate_count']}")

Notes¶

For most encodings, resource counts are computed analytically from encoding parameters (n_features, reps, entanglement pattern, etc.). This is fast and deterministic.

For data-dependent encodings like BasisEncoding, the gate count depends on the specific input values (e.g., number of 1s in binary input determines number of X gates).

create_rng ¶

create_rng(seed: int | None = None) -> Generator

Create a numpy random number generator.

Parameters¶

seed : int, optional Random seed for reproducibility. If None, uses system entropy.

Returns¶

Generator NumPy random number generator instance.

Examples¶

rng = create_rng(42) values = rng.random(5)

Same seed produces same values¶

rng2 = create_rng(42) values2 = rng2.random(5) np.allclose(values, values2) True

detect_barren_plateau ¶

detect_barren_plateau(gradient_variance: float, n_qubits: int, n_params: int) -> Literal['low', 'medium', 'high']

Detect barren plateau risk based on gradient variance.

Compares observed gradient variance to theoretical scaling for barren plateaus (exponential decay with qubit count) and returns a categorical risk assessment.

The thresholds are adjusted based on system size to account for the natural scaling of gradient variance with qubit count.

Parameters¶

gradient_variance : float Observed gradient variance from gradient sampling. Must be non-negative. n_qubits : int Number of qubits in the circuit. Used to adjust thresholds based on expected scaling. n_params : int Number of parameters in the circuit. Currently used for logging but may be used for more sophisticated detection in future versions.

Returns¶

{"low", "medium", "high"} Barren plateau risk level:

- "low": Variance is healthy, gradient-based training should work
- "medium": Variance is borderline, may need careful tuning
- "high": Variance is very low, likely barren plateau

Notes¶

Risk Threshold Calibration:

The risk thresholds are calibrated based on literature values:

McClean et al. (2018) showed variance scales as O(½ⁿ) for random circuits with global cost functions
Cerezo et al. (2021) showed local cost functions have better scaling: O(1/n) for certain circuit architectures

The thresholds are adjusted as follows:

Base high-risk threshold: 1e-6
Base medium-risk threshold: 1e-3
Size adjustment: thresholds scale with 2^(-n) to account for natural variance decay

Practical Interpretation:

"low" risk: Proceed with standard optimization
"medium" risk: Consider using:
Adaptive learning rates
Better parameter initialization (e.g., identity-initialized)
Noise injection during training
"high" risk: Fundamental changes needed:
Reduce circuit depth
Use local cost functions
Try hardware-efficient ansatze
Consider layer-wise training

Examples¶

risk = detect_barren_plateau( ... gradient_variance=1e-2, ... n_qubits=4, ... n_params=16 ... ) print(f"Risk level: {risk}") Risk level: low

risk = detect_barren_plateau( ... gradient_variance=1e-8, ... n_qubits=10, ... n_params=40 ... ) print(f"Risk level: {risk}") Risk level: high

estimate_entanglement_bound ¶

estimate_entanglement_bound(encoding: BaseEncoding, n_samples: int = 100, seed: int | None = None, backend: Literal['pennylane', 'qiskit', 'cirq'] = 'pennylane') -> float

Estimate upper bound on entanglement entropy.

Samples random inputs and estimates the maximum entanglement entropy across the middle bipartition. Low entanglement suggests the circuit may be simulable using tensor network methods.

The entanglement entropy is computed as the von Neumann entropy of the reduced density matrix for one half of the system:

S = -Tr(ρ_A log₂ ρ_A)

where ρ_A is obtained by tracing out the other half.

Parameters¶

encoding : BaseEncoding The encoding to analyze. n_samples : int, default=100 Number of random inputs to sample. More samples give more reliable estimates but take longer. seed : int, optional Random seed for reproducibility. backend : {"pennylane", "qiskit", "cirq"}, default="pennylane" Backend to use for circuit simulation.

- ``"pennylane"``: Uses PennyLane's default.qubit simulator (recommended)
- ``"qiskit"``: Uses Qiskit's Statevector class
- ``"cirq"``: Uses Cirq's Simulator for statevector simulation

Returns¶

float Estimated maximum entanglement entropy (in bits). For n qubits, the maximum possible value is n/2 bits (achieved by maximally entangled states).

Raises¶

AnalysisError If encoding is not valid or simulation fails. ValueError If an unknown backend is specified.

Warnings¶

UserWarning If the encoding has more than 25 qubits, a warning is issued about memory usage for simulation.

Notes¶

This is a statistical estimate based on random sampling. The true maximum may be higher for adversarial inputs not sampled.

For non-entangling encodings, this will return 0.0.

For tensor network simulability, entanglement entropy bounded by O(log n) suggests efficient MPS simulation is possible.

Examples¶

from encoding_atlas import AngleEncoding, IQPEncoding

Non-entangling encoding has zero entanglement¶

enc = AngleEncoding(n_features=4) entropy = estimate_entanglement_bound(enc, n_samples=50, seed=42) entropy < 0.01 # Essentially zero True

Entangling encoding has non-zero entanglement¶

enc = IQPEncoding(n_features=4, reps=2) entropy = estimate_entanglement_bound(enc, n_samples=50, seed=42) entropy > 0.1 True

Using Cirq backend¶

entropy_cirq = estimate_entanglement_bound(enc, n_samples=50, seed=42, backend="cirq") entropy_cirq > 0.1 True

estimate_execution_time ¶

estimate_execution_time(encoding: BaseEncoding, *, single_qubit_gate_time_us: float = _DEFAULT_SINGLE_QUBIT_GATE_TIME_US, two_qubit_gate_time_us: float = _DEFAULT_TWO_QUBIT_GATE_TIME_US, measurement_time_us: float = _DEFAULT_MEASUREMENT_TIME_US, include_measurement: bool = True, parallelization_factor: float = 0.5) -> dict[str, float]

Estimate execution time for an encoding circuit.

Provides a rough estimate of circuit execution time based on gate counts and typical gate times. This is useful for:

Comparing relative execution times between encodings
Planning batch processing
Evaluating hardware constraints

Parameters¶

encoding : BaseEncoding The encoding to analyze. single_qubit_gate_time_us : float, default=0.02 Time for single-qubit gates in microseconds. Default is 20 ns, typical for superconducting qubits. two_qubit_gate_time_us : float, default=0.2 Time for two-qubit gates in microseconds. Default is 200 ns, typical for superconducting qubits. measurement_time_us : float, default=1.0 Time for measurement in microseconds. include_measurement : bool, default=True Whether to include measurement time in the estimate. parallelization_factor : float, default=0.5 Fraction of gates assumed to run in parallel. 0 = fully serial, 1 = fully parallel. Default of 0.5 assumes moderate parallelism.

Returns¶

dict[str, float] Dictionary containing:

- ``serial_time_us``: Time assuming no parallelization
- ``estimated_time_us``: Time with parallelization factor
- ``single_qubit_time_us``: Time for single-qubit gates only
- ``two_qubit_time_us``: Time for two-qubit gates only
- ``measurement_time_us``: Measurement time (if included)
- ``parallelization_factor``: The factor used

Examples¶

from encoding_atlas import IQPEncoding from encoding_atlas.analysis import estimate_execution_time

enc = IQPEncoding(n_features=4, reps=2) times = estimate_execution_time(enc) print(f"Estimated time: {times['estimated_time_us']:.2f} μs")

With custom gate times (trapped ions):

times = estimate_execution_time( ... enc, ... single_qubit_gate_time_us=1.0, # 1 μs ... two_qubit_gate_time_us=100.0, # 100 μs ... )

Notes¶

This is a rough estimate based on gate counts and does not account for:

Circuit topology and routing overhead
Hardware-specific gate implementations
Error correction overhead
Classical control latency

For accurate timing, run the circuit on actual hardware or use a detailed hardware simulator.

The parallelization factor accounts for the fact that independent gates can run in parallel. A factor of 0.5 means the effective time is 50% of the fully serial time.

estimate_trainability ¶

estimate_trainability(encoding: BaseEncoding, n_samples: int = ..., input_range: tuple[float, float] = ..., observable: Literal['computational', 'pauli_z', 'global_z'] = ..., seed: int | None = ..., backend: Literal['pennylane', 'qiskit', 'cirq'] = ..., return_details: Literal[False] = ..., verbose: bool = ..., parallel: ParallelArg = ..., max_workers: int | None = ..., sampling: SamplingMethod = ..., confidence_level: float = ..., n_bootstrap_ci: int = ...) -> float

estimate_trainability(encoding: BaseEncoding, n_samples: int = ..., input_range: tuple[float, float] = ..., observable: Literal['computational', 'pauli_z', 'global_z'] = ..., seed: int | None = ..., backend: Literal['pennylane', 'qiskit', 'cirq'] = ..., return_details: Literal[True] = ..., verbose: bool = ..., parallel: ParallelArg = ..., max_workers: int | None = ..., sampling: SamplingMethod = ..., confidence_level: float = ..., n_bootstrap_ci: int = ...) -> TrainabilityResult

estimate_trainability(encoding: BaseEncoding, n_samples: int = _DEFAULT_N_SAMPLES, input_range: tuple[float, float] = _DEFAULT_INPUT_RANGE, observable: Literal['computational', 'pauli_z', 'global_z'] = 'computational', seed: int | None = None, backend: Literal['pennylane', 'qiskit', 'cirq'] = 'pennylane', return_details: bool = False, verbose: bool = False, parallel: ParallelArg = False, max_workers: int | None = None, sampling: SamplingMethod = 'uniform', confidence_level: float = _DEFAULT_CONFIDENCE_LEVEL, n_bootstrap_ci: int = _DEFAULT_N_BOOTSTRAP_CI) -> float | TrainabilityResult

Estimate the trainability of a quantum encoding.

Trainability is estimated by analyzing the variance of gradients with respect to encoding parameters. Low variance (especially exponentially decaying with system size) indicates barren plateaus and poor trainability.

This function samples random parameter initializations, computes gradients using the parameter-shift rule, and analyzes the variance of these gradients to detect barren plateaus.

Parameters¶

encoding : BaseEncoding The encoding instance to analyze. Must be a valid encoding that implements the get_circuit() method. n_samples : int, default=500 Number of random parameter initializations to sample. Higher values give more accurate estimates but take longer to compute. Minimum value is 10, recommended minimum is 50. input_range : tuple[float, float], default=(0, 2π) Range for sampling random input/parameter values as (min, max). The default range covers full rotation angles for quantum gates. observable : {"computational", "pauli_z", "global_z"}, default="computational" Observable to use for gradient computation:

- ``"computational"``: Probability of |0...0⟩ state (local)
- ``"pauli_z"``: Expectation value of Z on first qubit (local)
- ``"global_z"``: Expectation value of Z⊗Z⊗...⊗Z (global)

Local observables are less prone to false positives for barren
plateau detection and are recommended for most use cases.

seed : int, optional Random seed for reproducibility. If None, uses system entropy. For reproducible results, always specify a seed. backend : {"pennylane", "qiskit", "cirq"}, default="pennylane" Backend for circuit simulation and gradient computation:

- ``"pennylane"``: Uses PennyLane's default.qubit simulator
- ``"qiskit"``: Uses Qiskit's Statevector simulator
- ``"cirq"``: Uses Cirq's Simulator for statevector simulation

return_details : bool, default=False If True, return full result dictionary with detailed statistics. If False, return only the trainability estimate as a float. verbose : bool, default=False If True, log progress information during computation. parallel : bool or {'thread', 'process'}, default=False Parallel-dispatch mode for the per-sample gradient computation. Trainability is the most expensive of the three analyses (2 * n_features simulations per sample for the parameter-shift rule), so parallelism delivers the largest wall-clock speedup here.

- ``False`` (default) — sequential, no executor overhead.
- ``True`` or ``'thread'`` — :class:`ThreadPoolExecutor`.
- ``'process'`` — :class:`ProcessPoolExecutor` with the encoding
  pickled once per worker. Workers exchange only NumPy float
  arrays (gradient vectors), so process-pool parallelism works
  with **all** three backends here.

Output is numerically identical across all modes for a fixed
``seed`` — the RNG is fully consumed in the main process before
any work is dispatched. Failed samples are still counted (so the
``failure_fraction`` check produces the same result as
sequential), and successful gradients are stored contiguously
from index 0 just like the sequential version.

max_workers : int or None, default=None Maximum number of workers when parallel is enabled. sampling : {'uniform', 'sobol'}, default='uniform' Strategy for drawing the random per-sample inputs:

- ``'uniform'`` (default, unchanged): pseudo-random i.i.d.
  uniform draws.
- ``'sobol'``: Sobol' low-discrepancy quasi-random sequence,
  seeded from the same ``seed`` argument. Typically gives the
  same variance accuracy with **30-50% fewer samples**. For
  best statistical properties choose ``n_samples`` as a power
  of two.

confidence_level : float, default=0.95 Two-sided confidence level for the bootstrap CIs on trainability_estimate and gradient_variance reported in the result dict (only used when return_details=True). Must lie strictly in (0, 1). n_bootstrap_ci : int, default=200 Number of bootstrap resamples used to estimate the percentile CIs (only when return_details=True). 200 stabilizes the percentile endpoints to ~1% jitter at negligible cost.

Returns¶

float or TrainabilityResult If return_details=False: Trainability estimate in [0, 1], where higher values indicate better trainability.

If ``return_details=True``: Dictionary containing:

- ``trainability_estimate``: float in [0, 1]
- ``gradient_variance``: variance computed over successful samples only
- ``barren_plateau_risk``: "low", "medium", or "high"
- ``effective_dimension``: estimated effective parameter dimension
- ``n_samples``: total number of samples requested
- ``n_successful_samples``: samples used for variance computation
- ``per_parameter_variance``: variance for each parameter
- ``n_failed_samples``: number of failed gradient computations

Raises¶

InsufficientSamplesError If n_samples < 10. SimulationError If too many gradient computations fail (> 20% of samples). AnalysisError If the encoding is invalid or cannot be analyzed.

Warnings¶

UserWarning If n_samples < 50, a warning is issued about potential statistical unreliability.

Notes¶

Interpretation:

The trainability score should be interpreted as a relative measure for comparing encodings, not an absolute guarantee. A score of 0.8 means the encoding appears more trainable than one with score 0.5, but actual training success depends on many other factors.

Barren Plateau Risk Levels:

"low" (variance ≥ 1e-3): Training should proceed normally
"medium" (1e-6 ≤ variance < 1e-3): May need careful tuning
"high" (variance < 1e-6): Likely barren plateau, consider:
Using fewer qubits
Shallower circuits
Local cost functions
Parameter initialization strategies

Computational Cost:

Each sample requires computing gradients for all parameters, which involves 2 × n_parameters circuit evaluations (parameter-shift rule). Total circuit evaluations = 2 × n_parameters × n_samples.

Failed Sample Handling:

Gradient computation may fail for certain parameter configurations due to numerical instabilities or backend issues. Failed samples are completely excluded from variance computation to ensure unbiased statistical estimates. If more than 20% of samples fail, a :exc:SimulationError is raised. The number of successful samples used is reported in n_successful_samples when return_details=True.

Best Practices:

Use at least 100 samples for reliable estimates
Compare encodings using the same seed for fair comparison
Use local observables ("computational" or "pauli_z") unless specifically studying global cost functions
For publication, report both trainability score and gradient variance with confidence intervals

Examples¶

Basic usage:

from encoding_atlas import AngleEncoding enc = AngleEncoding(n_features=4) train = estimate_trainability(enc, seed=42) print(f"Trainability: {train:.4f}")

Detailed analysis:

result = estimate_trainability( ... enc, n_samples=200, seed=42, return_details=True ... ) print(f"Variance: {result['gradient_variance']:.2e}") print(f"Risk level: {result['barren_plateau_risk']}")

Comparing encodings:

from encoding_atlas import IQPEncoding enc1 = AngleEncoding(n_features=4) enc2 = IQPEncoding(n_features=4, reps=2) t1 = estimate_trainability(enc1, seed=42) t2 = estimate_trainability(enc2, seed=42) print(f"Angle: {t1:.4f}, IQP: {t2:.4f}")

generate_random_parameters ¶

generate_random_parameters(encoding_or_n_features: BaseEncoding | int, n_samples: int = 1, param_min: float = _DEFAULT_PARAM_MIN, param_max: float = _DEFAULT_PARAM_MAX, seed: int | None = None) -> FloatArray

Generate random parameter vectors for encoding analysis.

Parameters¶

encoding_or_n_features : BaseEncoding or int Either an encoding instance (n_features is extracted automatically) or an integer specifying the number of features directly. n_samples : int, default=1 Number of parameter vectors to generate. param_min : float, default=0.0 Minimum parameter value. param_max : float, default=2π Maximum parameter value. seed : int, optional Random seed for reproducibility.

Returns¶

FloatArray Random parameters of shape (n_samples, n_features) if n_samples > 1, or (n_features,) if n_samples == 1.

Examples¶

Generate parameters from an encoding:

from encoding_atlas import AngleEncoding enc = AngleEncoding(n_features=4) params = generate_random_parameters(enc, n_samples=10, seed=42) print(params.shape) (10, 4)

Generate parameters by specifying n_features directly:

params = generate_random_parameters(4, seed=42) print(params.shape) (4,)

With custom range:

params = generate_random_parameters( ... 4, n_samples=10, param_min=-np.pi, param_max=np.pi ... )

geometric_difference ¶

geometric_difference(K1: NDArray[floating[Any]], K2: NDArray[floating[Any]], *, regularization: float = 1e-06) -> float

Asymmetric geometric difference g(K1 || K2) (Huang et al., 2021).

g(K1 || K2) = sqrt( || sqrt(K2) (K1 + reg I)^-1 sqrt(K2) ||_inf ) where ||.||_inf is the spectral norm and both kernels are trace-normalised to n. It quantifies how well a model expressed in K1's space can reproduce K2's geometry: g is large when K2 explores directions K1 cannot. g(K, K) = 1 (in the reg -> 0 limit).

Parameters¶

K1, K2 : ndarray, shape (n, n) Kernel matrices (symmetric PSD). K1 is the reference (inverted); for a quantum-advantage diagnostic use K1 = classical and K2 = quantum. regularization : float, default=1e-6 Ridge added to K1 before inversion for numerical stability.

get_gate_breakdown ¶

get_gate_breakdown(encoding: BaseEncoding, x: NDArray[floating[Any]] | None = None) -> DetailedGateBreakdown

Get detailed gate-by-gate breakdown for an encoding.

This is a convenience function that calls :func:count_resources with detailed=True.

Parameters¶

encoding : BaseEncoding The encoding to analyze. x : NDArray[np.floating], optional Input data for data-dependent encodings. Required for encodings like BasisEncoding where gate count depends on input values.

Returns¶

DetailedGateBreakdown Dictionary with counts for each gate type.

Raises¶

AnalysisError If encoding is invalid. ValidationError If x is required but not provided or has invalid values.

Examples¶

from encoding_atlas import IQPEncoding from encoding_atlas.analysis import get_gate_breakdown

enc = IQPEncoding(n_features=4, reps=2) breakdown = get_gate_breakdown(enc) print(f"CNOT gates: {breakdown['cnot']}") print(f"Hadamard gates: {breakdown['h']}") print(f"Total: {breakdown['total']}")

get_resource_summary ¶

get_resource_summary(encoding: BaseEncoding) -> ResourceCountSummary

Get a quick resource summary from encoding properties.

This is a lightweight alternative to :func:count_resources that uses cached encoding properties when available. It always returns theoretical (worst-case) values, even for data-dependent encodings.

Parameters¶

encoding : BaseEncoding The encoding to summarize.

Returns¶

ResourceCountSummary Dictionary with resource metrics. See Warnings section for details on which fields contain approximations.

Raises¶

AnalysisError If encoding is not a valid BaseEncoding instance.

Warnings¶

Approximated Fields: The following fields are approximations based on aggregate property values, not actual gate-by-gate circuit analysis:

cnot_count: Approximated as two_qubit_gates. This assumes all two-qubit gates are CNOTs, which may overcount if the encoding uses CZ or SWAP gates instead.
cz_count: Always returns 0 (requires detailed circuit analysis).
t_gate_count: Always returns 0 (requires detailed circuit analysis).
hadamard_count: Always returns 0 (requires detailed circuit analysis).
rotation_gates: Approximated as single_qubit_gates. This assumes all single-qubit gates are rotation gates (RX, RY, RZ), which may overcount if the encoding uses non-rotation gates like H, X, Y, Z, S, T.

For accurate gate-type counts, use :func:count_resources with detailed=True, which performs actual gate-by-gate analysis via the encoding's gate_count_breakdown() method when available.

Examples¶

from encoding_atlas import IQPEncoding from encoding_atlas.analysis import get_resource_summary

enc = IQPEncoding(n_features=4, reps=2) summary = get_resource_summary(enc) print(f"Qubits: {summary['n_qubits']}, Depth: {summary['depth']}")

For accurate gate counts, prefer count_resources with detailed=True:

from encoding_atlas.analysis import count_resources detailed = count_resources(enc, detailed=True) print(f"Actual Hadamard gates: {detailed['h']}")

Notes¶

This function uses the properties attribute of the encoding, which is typically cached after first access. This makes it very fast for repeated calls.

For data-dependent encodings, this returns worst-case (maximum) gate counts. Use :func:count_resources with input data for actual counts.

get_simulability_reason ¶

get_simulability_reason(encoding: BaseEncoding) -> str

Get a concise explanation of why an encoding is or isn't simulable.

This is a convenience function that provides a quick summary without the full details returned by :func:check_simulability.

Parameters¶

encoding : BaseEncoding The encoding to analyze.

Returns¶

str Human-readable explanation of simulability status.

Raises¶

AnalysisError If encoding is not a valid BaseEncoding instance.

Examples¶

from encoding_atlas import AngleEncoding, IQPEncoding enc = AngleEncoding(n_features=4) print(get_simulability_reason(enc)) Simulable: Encoding produces only product states (no entanglement)

enc = IQPEncoding(n_features=4, reps=2) reason = get_simulability_reason(enc) print(reason.startswith("Not simulable:")) True

is_clifford_circuit ¶

is_clifford_circuit(encoding: BaseEncoding) -> bool

Check if an encoding uses only Clifford gates.

Clifford circuits can be efficiently simulated using the stabilizer formalism, as proven by the Gottesman-Knill theorem. The Clifford gate set includes: H, S, CNOT, CZ, and Pauli gates (X, Y, Z).

Parameters¶

encoding : BaseEncoding The encoding to check.

Returns¶

bool True if the circuit is believed to use only Clifford gates.

Raises¶

AnalysisError If encoding is not a valid BaseEncoding instance.

Notes¶

This is a conservative check based on encoding properties and known gate sets. Some encodings may be classified as non-Clifford even if specific parameter choices yield Clifford circuits.

For example, RZ(π/2) is equivalent to S (a Clifford gate), but general RZ(θ) rotations are non-Clifford.

Examples¶

from encoding_atlas import AngleEncoding, BasisEncoding

AngleEncoding uses parameterized rotations (non-Clifford in general)¶

enc = AngleEncoding(n_features=4) is_clifford_circuit(enc) False

is_matchgate_circuit ¶

is_matchgate_circuit(encoding: BaseEncoding) -> bool

Check if an encoding uses only matchgate operations.

Matchgate circuits with nearest-neighbor connectivity on a line topology can be efficiently simulated classically in polynomial time. This is based on the fact that matchgates preserve fermionic parity and can be mapped to free-fermion systems.

Parameters¶

encoding : BaseEncoding The encoding to check.

Returns¶

bool True if the circuit is believed to use only matchgate operations with appropriate (nearest-neighbor) topology.

Raises¶

AnalysisError If encoding is not a valid BaseEncoding instance.

Notes¶

What are Matchgates?

Matchgates are a class of two-qubit gates that act on the computational basis states in a specific way that preserves particle number parity. Common matchgates include:

iSWAP: Swaps |01⟩ ↔ |10⟩ with a phase
fSWAP (fermionic SWAP): Used in fermionic simulations
Givens rotations: Parameterized rotations in the {|01⟩, |10⟩} subspace

Simulability Conditions

Matchgate circuits are classically simulable when:

All two-qubit gates are matchgates
Qubits are arranged in a line topology
Two-qubit gates act only on nearest neighbors

Limitations

This check is heuristic and may be conservative. It identifies known matchgate-based encodings by name and checks for linear entanglement patterns. Some matchgate circuits may not be detected if they don't follow recognized naming conventions.

References¶

.. [1] Jozsa, R., & Miyake, A. (2008). "Matchgates and classical simulation of quantum circuits." Proc. R. Soc. A, 464(2100), 3089-3106.

.. [2] Terhal, B. M., & DiVincenzo, D. P. (2002). "Classical simulation of noninteracting-fermion quantum circuits." Physical Review A, 65(3).

Examples¶

from encoding_atlas import AngleEncoding enc = AngleEncoding(n_features=4) is_matchgate_circuit(enc) False

kernel_effective_dimension ¶

kernel_effective_dimension(K: NDArray[floating[Any]], *, regularization: float = 1.0, normalize: bool = True) -> float

Effective degrees of freedom of a kernel's eigenspectrum.

d_eff(lambda) = sum_i lambda_i / (lambda_i + lambda) over the eigenvalues lambda_i of K (Zhang, 2005). It is a smooth "soft rank" measuring how many feature-space directions the kernel effectively uses, in [0, n].

Parameters¶

K : ndarray, shape (n, n) Kernel matrix (symmetric PSD). regularization : float, default=1.0 Ridge lambda. With normalize=True the mean eigenvalue is 1, so lambda = 1 weights each direction by its relative strength. normalize : bool, default=True Trace-normalise K to n (mean eigenvalue 1) before the sum, making the value comparable across encodings and dataset sizes.

kernel_target_alignment ¶

kernel_target_alignment(K: NDArray[floating[Any]], y: NDArray[integer[Any] | floating[Any]]) -> float

Uncentred kernel-target alignment in [-1, 1] (Cristianini, 2002).

A(K, y) = <K, yy^T>_F / (||K||_F ||yy^T||_F) with labels mapped to {-1, +1}. Returns 0.0 for degenerate kernels or single-class labels.

partial_trace_single_qubit ¶

partial_trace_single_qubit(statevector: StatevectorType, n_qubits: int, keep_qubit: int) -> DensityMatrixType

Compute the partial trace, keeping only a single qubit.

This function traces out all qubits except the specified one, returning the reduced density matrix for that qubit.

Parameters¶

statevector : StatevectorType Full system statevector of shape (2^n_qubits,). n_qubits : int Total number of qubits in the system. keep_qubit : int Index of the qubit to keep (0-indexed). Qubit 0 is the most significant bit in the computational basis ordering.

Returns¶

DensityMatrixType Reduced density matrix for the kept qubit, shape (2, 2). This is a valid density matrix: Hermitian, positive semidefinite, with trace 1.

Raises¶

ValueError If keep_qubit is out of range [0, n_qubits-1]. ValidationError If the statevector is invalid.

Examples¶

For a separable state |00⟩, each qubit's reduced state is |0⟩:

import numpy as np state_00 = np.array([1, 0, 0, 0], dtype=complex) # |00⟩ rho_0 = partial_trace_single_qubit(state_00, n_qubits=2, keep_qubit=0) print(rho_0) [[1.+0.j 0.+0.j] [0.+0.j 0.+0.j]]

For a Bell state (|00⟩ + |11⟩)/√2, each qubit is maximally mixed:

bell = np.array([1, 0, 0, 1], dtype=complex) / np.sqrt(2) rho_0 = partial_trace_single_qubit(bell, n_qubits=2, keep_qubit=0) print(np.round(rho_0, 3)) [[0.5+0.j 0. +0.j] [0. +0.j 0.5+0.j]]

Notes¶

The convention used is that qubit 0 is the most significant bit. For a 2-qubit system:

Index 0: |00⟩ (qubit 0 = 0, qubit 1 = 0)
Index 1: |01⟩ (qubit 0 = 0, qubit 1 = 1)
Index 2: |10⟩ (qubit 0 = 1, qubit 1 = 0)
Index 3: |11⟩ (qubit 0 = 1, qubit 1 = 1)

partial_trace_subsystem ¶

partial_trace_subsystem(statevector: StatevectorType, n_qubits: int, keep_qubits: Sequence[int]) -> DensityMatrixType

Compute partial trace, keeping a specified subsystem.

This function traces out all qubits not in the keep list, returning the reduced density matrix for the kept subsystem.

Parameters¶

statevector : StatevectorType Full system statevector of shape (2^n_qubits,). n_qubits : int Total number of qubits in the system. keep_qubits : Sequence[int] Indices of qubits to keep (0-indexed). Must be a non-empty sequence of unique integers in range [0, n_qubits-1].

Returns¶

DensityMatrixType Reduced density matrix for the kept subsystem, shape (2^len(keep_qubits), 2^len(keep_qubits)).

Raises¶

ValueError If keep_qubits is empty, contains duplicates, or has out-of-range indices. ValidationError If the statevector is invalid.

Examples¶

Example 1: Keep the first two qubits of a 3-qubit product state

import numpy as np

|000⟩ state¶

state = np.zeros(8, dtype=complex) state[0] = 1.0 rho_01 = partial_trace_subsystem(state, n_qubits=3, keep_qubits=[0, 1]) print(rho_01.shape) (4, 4)

Result is |00⟩⟨00| (pure state)¶

print(np.round(rho_01, 3)) [[1.+0.j 0.+0.j 0.+0.j 0.+0.j] [0.+0.j 0.+0.j 0.+0.j 0.+0.j] [0.+0.j 0.+0.j 0.+0.j 0.+0.j] [0.+0.j 0.+0.j 0.+0.j 0.+0.j]]

Example 2: GHZ state partial trace (keeping first qubit)

For the GHZ state (|000⟩ + |111⟩)/√2, tracing out qubits 1 and 2 leaves qubit 0 in a maximally mixed state:

ghz = np.zeros(8, dtype=complex) ghz[0] = 1.0 / np.sqrt(2) # |000⟩ ghz[7] = 1.0 / np.sqrt(2) # |111⟩ rho_0 = partial_trace_subsystem(ghz, n_qubits=3, keep_qubits=[0])

Single qubit is maximally mixed (I/2)¶

print(np.round(rho_0, 3)) [[0.5+0.j 0. +0.j] [0. +0.j 0.5+0.j]]

Example 3: Non-contiguous qubit selection

Keep qubits 0 and 2, tracing out qubit 1:

|000⟩ state¶

state = np.zeros(8, dtype=complex) state[0] = 1.0 rho_02 = partial_trace_subsystem(state, n_qubits=3, keep_qubits=[0, 2]) print(rho_02.shape) (4, 4)

Result is |00⟩⟨00| for qubits 0 and 2¶

print(np.round(np.diag(rho_02).real, 3)) [1. 0. 0. 0.]

Example 4: W state partial trace

The W state (|001⟩ + |010⟩ + |100⟩)/√3 has different entanglement structure than GHZ. Tracing out one qubit leaves a mixed state:

w_state = np.zeros(8, dtype=complex) w_state[1] = 1.0 / np.sqrt(3) # |001⟩ w_state[2] = 1.0 / np.sqrt(3) # |010⟩ w_state[4] = 1.0 / np.sqrt(3) # |100⟩ rho_01 = partial_trace_subsystem(w_state, n_qubits=3, keep_qubits=[0, 1])

Trace = 1, but not maximally mixed¶

print(f"Trace: {np.trace(rho_01).real:.3f}") Trace: 1.000

Notes¶

The ordering of qubits in the output follows the order specified in keep_qubits. For example, if keep_qubits=[1, 0], the first index of the output density matrix corresponds to qubit 1.

Qubit Indexing Convention

Qubit 0 is the most significant bit in the computational basis:

For 3 qubits: |q0 q1 q2⟩
Index 0 = |000⟩, Index 1 = |001⟩, Index 4 = |100⟩, Index 7 = |111⟩

Use Cases

Computing entanglement entropy of subsystems
Analyzing local properties of entangled states
Verifying separability of quantum states

simulate_encoding_statevector ¶

simulate_encoding_statevector(encoding: BaseEncoding, x: NDArray[floating[Any]], backend: Literal['pennylane', 'qiskit', 'cirq'] = 'pennylane') -> StatevectorType

Simulate an encoding circuit and return the resulting statevector.

This function executes a quantum circuit defined by the encoding with the given input data and returns the final quantum state as a complex vector in the computational basis.

Parameters¶

encoding : BaseEncoding The encoding instance to simulate. x : NDArray[np.floating] Input data vector of shape (n_features,). The values are used as parameters for the encoding circuit. backend : {"pennylane", "qiskit", "cirq"}, default="pennylane" The quantum simulation backend to use.

- ``"pennylane"``: Uses PennyLane's default.qubit simulator
- ``"qiskit"``: Uses Qiskit's Statevector class
- ``"cirq"``: Uses Cirq's Simulator for statevector simulation

Returns¶

StatevectorType Complex statevector of shape (2^n_qubits,) representing the final quantum state in the computational basis. The state is normalized (sum of |amplitude|² = 1).

Raises¶

SimulationError If the simulation fails due to backend errors, missing dependencies, or invalid circuit structure. ValidationError If the encoding or input data is invalid. ValueError If an unknown backend is specified.

Warnings¶

UserWarning If the number of qubits exceeds _SIMULATION_QUBIT_WARNING_THRESHOLD (15 by default), a warning is issued about memory usage.

Examples¶

Simulate an angle encoding circuit:

from encoding_atlas import AngleEncoding import numpy as np

enc = AngleEncoding(n_features=3) x = np.array([0.1, 0.2, 0.3]) state = simulate_encoding_statevector(enc, x) print(f"State shape: {state.shape}") State shape: (8,) print(f"State norm: {np.linalg.norm(state):.6f}") State norm: 1.000000

Using Qiskit backend:

state_qiskit = simulate_encoding_statevector(enc, x, backend="qiskit")

Results should be equivalent (up to global phase)¶

fidelity = np.abs(np.vdot(state, state_qiskit))**2 print(f"Cross-backend fidelity: {fidelity:.6f}") Cross-backend fidelity: 1.000000

Notes¶

The statevector is ordered in the computational basis with the standard convention where qubit 0 is the most significant bit:

Index 0 corresponds to |00...0⟩
Index 1 corresponds to |00...1⟩
Index 2^n - 1 corresponds to |11...1⟩

Memory Requirements

Memory grows exponentially with qubit count:

Statevector: 2^n × 16 bytes (e.g., 15 qubits → 0.5 MB, 20 qubits → 16 MB)
Density matrix (for analysis): 2^(2n) × 16 bytes (e.g., 15 qubits → 16 GB)

Analysis operations like entanglement capability create full density matrices, so their practical limit is around 13-15 qubits. For larger systems, consider tensor network methods or sampling-based approaches.

simulate_encoding_statevectors_batch ¶

simulate_encoding_statevectors_batch(encoding: BaseEncoding, X: NDArray[floating[Any]], backend: Literal['pennylane', 'qiskit', 'cirq'] = 'pennylane') -> NDArray[complexfloating[Any, Any]]

Simulate encoding circuits for multiple input vectors.

This is a convenience function that applies :func:simulate_encoding_statevector to each row of a 2D input array and returns a pre-allocated 2D array of statevectors.

Parameters¶

encoding : BaseEncoding The encoding instance to simulate. X : NDArray[np.floating] Input data array of shape (n_samples, n_features). backend : {"pennylane", "qiskit", "cirq"}, default="pennylane" The quantum simulation backend to use.

Returns¶

NDArray[np.complexfloating], shape (n_samples, 2**n_qubits) 2D array of statevectors, one row per input sample.

Raises¶

SimulationError If simulation fails for any input. ValidationError If input shape is incorrect.

Examples¶

from encoding_atlas import AngleEncoding import numpy as np

enc = AngleEncoding(n_features=2) X = np.array([[0.1, 0.2], [0.3, 0.4], [0.5, 0.6]]) states = simulate_encoding_statevectors_batch(enc, X) print(f"Number of states: {len(states)}") Number of states: 3

simulate_noisy_density_matrix ¶

simulate_noisy_density_matrix(encoding: BaseEncoding, x: NDArray[floating[Any]], *, single_qubit_error: float, two_qubit_error: float) -> NDArray[complexfloating[Any, Any]]

Simulate the encoding under depolarizing noise; return the density matrix.

A depolarizing channel is inserted after each gate: single_qubit_error after one-qubit gates, and two_qubit_error on each qubit of two-qubit (or larger) gates. Uses PennyLane's default.mixed device.

Parameters¶

encoding : BaseEncoding Encoding to simulate. x : ndarray, shape (n_features,) Input to encode. single_qubit_error, two_qubit_error : float Depolarizing probabilities in [0, 1].

Returns¶

ndarray, shape (2**n_qubits, 2**n_qubits) The noisy density matrix.

validate_encoding_for_analysis ¶

validate_encoding_for_analysis(encoding: BaseEncoding) -> None

Validate that an encoding is suitable for analysis operations.

This function performs comprehensive validation to ensure the encoding can be safely used in analysis functions. It checks the encoding type, configuration, and derived properties.

Parameters¶

encoding : BaseEncoding The encoding instance to validate.

Raises¶

AnalysisError If the encoding is not a valid BaseEncoding instance. ValidationError If the encoding has invalid configuration (e.g., zero features or qubits).

Examples¶

from encoding_atlas import AngleEncoding enc = AngleEncoding(n_features=4) validate_encoding_for_analysis(enc) # No exception raised

validate_encoding_for_analysis("not an encoding") Traceback (most recent call last): ... AnalysisError: Expected BaseEncoding instance, got str

Notes¶

This function is called internally by analysis functions. You generally don't need to call it directly unless implementing custom analysis.

validate_statevector ¶

validate_statevector(statevector: NDArray[Any], expected_qubits: int | None = None, check_normalization: bool = True, tolerance: float = 1e-10) -> StatevectorType

Validate and normalize a statevector.

Parameters¶

statevector : NDArray The statevector to validate. expected_qubits : int, optional If provided, verify the statevector has dimension 2^expected_qubits. check_normalization : bool, default=True If True, verify the statevector is normalized and renormalize if needed. tolerance : float, default=1e-10 Tolerance for normalization check.

Returns¶

StatevectorType Validated (and possibly renormalized) statevector as complex128.

Raises¶

ValidationError If the statevector has invalid shape or values. NumericalInstabilityError If the statevector has near-zero norm.

Examples¶

import numpy as np state = np.array([1, 0, 0, 0], dtype=complex) validated = validate_statevector(state, expected_qubits=2) validated.dtype dtype('complex128')

Guide Module¶

recommend_encoding ¶

recommend_encoding(n_features: int, n_samples: int = 500, task: Literal['classification', 'regression'] = 'classification', hardware: str = 'simulator', priority: Literal['accuracy', 'trainability', 'speed', 'noise_resilience'] = 'accuracy', *, data_type: Literal['continuous', 'binary', 'discrete'] = 'continuous', symmetry: Literal['rotation', 'cyclic', 'permutation_pairs', 'general'] | None = None, trainable: bool = False, problem_structure: Literal['combinatorial', 'physics_simulation', 'time_series'] | None = None, feature_interactions: Literal['polynomial', 'custom_pauli'] | None = None) -> Recommendation

Recommend an encoding based on problem characteristics.

The recommendation is produced in two phases:

Hard filter — encodings whose structural preconditions are not satisfied (data type, feature count, symmetry, trainability) are eliminated.
Soft scoring — remaining candidates are scored on priority match, problem structure, task type, hardware suitability, and feature count. The highest-scoring encoding becomes the primary recommendation.

Parameters¶

n_features : int Number of input features (must be >= 1). n_samples : int Number of training samples (default 500, must be >= 1). task : {"classification", "regression"} Machine learning task type. Classification boosts kernel-method encodings; regression boosts universal-approximation encodings. hardware : str Target hardware ("simulator", "ibm", "ionq", etc.). priority : {"accuracy", "trainability", "speed", "noise_resilience"} Optimisation priority. data_type : {"continuous", "binary", "discrete"} Nature of the input features. symmetry : {"rotation", "cyclic", "permutation_pairs", "general"} | None Known symmetry in the data. None means no known symmetry. trainable : bool Whether the encoding should have learnable parameters. problem_structure : {"combinatorial", "physics_simulation", "time_series"} | None Domain structure of the problem. feature_interactions : {"polynomial", "custom_pauli"} | None Desired feature interaction type.

Returns¶

Recommendation The recommended encoding with explanation, alternatives, and confidence score.

Raises¶

ValueError If any parameter value is outside its valid set.

Source code in src/encoding_atlas/guide/recommender.py

def recommend_encoding(
    n_features: int,
    n_samples: int = 500,
    task: Literal["classification", "regression"] = "classification",
    hardware: str = "simulator",
    priority: Literal[
        "accuracy", "trainability", "speed", "noise_resilience"
    ] = "accuracy",
    *,
    data_type: Literal["continuous", "binary", "discrete"] = "continuous",
    symmetry: (
        Literal["rotation", "cyclic", "permutation_pairs", "general"] | None
    ) = None,
    trainable: bool = False,
    problem_structure: (
        Literal["combinatorial", "physics_simulation", "time_series"] | None
    ) = None,
    feature_interactions: Literal["polynomial", "custom_pauli"] | None = None,
) -> Recommendation:
    """Recommend an encoding based on problem characteristics.

    The recommendation is produced in two phases:

    1. **Hard filter** — encodings whose structural preconditions are not
       satisfied (data type, feature count, symmetry, trainability) are
       eliminated.
    2. **Soft scoring** — remaining candidates are scored on priority match,
       problem structure, task type, hardware suitability, and feature count.
       The highest-scoring encoding becomes the primary recommendation.

    Parameters
    ----------
    n_features : int
        Number of input features (must be >= 1).
    n_samples : int
        Number of training samples (default 500, must be >= 1).
    task : {"classification", "regression"}
        Machine learning task type.  Classification boosts kernel-method
        encodings; regression boosts universal-approximation encodings.
    hardware : str
        Target hardware (``"simulator"``, ``"ibm"``, ``"ionq"``, etc.).
    priority : {"accuracy", "trainability", "speed", "noise_resilience"}
        Optimisation priority.
    data_type : {"continuous", "binary", "discrete"}
        Nature of the input features.
    symmetry : {"rotation", "cyclic", "permutation_pairs", "general"} | None
        Known symmetry in the data.  ``None`` means no known symmetry.
    trainable : bool
        Whether the encoding should have learnable parameters.
    problem_structure : {"combinatorial", "physics_simulation", "time_series"} | None
        Domain structure of the problem.
    feature_interactions : {"polynomial", "custom_pauli"} | None
        Desired feature interaction type.

    Returns
    -------
    Recommendation
        The recommended encoding with explanation, alternatives, and
        confidence score.

    Raises
    ------
    ValueError
        If any parameter value is outside its valid set.
    """
    _validate_recommend_inputs(
        n_features=n_features,
        n_samples=n_samples,
        task=task,
        hardware=hardware,
        priority=priority,
        data_type=data_type,
        symmetry=symmetry,
        trainable=trainable,
        problem_structure=problem_structure,
        feature_interactions=feature_interactions,
    )

    # Phase A — hard filter
    candidates: dict[str, EncodingRule] = {}
    for name, rules in ENCODING_RULES.items():
        if _passes_hard_constraints(
            rules,
            n_features=n_features,
            data_type=data_type,
            symmetry=symmetry,
            trainable=trainable,
        ):
            candidates[name] = rules

    # Defensive fallback — unreachable in practice because ``angle`` has no
    # hard constraints, but kept for safety.  Uses ``_score_to_confidence``
    # to remain consistent with the documented confidence range [0.50, 0.95].
    if not candidates:
        return Recommendation(
            encoding_name="angle",
            explanation=(
                "No encoding matches all constraints; angle encoding "
                "is the safest general-purpose fallback"
            ),
            alternatives=[],
            confidence=_score_to_confidence(0.0),
        )

    # Check whether the user's symmetry constraint survived the hard filter.
    # If no candidate natively supports the requested symmetry, we append a
    # note to the explanation so the user understands the fallback.
    _symmetry_note = ""
    if symmetry is not None:
        symmetry_survived = any(
            r["requires_symmetry"] == symmetry for r in candidates.values()
        )
        if not symmetry_survived:
            _symmetry_note = (
                f" Note: no encoding with {symmetry!r} symmetry support "
                f"is compatible with n_features={n_features}; "
                f"falling back to general recommendation."
            )

    # Phase B — score candidates
    scores: dict[str, float] = {}
    for name, rules in candidates.items():
        scores[name] = _compute_score(
            name,
            rules,
            n_features=n_features,
            n_samples=n_samples,
            task=task,
            hardware=hardware,
            priority=priority,
            data_type=data_type,
            symmetry=symmetry,
            trainable=trainable,
            problem_structure=problem_structure,
            feature_interactions=feature_interactions,
        )

    # Rank and select — descending score, alphabetical name for ties.
    ranked = sorted(scores.items(), key=lambda item: (-item[1], item[0]))
    best_name, best_score = ranked[0]
    alternatives = [name for name, _ in ranked[1:4]]

    confidence = _score_to_confidence(best_score)

    explanation = (
        _generate_explanation(
            best_name,
            candidates[best_name],
            priority=priority,
            n_features=n_features,
        )
        + _symmetry_note
    )

    return Recommendation(
        encoding_name=best_name,
        explanation=explanation,
        alternatives=alternatives,
        confidence=confidence,
    )

Recommendation `dataclass` ¶

Recommendation(encoding_name: str, explanation: str, alternatives: list[str], confidence: float)

Encoding recommendation result.

Attributes¶

encoding_name : str Canonical name of the recommended encoding (matches registry keys). explanation : str Human-readable rationale for the recommendation. alternatives : list[str] Up to three runner-up encoding names, ranked by score. confidence : float Confidence in the recommendation, in [0, 1].

get_matching_encodings ¶

get_matching_encodings(requirements: list[str], constraints: list[str] | None = None, *, n_features: int | None = None, data_type: str = 'continuous', symmetry: str | None = None, trainable: bool = False) -> list[str]

Get encodings matching requirements, constraints, and hard filters.

This performs a two-phase check for each encoding:

Hard filter — eliminates encodings whose structural preconditions are not met (data type, feature count, symmetry, trainability).
Soft match — among survivors, selects those whose best_for tags overlap with requirements and whose avoid_when tags do not overlap with constraints.

Parameters¶

requirements : list[str] Tags the encoding should be good at (matched against best_for). constraints : list[str] | None Tags the encoding should not be associated with (matched against avoid_when). None means no soft constraints. n_features : int | None Number of input features for hard-filter checks. data_type : str Data type for hard-filter checks. symmetry : str | None Symmetry type for hard-filter checks. trainable : bool Trainable flag for hard-filter checks.

Returns¶

list[str] Encoding names that pass both hard and soft filters.

Source code in src/encoding_atlas/guide/rules.py

def get_matching_encodings(
    requirements: list[str],
    constraints: list[str] | None = None,
    *,
    n_features: int | None = None,
    data_type: str = "continuous",
    symmetry: str | None = None,
    trainable: bool = False,
) -> list[str]:
    """Get encodings matching requirements, constraints, and hard filters.

    This performs a two-phase check for each encoding:

    1. **Hard filter** — eliminates encodings whose structural preconditions
       are not met (data type, feature count, symmetry, trainability).
    2. **Soft match** — among survivors, selects those whose ``best_for`` tags
       overlap with *requirements* and whose ``avoid_when`` tags do not overlap
       with *constraints*.

    Parameters
    ----------
    requirements : list[str]
        Tags the encoding should be good at (matched against ``best_for``).
    constraints : list[str] | None
        Tags the encoding should *not* be associated with (matched against
        ``avoid_when``).  ``None`` means no soft constraints.
    n_features : int | None
        Number of input features for hard-filter checks.
    data_type : str
        Data type for hard-filter checks.
    symmetry : str | None
        Symmetry type for hard-filter checks.
    trainable : bool
        Trainable flag for hard-filter checks.

    Returns
    -------
    list[str]
        Encoding names that pass both hard and soft filters.
    """
    matches: list[str] = []

    for name, rules in ENCODING_RULES.items():
        # Phase A — hard filter
        if not _passes_hard_constraints(
            rules,
            n_features=n_features,
            data_type=data_type,
            symmetry=symmetry,
            trainable=trainable,
        ):
            continue

        # Phase B — soft tag match
        if any(req in rules["best_for"] for req in requirements):
            if constraints:
                if not any(c in rules["avoid_when"] for c in constraints):
                    matches.append(name)
            else:
                matches.append(name)

    return matches

Benchmark Module¶

Evaluate encodings on classification tasks with variational quantum classifiers and quantum-kernel SVMs, paired stratified cross-validation, classical baselines, and statistical comparison (Wilcoxon + Holm–Bonferroni + Cliff's delta).

EncodingBenchmark ¶

EncodingBenchmark(encodings: list[BaseEncoding], datasets: list[str], n_runs: int = 10, seed: int | None = None, *, methods: tuple[str, ...] = ('vqc', 'kernel'), n_folds: int = 5, baselines: tuple[str, ...] = (), custom_datasets: dict[str, tuple[Any, Any]] | None = None, n_samples: int = 200, vqc_layers: int = 2, vqc_epochs: int = 30, vqc_lr: float = 0.05, kernel_C: float = 1.0, scale_range: tuple[float, float] = _DEFAULT_SCALE)

Framework for benchmarking encodings on classification datasets.

Parameters¶

encodings : list[BaseEncoding] Encodings to benchmark. Each must accept the datasets' feature count. datasets : list[str] Built-in dataset names (see :func:encoding_atlas.benchmark.list_datasets). n_runs : int, default=10 Independent repetitions per configuration (each a different CV split). seed : int or None, default=None Base random seed (None -> 0). methods : tuple, default=("vqc", "kernel") Quantum methods to evaluate; subset of ("vqc", "kernel"). n_folds : int, default=5 Stratified CV folds per run. baselines : tuple, default=() Classical baseline names to include for calibration (e.g. ("svm_rbf",)). custom_datasets : dict or None, default=None Optional mapping name -> (X, y) of user-provided datasets, evaluated alongside the named datasets. n_samples : int, default=200 Sample count requested from the built-in dataset generators. vqc_layers, vqc_epochs, vqc_lr, kernel_C, scale_range Method hyper-parameters (see :func:evaluate_encoding).

Notes¶

Cost scales as len(encodings) x len(datasets) x len(methods) x n_runs x n_folds fold evaluations (VQC folds train a circuit and are the expensive part). Start small.

Source code in src/encoding_atlas/benchmark/runner.py

def __init__(
    self,
    encodings: list[BaseEncoding],
    datasets: list[str],
    n_runs: int = 10,
    seed: int | None = None,
    *,
    methods: tuple[str, ...] = ("vqc", "kernel"),
    n_folds: int = 5,
    baselines: tuple[str, ...] = (),
    custom_datasets: dict[str, tuple[Any, Any]] | None = None,
    n_samples: int = 200,
    vqc_layers: int = 2,
    vqc_epochs: int = 30,
    vqc_lr: float = 0.05,
    kernel_C: float = 1.0,
    scale_range: tuple[float, float] = _DEFAULT_SCALE,
) -> None:
    if not encodings:
        raise ValueError("encodings must be a non-empty list")
    if not datasets and not custom_datasets:
        raise ValueError("provide at least one named or custom dataset")
    invalid = [m for m in methods if m not in _VALID_METHODS]
    if invalid:
        raise ValueError(f"invalid methods {invalid}; valid: {_VALID_METHODS}")
    if n_runs < 1 or n_folds < 2:
        raise ValueError("n_runs must be >= 1 and n_folds must be >= 2")

    self.encodings = encodings
    self.datasets = datasets
    self.n_runs = n_runs
    self.seed = 0 if seed is None else seed
    self.methods = tuple(methods)
    self.n_folds = n_folds
    self.baselines = tuple(baselines)
    self.custom_datasets = custom_datasets or {}
    self.n_samples = n_samples
    self._params = {
        "vqc_layers": vqc_layers,
        "vqc_epochs": vqc_epochs,
        "vqc_lr": vqc_lr,
        "kernel_C": kernel_C,
    }
    self.scale_range = scale_range

    self.labels = _label_encodings(encodings)
    self.results: dict[str, Any] = {}
    # Raw per-(method, dataset) scores keyed by encoding label, for stats.
    self._raw: dict[tuple[str, str], dict[str, list[float]]] = {}

run ¶

run() -> dict[str, Any]

Run the full benchmark and return a structured results dictionary.

Returns¶

dict {"config", "encodings", "datasets", "results", "baselines"}. results[method][label][dataset] holds the accuracy summary; mismatched (encoding, dataset) feature counts are recorded with status="skipped".

Source code in src/encoding_atlas/benchmark/runner.py

def run(self) -> dict[str, Any]:
    """Run the full benchmark and return a structured results dictionary.

    Returns
    -------
    dict
        ``{"config", "encodings", "datasets", "results", "baselines"}``.
        ``results[method][label][dataset]`` holds the accuracy summary;
        mismatched (encoding, dataset) feature counts are recorded with
        ``status="skipped"``.
    """
    data = self._resolve_datasets()
    results: dict[str, Any] = {
        m: {label: {} for label in self.labels} for m in self.methods
    }
    baselines: dict[str, Any] = {b: {} for b in self.baselines}

    for dname, (X, y) in data.items():
        # Pre-compute the shared fold splits per run for paired comparison.
        run_folds = [
            _stratified_folds(X, y, self.n_folds, seed=self.seed + r)
            for r in range(self.n_runs)
        ]

        for method in self.methods:
            for enc, label in zip(self.encodings, self.labels):
                if getattr(enc, "n_features", X.shape[1]) != X.shape[1]:
                    results[method][label][dname] = {
                        "status": "skipped",
                        "reason": (
                            f"encoding n_features={enc.n_features} != "
                            f"dataset n_features={X.shape[1]}"
                        ),
                    }
                    continue

                scores: list[float] = []
                for r, folds in enumerate(run_folds):
                    for fold_idx, fold in enumerate(folds):
                        res = _run_method_fold(
                            method,
                            enc,
                            fold,
                            seed=(self.seed + r) * 100 + fold_idx,
                            params=self._params,
                        )
                        if res["status"] == "success":
                            scores.append(res["test_accuracy"])

                results[method][label][dname] = {
                    "status": "success" if scores else "failed",
                    **_summarize(scores),
                }
                self._raw[(method, dname)] = self._raw.get((method, dname), {})
                self._raw[(method, dname)][label] = scores

        # Classical baselines (encoding-independent) — one pass per dataset.
        for bname in self.baselines:
            bscores: list[float] = []
            for r, folds in enumerate(run_folds):
                for fold_idx, (Xtr, Xte, ytr, yte) in enumerate(folds):
                    res = run_baseline_single_fold(
                        bname,
                        Xtr,
                        Xte,
                        ytr,
                        yte,
                        seed=(self.seed + r) * 100 + fold_idx,
                    )
                    if res["status"] == "success":
                        bscores.append(res["test_accuracy"])
            baselines[bname][dname] = {
                "status": "success" if bscores else "failed",
                **_summarize(bscores),
            }

    self.results = {
        "config": {
            "n_runs": self.n_runs,
            "n_folds": self.n_folds,
            "methods": list(self.methods),
            "baselines": list(self.baselines),
            "seed": self.seed,
            **self._params,
        },
        "encodings": list(self.labels),
        "datasets": list(data.keys()),
        "results": results,
        "baselines": baselines,
    }
    return self.results

statistical_tests ¶

statistical_tests(*, alpha: float = 0.05) -> dict[str, Any]

Pairwise encoding comparison per (method, dataset).

Runs :func:compare_encodings_corrected (Wilcoxon + Holm-Bonferroni + Cliff's delta) on the paired per-fold scores. Requires :meth:run to have been called first.

Returns¶

dict {f"{method}/{dataset}": {<corrected comparison>}}.

Source code in src/encoding_atlas/benchmark/runner.py

def statistical_tests(self, *, alpha: float = 0.05) -> dict[str, Any]:
    """Pairwise encoding comparison per (method, dataset).

    Runs :func:`compare_encodings_corrected` (Wilcoxon + Holm-Bonferroni +
    Cliff's delta) on the paired per-fold scores. Requires :meth:`run` to
    have been called first.

    Returns
    -------
    dict
        ``{f"{method}/{dataset}": {<corrected comparison>}}``.
    """
    if not self._raw:
        raise RuntimeError("Call run() before statistical_tests().")

    out: dict[str, Any] = {}
    for (method, dname), per_label in self._raw.items():
        usable = {
            label: scores for label, scores in per_label.items() if len(scores) >= 2
        }
        lengths = {len(s) for s in usable.values()}
        if len(usable) < 2 or len(lengths) != 1:
            # Need >= 2 encodings with equal-length paired scores.
            continue
        out[f"{method}/{dname}"] = compare_encodings_corrected(usable, alpha=alpha)
    return out

plot_comparison ¶

plot_comparison(*, method: str | None = None) -> Any

Return a matplotlib bar chart of mean accuracy per encoding/dataset.

Parameters¶

method : str or None Which method to plot. Defaults to the first configured method.

Raises¶

RuntimeError If :meth:run has not been called. ImportError If matplotlib is not installed.

Source code in src/encoding_atlas/benchmark/runner.py

def plot_comparison(self, *, method: str | None = None) -> Any:
    """Return a matplotlib bar chart of mean accuracy per encoding/dataset.

    Parameters
    ----------
    method : str or None
        Which method to plot. Defaults to the first configured method.

    Raises
    ------
    RuntimeError
        If :meth:`run` has not been called.
    ImportError
        If matplotlib is not installed.
    """
    if not self.results:
        raise RuntimeError("Call run() before plot_comparison().")
    try:
        import matplotlib.pyplot as plt
    except ImportError as exc:
        raise ImportError("matplotlib required for plotting") from exc

    method = method or self.methods[0]
    datasets = self.results["datasets"]
    labels = self.results["encodings"]
    method_results = self.results["results"][method]

    x = np.arange(len(datasets))
    width = 0.8 / max(len(labels), 1)
    fig, ax = plt.subplots(figsize=(max(8, 2 * len(datasets)), 6))
    for i, label in enumerate(labels):
        means = [
            method_results[label].get(d, {}).get("mean", float("nan"))
            for d in datasets
        ]
        ax.bar(x + i * width, means, width, label=label)
    ax.set_xticks(x + width * (len(labels) - 1) / 2)
    ax.set_xticklabels(datasets, rotation=45, ha="right")
    ax.set_ylabel("Mean test accuracy")
    ax.set_title(f"Encoding comparison ({method})")
    ax.legend(fontsize="small")
    ax.set_ylim(0, 1)
    fig.tight_layout()
    return fig

save_results ¶

save_results(path: str) -> None

Write the benchmark results to path as JSON.

Raises¶

RuntimeError If :meth:run has not been called.

Source code in src/encoding_atlas/benchmark/runner.py

def save_results(self, path: str) -> None:
    """Write the benchmark results to ``path`` as JSON.

    Raises
    ------
    RuntimeError
        If :meth:`run` has not been called.
    """
    if not self.results:
        raise RuntimeError("Call run() before save_results().")
    with open(path, "w", encoding="utf-8") as handle:
        json.dump(self.results, handle, indent=2)

evaluate_encoding ¶

evaluate_encoding(encoding: Any, X: NDArray[floating[Any]], y: NDArray[intp], *, method: str = 'kernel', n_runs: int = 1, n_folds: int = 5, seed: int = 42, scale: bool = True, scale_range: tuple[float, float] = _DEFAULT_SCALE, vqc_layers: int = 2, vqc_epochs: int = 30, vqc_lr: float = 0.05, kernel_C: float = 1.0) -> dict[str, Any]

Evaluate one encoding on a single dataset via cross-validation.

This is the entry point for benchmarking on custom data: pass any feature matrix X and binary labels y.

Parameters¶

encoding : BaseEncoding Encoding whose n_features must equal X.shape[1]. X, y : ndarray Feature matrix and binary labels (values in {0, 1}). method : {"vqc", "kernel"}, default="kernel" Quantum classification method. n_runs : int, default=1 Independent repetitions, each with a different CV split seed. n_folds : int, default=5 Stratified CV folds per run. seed : int, default=42 Base random seed. scale : bool, default=True Min-max scale features into scale_range before encoding. scale_range : tuple, default=(0, 2*pi) Target range for feature scaling. vqc_layers, vqc_epochs, vqc_lr : int, int, float VQC ansatz depth, training epochs, and learning rate. kernel_C : float, default=1.0 SVM regularisation for the kernel method.

Returns¶

dict {"method", "scores", "n_failed", ...summary} where the summary keys are mean/std/ci_low/ci_high/n_scores.

Raises¶

ValueError If method is invalid or the encoding's feature count does not match X.

Source code in src/encoding_atlas/benchmark/runner.py

def evaluate_encoding(
    encoding: Any,
    X: NDArray[np.floating[Any]],
    y: NDArray[np.intp],
    *,
    method: str = "kernel",
    n_runs: int = 1,
    n_folds: int = 5,
    seed: int = 42,
    scale: bool = True,
    scale_range: tuple[float, float] = _DEFAULT_SCALE,
    vqc_layers: int = 2,
    vqc_epochs: int = 30,
    vqc_lr: float = 0.05,
    kernel_C: float = 1.0,
) -> dict[str, Any]:
    """Evaluate one encoding on a single dataset via cross-validation.

    This is the entry point for benchmarking on *custom* data: pass any feature
    matrix ``X`` and binary labels ``y``.

    Parameters
    ----------
    encoding : BaseEncoding
        Encoding whose ``n_features`` must equal ``X.shape[1]``.
    X, y : ndarray
        Feature matrix and binary labels (values in ``{0, 1}``).
    method : {"vqc", "kernel"}, default="kernel"
        Quantum classification method.
    n_runs : int, default=1
        Independent repetitions, each with a different CV split seed.
    n_folds : int, default=5
        Stratified CV folds per run.
    seed : int, default=42
        Base random seed.
    scale : bool, default=True
        Min-max scale features into ``scale_range`` before encoding.
    scale_range : tuple, default=(0, 2*pi)
        Target range for feature scaling.
    vqc_layers, vqc_epochs, vqc_lr : int, int, float
        VQC ansatz depth, training epochs, and learning rate.
    kernel_C : float, default=1.0
        SVM regularisation for the kernel method.

    Returns
    -------
    dict
        ``{"method", "scores", "n_failed", ...summary}`` where the summary keys
        are ``mean``/``std``/``ci_low``/``ci_high``/``n_scores``.

    Raises
    ------
    ValueError
        If ``method`` is invalid or the encoding's feature count does not match
        ``X``.
    """
    if method not in _VALID_METHODS:
        raise ValueError(f"method must be one of {_VALID_METHODS}, got {method!r}")

    X = np.asarray(X, dtype=np.float64)
    y = np.asarray(y, dtype=np.intp)
    if X.ndim != 2:
        raise ValueError(f"X must be 2-D, got shape {X.shape}")
    if getattr(encoding, "n_features", X.shape[1]) != X.shape[1]:
        raise ValueError(
            f"Encoding expects {encoding.n_features} features but X has "
            f"{X.shape[1]}; construct the encoding with matching n_features."
        )

    if scale:
        X = _scale_features(X, scale_range[0], scale_range[1])

    params = {
        "vqc_layers": vqc_layers,
        "vqc_epochs": vqc_epochs,
        "vqc_lr": vqc_lr,
        "kernel_C": kernel_C,
    }

    scores: list[float] = []
    n_failed = 0
    for run in range(n_runs):
        folds = _stratified_folds(X, y, n_folds, seed=seed + run)
        for fold_idx, fold in enumerate(folds):
            result = _run_method_fold(
                method,
                encoding,
                fold,
                seed=(seed + run) * 100 + fold_idx,
                params=params,
            )
            if result["status"] == "success":
                scores.append(result["test_accuracy"])
            else:
                n_failed += 1

    summary = _summarize(scores)
    return {"method": method, "scores": scores, "n_failed": n_failed, **summary}

VQCClassifier ¶

VQCClassifier(encoding: Any, n_var_layers: int = 2, lr: float = 0.05, epochs: int = 30, seed: int | None = None)

Variational quantum classifier with a configurable encoding.

Architecture::

|0> -- Encoding(x) -- [RY(theta) + CNOT chain] x L -- <Z_0> -- class

Parameters¶

encoding : BaseEncoding The quantum encoding to benchmark. Must expose n_qubits and a get_circuit(x, backend="pennylane") method. n_var_layers : int, default=2 Number of variational layers after the encoding. lr : float, default=0.05 Adam learning rate. epochs : int, default=30 Number of training epochs. seed : int or None, default=None Random seed for parameter initialisation and shuffling.

Attributes¶

params_ : ndarray or None Trained variational parameters of shape (n_var_layers, n_qubits). loss_history_ : list[float] Mean training loss per epoch. status_ : str One of "not_fitted", "success", "diverged".

Source code in src/encoding_atlas/benchmark/vqc.py

def __init__(
    self,
    encoding: Any,
    n_var_layers: int = 2,
    lr: float = 0.05,
    epochs: int = 30,
    seed: int | None = None,
) -> None:
    if n_var_layers < 1:
        raise ValueError(f"n_var_layers must be at least 1, got {n_var_layers}")
    if lr <= 0:
        raise ValueError(f"lr must be positive, got {lr}")
    if epochs < 1:
        raise ValueError(f"epochs must be at least 1, got {epochs}")

    self.encoding = encoding
    self.n_var_layers = n_var_layers
    self.lr = lr
    self.epochs = epochs
    self.seed = seed

    self.params_: NDArray[np.floating[Any]] | None = None
    self.loss_history_: list[float] = []
    self.status_: str = "not_fitted"

    self._qnode: Any | None = None
    self._device: Any | None = None
    self._n_qubits: int = encoding.n_qubits

fit ¶

fit(X: NDArray[floating[Any]], y: NDArray[intp]) -> VQCClassifier

Train on X (features) and y (labels in {0, 1}).

Source code in src/encoding_atlas/benchmark/vqc.py

def fit(self, X: NDArray[np.floating[Any]], y: NDArray[np.intp]) -> VQCClassifier:
    """Train on ``X`` (features) and ``y`` (labels in ``{0, 1}``)."""
    import pennylane as qml
    from pennylane import numpy as pnp

    if self._qnode is None:
        self._build_circuit()

    rng = np.random.default_rng(self.seed)
    self.params_ = pnp.array(
        rng.uniform(-np.pi, np.pi, size=(self.n_var_layers, self._n_qubits)),
        requires_grad=True,
    )

    opt = qml.AdamOptimizer(stepsize=self.lr)
    self.loss_history_ = []
    self.status_ = "success"
    n_samples = len(X)

    for epoch in range(self.epochs):
        epoch_loss = 0.0
        indices = rng.permutation(n_samples)
        X_epoch, y_epoch = X[indices], y[indices]

        for xi, yi in zip(X_epoch, y_epoch):

            def cost(params: NDArray[np.floating[Any]]) -> float:
                pred = self._qnode(xi, params)
                p = (pred + 1) / 2
                p = pnp.clip(p, 1e-7, 1 - 1e-7)
                return -yi * pnp.log(p) - (1 - yi) * pnp.log(1 - p)

            self.params_, loss = opt.step_and_cost(cost, self.params_)
            epoch_loss += float(loss)

        avg_loss = epoch_loss / n_samples
        self.loss_history_.append(avg_loss)

        if np.isnan(avg_loss) or avg_loss > _MAX_LOSS:
            logger.warning("VQC training diverged at epoch %d", epoch)
            self.status_ = "diverged"
            break

    return self

predict ¶

predict(X: NDArray[floating[Any]]) -> NDArray[intp]

Return predicted labels in {0, 1}.

Source code in src/encoding_atlas/benchmark/vqc.py

def predict(self, X: NDArray[np.floating[Any]]) -> NDArray[np.intp]:
    """Return predicted labels in ``{0, 1}``."""
    proba = self.predict_proba(X)
    return (proba[:, 1] >= 0.5).astype(np.intp)

score ¶

score(X: NDArray[floating[Any]], y: NDArray[intp]) -> float

Return classification accuracy on (X, y).

Source code in src/encoding_atlas/benchmark/vqc.py

def score(self, X: NDArray[np.floating[Any]], y: NDArray[np.intp]) -> float:
    """Return classification accuracy on ``(X, y)``."""
    return float(np.mean(self.predict(X) == y))

QuantumKernelClassifier ¶

QuantumKernelClassifier(encoding: Any, *, C: float = 1.0, seed: int | None = None)

Fidelity-kernel SVM classifier for a fixed encoding.

Computes the quantum kernel for the training data, enforces PSD, and fits a precomputed-kernel :class:sklearn.svm.SVC. Prediction uses the test/train cross kernel.

Parameters¶

encoding : BaseEncoding Encoding used for state preparation. C : float, default=1.0 SVM regularisation strength. seed : int or None, default=None Seed forwarded to the SVM for reproducibility.

Source code in src/encoding_atlas/benchmark/kernel.py

def __init__(
    self, encoding: Any, *, C: float = 1.0, seed: int | None = None
) -> None:
    if C <= 0:
        raise ValueError(f"C must be positive, got {C}")
    self.encoding = encoding
    self.C = C
    self.seed = seed
    self._svm: Any | None = None
    self._X_train: NDArray[np.floating[Any]] | None = None
    self._train_states: list[NDArray[np.complexfloating[Any, Any]]] | None = None
    self.kernel_regularized_: bool = False

fit ¶

fit(X: NDArray[floating[Any]], y: NDArray[intp]) -> QuantumKernelClassifier

Fit the precomputed-kernel SVM on (X, y).

Source code in src/encoding_atlas/benchmark/kernel.py

def fit(
    self, X: NDArray[np.floating[Any]], y: NDArray[np.intp]
) -> QuantumKernelClassifier:
    """Fit the precomputed-kernel SVM on ``(X, y)``."""
    from sklearn.svm import SVC

    K_train, states = compute_kernel_matrix(self.encoding, X, return_states=True)
    K_train_psd, self.kernel_regularized_ = ensure_psd(K_train)
    self._svm = SVC(kernel="precomputed", C=self.C, random_state=self.seed)
    self._svm.fit(K_train_psd, y)
    self._X_train = X
    self._train_states = states
    return self

predict ¶

predict(X: NDArray[floating[Any]]) -> NDArray[intp]

Predict labels for X using the test/train cross kernel.

Source code in src/encoding_atlas/benchmark/kernel.py

def predict(self, X: NDArray[np.floating[Any]]) -> NDArray[np.intp]:
    """Predict labels for ``X`` using the test/train cross kernel."""
    if self._svm is None or self._X_train is None:
        raise ValueError("Model not fitted. Call fit() first.")
    K_test = compute_kernel_matrix_cross(
        self.encoding, self._X_train, X, train_states=self._train_states
    )
    return self._svm.predict(K_test).astype(np.intp)

score ¶

score(X: NDArray[floating[Any]], y: NDArray[intp]) -> float

Return classification accuracy on (X, y).

Source code in src/encoding_atlas/benchmark/kernel.py

def score(self, X: NDArray[np.floating[Any]], y: NDArray[np.intp]) -> float:
    """Return classification accuracy on ``(X, y)``."""
    return float(np.mean(self.predict(X) == y))

compute_kernel_matrix ¶

compute_kernel_matrix(encoding: BaseEncoding, X: NDArray[floating[Any]], *, backend: str = 'pennylane', return_states: bool = False) -> Any

Compute the symmetric n x n fidelity kernel matrix for X.

Statevectors are simulated once per sample and reused for all pairwise fidelities. When return_states=True a (K, states) tuple is returned so the states can be reused for the cross kernel.

Source code in src/encoding_atlas/benchmark/kernel.py

def compute_kernel_matrix(
    encoding: BaseEncoding,
    X: NDArray[np.floating[Any]],
    *,
    backend: str = "pennylane",
    return_states: bool = False,
) -> Any:
    """Compute the symmetric ``n x n`` fidelity kernel matrix for ``X``.

    Statevectors are simulated once per sample and reused for all pairwise
    fidelities. When ``return_states=True`` a ``(K, states)`` tuple is returned
    so the states can be reused for the cross kernel.
    """
    n = len(X)
    K = np.zeros((n, n), dtype=np.float64)
    states = [simulate_encoding_state(encoding, xi, backend) for xi in X]

    for i in range(n):
        K[i, i] = 1.0
        for j in range(i + 1, n):
            K[i, j] = compute_kernel_entry(states[i], states[j])
            K[j, i] = K[i, j]

    if return_states:
        return K, states
    return K

kernel_target_alignment ¶

kernel_target_alignment(K: NDArray[floating[Any]], y: NDArray[integer[Any] | floating[Any]]) -> float

Uncentred kernel-target alignment in [-1, 1] (Cristianini, 2002).

Source code in src/encoding_atlas/benchmark/kernel.py

def kernel_target_alignment(
    K: NDArray[np.floating[Any]],
    y: NDArray[np.integer[Any] | np.floating[Any]],
) -> float:
    """Uncentred kernel-target alignment in ``[-1, 1]`` (Cristianini, 2002)."""
    y_signed = 2.0 * np.asarray(y, dtype=np.float64) - 1.0
    y_outer = np.outer(y_signed, y_signed)
    norm_K = float(np.linalg.norm(K, "fro"))
    norm_y = float(np.linalg.norm(y_outer, "fro"))
    if norm_K < 1e-10 or norm_y < 1e-10:
        return 0.0
    return float(np.sum(K * y_outer) / (norm_K * norm_y))

compare_encodings_corrected ¶

compare_encodings_corrected(results: dict[str, list[float]], *, alpha: float = 0.05) -> dict[str, Any]

Full pairwise comparison: Wilcoxon + Holm-Bonferroni + Cliff's delta.

Parameters¶

results : dict Mapping of encoding name -> list of paired per-fold/run scores. All lists must have equal length (paired across the same CV splits). alpha : float, default=0.05 Family-wise significance level used to flag significant pairs.

Returns¶

dict {"n_comparisons": int, "alpha": float, "pairs": [...]} where each pair record holds the two encodings, raw and corrected p-values, Wilcoxon statistic, Cliff's delta with magnitude, the mean-score difference, and a significant boolean (corrected p < alpha).

Source code in src/encoding_atlas/benchmark/statistical.py

def compare_encodings_corrected(
    results: dict[str, list[float]],
    *,
    alpha: float = 0.05,
) -> dict[str, Any]:
    """Full pairwise comparison: Wilcoxon + Holm-Bonferroni + Cliff's delta.

    Parameters
    ----------
    results : dict
        Mapping of encoding name -> list of paired per-fold/run scores. All
        lists must have equal length (paired across the same CV splits).
    alpha : float, default=0.05
        Family-wise significance level used to flag ``significant`` pairs.

    Returns
    -------
    dict
        ``{"n_comparisons": int, "alpha": float, "pairs": [...]}`` where each
        pair record holds the two encodings, raw and corrected p-values,
        Wilcoxon statistic, Cliff's delta with magnitude, the mean-score
        difference, and a ``significant`` boolean (corrected p < alpha).
    """
    names = list(results.keys())
    raw: dict[tuple[str, str], dict[str, Any]] = {}

    for i, name_a in enumerate(names):
        for name_b in names[i + 1 :]:
            a, b = results[name_a], results[name_b]
            stat, p_value = wilcoxon_test(a, b)
            delta, magnitude = cliffs_delta(a, b)
            mean_diff = (sum(a) / len(a) - sum(b) / len(b)) if a and b else 0.0
            raw[(name_a, name_b)] = {
                "encoding_a": name_a,
                "encoding_b": name_b,
                "statistic": stat,
                "p_value": p_value,
                "cliffs_delta": delta,
                "effect_magnitude": magnitude,
                "mean_difference": mean_diff,
            }

    corrected = holm_bonferroni({key: rec["p_value"] for key, rec in raw.items()})

    pairs = []
    for key, rec in raw.items():
        rec["p_value_corrected"] = corrected[key]
        rec["significant"] = corrected[key] < alpha
        pairs.append(rec)
    pairs.sort(key=lambda r: r["p_value_corrected"])

    return {"n_comparisons": len(pairs), "alpha": alpha, "pairs": pairs}

Atlas Module¶

The empirical benchmark results — measured circuit resources, simulability, expressibility, entanglement, trainability, noise resilience, and downstream VQC / quantum-kernel accuracy for all 16 encodings — bundled with the package as a queryable, read-only API.

get_encoding_profile ¶

get_encoding_profile(name: str) -> EncodingProfile

Return the measured profile for a single encoding.

Parameters¶

name : str Encoding identifier. Canonical names ("qaoa"), dataset aliases ("qaoa_encoding"), and class display names ("QAOAEncoding") are all accepted, case-insensitively.

Returns¶

EncodingProfile The encoding's empirical profile.

Raises¶

KeyError If name does not correspond to any benchmarked encoding.

Source code in src/encoding_atlas/atlas/profiles.py

def get_encoding_profile(name: str) -> EncodingProfile:
    """Return the measured profile for a single encoding.

    Parameters
    ----------
    name : str
        Encoding identifier. Canonical names (``"qaoa"``), dataset aliases
        (``"qaoa_encoding"``), and class display names (``"QAOAEncoding"``) are
        all accepted, case-insensitively.

    Returns
    -------
    EncodingProfile
        The encoding's empirical profile.

    Raises
    ------
    KeyError
        If ``name`` does not correspond to any benchmarked encoding.
    """
    index = _profile_index()
    key = name.strip().lower()
    if key not in index:
        available = ", ".join(available_encodings())
        raise KeyError(f"Unknown encoding {name!r}. Available encodings: {available}")
    return index[key]

rank_encodings ¶

rank_encodings(by: str = _SCORE_KEY, *, ascending: bool | None = None, limit: int | None = None) -> list[EncodingProfile]

Rank encodings by a measured metric or by the composite score.

Parameters¶

by : str Ranking key: "score", "rank", or any scalar metric from :func:list_metrics (e.g. "kernel_accuracy", "depth"). ascending : bool or None Sort direction. If None (default), a sensible direction is chosen per key: ascending for "rank" and "depth" (lower is better), descending otherwise (higher is better). limit : int or None If given, return at most this many profiles.

Returns¶

list[EncodingProfile] Profiles sorted as requested. Encodings whose value for by is undefined (None) are omitted from metric rankings.

Raises¶

ValueError If by is not a recognised ranking key, or limit is negative.

Source code in src/encoding_atlas/atlas/profiles.py

def rank_encodings(
    by: str = _SCORE_KEY,
    *,
    ascending: bool | None = None,
    limit: int | None = None,
) -> list[EncodingProfile]:
    """Rank encodings by a measured metric or by the composite score.

    Parameters
    ----------
    by : str
        Ranking key: ``"score"``, ``"rank"``, or any scalar metric from
        :func:`list_metrics` (e.g. ``"kernel_accuracy"``, ``"depth"``).
    ascending : bool or None
        Sort direction. If ``None`` (default), a sensible direction is chosen
        per key: ascending for ``"rank"`` and ``"depth"`` (lower is better),
        descending otherwise (higher is better).
    limit : int or None
        If given, return at most this many profiles.

    Returns
    -------
    list[EncodingProfile]
        Profiles sorted as requested. Encodings whose value for ``by`` is
        undefined (``None``) are omitted from metric rankings.

    Raises
    ------
    ValueError
        If ``by`` is not a recognised ranking key, or ``limit`` is negative.
    """
    valid_keys = {_SCORE_KEY, _RANK_KEY, *_SCALAR_METRICS}
    if by not in valid_keys:
        raise ValueError(f"Cannot rank by {by!r}. Valid keys: {sorted(valid_keys)}")
    if limit is not None and limit < 0:
        raise ValueError(f"limit must be non-negative, got {limit}")

    profiles = list(_profiles_by_rank())
    if by in _SCALAR_METRICS:
        profiles = [p for p in profiles if p.metrics.get(by) is not None]

    if ascending is None:
        ascending = by in _LOWER_IS_BETTER

    ranked = sorted(profiles, key=_sort_key(by), reverse=not ascending)
    if limit is not None:
        ranked = ranked[:limit]
    return ranked

pareto_front ¶

pareto_front() -> list[EncodingProfile]

Return the Pareto-optimal encoding profiles, ordered by rank.

These are the encodings not dominated on the benchmark objectives (accuracy, inverse depth, trainability, noise resilience).

Source code in src/encoding_atlas/atlas/profiles.py

def pareto_front() -> list[EncodingProfile]:
    """Return the Pareto-optimal encoding profiles, ordered by rank.

    These are the encodings not dominated on the benchmark objectives
    (accuracy, inverse depth, trainability, noise resilience).
    """
    return [p for p in _profiles_by_rank() if p.is_pareto]

hypothesis_verdicts ¶

hypothesis_verdicts() -> dict[str, Any]

Return the pre-registered hypothesis outcomes (H1–H7).

Each verdict carries its verdict ("supported" / "refuted" / "inconclusive"), confidence, a plain-language evidence summary, and the supporting test_statistic. A deep copy is returned so callers cannot mutate the cached dataset.

Source code in src/encoding_atlas/atlas/profiles.py

def hypothesis_verdicts() -> dict[str, Any]:
    """Return the pre-registered hypothesis outcomes (``H1``–``H7``).

    Each verdict carries its ``verdict`` (``"supported"`` / ``"refuted"`` /
    ``"inconclusive"``), ``confidence``, a plain-language ``evidence`` summary,
    and the supporting ``test_statistic``. A deep copy is returned so callers
    cannot mutate the cached dataset.
    """
    return copy.deepcopy(load_raw()["hypothesis_verdicts"])

atlas_metadata ¶

atlas_metadata() -> dict[str, Any]

Return provenance and summary metadata for the bundled atlas.

Returns¶

dict Schema version, encoding count, benchmark objective names, Pareto-front size, per-stage success counts, and a human-readable source string.

Source code in src/encoding_atlas/atlas/profiles.py

def atlas_metadata() -> dict[str, Any]:
    """Return provenance and summary metadata for the bundled atlas.

    Returns
    -------
    dict
        Schema version, encoding count, benchmark objective names, Pareto-front
        size, per-stage success counts, and a human-readable ``source`` string.
    """
    raw = load_raw()
    pareto = raw["pareto_front"]
    return {
        "schema_version": raw["schema_version"],
        "n_encodings": raw["n_encodings"],
        "objective_names": list(pareto["objective_names"]),
        "n_pareto_optimal": pareto["n_pareto_optimal"],
        "stage_counts": copy.deepcopy(raw["stage_counts"]),
        "source": ATLAS_SOURCE,
    }

EncodingProfile `dataclass` ¶

EncodingProfile(name: str, display_name: str, family: str, rank: int, score: float, is_pareto: bool, is_simulable: bool, metrics: Mapping[str, Any])

Measured empirical profile of a single encoding.

Attributes¶

name : str Canonical (registry-primary) encoding name, e.g. "qaoa". display_name : str Human-friendly class name, e.g. "QAOAEncoding". family : str Taxonomic family the encoding belongs to. rank : int Overall benchmark rank (1 = best) by the composite trade-off score. score : float Composite trade-off score in [0, 1] (higher is better). is_pareto : bool Whether the encoding lies on the Pareto front across the benchmark objectives (accuracy, inverse depth, trainability, noise resilience). is_simulable : bool Whether the encoding is classically efficiently simulable. metrics : Mapping[str, Any] Read-only mapping of measured metrics. Scalar keys are listed by :func:list_metrics; vqc_ci and kernel_ci hold 95% confidence intervals as [low, high] pairs. Some entries may be None when a metric is not defined for that encoding (e.g. expressibility for basis/amplitude state-preparation circuits).

metric ¶

metric(key: str, default: Any = None) -> Any

Return a measured metric value, or default if absent/undefined.

Parameters¶

key : str Metric name (see :func:list_metrics). default : Any Value to return if the metric is missing or recorded as None.

Source code in src/encoding_atlas/atlas/profiles.py

def metric(self, key: str, default: Any = None) -> Any:
    """Return a measured metric value, or ``default`` if absent/undefined.

    Parameters
    ----------
    key : str
        Metric name (see :func:`list_metrics`).
    default : Any
        Value to return if the metric is missing or recorded as ``None``.
    """
    value = self.metrics.get(key, default)
    return default if value is None else value

Exceptions¶

exceptions ¶

Custom exceptions for encoding_atlas.

This module defines the exception hierarchy for the encoding_atlas package. All custom exceptions inherit from :class:EncodingError, which itself inherits from Python's built-in :class:Exception.

Exception Hierarchy¶

::

Exception
└── EncodingError (base for all encoding_atlas exceptions)
    ├── ValidationError (input validation failures)
    ├── BackendError (quantum backend issues)
    ├── RegistryError (encoding registry issues)
    └── AnalysisError (analysis operation failures)
        ├── SimulationError (circuit simulation failures)
        ├── ConvergenceError (iterative computation failures)
        ├── NumericalInstabilityError (numerical issues)
        └── InsufficientSamplesError (sampling issues)

Examples¶

Catching all encoding-related exceptions:

try: ... result = some_encoding_operation() ... except EncodingError as e: ... print(f"Encoding operation failed: {e}")

Catching specific analysis exceptions:

from encoding_atlas.core.exceptions import SimulationError try: ... statevector = simulate_encoding(encoding, x) ... except SimulationError as e: ... print(f"Simulation failed: {e}")

AnalysisError ¶

AnalysisError(message: str, details: dict | None = None)

Bases: EncodingError

Base exception for analysis operations.

This exception class serves as the base for all analysis-related errors in the encoding_atlas.analysis module. It inherits from :class:EncodingError to maintain the package's exception hierarchy.

Use this exception for general analysis failures that don't fit into more specific categories. For specific failure modes, use the specialized subclasses.

Parameters¶

message : str Human-readable description of the error. details : dict, optional Additional context about the error (e.g., parameter values, intermediate results).

Attributes¶

details : dict Additional context about the error, if provided.

Examples¶

raise AnalysisError("Analysis failed for encoding") Traceback (most recent call last): ... AnalysisError: Analysis failed for encoding

With additional details:

raise AnalysisError( ... "Invalid encoding for analysis", ... details={"encoding_type": "CustomEncoding", "n_qubits": 0} ... )

BackendError ¶

Bases: EncodingError

Raised when quantum backend operations fail.

This exception indicates a failure in the underlying quantum computing framework (PennyLane, Qiskit, or Cirq). Common causes include missing backend installations, invalid circuit operations, or backend-specific limitations.

Examples¶

raise BackendError("Qiskit backend not available") Traceback (most recent call last): ... BackendError: Qiskit backend not available

ConvergenceError ¶

ConvergenceError(message: str, iterations: int | None = None, target_tolerance: float | None = None, achieved_tolerance: float | None = None, details: dict | None = None)

Bases: AnalysisError

Raised when iterative computation fails to converge.

This exception indicates that an iterative algorithm did not reach the desired convergence criteria within the allowed iterations or time. Common causes include:

Insufficient sample count
Poor initial conditions
Numerically unstable problem
Convergence criteria too strict

Parameters¶

message : str Human-readable description of the convergence failure. iterations : int, optional Number of iterations attempted before failure. target_tolerance : float, optional The convergence tolerance that was not achieved. achieved_tolerance : float, optional The best tolerance achieved before stopping. details : dict, optional Additional context about the failure.

Attributes¶

iterations : int or None Number of iterations attempted. target_tolerance : float or None The target convergence tolerance. achieved_tolerance : float or None The best tolerance achieved. details : dict Additional context about the failure.

Examples¶

raise ConvergenceError( ... "Expressibility estimation did not converge", ... iterations=10000, ... target_tolerance=1e-4, ... achieved_tolerance=1e-2 ... )

EncodingError ¶

Bases: Exception

Base exception for all encoding_atlas errors.

This is the root exception class for the encoding_atlas package. All custom exceptions in this package inherit from this class, allowing users to catch all encoding-related errors with a single except clause.

Parameters¶

message : str Human-readable description of the error. *args : tuple Additional positional arguments passed to Exception.

Examples¶

raise EncodingError("Something went wrong") Traceback (most recent call last): ... EncodingError: Something went wrong

InsufficientSamplesError ¶

InsufficientSamplesError(message: str, requested_samples: int | None = None, minimum_samples: int | None = None, metric: str | None = None, details: dict | None = None)

Bases: AnalysisError

Raised when sample count is too low for reliable results.

This exception indicates that a statistical analysis cannot provide reliable results with the given number of samples. This is distinct from a convergence failure — here, the algorithm recognizes upfront that the sample count is inadequate.

Parameters¶

message : str Human-readable description of the sampling issue. requested_samples : int, optional The number of samples requested by the user. minimum_samples : int, optional The minimum number of samples required for reliable results. metric : str, optional The metric being computed (e.g., "expressibility"). details : dict, optional Additional context about the sampling requirements.

Attributes¶

requested_samples : int or None The number of samples requested. minimum_samples : int or None The minimum required samples. metric : str or None The metric being computed. details : dict Additional context about the failure.

Examples¶

raise InsufficientSamplesError( ... "Too few samples for expressibility estimation", ... requested_samples=10, ... minimum_samples=100, ... metric="expressibility" ... )

Notes¶

When this exception is raised, the recommended action is to increase the sample count. The minimum_samples attribute provides guidance on the minimum viable sample count.

NumericalInstabilityError ¶

NumericalInstabilityError(message: str, value: float | None = None, operation: str | None = None, details: dict | None = None)

Bases: AnalysisError

Raised when numerical instability is detected.

This exception indicates that a computation encountered numerical issues that compromise the accuracy or validity of results. Common causes include:

Division by near-zero values
Overflow or underflow in intermediate calculations
Loss of precision in floating-point operations
Ill-conditioned matrices

Parameters¶

message : str Human-readable description of the numerical issue. value : float, optional The problematic value that triggered the error. operation : str, optional The operation that encountered the issue. details : dict, optional Additional context about the numerical issue.

Attributes¶

value : float or None The problematic value, if applicable. operation : str or None The operation that failed, if specified. details : dict Additional context about the failure.

Examples¶

raise NumericalInstabilityError( ... "Division by near-zero in fidelity computation", ... value=1e-320, ... operation="compute_fidelity" ... )

Notes¶

This exception is distinct from Python's built-in arithmetic exceptions (ZeroDivisionError, OverflowError) because it catches cases where the computation technically succeeds but produces unreliable results due to floating-point limitations.

RegistryError ¶

Bases: EncodingError

Raised when encoding registry operations fail.

This exception indicates a failure in the encoding registration or lookup system. Common causes include duplicate registrations or requests for non-existent encodings.

Examples¶

raise RegistryError("Encoding 'CustomEncoding' not found in registry") Traceback (most recent call last): ... RegistryError: Encoding 'CustomEncoding' not found in registry

SimulationError ¶

SimulationError(message: str, backend: str | None = None, details: dict | None = None)

Bases: AnalysisError

Raised when quantum circuit simulation fails.

This exception indicates that a quantum circuit could not be successfully simulated. Common causes include:

Missing backend dependencies (PennyLane, Qiskit not installed)
Invalid circuit structure
Memory limitations for large qubit counts
Backend-specific errors

Parameters¶

message : str Human-readable description of the simulation failure. backend : str, optional The backend that failed (e.g., "pennylane", "qiskit"). details : dict, optional Additional context about the failure.

Attributes¶

backend : str or None The backend that failed, if specified. details : dict Additional context about the failure.

Examples¶

raise SimulationError("PennyLane simulation failed: device error") Traceback (most recent call last): ... SimulationError: PennyLane simulation failed: device error

With backend information:

raise SimulationError( ... "Simulation failed", ... backend="qiskit", ... details={"n_qubits": 20, "error": "memory exceeded"} ... )

ValidationError ¶

Bases: EncodingError

Raised when input validation fails.

This exception indicates that input data does not meet the requirements of an encoding or analysis function. Common causes include incorrect shapes, invalid data types, or out-of-range values.

Examples¶

raise ValidationError("Expected 4 features, got 3") Traceback (most recent call last): ... ValidationError: Expected 4 features, got 3

API Reference¶

Encodings¶

Base Class¶

BaseEncoding ¶

Parameters¶

Examples¶

Notes¶

n_qubits abstractmethod property ¶

depth abstractmethod property ¶

properties property ¶

Returns¶

Notes¶

config property ¶

get_circuit ¶

Parameters¶

Returns¶

Raises¶

Properties¶

EncodingProperties dataclass ¶

Attributes¶

__post_init__ ¶

to_dict ¶

Encoding Classes¶

Angle Encoding¶

AngleEncoding ¶

Parameters¶

Attributes¶

Examples¶

References¶

See Also¶

Notes¶

Parameters¶

Raises¶

Amplitude Encoding¶

AmplitudeEncoding ¶

Parameters¶

Attributes¶

Examples¶

References¶

See Also¶

Notes¶

Parameters¶

Raises¶

Examples¶

Basis Encoding¶

BasisEncoding ¶

Parameters¶

Attributes¶

Examples¶

Now: positive values -> 1, non-positive values -> 0¶

References¶

See Also¶

Notes¶

Parameters¶

Raises¶

Examples¶

IQP Encoding¶

IQPEncoding ¶

Parameters¶

Attributes¶

Examples¶

References¶

See Also¶

Warnings¶

Notes¶

Parameters¶

Raises¶

Warns¶

ZZ Feature Map¶

ZZFeatureMap ¶

Parameters¶

Attributes¶

Use Cases¶

Limitations¶

Resource Analysis¶

Entanglement Topology Trade-offs¶

Examples¶

References¶

See Also¶

Notes¶

n_qubits `abstractmethod` `property` ¶

depth `abstractmethod` `property` ¶

properties `property` ¶

config `property` ¶

EncodingProperties `dataclass` ¶

NoiseResilienceResult `dataclass` ¶