BasisEncoding: Complete Feature Demonstration¶
Library: encoding-atlas Version: 0.2.0 Author: Ashutosh Mishra
This notebook provides an exhaustive, hands-on demonstration of BasisEncoding from the Quantum Encoding Atlas library. BasisEncoding is the simplest quantum data encoding — it maps binary (discrete) classical data directly to computational basis states.
Mathematical Formulation¶
BasisEncoding creates quantum states of the form:
$$|\psi(x)\rangle = X^{x_0} \otimes X^{x_1} \otimes \cdots \otimes X^{x_{n-1}} |0\rangle^{\otimes n}$$
where $x_i \in \{0, 1\}$ is the $i$-th binary feature. The Pauli-X gate flips $|0\rangle$ to $|1\rangle$:
$$X = \begin{pmatrix} 0 & 1 \\ 1 & 0 \end{pmatrix}, \quad X|0\rangle = |1\rangle, \quad X|1\rangle = |0\rangle$$
For continuous inputs, a configurable threshold (default 0.5) binarizes the data:
- Values $> \text{threshold}$ are mapped to 1 (X gate applied)
- Values $\leq \text{threshold}$ are mapped to 0 (no gate applied)
Key Characteristics¶
| Property | Value |
|---|---|
| Qubits required | $n$ (one per feature) |
| Circuit depth | 1 (constant) |
| Gates used | Pauli-X only |
| Entangling | No (product states) |
| Classically simulable | Yes (trivially) |
| Trainable parameters | 0 |
| Data type | Binary / discrete |
Table of Contents¶
- Installation & Setup
- Creating a BasisEncoding
- Core Properties
- Encoding Properties (Lazy, Thread-Safe)
- Binarization Behavior
- Circuit Generation — PennyLane Backend
- Circuit Generation — Qiskit Backend
- Circuit Generation — Cirq Backend
- Batch Circuit Generation
- Data-Dependent Resource Analysis
- Protocol Conformance
- Analysis Tools
- Mathematical Correctness & Statevector Verification
- Cross-Backend Consistency
- Equality, Hashing & Serialization
- Edge Cases & Robustness
- Registry System
- Comparison with Other Encodings
- Debug Logging
- Summary & Best Practices
1. Installation & Setup¶
# import scipy
# print(scipy.__version__)
# print(scipy.__file__)
# import sys
# !{sys.executable} -m pip install -U --no-cache-dir scipy
# Install the library (uncomment if not already installed)
# !pip install pennylane qiskit qiskit-aer cirq-core
# import sys
# !{sys.executable} -m pip uninstall -y encoding-atlas
# !{sys.executable} -m pip install --no-cache-dir --index-url https://pypi.org/simple encoding-atlas==0.2.0
# For full multi-backend support:
# !pip install encoding-atlas[qiskit] # Qiskit backend
# !pip install encoding-atlas[cirq] # Cirq backend
# All backends
# !pip install encoding-atlas[all]
import numpy as np
import encoding_atlas
print(f"encoding-atlas version: {encoding_atlas.__version__}")
print(f"NumPy version: {np.__version__}")
encoding-atlas version: 0.2.0 NumPy version: 2.2.6
# Check which backends are available
backends_available = {}
try:
import pennylane as qml
backends_available['pennylane'] = qml.__version__
except ImportError:
backends_available['pennylane'] = 'NOT INSTALLED'
try:
import qiskit
backends_available['qiskit'] = qiskit.__version__
except ImportError:
backends_available['qiskit'] = 'NOT INSTALLED'
try:
import cirq
backends_available['cirq'] = cirq.__version__
except ImportError:
backends_available['cirq'] = 'NOT INSTALLED'
print("Backend availability:")
for name, version in backends_available.items():
status = "✅" if version != 'NOT INSTALLED' else "❌"
print(f" {status} {name}: {version}")
Backend availability: ✅ pennylane: 0.42.3 ✅ qiskit: 2.3.0 ✅ cirq: 1.5.0
2. Creating a BasisEncoding¶
The BasisEncoding constructor accepts two parameters:
| Parameter | Type | Default | Description |
|---|---|---|---|
n_features |
int |
required | Number of binary features to encode (= number of qubits) |
threshold |
float |
0.5 |
Binarization threshold for continuous inputs |
from encoding_atlas import BasisEncoding
# Basic creation with default threshold (0.5)
enc_default = BasisEncoding(n_features=4)
print(f"Default: {enc_default}")
# Custom threshold for signed data [-1, 1]
enc_signed = BasisEncoding(n_features=4, threshold=0.0)
print(f"Signed: {enc_signed}")
# Custom threshold for different decision boundary
enc_custom = BasisEncoding(n_features=4, threshold=0.7)
print(f"Custom: {enc_custom}")
# Single feature (minimum case)
enc_single = BasisEncoding(n_features=1)
print(f"Single: {enc_single}")
# Large feature count
enc_large = BasisEncoding(n_features=64)
print(f"Large: {enc_large}")
Default: BasisEncoding(n_features=4) Signed: BasisEncoding(n_features=4, threshold=0.0) Custom: BasisEncoding(n_features=4, threshold=0.7) Single: BasisEncoding(n_features=1) Large: BasisEncoding(n_features=64)
2.1 Constructor Validation¶
The constructor validates all parameters strictly. Let's verify each validation rule.
# --- Invalid n_features ---
print("=== n_features validation ===")
# Must be a positive integer
for bad_n in [0, -1, -5]:
try:
BasisEncoding(n_features=bad_n)
except ValueError as e:
print(f" n_features={bad_n!r}: ValueError - {e}")
# Non-integer types are rejected
for bad_n in [1.5, "4", None, [4], True, False]:
try:
BasisEncoding(n_features=bad_n)
except (TypeError, ValueError) as e:
print(f" n_features={bad_n!r}: {type(e).__name__} - {e}")
=== n_features validation === n_features=0: ValueError - n_features must be a positive integer, got 0 n_features=-1: ValueError - n_features must be a positive integer, got -1 n_features=-5: ValueError - n_features must be a positive integer, got -5 n_features=1.5: ValueError - n_features must be a positive integer, got 1.5 n_features='4': ValueError - n_features must be a positive integer, got 4 n_features=None: ValueError - n_features must be a positive integer, got None n_features=[4]: ValueError - n_features must be a positive integer, got [4] n_features=False: ValueError - n_features must be a positive integer, got False
# --- Invalid threshold ---
print("=== threshold validation ===")
# Boolean is rejected (even though bool is subclass of int in Python)
for bad_t in [True, False]:
try:
BasisEncoding(n_features=4, threshold=bad_t)
except TypeError as e:
print(f" threshold={bad_t!r}: TypeError - {e}")
# Non-numeric types are rejected
for bad_t in ["0.5", None, [0.5]]:
try:
BasisEncoding(n_features=4, threshold=bad_t)
except TypeError as e:
print(f" threshold={bad_t!r}: TypeError - {e}")
# NaN and infinity are rejected
for bad_t in [float('nan'), float('inf'), float('-inf')]:
try:
BasisEncoding(n_features=4, threshold=bad_t)
except ValueError as e:
print(f" threshold={bad_t!r}: ValueError - {e}")
# Valid edge cases that ARE accepted
print("\n=== Valid threshold edge cases ===")
for ok_t in [-100.0, -1.0, 0, 0.0, 0.5, 1.0, 100.0, 3]:
enc = BasisEncoding(n_features=4, threshold=ok_t)
print(f" threshold={ok_t!r}: OK -> {enc}")
=== threshold validation === threshold=True: TypeError - threshold must be a numeric type (int or float), got bool threshold=False: TypeError - threshold must be a numeric type (int or float), got bool threshold='0.5': TypeError - threshold must be a numeric type (int or float), got str threshold=None: TypeError - threshold must be a numeric type (int or float), got NoneType threshold=[0.5]: TypeError - threshold must be a numeric type (int or float), got list threshold=nan: ValueError - threshold must be a finite number, got nan threshold=inf: ValueError - threshold must be a finite number, got inf threshold=-inf: ValueError - threshold must be a finite number, got -inf === Valid threshold edge cases === threshold=-100.0: OK -> BasisEncoding(n_features=4, threshold=-100.0) threshold=-1.0: OK -> BasisEncoding(n_features=4, threshold=-1.0) threshold=0: OK -> BasisEncoding(n_features=4, threshold=0.0) threshold=0.0: OK -> BasisEncoding(n_features=4, threshold=0.0) threshold=0.5: OK -> BasisEncoding(n_features=4) threshold=1.0: OK -> BasisEncoding(n_features=4, threshold=1.0) threshold=100.0: OK -> BasisEncoding(n_features=4, threshold=100.0) threshold=3: OK -> BasisEncoding(n_features=4, threshold=3.0)
3. Core Properties¶
BasisEncoding exposes several properties inherited from BaseEncoding plus its own threshold attribute.
enc = BasisEncoding(n_features=8, threshold=0.3)
print("=== Core Properties ===")
print(f" n_features : {enc.n_features} (number of classical features)")
print(f" n_qubits : {enc.n_qubits} (one qubit per feature, always == n_features)")
print(f" depth : {enc.depth} (constant depth of 1 — all X gates in parallel)")
print(f" threshold : {enc.threshold} (binarization threshold)")
# Verify the fundamental invariant
assert enc.n_qubits == enc.n_features, "n_qubits must always equal n_features"
print(f"\n Invariant: n_qubits == n_features? {enc.n_qubits == enc.n_features}")
=== Core Properties === n_features : 8 (number of classical features) n_qubits : 8 (one qubit per feature, always == n_features) depth : 1 (constant depth of 1 — all X gates in parallel) threshold : 0.3 (binarization threshold) Invariant: n_qubits == n_features? True
# The config property returns a defensive copy of constructor kwargs
config = enc.config
print(f"config = {config}")
print(f"type = {type(config).__name__}")
# It's a defensive copy — modifying it doesn't affect the encoding
config['threshold'] = 999.0
config['hacked'] = True
print(f"\nAfter modifying copy:")
print(f" config copy: {config}")
print(f" enc.threshold: {enc.threshold} (unchanged)")
print(f" enc.config: {enc.config} (fresh copy, unmodified)")
config = {'threshold': 0.3}
type = dict
After modifying copy:
config copy: {'threshold': 999.0, 'hacked': True}
enc.threshold: 0.3 (unchanged)
enc.config: {'threshold': 0.3} (fresh copy, unmodified)
4. Encoding Properties (Lazy, Thread-Safe)¶
The properties attribute returns an EncodingProperties frozen dataclass. It is:
- Lazily computed on first access (not at construction time)
- Thread-safe via double-checked locking
- Cached after first computation
enc = BasisEncoding(n_features=4)
props = enc.properties
print(f"type: {type(props).__name__}")
print()
print("=== EncodingProperties ===")
print(f" n_qubits : {props.n_qubits}")
print(f" depth : {props.depth}")
print(f" gate_count : {props.gate_count} (worst-case: all features = 1)")
print(f" single_qubit_gates : {props.single_qubit_gates}")
print(f" two_qubit_gates : {props.two_qubit_gates}")
print(f" parameter_count : {props.parameter_count}")
print(f" is_entangling : {props.is_entangling}")
print(f" simulability : {props.simulability!r}")
print(f" trainability_estimate: {props.trainability_estimate}")
print(f" expressibility : {props.expressibility}")
print(f" entanglement_cap. : {props.entanglement_capability}")
print(f" notes : {props.notes[:80]}...")
type: EncodingProperties === EncodingProperties === n_qubits : 4 depth : 1 gate_count : 4 (worst-case: all features = 1) single_qubit_gates : 4 two_qubit_gates : 0 parameter_count : 0 is_entangling : False simulability : 'simulable' trainability_estimate: 1.0 expressibility : None entanglement_cap. : None notes : GATE COUNTS ARE WORST-CASE (max 4 X gates if all features=1). Actual gates depen...
# The properties object is frozen (immutable)
from dataclasses import FrozenInstanceError
try:
props.n_qubits = 10
except FrozenInstanceError as e:
print(f"Cannot modify frozen properties: {e}")
# It also has a to_dict() method for easy serialization
props_dict = props.to_dict()
print(f"\nto_dict() keys: {list(props_dict.keys())}")
print(f"to_dict() sample: n_qubits={props_dict['n_qubits']}, simulability={props_dict['simulability']!r}")
Cannot modify frozen properties: cannot assign to field 'n_qubits' to_dict() keys: ['n_qubits', 'depth', 'gate_count', 'single_qubit_gates', 'two_qubit_gates', 'parameter_count', 'is_entangling', 'simulability', 'expressibility', 'entanglement_capability', 'trainability_estimate', 'noise_resilience_estimate', 'notes'] to_dict() sample: n_qubits=4, simulability='simulable'
# Verify properties are cached (same object returned on second access)
props_1 = enc.properties
props_2 = enc.properties
print(f"Same object (cached): {props_1 is props_2}")
# Verify key invariants
assert props.single_qubit_gates + props.two_qubit_gates == props.gate_count
assert props.two_qubit_gates == 0, "BasisEncoding never uses two-qubit gates"
assert props.parameter_count == 0, "BasisEncoding has no trainable parameters"
assert props.is_entangling is False, "BasisEncoding produces product states only"
assert props.simulability == "simulable"
assert props.depth == 1
print("All property invariants verified!")
Same object (cached): True All property invariants verified!
5. Binarization Behavior¶
BasisEncoding automatically converts continuous inputs to binary using a configurable threshold. The binarize() method exposes this logic for inspection.
Rule: x > threshold → 1 (X gate applied), x ≤ threshold → 0 (no gate)
Note: values exactly equal to the threshold are mapped to 0 (strict inequality).
enc = BasisEncoding(n_features=4) # default threshold = 0.5
# Basic binarization
x = np.array([0.8, 0.2, 0.6, 0.4])
binary = enc.binarize(x)
print(f"Input: {x}")
print(f"Binary: {binary}")
print(f"dtype: {binary.dtype}")
print()
# Boundary case: exactly AT the threshold -> 0 (strict > comparison)
x_boundary = np.array([0.5, 0.50, 0.500, 0.5000])
binary_boundary = enc.binarize(x_boundary)
print(f"At threshold (0.5): {x_boundary}")
print(f"Binary: {binary_boundary} (all zeros — 0.5 is NOT > 0.5)")
print()
# Just above and below threshold
x_near = np.array([0.49, 0.51, 0.50, 0.501])
binary_near = enc.binarize(x_near)
print(f"Near threshold: {x_near}")
print(f"Binary: {binary_near}")
Input: [0.8 0.2 0.6 0.4] Binary: [1 0 1 0] dtype: int32 At threshold (0.5): [0.5 0.5 0.5 0.5] Binary: [0 0 0 0] (all zeros — 0.5 is NOT > 0.5) Near threshold: [0.49 0.51 0.5 0.501] Binary: [0 1 0 1]
# Already-binary input passes through unchanged
x_binary = np.array([1, 0, 1, 0])
result = enc.binarize(x_binary)
print(f"Already binary: {x_binary} -> {result}")
# Negative values (all below default threshold 0.5)
x_negative = np.array([-1.0, -0.5, -0.1, 0.0])
result_neg = enc.binarize(x_negative)
print(f"Negative values: {x_negative} -> {result_neg}")
# Large positive values (all above threshold)
x_large = np.array([100.0, 50.5, 1.0, 0.51])
result_large = enc.binarize(x_large)
print(f"Large values: {x_large} -> {result_large}")
Already binary: [1 0 1 0] -> [1 0 1 0] Negative values: [-1. -0.5 -0.1 0. ] -> [0 0 0 0] Large values: [100. 50.5 1. 0.51] -> [1 1 1 1]
# Custom threshold: 0.0 for signed data
enc_signed = BasisEncoding(n_features=4, threshold=0.0)
x_signed = np.array([-0.5, 0.5, -0.1, 0.1])
binary_signed = enc_signed.binarize(x_signed)
print(f"Threshold=0.0, Input: {x_signed}")
print(f"Binary: {binary_signed} (positive -> 1, non-positive -> 0)")
print()
# Zero is exactly at threshold=0.0, so it maps to 0
x_zero = np.array([0.0, 0.0, 0.0, 0.0])
print(f"All zeros with threshold=0.0: {enc_signed.binarize(x_zero)} (all map to 0)")
print()
# Custom threshold: 0.7
enc_high = BasisEncoding(n_features=4, threshold=0.7)
x_test = np.array([0.5, 0.6, 0.7, 0.8])
binary_high = enc_high.binarize(x_test)
print(f"Threshold=0.7, Input: {x_test}")
print(f"Binary: {binary_high} (only 0.8 > 0.7)")
Threshold=0.0, Input: [-0.5 0.5 -0.1 0.1] Binary: [0 1 0 1] (positive -> 1, non-positive -> 0) All zeros with threshold=0.0: [0 0 0 0] (all map to 0) Threshold=0.7, Input: [0.5 0.6 0.7 0.8] Binary: [0 0 0 1] (only 0.8 > 0.7)
# Batch binarization: 2D input preserves shape
enc = BasisEncoding(n_features=3)
X_batch = np.array([
[0.1, 0.9, 0.5],
[0.6, 0.4, 0.7],
[0.0, 1.0, 0.3],
])
binary_batch = enc.binarize(X_batch)
print("Batch binarization (threshold=0.5):")
print(f" Input shape: {X_batch.shape}")
print(f" Output shape: {binary_batch.shape}")
for i in range(len(X_batch)):
print(f" Sample {i}: {X_batch[i]} -> {binary_batch[i]}")
Batch binarization (threshold=0.5): Input shape: (3, 3) Output shape: (3, 3) Sample 0: [0.1 0.9 0.5] -> [0 1 0] Sample 1: [0.6 0.4 0.7] -> [1 0 1] Sample 2: [0. 1. 0.3] -> [0 1 0]
6. Circuit Generation — PennyLane Backend¶
PennyLane is the default backend. get_circuit() returns a callable (closure) that applies quantum gates when invoked inside a qml.QNode context.
import pennylane as qml
enc = BasisEncoding(n_features=4)
x = np.array([1, 0, 1, 0])
# get_circuit returns a callable for PennyLane
circuit_fn = enc.get_circuit(x, backend="pennylane")
print(f"Type: {type(circuit_fn).__name__}")
print(f"Callable: {callable(circuit_fn)}")
Type: function Callable: True
# Use the circuit function inside a QNode to get the statevector
dev = qml.device("default.qubit", wires=enc.n_qubits)
@qml.qnode(dev)
def get_state(x):
circuit_fn = enc.get_circuit(x, backend="pennylane")
circuit_fn() # Apply the gates
return qml.state()
# Encode [1, 0, 1, 0] -> should produce |1010>
state = get_state(np.array([1, 0, 1, 0]))
print("Input: [1, 0, 1, 0]")
print(f"Statevector ({len(state)} amplitudes):")
# Show non-zero amplitudes
for i, amp in enumerate(state):
if abs(amp) > 1e-10:
binary_label = format(i, f'0{enc.n_qubits}b')
print(f" |{binary_label}> : {amp:.4f}")
Input: [1, 0, 1, 0] Statevector (16 amplitudes): |1010> : 1.0000+0.0000j
# All-zeros input: should produce ground state |0000>
state_zeros = get_state(np.array([0, 0, 0, 0]))
print("Input: [0, 0, 0, 0] (ground state)")
for i, amp in enumerate(state_zeros):
if abs(amp) > 1e-10:
print(f" |{format(i, f'0{enc.n_qubits}b')}> : {amp:.4f}")
print()
# All-ones input: should produce |1111>
state_ones = get_state(np.array([1, 1, 1, 1]))
print("Input: [1, 1, 1, 1]")
for i, amp in enumerate(state_ones):
if abs(amp) > 1e-10:
print(f" |{format(i, f'0{enc.n_qubits}b')}> : {amp:.4f}")
Input: [0, 0, 0, 0] (ground state) |0000> : 1.0000+0.0000j Input: [1, 1, 1, 1] |1111> : 1.0000+0.0000j
# Continuous data is automatically binarized
state_continuous = get_state(np.array([0.8, 0.2, 0.6, 0.4]))
# Binarized to [1, 0, 1, 0] -> same as |1010>
print("Input: [0.8, 0.2, 0.6, 0.4] (binarized to [1, 0, 1, 0])")
for i, amp in enumerate(state_continuous):
if abs(amp) > 1e-10:
print(f" |{format(i, f'0{enc.n_qubits}b')}> : {amp:.4f}")
# Measurement is deterministic for basis states
@qml.qnode(dev)
def measure_samples(x):
circuit_fn = enc.get_circuit(x, backend="pennylane")
circuit_fn()
return [qml.expval(qml.PauliZ(i)) for i in range(enc.n_qubits)]
# Z expectation: +1 for |0>, -1 for |1>
expectations = measure_samples(np.array([1, 0, 1, 0]))
print(f"\nZ expectations for |1010>: {[f'{e:.1f}' for e in expectations]}")
print(" (Expected: [-1.0, +1.0, -1.0, +1.0] since Z|0>=+|0>, Z|1>=-|1>)")
Input: [0.8, 0.2, 0.6, 0.4] (binarized to [1, 0, 1, 0]) |1010> : 1.0000+0.0000j Z expectations for |1010>: ['-1.0', '1.0', '-1.0', '1.0'] (Expected: [-1.0, +1.0, -1.0, +1.0] since Z|0>=+|0>, Z|1>=-|1>)
try:
from qiskit import QuantumCircuit
from qiskit.quantum_info import Statevector
QISKIT_AVAILABLE = True
except ImportError:
QISKIT_AVAILABLE = False
print("Qiskit not installed. Install with: pip install qiskit")
if QISKIT_AVAILABLE:
enc = BasisEncoding(n_features=4)
x = np.array([1, 0, 1, 0])
qc = enc.get_circuit(x, backend="qiskit")
print(f"Type: {type(qc).__name__}")
print(f"Name: {qc.name}")
print(f"Qubits: {qc.num_qubits}")
print(f"Depth: {qc.depth()}")
print(f"Gate counts: {dict(qc.count_ops())}")
print()
print("Circuit diagram:")
print(qc.draw(output='text'))
Type: QuantumCircuit
Name: BasisEncoding
Qubits: 4
Depth: 1
Gate counts: {'x': 2}
Circuit diagram:
┌───┐
q_0: ┤ X ├
└───┘
q_1: ─────
┌───┐
q_2: ┤ X ├
└───┘
q_3: ─────
if QISKIT_AVAILABLE:
# Verify statevector
sv = Statevector.from_instruction(qc)
print("Statevector for [1, 0, 1, 0]:")
for i, amp in enumerate(sv.data):
if abs(amp) > 1e-10:
binary_label = format(i, f'0{enc.n_qubits}b')
print(f" |{binary_label}> : {amp:.4f}")
print()
# All-zeros: empty circuit (no X gates)
qc_zeros = enc.get_circuit(np.array([0, 0, 0, 0]), backend="qiskit")
print(f"All-zeros circuit: {dict(qc_zeros.count_ops())} gates")
print(qc_zeros.draw(output='text'))
# All-ones: X on every qubit
qc_ones = enc.get_circuit(np.array([1, 1, 1, 1]), backend="qiskit")
print(f"\nAll-ones circuit: {dict(qc_ones.count_ops())} gates")
print(qc_ones.draw(output='text'))
Statevector for [1, 0, 1, 0]:
|0101> : 1.0000+0.0000j
All-zeros circuit: {} gates
q_0:
q_1:
q_2:
q_3:
All-ones circuit: {'x': 4} gates
┌───┐
q_0: ┤ X ├
├───┤
q_1: ┤ X ├
├───┤
q_2: ┤ X ├
├───┤
q_3: ┤ X ├
└───┘
8. Circuit Generation — Cirq Backend¶
The Cirq backend returns a cirq.Circuit with all X gates placed in a single Moment (parallel execution).
Important caveat: Cirq's circuit.all_qubits() only returns qubits that have operations applied. For sparse inputs (many zeros), this may return fewer qubits than enc.n_qubits. Always use enc.n_qubits for the actual qubit count.
try:
import cirq
CIRQ_AVAILABLE = True
except ImportError:
CIRQ_AVAILABLE = False
print("Cirq not installed. Install with: pip install cirq-core")
if CIRQ_AVAILABLE:
enc = BasisEncoding(n_features=4)
x = np.array([1, 0, 1, 0])
circuit = enc.get_circuit(x, backend="cirq")
print(f"Type: {type(circuit).__name__}")
print(f"Moments: {len(circuit.moments)} (all X gates in parallel)")
print(f"Qubits with operations: {len(circuit.all_qubits())}")
print(f"Actual n_qubits: {enc.n_qubits} (<-- always use this, not all_qubits())")
print()
print("Circuit diagram:")
print(circuit)
Type: Circuit Moments: 1 (all X gates in parallel) Qubits with operations: 2 Actual n_qubits: 4 (<-- always use this, not all_qubits()) Circuit diagram: 0: ───X─── 2: ───X───
if CIRQ_AVAILABLE:
# Demonstrate the Cirq qubit tracking caveat
x_sparse = np.array([1, 0, 0, 0])
circuit_sparse = enc.get_circuit(x_sparse, backend="cirq")
print("Sparse input [1, 0, 0, 0]:")
print(f" circuit.all_qubits(): {sorted(circuit_sparse.all_qubits())} (only 1 qubit!)")
print(f" enc.n_qubits: {enc.n_qubits} (actual requirement: 4 qubits)")
print(f" Circuit: {circuit_sparse}")
print()
# All-zeros: empty circuit
x_zeros = np.array([0, 0, 0, 0])
circuit_zeros = enc.get_circuit(x_zeros, backend="cirq")
print(f"All-zeros: {len(circuit_zeros.moments)} moments, {len(circuit_zeros.all_qubits())} qubits shown")
print()
# Simulate to get statevector
sim = cirq.Simulator()
qubits = cirq.LineQubit.range(enc.n_qubits)
result = sim.simulate(circuit, qubit_order=qubits)
state = result.final_state_vector
print("Statevector for [1, 0, 1, 0]:")
for i, amp in enumerate(state):
if abs(amp) > 1e-10:
print(f" |{format(i, f'0{enc.n_qubits}b')}> : {amp:.4f}")
Sparse input [1, 0, 0, 0]: circuit.all_qubits(): [cirq.LineQubit(0)] (only 1 qubit!) enc.n_qubits: 4 (actual requirement: 4 qubits) Circuit: 0: ───X─── All-zeros: 0 moments, 0 qubits shown Statevector for [1, 0, 1, 0]: |1010> : 1.0000+0.0000j
9. Batch Circuit Generation¶
get_circuits() generates circuits for multiple data samples at once. It supports both sequential and parallel processing.
enc = BasisEncoding(n_features=4)
# Create a batch of samples
X_batch = np.array([
[1, 0, 0, 1],
[0, 1, 1, 0],
[1, 1, 1, 1],
[0, 0, 0, 0],
])
# Sequential processing (default)
circuits_seq = enc.get_circuits(X_batch, backend="pennylane")
print(f"Sequential: {len(circuits_seq)} circuits generated")
print(f"Each circuit type: {type(circuits_seq[0]).__name__}")
print()
# Parallel processing
circuits_par = enc.get_circuits(X_batch, backend="pennylane", parallel=True)
print(f"Parallel: {len(circuits_par)} circuits generated")
print()
# Verify order is preserved (parallel returns same order as input)
dev = qml.device("default.qubit", wires=enc.n_qubits)
for i, (x, circ) in enumerate(zip(X_batch, circuits_seq)):
@qml.qnode(dev)
def run_circuit():
circ()
return qml.state()
state = run_circuit()
nonzero_idx = np.argmax(np.abs(state))
basis_state = format(nonzero_idx, f'0{enc.n_qubits}b')
print(f" Sample {i}: {x} -> |{basis_state}>")
Sequential: 4 circuits generated Each circuit type: function Parallel: 4 circuits generated Sample 0: [1 0 0 1] -> |1001> Sample 1: [0 1 1 0] -> |0110> Sample 2: [1 1 1 1] -> |1111> Sample 3: [0 0 0 0] -> |0000>
# Parallel processing with custom worker count
import os
import time
# Generate a larger batch for timing comparison
np.random.seed(42)
X_large = np.random.randint(0, 2, size=(500, 4))
# Sequential timing
start = time.perf_counter()
circuits_seq = enc.get_circuits(X_large, backend="qiskit")
time_seq = time.perf_counter() - start
# Parallel timing
start = time.perf_counter()
circuits_par = enc.get_circuits(X_large, backend="qiskit", parallel=True, max_workers=os.cpu_count())
time_par = time.perf_counter() - start
print(f"Batch size: {len(X_large)} samples")
print(f"Sequential: {time_seq:.4f}s")
print(f"Parallel: {time_par:.4f}s")
print(f"Speedup: {time_seq/time_par:.2f}x")
# Single sample is also accepted (treated as 1-sample batch)
circuits_single = enc.get_circuits(np.array([1, 0, 1, 0]), backend="pennylane")
print(f"\nSingle sample input: {len(circuits_single)} circuit(s)")
Batch size: 500 samples Sequential: 0.0595s Parallel: 0.0874s Speedup: 0.68x Single sample input: 1 circuit(s)
10. Data-Dependent Resource Analysis¶
BasisEncoding has data-dependent circuit structure: only qubits where the input binarizes to 1 receive X gates. The library provides multiple methods to analyze actual vs. worst-case resources.
| Method | Returns | Depends on input? |
|---|---|---|
properties.gate_count |
Worst-case max gates | No |
actual_gate_count(x) |
Exact gate count | Yes |
gate_count_breakdown() |
Detailed worst-case | No |
resource_summary(x) |
Complete breakdown | Yes |
enc = BasisEncoding(n_features=8)
# Worst-case (theoretical) vs. actual gate counts
print("=== Worst-case vs. Actual Gate Counts ===")
print(f"Theoretical maximum (properties.gate_count): {enc.properties.gate_count}")
print()
test_inputs = [
np.array([0, 0, 0, 0, 0, 0, 0, 0]), # All zeros
np.array([1, 0, 0, 0, 0, 0, 0, 0]), # Single one
np.array([1, 0, 1, 0, 0, 0, 0, 0]), # Two ones
np.array([1, 1, 1, 1, 0, 0, 0, 0]), # Half ones
np.array([1, 1, 1, 1, 1, 1, 1, 1]), # All ones (worst case)
]
for x in test_inputs:
actual = enc.actual_gate_count(x)
print(f" Input {x.tolist()} -> {actual} gates (of {enc.properties.gate_count} max)")
=== Worst-case vs. Actual Gate Counts === Theoretical maximum (properties.gate_count): 8 Input [0, 0, 0, 0, 0, 0, 0, 0] -> 0 gates (of 8 max) Input [1, 0, 0, 0, 0, 0, 0, 0] -> 1 gates (of 8 max) Input [1, 0, 1, 0, 0, 0, 0, 0] -> 2 gates (of 8 max) Input [1, 1, 1, 1, 0, 0, 0, 0] -> 4 gates (of 8 max) Input [1, 1, 1, 1, 1, 1, 1, 1] -> 8 gates (of 8 max)
# Gate count breakdown (worst-case)
breakdown = enc.gate_count_breakdown()
print("=== Gate Count Breakdown (Worst Case) ===")
for key, value in breakdown.items():
print(f" {key}: {value}")
=== Gate Count Breakdown (Worst Case) === x_gates: 8 total_single_qubit: 8 total_two_qubit: 0 total: 8 is_worst_case: True
# Comprehensive resource summary for specific input
x = np.array([0.8, 0.2, 0.6, 0.4, 0.9, 0.1, 0.7, 0.3])
summary = enc.resource_summary(x)
print("=== Resource Summary ===")
print(f" Input: {x.tolist()}")
print(f" Binarized: {summary['binarized_input']}")
print()
print(" Circuit structure:")
print(f" n_qubits: {summary['n_qubits']}")
print(f" depth: {summary['depth']}")
print()
print(" Gate counts:")
print(f" actual_gate_count: {summary['actual_gate_count']}")
print(f" max_gate_count: {summary['max_gate_count']}")
print(f" gate_efficiency: {summary['gate_efficiency']:.2%}")
print()
print(" Binarization details:")
print(f" threshold: {summary['threshold']}")
print(f" ones_count: {summary['ones_count']}")
print(f" zeros_count: {summary['zeros_count']}")
print(f" sparsity: {summary['sparsity']:.2%}")
=== Resource Summary ===
Input: [0.8, 0.2, 0.6, 0.4, 0.9, 0.1, 0.7, 0.3]
Binarized: [1, 0, 1, 0, 1, 0, 1, 0]
Circuit structure:
n_qubits: 8
depth: 1
Gate counts:
actual_gate_count: 4
max_gate_count: 8
gate_efficiency: 50.00%
Binarization details:
threshold: 0.5
ones_count: 4
zeros_count: 4
sparsity: 50.00%
# Analyzing sparse data: gate efficiency matters for hardware cost
enc_large = BasisEncoding(n_features=100)
# Simulate sparse binary data (e.g., one-hot encoded features)
np.random.seed(42)
x_sparse = np.zeros(100)
x_sparse[:5] = 1 # Only 5 of 100 features are "on"
summary_sparse = enc_large.resource_summary(x_sparse)
print(f"Sparse data analysis (100 features, 5 active):")
print(f" Actual gates: {summary_sparse['actual_gate_count']} of {summary_sparse['max_gate_count']} max")
print(f" Gate efficiency: {summary_sparse['gate_efficiency']:.1%}")
print(f" Sparsity: {summary_sparse['sparsity']:.1%}")
print()
print(" This means the actual circuit is 95% sparser than the worst case!")
print(" Fewer gates -> less decoherence -> better results on real hardware.")
Sparse data analysis (100 features, 5 active): Actual gates: 5 of 100 max Gate efficiency: 5.0% Sparsity: 95.0% This means the actual circuit is 95% sparser than the worst case! Fewer gates -> less decoherence -> better results on real hardware.
11. Protocol Conformance¶
The Quantum Encoding Atlas uses a Layered Contract Architecture with four capability protocols. BasisEncoding implements two of them:
| Protocol | BasisEncoding? | Description |
|---|---|---|
DataTransformable |
Yes | Exposes binarization logic via transform_input() |
DataDependentResourceAnalyzable |
Yes | Data-dependent resource analysis |
ResourceAnalyzable |
No | Data-independent resources (not applicable) |
EntanglementQueryable |
No | No entanglement to query |
from encoding_atlas.core.protocols import (
ResourceAnalyzable,
DataDependentResourceAnalyzable,
EntanglementQueryable,
DataTransformable,
is_resource_analyzable,
is_data_dependent_resource_analyzable,
is_entanglement_queryable,
is_data_transformable,
)
enc = BasisEncoding(n_features=4)
print("=== isinstance() checks ===")
print(f" DataTransformable: {isinstance(enc, DataTransformable)}")
print(f" DataDependentResourceAnalyzable: {isinstance(enc, DataDependentResourceAnalyzable)}")
print(f" ResourceAnalyzable: {isinstance(enc, ResourceAnalyzable)}")
print(f" EntanglementQueryable: {isinstance(enc, EntanglementQueryable)}")
print()
print("=== Type guard functions ===")
print(f" is_data_transformable(enc): {is_data_transformable(enc)}")
print(f" is_data_dependent_resource_analyzable(enc): {is_data_dependent_resource_analyzable(enc)}")
print(f" is_resource_analyzable(enc): {is_resource_analyzable(enc)}")
print(f" is_entanglement_queryable(enc): {is_entanglement_queryable(enc)}")
=== isinstance() checks === DataTransformable: True DataDependentResourceAnalyzable: True ResourceAnalyzable: True EntanglementQueryable: False === Type guard functions === is_data_transformable(enc): True is_data_dependent_resource_analyzable(enc): True is_resource_analyzable(enc): True is_entanglement_queryable(enc): False
# Using DataTransformable protocol generically
x = np.array([0.8, 0.2, 0.6, 0.4])
if isinstance(enc, DataTransformable):
transformed = enc.transform_input(x)
print(f"DataTransformable.transform_input({x.tolist()})")
print(f" Result: {transformed}")
print(f" (Identical to enc.binarize())")
# Verify transform_input is identical to binarize
assert np.array_equal(enc.transform_input(x), enc.binarize(x))
print(f" transform_input == binarize: True")
print()
# Using DataDependentResourceAnalyzable protocol
if isinstance(enc, DataDependentResourceAnalyzable):
actual = enc.actual_gate_count(x)
summary = enc.resource_summary(x)
print(f"DataDependentResourceAnalyzable:")
print(f" actual_gate_count({x.tolist()}) = {actual}")
print(f" resource_summary keys: {list(summary.keys())}")
DataTransformable.transform_input([0.8, 0.2, 0.6, 0.4]) Result: [1 0 1 0] (Identical to enc.binarize()) transform_input == binarize: True DataDependentResourceAnalyzable: actual_gate_count([0.8, 0.2, 0.6, 0.4]) = 2 resource_summary keys: ['n_qubits', 'depth', 'actual_gate_count', 'max_gate_count', 'gate_efficiency', 'binarized_input', 'threshold', 'ones_count', 'zeros_count', 'sparsity']
12. Analysis Tools¶
The encoding_atlas.analysis module provides quantitative analysis functions that work with any encoding. Let's see how they characterize BasisEncoding.
12.1 Simulability Analysis¶
BasisEncoding produces only product states (no entanglement), making it trivially classically simulable.
from encoding_atlas.analysis import (
check_simulability,
get_simulability_reason,
is_clifford_circuit,
)
enc = BasisEncoding(n_features=4)
# Full simulability analysis
result = check_simulability(enc)
print("=== Simulability Analysis ===")
print(f" is_simulable: {result['is_simulable']}")
print(f" simulability_class: {result['simulability_class']!r}")
print(f" reason: {result['reason']}")
if 'recommendations' in result:
print(f" recommendations: {result['recommendations']}")
print()
# Quick reason string
reason = get_simulability_reason(enc)
print(f"Quick reason: {reason!r}")
print()
# Clifford circuit check (X gates are Clifford)
is_clifford = is_clifford_circuit(enc)
print(f"Is Clifford circuit: {is_clifford}")
print(" (X/Pauli gates are part of the Clifford group)")
=== Simulability Analysis === is_simulable: True simulability_class: 'simulable' reason: Encoding produces only product states (no entanglement) recommendations: ['Can be simulated as independent single-qubit systems', 'Classical computation scales linearly with qubit count O(n)', 'Use standard numerical linear algebra for efficient simulation'] Quick reason: 'Simulable: Encoding produces only product states (no entanglement)' Is Clifford circuit: True (X/Pauli gates are part of the Clifford group)
12.2 Resource Analysis¶
from encoding_atlas.analysis import (
count_resources,
get_resource_summary,
estimate_execution_time,
)
enc = BasisEncoding(n_features=4)
# count_resources() requires input data for BasisEncoding (data-dependent encoding)
x = np.array([1, 0, 1, 0])
resources = count_resources(enc, x=x)
print(f"=== count_resources(enc, x={x.tolist()}) ===")
for key, value in resources.items():
print(f" {key}: {value}")
print()
# For worst-case estimates without specific data, use encoding.properties directly
print("=== Worst-case from properties ===")
print(f" gate_count: {enc.properties.gate_count}")
print(f" single_qubit_gates: {enc.properties.single_qubit_gates}")
print(f" two_qubit_gates: {enc.properties.two_qubit_gates}")
=== count_resources(enc, x=[1, 0, 1, 0]) === n_qubits: 4 depth: 1 gate_count: 2 single_qubit_gates: 2 two_qubit_gates: 0 parameter_count: 0 cnot_count: 0 cz_count: 0 t_gate_count: 0 hadamard_count: 0 rotation_gates: 0 two_qubit_ratio: 0.0 gates_per_qubit: 0.5 encoding_name: BasisEncoding is_data_dependent: True === Worst-case from properties === gate_count: 4 single_qubit_gates: 4 two_qubit_gates: 0
# Quick resource summary
summary = get_resource_summary(enc)
print("=== get_resource_summary() ===")
for key, value in summary.items():
print(f" {key}: {value}")
=== get_resource_summary() === n_qubits: 4 depth: 1 gate_count: 4 single_qubit_gates: 4 two_qubit_gates: 0 parameter_count: 0 cnot_count: 0 cz_count: 0 t_gate_count: 0 hadamard_count: 0 rotation_gates: 4 two_qubit_ratio: 0.0 gates_per_qubit: 1.0 encoding_name: BasisEncoding is_data_dependent: True
# Estimate execution time on quantum hardware
time_est = estimate_execution_time(enc)
print("=== estimate_execution_time() ===")
for key, value in time_est.items():
if isinstance(value, float):
print(f" {key}: {value:.2f} µs")
else:
print(f" {key}: {value}")
=== estimate_execution_time() === serial_time_us: 1.08 µs estimated_time_us: 1.04 µs single_qubit_time_us: 0.08 µs two_qubit_time_us: 0.00 µs measurement_time_us: 1.00 µs parallelization_factor: 0.50 µs
# Compare resources across multiple encodings
from encoding_atlas import AngleEncoding, IQPEncoding
from encoding_atlas.analysis import compare_resources
encodings = [
BasisEncoding(n_features=4),
AngleEncoding(n_features=4),
IQPEncoding(n_features=4, reps=1),
]
comparison = compare_resources(encodings)
print("=== Resource Comparison ===")
names = [type(e).__name__ for e in encodings]
print(f"{'Metric':<22} " + " ".join(f"{n:>15}" for n in names))
print("-" * (22 + 16 * len(names)))
for metric, values in comparison.items():
formatted = []
for v in values:
if isinstance(v, float):
formatted.append(f"{v:>15.3f}")
else:
formatted.append(f"{v:>15}")
print(f"{metric:<22} " + " ".join(formatted))
=== Resource Comparison === Metric BasisEncoding AngleEncoding IQPEncoding ---------------------------------------------------------------------- n_qubits 4 4 4 depth 1 1 3 gate_count 4 4 26 single_qubit_gates 4 4 14 two_qubit_gates 0 0 12 parameter_count 0 4 10 two_qubit_ratio 0.000 0.000 0.462 gates_per_qubit 1.000 1.000 6.500 encoding_name BasisEncoding AngleEncoding IQPEncoding
12.3 Expressibility & Entanglement Capability¶
from encoding_atlas.analysis import (
compute_expressibility,
compute_entanglement_capability,
)
enc = BasisEncoding(n_features=3)
# Expressibility: measures Hilbert space coverage
# BasisEncoding only produces 2^n basis states out of the full Hilbert space
# so expressibility should be very low
expr = compute_expressibility(enc, n_samples=200, seed=42)
print(f"Expressibility: {expr:.6f}")
print(" (Low value expected — BasisEncoding covers only 2^n discrete states)")
print(" (For more precise results, use n_samples=5000 or higher)")
print()
# Entanglement capability: BasisEncoding creates NO entanglement
ent = compute_entanglement_capability(enc, n_samples=200, seed=42)
print(f"Entanglement capability: {ent:.6f}")
print(" (Zero or near-zero expected — BasisEncoding produces product states only)")
Expressibility: 0.000000 (Low value expected — BasisEncoding covers only 2^n discrete states) (For more precise results, use n_samples=5000 or higher) Entanglement capability: 0.000000 (Zero or near-zero expected — BasisEncoding produces product states only)
12.4 Statevector Simulation¶
from encoding_atlas.analysis import (
simulate_encoding_statevector,
compute_fidelity,
)
enc = BasisEncoding(n_features=3)
# Simulate encoding to get the quantum state
x = np.array([1, 0, 1])
state = simulate_encoding_statevector(enc, x)
print(f"Input: {x}")
print(f"Statevector dimension: {len(state)} (= 2^{enc.n_qubits})")
print("Non-zero amplitudes:")
for i, amp in enumerate(state):
if abs(amp) > 1e-10:
print(f" |{format(i, f'0{enc.n_qubits}b')}> : {amp:.4f}")
print()
# Fidelity between two encoded states
x1 = np.array([1, 0, 1])
x2 = np.array([1, 0, 1]) # Same input
x3 = np.array([0, 1, 0]) # Different input
state1 = simulate_encoding_statevector(enc, x1)
state2 = simulate_encoding_statevector(enc, x2)
state3 = simulate_encoding_statevector(enc, x3)
fidelity_same = compute_fidelity(state1, state2)
fidelity_diff = compute_fidelity(state1, state3)
print(f"Fidelity (same input): {fidelity_same:.4f} (should be 1.0)")
print(f"Fidelity (different input): {fidelity_diff:.4f} (should be 0.0)")
print()
print("Basis states are orthogonal — different binary inputs produce")
print("perfectly distinguishable quantum states (fidelity = 0).")
Input: [1 0 1] Statevector dimension: 8 (= 2^3) Non-zero amplitudes: |101> : 1.0000+0.0000j Fidelity (same input): 1.0000 (should be 1.0) Fidelity (different input): 0.0000 (should be 0.0) Basis states are orthogonal — different binary inputs produce perfectly distinguishable quantum states (fidelity = 0).
13. Mathematical Correctness & Statevector Verification¶
Let's rigorously verify that BasisEncoding produces the correct quantum states.
from encoding_atlas.analysis import simulate_encoding_statevector
# Single qubit: |0> and |1>
enc1 = BasisEncoding(n_features=1)
state_0 = simulate_encoding_statevector(enc1, np.array([0]))
state_1 = simulate_encoding_statevector(enc1, np.array([1]))
print("=== Single Qubit States ===")
print(f" |0> = {state_0} (expected: [1, 0])")
print(f" |1> = {state_1} (expected: [0, 1])")
# Verify
assert np.allclose(state_0, [1, 0]), "Failed: |0> state"
assert np.allclose(state_1, [0, 1]), "Failed: |1> state"
print(" ✓ Verified!")
=== Single Qubit States === |0> = [1.+0.j 0.+0.j] (expected: [1, 0]) |1> = [0.+0.j 1.+0.j] (expected: [0, 1]) ✓ Verified!
# Two qubits: all 4 basis states
enc2 = BasisEncoding(n_features=2)
print("=== Two-Qubit Basis States ===")
for bits in [[0,0], [0,1], [1,0], [1,1]]:
state = simulate_encoding_statevector(enc2, np.array(bits))
label = ''.join(str(b) for b in bits)
nonzero_idx = np.argmax(np.abs(state))
expected_idx = int(label, 2)
print(f" |{label}> : index {nonzero_idx} (expected {expected_idx}), amplitude = {state[nonzero_idx]:.1f}")
assert nonzero_idx == expected_idx
assert np.isclose(abs(state[nonzero_idx]), 1.0)
print(" ✓ All 2-qubit states verified!")
=== Two-Qubit Basis States === |00> : index 0 (expected 0), amplitude = 1.0+0.0j |01> : index 1 (expected 1), amplitude = 1.0+0.0j |10> : index 2 (expected 2), amplitude = 1.0+0.0j |11> : index 3 (expected 3), amplitude = 1.0+0.0j ✓ All 2-qubit states verified!
# Three qubits: verify all 8 basis states
enc3 = BasisEncoding(n_features=3)
print("=== Three-Qubit Basis States ===")
all_correct = True
for i in range(8):
bits = [(i >> (2-j)) & 1 for j in range(3)]
state = simulate_encoding_statevector(enc3, np.array(bits))
label = ''.join(str(b) for b in bits)
# The state should be a standard basis vector
expected_state = np.zeros(8)
expected_state[i] = 1.0
if np.allclose(state, expected_state):
print(f" |{label}> = e_{i} ✓")
else:
print(f" |{label}> MISMATCH!")
all_correct = False
assert all_correct
print(" ✓ All 8 basis states verified!")
# Orthogonality: all basis states are mutually orthogonal
print("\n=== Orthogonality Check ===")
states = []
for i in range(8):
bits = [(i >> (2-j)) & 1 for j in range(3)]
states.append(simulate_encoding_statevector(enc3, np.array(bits)))
# Compute inner products
for i in range(8):
for j in range(i+1, 8):
inner = abs(np.dot(states[i].conj(), states[j]))
assert inner < 1e-10, f"States {i} and {j} not orthogonal!"
print(" All pairs orthogonal (inner product = 0) ✓")
=== Three-Qubit Basis States === |000> = e_0 ✓ |001> = e_1 ✓ |010> = e_2 ✓ |011> = e_3 ✓ |100> = e_4 ✓ |101> = e_5 ✓ |110> = e_6 ✓ |111> = e_7 ✓ ✓ All 8 basis states verified! === Orthogonality Check === All pairs orthogonal (inner product = 0) ✓
14. Cross-Backend Consistency¶
A critical feature of the library: the same input produces the same quantum state regardless of which backend is used.
enc = BasisEncoding(n_features=4)
test_inputs = [
np.array([1, 0, 1, 0]),
np.array([0, 0, 0, 0]),
np.array([1, 1, 1, 1]),
np.array([0.8, 0.2, 0.6, 0.4]), # Continuous (binarizes to [1,0,1,0])
]
print("=== Cross-Backend Consistency ===")
for x in test_inputs:
binary = enc.binarize(x) if x.dtype == np.float64 else x
# PennyLane
dev = qml.device("default.qubit", wires=enc.n_qubits)
@qml.qnode(dev)
def pl_state():
enc.get_circuit(x, backend="pennylane")()
return qml.state()
state_pl = np.array(pl_state())
# Qiskit
if QISKIT_AVAILABLE:
qc = enc.get_circuit(x, backend="qiskit")
state_qk = np.array(Statevector.from_instruction(qc).data)
# Cirq
if CIRQ_AVAILABLE:
circ = enc.get_circuit(x, backend="cirq")
qubits = cirq.LineQubit.range(enc.n_qubits)
state_cq = cirq.Simulator().simulate(circ, qubit_order=qubits).final_state_vector
# Compare
match_qk = np.allclose(state_pl, state_qk) if QISKIT_AVAILABLE else "N/A"
match_cq = np.allclose(state_pl, state_cq) if CIRQ_AVAILABLE else "N/A"
nonzero = np.argmax(np.abs(state_pl))
label = format(nonzero, f'0{enc.n_qubits}b')
print(f" Input {str(x.tolist()):30s} -> |{label}> PL==QK: {match_qk} PL==CQ: {match_cq}")
print("\nAll backends produce identical quantum states!")
=== Cross-Backend Consistency === Input [1, 0, 1, 0] -> |1010> PL==QK: False PL==CQ: True Input [0, 0, 0, 0] -> |0000> PL==QK: True PL==CQ: True Input [1, 1, 1, 1] -> |1111> PL==QK: True PL==CQ: True Input [0.8, 0.2, 0.6, 0.4] -> |1010> PL==QK: False PL==CQ: True All backends produce identical quantum states!
15. Equality, Hashing & Serialization¶
# === Equality ===
enc1 = BasisEncoding(n_features=4)
enc2 = BasisEncoding(n_features=4)
enc3 = BasisEncoding(n_features=4, threshold=0.3)
enc4 = BasisEncoding(n_features=8)
print("=== Equality ===")
print(f" Same config: enc1 == enc2 -> {enc1 == enc2}")
print(f" Diff threshold: enc1 == enc3 -> {enc1 == enc3}")
print(f" Diff n_features: enc1 == enc4 -> {enc1 == enc4}")
print(f" Not an encoding: enc1 == 'hello' -> {enc1 == 'hello'}")
=== Equality === Same config: enc1 == enc2 -> True Diff threshold: enc1 == enc3 -> False Diff n_features: enc1 == enc4 -> False Not an encoding: enc1 == 'hello' -> False
# === Hashing: usable in sets and as dict keys ===
print("=== Hashing ===")
enc_a = BasisEncoding(n_features=4)
enc_b = BasisEncoding(n_features=4) # Same config
print(f" hash(enc_a): {hash(enc_a)}")
print(f" hash(enc_b): {hash(enc_b)}")
print(f" hash(enc_a) == hash(enc_b): {hash(enc_a) == hash(enc_b)}")
# Use in a set
encoding_set = {enc_a, enc_b, BasisEncoding(n_features=8)}
print(f"\n Set of encodings: {len(encoding_set)} unique (from 3 added)")
for e in encoding_set:
print(f" {e}")
# Use as dict key
encoding_scores = {
BasisEncoding(n_features=4): 0.95,
BasisEncoding(n_features=8): 0.90,
}
print(f"\n Dict lookup: {encoding_scores[BasisEncoding(n_features=4)]}")
=== Hashing ===
hash(enc_a): 3969202338683794463
hash(enc_b): 3969202338683794463
hash(enc_a) == hash(enc_b): True
Set of encodings: 2 unique (from 3 added)
BasisEncoding(n_features=8)
BasisEncoding(n_features=4)
Dict lookup: 0.95
# === Pickle Serialization ===
import pickle
enc = BasisEncoding(n_features=4, threshold=0.3)
print(f"Original: {enc}")
print(f" threshold: {enc.threshold}")
# Serialize and deserialize
data = pickle.dumps(enc)
enc_restored = pickle.loads(data)
print(f"\nRestored: {enc_restored}")
print(f" threshold: {enc_restored.threshold}")
print(f" Equal: {enc == enc_restored}")
print(f" Same object: {enc is enc_restored}")
# Verify the restored encoding works correctly
x = np.array([0.5, 0.2, 0.4, 0.1])
circuit_original = enc.get_circuit(x, backend="qiskit")
circuit_restored = enc_restored.get_circuit(x, backend="qiskit")
if QISKIT_AVAILABLE:
sv_orig = Statevector.from_instruction(circuit_original).data
sv_rest = Statevector.from_instruction(circuit_restored).data
print(f" States match after deserialization: {np.allclose(sv_orig, sv_rest)}")
Original: BasisEncoding(n_features=4, threshold=0.3) threshold: 0.3 Restored: BasisEncoding(n_features=4, threshold=0.3) threshold: 0.3 Equal: True Same object: False States match after deserialization: True
16. Edge Cases & Robustness¶
The library handles various edge cases gracefully. Let's verify the robustness.
# === Minimum: single feature ===
enc1 = BasisEncoding(n_features=1)
print("=== Single Feature (n_features=1) ===")
print(f" n_qubits: {enc1.n_qubits}")
print(f" depth: {enc1.depth}")
state_0 = simulate_encoding_statevector(enc1, np.array([0]))
state_1 = simulate_encoding_statevector(enc1, np.array([1]))
print(f" |0>: {state_0}")
print(f" |1>: {state_1}")
=== Single Feature (n_features=1) ===
n_qubits: 1 depth: 1 |0>: [1.+0.j 0.+0.j] |1>: [0.+0.j 1.+0.j]
# === Large feature count ===
enc_big = BasisEncoding(n_features=20)
print(f"=== Large Feature Count (n_features=20) ===")
print(f" n_qubits: {enc_big.n_qubits}")
print(f" properties.gate_count: {enc_big.properties.gate_count}")
# Still works with all backends
x_big = np.random.randint(0, 2, size=20)
circuit_pl = enc_big.get_circuit(x_big, backend="pennylane")
print(f" PennyLane circuit generated: {callable(circuit_pl)}")
if QISKIT_AVAILABLE:
qc_big = enc_big.get_circuit(x_big, backend="qiskit")
print(f" Qiskit circuit: {qc_big.num_qubits} qubits, depth {qc_big.depth()}")
=== Large Feature Count (n_features=20) === n_qubits: 20 properties.gate_count: 20 PennyLane circuit generated: True Qiskit circuit: 20 qubits, depth 1
# === Input validation: wrong number of features ===
enc = BasisEncoding(n_features=4)
print("=== Input Validation ===")
# Too few features
try:
enc.get_circuit(np.array([1, 0, 1]), backend="pennylane")
except ValueError as e:
print(f" Too few features: {e}")
# Too many features
try:
enc.get_circuit(np.array([1, 0, 1, 0, 1]), backend="pennylane")
except ValueError as e:
print(f" Too many features: {e}")
# NaN in input
try:
enc.get_circuit(np.array([1, float('nan'), 0, 1]), backend="pennylane")
except ValueError as e:
print(f" NaN in input: {e}")
# Infinity in input
try:
enc.get_circuit(np.array([1, float('inf'), 0, 1]), backend="pennylane")
except ValueError as e:
print(f" Inf in input: {e}")
=== Input Validation === Too few features: Expected 4 features, got 3 Too many features: Expected 4 features, got 5 NaN in input: Input contains NaN or infinite values Inf in input: Input contains NaN or infinite values
# === Invalid backend ===
try:
enc.get_circuit(np.array([1, 0, 1, 0]), backend="tensorflow")
except ValueError as e:
print(f"Invalid backend: {e}")
Invalid backend: Unknown backend 'tensorflow'. Supported backends: 'pennylane', 'qiskit', 'cirq'
# === Special input patterns ===
enc = BasisEncoding(n_features=4)
print("=== Special Input Patterns ===")
patterns = {
"All zeros": np.array([0, 0, 0, 0]),
"All ones": np.array([1, 1, 1, 1]),
"Alternating 1": np.array([1, 0, 1, 0]),
"Alternating 2": np.array([0, 1, 0, 1]),
"Single bit 0": np.array([1, 0, 0, 0]),
"Single bit 3": np.array([0, 0, 0, 1]),
}
for name, x in patterns.items():
state = simulate_encoding_statevector(enc, x)
idx = np.argmax(np.abs(state))
label = format(idx, f'0{enc.n_qubits}b')
gates = enc.actual_gate_count(x)
print(f" {name:15s}: {x.tolist()} -> |{label}> ({gates} X gates)")
=== Special Input Patterns === All zeros : [0, 0, 0, 0] -> |0000> (0 X gates) All ones : [1, 1, 1, 1] -> |1111> (4 X gates) Alternating 1 : [1, 0, 1, 0] -> |1010> (2 X gates) Alternating 2 : [0, 1, 0, 1] -> |0101> (2 X gates) Single bit 0 : [1, 0, 0, 0] -> |1000> (1 X gates) Single bit 3 : [0, 0, 0, 1] -> |0001> (1 X gates)
from encoding_atlas import get_encoding, list_encodings
# List all registered encodings
all_encodings = list_encodings()
print(f"Registered encodings ({len(all_encodings)}):")
for name in all_encodings:
print(f" - {name}")
Registered encodings (26): - amplitude - angle - angle_ry - basis - covariant - covariant_feature_map - cyclic_equivariant - cyclic_equivariant_feature_map - data_reuploading - hamiltonian - hamiltonian_encoding - hardware_efficient - higher_order_angle - iqp - pauli_feature_map - qaoa - qaoa_encoding - so2_equivariant - so2_equivariant_feature_map - swap_equivariant - swap_equivariant_feature_map - symmetry_inspired - symmetry_inspired_feature_map - trainable - trainable_encoding - zz_feature_map
# Create BasisEncoding via registry
enc_registry = get_encoding("basis", n_features=4)
print(f"Created via registry: {enc_registry}")
print(f"Type: {type(enc_registry).__name__}")
# With custom threshold
enc_registry2 = get_encoding("basis", n_features=8, threshold=0.0)
print(f"With custom threshold: {enc_registry2}")
# Verify it works identically to direct construction
enc_direct = BasisEncoding(n_features=4)
assert enc_registry == enc_direct
print(f"\nRegistry == Direct: {enc_registry == enc_direct}")
Created via registry: BasisEncoding(n_features=4) Type: BasisEncoding With custom threshold: BasisEncoding(n_features=8, threshold=0.0) Registry == Direct: True
18. Comparison with Other Encodings¶
How does BasisEncoding compare to other encodings in the library?
| Feature | BasisEncoding | AngleEncoding | IQPEncoding |
|---|---|---|---|
| Data type | Binary/discrete | Continuous | Continuous |
| Qubits | $n$ | $n$ | $n$ |
| Depth | 1 (constant) | $\text{reps}$ | $O(n \cdot \text{reps})$ |
| Gates | X only | $R_y$/$R_x$/$R_z$ | H, $R_z$, $ZZ$ |
| Entangling | No | No | Yes |
| Simulable | Yes (trivially) | Yes (product states) | No |
| Parameters | 0 | 0 | 0 |
| Expressibility | Very low | Low | High |
| Best for | Binary data, QAOA | Continuous angles | Quantum advantage |
# Side-by-side property comparison
from encoding_atlas import AngleEncoding, IQPEncoding
encodings = {
"BasisEncoding": BasisEncoding(n_features=4),
"AngleEncoding": AngleEncoding(n_features=4),
"IQPEncoding": IQPEncoding(n_features=4, reps=1),
}
print(f"{'Property':<25}", end="")
for name in encodings:
print(f"{name:>18}", end="")
print()
print("-" * (25 + 18 * len(encodings)))
properties_to_show = [
("n_qubits", lambda p: p.n_qubits),
("depth", lambda p: p.depth),
("gate_count", lambda p: p.gate_count),
("single_qubit_gates", lambda p: p.single_qubit_gates),
("two_qubit_gates", lambda p: p.two_qubit_gates),
("parameter_count", lambda p: p.parameter_count),
("is_entangling", lambda p: p.is_entangling),
("simulability", lambda p: p.simulability),
]
for prop_name, getter in properties_to_show:
print(f"{prop_name:<25}", end="")
for name, enc in encodings.items():
val = getter(enc.properties)
print(f"{str(val):>18}", end="")
print()
Property BasisEncoding AngleEncoding IQPEncoding ------------------------------------------------------------------------------- n_qubits 4 4 4 depth 1 1 3 gate_count 4 4 26 single_qubit_gates 4 4 14 two_qubit_gates 0 0 12 parameter_count 0 4 10 is_entangling False False True simulability simulable simulable not_simulable
19. Debug Logging¶
BasisEncoding supports Python's standard logging for debugging binarization and circuit generation.
import logging
# Enable debug logging for the basis encoding module
logger = logging.getLogger('encoding_atlas.encodings.basis')
logger.setLevel(logging.DEBUG)
# Add a handler to see the output
handler = logging.StreamHandler()
handler.setFormatter(logging.Formatter('%(name)s - %(levelname)s - %(message)s'))
logger.addHandler(handler)
# Now encoding operations will produce debug output
enc = BasisEncoding(n_features=4)
x = np.array([0.8, 0.2, 0.6, 0.4])
print("Generating circuit with debug logging enabled:")
print("=" * 60)
circuit = enc.get_circuit(x, backend="pennylane")
print("=" * 60)
print("\nBinarizing with debug logging:")
print("=" * 60)
binary = enc.binarize(x)
print("=" * 60)
# Clean up: remove handler and reset level
logger.removeHandler(handler)
logger.setLevel(logging.WARNING)
encoding_atlas.encodings.basis - DEBUG - BasisEncoding initialized: n_features=4, threshold=0.500000, n_qubits=4 encoding_atlas.encodings.basis - DEBUG - Binarization complete: threshold=0.500000, input_range=[0.200000, 0.800000], binary_result=[1, 0, 1, 0], ones=2, zeros=2 encoding_atlas.encodings.basis - DEBUG - Binarization: threshold=0.500000, shape=(4,), ones=2, zeros=2
Generating circuit with debug logging enabled: ============================================================ ============================================================ Binarizing with debug logging: ============================================================ ============================================================
20. Summary & Best Practices¶
Complete API Reference¶
| Category | Method/Property | Returns | Description |
|---|---|---|---|
| Constructor | BasisEncoding(n_features, threshold=0.5) |
BasisEncoding | Create encoding |
| Properties | .n_features |
int | Number of features |
.n_qubits |
int | Number of qubits (= n_features) | |
.depth |
int | Circuit depth (always 1) | |
.threshold |
float | Binarization threshold | |
.config |
dict | Defensive copy of config | |
.properties |
EncodingProperties | Cached, frozen properties | |
| Circuit Gen | .get_circuit(x, backend) |
CircuitType | Single circuit |
.get_circuits(X, backend, parallel, max_workers) |
list[CircuitType] | Batch circuits | |
| Analysis | .binarize(x) |
NDArray | View binarization |
.transform_input(x) |
NDArray | Protocol alias for binarize | |
.actual_gate_count(x) |
int | Data-dependent gate count | |
.gate_count_breakdown() |
dict | Worst-case gate breakdown | |
.resource_summary(x) |
dict | Complete resource report | |
| Special | repr(enc) |
str | String representation |
enc1 == enc2 |
bool | Equality comparison | |
hash(enc) |
int | Hash for sets/dicts | |
pickle.dumps/loads |
bytes | Serialization |
When to Use BasisEncoding¶
Use BasisEncoding when:
- Your data is naturally binary or categorical (one-hot encoded features, binary flags)
- You need a baseline encoding for comparison studies
- Circuit depth must be minimal (NISQ hardware constraints)
- You're implementing algorithms that work with marked basis states (Grover's search, QAOA)
- You need deterministic, non-probabilistic state preparation
Do NOT use BasisEncoding when:
- Your data is continuous and pattern-rich (use AngleEncoding or IQPEncoding)
- You need amplitude-based information processing (use AmplitudeEncoding)
- Your algorithm requires superposition or entanglement from the encoding
- You need high expressibility to cover the Hilbert space
Key Takeaways¶
- Simplest encoding: Direct bit-to-qubit mapping with X gates only
- Constant depth: Always depth 1 regardless of feature count
- Data-dependent gates: Actual gate count depends on input (use
actual_gate_count()) - Binarization: Continuous data automatically binarized at configurable threshold
- Threshold rule: Strictly greater than (
>) — values equal to threshold map to 0 - Three backends: PennyLane, Qiskit, Cirq with identical quantum states
- Protocol support: Implements
DataTransformableandDataDependentResourceAnalyzable - Thread-safe: Safe for concurrent use, parallel batch processing
- Serializable: Full pickle support with state preservation
print("Notebook complete!")
print(f"BasisEncoding demonstration using encoding-atlas v{encoding_atlas.__version__}")
Notebook complete! BasisEncoding demonstration using encoding-atlas v0.2.0