Encoding Atlas — Guide & Recommendation System¶
This notebook provides a comprehensive walkthrough of the encoding_atlas.guide module, which helps users select the best quantum data encoding for their problem.
The guide module has three components:
| Component | Purpose |
|---|---|
rules.py |
Knowledge base of 16 encodings with hard constraints and soft tags |
recommender.py |
Two-phase recommendation engine (hard filter → soft scoring) |
decision_tree.py |
Deterministic, interpretable decision tree for encoding selection |
Table of Contents
- Setup & Imports
- The Knowledge Base —
ENCODING_RULES - Hard Constraint Filtering —
get_matching_encodings() - The Recommender —
recommend_encoding() - The Decision Tree —
EncodingDecisionTree - All 16 Encodings: Reachability Proof
- Decision Priority Hierarchy
- Parameter Deep-Dives
- Edge Cases & Robustness
- Recommender vs Decision Tree Comparison
- Real-World Scenarios
- Confidence Analysis
- Connecting Recommendations to Actual Encodings
- Summary
1. Setup & Imports ¶
# Core guide components
from encoding_atlas.guide import (
recommend_encoding,
Recommendation,
EncodingDecisionTree,
ENCODING_RULES,
get_matching_encodings,
)
# Internal helpers (for educational deep-dives)
from encoding_atlas.guide.rules import _passes_hard_constraints, EncodingRule
from encoding_atlas.guide.recommender import (
_compute_score,
_score_to_confidence,
_generate_explanation,
)
# Registry for connecting recommendations to actual encoding classes
from encoding_atlas import get_encoding, list_encodings
import math
print("All imports successful!")
print(f"Number of encodings in knowledge base: {len(ENCODING_RULES)}")
print(f"Encodings in registry: {len(list_encodings())}")
All imports successful! Number of encodings in knowledge base: 16 Encodings in registry: 26
2. The Knowledge Base — ENCODING_RULES ¶
The knowledge base is a dictionary mapping 16 canonical encoding names to their rule entries. Each entry is an EncodingRule TypedDict with 11 fields that describe when an encoding should (or shouldn't) be used.
# All 16 canonical encoding names
print("All 16 encodings in the knowledge base:")
print("=" * 50)
for i, name in enumerate(sorted(ENCODING_RULES.keys()), 1):
print(f" {i:2d}. {name}")
assert len(ENCODING_RULES) == 16, "Expected exactly 16 encodings"
All 16 encodings in the knowledge base: ================================================== 1. amplitude 2. angle 3. basis 4. cyclic_equivariant 5. data_reuploading 6. hamiltonian 7. hardware_efficient 8. higher_order_angle 9. iqp 10. pauli_feature_map 11. qaoa 12. so2_equivariant 13. swap_equivariant 14. symmetry_inspired 15. trainable 16. zz_feature_map
2.1 EncodingRule Schema¶
Each rule entry contains these 11 fields:
# Show the schema (field names and types)
print("EncodingRule TypedDict fields:")
print("=" * 60)
for field, field_type in EncodingRule.__annotations__.items():
print(f" {field:25s} : {field_type}")
EncodingRule TypedDict fields:
============================================================
best_for : ForwardRef('list[str]')
avoid_when : ForwardRef('list[str]')
max_features : ForwardRef('int | None')
simulable : ForwardRef('bool')
requires_data_type : ForwardRef('list[str] | None')
requires_symmetry : ForwardRef('str | None')
requires_n_features : ForwardRef('int | None')
requires_even_features : ForwardRef('bool')
requires_trainable : ForwardRef('bool')
qubit_scaling : ForwardRef("Literal['linear', 'logarithmic']")
circuit_depth : ForwardRef("Literal['constant', 'shallow', 'moderate', 'deep']")
# Inspect a specific encoding's complete rule entry
example_name = "iqp"
print(f"Complete rule entry for '{example_name}':")
print("=" * 60)
for key, value in ENCODING_RULES[example_name].items():
print(f" {key:25s} = {value}")
Complete rule entry for 'iqp': ============================================================ best_for = ['quantum_advantage', 'expressibility', 'kernel_methods'] avoid_when = ['many_features', 'noisy_hardware', 'nisq_hardware'] max_features = 12 simulable = False requires_data_type = None requires_symmetry = None requires_n_features = None requires_even_features = False requires_trainable = False qubit_scaling = linear circuit_depth = moderate
2.2 Exploring Encoding Categories¶
Let's categorize all 16 encodings by their properties.
# Group by simulability
simulable = [n for n, r in ENCODING_RULES.items() if r["simulable"]]
non_simulable = [n for n, r in ENCODING_RULES.items() if not r["simulable"]]
print("Classically simulable encodings:")
for name in sorted(simulable):
print(f" - {name}")
print(f"\nNon-simulable encodings ({len(non_simulable)}):")
for name in sorted(non_simulable):
print(f" - {name}")
Classically simulable encodings: - angle - basis - higher_order_angle Non-simulable encodings (13): - amplitude - cyclic_equivariant - data_reuploading - hamiltonian - hardware_efficient - iqp - pauli_feature_map - qaoa - so2_equivariant - swap_equivariant - symmetry_inspired - trainable - zz_feature_map
# Group by circuit depth
print("Encodings by circuit depth:")
print("=" * 50)
for depth in ["constant", "shallow", "moderate", "deep"]:
names = sorted(n for n, r in ENCODING_RULES.items() if r["circuit_depth"] == depth)
print(f"\n {depth.upper()}:")
for name in names:
print(f" - {name}")
Encodings by circuit depth:
==================================================
CONSTANT:
- angle
- basis
SHALLOW:
- hardware_efficient
- higher_order_angle
MODERATE:
- cyclic_equivariant
- iqp
- pauli_feature_map
- qaoa
- so2_equivariant
- swap_equivariant
- symmetry_inspired
- trainable
- zz_feature_map
DEEP:
- amplitude
- data_reuploading
- hamiltonian
# Group by qubit scaling
print("Encodings by qubit scaling:")
print("=" * 50)
for scaling in ["linear", "logarithmic"]:
names = sorted(n for n, r in ENCODING_RULES.items() if r["qubit_scaling"] == scaling)
print(f"\n {scaling.upper()} (n_qubits ~ {'n_features' if scaling == 'linear' else 'log2(n_features)'}):")
for name in names:
print(f" - {name}")
Encodings by qubit scaling:
==================================================
LINEAR (n_qubits ~ n_features):
- angle
- basis
- cyclic_equivariant
- data_reuploading
- hamiltonian
- hardware_efficient
- higher_order_angle
- iqp
- pauli_feature_map
- qaoa
- so2_equivariant
- swap_equivariant
- symmetry_inspired
- trainable
- zz_feature_map
LOGARITHMIC (n_qubits ~ log2(n_features)):
- amplitude
# Encodings with hard constraints
print("Encodings with HARD constraints:")
print("=" * 60)
for name, rules in sorted(ENCODING_RULES.items()):
constraints = []
if rules["requires_data_type"] is not None:
constraints.append(f"data_type in {rules['requires_data_type']}")
if rules["requires_symmetry"] is not None:
constraints.append(f"symmetry = '{rules['requires_symmetry']}'")
if rules["requires_n_features"] is not None:
constraints.append(f"n_features = {rules['requires_n_features']}")
if rules["requires_even_features"]:
constraints.append("n_features must be even")
if rules["requires_trainable"]:
constraints.append("trainable = True")
if rules["max_features"] is not None:
constraints.append(f"max_features = {rules['max_features']}")
if constraints:
print(f"\n {name}:")
for c in constraints:
print(f" - {c}")
# Encodings without any hard constraints
unconstrained = [
name for name, rules in ENCODING_RULES.items()
if rules["requires_data_type"] is None
and rules["requires_symmetry"] is None
and rules["requires_n_features"] is None
and not rules["requires_even_features"]
and not rules["requires_trainable"]
and rules["max_features"] is None
]
print(f"\nEncodings with NO hard constraints (always eligible): {sorted(unconstrained)}")
Encodings with HARD constraints:
============================================================
basis:
- data_type in ['binary', 'discrete']
cyclic_equivariant:
- symmetry = 'cyclic'
data_reuploading:
- max_features = 8
higher_order_angle:
- max_features = 10
iqp:
- max_features = 12
pauli_feature_map:
- max_features = 12
so2_equivariant:
- symmetry = 'rotation'
- n_features = 2
- max_features = 2
swap_equivariant:
- symmetry = 'permutation_pairs'
- n_features must be even
symmetry_inspired:
- symmetry = 'general'
trainable:
- trainable = True
zz_feature_map:
- max_features = 12
Encodings with NO hard constraints (always eligible): ['amplitude', 'angle', 'hamiltonian', 'hardware_efficient', 'qaoa']
# All best_for tags across all encodings
all_best_for_tags = set()
for rules in ENCODING_RULES.values():
all_best_for_tags.update(rules["best_for"])
print(f"All unique best_for tags ({len(all_best_for_tags)}):")
for tag in sorted(all_best_for_tags):
owners = [n for n, r in ENCODING_RULES.items() if tag in r["best_for"]]
print(f" {tag:30s} -> {owners}")
All unique best_for tags (40): 2d_rotation -> ['so2_equivariant'] balanced -> ['zz_feature_map'] binary_data -> ['basis'] combinatorial -> ['basis', 'qaoa'] compression -> ['amplitude'] custom_pauli -> ['pauli_feature_map'] cyclic_symmetry -> ['cyclic_equivariant'] exponential_compression -> ['amplitude'] expressibility -> ['iqp', 'data_reuploading', 'hamiltonian'] feature_interactions -> ['higher_order_angle', 'pauli_feature_map'] graph_optimization -> ['qaoa'] heuristic_symmetry -> ['symmetry_inspired'] inductive_bias -> ['symmetry_inspired'] kernel_methods -> ['iqp', 'zz_feature_map', 'pauli_feature_map'] many_features -> ['amplitude'] native_gates -> ['hardware_efficient'] nisq_hardware -> ['angle', 'hardware_efficient'] noise_resilience -> ['hardware_efficient'] optimization -> ['trainable'] paired_features -> ['swap_equivariant'] periodic_data -> ['cyclic_equivariant'] permutation_pairs -> ['swap_equivariant'] physics_simulation -> ['hamiltonian'] polynomial_features -> ['higher_order_angle'] product_states -> ['angle', 'higher_order_angle'] qaoa_structure -> ['qaoa'] qnn -> ['trainable'] quantum_advantage -> ['iqp'] research -> ['pauli_feature_map'] rigorous_equivariance -> ['so2_equivariant', 'cyclic_equivariant', 'swap_equivariant'] rotation_equivariance -> ['so2_equivariant'] simplicity -> ['angle', 'basis'] speed -> ['angle', 'basis'] standard_benchmark -> ['zz_feature_map'] symmetry_general -> ['symmetry_inspired'] task_specific -> ['trainable'] time_evolution -> ['hamiltonian'] time_series -> ['data_reuploading'] trainability -> ['data_reuploading', 'trainable'] universal_approximation -> ['data_reuploading']
# All avoid_when tags across all encodings
all_avoid_tags = set()
for rules in ENCODING_RULES.values():
all_avoid_tags.update(rules["avoid_when"])
print(f"All unique avoid_when tags ({len(all_avoid_tags)}):")
for tag in sorted(all_avoid_tags):
owners = [n for n, r in ENCODING_RULES.items() if tag in r["avoid_when"]]
print(f" {tag:30s} -> {owners}")
All unique avoid_when tags (20): continuous_data -> ['basis'] continuous_features_only -> ['qaoa'] feature_interactions -> ['angle', 'basis'] limited_depth -> ['data_reuploading'] many_features -> ['higher_order_angle', 'iqp', 'zz_feature_map', 'so2_equivariant'] need_entanglement -> ['angle', 'basis', 'higher_order_angle'] nisq_hardware -> ['iqp', 'data_reuploading', 'amplitude', 'hamiltonian'] no_optimization_budget -> ['trainable'] noisy_hardware -> ['iqp'] non_2d_data -> ['so2_equivariant'] non_paired_data -> ['swap_equivariant'] non_periodic_data -> ['cyclic_equivariant'] odd_features -> ['swap_equivariant'] quantum_advantage -> ['angle', 'higher_order_angle', 'hardware_efficient'] rigorous_equivariance -> ['symmetry_inspired'] shallow_circuits -> ['amplitude'] simplicity -> ['pauli_feature_map', 'hamiltonian', 'trainable'] simulator_only -> ['hardware_efficient'] speed -> ['data_reuploading', 'amplitude', 'qaoa', 'hamiltonian', 'symmetry_inspired'] very_noisy_hardware -> ['zz_feature_map', 'pauli_feature_map']
3. Hard Constraint Filtering — get_matching_encodings() ¶
The get_matching_encodings() function performs a two-phase check:
- Hard filter — eliminates encodings whose structural preconditions fail
- Soft match — among survivors, selects those whose
best_fortags overlap with requirements and whoseavoid_whentags don't overlap with constraints
3.1 Basic Tag-Based Matching¶
# Find encodings best for speed
speed_encodings = get_matching_encodings(["speed"])
print(f"Encodings good for 'speed': {speed_encodings}")
assert "angle" in speed_encodings, "angle should match 'speed'"
Encodings good for 'speed': ['angle']
# Find encodings for kernel methods
kernel_encodings = get_matching_encodings(["kernel_methods"])
print(f"Encodings good for 'kernel_methods': {kernel_encodings}")
assert "iqp" in kernel_encodings
assert "zz_feature_map" in kernel_encodings
assert "pauli_feature_map" in kernel_encodings
Encodings good for 'kernel_methods': ['iqp', 'zz_feature_map', 'pauli_feature_map']
# Multiple requirement tags (any match works)
multi_tag = get_matching_encodings(["speed", "quantum_advantage"])
print(f"Encodings matching 'speed' OR 'quantum_advantage': {multi_tag}")
Encodings matching 'speed' OR 'quantum_advantage': ['angle', 'iqp']
3.2 Adding Soft Constraints (avoid_when filtering)¶
# Without constraints: IQP matches quantum_advantage
without_constraint = get_matching_encodings(["quantum_advantage"])
print(f"Without constraints: {without_constraint}")
assert "iqp" in without_constraint
# With constraints: filter out encodings that should be avoided on noisy hardware
with_constraint = get_matching_encodings(
["quantum_advantage"],
constraints=["noisy_hardware"],
)
print(f"With 'noisy_hardware' constraint: {with_constraint}")
assert "iqp" not in with_constraint, "IQP should be filtered out (avoid_when has 'noisy_hardware')"
Without constraints: ['iqp'] With 'noisy_hardware' constraint: []
# Constraints filter out encodings whose avoid_when overlaps
# Data reuploading avoids 'speed', so constraining on 'speed' excludes it
trainability_fast = get_matching_encodings(
["trainability"],
constraints=["speed"],
)
print(f"Trainable but not slow: {trainability_fast}")
assert "data_reuploading" not in trainability_fast, (
"data_reuploading avoid_when has 'speed'"
)
Trainable but not slow: []
3.3 Hard Constraint Parameters¶
# data_type filtering: binary data eliminates most encodings
binary_speed = get_matching_encodings(["speed"], data_type="binary")
print(f"Speed encodings for binary data: {binary_speed}")
assert "basis" in binary_speed, "basis handles binary data and has 'speed' tag"
Speed encodings for binary data: ['angle', 'basis']
# Symmetry filtering: only symmetry-aware encodings pass
cyclic_match = get_matching_encodings(
["cyclic_symmetry"],
symmetry="cyclic",
)
print(f"Cyclic symmetry encodings: {cyclic_match}")
assert "cyclic_equivariant" in cyclic_match
Cyclic symmetry encodings: ['cyclic_equivariant']
# n_features filtering: large feature counts eliminate bounded encodings
large_features = get_matching_encodings(
["expressibility"],
n_features=20,
)
print(f"Expressive encodings for 20 features: {large_features}")
# IQP has max_features=12, so it's excluded
assert "iqp" not in large_features, "IQP max_features=12, should be excluded for 20 features"
# data_reuploading has max_features=8
assert "data_reuploading" not in large_features
Expressive encodings for 20 features: ['hamiltonian']
# trainable filtering: trainable encoding requires opt-in
without_trainable = get_matching_encodings(["trainability"])
print(f"Trainability without trainable=True: {without_trainable}")
assert "trainable" not in without_trainable, "'trainable' encoding should not appear without opt-in"
with_trainable = get_matching_encodings(["trainability"], trainable=True)
print(f"Trainability with trainable=True: {with_trainable}")
assert "trainable" in with_trainable, "'trainable' encoding should appear with opt-in"
Trainability without trainable=True: ['data_reuploading'] Trainability with trainable=True: ['data_reuploading', 'trainable']
3.4 Empty and Non-existent Tags¶
# Empty requirements -> no matches
empty = get_matching_encodings([])
print(f"Empty requirements: {empty}")
assert empty == [], "No requirements should return empty list"
# Non-existent tag -> no matches
nonexistent = get_matching_encodings(["this_tag_does_not_exist"])
print(f"Non-existent tag: {nonexistent}")
assert nonexistent == [], "Unknown tags should return empty list"
Empty requirements: [] Non-existent tag: []
3.5 Direct Hard Constraint Checking with _passes_hard_constraints()¶
The internal _passes_hard_constraints() function checks whether a specific encoding's rule entry satisfies the user's structural requirements. It returns True if all 6 hard constraints pass.
# Basis encoding requires binary or discrete data
basis_rules = ENCODING_RULES["basis"]
print(f"Basis with binary data: {_passes_hard_constraints(basis_rules, data_type='binary')}")
print(f"Basis with discrete data: {_passes_hard_constraints(basis_rules, data_type='discrete')}")
print(f"Basis with continuous data: {_passes_hard_constraints(basis_rules, data_type='continuous')}")
assert _passes_hard_constraints(basis_rules, data_type="binary") is True
assert _passes_hard_constraints(basis_rules, data_type="discrete") is True
assert _passes_hard_constraints(basis_rules, data_type="continuous") is False
Basis with binary data: True Basis with discrete data: True Basis with continuous data: False
# SO2 equivariant: requires exactly 2 features AND rotation symmetry
so2_rules = ENCODING_RULES["so2_equivariant"]
print(f"SO2 with 2 features + rotation: {_passes_hard_constraints(so2_rules, n_features=2, symmetry='rotation')}")
print(f"SO2 with 3 features + rotation: {_passes_hard_constraints(so2_rules, n_features=3, symmetry='rotation')}")
print(f"SO2 with 2 features + no symmetry: {_passes_hard_constraints(so2_rules, n_features=2, symmetry=None)}")
print(f"SO2 with 2 features + cyclic: {_passes_hard_constraints(so2_rules, n_features=2, symmetry='cyclic')}")
assert _passes_hard_constraints(so2_rules, n_features=2, symmetry="rotation") is True
assert _passes_hard_constraints(so2_rules, n_features=3, symmetry="rotation") is False
assert _passes_hard_constraints(so2_rules, n_features=2, symmetry=None) is False
assert _passes_hard_constraints(so2_rules, n_features=2, symmetry="cyclic") is False
SO2 with 2 features + rotation: True SO2 with 3 features + rotation: False SO2 with 2 features + no symmetry: False SO2 with 2 features + cyclic: False
# Swap equivariant: requires even features AND permutation_pairs symmetry
swap_rules = ENCODING_RULES["swap_equivariant"]
print(f"Swap with 4 features + permutation_pairs: {_passes_hard_constraints(swap_rules, n_features=4, symmetry='permutation_pairs')}")
print(f"Swap with 3 features + permutation_pairs: {_passes_hard_constraints(swap_rules, n_features=3, symmetry='permutation_pairs')}")
print(f"Swap with 6 features + permutation_pairs: {_passes_hard_constraints(swap_rules, n_features=6, symmetry='permutation_pairs')}")
assert _passes_hard_constraints(swap_rules, n_features=4, symmetry="permutation_pairs") is True
assert _passes_hard_constraints(swap_rules, n_features=3, symmetry="permutation_pairs") is False # odd!
assert _passes_hard_constraints(swap_rules, n_features=6, symmetry="permutation_pairs") is True
Swap with 4 features + permutation_pairs: True Swap with 3 features + permutation_pairs: False Swap with 6 features + permutation_pairs: True
# When n_features is None, feature-count checks are skipped
print(f"SO2 with n_features=None + rotation: {_passes_hard_constraints(so2_rules, n_features=None, symmetry='rotation')}")
assert _passes_hard_constraints(so2_rules, n_features=None, symmetry="rotation") is True
# Max features check
iqp_rules = ENCODING_RULES["iqp"]
print(f"IQP max_features={iqp_rules['max_features']}")
print(f"IQP with 12 features: {_passes_hard_constraints(iqp_rules, n_features=12)}")
print(f"IQP with 13 features: {_passes_hard_constraints(iqp_rules, n_features=13)}")
assert _passes_hard_constraints(iqp_rules, n_features=12) is True
assert _passes_hard_constraints(iqp_rules, n_features=13) is False
SO2 with n_features=None + rotation: True IQP max_features=12 IQP with 12 features: True IQP with 13 features: False
# Comprehensive: check all 16 encodings pass with default (continuous, no symmetry, no trainable)
print("Encodings that pass default hard constraints (continuous, no symmetry, not trainable):")
for name, rules in sorted(ENCODING_RULES.items()):
passes = _passes_hard_constraints(rules, n_features=4, data_type="continuous", symmetry=None, trainable=False)
status = "PASS" if passes else "FAIL"
print(f" {name:25s} -> {status}")
Encodings that pass default hard constraints (continuous, no symmetry, not trainable): amplitude -> PASS angle -> PASS basis -> FAIL cyclic_equivariant -> FAIL data_reuploading -> PASS hamiltonian -> PASS hardware_efficient -> PASS higher_order_angle -> PASS iqp -> PASS pauli_feature_map -> PASS qaoa -> PASS so2_equivariant -> FAIL swap_equivariant -> FAIL symmetry_inspired -> FAIL trainable -> FAIL zz_feature_map -> PASS
4. The Recommender — recommend_encoding() ¶
The recommend_encoding() function is the main public API. It takes up to 10 parameters and returns a Recommendation dataclass with:
encoding_name: The top recommended encodingexplanation: Human-readable rationalealternatives: Up to 3 runner-up encoding namesconfidence: Score in [0, 1] indicating certainty
4.1 Basic Usage¶
# Simplest call: just n_features
rec = recommend_encoding(n_features=4)
print(f"Recommendation for 4 features (all defaults):")
print(f" Encoding: {rec.encoding_name}")
print(f" Explanation: {rec.explanation}")
print(f" Alternatives: {rec.alternatives}")
print(f" Confidence: {rec.confidence:.3f}")
# Verify the result is valid
assert rec.encoding_name in ENCODING_RULES
assert isinstance(rec.explanation, str) and len(rec.explanation) > 0
assert len(rec.alternatives) <= 3
assert 0.0 <= rec.confidence <= 1.0
assert rec.encoding_name not in rec.alternatives
Recommendation for 4 features (all defaults): Encoding: iqp Explanation: IQP encoding creates highly entangled states with provable classical simulation hardness, well-suited for kernel methods Alternatives: ['data_reuploading', 'zz_feature_map', 'pauli_feature_map'] Confidence: 0.740
# Using all 5 original positional parameters (backward compatibility)
rec = recommend_encoding(4, 500, "classification", "simulator", "accuracy")
print(f"5-arg call: {rec.encoding_name} (confidence: {rec.confidence:.3f})")
assert rec.encoding_name in ENCODING_RULES
5-arg call: iqp (confidence: 0.740)
# The Recommendation dataclass fields
rec = recommend_encoding(n_features=6, priority="accuracy", hardware="ibm")
print(f"Recommendation object type: {type(rec).__name__}")
print(f" encoding_name : {rec.encoding_name!r}")
print(f" explanation : {rec.explanation!r}")
print(f" alternatives : {rec.alternatives!r}")
print(f" confidence : {rec.confidence!r}")
Recommendation object type: Recommendation encoding_name : 'zz_feature_map' explanation : 'ZZ Feature Map provides standard pairwise feature interactions via (pi-x_i)(pi-x_j) phase encoding for kernel methods' alternatives : ['iqp', 'pauli_feature_map', 'angle'] confidence : 0.63
4.2 All 10 Parameters Explained¶
The recommender accepts 10 parameters:
| # | Parameter | Type | Default | Description |
|---|---|---|---|---|
| 1 | n_features |
int | (required) | Number of input features |
| 2 | n_samples |
int | 500 | Number of training samples |
| 3 | task |
str | "classification" | ML task type |
| 4 | hardware |
str | "simulator" | Target hardware |
| 5 | priority |
str | "accuracy" | Optimization priority |
| 6 | data_type |
str | "continuous" | Nature of input features (keyword-only) |
| 7 | symmetry |
str|None | None | Known data symmetry (keyword-only) |
| 8 | trainable |
bool | False | Learnable parameters desired (keyword-only) |
| 9 | problem_structure |
str|None | None | Domain structure (keyword-only) |
| 10 | feature_interactions |
str|None | None | Desired interaction type (keyword-only) |
# Demonstrate every parameter
rec = recommend_encoding(
n_features=6,
n_samples=1000,
task="classification",
hardware="ibm",
priority="noise_resilience",
data_type="continuous",
symmetry=None,
trainable=False,
problem_structure=None,
feature_interactions=None,
)
print(f"Full parameter call: {rec.encoding_name}")
print(f" Explanation: {rec.explanation}")
print(f" Confidence: {rec.confidence:.3f}")
Full parameter call: hardware_efficient Explanation: Hardware-efficient encoding minimises gate decomposition overhead on real quantum devices Confidence: 0.650
4.3 Priority Parameter¶
# Each priority value steers the recommendation differently
print("Effect of priority parameter (n_features=4):")
print("=" * 60)
for priority in ["accuracy", "speed", "noise_resilience", "trainability"]:
rec = recommend_encoding(n_features=4, priority=priority)
print(f" {priority:20s} -> {rec.encoding_name:25s} (confidence: {rec.confidence:.3f})")
Effect of priority parameter (n_features=4): ============================================================ accuracy -> iqp (confidence: 0.740) speed -> angle (confidence: 0.600) noise_resilience -> hardware_efficient (confidence: 0.600) trainability -> data_reuploading (confidence: 0.565)
# Speed -> angle encoding
rec_speed = recommend_encoding(n_features=4, priority="speed")
assert rec_speed.encoding_name == "angle", f"Expected 'angle' for speed, got '{rec_speed.encoding_name}'"
# Noise resilience -> hardware_efficient
rec_noise = recommend_encoding(n_features=4, priority="noise_resilience")
assert rec_noise.encoding_name == "hardware_efficient", f"Expected 'hardware_efficient', got '{rec_noise.encoding_name}'"
# Trainability -> data_reuploading
rec_train = recommend_encoding(n_features=4, priority="trainability")
assert rec_train.encoding_name == "data_reuploading", f"Expected 'data_reuploading', got '{rec_train.encoding_name}'"
print("All priority-based recommendations correct!")
All priority-based recommendations correct!
4.4 Data Type Parameter¶
# Binary data -> basis encoding
rec_binary = recommend_encoding(n_features=4, data_type="binary")
print(f"Binary data: {rec_binary.encoding_name} (confidence: {rec_binary.confidence:.3f})")
assert rec_binary.encoding_name == "basis"
assert rec_binary.confidence >= 0.75, "Binary->basis should have high confidence"
# Discrete data -> also basis encoding
rec_discrete = recommend_encoding(n_features=4, data_type="discrete")
print(f"Discrete data: {rec_discrete.encoding_name} (confidence: {rec_discrete.confidence:.3f})")
assert rec_discrete.encoding_name == "basis"
# Continuous data (default) -> depends on other params
rec_continuous = recommend_encoding(n_features=4, data_type="continuous")
print(f"Continuous data: {rec_continuous.encoding_name} (confidence: {rec_continuous.confidence:.3f})")
Binary data: basis (confidence: 0.850) Discrete data: basis (confidence: 0.850) Continuous data: iqp (confidence: 0.740)
4.5 Symmetry Parameter¶
# All 4 symmetry options + None
print("Effect of symmetry parameter:")
print("=" * 70)
symmetry_cases = [
{"symmetry": "rotation", "n_features": 2},
{"symmetry": "cyclic", "n_features": 4},
{"symmetry": "permutation_pairs", "n_features": 4},
{"symmetry": "general", "n_features": 4},
{"symmetry": None, "n_features": 4},
]
for case in symmetry_cases:
rec = recommend_encoding(**case)
print(f" symmetry={str(case['symmetry']):20s} n_features={case['n_features']} -> {rec.encoding_name:25s} (confidence: {rec.confidence:.3f})")
Effect of symmetry parameter: ====================================================================== symmetry=rotation n_features=2 -> so2_equivariant (confidence: 0.866) symmetry=cyclic n_features=4 -> cyclic_equivariant (confidence: 0.800) symmetry=permutation_pairs n_features=4 -> swap_equivariant (confidence: 0.800) symmetry=general n_features=4 -> symmetry_inspired (confidence: 0.800) symmetry=None n_features=4 -> iqp (confidence: 0.740)
# Verify each symmetry maps to its encoding
assert recommend_encoding(n_features=2, symmetry="rotation").encoding_name == "so2_equivariant"
assert recommend_encoding(n_features=4, symmetry="cyclic").encoding_name == "cyclic_equivariant"
assert recommend_encoding(n_features=4, symmetry="permutation_pairs").encoding_name == "swap_equivariant"
assert recommend_encoding(n_features=4, symmetry="general").encoding_name == "symmetry_inspired"
print("All symmetry-based recommendations correct!")
All symmetry-based recommendations correct!
4.6 Trainable Parameter¶
# trainable=True -> trainable encoding
rec_trainable = recommend_encoding(n_features=4, trainable=True)
print(f"trainable=True: {rec_trainable.encoding_name} (confidence: {rec_trainable.confidence:.3f})")
assert rec_trainable.encoding_name == "trainable"
# trainable=False (default) -> trainable encoding never appears
rec_not_trainable = recommend_encoding(n_features=4, trainable=False)
print(f"trainable=False: {rec_not_trainable.encoding_name}")
assert rec_not_trainable.encoding_name != "trainable"
assert "trainable" not in rec_not_trainable.alternatives
trainable=True: trainable (confidence: 0.750) trainable=False: iqp
4.7 Problem Structure Parameter¶
# Combinatorial -> QAOA
rec_comb = recommend_encoding(n_features=4, problem_structure="combinatorial")
print(f"combinatorial: {rec_comb.encoding_name} (confidence: {rec_comb.confidence:.3f})")
assert rec_comb.encoding_name == "qaoa"
# Physics simulation -> Hamiltonian
rec_phys = recommend_encoding(n_features=4, problem_structure="physics_simulation")
print(f"physics_simulation: {rec_phys.encoding_name} (confidence: {rec_phys.confidence:.3f})")
assert rec_phys.encoding_name == "hamiltonian"
# Time series -> no dedicated encoding, handled as bonus
rec_ts = recommend_encoding(n_features=4, problem_structure="time_series")
print(f"time_series: {rec_ts.encoding_name} (confidence: {rec_ts.confidence:.3f})")
assert rec_ts.encoding_name in ENCODING_RULES
combinatorial: qaoa (confidence: 0.710) physics_simulation: hamiltonian (confidence: 0.810) time_series: data_reuploading (confidence: 0.868)
4.8 Feature Interactions Parameter¶
# Polynomial -> higher_order_angle
rec_poly = recommend_encoding(n_features=4, feature_interactions="polynomial")
print(f"polynomial: {rec_poly.encoding_name} (confidence: {rec_poly.confidence:.3f})")
assert rec_poly.encoding_name == "higher_order_angle"
# Custom Pauli -> pauli_feature_map
rec_pauli = recommend_encoding(n_features=4, feature_interactions="custom_pauli")
print(f"custom_pauli: {rec_pauli.encoding_name} (confidence: {rec_pauli.confidence:.3f})")
assert rec_pauli.encoding_name == "pauli_feature_map"
polynomial: higher_order_angle (confidence: 0.730) custom_pauli: pauli_feature_map (confidence: 0.830)
4.9 Hardware Parameter¶
# Hardware affects scoring: real hardware penalizes deep circuits
print("Effect of hardware parameter (n_features=4, priority='accuracy'):")
print("=" * 60)
for hw in ["simulator", "ibm", "ionq", "google"]:
rec = recommend_encoding(n_features=4, hardware=hw)
print(f" {hw:15s} -> {rec.encoding_name:25s} (confidence: {rec.confidence:.3f})")
Effect of hardware parameter (n_features=4, priority='accuracy'): ============================================================ simulator -> iqp (confidence: 0.740) ibm -> iqp (confidence: 0.660) ionq -> iqp (confidence: 0.660) google -> iqp (confidence: 0.660)
4.10 Feature Count and Accuracy Fallback¶
When priority='accuracy' and no other strong signals, the feature count determines the recommendation:
- ≤ 4 features → IQP
- 5-8 features → ZZ Feature Map
- > 8 features → Amplitude
print("Accuracy-priority feature count fallback:")
print("=" * 60)
for n in [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 16, 32, 64, 100]:
rec = recommend_encoding(n_features=n, priority="accuracy")
print(f" n_features={n:3d} -> {rec.encoding_name:20s} (confidence: {rec.confidence:.3f})")
Accuracy-priority feature count fallback: ============================================================ n_features= 1 -> iqp (confidence: 0.740) n_features= 2 -> iqp (confidence: 0.740) n_features= 3 -> iqp (confidence: 0.740) n_features= 4 -> iqp (confidence: 0.740) n_features= 5 -> zz_feature_map (confidence: 0.630) n_features= 6 -> zz_feature_map (confidence: 0.630) n_features= 7 -> zz_feature_map (confidence: 0.630) n_features= 8 -> zz_feature_map (confidence: 0.630) n_features= 9 -> amplitude (confidence: 0.635) n_features= 10 -> amplitude (confidence: 0.635) n_features= 16 -> amplitude (confidence: 0.635) n_features= 32 -> amplitude (confidence: 0.635) n_features= 64 -> amplitude (confidence: 0.635) n_features=100 -> amplitude (confidence: 0.635)
# Verify the accuracy fallback ranges
assert recommend_encoding(n_features=4, priority="accuracy").encoding_name == "iqp"
assert recommend_encoding(n_features=6, priority="accuracy").encoding_name == "zz_feature_map"
assert recommend_encoding(n_features=16, priority="accuracy").encoding_name == "amplitude"
print("Accuracy fallback ranges verified!")
Accuracy fallback ranges verified!
4.11 Alternatives¶
# Alternatives are the next 3 best-scoring encodings
rec = recommend_encoding(n_features=4, priority="accuracy")
print(f"Primary: {rec.encoding_name}")
print(f"Alternatives ({len(rec.alternatives)}):")
for i, alt in enumerate(rec.alternatives, 1):
print(f" {i}. {alt}")
# Alternatives are always valid encodings
for alt in rec.alternatives:
assert alt in ENCODING_RULES, f"Alternative '{alt}' not in ENCODING_RULES"
# Primary is never in alternatives
assert rec.encoding_name not in rec.alternatives
# At most 3 alternatives
assert len(rec.alternatives) <= 3
Primary: iqp Alternatives (3): 1. data_reuploading 2. zz_feature_map 3. pauli_feature_map
# All alternatives must pass the same hard constraints
test_params = [
{"n_features": 4, "data_type": "binary"},
{"n_features": 2, "symmetry": "rotation"},
{"n_features": 4, "symmetry": "cyclic"},
{"n_features": 4, "trainable": True},
]
for params in test_params:
rec = recommend_encoding(**params)
for alt in rec.alternatives:
rules = ENCODING_RULES[alt]
assert _passes_hard_constraints(
rules,
n_features=params.get("n_features"),
data_type=params.get("data_type", "continuous"),
symmetry=params.get("symmetry"),
trainable=params.get("trainable", False),
), f"Alternative '{alt}' violates hard constraints for params {params}"
print("All alternatives pass hard constraint validation!")
All alternatives pass hard constraint validation!
4.12 Explanations¶
# Every encoding has a meaningful explanation
print("Explanations for all 16 encodings:")
print("=" * 80)
# Trigger params that make each encoding the primary recommendation
trigger_params = {
"angle": dict(n_features=4, priority="speed"),
"basis": dict(n_features=4, data_type="binary"),
"higher_order_angle": dict(n_features=4, feature_interactions="polynomial"),
"iqp": dict(n_features=4, priority="accuracy"),
"zz_feature_map": dict(n_features=6, priority="accuracy"),
"pauli_feature_map": dict(n_features=4, feature_interactions="custom_pauli"),
"data_reuploading": dict(n_features=4, priority="trainability"),
"hardware_efficient": dict(n_features=4, priority="noise_resilience"),
"amplitude": dict(n_features=16, priority="accuracy"),
"qaoa": dict(n_features=4, problem_structure="combinatorial"),
"hamiltonian": dict(n_features=4, problem_structure="physics_simulation"),
"trainable": dict(n_features=4, trainable=True),
"symmetry_inspired": dict(n_features=4, symmetry="general"),
"so2_equivariant": dict(n_features=2, symmetry="rotation"),
"cyclic_equivariant": dict(n_features=4, symmetry="cyclic"),
"swap_equivariant": dict(n_features=4, symmetry="permutation_pairs"),
}
for name, params in sorted(trigger_params.items()):
rec = recommend_encoding(**params)
assert rec.encoding_name == name, f"Expected '{name}' but got '{rec.encoding_name}'"
print(f"\n {name}:")
print(f" {rec.explanation}")
Explanations for all 16 encodings:
================================================================================
amplitude:
Amplitude encoding provides exponential compression (4 qubits for 16 features)
angle:
Angle encoding provides O(1) depth with simple rotations, ideal for speed
basis:
Basis encoding directly maps binary/discrete features to computational basis states
cyclic_equivariant:
Cyclic equivariant encoding guarantees rigorous Z_n cyclic shift symmetry with ring-topology circuits
data_reuploading:
Data re-uploading achieves universal approximation capability through repeated data encoding with entanglement layers
hamiltonian:
Hamiltonian encoding applies Trotterised time evolution under a data-dependent Hamiltonian for physics-inspired ML
hardware_efficient:
Hardware-efficient encoding minimises gate decomposition overhead on real quantum devices
higher_order_angle:
Higher-order angle encoding captures polynomial feature interactions (order-k products) without entanglement
iqp:
IQP encoding creates highly entangled states with provable classical simulation hardness, well-suited for kernel methods
pauli_feature_map:
Pauli Feature Map enables configurable Pauli-string rotation structures for custom feature interactions
qaoa:
QAOA-inspired encoding uses cost-mixer layer structure suited for combinatorial and graph-structured problems
so2_equivariant:
SO(2) equivariant encoding guarantees mathematically rigorous 2D rotational equivariance for the 2-feature input
swap_equivariant:
Swap equivariant encoding guarantees rigorous S_2 pair-swap symmetry over feature pairs
symmetry_inspired:
Symmetry-inspired encoding provides a heuristic symmetry-aware inductive bias for the given problem
trainable:
Trainable encoding interleaves data rotations with learnable parameter layers for task-specific optimisation
zz_feature_map:
ZZ Feature Map provides standard pairwise feature interactions via (pi-x_i)(pi-x_j) phase encoding for kernel methods
# Amplitude encoding explanation includes qubit count
rec_amp = recommend_encoding(n_features=16, priority="accuracy")
print(f"Amplitude explanation: {rec_amp.explanation}")
assert "4 qubits" in rec_amp.explanation, "Should mention 4 qubits for 16 features"
assert "16 features" in rec_amp.explanation, "Should mention 16 features"
rec_amp32 = recommend_encoding(n_features=32, priority="accuracy")
print(f"Amplitude (32 features): {rec_amp32.explanation}")
assert "5 qubits" in rec_amp32.explanation, "Should mention 5 qubits for 32 features"
Amplitude explanation: Amplitude encoding provides exponential compression (4 qubits for 16 features) Amplitude (32 features): Amplitude encoding provides exponential compression (5 qubits for 32 features)
4.13 Fallback Behavior¶
When no encoding passes all hard constraints, the recommender falls back to angle encoding with low confidence.
# Demonstrate the fallback: binary data + rotation symmetry + 2 features
# basis requires binary/discrete, so2 requires rotation+2features
# But basis doesn't require symmetry='rotation', and so2 doesn't require binary
# Both can pass! Let's try a truly impossible case.
# symmetry='rotation' + n_features=3: SO2 requires exactly 2, others need rotation
# Actually, non-symmetry encodings don't check symmetry at all, so they still pass.
# Let's verify: for a fallback to trigger, ALL 16 encodings must fail hard constraints.
# That's very hard to achieve since angle has no hard constraints at all!
# The fallback is a safety net for future constraint expansions.
# We can verify the fallback exists by checking the code path:
print("The fallback returns angle encoding with confidence 0.3.")
print("Since 'angle' has no hard constraints, it always passes,")
print("making the fallback unreachable in the current rule set.")
print("This is by design - it's a safety net for extensibility.")
# Let's verify angle always passes
angle_rules = ENCODING_RULES["angle"]
assert angle_rules["requires_data_type"] is None
assert angle_rules["requires_symmetry"] is None
assert angle_rules["requires_n_features"] is None
assert angle_rules["requires_even_features"] is False
assert angle_rules["requires_trainable"] is False
assert angle_rules["max_features"] is None
print("\nAngle encoding has zero hard constraints -> always eligible.")
The fallback returns angle encoding with confidence 0.3. Since 'angle' has no hard constraints, it always passes, making the fallback unreachable in the current rule set. This is by design - it's a safety net for extensibility. Angle encoding has zero hard constraints -> always eligible.
5. The Decision Tree — EncodingDecisionTree ¶
The decision tree provides a deterministic, interpretable encoding selection path. Unlike the recommender (which scores and ranks), the tree follows a single path and returns exactly one encoding.
5.1 Tree Construction and Structure¶
tree = EncodingDecisionTree()
# The tree is a nested dictionary
print(f"Tree type: {type(tree.tree).__name__}")
print(f"Root question: {tree.tree['question']}")
print(f"Root options: {list(tree.tree['options'].keys())}")
Tree type: dict Root question: What is your data type? Root options: ['binary', 'discrete', 'continuous']
# Recursive tree visualization
def print_tree(node, indent=0, prefix=""):
"""Pretty-print the decision tree."""
if isinstance(node, str):
print(f"{' ' * indent}{prefix}-> [{node}]")
return
if isinstance(node, dict) and "question" in node:
print(f"{' ' * indent}{prefix}{node['question']}")
for option_name, child in node["options"].items():
print_tree(child, indent + 4, f"{option_name}: ")
print("Complete Decision Tree:")
print("=" * 80)
print_tree(tree.tree)
Complete Decision Tree:
================================================================================
What is your data type?
binary: -> [basis]
discrete: -> [basis]
continuous: Does your data have a known symmetry?
rotation (2D, n_features=2): -> [so2_equivariant]
cyclic: -> [cyclic_equivariant]
permutation_pairs (even n_features): -> [swap_equivariant]
general (heuristic): -> [symmetry_inspired]
none: Do you want trainable encoding parameters?
yes: -> [trainable]
no: What is the problem structure?
combinatorial / graph: -> [qaoa]
physics simulation: -> [hamiltonian]
time_series / periodic: -> [data_reuploading]
none / general: Do you need specific feature interactions?
polynomial (no entanglement): -> [higher_order_angle]
custom Pauli strings: -> [pauli_feature_map]
none: What is your optimisation priority?
speed: -> [angle]
noise_resilience: -> [hardware_efficient]
trainability: -> [data_reuploading]
accuracy: How many features?
few (<= 4): -> [iqp]
medium (5-8): -> [zz_feature_map]
many (> 8): -> [amplitude]
# Verify all 16 encodings appear as leaves
def collect_leaves(node):
"""Recursively collect all leaf strings from the tree."""
if isinstance(node, str):
return [node]
if isinstance(node, dict) and "options" in node:
leaves = []
for child in node["options"].values():
leaves.extend(collect_leaves(child))
return leaves
return []
leaves = collect_leaves(tree.tree)
unique_leaves = set(leaves)
print(f"Total leaf nodes: {len(leaves)}")
print(f"Unique encoding names: {len(unique_leaves)}")
print(f"\nAll leaves are valid encodings: {all(l in ENCODING_RULES for l in leaves)}")
# Check every encoding appears
missing = set(ENCODING_RULES.keys()) - unique_leaves
assert not missing, f"Encodings missing from tree: {missing}"
print(f"All 16 encodings present in tree: True")
Total leaf nodes: 18 Unique encoding names: 16 All leaves are valid encodings: True All 16 encodings present in tree: True
5.2 Using decide()¶
tree = EncodingDecisionTree()
# Basic usage with keyword arguments
result = tree.decide(n_features=4, priority="accuracy")
print(f"decide(n_features=4, priority='accuracy') -> {result}")
# Default parameters
result_default = tree.decide()
print(f"decide() with all defaults -> {result_default}")
assert result_default in ENCODING_RULES
decide(n_features=4, priority='accuracy') -> iqp decide() with all defaults -> iqp
# The decide() method accepts 7 keyword arguments:
print("decide() parameters:")
print(" data_type: 'continuous' (default), 'binary', 'discrete'")
print(" n_features: int (default 4)")
print(" symmetry: None (default), 'rotation', 'cyclic', 'permutation_pairs', 'general'")
print(" trainable: False (default), True")
print(" priority: 'accuracy' (default), 'speed', 'noise_resilience', 'trainability'")
print(" problem_structure: None (default), 'combinatorial', 'physics_simulation'")
print(" feature_interactions: None (default), 'polynomial', 'custom_pauli'")
decide() parameters: data_type: 'continuous' (default), 'binary', 'discrete' n_features: int (default 4) symmetry: None (default), 'rotation', 'cyclic', 'permutation_pairs', 'general' trainable: False (default), True priority: 'accuracy' (default), 'speed', 'noise_resilience', 'trainability' problem_structure: None (default), 'combinatorial', 'physics_simulation' feature_interactions: None (default), 'polynomial', 'custom_pauli'
6. All 16 Encodings: Reachability Proof ¶
Every one of the 16 encodings must be reachable as the primary recommendation from both the recommender and the decision tree. Here we prove it.
tree = EncodingDecisionTree()
# Trigger parameters for each encoding
TRIGGER_PARAMS = {
"angle": dict(n_features=4, priority="speed"),
"basis": dict(n_features=4, data_type="binary"),
"higher_order_angle": dict(n_features=4, feature_interactions="polynomial"),
"iqp": dict(n_features=4, priority="accuracy"),
"zz_feature_map": dict(n_features=6, priority="accuracy"),
"pauli_feature_map": dict(n_features=4, feature_interactions="custom_pauli"),
"data_reuploading": dict(n_features=4, priority="trainability"),
"hardware_efficient": dict(n_features=4, priority="noise_resilience"),
"amplitude": dict(n_features=16, priority="accuracy"),
"qaoa": dict(n_features=4, problem_structure="combinatorial"),
"hamiltonian": dict(n_features=4, problem_structure="physics_simulation"),
"trainable": dict(n_features=4, trainable=True),
"symmetry_inspired": dict(n_features=4, symmetry="general"),
"so2_equivariant": dict(n_features=2, symmetry="rotation"),
"cyclic_equivariant": dict(n_features=4, symmetry="cyclic"),
"swap_equivariant": dict(n_features=4, symmetry="permutation_pairs"),
}
print(f"{'Encoding':<25} {'Recommender':^15} {'Decision Tree':^15} {'Match':^7}")
print("=" * 65)
all_pass = True
for encoding_name, params in sorted(TRIGGER_PARAMS.items()):
rec_result = recommend_encoding(**params).encoding_name
tree_result = tree.decide(**params)
match = rec_result == tree_result == encoding_name
status = "OK" if match else "FAIL"
if not match:
all_pass = False
print(f" {encoding_name:<23} {rec_result:^15} {tree_result:^15} {status:^7}")
assert all_pass, "Not all encodings are reachable!"
print(f"\nAll 16 encodings reachable and consistent between recommender and tree!")
Encoding Recommender Decision Tree Match ================================================================= amplitude amplitude amplitude OK angle angle angle OK basis basis basis OK cyclic_equivariant cyclic_equivariant cyclic_equivariant OK data_reuploading data_reuploading data_reuploading OK hamiltonian hamiltonian hamiltonian OK hardware_efficient hardware_efficient hardware_efficient OK higher_order_angle higher_order_angle higher_order_angle OK iqp iqp iqp OK pauli_feature_map pauli_feature_map pauli_feature_map OK qaoa qaoa qaoa OK so2_equivariant so2_equivariant so2_equivariant OK swap_equivariant swap_equivariant swap_equivariant OK symmetry_inspired symmetry_inspired symmetry_inspired OK trainable trainable trainable OK zz_feature_map zz_feature_map zz_feature_map OK All 16 encodings reachable and consistent between recommender and tree!
7. Decision Priority Hierarchy ¶
Both the recommender and decision tree follow a 7-level priority hierarchy:
- Data type — binary/discrete → basis
- Symmetry — rotation/cyclic/permutation_pairs/general → equivariant encodings
- Trainable — True → trainable encoding
- Problem structure — combinatorial/physics_simulation → QAOA/Hamiltonian
- Feature interactions — polynomial/custom_pauli → higher_order_angle/pauli_feature_map
- Priority — speed/noise_resilience/trainability → angle/hardware_efficient/data_reuploading
- Feature count — (accuracy fallback) ≤ 4/5-8/>8 → IQP/ZZ Feature Map/Amplitude
Higher levels always override lower levels.
tree = EncodingDecisionTree()
# Priority hierarchy demonstration: symmetry overrides problem_structure
result = tree.decide(symmetry="cyclic", problem_structure="combinatorial", n_features=4)
print(f"symmetry='cyclic' + problem_structure='combinatorial' -> {result}")
assert result == "cyclic_equivariant", "Symmetry (level 2) should override problem_structure (level 4)"
# Data type overrides everything
result = tree.decide(data_type="binary", symmetry="cyclic", trainable=True, priority="speed")
print(f"data_type='binary' + symmetry='cyclic' + trainable=True + priority='speed' -> {result}")
assert result == "basis", "Data type (level 1) should override all other levels"
# Trainable overrides problem_structure
result = tree.decide(trainable=True, problem_structure="combinatorial", n_features=4)
print(f"trainable=True + problem_structure='combinatorial' -> {result}")
assert result == "trainable", "Trainable (level 3) should override problem_structure (level 4)"
# Problem structure overrides feature interactions
result = tree.decide(problem_structure="physics_simulation", feature_interactions="polynomial", n_features=4)
print(f"problem_structure='physics_simulation' + feature_interactions='polynomial' -> {result}")
assert result == "hamiltonian", "Problem structure (level 4) should override feature_interactions (level 5)"
# Feature interactions overrides priority
result = tree.decide(feature_interactions="custom_pauli", priority="speed", n_features=4)
print(f"feature_interactions='custom_pauli' + priority='speed' -> {result}")
assert result == "pauli_feature_map", "Feature interactions (level 5) should override priority (level 6)"
print("\nAll priority hierarchy tests passed!")
symmetry='cyclic' + problem_structure='combinatorial' -> cyclic_equivariant data_type='binary' + symmetry='cyclic' + trainable=True + priority='speed' -> basis trainable=True + problem_structure='combinatorial' -> trainable problem_structure='physics_simulation' + feature_interactions='polynomial' -> hamiltonian feature_interactions='custom_pauli' + priority='speed' -> pauli_feature_map All priority hierarchy tests passed!
8. Parameter Deep-Dives ¶
8.1 n_samples Effect¶
# n_samples affects scoring: small datasets slightly prefer simulable encodings
print("Effect of n_samples on scoring (n_features=4, priority='accuracy'):")
print("=" * 60)
for n_samples in [10, 50, 100, 500, 1000, 10000]:
rec = recommend_encoding(n_features=4, n_samples=n_samples)
print(f" n_samples={n_samples:6d} -> {rec.encoding_name:20s} (confidence: {rec.confidence:.3f})")
Effect of n_samples on scoring (n_features=4, priority='accuracy'): ============================================================ n_samples= 10 -> iqp (confidence: 0.740) n_samples= 50 -> iqp (confidence: 0.740) n_samples= 100 -> iqp (confidence: 0.740) n_samples= 500 -> iqp (confidence: 0.740) n_samples= 1000 -> iqp (confidence: 0.740) n_samples= 10000 -> iqp (confidence: 0.740)
8.2 task Parameter¶
# The task parameter (classification vs regression)
for task in ["classification", "regression"]:
rec = recommend_encoding(n_features=4, task=task)
print(f"task='{task}' -> {rec.encoding_name} (confidence: {rec.confidence:.3f})")
task='classification' -> iqp (confidence: 0.740) task='regression' -> iqp (confidence: 0.700)
8.3 Combined Parameters¶
# Hardware + priority combined
print("Hardware + Priority combinations (n_features=4):")
print("=" * 70)
for hw in ["simulator", "ibm"]:
for priority in ["accuracy", "speed", "noise_resilience"]:
rec = recommend_encoding(n_features=4, hardware=hw, priority=priority)
print(f" hw={hw:10s} priority={priority:20s} -> {rec.encoding_name:25s} ({rec.confidence:.3f})")
Hardware + Priority combinations (n_features=4): ====================================================================== hw=simulator priority=accuracy -> iqp (0.740) hw=simulator priority=speed -> angle (0.600) hw=simulator priority=noise_resilience -> hardware_efficient (0.600) hw=ibm priority=accuracy -> iqp (0.660) hw=ibm priority=speed -> angle (0.650) hw=ibm priority=noise_resilience -> hardware_efficient (0.650)
9. Edge Cases & Robustness ¶
9.1 Extreme Feature Counts¶
# n_features=1 (single feature)
rec = recommend_encoding(n_features=1)
print(f"n_features=1: {rec.encoding_name} (confidence: {rec.confidence:.3f})")
assert rec.encoding_name in ENCODING_RULES
# n_features=2 (minimal multi-feature)
rec = recommend_encoding(n_features=2)
print(f"n_features=2 (no symmetry): {rec.encoding_name} (confidence: {rec.confidence:.3f})")
assert rec.encoding_name != "so2_equivariant", "SO2 should not appear without symmetry='rotation'"
# Very large
rec = recommend_encoding(n_features=100)
print(f"n_features=100: {rec.encoding_name} (confidence: {rec.confidence:.3f})")
assert rec.encoding_name == "amplitude", "Very large features should pick amplitude"
rec = recommend_encoding(n_features=1000)
print(f"n_features=1000: {rec.encoding_name} (confidence: {rec.confidence:.3f})")
assert rec.encoding_name == "amplitude"
n_features=1: iqp (confidence: 0.740) n_features=2 (no symmetry): iqp (confidence: 0.740) n_features=100: amplitude (confidence: 0.635) n_features=1000: amplitude (confidence: 0.635)
# Decision tree with extreme values
tree = EncodingDecisionTree()
result = tree.decide(n_features=1)
print(f"Tree n_features=1: {result}")
assert result in ENCODING_RULES
result = tree.decide(n_features=100)
print(f"Tree n_features=100: {result}")
assert result == "amplitude"
result = tree.decide(n_features=1000)
print(f"Tree n_features=1000: {result}")
assert result == "amplitude"
Tree n_features=1: iqp Tree n_features=100: amplitude Tree n_features=1000: amplitude
9.2 Symmetry with Wrong Feature Counts¶
tree = EncodingDecisionTree()
# rotation symmetry with n_features != 2 -> SO2 requires exactly 2
result = tree.decide(symmetry="rotation", n_features=5)
print(f"rotation + n_features=5: {result} (not so2_equivariant)")
assert result != "so2_equivariant", "SO2 requires exactly 2 features"
# permutation_pairs with odd features -> swap requires even
result = tree.decide(symmetry="permutation_pairs", n_features=3)
print(f"permutation_pairs + n_features=3: {result} (not swap_equivariant)")
assert result != "swap_equivariant", "Swap requires even features"
# permutation_pairs with odd features -> also check recommender
rec = recommend_encoding(n_features=3, symmetry="permutation_pairs")
print(f"Recommender: permutation_pairs + n_features=3: {rec.encoding_name}")
assert rec.encoding_name != "swap_equivariant"
assert "swap_equivariant" not in rec.alternatives
# rotation with n_features=3 -> recommender
rec = recommend_encoding(n_features=3, symmetry="rotation")
print(f"Recommender: rotation + n_features=3: {rec.encoding_name}")
assert rec.encoding_name != "so2_equivariant"
assert "so2_equivariant" not in rec.alternatives
print("\nAll wrong-feature-count symmetry cases handled correctly!")
rotation + n_features=5: zz_feature_map (not so2_equivariant) permutation_pairs + n_features=3: iqp (not swap_equivariant) Recommender: permutation_pairs + n_features=3: iqp Recommender: rotation + n_features=3: iqp All wrong-feature-count symmetry cases handled correctly!
9.3 Conflicting Constraints¶
# Binary data + rotation symmetry + 2 features
# basis needs binary, so2 needs rotation+2features
# Both pass their own hard constraints, scoring decides the winner
rec = recommend_encoding(n_features=2, data_type="binary", symmetry="rotation")
print(f"binary + rotation + 2 features: {rec.encoding_name}")
print(f" Alternatives: {rec.alternatives}")
print(f" Confidence: {rec.confidence:.3f}")
# The result must not violate any hard constraints
rules = ENCODING_RULES[rec.encoding_name]
if rules["requires_data_type"] is not None:
assert "binary" in rules["requires_data_type"]
print(" Result respects hard constraints: True")
binary + rotation + 2 features: basis Alternatives: ['so2_equivariant', 'iqp', 'data_reuploading'] Confidence: 0.850 Result respects hard constraints: True
# trainable + speed priority -> trainable wins (higher priority level)
rec = recommend_encoding(n_features=4, trainable=True, priority="speed")
print(f"trainable=True + priority='speed': {rec.encoding_name}")
# In the recommender, trainable gets a strong score boost (+0.40),
# and angle also gets speed bonus. Which wins depends on total score.
assert rec.encoding_name in ENCODING_RULES
trainable=True + priority='speed': trainable
# Multiple symmetry-like parameters at once
rec = recommend_encoding(
n_features=4,
symmetry="general",
problem_structure="combinatorial",
feature_interactions="polynomial",
)
print(f"general symmetry + combinatorial + polynomial: {rec.encoding_name}")
print(f" Alternatives: {rec.alternatives}")
# symmetry_inspired requires symmetry='general', so it should be in candidates
assert rec.encoding_name in ENCODING_RULES
general symmetry + combinatorial + polynomial: symmetry_inspired Alternatives: ['higher_order_angle', 'qaoa', 'iqp']
9.4 Feature Count Boundaries¶
# Test exact boundary values for accuracy fallback
tree = EncodingDecisionTree()
# Boundary: n_features=4 (should be IQP, <= 4)
assert tree.decide(n_features=4, priority="accuracy") == "iqp"
# Boundary: n_features=5 (should be zz_feature_map, 5-8 range)
assert tree.decide(n_features=5, priority="accuracy") == "zz_feature_map"
# Boundary: n_features=8 (should be zz_feature_map, 5-8 range)
assert tree.decide(n_features=8, priority="accuracy") == "zz_feature_map"
# Boundary: n_features=9 (should be amplitude, > 8)
assert tree.decide(n_features=9, priority="accuracy") == "amplitude"
print("Feature count boundaries (decision tree):")
for n in range(1, 12):
result = tree.decide(n_features=n, priority="accuracy")
print(f" n={n:2d} -> {result}")
print("\nAll boundary tests passed!")
Feature count boundaries (decision tree): n= 1 -> iqp n= 2 -> iqp n= 3 -> iqp n= 4 -> iqp n= 5 -> zz_feature_map n= 6 -> zz_feature_map n= 7 -> zz_feature_map n= 8 -> zz_feature_map n= 9 -> amplitude n=10 -> amplitude n=11 -> amplitude All boundary tests passed!
9.5 Max Features Enforcement¶
# Encodings with max_features limits should never appear when exceeded
limited_encodings = [
("iqp", 12),
("zz_feature_map", 12),
("pauli_feature_map", 12),
("data_reuploading", 8),
("higher_order_angle", 10),
("so2_equivariant", 2), # requires exactly 2, also max_features=2
]
print("Max features enforcement:")
print("=" * 60)
for name, max_val in limited_encodings:
# At the limit: should still be eligible (may or may not be recommended)
rec_at = recommend_encoding(n_features=max_val)
at_limit = name in [rec_at.encoding_name] + rec_at.alternatives
# Above the limit: must NEVER appear
rec_over = recommend_encoding(n_features=max_val + 5)
over_limit = name in [rec_over.encoding_name] + rec_over.alternatives
print(f" {name:25s} (max={max_val:2d}): at_limit_eligible={at_limit}, above_limit_excluded={not over_limit}")
assert not over_limit, f"{name} appeared with n_features={max_val + 5} (max={max_val})!"
Max features enforcement: ============================================================ iqp (max=12): at_limit_eligible=True, above_limit_excluded=True zz_feature_map (max=12): at_limit_eligible=True, above_limit_excluded=True pauli_feature_map (max=12): at_limit_eligible=True, above_limit_excluded=True data_reuploading (max= 8): at_limit_eligible=True, above_limit_excluded=True higher_order_angle (max=10): at_limit_eligible=False, above_limit_excluded=True so2_equivariant (max= 2): at_limit_eligible=False, above_limit_excluded=True
9.6 Hard Constraints Are Never Violated¶
# Comprehensive check: for many parameter combinations, the primary recommendation
# and all alternatives must satisfy their own hard constraints
test_scenarios = [
dict(n_features=4),
dict(n_features=4, priority="speed"),
dict(n_features=4, priority="noise_resilience"),
dict(n_features=4, priority="trainability"),
dict(n_features=4, data_type="binary"),
dict(n_features=4, data_type="discrete"),
dict(n_features=2, symmetry="rotation"),
dict(n_features=4, symmetry="cyclic"),
dict(n_features=4, symmetry="permutation_pairs"),
dict(n_features=4, symmetry="general"),
dict(n_features=4, trainable=True),
dict(n_features=4, problem_structure="combinatorial"),
dict(n_features=4, problem_structure="physics_simulation"),
dict(n_features=4, feature_interactions="polynomial"),
dict(n_features=4, feature_interactions="custom_pauli"),
dict(n_features=16),
dict(n_features=100),
dict(n_features=3, symmetry="permutation_pairs"), # odd -> swap excluded
dict(n_features=3, symmetry="rotation"), # n!=2 -> SO2 excluded
]
violations = []
for params in test_scenarios:
rec = recommend_encoding(**params)
all_names = [rec.encoding_name] + rec.alternatives
for name in all_names:
rules = ENCODING_RULES[name]
passes = _passes_hard_constraints(
rules,
n_features=params.get("n_features"),
data_type=params.get("data_type", "continuous"),
symmetry=params.get("symmetry"),
trainable=params.get("trainable", False),
)
if not passes:
violations.append((params, name))
if violations:
print("VIOLATIONS FOUND:")
for params, name in violations:
print(f" {name} recommended for {params}")
else:
print(f"No hard constraint violations across {len(test_scenarios)} scenarios!")
assert len(violations) == 0
No hard constraint violations across 19 scenarios!
10. Recommender vs Decision Tree Comparison ¶
| Feature | Recommender | Decision Tree |
|---|---|---|
| Output | Ranked list + confidence | Single encoding |
| Method | Hard filter + soft scoring | Deterministic if/elif chain |
| Alternatives | Up to 3 | None |
| Confidence | 0.0-1.0 | N/A |
| Explanation | Template-based | N/A |
| Parameters | 10 | 7 |
tree = EncodingDecisionTree()
# For canonical trigger parameters, both should agree
print(f"{'Parameters':<55} {'Recommender':<20} {'Tree':<20} {'Agree':>5}")
print("=" * 105)
test_cases = [
dict(n_features=4, priority="speed"),
dict(n_features=4, priority="accuracy"),
dict(n_features=4, priority="noise_resilience"),
dict(n_features=4, priority="trainability"),
dict(n_features=4, data_type="binary"),
dict(n_features=6, priority="accuracy"),
dict(n_features=16, priority="accuracy"),
dict(n_features=2, symmetry="rotation"),
dict(n_features=4, symmetry="cyclic"),
dict(n_features=4, symmetry="permutation_pairs"),
dict(n_features=4, symmetry="general"),
dict(n_features=4, trainable=True),
dict(n_features=4, problem_structure="combinatorial"),
dict(n_features=4, problem_structure="physics_simulation"),
dict(n_features=4, feature_interactions="polynomial"),
dict(n_features=4, feature_interactions="custom_pauli"),
]
all_agree = True
for params in test_cases:
rec_result = recommend_encoding(**params).encoding_name
tree_result = tree.decide(**params)
agree = rec_result == tree_result
if not agree:
all_agree = False
params_str = str(params)
print(f" {params_str:<53} {rec_result:<20} {tree_result:<20} {'Yes' if agree else 'NO':>5}")
print(f"\nAll cases agree: {all_agree}")
Parameters Recommender Tree Agree
=========================================================================================================
{'n_features': 4, 'priority': 'speed'} angle angle Yes
{'n_features': 4, 'priority': 'accuracy'} iqp iqp Yes
{'n_features': 4, 'priority': 'noise_resilience'} hardware_efficient hardware_efficient Yes
{'n_features': 4, 'priority': 'trainability'} data_reuploading data_reuploading Yes
{'n_features': 4, 'data_type': 'binary'} basis basis Yes
{'n_features': 6, 'priority': 'accuracy'} zz_feature_map zz_feature_map Yes
{'n_features': 16, 'priority': 'accuracy'} amplitude amplitude Yes
{'n_features': 2, 'symmetry': 'rotation'} so2_equivariant so2_equivariant Yes
{'n_features': 4, 'symmetry': 'cyclic'} cyclic_equivariant cyclic_equivariant Yes
{'n_features': 4, 'symmetry': 'permutation_pairs'} swap_equivariant swap_equivariant Yes
{'n_features': 4, 'symmetry': 'general'} symmetry_inspired symmetry_inspired Yes
{'n_features': 4, 'trainable': True} trainable trainable Yes
{'n_features': 4, 'problem_structure': 'combinatorial'} qaoa qaoa Yes
{'n_features': 4, 'problem_structure': 'physics_simulation'} hamiltonian hamiltonian Yes
{'n_features': 4, 'feature_interactions': 'polynomial'} higher_order_angle higher_order_angle Yes
{'n_features': 4, 'feature_interactions': 'custom_pauli'} pauli_feature_map pauli_feature_map Yes
All cases agree: True
# Key difference: recommender provides alternatives and confidence
params = dict(n_features=4, priority="accuracy")
rec = recommend_encoding(**params)
tree_result = tree.decide(**params)
print(f"Recommender output for {params}:")
print(f" Primary: {rec.encoding_name}")
print(f" Alternatives: {rec.alternatives}")
print(f" Confidence: {rec.confidence:.3f}")
print(f" Explanation: {rec.explanation}")
print(f"\nDecision tree output for {params}:")
print(f" Result: {tree_result}")
print(f" (No alternatives, no confidence, no explanation)")
Recommender output for {'n_features': 4, 'priority': 'accuracy'}:
Primary: iqp
Alternatives: ['data_reuploading', 'zz_feature_map', 'pauli_feature_map']
Confidence: 0.740
Explanation: IQP encoding creates highly entangled states with provable classical simulation hardness, well-suited for kernel methods
Decision tree output for {'n_features': 4, 'priority': 'accuracy'}:
Result: iqp
(No alternatives, no confidence, no explanation)
11. Real-World Scenarios ¶
Let's walk through realistic use cases to demonstrate how the recommendation system guides encoding selection.
def show_recommendation(title, **params):
"""Helper to display a recommendation nicely."""
rec = recommend_encoding(**params)
print(f"Scenario: {title}")
print(f" Parameters: {params}")
print(f" Recommended: {rec.encoding_name}")
print(f" Confidence: {rec.confidence:.3f}")
print(f" Explanation: {rec.explanation}")
if rec.alternatives:
print(f" Alternatives: {', '.join(rec.alternatives)}")
print()
# Scenario 1: Image classification with many features
show_recommendation(
"Image classification (784 pixel features)",
n_features=784,
n_samples=60000,
task="classification",
priority="accuracy",
)
Scenario: Image classification (784 pixel features)
Parameters: {'n_features': 784, 'n_samples': 60000, 'task': 'classification', 'priority': 'accuracy'}
Recommended: amplitude
Confidence: 0.635
Explanation: Amplitude encoding provides exponential compression (10 qubits for 784 features)
Alternatives: hamiltonian, angle, hardware_efficient
# Scenario 2: Small binary classification on real hardware
show_recommendation(
"Binary features on IBM hardware",
n_features=5,
n_samples=200,
task="classification",
hardware="ibm",
priority="noise_resilience",
data_type="binary",
)
Scenario: Binary features on IBM hardware
Parameters: {'n_features': 5, 'n_samples': 200, 'task': 'classification', 'hardware': 'ibm', 'priority': 'noise_resilience', 'data_type': 'binary'}
Recommended: basis
Confidence: 0.850
Explanation: Basis encoding directly maps binary/discrete features to computational basis states
Alternatives: hardware_efficient, angle, higher_order_angle
# Scenario 3: Molecular simulation with physics structure
show_recommendation(
"Molecular dynamics simulation",
n_features=8,
n_samples=1000,
task="regression",
priority="accuracy",
problem_structure="physics_simulation",
)
Scenario: Molecular dynamics simulation
Parameters: {'n_features': 8, 'n_samples': 1000, 'task': 'regression', 'priority': 'accuracy', 'problem_structure': 'physics_simulation'}
Recommended: hamiltonian
Confidence: 0.810
Explanation: Hamiltonian encoding applies Trotterised time evolution under a data-dependent Hamiltonian for physics-inspired ML
Alternatives: zz_feature_map, iqp, data_reuploading
# Scenario 4: Graph optimization (e.g., MaxCut)
show_recommendation(
"Graph MaxCut optimization",
n_features=6,
priority="accuracy",
problem_structure="combinatorial",
)
Scenario: Graph MaxCut optimization
Parameters: {'n_features': 6, 'priority': 'accuracy', 'problem_structure': 'combinatorial'}
Recommended: qaoa
Confidence: 0.710
Explanation: QAOA-inspired encoding uses cost-mixer layer structure suited for combinatorial and graph-structured problems
Alternatives: zz_feature_map, iqp, data_reuploading
# Scenario 5: 2D rotation-equivariant task (e.g., compass data)
show_recommendation(
"2D rotation-equivariant (compass bearing)",
n_features=2,
n_samples=500,
task="classification",
symmetry="rotation",
)
Scenario: 2D rotation-equivariant (compass bearing)
Parameters: {'n_features': 2, 'n_samples': 500, 'task': 'classification', 'symmetry': 'rotation'}
Recommended: so2_equivariant
Confidence: 0.866
Explanation: SO(2) equivariant encoding guarantees mathematically rigorous 2D rotational equivariance for the 2-feature input
Alternatives: iqp, data_reuploading, zz_feature_map
# Scenario 6: Fast prototyping on simulator
show_recommendation(
"Quick prototype on simulator",
n_features=8,
n_samples=100,
hardware="simulator",
priority="speed",
)
Scenario: Quick prototype on simulator
Parameters: {'n_features': 8, 'n_samples': 100, 'hardware': 'simulator', 'priority': 'speed'}
Recommended: angle
Confidence: 0.600
Explanation: Angle encoding provides O(1) depth with simple rotations, ideal for speed
Alternatives: iqp, zz_feature_map, pauli_feature_map
# Scenario 7: Task-specific trainable encoding for quantum neural network
show_recommendation(
"Quantum neural network with trainable parameters",
n_features=6,
n_samples=2000,
task="classification",
trainable=True,
)
Scenario: Quantum neural network with trainable parameters
Parameters: {'n_features': 6, 'n_samples': 2000, 'task': 'classification', 'trainable': True}
Recommended: trainable
Confidence: 0.750
Explanation: Trainable encoding interleaves data rotations with learnable parameter layers for task-specific optimisation
Alternatives: zz_feature_map, iqp, data_reuploading
# Scenario 8: Cyclic time-series data
show_recommendation(
"Cyclic time-series sensor data",
n_features=6,
n_samples=1000,
symmetry="cyclic",
)
Scenario: Cyclic time-series sensor data
Parameters: {'n_features': 6, 'n_samples': 1000, 'symmetry': 'cyclic'}
Recommended: cyclic_equivariant
Confidence: 0.800
Explanation: Cyclic equivariant encoding guarantees rigorous Z_n cyclic shift symmetry with ring-topology circuits
Alternatives: zz_feature_map, iqp, data_reuploading
# Scenario 9: Paired features with swap symmetry
show_recommendation(
"Paired sensor features (swap symmetry)",
n_features=6,
n_samples=500,
symmetry="permutation_pairs",
)
Scenario: Paired sensor features (swap symmetry)
Parameters: {'n_features': 6, 'n_samples': 500, 'symmetry': 'permutation_pairs'}
Recommended: swap_equivariant
Confidence: 0.800
Explanation: Swap equivariant encoding guarantees rigorous S_2 pair-swap symmetry over feature pairs
Alternatives: zz_feature_map, iqp, data_reuploading
# Scenario 10: Custom Pauli-string research
show_recommendation(
"Research: custom Pauli-string interactions",
n_features=4,
task="classification",
feature_interactions="custom_pauli",
)
Scenario: Research: custom Pauli-string interactions
Parameters: {'n_features': 4, 'task': 'classification', 'feature_interactions': 'custom_pauli'}
Recommended: pauli_feature_map
Confidence: 0.830
Explanation: Pauli Feature Map enables configurable Pauli-string rotation structures for custom feature interactions
Alternatives: iqp, data_reuploading, zz_feature_map
12. Confidence Analysis ¶
The confidence score maps raw suitability scores to a human-interpretable value in [0.50, 0.95].
12.1 Confidence Bands¶
# Score-to-confidence mapping
print("Raw Score -> Confidence mapping:")
print("=" * 40)
for score_x10 in range(0, 11):
score = score_x10 / 10
conf = _score_to_confidence(score)
band = "HIGH" if conf >= 0.85 else "MEDIUM" if conf >= 0.65 else "LOWER"
print(f" score={score:.1f} -> confidence={conf:.3f} ({band})")
Raw Score -> Confidence mapping: ======================================== score=0.0 -> confidence=0.500 (LOWER) score=0.1 -> confidence=0.550 (LOWER) score=0.2 -> confidence=0.600 (LOWER) score=0.3 -> confidence=0.650 (MEDIUM) score=0.4 -> confidence=0.750 (MEDIUM) score=0.5 -> confidence=0.850 (HIGH) score=0.6 -> confidence=0.870 (HIGH) score=0.7 -> confidence=0.890 (HIGH) score=0.8 -> confidence=0.910 (HIGH) score=0.9 -> confidence=0.930 (HIGH) score=1.0 -> confidence=0.950 (HIGH)
# Monotonicity: higher scores always produce equal or higher confidence
scores = [i / 100 for i in range(101)]
confidences = [_score_to_confidence(s) for s in scores]
is_monotonic = all(confidences[i] <= confidences[i + 1] for i in range(len(confidences) - 1))
print(f"Confidence is monotonically non-decreasing: {is_monotonic}")
assert is_monotonic
# Range check
print(f"Min confidence: {min(confidences):.3f} (at score=0.0)")
print(f"Max confidence: {max(confidences):.3f} (at score=1.0)")
assert min(confidences) >= 0.50
assert max(confidences) <= 0.95
Confidence is monotonically non-decreasing: True Min confidence: 0.500 (at score=0.0) Max confidence: 0.950 (at score=1.0)
12.2 Confidence for Different Scenarios¶
# High confidence: strong signal (hard constraint match)
rec_binary = recommend_encoding(n_features=4, data_type="binary")
print(f"Binary data (strong signal): {rec_binary.encoding_name} -> confidence={rec_binary.confidence:.3f}")
assert rec_binary.confidence >= 0.75
rec_so2 = recommend_encoding(n_features=2, symmetry="rotation")
print(f"SO2 equivariant (strong signal): {rec_so2.encoding_name} -> confidence={rec_so2.confidence:.3f}")
assert rec_so2.confidence >= 0.85
# Lower confidence: weaker signals
rec_generic = recommend_encoding(n_features=6)
print(f"Generic 6 features (weak signal): {rec_generic.encoding_name} -> confidence={rec_generic.confidence:.3f}")
assert rec_generic.confidence <= 0.80
Binary data (strong signal): basis -> confidence=0.850 SO2 equivariant (strong signal): so2_equivariant -> confidence=0.866 Generic 6 features (weak signal): zz_feature_map -> confidence=0.630
12.3 Scoring Internals¶
# Peek inside the scoring: compare two encodings for the same scenario
params = dict(
n_features=4,
n_samples=500,
task="classification",
hardware="simulator",
priority="speed",
data_type="continuous",
symmetry=None,
trainable=False,
problem_structure=None,
feature_interactions=None,
)
print(f"Scores for priority='speed' (n_features=4):")
print("=" * 50)
scores = {}
for name in sorted(ENCODING_RULES.keys()):
rules = ENCODING_RULES[name]
# Only score if passes hard constraints
if _passes_hard_constraints(rules, n_features=4, data_type="continuous"):
score = _compute_score(name, rules, **params)
scores[name] = score
for name, score in sorted(scores.items(), key=lambda x: x[1], reverse=True):
conf = _score_to_confidence(score)
print(f" {name:25s}: score={score:.4f} -> confidence={conf:.3f}")
Scores for priority='speed' (n_features=4): ================================================== angle : score=0.2000 -> confidence=0.600 iqp : score=0.0700 -> confidence=0.535 pauli_feature_map : score=0.0700 -> confidence=0.535 zz_feature_map : score=0.0700 -> confidence=0.535 data_reuploading : score=0.0300 -> confidence=0.515 higher_order_angle : score=0.0300 -> confidence=0.515 amplitude : score=0.0000 -> confidence=0.500 hamiltonian : score=0.0000 -> confidence=0.500 hardware_efficient : score=0.0000 -> confidence=0.500 qaoa : score=0.0000 -> confidence=0.500
# Score clamping: scores are always in [0, 1]
print("Score clamping verification:")
for name, rules in ENCODING_RULES.items():
score = _compute_score(
name, rules,
n_features=4, n_samples=500, task="classification",
hardware="simulator", priority="accuracy", data_type="continuous",
symmetry=None, trainable=False, problem_structure=None,
feature_interactions=None,
)
assert 0.0 <= score <= 1.0, f"{name} has out-of-range score: {score}"
print("All scores in [0, 1] range!")
Score clamping verification: All scores in [0, 1] range!
# Hard precondition bonus demonstration
# basis encoding with binary data gets +0.50 data_type bonus
basis_binary_score = _compute_score(
"basis", ENCODING_RULES["basis"],
n_features=4, n_samples=500, task="classification",
hardware="simulator", priority="accuracy", data_type="binary",
symmetry=None, trainable=False, problem_structure=None,
feature_interactions=None,
)
angle_binary_score = _compute_score(
"angle", ENCODING_RULES["angle"],
n_features=4, n_samples=500, task="classification",
hardware="simulator", priority="accuracy", data_type="binary",
symmetry=None, trainable=False, problem_structure=None,
feature_interactions=None,
)
print(f"Basis with binary data: score={basis_binary_score:.4f}")
print(f"Angle with binary data: score={angle_binary_score:.4f}")
print(f"Basis scores higher: {basis_binary_score > angle_binary_score}")
assert basis_binary_score > angle_binary_score
Basis with binary data: score=0.5000 Angle with binary data: score=0.0000 Basis scores higher: True
# Hardware penalty for deep circuits
amp_sim = _compute_score(
"amplitude", ENCODING_RULES["amplitude"],
n_features=16, n_samples=500, task="classification",
hardware="simulator", priority="accuracy", data_type="continuous",
symmetry=None, trainable=False, problem_structure=None,
feature_interactions=None,
)
amp_hw = _compute_score(
"amplitude", ENCODING_RULES["amplitude"],
n_features=16, n_samples=500, task="classification",
hardware="ibm", priority="accuracy", data_type="continuous",
symmetry=None, trainable=False, problem_structure=None,
feature_interactions=None,
)
print(f"Amplitude on simulator: score={amp_sim:.4f}")
print(f"Amplitude on IBM: score={amp_hw:.4f}")
print(f"Hardware penalty applied: {amp_sim > amp_hw}")
assert amp_sim > amp_hw, "Deep circuits should be penalized on real hardware"
Amplitude on simulator: score=0.2700 Amplitude on IBM: score=0.0400 Hardware penalty applied: True
12.4 Explanation Generation¶
# Every encoding has a non-empty explanation
print("Explanation templates:")
print("=" * 80)
for name in sorted(ENCODING_RULES.keys()):
explanation = _generate_explanation(
name, ENCODING_RULES[name], priority="accuracy", n_features=4
)
assert isinstance(explanation, str) and len(explanation) > 0
print(f" {name:25s}: {explanation}")
Explanation templates: ================================================================================ amplitude : Amplitude encoding provides exponential compression (2 qubits for 4 features) angle : Angle encoding provides O(1) depth with simple rotations, ideal for accuracy basis : Basis encoding directly maps binary/discrete features to computational basis states cyclic_equivariant : Cyclic equivariant encoding guarantees rigorous Z_n cyclic shift symmetry with ring-topology circuits data_reuploading : Data re-uploading achieves universal approximation capability through repeated data encoding with entanglement layers hamiltonian : Hamiltonian encoding applies Trotterised time evolution under a data-dependent Hamiltonian for physics-inspired ML hardware_efficient : Hardware-efficient encoding minimises gate decomposition overhead on real quantum devices higher_order_angle : Higher-order angle encoding captures polynomial feature interactions (order-k products) without entanglement iqp : IQP encoding creates highly entangled states with provable classical simulation hardness, well-suited for kernel methods pauli_feature_map : Pauli Feature Map enables configurable Pauli-string rotation structures for custom feature interactions qaoa : QAOA-inspired encoding uses cost-mixer layer structure suited for combinatorial and graph-structured problems so2_equivariant : SO(2) equivariant encoding guarantees mathematically rigorous 2D rotational equivariance for the 2-feature input swap_equivariant : Swap equivariant encoding guarantees rigorous S_2 pair-swap symmetry over feature pairs symmetry_inspired : Symmetry-inspired encoding provides a heuristic symmetry-aware inductive bias for the given problem trainable : Trainable encoding interleaves data rotations with learnable parameter layers for task-specific optimisation zz_feature_map : ZZ Feature Map provides standard pairwise feature interactions via (pi-x_i)(pi-x_j) phase encoding for kernel methods
13. Connecting Recommendations to Actual Encodings ¶
The canonical names returned by the recommender and decision tree match the encoding registry keys. This means you can directly instantiate the recommended encoding using get_encoding().
# Map canonical names to registry names
registry_names = list_encodings()
print(f"Registry has {len(registry_names)} entries (including aliases):")
for name in registry_names:
print(f" - {name}")
Registry has 26 entries (including aliases): - amplitude - angle - angle_ry - basis - covariant - covariant_feature_map - cyclic_equivariant - cyclic_equivariant_feature_map - data_reuploading - hamiltonian - hamiltonian_encoding - hardware_efficient - higher_order_angle - iqp - pauli_feature_map - qaoa - qaoa_encoding - so2_equivariant - so2_equivariant_feature_map - swap_equivariant - swap_equivariant_feature_map - symmetry_inspired - symmetry_inspired_feature_map - trainable - trainable_encoding - zz_feature_map
# Verify every canonical name in ENCODING_RULES exists in the registry
print("Canonical name -> Registry lookup:")
print("=" * 50)
for name in sorted(ENCODING_RULES.keys()):
found = name in registry_names
# Some names may use aliases
aliases = [r for r in registry_names if r.startswith(name.split('_')[0])]
status = "DIRECT" if found else f"via alias: {aliases}"
print(f" {name:25s} -> {status}")
Canonical name -> Registry lookup: ================================================== amplitude -> DIRECT angle -> DIRECT basis -> DIRECT cyclic_equivariant -> DIRECT data_reuploading -> DIRECT hamiltonian -> DIRECT hardware_efficient -> DIRECT higher_order_angle -> DIRECT iqp -> DIRECT pauli_feature_map -> DIRECT qaoa -> DIRECT so2_equivariant -> DIRECT swap_equivariant -> DIRECT symmetry_inspired -> DIRECT trainable -> DIRECT zz_feature_map -> DIRECT
# End-to-end: get a recommendation and instantiate the encoding
rec = recommend_encoding(n_features=4, priority="accuracy")
print(f"Recommendation: {rec.encoding_name}")
# Instantiate the recommended encoding
encoding = get_encoding(rec.encoding_name, n_features=4)
print(f"Encoding class: {type(encoding).__name__}")
print(f"Encoding n_qubits: {encoding.n_qubits}")
print(f"Encoding n_features: {encoding.n_features}")
Recommendation: iqp Encoding class: IQPEncoding Encoding n_qubits: 4 Encoding n_features: 4
# Demonstrate end-to-end for multiple scenarios
scenarios = [
("Binary classification", dict(n_features=4, data_type="binary"), dict(n_features=4)),
("Speed priority", dict(n_features=6, priority="speed"), dict(n_features=6)),
("Many features", dict(n_features=16, priority="accuracy"), dict(n_features=16)),
("2D rotation", dict(n_features=2, symmetry="rotation"), dict(n_features=2)),
]
print("End-to-end: recommend -> instantiate:")
print("=" * 70)
for title, rec_params, enc_params in scenarios:
rec = recommend_encoding(**rec_params)
try:
encoding = get_encoding(rec.encoding_name, **enc_params)
print(f" {title:25s} -> {rec.encoding_name:20s} -> {type(encoding).__name__} (n_qubits={encoding.n_qubits})")
except Exception as e:
print(f" {title:25s} -> {rec.encoding_name:20s} -> Error: {e}")
End-to-end: recommend -> instantiate: ====================================================================== Binary classification -> basis -> BasisEncoding (n_qubits=4) Speed priority -> angle -> AngleEncoding (n_qubits=6) Many features -> amplitude -> AmplitudeEncoding (n_qubits=4) 2D rotation -> so2_equivariant -> SO2EquivariantFeatureMap (n_qubits=2)
14. Summary ¶
What We Covered¶
- Knowledge Base (
ENCODING_RULES): 16 encodings with 11 typed fields each, including hard constraints and soft tags - Hard Constraint Filtering: 6 binary constraint gates that eliminate structurally invalid encodings
- Tag-Based Matching (
get_matching_encodings()): Combining hard filters with softbest_for/avoid_whentag matching - The Recommender (
recommend_encoding()): 10-parameter API producing ranked recommendations with confidence scores - The Decision Tree (
EncodingDecisionTree): Deterministic 7-level tree for interpretable encoding selection - Reachability: All 16 encodings are reachable as primary recommendation from both systems
- Priority Hierarchy: Higher-level decision criteria always override lower levels
- Parameter Deep-Dives: Every parameter's effect on the recommendation
- Edge Cases: Extreme feature counts, wrong symmetry+feature combos, conflicting constraints
- Consistency: Recommender and decision tree agree for canonical trigger parameters
- Real-World Scenarios: 10 practical use cases demonstrating the system
- Confidence Analysis: Scoring internals, confidence bands, and monotonicity
- Registry Integration: End-to-end from recommendation to instantiated encoding
Key Design Properties¶
- Safety: Hard constraints are never violated — no encoding is ever recommended when its preconditions fail
- Completeness: All 16 encodings are reachable from both the recommender and decision tree
- Backward Compatibility: The original 5-positional-arg API still works unchanged
- Interpretability: Every recommendation comes with an explanation and confidence score
- Extensibility: Adding a 17th encoding requires only adding an entry to
ENCODING_RULES, a leaf in the decision tree, and an explanation template
print("Notebook completed successfully!")
print(f"All {len(ENCODING_RULES)} encodings covered.")
print("No errors encountered.")
Notebook completed successfully! All 16 encodings covered. No errors encountered.