Basis Encoding¶
The simplest quantum data encoding — a direct one-to-one map from classical bits to computational basis states.
╔══════════════════════════════════════════════════════════════════════╗
║ ║
║ |ψ(x)⟩ = |x₀ x₁ ... xₙ₋₁⟩ where xᵢ ∈ {0, 1} ║
║ ║
║ "n binary features → n qubits → one basis state" ║
║ ║
╚══════════════════════════════════════════════════════════════════════╝
1. The Core Idea¶
Each classical bit is stored in a dedicated qubit. A 0 leaves the
qubit in |0⟩; a 1 flips it to |1⟩ with a Pauli-X gate. The result is
always a single computational basis state — no superposition, no
entanglement.
Classical Bits (4 features) Quantum State (4 qubits)
┌──────────────────────────┐ ┌──────────────────────────┐
│ │ │ │
│ x₀ = 1 │ │ |1⟩ ← X applied │
│ x₁ = 0 │ Encode │ |0⟩ ← untouched │
│ x₂ = 1 ──────────────────────────► │ |1⟩ ← X applied │
│ x₃ = 0 │ │ |0⟩ ← untouched │
│ │ │ │
└──────────────────────────┘ └──────────────────────────┘
bit string [1, 0, 1, 0] state |1010⟩
Think of it as writing binary data onto a quantum register — the quantum version of loading bits into a classical register.
2. The Mathematics¶
State Definition¶
Given a binary vector x = (x₀, x₁, ..., xₙ₋₁) with xᵢ ∈ {0, 1}:
where X^{xᵢ} means:
The Pauli-X Gate¶
┌ ┐
X = │ 0 1 │ X|0⟩ = |1⟩
│ 1 0 │ X|1⟩ = |0⟩
└ ┘
It is the quantum analogue of a classical NOT gate.
Resulting Quantum State¶
The output is always a single computational basis state with coefficient 1. No superposition is created:
Input: x = [1, 0, 1, 1]
|ψ⟩ = |1011⟩ = |1⟩ ⊗ |0⟩ ⊗ |1⟩ ⊗ |1⟩
Statevector: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0]
↑
index 11 (binary 1011)
Measurement: |1011⟩ with probability 1.000 (deterministic)
3. Circuit Structure¶
The circuit is the simplest possible — parallel X gates on selected qubits, all in a single layer (depth = 1):
x = [1, 0, 1, 0]
q₀: |0⟩ ───[X]─── |1⟩ ← x₀ = 1, flip
q₁: |0⟩ ─────────── |0⟩ ← x₁ = 0, no gate
q₂: |0⟩ ───[X]─── |1⟩ ← x₂ = 1, flip
q₃: |0⟩ ─────────── |0⟩ ← x₃ = 0, no gate
Result: |1010⟩
Depth: 1
Gates: 2 (only the X gates that fire)
For the all-ones case (worst case):
x = [1, 1, 1, 1]
q₀: |0⟩ ───[X]─── |1⟩
q₁: |0⟩ ───[X]─── |1⟩
q₂: |0⟩ ───[X]─── |1⟩
q₃: |0⟩ ───[X]─── |1⟩
Result: |1111⟩
Depth: 1
Gates: 4 (maximum for n=4)
For the all-zeros case (best case):
x = [0, 0, 0, 0]
q₀: |0⟩ ─────────── |0⟩
q₁: |0⟩ ─────────── |0⟩
q₂: |0⟩ ─────────── |0⟩
q₃: |0⟩ ─────────── |0⟩
Result: |0000⟩
Depth: 1 (the `depth` property always returns 1)
Gates: 0 (no gates needed!)
The gate count is data-dependent: it equals the number of 1s in the input. For sparse binary data, the circuit is almost empty.
4. Binarization of Continuous Data¶
Real-world data is often continuous. Basis encoding handles this via a threshold rule (default threshold = 0.5):
┌──────────────────────────────────────────────────────────────────┐
│ BINARIZATION RULE │
│ │
│ x > threshold → 1 (X gate applied) │
│ x ≤ threshold → 0 (no gate) │
│ │
│ Note: values exactly AT the threshold map to 0 │
└──────────────────────────────────────────────────────────────────┘
Example¶
Continuous input: [0.8, 0.2, 0.6, 0.4, 0.5, 0.51]
Threshold: 0.5
↓ ↓ ↓ ↓ ↓ ↓
Binarized: [ 1, 0, 1, 0, 0, 1 ]
>0.5 ≤0.5 >0.5 ≤0.5 =0.5 >0.5
Custom Thresholds¶
┌─────────────────────────────────────────────────────────────┐
│ Threshold │ Use Case │ Behavior │
├────────────┼───────────────────────┼────────────────────────┤
│ 0.5 │ Data in [0, 1] │ Standard split │
│ 0.0 │ Signed data │ Negative→0, Positive→1 │
│ 0.7 │ Conservative "on" │ Only strong signals→1 │
│ custom │ Domain-specific │ Match decision boundary│
└─────────────────────────────────────────────────────────────┘
Example with threshold = 0.0:
Input: [-0.5, 0.5, -0.1, 0.1, 0.0]
Binary: [ 0, 1, 0, 1, 0 ]
neg pos neg pos =0
5. Example Walkthrough¶
Encode the binary string [1, 0, 1] using 3 qubits:
Step 1: Input
─────────────
x = [1, 0, 1] (already binary, no binarization needed)
Step 2: Apply gates
────────────────────
q₀: |0⟩ ─ X ─ |1⟩ (x₀ = 1 → apply X)
q₁: |0⟩ ───── |0⟩ (x₁ = 0 → identity)
q₂: |0⟩ ─ X ─ |1⟩ (x₂ = 1 → apply X)
Step 3: Output state
─────────────────────
|ψ⟩ = |101⟩ = |1⟩ ⊗ |0⟩ ⊗ |1⟩
Statevector (8 entries for 3 qubits):
|000⟩ ░░░░░░░░░░░░░░░░░░░░ 0.0
|001⟩ ░░░░░░░░░░░░░░░░░░░░ 0.0
|010⟩ ░░░░░░░░░░░░░░░░░░░░ 0.0
|011⟩ ░░░░░░░░░░░░░░░░░░░░ 0.0
|100⟩ ░░░░░░░░░░░░░░░░░░░░ 0.0
|101⟩ ████████████████████ 1.0 ← the encoded state
|110⟩ ░░░░░░░░░░░░░░░░░░░░ 0.0
|111⟩ ░░░░░░░░░░░░░░░░░░░░ 0.0
Measurement: |101⟩ with probability 1 (deterministic)
6. Key Properties¶
┌─────────────────────────────────────────────────────────────────────┐
│ BASIS ENCODING PROPERTIES │
├──────────────────────┬──────────────────────────────────────────────┤
│ Qubits required │ n (one per feature, linear scaling) │
│ Circuit depth │ 1 (constant, all gates parallel) │
│ Gate count │ n (worst-case; actual is 0 to n, │
│ │ data-dependent — see actual_gate_count) │
│ Two-qubit gates │ 0 (never, no entanglement) │
│ Parameter count │ 0 (no trainable parameters) │
│ Entangling? │ No (always produces product states) │
│ Simulability │ Trivially classically simulable │
│ Trainability │ N/A (nothing to train) │
│ Expressibility │ Minimal (only 2ⁿ basis states reachable) │
│ Normalization │ Not needed (data is binary) │
│ Data-dependent │ Gate count only (which X gates fire) │
│ Deterministic? │ Yes (measurement always returns input) │
└──────────────────────┴──────────────────────────────────────────────┘
Why Trivially Simulable?¶
A classical computer can predict the output of basis encoding
in O(n) time — just read the input bits. There is:
• No superposition (exactly one basis state has amplitude 1)
• No entanglement (state is a tensor product of single qubits)
• No interference (no amplitudes to combine)
The output is always: |x₀ x₁ ... xₙ₋₁⟩ with probability 1.
The 2ⁿ Reachable States¶
For n = 3, basis encoding can produce exactly 2³ = 8 states:
[0,0,0] → |000⟩ [1,0,0] → |100⟩
[0,0,1] → |001⟩ [1,0,1] → |101⟩
[0,1,0] → |010⟩ [1,1,0] → |110⟩
[0,1,1] → |011⟩ [1,1,1] → |111⟩
These 8 states form an orthonormal basis for the 3-qubit Hilbert
space, but basis encoding can ONLY reach these 8 points — it cannot
produce any superposition state like (|000⟩ + |111⟩)/√2.
7. Data-Dependent Gate Counts¶
Unlike most encodings where the circuit structure is fixed, basis encoding's gate count depends on the input data:
┌──────────────────────────────────────────────────────────────────┐
│ GATE COUNT ANALYSIS │
│ │
│ Theoretical maximum (worst case): n (all features = 1) │
│ Theoretical minimum (best case): 0 (all features = 0) │
│ Actual for specific data: Σ xᵢ (count of 1s) │
│ │
└──────────────────────────────────────────────────────────────────┘
n_features = 8
Input Binarized X Gates Depth Sparsity
──────────────────── ────────────── ─────── ───── ────────
[1,1,1,1,1,1,1,1] [1,1,1,1,1,1,1,1] 8 1 0%
[1,0,1,0,1,0,1,0] [1,0,1,0,1,0,1,0] 4 1 50%
[1,0,0,0,0,0,0,0] [1,0,0,0,0,0,0,0] 1 1 88%
[0,0,0,0,0,0,0,0] [0,0,0,0,0,0,0,0] 0 1 100%
For sparse binary data (common in practice), the actual hardware
cost is much lower than the worst-case bound.
8. Comparison with Other Encodings¶
┌───────────────────┬──────────┬─────────┬────────────┬──────────────┐
│ Encoding │ Qubits │ Depth │ Entangling │ Data Type │
├───────────────────┼──────────┼─────────┼────────────┼──────────────┤
│ Basis ★ │ n │ 1 │ No │ Binary │
│ Angle │ n │ 1 │ No │ Continuous │
│ Amplitude │ log₂n │ O(2ⁿ) │ Yes │ Continuous │
│ IQP │ n │ O(n) │ Yes │ Continuous │
│ ZZ Feature Map │ n │ O(n²) │ Yes │ Continuous │
└───────────────────┴──────────┴─────────┴────────────┴──────────────┘
Visual comparison (n = 4 features):
BASIS ENCODING ANGLE ENCODING IQP ENCODING
4 qubits, depth 1 4 qubits, depth 1 4 qubits, depth ~4
No entanglement No entanglement Entangling
|0⟩─[X]── |0⟩─[RY(θ₀)]── |0⟩─[H]─[RZ]─●───────
|0⟩────── |0⟩─[RY(θ₁)]── |0⟩─[H]─[RZ]─●──●────
|0⟩─[X]── |0⟩─[RY(θ₂)]── |0⟩─[H]─[RZ]────●──●─
|0⟩────── |0⟩─[RY(θ₃)]── |0⟩─[H]─[RZ]───────●─
Simplest possible Same width, rotations Entangling ZZ layers
Binary data only Continuous features Rich feature mixing
9. Strengths and Limitations¶
STRENGTHS LIMITATIONS
┌───────────────────────────┐ ┌───────────────────────────────┐
│ │ │ │
│ ✓ Simplest encoding │ │ ✗ Binary data only │
│ One gate type (X) │ │ Continuous info is lost │
│ Zero entanglement │ │ via thresholding │
│ │ │ │
│ ✓ Minimal circuit depth │ │ ✗ Linear qubit scaling │
│ Always depth 1 │ │ n features need n qubits │
│ Ideal for NISQ │ │ (no compression) │
│ │ │ │
│ ✓ Deterministic output │ │ ✗ Minimal expressibility │
│ Measurement always │ │ Only 2ⁿ states reachable │
│ returns the input │ │ out of continuous Hilbert │
│ │ │ space │
│ ✓ Hardware efficient │ │ │
│ No two-qubit gates │ │ ✗ No quantum advantage │
│ No calibration issues │ │ Trivially simulable — │
│ │ │ a classical computer does │
│ ✓ Zero error from │ │ this just as well │
│ encoding itself │ │ │
│ (only hardware noise) │ │ ✗ No entanglement │
│ │ │ Cannot leverage quantum │
│ │ │ correlations │
└───────────────────────────┘ └───────────────────────────────┘
10. Use Cases¶
Best suited for
┌──────────────────────────────┐
│ │
┌─────────────────┤ Combinatorial Optimization │ QAOA, VQE encode
│ │ (QAOA / VQE) │ problem solutions
│ ├──────────────────────────────┤ as bit strings
│ │ │
├─────────────────┤ Database Search │ Grover's algorithm
│ │ (Grover's Algorithm) │ marks basis states
│ ├──────────────────────────────┤
│ │ │
├─────────────────┤ Discrete / Categorical │ One-hot vectors,
│ │ Data Classification │ binary flags,
│ ├──────────────────────────────┤ feature presence
│ │ │
├─────────────────┤ Quantum Memory (QRAM) │ Address encoding
│ │ │ for quantum RAM
│ ├──────────────────────────────┤
│ │ │
└─────────────────┤ Encoding Baseline │ Simplest reference
│ for Benchmarking │ to compare against
└──────────────────────────────┘
11. Resource Scaling¶
n_features │ n_qubits │ Max X Gates │ Depth │ Two-Qubit │ Simulable
───────────┼──────────┼─────────────┼───────┼───────────┼──────────
1 │ 1 │ 1 │ 1 │ 0 │ Yes
2 │ 2 │ 2 │ 1 │ 0 │ Yes
4 │ 4 │ 4 │ 1 │ 0 │ Yes
8 │ 8 │ 8 │ 1 │ 0 │ Yes
16 │ 16 │ 16 │ 1 │ 0 │ Yes
64 │ 64 │ 64 │ 1 │ 0 │ Yes
256 │ 256 │ 256 │ 1 │ 0 │ Yes
1024 │ 1024 │ 1024 │ 1 │ 0 │ Yes
Depth is ALWAYS 1. The only scaling dimension is qubit count,
which grows linearly with features.
12. Backend Implementations¶
┌─────────────────────────────────────────────────────────────────┐
│ BACKEND COMPARISON │
├──────────────┬──────────────────────────────────────────────────┤
│ │ │
│ PennyLane │ Manual X gates via qml.PauliX │
│ │ • Returns callable closure │
│ │ • Big-endian (wire 0 = MSB) │
│ │ • Feature x[i] → wire i (direct mapping) │
│ │ │
├──────────────┼──────────────────────────────────────────────────┤
│ │ │
│ Qiskit │ QuantumCircuit with qc.x(i) │
│ │ • Little-endian (qubit 0 = LSB) │
│ │ • Feature x[i] → qubit i (direct mapping) │
│ │ • LSB→MSB conversion handled by analysis module │
│ │ │
├──────────────┼──────────────────────────────────────────────────┤
│ │ │
│ Cirq │ cirq.X on LineQubits │
│ │ • All X gates in a single Moment (parallel) │
│ │ • Big-endian (LineQubit(0) = MSB) │
│ │ • Feature x[i] → qubit i (direct mapping) │
│ │ • ⚠ all_qubits() may return fewer qubits │
│ │ for sparse inputs (only active qubits) │
│ │ │
└──────────────┴──────────────────────────────────────────────────┘
Qubit ordering: All backends use direct mapping (feature x[i] →
qubit i). PennyLane and Cirq are both big-endian (qubit 0 = MSB),
so no conversion is needed. Qiskit is little-endian (qubit 0 =
LSB), but the analysis module's _simulate_qiskit automatically
converts statevectors to big-endian for cross-backend consistency.
13. When To Use (and When Not To)¶
┌──────────────────────────────────────────────────────────────────┐
│ USE BASIS ENCODING WHEN: │
│ │
│ • Data is naturally binary or categorical │
│ • You need the shallowest possible circuit (NISQ hardware) │
│ • Your algorithm works with marked basis states (Grover, QAOA) │
│ • You want a simple baseline encoding for comparisons │
│ • Hardware error budget is tight (fewer gates = less error) │
└──────────────────────────────────────────────────────────────────┘
┌──────────────────────────────────────────────────────────────────┐
│ DO NOT USE BASIS ENCODING WHEN: │
│ │
│ • Data is continuous and pattern-rich (use Angle or IQP) │
│ • You need superposition or interference effects │
│ • You need entanglement from the encoding layer │
│ • You need amplitude-level information processing │
│ • You want sub-linear qubit scaling (use Amplitude encoding) │
└──────────────────────────────────────────────────────────────────┘
14. Orthogonality of Encoded States¶
A unique property of basis encoding: every pair of distinct inputs maps to orthogonal quantum states.
⟨ψ(x) | ψ(y)⟩ = δ_{x,y} (Kronecker delta)
Inputs States Inner Product
────── ────── ─────────────
[0,0] , [0,1] |00⟩ , |01⟩ ⟨00|01⟩ = 0 (orthogonal)
[0,0] , [1,0] |00⟩ , |10⟩ ⟨00|10⟩ = 0 (orthogonal)
[0,1] , [1,0] |01⟩ , |10⟩ ⟨01|10⟩ = 0 (orthogonal)
[1,0] , [1,0] |10⟩ , |10⟩ ⟨10|10⟩ = 1 (identical)
This means:
• Every distinct bit string is perfectly distinguishable
• No information overlap between different encodings
• Kernel matrix = Identity (for distinct inputs)
This perfect distinguishability is both a strength (no confusion between inputs) and a limitation (no notion of "similarity" between nearby bit strings — [0,0,0,1] and [0,0,1,0] are equally "far" from each other as [0,0,0,0] and [1,1,1,1]).
References¶
-
Nielsen, M. A., & Chuang, I. L. (2010). "Quantum Computation and Quantum Information." Cambridge University Press. Chapter 1.
-
Schuld, M., & Petruccione, F. (2018). "Supervised Learning with Quantum Computers." Springer. Chapter 4: Quantum Feature Maps.
-
Weigold, M., et al. (2020). "Data encoding patterns for quantum computing." IEEE International Conference on Quantum Computing and Engineering (QCE).
-
LaRose, R., & Coyle, B. (2020). "Robust data encodings for quantum classifiers." Physical Review A, 102(3), 032420.