Discretized Amplitude Encoding: Beyond 2^n?

April 3, 2026

The Intuition

An nn-qubit state has the form:

ψ=i=02n1αii,i=02n1αi2=1|\psi\rangle = \sum_{i=0}^{2^n - 1} \alpha_i |i\rangle, \quad \sum_{i=0}^{2^n - 1} |\alpha_i|^2 = 1

We typically encode information by choosing which basis states i|i\rangle have non-zero amplitudes. But the amplitudes αiC\alpha_i \in \mathbb{C} are continuous values—why not encode information in their structure?

Idea: Apply FFT to the amplitude vector, discretize the frequency-domain coefficients into bins [0,0.01),[0.01,0.02),[0, 0.01), [0.01, 0.02), \ldots, and encode data in this discretized pattern. Could this let us pack more information into fewer qubits?

The Mathematical Reality

Problem 1: The Born Rule

When we measure in the computational basis, we get outcome ii with probability:

P(i)=αi2=iψ2P(i) = |\alpha_i|^2 = |\langle i | \psi \rangle|^2

We observe probabilities, not amplitudes. The phase information arg(αi)\text{arg}(\alpha_i) is lost. To reconstruct the amplitude distribution, we need:

  • Many copies of ψ|\psi\rangle (violates no-cloning for unknown states)
  • Or quantum state tomography (O(22n)O(2^{2n}) measurements for full reconstruction)

Problem 2: The Holevo Bound

Even if we could prepare a state with rich amplitude structure, Holevo's theorem tells us the accessible classical information is bounded:

χS(ρ)=Tr(ρlogρ)\chi \leq S(\rho) = -\text{Tr}(\rho \log \rho)

For a pure state ρ=ψψ\rho = |\psi\rangle\langle\psi|, we have S(ρ)=0S(\rho) = 0. The accessible information comes from the ensemble of states we prepare, not from a single state's amplitude structure.

Translation: You can't extract arbitrarily many bits from a single quantum state by clever amplitude engineering.

Problem 3: State Preparation Complexity

Preparing a state with arbitrary amplitudes {αi}\{\alpha_i\} requires a circuit of depth O(2n)O(2^n) in the worst case. If our goal is efficiency, this defeats the purpose.

Where This Does Work

Interestingly, similar ideas already exist:

1. Quantum Fourier Transform (QFT)

The QFT maps:

j12nk=02n1e2πijk/2nk|j\rangle \to \frac{1}{\sqrt{2^n}} \sum_{k=0}^{2^n-1} e^{2\pi ijk/2^n} |k\rangle

This is encoding information in the frequency domain of amplitudes! Used in Shor's algorithm, phase estimation, etc.

2. Amplitude Amplification

Grover's algorithm and amplitude amplification manipulate amplitudes to increase the probability of measuring the correct answer:

ψQkψ,where Q amplifies target amplitudes|\psi\rangle \to Q^k|\psi\rangle, \quad \text{where } Q \text{ amplifies target amplitudes}

3. Quantum Signal Processing (QSP)

QSP can implement polynomial transformations P(cosθ)P(\cos\theta) on amplitude distributions. This is essentially what you're describing—structured manipulation of amplitudes.

A Possible Hybrid Approach?

Here's where it might get interesting:

Amplitude Discretization for Noise-Resilient Encoding

Instead of maximizing information density, what if we discretize amplitudes to create error-correcting structure?

αi{ϵ1,ϵ2,,ϵm}(discrete amplitude alphabet)\alpha_i \in \{\epsilon_1, \epsilon_2, \ldots, \epsilon_m\} \quad \text{(discrete amplitude alphabet)}

Apply FFT, then constrain frequency-domain coefficients to discrete values. This might give:

  • Reduced sensitivity to amplitude damping
  • Natural compression (like JPEG for quantum states)
  • Easier verification (finite precision arithmetic)

The trade-off: We sacrifice some information capacity for robustness.

Practical Implication: Vocabulary Encoding

Question: If we can't beat the information bounds, do we need as many qubits as classical bits to encode a vocabulary of size VV?

Short answer: Yes, fundamentally.

To encode one of VV distinct tokens:

Classical: log2V\lceil \log_2 V \rceil bits

Quantum: log2V\lceil \log_2 V \rceil qubits

Here's why:

Distinguishability Requires Orthogonal States

To reliably distinguish between VV different tokens, we need VV orthogonal quantum states. In an nn-qubit system, we have at most 2n2^n orthogonal basis states. Therefore:

2nV    nlog2V2^n \geq V \implies n \geq \log_2 V

No amplitude tricks change this—orthogonality is a geometric constraint in Hilbert space.

Where Quantum Wins (Not in Storage)

The advantage of quantum computing is not in data compression. It's in:

  1. Superposition: A token embedding can be in superposition:

    token=i=1Vαii|\text{token}\rangle = \sum_{i=1}^{V} \alpha_i |i\rangle

    This lets us process multiple possibilities in parallel.

  2. Entanglement: Token relationships can be encoded non-locally:

    bigram=i,jαijij|\text{bigram}\rangle = \sum_{i,j} \alpha_{ij} |i\rangle \otimes |j\rangle

    With nn qubits, you can represent entangled states that have no classical equivalent description shorter than 2n2^n parameters.

  3. Interference: Quantum algorithms use constructive/destructive interference to amplify correct answers—this is where speedup comes from, not from packing more data per qubit.

The Trap for Quantum NLP

If you're building a quantum transformer or language model, you might think: "Can I use fewer qubits than classical bits for my vocabulary?"

No. For a vocabulary of 50k tokens, you need:

  • Classical: log2(50000)16\log_2(50000) \approx 16 bits
  • Quantum: log2(50000)16\log_2(50000) \approx 16 qubits

But: The quantum version can process superpositions of tokens, entangle positional encodings with semantic embeddings in ways classical systems can't, and potentially offer speedups in attention mechanisms through Grover-like searches.

The win is in computation, not representation.

The Verdict

Can we use discretized amplitudes to encode more than nn bits on nn qubits?

No—Holevo bound and measurement constraints prevent extracting more classical information than the system's entropy allows.

But: We can use amplitude structure for:

  • Algorithmic advantage (QFT, QSP already do this)
  • Noise-resilient encoding (discretization as a form of "quantization")
  • Quantum machine learning (amplitude patterns as features)

The 2^n scaling is fundamental to how much information we can extract, but how we structure amplitudes still matters for computation and error correction.

Next: Testing Discretized QSP

I'm curious whether variational circuits can learn to prepare "quantized amplitude" states that are more robust to noise. Might be worth implementing a small experiment:

  1. Parameterized circuit to prepare ψ(θ)|\psi(\theta)\rangle
  2. Constraint: amplitudes must lie in discrete bins
  3. Measure noise resilience vs. standard amplitude encoding

Worth exploring whether discretization helps or hurts in practice.