nature.com

Circuit-Based Quantum Random Access Memory for Classical Data - Scientific Reports

️Rhee, June-Koo Kevin
️Fri Mar 08 2019

A quantum superposition state prepared with respect to classical data for quantum computation can be referred to as quantum database (QDB). In most general form, a QDB can be expressed as

$$|\psi {\rangle }_{{\rm{QDB}}}=\sum _{l=0}^{M-1}\,{b}_{l}|{\overrightarrow{d}}^{(l)}\rangle ,$$

(1)

where ${\overrightarrow{d}}^{(l)}={d}_{0}^{(l)}{d}_{1}^{(l)}\ldots {d}_{n-1}^{(l)}\in {\{0,1\}}^{n}$ denotes a string of quantum bits in the computational basis to represent a classical data, and M is the number of data. For big data applications, a typical quantum representation of ${\overrightarrow{d}}^{(l)}$ can be decomposed as $|{\overrightarrow{d}}^{(l)}\rangle =|{\overrightarrow{x}}^{(l)}\rangle |l\rangle $, where ${\overrightarrow{x}}^{(l)}$ and l denote a data entry and the corresponding label, respectively^7,10. The probability amplitude, b_l, can encode continuous data as required in some applications^10,15, or represent the normalized occurrence of the data entry ${\overrightarrow{x}}^{(l)}$. For M numbers of n-bit data, n + m qubits are sufficient to realize the QDB, where n and $m=\lceil {\mathrm{log}}_{2}(M)\rceil $ qubits encode the data and the label, respectively. The number of label qubits can be reduced by labelling only the data that appears more than once. The label qubits are unnecessary when all data entries, ${\overrightarrow{x}}^{(l)}$, are unique.

Flip-flop QRAM

The FF-QRAM is used to generate a QDB as follows. Consider a quantum computer with an (n + m)-qubit bus state to encode a big data class database. The bus qubit state can be arbitrary, and it defines which computational basis states, |j〉_B, are accessed with probability amplitudes, ${\psi }_{j}$. A QRAM operation on the bus qubit superposes a set of classical data D as

$${\rm{QRAM}}(D)\,\sum _{j}\,{\psi }_{j}|j{\rangle }_{B}|0{\rangle }_{R}\equiv \sum _{l}\,{\psi }_{l}|{\overrightarrow{d}}^{(l)}{\rangle }_{B}|{b}_{l}{\rangle }_{R},$$

(2)

where the subscript B (R) indicates the bus (register) qubit, and the register qubit can include the probability amplitudes for encoding the analog data. The FF-QRAM is implemented systematically with standard quantum circuit elements, which include the classically-controlled Pauli X gate, $\bar{c}X$, and the n-qubit controlled rotation gate, CⁿR_p(θ). The $\bar{c}X$ flips the target qubit only when the classical control bit is zero. The CⁿR_p(θ) gate rotates the target qubit by θ around the p-axis of the Bloch sphere only if all n control qubits are 1.

The underlying idea of the FF-QRAM model is depicted in Fig. 1, describing the procedure to superpose two independent bit strings ${\overrightarrow{d}}^{(l)}$ and ${\overrightarrow{d}}^{(l+1)}$ with target probability amplitudes in the bus qubit state, $|\psi {\rangle }_{B}$. In this example, the label qubits, $|l\rangle $, are omitted without loss of generality to deliver the main idea. The initial state can be expressed with a focus on ${\overrightarrow{d}}^{(l)}$ as

$$|{\psi }_{0}{\rangle }_{l}={\psi }_{{\overrightarrow{d}}^{(l)}}|{\overrightarrow{d}}^{(l)}\rangle |0{\rangle }_{R}+\sum _{j\ne {\overrightarrow{d}}^{(l)}}\,{\psi }_{j}|j\rangle |0{\rangle }_{R},$$

(3)

where $|{\psi }_{s}{\rangle }_{l}$ denotes the state of the (n + 1) qubits in the process of writing the lth data entry, observed at the sth step in Fig. 1. The $\bar{c}X$ gates controlled by ${\overrightarrow{d}}^{(l)}$ rearrange the computational basis states of the bus qubit so that $|{\overrightarrow{d}}^{(l)}\rangle $ becomes $|1{\rangle }^{\otimes n}$, and the rest of the quantum bits flip accordingly:

$$|{\psi }_{1}{\rangle }_{l}={\psi }_{{\overrightarrow{d}}^{(l)}}|1{\rangle }^{\otimes n}|0{\rangle }_{R}+\sum _{|\overline{j\oplus {\overrightarrow{d}}^{(l)}}\rangle \ne |1{\rangle }^{\otimes n}}\,{\psi }_{j}|\overline{j\oplus {\overrightarrow{d}}^{(l)}}\rangle |0{\rangle }_{R}.$$

(4)

The overline in the last term indicates that the bit flip occurs if the control bit is 0. After step 1, the controlled qubit rotation, ${C}^{n}{R}_{y}({\theta }^{(l)})$, denoted as θ^(l) in the figure, is applied to the register qubit. The quantum state at step 2 becomes

$$|{\psi }_{2}{\rangle }_{l}={\psi }_{{\overrightarrow{d}}^{(l)}}|1{\rangle }^{\otimes n}|{\theta }^{(l)}{\rangle }_{R}+\sum _{|\overline{j\oplus {\overrightarrow{d}}^{(l)}}\rangle \ne |1{\rangle }^{\otimes n}}\,{\psi }_{j}|\overline{j\oplus {\overrightarrow{d}}^{(l)}}\rangle |0{\rangle }_{R},$$

(5)

where $|\theta \rangle =\,\cos \,\theta |0\rangle +\,\sin \,\theta |1\rangle $. The $\bar{c}X$ gates conditioned on ${\overrightarrow{d}}^{(l)}$ are applied again to revert the bus state:

$$|{\psi }_{3}{\rangle }_{l}={\psi }_{{\overrightarrow{d}}^{(l)}}|{\overrightarrow{d}}^{(l)}\rangle |{\theta }^{(l)}{\rangle }_{R}+\sum _{j\ne {\overrightarrow{d}}^{(l)}}\,{\psi }_{j}|j\rangle |0{\rangle }_{R}.$$

(6)

The second round registers the next data ${\overrightarrow{d}}^{(l+\mathrm{1)}}$ and θ^(l+1):

$$|{\psi }_{4}{\rangle }_{l,l+1}={\psi }_{{\overrightarrow{d}}^{(l)}}|{\overrightarrow{d}}^{(l)}\rangle |{\theta }^{(l)}{\rangle }_{R}+{\psi }_{{\overrightarrow{d}}^{(l+1)}}|{\overrightarrow{d}}^{(l+1)}\rangle |{\theta }^{(l+1)}{\rangle }_{R}+\sum _{j\ne {\overrightarrow{d}}^{(l)},{\overrightarrow{d}}^{(l+1)}}\,{\psi }_{j}|j\rangle |0{\rangle }_{R}.$$

(7)

This process can be repeated as many times as the number of data entries. In this way, M data entries can be registered with non-uniform weights to generate a state,

$$\sum _{l=0}^{M-1}\,{\psi }_{{\overrightarrow{d}}^{(l)}}|{\overrightarrow{d}}^{(l)}\rangle \,{[\cos {\theta }^{(l)}|0\rangle }_{R}+\,\sin \,{\theta }^{(l)}|1{\rangle }_{R}]+\sum _{j\notin \{{\overrightarrow{d}}^{(l)}\}}\,{\psi }_{j}|j\rangle |0{\rangle }_{R}.$$

(8)

Finally, the queried QDB derived from Eq. (1) can be obtained by selecting an appropriate angle θ^(l) to match the desired probability amplitude b_l, and post-selecting the measurement outcome $\mathrm{|1}{\rangle }_{R}$. The probability to measure $\mathrm{|1}{\rangle }_{R}$ is

$$P(1)=\sum _{l=0}^{M-1}\,|{\psi }_{{\overrightarrow{d}}^{(l)}}\,\sin \,{\theta }^{(l)}{|}^{2}.$$

(9)

The post-selection increases the total runtime by a factor of ~1/P(1), which is data dependent. In some instances, such as in the distance-based quantum classifier¹⁰, the post-selection success probability can be improved by pre-processing the classical data so that θ^(l) is close to kπ/2 for all l where k is an odd-integer.

For the quantum state containing data as equal superposition, the controlled rotations can be replaced with the controlled-NOT gate. Then, the classical data is encoded only in the digital form. If the bus qubit is not in the basis state that corresponds to a data entry ${\overrightarrow{d}}^{(j)}$, then the jth data entry cannot be written in the queried QDB. Moreover, when the same bit string appears more than once, the register qubit accumulates the rotation.

Updating desired data entries of the existing quantum database using our scheme is straightforward. The update can be done by inserting the QDB state as the bus qubit and addressing only the target basis states that are to be updated with the selective flip-flop process.

It is important to note that the post-selection process is not always necessary. For example, the post-selection is not needed when all or some of the bus qubit states are addressed to write or modify the binary data to generate the transformation,

$${\rm{QRAM}}(D)\,\sum _{j}\,{\psi }_{j}|j{\rangle }_{B}|{D}_{j}^{0}{\rangle }_{R}=\sum _{j}\,{\psi }_{j}|j{\rangle }_{B}|{D}_{j}{\rangle }_{R},$$

(10)

where ${D}_{j}^{0}$, ${D}_{j}\in \{0,1\}$. With r register qubits, this process can be easily generalized to encode r-bit data.

In most of the big data applications, real-valued data is encoded in the probability amplitude as discussed thus far. However, if desired, our method can also encode complex probability amplitudes by using a controlled rotation around an arbitrary axis.

In addition to O(Mn) flip-register-flop steps, the total FF-QRAM cost must include the resource overhead for the register operations. In fact, the number of elementary gates needed for this step can dominate the runtime of the entire QRAM process. Thus efficient realization of CⁿR_p(θ) is critical for the practicality of our scheme. Though the optimal circuit depth reduction can be carried out based on the naturally available set of gates in a specific experimental setup, and is beyond the primary scope of this paper, we briefly mention some examples on how to implement CⁿR_p(θ) here. If energy splittings between all pairs of the computational basis states are distinct, then in principle, a resonant pulse at the frequency corresponding to the energy difference between $|1{\rangle }^{\otimes n}|0\rangle $ and $\mathrm{|1}{\rangle }^{\otimes n+1}$ can realize the desired CⁿR_p(θ). But this condition becomes exponentially challenging to satisfy in practice as the number of qubits increases. On the other hand, we can decompose the controlled rotation as ${C}^{n}{R}_{y}(\theta )={{\rm{C}}}^{n}{\rm{NOT}}$ R_y(θ/2) CⁿNOT ${R}_{y}^{\dagger }(\theta /\mathrm{2)}$. The CⁿNOT gate can be further decomposed into 2n − 3 Toffoli gates with n − 2 ancilla qubits prepared in $\mathrm{|0}\rangle $ (see Methods). A Toffoli gate can be realized by applying a frequency-selective on-resonance pulse as described above if a set of three qubits is fully addressable while decoupled from the rest of the qubits in the system. Alternatively, a Toffoli gate can be decomposed into five two-qubit gates without requiring ancilla qubits¹⁶. Other methods for implementing CⁿNOT using O(n) number of elementary gates and ancillary space are discussed in refs^16,17,18. The circuit optimization in terms of Clifford and T gates can be performed using the techniques presented in refs^19,20.

We investigate the robustness of the FF-QRAM shown in Fig. 1 under imperfections using a simple but relevant error model. We assume a typical depolarizing error, in which the state at each time step becomes the maximally mixed classical state with probability ε, and remains unchanged with probability $1-\varepsilon $. Here, we use the Toffoli gate as an example to count the number of time steps, while further gate decomposition and optimization can be required depending on the experimental setup as mentioned above. When implementing the CⁿNOT, 2n − 1 qubits undergo $2\lceil {\mathrm{log}}_{2}(n)\rceil -1$ time steps. Therefore, the success probability after writing M classical bit strings of length n with arbitrary probability amplitudes is ${(1-\varepsilon )}^{O(Mn{\mathrm{log}}_{2}(n))}$. As an illustrative example, solid lines in Fig. 2 show the individual error rate at each time step necessary for writing M classical bit strings with arbitrary probability amplitudes, assuming $n={\mathrm{log}}_{2}(M)$ without loss of generality, with the success probability p_s of the total QRAM process. A milder assumption that the imperfect CⁿR_p(θ) operation causes independent errors on n + 1 qubits yields a better success probability, ${(1-\varepsilon )}^{O(Mn)}$. This case is plotted as dashed lines with open symbols in the figure. The Methods section elaborates on how the number of noisy operations is counted.

Since our scheme is based on the quantum circuit model, fault-tolerant quantum error correction techniques^21,22,23 can be employed to enhance the accuracy. In contrast, if quantum error correction is applied to BB-QRAM, all routing components are activated at a physical level and make the scheme equivalent to the conventional fanout architecture²⁴. In addition, depending on the physical setup, the quantum circuit can be further optimized using various gate decomposition techniques^16,19,25,26.

Application to quantum support vector machine

As an example, we demonstrate the FF-QRAM process for preparing a training data state in the quantum support vector machine. The classified training examples, ${\overrightarrow{x}}^{(i)}\in {{\mathbb{R}}}^{N}$, need to be encoded in the quantum format as

$$|\chi \rangle =\frac{1}{\sqrt{{\sum }_{i=0}^{M-1}\,|{\overrightarrow{x}}^{(i)}{|}^{2}}}\,\sum _{i=0}^{M-1}\,\sum _{k=0}^{N-1}\,{x}_{k}^{(i)}|k\rangle |i\rangle ,$$

(11)

where M denotes the number of training data samples⁸. The quantum circuit for preparing this QDB is depicted in Fig. 3 for a particular training data with an index i, where without loss of generality, M and N are assumed to be powers of two. An equal superposition state for log₂(M) and log₂(N) qubits is used as the bus qubit for providing the computational basis states k and i, respectively. The jth element of ${\overrightarrow{x}}^{(i)}$ determines the angle ${\theta }_{j}^{(i)}$ for the y-axis rotation of the register qubit. Note that the gates shaded in gray are included only to show the full flip-flop process for addressing a specific computational basis state, but they can be omitted in the implementation. The state shown in Eq. (11) is obtained after post-selecting the measurement outcome $\mathrm{|1}{\rangle }_{R}$ on the register qubit. This scheme constructs the QDB shown in Eq. (11) using $O({\mathrm{log}}_{2}(MN))$ hardware resources and O(MN) flip-register-flop operations.

Quantum forking

Here, we introduce a concept of quantum forking (QF) with which a qubit can undergo independent processes in superposition. This can be utilized as a means to reduce the number of QRAM queries in certain applications. Let us consider a quantum state $|{{\rm{\Psi }}}_{0}\rangle =|0\rangle |{\rm{\Phi }}\rangle |a\rangle $ with an n-qubit QDB state $|{\rm{\Phi }}\rangle $ generated by a QRAM process and an arbitrary n-qubit state $|a\rangle $, where $|{{\rm{\Psi }}}_{s}\rangle $ denotes the state at step s in Fig. 4(a). An n-qubit swap gate between $|{\rm{\Phi }}\rangle $ and $|a\rangle $ controlled by a qubit in $(|0\rangle +|1\rangle )/\sqrt{2}$ forms an entangled state,

$$|{{\rm{\Psi }}}_{1}\rangle =\frac{1}{\sqrt{2}}\,(|0\rangle |{\rm{\Phi }}\rangle |a\rangle +|1\rangle |a\rangle |{\rm{\Phi }}\rangle ).$$

(12)

In other words, the QDB is encoded in the first n-qubit data block if the control qubit is 0, and in the second n-qubit data block if the control qubit is 1. Then by applying two unitary evolutions activated by different computational basis states of the control qubit to each n-qubit block, $|{\rm{\Phi }}\rangle $ forks into two different states in superposition:

$$|{{\rm{\Psi }}}_{2}\rangle =\frac{1}{\sqrt{2}}\,(|0\rangle {U}_{1}|{\rm{\Phi }}\rangle |a\rangle +|1\rangle |a\rangle {U}_{2}|{\rm{\Phi }}\rangle ).$$

(13)

Evidently, it is not possible to create correlations between $|{{\rm{\Phi }}}_{1}\rangle ={U}_{1}|{\rm{\Phi }}\rangle $ and $|{{\rm{\Phi }}}_{2}\rangle ={U}_{2}|{\rm{\Phi }}\rangle $ via linear operations. Nonetheless, QF can speedup certain tasks, such as ensemble averaging²⁷ and the inner product calculation. Here we focus on the inner product evaluation problem as an example. The inner product between $|{{\rm{\Phi }}}_{1}\rangle $ and $|{{\rm{\Phi }}}_{2}\rangle $ can be evaluated by preparing these two states individually by making queries to the QRAM and performing the swap test²⁸. Alternatively, starting from the state shown in Eq. (13), another controlled swap followed by a Hadamard operation on the control qubit yields the state

$$|{{\rm{\Psi }}}_{3}\rangle =\frac{1}{2}[|0\rangle \,(|{{\rm{\Phi }}}_{1}\rangle +|{{\rm{\Phi }}}_{2}\rangle )+|1\rangle \,(|{{\rm{\Phi }}}_{1}\rangle -|{{\rm{\Phi }}}_{2}\rangle )]\,|a\rangle .$$

(14)

Finally, the probability of measuring the control qubit in $\mathrm{|0}\rangle $ is given by $P(0)=[1+{\rm{Re}}(\langle {{\rm{\Phi }}}_{1}|{{\rm{\Phi }}}_{2}\rangle )]/2$. This procedure only reveals the real part of the inner product. The imaginary part can be evaluated by adding a phase gate to the control qubit in front of the final Hadamard gate. Since the ancilla qubit is in an arbitrary state, QRAM is used only for preparing $|{\rm{\Phi }}\rangle $ once. The ancilla qubit can even be in the maximally mixed state, and we assume that the cost of preparing such a state is negligible. This method consumes O(n) additional gates, but reduces the number of QRAM queries by a factor of ~1/2. Note that the conventional swap test can only estimate the magnitude of the inner product. Thus the QF based approach not only reduces the number of QRAM queries, but also allows for the determination of the sign of the inner product. This is a consequence of an important property of the QF circuit; since different unitary operators can be applied to each branch (subspace), the global phase that a unitary operator introduces become distinguishable. Clearly, the quantum circuit shown on the left side of Fig. 4(a) can be rewritten more compactly without the controlled swap gates and the ancilla qubit by applying both controlled unitary operators directly to the data qubit as shown on the right side. The quantum circuit on the left illustrates the general quantum forking framework that can also be adapted for other applications by replacing the swap test with other measurement schemes²⁷.

Generalizing above idea, a quantity such as ${\sum }_{1\le i,j\le d}\,{\rm{Re}}(\langle {{\rm{\Phi }}}_{i}|{{\rm{\Phi }}}_{j}\rangle )$ can be evaluated by repeating the modified swap test for which only one QRAM state preparation is needed. The modified swap test based on QF requires a control qudit of dimension d (or log₂(d) qubits), and O(nd) gates.

The quantum forking for implementing three different unitary processes in superposition is depicted in Fig. 4(b). A qutrit is used as the control, and H₃ represents the qutrit Hadamard operation for preparing the equal superposition of the three computational basis states. This circuit produces an entangled state $(|0\rangle |{{\rm{\Phi }}}_{1}\rangle |a\rangle |a\rangle +|1\rangle |a\rangle |{{\rm{\Phi }}}_{2}\rangle |a\rangle +|2\rangle |a\rangle |a\rangle |{{\rm{\Phi }}}_{3}\rangle )$/$\sqrt{3}$.