Efficient quantum measurement of Pauli operators
Abstract
Estimating the expectation value of an observable is a fundamental task in quantum computation. Unfortunately, it is often impossible to obtain such estimates directly, as the computer is restricted to measuring in a fixed computational basis. One common solution splits the observable into a weighted sum of Pauli operators and measures each separately, at the cost of many measurements. An improved version first groups mutually commuting Pauli operators together and then measures all operators within each group simultaneously. The effectiveness of this depends on two factors. First, to enable simultaneous measurement, circuits are required to rotate each group to the computational basis. In our work, we present two efficient circuit constructions that suitably rotate any group of commuting qubit Pauli operators using at most and twoqubit gates respectively. Second, metrics that justifiably measure the effectiveness of a grouping are required. In our work, we propose two natural metrics that operate under the assumption that measurements are distributed optimally among groups. Motivated by our new metrics, we introduce SORTED INSERTION, a grouping strategy that is explicitly aware of the weighting of each Pauli operator in the observable. Our methods are numerically illustrated in the context of the Variational Quantum Eigensolver, where the observables in question are molecular Hamiltonians. As measured by our metrics, SORTED INSERTION outperforms four conventional greedy colouring algorithms that seek the minimum number of groups.
I Introduction
Estimating the expectation value of an observable on a quantum state is a fundamental task in quantum mechanical experiments. However, often there is no natural way to measure directly and some indirect protocol is required. In particular, this is true of current quantum computers that can only measure each qubit in the computational basis defined, by convention, as eigenstates of the Pauli operator.
One naive protocol, therefore, is to decompose into a weighted sum of Pauli operators (or Paulis) and then measure each separately. An extended version assembles the Paulis into commuting subsets, or “groups”, given by
(1) 
for some . All Paulis in a group can then be measured at the same time, as any set of commuting Paulis can be simultaneously diagonalised by a single unitary , so
(2) 
where is diagonal in the computational basis. To measure all operators in simultaneously, we first apply the unitary “rotation” to , then measure in the computational basis which yields a bitstring, , of measurements on each qubit, and finally, for each , classically postprocess according to to infer .
Expectation estimation features prominently as the quantum subroutine of the Variational Quantum Eigensolver (VQE) algorithm Peruzzo et al. (2014), which has emerged as a leading candidate for exhibiting quantum advantage in the Noisy Intermediate Scale Quantum era Preskill (2018). VQE is a hybrid quantumclassical algorithm designed to find the ground state Peruzzo et al. (2014); McClean et al. (2016); O’Malley et al. (2016); Kandala et al. (2017a); McArdle et al. (2018); Ryabinkin et al. (2019); Romero et al. (2018); Wang et al. (2019), or energy spectra McClean et al. (2017); Santagati et al. (2018); Colless et al. (2018); Heya et al. (2019); Jones et al. (2019); Higgott et al. (2019), of a physical or chemical system. The observable in question is a Hamiltonian on qubits. In the context of quantum chemistry, readily decomposes into a weighted sum of Paulis via, for example, the JordanWigner Jordan and Wigner (1928), BravyiKitaev Bravyi et al. (2017), or VerstraeteCirac Verstraete and Cirac (2005) transformations.
The paper that introduced VQE Peruzzo et al. (2014) proposed measuring according to the naive protocol above. However, this can be inefficient. For example, a secondquantised chemical Hamiltonian on qubits decomposes into a very large number of Paulis that scales as . To remedy this problem, McClean et al. McClean et al. (2016) proposed the extended protocol. The authors also argued using a toy example that, due to covariances between Paulis, optimally grouping might not be the same as minimising the number of groups, . However, Ref. McClean et al. (2016) did not propose strategies to obtain the commuting groups , neither did it show how to construct the rotation that enables simultaneous measurement.
Recently, a series of papers Verteletskyi et al. (2019); Jena et al. (2019); Yen et al. (2019); Huggins et al. (2019); Gokhale et al. (2019) ^{1}^{1}1We mention that both the grouping and rotation construction problems have been addressed on an adhoc basis by experimentalists since Kandala et al. Kandala et al. (2017b). We refer readers to Table 2 of Ref. Gokhale et al. (2019) for a good summary. We also mention that recent Ref. Izmaylov et al. (2019) and the less recent Ref. Izmaylov et al. (2019) allow for grouping of noncommuting Paulis. We found it interesting that such approaches are feasible, but found it hard to compare them to our work on a likeforlike basis. have appeared that together make good progress on both the grouping strategy and rotation construction problems. Our paper is in this same arena and attacks both problems.
First, we contribute two new methods for constructing Clifford rotation circuits that enable simultaneous measurement of arbitrary , i.e., a group containing arbitrary commuting Paulis. Like Ref. Gokhale et al. (2019), we approached the problem via the stabiliser formalism but have gone further to consider the case when has any number of independent Paulis. We show that the number of twoqubit gates in can be reduced in a way that scales with . This is important because it is atypical for actual groupings to have exactly independent Paulis and reducing the number of twoqubit gates is important, especially in the nearterm Blais et al. (2007); Laing,Anthony et al. (2010); Barends et al. (2014); Ballance et al. (2016); Allen et al. (2017); Linke et al. (2017); Wendin (2017); Reagor et al. (2018); Schäfer et al. (2018); Webb et al. (2018); Levine et al. (2018); He et al. (2019b); Huang et al. (2019); Blumel et al. (2019); He et al. (2019a). As far as we are aware, ours is the first paper to consider the case explicitly. Also, we emphasise the role classical postprocessing can play in saving quantum resources.
More specifically, we introduce constructions “CZ” and “CNOT”. The CZconstruction builds on work by Van den Nest, Dehaene, and De Moor Van den Nest et al. (2004) in the graphstate literature to yield with a number of twoqubit gates, or “size”, at most
(3) 
The CNOTconstruction builds on our CZconstruction, and work by Aaronson and Gottesman Aaronson and Gottesman (2004) and Patel, Markov, and Hayes Patel et al. (2008) to yield with size at most
(4) 
We stress that and are worstcase upper bounds. In practice, numerical simulations are needed to determine whether the CZ or CNOTconstruction is actually more efficient. We note that, in the case of , our methods produce a twoqubit gate count scaling no worse than the previous best of Gokhale et al. (2019). Other works, such as Ref. (Jena et al., 2019, Appendix A), prove only that a Clifford exists, or demonstrate a worstcase gate count scaling that is worse than (Yen et al., 2019, Appendix B), or present a method that cannot be used for arbitrary Huggins et al. (2019).
When considering the grouping strategy problem, we contribute two new but natural metrics, and , that quantify the performance of any given grouping. and measure the ratio between the number of measurements required in the ungrouped case versus the grouped case to attain a fixed level of accuracy. The key novelty in these two metrics is that they assume measurements on the groups are distributed optimally to maximise accuracy, following Refs. Wecker et al. (2015); Rubin et al. (2018); Romero et al. (2018). The difference between them is that is statedependent but is designed to approximate over the uniform spherical measure. Therefore, is more suitable for use given some knowledge of the underlying state , while is more suitable otherwise.
We find it useful to prove that, for all , breaking a single commuting group into two never improves nor . More formally, let group be the disjoint union of groups and ; then
(5) 
This result is in direct contrast to the conclusion of the aforementioned toy example used by McClean et al. McClean et al. (2016), and analysed in full in Ref. (Gokhale et al., 2019, Sec. 10.1), that breaking a group can be advantageous. The reason for the discrepancy is that we assume measurements are distributed optimally, whereas they assume measurements are distributed uniformly.
Informed by the mathematical form of , we contribute our new grouping strategy SORTED INSERTION. Unlike strategies used previously Verteletskyi et al. (2019); Jena et al. (2019); Gokhale et al. (2019), SORTED INSERTION is explicitly aware of the coefficients in the Pauli decomposition of an observable . We present data showing SORTED INSERTION outperforming all four conventional greedy colouring algorithms that we tried, as measured by the metric . Our data also challenges the assumption that minimising the number of groups is optimal, as groupings with the smallest number of groups do not typically perform the best with respect to . Note that this does not logically contradict Eq. (5). We quantify the performance of SORTED INSERTION using the metric for molecules ranging in size from hydrogen H, which requires two qubits, to hydrogen selenide HSe, which requires 38, finding that it leads to a 10 to 60 fold improvement in the number of measurements required. Note that we are defining a single measurement to consist of a measurement of all qubits, and so the number of measurements equals the number of ansatz state preparations.
Finally, we run SORTED INSERTION alongside our CZconstruction on molecules requiring up to 38 qubits to calculate the actual number of twoqubit gates required for real molecular systems. Our numerical results show that the typical number of twoqubit gates is fewer than the worstcase by a factor of 3.5.
Ii Rotation constructions
In this section, we assume familiarity of the reader with the stabiliser formalism, especially the bit binary representation of qubit Paulis Calderbank et al. (1997); Gottesman (1997); Nielsen and Chuang (2011); Aaronson and Gottesman (2004). We follow the convention that the upper and lower halves of the binary matrix encode and operators respectively. This representation is reviewed in Appendix A. We also reserve symbols and for the identity and allzero matrices respectively.
Our starting point is a commuting group, , of Paulis which can be represented as a binary matrix . By Gaussian elimination on , we can form a matrix representing a set of independent Paulis drawn from where . Our goal is to transform , using certain allowed transformations, into a matrix where
(6) 
Let denote the circuit consisting of and transformations in the order they were applied from . Then applying to any state , measuring in the computational basis, and classically postprocessing allows us to measure on simultaneously.
The allowed set of transformations on a binary matrix is, where ranges over all columns, ranges over all rows, and addition is :

1q and 2q, one and twoqubit quantum row operations, specifically:

CZ on qubits and :
,
. 
CNOT on controlqubit and targetqubit :
,
. 
Hadamard (H) on qubit :
. 
Phase (P) on qubit :
.


cpp, classical postprocessing:

Products of eventual singlequbit computationalbasis measurements:
rightmultiply by invertible matrix. 
Relabelling of qubits and :
,
.


ext, basis extension.

Addition of further stabiliser:
append new column .

In the near term, operations in have different costs that can be justifiably ranked as “”. In the first inequality, cost can refer to either fidelity or gatetime Blais et al. (2007); Laing,Anthony et al. (2010); Barends et al. (2014); Ballance et al. (2016); Allen et al. (2017); Linke et al. (2017); Wendin (2017); Reagor et al. (2018); Schäfer et al. (2018); Webb et al. (2018); Levine et al. (2018); He et al. (2019b); Huang et al. (2019); Blumel et al. (2019); He et al. (2019a). Therefore, we have aimed to minimise the number of twoqubit gates, or “2qsize”, in the resulting from our constructions. This means, for example, we never perform the row swap using a twoqubit SWAP.
In presenting our constructions, we shall refer to the commutativity condition, preserved under , given by
(7) 
where is the matrix encoding the Paulis and
(8) 
We ignore any changes in sign of stabilisers under as this can be easily accounted for by classical postprocessing. Readers interested in this and other details are referred to Appendix B, where we work through our CZconstruction with a specific example.
ii.1 Czconstruction
Important to our first approach is the special class of stabiliser states known as graph states. Consider any graph on vertices. The graph state is then defined by independent stabiliser generators
(9) 
where is the set of neighbours of vertex in . The binary representation of these stabilisers is
(10) 
where , an symmetric matrix with s on its diagonal, is exactly the adjacency matrix of .
It is wellknown that where is a product of CZ gates and H is the Hadamard gate. More specifically, applies CZ between qubits and if and only if vertex neighbours in . Van den Nest, Dehaene, and De Moor Van den Nest et al. (2004) tell us that any stabiliser state can be transformed to a graph state by a product of singlequbit Clifford gates. It is therefore clear that we can transform any to via using at most twoqubit (CZ) gates, as this is the maximum number of edges on an vertex graph. The interesting question is whether we can do better by exploiting the potential low rank of .
Our answer is in the affirmative and we now present an explicit and efficient algorithm that constructs with at most twoqubit gates.
In Fig. 1, we illustrate the sequence of reductions that allow us to reach , and so , from . We now describe the salient aspects of each step:

Following (Aaronson and Gottesman, 2004, Lemma 6), we can apply Hadamard gates so that has rank . By cpp rowswaps (relabelling of qubits), we can ensure that the first rows of have fullrank.

Since the upper submatrix of has fullrank, cpp column operations can reduce it to .

We can directly verify that the extension to is valid by Eq. (7). Clearly has full columnrank . The sparsity of our chosen extension shall play a crucial role in the reduced 2qsize of when in both the CZ and CNOTconstructions.

Column operations can eliminate , then Phase gates can ensure has zeros on its diagonal. Importantly, represents a graph state .

. Hadamard and CZ gates can implement this final reduction as discussed above. The maximum number of CZ gates required to map to equals the maximum number of offdiagonal s in the upper half of . When , this is . When , this is due to sparsity of the upper half of which traces back to the step from .
Note that in step , we can first try to reduce the upper half of by singlequbit gates before applying CZ. One way to do this is to reduce the number of edges in the graph whose adjacency matrix is specified by the upper half of by the socalled “local complementation” operation Van den Nest et al. (2004); Hein et al. (2004); Bouchet (1993). This corresponds precisely to reducing the number of CZ gates in our CZconstruction.
ii.2 Cnotconstruction
We start from above which we reached without using twoqubit gates. Now, instead of using one block of CZ gates, we reduce to as shown in Fig. 2, using three blocks of CNOT gates:

. Note that must be symmetric by the commutativity condition given in Eq. (7). Then, following Ref. (Aaronson and Gottesman, 2004, Lemma 7), we can eliminate using singlequbit and CNOT gates. This is accomplished by noting that any symmetric binary can be Cholesky decomposed as , with diagonal and invertible.

. Reduce to by column operations, then add s on the top diagonal by phase gates.

. Now, the upper matrix can be blockCholesky decomposed as
(11) where
(12) (13) Next, we apply CNOT gates corresponding to . The number of CNOT gates required here equals the number of row operations required to reduce to . We find this is at most via arguments of Patel, Markov, and Hayes Patel et al. (2008). The proof is given in Appendix C.

. Multiply by on the right.

. is a symmetric matrix and so can be again eliminated via the Cholesky decomposition using CNOT gates.
Note that in the three steps , , and , we have used blocks of CNOT gates. The method we used to synthesise these blocks is sizeoptimal (Patel et al., 2008, Lemma 1), but we could have alternatively used methods in Ref. Jiang et al. (2019), that built on Ref. Moore and Nilsson (1998), to achieve optimal spacedepth tradeoff, where space refers to extra ancilla qubits.
To end our discussion of constructing rotation circuits, we briefly mention a third, ancillabased construction with size at most . This construction is wellknown in the context of syndrome measurement Nielsen and Chuang (2011) in quantum error correction but does not seem to have been mentioned in the context of measuring a Pauli decomposition of an observable, as in VQE. To measure commuting Paulis , this “ancillaconstruction” uses ancilla and involves consecutive blocks of generalisedCNOT gates, each targeted at a different ancilla. The controls in block are activated or deactivated by the or eigenstates of the singlequbit Paulis forming ^{2}^{2}2So corresponds to standard control on , corresponds to control on , corresponds to control on , and corresponds to not having a generalisedCNOT. These generalisedCNOT gates can be implemented by the standard CNOT conjugated by in case of , or in case of .. singlequbit measurements are performed on the ancilla at the end of each block to exactly give measurements of . Unfortunately, this construction requires extra ancilla qubits and has worse worstcase size than both of our constructions. However, it does serve as a simple way to see, a priori, that a size scaling of is possible.
Iii Grouping strategies
Now that we have demonstrated the construction of a rotation circuit for a group of generally commuting Paulis, we would like to quantify the advantage offered, in terms of use of the quantum computer, in assembling operators into such groups. We have a Hamiltonian, , of the form
(14) 
where is the number of groups of mutually commuting operators, is the number of operators in group , is the th Pauli operator in the th group and is its coefficient.
Given , let and be the minimal number of measurements required to attain an accuracy in the ungrouped and grouped (as per Eq. 14) cases respectively. Finding is a special case of finding . To find , we can solve the constrained optimisation problem that asks how a given number of measurements should be distributed in order to maximise accuracy. Following Ref. Wecker et al. (2015); Rubin et al. (2018); Romero et al. (2018), we can use Lagrange multipliers to find
(15) 
where
(16) 
Since is just evaluated with every operator in its own group, we have
(17) 
where
(18) 
The ratio , defined as
(19) 
therefore acts as a natural metric for the performance of a particular grouping under the assumption that measurements are distributed optimally. We prove as Claim 1 that combining two groups into one is always better with respect to .
Claim 1.
Consider two groups and of mutually commuting Paulis, where each Pauli is in at most one group. Suppose that it is possible to combine and into a single commuting group, called . Let and denote the metric, as defined in Eq. (19), for the two groups separated and combined respectively. Then
(20) 
Proof.
As is a strictly increasing function for and the numerator of is independent of the grouping, it is sufficient to consider only the size of denominator of
(21) 
for the two cases. The variance of a single commuting group can be written as
(22) 
where is the Hermitian and positive semidefinite covariance matrix for the state and is a vector of the coefficients . is defined to have elements
(23) 
where the covariance is defined as
(24) 
Let us now turn our attention to the single and two group comparison. We define coefficient vectors and , both of size , which contain the coefficients of the operators contained within groups and respectively. If a Pauli operator is not present within , the corresponding coefficient in will be zero, and similarly for and . The combined group therefore has coefficients . The contribution of the two separate groups to (22) is
(25) 
where is the covariance matrix of the full set of operators contained within . The contribution due to the single group is
(26) 
Because is positive semidefinite, we can define the semiinner product (Conway, 1994, Example 1.1) and use the CauchySchwarz inequality to find
(27) 
Equality holds if and only if there exist and , such that not both are equal to 0 and (Conway, 1994, Example 1.4). Therefore, by comparison, (25) (III) and so
(28) 
∎
Claim 1 shows that it is impossible to mitigate covariances by splitting groups and using the optimal measurement strategy. This is in contrast to Refs. McClean et al. (2016); Gokhale et al. (2019), who showed that it is possible using a suboptimal measurement strategy. In Appendix D, we redo precisely their example using the optimal measurement strategy. Note that Claim 1 does not imply that the minimum number of groups is best, simply that it is never better to break a single group into two.
If all of the variances going into are replaced by their expectation values over the uniform spherical measure (see Ref. (Watrous, 2018, Ch. 7)), we obtain another metric, , given by
(29) 
The derivation of is given in Appendix E. The same proof as in Claim 1 can be used to show that breaking a group into two is never better when measured by ; the only difference is the covariance matrix must be replaced by its expectation over the uniform spherical measure.
is a particularly useful metric because it approximates over the uniform spherical measure, but can be calculated analytically. A good grouping strategy maximises by minimising the denominator. This is achieved by taking advantage of the concavity of the square root by placing the operators with the largest coefficients in the same groups. Physically, this represents the optimal measurement scheme being able to direct many measurements towards a few groups with large variances. In the next paragraph, we propose a simple strategy for grouping operators motivated by this idea.
Given , the strategy is to take each operator ordered by the absolute value of the coefficient, check if it can be placed in an existing group and, if not, start a new group. The groups are checked in order of creation. This is of complexity at worst, where we recall that is the number of qubits and is the number of Pauli terms in the Hamiltonian. We have named this strategy SORTED INSERTION.
Greedy colouring algorithms, as implemented in Ref. Verteletskyi et al. (2019), require pregenerating the commutation graph which takes the same number of operations as the worst case scenario for SORTED INSERTION. The colouring algorithms then run on the graph adding their own complexity – see Table 1. Therefore, SORTED INSERTION’s worst case complexity is bounded by the best case complexity of greedy colouring algorithms, such as those we will compare it to in Sec. IV.
Colouring Algorithm  Time Complexity 
Largest First  
Connected Sequential d.f.s.  
DSATUR  
Independent Set 
Molecule  qubits  Paulis  Grouping  Ratios ,  Rotation Circuit size  


min  mean  max  theory max  true max  mean  
H  2  4  2  2.00  1.50  1.09  1.93  4.60  1.76  0  0  0  
H  4  59  10  5.90  3.50  3.76  11.92  33.04  10.25  6  3  0.80  
LiH  10  630  41  15.37  6.85  19.60  24.91  34.74  23.97  45  18  5.29  
OH  10  630  38  16.58  7.29  6.32  8.90  12.86  8.51  45  17  5.63  
HF  10  630  39  16.15  6.97  6.07  8.57  12.27  8.21  45  16  5.74  
HO  12  1085  51  21.27  9.04  7.68  11.27  16.96  10.67  66  26  7.37  
BH  14  1584  66  24.00  10.36  17.21  20.93  32.13  20.05  91  26  9.56  
NH  14  3608  118  30.58  11.34  12.65  15.96  26.93  15.31  91  28  10.26  
CH  16  3887  123  31.60  13.39  16.96  21.63  29.33  20.27  120  45  16.75 
Iv Application to VQE
Molecule  Largest First  Connected Sequential d.f.s.  Independent Set  DSATUR  SORTED INSERTION  


H  2  1.76  2  1.76  2  1.76  2  1.76  2  1.76  
H  10  4.86  10  10.25  10  10.30  9  4.10  10  10.25  
LiH  39  23.87  45  23.33  30  5.72  29  10.47  41  23.97  
OH  40  8.27  41  8.41  21  3.00  28  3.23  37  8.51  
HF  38  8.05  41  8.07  21  2.80  28  3.23  38  8.21  
HO  57  2.98  55  10.66  42  3.87  51  3.18  51  10.66  
BH  66  4.80  85  18.70  60  7.85  72  4.11  68  20.05  
NH  124  6.50  174  13.97  126  4.03  137  2.92  117  15.31  
CH  122  5.84  176  18.93  114  9.88  110  4.90  125  20.27  
O  62  13.62  85  19.95  42  6.79  52  7.91  67  20.23  
N  62  15.00  86  21.15  39  8.37  49  5.80  78  22.10  
CO  124  20.70  155  20.67  89  6.03  106  4.55  128  21.31  
HCl  117  2.16  141  10.29  98  3.52  104  2.04  123  10.36  
NaH  121  8.78  181  12.40  149  3.44  145  3.65  135  12.90  
HS  122  8.81  180  12.45  147  3.80  145  3.66  147  11.60 
In this section, we present numerical results of the grouping method discussed in Sec. III, alongside the CZconstruction of Sec. II.1 to construct the rotation circuits for given commuting groups. In particular, we have applied our methods to the Hamiltonians of simple molecules so as to demonstrate their use in the context of VQE. The full results are given in Table 4, with a subset shown in Table 2. In all cases, we used OpenFermion McClean et al. (2017) to obtain Hamiltonians in the STO3G basis, at approximately the equilibrium geometry of the molecules, with the symmetry conserving BravyiKitaev transformation Bravyi and Kitaev (2002); Bravyi et al. (2017). In order to reduce the number of twoqubit gates required, we considered qubits on which all operators in a group locally commute separately – a onequbit rotation per locally commuting qubit is all that is required to do so.
In Fig. 3(a), we plot the average group size against the number of qubits, , for the molecular Hamiltonians. We can see that the average group size increases with increasing , and the increase does not appear to be slowing down. We therefore conclude that our sorting method works well on systems of at least size . However, the key advantage of assembling a Hamiltonian into groups of mutually commuting operators is a reduction in the number of measurements required to obtain an energy expectation to a certain level of accuracy, and group size alone does not directly quantify this reduction. For a given Hamiltonian and quantum state, the reduction is instead given by , as in Eq. (19).
We therefore calculated the value of for 100 different quantum states, generated using 100 random sets of ansatz parameters with a hardware efficient ansatz of depth 1, for the nine smallest molecular systems. We show the mean, minimum and maximum values for each molecule. In practice, the value of can at best be obtained approximately by making measurements on the quantum computer and so cannot be used to determine the expected advantage of a particular grouping a priori. The metric , given by Eq. (29), on the other hand, depends only on the coefficients of the terms in the Hamiltonian. From Table 2, we can see that closely approximates the average of over many ansatz parameters, but can be calculated analytically without the need for simulations. In Fig. 3(b), we show as a function of the number of qubits for our full selection of molecules. We can see that it is highly molecule dependent, with systems of similar size having very different values.
The reduction in number of measurements required comes at the cost of applying additional quantum gates before the qubits are measured, the most costly of which are twoqubit gates. For the CZconstruction, we demonstrated in Sec. II.1 that the maximum number of additional twoqubit gates required for a group with independent terms is . We would like to know, in practice, how many additional twoqubit gates are required at a maximum, as this is the quantum resource that is most limiting. Assuming for a given Hamiltonian that at least one group has rank , obtaining a measurement of all terms in a Hamiltonian on qubits may therefore require applying an additional gates in a single circuit. However, for the molecules we have considered, we find that the largest number of twoqubit gates required is in fact far lower than this, typically by a factor of approximately 3.5, as can be seen in Fig. 3(c).
Given the close relationship between the average value of and the value of , we propose using as a metric for the quality of a grouping method and compare different methods of grouping the operators with this metric in mind. The results are shown in Table 3, along with the number of groups of operators, , that each method produces. Out of the methods, “Independent Set” was best at approximating the minimum clique cover – it found the cover with the fewest cliques in all but one case. However, it appears that the minimum clique cover does not necessarily result in the fewest measurements, with “Independent Set” only performing best once with respect to . Overall, our SORTED INSERTION was best at maximising , performing best or joint best in all but two cases.
V Conclusion
We have addressed two problems related to the efficient measurement of Pauli operators on a quantum computer. The first is how to synthesise rotation circuits that enable mutually commuting Paulis to be measured simultaneously, and the second is how to assemble a set of Paulis into groups in which they mutually commute.
We have contributed two rotation circuit constructions CZ and CNOT. The CZconstruction results in a maximum of twoqubit gates while the CNOTconstruction results in a maximum of . On grouping Pauli operators, we contribute two natural metrics, and , that justifiably measure the effectiveness of a grouping. We also contribute a grouping strategy motivated by that we call SORTED INSERTION.
We have applied our theoretical work to the task of estimating energies of molecules in the context of VQE. We find that, for the CZconstruction, the largest number of twoqubit gates required is typically less than the theoretical worstcase by a factor of approximately 3.5. Comparison to other grouping methods shows that while SORTED INSERTION does not normally result in the smallest number of groups, it nearly always results in the best value of .
Vi Acknowledgement
We thank Oscar Higgott for informing us of the ancillabased simultaneous measurement method and Ref. (Dennis et al., 2002, Fig. 14).
References
 Improved simulation of stabilizer circuits. Phys. Rev. A 70, pp. 052328. External Links: Document, Link Cited by: §I, 1st item, 1st item, §II.
 Optimal control of two qubits via a single cavity drive in circuit quantum electrodynamics. Phys. Rev. A 95, pp. 042325. External Links: Document, Link Cited by: §I, §II.
 On economical construction of the transitive closure of an oriented graph. Soviet Mathematics Doklady, pp. 1209–10. Cited by: Appendix C.
 Highfidelity quantum logic gates using trappedion hyperfine qubits. Phys. Rev. Lett. 117, pp. 060504. External Links: Document, Link Cited by: §I, §II.
 Superconducting quantum circuits at the surface code threshold for fault tolerance. Nature 508, pp. 500 EP –. External Links: Link Cited by: §I, §II.
 Quantuminformation processing with circuit quantum electrodynamics. Phys. Rev. A 75, pp. 032329. External Links: Document, Link Cited by: §I, §II.
 Poweroptimal, stabilized entangling gate between trappedion qubits. arXiv eprints. External Links: 1905.09292 Cited by: §I, §II.
 Recognizing locally equivalent graphs. Discrete Mathematics 114 (1), pp. 75 – 86. External Links: ISSN 0012365X, Document, Link Cited by: §II.1.
 Fermionic Quantum Computation. Annals of Physics 298 (1), pp. 210–226. External Links: Document, quantph/0003137 Cited by: Table 4, §IV.
 Tapering off qubits to simulate fermionic Hamiltonians. arXiv eprints. External Links: 1701.08213 Cited by: Table 4, §I, §IV.
 Quantum error correction and orthogonal geometry. Phys. Rev. Lett. 78, pp. 405–408. External Links: Document, Link Cited by: Appendix A, §II.
 Computation of molecular spectra on a quantum processor with an errorresilient algorithm. Phys. Rev. X 8, pp. 011021. External Links: Document, Link Cited by: §I.
 A course in functional analysis. Graduate Texts in Mathematics, Springer New York. External Links: ISBN 9780387972459, LCCN 97122669, Link Cited by: §III.
 Topological quantum memory. Journal of Mathematical Physics 43 (9), pp. 4452–4505. External Links: Document, quantph/0110143 Cited by: §VI.
 Minimizing State Preparations in Variational Quantum Eigensolver by Partitioning into Commuting Families. arXiv eprints. External Links: 1907.13623 Cited by: Appendix D, Appendix E, §I, §I, §I, §I, §I, §III, footnote 1.
 Stabilizer codes and quantum error correction. Ph.D. Thesis, California Institute of Technology. Cited by: §II.
 Exploring network structure, dynamics, and function using NetworkX. In Proceedings of the 7th Python in Science Conference, G. Varoquaux, T. Vaught, and J. Millman (Eds.), pp. 11 – 15. Cited by: Table 3.
 A twoqubit gate between phosphorus donor electrons in silicon. Nature 571 (7765), pp. 371–375. External Links: Document, ISBN 14764687, Link Cited by: §I, §II.
 A twoqubit gate between phosphorus donor electrons in silicon. Nature 571 (7765), pp. 371–375. External Links: Document, ISBN 14764687, Link Cited by: §I, §II.
 Multiparty entanglement in graph states. Phys. Rev. A 69, pp. 062311. External Links: Document, Link Cited by: §II.1.
 Subspace Variational Quantum Simulator. arXiv eprints. External Links: 1904.08566 Cited by: §I.
 Variational Quantum Computation of Excited States. Quantum 3, pp. 156. External Links: Document, Link, ISSN 2521327X Cited by: §I.
 Fidelity benchmarks for twoqubit gates in silicon. Nature 569 (7757), pp. 532–536. External Links: Document, ISBN 14764687, Link Cited by: §I, §II.
 Efficient and Noise Resilient Measurements for Quantum Chemistry on NearTerm Quantum Computers. arXiv eprints. External Links: 1907.13117 Cited by: §I, §I.
 Unitary partitioning approach to the measurement problem in the Variational Quantum Eigensolver method. arXiv eprints. External Links: 1907.09040 Cited by: footnote 1.
 Revising the measurement process in the variational quantum eigensolver: is it possible to reduce the number of separately measured operators?. Chem. Sci. 10, pp. 3746–3755. External Links: Document, Link Cited by: footnote 1.
 Pauli Partitioning with Respect to Gate Sets. arXiv eprints. External Links: 1907.07859 Cited by: §I, §I, §I.
 Optimal SpaceDepth TradeOff of CNOT Circuits in Quantum Logic Synthesis. arXiv eprints. External Links: 1907.05087 Cited by: §II.2.
 Variational quantum algorithms for discovering Hamiltonian spectra. Phys. Rev. A 99, pp. 062304. External Links: Document, Link Cited by: §I.
 Über das paulische Äquivalenzverbot. Zeitschrift für Physik 47 (9), pp. 631–651. External Links: ISSN 00443328, Document, Link Cited by: §I.
 Hardwareefficient variational quantum eigensolver for small molecules and quantum magnets. Nature 549, pp. 242 EP –. External Links: Link Cited by: §I.
 Hardwareefficient variational quantum eigensolver for small molecules and quantum magnets. Nature 549, pp. 242 EP –. External Links: Link Cited by: footnote 1.
 Classical coloring of graphs. Cited by: Table 1.
 Highfidelity operation of quantum photonic circuits. Applied Physics Letters 97 (21), pp. 211109. External Links: Document, Link, https://doi.org/10.1063/1.3497087 Cited by: §I, §II.
 Highfidelity control and entanglement of Rydbergatom qubits. Phys. Rev. Lett. 121, pp. 123603. External Links: Document, Link Cited by: §I, §II.
 Experimental comparison of two quantum computing architectures. Proceedings of the National Academy of Sciences 114 (13), pp. 3305–3310. External Links: Document, ISSN 00278424, Link, https://www.pnas.org/content/114/13/3305.full.pdf Cited by: §I, §II.
 Quantum computational chemistry. arXiv eprints. External Links: 1808.10402 Cited by: §I.
 Hybrid quantumclassical hierarchy for mitigation of decoherence and determination of excited states. Physical Review A 95 (4), pp. 042308. Cited by: §I.
 The theory of variational hybrid quantumclassical algorithms. New Journal of Physics 18 (2), pp. 023023. Cited by: Appendix D, §I, §I, §I, §III.
 OpenFermion: The Electronic Structure Package for Quantum Computers. arXiv eprints. External Links: 1710.07629 Cited by: Table 4, §IV.
 Parallel Quantum Computation and Quantum Codes. arXiv eprints. External Links: quantph/9808027 Cited by: §II.2.
 Quantum computation and quantum information: 10th anniversary edition. 10th edition, Cambridge University Press, New York, NY, USA. External Links: ISBN 1107002176, 9781107002173 Cited by: §II.2, §II.
 Scalable quantum simulation of molecular energies. Phys. Rev. X 6, pp. 031007. External Links: Document, Link Cited by: §I.
 Psi4 1.1: an opensource electronic structure program emphasizing automation, advanced libraries, and interoperability. Journal of Chemical Theory and Computation 13 (7), pp. 3185–3197. Note: doi: 10.1021/acs.jctc.7b00174 External Links: Document, ISBN 15499618, Link Cited by: Table 4.
 Optimal synthesis of linear reversible circuits. Quantum Info. Comput. 8 (3), pp. 282–294. External Links: ISSN 15337146, Link Cited by: Appendix C, Appendix C, §I, 3rd item, §II.2.
 A variational eigenvalue solver on a photonic quantum processor. Nature communications 5, pp. 4213. Cited by: §I, §I.
 Quantum computing in the NISQ era and beyond. Quantum 2, pp. 79. External Links: Document, Link, ISSN 2521327X Cited by: §I.
 Demonstration of universal parametric entangling gates on a multiqubit lattice. Science Advances 4 (2). External Links: Document, Link, https://advances.sciencemag.org/content/4/2/eaao3603.full.pdf Cited by: §I, §II.
 Strategies for quantum computing molecular energies using the unitary coupled cluster ansatz. Quantum Science and Technology 4 (1), pp. 014008. Cited by: §I, §I, §III.
 Application of fermionic marginal constraints to hybrid quantum algorithms. New Journal of Physics 20 (5), pp. 053020. External Links: Document, Link Cited by: §I, §III.
 Constrained variational quantum eigensolver: quantum computer search engine in the fock space. Journal of Chemical Theory and Computation 15 (1), pp. 249–255. Note: doi: 10.1021/acs.jctc.8b00943 External Links: Document, ISBN 15499618, Link Cited by: §I.
 Witnessing eigenstates for quantum simulation of hamiltonian spectra. Science Advances 4 (1). External Links: Document, Link Cited by: §I.
 Fast quantum logic gates with trappedion qubits. Nature 555, pp. 75 EP –. External Links: Link Cited by: §I, §II.
 Graphical description of the action of local Clifford transformations on graph states. Phys. Rev. A 69, pp. 022316. External Links: Document, Link Cited by: §I, §II.1, §II.1.
 Mapping local Hamiltonians of fermions to local Hamiltonians of spins. Journal of Statistical Mechanics: Theory and Experiment 2005 (09). Cited by: §I.
 Measurement Optimization in the Variational Quantum Eigensolver Using a Minimum Clique Cover. arXiv eprints. External Links: 1907.03358 Cited by: §I, §I, §III.
 Accelerated variational quantum eigensolver. Phys. Rev. Lett. 122, pp. 140504. External Links: Document, Link Cited by: §I.
 The theory of quantum information. Cambridge University Press. External Links: Document Cited by: Appendix E, §III.
 Resilient entangling gates for trapped ions. Phys. Rev. Lett. 121, pp. 180501. External Links: Document, Link Cited by: §I, §II.
 Progress towards practical quantum variational algorithms. Phys. Rev. A 92, pp. 042303. External Links: Document, Link Cited by: §I, §III.
 Quantum information processing with superconducting circuits: a review. Reports on Progress in Physics 80 (10), pp. 106001. External Links: Document, Link Cited by: §I, §II.
 Measuring all compatible operators in one series of a singlequbit measurements using unitary transformations. arXiv eprints. External Links: 1907.09386 Cited by: §I, §I.
Appendix A Binary representation
The Pauli group on qubits is a group of elements defined by
(30) 
The binary representation, first introduced by Calderbank, Rains, Shor, and Sloane Calderbank et al. (1997), is a representation of as binary vectors. In this representation, Paulis differing only in phase are represented in the same way.
Singlequbit Paulis are represented by 2dimensional binary vectors, so that
(31)  
An qubit Pauli
(32) 
is then represented by the dimensional binary vector
(33) 
In this representation, two qubit Paulis with binary vectors and commute if and only if
(34) 
where denotes the matrix
(35) 
Given a set of qubit Paulis, we can write down a corresponding binary matrix where each column represents a Pauli. Then, from Eq. (34), we deduce that all Paulis in mutually commute if and only if
(36) 
which recovers Eq. (7) in the main text. We say that the set of Paulis is independent if the matrix has rank .
We shall often find it helpful to write in terms of its upper half and lower half , separated by a horizontal line for visualaid, i.e.,
(37) 
The conjugation action of quantum gates on can be represented as transformations to the matrix . For example, we document the transformations on that represent four common quantum gates. In the following, addition is and ranges over all columns .

CZ on qubits and :
,
. 
CNOT on controlqubit and targetqubit :
,
. 
Hadamard (H) on qubit :
. 
Phase (P) on qubit :
.
These rules can be directly verified by conjugating by the listed gates. They are also reproduced in Sec. II of the main text.
Appendix B Czconstruction example
We walk through our CZconstruction for a specific example. In this example, we would like to obtain measurements simultaneously of a set of six fourqubit Paulis given by
(38)  
We can represent these Paulis in a matrix with
(39) 
By Gaussian elimination, we find the reduced row echelon form of to be
(40) 
The pivot columns are numbers 1, 2 and 4 which tells us that , and are the three independent Paulis from which the remaining Paulis in can be constructed. Therefore, we can write , where