Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update classical_shadows.ipynb and ime_time_evo.ipynb #180

Merged
merged 6 commits into from
Aug 13, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions docs/source/tutorial.rst
Original file line number Diff line number Diff line change
Expand Up @@ -25,5 +25,6 @@ Jupyter Tutorials
tutorials/barren_plateaus.ipynb
tutorials/qaoa_portfolio_optimization.ipynb
tutorials/imag_time_evo.ipynb
tutorials/classical_shadows.ipynb
tutorials/sklearn_svc.ipynb
tutorials/qcloud_sdk_demo.ipynb
79 changes: 38 additions & 41 deletions docs/source/tutorials/classical_shadows.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -32,27 +32,19 @@
"source": [
"Let's first briefly review the classical shadows in Pauli basis. For an $n$-qubit quantum state $\\rho$, we randomly perform Pauli projection measurement on each qubit and obtain a snapshot like $\\{1,-1,-1,1,\\cdots,1,-1\\}$. This process is equivalent to apply a random unitary $U$ to $\\rho$ and measure in computational basis to obtain $|b\\rangle=|s_1\\cdots s_n\\rangle,\\ s_j\\in\\{0,1\\}$:\n",
"$$\n",
"\\begin{equation}\n",
" \\rho\\rightarrow U\\rho U^{\\dagger}\\xrightarrow{measure}|b\\rangle\\langle b|,\n",
"\\end{equation}\n",
"\\rho\\rightarrow U\\rho U^{\\dagger}\\xrightarrow{measure}|b\\rangle\\langle b|,\n",
"$$\n",
"where $U=\\bigotimes_{j=1}^{n}u_j$, $u_i\\in\\{H, HS^{\\dagger}, \\mathbb{I}\\}$ correspond to the projection measurements of Pauli $X$, $Y$, $Z$ respectively. Then we reverse the operation to get the equivalent measurement result on $\\rho$:\n",
"$$\n",
"\\begin{equation}\n",
" \\rho\\xrightarrow{measure}U^{\\dagger}|b\\rangle\\langle b| U.\n",
"\\end{equation}\n",
"\\rho\\xrightarrow{measure}U^{\\dagger}|b\\rangle\\langle b| U.\n",
"$$\n",
"Moreover, we perform $N$ random measurements and view their average as a quantum channel:\n",
"$$\n",
"\\begin{equation}\n",
" \\mathbb{E}\\left[U^{\\dagger}|b\\rangle\\langle b|U\\right]=\\mathcal{M}(\\rho),\n",
"\\end{equation}\n",
"\\mathbb{E}\\left[U^{\\dagger}|b\\rangle\\langle b|U\\right]=\\mathcal{M}(\\rho),\n",
"$$\n",
"we can invert the channel to get the approximation of $\\rho$:\n",
"$$\n",
"\\begin{equation}\n",
" \\rho=\\mathbb{E}\\left[\\mathcal{M}^{-1}(U^{\\dagger}|b\\rangle\\langle b|U)\\right].\n",
"\\end{equation}\n",
"\\rho=\\mathbb{E}\\left[\\mathcal{M}^{-1}(U^{\\dagger}|b\\rangle\\langle b|U)\\right].\n",
"$$\n",
"We call each $\\rho_i=\\mathcal{M}^{-1}(U_i^{\\dagger}|b_i\\rangle\\langle b_i|U_i)$ a shadow snapshot state and their ensemble $S(\\rho;N)=\\{\\rho_i|i=1,\\cdots,N\\}$ classical shadows."
],
Expand All @@ -65,41 +57,39 @@
"source": [
"In Pauli basis, we have a simple expression of $\\mathcal{M}^{-1}$:\n",
"$$\n",
"\\begin{eqnarray}\n",
" \\rho_i&=&\\mathcal{M}^{-1}(U_i^{\\dagger}|b_i\\rangle\\langle b_i|U_i)=\\bigotimes_{j=1}^{n}3u_{ij}^{\\dagger}|s_{ij}\\rangle\\langle s_{ij}|u_{ij}-\\mathbb{I},\\\\\n",
" \\rho&=&\\frac{1}{N}\\sum_{i=1}^{N}\\rho_i\\ .\n",
"\\end{eqnarray}\n",
"\\begin{split}\n",
" \\rho_i&=\\mathcal{M}^{-1}(U_i^{\\dagger}|b_i\\rangle\\langle b_i|U_i)=\\bigotimes_{j=1}^{n}3u_{ij}^{\\dagger}|s_{ij}\\rangle\\langle s_{ij}|u_{ij}-\\mathbb{I},\\\\\n",
" \\rho&=\\frac{1}{N}\\sum_{i=1}^{N}\\rho_i\\ .\n",
"\\end{split}\n",
"$$\n",
"For an observable Pauli string $O=\\bigotimes_{j=1}^{n}P_j,\\ P_j\\in\\{\\mathbb{I}, X, Y, Z\\}$, we can directly use $\\rho$ to calculate $\\langle O\\rangle=\\text{Tr}(O\\rho)$. In practice, we will divide the classical shadows into $K$ parts to calculate the expectation values independently and take the median to avoid the influence of outliers:\n",
"$$\n",
"\\begin{equation}\n",
" \\langle O\\rangle=\\text{median}\\{\\langle O_{(1)}\\rangle\\cdots\\langle O_{(K)}\\rangle\\},\n",
"\\end{equation}\n",
"\\langle O\\rangle=\\text{median}\\{\\langle O_{(1)}\\rangle\\cdots\\langle O_{(K)}\\rangle\\},\n",
"$$\n",
"where\n",
"$$\n",
"\\begin{equation}\n",
" \\begin{split}\n",
" \\langle O_{(k)}\\rangle&=\\frac{1}{\\lceil N/K\\rceil}\\sum_{i=(k-1)\\lceil N/K\\rceil+1}^{k\\lceil N/K\\rceil}\\text{Tr}\\left[\\bigotimes_{j=1}^{n}P_j(3u_{ij}^{\\dagger}|s_{ij}\\rangle\\langle s_{ij}|u_{ij}-\\mathbb{I})\\right]\\\\\n",
" &=\\frac{1}{\\lceil N/K\\rceil}\\sum_{i=(k-1)\\lceil N/K\\rceil+1}^{k\\lceil N/K\\rceil}\\prod_{j=1}^n\\text{Tr}\\left[3P_j u_{ij}^{\\dagger}|s_{ij}\\rangle\\langle s_{ij}|u_{ij}\\right].\n",
" \\end{split}\n",
"\\end{equation}\n",
"\\begin{split}\n",
" \\langle O_{(k)}\\rangle&=\\frac{1}{\\lceil N/K\\rceil}\\sum_{i=(k-1)\\lceil N/K\\rceil+1}^{k\\lceil N/K\\rceil}\\text{Tr}\\left[\\bigotimes_{j=1}^{n}P_j(3u_{ij}^{\\dagger}|s_{ij}\\rangle\\langle s_{ij}|u_{ij}-\\mathbb{I})\\right]\\\\\n",
" &=\\frac{1}{\\lceil N/K\\rceil}\\sum_{i=(k-1)\\lceil N/K\\rceil+1}^{k\\lceil N/K\\rceil}\\prod_{j=1}^n\\text{Tr}\\left[3P_j u_{ij}^{\\dagger}|s_{ij}\\rangle\\langle s_{ij}|u_{ij}\\right].\n",
"\\end{split}\n",
"$$"
],
"metadata": {
"collapsed": false
}
},
{
"cell_type": "markdown",
"source": [
"## Setup"
],
"metadata": {
"collapsed": false
}
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {
"collapsed": true,
"ExecuteTime": {
"end_time": "2023-08-09T04:51:21.649753Z",
"start_time": "2023-08-09T04:51:21.598273400Z"
}
},
"outputs": [
{
"data": {
Expand All @@ -120,7 +110,14 @@
"\n",
"tc.set_backend(\"jax\")\n",
"tc.set_dtype(\"complex128\")"
]
],
"metadata": {
"collapsed": false,
"ExecuteTime": {
"end_time": "2023-08-09T04:51:21.649753Z",
"start_time": "2023-08-09T04:51:21.598273400Z"
}
}
},
{
"cell_type": "markdown",
Expand All @@ -136,10 +133,10 @@
"source": [
"We first set the number of qubits $n$ and the number of repeated measurements $r$ on each Pauli string. Then from the target observable Pauli strings $\\{O_i|i=1,\\cdots,M\\}$ (0, 1, 2, and 3 correspond to $\\mathbb{I}$, $X$, $Y$, and $Z$, respectively), the error $\\epsilon$ and the rate of failure $\\delta$, we can use ``shadow_bound`` function to get the total number of snapshots $N$ and the number of equal parts $K$ to split the shadow snapshot states to compute the median of means:\n",
"$$\n",
"\\begin{eqnarray}\n",
" K&=&2\\log(2M/\\delta),\\\\\n",
" N&=&K\\frac{34}{\\epsilon^2}\\max_{1\\le i\\le M}\\left\\|O_i-\\frac{\\text{Tr}(O_i)}{2^n}\\mathbb{I}\\right\\|^2_{\\text{shadow}}=K\\frac{34}{\\epsilon^2}3^{\\max_{1\\le i\\le M}k_i},\n",
"\\end{eqnarray}\n",
"\\begin{split}\n",
" K&=2\\log(2M/\\delta),\\\\\n",
" N&=K\\frac{34}{\\epsilon^2}\\max_{1\\le i\\le M}\\left\\|O_i-\\frac{\\text{Tr}(O_i)}{2^n}\\mathbb{I}\\right\\|^2_{\\text{shadow}}=K\\frac{34}{\\epsilon^2}3^{\\max_{1\\le i\\le M}k_i},\n",
"\\end{split}\n",
"$$\n",
"where $k_i$ is the number of nontrivial Pauli matrices in $O_i$. Please refer to the Theorem S1 and Lemma S3 in [Huang, Kueng and Preskill (2020)](https://www.nature.com/articles/s41567-020-0932-7) for the details of proof. It should be noted that ``shadow_bound`` has a certain degree of overestimation of $N$, and so many measurements are not really needed in practice. Moreover, ``shadow_bound`` is not jitable and no need to jit."
],
Expand Down Expand Up @@ -589,10 +586,10 @@
"source": [
"On the other hand, for the second order Renyi entropy, we have another method to calculate it in polynomial time by random measurements:\n",
"$$\n",
"\\begin{eqnarray}\n",
" S_2&=&-\\log\\left(\\text{Tr}(\\rho_A^2)\\right),\\\\\n",
" \\text{Tr}(\\rho_A^2)&=&2^k\\sum_{b,b'\\in\\{0,1\\}^k}(-2)^{-H(b,b')}\\overline{P(b)P(b')},\n",
"\\end{eqnarray}\n",
"\\begin{split}\n",
" R_A^2&=-\\log\\left(\\text{Tr}(\\rho_A^2)\\right),\\\\\n",
" \\text{Tr}(\\rho_A^2)&=2^k\\sum_{b,b'\\in\\{0,1\\}^k}(-2)^{-H(b,b')}\\overline{P(b)P(b')},\n",
"\\end{split}\n",
"$$\n",
"where $A$ is the $k$-d reduced system, $H(b,b')$ is the Hamming distance between $b$ and $b'$, $P(b)$ is the probability for measuring $\\rho_A$ and obtaining the outcomes $b$ thus we need a larger $r$ to obtain a good enough priori probability, and the overline means the average on all random selected Pauli strings. Please refer to [Brydges, et al. (2019)](https://www.science.org/doi/full/10.1126/science.aau4963) for more details. We can use ``renyi_entropy_2`` to implement this method, but it is not jitable because we need to build the dictionary based on the bit strings obtained by measurements, which is a dynamical process. Compared with ``entropy_shadow``, it cannot filter out non-negative eigenvalues, so the accuracy is slightly worse."
],
Expand Down
49 changes: 24 additions & 25 deletions docs/source/tutorials/imag_time_evo.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -25,34 +25,26 @@
"\n",
"However, exact imaginary-time evolution takes exponentially more space and time to execute on a classical computer. On the other hand, it is non-unitary and also cannot be implemented on a quantum computer. Therefore, we consider the approximate imaginary-time evolution, also known as the quantum natural gradient (QNG), that is, at each step we adjust the parameters in the quantum circuit to realize the imaginary-time evolution in a very short time $\\tau$. Let the parameters in the circuit be $\\boldsymbol{\\theta}$ and the output of the circuit is $|\\psi\\rangle$, define\n",
"$$\n",
"\\begin{eqnarray}\n",
" |\\psi_\\tau\\rangle&=&e^{-\\tau\\hat{H}}|\\psi\\rangle\\approx|\\psi\\rangle-\\tau\\hat{H}|\\psi\\rangle,\\\\\n",
" |\\psi'\\rangle&=&|\\psi(\\boldsymbol{\\theta}-\\tau\\boldsymbol{\\delta})\\rangle\\approx|\\psi\\rangle-\\tau\\sum_j\\boldsymbol{\\delta}_j|\\partial_{\\boldsymbol{\\theta}_{j}}\\psi\\rangle,\n",
"\\end{eqnarray}\n",
"\\begin{split}\n",
" |\\psi_\\tau\\rangle&=e^{-\\tau\\hat{H}}|\\psi\\rangle\\approx|\\psi\\rangle-\\tau\\hat{H}|\\psi\\rangle,\\\\\n",
" |\\psi'\\rangle&=|\\psi(\\boldsymbol{\\theta}-\\tau\\boldsymbol{\\delta})\\rangle\\approx|\\psi\\rangle-\\tau\\sum_j\\boldsymbol{\\delta}_j|\\partial_{\\boldsymbol{\\theta}_{j}}\\psi\\rangle,\n",
"\\end{split}\n",
"$$\n",
"and the overlap is\n",
"$$\n",
"\\begin{equation}\n",
" O=\\sqrt{\\frac{\\langle\\psi_\\tau\\rangle|\\psi'\\rangle\\langle\\psi'|\\psi_\\tau\\rangle}{\\langle\\psi_\\tau\\rangle|\\psi_\\tau\\rangle\\langle\\psi'|\\psi'\\rangle}}.\n",
"\\end{equation}\n",
"O=\\sqrt{\\frac{\\langle\\psi_\\tau\\rangle|\\psi'\\rangle\\langle\\psi'|\\psi_\\tau\\rangle}{\\langle\\psi_\\tau\\rangle|\\psi_\\tau\\rangle\\langle\\psi'|\\psi'\\rangle}}.\n",
"$$\n",
"Let $\\partial|O|^2/\\partial\\boldsymbol{\\delta}=0$, we can get $\\boldsymbol{A\\delta=C}$, then the update method is as follows\n",
"$$\n",
"\\begin{equation}\n",
" \\boldsymbol{\\theta}^{n+1}=\\boldsymbol{\\theta}^{n}-\\tau\\boldsymbol{\\delta}=\\boldsymbol{\\theta}^{n}-\\tau\\boldsymbol{A}^{-1}\\boldsymbol{C},\n",
"\\end{equation}\n",
"\\boldsymbol{\\theta}^{n+1}=\\boldsymbol{\\theta}^{n}-\\tau\\boldsymbol{\\delta}=\\boldsymbol{\\theta}^{n}-\\tau\\boldsymbol{A}^{-1}\\boldsymbol{C},\n",
"$$\n",
"where $\\boldsymbol{A}$ is the quantum Fisher information matrix and the matrix element\n",
"$$\n",
"\\begin{equation}\n",
" \\boldsymbol{A}_{ij}=\\Re\\left[\\frac{\\langle\\partial_{\\boldsymbol{\\theta}_{i}}\\psi|\\partial_{\\boldsymbol{\\theta}_{j}}\\psi\\rangle}{\\langle\\psi|\\psi\\rangle}-\\frac{\\langle\\partial_{\\boldsymbol{\\theta}_{i}}\\psi|\\psi\\rangle}{\\langle\\psi|\\psi\\rangle}\\frac{\\langle\\psi|\\partial_{\\boldsymbol{\\theta}_{j}}\\psi\\rangle}{\\langle\\psi|\\psi\\rangle}\\right];\n",
"\\end{equation}\n",
"\\boldsymbol{A}_{ij}=\\Re\\left[\\frac{\\langle\\partial_{\\boldsymbol{\\theta}_{i}}\\psi|\\partial_{\\boldsymbol{\\theta}_{j}}\\psi\\rangle}{\\langle\\psi|\\psi\\rangle}-\\frac{\\langle\\partial_{\\boldsymbol{\\theta}_{i}}\\psi|\\psi\\rangle}{\\langle\\psi|\\psi\\rangle}\\frac{\\langle\\psi|\\partial_{\\boldsymbol{\\theta}_{j}}\\psi\\rangle}{\\langle\\psi|\\psi\\rangle}\\right];\n",
"$$\n",
"$\\boldsymbol{C}$ is the gradient vector of energy versus parameters and the vector element\n",
"$$\n",
"\\begin{equation}\n",
" \\boldsymbol{C}_j=\\Re\\left[\\frac{\\langle\\psi|\\hat{H}|\\partial_{\\boldsymbol{\\theta}_{j}}\\psi\\rangle}{\\langle\\psi|\\psi\\rangle}-\\frac{\\langle\\psi|\\hat{H}|\\psi\\rangle}{\\langle\\psi|\\psi\\rangle}\\frac{\\langle\\psi|\\partial_{\\boldsymbol{\\theta}_{j}}\\psi\\rangle}{\\langle\\psi|\\psi\\rangle}\\right].\n",
"\\end{equation}\n",
"\\boldsymbol{C}_j=\\Re\\left[\\frac{\\langle\\psi|\\hat{H}|\\partial_{\\boldsymbol{\\theta}_{j}}\\psi\\rangle}{\\langle\\psi|\\psi\\rangle}-\\frac{\\langle\\psi|\\hat{H}|\\psi\\rangle}{\\langle\\psi|\\psi\\rangle}\\frac{\\langle\\psi|\\partial_{\\boldsymbol{\\theta}_{j}}\\psi\\rangle}{\\langle\\psi|\\psi\\rangle}\\right].\n",
"$$\n",
"Since $|\\psi\\rangle$ is represented by quantum circuit and naturally normalized, the second term of $\\boldsymbol{C}$ vanishes. And because the global phase is not important, we can add a $U(1)$ gauge to make the second term of $\\boldsymbol{A}$ vanishes. Related theoretical work can refer to [Yuan, Endo, Zhao, Li and Benjamin](https://doi.org/10.22331/q-2019-10-07-191) and [Stokes, Izaac, Killoran and Carleo](https://doi.org/10.22331/q-2020-05-25-269), which will also show how to measure $\\boldsymbol{A}$ and $\\boldsymbol{C}$ in circuit."
]
Expand All @@ -75,6 +67,15 @@
"collapsed": false
}
},
{
"cell_type": "markdown",
"source": [
"## Setup"
],
"metadata": {
"collapsed": false
}
},
{
"cell_type": "code",
"execution_count": null,
Expand Down Expand Up @@ -127,12 +128,10 @@
"\n",
"Let the set of clauses in the NAE3SAT be $\\mathcal{C}$. In each clause, there are three literals and each literal is represented by a spin. Spins up ($s=1$, $\\text{bit}=0$) and down ($s=-1$, $\\text{bit}=1$) represent false and true respectively. For the clause $(s_i,\\ s_j,\\ s_k)\\in\\mathcal{C}$, $s_i,\\ s_j,\\ s_k$ cannot be 1 or -1 at the same time. The Hamiltonian of the NAE3SAT is as follows\n",
"$$\n",
"\\begin{equation}\n",
" \\begin{split}\n",
" \\hat{H}_C&=\\sum_{(i,j,k)\\in\\mathcal{C}}\\left[(s_i+s_j+s_k)^2-1\\right]/2\\\\\n",
" &=\\sum_{(i,j,k)\\in\\mathcal{C}}(s_i s_j+s_j s_k+s_k s_i)+|\\mathcal{C}|,\n",
" \\end{split}\n",
"\\end{equation}\n",
"\\begin{split}\n",
" \\hat{H}_C&=\\sum_{(i,j,k)\\in\\mathcal{C}}\\left[(s_i+s_j+s_k)^2-1\\right]/2\\\\\n",
" &=\\sum_{(i,j,k)\\in\\mathcal{C}}(s_i s_j+s_j s_k+s_k s_i)+|\\mathcal{C}|,\n",
"\\end{split}\n",
"$$\n",
"where $|\\mathcal{C}|$ is the number of clauses in $\\mathcal{C}$. When all clauses are true, $\\hat{H}_C$ takes the minimum value 0, and the corresponding bit string is the solution of the NAE3SAT."
],
Expand Down Expand Up @@ -414,7 +413,7 @@
{
"cell_type": "markdown",
"source": [
"We use two methods to calculate $\\boldsymbol{\\delta}$, one is to calculate directly according to the expressions of $\\boldsymbol{A}$ and $\\boldsymbol{C}$, and the other is to call the existing API to calculate $\\boldsymbol{A}$ and $\\boldsymbol{C}$. The former only needs to calculate the $|\\partial_{\\boldsymbol{\\theta}_{j}}\\psi\\rangle$ once, while the latter needs to calculate that twice, but the code of the latter is more concise. In each method, we set the parameter fixed_global_phase to decide whether to fix the global phase, that is, whether the second term of $\\boldsymbol{A}$ vanishes.\n",
"We use two methods to calculate $\\boldsymbol{\\delta}$, one is to calculate directly according to the expressions of $\\boldsymbol{A}$ and $\\boldsymbol{C}$, and the other is to call the existing API to calculate $\\boldsymbol{A}$ and $\\boldsymbol{C}$. The former only needs to calculate the $|\\partial_{\\boldsymbol{\\theta}_{j}}\\psi\\rangle$ once, while the latter needs to calculate that twice, but the code of the latter is more concise. In each method, we set the parameter ``fixed_global_phase`` to decide whether to fix the global phase, that is, whether the second term of $\\boldsymbol{A}$ vanishes.\n",
"\n",
"Then we choose the existing optimizer, SGD, to implement the update step. Since compared with naive gradient descent, the approximate imaginary-time evolution has been corrected on the update step size, the adaptive optimizer improved for the naive gradient descent such as Adam is not suitable for the approximate imaginary-time evolution. When update by the adaptive optimizer, the loss function fluctuates greatly. On the other hand, the update method of SGD without momentum is naive update, which is convenient for comparison with the exact imaginary-time evolution."
],
Expand Down Expand Up @@ -576,7 +575,7 @@
{
"cell_type": "markdown",
"source": [
"We first show the overlap between the final states obtained by different methods. The final states obtained by different methods but with the same parameter of fixed_global_phase are almost the same, which are also close to the exact final state. And the final states obtained by the same method but with the different parameter of fixed_global_phase has a global phase difference."
"We first show the overlap between the final states obtained by different methods. The final states obtained by different methods but with the same parameter of ``fixed_global_phase`` are almost the same, which are also close to the exact final state. And the final states obtained by the same method but with the different parameter of ``fixed_global_phase`` has a global phase difference."
],
"metadata": {
"collapsed": false
Expand Down Expand Up @@ -915,7 +914,7 @@
{
"cell_type": "markdown",
"source": [
"We also use two methods to calculate $\\boldsymbol{\\delta}$, but make some changes in the method of directly calling the API and the update method. When calculating $\\boldsymbol{A}$, we call qng2 instead of qng, and when calculating $\\boldsymbol{C}$, we call dynamics_rhs instead of calculating the energy gradient by value_and_grad. For the update method, we do not call the existing optimizer but directly adopt the naive update method."
"We also use two methods to calculate $\\boldsymbol{\\delta}$, but make some changes in the method of directly calling the API and the update method. When calculating $\\boldsymbol{A}$, we call ``qng2`` instead of ``qng``, and when calculating $\\boldsymbol{C}$, we call ``dynamics_rhs`` instead of calculating the energy gradient by ``value_and_grad``. For the update method, we do not call the existing optimizer but directly adopt the naive update method."
],
"metadata": {
"collapsed": false
Expand Down
Loading
Loading