Skip to content

GabrielCostaBatista/efficient-qpt

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

93 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Efficient Quantum Process Tomography (QPT) with Pauli and SIC Strings

License Python

This repository contains the source code for investigating Quantum Process Tomography (QPT) using Pauli Strings (and SIC Strings). The goal of this project is to develop and analyze efficient strategies for reconstructing quantum processes (unitary maps) by optimizing the selection of preparation and measurement bases (either Pauli Strings or SIC Strings).


⚠️ Attribution & Credits

  • Core Research & Simulations: Developed by Gabriel Batista.
  • Pauli Rank Analysis: The notebooks in the HeuristicStrings and PauliRank/ directories were developed by Carlos Couto and are included here for completeness in finding heuristic strings and analyzing the rank properties of the Pauli bases. These notebooks were used to find heuristic string sets.

πŸ“‹ Table of Contents


πŸ”­ Problem Statement

Quantum Process Tomography (QPT) is the procedure for characterizing the dynamics of a quantum system. However, QPT scales exponentially with the number of qubits, making it impractical for larger systems.

This project attempts to find more efficient tomography strategies by:

  1. Leveraging Pauli Strings and SIC Strings: Using Pauli bases or Symmetric Informationally Complete (SIC) bases to probe the system.
  2. Optimizing Measurement Strategies: Comparing different heuristics to each other and to random baselines.
  3. Noise Models: Running both ideal (no-noise) simulations and realistic shot-noise simulations.
  4. Machine Learning Reconstruction: Using gradient-based optimization to reconstruct the unitary matrix from measurement probabilities.

πŸ“‚ Project Structure

The repository is organized to separate simulation logic, utilities, and data artifacts.

.
β”œβ”€β”€ data/                       # Simulation data (Git-ignored because of large files)
β”‚   β”œβ”€β”€ pickle/                 # Raw data dictionaries (PyTorch tensors moved to CPU)
β”‚   └── plots/                  # Generated figures from src/plotting/
β”œβ”€β”€ HeuristicStrings/           # Tools to generate sequences of Pauli/SICs according to the correlation heuristic (Credit: Carlos Couto)
β”‚   β”œβ”€β”€ corrmat_paulis_sics.zip # WinRAR ZIP archive (4,496 KB)
β”‚   β”œβ”€β”€ HeuristicStrings.ipynb  # Jupyter notebook
β”‚   β”œβ”€β”€ PauliStrings.nb         # Wolfram Notebook (Not documented but included for completeness sake)
β”‚   └── PauliStrings_SIC.nb     # Wolfram Notebook (Not documented but included for completeness sake)
β”œβ”€β”€ PauliRank/                  # Rank analysis tools (Credit: Carlos Couto)
β”‚   └── PauliStrings_Rank.ipynb
β”œβ”€β”€ src/                        # Source code
β”‚   β”œβ”€β”€ simulations/            # Executable scripts for running experiments
β”‚   β”‚   β”œβ”€β”€ num_ps_heuristic.py
β”‚   β”‚   β”œβ”€β”€ num_ps_random.py
β”‚   β”‚   └── ...
β”‚   β”œβ”€β”€ plotting/               # Scripts to visualize data/pickle files
β”‚   β”œβ”€β”€ converters/             # Utilities for data format conversion
β”‚   └── utils/                  # Core reusable libraries
β”‚       β”œβ”€β”€ quantum_utils.py    # Quantum simulator & Pauli string logic
β”‚       β”œβ”€β”€ ml_utils.py         # Optimization loop & loss functions
β”‚       └── data_utils.py       # File I/O and tensor management
└── requirements.txt            # Python dependencies

βš™οΈ Installation

Prerequisites

  • Python 3.10 or higher.
  • A working Python environment (Virtualenv or Conda recommended).
  • CUDA support (optional) if you wish to run simulations on GPU.

Steps

  1. Clone the Repository

    git clone https://github.com/GabrielCostaBatista/efficient-qpt.git
    cd efficient-qpt
  2. Install Dependencies It is recommended to create a virtual environment first.

    python -m venv venv
    source venv/bin/activate  # On Windows use: venv\Scripts\activate

    Install the required packages:

    pip install -r requirements.txt

πŸš€ Usage

Running Simulations

The simulation scripts are located in src/simulations/. Each script corresponds to a different strategy for selecting Pauli strings (e.g., Random, Heuristic).

Standard Run:

python src/simulations/num_ps_heuristic.py

Configured Run (CLI arguments):

python src/simulations/num_ps_heuristic.py \
  --heuristic_file inputs/heuristic_strings/2_18_ps \
  --num_qubits 2 \
  --num_runs 30 \
  --num_shots 1000 \
  --min_strings 1 \
  --step_size 1

Note: Data is automatically saved to data/pickle/. The code will automatically create these directories if they do not exist.

Configuration Variables

Most main simulation scripts support command-line arguments via argparse. The DEFAULT_* globals at the top of each script are fallback defaults used when a CLI argument is not provided.

To inspect available options for a given script, use:

python src/simulations/num_ps_heuristic.py --help

Common parameters exposed by CLI flags (with defaults taken from each script):

Variable Description Default
NUM_QUBITS Number of qubits in the system (e.g., 2, 3) 2
NUM_RUNS Number of independent simulation runs to average 30
NOISE Boolean (True/False) to enable shot noise simulation True
NUM_SHOTS Number of shots per measurement if noise is enabled 1000
MIN_STATE_STRINGS Minimum number of Pauli/SIC strings to use for reconstruction 1
MAX_STATE_STRINGS Max number of Pauli/SIC strings to use for reconstruction -
SS_STEP_SIZE State String Step Size (by how much to increase the number of state strings when iterating from MIN_STATE_STRINGS to MAX_STATE_STRINGS) 1
SHUFFLE_STRINGS Boolean (True/False) to enable shuffling of the state strings before each run True
USE_PREVIOUS_GUESS Boolean (True/False) to use previous optimization result as start for next step True
CHECKPOINTING_FREQUENCY Frequency of saving checkpoints (in number of runs) 1

Plotting Results

Once data is generated in data/pickle/, you can generate visualizations:

python src/plotting/num_ps_heuristic.py

This will save figures to data/plots/.


πŸ§ͺ Simulations (one by one)

This section documents each simulation script in src/simulations/ and briefly explains its purpose. Use the global configuration variables at the top of each script (e.g., NUM_QUBITS, NUM_RUNS, NOISE, NUM_SHOTS) to control behavior. All scripts save outputs to data/pickle/ and human-readable summaries to data/readable/.

Main simulations

  • num_ps_heuristic.py - Script for evaluating a Pauli String heuristic
    Runs simulation for Pauli Strings selected using an heuristic. Sweeps the number of state strings from MIN_STATE_STRINGS to MAX_STATE_STRINGS, performs NUM_RUNS independent runs, and records reconstruction fidelity and other relevant information.

  • num_ps_random.py - Script for evaluating Random Pauli Strings
    Runs simulation for Pauli Strings selected uniformly at random. Sweeps the number of state strings from MIN_STATE_STRINGS to MAX_STATE_STRINGS, performs NUM_RUNS independent runs with different random Pauli Strings, and records reconstruction fidelity and other relevant information.

  • num_ps_symmetric.py - Symmetry-Avoidance Heuristic
    A variation of the heuristic strategy that prioritizes selecting "non-symmetric" Pauli strings first. It aims to maximize information gain by avoiding redundant symmetric measurements.

    Note: If you plan to use this script for a large number of qubits consider taking a look at the string generation method as there are more efficient ways of generating them.

  • num_sic_heuristic.py - Script for evaluating a SIC String heuristic
    Runs simulation for SIC Strings (Symmetric Informationally Complete) selected using an heuristic. Sweeps the number of state strings from MIN_STATE_STRINGS to MAX_STATE_STRINGS, performs NUM_RUNS independent runs, and records reconstruction fidelity and other relevant information.

  • num_sic_random.py - Script for evaluating Random SIC Strings
    Runs simulation for SIC Strings selected uniformly at random (from the complete set). Sweeps the number of state strings from MIN_STATE_STRINGS to MAX_STATE_STRINGS, performs NUM_RUNS independent runs, and records reconstruction fidelity and other relevant information.

Legacy simulations

Legacy simulations are old simulations that were used to test ideas. These scripts can be used to understand which ideas were tested. These scripts rely on an old (but still supported) way of saving data (in .txt files) and may not support all of the most recent features available in the main simulations.

  • correlations.py - Correlation Matrix Analysis
    Computes and prints the correlation matrix for a set of random unitaries. Used to analyze the statistical independence of different Pauli string measurements.

  • rank_heuristic.py - Rank-based Analysis
    Loads pre-computed sets of Pauli strings from JSON files (generated by PauliRank/ notebooks) which have specific algebraic rank properties. Benchmarks these specific sets to correlate algebraic rank with reconstruction fidelity.

  • variation_fidelity_distance.py / variation_fidelity_distance_2.py - Marginal Improvement Analysis
    Investigates the "marginal utility" of adding a single Pauli string to a fixed base set. It correlates the improvement in fidelity with the geometric distance of the new string from the existing set. Version 2 uses a "minimum distance" metric.

  • variation_fidelity_symmetric.py / variation_fidelity_symmetric_2.py - Symmetry Impact Analysis
    Investigates the impact of adding symmetric vs. non-symmetric Pauli strings to a base set. Version 2 further distinguishes between "Total Symmetric" (global) and "Qubit-Wise Symmetric" (local) relationships to isolate their specific contributions to fidelity.

Notes:

  • If you add a new simulation, follow existing I/O and config conventions so results are automatically saved and reproducible.
  • To generate plots from a run, ensure its .pkl files are present in data/pickle/ before invoking the plotting scripts.

πŸ’Ύ Data Flow

  1. Simulator (src/simulations/): Runs the simulations to generate data on how good is the recovery of a unitary depending on the set of strings that is used.
  2. Checkpointing: Data is saved periodically to data/pickle/ as .pkl files.
    • Note: The data_utils.py module automatically moves PyTorch tensors to CPU before pickling to ensure you can open the data on any machine (even one without a GPU).

πŸ“Š Main Results

Disclaimer: These results indicate general trends, however, in some cases they might lack the statistical robustness that one would expect in a paper.


Exponentially small % of the total set of Pauli Strings needed to reconstruct the unitary

There are a total of $18^{n}$ Pauli Strings (6 choices for preparation and 3 for measurement per qubit) and $12^{n}$ SIC Strings (4 choices for preparation and 3 for measurement per qubit). Despite this exponentially large full set, the unitary can be reconstructed from only an exponentially small fraction of it (although the absolute size of the sufficient subset still grows exponentially).

Using a random selection of Pauli Strings (noiseless), a fidelity β‰₯ 0.99 is first reached on average at:

Qubits Strings needed Total strings ($18^n$) Fraction
1 ~9 18 ~50%
2 ~13 324 ~4.0%
3 ~26 5832 ~0.45%

Note: with 1000 shots per string (noisy), convergence is slower and may not reach the same plateau within the same number of strings.

With noise (1000 shots per string):

1 Qubit 2 Qubits 3 Qubits

Without noise (infinite shots):

1 Qubit 2 Qubits 3 Qubits

The Correlation Heuristic

This heuristic builds a set of Pauli strings that are as mutually uncorrelated as possible, using two layers of logic:

  • Greedy construction: Starting from an initial string or set, the algorithm grows the set one string at a time. At each step it scans all remaining candidates and picks whichever one is least correlated with the strings already in the set. Here, "least correlated" means the sum of its correlations with every current member is as small as possible. This is greedy in the sense that it never reconsiders past choices, it just always takes the locally best option available.
  • Random restarts: Because a greedy approach is sensitive to where you start the algorithm repeats the construction many times from different random starting points and keeps whichever run produced the most uncorrelated final set. More iterations means a better chance of stumbling onto a good starting point, at the cost of more computation.

The key limitation worth noting is that greedily minimising correlation one string at a time does not guarantee the globally optimal set. It's possible that accepting a slightly more correlated string early on would open the door to much better choices later. The random restarts partially compensate for this, but don't fully solve it.

We define the correlation between two different strings as measuring how similar their measurement outcome distributions are, averaged over all possible quantum circuits (unitaries drawn from the Haar measure). This computation was done analytically, leading to an expression which solely depends on the preparation and measurement gates.

Inside the HeuristicStrings directory you can find the code to generate sets of strings following this heuristic, as well as undocumented Mathematica notebooks to generate the correlation matrices and string indices.


Strategy Comparisons

The following plots compares different string-selection strategies against their respective random baselines. Each panel shows average fidelity (Β± standard error) as a function of the number of strings used, for 1, 2, and 3 qubits.

PS Random vs SIC Random

Compares the two random baselines: Pauli Strings drawn uniformly at random vs SIC Strings drawn uniformly at random.

With noise (1000 shots):

1 Qubit 2 Qubits 3 Qubits

Without noise:

1 Qubit 2 Qubits 3 Qubits

PS Random vs PS (Correlation) Heuristic

Compares the Pauli String random baseline against the rank-based heuristic set (strings chosen to minimise the correlation).

With noise (1000 shots):

1 Qubit 2 Qubits 3 Qubits

Without noise:

1 Qubit 2 Qubits 3 Qubits

PS Random vs PS Symmetric (Symmetry-Avoidance Heuristic)

Compares the Pauli String random baseline against the symmetry-avoidance heuristic, where non-symmetric strings (those that differ in more than just preparation sign) are prioritized first.

With noise (1000 shots):

1 Qubit 2 Qubits 3 Qubits

Without noise:

1 Qubit 2 Qubits 3 Qubits

SIC Random vs SIC (Correlation) Heuristic

Compares the SIC String random baseline against the SIC heuristic set (strings chosen to minimise correlation within the SIC basis).

With noise (1000 shots):

1 Qubit 2 Qubits 3 Qubits

Without noise:

1 Qubit 2 Qubits 3 Qubits

Rank Heuristic

The rank of the Pauli measurement matrix can be used as a proxy for information gain: higher-rank configurations tend to yield better reconstructions for the same number of strings.

The trend is clearest at intermediate string counts (e.g. 8 strings), where rank differences are large enough to matter but the reconstruction has not yet converged. The rank heuristic outperforms a random baseline:

1 Qubit 2 Qubits 3 Qubits

Note: these rank-heuristic plots use the legacy text-based data and noiseless simulations. Beyond that the simulations were performed with less runs.


Conclusion

Although the heuristics (and SIC strings) seem to outperform random selections of strings for a single qubit, this advantage disappears when the number of qubits increases. One possible explanation for this phenomena is that what it matters is avoiding doing terrible choices of strings and not making good ones. When the number of qubits increases the ammount of total strings increases exponentially and possibily the % of bad choices reduces exponentially, which makes the choice of a bad set exponentially unlikely.

An interesting future direction to test this hypothesis is to optimise for the worst possible set of strings and check if it performs significantly worst than random.


βš–οΈ License

The source code in this repository is available under the MIT License.

What this means:

  • βœ… You can use, copy, modify, and distribute this software for any purpose.
  • βœ… You can use it in commercial or private projects.
  • πŸ“ You simply need to include the original copyright notice and license text in any copy of the software/source.

See the LICENSE file for details.

About

For researching the best strategies for Quantum Process Tomography with Pauli and SIC Strings

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors