Efficient Quantum Process Tomography (QPT) with Pauli and SIC Strings

This repository contains the source code for investigating Quantum Process Tomography (QPT) using Pauli Strings (and SIC Strings). The goal of this project is to develop and analyze efficient strategies for reconstructing quantum processes (unitary maps) by optimizing the selection of preparation and measurement bases (either Pauli Strings or SIC Strings).

⚠️ Attribution & Credits

Core Research & Simulations: Developed by Gabriel Batista.
Pauli Rank Analysis: The notebooks in the HeuristicStrings and PauliRank/ directories were developed by Carlos Couto and are included here for completeness in finding heuristic strings and analyzing the rank properties of the Pauli bases. These notebooks were used to find heuristic string sets.

📋 Table of Contents

Problem Statement
Project Structure
Installation
Usage
- Running Simulations
- Plotting Results
Simulations (one by one)
Data Flow
Main Results
License

🔭 Problem Statement

Quantum Process Tomography (QPT) is the procedure for characterizing the dynamics of a quantum system. However, QPT scales exponentially with the number of qubits, making it impractical for larger systems.

This project attempts to find more efficient tomography strategies by:

Leveraging Pauli Strings and SIC Strings: Using Pauli bases or Symmetric Informationally Complete (SIC) bases to probe the system.
Optimizing Measurement Strategies: Comparing different heuristics to each other and to random baselines.
Noise Models: Running both ideal (no-noise) simulations and realistic shot-noise simulations.
Machine Learning Reconstruction: Using gradient-based optimization to reconstruct the unitary matrix from measurement probabilities.

📂 Project Structure

The repository is organized to separate simulation logic, utilities, and data artifacts.

.
├── data/                       # Simulation data (Git-ignored because of large files)
│   ├── pickle/                 # Raw data dictionaries (PyTorch tensors moved to CPU)
│   └── plots/                  # Generated figures from src/plotting/
├── HeuristicStrings/           # Tools to generate sequences of Pauli/SICs according to the correlation heuristic (Credit: Carlos Couto)
│   ├── corrmat_paulis_sics.zip # WinRAR ZIP archive (4,496 KB)
│   ├── HeuristicStrings.ipynb  # Jupyter notebook
│   ├── PauliStrings.nb         # Wolfram Notebook (Not documented but included for completeness sake)
│   └── PauliStrings_SIC.nb     # Wolfram Notebook (Not documented but included for completeness sake)
├── PauliRank/                  # Rank analysis tools (Credit: Carlos Couto)
│   └── PauliStrings_Rank.ipynb
├── src/                        # Source code
│   ├── simulations/            # Executable scripts for running experiments
│   │   ├── num_ps_heuristic.py
│   │   ├── num_ps_random.py
│   │   └── ...
│   ├── plotting/               # Scripts to visualize data/pickle files
│   ├── converters/             # Utilities for data format conversion
│   └── utils/                  # Core reusable libraries
│       ├── quantum_utils.py    # Quantum simulator & Pauli string logic
│       ├── ml_utils.py         # Optimization loop & loss functions
│       └── data_utils.py       # File I/O and tensor management
└── requirements.txt            # Python dependencies

⚙️ Installation

Prerequisites

Python 3.10 or higher.
A working Python environment (Virtualenv or Conda recommended).
CUDA support (optional) if you wish to run simulations on GPU.

Steps

Clone the Repository

git clone https://github.com/GabrielCostaBatista/efficient-qpt.git
cd efficient-qpt

Install Dependencies It is recommended to create a virtual environment first.

python -m venv venv
source venv/bin/activate  # On Windows use: venv\Scripts\activate

Install the required packages:

pip install -r requirements.txt

🚀 Usage

Running Simulations

The simulation scripts are located in src/simulations/. Each script corresponds to a different strategy for selecting Pauli strings (e.g., Random, Heuristic).

Standard Run:

python src/simulations/num_ps_heuristic.py

Configured Run (CLI arguments):

python src/simulations/num_ps_heuristic.py \
  --heuristic_file inputs/heuristic_strings/2_18_ps \
  --num_qubits 2 \
  --num_runs 30 \
  --num_shots 1000 \
  --min_strings 1 \
  --step_size 1

Note: Data is automatically saved to data/pickle/. The code will automatically create these directories if they do not exist.

Configuration Variables

Most main simulation scripts support command-line arguments via argparse. The DEFAULT_* globals at the top of each script are fallback defaults used when a CLI argument is not provided.

To inspect available options for a given script, use:

python src/simulations/num_ps_heuristic.py --help

Common parameters exposed by CLI flags (with defaults taken from each script):

Variable	Description	Default
`NUM_QUBITS`	Number of qubits in the system (e.g., 2, 3)	`2`
`NUM_RUNS`	Number of independent simulation runs to average	`30`
`NOISE`	Boolean (`True`/`False`) to enable shot noise simulation	`True`
`NUM_SHOTS`	Number of shots per measurement if noise is enabled	`1000`
`MIN_STATE_STRINGS`	Minimum number of Pauli/SIC strings to use for reconstruction	`1`
`MAX_STATE_STRINGS`	Max number of Pauli/SIC strings to use for reconstruction	-
`SS_STEP_SIZE`	State String Step Size (by how much to increase the number of state strings when iterating from `MIN_STATE_STRINGS` to `MAX_STATE_STRINGS`)	`1`
`SHUFFLE_STRINGS`	Boolean (`True`/`False`) to enable shuffling of the state strings before each run	`True`
`USE_PREVIOUS_GUESS`	Boolean (`True`/`False`) to use previous optimization result as start for next step	`True`
`CHECKPOINTING_FREQUENCY`	Frequency of saving checkpoints (in number of runs)	`1`

Plotting Results

Once data is generated in data/pickle/, you can generate visualizations:

python src/plotting/num_ps_heuristic.py

This will save figures to data/plots/.

🧪 Simulations (one by one)

This section documents each simulation script in src/simulations/ and briefly explains its purpose. Use the global configuration variables at the top of each script (e.g., NUM_QUBITS, NUM_RUNS, NOISE, NUM_SHOTS) to control behavior. All scripts save outputs to data/pickle/ and human-readable summaries to data/readable/.

Main simulations

num_ps_heuristic.py - Script for evaluating a Pauli String heuristic
Runs simulation for Pauli Strings selected using an heuristic. Sweeps the number of state strings from MIN_STATE_STRINGS to MAX_STATE_STRINGS, performs NUM_RUNS independent runs, and records reconstruction fidelity and other relevant information.
num_ps_random.py - Script for evaluating Random Pauli Strings
Runs simulation for Pauli Strings selected uniformly at random. Sweeps the number of state strings from MIN_STATE_STRINGS to MAX_STATE_STRINGS, performs NUM_RUNS independent runs with different random Pauli Strings, and records reconstruction fidelity and other relevant information.
num_ps_symmetric.py - Symmetry-Avoidance Heuristic
A variation of the heuristic strategy that prioritizes selecting "non-symmetric" Pauli strings first. It aims to maximize information gain by avoiding redundant symmetric measurements.

Note: If you plan to use this script for a large number of qubits consider taking a look at the string generation method as there are more efficient ways of generating them.
num_sic_heuristic.py - Script for evaluating a SIC String heuristic
Runs simulation for SIC Strings (Symmetric Informationally Complete) selected using an heuristic. Sweeps the number of state strings from MIN_STATE_STRINGS to MAX_STATE_STRINGS, performs NUM_RUNS independent runs, and records reconstruction fidelity and other relevant information.
num_sic_random.py - Script for evaluating Random SIC Strings
Runs simulation for SIC Strings selected uniformly at random (from the complete set). Sweeps the number of state strings from MIN_STATE_STRINGS to MAX_STATE_STRINGS, performs NUM_RUNS independent runs, and records reconstruction fidelity and other relevant information.

Legacy simulations

Legacy simulations are old simulations that were used to test ideas. These scripts can be used to understand which ideas were tested. These scripts rely on an old (but still supported) way of saving data (in .txt files) and may not support all of the most recent features available in the main simulations.

correlations.py - Correlation Matrix Analysis
Computes and prints the correlation matrix for a set of random unitaries. Used to analyze the statistical independence of different Pauli string measurements.
rank_heuristic.py - Rank-based Analysis
Loads pre-computed sets of Pauli strings from JSON files (generated by PauliRank/ notebooks) which have specific algebraic rank properties. Benchmarks these specific sets to correlate algebraic rank with reconstruction fidelity.
variation_fidelity_distance.py / variation_fidelity_distance_2.py - Marginal Improvement Analysis
Investigates the "marginal utility" of adding a single Pauli string to a fixed base set. It correlates the improvement in fidelity with the geometric distance of the new string from the existing set. Version 2 uses a "minimum distance" metric.
variation_fidelity_symmetric.py / variation_fidelity_symmetric_2.py - Symmetry Impact Analysis
Investigates the impact of adding symmetric vs. non-symmetric Pauli strings to a base set. Version 2 further distinguishes between "Total Symmetric" (global) and "Qubit-Wise Symmetric" (local) relationships to isolate their specific contributions to fidelity.

Notes:

If you add a new simulation, follow existing I/O and config conventions so results are automatically saved and reproducible.
To generate plots from a run, ensure its .pkl files are present in data/pickle/ before invoking the plotting scripts.

💾 Data Flow

Simulator (src/simulations/): Runs the simulations to generate data on how good is the recovery of a unitary depending on the set of strings that is used.
Checkpointing: Data is saved periodically to data/pickle/ as .pkl files.
- Note: The data_utils.py module automatically moves PyTorch tensors to CPU before pickling to ensure you can open the data on any machine (even one without a GPU).

📊 Main Results

Disclaimer: These results indicate general trends, however, in some cases they might lack the statistical robustness that one would expect in a paper.

Exponentially small % of the total set of Pauli Strings needed to reconstruct the unitary

There are a total of $18^{n}$ Pauli Strings (6 choices for preparation and 3 for measurement per qubit) and $12^{n}$ SIC Strings (4 choices for preparation and 3 for measurement per qubit). Despite this exponentially large full set, the unitary can be reconstructed from only an exponentially small fraction of it (although the absolute size of the sufficient subset still grows exponentially).

Using a random selection of Pauli Strings (noiseless), a fidelity ≥ 0.99 is first reached on average at:

Qubits	Strings needed	Total strings ($18^n$)	Fraction
1	~9	18	~50%
2	~13	324	~4.0%
3	~26	5832	~0.45%

Note: with 1000 shots per string (noisy), convergence is slower and may not reach the same plateau within the same number of strings.

With noise (1000 shots per string):

1 Qubit	2 Qubits	3 Qubits

Without noise (infinite shots):

1 Qubit	2 Qubits	3 Qubits

The Correlation Heuristic

This heuristic builds a set of Pauli strings that are as mutually uncorrelated as possible, using two layers of logic:

Greedy construction: Starting from an initial string or set, the algorithm grows the set one string at a time. At each step it scans all remaining candidates and picks whichever one is least correlated with the strings already in the set. Here, "least correlated" means the sum of its correlations with every current member is as small as possible. This is greedy in the sense that it never reconsiders past choices, it just always takes the locally best option available.
Random restarts: Because a greedy approach is sensitive to where you start the algorithm repeats the construction many times from different random starting points and keeps whichever run produced the most uncorrelated final set. More iterations means a better chance of stumbling onto a good starting point, at the cost of more computation.

The key limitation worth noting is that greedily minimising correlation one string at a time does not guarantee the globally optimal set. It's possible that accepting a slightly more correlated string early on would open the door to much better choices later. The random restarts partially compensate for this, but don't fully solve it.

We define the correlation between two different strings as measuring how similar their measurement outcome distributions are, averaged over all possible quantum circuits (unitaries drawn from the Haar measure). This computation was done analytically, leading to an expression which solely depends on the preparation and measurement gates.

Inside the HeuristicStrings directory you can find the code to generate sets of strings following this heuristic, as well as undocumented Mathematica notebooks to generate the correlation matrices and string indices.

Strategy Comparisons

The following plots compares different string-selection strategies against their respective random baselines. Each panel shows average fidelity (± standard error) as a function of the number of strings used, for 1, 2, and 3 qubits.

PS Random vs SIC Random

Compares the two random baselines: Pauli Strings drawn uniformly at random vs SIC Strings drawn uniformly at random.

With noise (1000 shots):

1 Qubit	2 Qubits	3 Qubits

Without noise:

1 Qubit	2 Qubits	3 Qubits

PS Random vs PS (Correlation) Heuristic

Compares the Pauli String random baseline against the rank-based heuristic set (strings chosen to minimise the correlation).

With noise (1000 shots):

1 Qubit	2 Qubits	3 Qubits

Without noise:

1 Qubit	2 Qubits	3 Qubits

PS Random vs PS Symmetric (Symmetry-Avoidance Heuristic)

Compares the Pauli String random baseline against the symmetry-avoidance heuristic, where non-symmetric strings (those that differ in more than just preparation sign) are prioritized first.

With noise (1000 shots):

1 Qubit	2 Qubits	3 Qubits

Without noise:

1 Qubit	2 Qubits	3 Qubits

SIC Random vs SIC (Correlation) Heuristic

Compares the SIC String random baseline against the SIC heuristic set (strings chosen to minimise correlation within the SIC basis).

With noise (1000 shots):

1 Qubit	2 Qubits	3 Qubits

Without noise:

1 Qubit	2 Qubits	3 Qubits

Rank Heuristic

The rank of the Pauli measurement matrix can be used as a proxy for information gain: higher-rank configurations tend to yield better reconstructions for the same number of strings.

The trend is clearest at intermediate string counts (e.g. 8 strings), where rank differences are large enough to matter but the reconstruction has not yet converged. The rank heuristic outperforms a random baseline:

1 Qubit	2 Qubits	3 Qubits

Note: these rank-heuristic plots use the legacy text-based data and noiseless simulations. Beyond that the simulations were performed with less runs.

Conclusion

Although the heuristics (and SIC strings) seem to outperform random selections of strings for a single qubit, this advantage disappears when the number of qubits increases. One possible explanation for this phenomena is that what it matters is avoiding doing terrible choices of strings and not making good ones. When the number of qubits increases the ammount of total strings increases exponentially and possibily the % of bad choices reduces exponentially, which makes the choice of a bad set exponentially unlikely.

An interesting future direction to test this hypothesis is to optimise for the worst possible set of strings and check if it performs significantly worst than random.

⚖️ License

The source code in this repository is available under the MIT License.

What this means:

✅ You can use, copy, modify, and distribute this software for any purpose.
✅ You can use it in commercial or private projects.
📝 You simply need to include the original copyright notice and license text in any copy of the software/source.

See the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 93 Commits
HeuristicStrings		HeuristicStrings
PauliRank		PauliRank
inputs/heuristic_strings		inputs/heuristic_strings
plots_readme		plots_readme
src		src
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Efficient Quantum Process Tomography (QPT) with Pauli and SIC Strings

⚠️ Attribution & Credits

📋 Table of Contents

🔭 Problem Statement

📂 Project Structure

⚙️ Installation

Prerequisites

Steps

🚀 Usage

Running Simulations

Configuration Variables

Plotting Results

🧪 Simulations (one by one)

Main simulations

Legacy simulations

💾 Data Flow

📊 Main Results

Exponentially small % of the total set of Pauli Strings needed to reconstruct the unitary

The Correlation Heuristic

Strategy Comparisons

PS Random vs SIC Random

PS Random vs PS (Correlation) Heuristic

PS Random vs PS Symmetric (Symmetry-Avoidance Heuristic)

SIC Random vs SIC (Correlation) Heuristic

Rank Heuristic

Conclusion

⚖️ License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Efficient Quantum Process Tomography (QPT) with Pauli and SIC Strings

⚠️ Attribution & Credits

📋 Table of Contents

🔭 Problem Statement

📂 Project Structure

⚙️ Installation

Prerequisites

Steps

🚀 Usage

Running Simulations

Configuration Variables

Plotting Results

🧪 Simulations (one by one)

Main simulations

Legacy simulations

💾 Data Flow

📊 Main Results

Exponentially small % of the total set of Pauli Strings needed to reconstruct the unitary

The Correlation Heuristic

Strategy Comparisons

PS Random vs SIC Random

PS Random vs PS (Correlation) Heuristic

PS Random vs PS Symmetric (Symmetry-Avoidance Heuristic)

SIC Random vs SIC (Correlation) Heuristic

Rank Heuristic

Conclusion

⚖️ License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages