Integrating Generative and Physics-Based Models for Ptychographic Imaging with Uncertainty Quantification

Abstract

Ptychography is a scanning coherent diffractive imaging technique that enables imaging nanometer-scale features in extended samples. One main challenge is that widely used iterative image reconstruction methods often require significant amount of overlap between adjacent scan locations, leading to large data volumes and prolonged acquisition times. To address this key limitation, this paper proposes a Bayesian inversion method for ptychography that performs effectively even with less overlap between neighboring scan locations. Furthermore, the proposed method can quantify the inherent uncertainty on the ptychographic object, which is created by the ill-posed nature of the ptychographic inverse problem. At a high level, the proposed method first utilizes a deep generative model to learn the prior distribution of the object and then generates samples from the posterior distribution of the object by using a Markov Chain Monte Carlo algorithm. Our results from simulated ptychography experiments show that the proposed framework can consistently outperform a widely used iterative reconstruction algorithm in cases of reduced overlap. Moreover, the proposed framework can provide uncertainty estimates that closely correlate with the true error, which is not available in practice.

Proposed Method

Forward and Inverse Problems Let $\mathbf{u} \in \mathbb{C}^N$ be the object of interest in vectorized form and $\mathbf{f}_{j} \in \mathbb{Z}_{+}^M$ be the diffraction pattern collected at the $j^\text{th}$ scan location by the far-field detector. Under the Poisson measurement noise assumption, the relationship between the object and the measured diffraction patterns is modeled by the following forward problem (observation model): \begin{equation} \mathbf{f}_{j} \sim \mathcal{P} \left( \left| \mathbf{F} \text{diag}(\mathbf{w}_j)\mathbf{S}_j \mathbf{u} \right|^2 \right) \quad \text{for} \quad j = 1, \dots, J, \label{eq:original-observation-model} \end{equation} where $\mathcal{P}(\lambda)$ denotes a Poisson process with the rate $\lambda > 0$; the matrix $\mathbf{F} \in \mathbb{C}^{M \times M}$ represents the two-dimensional discrete Fourier transform; the operator $\text{diag}: \mathbb{C}^M \to \mathbb{C}^{M \times M}$ maps a given vector to a diagonal matrix whose diagonal entries are given by the input vector; the vector $\mathbf{w}_j \in \mathbb{C}^M$ is the complex probe illuminating the object at the $j^\text{th}$ scan location; $\mathbf{S}_j \in \mathbb{R}^{M \times N}$ is a binary matrix extracting the $\sqrt{M} \times \sqrt{M}$ patch located at the $j^\text{th}$ scan location; and $J \in \mathbb{Z}_{++}$ is the number of scan locations in the trajectory. For this forward problem, the inverse problem refers to the task of estimating the underlying object from the measured diffraction patterns $\mathbf{f} \triangleq \begin{bmatrix} \mathbf{f}_1^\top , \dots, \mathbf{f}_J^\top \end{bmatrix}^\top$.

Bayesian Inversion with Generative Models for Ptychography Bayesian inversion aims to solve the inverse problem by calculating the posterior distribution of the underlying object given the observed diffraction patterns, $p_{u | f}(\cdot | \mathbf{f})$. This requires defining a likelihood model $p_{f | u}(\mathbf{f} | \cdot)$ and specifying a prior distribution $p_u$ that represents our a priori knowledge about the object. Unfortunately, specifying the prior distribution is challenging in practice since it is hard to express our a priori knowledge about the object mathematically while ensuring mathematical tractability. This often results in overly simplified prior distributions, leading to priors that may not capture the complex structure of real-world objects (see [1] for an illustrative example).

Inspired by a common approach presented in the literature (e.g., [2], [3], [4], [5], [6]), we aim to circumvent this problem by representing the prior distribution of the underlying object through a deep latent generative model. In the rest of this paper, we assume that we have access to a trained deep latent generative model $G : \mathbb{R}^Z \to \mathbb{C}^N$, which is trained using a dataset containing many object samples, and that the generative model $G$ is capable of generating the type of objects we would like to reconstruct, i.e., $\mathbf{u} = G(\mathbf{z})$ where $\mathbf{z} \sim p_z$ is the latent variable of the generator. Under this assumption, we substitute the surrogate expression $G(\mathbf{z})$ in lieu of the object $\mathbf{u}$ in the forward problem in \eqref{eq:original-observation-model} and aim to perform Bayesian inversion over the latent variable $\mathbf{z}$ by generating samples from the posterior distribution of the latent variable given the diffraction patterns, $p_{z | f}(\cdot | \mathbf{f})$. By leveraging the Bayes’ theorem, we can decompose the posterior distribution as follows: \begin{equation} p_{z | f}(\mathbf{z} | \mathbf{f}) = \frac{p_{f | z}(\mathbf{f} | \mathbf{z}) p_z(\mathbf{z})}{p_f(\mathbf{f})}, \end{equation} where \begin{equation} p_{f | z}\left(\mathbf{f}|\mathbf{z}\right) = \prod_{j=1}^J \prod_{m=1}^M \frac{ \left[ | {\mathbf{A}}_j G(\mathbf{z}) |^2 \right]_m^{ \left[ \mathbf{f}_j \right]_m } }{ \left[ \mathbf{f}_j \right]_m! } e^{ - \left[ | {\mathbf{A}}_j G(\mathbf{z}) |^2 \right]_m } \quad \text{and} \quad \mathbf{A}_j \triangleq \mathbf{F} \text{diag}(\mathbf{w}_j)\mathbf{S}_j. \end{equation}

Unfortunately, the exact calculation of the posterior distribution above is intractable due to the evidence term, $p_f(\mathbf{f})$. Thus, we leverage a gradient-based Markov Chain Monte Carlo algorithm called unadjusted Langevin algorithm [7] to generate samples from the posterior distribution. The update equation of the corresponding iterative algorithm is given by \begin{equation} \mathbf{z}^{(k+1)} = \mathbf{z}^{(k)} + \gamma \nabla_{\mathbf{z}} \log p_{f | z}( \mathbf{f} | \mathbf{z}^{(k)} ) + \gamma \nabla_{\mathbf{z}} \log p_{z}( \mathbf{z}^{(k)}) + \sqrt{2 \gamma} \pmb{\varepsilon}^{(k)} \label{eq:ula} \end{equation} where $\gamma > 0$ is the step size; $\pmb{\varepsilon}^{(k)} \sim \mathcal{N}(\mathbf{0}, \mathbf{I})$ is a random additive perturbation; and the gradient of the log-likelihood is given by \begin{equation} \nabla_{\mathbf{z}} \log p_{f|z}( \mathbf{f} | \mathbf{z} ) = 2 \Re \biggl \lbrace \mathbf{J}_G^H(\mathbf{z}) \sum_{j=1}^J \mathbf{A}^H \left[ ({\mathbf{A}}_j G(\mathbf{z})) \odot \left( {\mathbf{f}_j} \oslash { | {\mathbf{A}}_j G(\mathbf{z}) |^2 } - \mathbf{1} \right) \right] \biggr \rbrace, \label{eq:gradient-log-likelihood} \end{equation} where $\Re$ calculates the element-wise real part of a given complex vector; $\mathbf{J}_G(\mathbf{z})$ is the Jacobian matrix of the generator $G$ evaluated at $\mathbf{z}$; $(\cdot)^H$ denotes the Hermitian conjugate; and the symbols $\odot$ and $\oslash$ denote the element-wise multiplication and division operations, respectively. It is worth noting that each iteration of \eqref{eq:ula} utilizes the generative model to represent the prior on the object while utilizing the likelihood function $ p_{f | z}( \mathbf{f} | \cdot )$ to enforce data consistency and introduce a physical inductive bias.

After executing this iterative algorithm for $K$ iterations, while calculating the intermediate vector-Jacobian multiplications using automatic differentiation, the resulting variables $\lbrace \mathbf{z}^{(1)}, \dots, \mathbf{z}^{(K)} \rbrace$ are used as inputs to the generator to obtain our samples from the posterior distribution of the object. Then, these samples can be used to obtain a reconstructed image, which can be the arithmetic mean of the samples, and an uncertainty map, which can be the pixel-wise standard deviation of the samples.

Results

We utilized the MNIST dataset [8] to create synthetic ptychographic objects. Initially, we resized each image in the dataset to a $64 \times 64$ resolution. Subsequently, we added a constant offset of $0.2$ to each image and normalized the images so that the pixel intensities fell within the range $(0,1]$. Finally, each ptychographic object is constructed by using one of these images for the magnitude and another for the phase. Since the magnitude and phase images of the ptychographic objects were constructed using identical procedures, we trained a single Wasserstein GAN model [9] (see [2] for its desirable theoretical properties relevant to the posterior sampling problem) on both types of images and used the resulting generator twice to obtain the complex-valued generator $G$.

To simulate the forward problem, we utilized the Tike software package [10]. We used a disk probe for the simulations, where the probe size was set to $16 \times 16$, and the radius was fixed to $8$ pixels. The scan locations were created by following a raster scan pattern, where random perturbations to the scan locations were added. To simulate different data acquisition conditions, we repeated these simulations for different overlap ratios and different probe amplitude values. We ran the proposed method for $1000$ iterations, discarding the first $500$ iterations as the burn-in period. We fixed the step size $\gamma$ to $10^{-5}$ and initialized the latent variable such that $G( *z^{(0)})$ was close to free space in mean-squared error sense. We compared our method with the state-of-the-art iterative reconstruction method rPIE [11], which is frequently used in synchrotron facilities.

Figure 1: Reconstructions obtained by the proposed method and rPIE [11]. The overlap rate is 5%, and the probe amplitude is 100. Uncertainty maps are provided for reference.

Figure 2: (Left) Reconstruction performance of the proposed method and rPIE [11] under various conditions. (Right) Correlation between the error and the uncertainty estimates provided by the proposed method.

Citation

@inproceedings{Ekmekci2024PtychographyUncertaintyQuantification,
title={Integrating Generative and Physics-Based Models for Ptychographic Imaging with Uncertainty Quantification},
author = {Canberk Ekmekci and Tekin Bicer and Zichao (Wendy) Di and Junjing Deng and Mujdat Cetin},
booktitle={Machine Learning and the Physical Sciences Workshop @ NeurIPS 2024},
year={2024},
url={https://openreview.net/forum?id=HyBXHOHTYD}}

References

[1] J. Adler and O. Oktem, “Deep Bayesian Inversion,” arXiv:1811.05910, 2018.

[2] P. Bohra, T. -a. Pham, J. Dong, and M. Unser, “Bayesian Inversion for Nonlinear Imaging Models Using Deep Generative Priors,” IEEE Transactions on Computational Imaging, 2022.

[3] A. Dasgupta, D. V. Patel, D. Ray, E. A. Johnson, and A. A. Oberai, “A Dimension-Reduced Variational Approach for Solving Physics-Based Inverse Problems Using Generative Adversarial Network Priors and Normalizing Flows,” Computer Methods in Applied Mechanics and Engineering, 2024.

[4] D. V. Patel, D. Ray, and A. A. Oberai, “Solution of Physics-Based Bayesian Inverse Problems with Deep Generative Priors,” Computer Methods in Applied Mechanics and Engineering, 2022.

[5] A. Jalal, S. Karmalkar, A. Dimakis, and E. Price, “Instance-Optimal Compressed Sensing via Posterior Sampling,” International Conference on Machine Learning, 2021.

[6] J. Whang, E. Lindgren, and A. Dimakis, “Composing Normalizing Flows for Inverse Problems,” International Conference on Machine Learning, 2021.

[7] G. O. Roberts and R. L. Tweedie, “Exponential Convergence of Langevin Distributions and Their Discrete Approximations,” Bernoulli, 1996.

[8] Y. LeCun, C. Cortes, and CJ Burges, “MNIST Handwritten Digit Database,” ATT Labs [Online]. Available: http://yann.lecun.com/exdb/mnist, 2010.

[9] M. Arjovsky, S. Chintala, and L. Bottou, “Wasserstein Generative Adversarial Networks,” International Conference on Machine Learning, 2017.

[10] D. Gursoy and D. J. Ching, “Tike,” [Computer Software] https://doi.org/10.11578/dc.20230202.1, 2022.

[11] A. Maiden, D. Johnson, and P. Li “Further improvements to the ptychographical iterative engine,” Optica, 2017.