Vol.:(0123456789)

Applied Physics B         (2024) 130:166  
https://doi.org/10.1007/s00340-024-08280-3

REVIEW

Roadmap on computational methods in optical imaging 
and holography [invited]

Joseph Rosen1,2 · Simon Alford5 · Blake Allan6 · Vijayakumar Anand2,7 · Shlomi Arnon1 · Francis Gracy Arockiaraj1,2 · 
Jonathan Art5 · Bijie Bai8 · Ganesh M. Balasubramaniam1 · Tobias Birnbaum9,10 · Nandan S. Bisht11 · 
David Blinder9,12,13 · Liangcai Cao14 · Qian Chen15 · Ziyang Chen16 · Vishesh Dubey4 · Karen Egiazarian17 · 
Mert Ercan18,19 · Andrew Forbes20 · G. Gopakumar21 · Yunhui Gao14 · Sylvain Gigan3 · Paweł Gocłowski4 · 
Shivasubramanian Gopinath2 · Alon Greenbaum22,23,24 · Ryoichi Horisaki25 · Daniel Ierodiaconou6 · 
Saulius Juodkazis7,26 · Tanushree Karmakar27 · Vladimir Katkovnik17 · Svetlana N. Khonina28,29 · Peter Kner30 · 
Vladislav Kravets1 · Ravi Kumar31 · Yingming Lai32 · Chen Li22,23 · Jiaji Li15,33,34 · Shaoheng Li30 · Yuzhu Li8 · 
Jinyang Liang32 · Gokul Manavalan1 · Aditya Chandra Mandal27 · Manisha Manisha27 · Christopher Mann35,36 · 
Marcin J. Marzejon37 · Chané Moodley20 · Junko Morikawa26 · Inbarasan Muniraj38 · Donatas Narbutis39 · 
Soon Hock Ng7 · Fazilah Nothlawala20 · Jeonghun Oh40,41 · Aydogan Ozcan8  · YongKeun Park40,41,42 · 
Alexey P. Porfirev28 · Mariana Potcoava5 · Shashi Prabhakar43 · Jixiong Pu16 · Mani Ratnam Rai22,23 · 
Mikołaj Rogalski37 · Meguya Ryu44 · Sakshi Choudhary45 · Gangi Reddy Salla31 · Peter Schelkens9,12 · 
Sarp Feykun Şener18,19 · Igor Shevkunov17 · Tomoyoshi Shimobaba13 · Rakesh K. Singh27 · Ravindra P. Singh43 · 
Adrian Stern1 · Jiasong Sun15,33,34 · Shun Zhou15,33,34 · Chao Zuo15,33,34 · Zack Zurawski5 · Tatsuki Tahara46 · 
Vipin Tiwari2 · Maciej Trusiak37 · R. V. Vinu16 · Sergey G. Volotovskiy28 · Hasan Yılmaz18 · Hilton Barbosa De Aguiar3 · 
Balpreet S. Ahluwalia4 · Azeem Ahmad4

Received: 30 January 2024 / Accepted: 10 July 2024 
© The Author(s) 2024

Abstract
Computational methods have been established as cornerstones in optical imaging and holography in recent years. Every 
year, the dependence of optical imaging and holography on computational methods is increasing significantly to the extent 
that optical methods and components are being completely and efficiently replaced with computational methods at low cost. 
This roadmap reviews the current scenario in four major areas namely incoherent digital holography, quantitative phase 
imaging, imaging through scattering layers, and super-resolution imaging. In addition to registering the perspectives of the 
modern-day architects of the above research areas, the roadmap also reports some of the latest studies on the topic. Compu-
tational codes and pseudocodes are presented for computational methods in a plug-and-play fashion for readers to not only 
read and understand but also practice the latest algorithms with their data. We believe that this roadmap will be a valuable 
tool for analyzing the current trends in computational methods to predict and prepare the future of computational methods 
in optical imaging and holography.

1  Introduction (Joseph Rosen 
and Vijayakumar Anand)

Light is a powerful means that enables imprinting and 
recording of the characteristics of objects in real-time 
on a rewritable mold. The different properties of light, 

such as intensity and phase distributions, polarization and 
spectrum allow us to sense the reflectivity and thickness 
distributions and the birefringence and spectral absorption 
characteristics of objects [1]. When light interacts with 
an object, the different characteristics of the object are 
imprinted on those of light, and the goal is to measure the 

Extended author information available on the last page of the article

http://crossmark.crossref.org/dialog/?doi=10.1007/s00340-024-08280-3&domain=pdf
http://orcid.org/0000-0002-0717-683X
http://crossmark.crossref.org/dialog/?doi=10.1007/s00340-024-08280-3&domain=pdf


 J. Rosen et al.  166  Page 2 of 82

changes in the characteristics of light after the interaction 
with high accuracy, at a low cost, with fewer resources and 
in a short time. In the past, the abovementioned measure-
ments involved only optical techniques and optical and 
recording elements [2–10]. However, the invention of 
charge-coupled devices, computers, and associated com-
putational techniques revolutionized light-based measure-
ment technologies by sharing the responsibilities between 
optics and computation. The phenomenal work of several 
researchers resulted in the gradual introduction of com-
putational methods to imaging and holography [11–15]. 
This optical-computational association gradually reached 
several milestones in imaging technology in the follow-
ing stages. The first imaging approaches were free of 
computational methods and completely relied on optical 
elements and recording media. The introduction of com-
putational methods to imaging and holography shifted the 
full dependency on optics to both partial ones between 
optics and computation. Today, the field of imaging relies 
significantly more on computations than on optical ele-
ments, with some techniques even free of optical elements 
[16–20]. With the development of deep learning methods, 
new possibilities in imaging technology have arisen [20]. 
The entire imaging process in imaging systems that com-
prises many optical elements, if broken down into indi-
vidual steps, reveals several closely knitted computational 
methods and processes with very few optical methods and 
processes.

The above evolution leads to an important question: 
what is the next step in this evolutionary process? This 
question is not direct or easy to answer. To answer this 
question, it is necessary to review the current state-of-
the-art imaging technologies used in all associated sub-
fields, such as computational imaging, quantitative phase 
imaging, quantum imaging, incoherent imaging, imaging 
through scattering layers, deep learning and polariza-
tion imaging. This roadmap is a collection of some of the 
widely used computational techniques that assist, improve, 
and replace optical counterparts in today’s imaging tech-
nologies. Unlike other roadmaps, this roadmap focuses 
on computational methods. The roadmap comprises com-
putational techniques developed by some of the leading 
research groups that include prominent researchers and 
architects of modern-day computational imaging tech-
nologies. In the past, even today, the goal has been to 
measure the characteristics of an object, such as intensity, 
phase, and polarization, using light as a real-time mold, 
but better and faster, with fewer resources and at a low 
cost. Although it is impossible to cover the entire domain 
of imaging technology, this roadmap aims to provide 
insight into some of the latest computational techniques 
used in advanced imaging technologies. Mini summaries 
of the computational optical techniques with associated 

supplementary materials as computational codes are pre-
sented in the subsequent sections.

2  Incoherent digital holography 
with phase‑shifting interferometry 
(Tatsuki Tahara)

2.1  Background

Digital holography (DH) [21–25] is a technique used to 
record an interference fringe image of the light wave dif-
fracted from an object, termed hologram, and to reconstruct 
a three-dimensional (3D) image of the object. A laser light 
source is generally adopted to obtain interference fringes 
with high visibility. However, a digital hologram of daily-use 
light is obtained by exploiting incoherent digital hologra-
phy [26–30]. Using incoherent digital holography (IDH), a 
speckleless holographic 3D image of the object is obtained. 
Single-pixel holographic fluorescence microscopy [31], 
lensless 3D imaging [32], the improvement of the point 
spread function (PSF) in the in-plane direction [33], and full-
color 3D imaging with sunlight [34] were experimentally 
demonstrated. In IDH, phase-shifting interferometry (PSI) 
[35] and a common-path in-line configuration are frequently 
adopted to obtain a clear holographic 3D image of the object 
and robustness against external vibrations. I introduce IDH 
techniques using PSI in this section.

2.2  Methodology

Figure 1 illustrates the schematic of the phase-shifting 
IDH (PS-IDH) and configurations of frequently adopted 
optical systems. An object wave generated with spatially 
incoherent light is diffracted from an object. In IDH, self-
interference phenomenon is applied to generate an inco-
herent hologram from spatially incoherent light. Optical 
elements for generating two object waves whose wavefront 
curvature radii are different are set to obtain a self-inter-
ference hologram as shown in Fig. 1a. A phase modulator 
is set to shift the phase of one of the object waves, and an 
image sensor records multiple phase-shifted incoherent 
digital holograms. The complex amplitude distribution in 
the hologram is retrieved by PSI, and the 3D information 
of the object is reconstructed by calculating diffraction 
integrals to the complex amplitude distribution. In PS-
IDH, optical setups of the Fresnel incoherent correlation 
holography (FINCH) type [26, 28, 36], conoscopic holog-
raphy type [37, 38], two-arm interferometer type [34], 
optical scanning holography type [39–41], and two polar-
ization-sensitive phase-only spatial light modulators (TPP-
SLMs) type [42, 43] have been proposed. In a FINCH-type 
optical setup shown in Fig. 1b, a liquid crystal (LC) SLM 



Roadmap on computational methods in optical imaging and holography [invited]  Page 3 of 82   166 

works as both a two-wave generator and a phase modula-
tor for obtaining self-interference phase-shifted incoher-
ent holograms. In FINCH, phase-shifted Fresnel phase-
lens patterns are displayed, and phase-shifted incoherent 
holograms are recorded. Polarizers are frequently set to 
improve the visibility of interference fringes. In a con-
oscopic-holography-type optical setup shown in Fig. 1c, 
instead of an SLM, a solid birefringent material such as 
calcite is adopted as a polarimetric two-wave generator. In 
comparison to that of Fig. 1b, the setup suppresses multi-
order diffraction waves when a wide-wavelength-band-
width light wave illuminates the setup although the size of 
the setup is enlarged. As another way, IDH is implemented 
with a classical two-arm interferometer shown in Fig. 1d 
and the wavefront curvature radius of one of the two object 
waves is changed by a concave mirror. Robustness against 
external vibrations is a current research objective. Fig-
ure 1e is a setup adopting optical scanning holography 
[21, 24, 27] and PSI. Phase shifts are introduced using a 
phase shifter such as an SLM [40, 41] before illuminating 
an object and phase-shifted Gabor zone plate pattern is 
illuminated to an object as a structured light. An object is 
moved along the in-plane direction and a photo detector 
records a sequence of temporally changed intensity values 
by introducing phase shifts. The structured light pattern 
relates the depth position of an object and detected inten-
sity values, and information in the in-plane direction is 
obtained through optical scanning. Spatially incoherent 
phase-shifted holograms are numerically generated from 

the intensity values. The number of pixels and recording 
speed are dependent on the optical scanning. As described 
above, PS-IDH systems generally require a polarization 
filter and/or a half mirror. TPP-SLMs-type optical setup 
shown in Fig. 1f does not require these optical elements 
and is constructed to improve the light-use efficiency. Each 
SLM displays the phase distribution containing two spher-
ical waves with different wavefront curvature radii based 
on space-division multiplexing, which is termed spatial 
multiplexing [36]. Phase shifts are introduced to one of 
the two spherical waves to conduct PSI. By introducing 
the same phase distributions and phase shifts for respec-
tive SLMs, self-interference phase-shifted incoherent 
holograms are generated. Phase-shifted incoherent holo-
grams for respective polarization directions are formed 
and multiplexed on the image sensor plane. PS-IDH is 
implemented when the same phase shifts are introduced 
for respective polarization directions. It is noted that 
both 3D and polarization information is simultaneously 
obtained without a polarization filter by introducing differ-
ent phase shifts for respective polarization directions and 
exploiting a holographic multiplexing scheme [42, 43]. 
Single-shot phase shifting (SSPS) [44–46] and the com-
putational coherent superposition (CCS) scheme [47–49] 
are combined with these optical setups when conducting 
single-shot measurement and multidimensional imaging 
with holographic multiplexing, respectively. CCS is a 
multidimension-multiplexed PSI technique, and multiple 
physical quantities such as multiple wavelengths [47–49], 

Fig. 1  Phase-shifting incoherent 
digital holography (PS-IDH). 
a Schematic. b FINCH-type, c 
conoscopic-holography-type, d 
two-arm-interferometer-type, e 
optical-scanning-holography-
type, and f TPP-SLMs-type 
optical setups



 J. Rosen et al.  166  Page 4 of 82

multiple wavelength bands [50], and state of polarization 
[42, 43] are selectively extracted by the signal processing 
based on PSI. The detailed explanations for PS-IDH with 
SSPS and CCS are shown in refs. [29, 30].

2.3  Results

The TPP-SLMs-type optical setup shown in refs. [42, 43] 
was constructed for experimental demonstrations. PS-IDH 
with TPP-SLMs is the IDH system with neither linear 
polarizers nor half mirrors, and a high light-use efficiency 
is achieved. Experiments on PS-IDH and filter-free polari-
metric incoherent holography, termed polarization-filterless 
polarization-sensitive polarization-multiplexed phase-shift-
ing IDH  (P4IDH), which is the combination of PS-IDH and 
CCS, were carried out. Two objects, an origami fan and an 
origami crane, were set at different depths, and a polariza-
tion film was placed in front of the origami fan. The depth 
difference was 140 mm. The transmission axis of the film 
was the vertical direction. In this experiment, I set high-
resolution LCoS-SLMs [51] to display the phase distribution 
of two spatially multiplexed spherical waves whose focal 
lengths are 850 mm and infinity. Four holograms in the 
experiment of PS-IDH and seven holograms in the experi-
ment of P4IDH were obtained with blue LEDs (Thorlabs, 
LED4D201) whose nominal wavelength and full width at 
half maximum were 455 nm and 18 nm, respectively. The 
phase shifts in the horizontal and vertical polarizations of 
the object wave (θ1, θ2) were (0, 0), (π/2, π/2), (π, π), and 
(3π/2, 3π/2) in the experiment of PS-IDH and (3π/2, 0), 
(π, 0), (π/2, 0), (0, 0), (0, π/2), (0, π), and (0, 3π/2) in the 
experiment of  P4IDH, respectively. The magnification set 
by four lenses in the constructed N-shaped self-interference 
interferometer [42, 43] was 0.5 and the field of view for 
an image hologram in length was 2.66 cm. Figure 2 shows 
the experimental results. The results indicate that clear 3D 
image information was reconstructed by PS-IDH and both 
3D information and polarization information on the reflec-
tive 3D objects were successfully reconstructed without the 
use of any polarization filter by exploiting  P4IDH. Depth 
information is obtained in the numerical refocusing, and 
quantitative depth-sensing capability is shown. The images 
of the normalized Stokes parameter S1/S0 shown in Fig. 2g 
and h describe quantitative imaging capability of polarimet-
ric information.

2.4  Conclusion and future perspectives

Both IDH and PSI are long-established 3D measurement 
techniques. Single-shot 3D imaging [52–54] and multidi-
mensional imaging such as multiwavelength-multiplexed 

3D imaging [55, 56], high-speed 3D motion-picture 
imaging [57], and filter-free polarimetric holographic 3D 
imaging [42, 43] have been experimentally demonstrated, 
merging SSPS and CCS into PS-IDH. Although single-
shot 3D imaging with IDH has also been demonstrated 
by off-axis configurations [26, 58–60], an in-line con-
figuration is frequently adopted in IDH, considering low 
temporal coherency of daily-use light. Research studies 
toward filter-free multidimensional motion-picture IDH 
and real-time measurements, the improvement of speci-
fications such as light-use efficiency and 3D resolution, 
and developments of promising applications listed in many 
publications [26–30, 42, 43] are listed as future perspec-
tives. The C-codes for generating the multiplexed Fresnel 
phase lens and phase shifting are given in supplementary 
materials S1 and S2 respectively.

Fig. 2  Experimental results of a–c PS-IDH and d–h  P4IDH. a One 
of the phase-shifted holograms. Reconstructed images numerically 
focused on b origami fan and c origami crane. d One of the polari-
zation-multiplexed phase-shifted holograms. Reconstructed intensity 
images numerically focused on e origami fan and f origami crane. 
Reconstructed polarimetric images numerically focused on g origami 
fan and h origami crane. Blue and red colors in g and h mean that 
the normalized Stokes parameter S1/S0 is minus and plus according 
to the scale bar, respectively. The exposure times per recording of a 
phase-shifted hologram were 100 ms in PS-IDH and 50 ms in  P4IDH. 
i Photographs of the objects to show these sizes



Roadmap on computational methods in optical imaging and holography [invited]  Page 5 of 82   166 

3  Transport of amplitude into phase using 
Gerchberg‑Saxton algorithm for design 
of pure phase multifunctional diffractive 
optical elements (Shivasubramanian 
Gopinath, Joseph Rosen and Vijayakumar 
Anand)

3.1  Background

Multiplexing multiple phase-only diffractive optical func-
tions into a single high-efficiency multifunctional diffractive 
optical element (DOE) is essential for many applications, 
such as holography, imaging, and augmented and mixed 
reality applications [61–66]. When multiple phase func-
tions are combined as 

∑
k exp

�
jΦk

�
 , the resulting function 

is a complex function requiring both phase and amplitude 
modulations to achieve the expected result. However, most 
of the available modulators, either phase-only or ampli-
tude-only, make the realization of multifunctional diffrac-
tive elements challenging. Advanced phase mask design 
methods and computational optical methods are needed to 
implement multifunctional DOEs. One of the widely used 
methods is the random multiplexing (RM) method, where 
multiple binary random matrices are designed such that 
the binary states of any mask are mutually exclusive to one 
another. One unique binary random matrix is assigned to 
every diffractive function and then summed [36]. This RM 
approach allows the combination of more than two diffrac-
tive functions in a single phase-only DOE [67]. However, 

the disadvantages of the RM include scattering noise and 
low light throughput. The polarization multiplexing (PM) 
method encodes different diffractive functions to orthogo-
nal polarizations, and consequently, multiplexing more than 
two functions in a single phase only DOE [68] is impos-
sible. Compared to RM, PM has a higher signal-to-noise 
ratio (SNR) but relatively lower light throughput due to 
the loss of light at polarizers. In this section, we present a 
recently developed computational algorithm called Trans-
port of Amplitude into Phase using the Gerchberg Saxton 
Algorithm (TAP-GSA) for designing multifunctional pure 
phase DOEs [69].

3.2  Methodology

A schematic of the TAP-GSA is shown in Fig. 3. The TAP-
GSA consists of two steps. In the first step, the functions of 
the DOEs are summed as follows C1 =

∑
k exp

�
jΦk

�
 , where 

C1 is a complex function at the mask plane. The complex 
function C1 is propagated to a plane of interest via Fresnel 
propagation to obtain the complex function C2 = Fr

(
C1

)
 , 

where Fr is the Fresnel transform operator. After the first 
step, the following functions are extracted: Arg(C1) , Arg(C2) 
and ||C2

|| . Next, the GSA begins with a complex amplitude 
C3 = exp

[
j ⋅ Arg(C1)

]
 at the mask plane, and C3 is propagated 

to the sensor plane using the Fresnel transform. At the sensor 
plane, the magnitude of the resulting complex function C4 is 
replaced by ||C2

|| , and its phase is partly replaced by Arg(C2) . 
The ratio of the number of pixels replaced by Arg

(
C2

)
 to the 

Fig. 3  Schematic of TAP-GSA demonstrated here for multiplexing four diffractive lenses with different focal lengths



 J. Rosen et al.  166  Page 6 of 82

total number of pixels is given as the degrees of freedom 
(DoF). The resulting complex function C5 is backpropagated 
to the mask plane by an inverse Fresnel transform, and the 
phase is carried out while the amplitude is replaced by a 
uniform matrix of ones. This process is iterated until a non-
changing phase matrix is obtained in the mask plane.

In FINCH or IDH, it is necessary to create two different 
object beams for every object point. In the first versions of 
FINCH, the generation of two object beams was achieved 
using a randomly multiplexed diffractive lens, where two 
diffractive lenses with two different focal lengths are spa-
tially and randomly multiplexed [36]. Spatial random mul-
tiplexing results in scattering noise, resulting in a low SNR. 
Polarization multiplexing was then developed by polarizing 
the input object beam along 45° of the active axis of a bire-
fringent device, resulting in the generation of two different 
object beams with orthogonal polarizations at the birefrin-
gent device [33]. A second polarizer was mounted before 
the image sensor at 45° with respect to the active axis of the 
birefringent device to cause self-interference. As the SNR 
improved in polarization multiplexing, the light through-
put decreased. TAP-GSA was implemented to design phase 
masks for FINCH [36].

3.3  Results

The optical configuration of FINCH is shown in Fig. 4. Light 
from an incoherently illuminated object is collected and col-
limated by lens L with a focal length of f1 at a distance of z1. 
The collimated light is modulated by a spatial light modula-
tor (SLM) on which dual diffractive lenses with focal lengths 
f2 = ∞ and f3 = z2/2 are displayed, and the holograms are 
recorded by an image sensor located at a distance of z2 from 
the SLM. The light from every object point is split into two 
waves that self-interfere to obtain the FINCH hologram. Two 
polarizers are used one before and one after the SLM for 
implementing one at a time by the same setup, FINCH with 
spatial multiplexing using RM, TAP-GSA, at one moment, 
and polarization multiplexing methods at the other. For RM 
and TAP-GSA, the multiplexed lenses are displayed with P1 
and P2 oriented along the active axis of the SLM. For the 

PM, P1 and P2 are oriented at  45o with respect to the active 
axis of the SLM, and a single diffractive lens is displayed on 
the SLM. The experiment was carried out with a high-power 
LED (Thorlabs, 940 mW, λ = 660 nm and Δλ = 20 nm), 
SLM (Thorlabs Exulus HD2, 1920 × 1200 pixels, pixel 
size = 8 μm) and image sensor (Zelux CS165MU/M 1.6 MP 
monochrome CMOS camera, 1440 × 1080 pixels with pixel 
size ~ 3.5 µm) with distances  z1 = 5 cm and  z2 = 17.8 cm. 
The images of the phase masks, FINCH holograms for the 
three-phase shifts θ = 0, 120, and 240 degrees, magnitude 
and phase of the complex hologram, and reconstruction 
results obtained by Fresnel propagation for the RM, TAP-
GSA, and PM are shown in Fig. 5. The average background 
noise of RM, TAP-GSA, and PM are 3.27 ×  10–3, 2.32 ×  10–3, 
and 0.41 ×  10–3, respectively. The exposure times needed to 
achieve the same signal level in the image sensor for RM, 
TAP-GSA, and PM were 440, 384, and 861 ms, respectively. 
Comparing all three approaches, TAP-GSA has better light 
throughput than both RM and PM and has an SNR better 
than that of RM and close to that of PM.

3.4  Conclusion and future perspectives

The useful computational algorithm TAP-GSA was devel-
oped to combine multiple phase functions into a single 
phase-only function. The algorithm has been demonstrated 
on FINCH to improve both the SNR and the light through-
put. We believe that the developed algorithm will benefit 
many research areas, such as beam shaping, optical trapping, 
holography and augmented reality. The MATLAB code with 
comments is provided in the supplementary materials S3.

4  PSF engineering for Fresnel incoherent 
correlation holography (Francis 
Gracy Arockiaraj, Saulius Juodkazis 
and Vijayakumar Anand)

4.1  Background

In the previous sections, FINCH was implemented based 
on the principles of IDH with self-interference, three to 
four camera shots with phase-shifting and reconstruction 
by back propagation of the complex hologram. FINCH is 
a linear, shift-invariant system and therefore FINCH can 
also be implemented based on the principles of coded aper-
ture imaging (CAI). The FINCH hologram for an object is 
formed by the summation of shifted FINCH point responses. 
Therefore, if the point spread hologram (IPSH) library is 
recorded at different depths, then they can be used as recon-
struction functions of FINCH object hologram (IOH) at those 
depths [70–77]. This FINCH as CAI replaced the multiple 
camera recordings by a one-time calibration procedure Fig. 4  Optical configuration of FINCH



Roadmap on computational methods in optical imaging and holography [invited]  Page 7 of 82   166 

involving the recording of the IPSH library. However, this 
approach of FINCH as CAI has the challenges associated 
with CAI. One of the challenges in the implementation is 
that the lateral resolution in CAI is governed by the size of 
the pinhole used for recording the IPSH instead of the numeri-
cal aperture (NA) [70–79]. It is possible to record the IPSH 
with a pinhole with a smaller diameter that is close to the lat-
eral resolution limit governed by the NA. But with a smaller 
aperture, there is lesser number of photons and increased 
noise. In this section, we present a recently developed PSH 
engineering technique that allows to improve the reconstruc-
tions of FINCH as CAI [80].

4.2  Methodology

The optical configuration of FINCH as CAI using Lucy-
Richardson-Rosen algorithm (LRRA) is shown in Fig. 6a. 
The light from the object point is split into two beams differ-
ently modulated using phase masks created from the TAP-
GSA displayed on the SLM, and the two beams are then 
interfered to form a self-interference hologram. The IPSH and 
IOH holograms are required to reconstruct object information 
using the LRRA. In the PSH engineering technique, the IPSH 
is recorded using a pinhole that can allow sufficient number 
of photons to record a hologram with minimum detector 
noise in the first step. In the next step, the ideal PSH IIPSH for 
a single point is synthesized from the IPSH recorded for the 
large pinhole and direct image of the pinhole using LRRA. 
The engineered PSH IIPSH is given as IIPSH = IPSH ⊛𝛼,𝛽

p
ID , 

⊛𝛼,𝛽
p

 is the LRRA operator and ID is the direct image of the 
pinhole. The LRRA operator consists of three parameters α, 
β and p which are the powers of the magnitudes of the spec-
trum of matrices and the number of iterations respectively 

as shown in Fig. 6b. The synthesized IIPSH and IOH are used 
for reconstructing the object information in the final step as 
IR = IIPSH ⊛𝛼,𝛽

n
IOH . With IIPSH and recorded IOH, the object 

is reconstructed with an improved resolution and signal to 
noise ratio (SNR).

A simulation study of FINCH as CAI was carried out and 
the results are shown in Fig. 7. The simulation was carried 
out in MATLAB. An image of USAF 1951 (Fig. 7a) was 
used as a test object for the simulation studies. The  IPSH for 
a point object with a size equivalent to the lateral resolu-
tion and a point object with 2.5 times larger than the point 

Fig. 5  Phase masks and FINCH holograms of the USAF target for 
θ = 0, 120, and 240 degrees, magnitude and phase of the complex 
hologram and reconstruction result by Fresnel propagation. Rows 1, 

2, and 3 are the results for the RM, TAP-GSA, and polarization multi-
plexing, respectively

Fig. 6  a Optical configuration of FINCH as CAI. b Schematic of 
LRRA 



 J. Rosen et al.  166  Page 8 of 82

object are shown in Fig. 7b and c respectively. The  IIPSH 
synthesized from Fig. 7c and the direct image of the pinhole 
using LRRA is shown in Fig. 7d. The object hologram  IOH is 
shown in Fig. 7e. The reconstruction results using Fig. 7b–d 
are shown in Fig. 7f–h respectively. As seen from the results, 
PSH engineering approach has more information and better 
SNR than the results obtained using the PSH recorded using 
a large pinhole. 

5  Results

An optical experiment similar to Sect. 3 was carried out 
but instead of three camera shots, a single camera shot for a 
pinhole with a diameter of 50 µm and a USAF object digit 
‘1’ from Group 5 were recorded. The images of the phase 
mask designed using TAP-GSA with a 98% DoF, recorded 
IPSH and engineered IIPSH are shown in Fig. 8a–c respec-
tively. The reconstruction results using LRRA for IPSH and 
engineered IIPSH for α = 0.4, β = 1 and p = 10 are shown in 
Fig. 8d and e respectively. The direct imaging result of the 
USAF object is shown in Fig. 8f. From the results shown in 
Fig. 8d–f, it can be seen that the result of PSH engineering 
has better SNR and more object information compared to 
the result obtained for a PSH recorded using a large pinhole. 

5.1  Conclusion and future perspectives

The lateral resolution of all imaging systems is governed 
by the NA of the system. However, in CAI, there is a sec-
ondary resolution limit given by the size of the pinhole 
that is used to record the PSF. This secondary resolution 
is usually lower than the NA defined lateral resolution. 
When FINCH is implemented as CAI, the above limita-
tion ruins one of the most important advantages of FINCH 

Fig. 7  Simulation results of FINCH as CAI. a Test object, b simu-
lated ideal IPSH, c IPSH simulated with a point object 2.5 times that of 
NA defined lateral resolution, d engineered IIPSH, e IOH. Reconstruc-

tion results for f ideal IPSH g IPSH simulated with a point object 2.5 
times that of NA defined lateral resolution and h engineered IIPSH

Fig. 8  Experimental results of FINCH as CAI. a FINCH phase mask 
for DoF 98%, b recorded IPSH for 50 μm, c Engineered IPSH, d recon-
struction result of (b), e reconstruction result of (c), f direct imaging 
result



Roadmap on computational methods in optical imaging and holography [invited]  Page 9 of 82   166 

which is the super lateral resolution. A PSH engineering 
method has been developed to shift the resolution limit of 
CAI back to the limit defined by the NA. A recently devel-
oped algorithm LRRA was used for this demonstration. 
However, the developed PSH engineering method can also 
work with other reconstruction methods such as non-linear 
reconstruction [81], Weiner deconvolution [82] and other 
advanced non-linear deconvolution methods [83]. While 
the PSH engineering approach improved the reconstruc-
tion results, advanced reconstruction methods are needed 
to minimize the differences in SNR between reconstruc-
tions of ideal PSH and synthesized ideal PSH. The PSH 
engineering method is not limited to FINCH as CAI but 
can be applied to many CAI methods [84]. The MATLAB 
code for implementing the PSH engineering method using 
LRRA is given in the supplementary section S4.

6  Single molecule localization 
from self‑interference digital holography 
(Shaoheng Li and Peter Kner)

6.1  Background

Single Molecule Localization Microscopy (SMLM) has 
emerged as a powerful technique for breaking the diffraction 
limit in optical microscopy, enabling the precise localiza-
tion—typically to less than 20 nm—of individual fluores-
cent molecules within biological samples [85]. However, 
the maximum depth of field for 3D-SMLM so far is still lim-
ited to a few microns. Self-interference digital holography 
(SIDH) can reconstruct images over an extended axial range 
[26]. We have proposed combining SIDH with SMLM to 
perform 3D super-resolution imaging with nanometer preci-
sion over a large axial range without mechanical refocusing. 
Previous work from our group has experimentally demon-
strated localization of fluorescent microspheres using SIDH 
from only a few thousand photons [86–88]. SIDH produces 
a complex hologram from which the full 3D image of the 
emitter can be recreated. By determining the center of this 
3D image, the emitter can be localized. Here, we describe 
the algorithm for localizing emitters from the SIDH data.

6.2  Methodology

Three raw images of one or a few emitters are collected with 
an added phase shifts of 120° introduced between the two 
arms of the interferometer. The PSH is then calculated using 
the standard formula which eliminates the background and 
twin image [36]. The PSF can then be calculated from the 
PSH by convolution with the kernel, exp

(
j��2∕�zr

)
 , where zr 

is the reconstruction distance of the emitter image. By recon-
structing 2D images as zr is varied, a 3D image stack can be 

created. Reconstruction of the in-focus PSF requires knowl-
edge of the emitter axial location. Therefore, to reconstruct 
and localize an arbitrary emitter, a coarse axial search must 
first be done by varying zr . The PSF is located by finding the 
approximate intensity maximum over the z-stack [86]; the 
axial step should be chosen less than the PSF axial width. 
Then, the 3D PSF of the emitter can be reconstructed with a 
finer axial step—100 nm for our experiments. For other 3D 
SMLM techniques, the axial localization is determined by 
PSF shape or by comparing two different PSF images [89, 
90]. Because SIDH provides access to the full 3D PSF, the 
center of emission can be localized in all 3 dimensions using 
the same approach. The 3D centroid can be calculated, or 
maximum likelihood estimation can be used to determine 
the center of a three-dimensional Gaussian approximation to 
the PSF [91]. Here, we localize the center of the PSF by per-
forming two-dimensional curve-fitting. 2D xy and yz slices 
are cut through the maximum intensity pixel and Gauss-
ian fits are performed. The curve-fits yield the center of the 
Gaussian, 

(
xc, yc, zc

)
 , the size of the Gaussian, 

(
�x, �y, �z

)
 , 

and the total signal.

6.3  Results

Results are shown in Fig. 9. Figure 9a shows a schematic of 
the optical setup. Figure 9b shows the light-sheet illumina-
tion which is used to reduce background. The hologram is 
created by a Mach–Zehnder interferometer consisting of one 
plane mirror and one concave mirror (f = 300 mm, Edmund 
Optics). The plane mirror is mounted on a piezoelectric 
translation stage (Thorlabs NFL5DP20) to create the phase 
shifts necessary for reconstruction. The objective lens is an 
oil immersion lens (Olympus PlanApoN 60x, 1.42 NA), and 
the camera is an EMCCD camera (Andor Ixon-897 Life). 
The focal length of the tube lens is 180 mm. The focal length 
of L2 is set to f2 = 120 mm. The focal lengths of the relay 
lenses L3 and L4 are f3 = 200 mm and f4 = 100 mm, respec-
tively. The distance from the interferometer to the camera is 
set to 100 mm. Figure 9b shows the light-sheet illumination 
path of SIDH, which is used to reduce background noise 
[88]. The excitation laser beams are initially expanded and 
then shaped using a cylindrical lens with a focal length of 
200 mm (not shown). They are then introduced into the illu-
mination objective and subsequently reflected by the glass 
prism. As the excitation lasers enter the imaging chamber, 
the incident angle of the tilted light-sheet is approximately 
5.6°. The light sheet beam waist at the sample is 3.4 µm. A 
more detailed description of the optical system can be found 
in our earlier work [86–88].

Figure 9c shows results of simulations of the localiza-
tion precision over a 10 µm axial range. With no back-
ground, the localization precision is better than 10 nm 
in the lateral plane, and better than 30 nm in the axial 



 J. Rosen et al.  166  Page 10 of 82

direction. In Fig. 9d, the results of imaging a 40 nm micro-
sphere emitting ~ 2120 photons are shown. The PSH is 
shown on the left, and the resulting PSF is shown on the 
right. As can be seen, even with only a couple thousand 
photons, a SNR of 5 can be achieved demonstrating that 
the PSF is bright enough to be localized. In Fig. 9e, the 
results of imaging a 100 nm microsphere emitting ~ 8400 
photons are shown. The microsphere was imaged and 
localized 50 times. A representative PSH is shown on the 
left, and scatter plots of the localizations in lateral and 
axial planes are shown on the right. The standard devia-
tion of the localizations was �x = 22nm , �y = 30nm , and 
�z = 38nm . As can be seen from Fig. 9c, the localization 
precision is sensitive to the level of background, and we 
estimate the background level in Fig. 9e to be 13 photons/
pixel.

6.4  Conclusion and future perspectives

We have demonstrated a straightforward algorithm for the 
localization of point sources from SIDH images. With low 
background, SMLM-SIDH can achieve better than 10 nm 
precision in all three dimensions over an axial range greater 
than 10 µm. In future work, we will optimize the reconstruc-
tion process by extracting the fluorophore position directly 
from the hologram without explicitly reconstructing the PSF. 
It should also be possible to capture only one hologram and 
then discard the twin-image based on image analysis. Future 
work will also include incorporating aberration correction 
into the reconstruction process. Single fluorophores emit 
several hundred to several thousand photons, and we plan to 
demonstrate localization of single fluorophores. The Python 
codes for SMLM-SIDH are given in supplementary materi-
als S5 and GitHub [92].

Fig. 9  a Detailed schematic of the imaging path of the optimized 
SIDH setup with a Michelson interferometer. b The custom designed 
sample chamber for the tilted light-sheet (LS) illumination pathway. 
c Simulation results of lateral (top) and axial (bottom) localization 
precision of the optimized SIDH setup with the different background 
noise levels across a 10 µm imaging range. d The hologram of a 40 
nm microsphere imaged with light-sheet illumination (left). Lateral 

(top) and axial (bottom) views of the image reconstructed by back-
propagating the hologram. The SNR was calculated as the ratio of 
mean signal to the standard deviation of the background. e The PSH 
of a 100 nm microsphere (left). Scatter plots of the localizations in 
the xy-plane (middle) and yz-plane (right) of images reconstructed by 
back-propagating the hologram



Roadmap on computational methods in optical imaging and holography [invited]  Page 11 of 82   166 

7  Deep learning‑based illumination 
and detection correction in light‑sheet 
microscopy (Mani Ratnam Rai, Chen Li 
and Alon Greenbaum)

7.1  Background

Light-sheet fluorescence microscopy (LSFM) has become 
an essential tool in life sciences due to its fast acquisition 
speed and optical sectioning capability. As such, LFSM 
is widely employed for imaging large volumes of tissue 
cleared samples [93]. LSFM operates by projecting a thin 
light sheet into the tissue, exciting fluorophores, and the 
emitted photons are then collected by a wide-field detec-
tion system positioned perpendicular to the illumination 
axis of the light sheet [93, 94]. The quality of LSFM 
images hinges on the performance of both the illumination 
and detection aspects of the microscopy system. On the 
illumination side: challenges arise from the non-coplanar 
alignment of the illumination beam and the focal plane 
of the detection lens, resulting in uneven focus across 
the field of view (FOV) (Fig. 10a) [95]. In the detection 
side, when imaging deep, the tissue components introduce 
aberrations into the imaging system, particularly when 
imaging complex specimens such as cochlea, bones, or 
whole organisms with transitions from soft to hard tissue 
(Fig. 10b) [94]. Most researchers tend to address either 
the illumination or detection errors independently, often 
neglecting their interconnected nature. In this research, 
we systematically quantified the correction procedures for 

both illumination and detection errors. Then, we devel-
oped two distinct deep learning methods: one for illumina-
tion correction and the other for aberration correction on 
the detection side. The proposed system is thoughtfully 
designed to achieve the highest quality 3D imaging with-
out the need for human intervention. 

7.2  Methodology

The initial phase of our research involved establishing the 
order for addressing aberrations, namely, whether to cor-
rect illumination or detection errors first [94]. Following 
this, two distinct deep learning models were developed: one 
for rectifying sample induced detection aberrations and the 
other for addressing illumination errors, simply put, mak-
ing sure that the illumination beam was parallel and over-
lapped with the objective detection plane. In the detection 
network, we employed a 13-layer RESNET-based network, 
trained and validated on valuable biomedical samples like 
porcine cochlea and mouse brain [96]. During training, data 
are generated by first correcting aberrations using a clas-
sical grid search approach per imaging location. Once the 
aberrations are corrected, a known aberration is introduced 
into the non-aberrated images using a deformable mirror 
(DM), and two defocused images with known aberrations 
are captured. During the testing phase, the network receives 
two defocused images as input and estimates coma, astig-
matism, and spherical aberrations, and the DM is utilized 
to correct the aberrations based on the predictions of the 
network. To correct illumination errors, a U-net-based net-
work was utilized and integrated into our LSFM setup [95]. 

Fig. 10  a Illumination and b detection errors in LSFM. c Experimental schematic for correcting the illumination and detection errors in a cus-
tom-LSFM, with a deformable mirror, and two galvo mirrors



 J. Rosen et al.  166  Page 12 of 82

This algorithm captured two defocused images as well, and 
the images served as input to the deep learning model. The 
network generated a defocus map. Subsequently, this map is 
employed to estimate and rectify angular and defocus aber-
rations through the utilization of two galvo scanners and a 
linear motorized stage (Fig. 10c).

7.3  Results

The experimental demonstration of the proposed work was 
performed using a custom-built LSFM system (Fig. 10c). 
Tissue cleared brains were used to experimentally demon-
strate the proposed work. We have found that it is better to 
first correct the illumination errors and only then the detec-
tion aberrations. Figure 11a shows the image before and 
after illumination correction. Before the correction, only the 
top portion of the FOV is in focus whereas after the illumi-
nation correction, the entire FOV is in focus as seen in the 
defocus map. The color bar in Fig. 11a shows the defocus 
level. Figure 11b shows the images before and after correc-
tion of detection aberrations.

7.4  Conclusion and future perspectives

In this work, we have developed machine learning based 
method to correct illumination and detection errors in 
LSFM. The proposed system can estimate errors from two 
defocused images. The developed technique will be prag-
matic in fully automated error free 3D imaging of large tis-
sue samples without any human intervention. The Python 
codes for Illumination correction and detection correction 
are given in https:// github. com/ Chenl i235/ Angle Corre ction_ 
Unet and https:// github. com/ maniRr/ Detec tion- corre ction 
and in supplementary materials S6.

8  Complex amplitude reconstruction 
of objects above and below the objective 
focal plane by IHLLS fluorescence 
microscopy (Christopher Mann, Zack 
Zurawski, Simon Alford, Jonathan Art, 
and Mariana Potcoava)

8.1  Background

The Incoherent Holographic Lattice-Light Sheet (IHLLS) 
technique, which offers essential volumetric data and is 
characterized by its high sensitivity and spatio-temporal 
resolution, contains a diffraction reconstruction package that 
has been developed into a tool, HOLO_LLS that routinely 
achieves both lateral and depth resolution, at least micron 
level [28, 97, 98]. The software enables data visualization 
and serve a multitude of purposes ranging from calibra-
tion steps to volumetric imaging of live cells, in which the 
structure and intracellular milieu is rapidly changing, where 
phase imaging gives quantitative information on the state 
and size of subcellular structures [98–100]. This work pre-
sents a simple experimental and numerical procedures that 
have been incorporated into a program package to highlight 
the imaging capabilities of IHLLS detection system. This 
capability is demonstrated for 200 nm suspension micro-
spheres and the advantages are discussed by comparing 
holographic reconstructions with images taken by using 
conventional Lattice-Light Sheet (LLS). Our study intro-
duces the two configurations of this optical design: IHLLS 
1L, used for calibration, and IHLLS 2L, used for sample 
imaging. IHLLS 1L, an incoherent version of the LLS, cre-
ates a hologram via a plane wave and a spherical wave using 
the same scanning geometry as the LLS in dithering mode. 
Conversely, IHLLS 2L employs a fixed detection microscope 

Fig. 11  a Illumination cor-
rection in LSFM. b Detection 
correction in LSFM

https://github.com/Chenli235/AngleCorrection_Unet
https://github.com/Chenli235/AngleCorrection_Unet
https://github.com/maniRr/Detection-correction


Roadmap on computational methods in optical imaging and holography [invited]  Page 13 of 82   166 

objective to create a hologram with two spherical waves, 
serving as the incoherent LLS version. By modulating the 
wavefront of the emission beam with two diffractive lenses 
uploaded on the phase SLM, this system can attain full Field 
of View (FOV) and deeper scanning depth with fewer z-gal-
vanometric mirror displacements.

8.2  Methodology

The schematic of the IHLLS detection system is shown in 
Fig. 12. The IHLLS system is a home-built extra hardware 
added to an existing lattice light-sheet instrument. The 
IHLLS system is composed of two parts which must both 
operate in order for the system to perform as intended. The 
z-scanning principle in IHLLS 1L, same as in LLS, is that 
both the z-galvanometric mirror (zgalvo) and the detection 
objective (zpiezo), synchronize in motion to scan the sample 
in 3D, Fig. 12a. This case is used for calibration purposes, 
to mimic the conventional LLS but using a diffractive lens 
of focal length f_SLM [36, 101]. In the IHLLS 2L case, two 
diffractive lenses of finite focal lengths, with non-shared 
randomly selected pixels, Fig.  12b, are simultaneously 
uploaded on the SLM and four phase-shifting intensity 
images with different phase factors are recorded and saved 
in the computer sequentially and numerically processed by 
in-house diffraction software. The complex hologram of an 
object point located at ( rs, zs ) = ( xs, ys, zs ), as it was 
described in [36, 101], but using a four-step phase-shifting 
equation has the expression: HPSH(x, y) = I(x, y; � = 0)

− i
(

I(x, y; � = �) − I
(

x, y; � = 3�
2

))

 ,  where I
(

x, y;�k
)

= C
[

2 + Q
(

1
zr

)

exp
(

i�k
)

+ Q
(

− 1
zr

)

exp
(

−i�k
)

]

,   are the 
intensities of the recorded holograms for each phase shift, 
�k , C is a constant, and zr is the reconstruction distance. The 
SLM transparency for the two beams has the expression: 
C1Q

(
−

1

fd1

)
+ C2exp(i�)Q

(
−

1

fd2

)
 , 

Q(b) = exp[i�b�−1(x2 + y2)] is a quadratic phase function, 
C1,2 constants, fd1 , fd2 are the two diffractive lenses focal 
lengths, Fig. 13a and b, designed for a specific emission 
wavelength, and θ is the shift phase factor of the SLM. The 
two diffractive lenses focus on the planes fp1 and fp2 , in the 
front and behind the camera. In IHLLS 2L technique, 
C1,2 = 0.5 and the phase factor has four phase shifts, 

Fig. 12  Schematic of glass-lensless IHLLS detection system. a IHLLS 1L with one diffractive lens; b IHLLS 2L with two diffractive lenses

Fig. 13  Optical configuration of IHLLS [97];  fSLM = 400  mm, 
 fd1 = 435 mm,  fd2 = 550 mm; here, we chose two focal lengths of size 
closer to the calibration focal length



 J. Rosen et al.  166  Page 14 of 82

θ = 0, π∕2, π, 3π∕2 . When fd1 = ∞ , Fig. 12a, with an uneven 
distribution of the two constants, with only one the phase 
fa c t o r  o f  θ  =  0 ,  t h e  ex p r e s s i o n  b e c o m e s : 
0.1 + 0.9 exp (i�)Q

(
−

1

fSLM

)
 , and this case refers to the tech-

nique called IHLLS 1L. Phase shifted intensity images and 
hologram reconstructions at multiple z-galvo displacement 
positions − 40 μm to 40 μm in steps of ∆z = 10 μm were 
performed on an experimental dataset of 200 nm polystyrene 
beads acquired with the home-made LLS and IHLLS 
systems.

8.3  Results

In this work, we show how to numerically compute IHLLS 
diffraction patterns with the HOLO_LLS package. The entire 
package is implemented in MATLAB or Python. Here, we 
present the MATLAB version. We split the reconstruction 
process into four steps to produce numerical approximation 
of the full electric field (amplitude and phase) of the object: 
(a) addition of all complex fields built by phase shifting 
holography (PSH) at various z_galvo positions to create a 
bigger field; (b) apply parameter optimization to the complex 
wave hologram, a real-space bandpass filter that suppresses 
the pixel noise while retaining information of a character-
istic size, (c) reconstruct the object data from the hologram 
(backpropagate), and (d) 3D volume representation from 
the obtained object data. The diffraction subroutine uses the 
Angular Spectrum Method as the Fresnel and Fraunhofer 
regimes are limited by the requirement of a different grid 

scale and by certain approximations [102]. As an example of 
the methods explained, we present MATLAB pseudocodes 
for making diffractive lenses and for the 3D volume recon-
struction from phase-shift holographic images. We hope this 
software improves the reproducibility of research, thus ena-
bling consistent comparison of data between research groups 
and the quality of specific numerical reconstructions. The 
recorded intensity distributions, amplitude and phase after 
Fresnel propagation and reconstruction results for different 
scanning positions of z-galvo mirror from 40 µm to −  40 µm 
in steps of 10 µm are shown in Fig. 14.

8.4  Conclusion and future perspectives

Our approach will enable automated capture of complex data 
volumes over time to achieve spatial and temporal resolu-
tions to track dynamic movements of cellular structure in 
3D over time. It will enable high temporal resolution of the 
spatial relationships between cellular structures and retain 
both amplitude and phase information in the reconstructed 
images. We have theoretically and practically demonstrated 
the feasibility of the approach to provide a working micro-
scope system. Our next steps will automate 3D scanning and 
IHLLS 2L imaging in multiple wavelengths by sweeping 
excitation through hundreds of z-axis planes. We will then 
fully automate reconstruction software. Our overall goals 
are to integrate phase image acquisition in multiple z planes 
and excitation wavelengths into the existing SPIM software 
suite. The MATLAB pseudocodes for the HOLO_LLS are 
provided in the supplementary materials S7.

Fig. 14  a Z-galvo scanning locations. b IHLLS – 2L intensity images at θ = 0. c Amplitude and phase obtained by Angular Spectrum propaga-
tion method. d Holographic reconstruction



Roadmap on computational methods in optical imaging and holography [invited]  Page 15 of 82   166 

9  Sparse‑view computed tomography 
for passive two‑dimensional ultrafast 
imaging (Yingming Lai and Jinyang Liang)

9.1  Background

Sparse-view computed tomography (SV-CT) is an 
advanced computational method to obtain the three-
dimensional (3D) internal spatial structure [i.e., (x, y, z)] 
of an object from a few angle-diverse projections [103]. 
Compared with traditional CT, SV-CT effectively reduces 
acquisition time with minimally compromising imaging 
quality. Since its invention, SV-CT has been prominently 
applied scenarios such as in x-ray medical imaging and 
industrial product testing scenarios to reduce the radia-
tion dose received by patients and samples [104–106]. 
In recent years, SV-CT has begun to be noticed as an 
advanced imaging strategy for efficiently recording spati-
otemporal information [i.e., (x, y, t)] [107, 108]. Despite 
enabling ultrafast imaging speeds, these techniques are 
based on active laser illumination, making them unsuitable 
for self-illumination and color-selective dynamic scenes. 
In this chapter, we present a newly developed compressed 
ultrafast tomographic imaging (CUTI) method by apply-
ing SV-CT to the spatiotemporal domain with passive 
projections.

9.2  Methodology

CUTI achieves spatiotemporal SV-CT based on streak 
imaging whose typical configuration includes three parts: 
an imaging unit, a temporal shearing unit, and a two-dimen-
sional (2D) detector. As shown in Fig. 15, after being imaged 
by the imaging unit, the dynamic scene I(x, y, t) is deflected 
to different spatial positions on the detector by the temporal 
shearing unit [109]. The multiple-scale sweeping speeds, 
accessible by the shearing unit, enable the passive projec-
tions of the (x, y, t) datacube from different angles in the 
spatiotemporal domain [110]. 

The projection angle is determined by the maximum 
resolving capability of CUTI in both the spatial and tem-
poral dimensions. Particularly, the dynamic information is 
spatiotemporally integrated into each discrete pixel on the 
2D detector after the temporal shearing. Thus, the size of 
discrete pixels (denoted by pc ) and the maximum shearing 
velocity (denoted by vmax ) determine the maximum resolv-
ing capability of CUTI in the t-axis. During the observation 
window of vmax determined by the sweep time (denoted by 
ts ), CUTI’s sequence depth (i.e., the number of frames in 
the recovered movie) is calculated by Lt = ||vmax

||ts∕pc. In 
the i th acquisition, the streak length in the spatial direction 
(e.g., the y-axis) is expressed by Ls = vits∕pc , where vi is the 
shearing velocity in the i th acquisition ( i = 1, 2, 3,… ,N ). 

Fig. 15  Operating principle of 
compressed ultrafast tomo-
graphic imaging (CUTI). TTR, 
TwIST-based tomographic 
reconstruction. Inset in the 
dashed box: illustrations of 
the equivalent spatiotemporal 
projections in data acquisition. 
Adapted with the permission 
from Ref. [110]



 J. Rosen et al.  166  Page 16 of 82

Hence, the spatiotemporal projection angle, denoted by �i , 
is determined by

The sparse projections at different angles of a dynamic 
event I(x, y, t) can be expressed as

where E =
[
E1,E2,… ,EN

]T is the set of streak measure-
ments, � is the operator of spatiotemporal integration, and 
� =

[
�1, �2,… , �N

]T is the set of temporal shearing opera-
tions corresponding to various projection angles.

The image reconstruction of CUTI is based on the 
framework of SV-CT and the two-step iterative shrink-
age/thresholding (TwIST) algorithm [111]. The acquired 
sparse projections are input into a TwIST-based tomo-
graphic reconstructions (TTR) algorithm (detailed in 
the Supplementary information). With an initialization 

(1)�i = tan−1
(
Ls

Lt

)
= tan−1

(
vi

||vmax
||

)

(2)E =
[
��I(x, y, t)

]
,

Î0 = (��)TE , the dynamic scene can be recovered by solv-
ing the optimization problem of

where Î  is the reconstructed datacube of the dynamic scene, 
� is the regularization parameter, and ΦTV(⋅) is the 3D total-
variation regularization function [112].

9.3  Results

The performance of CUTI was demonstrated using an 
image-converter streak camera to capture an ultrafast ultra-
violet (UV) dynamic event [110]. Figure 16a illustrates 
the generation of two spatially and temporally separated 
266-nm, 100-fs laser pulses via a Michelson interferom-
eter, with a 1.6-ns time delay introduced between them. 
These pulses undergo modulation by a resolution target as 

(3)Î = argmin
I

1

2
‖E − ��I‖2

2
+ 𝜏ΦTV (I),

Fig. 16  Capture two spatially and temporally separated UV pulses by 
implementing CUTI to a standard UV streak camera. a Experimental 
setup. M1 − M2: mirrors. Magenta-boxed inset: the reference image 
was captured without using temporal shearing. b Representative 
frames of the reconstruction scenes. c Selected cross-sections of the 

resolution target in the x- and y-direction at t = 150 ps. (d) As (c), but 
shows the profiles t = 1746 ps. e Temporal trace of the reconstruction. 
FWHM full width at half maximum. Adapted with permission from 
Ref. [110]



Roadmap on computational methods in optical imaging and holography [invited]  Page 17 of 82   166 

shown in the inset of Fig. 16a. Subsequently, 11 spatiotem-
poral projections were acquired within the angular range 
θi ∈ [− 45°, + 45°] employing a 9° angular step. By setting 
the regularization parameter to τ = 0.0204, the event was 
successfully reconstructed using the TTR algorithm at an 
imaging speed of 0.5 trillion (0.5 ×  1012) frames per sec-
ond. Figure 16b represents six representative frames in the 
reconstruction of the two pulses. To quantitatively assess 
the image quality, selected cross-sections were extracted 
at the first pulse (at 150 ps) and the second pulse (at 
1746 ps). These results were compared with the reference 
image captured without temporal shearing (Fig. 16c, d). 
Using a 10% contrast threshold, at t = 150 ps, the spatial 
resolutions were determined as 15.6 and 14.1 lp/mm in 
the x- and y-directions, respectively. At t = 1746 ps, the 
values were 13.2 and 14.1 lp/mm. Figure 16e shows the 
reconstructed temporal trace of this event. 

9.4  Conclusion

As a new computational ultrafast imaging method, CUTI 
grafts SV-CT to the spatiotemporal domain. The method 
has been demonstrated in a standard image-converter 
streak camera for passively capturing an ultrafast UV 
dynamic event. In the future, CUTI’s image quality can be 
improved by using an image rotation unit for a larger angu-
lar range [113] and adopting advanced SV-CT algorithms 
[114, 115]. CUTI is expected to contribute to the observa-
tion of many significant transient phenomena [116, 117].

10  Computational reconstruction 
of quantum objects by a modal approach 
(Fazilah Nothlawala, Chané Moodley 
and Andrew Forbes)

10.1  Background

Optical imaging and holography have traditionally been 
based on exploiting correlations in space, for instance, using 
position or pixels as the basis on which to measure. Subse-
quently, structured illumination with computational recon-
struction [118] has exploited the orthogonality in random and 
Walsh-Hadamard masks, implemented for high-quality 3D 
reconstruction of classical objects [119] as well as complex 
amplitude (amplitude and phase) reconstruction of quantum 
objects [120]. Recently, a modal approach has been suggested 
to enhance the resolution in imaging [121], taking the well-
known ability to modally resolve optical fields for their full 
reconstruction [122] to that of physical and digital objects. 
This has been used to infer step heights with nanometer reso-
lution [123], to resolve quantum objects [124], in quantum 
metrology [125], in phase imaging [126] and suggested as a 
means of searching for exoplanets [127]. Here, we will apply 
it to reconstruct quantum images of complex objects and com-
pare it to conventional quantum computational approaches.

10.2  Methodology

The idea is very simple: any complete and orthonormal basis 
can be used to reconstruct a function, and this function can 

Fig. 17  a An object can be reconstructed using any complete and 
orthonormal basis. Three different bases are depicted in this figure: 
the pixel basis, random basis and a modal basis, respectively. b Sim-
ple schematic of the experiment where two entangled photons are 

produced from a nonlinear crystal, one directed to the object and the 
other to the mask that displays the basis projections. c Computational 
reconstructions of a cat using four mask options



 J. Rosen et al.  166  Page 18 of 82

represent a physical or digital object. In the present context 
it is the image of the object. This is depicted graphically in 
Fig. 17a evolving from a pixel basis (top), to a random basis 
(middle) and finally to a modal basis (bottom). In the case 
of the latter, the modal function must be chosen with some 
care to minimize the number of terms in the sum.

Because the right hand side can include modal phases, 
any physical property of the left hand side can be inferred, 
including full phase retrieval. We do exactly this for the rec-
ognition of quantum and classical objects using the experi-
mental set-up shown in Fig. 17b. Two entangled photons 
are produced by spontaneous parametric downconversion 
(SPDC) in a nonlinear crystal and relay imaged from the 
crystal plane to the object plane in one arm, and to the image 
plane in the other arm, the latter with a spatial light modula-
tor as a modal analyzer. Thereafter, each photon is collected 
by optical fibre and detected by single photon avalanche pho-
todiodes (APDs). The spatial light modulator in the imaging 
arm is used to display digital match filters for each mode in 
the basis, while the single mode fibre collection performs an 
optical inner product to return the modal weights. The intra-
modal phase is determined by displaying a superposition of 
modal elements, two of which (sine and cosine) allow the 
phase to be known unambiguously. All three measurements 
together (one for amplitude and two for phase) return the 
complex weighting coefficient. The final image is then com-
putationally reconstructed by adding the terms on the right 
hand side with the correct complex weights. The process can 
be augmented by machine learning and artificial intelligence 
tools to speed up the reconstruction (with fewer projections) 
and/or to enhance the final image quality. A simulation of 

the experiment was performed with computational images 
of a “cat” object shown in Fig. 17c for four bases.

10.3  Results

To illustrate how this approach can be used for quantum 
objects, we use test cases of (I) an amplitude step and check-
erboard pattern object, and (II) a phase step object and 
checkerboard pattern object for both the Walsh-Hadamard 
and HG mode reconstructions, with experimental images 
shown in Fig. 18a and b. The outer area of the dashed 
white circle for each reconstruction represents the region 
where noise was suppressed due to lack of SPDC signal. 
We see the reconstructed images of both the amplitude and 
phase objects show a high fidelity with both reconstruction 
approaches (Walsh-Hadamard and HG modes), however the 
phase objects show a higher object-image fidelity overall. 
Figure 18c and d provide a quantitative comparison between 
the object (simulated reconstructed) and the image through a 
cross-section, showing good agreement between the object 
(simulated reconstruction) and the experimental reconstruc-
tions for both the Hadamard (blue) and Hermite-Gauss (red) 
amplitude and phase steps, albeit with a low level of noise, 
characteristic to quantum experiments, present.

10.4  Conclusion and future perspectives

While scanning methods employing the pixel, Walsh-Had-
amard and random bases depend directly on the number of 
pixels required within the image, the modal approach proves 
beneficial in that there is no direct correlation between the 

Fig. 18  a Amplitude and b 
phase reconstructions for a 
checkerboard pattern and a 
step object (shown as insets), 
using Hermite-Gauss (HG) and 
Walsh-Hadamard masks. The 
outer area of the dashed white 
circle represents the region 
where noise was suppressed 
due to lack of SPDC signal. 
2D cross-sectional plots of 
the c amplitude and d phase 
step functions with the object 
(simulation), and reconstruc-
tions with the Walsh-Hadamard 
(blue diamonds) and HG (red 
dots) masks



Roadmap on computational methods in optical imaging and holography [invited]  Page 19 of 82   166 

number of scans required and image resolution. The reso-
lution is set by the optical system itself, while the number 
of modes required to image the object is dependent on the 
complexity of the object. The modal approach requires a 
judicious choice of modal basis as well as the number of 
terms required to image the object. The introduction of 
phase only and amplitude only scanning through a modal 
approach allows for the ability to probe individual proper-
ties of an unknown object. The future prospects for com-
putational methods in optical imaging and holography are 
highly promising, with trends indicating integration of AI 
for enhanced image reconstruction, the advancement of 
3D holography with improved resolution, and the poten-
tial impact of quantum techniques. These developments 
will benefit various fields, including bio-photonics, mate-
rial science, and quantum cryptography. The introduction 
of quantum computing and interdisciplinary collaborations 
will likely act as a catalyst for innovation, expanding the 
applications and accessibility of optical imaging and holog-
raphy across industrial and research domains.

11  Label‑free sensing of bacteria 
and viruses using holography and deep 
learning (Yuzhu Li, Bijie Bai and Aydogan 
Ozcan)

11.1  Background

Microorganisms, like bacteria and viruses, play an indis-
pensable role in our ecosystem. While they serve crucial 
functions, such as facilitating the decomposition of organic 
waste and signaling environmental changes, certain micro-
organisms are pathogenic and can lead to diseases like 

anthrax, tuberculosis, influenza, etc. [128]. The replication 
of bacteria and viruses can be detected using culture-based 
methods [129] and viral plaque assays [130], respectively. 
Though these culture-based methods have the unique abil-
ity to identify live and infectious/replicating bacteria and 
viruses, they are notably time-consuming. Specifically, it 
usually requires > 24 h for bacterial colonies to form [129] 
and > 2 days for viral plaques [131] to grow to sizes dis-
cernible to the naked eye. In addition, these methods are 
labor-intensive, and are subject to human counting errors, 
as experts/microbiologists need to manually count the num-
ber of colony-forming units (CFUs) or plaque-forming units 
(PFUs) within the test plates after the corresponding incuba-
tion period to determine the sample concentrations. There-
fore, a more rapid and automated method for detecting the 
replication of bacteria and viruses is urgently needed.

The combination of time-lapse holographic imaging and 
deep learning algorithms provides a promising solution to 
circumvent these limitations. Holographic imaging, regarded 
as a prominent label-free imaging modality, is effective at 
revealing features of transparent biological specimens by 
exploiting the refractive index as an endogenous imag-
ing contrast [17, 132]. Consequently, it can be employed 
to monitor the growth of colonies or viral plaques during 
their incubation process in a label-free manner. This allows 
for the capture of subtle spatio-temporal changes associated 
with colony or viral plaque growth, enabling early detection 
of them when they are imperceptible to human eye. How-
ever, the presence of other potential artifacts (e.g., bubbles, 
dust, and other random features created by the uncontrolled 
motion of the sample surface) can hinder the accurate detec-
tion of true bacterial colonies or viral plaques. To mitigate 
such false positive events, deep learning algorithms become 
critical in automatically differentiating these randomly 

Fig. 19  Workflows used for label-free sensing of bacteria (CFUs) and viruses (PFUs) using time-lapse holographic imaging and deep learning. a 
Workflow for CFU early detection using holography and deep learning. b Workflow for PFU early detection using holography and deep learning



 J. Rosen et al.  166  Page 20 of 82

appearing artifacts from true positive events by leveraging 
the unique spatio-temporal features of CFU or PFU growth. 
In this chapter, we will present how the integration of time-
lapse holographic imaging and deep learning enables the 
early detection of bacterial colonies or viral plaques in an 
automated and label-free manner, achieving significant time 
savings compared to the gold-standard methods [133–135].

11.2  Methodology

The primary workflow for detecting CFUs and PFUs using 
holography and deep learning includes key steps such as 
time-lapse hologram acquisition of test well plates, digital 
holographic reconstruction, image processing, and deep 
learning-based CFU/PFU identification and automatic 
counting, as illustrated in Fig. 19. The specific methodolo-
gies employed in each step differ for CFU and PFU detec-
tion, as detailed below.

For the CFU detection [133], as shown in Fig. 19a, after 
the sample is prepared by inoculating bacteria on a chromog-
enic agar plate, it is positioned on a customized lens-free [17, 
136] holographic microscopy device for time-lapse imaging 
which utilized a digital in-line holographic microscopy con-
figuration. The sample is illuminated by a coherent laser 
source, and the resulting holograms are scanned across the 
entire sample plate by a complementary metal–oxide–semi-
conductor (CMOS) sensor. These captured time-lapse holo-
grams are digitally stitched and co-registered across various 
timestamps to mitigate the effects of random shifts in the 
mechanical scanning process, and digitally reconstructed 
to retrieve both the amplitude and phase channels of the 
observed sample plate. Subsequently, a differential analy-
sis-based image processing algorithm is employed to select 
colony candidates. These candidates are then fed into a CFU 
detection neural network to identify true colonies from non-
colony candidates (e.g., bubbles, dust and other spatial arti-
facts). Following this, a CFU classification neural network is 
subsequently employed to classify true colonies identified by 
the CFU detection network into their specific species. Note 
that the CMOS image sensor in this workflow can also be 
replaced by a thin-film-transistor (TFT) image sensor with 
a much larger imaging field-of-view (FOV) of ∼ 7–10  cm2 
[134]. In this case, the whole FOV of the sample plate can be 
captured in a single shot using the TFT image sensor and the 
obtained holograms are inherently registered across all the 
timestamps, eliminating the need for mechanical scanning, 
image stitching, and image registration steps that are used 
in the CMOS sensor-based system.

For the PFU detection [135], as shown in Fig. 19b, the 
process of hologram capture and image preprocessing are 
similar to those used in the CFU detection system. However, 

the candidate selection and identification procedures are 
not employed in the PFU detection task. Instead, the recon-
structed time-lapse phase images of the whole test well are 
directly converted into a PFU probability map by applying 
a PFU detection neural network to the whole well image. 
This PFU probability map is further converted to a binary 
detection mask after thresholding by 0.5, revealing the sizes 
and locations of the detected PFUs at a given time point. 
The neural networks employed in these studies utilized a 
DenseNet architecture [137], with 2D convolutional layers 
replaced by Pseudo3D convolutional blocks [138] to better 
accommodate time-lapse image sequences. Nonetheless, the 
network structures suitable for similar work can be changed 
to more advanced architectures to meet the specific require-
ments of different detection targets.

11.3  Results

Following the workflows described above, the presented 
CFU detection system based on the CMOS image sensor 
showcased its capability to detect ~ 90% of the true colo-
nies within ~ 7.8 h of incubation for Escherichia coli (E. 
coli), ~ 7.0 h for Klebsiella pneumoniae (K. pneumoniae), 
and ~ 9.8 h for Klebsiella aerogenes (K. aerogenes) when 
tested on 336, 339, and 280 colonies for E. coli, K. pneu-
moniae, and K. aerogenes, respectively. Compared to the 
gold-standard Environmental Protection Agency (EPA) 
approved culture-based methods (requiring > 24 h of incu-
bation), this system achieved time savings of > 12 h [133]. 
As for the TFT sensor-based CFU detection system with 
simplified hardware and software design [134], its detec-
tion time was slightly longer compared to the CMOS image 
sensor-based system, attributed to the larger pixel size of the 
TFT sensor (~ 321 μm). When tested on 85 E. coli colonies, 
114 K. pneumoniae, and 66 Citrobacter colonies, this TFT 
sensor-based CFU detection system achieved ~ 90% detec-
tion rate within ~ 8.0 h for E. coli, ~ 7.7 h of incubation for 
K. pneumoniae, and ~ 9.0 h for Citrobacter.

Regarding the automated colony classification task, the 
CMOS sensor-based CFU detection system correctly classi-
fied ~ 80% of all the colonies into their species within ~ 8.0 
h, ~ 7.6 h, and ~ 12.0 h for E. coli, K. pneumoniae, and K. 
aerogenes, respectively. In contrast, the TFT sensor-based 
CFU system was able to classify the detected colonies into 
either E. coli or non-E. coli coliforms (K. pneumoniae and 
Citrobacter) with an accuracy of > 85% within ~ 11.3 h 
for E. coli, ~ 10.3 h for K. pneumoniae, and ~ 13.0 h for 
Citrobacter.

Regarding the PFU detection system, when evaluated 
on vesicular stomatitis virus (VSV) plates (containing a 
total of 335 VSV PFUs and five negative control wells), the 



Roadmap on computational methods in optical imaging and holography [invited]  Page 21 of 82   166 

presented PFU detection system was able to detect 90.3% 
of VSV PFUs at 17 h, reducing the detection time by > 24 h 
compared to the traditional viral plaque assays that need 
48 h of incubation, followed by chemical staining—which 
was eliminated through the label-free holographic imaging 
of the plaque assay. Moreover, after simple transfer learn-
ing, this method was demonstrated to successfully general-
ize to new types of viruses, i.e., herpes simplex virus type 
1 (HSV-1) and encephalomyocarditis virus (EMCV). When 
blindly tested on 6-well plates (containing 214 HSV-1 PFUs 
and two negative control wells), it achieved a 90.4% HSV-1 
detection rate at 72 h, marking a 48 h reduction compared 
to the traditional 120-h HSV-1 plaque assay. For EMCV, a 
detection rate of 90.8% was obtained at 52 h of incubation 
when tested on 6-well plates (containing 249 EMCV PFUs 
and two negative control wells), achieving 20 h of time-
saving compared to the traditional 72-h EMCV plaque assay. 
Notably, across all detection time points, there were no false 
positives detected for all the test wells.

11.4  Conclusion and future perspectives

By leveraging deep learning and holography, the CFU and 
PFU detection systems discussed in this chapter achieved 
significant time savings compared to their gold-standard 
methods. The entire detection process was fully automated 
and performed in a label-free manner—without the use of 
any staining chemicals. We believe these automated, label-
free systems are not only advantageous for rapid on-site 
detection but also hold promise in accelerating bacterial and 
virological research, potentially facilitating the development 
of antibiotics, vaccines, and antiviral medications.

12  Accelerating computer‑generated 
holography with sparse signal models 
(David Blinder, Tobias Birnbaum, Peter 
Schelkens)

12.1  Background

Computer-generated holography (CGH) comprises many 
techniques to simulate light diffraction for holography 
numerically. CGH has many applications for holographic 
microscopy and tomography [22], display technology [139], 
and especially for computational imaging [140]. CGH is 
computationally costly because of the properties of diffrac-
tion: every point in the imaged or rendered scene will emit 
waves that can affect all hologram pixels. That is why a mul-
titude of algorithms have been developed to accelerate and 
accurately approximate these calculations [141].

One particular set of techniques of interest is sparse CGH 
algorithms. These encode the wavefield in a well-chosen 
transform space where the holographic signals to be com-
puted are sparse; namely, they only require a small number 
of coefficient updates to be accurate. That way, diffraction 
calculations can be done much faster, as only a fraction of 
the total coefficients will be updated. Examples include the 
use of the sparse FFT [142], wavefront recording planes 
that express zone plate signals in planes close to the virtual 
object, resulting in limited spatial support, and coefficient-
shrinking methods such as WASABI relying on wavelets 
[143].

A transform that has been especially effective in repre-
senting holographic signals is the Short-time Fourier trans-
form (STFT). Unlike the standard Fourier transform, the 
STFT determines the frequency components of localized sig-
nal sections as it changes over time (or space). One impor-
tant reason for its effectiveness in holography is that the 
impulse response of the diffraction operator is highly sparse 
in phase space, expressible as a curve in time–frequency 
space [144]. This has shown to be effective for STFT-based 
CGH with coefficient shrinking [144] and the use of phase-
added stereograms [145, 146].

Recently, the Fresnel diffraction operator itself was accel-
erated using Gabor frames, relying on the STFT [147]. This 
resulted in a novel Fresnel diffraction algorithm with linear 
time complexity that needs no zero-padding and can be used 
for any propagation distance.

12.2  Methodology

The Fresnel diffraction operator expresses light propagation 
from a plane z =  z1 to z =  z2 by

relating the evolving complex-valued amplitude U over a 
distance d = z2 − z1 , with wavelength � , wavenumber k = 2�

�
 

and imaginary unit i . Because this integral is separable along 
x and y , we can focus on the operation only along x, namely

where K is the Fresnel convolution kernel. This expression 
can be numerically evaluated with many techniques [146], 
but essentially boil down to either spatial convolution or 

(4)

U
(

x, y; z2
)

= eikd
ikd ∫ ∫

+∞

−∞
U
(

�, �; z1
)

exp
( ik
2d

[

(x − �)2 + (y − �)2
]

)

d�d�,

(5)
U
(

x;z2
)

=∫

+∞

−∞
U
(

ξ;z1
)

exp
( ik
2d

(x − ξ)2
)

dξ

= ∫

+∞

−∞
U
(

�;z1
)

(x − ξ)d�,



 J. Rosen et al.  166  Page 22 of 82

frequency domain convolution using the FFT. We proposed 
a third approach via chirplets:

which is a generalization of Gaussians with complex-valued 
parameters. The set of chirplets G is closed under multiplica-
tion and convolutions; consider two chirplets 
u = exp

(
at2 + bt + c

)
 , û = exp

(
ât2 + b̂t + ĉ

)
 ; we have that, 

∀u, û ∈ G,

and

(6)
G =

{
exp

(
𝛼t2 + 𝛽t + 𝛾

) | 𝛼, 𝛽, 𝛾 ∈ ℂ ∧ ℜ(𝛼) < 0
}

(7)
u ⋅ û = exp

((
a + â

)
t2 +

(
b + b̂

)
t +

(
c + ĉ

))
∈ G,

(8)

u ∗ û =

�
−�

a + â
exp

⎛
⎜⎜⎜⎝

aâ

a + â
t2 +

ab̂ + âb

a + â
t −

�
b − b̂

�2

4
�
a + â

� + c + ĉ

⎞
⎟⎟⎟⎠
∈ G.

Because the Fresnel convolution kernel K can be seen 
as a degenerate chirplet (where α is purely imaginary), any 
Chirplet that gets multiplied or convolved with K will also 
result in a chirplet. Finally, chirplets can be integrated over 
as follows:

Thus, if we express the holographic signal in both source 
and destination planes in terms of chirplets, we can analyti-
cally relate the output chirplet coefficients in the plane z =  z2 
as a function of their inputs from the plane z =  z1.

A Gabor frame with Gaussian windows can serve as a 
representation of a signal by a collection of chirplets, using 
the STFT [148]. This means we can use a Gabor transform 
to obtain the chirplet coefficients, transform them using the 
aforementioned equations, and retrieve the propagated signal 
by applying the inverse Gabor transform, cf. Fig. 20.

12.3  Results

Because of the sparsity of chirplets for holograms, each 
input Gabor coefficient will only significantly affect a 

(9)

∀u ∈ G ∶ ∫
+∞

−∞

u(t)dt =

√
�

−a
exp

(
c −

b2

4a

)
.

Fig. 20  Chirplet-based Fresnel transform pipeline. Every row and column can be processed independently thanks to the separability of the 
Fresnel transform. The Gabor coefficients can be processed with the chirplet mapping, and transformed back to obtain the propagated hologram

Fig. 21  Side-by-side comparison of an example hologram propa-
gated with different algorithms, at d = 5 mm. a The input hologram, 
followed by multiple reconstructions, using b the spatial domain 

method, c the proposed Gabor domain method and d the frequency 
domain method. The proposed Gabor technique appears to be visually 
identical to the reference spatial and frequency domain algorithms



Roadmap on computational methods in optical imaging and holography [invited]  Page 23 of 82   166 

small number of output Gabor coefficients, no matter the 
distance d. Therefore, only a few coefficients need updat-
ing while maintaining high accuracy. Combined with the 
fact that the computational complexity of the Gabor trans-
form is linear as a function of the number of samples, this 
results in an O(n) Fresnel diffraction algorithm.

We calculated a 1024 × 1024-pixel hologram with a 
pixel pitch of 4 μm, distance d = 5 cm, and a wavelength 
λ = 633 nm, cf. Fig. 21. Using a sheared input coefficient 
neighborhood with a radius of 5 of input coefficients per 
output coefficient, we obtain a PSNR of 68.3 dB w.r.t. 
the reference reconstruction. Preliminary experiments on 
a C +  + /CUDA implementation give a speedup of about 
2 to 4 on a 2048 × 2048-pixel hologram compared to FFT-
based Fresnel diffraction [149]. The sparsity factor can be 
chosen to trade off accuracy and speed.

12.4  Conclusion and future perspectives

We have combined two of Dennis Gabor’s inventions, 
holography, and the Gabor transform, creating a novel 
method for numerical Fresnel diffraction. It is a new, dis-
tinct mathematical way to express discrete Fresnel dif-
fraction with multiple algorithmic benefits. The method 
requires no zero-padding, poses no limits on the distance 
d and inherently supports off-axis propagation and changes 
in pixel pitch or resolution. Its sparsity enables a linear 
complexity algorithm implementation. We plan to inves-
tigate these matters and perform detailed experiments in 
future work. This novel propagation technique may serve 
as a basis for more efficient and flexible propagation oper-
ators for various applications in computational imaging 
with diffraction.

13  Layer‑based hologram calculations: 
practical implementations (Tomoyoshi 
Shimobaba)

13.1  Background

Holographic displays have attracted significant attention as 
one of the most promising technologies for three-dimen-
sional (3D) displays. They require optical systems capable 
of rendering holograms with a large spatial bandwidth, in 
addition to algorithms and computational hardware that 
can efficiently compute these holograms at high speeds 
[139]. Computational algorithms designed for hologram 
generation can be broadly categorized into several meth-
ods, including point cloud, polygon, layer, light field, and 
deep learning-based approaches. Each method has its own 
set of advantages and disadvantages, and currently, there 
is no universally perfect method identified. In this context, 
we focus on the layer method.

The layer method computes holograms from a 3D scene 
represented by an RGB image and a depth image (RGB-D 
image). Depth cameras, such as the Kinect, are now read-
ily available for capturing RGB-D images. Alternatively, 
RGB-D images can be obtained from computer graphics 
generated using 3D graphics libraries such as OpenGL. 
A prototype of a holographic near-eye display utilizing 
RGB-D images has been successfully developed and has 
effectively presented 3D images to viewers without caus-
ing discomfort [150]. The layer method decomposes a 
3D scene into multiple 2D images (referred to as layers) 
or point clouds, from which holograms can be computed 
[151]. This chapter provides a detailed exposition of the 
layer method.

Fig. 22  RGB-D image and layer hologram calculation via the layer method: a RGB-D image [149], b layer hologram calculation by diffraction 
calculation, c layer hologram calculation by point cloud method



 J. Rosen et al.  166  Page 24 of 82

13.2  Methodology

A schematic of the hologram computation using the layer 
method is presented in Fig. 22, wherein an example of RGB 
and depth images is illustrated. Figure 22b details the pro-
cess of computing a layer hologram through diffraction cal-
culations. This RGB-D image serves as the basis for decom-
posing the 3D scene into multiple layers. The light waves 
emitted from each of these layers are individually computed 
and subsequently combined on the hologram, resulting in 
the final hologram [152]. This can be expressed as follows:

where x is the position vector, L is the number of layers, 
Pj is the operator representing the diffraction calculation, 
u(x) is one channel with RGB images, Mj(x) is a function of 
extracting the j-th layer and is set to 1 if a pixel in the depth 
image d(x) matches the depth index j; otherwise, it is set to 
0. It is defined as follows:

Representative diffraction calculations employed for Pj 
involve the utilization of Fresnel diffraction and the angular 
spectrum method. While these calculations can be expedited 
through convolution using fast Fourier transforms (FFTs), 
owing to the cyclic nature of convolution, wraparound noise 
may be introduced into the reproduced image. To mitigate 
this wraparound issue, when the hologram size is N × N pix-
els, the diffraction calculation is extended to 2N × 2N pixels, 
with the extended region being zero-padded. This extension, 
however, leads to increased computation time and greater 
memory usage. To address this challenge, diffraction calcu-
lations using pruned FFT and implicit convolution methods 
have been proposed as means to alleviate this problem [153].

The function ψ(x) represents an arbitrary phase, 
and both random and compensation phases are used in 
layer holograms [154]. The random phase is defined 
as �R(x) = exp(2�in(x)) , where i is the imaginary unit, 
and n(x) is a random number within the range of 0 to 1. 
While the random phase has the drawback of introducing 
speckle noise in the reproduced image, it has the benefit 
of broadening the viewing angle of the reproduced image 
and regulating the depth of field in the reproduction. The 

(10)

h(x) =

L∑
j=1

P j{u(x)Mj(x)�(x)},

(11)

Mj(x) =

{
1(d(x) = j)

0(otherwise)

compensation phase is defined as �C(x) = exp
(
iπzj

)
 , where 

zj represents the distance between the j-th layer and the 
hologram. Phase differences between layers can lead to 
unwanted diffraction waves (referred to as ringing arti-
facts) [155] that are superimposed on the reproduced 
image. The compensation phase serves the purpose of 
reducing the phase difference between each layer to zero, 
thereby diminishing ringing artifacts.

Layer holograms can also be computed through the 
point cloud method, as depicted schematically in Fig. 22c. 
In this approach, a point cloud is generated from the 
RGB-D image, and subsequently, the following calcula-
tions are executed [156, 157]

where ul is the l-th object point, k is the wavenumber, 
and rl is the distance between the object point and a cer-
tain point in the hologram. To expedite the computation, 
a virtual plane, known as the wavefront recording plane 
(WRP), is positioned in close proximity to the point cloud. 
The light waves are subsequently recorded on the WRP, 
denoted as w(x), using Eq. (12) [156, 157]. Once all the 
object point information is recorded on the WRP, the 
hologram can be generated by performing a diffraction 
calculation from the WRP to the hologram. Layer images 
often exhibit sparsity. In such instances, using FFTs in Eq. 
(10) for calculations would be less efficient owing to the 
presence of layers containing many zeros. If each layer is 
sparse, it would be more efficient to calculate the hologram 
by using Eq. (12). The introduction of multiple WRPs is a 
possibility, and optimal values for the number and place-
ment of these WRPs exist [158].

Holograms obtained using Eqs. (10) and (12) are inher-
ently complex-valued. Typical spatial light modulators 
(SLMs) are capable of performing either amplitude or 
phase modulation. Therefore, complex holograms need to 
be converted into a format suitable for SLMs. In the case 
of amplitude modulation SLMs, methods are employed that 
either directly extract the real part of the complex hologram 
to produce an amplitude hologram or utilize techniques such 
as single sideband and half-zone plate processing to achieve 
a complex hologram even when using amplitude modula-
tion SLMs [159, 160]. For phase modulation SLMs, com-
plex holograms are transformed into phase-only holograms 
through methods such as double phase hologram [161, 162], 
error diffusion method [163], binary amplitude encoding 
[164], and bleached hologram [165].

(12)

w(x) =

N∑
l=1

ul exp
(
ikrl

)
,



Roadmap on computational methods in optical imaging and holography [invited]  Page 25 of 82   166 

13.3  Results

The results of computing the layer hologram from the 
RGB-D image in Fig. 22a using Eq. (10) are displayed in 
Fig. 23. For these diffraction calculations, the angular spec-
trum method was applied [166]. The following parameters 
were used: a wavelength of 532 nm, a minimum distance of 
50 mm between the hologram and the 3D scene, a pixel pitch 
of 3.74 µm, a thickness of 5 mm for the 3D scene (where a 
zero-pixel value in the depth image corresponds to a distance 
of 50 mm from the hologram, and a pixel value of 255 repre-
sents a distance of 55 mm), and a total of 32 layers.

Figure 23a shows the hologram utilizing the compen-
sation phase for ψ(x) is presented. Further, Fig. 23b and c 
show the reproduced images derived from the hologram. 
Figure 23b displays the reproduced image with a focus 
on the standlight, while Fig. 23c exhibits the reproduced 
image with emphasis on the shelf. Figure 23d illustrates the 
hologram created using the random phase for ψ(x). Cor-
respondingly, Fig. 23e and f reveal the reproduced images 
from the hologram, with a specific focus on the standlight 
and the shelf, respectively. Notably, the reproduced image 
originating from the hologram utilizing the compensation 
phase manifests a deep depth of field, whereas the repro-
duced image obtained from the hologram using the random 
phase exhibits a shallow depth of field. Furthermore, the 
reproduced image of the random phase hologram displays a 
pronounced presence of speckle noise.

13.4  Conclusion and future perspectives

This section describes the calculation of layer holograms, 
and the Python code with accompanying comments can 
be found in the supplementary material. The layer holo-
grams discussed here are commonly applied in near-eye 
holographic displays [150]. However, this method may not 
be suitable for holographic displays with broad viewing 
angles and expansive fields of view, where the observ-
er's eye position can be freely adjusted [167]. For such 
holographic displays, an alternative approach involves 
calculating layer holograms using multi-view images in 
conjunction with depth images [168]. Many hologram 
computations using deep learning methods also involve 
inferring layer holograms from RGB-D images. The layer 
hologram calculations presented in this chapter, serve a 
valuable purpose in generating training datasets for deep 
learning. Additionally, while layer holograms derived 
through deep learning may face challenges in achieving 
deep-depth reproduction images [169], the computational 
approach introduced in this chapter allows for greater flex-
ibility in setting the depth parameter. Commented Python 
code for implementing layer hologram calculation is given 
in the supplementary materials S9.

Fig. 23  Layer holograms and 
reproduced images. a Hologram 
using compensation phase, 
b reproduced image focused 
on standlight, c reproduced 
image focused on shelf, d 
hologram using random phase, 
e reproduced image focused on 
standlight, f reproduced image 
focused on shelf. The contrast 
and brightness of each repro-
duced image were adjusted for 
ease of viewing



 J. Rosen et al.  166  Page 26 of 82

14  Learned compressive holography 
(Vladislav Kravets and Adrian Stern)

14.1  Background

In [170] we introduced Compressive Fresnel Holography 
(CFH)—a technique that uses only a small subset of samples 
of the hologram to capture the object’s three-dimensional 
information. The CFH was built upon the Compressive Sens-
ing (CS) [171–173] theory, which states that objects that are 
sparse or have a sparse representation in some mathematical 
domain can be economically sampled and reconstructed by 
employing an appropriate sampling scheme and reconstruc-
tion algorithm. Using this technique, a compression ratio of 
up to 12.5:1 was demonstrated in [170] for Fresnel coherent 
digital holography. The method was extended for incoherent 
holography in [174, 175] facilitating the sensing effort by an 
order of magnitude. Theoretical guarantees for CFH were 
derived in [176], and comprehensive sampling conditions 
are summarized in Chapter 9 in [173].

The CS theory considers a linear sensing model 
described as � = ��  where � ∈ ℂ

N represents the objects, 
� ∈ ℂ

M the measured samples and � ∈ ℂ
M×N is the sens-

ing matrix. In CS M < N. There are two common types of 
sensing matrices � : Random Modulators (RM) and Partial 
Random Ensemble (PRE). The RM sensing matrix is an M 
by N random matrix with entries commonly drawn from a 
sub-Gaussian distribution (e.g., Gaussian Bernoulli, etc.). 
The best-known representative of PRE is the Random Par-
tial Fourier (RPF) sensing matrix, which is constructed 
by randomly picking out M rows from a Fourier Basis. 
The CFH sensing model relates to this method, suggesting 
randomly sampling only M samples from a full Fresnel 
transformation ensemble, as we demonstrated in [170]. 
We further have shown that it is advantageous to sample 
Fresnel holograms randomly according to a non-uniform 
pattern [170]. Similar non-uniform CS sampling was also 
studied for other CS settings (e.g., [177, 178]) using theo-
retical analysis of the sensing matrix. In this chapter, we 
present a data-driven deep learning method to determine 
the optimal random-like sampling. We apply the recently 
introduced LPTnet [179] to choose the optimal Fresnel 

samples and to reconstruct the object from the CS samples. 
LPTnet was demonstrated in [179] to push the classical CS 
limits by almost two orders of magnitudes when applied 
to regular 2D imaging. Here, we demonstrate its useful-
ness for CFH.

LPTnet is an innovative CS framework that utilizes 
end-to-end Deep Learning (DL) for jointly optimizing 
the sensing process and the CS reconstruction algorithm. 
Unlike traditional CS methods that often rely on random 
sampling techniques, LPTnet intelligently selects optimal 
samples from a predetermined transformation ensemble 
using deep learning. Figure 1 shows a simplified scheme 
of LPTNet. The sensing process is modeled as Hadamard 
(pointwise) multiplications of the fully transformed image 
with a binary mask, that is, � = �F�◦� ,  where �F denotes 
the non-sampled (typically unitary) transform, and c the 
sampling mask. The nonzero values of c serve as indica-
tors of the transformed values to be selected. The zero val-
ues of the mask c effectively null out the fully transformed 
values, leaving only the partial transformed ensemble �� , 
representing the compressed measurments, g. During the 
training phase, optimization is performed on a recon-
struction scheme along with the selection map c. In the 
reconstruction phase, an inverse transform 

(
��

)−1 is first 
applied on the measurements, g. Due to the partial sam-
pling, the obtained image is highly distorted; therefore, a 
sequence of refinement iterations is performed. In each 
iteration (dashed box in Fig. 1), a deep neural network is 
first applied to reconstruct the image. The estimated image 
is then converted to the transformed domain, and the real 
measurements are reinforced while leaving the (implic-
itly) inferred missing coefficients (MR block in Fig. 24). 
Finally, the data is transformed back 

(
�F

)−1 to the image 
domain. A detailed description of LPTNet architecture and 
training process can be found in [179]. 

14.2  Methodology

The schematic of LPTnet-based compressive Fresnel 
holography is shown in Fig. 25a. The relation between 
the object complex field amplitude, f, and the field at the 
image plane is mathematically described in the far field 
regime [173] by: 

Fig. 24  Schematic description 
of the learned compressive 
Fresnel holography: LPTnet is 
used to determine the hologram 
sampling point and the param-
eters of the reconstruction deep 
neural network



Roadmap on computational methods in optical imaging and holography [invited]  Page 27 of 82   166 

where F2D is the 2D Fourier transform, � is the 
illumination wavelength, Δz is the sensor pixel size, 
the object sampling interval is Δ0 = �z∕

(
nΔz

)
 , and 

0 ≤ p, q, k, l ≤ n − 1 . Assuming no loss of generality, we 
consider the complex field amplitude image to be of size 
n by n and N = n2 is the total number of samples. Equa-
tion (1) can readily be written in a vector matrix form 
[173], � = �FST � , where �FST describes the full ensemble 
of Fresnel transformation. A partial sample of transforma-
tion, � , is found by applying the LPTNet [179].

Similar to Fig. 24, we can represent the CS Fresnel 
transform as g = c◦�FST (f ) , where ∘ is the Hadamard 
(point-wise) product and c ∈ ℝ

n×n is a learned binary 
sampling map (Fig. 25a) which can be found using the 
LPTNet [179], and �FST  is the full Fresnel transform 
ensemble.

14.3  Results

We have simulated CFH with W = 30  mm, n = 64, 
λ = 550 nm. The top row in Fig. 25b shows the learned sam-
pling maps, c, for z = 25  mm and for z = 192  mm with 
M = 200 and M = 500 samples each. Notice that each case 

(13)g(pΔz, qΔz) = exp

{
j�

�z

(
p2Δ2

z
+ q2Δ2

z

)}
F2D

[
f (kΔ0, lΔ0) exp

{
j�

�z

(
k2Δ2

0
+ l2Δ2

0

)}]
,

has its own Fresnel transformation ensemble, 
�FST = �FST

(
z,

M

N

)
 , therefore LPTnet finds an appropriate 

sampling map. It can be seen that for short distances, where 
the Fresnel transformation is not much different from that of 
the object, the optimal sampling pattern is uniformly ran-
dom, which is in agreement with the universal CS oracles 
that do not employ learning tools. However, as the field 
propagates, the sampling pattern becomes patterned. Fig-
ure 25b (center and bottom) demonstrates the image recon-
structions of Augustin-Jean Fresnel portrait (Fig. 25b) from