Physica A 651 (2024) 130015

A
0
(

Contents lists available at ScienceDirect

Physica A

journal homepage: www.elsevier.com/locate/physa

Permutation invariant Gaussian matrix models for financial
correlation matrices
George Barnes a,∗, Sanjaye Ramgoolam a,b, Michael Stephanou c

a Centre for Theoretical Physics, School of Physical and Chemical Sciences, Queen Mary University of London, Mile End Road, London E1
4NS, UK
b National Institute for Theoretical Physics, School of Physics and Centre for Theoretical Physics, University of the
Witwatersrand, Wits, 2050, South Africa
c Rand Merchant Bank, 1 Merchant Place, Fredman Drive, Johannesburg, 2196, South Africa

A R T I C L E I N F O

Keywords:
Permutation invariant matrix models
Symmetric group representation theory
Statistical mechanics
Gaussianity
Financial correlations
High-frequency foreign exchange data

A B S T R A C T

We construct an ensemble of correlation matrices from high-frequency foreign exchange market
data, with one matrix for every day for 446 days. The matrices are symmetric and have
vanishing diagonal elements after subtracting the identity matrix. For such ensembles, we
construct the general permutation invariant Gaussian matrix model, which has 4 parameters
characterised using the representation theory of symmetric groups. The permutation invariant
polynomial functions of the symmetric, diagonally vanishing matrices have a basis labelled by
undirected loop-less graphs. Using the expectation values of the general linear and quadratic
permutation invariant functions of the matrices in the dataset, the 4 parameters of the matrix
model are determined. The model then predicts the expectation values of the cubic and quartic
polynomials. These predictions are compared to the data to give strong evidence for a good
overall fit of the permutation invariant Gaussian matrix model. The linear, quadratic, cubic and
quartic polynomial functions are then used to define low-dimensional feature vectors for the
days associated to the matrices. These vectors, with choices informed by the refined structure
of small non-Gaussianities, are found to be effective as a tool for anomaly detection in market
states: statistically significant correlations are established between atypical days as defined using
these feature vectors, and days with significant economic events as recognised in standard
foreign exchange economic calendars. They are also shown to be useful as a tool for ranking
pairs of days in terms of their similarity, yielding a strongly statistically significant correlation
with a ranking based on a higher dimensional proxy for visual similarity.

1. Introduction

Permutation invariant Gaussian matrix models have recently been introduced [1], with motivations coming from the study
of the statistics of ensembles of matrices arising in computational linguistics [2]. They have been applied to show the existence
of approximate Gaussianity in these ensembles [3,4]. Permutation invariant polynomials of matrix variables, which are the key
observables in the study of Gaussianity, have also been used as tools for lexical semantic tasks in computational linguistics [4].

In this paper, we construct and analyse financial correlation matrices for a sequence of days obtained by calculating correlations
between price movements in high frequency foreign exchange market data. This is an interesting study in itself and adds to a
rather sparse literature on high frequency forex correlation matrices ([5] is one other recent example). The correlation matrices are

∗ Corresponding author.
E-mail addresses: george.barnes.3108@gmail.com (G. Barnes), s.ramgoolam@qmul.ac.uk (S. Ramgoolam), michael.stephanou@gmail.com (M. Stephanou).
vailable online 8 August 2024
378-4371/© 2024 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY license
http://creativecommons.org/licenses/by/4.0/).

https://doi.org/10.1016/j.physa.2024.130015
Received 9 July 2023; Received in revised form 14 July 2024

https://www.elsevier.com/locate/physa
https://www.elsevier.com/locate/physa
mailto:george.barnes.3108@gmail.com
mailto:s.ramgoolam@qmul.ac.uk
mailto:michael.stephanou@gmail.com
https://doi.org/10.1016/j.physa.2024.130015
http://crossmark.crossref.org/dialog/?doi=10.1016/j.physa.2024.130015&domain=pdf
https://doi.org/10.1016/j.physa.2024.130015
http://creativecommons.org/licenses/by/4.0/


Physica A: Statistical Mechanics and its Applications 651 (2024) 130015G. Barnes et al.

a

s

W
t
m
b
f
d
P
a

e
d
i
a
h

a
o
t
a
v
G

symmetric and have vanishing matrix elements along the diagonal (by subtracting the identity matrix). We also briefly note the
positive-semidefinite property of these correlation matrices which we do not explicitly model. The permutation invariant matrix
model constructed in [1] uses an integration over general matrices and has 13 parameters, 2 are coefficients in the action of linear
invariants while 11 are coefficients of quadratic invariants. For the case of the restricted matrices here, there is a reduction of the
13-parameter model to a 4-parameter model. In this paper we explicitly construct the 4-parameter model and use it to demonstrate
pproximate permutation invariant Gaussianity in the ensemble of forex correlation matrices we construct.

The permutation invariant polynomial functions of generic matrices of size 𝐷 and degree 𝑘 have a basis labelled by directed
graphs with 𝑘 edges and any number of nodes, when 𝐷 ≥ 2𝑘. This condition is satisfied for the cases of interest in this paper
ince we consider 𝑘 up to 8 and 𝐷 = 19. The nodes of the graph correspond to indices being summed and the edges of the graph

correspond to the matrix variables. In the case of permutation invariant polynomial functions of symmetric, diagonally vanishing
matrices of degree 𝑘, there exists a basis labelled by undirected graphs containing no loops (i.e. no edges starting and ending at the
same node). See Appendix B.4 for an explanation of this connection between graphs and polynomial invariant functions, as well as
examples of the loopless graphs and the corresponding polynomials.

In part the motivation from physics for applying the permutation invariant Gaussian matrix (PIGM) models is to establish
universality properties of these permutation invariant Gaussianities in diverse matrix ensembles in data sciences. In the case of
traditional random matrix theory (RMT) [6,7], which is also Gaussian but has a continuous, as opposed to a discrete symmetry,
the areas of application extend across many-body quantum physics [8] and various areas of data science [9]. This diverse range
of applications defines a universality class of random matrix statistics. There are already hints of such a universality in existing
applications of PIGM models, where different matrix ensembles in computational linguistics constructed from different kinds of
algorithms show approximate Gaussianity. The Gaussianity analysed in [4] was based on the construction of matrices by linear
regression in [2] while [3] extended the analysis of [4] and also analysed the matrices constructed by neural network methods
in [10].

Further motivation for investigating the broad theme of Gaussianity within a financial setting comes from the known statistical
properties of correlation matrices. Existing results establish the asymptotic 𝐷2-variate Gaussianity of the sampling distribution
of 𝐷 × 𝐷 dimensional correlation matrices under fairly general conditions for example (related to finite fourth moments of the
underlying observations from which the correlation matrices are constructed, see [11], Theorem 3.4.4 and subsequent comments).
This differs from our non-asymptotic i.e. finite sample setting, but does imply that the distribution of ensembles of correlation
matrices approaches a multivariate Gaussian in the large observation sample limit.

Other known statistical results include the fact that the non-asymptotic sampling distribution of correlation matrices is the
ishart distribution, under certain specific conditions [12]. In particular, the Wishart distribution only arises in the case where

he observations from which the correlation matrix estimates are constructed are themselves multivariate Gaussian. We do not
ake this assumption. Thus our setting is one of finite observation samples which may not be Gaussian distributed. It has also

een noted that the correlation information encoded in correlation matrices is not sensitive to the ordering of basis vectors (in [13]
or financial correlation matrices for example). This motivates a permutation invariant model for correlation matrices. The Wishart
istribution is not permutation invariant in general, in contrast to the PIGM model. To summarise then, in our setting we apply the
IGM model as a phenomenological model for ensembles of correlation matrices. The PIGM model is permutation invariant and is
ble to capture leading Gaussian structure which forms the basis for quantifying small non-Gaussian corrections.

We note a rich history of applying random matrix theory (RMT) to the study of financial correlation matrices centring on the
igenvalue distributions of these matrices [14–16]. In particular, random matrix models have been fit to the empirical eigenvalue
istributions of such correlation matrices, predominantly the so-called Marchenko–Pastur (M–P) distribution [17] derived for
dentity correlation matrices, as well as more realistic models [18,19]. Evidence has also been presented that large eigenvalues
re associated with overall market and sector correlation structure (see [16] for example). Practical applications of these findings
ave been developed such as cleaning/de-noising correlation matrices, amongst others [20–23].

The PIGM model provides a new approach to studying and describing financial correlation matrices that is distinct from existing
pproaches based on RMT. It focuses on low degree permutation invariant polynomial functions of matrices (which we refer to as
bservables) instead of eigenvalue distributions, which are the focus of traditional RMT. This perspective is based on the postulate
hat near-Gaussian permutation invariant sectors of real world matrix data contain useful information. The PIGM model furnishes
parsimonious specification of the probability density function of these matrices using only 4 free parameters for the symmetric,

anishing diagonal model. This is close to the one or two parameters of the simplest RMT and far smaller than a multi-variate
aussian distribution for a 𝐷×𝐷 matrix, which has order 𝐷2 parameters. It provides an analytical solution to the expectation values

of permutation invariant products of matrix elements. The empirical higher order observables (cubic, quartic etc.) that agree closely
with the model – which is only fit to linear and quadratic observables – reveal consistency with random matrices implied by the fitted
Gaussian model. The empirical higher order observables that depart from theoretical expectations indicate informative structure
beyond that encoded in the random model. A vector of observables therefore provides a signature for a particular correlation matrix,
which may provide a useful, lower dimensional, permutation invariant representation of correlation matrices. The effectiveness of
this representation is explored in anomaly detection and measurement of similarity as initial examples.

It should be noted that the PIGM model can be applied beyond symmetric correlation matrices to general cross-correlation
matrices with only permutation invariance (e.g. constructed from returns at different times) through the full 13-parameter PIGM
model developed in [1]. In this article we lay the foundations for these future studies.

In Section 2 we summarise the theoretical results of the paper on permutation invariant Gaussian models. We define general
2

permutation invariant Gaussian matrix models and consider the restriction of these models appropriate to the financial data described


Physica A: Statistical Mechanics and its Applications 651 (2024) 130015G. Barnes et al.

p

G

r
a

w
a
o
f
m
a

2

o
b
t
r
p
a
s
p
v

i

T

I

o

m

in Section 3, namely that the matrices must be symmetric and have vanishing diagonal elements. We also define the permutation
invariant observables of the model and explain a useful bijection between these observables and loopless graphs, examples of which
are given. The detailed construction and solution of the PIGM model is contained within Appendix B. It is achieved with the help
of representation theory of the symmetric group and builds on the results of [1,24]. We find that these models are characterised
by 1 linear and 3 quadratic couplings. Linear and quadratic expectation values of observables can be expressed simply in terms of
these coupling parameters, see Eqs. (B.72) and (B.71) respectively. Higher order expectation values are simply constructed from
these with the application of Wick’s theorem.

Section 3 gives details of the high-frequency forex data used to construct the matrix ensemble studied in the remainder of the
aper, as well as the method by which the members of this ensemble are constructed from the underlying data.

Section 4 contains a description of the empirical statistical properties of the observables. This includes measures of their
aussianity and comparison of their properties with those predicted by the model presented in Section 2.1

In Section 5 we construct vectors of observables for each correlation matrix. These observable vectors are low dimensional
epresentations of the correlation matrices. There are 31 cubic and quartic observables for general matrix size 𝐷 (as long as 𝐷 ≥ 8,
condition which is generally satisfied in large 𝐷 applications such as the one here).

In general, we find that the observable vectors provide a good representation of the original correlation matrices, performing
ell in anomaly detection and similarity ranking applications. The best performances are achieved by selecting subsets of the cubic
nd quartic observables, based on the ranking of their small non-Gaussianities, and on the postulate that the more non-Gaussian
bservables are most informative of economic factors driving atypicality of the days. We conclude in Section 6 and suggest interesting
uture research directions arising from this work. We collect some derivations, figures and robustness checks in supplementary
aterial, which are referred to within the main text. The code and data required to reproduce the results in this article are available

t https://github.com/pigm-finance/codedata, with certain aspects further described in the supplementary material.

. Summary of results on the 𝟒-parameter Gaussian matrix model

Here we summarise the main results and outline the key ideas behind the construction of the general PIGM model for an ensemble
f symmetric matrices which have vanishing matrix elements along the diagonal. This section is intended to provide, for a reader with
ackground in mathematical finance or statistics, an understanding of the key theoretical points of the paper, without getting into
he full details of the construction presented in Appendix B which will be more easily accessible to a reader with knowledge of group
epresentation theory at a level covered in mathematical texts such as in chapters 3, 5 and 7 of [25]. We will review the description of
robability distributions using a Euclidean action which is Gaussian or near-Gaussian using the simple case of one-variable statistics
nd motivate the measure of non-Gaussianity we use later in the case of permutation invariant matrix distributions. We explain the
tructure of the 4-parameter permutation invariant Gaussian matrix model and the connection between the permutation invariant
olynomial functions of the matrices with loopless graphs. Finally we present formulae for the one and two-point functions of matrix
ariables derived in detail in Appendix B.

It is useful to recall that a one-variable Gaussian distribution, with mean 𝜇 and standard deviation 𝜎, for a random variable 𝑥,
s described by a probability density function

𝑓 (𝑥) = 1

𝜎
√

2𝜋
𝑒−

(𝑥−𝜇)2

2𝜎2 . (2.1)

he moments of the distribution are expectation values ⟨𝑥𝑘⟩, defined as

⟨𝑥𝑘⟩ = ∫

∞

−∞
𝑑𝑥 𝑓 (𝑥) 𝑥𝑘. (2.2)

t is also useful to define, by analogy with statistical physics and quantum mechanical path integrals, the action 𝑆 = (𝑥−𝜇)2

2𝜎2 . The
partition function is

𝑍 = ∫ 𝑑𝑥 𝑒−𝑆 , (2.3)

while the moments are

⟨𝑥𝑘⟩ = 1
𝑍 ∫ 𝑑𝑥 𝑒−𝑆 𝑥𝑘. (2.4)

The action 𝑆 is a quadratic function of 𝑥.
It is often the case that the action of a theory is approximately Gaussian — deviating from Gaussianity by some small higher

rder terms. The full action of the system 𝑆′ is then written as a Gaussian piece 𝑆, plus an additional non-Gaussian piece 𝛿𝑆

𝑆′ = 𝑆 + 𝜆𝛿𝑆. (2.5)

1 We include a further, practical comparison between empirical observations and theoretical results from the PIGM model in section 3 of the supplementary
3

aterial.

https://github.com/pigm-finance/codedata


Physica A: Statistical Mechanics and its Applications 651 (2024) 130015G. Barnes et al.

V
e
o
s
t

s

T
b

c
i
i
v
o

T
p
t

A
s

T
f
e
D
p

w
T
p
i

m
p

The smallness of the higher order terms is governed by the interaction strength 𝜆, whose smallness is required to ensure

⟨𝛼⟩𝑆 − ⟨𝛼⟩𝑆′ < 𝜎
⟨𝛼⟩𝑆′

, (2.6)

i.e. the expectation value of some observable 𝛼 , is largely insensitive to the non-Gaussian contribution to the true action governing
the theory, 𝛿𝑆.

More concretely, take 𝜇 = 0 in the simple pure Gaussian, one parameter toy model defined above. The partition function is

𝑍 = ∫ d𝑥 𝑒−𝑆 = ∫ d𝑥 𝑒−
𝑥2

2𝜎2 . (2.7)

iew this as an approximation to some true physical partition function which includes some small, non-Gaussian perturbation. We
xplain by way of a simple example which captures the mechanism, that the absolute differences between expectation values of low
rder polynomials in the random variables (observables) in the Gaussian model and the perturbed model are small compared to the
tandard deviation of the observable. The simple example consists of a Gaussian action perturbed by a small quartic correction, so
hat the perturbed partition function 𝑍′ is

𝑍′ = ∫ d𝑥𝑒−𝑆
′
= ∫ d𝑥𝑒−

𝑥2

2𝜎2
− 𝜆

4! 𝑥
4
= 𝑍

(

1 − 𝜆
4!
⟨𝑥4⟩ +⋯

)

. (2.8)

Using (2.4) we calculate the absolute difference between the fourth moment of each of the theories and obtain (see the
upplementary material for the derivation),

|

|

|

⟨𝑥4⟩ − ⟨𝑥4⟩𝑆′
|

|

|

𝜎
⟨𝑥4⟩𝑆′

∼ 𝜆𝜎4. (2.9)

herefore, as long as the physical theory is approximately Gaussian, i.e. 𝜆 is small, its normalised fourth moment is well approximated
y that of the purely Gaussian theory.

We postulate that real market effects governing the interactions between currency rates – as encoded in the distribution of the
orrelation matrices we study – are modelled analogously by a Gaussian action (incorporating linear and quadratic permutation
nvariant polynomial functions of the matrices) plus some small non-Gaussian perturbation (cubic and higher order permutation
nvariant polynomial functions of the matrices). The smallness of the non-Gaussian terms allows us to approximate expectation
alues using a purely Gaussian theory. Evidence for this near-Gaussianity is provided primarily by the smallness of the measured
bservable deviations from those predicted by a purely Gaussian theory. These are listed in Table 4 of Section 4.

Permutation invariant Gaussian matrix models for generic real matrices of size 𝐷 are described by a partition function

 = ∫

𝐷
∏

𝑖,𝑗=1
𝑑𝑀𝑖𝑗 𝑒−

PIGMM
. (2.10)

he integration measure is the standard Euclidean measure on R𝐷2 . The general Gaussian action is defined by a general quadratic
ermutation invariant function of the matrix elements 𝑀𝑖𝑗 , depending on 13 parameters, which is compatible with convergence of
he integral [1]. The set of all permutations of 𝐷 objects forms the symmetric group 𝑆𝐷. The permutations 𝛾 ∈ 𝑆𝐷 act as

𝛾 ∶ 𝑀𝑖𝑗 → 𝑀𝛾(𝑖)𝛾(𝑗) . (2.11)

precise description of the general parameter space for such a Gaussian action is found using the representation theory of the
ymmetric group. A key outcome of the representation theoretic treatment is that there exists a convenient set of variables 𝑆𝑉𝐴;𝜏𝐴

𝑎 ,
which are linear combinations of the 𝐷2 variables 𝑀𝑖𝑗 :

𝑆𝑉𝐴;𝜏𝐴
𝑎 =

𝐷
∑

𝑖,𝑗=1
𝐶𝑉𝐴 ,𝜏𝐴
𝑎 ; 𝑖𝑗 𝑀𝑖𝑗 . (2.12)

he variables 𝑉𝐴 are vector spaces {𝑉[𝐷] = 𝑉0, 𝑉[𝐷−1,1] = 𝑉𝐻 , 𝑉[𝐷−2,2] = 𝑉2, 𝑉[𝐷−2,1,1] = 𝑉3}. The meaning of this list is explained
urther in Appendix B. For now the only important point is that this is a list of 4 elements. The index 𝑎 ranges over a set of basis
lements for each vector space 𝑉𝐴, numbering respectively {1, (𝐷 − 1), 𝐷(𝐷 − 3)∕2, (𝐷 − 1)(𝐷 − 2)∕2}, which are the dimensions
im𝑉𝐴 of the vector spaces. The index 𝜏𝐴 runs over {2, 3, 1, 1} values respectively. The key result is that the action for the general
ermutation invariant Gaussian model takes the form

PIGMM = −𝜇1𝑆𝑉0;1 − 𝜇2𝑆
𝑉0;2 +

∑

𝐴

Dim𝑉𝐴
∑

𝑎=1
𝑔(𝐴)
𝜏𝐴 ,𝜏′𝐴

∑

𝜏,𝜏′
𝑆𝑉𝐴;𝜏𝐴
𝑎 𝑆

𝑉𝐴;𝜏′𝐴
𝑎 , (2.13)

here 𝜇1 and 𝜇2 are linear coupling parameters. The parameters 𝑔(𝐴)
𝜏𝐴 ,𝜏′𝐴

are symmetric matrix parameters of matrix size 2, 3, 1, 1.
hus they define parameter spaces of dimension {3, 6, 1, 1}. Convergence is guaranteed by the condition that these matrices have
ositive eigenvalues. Thus the general Gaussian model has 2 parameters for the linear invariants and 11 parameters for the quadratic
nvariants.

We show in Appendix B that the general permutation invariant Gaussian matrix model for symmetric matrices is a 9-parameter
odel, and for symmetric matrices with diagonally vanishing matrix elements the permutation invariant Gaussian model is a 4-
4

arameter model. The 9-dimensional parameter space for the symmetric matrices and the 4-dimensional parameter space for the


Physica A: Statistical Mechanics and its Applications 651 (2024) 130015G. Barnes et al.

i
𝑉

B
d

i
i
1
d
g

E
t

t
o
g

t

w
w
m

w

c
a

W
o
t

t

symmetric diagonally vanishing matrices are subspaces of the 13-dimensional parameter space for generic matrices. The embedding
of the 4-dimensional parameter space in the 13-parameter space is described in Appendix B.

The diagonal action of 𝑆𝐷 simultaneously permutes the rows and columns of a matrix as in (2.11). We wish to find an action that
s invariant under this group action. We refer to the space of symmetric matrices with vanishing diagonal as the physical subspace

phys of general 𝐷 ×𝐷 matrices and label matrices in the space with a superscript i.e. 𝑀phys ∈ 𝑉 phys. They obey the conditions

𝑀𝑖𝑗 = 𝑀𝑗𝑖, 𝑀𝑖𝑖 = 0, 1 ≤ 𝑖, 𝑗 ≤ 𝐷. (2.14)

Since these conditions are 𝑆𝐷 equivariant with respect to the action defined in (2.11) the physical subspace is invariant under 𝑆𝐷

𝑀𝑖𝑗 ∈ 𝑉 phys ⇒ 𝑀𝛾(𝑖)𝛾(𝑗) ∈ 𝑉 phys, ∀ 𝛾 ∈ 𝑆𝐷. (2.15)

y physical we mean only to restrict to the non-trivial data of interest. In the correlation matrices described in Section 3 all the
ata is contained within symmetric matrices with vanishing diagonal.

An important ingredient in understanding permutation invariant random matrix models is the structure of the permutation
nvariant polynomial functions of matrices, which are closely related to graphs. The key point behind this relation is that permutation
nvariant polynomial functions can be constructed by summing over the matrix indices. For general matrices there are 2 linear and
1 quadratic functions, these polynomials and their graphs are described in Appendix B.4. For symmetric matrices with vanishing
iagonal elements there is 1 linear invariant function and 3 quadratic invariant functions, which can be represented by the following
raphs

, , , .
∑𝐷

𝑖,𝑗 𝑀𝑖𝑗
∑𝐷

𝑖,𝑗 𝑀
2
𝑖𝑗

∑𝐷
𝑖,𝑗,𝑘 𝑀𝑖𝑗𝑀𝑗𝑘

∑𝐷
𝑖,𝑗,𝑘,𝑙 𝑀𝑖𝑗𝑀𝑘𝑙

(2.16)

ach index summation is represented by a node of the graph and matrices span these nodes as edges. No loops are permitted due
o the matrices having vanishing diagonal and the edges are undirected due to the symmetry of the matrices.

For a fixed degree, the permutation invariant polynomial functions form a vector space. As long as the matrix dimension is larger
han twice the degree of the polynomial the graphs are in one-to-one correspondence with basis elements of this vector space. In
ur present financial application this degree condition is always satisfied. A detailed discussion of this condition and the effects of
oing beyond it are given in [24].

The action of the reduced PIGM model is given by the most general combination of permutation invariant linear and quadratic
erms

 = −𝜇
𝐷
∑

𝑖,𝑗=1
𝑀𝑖𝑗 + 𝜏1

𝐷
∑

𝑖,𝑗=1
𝑀𝑖𝑗𝑀𝑖𝑗 + 𝜏2

𝐷
∑

𝑖,𝑗,𝑘=1
𝑀𝑖𝑗𝑀𝑗𝑘 + 𝜏3

𝐷
∑

𝑖,𝑗,𝑘,𝑙=1
𝑀𝑖𝑗𝑀𝑘𝑙 , (2.17)

here 𝜇 is the linear coupling strength and 𝜏1, 𝜏2 and 𝜏3 are the quadratic couplings. Label observables of this theory 𝛼(𝑀𝑖𝑗 ),
here 𝛼 indexes the particular observable, they are permutation invariant polynomials of the random matrix variables of symmetric
atrices with vanishing diagonal. Expectation values of these variables are defined as

⟨𝛼(𝑀𝑖𝑗 )⟩ =
1
 ∫ d𝑀𝛼𝑒

− , (2.18)

here the measure is defined as

d𝑀 ≡
∏

𝑖<𝑗
d𝑀𝑖𝑗 . (2.19)

The quadratic, cubic and quartic observables along with their corresponding graphs are listed in Appendix B.4. In order to
ompute (2.18) for any choice of 𝛼 we must find a change of basis that factorises the RHS. This is possible with the application of
ppropriate projectors. Given these the action for the 4-parameter model can be written

 =
𝐷
∑

𝑖,𝑗,𝑘,𝑙=1

1
2

(

𝜏𝑉0𝑀𝑖𝑗𝑃
phys;𝑉0
𝑖𝑗,𝑘𝑙 𝑀𝑘𝑙 + 𝜏𝑉𝐻𝑀𝑖𝑗𝑃

phys;𝑉𝐻
𝑖𝑗,𝑘𝑙 𝑀𝑘𝑙 + 𝜏𝑉2𝑀𝑖𝑗𝑃

phys;𝑉2
𝑖𝑗,𝑘𝑙 𝑀𝑘𝑙

)

−
𝐷
∑

𝑖,𝑗=1
𝜇𝑉0𝐶

phys;𝑉0
𝑖𝑗 𝑀𝑖𝑗 . (2.20)

here 𝜇𝑉0 , 𝜏𝑉0 , 𝜏𝑉𝐻 and 𝜏𝑉2 are the couplings in the transformed basis. Performing the projections in (2.20) allows for the solution
f the linear and quadratic expectation values via standard techniques of Gaussian integration. These can then be transformed into
he original 𝑀𝑖𝑗 basis to give expressions for the matrix observables. Defining

𝜇𝑉0 ≡ 𝜏−1𝑉0
𝜇𝑉0 , (2.21)

his procedure produces the one-point function

⟨𝑀phys
𝑖𝑗 ⟩ = 𝐶phys;𝑉0

𝑖𝑗 ⟨𝑆phys;𝑉0
⟩

=
(

√

𝐷 − 1
𝐷3

− 1
√

𝐷(𝐷 − 1)
𝐹𝑖,𝑗

)

𝜇𝑉0 , (2.22)
5


Physica A: Statistical Mechanics and its Applications 651 (2024) 130015G. Barnes et al.

I
l

r

t
a

a

t
t
t
u

w
s
G
t

3

3

f
p
p
U
o
l
a

w
q

and the two-point function

⟨𝑀phys
𝑖𝑗 𝑀phys

𝑘𝑙 ⟩ =
(

√

𝐷 − 1
𝐷3

− 1
√

𝐷(𝐷 − 1)
𝐹𝑖,𝑗

)(

√

𝐷 − 1
𝐷3

− 1
√

𝐷(𝐷 − 1)
𝐹𝑘,𝑙

)

𝜇2
𝑉0

1
𝐷

( 1
𝐷 − 1

𝐹𝑖,𝑗𝐹𝑘,𝑙 −
1
𝐷
(

𝐹𝑖,𝑗 + 𝐹𝑘,𝑙
)

+ 𝐷 − 1
𝐷2

)

𝜏−1𝑉0

+ 1
2(𝐷 − 2)

(

1 − 𝛿𝑖𝑗
)(

1 − 𝛿𝑘𝑙
)(

𝐹𝑖,𝑘 + 𝐹𝑗,𝑘 + 𝐹𝑖,𝑙 + 𝐹𝑗,𝑙
)

𝜏−1𝑉𝐻

+

(

1
2
𝐹𝑖,𝑘𝐹𝑗,𝑙 +

1
2
𝐹𝑖,𝑙𝐹𝑗,𝑘 −

𝐷
𝐷 − 2

𝐷
∑

𝑝,𝑞=1
𝐹𝑖,𝑝𝐹𝑗,𝑝𝐹𝑘,𝑞𝐹𝑙,𝑞𝐹𝑝,𝑞 −

1
𝐷 − 1

𝐹𝑖,𝑗𝐹𝑘,𝑙

)

𝜏−1𝑉2
. (2.23)

dentifying and then summing over indices in (2.22) and (2.23) produces analytic expressions in 𝐷 for the expectation values of the
inear and quadratic observables

∑

𝑖,𝑗
⟨𝑀phys

𝑖𝑗 ⟩ =
√

𝐷(𝐷 − 1)𝜇𝑉0 , (2.24)

and
∑

𝑖,𝑗
⟨𝑀phys

𝑖𝑗 𝑀phys
𝑖𝑗 ⟩ = 𝜇2

𝑉0
+ 𝜏−1𝑉0

+ (𝐷 − 1)𝜏−1𝑉𝐻
+

𝐷(𝐷 − 3)
2

𝜏−1𝑉2
, (2.25)

∑

𝑖,𝑗,𝑘
⟨𝑀phys

𝑖𝑗 𝑀phys
𝑗𝑘 ⟩conn = (𝐷 − 1)𝜇2

𝑉0
+ (𝐷 − 1)𝜏−1𝑉0

+
(𝐷 − 1)(𝐷 − 2)

2
𝜏−1𝑉𝐻

, (2.26)

∑

𝑖,𝑗,𝑘,𝑙
⟨𝑀phys

𝑖𝑗 𝑀phys
𝑘𝑙 ⟩conn = 𝐷(𝐷 − 1)𝜇2

𝑉0
+𝐷(𝐷 − 1)𝜏−1𝑉0

(2.27)

espectively.
Cubic, quartic and higher expectation values can be calculated from these with the application of Wick’s theorem, which allows

hem to be expressed as sums of products of linear and quadratic expectation values. Details of Wick’s theorem as applied to cubic
nd quartic expectation values are given in Appendix B.4.2.

The equations for the 13-parameters appearing in the action (2.13) in terms of the 4-parameters of the physical action (2.20)
re given in (B.80) and (B.81). As expected both models give consistent results for the expectation values of physical observables.

To determine how well this PIGM model predicts the statistics of the forex correlation data described in Section 3 we first use
he correlation matrices to define the Gaussian model i.e. to fix the linear and quadratic couplings of the model. We then calculate
he theoretical expectation values ⟨𝛼(𝑀)⟩T defined by (2.18) using this action with the application of Wick’s theorem. These can
hen be compared to the experimental expectation values ⟨𝛼(𝑀)⟩E calculated from the financial correlation matrices themselves
sing the following similarity measure

𝛥𝛼 =
|

|

|

⟨𝛼(𝑀)⟩T − ⟨𝛼(𝑀)⟩E
|

|

|

𝜎𝐸,𝛼(𝑀)
, (2.28)

here 𝜎𝐸,𝛼 is the standard deviation of the expectation value with respect to the ensemble of correlation matrices. This measure of
imilarity is used to identify the observables which deviate most significantly from Gaussianity. In Sections 5.1 and 5.2 these least
aussian observables are shown to be the optimal candidates for data reduction in a variety of anomaly detection and similarity

ests.

. Daily correlation matrices from high-frequency forex data

.1. Forex data description

The high-frequency forex data that we analyse pertain to 19 of the most liquidly traded currency pairs and cover the date range
rom 1 April 2020 to 31 January 2022. The data is sourced from TrueFX [26] and is comprised of all updates/ticks of the best
rice quotes at which any market participant is willing to buy (top-of-book bid quotes) or sell (top-of-book offer quotes). Market
articipants providing such price quotes include banks, brokers and asset managers on the Integral OCX platform.2 We exclude
nited States currency settlement holidays (days where no settlements of prior transactions are made) due to the central importance
f the US Dollar to FX trading. We also exclude the 24th, 25th, 26th, 31st of December and the 1st, 2nd of January due to reduced
iquidity. The rationale behind this is to remove days which would likely have a different data generating process a priori. In total,
round one billion pricing updates were analysed.

For each currency pair, the mid-price series, 𝑝(𝐼)𝑗 , is calculated from the bid and offer quotes as follows:

𝑝(𝐼)𝑗 = (𝑏(𝐼)𝑗 + 𝑎(𝐼)𝑗 )∕2 𝐼 ∈ {1,… , 19}, 𝑗 ∈ {1,… , 𝑛𝐼}, (3.1)

here 𝑏(𝐼) and 𝑎(𝐼) are contemporaneous bid and offer quotes respectively, 𝐼 ∈ {1,… , 19} indexes the currency pairs, 𝑗 indexes the
uotes and 𝑛𝐼 corresponds to the number of quotes for the currency pair 𝐼 per day. See Table 8 for a mapping of these indices to

2 This platform is an ECN i.e. an Electronic Communication Network.
6


Physica A: Statistical Mechanics and its Applications 651 (2024) 130015G. Barnes et al.

w

W
m

w
d
a
i
T
p
r

T
t
c
w
h
l
L

B
q
d
f
c
c
o
d

a
I
e
o
m
g
p

n

3

m
c

o
v
a
u
e

actual currency pair names. These mid-prices are then sampled on a regular time grid using the last-tick methodology, where the
most recent quotes in each currency pair are used to calculate the mid-price for that time interval. The regularly sampled mid-prices
are then:

𝑝(𝐼)𝑡(1)
,… , 𝑝(𝐼)𝑡(𝑛)

, 𝑡(𝑖+1) − 𝑡(𝑖) = 5 min, 𝑖 ∈ {1,… , 𝑛}, (3.2)

here 𝑡(𝑖), 𝑖 ∈ {1,… , 𝑛} are the time stamps on a regularly sampled grid and 𝑛 is the number of 5 min intervals per day. If we denote
the time stamp of each quote as 𝜏𝑗 , 𝑗 ∈ {1,… , 𝑛𝐼}, then the quote used for each 5 min interval can be described as,

𝑝(𝐼)𝑡(𝑖)
= 𝑝(𝐼)

max{1≤𝑗≤𝑛𝐼 |𝜏𝑗≤𝑡(𝑖)}
. (3.3)

e note that the choice of 5 min as a time interval is common in high frequency financial correlation analyses. We obtain the (log)
id-price returns via:

𝑟(𝐼)(𝑖) = log
𝑝(𝐼)𝑡(𝑖+1)

𝑝(𝐼)𝑡(𝑖)

, 𝑖 ∈ {1,… , 𝑛 − 1}, (3.4)

here the log function as used here and throughout the article is the natural logarithm. Note that the first time interval of each
ay, for all currency pairs, begins at 00:00:00.000 UTC/GMT and ends at 00:04:59:59.999 UTC/GMT. The last time interval begins
t 23:55:00.000 UTC/GMT and ends at 23:59:59.999 UTC/GMT. The advantage of determining calendar date based on UTC/GMT
s that the major FX trading sessions are all captured on the same calendar date, namely Asia, then Europe, then North America.
here are 𝑛 = 288 five minute intervals per day. The time intervals are not only regular, but also aligned across all the currency
airs. Indeed, it is important to note that this procedure downsamples the rapidly updating and irregularly spaced tick data to a
egular grid on a much larger timescale.

See Table 1 for the statistics on the number of quotes per time interval for each currency pair, aggregated across all days. See
able 2 for the descriptive statistics of the regularly sampled (log) returns per currency pair, again aggregated across all days in
he data set. It is readily apparent from the descriptive statistics in Table 2 that the means of the (log) return distributions are very
lose to zero and that the standard deviations vary between currency pairs. In addition, the returns have high kurtosis consistent
ith the expected behaviour of price movements of financial instruments with a calendar time clock. The only currency pair that
as a markedly asymmetric distribution is USD/TRY as evidenced by a large negative skewness (i.e. a left-skewed distribution). The
arge volatility, kurtosis and negative skewness of the USD/TRY distribution can be related to the sharp depreciation of the Turkish
ira during the Turkish currency and debt crisis which occurred during the period of analysis.

A few further comments are in order. The data we analyse are mid-quote data, to be contrasted with transaction price data.
id and offer quote data represent the prices at which market participants are willing to trade at any given moment (and mid-
uotes are the average of these prices), whereas transaction prices are those prices where trades have actually occurred. Quote
ata typically present a more continuous view of price formation whereas transaction data represent the most definitive, but less
requent measurements of the price. The FX market is a highly fragmented, over-the-counter (OTC) market. Trading on any given
urrency pair is distributed across a number of primary and secondary venues along with bilateral transactions between trading
ounterparties. For the majority of such venues, market participants do not have access to detailed transaction data other than their
wn trades and thus quote data is usually the main source of price formation information. Our analyses rely exclusively on quote
ata for price information.

We take the view that our results may well generalise beyond quote data however, based on the thesis that ‘‘transaction prices
nd mid-quote prices are both noisy measures of the latent ‘‘efficient prices’’, polluted by market microstructure effects.’’ As per [27].
n the context of realised volatility, [27] go further and propose that a good estimator is one that yields approximately the same
stimates with transaction data and mid-quote data. While [27] present this view with reference to realised volatility, in follow
n work [28], empirical evidence is gathered that correlation estimates based on mid-quotes and transaction prices are similar for
any correlation estimators (for equities data at least). This includes the realised covariance measure estimated on a regular time

rid with the last-tick sampling scheme (which we utilise in this article). While these results are obtained for equities data, it is
lausible that they generalise to forex data.

We make a final note that the data we analyse is obtained from a single venue i.e. the Integral OCX platform. This should also
ot preclude generalisability given that quotes are not expected to drastically differ between venues due to arbitrage.

.2. Correlation matrix methodology

In statistics, various measures of association between two random variables have been defined. In our context, we apply certain
easures of correlation to ascertain the degree to which currency (log) returns are concordant or discordant. Intuitively, this should

apture an important aspect of the relationship of one currency pair with another.
Calculating correlations on high frequency financial data is complicated by two main effects. The first is the fact that observations

ccur irregularly in time and moreover, asynchronously across instruments. The second is the presence of microstructure noise due to
arious factors such as bid–ask bounce (relevant mainly for transaction based data), minimum tick intervals, latency effects etc. There
re several covariance/correlation estimators that have been proposed in the literature to address these issues (i.e. asynchronous
pdates and market microstructure noise). Given that these estimators are not the focus of this article, and that the literature is
7

xtensive, we point the reader to the following (non-exhaustive) list of Refs. [28–34] for more detail on these two complicating


Physica A: Statistical Mechanics and its Applications 651 (2024) 130015G. Barnes et al.

w
s
w
n
m
e
s

Table 1
Descriptive statistics of number of quote updates per 5 min time interval.
Currency pair Mean Std Dev. Q1 Med. Q3

AUD/JPY 419 374.5 187 319 531
AUD/NZD 304 279.3 135 231 383
AUD/USD 404 396.9 165 296 511
CAD/JPY 272 257.0 113 197 348
CHF/JPY 287 271.7 118 212 368
EUR/CHF 269 295.5 90 173 338
EUR/GBP 308 316.7 103 206 407
EUR/JPY 506 435.8 202 387 685
EUR/PLN 277 487.4 47 125 296
EUR/USD 512 506.4 181 376 683
GBP/JPY 553 471.5 240 433 724
GBP/USD 492 472.6 174 361 668
NZD/USD 269 269.3 114 199 333
USD/CAD 373 371.7 152 271 470
USD/CHF 237 255.4 82 161 305
USD/JPY 322 319.9 135 234 399
USD/MXN 447 434.1 143 324 610
USD/TRY 205 580.4 9 36 138
USD/ZAR 445 498.5 139 325 588

Table 2
Summary statistics of regular 5 min (log) mid-price returns.
Currency pair Mean (x 10−4) Std Dev. (x 10−4) Skewness Kurtosis

AUD/JPY 0.0 3.8 0.0 24.4
AUD/NZD 0.0 2.1 0.2 36.6
AUD/USD 0.0 3.8 0.0 15.3
CAD/JPY 0.0 3.1 0.1 16.2
CHF/JPY 0.0 2.4 0.0 12.5
EUR/CHF 0.0 1.8 −0.6 80.5
EUR/GBP 0.0 2.6 −0.2 37.5
EUR/JPY 0.0 2.4 0.1 16.8
EUR/PLN 0.0 2.7 0.0 31.1
EUR/USD 0.0 2.4 −0.2 32.9
GBP/JPY 0.0 3.1 0.0 20.0
GBP/USD 0.0 3.1 0.0 14.9
NZD/USD 0.0 3.8 0.1 19.8
USD/CAD 0.0 2.7 0.0 18.8
USD/CHF 0.0 2.5 −0.2 19.8
USD/JPY 0.0 2.1 0.2 17.3
USD/MXN 0.0 5.5 0.0 18.8
USD/TRY 0.1 11.7 −7.3 795.7
USD/ZAR 0.0 6.4 −0.1 17.2

issues and various approaches to jointly handle them. In this article we utilise the correlation estimator (3.5) on (log) mid-price
returns. This estimator is referred to as the realised correlation estimator in the finance literature and is defined as,

�̂�𝐼𝐽 =

∑𝑛−1
𝑖=1 𝑟(𝐼)(𝑖) 𝑟

(𝐽 )
(𝑖)

√

∑𝑛−1
𝑖=1 (𝑟

(𝐼)
(𝑖) )

2 ∑𝑛−1
𝑖=1 (𝑟

(𝐽 )
(𝑖) )

2
, 𝐼, 𝐽 ∈ {1,… , 19}, (3.5)

here 𝐼, 𝐽 are currency pair indices. This estimator captures the normalised, aggregated co-movement (i.e. covariance) of two
eries of returns over a given time period (one day in our case). The resultant �̂�𝐼𝐽 correlation matrix is a symmetric, real matrix
ith 19(19− 1)∕2 = 171 independent entries. This figure accounts for the fact that the diagonal elements are fixed and equal and do
ot contribute to the degrees of freedom (we subtract the identity to get a correlation matrix with vanishing diagonal elements). As
entioned previously, we are concerned with the ensemble statistics of such matrices. There are several ways to construct such an

nsemble. We choose to calculate the correlation matrix for each trading day (aligned with UTC/GMT boundaries), and study the
ampling distribution of the matrices. In particular, we study, �̂�𝐼𝐽𝐴 , where 𝐴 ∈ {1,… , 𝑁𝐷} indexes trading dates. In our data, there

are 446 unique trading days, i.e. 𝑁𝐷 = 446. We plot examples of the evolution of two elements of these correlation matrices over
time in Fig. 1.

It is well established that the realised correlation estimator is, in general, sensitive to the issues described above. However, it is
widely acknowledged in the literature that the impact of these issues can be mitigated by sampling regularly at a lower frequency
i.e. 5 - 15 min intervals. We utilise 5 min time intervals in particular, as is common in analysing high frequency financial data.
We have also verified empirically that the correlation results are not very sensitive to the choice of the time interval length —
see Table 3 where we demonstrate that the correlation values are fairly similar (in aggregate) at different timescales (1, 5, 10 and
8


Physica A: Statistical Mechanics and its Applications 651 (2024) 130015G. Barnes et al.
Table 3
Summary statistics for the values of correlation obtained from the realised correlation estimator
with different sampling intervals. Here the mean (Mean) is the average correlation across all
currency pair combinations. The standard deviation (Std Dev.) is the average standard deviation
of the correlation elements across all currency pair combinations. Finally cor(., 5 mins) gives
the average correlation of the individual correlation elements with those obtained for a 5 min
sampling period across all currency pair combinations.
Time scale (mins) Mean Std Dev. Cor(., 5 mins)

1 0.05 0.10 0.77
5 0.04 0.12 1.00
10 0.04 0.15 0.86
15 0.04 0.16 0.77

Fig. 1. Examples of realised daily correlation estimates over time.

15 min) and moreover, vary in a concordant manner. We have also confirmed that the model fit results are similar for 1, 5, 10
and 15 min suggesting some robustness to this choice (see Table 4). Finally, we have validated that our anomaly detection results
presented in Section 5.1 are qualitatively similar for all timescales beyond a certain value (the effectiveness of the anomaly detection
procedure for 5, 10 and 15 min is similar, with 1 min results being slightly worse), see section 5 of the supplementary material.

We do acknowledge however that the procedure we have applied is not likely to be the most efficient nor the most robust to
microstructure noise (see the aforementioned references for approaches that could be more efficient and robust). However, the
simplicity and ubiquity of the realised correlation estimator is appealing and it allows us to make contact with asymptotic Gaussian
sampling properties as discussed in the introduction. The main focus of the present article is to explore the phenomenological
modelling of ensembles of correlation matrices and not particular correlation estimators. Indeed, the PIGM model can be just as
readily applied to correlation matrices arising from other correlation estimators, some of which naturally handle the asynchronous
nature of the data, and do not require a choice of timescale e.g. [29,35]. Our results may well generalise to a broader class of
correlation estimators. Investigating the degree to which the results generalise to other correlation estimators is an area for future
research.

4. Matrix theory and matrix data: near-Gaussianity

In this section we apply the Gaussian model of Section 2 to the ensemble of correlation matrices defined in Section 3. We show
that the vast majority of cubic and quartic observables closely match the predictions of the Gaussian theory.

We form a compact representation of the original correlation matrix data by defining a vector of observables for each correlation
matrix. The observable vectors are shown to perform well in a variety of anomaly detection and similarity ranking tasks in Section 5.
Optimal performance is achieved in these tasks by constructing observable vectors from the least Gaussian observables.

4.1. Theory/experiment deviations normalised by standard deviations of the observables

We begin by elucidating some empirical statistical properties of the observables — the permutation invariant polynomials of
the correlation matrix elements. The empirical distributions of these observables typically exhibit a right/positive skew along with
heavier tails than the normal distribution. It is also noteworthy that all estimated Pearson product-moment correlations between
9


Physica A: Statistical Mechanics and its Applications 651 (2024) 130015G. Barnes et al.

b

a
h

T
f
e
n
i
r
a
a

t

I
s
o
T
f
m

t
d
t
s
o

4

n
t
i
n
m
a

o
e
1
𝜎
o
s

the observables are positive and that most correlations are very strongly positive (see the supplementary material for a plot of these
correlations). The strength of the correlations is particularly relevant in our choice of statistical distance measure in the anomaly
detection analysis presented in Section 5.1. Indeed, this motivated utilising the Mahalanobis distance which typically performs
well even in the presence of such correlations. In addition, the fact that the empirical distributions of the observables are mostly
unimodal, suggests this distance measure is appropriate. Equipped with the Gaussian model and its solution, given in Appendix B,
and the financial data described in Section 3 we now perform a variety of tests to assess how well this model describes the statistics
of the data. Firstly, we calculate the normalised absolute error

𝛥𝛼 =
|⟨𝛼⟩T − ⟨𝛼⟩E|

𝜎E,𝛼
, (4.1)

etween the experimental cubic and quartic observable average values

⟨𝛼⟩E = 1
𝑛𝐴

𝑁𝐷
∑

𝐴=1
𝛼(𝜌

𝑖𝑗
𝐴) , (4.2)

nd the Gaussian model’s prediction of those expectation values ⟨𝛼⟩T. In both equations 𝛼 labels the observable and in (4.1) we
ave normalised by the standard deviation of the experimental observable values.

As we argued in Section 2 we expect these normalised errors to be small where the underlying data is approximately Gaussian.
he normalised absolute errors for each observable with a sampling timescale of 1, 5, 10, and 15 min are listed in the third,
ourth, fifth, and sixth columns of Table 4 respectively. In general these normalised absolute errors are in very good agreement: at
ach timescale 4 or fewer observables differ from the theoretical prediction by more than 1 standard deviation, and the average
ormalised absolute error of the cubic and quartic observables is at most 0.43 standard deviations. This is evidence for Gaussianity
n the permutation invariant sector of the FX-rate correlation matrix data. The fact that the absolute errors are small across the
ange of sampling timescales considered, from 1 min up to 15 min, suggests that this is a robust feature of the data and not merely
n artefact of the choice of a particular sampling timescale. We have chosen to use the 5 min data for the ratio of the experimental
nd theoretical standard deviation as well as the subsequent statistical analyses.

The model defined in Appendix B can be used to predict the standard deviation of cubic and quartic observables. We call this
he theoretical standard deviation, and define it for each observable 𝛼 as

𝜎T,𝛼 ≡
√

|⟨(𝛼)2⟩T − ⟨𝛼⟩
2
T| . (4.3)

n itself this is an interesting quantity to compare to the experimental observable standard deviations 𝜎E,𝛼 . The ratio of the two
tandard deviations is shown for each observable in the seventh column of Table 4 for data sampled at 5 min intervals. The values
f 𝜎T,3 and 𝜎T,22 provided by the model are much smaller than the observed values. This is consistent with the absolute errors in
able 4, where we see the expectation values of these observables deviating the most from the model. It is these large deviations
rom Gaussianity that lend these observables their power in the construction of lower-dimensional representations of the correlation
atrices (see Section 5.1.3).

We briefly note an alternative approach to estimating the theoretical standard deviation, also employed in [4] to give good
heoretical predictions of the experimental standard deviations of observables. This estimate is obtained by calculating the absolute
ifference between ⟨𝛼⟩T and ⟨𝛼⟩T’, where ⟨𝛼⟩T’ is the expectation value evaluated with the quadratic couplings that parametrise
he model shifted by one standard deviation. Taking the average of this difference over all 8 possible permutations of sign for the
hifts of the 3 parameters gives us our estimate of the standard deviation. This method was used to estimate the standard deviations
f 12 and 19 due to the prohibitive computational demands of calculating the octic expectation values ⟨2

12⟩ and ⟨2
19⟩.

.2. Absolute errors relative to standard deviations and standard errors of observables

Thus far, we have analysed the differences between theoretical observable mean values and experimental observable mean values,
ormalised by the experimental standard deviation of the observable values. We have found that 27 out of 31 observables deviate less
han 1 experimental standard deviation.3 These normalised differences have a physical interpretation. Small normalised differences
n this case are suggestive of small coupling constants for higher order corrections in the action (see Section 2) which is evidence for
ear-Gaussianity of the experimental data generating process. Furthermore, we have found that ‘‘physical’’ tests of the theoretical
odel versus experiment such as calculating the proportion of days captured and balanced accuracy of a typicality classifier provide

dditional evidence for the pure Gaussian model being a good approximation (see section 3 of the supplementary material).
Another possible choice for normalising the differences between the theoretical observable mean values and the experimental

bservable mean values is the experimental standard error. The standard error in this case is the standard deviation of the
xperimental observable mean values. We denote the standard error, 𝜎�̄�. Given the definition of the mean estimator, i.e. �̄� =
∕𝑛

∑𝑛
𝑖=1 𝑥𝑖, the standard error is equal to the standard deviation of the permutation invariant polynomial values for each matrix,

E, divided by
√

𝑛 i.e. 𝜎�̄� = 𝜎E∕
√

𝑛. The standard error is useful in determining whether the differences between the theoretical
bservable mean values and experimental observable mean values are plausibly due to sampling variation. The larger the sample
ize, the smaller the departures between theory and experiment that can be distinguished from sampling variation i.e. ‘‘experimental

3 The four observables that differ by more than one standard deviation are  , , and  .
10

3 17 19 22


Physica A: Statistical Mechanics and its Applications 651 (2024) 130015G. Barnes et al.

e
d

b
3
t
o
v
m
9
e
v
a
m

r

I
e

d
w
(

Table 4
For each observable in the first two columns the third, fourth, fifth and sixth columns list the absolute difference between the experimental
value and theoretical prediction normalised by the experimental standard deviation for data sampled at 1, 5, 10 and 15 min intervals
respectively. The seventh column lists the ratio of the experimental and theoretical standard deviations for data sampled at 5 min intervals.

Label Observable 𝛥(1)
𝛼 𝛥(5)

𝛼 𝛥(10)
𝛼 𝛥(15)

𝛼 𝜎𝐸∕𝜎𝑇
1

∑

𝑖,𝑗 �̂�
3
𝑖𝑗 0.09 0.02 0.07 0.10 0.90

2
∑

𝑖,𝑗,𝑘 �̂�2𝑖𝑗 �̂�𝑗𝑘 0.43 0.33 0.28 0.23 1.49
3

∑

𝑖,𝑗,𝑘 �̂�𝑖𝑗 �̂�𝑗𝑘 �̂�𝑘𝑖 2.02 2.04 1.98 1.83 9.39
4

∑

𝑖,𝑗,𝑘,𝑙 �̂�2𝑖𝑗 �̂�𝑘𝑙 0.05 0.01 0.00 0.01 1.06
5

∑

𝑖,𝑗,𝑘,𝑙 �̂�𝑖𝑗 �̂�𝑗𝑘 �̂�𝑘𝑙 0.96 0.97 0.96 0.90 3.36
6

∑

𝑖,𝑗,𝑘,𝑙 �̂�𝑖𝑗 �̂�𝑖𝑘 �̂�𝑖𝑙 0.36 0.33 0.33 0.28 0.73
7

∑

𝑖,𝑗,𝑘,𝑙,𝑚 �̂�𝑖𝑗 �̂�𝑗𝑘 �̂�𝑙𝑚 0.11 0.12 0.12 0.13 1.05
8

∑

𝑖,𝑗,𝑘,𝑙,𝑚,𝑛 �̂�𝑖𝑗 �̂�𝑘𝑙 �̂�𝑚𝑛 0.01 0.01 0.02 0.02 0.65

9
∑

𝑖,𝑗 �̂�4𝑖𝑗 0.16 0.54 0.67 0.69 1.10
10

∑

𝑖,𝑗,𝑘 �̂�2𝑖𝑗 �̂�
2
𝑗𝑘 0.49 0.42 0.38 0.33 2.42

11
∑

𝑖,𝑗,𝑘 �̂�𝑖𝑗 �̂�
3
𝑗𝑘 0.09 0.05 0.12 0.16 1.30

12
∑

𝑖,𝑗,𝑘 �̂�𝑖𝑗 �̂�𝑖𝑘 �̂�
2
𝑗𝑘 0.90 0.88 0.85 0.79 3.27a

13
∑

𝑖,𝑗,𝑘,𝑙 �̂�𝑖𝑗 �̂�𝑘𝑗 �̂�
2
𝑙𝑗 0.23 0.19 0.17 0.15 1.67

14
∑

𝑖,𝑗,𝑘,𝑙 �̂�𝑖𝑗 �̂�
3
𝑘𝑙 0.12 0.04 0.00 0.03 0.64

15
∑

𝑖,𝑗,𝑘,𝑙 �̂�𝑖𝑗 �̂�
2
𝑗𝑘 �̂�𝑘𝑙 0.25 0.21 0.17 0.14 1.07

16
∑

𝑖,𝑗,𝑘,𝑙 �̂�𝑖𝑗 �̂�𝑗𝑘 �̂�
2
𝑘𝑙 0.53 0.49 0.45 0.41 2.45

17
∑

𝑖,𝑗,𝑘,𝑙 �̂�𝑖𝑗 �̂�𝑗𝑘 �̂�𝑖𝑘 �̂�𝑘𝑙 1.01 1.00 0.96 0.88 7.44
18

∑

𝑖,𝑗,𝑘,𝑙 �̂�2𝑖𝑗 �̂�
2
𝑘𝑙 0.08 0.07 0.08 0.08 2.06

19
∑

𝑖,𝑗,𝑘,𝑙 �̂�𝑖𝑗 �̂�𝑗𝑘 �̂�𝑘𝑙 �̂�𝑙𝑖 1.21 1.24 1.20 1.08 8.57a

20
∑

𝑖,𝑗,𝑘,𝑙,𝑚 �̂�𝑖𝑘 �̂�𝑗𝑘 �̂�𝑙𝑘 �̂�𝑚𝑘 0.37 0.36 0.36 0.29 0.79
21

∑

𝑖,𝑗,𝑘,𝑙,𝑚 �̂�𝑖𝑙 �̂�𝑗𝑘 �̂�𝑙𝑘 �̂�𝑚𝑘 0.40 0.39 0.37 0.33 1.67
22

∑

𝑖,𝑗,𝑘,𝑙,𝑚 �̂�𝑖𝑗 �̂�𝑘𝑙 �̂�𝑙𝑚 �̂�𝑚𝑘 1.25 1.31 1.27 1.15 10.1
23

∑

𝑖,𝑗,𝑘,𝑙,𝑚 �̂�2𝑖𝑗 �̂�𝑘𝑙 �̂�𝑙𝑚 0.07 0.05 0.05 0.05 1.63
24

∑

𝑖,𝑗,𝑘,𝑙,𝑚 �̂�𝑖𝑗 �̂�𝑘𝑙 �̂�2𝑙𝑚 0.31 0.24 0.20 0.16 0.81
25

∑

𝑖,𝑗,𝑘,𝑙,𝑚 �̂�𝑖𝑗 �̂�𝑗𝑘 �̂�𝑘𝑙 �̂�𝑙𝑚 0.79 0.79 0.78 0.71 5.98
26

∑

𝑖,𝑗,𝑘,𝑙,𝑚,𝑛 �̂�𝑖𝑗 �̂�𝑘𝑙 �̂�𝑘𝑚 �̂�𝑘𝑛 0.14 0.12 0.12 0.08 0.54
27

∑

𝑖,𝑗,𝑘,𝑙,𝑚,𝑛 �̂�𝑖𝑗 �̂�𝑗𝑘 �̂�𝑙𝑚 �̂�𝑙𝑛 0.10 0.10 0.11 0.12 1.47
28

∑

𝑖,𝑗,𝑘,𝑙,𝑚,𝑛 �̂�2𝑖𝑗 �̂�𝑘𝑙 �̂�𝑚𝑛 0.06 0.02 0.00 0.00 0.58
29

∑

𝑖,𝑗,𝑘,𝑙,𝑚,𝑛 �̂�𝑖𝑗 �̂�𝑘𝑙 �̂�𝑙𝑚 �̂�𝑚𝑛 0.65 0.65 0.64 0.57 2.10
30

∑

𝑖,𝑗,𝑘,𝑙,𝑚,𝑛,𝑜 �̂�𝑖𝑗 �̂�𝑗𝑘 �̂�𝑙𝑚 �̂�𝑛𝑜 0.13 0.13 0.13 0.13 0.67
31

∑

𝑖,𝑗,𝑘,𝑙,𝑚,𝑛,𝑜,𝑝 �̂�𝑖𝑗 �̂�𝑘𝑙 �̂�𝑚𝑛 �̂�𝑜𝑝 0.02 0.03 0.03 0.04 0.38

Average 0.43 0.42 0.41 0.38 2.50

a Values were obtained using an estimate of 𝜎T described at the end of Section 4.1.

rror’’. In particular, genuine departures are sizeable in comparison to standard errors e.g. bigger than 3 standard errors. Such
epartures can be interpreted as highly statistically significant differences.

In our data set, which has a fairly large sample size of 446, we have observed that 13 out of 31 observables have a difference
etween the theoretical observable mean and experimental observable mean value of fewer than 3 standard errors (i.e. 18 out of
1 observables exhibit a departure of more than 3 standard errors). To further explore the statistical significance of differences in
he theoretical versus experimental observable mean values we have also calculated the percentile bootstrap confidence intervals
f the experimental observable mean values. The bootstrap procedure involved re-sampling from the original set of observable
alues, uniformly with replacement, to construct 1000 bootstrap samples of the same size as the original sample (i.e. 446). The
ean of each such bootstrap sample was then calculated. Given the asymptotic normality of the mean estimator, it is expected that
9.7th percentile bootstrap confidence intervals will approximately correspond to 3 standard errors on either side of the original
xperimental mean estimate. This is reflected in our results, which reveal 12 of 31 observables with theoretical observable mean
alues within this confidence interval and 19 of 31 observables with theoretical mean values outside the interval. This is in close
greement with the aforementioned basic standard error results where we had 13 of 31 observables with theoretical observable
ean values lying within 3 standard errors of the experimental observable mean values.

It is also worth noting that statistically significant differences may nonetheless be small in terms of relative error, which we
ecall is defined as

|⟨𝛼⟩T − ⟨𝛼⟩E|

⟨𝛼⟩E
. (4.4)

ndeed we have observed that 20 out of 31 observables have a relative error of less than 30% when comparing theoretical versus
xperimental mean observable values.

The key point from this section is that when we consider the absolute error of observables in comparison to the standard
eviation, a measure motivated by consideration of perturbative corrections to the toy Gaussian model, we have 27 of 31 observables
hich are within 1 standard deviation (all are within 3 standard deviations). On the other hand 6 of 31 are within 1 standard error

13 are within 3 standard errors). This suggests that developing computations of expectation values in theoretical models which
11


Physica A: Statistical Mechanics and its Applications 651 (2024) 130015G. Barnes et al.

p
m
l
s
v
t
i
t

5

d
t
t
a
t
t
N
s
w
w
r
t

5

d
v
(
T
v

w

contain small cubic and quartic terms, as guided by the data, is likely to give statistically significant improvement, given our current
sample sizes, of the agreement between theoretical and experimental expectation values of observables. This is technically more
intricate than computing in the Gaussian model and is left for future investigation.

5. Applications of matrix theory to matrix data: anomaly detection and similarity measures

In establishing approximate Gaussianity in the matrix data of correlation matrices, observable vectors, formed using lists of
ermutation invariant polynomial functions labelled by graphs, provide the key bridge between permutation invariant Gaussian
atrix theory and the matrix data. The observable vectors associated with the correlation matrices can be regarded as particular,

ower-dimensional representations of the correlation matrices. The observable vectors themselves are random vectors, for which the
tatistics entailed by the PIGM model are a good approximation in general. A natural question to ask is whether the observable
ectors provide a more compact representation of correlation matrices which accentuates statistical ‘‘signal’’ in the data as opposed
o noise. Such a representation would be closely linked to an accurate characterisation of the market state and applications could
nclude classification/regression models, clustering analysis, anomaly/outlier detection etc. In this section, we will consider two
asks: anomaly detection and similarity measurement, which we describe in more detail in Sections 5.1 and 5.2 respectively.

.1. Anomaly detection and economically significant dates

In this section we will demonstrate that the observable vectors do indeed constitute a promising representation for anomaly/outlier
etection. The task of anomaly/outlier detection pertains to identifying observations that differ significantly from the majority of
he data set. In our context, we seek to identify unusual and noteworthy observable vectors, each of which is associated with
he correlation matrix of a particular date. To verify that a meaningful result has been obtained, we need a notion of unusual
nd noteworthy dates in the forex market as a reference. The natural approach we take is to consider special dates in the forex
rading calendar corresponding to the highest impact economic news announcements. These announcements often lead to a flurry of
rading activity along with associated price movements, market volatility and changes in the relationships between currency pairs.
ote that the main purpose of our analysis is to establish whether the observable space has physically/economically meaningful

tructure (thus constituting a good representation) rather than to introduce a state-of-the-art anomaly detection algorithm which
ould require significant tuning and comparison to several existing alternatives. As such, we combine the observable vectors
ith a straightforward, parameter-free, distance-based approach for anomaly detection. We only compare the observable vector

epresentation to 𝐷(𝐷 − 1)∕2-dimensional vectors formed by stacking the daily correlation matrix elements, which we refer to as
he raw correlation representation.

.1.1. Anomaly detection algorithm
A common approach to detecting anomalous/outlier observations, is to utilise a statistical distance measure to determine the

istance of each random vector from the mean vector. One can equivalently think of this as determining the length of the random
ectors as measured from the origin for centred data. We will utilise the Mahalanobis distance measure to assess these distances
see [36] for example). These distances directly correspond to anomaly scores, allowing the vectors to be ordered by anomalousness.
he Mahalanobis distance better treats the case of highly correlated elements in the random vectors, as is the case for the observable
ectors (as noted in Section 4.1). Concretely, given a multivariate probability distribution 𝐹 on R𝑁 (i.e. generating random vectors

𝑦 ∈ R𝑁 ), with mean vector 𝜇 and covariance matrix 𝛴, the Mahalanobis distance of a point �⃗� ∈ R𝑁 from the mean 𝜇 is defined as,

𝑑(�⃗�, 𝜇) =
√

(�⃗� − 𝜇)𝛴−1(�⃗� − 𝜇), (5.1)

here 𝜇𝑖 = 𝐸(𝑦𝑖) and 𝛴𝑖𝑗 = 𝐸
[

(𝑦𝑖 − 𝜇𝑖)(𝑦𝑗 − 𝜇𝑗 )
]

. In the context of anomaly detection, the distribution 𝐹 may be regarded as the
distribution of the non-anomalous points.

One potential issue in practice is that the covariance matrix estimate used in the calculation above could be ill-conditioned,
possibly leading to numerical issues. Another potential issue is that all points being evaluated contribute to the mean and covariance
estimates in the Mahalanobis distance calculation. In particular, both ordinary points and those points that are, in fact, anomalous,
affect the mean and covariance estimates. We have attempted to improve the situation by excluding each point in turn from the
estimate of the mean and covariance before evaluating the distance of this point from the mean. We have found that this simple
and easy-to-interpret adjustment gives better anomaly detection results. While more sophisticated approaches exist in the literature
based on the robust estimation of the mean and covariance matrix, these are more complicated and require further choices to be
made. It is also noteworthy that we have found that the procedure of excluding each point in turn from the mean and covariance
estimates yields essentially the same result as using the average distance to all other points as an anomaly score (i.e. the average
k-nearest neighbours approach with k set to the number of other points in the data set). The average k-NN approach using all other
points in the data set is known to be effective in detecting anomalous points that are on the boundaries of the data set [36], whilst
not being able to detect isolated points in the interior of the data set, which we regard as a more subtle type of anomaly for future
12

study.


Physica A: Statistical Mechanics and its Applications 651 (2024) 130015G. Barnes et al.
Fig. 2. Distances of observable vectors and raw correlation vectors from the origin using the standardised Euclidean and Mahalanobis metrics.

5.1.2. Economically significant dates
It is well known amongst forex trading practitioners that there are certain currencies and certain types of economic announce-

ments that typically have the highest impact on the forex markets (see [37] for example). These currencies are associated with the
countries or blocs with the largest economies, namely the US Dollar, Chinese Renminbi, Japanese Yen, European Union Euro and
Great British Pound. Four of the most important classes of economic announcements are described below.

• Central bank meetings and announcements relating to interest rate decisions etc. These include the FOMC (Federal Open
Market Committee), ECB (European Central Bank), BoE (Bank of England), PBoC (People’s Bank of China) and BoJ (Bank of
Japan) meetings associated with the United States, European Union, Great Britain, China and Japan respectively.

• Unemployment data releases. One of the most important examples of this is the US Non-Farm Payrolls release.
• Consumer price index releases. The most important release in this category is the US consumer price index release.
• Unplanned forex news including special central bank meetings and speeches, political speeches etc.

In our subsequent analyses we utilise the economic calendar sourced from [38] of high impact events and filter for only those events
pertaining to the aforementioned currencies (and associated economies) and economic announcements specifically. The exact strings
used for filtering the events based on event name are: ‘‘ECB Press Conference’’, ‘‘BoE MPC’’, ‘‘FOMC Press Conference’’, ‘‘BoJ Press
Conference’’, ‘‘PBoC Interest Rate Decision’’, ‘‘Nonfarm Payrolls’’, ‘‘Consumer Price Index ex Food & Energy’’, ‘‘European Council
Meeting’’, ‘‘EU Leaders Special Summit’’ and ‘‘ECB Special Strategy Meeting’’. During the period 2020-04-01 to 2022-01-31 one or
more of these high impact events occurred on approximately 27% of business days.

5.1.3. Dimensionality reduction
The construction of lower-dimensional representations of the correlation matrices – namely the observable vectors – is effectively

a dimensionality reduction procedure. As is common with such procedures (e.g. Principal Component Analysis, PCA), there is a
trade-off between reducing dimensionality and preserving information content. Balancing these trade-offs through a good choice
of the number of components often leads to better results in various applications. In PCA, the cumulative variance of the first
principal components is typically used as an organising quantity to select how many such components to include. In our case, we
take the normalised magnitude of the difference between the empirical expectation values of the cubic and quartic observables
and the theoretical predictions of the PIGM model (2.28), as an organising quantity for determining which observables to retain.
The thesis is that the empirical higher order observables that depart from theoretical expectations indicate additional information
13


Physica A: Statistical Mechanics and its Applications 651 (2024) 130015G. Barnes et al.

p
t
A
o
P

T
m
b
t
e
s
T

Table 5
The 12 cubic and quartic observables that have the largest normalised difference
from the PIGM predictions.
Observable label Observable Def. Observable order

3
∑

𝑖,𝑗,𝑘 �̂�𝑖𝑗 �̂�𝑗𝑘 �̂�𝑘𝑖 Cubic
5

∑

𝑖,𝑗,𝑘,𝑙 �̂�𝑖𝑗 �̂�𝑗𝑘 �̂�𝑘𝑙 Cubic

9
∑

𝑖,𝑗 �̂�4𝑖𝑗 Quartic
10

∑

𝑖,𝑗,𝑘 �̂�2𝑖𝑗 �̂�
2
𝑗𝑘 Quartic

12
∑

𝑖,𝑗,𝑘 �̂�𝑖𝑗 �̂�𝑖𝑘 �̂�
2
𝑗𝑘 Quartic

16
∑

𝑖,𝑗,𝑘,𝑙 �̂�𝑖𝑗 �̂�𝑗𝑘 �̂�
2
𝑘𝑙 Quartic

17
∑

𝑖,𝑗,𝑘,𝑙 �̂�𝑖𝑗 �̂�𝑗𝑘 �̂�𝑖𝑘 �̂�𝑘𝑙 Quartic
19

∑

𝑖,𝑗,𝑘,𝑙 �̂�𝑖𝑗 �̂�𝑗𝑘 �̂�𝑘𝑙 �̂�𝑙𝑖 Quartic
21

∑

𝑖,𝑗,𝑘,𝑙,𝑚 �̂�𝑖𝑙 �̂�𝑗𝑘 �̂�𝑙𝑘 �̂�𝑚𝑘 Quartic
22

∑

𝑖,𝑗,𝑘,𝑙,𝑚 �̂�𝑖𝑗 �̂�𝑘𝑙 �̂�𝑙𝑚 �̂�𝑚𝑘 Quartic
25

∑

𝑖,𝑗,𝑘,𝑙,𝑚 �̂�𝑖𝑗 �̂�𝑗𝑘 �̂�𝑘𝑙 �̂�𝑙𝑚 Quartic
29

∑

𝑖,𝑗,𝑘,𝑙,𝑚,𝑛 �̂�𝑖𝑗 �̂�𝑘𝑙 �̂�𝑙𝑚 �̂�𝑚𝑛 Quartic

beyond the linear and quadratic structure encoded into the PIGM model. We have empirically determined that the 12 ‘‘least
Gaussian’’ observables yield optimal anomaly detection results (i.e. median odds-ratio). Notably, the results broadly improve as more
observables are added starting from a small number of observables, reach a peak and then decline somewhat as more observables
are added. These ‘‘least Gaussian’’ observables are listed in Table 5. We have also observed that useful information remains in
the other observables however. The cubic and quartic observables that are best predicted by the PIGM model still have reasonable
effectiveness in anomaly detection for example, as do random subsets of observables and the complete set of observables. This aligns
with our overarching findings that the PIGM model is a good fit overall and captures meaningful statistical structure. There does
appear to be additional information captured in the least well fit observables however as supported by the results below.

5.1.4. Longest observable vectors and economically significant dates
We assess how strongly the lengths of the observable vectors constructed from the observables in Table 5 are associated with the

resence or absence of economically significant events. To investigate the utility of the observable representation, we compare to
he results obtained for the original correlation matrices, applying both standardised Euclidean and Mahalanobis distance measures.
n interesting aside is that the Mahalanobis distance on the original correlation matrices can be regarded as a soft-threshold version
f anomaly detection based on evaluating the PCA reconstruction error [36], thus constituting an interesting comparison involving
CA. The methodology is as follows.

1. Calculate the standardised Euclidean and Mahalanobis vector lengths for the observable vector associated with each date
in the dataset (representing distance from the mean observable vector or equivalently the origin in this case). The maximal
dimension of the vector space considered here is 𝑁 = 31 while the optimal number of least Gaussian observables, as stated
earlier, is 12.

2. Calculate the standardised Euclidean and Mahalanobis vector lengths for the correlation feature vector associated with each
date in the dataset. The correlation feature vector for each date is comprised of the 171 pairwise correlations calculated
between all 19 currency pairs. We term these features raw correlations.

3. Rank the dates in the dataset by Euclidean and Mahalanobis vector length, in descending order for the observable vectors
and raw correlation feature vectors.

4. Assess whether the top 25, 50 and 100 dates have a statistically significantly higher number of economically significant
events than the bottom 25, 50 and 100 dates ordered by distance in a descending manner. In addition, we calculate the ratio
between the odds of observing a economically significant news event in the top/most anomalous 25, 50, 100 dates and the
odds of such an event occurring in the bottom/most typical 25, 50, 100 dates. This odds-ratio (OR) is defined as,

OR =
𝑃𝑇 ∕(1 − 𝑃𝑇 )
𝑃𝐵∕(1 − 𝑃𝐵)

, (5.2)

where 𝑃𝐵 corresponds to the proportion of the 25, 50, 100 closest dates to the origin that are associated with economically
significant events. Similarly, 𝑃𝑇 corresponds to the proportion of the 25, 50, 100 furthest dates from the origin that are
associated with economically significant events.

he distances for the respective metrics and features are presented in Fig. 2. Notably, the observable feature vectors appear to have
ore distinct outlier days and less noise. The results of comparing the top 25, 50, 100 dates by distance from the origin with the

ottom 25, 50, 100 dates respectively are collected in Table 6. The best contrast of the number of economic events appearing in
he furthest days from the origin compared to the closest days from the origin respectively is given by the Mahalanobis distance
valuated on observable vectors. Indeed, for the Mahalanobis distance on observable vectors, there is a higher degree of statistical
ignificance (i.e. smaller p-values) and higher odds-ratios than the other combinations of metric and features in almost all cases.
14

he high odds-ratios imply that the odds of an economically significant event occurring in the anomalous groups (most distant)


Physica A: Statistical Mechanics and its Applications 651 (2024) 130015G. Barnes et al.

p
0

T
t

t
T
i
t

t
2
w
c
i
w

5

i
t
b

Table 6
In-sample anomaly detection results for 5 min sampled data. In the table below, the proportions, 𝑃𝐵 , 𝑃𝑇 and the odds-ratio, OR, are as defined in Eq. (5.2). The
-value is obtained using Fisher’s exact one-sided test. The * symbol following a p-value indicates significance at the 0.05 level, ** indicates significance at the
.01 level and *** indicates significance at the 0.001 level.
Metric Features Subset size 𝑃𝑇 𝑃𝐵 p-value OR

Euclidean Observables 25 0.40 0.20 1.1×10−1 2.67
Euclidean Observables 50 0.38 0.24 1.0×10−1 1.94
Euclidean Observables 100 0.38 0.28 8.8×10−2 1.58

Mahalanobis Observables 25 0.44 0.16 3.1×10−2* 4.12
Mahalanobis Observables 50 0.38 0.16 1.2×10−2* 3.22
Mahalanobis Observables 100 0.39 0.13 2.1×10−5*** 4.28

Euclidean Raw Correlations 25 0.52 0.32 1.3×10−1 2.30
Euclidean Raw Correlations 50 0.44 0.28 7.2×10−2 2.02
Euclidean Raw Correlations 100 0.39 0.22 6.8×10−3** 2.27

Mahalanobis Raw Correlations 25 0.24 0.16 3.6×10−1 1.66
Mahalanobis Raw Correlations 50 0.32 0.14 2.8×10−2* 2.89
Mahalanobis Raw Correlations 100 0.28 0.20 1.2×10−1 1.56

Table 7
Out-of-sample anomaly detection results for 5 min sampled data. In the table below, the proportions, 𝑃𝐵 , 𝑃𝑇 and the odds-ratio, OR, are as defined in Eq. (5.2).

he p-value is obtained using Fisher’s exact one-sided test. The * symbol following a p-value indicates significance at the 0.05 level, ** indicates significance at
he 0.01 level and *** indicates significance at the 0.001 level.
Metric Features Subset Size 𝑃𝑇 𝑃𝐵 p-value OR

Euclidean Observables 25 0.44 0.24 1.2×10−1 2.49
Euclidean Observables 50 0.44 0.24 2.8×10−2* 2.49
Euclidean Observables 100 0.29 0.23 2.1×10−1 1.37

Mahalanobis Observables 25 0.56 0.16 3.6×10−3** 6.68
Mahalanobis Observables 50 0.50 0.10 1.0×10−5*** 9.00
Mahalanobis Observables 100 0.37 0.12 3.0×10−5*** 4.31

Euclidean Raw Correlations 25 0.48 0.20 3.6×10−2* 3.69
Euclidean Raw Correlations 50 0.48 0.16 5.6×10−4*** 4.85
Euclidean Raw Correlations 100 0.42 0.14 7.9×10−6*** 4.45

Mahalanobis Raw Correlations 25 0.28 0.16 2.5×10−1 2.04
Mahalanobis Raw Correlations 50 0.36 0.20 5.9×10−2 2.25
Mahalanobis Raw Correlations 100 0.31 0.25 2.2×10−1 1.35

are much higher than the odds in the typical groups (least distant). This provides evidence that the observable vectors are a good,
low-dimensional characterisation of the market state and accentuate meaningful financial ‘‘signal’’.

An additional note relates to the correlation matrices and associated observable vectors for February 2022. During this period,
here were several extremely anomalous dates (with extreme vector lengths), coinciding with the beginning of the war in Ukraine.
hese had the effect of masking the anomalous nature of earlier events and reducing the sensitivity of the detection algorithm. This

s a well known consequence of applying the Mahalanobis distance to anomaly detection, known as the masking effect [39]. We
herefore excluded February 2022 from all our analyses.

The analysis conducted thus far can be regarded as pertaining to anomaly detection with an in-sample selection of the observables
o include. We also conducted an out-of-sample analysis using the same observables as the in-sample analysis, now for the date range
022-03-01 to 2023-03-31. The results are collected in Table 7 and reveal that the in-sample anomaly detection results generalise
ell, and the Mahalanobis distance calculated on observable vectors continues to out-perform the other alternatives. Robustness

hecks that were conducted include rerunning the analysis with correlation matrices constructed with 1, 10 and 15 min sampling
ntervals for the log returns (as opposed to 5 min). The results of these analyses were qualitatively similar to those already presented,
ith only the 1 min results being somewhat worse than longer timescales, see section 5 of the supplementary material.

.2. Visual similarity

Equipped with the observable vector representation of correlation matrices and the Mahalanobis distance metric, another
nteresting investigation is to assess how visually similar the correlation matrices are that correspond to the pairs of dates with
he closest observable vectors and contrast this with the furthest pairs. The procedure involves calculating the Mahalanobis distance
etween each pair of dates, of which there are 446(445)∕2 = 99,235 such pairs, and then sorting by distance in ascending order. The

closest 10 pairs and most distant 10 pairs are visualised in figures provided in the supplementary material. It is readily apparent that
the proximity of the dates in the space of observables aligns quite closely with the visual similarity or dissimilarity of the correlation
matrices.

Finally, we quantitatively assess whether there is a concordance between the closeness of the visual representations of correlation
matrices with the closeness of the observable vectors. In particular, we determine the Spearman correlation between the distance
ranking of pairs of dates as determined in two ways,
15


Physica A: Statistical Mechanics and its Applications 651 (2024) 130015G. Barnes et al.

r
d
m

6

m
f
e
v
n
o
p
n
t
c
a
o
b
o
a
e
c
i
d

m
s
c
T

c
a
s
f
e
o
y

c
b
o
t

c
s
l
s
c
l

e

1. Using the Euclidean distance to measure distance between pairs of correlation matrices as a proxy for visual similarity (this
is essentially the widely used mean squared error metric for comparing image similarity, known to be closely related to
perceived visual similarity).

2. Using the Mahalanobis distance between pairs of observable vectors.

The result is a Spearman correlation of approximately 0.7 across 99,235 pairs of dates. Spearman correlation coefficients of between
0.7 and 1.0 are regarded as very strongly positive, with 1.0 representing a perfect monotonic relationship. Using the exact Spearman
correlation test, the 𝑝-value is smaller than 2.2 × 10−16 and the hypothesis of the Spearman correlation being zero can be strongly
ejected in favour of the alternate hypothesis that the Spearman correlation is positive. This implies that despite the much lower
imensionality of the observable vectors, they do indeed capture the essence of what we would regard as visually similar correlation
atrices.

. Discussion and conclusion

We have developed the most general 4-parameter permutation invariant Gaussian matrix model appropriate for ensembles of
atrices which are symmetric and diagonally vanishing, as suitable for financial correlation matrices. We have used the models to

ind near-Gaussianity in ensembles of matrices, one matrix for every day over a period, constructed from high-frequency foreign
xchange price quotes. The near-Gaussianity was found to be robust against changes in how the ensemble was constructed : we
aried the sampling intervals used to construct the daily estimates as well as the number of days used in our ensemble. The
ear-Gaussianity is used to motivate a data-reduction technique based on the use of low degree permutation invariant functions
f matrices (observables) as characteristics of the entities represented by the matrices in the ensemble, in this case the days in the
eriod under consideration. The small non-Gaussianities of each observable were used to rank the observables in order of decreasing
on-Gaussianity and to find an optimal number of least Gaussian observables for data analysis. The degree of non-Gaussianity is
hus being used as an analog of the magnitude of singular values in principal component analysis (PCA). The sets of observables
onsidered, either the full set of observables up to quartic degree or the subsets with optimal number of least Gaussian observables,
re much smaller than the number of matrix elements in the matrices. We found successful results in anomaly detection based on the
bservables to find the most atypical and the most typical days in the ensemble. We demonstrated statistically significant matching
etween these typicality/atypicality results extracted from the data of financial correlation matrices and corresponding results based
n human economic judgement of significant events affecting foreign exchange markets. We propose that the success of the use of
set of least Gaussian observables in anomaly detection should be interpreted as indicating that these ensembles of daily foreign

xchange matrices capture an economic reality best described by the Gaussian model perturbed by specific small cubic and quartic
ouplings in the action. The non-Gaussianities capture system-specific non-universalities while the overall approximate Gaussianity
s a universal characteristic which holds across diverse systems, as indeed already evidenced in ensembles of words [3,4]. Related
iscussions of Gaussianity, universality and non-universality appear in random matrix theory literature [40–42].

We have thus explored two hypotheses, which are, in principle, independent. The first hypothesis is that the daily correlation
atrices estimated from the forex data are well described by the PIGM model. This is supported by Table 4 for our data set. The

econd hypothesis is that small departures from Gaussianity, as revealed by the PIGM model, are physically/economically meaningful
haracteristics of the day. We have gathered evidence in favour of this hypothesis on our particular data set in Section 5.1, see
ables 6 and 7 in particular.

Future directions include evaluating how the parameters of the PIGM model and our other key results change as the details of the
orrelation calculation methodology as well as the particular FX data set used for the analysis change. This should help definitively
nswer the question of whether our results capture intrinsic properties of the FX market or are more specific to a particular data
et and correlation estimation procedure. Indeed, this includes investigating how non-stationarity in the FX market data affects the
it of the PIGM model and whether the aforementioned hypotheses survive different market regimes. Another interesting avenue of
xploration related to statistical methodology is the investigation of the impact of using maximum likelihood estimation in place
f the method of moments approach that was utilised in fitting the PIGM model in this paper. While we expect the approaches to
ield qualitatively similar results, there could be subtle differences.

A distinct research direction involves studying the effect of different choices of time-scale for constructing the ensemble of
orrelation matrices, e.g. one hour, twelve hours, two days etc., on the linear and quadratic parameters. In addition, it should
e illuminating to investigate whether the agreement of empirical observables with PIGM predictions (fit on linear and quadratic
bservables only) changes with correlation time-scale. Indeed, the evolution of the magnitude of cubic and quartic corrections with
ime-scale should assist in answering the question of whether the non-Gaussianity increases or decreases with time-scale.

There are also several practical directions to pursue with particular relevance to finance. The first involves clustering the
orrelation matrices in observable vector representation and elucidating how the cluster structure changes with correlation time-
cale. This should provide insight into the states of the market at different time-scales and how they relate to each other (along the
ines of [43,44] for example). Modelling transitions between market states is a related application. A further area of interest is to
tudy the effectiveness of so-called nowcasting of market state, where the market state is identified as it is formed. These applications
ould have great utility in risk management for example. Assessing the effectiveness of the PIGM observables as features in machine
earning algorithms for predicting quantities such as future price returns would inform viability in trading strategy applications.

While we have studied forex data in this article, another future direction is to apply the PIGM model to stock market data. We
16

xpect that some results will be similar and differences would be particularly illuminating.


Physica A: Statistical Mechanics and its Applications 651 (2024) 130015G. Barnes et al.

o
s
s
i
t
c
s

C

Finally, generating realistic, random samples of financial correlation matrices has been recognised as having value in a number
f contexts including enhancing trading strategies and stress testing portfolios of assets [13]. We are presently investigating drawing
amples from the PIGM model. In this case we do not expect sampled matrices to be positive-semidefinite in general, where positive-
emidefiniteness is a property of all valid correlation matrices. However this is readily addressed using approaches such as those
n [45]. We are also contrasting to existing approaches [13]. It will be fascinating to determine whether there is a trade-off between
he exceptional parsimony of the PIGM model (i.e. 4 parameters) and the ability to capture the known stylised facts of financial
orrelation matrices - particularly in contrast to the approach in [13] which utilises a generative adversarial neural network with
everal orders of magnitude more parameters than the PIGM model.

RediT authorship contribution statement

George Barnes: Writing – review & editing, Writing – original draft, Conceptualization. Sanjaye Ramgoolam: Writing – review
& editing, Writing – original draft, Conceptualization. Michael Stephanou: Writing – review & editing, Writing – original draft,
Conceptualization.

Data availability

The data and code has been shared via a link given in the article.

Statement and acknowledgements

The views expressed in this article are those of the authors and do not necessarily reflect the views of Rand Merchant Bank.
Rand Merchant Bank does not make any representations or give any warranties as to the correctness, accuracy or completeness
of the information presented; nor does Rand Merchant Bank assume liability for any losses arising from errors or omissions in the
information in this article. We would like to thank Steve Abel, Manuel Accettuli Huber, Stephon Alexander, Adrian Bevan, Graham
Brown, Adrian Padellaro, Mehrnoosh Sadrzadeh, Alex Stapleton and Steve Thomas for useful discussions on this project. We would
also like to thank the reviewers for their very helpful suggestions. SR is supported by the STFC consolidated grant ST/P000754/1
‘‘String Theory, Gauge Theory and Duality’’. S.R. acknowledges the support of the Institut Henri Poincaré (UAR 839 CNRS-Sorbonne
Université), and LabEx CARMIN (ANR-10-LABX-59- 01). SR also acknowledges the support of the Perimeter Institute for Theoretical
Physics during a visit in the final stages of completion of the paper. Research at Perimeter Institute is supported by the Government
of Canada through Industry Canada and by the Province of Ontario through the Ministry of Economic Development and Innovation.
SR also acknowledges the hospitality of the high energy physics group of Brown University and a Visiting Professorship at the Dublin
Institute for Advanced Studies during the completion of this project.

Declaration of competing interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared
to influence the work reported in this paper.

Appendix A. Currency pair mapping

See Table 8 for the mapping between currency pair indices and currency pair names.

Appendix B. 𝟒-parameter Gaussian model: detailed construction

In this appendix we give a detailed account of the construction of the 4-parameter matrix model which was outlined in Section 2.
The aim is to model the statistics of the particular ensemble of correlation matrices introduced in Section 3 as well as any matrix
ensemble in the same universality class. This class is composed of symmetric square matrices with zeros along the diagonal, for
which physical quantities are invariant under simultaneous permutations of the rows and columns. Given the universality of the
model we label our matrices 𝑀 and index them with lowercase indices 1 ≤ 𝑖, 𝑗 ≤ 𝐷, to distinguish them from the financial correlation
matrices specifically, which we label �̂�. We will refer to matrices with zeros on the diagonal as ‘‘diagonally vanishing’’ throughout.

We begin by defining the action of permutations on the matrix variables and establishing their irreducible decomposition under
this group action. We then define the most general action of a permutation invariant Gaussian matrix model, containing symmetric
matrices with vanishing diagonal. It is parameterised by one linear and three quadratic couplings. Finding projectors that project
from the original matrix basis 𝑀𝑖𝑗 to a basis transforming according to this irreducible decomposition allows us to rewrite this action
in a diagonalised form. In turn, this diagonalisation permits the application of standard multi-dimensional Gaussian integration
techniques which, along with the application of Wick’s theorem, produce analytic formulae for the expectation values of observables
17

as a function of 𝐷 — the dimension of the matrices.


Physica A: Statistical Mechanics and its Applications 651 (2024) 130015G. Barnes et al.

W
m
m

𝐸

Table 8
Currency pair mapping.
Index (I) Currency pair

1 AUD/JPY
2 AUD/NZD
3 AUD/USD
4 CAD/JPY
5 CHF/JPY
6 EUR/CHF
7 EUR/GBP
8 EUR/JPY
9 EUR/PLN
10 EUR/USD
11 GBP/JPY
12 GBP/USD
13 NZD/USD
14 USD/CAD
15 USD/CHF
16 USD/JPY
17 USD/MXN
18 USD/TRY
19 USD/ZAR

B.1. Symmetric group representation theory and matrix variables

The diagonal action of 𝑆𝐷 simultaneously permutes the rows and columns of a matrix

𝑀𝑖𝑗 → 𝑀𝛾(𝑖)𝛾(𝑗), ∀ 𝛾 ∈ 𝑆𝐷. (B.1)

e wish to find an action 𝑆 (which defines the measure) that is invariant under the symmetric group action in Eq. (B.1) on the
atrix variables. We wish to find an action that is invariant under this group action, and again refer to the space of symmetric
atrices with vanishing diagonal as the physical subspace 𝑉 phys of general 𝐷 × 𝐷 matrices and label matrices in the space with a

superscript i.e. 𝑀phys ∈ 𝑉 phys.
The natural representation of the symmetric group 𝑉𝐷, is a 𝐷-dimensional representation with basis {𝑒1, 𝑒2,… , 𝑒𝐷}. For each

permutation 𝛾 ∈ 𝑆𝐷 the linear operator acting on this space is defined by

𝜌𝑉𝐷 (𝛾)𝑒𝑖 = 𝑒𝛾−1(𝑖). (B.2)

The natural representation is a reducible representation and decomposes as

𝑉𝐷 ≅ 𝑉0 ⊕ 𝑉𝐻 , (B.3)

where 𝑉0 is the trivial representation and 𝑉𝐻 is the Hook representation [46], both of which are irreducible. This decomposition is
given explicitly by forming the following linear combinations of natural basis elements

𝐸0 = 1
√

𝐷
(𝑒1 + 𝑒2 +⋯ + 𝑒𝐷) ,

𝐸1 = 1
√

2
(𝑒1 − 𝑒2) ,

𝐸2 = 1
√

6
(𝑒1 + 𝑒2 − 2𝑒3) ,

⋮

𝐸𝑎 = 1
√

𝑎(𝑎 + 1)
(𝑒1 + 𝑒2 +⋯ + 𝑒𝑎 − 𝑎𝑒𝑎+1) ,

⋮

𝐸𝐷−1 = 1
√

𝐷(𝐷 + 1)
(𝑒1 + 𝑒2 +⋯ + 𝑒𝐷−1 − (𝐷 − 1)𝑒𝐷) . (B.4)

0 is the trivial representation, and the remaining 𝐷 − 1 vectors {𝐸1, 𝐸2.… , 𝐸𝐷−1} form an irreducible representation of 𝑆𝐷 which
is 𝑉𝐻 .

A general matrix 𝑀𝑖𝑗 transforming under (B.1) forms the representation 𝑉𝐷⊗𝑉𝐷, the tensor product of two copies of the natural
representation of 𝑆𝐷. A natural basis for this tensor product is given by

𝑒𝑖 ⊗ 𝑒𝑗 , 1 ≤ 𝑖, 𝑗 ≤ 𝐷, (B.5)

on which 𝑆𝐷 acts as

𝜌 ⊗2 (𝛾)𝑒 ⊗ 𝑒 = 𝑒 −1 ⊗ 𝑒 −1 . (B.6)
18

𝑉𝐷
𝑖 𝑗 𝛾 (𝑖) 𝛾 (𝑗)


Physica A: Statistical Mechanics and its Applications 651 (2024) 130015G. Barnes et al.

U
i
i
T
c

m

{𝑒1, 𝑒2,… , 𝑒𝐷} is an orthonormal basis of 𝑉𝐷 under the inner product
(

𝑒𝑖, 𝑒𝑗
)

= 𝛿𝑖𝑗 . (B.7)

This inner product is extended to 𝑉𝐷 ⊗ 𝑉𝐷 as
(

𝑒𝑖 ⊗ 𝑒𝑗 , 𝑒𝑘 ⊗ 𝑒𝑙
)

= 𝛿𝑖𝑘𝛿𝑗𝑙 . (B.8)

Using (B.7) we define the change of basis coefficients

𝐶0,𝑖 ≡ (𝐸0, 𝑒𝑖) =
1

√

𝐷
, (B.9)

𝐶𝑎,𝑖 ≡ (𝐸𝑎, 𝑒𝑖). (B.10)

𝑉𝐷 ⊗ 𝑉𝐷 is a reducible representation of 𝑆𝐷 which has the following irreducible decomposition

𝑉𝐷 ⊗ 𝑉𝐷 ≅ 2𝑉0 ⊕ 3𝑉𝐻 ⊕ 𝑉2 ⊕ 𝑉3. (B.11)

This decomposition can be deduced using (B.3) along with the tensor product rule described in section 7.13 of [25]. Other useful
references for symmetric group representation theory include [46,47]. In terms of Young diagrams, described by listing the number
of columns in each row in descending order, the irreducible representations appearing on the right hand side of this decomposition
are

𝑉0 = [𝐷],

𝑉𝐻 = [𝐷 − 1, 1],

𝑉2 = [𝐷 − 2, 2],

𝑉3 = [𝐷 − 2, 1, 1]. (B.12)

The space of symmetric matrices with vanishing diagonal form a subspace of the representations in (B.11). Firstly, symmetric
matrices transform as Sym2(𝑉𝐷) i.e. the symmetric part of the product space 𝑉𝐷 ⊗ 𝑉𝐷. This is a reducible representation with the
following decomposition

Sym2(𝑉𝐷) ≅ 2𝑉0 ⊕ 2𝑉𝐻 ⊕ 𝑉2. (B.13)

The matrix elements along the diagonal {𝑀𝑖𝑖|1 ≤ 𝑖 ≤ 𝐷} transform like the natural representation 𝑉𝐷. Removing a copy of
𝑉𝐷 ≅ 𝑉0 ⊕ 𝑉𝐻 from the symmetric product of 𝑉𝐷 in (B.13) gives the following decomposition of the physical subspace

𝑉 phys ≅ Sym2(𝑉𝐷)∕𝑉𝐷 ≅ 𝑉0 ⊕ 𝑉𝐻 ⊕ 𝑉2. (B.14)

The decomposition (B.14) tells us that the enforcement of permutation invariance on the action of a Gaussian theory containing sym-
metric 𝐷 dimensional matrices without diagonal permits a single independent linear term (i.e. the number of trivial representations
appearing on the RHS). Quadratic products of physical matrices transform as

𝑉 phys ⊗ 𝑉 phys ≅
(

𝑉0 ⊕ 𝑉𝐻 ⊕ 𝑉2
)

⊗
(

𝑉0 ⊕ 𝑉𝐻 ⊕ 𝑉2
)

. (B.15)

sing the orthogonality property of characters as well as the reality property of irreducible representations of the symmetric group,
t can be shown that the tensor product of two irreducible representations contains the trivial representation if and only if the
rreducible representations are identical — and in this case the decomposition contains exactly one copy of the trivial representation.
his enables us to count the number of independent quadratic terms. We find three independent quadratic contributions to the action
orresponding to the following three terms in (B.15)

𝑉0 ⊗ 𝑉0 ≅ 𝑉0 +⋯ , (B.16)

𝑉𝐻 ⊗ 𝑉𝐻 ≅ 𝑉0 +⋯ , (B.17)

𝑉2 ⊗ 𝑉2 ≅ 𝑉0 +⋯ . (B.18)

Much of our task in calculating expectation values of observables of the physical model for symmetric diagonally vanishing
atrices amounts to finding a change of basis for 𝑉𝐷 ⊗𝑉𝐷 from the original 𝑒𝑖 ⊗𝑒𝑗 to one which transforms in the same manner as

the irreducible decomposition of 𝑉 phys, i.e. from the LHS of (B.14) to the RHS. Once found it diagonalises the physical action and
consequently permits the calculation of expectation values of observables. The coefficients that define this change of basis are called
Clebsch–Gordon coefficients. For each irreducible representation on the RHS of (B.14) define 𝐶phys;𝑉0

𝑖𝑗 , 𝐶phys;𝑉𝐻
𝑖𝑗, 𝑎 , 𝐶phys;𝑉2

𝑖𝑗, 𝑎 respectively,
where 𝑎 is a state index running over the dimension of the irreducible representation.

Note that if we only imposed the condition of symmetry 𝑀𝑖𝑗 = 𝑀𝑗𝑖 on the matrices, we would have two linear couplings
corresponding to the two copies of 𝑉0 in (B.13). We would have 3 parameters of a 2×2 symmetric matrix of couplings for quadratic
terms arising from the two copies of 𝑉0 in (B.13), 3 parameters of a 2 × 2 symmetric matrix of couplings for quadratic terms arising
from the two copies of 𝑉𝐻 in (B.13), and finally one parameter for 𝑉2. For symmetric matrices, therefore, there is a 9 parameter
permutation invariant matrix model. We will focus, in the following, on the 4-parameter model which incorporates the symmetry
19

condition 𝑀𝑖𝑗 = 𝑀𝑗𝑖 as well as the condition of vanishing diagonal.


Physica A: Statistical Mechanics and its Applications 651 (2024) 130015G. Barnes et al.

U

B.2. Projector for 𝑉 phys

The projectors to the trivial representations appearing in the quadratic products of 𝑀𝑖𝑗 are given by squaring the relevant Clebsch
coefficients and summing over intermediate states

𝑃 phys;𝑉0
𝑖𝑗,𝑘𝑙 = 𝐶phys;𝑉0

𝑖𝑗 𝐶phys;𝑉0
𝑘𝑙 , (B.19)

𝑃 phys;𝑉𝐻
𝑖𝑗,𝑘𝑙 =

𝐷−1
∑

𝑎=1
𝐶phys;𝑉𝐻
𝑖𝑗, 𝑎 𝐶phys;𝑉𝐻

𝑘𝑙, 𝑎 , (B.20)

𝑃 phys;𝑉2
𝑖𝑗,𝑘𝑙 =

dim𝑉2
∑

𝑎=1
𝐶phys;𝑉2
𝑖𝑗, 𝑎 𝐶phys;𝑉2

𝑘𝑙, 𝑎 . (B.21)

Here we find explicit formulae for these projectors. The 𝑉0 and 𝑉𝐻 projectors are constructed by finding the Clebschs on the RHS
of (B.19) and (B.20) explicitly. In the case of the 𝑉2 projector things are not so simple as the Clebsch is not so easily calculable,
nonetheless we are able to construct the projector using general properties of Clebsch coefficients and other known projectors,
without precise knowledge of the 𝑉2 Clebsch itself.

To find 𝐶phys;𝑉0
𝑖𝑗 and 𝐶phys;𝑉𝐻

𝑖𝑗, 𝑎 we first write down a representation theory basis for 𝑉𝐷 ⊗ 𝑉𝐷 in terms of the change of basis
coefficients given in (B.9) and (B.10) as was done in [1] i.e. a basis that transforms like the RHS of (B.11),

𝑆𝑉0;1 ≡
𝐷
∑

𝑖,𝑗=1
𝐶0,𝑖𝐶0,𝑗𝑀𝑖𝑗 =

1
𝐷

𝐷
∑

𝑖,𝑗=1
𝑒𝑖 ⊗ 𝑒𝑗 , (B.22)

𝑆𝑉0;2 ≡ 1
√

𝐷 − 1

𝐷−1
∑

𝑎=1

𝐷
∑

𝑖,𝑗=1
𝐶𝑎,𝑖𝐶𝑎,𝑗𝑀𝑖𝑗 =

1
√

𝐷 − 1

𝐷−1
∑

𝑎=1
𝐸𝑎 ⊗𝐸𝑎 , (B.23)

𝑆𝑉𝐻 ;1
𝑎 ≡

𝐷
∑

𝑖,𝑗=1
𝐶0,𝑖𝐶𝑎,𝑗𝑀𝑖𝑗 =

1
√

𝐷

𝐷
∑

𝑖=1
𝑒𝑖 ⊗𝐸𝑎 , (B.24)

𝑆𝑉𝐻 ;2
𝑎 ≡

𝐷
∑

𝑖,𝑗=1
𝐶𝑎,𝑖𝐶0,𝑗𝑀𝑖𝑗 =

1
√

𝐷

𝐷
∑

𝑖=1
𝐸𝑎 ⊗ 𝑒𝑖 , (B.25)

𝑆𝑉𝐻 ;3
𝑎 ≡

√

𝐷
𝐷 − 2

𝐷−1
∑

𝑏,𝑐=1

𝐷
∑

𝑖,𝑗,𝑘=1
𝐶𝑎,𝑘𝐶𝑏,𝑘𝐶𝑐,𝑘𝐶𝑏,𝑖𝐶𝑐,𝑗𝑀𝑖𝑗 =

√

𝐷
𝐷 − 2

𝐷−1
∑

𝑏,𝑐=1

𝐷
∑

𝑖=1
𝐶𝑎,𝑖𝐶𝑏,𝑖𝐶𝑐,𝑖𝐸𝑏 ⊗𝐸𝑐 . (B.26)

We also note the orthogonal decomposition

𝑉𝐷 ⊗ 𝑉𝐷 ≅ Sym2(𝑉𝐷)⊕𝛬2(𝑉𝐷) ≅ 𝑉 phys ⊕ 𝑉 diag ⊕𝛬2(𝑉𝐷), (B.27)

in which 𝑉 diag is the subspace of diagonal matrix elements and 𝛬2(𝑉𝐷) is the antisymmetric subspace of 𝑉𝐷 ⊗ 𝑉𝐷.
Define further representation variables 𝑆diag;𝑉0 and 𝑆diag;𝑉𝐻

𝑎 composed of the diagonal elements of 𝑀𝑖𝑗 , that transform according
to the first and second terms on the RHS of the 𝑉 diag decomposition respectively

𝑆diag;𝑉0 ≡ 1
√

𝐷

𝐷
∑

𝑖=1
𝑒𝑖 ⊗ 𝑒𝑖, (B.28)

𝑆diag;𝑉𝐻
𝑎 ≡ 𝐸𝑎 ⊗𝐸𝑎. (B.29)

sing the inner product on 𝑉𝐷 ⊗ 𝑉𝐷 in (B.8) We can express these in terms of the representation variables

𝑆diag;𝑉0 = (𝑆diag;𝑉0 , 𝑆𝑉0;1)𝑆𝑉0;1 + (𝑆diag;𝑉0 , 𝑆𝑉0;2)𝑆𝑉0;2

= 1
√

𝐷
𝑆𝑉0;1 +

√

𝐷 − 1
𝐷

𝑆𝑉0;2 , (B.30)

and

𝑆diag;𝑉𝐻
𝑎 = 1

2

𝐷−1
∑

𝑏=1

(

(𝑆diag;𝑉𝐻
𝑎 , 𝑆𝑉𝐻 ;1

𝑏 )𝑆𝑉𝐻 ;1
𝑏 + (𝑆diag;𝑉𝐻

𝑎 , 𝑆𝑉𝐻 ;2
𝑏 )𝑆𝑉𝐻 ;2

𝑏
)

+
𝐷−1
∑

𝑏=1
(𝑆diag;𝑉𝐻

𝑎 , 𝑆𝑉𝐻 ;3
𝑏 )𝑆𝑉𝐻 ;3

𝑏

= 1
2

𝐷−1
∑

𝑏=1

(

√

2
𝐷
𝛿𝑎𝑏𝑆

𝑉𝐻 ;1
𝑏 +

√

2
𝐷
𝛿𝑎𝑏𝑆

𝑉𝐻 ;2
𝑏

)

+
𝐷−1
∑

𝑏=1

√

𝐷 − 2
𝐷

𝛿𝑎𝑏𝑆
𝑉𝐻 ;3
𝑏

= 1
√

(

𝑆𝑉𝐻 ;1
𝑎 + 𝑆𝑉𝐻 ;2

𝑎
)

+
√

𝐷 − 2
𝐷

𝑆𝑉𝐻 ;3
𝑎 . (B.31)
20

2𝐷


Physica A: Statistical Mechanics and its Applications 651 (2024) 130015G. Barnes et al.

F

w

A

T

The physical variables 𝑆phys;𝑉0 and 𝑆phys;𝑉𝐻
𝑎 span the orthogonal complement of the diagonal variables in the 𝑉0 and 𝑉𝐻 subspaces

(as given in (B.28)) of Sym2(𝑉𝐷)

𝑆phys;𝑉0 =
√

𝐷 − 1
𝐷

𝑆𝑉0;1 − 1
√

𝐷
𝑆𝑉0;2

=
√

𝐷 − 1
𝐷

𝐷
∑

𝑖,𝑗=1
𝐶0,𝑖𝐶0,𝑗𝑀𝑖𝑗 −

1
√

𝐷(𝐷 − 1)

𝐷−1
∑

𝑎=1

𝐷
∑

𝑖,𝑗=1
𝐶𝑎,𝑖𝐶𝑎,𝑗𝑀𝑖𝑗 , (B.32)

𝑆phys;𝑉𝐻
𝑎 =

√

𝐷 − 2
2𝐷

(

𝑆𝑉𝐻 ;1
𝑎 + 𝑆𝑉𝐻 ;2

𝑎
)

−
√

2
𝐷
𝑆𝑉𝐻 ;3
𝑎

=
√

𝐷 − 2
2𝐷2

𝐷
∑

𝑖,𝑗=1

(

𝐶𝑎,𝑖 + 𝐶𝑎,𝑗
)

𝑀𝑖𝑗 −
√

2
𝐷

𝐷
∑

𝑖,𝑗=1

𝐷−1
∑

𝑏,𝑐=1
𝐶𝑏,𝑖𝐶𝑐,𝑗𝐶

𝐻𝐻→𝐻
𝑏,𝑐 𝑎 𝑀𝑖𝑗 . (B.33)

rom (B.32) and (B.33) we can read off the Clebsch coefficients needed in the construction of the projectors (B.19) and (B.20):

𝑆phys;𝑉0 =
∑

𝑖,𝑗
𝐶phys;𝑉0
𝑖𝑗 𝑀𝑖𝑗 ⇒ 𝐶phys;𝑉0

𝑖𝑗 =
√

𝐷 − 1
𝐷

𝐶0,𝑖𝐶0,𝑗 −
1

√

𝐷(𝐷 − 1)

𝐷−1
∑

𝑎=1
𝐶𝑎,𝑖𝐶𝑎,𝑗 , (B.34)

𝑆phys;𝑉𝐻
𝑎 =

∑

𝑖,𝑗
𝐶phys;𝑉𝐻
𝑖𝑗, 𝑎 𝑀𝑖𝑗 ⇒ 𝐶phys;𝑉𝐻

𝑖𝑗, 𝑎 =
√

(𝐷 − 2)
2𝐷2

(

𝐶𝑎,𝑖 + 𝐶𝑎,𝑗
)

−
√

2
(𝐷 − 2)

𝐷−1
∑

𝑏,𝑐=1

𝐷
∑

𝑘=1
𝐶𝑏,𝑖𝐶𝑐,𝑗𝐶𝑎,𝑘𝐶𝑏,𝑘𝐶𝑐,𝑘 . (B.35)

Although we do not know the 𝑉2 Clebsch coefficients 𝐶𝐻𝐻→𝑉2
𝑏,𝑐 𝑎 appearing in the remaining physical variables

𝑆phys;𝑉2
𝑎 = 𝑆𝑉2

𝑎 =
𝐷
∑

𝑖,𝑗=1

𝐷−1
∑

𝑏,𝑐=1
𝐶𝑏,𝑖𝐶𝑐,𝑗𝐶

𝐻𝐻→𝑉2
𝑏,𝑐 𝑎 𝑀𝑖𝑗 , (B.36)

e do know that they possess the usual Clebsch orthogonality property

𝐷−1
∑

𝑏,𝑐=1
𝐶𝐻𝐻→𝑉2
𝑏 𝑐, 𝑎1

𝐶𝐻𝐻→𝑉2
𝑏 𝑐, 𝑎2

= 𝛿𝑎1𝑎2 . (B.37)

lso, we are able to write an expression for the 𝑉2 projector using the decomposition

Sym2(𝑉𝐻 ) = 𝑉0 ⊕ 𝑉𝐻 ⊕ 𝑉2. (B.38)

his allows us to express the 𝑉2 projector in terms of the projectors from 𝑉𝐻 ⊗ 𝑉𝐻 to Sym2(𝑉𝐻 ), 𝑉0 and 𝑉𝐻

𝑃 phys;𝑉2 = 𝑃 𝑉𝐻 ,𝑉𝐻→𝑉2 =
(

1 − 𝑃 𝑉𝐻 ,𝑉𝐻→𝑉0 − 𝑃 𝑉𝐻 ,𝑉𝐻→𝑉𝐻
)

𝑃 𝑉𝐻 ,𝑉𝐻→Sym2(𝑉𝐻 ). (B.39)

The projectors on the RHS of this expression are giv