DL_POLY 5: Calculation of system properties on the fly for very large systems via massive parallelism - Devereux H.L., CockrellC., Elena A.M., Bush Ian

DL_POLY, massive parallelism and scalability
System properties on-the-fly
On-the-fly methods
Comparison to simulations and experiments
Author: Devereux H.L. CockrellC. Elena A.M. Bush Ian
Tags: computer modeling
ISBN: 0021-9606
Year: 2025
Similar
Advances in Memristors, Memristive Devices and Systems. Volume 701
Heating and cooling systems in buildings - Method for calculation of the system performance and system design for heat pump systems. Part 2: Energy calculation
Gravitational Waves. Volume 1. Theory and Experiments
Systems Science: Theory, Analysis, Modeling, and Design
Text
                    arXiv:2503.07526v1 [cond-mat.mtrl-sci] 10 Mar 2025

DL POLY 5: Calculation of system properties on the fly
for very large systems via massive parallelism
H. L. Devereux1 , C. Cockrell1,2 , A. M. Elena3 , Ian Bush4,6 , Aidan B. G.
Chalk5 , Jim Madge3,7 , Ivan Scivetti3 , J. S. Wilkins4,6 , I. T. Todorov1,3 , W.
Smith3 , K. Trachenko1
1

School of Physical and Chemical Sciences, Queen Mary University of London, Mile
End Road, London, E1 4NS, UK
2

5

Nuclear Futures Institute, Bangor University, Bangor, LL57 1UT, UK

3

Scientific Computing Department, Science and Technology Facilities Council,
Daresbury Laboratory, Keckwick Lane, Daresbury, WA4 4AD, UK

4

Scientific Computing Department, Science and Technology Facilities Council,
Rutherford Appleton Laboratory, UK - current address

Hartree Centre, Science and Technology Facilities Council, Daresbury Laboratory,
Keckwick Lane, Daresbury, WA4 4AD, UK
6

7

Oxford eResearch Centre, University of Oxford - work carried out

The Alan Turing Institute, British Library, 96 Euston Road, London, NW1 2DB current address

Abstract
Modelling has become a third distinct line of scientific enquiry, alongside
experiments and theory. Molecular dynamics (MD) simulations serve to interpret, predict and guide experiments and to test and develop theories. A
major limiting factor of MD simulations is system size and in particular the
difficulty in handling, storing and processing trajectories of very large systems. This limitation has become significant as the need to simulate large
system sizes of the order of billions of atoms and beyond has been steadily
growing. Examples include interface phenomena, composite materials, biomaterials, melting, nucleation, atomic transport, adhesion, radiation damage and fracture. More generally, accessing new length and energy scales
often brings qualitatively new science, but this has currently reached a bottleneck in MD simulations due to the traditional methods of storing and
post-processing trajectory files. To address this challenge, we propose a new
Preprint submitted to Computer Physics Communications

March 11, 2025

paradigm of running MD simulations: instead of storing and post-processing
trajectory files, we calculate key system properties on-the-fly. Here, we discuss the implementation of this idea and on-the-fly calculation of key system
properties in the general-purpose MD code, DL POLY. We discuss code development, new capabilities and the calculation of these properties, including
correlation functions, viscosity, thermal conductivity and elastic constants.
We give examples of these on-the-fly calculations in very large systems. Our
developments offer a new way to run MD simulations of large systems efficiently in the future.
Keywords: , Molecular dynamics, Materials modelling, online algorithms,
DL POLY
PROGRAM SUMMARY/NEW VERSION PROGRAM SUMMARY
Program Title: DL POLY 5
Developer’s repository link: https://gitlab.com/ccp5/dl-poly
Licensing provisions: L-GPL v3.0
Programming language: Fortran95
Supplementary material:
Journal reference of previous version: Todorov, I.T., Smith, W., Trachenko, K.
and Dove, M.T., 2006. DL POLY 3: new dimensions in molecular dynamics simulations via massive parallelism. Journal of Materials Chemistry, 16(20), pp.19111918.
Does the new version supersede the previous version?: Yes
Reasons for the new version: Development of on-the-fly correlation analysis.
Summary of revisions: General code modularisation, a new control format, SPME
re-implementation for per-particle contributions, and a general on-the-fly correlation framework.
Nature of problem: Molecular dynamics is utilised for modelling in many domains
including physics, chemistry, biology, materials science, and their interfaces. These
applications often call for large scale simulations targeting high-fidelity timescales
and ever larger length scales. In all these cases efficient use of computation, storage, and input/output (I/O) load handling on both the software and hardware
sides is required to facilitate analysis.
Solution method: DL POLY provides an efficient set of algorithms for molecular
simulation alongside a domain-decomposition strategy to distributed computation
efficiently across massively parallel CPU systems. These include parallel I/O handling, link-cells, smooth particle mesh Ewald electrostatics, and now a general

ii

purpose on-the-fly correlation framework. This latter development addresses the
problem of storing and analysing large trajectory files by iteratively computing
correlations at runtime.

1. Introduction
For a few decades now, molecular simulations have grown into a third
distinct line of scientific enquiry in condensed matter, alongside experiment
and theory [1, 2]. Molecular dynamics (MD) simulations give coordinates
and momenta of simulated particles as a function of time and thus provide
the system’s trajectory in phase space. This enables us to calculate most
important system properties including those related to structure, statistics,
dynamics, transport and so on.
MD results are used in a number of ways to provide insight into experiments, from interpretation to prediction and guiding new ones, as well as
to test theories and provide insight at the atomic level resolution for materials, scenarios or processes not yet synthesised by researchers or constructed
by experimentalists. MD is particularly useful in fields where measurements
are inaccessible in experiment, where empirical theory of phenomena at the
larger engineering scales can be examined at short time and length scales,
such as damage in materials [3]. Another popular use is extracting statistics
and analytics that are impossible to capture by experiment such as cheminformatics analysis (formation of hydrogen bond networks, hydrophobicity,
π-π stacking, etc.), the dynamics of defects in materials (defects and faults
in crystalline matrices, amorphisation processes) as well as coarse-grained
modelling using augmented force-fields of complex dynamics of, for example,
proteins. Generally, MD results are also used to generate synthetic benchmark data for fortifying machine-learnt (ML) models.
This interaction between experiments, theory and simulation has proven
to be very productive and has naturally found wide applications in physics,
chemistry, materials and earth sciences, engineering, biological and medical
sciences etc, from purely academic research to real-world industrial applications. These applications include those in the strategic priority areas of
energy, environment, advanced materials and health. It is hard to imagine
the current state of these areas without MD and other types of computer
modelling. Many practising scientists have taken MD simulations on board
iii

to the extent that they consider themselves as combined experimentalistmodellers or theorist-modellers.
As for any method, MD simulations have their limitations and associated domain of applicability. Several important ones have been relaxed or
lifted through the developments of the last decades; others remain unsolved,
including the important issue of small system size.
One of the strengths of MD is its ability to access small length scales of
nanometers and short time scales of picoseconds or beyond where few experimental methods can measure. Early MD simulations were able to simulate
about 100 particles for typically tens of picoseconds. With increasing computer power, simulations of millions of atoms on a personal computer is now
routine, with simulation times extending to nanoseconds and above.
The appetite of the modelling community to simulate larger systems is
constantly growing. In many areas, large systems are necessary to reach the
required length or energy scales and match these to experiments and relevant physical processes. These areas include interface phenomena, composite
materials, biomaterials, melting, nucleation, atomic transport, adhesion, radiation damage and fracture.
About a decade ago, we simulated system sizes approaching 1 billion
atoms with realistic many-body potentials for iron, using over 60,000 parallel processors [4]. This large system size was required in order to contain
radiation damage due to high-energy (in MeV) recoil energy. A representative picture of the damage is shown in Fig. 1. We note that the system size,
in the order of micrometers, is in interesting proximity to the length scales
visible through an optical microscope; simulations and experiments met at
this length-scale for the first time.
We found that the nature of the damage and defects in these structures
were new and could not be extrapolated from lower-energy events. This had
a clear value from scientific perspective and showed that, as is often the case
in physics, exploring new length- and energy-scales brings about new effects
[4].
Running these very large simulations, not only brought scientific insights,
but also computational insights. First, the size of a single configuration of
100 million atoms (position data only, in Å with one significant figure) in
ASCII is ∼3.5GB. For comparison, the standard DL POLY format requires
∼5 times more data for structural data (15 GB) with an additional ∼2 times
more if velocity data is needed (25GB). In an MD simulation, certain structural and dynamical properties (for example, viscosity or velocity correlation
iv

Figure 1: A representative 0.5 MeV collision cascade from our earlier work [4] showing
displaced atoms at 0.2, 1.5 ps and 100 ps. The system size is a cube of 2000 Å size with
0.1 billion atoms. Animations showing the propagation of the collision cascade can be
watched from [5].

functions) often need over 10,000-100,000 configurations to be computed,
requiring usually two orders of magnitude more timesteps in order to get
good-quality data. This means that a single trajectory file of this type for a
1 million atom system will require ∼0.1-1 PB of storage. On the other hand,
the work file system of a high-performance facility such as UK’s ARCHER2
service [6] has a total of 10.2 PB [7] to be shared between all users (a maximum of 60 TB allocated to a single project or team). Therefore, the trajectory files of very large systems become impossible to store, let alone process,
on high-performance computing facilities.
Second, a high-performance computing facility such as ARCHER2 spends
a significant amount of its time writing the data rather than actually carrying
out the numerical simulation, which is an inefficient way of utilising the
HPC resource. Even with efficient parallelisation of writing data to storage,
a best outcome means that writing a single sample configuration costs as
much in CPU time as a timestep. In the case of 100 million atoms run on
65,000 processors, this fraction was in excess of 5% for a full production
run, including reading and writing. This implies that each simulation run
involving writing the trajectory file, even if feasible in terms of file size,
wastes considerable compute power. Furthermore, the larger the simulation
(in model size and time evolution) and the larger the count of processors
used to run it, the larger the waste.
v

The problems outlined above are general and will either prevent users
from running and analysing MD simulations of large systems or make these
simulations inefficient. In our radiation damage case study, we overcame
these problems by implementing a special on-the-fly algorithm in the generalpurpose MD package, DL POLY [8]. This algorithm specifically finds and
analyses displaced and defect atoms in the very large billion-atom system.
This was a non-trivial task as the algorithm had to be made consistent with
the domain-decomposition infrastructure of DL POLY [9, 10]. These simulations have provided a useful case study and a motivation for the development
of a more general code, with a view to benefit a wider modelling community
and to approach the problems in a general way.
We recall that some properties such as energy can be sampled and averaged at each time step, and these are calculated by DL POLY and other
popular MD codes as a standard output. However, there are other important
properties relying on time correlation functions which need to be calculated
over the entire trajectory and which are currently calculated by writing out
the trajectory file and analysing it a posteriori. One common example is
the velocity autocorrelation function related to the phonon density of states
and power spectrum of the system. Other examples include k-space densities
and currents. For very large system sizes, the problems discussed above make
performing these analyses from trajectory files impractical.
Instead of attempting to run a trajectory and subsequently analysing it,
we have enabled the calculation of important physical properties on-the-fly.
This sort of on-the-fly calculation will be crucial to the execution and
analysis of ever-larger MD simulations. Particularly as there is a growing
need to simulate larger systems, length and energy scales, coupled with the
availability of scalable MD codes and massive parallel computing facilities.
Supported by the EPSRC funding, we have extended the UK flagship
MD code, DL POLY, to calculate system properties on-the-fly culminating
in the DL POLY 5 code available at [11]. DL POLY is a general-purpose
massively parallel MD simulation package that uses a highly efficient set
of methods and algorithms, including domain decomposition, linked cells,
Daresbury Advanced Fourier Transform, Trotter derived Velocity Verlet integration and RATTLE [8]. Written to support academic and industrial
research, DL POLY has a wide range of applications and can run on a wide
range of computers; from single processor workstations to massively parallel high-performance facilities. During development a strong emphasis was
placed on efficient utilization of multi-processor power by optimising memvi

ory workload and distribution. This makes it possible to efficiently simulate
large systems of billions of atoms. Simulation of systems of this size has been
tested, and we are not aware of reasons preventing from running larger systems. DL POLY 5 is hosted on GitLab [11] has been a free and open source
project (GPLv3.0) since 2020.
In this paper, we document and detail the developments of DL POLY
related to on-the-fly calculation of some key system properties. This included
general and substantial refactoring across DL POLY’s modules. We discuss
the implementation of on-the-fly calculations of different properties including
correlation functions, viscosity and thermal conductivity, elastic constants,
currents, and rigid body properties. We then discuss the calculations of
these properties and show the results from simulations of large 100-million
systems. These results are compared to smaller-size calculations, experiments
and previous MD simulations. We begin with an overview of DL POLY’s
history up to DL POLY 4, including a discussion on its scalability. Then we
detail the re-engineering of the code, laying the groundwork for our on-the-fly
developments.
2. DL POLY, massive parallelism and scalability
The DL POLY project was conceived in 1993 by W Smith and originally
released in 1994 as DL POLY 2 [12, 13] (now known as DL POLY CLASSIC
[14]) using a replicated data (RD) approach as its underlying parallelisation
strategy [15, 16]. Written in Fortran77 [17] and using MPI [18] for interprocessor communication the project brought significant improvements in both
functionality and internal algorithms performance in the code, establishing
the link-cell (LC) algorithm for efficient search and listing of particle pairs
based on a cutoff [19] and the use of 3D discrete fast Fourier transforms (3D
FFTs) [20] for the central grid-based operation in the evaluation of the longranged contributions of electrostatics interactions [21]. However, by the early
2000s it was clear that RD strategy was rather limited in performance and
scalability gains especially for then very large model systems, of the order
of 100,000 atoms, and large processor counts, of the order of ∼100 CPUs,
thus necessitating a change in parallelisation strategy and reformulation of
its algorithms. The particular deficiencies of the RD approach that needed
addressing were the avalanche of communication which increased with the
size of a model system (creating larger inter-processor messages) as well as
with the processor count (increasing the number of senders and receivers per
vii

message for all-to-all communications), and that there was no memory distribution at all, these were major bottlenecks for running large systems of
the order of millions of atoms.
In 2003, led by I. Todorov, with core work carried out by W. Smith and I.
Bush, DL POLY 3 was released as part of the fundamental work supported by
the eMinerals project [22], brought in as part of the UK’s eScience programme
[23]. Not only did this version bring a new code infrastructure in modularised
Fortran 90, but it also developed the foundation of the domain decomposition
(DD) parallelisation strategy [9, 10], including a Smooth Particle Mesh Ewald
(SPME) method suitable for the DL POLY DD [24], still used in the newer
versions. Based on the two premises of: (i) limited density variation - that
the model system average particle density does not fluctuate significantly - in
space and time, and (ii) the system size/dimensions being much larger than
that of the most characteristic interaction cutoff; this strategy developed and
addressed the two major shortcomings of the RD approach. First, it provided
for near perfect memory distribution as demonstrated in Fig. 2, where three
model systems with increasing complexity were run and compared in a weakscaling manner, i.e. every time the system size doubled, the processor count
used to run the simulation doubled. This opened the door to running multimillion atom simulations from its very first release [25].
The weak scaling provides information about the deviation from perfect
performance, where it is assumed that doubling processors halves the timeper-atom, i.e. constant speed gain. However, we see in Fig. 2, that even
for the most trivial system, Ar, one can see small deviation from the perfect
performance due to inter-process communication, necessary for domain halo
boundary data exchange. It is clear that, as the model’s complexity increases,
so too does the toll on performance, demonstrated by the greater deviation
from perfect scaling. For the second model system, NaCl, this is due to longranged coulombic interactions, bringing the cost of inclusion of electrostatics
into focus. As discussed [24] this is due to communications overheads that are
a necessity of the FFT operations needed for evaluation of the long-ranged
electrostatics interactions. The third model system, SPC water, is subject to
a further performance penalty which is caused by the inclusion of constrained
bond dynamics. The reason for this is simple to understand: as more processors are working together, there are more inter-domain faces and hence more
constrained oxygen-hydrogen bonds crossing inter-domain faces. Constraining these bonds requires multiple small inter-process updates, which become
costly as the model system or processor counts become bigger.
viii

Figure 2: Plot of the relative speed gain (time-per-timestep-per-atom) as a function of
processor count for three generic systems, where the system size scales with processor
count. The solid black line represents the performance scaling of a 32,000 particle per
processor system of solid Ar model with only short-ranged interactions. The solid red
line represents the performance scaling of a 27,000 particle per processor system of solid
NaCl model with short-ranged and coulombic interactions. The solid green line represents
the performance scaling of a 20,736 particle per processor system of water, SPC model,
with with short-ranged and coulombic interactions as well as constrained bonds. The
green dotted line indicates the parallelisation limit also called perfect or embarrassing
parallelisation. The red line indicates the standard for a good parallelisation (time-pertimestep increase of 25% for each doubling of the processor count). This data was first
produced with DL POLY 3.04 in December 2005 on UK’s National HPC service HPCx
(IBM p690).

ix

Figure 3: Strong scaling benchmark all are timed over 1ps simulation time. This data was
first produced with DL POLY 3.06 in December 2006 at Jülich Supercomputing Centre’s
JUGENE (Blue Gene/L).

In strong scaling, we track the performance of the key underlying algorithms on the same system size as we increase the processor count. As shown
in Fig. 3 the domain decomposition parallelisation strategy provides a good
load-balancing and near-perfect (linear) performance, up to to a limit, by
the majority of its core algorithms. The speed gain curve deviates from its
linear form only for the long-range interactions (Ewald k-space). This evaluation is also the ultimate cause for the cumulative overall speed gain to
deviate in a non-linear manner as the processor count increases and decay
with increasing the processor count. More recent strong-scaling benchmarking on Archer 2 [6] demonstrated the same excellent scalability for large Ar
and NaCl systems. A comprehensive survey of DL POLY performance on
modern hardware can be found in [26].
In 2010 DL POLY 4 was released, including a major rewrite providing
an extension of particle dynamics to rigid body dynamics [27], inclusion of

x

two-temperature thermostat functionality [28] for high-energy events and the
inclusion of parallel I/O [29]. The rigid body dynamics provided a natural
extension of the parallel performance capabilities for biological and biochemical systems in which constrained bond dynamics was a bottleneck (as seen in
the fast decaying weak performance of the SPC water system in Fig. 2). The
enhancement in performance of rigid body (RB) dynamics and constrained
bond (CB) dynamics is demonstrated in Fig. 4 where both weak and strong
scaling tests were performed on the same SPC water system and model with
scaled sizes in the case of weak scaling. The difference between RB and CB
dynamics is the way in which bond constraints are applied. In the CB case, it
is based on simple, but iterative, application of small constant force updates
on top of the Newtonian equations of motion, which in DD requires frequent
local inter-process data exchanges. In contrast, in the RB case the Eulerian
equations of RB rotation are solved alongside with the Newtonian ones. However, these further calculations, despite their complexity, are computationally
cheaper (no iterative computation) and require no iterative communication
for a difference with CB dynamics one. Further developments in 2012 led to
the inclusion of Dissipative Particle Dynamics integration [30] and a three
seeded random number generation [31] for using repeatable random series for
stochastically dependent thermostating, kinetic fields and pressure tensors in
a fully deterministic manner across domains and time series.
2.1. Current state
A few years ago an extensive refactoring effort took place that is the
the base of current DL POLY. These changes were introduced mainly due
to the need for being able to add new complex tasks with maximum code
reuse and allow various easy optimisations. The main changes introduced
were: i) using object oriented Fortran – 2018 as the minimal standard required, ii) restructuring the code to be fully modular, with all functions and
subroutines exchanging data only via parameters rather than via modules removal of implicit common blocks. Only one singleton remains for dealing
with I/O for errors. iii) introduced an automated test system with both
regression and unit tests. Testing was extended from 24 tests to over 200
tests. iv) atomic configurations are stored in an array of structures that
is cache friendly with respect to positions and forces, v) electrostatics was
rewritten to allow per-particle decomposition and allowing the computation
of long range contributions from any power potential, see next section for
details. iv) The CONTROL file format was redesigned (see appendix Appendix
xi

Figure 4: Weak and strong scaling benchmarks of SPC water systems. This data was
first produced with DL POLY 4.04 in May 2013 on UK’s National HPC service HECToR
(Cray XE6).

C). In general all entries now are consistently of the form token-value-unit
(where applicable), STATIS, RDF and other output files can be written in
the yaml format [32] for easier postprocessing, v) a modern software infrastructure is in place with full CI/CD via gitlab.com, vi) license was changed
to an open-source license GPL-3.0, vii) dlpoly-py was developed [33] - a companion python package that can read inputs and outputs of DL POLY and
initiate simulations. Without this extensive refactoring, implementation of
the Empirical Valence Bond [34] and the new Chemshell interface [35] would
not have been possible.
2.2. SPME refactoring
The Ewald method [36] is a method for the calculation of long-range
(conditionally convergent until infinity) forces. This can be achieved in a
number of ways, the usual physical description being the introduction of a
screening charge distribution and a correction to account for the screening
so as to accelerate the summation of the long-range component in Fourier
space. This can also be seen as performing a sum from 0 → α in real-space
and from 0 → α′ in reciprocal space to construct a full sum of 0 → ∞ in
finite time (lim x1 = ∞).
x→0

The smooth particle mesh Ewald (SPME) approach by Essmann et al.
xii

[21] uses cardinal B-Splines to interpolate the complex exponentials that
result from the above approach in a regular fashion. This allows the use
of a standard Fast Fourier Transform (FFT), so reducing the scaling of the
method with system size to O(N log(N )). However this is at the cost of some
accuracy due to the interpolation, though this can be negated by using higher
order splines to provide a better description of the exponentials.
The SPME approach in DL POLY has been refactored to enable computation of per-particle contributions to energies, forces and stresses. The
calculation of per-particle contributions has been used to implement the computation of Green-Kubo thermal conductivity with Ewald methods (See section 3.3).
This new method coincidentally enables the expansion to not only alternative orders of r1n potentials, but also derivatives of arbitrary order, thereby
paving the way to correct multipolar expansions of arbitrary order. Direct
computations of higher-order potentials allow “exact” computation of, e.g.
Lennard-Jones (6-12) [37] potentials without cut-offs, enabling calculation
of the effect of these cutoffs on actual dynamics. Currently implemented
are functional forms for: r−1 , r−2 , r−6 , r−12 and a general form for r−n and
their derivatives for both the real- and reciprocal-space parts (see appendix
Appendix A).
As part of the DL POLY5 developments, we have recently benchmarked
2-body and SPME-enabled systems composed of ∼2 million atoms and 2body simulations composed of ∼ 1 billion atoms across increasing MPI process counts on Archer2 [6]. We examine processor counts ranging from 128
in the small systems to 262,144 in the larger system. Representing a 35%
concurrent utilisation of Archer2’s 750,080 CPU cores [38]. The results can
be found in Fig. 5. Primarily the results continue to demonstrate the scalability of DL POLY. But secondly, it also demonstrated the scalability of
the newly refactored MD algorithms, with the reworked DD memory distribution that favoured purely local communications and minimised all-to-all
communications as much as possible. A comprehensive survey of DL POLY
performance on modern hardware can be found in [26].
3. System properties on-the-fly
3.1. On-the-fly methods
Transforming data as it is acquired is a paradigm often termed “online
algorithms” [39]. The basic idea is that, instead of processing an entire data
xiii

Figure 5: Strong scaling benchmarks for DL POLY for Argon (2-body) and NaCl (2-body
and SPME electrostatics) systems. Benchmarks were conducted on Archer2 [6] all are
timed over 1ps simulation time. The system sizes were of the order of millions and billions
for Argon, and millions for NaCl. Processor counts vary from 128 in the smaller systems,
up to 262,144 for the 1 billion atom system. Our largest benchmark represents a 35%
concurrent utilisation of Archer2 [38]

xiv

set a once, an on-the-fly calculation processes the data incrementally, often
as the data are generated, received, or simply by processing a data set in
chunks. This can incur performance advantages by eliminating the requirement to store the full data set on disc or in memory (RAM). Alternatively
input/output (I/O) penalties can be significant when reading from disk, and
RAM may be limited. In some cases, the full data set may be either too large
to store in memory, or perhaps even unbounded. One example of unbounded
data is the internet, where a continuous stream of “big”-data is encountered
on a daily basis [40]. In these applications a sliding-window technique is often
used where statistics and other data transformations are defined in reference
to a fixed-size window [41] which could be temporal or spatial.
In MD, the configuration of atomic positions is mapped through simulation time-steps into a trajectory. In order to calculate any properties defined
by these time-linked configurations the trajectory may be stored in memory,
and an algorithm applied. For example storing atom velocities and calculating the velocity auto-correlation function (VAF). For small systems, it
may be possible to store these data at every step, obtaining a VAF accurate to the time-step dt. As the system size scales this becomes infeasible
both in terms of storage and I/O penalties if storing to disc. By conceptualising the simulation trajectory as a big-data stream, we can apply the
same ideas of online algorithms to calculate system properties on-the-fly. For
modest systems, this presents a convenience. However for large systems, this
methodology can give access to properties which otherwise require infeasible
storage requirements or some accuracy trade-off.
Multiple system properties depend upon the calculation of correlation
functions. As an incomplete list: (1) The VAF is one example which can in
turn be used to calculate the vibrational density of states. (2) From GreenKubo theory [42, 43] the thermal conductivity and viscosity can be calculated
by using correlation functions of the heat flux and stress tensor respectively
[42, 43]. (3) Elastic constants may also be calculated using either strain correlations [44] or stress correlations [45, 46]. (4) k-space density correlations
may be used to determine thermal-conductivity [47], and momentum current
correlations may be used to calculate structure factors and liquid phonon
spectra [48, 49].
Motivated by these examples, and others, in DL POLY we implemented
a general on-the-fly correlation algorithm, so that we are able to calculate
these values at runtime; that is as the system is simulated, without storing
trajectory data to disc.
xv

3.2. Correlation functions at runtime
Given two observable quantities X(s∆t) and Y (s∆t) in MD, where s ∈
1, 2, . . . S are simulation time steps for a time step of ∆t, we can evaluate
the correlation function of these two values at time-lags l = 1, 2, . . . , L ≪ S
as [43],
S−l
1 X
X(s′ ∆t)Y ((s′ + l)∆t)
(1)
CXY (l∆t) =
S − l s′ =1
for one trajectory. Typically statistics are accumulated over multiple trajectories. In any case, directly evaluating equation 1 is cumbersome as S and L
increase. Even more so if the observable is averaged over atoms of molecules
(for example, in the case of the VAF). Therefore, often the discrete Fourier
transforms of X and Y are taken to access the correlation function using the
inverse transform of their product [43]. Whether the sum is taken directly
or a Fourier transform method is used X and Y must be stored in memory
or on disc for the computation.
We use the multiple-tau correlation algorithm [50] within DL POLY to
calculate correlations on-the-fly. The method works by maintaining an hierarchical set of blocks which are updated during the simulation. The blocks
store observed data (velocities, stresses, etc.) in a decreasing resolution. The
first block stores the observed data intact whilst lower blocks store block
averages, computed over the previous level. This is controlled by three parameters, the number of blocks b, the number of points within each block p,
and the size of the block average m < p. See Appendix Appendix D for
algorithmic details of our implementation based upon [50].
This approach is similar to the velocity correlator introduced by Frenkel
and Smit [2] with the additional separation of p (data points stored at each
level) and m the averaging parameter. By increasing b and p (with m fixed),
later correlation times may be collected at a fixed accuracy. Increasing m
has the effect of calculating much longer correlation times for a trade off in
accuracy and increased performance. The maximum lag-time of a correlation
given these parameters is defined by (p−1)ml u∆t where ∆t is the simulation
time step and the frequency the correlation is updated is u. The trade off
). The actual performance will also
in performance terms scales as O(p m+1
m
depend on the values correlated (e.g. velocity across all atoms, or system
stress) and other system options (such as electrostatics).
In DL POLY we support correlating key quantities such as components
of heat-flux, velocity, the stress tensor, rigid body positions, velocites, and
xvi

angular velocities, as well as k-space resolved densities, heat-fluxes, currents,
stress tensors, and energy currents. Additionally common statistical values
reported by DL POLY’s statistics module including, but not limited to volume, kinetic/potential energy, and temperature. We do this by defining each
as an observable. Quantities that are observable may be juxtaposed in the
CONTROL file to define arbitrary correlations. For example a velocity-velocity
correlation may be defined by v x-v x where x indicates the component
correlated, and stress xy-stress xy defines a stress correlation of the xycomponent of the stress tensor. Scalar quantities such as volume can be
requested using e.g. volume-volume. The power of the juxtaposition is that
any pair of observables can be correlated.
For each correlation, independently, we also support specifying the parameters b, p, and m separately, and provide the ability to set the frequency
each correlation is updated separately from the frequency of other statistics
and output in DL POLY. This enables high-resolution correlations to be calculated without additional I/O overhead, and without forcing low-resolution
correlations to be calculated at unnecessarily high time-resolutions. Differences in correlation frequency are handled in the correlation output by automatically accounting for the different effective time-units in the outputted
lag times.
During the simulation, correlators are stored index-able by the unique
names of the observables they correlate, i.e. at any point in the simulation,
we may trivially obtain the current correlation value for say the heat-flux in x,
heat flux x-heat flux x. Not only does this serve to maintain uniqueness,
but also allows us to easily report these values, or process them further, at
runtime. The result of each correlator is reported in a new file COR which
reports the value of each correlation function as well as potentially a set
of derived quantities, at user specified intervals. These derived quantities
include viscosity, thermal conductivity, and elastic constants as detailed in
the next sections.
3.3. Viscosity and thermal conductivity
The viscosity and thermal conductivity are important properties characterising liquids, for example in the evaluation of the performance of molten
salts as nuclear reactor coolant fluids, since this is determined by their transport and thermal diffusion characteristics [51, 52]; these can be calculated
using Green-Kubo theory from correlations of shear-stress and heat-flux respectively [42, 43, 53].
xvii

The shear-viscosity may be calculated using the integral
Z ∞
V
η=
dt⟨σxy (t) σxy (0)⟩,
kB T 0

(2)

where ⟨·⟩ is the ensemble average, T is the system temperature and kB is
Boltzmann’s constant. Finally the stress tensor, for pairwise additive potential U , is defined as
σαβ

ij
1 X i i i
1 X ∂U (rij ) rαij rβ
=
m vα vβ −
,
V i
V i,j̸=i ∂r
rij

(3)

where mi , vi , ri are particle i’s mass, velocity, and position, and rij = ri − rj ,
and V is the system volume. Finally, α and β are orthogonal Cartesian
coordinates.
Similarly the thermal conductivity κ is related to the heat flux by
Z ∞
V
dt⟨q(t) · q(0)⟩.
(4)
κ=
3kB T 2 0
Where the heat flux is

N 
1 X
1 X ij i ij
i i
q=
f ·v r ,
Ev +
V i=1
2 j̸=i

(5)

where E i is the total energy (kinetic plus potential) of particle i, and f ij is
the force exerted on i due to j. In Eq. 4 the dot product has the function of
averaging over the dimensions x, y, and z.
For multi-component systems with significant asymmetry (e.g. mass difference) it is necessary to correct for mass flux effects when calculating κ in
order to compare to experimental data [54]. For example the effect is significant in LiF but not in KCl [55] which have mass ratios 6.941
= 0.37 and
18.99
35.453
= 0.91 respectively. This can be done by forming multiple Green-Kubo
39.098
formulae using the heat flux q and partial momentum densities for particular
species,
jIs =

1 X
mi vi (t),
V i∈I
s

xviii

(6)

where Is is the set of atom indices for atoms of species s. In DL POLY
the partial momentum densities may be calculated, and correlated, for a
selection of species by the user. The values themselves may also be written
to the HEATFLUX file alongside q. In general it is possible to take advantage
of momentum conservation in order to not calculate all partial momentum
densities. A user is then able to form the necessary correlations for the
corrected Green-Kubo formula required.
3.3.1. Comparison to experiment
The correlation functions integrated in equations 2 and 4 can be calculated during an MD run using DL POLY’s on-the-fly correlator. In each
case we can approximate the integral by computing the required correlation
to some maximum time-lag T and performing a numerical integration of the
correlation functions’ values. In each case, when multiple congruent correlation are present (e.g. xy and zx shear stresses), DL POLY automatically
calculates the derived properties (viscosity) listing them for each correlation
and also as an averaged quantity. Care must be taken to collect adequate
statistics for accurate calculations of these derived quantities. For example
Zhang et al. [53] detail a method for choosing an optimal cutoff lag-time.
For argon, there is experimental data for κ and η at a variety of pressure and temperatures compiled by the National Institute of Standards and
Technology (NIST) [56]. DL POLY has been used successfully before to
calculated viscosity for argon by directly correlating the outputted stress
tensor [57]. Here we report new data including thermal conductivity using
DL POLY’s on-the-fly correlator.
For all simulation work, we began by equilibrating for 105 steps in an
NPT ensemble, followed by 104 steps with an NVE ensemble to account
for changing thermostats, and then production runs, where statistics are
calculated, ran for 106 time steps. In all cases the time step was 0.001 ps.
For correlations we collect data for lag times up to 5 ps. For the small
systems (N = 500) we collect data on independently seeded simulations for
20 replicates to ensure adequate averaging of the Green-Kubo results. For
the large-scale systems, we were able to simulate 5 replicates for each data
point.
Figure 6 shows the small- and large-scale system results for viscosity (see
Eq. 2). We find good agreement with the NIST data as expected [57]. Additionally we find similarly consistent results for the large-scale simulations.
Likewise for thermal conductivity, Eq. 4, we also find the calculated results
xix

Figure 6: Viscosity for supercritical Argon, as calculated using DL POLY’s on-the-fly
correlations. In each case experimental data from NIST is compared to data from N = 500
atom simulations averaged over 20 simulations, and large scale simulations with N = 108
atoms averaged over 5 simulations, also compared to NIST [56]

xx

are consistent with the experimental data (see Figure 7). The thermal conductivity data notably deviates more, this is due to increased noise between
different replicates. For viscosity, we find standard deviations, across the different simulation results, of the order 10−6 Pa for both system sizes, whereas
for thermal conductivity the smaller systems’ standard deviations are of the
order 10−3 but for the large-scale systems this rose to 10−2 at 250 K and 10−3
at 500 K and 750 K.
Both viscosity and thermal conductivity show an interesting feature: the
minima which bound these properties from below. The minima are due to
the dynamical crossover of particle dynamics [57]. Interestingly, the values at
the minima are fixed by fundamental physical constants such as the Planck
constant and electron mass [58, 59, 60].
3.4. Elastic constants
Another use case for the on-the-fly correlator is the calculation of elastic
constants. In general, the elastic constants of a system are defined by the
elements of the elasticity tensor Cαβµν . The values of which can be calculated,
in full, using either the stress-fluctuation of strain-fluctuation methods, or by
examining system stress after applying a set of strains. The stress-fluctuation
method has been found to be more reliable and faster converging [61, 62, 63,
64].
For the calculation of elastic constants, we use this method where the
elastic constants are given by,
B
Cαβµν = ⟨Cαβµν
⟩
V
[⟨σαβ σαβ ⟩ − ⟨σαβ ⟩⟨σµν ⟩]
−
kB T
2N kB T
+
(δαµ δβν + δαν δβµ ).
V

(7)

where the parameters N is the atom count, δij is the Kronecker delta, and
σαβ is the microscopic stress tensor as in equation 3. Finally the Born term

xxi

Figure 7: Thermal conductivity for supercritical Argon, as calculated using DL POLY’s
on-the-fly correlations. In each case experimental data from NIST is compared to data
from N = 500 atom simulations averaged over 20 simulations, and large scale simulations
with N = 108 atoms averaged over 5 simulations, also compared to NIST [56]. See also
figure 6.

xxii

Figure 8: Elastic constants for FCC Argon, as calculated by DL POLY compared to
previous simulation results. In general there is good agreement with data that exists,
notably existing data is quite sparse. Our data is averaged over n = 20 initial conditions,
and statistics collected over T = 106 steps, with a 10 ps maximum correlation lag time.

xxiii

is defined as,
B
Cαβµν


1 X ∂ 2 U (rij )
=
V i,j̸=i
∂rij 2

ij
1 ∂U (rij ) rαij rβ rµij rνij
.
− ij
r
∂rij
rij 2

(8)

The 81 components of the elasticity tensor Cαβµν can be reduced to 21 independent values for physical media, and further to 3 for physical media with
cubic symmetry. Using Voigt notation we identify C11 = 13 (C1111 + C2222 +
C3333 ), C12 = 31 (C1122 + C1133 + C2233 ), and C44 = 31 (C2323 + C3131 + C1212 ).
DL POLY automatically reports the values of the elastic constants the 21
possible elastic constants when the required stress correlation terms are
present. To compare with experimental data, from the elastic constants
we can derive the bulk and shear modulus. The bulk modulus is
1
B = (C11 + 2C12 ).
3

(9)

For the shear modulus there is a choice of averaging, K V (Voigt) and K R
(Reuss) which give upper and lower bounds of the experimental results [65].
These take the form
C11 − C12 + 3C44
,
5
5
KR =
.
4/(C11 − C12 ) + 3/C44

KV =

(10)
(11)

Both K and B are reported for experimentally observed argon crystals. Note
that the isothermal compressibility is often reported where χ = B −1 , and the
bulk modulus itself may also be more simply calculated from fluctuations in
volume in an NPT ensemble as [1]
B=

⟨V ⟩kB T
.
⟨V 2 ⟩ − ⟨V ⟩2

(12)

3.4.1. Comparison to simulations and experiments
The following methodology was used in all our simulation results. An
FCC Argon crystal was initialised and equilibrated in NPT, for 105 steps
xxiv

Figure 9: The bulk and shear modulus, derived from DL POLY elastic constants (see
Fig 8 for the raw data), compared with experimental data. Voigt and Reuss averaging
give upper and lower bounds on the shear modulus respectively as expected. The Bulk
modulus is also in good agreement Like [63] we find a linear relationship in C11 and C12
with temperature and hence the same is true for B.

xxv

(timestep 1 fs) with a Nosé-hoover thermostat with thermo- and baro-stat
couplings of 0.1 ps and 0.1 ps respectively, and a cutoff of 12 Å. The potential
was the Lennard-Jones potential with the same parameters as [61].
After equilibration production runs (NVT) were simulated for 106 steps
unless otherwise specified. For these runs, the atoms where re-scaled to the
average simulation cell from the equilibration phase. Statistical averages were
calculated on a rolling window of 10 ps updated every step, and correlations
computed for a maximum lag time of 10 ps, also updated every step using
DL POLY’s on-the-fly correlators. Everything else was kept the same. For
each replicate n = 1, 2, . . . , 20, this procedure was carried out with a different
random seed for equilibration and production. The methodology is similar to
[61], except they generate initial inputs through Monte-Carlo (MC) methods.
Fig. 8 compares DL POLY’s calculated elastic constants with existing
simulation results on argon FCC crystals, including MD and MC results,
which are notably quite sparsely reported along the temperature axis. Previous simulation results are obtained from MC [45] and MD [62, 63, 61, 64].
We find that the values reported by DL POLY are consistent with previous simulation results. The results from Pereverzev [63] and Squire et al.
[45] follow broadly the same trend in temperature, and the remaining results
reported at 60 K are also consistent with DL POLY’s.
Given that we find consistent results for the individual elastic constants
against simulated data we now compare with experimental values. For the
bulk modulus (B) these follow the same averaging scheme. DL POLY’s results, Fig 9, bound the experimental values of Dobbs and Jones [66] as expected. The bulk modulus is also in good agreement with Peterson et al. [67].
Although there is a notable plateau trend at ≲ 20 K, which is not captured
in simulations. Not that Pereverzev [63] also finds a linear relationship at
low temperature consistent with our results (see Fig 8).
3.5. Currents
In DL POLY 5’s currents module is capable of calculating various collective properties resolved to user supplied k-space vectors. To begin, the
most basic functionality is the computation of k-space density, as well as

xxvi

transverse and longitudinal currents. These are defined as [48]
X
i
n(k, t) =
eik·r (t) ,

(13)

i

X
i
jL (k, t) =
(vi (t) · k̂)k̂eik·r (t) ,

(14)

i

jT (k, t) =

X
i
[vi (t) − (vi (t) · k̂)k̂eik·r (t) .

(15)

i

Equation 13 may be used to determine the intermediate scattering function using the correlations
F (k, t) =

1
⟨n(k, t)n(−k, 0)⟩,
N

(16)

which can be used to determine thermal-conductivity in fluids [47], as well
as the static structure factor as the t = 0 correlation value. The dynamic
structure factor may also be obtained using the Fourier transform of F (k, t).
Analogous correlations of jL (k, t) and, jT (k, t) may be used to determine
transverse and longitudinal propagating modes. These collective properties
relate to the overall motion of the particles of the system. Collective excitations form an integral and well known part of the theory of solid and gaseous
states, but for liquids, which combine strong interactions with dynamical
disorder, this has historically not been the case.
Alongside these currents, DL POLY can also calculate the k-space energy
density, energy currents, and k-dependent stress tensor.
The energy density is given by
e(k, t) =

1 X i ik·ri (t)
Ee
.
2 i

xxvii

(17)

and the k-dependent stress and energy currents from equations 18 and 19.

ij
X
1 X rαij rβ
i
ij
i i
σα,β (k) =
P (k, r ) eik·r (t) ,
(18)
mvα vβ −
ij
2
2 j̸=i |r |
i


1 X i i 1 XX i
i
j
ij ij
ij 2
ij
E va −
qa (k) =
(vb + vb )(ra rb /|r | )P (k, r ) eik·r (t) ,
2 i
2 j̸=i b
(19)
where Pk (k, r) = |r|

∂U (|r|) 1 − e−ik·r
∂r
ik · r

(20)

A user may request correlations of combinations of these various currents
(component wise). Each k-point specified by the user will result in one correlation matching these requests in the CONTROL file. One use case for current
correlations is calculating phonon dispersion curves. These dispersions can
be calculated in DL POLY via the calculation of current correlations
CL (k, t) = ⟨jzL (k, t)jzL (k, 0)⟩,
2CT (k, t) = ⟨jxT (k, t)jxT (k, 0)⟩
+ ⟨jyT (k, t)jyT (k, 0)⟩.

(21)
(22)

From these correlation functions, the spectra are found from the Fourier
transforms CL (k, ω) and CT (k, ω). The frequencies corresponding to maxima of these Fourier transforms of each wave-vector are interpreted as the
spectrum of collective modes [48, 49].
In liquids in particular, it is possible to observe an interesting feature of
spectra not seen in solids: k-gap in the solid-like transverse spectrum meaning
a threshold value below which no such propagating modes exist [68, 49]. The
k-gap can be observed by plotting the relationship between these maxima
and k = |k|.
To calculate these data in DL POLY we first equilibrate an argon system
of 4,000 atoms in NPT to the desired temperature. We then expand to other
desired system sizes using DL POLY’s nfold capability. We then calculate
the correlations functions on-the-fly over 106 simulation steps (1 fs per step),
for all physically reasonable k-points, kz. Our correlations are resolved for
10240 points per block, at a correlation frequency of every 10 steps (lags up
to 102400 fs). The resulting correlation functions are first averaged across 20
independent trajectories. The frequency maxima are then found by taking
xxviii

Figure 10: Dispersion curves for Argon at 85 K and 134.5 K for system sizes 4,000, 32,000,
108,000, and 10,976,000.

xxix

Fourier transforms of these averaged correlations. The result of this is shown
in figure 10 where we see the increasing k-gap for the transverse mode for
temperatures of 85 K and 134.5 K.
3.6. Rigid body correlations
In DL POLY rigid bodies can be defined as part of user input in the FIELD
file. Molecules may have single or multiple rigid body components. During
a simulation each rigid body has position, velocity and angular velocity data
which may be correlated in the same manner as per-atom correlations. That
is DL POLY will determine distinct rigid body types based on distinct rigid
body components within distinct molecular types, calculate requested correlations at runtime, and average these results for each.
Using this on-the-fly correlation functionality we are able to reproduce
the results of Brodka and Zerda [69] for SF6 . We do this by simulating rigid
bodies of SF6 molecules with the same Lennard-Jones potential parameters.
For equilibration we set the temperature to 296 K, and pressure to the experimental values listed for the pressure by Brodka and Zerda. We simulate
in NPT for 100 ps to ensure equilibration. The resulting densities were similar to the 1.5 and 1.9 g/cm3 listed in their results. We then rescaled our
resulting configurations to match those densities exactly, and then proceed
in NVE collecting statistics over 1 ns simulation steps. Brodka and Zerda
use 1000 equilibration steps and 4200 production steps at a higher time step
(5 fs vs. 1 fs). Our results for the VAF are shown in Figure 11 which match
well.
The VAF can also be used as a definition of the Frenkel line in supercritical fluids [70, 71], which marks the crossover between liquid-like and gas-like
particle dynamics and thermodynamics. Using the rigid body VAF functionality we perform this analysis for rigid body CH4 molecules. We follow the
process of Yang et al. [72], using the same potential [73]. We first equilibrate
our system in NPT at the desired P (900 bar) and T conditions for 50 ps and
then collect statistics over a further 50 ps. As before we find the equilibrated
simulation densities to be approximately the same as the NIST experimental
data for supercritical CH4 [56], and for increasing T there is a loss of the
VAF minimum as shown in figure 12.

xxx

Figure 11: Comparison to Brodka and Zerda’s results (digitised) for SF6 rigid body VAFs
[69]. Temperature 296 K with pressure and density indicated in the legend. N indicates
the rigid body count.

xxxi

Figure 12: VAF for CH4 rigid molecules at 900 bar for increasing T , showing the crossover
at the Frenkel line.

4. Conclusions
We have used the paradigm of on-the-fly calculations to develop a general purpose correlation module in DL POLY 5. This functionality exposes
key properties derived from simulation data which may be composed into
correlation functions by the user. This method provides a convenient way
to perform correlation analysis as well as eliminating the cumbersome I/O
operations inherent in saving trajectory data for post-processing. It also
presents the continuous availability of correlation function during simulation
time. This saving becomes significant for high temporal resolution correlations defined at the atomic or molecular level such as the VAF.
DL POLY continues to show the same scalability for large CPU counts
as past versions. Our correlation implementation naturally takes advantage
of the DD algorithm and multi-processing architecture used in DL POLY to
distribute the calculation of atomistic correlations. We aim to continue the
development of on-the-fly methods in and beyond DL POLY 5. For example, to date we have exposed rigid body molecular positions, velocities, and
angular velocities. A natural extension is to correlate properties of flexible
xxxii

molecular species and other atomic groupings more generally. In particular
the more complex sub-components of flexible molecules such as atomic bonds
or angle based interactions etc.
Author contributions
H. L. Devereux: Methodology; Software; Validation; Formal analysis;
Investigation; Data Curation; Writing - Original Draft; Writing - Review
& Editing; Visualization. C. Cockrell:
Methodology; Software; Validation; Formal Analysis; Investigation; Writing - Original Draft; Writing Review & Editing. A. M. Elena: Conceptualization; Methodology; Software; Validation; Formal Analysis; Investigation; Resources; Data Curation;
Writing - Original Draft; Writing - Review & Editing; Supervision; Project
administration; Funding Acquisition. Ian Bush: Software; Validation;
Data Curation; Writing - Review & Editing; Aidan B. G. Chalk: Software; Validation; Data Curation. Jim Madge: Software; Validation; Data
Curation. Ivan Scivetti: Software; Validation; Data Curation. J. S.
Wilkins: Software; Validation; Data Curation; Writing - Original Draft;
Writing - Review & Editing. I. T. Todorov: Conceptualization; Methodology; Software; Validation; Formal Analysis; Investigation; Resources; Data
Curation; Writing - Original Draft; Writing - Review & Editing; Supervision;
Project administration; Funding Acquisition. W. Smith: Software; Conceptualization; Methodology; Validation; Data Curation. K. Trachenko:
Conceptualization; Methodology; Validation; Formal analysis; Investigation;
Resources; Writing; Writing-Review & Editing; Supervision; Project administration; Funding acquisition.
Data statement
The DL POLY 5 source code is hosted on GitLab [11] (GPLv3.0). The
input files and submission scripts to reproduce the statistical data for our
on-the-fly methods can be found on Gitlab https://gitlab.com/apw951/
dl_poly_workflows/-/tree/dlpoly_5_preprint?ref_type=tags.
Acknowledgements
We are grateful to EPSRC (for grant No. EP/W029006/1). This research
utilised Queen Mary’s Apocrita HPC facility, supported by QMUL Researchxxxiii

IT http://doi.org/10.5281/zenodo.438045, STFC Scientific Computing Department’s SCARF cluster and the Sulis Tier 2 HPC platform hosted by
the Scientific Computing Research Technology Platform at the University of
Warwick and funded by EPSRC GrantS EP/T022108/1 and the HPC Midlands+ consortium. Via our membership of the UK’s HEC Materials Chemistry Consortium, which is funded by EPSRC (EP/R029431), this work used
the ARCHER UK National Supercomputing Service (http://www.archer.ac.uk)
References
[1] M. P. Allen and D. J. Tildesley. Computer Simulation of Liquids. Oxford
University Press, 2007.
[2] D. Frenkel and B. Smit. Understanding molecular simulation: from
algorithms to applications, volume 2. Academic Press San Diego, 2002.
[3] K Nordlund. Computational materials science of ion irradiation. Nuclear Instruments and Methods in Physics Research Section B: Beam
Interactions with Materials and Atoms, 188(1-4):41–48, 2002.
[4] E. Zarkadoula, S. L. Daraszewicz, D. M. Duffy, M. A. Seaton, I. T.
Todorov, K. Nordlund, M. T. Dove, and K. Trachenko. The nature
of high-energy radiation damage in iron. J. Phys.: Condens. Matt.,
25(12):125402, feb 2013. doi: 10.1088/0953-8984/25/12/125402. URL
https://dx.doi.org/10.1088/0953-8984/25/12/125402.
[5] 0.5 MeV collision cascade animation. https://ccmmp.ph.qmul.ac.uk/
~kostya/rad.html. Accessed: 2025-01-14.
[6] George Beckett, Josephine Beech-Brandt, Kieran Leach, Zöe Payne,
Alan Simpson, Lorna Smith, Andy Turner, and Anne Whiting. Archer2
service description, December 2024. URL https://doi.org/10.5281/
zenodo.14507040.
[7] Archer2 user documentation: Data management and transfer. (https:
//docs.archer2.ac.uk/user-guide/data/), . Accessed: 2025-01-21.
[8] I. T. Todorov, W. Smith, K. Trachenko, and M.T. Dove. DL POLY 3:
new dimensions in molecular dynamics simulations via massive parallelism. J. Mater. Chem., 16:1911, 2006.
xxxiv

[9] IT Todorov and W Smith. DL POLY 3: the CCP5 national UK code
for molecular–dynamics simulations. Philosophical Transactions of the
Royal Society of London. Series A: Mathematical, Physical and Engineering Sciences, 362(1822):1835–1852, 2004.
[10] Ilian T. Todorov, William Smith, Kostya Trachenko, and Martin T.
Dove. DL POLY 3: new dimensions in molecular dynamics simulations
via massive parallelism. Journal of Materials Chemistry, 16(20):1911–
1918, 2006.
[11] DL POLY, 2024. URL https://gitlab.com/ccp5/dl-poly.
[12] W Smith and TR Forester. DL POLY 2.0: A general-purpose parallel
molecular dynamics simulation package. Journal of molecular graphics,
14(3):136–141, 1996.
[13] W Smith, CW Yong, and PM Rodger. DL POLY: Application to molecular simulation. Molecular Simulation, 28(5):385–471, 2002.
[14] DL POLY CLASSIC. https://gitlab.com/DL_POLY_Classic. Accessed: 2025-01-21.
[15] W Smith and TR Forester. Parallel macromolecular simulations and the
replicated data strategy: I. the computation of atomic forces. Computer
physics communications, 79(1):52–62, 1994.
[16] W Smith and TR Forester. Parallel macromolecular simulations and
the replicated data strategy: II. the RD-SHAKE algorithm. Computer
physics communications, 79(1):63–77, 1994.
[17] Walt Brainerd. Fortran 77. Communications of the ACM, 21(10):806–
820, 1978.
[18] David W. Walker and Jack J. Dongarra. MPI: a standard message
passing interface. Supercomputer, 12:56–68, 1996.
[19] M. R. S. Pinches, D. J. Tildesley, and W. Smith. Large scale molecular
dynamics on parallel computers using the link-cell algorithm. Molecular
Simulation, 6(1-3):51–87, 1991.
[20] E. O. Brigham. The fast fourier transform and its applications, 1988.
xxxv

[21] Ulrich Essmann, Lalith Perera, Max L. Berkowitz, Tom Darden, Hsing
Lee, and Lee G. Pedersen. A smooth particle mesh Ewald method.
The Journal of Chemical Physics, 103(19):8577–8593, 11 1995. ISSN
0021-9606. doi: 10.1063/1.470117. URL https://doi.org/10.1063/
1.470117.
[22] M. T. Dove, M. Calleja, R. Bruin, J. G. Lewis, S. Mehmood Hasan,
V. N. Alexandrov, M. Keegan, S. Ballard, R. P. Tyler, I. T. Todorov,
P. B. Wilson, M. Alfredsson, G. D. Price, C. Chapman, W. Emmerich,
S. A. Wells, A. Marmier, S. C. Parker, and Z. Du. The eminerals collaboratory: tools and experience. Molecular Simulation, 31:329–337, 2005.
doi: https://doi.org/10.1080/10503300500520106.
[23] T. Hey and A.E. Trefethen. The uk e-science core programme and the
grid. in: Sloot, p.m.a., hoekstra, a.g., tan, c.j.k., dongarra, j.j. (eds)
computational science — iccs 2002. Molecular Simulation, 31, 2002.
doi: https://doi.org/10.1007/3-540-46043-8 1.
[24] I. J. Bush, Ilian T. Todorov, and W. Smith. A DAFT DL POLY
distributed memory adaptation of the smoothed particle mesh Ewald
method. Computer physics communications, 175(5):323–329, 2006.
[25] K. Trachenko, M.T. Dove, T. Geisler, I. T. Todorov, and W. Smith.
Radiation damage effects and percolation theory. J. Phys.: Condens.
Matter, 16:S2623–S2627, 2004.
[26] Martyn F. Guest, Alin M. Elena, and Aidan B. G. Chalk. DL POLY - a
performance overview analysing, understanding and exploiting available
hpc technology. Molecular Simulation, 47(2-3):194–227, 2021.
[27] I. Todorov, L. Ellison, and W. Smith. Rigid body molecular dynamics within the domain decomposition framework of DL POLY 4. Parallel Computing Technologies: Lecture Notes in Computer Science (V.
Malyshkin (Ed.)), 7979:429–435, 2013. doi: https://doi.org/10.1007/
978-3-642-39958-9 40.
[28] M. A. Seaton, I. T. Todorov, S. L. Daraszewicz, G. S. Khara, and
D. M. Duffy. Domain decomposition of the two-temperature model
in DL POLY 4. Molecular Simulation, 47:180–187, 2021. doi: https:
//doi.org/10.1080/08927022.2018.1511902.
xxxvi

[29] I. J. Bush, Ilian T. Todorov, and W. Smith. Optimisation of the I/O for
distributed data molecular dynamics applications. Cray User Group
Proceedings 2010, 2010. URL https://cug.org/5-publications/
proceedings_attendee_lists/CUG10CD/.
[30] T. Shardlow. Splitting for dissipative particle dynamics. SIAM Journal
on Scientific Computing, 24:1267–1282, 2003. doi: https://doi.org/10.
1137/S1064827501392879.
[31] M. Seaton, I. Todorov, and A. Yaser. Efficient domain decomposition
of dissipative particle dynamics via choice of pseudorandom number
generator. Parallel Computing Technologies: Lecture Notes in Computer
Science (V. Malyshkin (Ed.)), 7979:250–257, 2013. doi: https://doi.org/
10.1007/978-3-642-39958-9 23.
[32] Oren Ben-Kiki, Clark Evans, and Brian Ingerson. Yaml ain’t markup
language (yaml™) version 1.1. Working Draft 2008, 5(11), 2009.
[33] Alin Elena, Jacob Wilkins, and Harvey Devereux. dlpoly-py, March
2025. URL https://doi.org/10.5281/zenodo.14983396.
[34] Ivan Scivetti, Kakali Sen, Alin M. Elena, and Ilian Todorov. Reactive
molecular dynamics at constant pressure via nonreactive force fields:
Extending the empirical valence bond method to the isothermal-isobaric
ensemble. The Journal of Physical Chemistry A, 124(37):7585–7597,
2020. doi: 10.1021/acs.jpca.0c05461. URL https://doi.org/10.1021/
acs.jpca.0c05461. PMID: 32820921.
[35] You Lu, Matthew R. Farrow, Pierre Fayon, Andrew J. Logsdail,
Alexey A. Sokol, C. Richard A. Catlow, Paul Sherwood, and Thomas W.
Keal. Open-source, Python-based redevelopment of the ChemShell multiscale QM/MM environment. Journal of Chemical Theory and Computation, 15(2):1317–1328, 2019. doi: 10.1021/acs.jctc.8b01036. URL
https://doi.org/10.1021/acs.jctc.8b01036. PMID: 30511845.
[36] P. P. Ewald. Die Berechnung optischer und elektrostatischer Gitterpotentiale [On the calculation of optical and electrostatic grid potentials].
Annalen der Physik, 369(3):253–287, 1921.
[37] J. E. Lennard-Jones. On the determination of molecular fields. Proc. R.
Soc. Lond. A, 106(738):463–477, 1924.
xxxvii

[38] Archer2 user documentation: Archer2 hardware. (https://docs.
archer2.ac.uk/user-guide/hardware/#system-overview), . Accessed: 2025-02-21.
[39] Richard M. Karp. On-line algorithms versus off-line algorithms: How
much. In Algorithms, Software, Architecture: Information Processing
92: Proceedings of the IFIP 12th World Computer Congress, volume 1,
page 416, 1992.
[40] Georg Krempl, Indre Žliobaite, Dariusz Brzeziński, Eyke Hüllermeier,
Mark Last, Vincent Lemaire, Tino Noack, Ammar Shaker, Sonja Sievi,
Myra Spiliopoulou, et al. Open challenges for data stream mining research. ACM SIGKDD explorations newsletter, 16(1):1–10, 2014.
[41] Mayur Datar, Aristides Gionis, Piotr Indyk, and Rajeev Motwani. Maintaining stream statistics over sliding windows. SIAM journal on computing, 31(6):1794–1813, 2002.
[42] Robert Zwanzig and Raymond D Mountain. High-frequency elastic moduli of simple fluids. The Journal of Chemical Physics, 43(12):4464–4471,
1965.
[43] Mark E Tuckerman. Statistical mechanics: theory and molecular simulation. Oxford university press, 2023.
[44] John R. Ray and Aneesur Rahman. Statistical ensembles and molecular
dynamics studies of anisotropic solids. II. The Journal of Chemical
Physics, 82(9):4243–4247, 05 1985. ISSN 0021-9606. doi: 10.1063/1.
448813. URL https://doi.org/10.1063/1.448813.
[45] D. R. Squire, A. C. Holt, and W. G. Hoover. Isothermal elastic constants
for argon. theory and monte carlo calculations. Physica, 42(3):388–397,
1969.
[46] J. F. Lutsko. Generalized expressions for the calculation of elastic constants by computer simulation. Journal of applied physics, 65(8):2991–
2997, 1989.
[47] Bingqing Cheng and Daan Frenkel. Computing the heat conductivity
of fluids from density fluctuations. Physical Review Letters, 125(13):
130602, 2020.
xxxviii

[48] U. Balucani and M. Zoppi. Dynamics of the Liquid State. Oxford University Press, 2003.
[49] C. Yang, M. T. Dove, V. V. Brazhkin, and K. Trachenko. Emergence
and evolution of the k-gap in spectra of liquid and supercritical states.
Phys. Rev. Lett., 118:215502, 2017.
[50] Jorge Ramı́rez, Sathish K. Sukumaran, Bart Vorselaars, and Alexei E.
Likhtman. Efficient on the fly calculation of time correlation functions
in computer simulations. The Journal of Chemical Physics, 133(15):
154103, 10 2010. ISSN 0021-9606. doi: 10.1063/1.3491098. URL https:
//doi.org/10.1063/1.3491098.
[51] P. R. Kasten M. W. Rosenthal and R. B. Briggs. Molten-salt reactors—history, status, and potential. Nuclear Applications and Technology, 8(2):107–117, 1970. doi: 10.13182/NT70-A28619. URL https:
//doi.org/10.13182/NT70-A28619.
[52] Benjamin A. Frandsen, Stella D. Nickerson, Austin D. Clark, Andrew
Solano, Raju Baral, Johnny Williams, Jörg Neuefeind, and Matthew
Memmott. The structure of molten FLiNaK. Journal of Nuclear Materials, 537:152219, 2020. ISSN 0022-3115. doi: https://doi.org/10.
1016/j.jnucmat.2020.152219. URL https://www.sciencedirect.com/
science/article/pii/S0022311520304736.
[53] Yong Zhang, Akihito Otani, and Edward J. Maginn. Reliable viscosity
calculation from equilibrium molecular dynamics simulations: A time
decomposition method. Journal of chemical theory and computation, 11
(8):3537–3546, 2015.
[54] Jeff Armstrong and Fernando Bresme. Thermal conductivity of highly
asymmetric binary mixtures: how important are heat/mass coupling effects? Physical Chemistry Chemical Physics, 16(24):12307–12316, 2014.
[55] C. Cockrell, M. Withington, H. L. Devereux, A. M. Elena, I. T. Todorov,
Z. K. Liu, S. L. Shang, J. S. McCloy, P. A. Bingham, and K. Trachenko.
Thermal conductivity and thermal diffusivity of molten salts: insights
from molecular dynamics simulations and fundamental bounds. arXiv
preprint arXiv:2409.03775, 2024.

xxxix

[56] National Institute of Standards and Technology database. URL https:
//webbook.nist.gov/chemistry/fluid.
[57] C. Cockrell, V. V. Brazhkin, and K. Trachenko. Universal interrelation between dynamics and thermodynamics and a dynamically driven
“c” transition in fluids. Phys. Rev. E, 104:034108, Sep 2021. doi:
10.1103/PhysRevE.104.034108. URL https://link.aps.org/doi/10.
1103/PhysRevE.104.034108.
[58] K. Trachenko and V. V. Brazhkin. Minimal quantum viscosity from
fundamental physical constants. Sci. Adv., 6:eaba3747, 2020.
[59] K. Trachenko and V. V. Brazhkin. The quantum mechanics of viscosity.
Physics Today, 74(12):66, 2021.
[60] K. Trachenko, M. Baggioli, K. Behnia, and V. V. Brazhkin. Universal
lower bounds on energy and momentum diffusion in liquids. Phys. Rev.
B, 103:014311, 2021.
[61] Germain Clavier, Nicolas Desbiens, Emeric Bourasseau, Véronique Lachet, Nadège Brusselle-Dupend, and Bernard Rousseau. Computation
of elastic constants of solids using molecular simulation: comparison of
constant volume and constant pressure ensemble methods. Molecular
Simulation, 43(17):1413–1422, 2017.
[62] D. J. Quesnel, D. S. Rimai, and L. P. DeMejo. Elastic compliances and
stiffnesses of the fcc Lennard-Jones solid. Physical Review B, 48(10):
6795, 1993.
[63] Andrey Pereverzev. Isothermal and adiabatic elastic constants from
virial fluctuations. Physical Review E, 106(4):044110, 2022.
[64] Aidan Thompson and Germain Clavier. A general method for calculating local stress and elastic constants for arbitrary many-body interaction
potentials in LAMMPS. Technical report, Sandia National Lab.(SNLNM), Albuquerque, NM (United States), 2022.
[65] Richard Hill. The elastic behaviour of a crystalline aggregate. Proceedings of the Physical Society. Section A, 65(5):349, 1952.

xl

[66] ER Dobbs and Gwyn Owain Jones. Theory and properties of solid argon.
Reports on Progress in Physics, 20(1):516, 1957.
[67] O. G. Peterson, D. N. Batchelder, and R. O. Simmons. Measurements of X-ray lattice constant, thermal expansivity, and isothermal
compressibility of argon crystals. Phys. Rev., 150:703–711, Oct 1966.
doi: 10.1103/PhysRev.150.703. URL https://link.aps.org/doi/10.
1103/PhysRev.150.703.
[68] M. Baggioli, M. Vasin, V. Brazhkin, and K. Trachenko. Gapped momentum states. Physics Reports, 865:1, 2020.
[69] A Brodka and TW Zerda. A molecular dynamics simulation of sulphur
hexafluoride. Molecular Physics, 76(1):103–112, 1992.
[70] C. Cockrell, V. V. Brazhkin, and K. Trachenko. Transition in the supercritical state of matter: Review of experimental evidence. Physics
Reports, 941:1–27, 12 2021. ISSN 0370-1573. doi: 10.1016/J.PHYSREP.
2021.10.002.
[71] V. V. Brazhkin, Yu. D. Fomin, A. G. Lyapin, V. N. Ryzhov, E. N.
Tsiok, and Kostya Trachenko. “liquid-gas” transition in the supercritical
region: Fundamental changes in the particle dynamics. Phys. Rev. Lett.,
111:145901, 2013.
[72] C. Yang, V. V. Brazhkin, M. T. Dove, and K. Trachenko. Frenkel line
and solubility maximum in supercritical fluids. Phys. Rev. E, 91:012112,
2015.
[73] Ioannis Skarmoutsos, Leonidas I. Kampanakis, and Jannis Samios. Investigation of the vapor–liquid equilibrium and supercritical phase of
pure methane via computer simulations. Journal of molecular liquids,
117(1-3):33–41, 2005.
[74] I. S. Gradshteyn and I. M. Ryzhik. Table of Integrals, Series, and Products. Elsevier, 2007. ISBN 9780123736376.

xli

Appendix A. Kernels and derivation for SPME
Below follows a derivation for the generic kernels for order 1/rp potentials.
gp .
As described in Essmann et al. [21] the general form of the g function for
1/rp is given as:
Z∞

2
 sp−1 exp −s2 ds
(A.1)
gp (x) =
p
Γ 2
x

From Gradshteyn and Ryzhik [74] (pp. 108–109) and [21] we find for r−1
(Coulomb):
Z∞

2
 exp −s2 ds
g1 (x) =
(A.2)
1
Γ 2
x

Z
where:

1
exp (−βx ) dx =
2
n

r

p 
π
erf
βx
β

√
2
π
g1 (x) = √
(erf (∞) − erf (x))
π 2
g1 (x) = erfc (x)

(A.3)
(A.4)

For the general case (1/rp ) we find:
gp =

Z∞

2
Γ

p
2




sp−1 exp −s2 ds

x

where:
Z

xm exp (±axn ) dx =
xm+1−n
na
Z
m+1−n
∓
xm−n exp (±axn ) dx
na
±

xlii

(A.5)

So for our simpler case:
Z

xp−1 exp −x2 dx =
xp−2
2 Z

p−2
+
xp−3 exp −x2 dx
2

−

(A.6)

Which, using a series expansion gives for even powers:
p/2  2n 
X

x
exp −x2
n!
n=0

(A.7)

and for odd powers:
p−1  n
X
2
n=1

n!!

x

2n−1



where:

exp (−x2 )
√
+ erfc (x)
π

(A.8)

n+1

n!! =

2
Y

(2j − 1)

j=1

which gives us the final form of the equation:

"
#
h 2n i
p/2

P

x

exp (−x2 ) ,
p ∈ even
 Γ (2p )
n!
2
n=0
gp (x) =
 p−1


P  2n 2n−1  exp(−x2 )

2

√
x
+ erfc (x) , p ∈ odd
Γ p
π
( 2 ) n=1 n!!
fp .
2xp−3

fp (x) =
Γ p2

Z∞


s2−p exp −s2 ds

(A.9)

(A.10)

x

There are two simpler to solve cases, these are:
where p = 1:
1 exp (−x2 )
f1 = √
,
x2
π
xliii

(A.11)

and where p is odd and p > 3
fp =
p
X
2

1
p!Γ

p
2





(p − 2n)! (−1)n+1 x2n exp −x2 +

n=0

−x

2

 p−3
−1
2


Ei −x , (A.12)
2



The general case of fp is harder to evaluate. Consider
Z∞


st exp −s2 ds

(A.13)

 t−1  

s
s exp −s2 ds

(A.14)

I (t, x) =
x

This can be written
Z∞
I (t, x) =
x

Which can easily be integrated by parts to give
I (t, x) =


t−1
1
I (t − 2, x) + xt−1 exp −x2
2
2

(A.15)

This can be rearranged to give us an upwards recurrence
I (t + 2, x) =


1
t+1
I (t, x) + xt+1 exp −x2
2
2

(A.16)

From which given I(0,x) and I(1,x) we can calculate all values of the
integral for positive t. These starting integrals are straightfoward:
√
π
I(0, x) =
erf c(x)
(A.17)
2

1
exp −x2
(A.18)
2
For negative t we need a downwards recurrence. This can also be obtained
I(1, x) =

xliv

by rearranging the above


2
1 t−1
−x2
I (t − 2, x) =
I (t, x) − x exp
t−1
2

(A.19)

While we can use the above for even negative t it is not suitable for odd
as stepping from t = 1 to t = −1 introduces a division by zero. We can solve
this by noting
1
(A.20)
I (−1, x) = E1 (x)
2
where E1 (x) is the first order exponential integral and using this as the
starting point for odd negative t.
From I (t, x) fp is trivially calculated
dgp
.
dx

g (x) =
g (x) =
where : F [s] =

Z∞

2
Γ


p
2

2
Γ
Z

p
2


sp−1 exp −s2 ds

(A.21)

x

 [F (∞) − F (x)]


sp−1 exp −s2 ds

dg (x)
2
′

=
p [−F (x)]
dx
Γ 2

dg (x)
2  p−1
 −x exp −x2
=
p
dx
Γ 2
dfp
.
dx

xlv

(A.22)
(A.23)
(A.24)
(A.25)

2xp−3

f (x) =
Γ p2

Z∞


s2−p exp −s2 ds

(A.26)

x

p−3

2x
 [F (∞) − F (x)]
Γ p2
Z

where : F [s] = s2−p exp −s2 ds
"
#
df (x)
d 2xp−3
 [F (∞) − F (x)]
=
dx
dx Γ p2
f (x) =

2xp−3 d

[F (∞) − F (x)]
Γ p2 dx
p−3 2
 p−3 [F (∞) − F (x)]
=
p x
x Γ 2
2xp−3
 [−F ′ (x)]
+
Γ p2

2xp−3  2−p
p−3
2

f (x) −
=
x
exp
−x
x
Γ p2

p−3
2
2

=
f (x) −
p exp −x
x
xΓ 2

+

N.B. our x is actually:
x=
so we find that:

πm
β

(A.27)
(A.28)

(A.29)

(A.30)
(A.31)
(A.32)

(A.33)

df (x)
dx df (x)
→
−
dm
dm dx

(A.34)

π df (x)
β dx

(A.35)

which gives:

as our actual stress kernel.

xlvi

Appendix B. Per-particle contributions for SPME
This section follows the notation of the original SPME paper [21]. α, β, γ
label the lattice vectors. Note all is in the basis of the lattice vectors, should
cartesian values be required a basis transformation is needed at the end (see
the end of the force calculation for an example in the code)
Consider
ΩABC =

1 X
B C
S(m)mA
α mβ mγ f (m)
2πV m̸=0

where A, B, C are the derivatives in u, v, w.
It will be shown that all of the energy, force and stresses can be expressed
in this form
Given the definition of the structure factor we can write:
ΩABC =

1 X X 2πim.rj A B C
qj e
mα mβ mγ f (m)
2πV m̸=0 j

Now applying the standard SPME methods.
First start working in scaled coordinates, where the scaling is by the size
of the FFT grid we intend to use. So for a grid of size K 1 *K 2 *K 3

ΩABC

uµj = Kµ rµj
X
Y 2πi mµ uµj
X
1
B C
Kµ
e
qj
mA
=
α mβ mγ f (m)
2πV j
µ
m̸=0

Now applying the SPME approximation:
m
µ

2πi Kµ uµj

e

≈ bµ (mµ )

∞
X

Mn (uµj − kµ )e

m
µ

2πi Kµ kµ

kµ =−∞

Apply periodic boundary conditions:
m
µ

2πi Kµ uµj

e

≈ bµ (m)

∞
X

Kµ −1

X

m
µ

2πi Kµ kµ

Mn (uµj − kµ − nµ Kµ )e

nµ =−∞ kµ =0

However we can go one step further. Differentiating A times on both
sides w.r.t. uµj and rearranging gives

xlvii

m
µ

2πi Kµ uµj

mA
µe


Kµ
2πi
∞
X

≈

A
bµ (m)×
Kµ −1

X

∂ A Mn (uµj − kµ − nµ Kµ )/∂uµj e

m
µ

2πi Kµ kµ

nµ =−∞ kµ =0

The properties of cardinal B splines means this is easy to evaluate as
d
Mn (u) = Mn−1 (u) − Mn−1 (u − 1)
du
and since the splines are evaluated by recursion we already have all the terms
we need already.
Inserting this into the above gives

ABC

Ω

K
∞
1 −1 K
2 −1 K
3 −1
X
X
X
X
 ABC
1 X
Qj (k, n)×
=
2πV j n ,n ,n =−∞ k
k
k
1 2 3
1
"
# 2 #3
X Y
mµ k µ
2πi K
µ
bµ (m)e
f (m)
m̸=0

QABC
(k, n)
j

µ



Kα
2πi

A

∂A
Mn (uαj − kα − nα Kα )×
∂uA
αj



Kβ
2πi

B

∂B
Mn (uβj − kβ − nβ Kβ )×
∂uB
βj



Kγ
2πi

C

∂C
Mn (uγj − kγ − nγ Kγ )
∂uC
γj

= qj

Now the sum over m is almost a discrete Fourier transform. The problems
are missing out zero and the infinite range.
We can make it one if
1. We define f(0)=0 – which it will be for a neutral cell
xlviii

2. We only sum out to K µ -1. This is OK if f(m) has decayed to negligible
values by the edge of the grid. This should be the case if we have
converged the calculation carefully – ultimately the decaying potential
term will kill off the function provided we have chosen the grid size and
Ewald parameter correctly
Thus we can define a per particle contribution as
X
ΩABC =
ωjABC

(B.1)

j

ωjABC

1
=
2πV

∞
X
n1 n2 n3 =−∞

K
1 −1 K
2 −1 K
3 −1
X
X
X
k1

k2

QABC
(k, n)
j

k3

"
X Y
m̸=0

2πi

bµ (m)e

mµ kµ
Kµ

µ

(B.2)
QABC
(k, n) = qj
j



Kα
2πi

A

∂A
Mn (uαj − kα − nα Kα )×
∂uA
αj



Kβ
2πi

B

∂B
Mn (uβj − kβ − nβ Kβ )×
∂uB
βj



Kγ
2πi

C

∂C
Mn (uγj − kγ − nγ Kγ )
∂uC
γj

xlix

(B.3)

#
f (m)

Appendix C. Redefinition of control files
Old Style.
P-cresol 340
temperature 340
ensemble nve
1.0
pressure
0.000
steps
10200000
multiple
1
print
2000
stack
100
stats
2000
trajectory
200000 2000 0
timestep
0.0005
cutoff
10.0
delr
0.5
rvdw cutoff
10.000E+00
ewald precision
1.0000E-06
cap
1.0000E+04
shake tolerance
1.0000E-05
quaternion tolerance
1.0000E-05
job time
2.5000E+05
close time
1.0000E+02
finish

l

New Style.
title P-cresol 340
io_file_config CONFIG
io_file_field FIELD
io_file_statis STATIS
io_file_revive REVIVE
io_file_revcon REVCON
io_file_cor COR
io_file_currents CURRENTS
print_frequency 2000 steps
stats_frequency 2000 steps
stack_size 100 steps
vdw_cutoff 10.0 ang
padding 0.125 ang
cutoff 10.0 ang
coul_method spme
spme_precision 1e-06
ensemble nve
temperature 340.0 K
pressure_hydrostatic 0.0 katm
timestep 0.0005 ps
time_run 10200000 steps
time_equilibration 0 steps
restart clean
traj_calculate ON
traj_start 200000 steps
traj_interval 2000 steps
traj_key pos
initial_minimum_separation 0.0 ang
equilibration_force_cap 10000.0 k_B.temp/ang
li

shake_tolerance 1e-05 ang

lii

Appendix D. Implementation of general correlations
In order to support arbitrary correlations in a modular and extensible
manner, we separate the code into two components. First implementing the
multi-tau algorithm [50], and second supplying an interface to submit data
for correlation at runtime.
For the first, Algorithm 1 details updating a correlator with newly observed data x and y. In this listing the correlator structure includes rolling
shift arrays for the data to be stored in, accumulators, and correlation values,
all index by bi the block index in the hierarchical data structure. Every m
steps the algorithm proceeds to the next block by averaging the accumulated
values of x and y that have previously been submitted. This step results in
the increasing granularity when bi > 1 if m > 1. Every p steps the shift
arrays will again update from the start. The values of x and y may be real
or complex (in which case a conjugate must be taken). In order to read out
the correlation values, the implicit time lags due to any coarse graining need
to be unwrapped as shown in Algorithm 2.
During a simulation multiple correlators are spawned, identified with correlating particular observables. As the simulation progresses, whenever a
correlation must be updated the get method for the observables it tracks are
called and the data submitted with bi = 1 (the first block).
For the second part, the interface calls for a pair of abstract Observable
types to be supplied which each implement a get method. This method
obtains the data to be correlated and acts as a kernel. By designing the
module in this way a single loop over arbitrary pairs of Observable types
may be completed to compute all correlations, without the need for additional
selection logic. The specific functionality is implemented in types deriving
from Observable as opposed to including this logic in the statistics collection
subroutine itself. For some types this is a relatively trivial task. For example
obtaining the velocity of a particular atom. Some a little more complex such
as selecting one of the six current types which are also separated by atomic
species and k-point. In all cases however, computation is separated from
obtaining the data.

liii

input: x: first value correlated, y: second value correlated, bi : the
correlator block to update, cor: correlator data to update.
if bi > cor.blocks then
return
end
cor.blocksUsed ← max(bi , cor.blocksUsed)
cor.shift[bi , cor.shift[bi ], :] ← x, y
cor.validShift[bi ] ← max(cor.shift[bi ], 1)
cor.accumulator[bi , :] ← cor.accumulate[bi , :] +x, y
cor.accumulated[bi ] ← cor.accumulated[bi ]+1
if cor.accumulated[bi ] > cor.m then
u ← cor.accumulator[bi , 1]/cor.accumulated[bi ]
v ← cor.accumulatorbi , 2]/cor.accumulated[bi ]
Update(u, v, bi + 1, cor)
cor.accumulator[bi , :] ← 0; cor.accumulated[bi , :] ← 0
end
i ← cor.shift[bi ]
if i == 1 then
j←i
for n ← 1, cor.validShifts[bi ] do
cor.values[bi , n] ←
cor.values[bi , n]+cor.shift[bi , n, 1]*Conj(cor.shift[bi , j, 2])
cor.count[bi , n] ← cor.count[bi , n] +1
j ←j−1
if j <= 0 then
j ←j+p
end
end
else
j ← i − ⌊cor.p/cor.m⌋ + 1
for n ← ⌊cor.p/cor.m⌋ + 1, cor.p do
if j <= 0 then
j ← j + cor.p
end
cor.values[bi , n] ←
cor.values[bi , n]+cor.shift[bi , n, 1]*Conj(cor.shift[bi , j, 2])
cor.count[bi , n] ← cor.count[bi , n] +1
j ←j−1
end
end
liv
cor.shift[bi ] ← cor.shift[bi ]+1
if cor.shift[bi ] > p then
cor.shift[bi ] ← 1
end
Algorithm 1: Updating a correlator

input : cor: correlator to readout.
output: values: the correlation values, times: the time lags between
correlation values.
τ ←1
for i ← 1, cor.p do
if cor.count[1, i] > 0 then
values[τ ]← values[τ ] + cor.values[1, i]/cor.count[1, i]
times[τ ] ← i − 1
τ ←τ +1
end
end
for k ← 1, cor.blocksUsed do
for i ← ⌊cor.p/cor.m⌋ + 1, cor.p do
if cor.count[k, i] > 0 then
values[τ ]← values[τ ] + cor.values[k, i]/cor.count[k, i]
times[τ ] ← (i − 1) ∗ mk−1
τ ←τ +1
end
end
end
Algorithm 2: Readout of a correlation

lv