Non-equilibrium Fluctuations of Interacting Particle Systems
Speaker: Benjamin FehrmanDate of Talk: June 23, 2025
Upstream link: UCI PDE Summer School
Let’s begin with a simple random walk as a motivating starting point. Let \(\left\lbrace X_n \right\rbrace\) be a sequence of i.i.d. fair coin flips taking values in \(\left\lbrace -1,1 \right\rbrace\). There are some classic theorems like the law of large numbers and the central limit theorem that describe the behaviour of \(S_n = \frac{1}{n}\left( X_1+\cdots + X_n \right)\). For instance, the central limit theorem predicts that \[\mathbb{P} \left[ S_n \geq \delta \right] = \exp \left( -n \delta ^2 \right).\] This is called the large deviations principle: although it’s quite good when \(\delta \gg 1\), when \(\delta = 1 + \epsilon \) and \(n\) is small this is pretty garbage. We know for sure \(\left\lvert S_n \right\rvert \leq 1\), after all!
One can do better by analysing so-called “rate functions” which quantify how rare small and large deviations are (very loosely), and one can for instance deduce the much more precise
Theorem 1. Cramér's Theorem
With the rate function \[I(x) = \begin{cases} \tanh ^{-1}(x) x - \log \left( \frac{1}{2}\left( e ^{-\tanh ^{-1}(x)} + e ^{\tanh ^{-1}}(x) \right) \right) & \left\lvert x \right\rvert \leq 1, \\ +\infty & \left\lvert x \right\rvert < 1,\end{cases} \] the random variables \(S_n\) satisfy the large deviations principle \[\mathbb{P} \left[ S_n \geq \delta \right] \sim e ^{-n I( \delta )}.\]
The point is that, although things like the central limit theorem are good for very large deviations, they don’t represent the smaller deviations very well at all. Somehow this is like doing a first-order Taylor expansion when the leading term is zero, and having another degree of precision would be better…
Here’s another example of a rate function computation. Let \(x \in C \left( [0, 1]; \mathbb{R} _{\geq 0} \right)\), and define \[I(x) = \frac{1}{2}\int _{0}^{T} \left\lvert \dot x(t) \right\rvert^2 dt\] if \(x\) is differentiable, \(\infty\) otherwise. Then, one has
Theorem 2. Schilder's Theorem
For any \(\epsilon \in (0, 1)\), if \(W^ \epsilon (t) = \sqrt \epsilon B(t)\), then the paths \(\left\lbrace W^ \epsilon \right\rbrace _{\epsilon \in (0, 1)}\) satisfy the large deviations principle \[\mathbb{P} \left[ W^ \epsilon \in A \right] \sim \exp \left( - \frac{1}{ \epsilon } \inf _{x\in A} I(x) \right).\]
Here, \(B(t)\) is a Brownian motion. This says that \(W^ \epsilon \) resembles a continuous path closely with probability exponentially decaying in \(\epsilon \) and the \(L^2\) norm of the gradient of the path.
The goal is to extend these arguments and the large deviations principle to particle systems with interactions: think statistical physics, voting models, traffic models, and neural networks! The zero range particle process lies at the basis of this.
To define this, let \(g : \mathbb{N}_0 \to \mathbb{N}_0\) be nondecreasing, \(g(0) = 0\), and \(g(k) > 0\) when \(k > 0\). Let \(T(k)\) be independent random clocks with distribution \(T(k) \sim g(k) \exp \left( -g(k)t \right)\). Suppose one has \(N\) buckets of particles arranged in a circle (or on a torus), and each bucket gets a clock \(T(k)\), where \(k\) is the number of guys in the bucket. Whenever the first clock rings, one particle from that bucket jumps to a random neighbour, drawn uniformly at random. The clocks are reconfigured, and this repeats. Take a parabolically rescaled limit as \(N\to\infty\), and we get our zero range process. The formalisation of this is called the hydrodynamic limit (see Ferrari, Presutti, Vares; 1998), and the limiting process solves a deterministic PDE!
Sometimes, this system will have a deviation from the expected deterministic solution, and we are curious about understanding a large deviation principle for this dude. This can be done by describing the deviant distribution as a solution to a similar deterministic PDE with forcing, as if something is influencing the particles to drift in a particular direction. This is called the “skeleton equation”, and we can describe its “energy” as a means to describe the likelihood of such a deviation.
Explicitly, the skeleton equation is \[\begin{align*} \partial_t \rho &= \frac{1}{2} \Delta \Phi (\rho ) - \nabla \cdot \left( \Phi ^{\frac{1}{2}} (\rho ) g \right) \\ &= \nabla\cdot \left( \Phi ^{\frac{1}{2}}(\rho ) \left( \nabla \Phi ^{\frac{1}{2}}(\rho ) - g \right) \right),\end{align*}\] where \(g\in \left( L _{t, x}^{2} \right)^d\) is some “forcing” and \(\Phi \) solves \(\nabla \Phi = 2 \Phi ^{\frac{1}{2}}(\rho ) \nabla \Phi ^{\frac{1}{2}}(\rho )\). In fast diffusion and porous media, one actually has \(\Phi = \rho ^{\alpha }\) for some \(\alpha > 0\).
Since the skeleton equation can be put in divergence form, and since we’re on the torus, we have conservation of mass for solutions \(\rho \). There is a lengthy technical computation that produces an entropy dissipation inequality: \[\max _{[0, T]} \int \rho \log \rho +\int_0^T \int \left\lvert \nabla \rho ^{\frac{\alpha }{2}} \right\rvert^2 \lesssim \int \rho _0 \log \rho _0 + \int _{0}^{T}\int \left\lvert g \right\rvert^2.\] This second term is often called the “Fischer information”. Additionally, this equation is an \(L^1\)-contraction and preserves nonnegativity.
The goal is to prove uniqueness of solutions to the skeleton equation, it seems. A naïve solution to this is to simply take the derivative of the \(L^1\) norm for solutions with the same initial data. However, when doing so, there is an issue with differentiability of the absolute value at \(0\). Trying to smoothly approximate the absolute value function doesn’t work, since this completely nukes positivity/negativity.
Because of this, a “kinetic formulation” of the skeleton equation is developed. One first performs a small elliptic regularisation, then introduces a dummy variable that “cuts out” the values of solutions near \(0\). Then the naïve solution and computations do go through, and taking limits recovers uniqueness (among other things). (To be honest I got off the bus a bit and didn’t understand everything…)
There’s a lot that goes over my head, but there’s a very nice general fact:
Theorem 3. (DiPerna-Lions; 1989)
Suppose \(\nabla \cdot b \in \left( L_t^1 L _{x}^{\infty} \cap L _{t}^{1} BV_x \right)^d\). Then, for any initial data \(\rho _0 \in L^\infty \left( \mathbb{T}^d \right)\), the continuity equation \[\partial_t \rho = \nabla \cdot (\rho b)\] has a unique weak solution in \(\left( L^1\cap L^\infty \right) \left( \mathbb{T}^d\times [0, T] \right)\).
Ambrosio (2004) relaxed DiPerna and Lions’ original assumptions to those above. Additionally, Depauw (2003) showed that the \(BV\) assumption on \(\nabla \cdot b\) is necessary through a counterexample. The main tool here is a commutator estimate, but I’m not sure how exactly it plays into the argument.