In Machine Learning, it is often convenient to use functions that are not fixed but only known with some probability. For example, let say we know the weather in Cambridge for the last few years and we would like to predict what will be the weather tomorrow (e.g. the temperature or wind speed, a function of time $f(t)$). Or, we know today’s weather in Cambridge and London, and we would like to know what is happening along the M11 motorway between the two cities. Instead of giving one predicted value, we would like a probability distribution for each time (or space point). Since time is continuous, we are looking for a **distribution over functions $f(t)$**.

Gaussian Processes (GP) represent a simple and powerful example of distribution over functions. They are widely used for modeling and prediction (Rasmussen and Williams 2006), a few examples of GPs are shown in Figure 1. Perhaps the most important feature of a GP is the **correlation function** $k(t,t’)$, which determines the information we get about $f(t’)$ when measuring the true value of $f(t)$. The shape of this correlation function is the most important choice to make when we use a GP, it determines what kind of functions $f(t)$ we are able to represent. The paper by Rutten et al. points out that one limitation of current GPs is that all correlation functions are time-reversible, **it provides for the first time a model of non-reversible GPs**.

What does it mean that a process is non-reversible? It means that, if we play the function $f(t)$ in reverse, namely $f(-t)$, we would see something that doesn’t look right. For example, we know that hurricanes spin counter-clockwise in the Northern Hemisphere (see Figure 2). If we play the movie of a Hurricane in reverse, we would see it spinning clockwise and immediately think that’s impossible (or, that we are in the Southern Hemisphere). More formally, **a process is time-reversible if the function $f(-t)$ has the same probability of $f(t)$ under the GP**. Conversely, **a process is non-reversible if the probability of $f(-t)$ is different from the probability of $f(t)$**. For an extremely non-reversible GP, a function $f(t)$ that has a non-zero probability corresponds to a $f(-t)$ that has zero probability.

The paper by Rutten et al. provides the first non-reversible GP, by defining a new correlation function. This correlation function depends on a parameter a, the **non-reversibility parameter**, that is equal to zero when the process is reversible and equal to one when the process is maximally non-reversible. Therefore, the proposed model can interpolate between reversible and non-reversible models. Figure 3 shows an interactive example of 2-dimensional trajectories sampled from the non-reversible GP, where you can control the parameter and see how it affects the shape of the trajectory. For $a=0$, trajectories don’t have any tendency to rotate. However, when $a=1$ (or $a=-1$), trajectories tend to rotate clockwise (respectively, counter-clockwise). The new proposed GP is not limited to 2-dimensional trajectories, it can model functions of any higher dimension as well (see the paper for details).

**how much the data is non-reversible**. Furthermore, there is an

**upper limit for how much non-reversible a GP can be**, since the non-reversibility parameter is bounded between the values a=±1. These bounds make the estimation of this parameter quite robust and informative. An example of application of the model GP on real data is shown in Figure 4, where the authors discovered the hidden dynamical structure in the activity of motor neurons.