The SI difference equation

Two of the simplest models in epidemiological analysis are the SI and SIS models. In each case, S stands for susceptible and I for infected, and the differences between them will be apparent shortly. Both belong to a class of models called compartmental models, where each individual in the model "moves" from one compartment to the other. Think of it like a bin for filing paperwork for your personal tasks, with categories to do, in progress, and complete. If you were processing tasks at a certain defined rate, you could call a model that describes your duties as a TIC model, covering each of those statuses. Here, we have compartments for the serostatus of people with regard to epidemiology, and so in our most simple models, people are either susceptible or infected, hence the abbreviations. Other compartments can be added, such as categories for exposed (but not yet infectious) individuals and recovered individuals, with letters E and R, with many more possibilities available to build more compex models (e.g. Q - quarantined).

Let's, for now, start with the more basic model of the two - the SI model. And let's consider what's called the discrete case model at first (as opposed to the continuous case model). Discrete difference equations are called as much because they use defined time steps to update the system using a difference term between those updates. As you'll see, they aren't differential equations in that there aren't any derivatives, but there are deep connections between these types of equations (and, in a real sense, difference equations are how a computer approximates differential equations.)

So what would the simplest case SI model look like, as a difference equation? Suppose we have a contained population of a fixed size, $N$, and everyone is either susceptible or infected. That gives us $N = S + I$ as a generally useful sum. Suppose further that once infected, there is no recovery, and that all infected persons are equally infectious upon encountering a susceptible individual. Let's assume the population is well-mixed - individuals intermingle without regard to infection status. Lastly, let's assume that the infection process takes one day, so we can build the equations around a day-to-day update. These are all simplifying assumptions that can make the system of equations much easier to analyze.

Putting these together, we might have something like the following:

$$\begin{aligned} S_{today} &= S_{yesterday} - \beta S_{yesterday} I_{yesterday} \\ I_{today} &= I_{yesterday} + \beta S_{yesterday} I_{yesterday} \end{aligned}$$

This can be a little unweildy, so let's let $n$ designate the current day (or timestep, more generally) and $n-1$ the previous day. This tidies up the difference equation a little bit into:

$$\begin{aligned} S_{n} &= S_{n-1} - \beta S_{n-1} I_{n-1} \\ I_{n} &= I_{n-1} + \beta S_{n-1} I_{n-1} \end{aligned}$$

This has a few features worth noting. The number of susceptible and infected persons depends entirely on the number from the previous day - as stated earlier, we're assuming infection occurs after a day. We have a force of infection term in $\beta$, which can, for now, be thought of as the probability of a susceptible person becoming infected upon encountering an infectious person. We see that this moves the person from $S$ to $I$, as it subtracts from $S$ and adds to $I$.

This set of equations needs one more thing: an initial condition. We'll call the initial values for the classes $S_0$ and $I_0$ for day 0. In the easiest case to analyze would be something like $S_0 = 1000$ and $I_0 = 0$. What would the system do in that case? Well, let's look at the equations and see what happens in one day. If we have $S_0 = 1000$ and $I_0 = 0$, then $S_1 = S_0 - \beta S_0 I_0$. But regardless of the value of $\beta$, the product $S_0 I_0$ is 0 since $I_0 = 0$. So $S_1 = S_0$. Similarly, $I_1 = I_0$. You can repeat this argument for any number of days. Therefore, the model does something we intuitively expect - if you don't have any infectious individuals in the population at the start, then there will not be any at any point in the future. This property of not changing with any timestep makes it a special solution called a steady state, and in this case we call it the disease-free steady state or the disease-free equilibrium. What if we have $S_0 = 0$ and $I_0 = 1000$? Well, you basically get the same thing. This usually isn't given a special name, but if you like, you could call it the "Oops, all infected" equilibrium. For reasons you might encounter later, this isn't as generally applicable, because it arises here only because we have a closed popultion.

Now for a formal matter - the model as we've described it so far is for the number of people in each class, but we can also use it to describe proportions. That is, in the previous description, we set $S_0 = 1000$ and $I_0 = 0$. We could similarly divide everything by $N$ (remember, $N$ is the total population size) and get the proportion of individuals in each compartment. This works out better in some cases - we aren't guaranteed that the product $\beta S I$ will always be an integer after all, and then we'd have to make a decision about how to round values. This way, we don't have to do any such thing to interpret the output of the model. (Bonus question: How does the old value of $/beta$ relate to the new value of $/beta$ in making this change to the model to normalize it by population size?)

We already discussed the model behavior where $S_0 = N$, also known as the disease-free steady state, but what happens in a slightly more interesting case? Let's suppose 0.1% of the population is infectious at time 0. For simplicity, let's just say $\beta = 1$. What does the model do in that case? We can plot the first ten values of the model and see. Note that we're tracking using $n$ as the index for the days in the left column, followed by the values of $S$ and $I$ for that respective day. We'll use decimal representations instead of percents as well, so $S_0 = 0.999$ and $I_0 = 0.001$. The first day update comes from the equation. Since $\beta = 1$, we have $S_1 = S_0 - S_0 I_0$, or $S_1 = 0.999 - (0.999)(0.001)$. This gives us $S_1 = 0.998001$. We could similarly calculate $I_1$ using the equation, or we could use the fact that $S_n + I_n = 1$ for any timestep, since the values are proportions. You can repeat this process as desired. Below, the model is run out to $n=10$, rounding to the nearest thousandth for simplicity.

n $S_n$ $I_n$
0 0.999 0.001
1 0.998 0.002
2 0.996 0.004
3 0.992 0.008
4 0.984 0.016
5 0.968 0.032
6 0.937 0.063
7 0.878 0.122
8 0.771 0.229
9 0.594 0.406
10 0.353 0.647
11 0.125 0.875
12 0.016 0.984
13 0 1

We see that, in this model, we essentially exhaust all of the susceptible compartment. If you're particularly astute, you might notice that we only actually got to $S_{13} = 0$ and $I_{13} = 1$ because of the way I was rounding the output. It's possible to show that, with infinite accuracy in the model, you won't ever reach this by looking closely at how the timesteps update and observing that the product $S_n I_n$ is always less than $S_n$ itself, as $I_n = 1 - S_n$.

A natural question now is, "Does this always happen as long as $I_0 > 0$ and $\beta > 0$?" And the answer, in this case, is yes. It takes a little to generate a formal proof - and for the most part, formal proofs about difference equations are actually harder to obtain than proofs about their cousins, differential equations - but you can start by showing that the sequence of values for $S_n$ is strictly decreasing as $n$ increases. That's not a mathematical guarantee on its own (after all, some infinite series have finite sums), but it's a necessary condition. From there, it requires showing that the difference term, $\beta S_n I_n$ is generally not too small. This would come from the fact that $I_n = 1 - S_n$, so when $I_n$ is smaller, $S_n$ is larger, and vice versa. The formal proof is beyond the scope of this explainer at this time, but these are the types of observations that are useful for intuiting the behavior of the system.

Under construction