Investigate diffusion through simulation
In this tutorial, we’re going to look at several diffusion processes
and how they operate across various networks. Let’s start off with
creating and visualising a few networks that we might be interested in
here. Let’s take the ison_networkers
dataset from
{migraph}
, and create or generate ring, lattice, random,
scale-free, and small-world versions with the same number of nodes.
nw <- to_undirected(to_unnamed(ison_networkers))
rg <- create_ring(nw, width = 2)
la <- create_lattice(nw)
rd <- generate_random(nw, with_attr = FALSE)
sf <- generate_scalefree(nw, 0.025)
sw <- generate_smallworld(nw, 0.025)
autographr(nw) + ggtitle("Networkers") +
autographr(rg) + ggtitle("Ring") +
autographr(la) + ggtitle("Lattice") +
autographr(rd) + ggtitle("Random") +
autographr(sf) + ggtitle("Scale-Free") +
autographr(sw) + ggtitle("Small-World")
Examining diffusion across networks of different structure
Now let’s start off by examining a pretty straight-forward structure,
that of the ring network. To run a basic diffusion model across this
network, simply pass it to play_diffusion()
and (save and)
plot the result.
rg1 <- play_diffusion(rg, seeds = 1)
plot(rg1)
The result object, when printed, lists how many of the nodes in the
network, n
, are ‘infected’ (I
) or not
(S
) at each step t
. The plot visualises this,
with the proportion of S
in blue and I
in red.
The bar plot behind shows how many nodes are newly ‘infected’ at each
time point.
We can see that there is a pretty constant diffusion across this network, with 2-3 nodes being newly infected at each time-point. The whole network is infected by the eighth time-point.
Varying seed nodes
Since the ring network we constructed is cyclical, then no matter where the ‘infection’ starts, it should diffuse throughout the whole network. To see whether this is true, try choosing the sixteenth (middle) node and see whether the result is any different.
rg2 <- play_diffusion(rg, seeds = 16)
plot(rg2)
Now what if we seed the network with more than one infected node? Choosing the first four nodes we can see that the process is jump-started, but doesn’t really conclude that much faster.
plot(play_diffusion(rg, seeds = 1:4))
But what if we seed the network at three different places? Here we
can use node_is_random()
to randomly select some nodes to
seed. Try it with four randomly-selected nodes and see what you get.
plot(play_diffusion(rg, seeds = node_is_random(rg, 4)))
Where the innovation/disease is optimally seeded to accelerate or decelerate diffusions is a crucial question in network intervention studies.
Varying networks
Now let’s see whether where the infection is seeded matters when the network has a different structure. Here let’s play and plot two diffusion on the lattice network, one with the first node as seed and again one on the last.
plot(play_diffusion(la, seeds = 1))/
plot(play_diffusion(la, seeds = 16))
Let’s try one more network type, this time the scale-free network. Play and plot the results over ten steps for node 10, random, maximum, and minimum nodes as seeds.
plot(play_diffusion(sf, seeds = 10, steps = 10)) /
plot(play_diffusion(sf, seeds = node_is_random(sf), steps = 10)) /
plot(play_diffusion(sf, seeds = node_is_max(node_degree(sf)), steps = 10)) /
plot(play_diffusion(sf, seeds = node_is_min(node_degree(sf)), steps = 10))
Why is choosing the minimum degree nodes so much faster?
Varying thresholds
So far, we’ve been using a simple diffusion model where each node needs only to be in contact with one infectious individual to be infected. But what if nodes have higher thresholds or even where they vary?
Let’s first start out with our ring network again. Show that whereas a threshold of one will result in complete infection, a threshold of two will not lead to any diffusion process unless there are two seeds and that they are in another nodes neighbourhood.
plot(play_diffusion(rg, seeds = 1, thresholds = 1))/
plot(play_diffusion(rg, seeds = 1, thresholds = 2))/
plot(play_diffusion(rg, seeds = 1:2, thresholds = 2))/
plot(play_diffusion(rg, seeds = c(1,16), thresholds = 2))
In our ring network, all nodes have the same degree. But many typical social networks include some variation in degree. A threshold of 2 would be easy to surpass for particularly well connected nodes, but impossible for pendants. Let’s see what happens when we use this threshold on a scale-free network.
plot(play_diffusion(sf, seeds = 1, thresholds = 2))
That’s because there’s variation in degree in a scale-free network. Let’s try again, but this time we’re going to specify the threshold as a proportion of contacts that should be infected before the node will become infected. Try thresholds of 0.1, 0.25, and 0.5 on two seeds and 10 steps.
plot(play_diffusion(sf, seeds = 1:2, thresholds = 0.1, steps = 10))/
plot(play_diffusion(sf, seeds = 1:2, thresholds = 0.25, steps = 10))/
plot(play_diffusion(sf, seeds = 1:2, thresholds = 0.5, steps = 10))
What’s happening here is that the high degree nodes in this scale-free network are obstructing the diffusion process because it is unlikely that many of their branches are already infected.
Lastly, note that it may be that thresholds vary across the network. You could make this depend on some nodal attribute, or just assign some random variation. Try two diffusions, one where the threshold is 0.1 for the first 10 and 0.25 for the latter group of 22 nodes, and another diffusion where the threshold levels are reversed.
plot(play_diffusion(sf, thresholds = c(rep(0.1,10),rep(0.25,22))))/
plot(play_diffusion(sf, thresholds = c(rep(0.25,10),rep(0.1,22))))
Since the first ten nodes are the first to join the scale-free network and are preferentially attached by those who follow, they will have a higher degree and only with a lower threshold will we see complete infection.
Investigate epidemiological models
So far we’ve been looking at variations on a pretty straight-forward diffusion process where nodes can only belong to one of two states or ‘compartments’, Susceptible and Infected (the basic SI model). This has been useful, but sometimes what we are interested in, whether disease, innovation, or some other behaviour, has more complicated and probabilistic dynamics. But before we get into that, let’s see how we can play and plot several simulations to see what the range of outcomes might be like.
Running multiple simulations
To do this, we need to use play_diffusions()
(note the
plural). It has all the same arguments as its singular counterpart,
along with a couple of additional parameters to indicate how many
simulations it should run, e.g. times = 50
, whether it
should use strategy="multisession"
to run the simulations
across multiple cores instead of the default
strategy="sequential
, and verbose=TRUE
if it
should inform you of computational progress. Try this out with our
well-mixed random network, 10 steps, 5 times, and with a
transmissibility
parameter set to 0.5 to indicate that in
only 1/2 cases is contagion successful.
plot(play_diffusions(rd, transmissibility = 0.5, times = 5, steps = 10))
Note that in this plot the number of new infections is not plotted, and the loess line smooths over the varying trajectories. The blue line is the proportion of nodes in the Susceptible compartment, and the red line is the proportion of nodes in the Infected compartment.
SIR models
Let’s start off with an SIR model in which, after some period in
which an infected node is themselves infectious, they recover and can no
longer infect or become reinfected. To add a recovered component to the
model, specify the recovery
argument. Let’s try a rate of
recovery of 0.20, which means that it’ll take an infected node on
average 5 steps (days?) to recover.
plot(play_diffusions(rd, recovery = 0.2))
What we see in these kinds of models is typically a spike in infections towards the start, but as these early infections recover and become immune, then they can provide some herd immunity to those who remain susceptible.
SIRS models
That’s great, but maybe the immunity conferred from having recovered from the contagion doesn’t last forever. In this kind of model, add an additional waning parameter of 0.05. Play a single diffusion so that you can see what’s going on in a particular run.
plot(play_diffusion(rd, recovery = 0.25, waning = 0.05))
SEIR models
Lastly, we’ll consider a compartment for nodes that have been Exposed
but are not yet infectious. This kind of an incubation period is due to
some latency
. Again, this should be specified as a
proportion (try 0.25, approx four days). Play a single diffusion so that
you can see what’s going on in a particular run.
plot(play_diffusion(rd, latency = 0.25, recovery = 0.25))
Investigate learning through simulation
Lastly, we’re going to consider a different kind of model: a DeGroot learning model. As you will recall, a network that is strongly connected and aperiodic will converge to a consensus of (any) beliefs entered.
Expectations of convergence and consensus
Let’s try this out on the ison_networkers
dataset
included in the package. First of all, check whether the network is
connected and aperiodic.
# By default is_connected() will check whether a directed network
# is strongly connected.
is_connected(ison_networkers)
is_aperiodic(ison_networkers)
Playing the DeGroot learning model
Now let’s see whether you are right. We want to see whether some random distribution of beliefs converges to a consensus in this network. Let’s play the DeGroot learning game on this network with a vector of random belief probabilities (the same length as the nodes in the network) drawn from the binomial distribution with probability 0.25. Create the distribution of beliefs and graph the network to show where they have been distributed. Then play the learning model with these beliefs, and plot the result.
beliefs <- rbinom(network_nodes(____),1,prob = 0.25)
____ %>% mutate(____ = beliefs) %>% autographr(node_color = "____")
netlearn <- play_learning(____, ____)
plot(____)
beliefs <- rbinom(network_nodes(ison_networkers),1,prob = 0.25)
ison_networkers %>% mutate(beliefs = beliefs) %>% autographr(node_color = "beliefs")
netlearn <- play_learning(ison_networkers, beliefs)
plot(netlearn)
Each line in this plot represents the belief trajectory of a single node at each step. About a quarter of the nodes begin believing, and the other three quarters do not. Then we can see how responsive these nodes are to the random distribution of beliefs across the network. Some revise their beliefs more significantly than others.