epidemic control

  • epidemics
  • information diffusion
  • viral marketing
  • temporal networks

Team

Corey, Sascha, Vincent, Christina, John

Project

Computing tasks:
1. Read Prakash’s paper (click on name)
2. Implement Prakash’s spreading models
3. Import data into MATLAB (or alternative SW)
– Get the data from here: http://ndssl.vbi.vt.edu/ebola/data/SLE_contact_graph-ver1.1.zip
– Build an unweighted graph (for now) using columns 1 and 3 in SLE_contact_graph.txt. Please, disregard the other columns.
4. Extract a subgraph of 1000 nodes using BFS
5. Run Prakash’s models in the subgraph

Policy tasks:
1. Read Valente’s paper (click on name)
2. Think about network research and how it can be used to build recommendations for health intervention

COMM students: have a look at the questions listed in this document and try to articulate a paragraph-long answer to them. Submit your answers to the blog.

NETS students: due on Friday, April 3.
1. Read and review the following paper:
a. “Identification of Influential Spreaders in Complex Networks” by Kitsak et al. in Nature 2010.
2. In the following steps, you need to use the network imported in Phase I.
3. Design a seeding strategy to spread information based on the following criteria:
a. Largest degrees
b. Eigenvector centrality
c. k-core decomposition (see Kitsak’s paper)
d. Cascading size
4. Code algorithms to implement each one of the above criteria:
a. Run your algorithms in the network extracted in Phase I
b. Use a number of seeds that varies from 1 to 20
c. Compare the effectiveness of each strategy

Due on Friday April 10.
All the students in the group should collaborate to solve the tasks below.

1. Code an algorithm to visualize the state of the epidemics over time
a. Find a network layout that allows you to visualize the network in a convenient manner
b. Each node should change colors, from green to red, depending on the infection level of each node
c. Edges should light up whenever they transmit the disease in a particular time

Due on Friday April 17.
All the students in the group should collaborate to solve the tasks below.
1. Consider other spreading criteria to increase influence
2. If you are a health agency and you want to contain the spread of the disease, how would you modify the network, i.e., cut links, in order to contain the spread?
3. If you were trying to maximize the spread throughout the network, how would modify the network to increase the rate of spreading?
You should prepare, as a group, a slide show and present your work in class on Friday April 24. The presentation should be less than 20 minutes long. Send the slides to Josh (email below) no later than Thursday April 23!

Do not forget to update the blog with your advances! You will have to register your email the first time you post. If you have any figures, send them to Joshua Becker: jbecker[at]asc.upenn.edu. We will add them to the projects portfolio!

white_rectangle
figures
white_rectangle

Sandra Bailonepidemic control

Comments 10

    1. John Trueman and Christina Nelson

      The article focuses on the efficiency of the k-shell decomposition model in spreading. But this does not apply to our case because k-shell decomposition spreads most efficiently when there is a single spreading origin. Our disease, Ebola, would have multiple origins within the network. That explains why highest degree nodes were more efficient spreaders than using the k-shell decomposition model. But this article specifies that highest degree nodes is most efficient when the epidemic starts in multiple origins simultaneously. Ebola could start in multiple origins and at different times. Is it possible to implement a spreading model that also takes this time variant aspect into account? At the end of the article it says that in the case of infections that do not confer immunity on recovered individuals, the core of the network in the large k-shell layers forms a reservoir where infection can survive locally. Ebola is an infection that would fall under this category. Areas in the network with large k-shell layers should be one of the first and primary targets of preventative interventions. People can avoid becoming infected with Ebola by avoiding contact with those who already have the virus and by following certain, basic hygiene measures such as disinfecting commonly touched surfaces or washing hands either with soap and water hand sanitizer. A preventative behavioral intervention that promotes and enables people to maintain the appropriate levels of hygiene could be utilized to prevent Ebola from spreading to areas of the network with large k-shell layers.
      Our intervention recommendations should adequately reflect the fact that our data set represents the spread of Ebola in Sierra Leone. To do so, we should try to recommend intervention strategies that would be feasible given Sierra Leone’s limitations in infrastructure as well as available financial and medical resources. Given this, we recommend that hand sanitizer be made readily and widely available to Sierra Leone citizens, schools, and hospitals. Soap would probably be cheaper to deliver and easier than hand sanitizer to get Sierra Leone’s citizens to adopt the use of, but running water is not easily accessible there. Additionally, we recommend sending health-care educators into communities to teach citizens how to avoid contracting Ebola and to promote the normalization of hand sanitizer usage. Partnerships with Sierra Leone’s community and organizational leaders designed to facilitate positive, comfortable communications and relationships between the health-care educators and in Sierra Leone’s citizens could be established to enhance and assure the efficacy of the health-care educators’ efforts.

  1. Vincent Gubitosi

    We designed seeding strategies to spread information based on largest degrees, eigenvector centrality, and k-core decomposition, as coded algorithms implementing each of the strategies. We ran the algorithms with our SIS model on our network with 50,000 nodes. Our values for the model consisted of beta = 0.1, delta = 0.3, gamma = 0.1, epsilon = 0.3, and theta = 0.1. We ran the model for 1,000 iterations. We simulated seeding the top 1, 5, 10, 15, and 20 nodes with the largest degrees and greatest eigenvector centrality. We also simulated seeding one node from each of the highest 1, 5, 10, 15, and 20 k-cores. We based the effectiveness of each algorithm on the percent of infected nodes in the network. Here are the results:

    Largest Degrees with Beta = 0.1:
    1 Node:infected ratio: 0.093541749259
    5 Nodes:infected ratio: 0.0940892186291
    10 Nodes:infected ratio: 0.0928243756017
    15 Nodes:infected ratio: 0.0907100111382
    20 Nodes:infected ratio: 0.0918993411489

    Eigenvector Centrality with Beta = 0.1:
    1 Node:infected ratio: 0.089464046365
    5 Nodes:infected ratio: 0.0940514621208
    10 Nodes:infected ratio: 0.090011515735
    15 Nodes:infected ratio: 0.0926922278228
    20 Nodes:infected ratio: 0.0917105586075

    k-Core Decomposition with Beta = 0.1:
    1 Node:infected ratio: 0.0914273847955
    5 Nodes:infected ratio: 0.0924090540107
    10 Nodes:infected ratio: 0.0912763587623
    15 Nodes:infected ratio: 0.0906156198675
    20 Nodes: infected ratio: 0.0911442109834

    The algorithms all returned fairly similar results, but some comparisons can still be made. Below, we list the maximum, minimum, range, and average infected ratio for each algorithm. For each metric, we order the results from greatest to least.

    Maximum Infected Ratio:
    1. Largest Degrees: 0.0940892186291
    2. Eigenvector Centrality: 0.0940514621208
    3. k-Core Decomposition: 0.0924090540107

    Minimum Infected Ratio:
    1. Largest Degrees Min: 0.0907100111382
    2. k-Core Decomposition Min: 0.0906156198675
    3. Eigenvector Centrality Min: 0.089464046365

    Range of Infected Ratios:
    1. Eigenvector Centrality Range: 0.004587416
    2. Largest Degrees Range: 0.0033792075
    3. k-Core Decomposition Range: 0.001793434

    Average Infected Ratio:
    1. Largest Degrees Average: 0.092612939
    2. Eigenvector Centrality Average: 0.091585962
    3. k-Core Decomposition Average: 0.091374526

    From the results, it appears that largest degrees performed the best, followed by eigenvector centrality, followed by k-Core decomposition. Although we would expect both eigenvector centrality and k-core decomposition to perform better than largest degrees, we could have gotten the results we did because of the structure of our subgraph.

  2. John Trueman, Christina Nelson

    Problem:
    In today’s society, there are a variety of epidemics, from the technological sphere, to the medical world. Our job is to try and find a way to develop a network, analyze the flow of the network, and discuss ways in which we can prevent the spread of epidemics. Some key facts that we must consider are the variability and diversity of epidemics. From computer viruses to the bird flu, epidemics come in a variety of forms, and may require different network interventions to address the networks, as discussed by our engineering counterparts for this project. We also have to think about the modes by which the epidemic spreads in order to develop effective interventions that target the right places and people. Additionally, an organization’s or government’s financial constraints and resources must be taken into account when thinking about how to develop the most effective epidemic control methods and interventions. We must also consider the fact that epidemics are considered epidemics because they are extremely hard to control. While networks can be boiled down into basic parts, from nodes to links, there are stronger forces at play that make it extremely difficult for decision-makers to control. From the rate of spread, to the types of epidemics we are to consider, all these facts work in tandem to make the process rewarding, yet cumbersome.

    Context:
    With so little research on epidemics and the control of them, there has never been a better time to analyze, on a more sophisticated level, ways to control them. Ultimately, the aim is to be in a world where disease is easily controlled, can be dissipated, and eventually, does not exist. We need to take notice of epidemics, on an individual level, as we consider the awful implications it can have for individuals. Epidemics can cause a high rate of death and sickness, as shown through high profile diseases such as ebola and HIV. These epidemics also have serious economic implications, as the cost for the government and medical facilities to control the epidemics is extremely high. The paper we read discussed a variety of network interventions that could prove key to preventing the spread of an epidemic. To place this into context, take for example the ebola outbreak. Doctors were at a loss as to how to prevent it from spreading for a large period of time. As a result, with not network intervention, the disease was allowed to run its course through a number of countries and continents.

    Method:
    In order to understand the epidemics on a deeper level, we must expand our knowledge on not only networks, but network interventions. As a group, we also need to gain a greater understanding of the epidemic on hand. What causes the epidemic to spread? Who is most prone to contract the spreading disease? What specific disease are we discussing? The goal of our analysis is to not only explore ways in which epidemics can be prevented, but also discover more about how networks function, from individual nodes, to the importance and strength of different links. Understanding the structure and function of entire networks and systems is necessary in order to stop or prevent the spread of an epidemic. Understanding the entire network would allow for predictions to be made about how the epidemic will spread, which would enable decision-makers to get ahead of the epidemic, instead of trying to clean up behind it.

    Implications:
    The decision-maker, whether it be an IT expert, or a doctor, has to discern between a variety of network interventions. There are a number of different alternatives, whether we attack the epidemic on an individual level, segment the network, or alter the network. We would recommend segmenting the network. Big Data can be utilized to segment the network based on various qualitative and quantitative measures. For example, the network could be segmented by whose who have been exposed to the disease and those who have not been. This form of segmentation would allow for isolation of affected individuals, slowing further spread of the epidemic. Additionally, the network can be segmented by attitude about adhering to and promoting certain health protocols. Using this type of segmentation could help decision makers choose who to target in health communication campaigns geared toward preventing or slowing the spread of the epidemic.

  3. Christina Nelson

    Research on social networks and networking has shown that people can be influenced by their social networks to adopt new practices that affect their personal lives. Network interventions are based on the diffusion of innovations theory, which explains how new ideas and practices spread within and between communities. The article that John and I were assigned to read presented four strategies that capitalize on network data to develop planned change programs:

    1. Identifying individuals: leaders to promote behavior change may be identified by finding those in the networking that most optimally span the network. In the case of epidemic control, it would be helpful to identify nodes that would be best for either spreading a behavior that would reduce the spread of the epidemic or for disrupting the spread of a behavior that is contributing to the epidemic’s spread. For our project, this could be achieved by identifying nodes that are leaders and bridges (connect different groups of people) in their network because these individuals are in a good position to lead a change program due to their prominence and diversity. Additionally, peripheral nodes or nodes with low-connectivity should be identified in our project because these are the individuals that may be unintentionally excluded from intervention programs or educational services that could prevent them from being affected by the epidemic.

    2. Segmentation, in which interventions are directed toward groups of people and can be carried out simultaneously. If our data shows that the epidemic is disproportionally affecting certain parts of the network or if the epidemic is spreading in different manners among different clusters of the network, I would suggest employing this type of intervention.

    3. Induction, where the network is stimulated such that new links in the networks are formed. Induction interventions stimulate or force peer-to-peer interaction to create cascades of information/behavioral diffusion. Network outreach is a kind of induction method that could be used for our project. Network outreach requires selected individuals to reach out to their personal networks to participate in an intervention together. So by this method, the behavior change message can be delivered to groups of people. Network outreach is expected to be more effective than individual interventions (for example via word of mouth) because the group can reinforce the positive behavior change. The “identifying individuals” method could be used to identify those who would be best for spreading behavior changes that would disrupt the spread of the epidemic among various groups of people.

    4. Alteration, which include interventions that change the network. Adding nodes, such as lay health advisors or health educators, is a behavior change approach that could be utilized early on, before an epidemic has spread significantly. These new nodes could be added to reach out to individuals on the margins of networks. For our project, node-deletion interventions will probably be necessary to disrupt the spread of the epidemic. For example, it may be useful to remove affected nodes that act as key bridges in networks. “Removing” an individual from a network does not necessarily have to mean quarantining or physically removing him/her—for example, removing a node could mean adding a protective barrier (such as condom use for sexually transmitted epidemics, or a face mask for airborne epidemics) that inhibits the node from transmitting the disease. This method would also be used alongside the “identifying individuals” strategy in order to figure out which node needs to be removed.

  4. John Trueman

    Hey all, this is John Trueman checking in from the ‘communications’ side of the project. While much of what I anticipate the engineers to be discussing to be over my head in regards to the nitty-gritty details of their software, I hope I can shed some light your way in reference to the Valenti article we were asked to read and summarize. The article is basically a discussion of network interventions, and was fueled by the fact that research on social networks has shown that individuals can be influenced by their social networks to begin incorporating new practices into their daily routines.
    There are four separate types of network interventions. The first one is an intervention that uses network data to identify an individual as a champion. In more understandable terms, a champion would be a person who can promote behavior change, and these interventions attempt to name him. This can be done through methods as simple as nominations within a network, or mathematical algorithms that identify central nodes. Companies and other entities can use this information for the dissemination of news, ideas, products, and theories.
    The second type is segmentation, in which a network intervention recruits certain people to change proponents. In other terms, this intervention sets out to identify groups of people that they wish to change simultaneously. When a company begins a new business practice, for example, they might prefer to incorporate it into one location, then another, and another, instead of simultaneously, to ensure that their business practice will be effective.
    The third type of intervention is known as induction, and forces peers to interact to create a cascade of information diffusion. This can be witnessed no clearer than in media marketing campaigns, that are designed to generate a buzz about a specific product, with the long term goal of increasing sales. When you think of this intervention, think virality!
    The final example of a network intervention is known as alteration, that involves a change in the network completely. This can involve adding or deleting nodes, links, and even rewiring the network. The goal of this is to encourage a network to run smoother. A good example of this might be the removal of a noisy student from a classroom setting in order that work by other students can be completed on time.
    Now that we have established the four types of network interventions, it is important for us to consider exactly which type we want to incorporate. There are pluses and minuses to both, but the reality is if we pick more than one, we may find ourselves overwhelming a network to the point that it gets disrupted negatively. Do we think that health epidemics can be stopped by targeting a specific person, or by altering the group entirely?

  5. Vincent Gubitosi

    The paper attempts to provide a general model for discovering the long-term effects of a new contagion entering a given network. The contagion can be anything that could potentially spread through a given network. This includes viruses, products, and memes, among others. It also gives a threshold condition for discovering these long-term effects. If the threshold condition isn’t met, the contagion dies out. If it is met exactly at its tipping point, then the contagion remains approximately at the same level of influence among the nodes of the network. Otherwise, if the threshold condition is met at a value greater than the tipping point, the contagion blows up and eventually spreads throughout the entire network.
    The generalized model contains nodes that are each in a specific state at any point in time, each state coming from a specific class. The classes are called susceptible, infected, and vigilant. Nodes in a state from the susceptible class can become infected by any neighboring node who is infectious. Their state would convert into an infected state. Nodes in a state from the infected class are capable of spreading the infection to their neighbors who are in a susceptible state, thus converting these states to infected states.
    The generalized model is constructed from an arbitrary number of susceptible and vigilant states, and two infectious states. With this construction, the model is capable of generalizing every practical virus propagation model (VPM). Six of the VPMs mentioned in the paper are SIS, SIR, SIRS, SEIR, SIV, and SEIV. S stands for susceptible, I stands for infected, R stands for recovered (where nodes in the recovered state have some level of immunity), E stands for exposed (where nodes in the exposed state are infected, yet can’t convert other nodes to an infected state), and V stands for vigilant.
    The threshold condition s is the product of the largest eigenvalue of the network and some constant C. The constant C depends on the VPM. If s 1, the contagion will spread throughout the entire network. The threshold condition has some interesting properties. First, it separates the effect of topology (the layout of the network) and the VPM. Second, the actual topology can be arbitrary. This is because the threshold depends on the largest eigenvalue of the connectivity matrix (not the topology itself). Third, the VPM can be arbitrary. This is because the threshold depends on a constant that characterizes the VPM (not the actual VPM itself).
    The paper includes some numerical simulations involving the generalized model. The two networks that were used come from (1) the Oregon Route Views project and (2) a physical contact graph representing a synthetic population of the city of Portland. One result found from the experiments was that, in order to find out how vulnerable a network is, you should focus on its largest eigenvalue. This is the value that captures the connectivity of the graph, which makes sense as to why you should focus on it when trying to figure out the vulnerability of a given network. The experiments also lead to some counter-intuitive results.
    The first is that the long-term effect of the contagion doesn’t depend on the virus-incubation probability (the probability of the state of a node converting to an exposed state). This probability only changes the speed at which the long-term result of the contagion is reached. The second is that the long-term effect of the contagion doesn’t depend on the rate of the loss of immunity (which affects the transitioning of the state of a node between recovered and susceptible states). The only way that lowering this rate affects the effective strength of the virus (thus changing the threshold condition) is if there is a way to give a node direct immunity before it ever becomes infected.

Leave a Reply

Your email address will not be published. Required fields are marked *