Data

GLEAM uses a network approach where the world is divided into subpopulations, defined around major transportation hubs, connected by the flux of individuals traveling among them. In order to define this network we integrate population and mobility real world data. These data layers combined with the natural history of the disease and the appropriate compartmental model are the core of the GLEAM engine. When necessary, extra layers of information can be added to the simulator such as vector abundance, socio-economic information and environmental data.

GLEAM acquires population data from large scale projects such as the Global Urban-Rural Mapping project, World Pop and census databases.

In these datasets the world is divided into a grid of cells and assigned an estimated population value. GLEAM uses cells that are approximately 25 km x 25 km, dividing the globe into over 250,000 populated cells.

We know the coordinates of each cell and also those of all the commercial airports in the World Airport Network. By considering the distance between the cells and airports we assign each cell to a ‘local’ airport; this process generates about 3,300 subpopulations, each centered on a local transportation hub.

GLEAM uses a set of four different flight networks (one for each season) derived from the worldwide booking datasets from the Official Airline Guide (OAG) and the International Air Transport Association (IATA) databases (updated 2019). This database contains more than 3,800 commercial airports in about 230 countries, and includes over 4,000,000 connections representing the estimated bookings between any two of these airports for each month.

The airport network data reveals significant variations in both the number of destinations per airport and in the number of passengers per connection. There are some airports with lots of connections and large volumes (where we typically have to catch our connecting flights), and many airports with few connections and low volumes. This characteristic is sometimes called the “long tail”, and has a significant impact on how infections spread around the globe.

GLEAM also takes into account information about commuting patterns, included in a commuting database that has been compiled using information obtained from the national statistics offices of more than 40 countries in five continents, covering more than 78,000 administrative regions. These data sources, which use different semantics and organizational structures with varying degrees of detail have been standardized before being integrated.

Our fully integrated dataset contains over five million commuting connections between GLEAM’s geographic subpopulations, capturing the irregular network structure that affects the local diffusion of infections between neighboring subpopulations.

The GLEAM engine simulates the infection dynamics according to the characteristics of the disease coupled with any prevention and intervention measures. Examples of disease characteristics are: incubation times, the proportion of asymptomatic yet infectious individuals, mortality rates and immunity.

The infection characteristics are defined in a so-called ‘compartmental model’. Each individual fits, at any given point in time, within a certain ‘compartment’ that corresponds to a particular disease-related state (being susceptible, symptomatic or vaccinated, for example). These compartments are connected by paths that define how individuals may pass from one state to another (from susceptible to latent when being infected, for example) while associated parameters determine the likelihood that such transitions take place.

GLEAM uses stochastic algorithms mathematically defined through individual based stochastic chain binomial and multinomial processes to calculate the proportion of the population within each compartment for each subpopulation, and how these proportions change over time as individuals transition from one compartment to the next.

Diseases do not have the same impact in different parts of the world. Not even within the same city. This not only depends on how people interact but also on the living conditions. This means that economic factors play an important role on the spread of infectious diseases. GLEAM incorporates incorporates GCP (Gross Cell Product) per capita data to estimate the population at risk.

GLEAM integrates vector populations in order to simulate vector-borne diseases. In particular, we consider high resolution maps of mosquito occurrence [1] for Aedes aegypti and Aedes albopictus. This specific vectors can be used in the modeling of diseases such as dengue, chickungunya, Zika and yellow fever.

GLEAM includes the information about the particular lifecycle of these two vectors, providing daily estimates of mosquito’s abundance based on environmental factors [2].

[1] Kraemer MU, et al (2015) The global compendium of Aedes aegypti and Ae. albopictus occurrence. Sci Data 2:150035.
[2] Zhang Q, et al (2017) Spread of Zika Virus in the Americas, PNAS 114(22)E4334-E4343).

Environment plays a crucial role in the lifespan of certain vectors (mosquitoes for instance). GLEAM uses detailed environmental databases to model relevant parameters, such as lifespan and mortality.

No items found.