Skip to main content
Advertisement
  • Loading metrics

Modelling geospatial distributions of the triatomine vectors of Trypanosoma cruzi in Latin America

  • Andreas Bender ,

    Roles Conceptualization, Data curation, Formal analysis, Methodology, Software, Validation, Visualization, Writing – original draft, Writing – review & editing

    bender.at.research@gmail.com (AB); catherinemoyes@gmail.com (CLM)

    Current address: Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Old Road Campus, Oxford, United Kingdom

    Affiliation Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Old Road Campus, Oxford, United Kingdom

  • Andre Python,

    Roles Methodology, Writing – review & editing

    Affiliation Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Old Road Campus, Oxford, United Kingdom

  • Steve W. Lindsay,

    Roles Funding acquisition, Investigation, Writing – review & editing

    Affiliation Department of Biosciences, Durham University, DH1 3LE, Durham, United Kingdom

  • Nick Golding,

    Roles Conceptualization, Methodology, Supervision, Writing – review & editing

    Affiliation Department of BioSciences, University of Melbourne, Parkville, Melbourne, Victoria, Australia

  • Catherine L. Moyes

    Roles Conceptualization, Data curation, Funding acquisition, Methodology, Project administration, Resources, Supervision, Writing – original draft, Writing – review & editing

    bender.at.research@gmail.com (AB); catherinemoyes@gmail.com (CLM)

    Affiliation Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Old Road Campus, Oxford, United Kingdom

Abstract

Approximately 150 triatomine species are suspected to be infected with the Chagas parasite, Trypanosoma cruzi, but they differ in the risk they pose to human populations. The largest risk comes from species that have a domestic life cycle and these species have been targeted by indoor residual spraying campaigns, which have been successful in many locations. It is now important to consider residual transmission that may be linked to persistent populations of dominant vectors, or to secondary or minor vectors. The aim of this project was to define the geographical distributions of the community of triatomine species across the Chagas endemic region. Presence-only data with over 12, 000 observations of triatomine vectors were extracted from a public database and target-group background data were generated to account for sampling bias in the presence data. Geostatistical regression was then applied to estimate species distributions and fine-scale distribution maps were generated for thirty triatomine vector species including those found within one or two countries and species that are more widely distributed from northern Argentina to Guatemala, Bolivia to southern Mexico, and Mexico to the southern United States of America. The results for Rhodnius pictipes, Panstrongylus geniculatus, Triatoma dimidiata, Triatoma gerstaeckeri, and Triatoma infestans are presented in detail, including model predictions and uncertainty in these predictions, and the model validation results for each of the 30 species are presented in full. The predictive maps for all species are made publicly available so that they can be used to assess the communities of vectors present within different regions of the endemic zone. The maps are presented alongside key indicators for the capacity of each species to transmit T. cruzi to humans. These indicators include infection prevalence, evidence for human blood meals, and colonisation or invasion of homes. A summary of the published evidence for these indicators shows that the majority of the 30 species mapped by this study have the potential to transmit T. cruzi to humans.

Author summary

The Pan American Health Organisation’s Strategy and Plan of Action for Chagas Disease Prevention, Control and Care highlights the importance of eliminating those triatomine vector species that colonise homes, and has had great success in many locations. Since indoor residual spraying campaigns have targeted these species, their importance relative to other vectors has diminished and their geographical distributions may also have changed. It is now vital to consider the full community of vector species, including previously dominant vectors as well as secondary or minor vector species, in order to target residual transmission to humans. Our aim was to define the geographical distributions of the most commonly reported triatomine species in the Chagas endemic region from northern Argentina and Chile to the southern United States of America. We extracted reports of triatomine vector species observed at specific locations from a public database and we used a geostatistical model to generate fine-scale predictive maps for thirty triatomine vector species. Data quality and data availability issues necessitate careful interpretation of the results, so (un)certainty intervals are presented alongside each map. We also present these maps alongside a summary of the published evidence for key indicators related to the capacity of each species to transmit the Chagas parasite to humans. This evidence shows that most of the 30 species that we have mapped pose a potential threat to human populations.

Introduction

American trypanosomiasis, or Chagas disease, is one of the 10 neglected diseases addressed by the London Declaration, which calls for control and elimination of these devastating diseases by 2020 [1]. It is a disease where vectorial transmission occurs from northern Argentina and Chile to the southern United States of America, and a ‘Strategy and Plan of Action for Chagas Disease Prevention, Control and Care’ has been set out by the Pan American Health Organisation (PAHO) [2]. This strategy includes the elimination of domestic vectors to prevent intra-domiciliary transmission, as well as screening blood donors and pregnant women to prevent transmission via blood donation or the placenta, and implementation of best practice in food handling to prevent oral transmission. Our study focuses on the primary route of infection; the contamination of a vector bite with faeces of that vector.

The Trypanosoma cruzi parasite is transmitted to humans by over 150 different vector species from 18 different genera [3]. The transmission risk that each vector species poses is influenced by how likely it is that the species in question will come into contact with humans. The likelihood of human contact is influenced by short-distance movement (for example, whether the species enters and/or colonises homes) and the larger-scale geographical distribution of that species. Studies assessing vulnerability of individuals to Chagas disease have shown that, while housing, ecotype and socio-economics are all relevant, triatomine presence is the most important indicator [4]. Thus understanding the distribution of these vector species is vital to both target control measures and to assess disease risk.

Before the current intervention era, five vector species were recognised as being dominant in the transmission of T. cruzi to humans based on their habit of colonising houses, behaviour (feeding-defecation interval) and widespread geographical distributions [5]. Since indoor residual spraying (IRS) campaigns have successfully targeted these dominant species in many locations, their importance relative to other vectors has diminished and their geographical distributions may also have changed [6]. It is now vital to understand the full community of vector species, including previously dominant vectors as well as secondary or minor vector species, in order to target residual transmission to humans [68].

Several studies have investigated species behaviours that influence short distance travel in and around homes, such as host-seeking, aggregation and dispersal [919], but fewer studies have considered the larger-scale geographical distributions of these species. The studies of geographical species distributions that have been conducted typically focus on a single country or a region within a country [1924]. One earlier study considered the distribution of triatomines infected with a virus across South America, without distinguishing species [25], and previous studies have mapped individual species across their ranges [2631], but no studies have considered the geographical distributions of multiple, individual dominant and secondary vector species across the Chagas endemic region from northern Argentina and Chile to the southern United States of America. A lack of consistent region-wide information makes it harder to construct an overview for the region as a whole or to compare areas within the endemic zone.

The data recording presence of a species are often sparse and suffer from sampling bias, which makes inter-region comparison of these records difficult. The aim of this study is to use statistical models to produce a comprehensive set of maps predicting the distributions of triatomine vector species while taking into account the limitations of the data. We use an extensive database of reported occurrences of each species, and data on environmental variables that are likely to influence species presence, and we build species distribution models to improve our current understanding of the spatial distribution of vectorial transmission of T. cruzi.

Materials and methods

Study area

The study area was defined as the Chagas endemic region, which extends from northern Argentina and Chile to the southern United States of America. The study area for each individual species was defined as the area encompassing all reports of that species since the year 2000 plus a buffer zone of 5 degrees (approximately 300km).

Species occurrence and background points

The primary source of vector species data was a database of vector occurrence locations, which was supplemented with additional species presence points derived from a database of infections in vector species. Data on vector occurrence was extracted from DataTri, a publicly available database that reports the presence of a given triatomine species, the date of collection (if available) and geographical coordinates for each collection [32]. Additional vector occurrence data was added using a database of T. cruzi infections in triatomines that also provided the vector species found, the date of collection (if available) and geographical coordinates for each collection [33]. Any data points from DataTri that were duplicated in the second data set were removed before vector occurrence data from the infections database was added to the DataTri data set. Data points before the year 2000 were removed because the aim was to investigate vector distributions in the current era.

The available vector occurrence data is usually referred to as presence only data. Techniques for modelling such data often involve augmenting the presence data with pseudo-absence or background points, which requires a source of appropriate background data [3436].

Here we use a target-group background (TGB) approach by choosing background data that exhibits similar sampling bias as the occurrence data [37]. This approach can reduce the bias introduced by preferential sampling of the presence locations. It was successfully used to map geographical distributions of malaria hosts and vectors [38] and predict infection risk zones of yellow fever [39]. In simulation studies this method also performed well when compared to approaches using presence-absence data [37, 40]. As with all models of presence-only data, the maps produced using the TGB approach represent relative rather than absolute probabilities of species occurrence.

We constructed one TGB dataset for each vector species as outlined below and illustrated in Fig 1 for Panstrongylus megistus:

  1. The presence locations of vector species k = 1, …, K (target-group) were extracted from the database and a convex hull containing all presence locations was constructed (panel (A) in Fig 1).
  2. This hull was extended by a constant width of 5 degrees in all directions to allow for uncertainty with respect to the range of the species being modelled (extended hull; panel (B) in Fig 1).
  3. The presence locations of all other species within the extended hull were defined as background points (blue dots in panel (C) of Fig 1).
  4. Duplicate observations at the same site and the same year were removed.
  5. At the modelling stage, the background points were weighted such that their total weight is equal to the number of presence observations (cf. [37, 38]).

thumbnail
Fig 1. Construction of background points.

Illustration of the construction of background points using the TGB approach for species Panstrongylus megistus. Panel (A): A convex hull is constructed around the presence locations of the species. Panel (B): The hull is extended by a fixed width of 5 degrees (extended hull). Panel (C): Background points are added using presence locations of all other species within the extended hull. Panel (D): Blocks of width wk are allocated randomly across the extended hull. The observations in blocks numbered 1-4 are used as training data, and the observations in blocks numbered 5 are assigned to the test data. One fold consists of all blocks sharing the same number. Figure created by the authors using R package tmap [41].

https://doi.org/10.1371/journal.pntd.0008411.g001

The weighting of background points means that if presence and background points were randomly distributed the predicted relative probability would be about 0.5. Thus probabilities > 0.5 indicate that it is likelier to observe presence than background, rather than an absolute probability of occurrence.

For some species, only few observations were available in the dataset. It was suggested that approximately five [42] or ten [4345] events (presences) per predictor are required to reliably fit a logistic regression. Given that we use up to 30 predictors, this would imply a sample size of n ≈ 150 and ≈ 300, respectively. In our data, 14 and and 9 species fulfilled this (approximate) requirement. For completeness, we fit models for all species with n > 50, but obviously results must be interpreted with care as the sample size (number of presence observations) decreases (see Results section). Species with fewer than fifty observations in the training and test data were not modelled, however, their presence locations were used as background points for the species that were modelled.

Environmental variables

Previous work has shown that vector distributions are influenced by climate, land cover types and rural/urban classifications [1924]. Environmental variables for these three data types were obtained at a resolution of 5 × 5 kilometres. The climatic variables used were land surface temperature (annual; day, night and diurnal difference) [46], two measures of surface moisture (annual) [47], rainfall (annual) [48], elevation (static) and slope (static) [49]. The variables used for land cover were the 16 IGBP land cover classes (annual) [50] and an enhanced vegetation index [51]. Finally the variables used to distinguish rural, peri-urban and urban areas were urban footprint (static) [52], nighttimelights (static) [53], human population (annual) [54] and accessibility (static, based on road networks and distance to cities) [55]). Annual environmental variables were not always available for all time periods in which occurrence data was available. In this case, we instead used values from the closest year. A full description of each variable is given in the supplement (Table A, S1 File).

Model evaluation

In the context of spatial analysis, data available for modelling often only encompasses few locations in areas for which predictions are generated. Therefore, the model is usually evaluated on out of sample data to avoid over-fitting, to ensure transferability to new locations and to obtain realistic estimates for the goodness of fit. Standard approaches to model evaluation, however, can yield over-optimistic metrics of the model predictive ability unless the spatial nature of the data (and the model) is taken into account [5658]. To address these concerns, the data was initially split randomly into train-test data (80%) and evaluation data (20%), stratified by species. The latter dataset is not utilised during model building but later used to evaluate the models ability to interpolate and the final prediction. Additionally, the train-test data was split into five folds. Following recommendations in [59] each fold consisted of multiple spatial blocks, where the block size wk for species k = 1, …, K was set such that approximately 50 blocks (10 per fold) would cover the extended hull of that species and defined as , where ak is the area of the extended hull of species k. Fig 1 (panel (D)) depicts the resulting blocks and folds for species Panstrongylus megistus. For each species, blocks one through four were assigned to the training data, while blocks numbered five (grey shade) were only used to obtain out-of-sample test errors. Allocation of blocks was spatially random to avoid systematic bias of presence and background locations in any of the folds, but stratified with respect to species presence such that the proportion of presence and background points was approximately equal in all folds. The spatial blocking for all species considered in our analyses are provided in [60]. Model performance was evaluated by the area under the receiver operator curve (AUC), which measures the models ability to discriminate between presence and background points. After model evaluation as reported in the Results section was complete, the best model was refit on all data for the final prediction.

Modelling

To estimate the triatomine species distributions we fit a logistic regression to the target-group background (TGB) data using a generalised additive model (GAM). This modelling framework is comprised of two components: observations (1) and a linear predictor (2). We consider a Bernoulli process to model the background/presence (yk,t,i ∈ {0, 1}) of each species k in year t ∈ {2000, …, 2016} at location . Within this framework, we specify the Bernoulli model (1) where i = 1, …, nk are the observations per species. For each species k = 1, …, K, we set a spatial domain delimited by the extended hull formed by the spatial locations of the corresponding species (see Fig 1 for further detail). The relative probability of occurrence πk,t,i is estimated by a logistic GAM with linear predictor (2) (2) where fk,p(xp,t,i) is the species specific, potentially non-linear, effect of the p-th covariate estimated by a penalised thin-plate spline [61] and GPk,i() is a two-dimensional, species-specific Gaussian process (GP) with range parameter evaluated at location si (the smoothness parameter was set to 1.5). The number of covariates Pk can vary by species as some of them might not have enough unique values within the spatial extent of the species to be relevant for analysis. Here, covariates were only included if the number of unique values was at least twenty. The correlation function of the GP was defined by C(x, x′) = ρ(||xx′||), where ρ(d) = (1 + d/) exp(−d/) is the simplified Matérn correlation function with range parameter = maxij||xixj|| as suggested in [62] and implemented in [63].

The model was estimated by optimising the penalised restricted maximum likelihood (REML) criterion (3) (3) using a double shrinkage approach where ψ is a vector of all coefficients associated with the smooth functions f and GP, D(ψ) is the model deviance and ϕ(⋅) and ϕ*(⋅) are range space and null space penalties of the model coefficients ψ [61, 64]. The first penalty (range space) controls the smoothness of functions fk,p and GPk, while the second penalty (null space) enables the removal of individual terms from the model entirely. The γ parameter can be used to globally increase the penalty and thus to obtain smoother, sparser and therefore potentially more robust models. Practical estimation was performed using techniques introduced in [6567] to increase computational speed and reduce memory requirements.

Six model specifications (Table 1) were considered for this analysis, varying by the definition of covariate effects in Eq 2 and whether the global GP term GPk,i() was included. For each species, the final model (out of the six candidate models in Table 1) was selected based on its performance (AUC) on the test data (fold 5). Model 1 has no tuning parameters and was fit directly to the complete training data (folds 1-4). Models 2 through 6 were first tuned with respect to the global penalty γ ∈ {1, …, 4} based on 4-fold cross-validation on folds 1 through 4. Based on the value of γ that yielded the highest average AUC, the models were refit on the complete training data (blocks 1–4). These models were used to calculate the out-of-sample extrapolation and interpolation error (see “Results” section for details). After model evaluation the best model for each species was refit on all data (blocks 1 through 5 and the random hold-out data) to create final predictions.

thumbnail
Table 1. Model specifications considered in the analysis.

https://doi.org/10.1371/journal.pntd.0008411.t001

Implementation

All calculations were performed using the R language environment [68]. Thematic maps were created using package tmap [41]. Data munging and pre-processing was performed using packages dplyr [69] and tidyr [70]. Spatial cross-validation was set up using package blockCV [71]. Package mgcv was used to fit the GAMs [63].

Vectorial capacity of the mapped species

For each triatomine species that was mapped, information related to its importance in transmitting T. cruzi to humans was collated. The prevalence of infection with the T. cruzi parasite was calculated using the data from an existing repository [33]. Collections of less than twenty individuals of a species were excluded and the mean prevalence was calculated for all species where the remaining number of collections exceeded ten. Relevant behavioural data for each vector species was extracted from the published literature.

Results

Species distributions

A total of 30 species were mapped. Triatoma infestans is predicted to occur from northern Argentina and Chile to southern Bolivia and Peru, overlapping in part with the predictions for Triatoma guaysayana, although T. guaysayana isn’t predicted in Peru. Panstrongylatus lutzi, Psammolestes tertius, Rhodnius nasutus, Rhodnius neglectus, Triatoma brasiliensis and Triatoma psuedomaculata are all predicted to primarily occur within Brazil. Triatoma sordida is predicted to occur in Brazil, Paraguay and Bolivia while Triatoma rubrovaria is mainly predicted to occur in Uruguay. Eratyrus mucronatus, Panstrongylatus geniculatus, Panstrongylatus rufotuberculatus, Rhodnius pictipes and Rhodnius robustus are predicted to overlap to differing degrees across a broad area that encompasses northern Bolivia and Peru, northwestern Brazil, Ecuador, Colombia, Venezuela, Guyana, Suriname and French Guiana. Triatoma maculata predictions are restricted to the northern part of this area, while Rhodnius prolixus is predicted even further north in Colombia and Venezuela, and Panstrongylatus chinai is only predicted at the far west of this area within Peru and Ecuador. The predicted distributions of Panstrongylatus geniculatus, Panstrongylatus rufotuberculatus, Rhodnius pallescens and Triatoma dimidiate all extend from northern South America to Central America. Triatoma barberi, Triatoma longipennis, Triatoma mazzotti, Triatoma Mexicana and Triatoma pallidipennis were all predicted to occur in Mexico only. Triatoma gerstaeckeri, Triatoma protracta and Triatoma rubida were predicted to occur from northern Mexico to the southern United States of America and Triatoma sanguisuga was predicted to occur exclusively in the southern United States of America.

A summary for all species that were modelled is provided in Table 2, including the specification of the model selected on training data as well as the AUC of this model evaluated on test data (fold 5) and the AUC obtained on the 20% randomly selected hold-out data (denoted by AUC*). The former is an indicator of the model’s transferability and ability to predict into new areas with potentially unseen covariate values or combinations, within the area that was modelled (cf. Fig 1). This is important because this is precisely the goal of the TGB approach and other modelling strategies that account for preferential sampling. The latter value indicates how well the model interpolates.

thumbnail
Table 2. Summary table for all species considered in the analysis ordered by number of presence observations.

https://doi.org/10.1371/journal.pntd.0008411.t002

The AUC values were all well above the 0.5 random classification threshold (mean: 0.85, SD: 0.12), indicating the maps usefulness to identify areas of higher probability of presence relative to background points. The comparatively low AUC values for species T. brasiliensis and T. pseudomaculata could be partially due to an overlap with many other species, thus making it difficult to discriminate between presence and background. The AUC* values were on average higher and had a lower variance (mean: 0.88, SD: 0.08) but generally consistent with the AUC values obtained on the spatial hold-out test data (fold 5). The predicted values (including 95% CI) in raster format for all species listed in Table 2 are given in [72], (.gri file format). Respective visualisations, i.e. map images, are available from [73].

Predicted distributions of four example species.

In this section, we present the results for four diverse species that provide examples from different genera including dominant and secondary vectors with domestic, peri-domestic and sylvatic habits. In addition, we provide detailed discussion of the results for one of the most important of these vectors, T. infestans, in the following section. Together distributions of these species cover most of the endemic region from northern Argentina and Chile to the south of the United States. Predictions are presented alongside bivariate maps that display prediction and (un)certainty in one map. To do so, predictions and uncertainty (defined by the width of confidence intervals) are divided into intervals (here [0, .25), [.25, .5), [.5, .75), [.75, 1], and [0, .075), [.075, .15), [.15, .3), [3, 1], respectively. The cut off for uncertainty was chosen because a CI width of ≥ .3 means that the upper and lower CIs fall into different categories of the probability intervals. The legend in the bivariate maps indicates which colours correspond to which combination of prediction and uncertainty.

Triatoma gerstaeckeri colonises homes and kennels and is known to bite humans (S2 File). The predicted distribution of this species from the southern USA through much of Mexico is shown in Fig 2a. Uncertainty in these predictions is high at the northern boundaries of this species where sampling is particularly sparse (Fig 2b and [60]). Triatoma dimidiata colonises homes, as well as sylvatic and peri-domestic habitats (S2 File), and is a dominant vector of T. cruzi. It’s distribution predicted using data for the time period from 2000 onwards ranges from southern Mexico through Central America into northern Venezuela and Colombia (Fig 2c and 2d). Predictions are high with low uncertainty in Central America whereas predictions are lower with higher uncertainty in northern Venezuela and Colombia. It is important to note that the block width for this species is smaller than an estimation of the range of spatial auto-correlation, thus the AUC values may be optimistic. Rhodnius pictipes colonises palm trees and has been implicated in the contamination of products for human consumption (S2 File). The predicted distribution of R. pictipes in northern Brazil and Bolivia, French Guiana, Suriname, Guyana, Venezuela, Colombia, Ecuador and Peru is shown in Fig 2e. It is important to note the areas of high uncertainty within the region of higher predicted probability of presence for this sylvatic species (Fig 2f). The confidence intervals for this species are higher than those seen for Panstrongylus geniculatus in the same region, which reflects the lower volume of data available for R. pictipes. Panstrongylus geniculatus colonises trees and rodent nests, has been found in urban areas, and is known to bite humans (S2 File). Its predicted distribution, which overlaps with that of R. pictipes in South America and extends further north as far as Honduras in Central America, is shown in Fig 2g. The uncertainty in these predictions is typically low but is higher at the fringes of the area where the predicted relative probability of presence is high (Fig 2h).

thumbnail
Fig 2. Predicted relative probability of occurrence.

Predictions for 4 selected species at a resolution of 5 × 5 km within the respective extended hull of species occurrence (left panel) and bivariate map (right panel), where darker colours indicate higher predicted probabilities while the transition from white/pink to turquoise/blue indicates increased uncertainty. Row 1 (A, B): Triatoma gerstaeckeri; row 2 (C, D): Triatoma dimidiata; row 3 (E, F): Rhodnius pictipes; row 4 (G, H): Panstrongylus geniculatus. Figure created by the authors using R package tmap [41].

https://doi.org/10.1371/journal.pntd.0008411.g002

Distribution of Triatoma infestans.

Arguably, the most important T. cruzi vector species is T. infestans making it a key target for indoor residual spraying campaigns that have the potential to alter the distribution of this predominantly domestic species. Intervention coverage data was not available to our models so it is particularly important to consider the uncertainty in the predictions for T. infestans. Fig 3 shows the predicted probabilities for this vector species using data from the year 2000 onwards (left panel) alongside a bivariate map that highlights the (un)certainty of the estimation (right panel). The model predicts areas of high relative probabilities of presence with higher certainty in southern Bolivia, northern Chile and northwestern Argentina, which aligns well with the observed presence points. The uncertainty is usually high in unsampled areas, e.g., along the border of Argentina and Chile, and eastern parts of Paraguay. Low probabilities of occurrence are predicted in Brazil and northeastern Paraguay, with varying levels of certainty.

thumbnail
Fig 3.

Left panel: Final predicted map for T.infestans. Right panel: A bivariate map of the predictions that indicates areas of high vs. low probabilities together with the model uncertainty. Darker colours indicate higher predicted probability. Transitions from white/pink to turquoise/blue indicate higher uncertainty.

https://doi.org/10.1371/journal.pntd.0008411.g003

Ranking importance of the environmental variables

Caution is needed when drawing conclusions from information on which environmental variables were selected by each model because i) many of these variables are highly correlated, for example temperature, rainfall, surface wetness and elevation, and ii) some important variables may not have been made available to the models as discussed above. It is, however, interesting to note the three variables selected as most important by each species model and these rankings are given in Table D, S3 File. Forest cover was one of the three most important variables for three species and Eratyrus mucronatus, Panstrongylus rufotuberculatus and Rhodnius pictipes are all known to colonise trees (S2 File). Eight other species models selected other vegetation cover variables as important and variables that define the urban-rural gradient (urbanicity, accessibility and human population) were selected as being among the most important by seven species models. Looking across all species models (Table E and Figure A, S3 File), the most important variables for predicting T. cruzi vector species distributions were temperature, the Gaussian process (the spatial component), evergreen broadleaf forest cover, the vegetation index, rainfall and elevation, followed by variables defining the urban-rural gradient and other types of vegetation cover. Nine land cover classes were never selected by any model. Unsurprisingly these were water, needleleaf forest cover, built up areas (information that was provided to the model by other variables that were selected), snow and ice, barren areas and unclassified land (which is a rare occurrence in the land cover data). For most species there are rarely single covariates with a contribution of more than 50%, meaning that each prediction is comprised of smaller contributions from many variables (Figure A, S3 File).

Vectorial capacity of the mapped species

The current state of knowledge on factors related to the capacity of each of the mapped species to transmit T. cruzi to humans (vectorial capacity) is summarised in Table 3. Specifically, mean infection prevalence, confirmation of human blood meals in natural vector populations, and confirmation of colonisation or invasion of homes (including in urban areas) are listed. Less information is available on the feeding-defecation interval or defecation location for each species, and these values may be influenced by the different experimental conditions used, so these variables are not included in Table 3 but sources of evidence are listed in the supplement (Table B, S2 File). The 30 most commonly reported species mapped here encompass the five most important dominant vectors that frequently colonise homes (P. megistus, R. prolixus, T. brasiliensis, T. dimidiata and T. infestans) as well as species that often colonise peridomestic habitats such as chicken coops, rats nests, boundary walls, wood piles, palm trees, and livestock housing. These species encompass a range of mean T. cruzi infection prevalences from 0.8% in T. sordida to 55.6% in T. longipennis, although for 13 of the most commonly reported species there was insufficient data to generate a reliable mean infection prevalence value. In addition to differences among the mean values for species, there is also considerable variation within each species likely resulting from heterogeneities in factors such as intervention deployment and local host species. When viewing the summaries in Table 3, it is important to note that not all regions or species have been sampled or tested equally and a lack of published evidence for a specific component of vectorial capacity cannot be taken as definitive evidence of its absence. For example, no infections have been reported in Eratyrus mucronatus or Psammolestes tertius, but only 28 and 143 individuals have been tested, respectively, compared to 335,467 T. sordida individuals. In addition to the five important dominant vectors, there is evidence that many of the species mapped in this work are potential vectors of T. cruzi. Almost all of the 28 species that have been found to be infected with T. cruzi are known to invade homes, and at least 17 have been found to have fed on humans (Table 3). The sources of evidence—130 published articles in total—are given in the supplement (Tables B and C, S2 File).

thumbnail
Table 3. Infection prevalence and behaviour of selected species.

https://doi.org/10.1371/journal.pntd.0008411.t003

Discussion

This study models the contemporary geospatial distributions of the thirty most commonly reported triatomine species and putative vectors of the T. cruzi parasite to humans. Our approach allows the distributions of these different species to be compared, and to be overlaid, which increases our understanding of the community of vector species at different locations in the current intervention era.

Our aim was to consider the most commonly reported species in the Chagas endemic zone. To provide policy makers, stakeholders and researchers with relevant information, we included all species for which distribution maps could be reasonably estimated. However, as can be seen from Table 2, the training (and test) data for many species contained fewer than 300 or even fewer than than 150 presence observations reported since the year 2000. AUC values will tend to be less robust and potentially over- or under optimistic as sample size decreases. In these instances it is particularly important to take into account the uncertainty of the estimates as presented here. There are also locally important species for which maps could not be produced because they are only found within areas where the relevant surveillance records are not publicly available or because their range is limited so only small numbers of observations exist. For example, Rhodnius ecuadoriensis is an important vector in Ecuador [74] but the databases used in this study only provided 11 and 23 records, respectively, for known collection dates after the year 2000.

Earlier studies [19, 20, 22, 23, 25, 27, 28, 30, 31], most notably [24, 26], have modelled the distributions of some of these species but often previous work has focused on specific regions, states or countries. Additionally, comparisons with the previous work are limited because of differences in methodology, datasets and spatial extent under consideration. Only visual comparison is possible in most cases because the predicted values generated by previous studies are not openly available, precluding quantitative assessment of the different versions. Known T. dimidiata presence locations are predicted by the model presented here, but with higher probabilities for the locations in central America compared to Colombia and Ecuador. This may imply differentiation between these populations, for example, the subspecies T. dimidiata capitata is only found in Colombia, however, the subspecies T. dimidiata dimidiata is common to central America and Ecuador [75]. Within Central America, our predictions align well with a T. dimidiata map published in 2010, predicting this species throughout Central America extending up both the east and west coasts of Mexico [31]. This earlier map also provides a main and maximum distribution for T. infestans. The main distribution from the 2010 map (from La Rioja and northern Cordoba in Argentina up to Santa Cruz and southern Beni in Bolivia, as well as an area around Moquega in Peru) falls within our area of highest predictions for occurrence with moderate certainty (Fig 3), but an area of higher certainty can be seen running directly west of the previously published main distribution in Argentina and Bolivia, joining the main distribution predicted in Peru in the 2010 work. Comparisons with a T. infestans map published in 2002 [30] demonstrate the dramatic changes in Brazil over the last decades. The area of lowest probability of occurrence with highest certainty in our current map aligns with the area of absence in the 2002 map, whereas the areas in southwestern Brazil with low probability of occurrence but higher uncertainty in our current map align with areas of species presence in the 2002 map from Paraiba down to Mato Grosso do Sul and Rio Grande do Sul. These areas, where this species is predicted to be no longer present by both the 2010 study [31] and our current work, match PAHO reports of the Southern Cone Initiative (or INCOSUR). Hernandez et al. [22] modelled the joint distribution of T. infestans and Mepraia spinolai in the Coquimbo, Valparaíso and Metropolitana Regións of Chile, which are all regions where our model predicted high relative probability of occurrence of this species. Ceccarelli et al. [25] generated climatic suitability maps for infected T. infestans triatomines using two climatic datasets, the Advanced Very High Resolution Radiometer onboard the National Oceanic and Atmospheric Administration meteorological satellite series (AVHRR) and the WorldClim dataset. Their AVHRR results are closest to our distribution maps but the studies cannot be compared directly due to the different outcomes modelled, i.e. vector occurrence and infected vector occurrence.

In general, our predicted maps show good agreement with respect to regions that highlight higher vs. lower probabilities of occurrence when compared to earlier studies. Curtis-Robles et al. [19] recently investigated the spatial distribution of, among others, T. gerstaeckeri,T. sanguisuga and T. rubida within the state of Texas in the USA and their areas of species occurrence match our areas of high relative probability of occurrence with high certainty for these three species within this region. Garza et al. [23] also mapped the distribution of T. gerstaeckeri and predicted that the current distribution was largely restricted to Texas in the USA whereas our predictions show high relative probability of occurrence with higher certainty in both Texas and the neighbouring Mexican states of Coahuila, Nuevo Leon and Tamaulipas, in agreement with the predictions made in 2015 by the Mexican Atlas of Triatomines [24]. The most comprehensive collection of species distribution maps is provided by the Mexican Atlas of Triatomines [24] which generated predictive maps for 19 species of which T. rubida, T.gerstaeckeri, T. longipennis, T. mexicana, T. barberi, T. pallidipennis, T. mazzottii and T. protracta were also modelled in our study. A visual comparison shows a reasonable alignment between the predictions made by the Mexican Atlas of Triatomines and our results for these eight species within Mexico. It is also interesting to note that our results for T. dimidiata, within Mexico, most closely align to the Mexican Atlas of Triatomines’s results for haplogroup 2 of this species [24].

Arboleda et al. [28] produced a predictive map of the geographical distribution of R. pallescens across Central and South America. Our two studies show broad agreement in Central America but the earlier study predicts high environmental suitability for this species in areas much further south than the region where we predict high probability of occurrence. This result demonstrates a key difference in the two methods used because Arboleda et al. quantified associations with environmental variables only whereas we also incorporated a spatial component (the Gaussian process) in our model. Consequently, the earlier study identified locations that were predicted to be suitable much further south than any known reports of this species. Parra-Henao et al. [20] also used ecological niche modelling, which they applied to P. geniculatus, R. pallescens, R. prolixus and T. maculata in the Caribbean, Pacific, Eastern Plains, Andean and Amazon regions of Colombia. The closest alignment between the results of their study and ours can be seen for T. maculata. Both studies predict occurrence of this species in non-contiguous areas of northern Colombia; one from the Guarjira Peninsula heading southwest, and another on the eastern slope of Eastern Cordillera towards the Orinoquía region.

Gurgel-Goncalves et al. [26] modelled the ecological niches of 16 triatomine species in Brazil, of which 11 were also modelled in our study. Of these, a visual comparison shows that there is good broad agreement between the 2012 study and our results within Brazil for ten of these species (P. megistus, P. lutzi, R. nasutus, R. neglectus, R. pictipes, R. robustus, T. brasiliensis, T. pseudomaculata, T. rubrovaria, T. sordida), however, our results show a lower probability of occurrence for P. geniculatus in southeast Brazil whereas the 2012 study predicts presence across this region. Carbajal de la Fuente et al. [27] also modelled the potential geographic distribution of T. pseudomaculata in 2008 and again our results show good broad agreement.

In conclusion, the maps generated by this study provide a robust summary of the contemporary distributions of the most commonly reported vector species across the Chagas endemic zone. It is important that these maps are viewed within the context of the behaviour and vectorial capacity of each of these species. Summaries of the literature published to-date are provided here and the earlier studies show that most of these triatomine species are potentially important vectors of T. cruzi to humans. Each of the indicators of vectorial capacity summarised at a species level here may vary within the range of the species, as well as between species [76, 77]. It is therefore important to map spatial variation in these characteristics, as well as in the species themselves, in order to identify where regions of high vectorial transmission risk are likely to exist.

Supporting information

S1 File. Table of environmental variables.

This file contains Table A that describes each covariate that went into the models including the time period for which data was available.

https://doi.org/10.1371/journal.pntd.0008411.s001

(DOCX)

S2 File. Sources of evidence for variables linked to vectorial capacity.

Tables B and C provide the full information summarised in Table 3 together with citations for the sources of evidence used.

https://doi.org/10.1371/journal.pntd.0008411.s002

(DOCX)

S3 File. Importance of each covariate to the species models.

Table D provides the three most important covariate contributions for each species model. Table E ranks the environmental covariates by their relative contributions across all 30 species models. Figure A shows the relative contribution of each covariate to each species model.

https://doi.org/10.1371/journal.pntd.0008411.s003

(DOCX)

References

  1. 1. London Declaration on Neglected Tropical Diseases. 5 Feb 2019; 2012.
  2. 2. Pan American Health Organization. Strategy and plan of action for Chagas disease prevention, control and care; 2010.
  3. 3. World Health Organization. Chagas disease in Latin America: an epidemiological update based on 2010 estimates. Weekly Epidemiological Record. 2015;90:33–44. pmid:25671846
  4. 4. Montenegro D, da Cunha AP, Ladeia-Andrade S, Vera M, Pedroso M, Junqueira A. Multi-criteria decision analysis and spatial statistic: an approach to determining human vulnerability to vector transmission of Trypanosoma cruzi. Memorias Do Instituto Oswaldo Cruz. 2017;112(10):709–718. pmid:28953999
  5. 5. Guhl F. Geographical distribution of Chagas Disease. In: Telleria J, Tibayrenc M, editors. American Trypanosomiasis Chagas Disease: One Hundred Years of Research; 2017. p. 89–106.
  6. 6. Buitrago R, Bosseno MF, Depickere S, Waleckx E, Salas R, Aliaga C, et al. Blood meal sources of wild and domestic Triatoma infestans (Hemiptera: Reduviidae) in Bolivia: connectivity between cycles of transmission of Trypanosoma cruzi. Parasites & Vectors. 2016;9:214.
  7. 7. Hernandez C, Salazar C, Brochero H, Teheran A, Stella Buitrago L, Vera M, et al. Untangling the transmission dynamics of primary and secondary vectors of Trypanosoma cruzi in Colombia: parasite infection, feeding sources and discrete typing units. Parasites & Vectors. 2016;9:620.
  8. 8. Cantillo-Barraza O, Garces E, Gomez-Palacio A, Cortes LA, Pereira A, Marcet PL, et al. Eco-epidemiological study of an endemic Chagas disease region in northern Colombia reveals the importance of Triatoma maculata (Hemiptera: Reduviidae), dogs and Didelphis marsupialis in Trypanosoma cruzi maintenance. Parasites & Vectors. 2015;8:482.
  9. 9. Indacochea A, Gard CC, Hansen IA, Pierce J, Romero A. Short-Range Responses of the Kissing Bug Triatoma rubida (Hemiptera: Reduviidae) to Carbon Dioxide, Moisture, and Artificial Light. Insects. 2017;8(3). pmid:28850059
  10. 10. Weinberg D, Porcasi X, Lanfri S, Abril M, Scavuzzo CM. Spatial analyzes of triatomine infestation indices and their association to the actions of a Chagas disease program and environmental variables during a 5-year intervention period. Acta Tropica. 2018;188:41–49. pmid:30142310
  11. 11. Dantas ES, Gurgel-Goncalves R, Maciel Villela DA, Monteiro FA, Maciel-de Freitas R. Should I stay or should I go? Movement of adult Triatoma sordida within the peridomestic area of a typical Brazilian Cerrado rural household. Parasites & Vectors. 2018;11.
  12. 12. Flores A, Vitek C, Feria-Arroyo TP, Fredensborg BL. Temporal Variation in the Abundance and Timing of daily Activity of Chagas Disease Vector Triatoma gerstaeckeri (Stal, 1859) in a natural Habitat in the lower Rio Grande Valley, South Texas. Journal of Parasitology. 2017;103(5):574–578.
  13. 13. Di Iorio O, Gurtler RE. Seasonality and temperature-dependent Flight Dispersal of Triatoma infestans (Hemiptera: Reduviidae) and Other Vectors of Chagas Disease in Western Argentina. Journal of Medical Entomology. 2017;54(5):1285–1292. pmid:28605522
  14. 14. Brito RN, Gorla DE, Diotaiuti L, Gomes ACF, Souza RCM, Abad-Franch F. Drivers of house invasion by sylvatic Chagas disease vectors in the Amazon-Cerrado transition: A multi-year, state-wide assessment of municipality-aggregated surveillance data. PLoS Neglected Tropical Diseases. 2017;11(11).
  15. 15. Falvo ML, Figueiras ANL, Manrique G. Spatio-temporal analysis of the role of faecal depositions in aggregation behaviour of the triatomine Rhodnius prolixus. Physiological Entomology. 2016;41(1):24–30.
  16. 16. Dias JVL, Queiroz DRM, Martins HR, Gorla DE, Pires HHR, Diotaiuti L. Spatial distribution of triatomines in domiciles of an urban area of the Brazilian Southeast Region. Memorias Do Instituto Oswaldo Cruz. 2016;111(1):43–50. pmid:26814643
  17. 17. Jacome-Pinilla D, Hincapie-Penaloza E, Ortiz MI, David Ramirez J, Guhl F, Molina J. Risks associated with dispersive nocturnal flights of sylvatic Triatominae to artificial lights in a model house in the northeastern plains of Colombia. Parasites & Vectors. 2015;8.
  18. 18. Castillo-Neyra R, Barbu CM, Salazar R, Borrini K, Naquira C, Levy MZ. Host-Seeking Behavior and Dispersal of Triatoma infestans, a Vector of Chagas Disease, under Semi-field Conditions. PLoS Neglected Tropical Diseases. 2015;9(1). pmid:25569228
  19. 19. Curtis-Robles R, Hamer SA, Lane S, Levy MZ, Hamer GL. Bionomics and Spatial Distribution of Triatomine Vectors of Trypanosoma cruzi in Texas and Other Southern States, USA. American Journal of Tropical Medicine and Hygiene. 2018;98(1):113–121. pmid:29141765
  20. 20. Parra-Henao G, Suarez-Escudero LC, Gonzalez-Caro S. Potential Distribution of Chagas Disease Vectors (Hemiptera, Reduviidae, Triatominae) in Colombia, Based on Ecological Niche Modeling. Journal of Tropical Medicine. 2016;. pmid:28115946
  21. 21. Ceccarelli S, Rabinovich JE. Global Climate Change Effects on Venezuela’s Vulnerability to Chagas Disease is Linked to the Geographic Distribution of Five Triatomine Species. Journal of Medical Entomology. 2015;52(6):1333–1343.
  22. 22. Hernandez J, Nunez I, Bacigalupo A, Cattan PE. Modeling the spatial distribution of Chagas disease vectors using environmental variables and people’s knowledge. International Journal of Health Geographics. 2013;12.
  23. 23. Garza M, Arroyo TPF, Casillas EA, Sanchez-Cordero V, Rivaldi CL, Sarkar S. Projected Future Distributions of Vectors of Trypanosoma cruzi in North America under Climate Change Scenarios. PLoS Neglected Tropical Diseases. 2014;8(5). pmid:24831117
  24. 24. Ramsey JM, Peterson AT, Carmona-Castro O, Moo-Llanes DA, Nakazawa Y, Butrick M, et al. Atlas of Mexican Triatominae (Reduviidae: Hemiptera) and vector transmission of Chagas disease. Memorias Do Instituto Oswaldo Cruz. 2015;110(3):339–352. pmid:25993505
  25. 25. Ceccarelli S, Balsalobre A, Susevich ML, Echeverria MG, Gorla DE, Marti GA. Modelling the potential geographic distribution of triatomines infected by Triatoma virus in the southern cone of South America. Parasites & Vectors. 2015;8:153.
  26. 26. Gurgel-Gonçalves R, Galvão C, Costa J, Peterson AT. Geographic Distribution of Chagas Disease Vectors in Brazil Based on Ecological Niche Modeling. Journal of Tropical Medicine. 2012. pmid:22523500
  27. 27. Carbajal de la Fuente A, Porcasi X, Noireau F, Diotaiuti L, Gorla D. The association between the geographic distribution of Triatoma pseudomaculata and Triatoma wygodzinskyi (Hemiptera: Reduviidae) with environmental variables recorded by remote sensors. Infection, genetics and evolution: journal of molecular epidemiology and evolutionary genetics in infectious diseases. 2008;9:54–61. pmid:18992369
  28. 28. Arboleda S, Gorla D, Porcasi X, Saldana A, Calzada J, Jaramillo ON. Development of a geographical distribution model of Rhodnius pallescens Barber, 1932 using environmental data recorded by remote sensing. Infection, genetics and evolution: journal of molecular epidemiology and evolutionary genetics in infectious diseases. 2009;9:441–8. pmid:19138764
  29. 29. de Souza RDM, Diotaiuti L, Lorenzo MG, Gorla DE. Analysis of the geographical distribution of Triatoma vitticeps (Stal, 1859) based on data of species occurrence in Minas Gerais, Brazil. Infection Genetics and Evolution. 2010;10(6):720–726.
  30. 30. Gorla D. Variables ambientales registradas por sensores remotos como indicadores de la distribución geográfica de Triatoma infestans (Heteroptera: Reduviidae). Ecologia Austral. 2002;12:117–127.
  31. 31. Who, how, what and where? Nature. 2010;465(S7301):S8–S9. pmid:20571555
  32. 32. Ceccarelli S, Balsalobre A, Medone P, Cano ME, Gurgel Gonçalves R, Feliciangeli D, et al. DataTri, a database of American triatomine species occurrence. Scientific Data. 2018;5:180071. pmid:29688221
  33. 33. Browne AJ, Guerra CA, Alves RV, Costa VMd, Wilson AL, Pigott DM, et al. The contemporary distribution of Trypanosoma cruzi infection in humans, alternative hosts and vectors. Scientific Data. 2017;4:170050. pmid:28398292
  34. 34. Barbet-Massin M, Jiguet F, Albert CH, Thuiller W. Selecting pseudo-absences for species distribution models: how, where and how many? Methods in Ecology and Evolution. 2012;3(2):327–338.
  35. 35. Warton DI, Shepherd LC. Poisson point process models solve the “pseudo-absence problem” for presence-only data in ecology. The Annals of Applied Statistics. 2010;4(3):1383–1402.
  36. 36. Renner IW, Elith J, Baddeley A, Fithian W, Hastie T, Phillips SJ, et al. Point process models for presence-only analysis. Methods in Ecology and Evolution. 2015;6(4):366–379.
  37. 37. Phillips SJ, Dudík M, Elith J, Graham CH, Lehmann A, Leathwick J, et al. Sample selection bias and presence-only distribution models: implications for background and pseudo-absence data. Ecological Applications. 2009;19(1):181–197. pmid:19323182
  38. 38. Moyes CL, Shearer FM, Huang Z, Wiebe A, Gibson HS, Nijman V, et al. Predicting the geographical distributions of the macaque hosts and mosquito vectors of Plasmodium knowlesi malaria in forested and non-forested areas. Parasites & Vectors. 2016;9(1):242.
  39. 39. Shearer FM, Longbottom J, Browne AJ, Pigott DM, Brady OJ, Kraemer MUG, et al. Existing and potential infection risk zones of yellow fever worldwide: a modelling analysis. The Lancet Global Health. 2018;6(3):e270–e278. pmid:29398634
  40. 40. Fithian W, Elith J, Hastie T, Keith DA. Bias correction in species distribution models: pooling survey and collection data for multiple species. Methods in Ecology and Evolution. 2015;6(4):424–438. pmid:27840673
  41. 41. Tennekes M. tmap: Thematic Maps in R. Journal of Statistical Software. 2018;84(1):1–39.
  42. 42. Vittinghoff E, McCulloch CE. Relaxing the Rule of Ten Events per Variable in Logistic and Cox Regression. American Journal of Epidemiology. 2007;165(6):710–718. pmid:17182981
  43. 43. Concato J, Peduzzi P, Holford TR, Feinstein AR. Importance of events per independent variable in proportional hazards analysis I. Background, goals, and general strategy. Journal of Clinical Epidemiology. 1995;48(12):1495–1501.
  44. 44. Peduzzi P, Concato J, Feinstein AR, Holford TR. Importance of events per independent variable in proportional hazards regression analysis II. Accuracy and precision of regression estimates. Journal of Clinical Epidemiology. 1995;48(12):1503–1510.
  45. 45. Peduzzi P, Concato J, Kemper E, Holford TR, Feinstein AR. A simulation study of the number of events per variable in logistic regression analysis. Journal of Clinical Epidemiology. 1996;49(12):1373–1379. pmid:8970487
  46. 46. Wan Z, Hook S. MOD11A2 MODIS/Terra Land Surface Temperature/Emissivity 8-Day L3 Global 1km SIN Grid V006. 2015.
  47. 47. Lobser SE, Cohen WB. MODIS tasselled cap: land cover characteristics expressed through transformed MODIS data. International Journal of Remote Sensing. 2007;28:5079–5101.
  48. 48. Funk C, Peterson P, Landsfeld M, Pedreros D, Verdin J, Shukla S, et al. The climate hazards infrared precipitation with stations—a new environmental record for monitoring extremes. Scientific Data. 2015;2:150066.
  49. 49. Jarvis A, Reuter H, Nelson A, Guevara E. CGIAR-CSI SRTM—SRTM 90m DEM Digital Elevation Database; 2008. Available from: http://srtm.csi.cgiar.org/.
  50. 50. M Friedl DSM. MCD12Q1 MODIS/Terra+Aqua Land Cover Type Yearly L3 Global 500m SIN Grid V006; 2015. Available from: https://lpdaac.usgs.gov/node/1260.
  51. 51. Didan K, Munoz AB, Solano R, Huete A. type [; 2015]Available from: https://vip.arizona.edu/documents/MODIS/MODIS_VI_UsersGuide_June_2015_C6.pdf.
  52. 52. Esch T, Bachofer F, Heldens W, Hirner A, Marconcini M, Palacios-Lopez D, et al. Where we live—a summary of the achievements and planned evolution of the global urban footprint. Remote Sensing. 2018;10:10.
  53. 53. Earth Observation Group. type [; 2015]Available from: https://eogdata.mines.edu/download_dnb_composites.html.
  54. 54. Center for International Earth Science Information Network. Gridded Population of the World, Version 4 (GPWv4): Population Count, Revision 11; 2018. Available from: https://eogdata.mines.edu/download_dnb_composites.html.
  55. 55. Weiss DJ, Nelson A, Gibson HS, Temperley W, Peedell S, Lieber A, et al. A global map of travel time to cities to assess inequalities in accessibility in 2015. Nature. 2018;553(7688):333–336. pmid:29320477
  56. 56. Heikkinen RK, Marmion M, Luoto M. Does the interpolation accuracy of species distribution models come at the expense of transferability? Ecography. 2012;35(3):276–288.
  57. 57. Wenger SJ, Olden JD. Assessing transferability of ecological models: an underappreciated aspect of statistical validation. Methods in Ecology and Evolution. 2012;3(2):260–267.
  58. 58. Trachsel M, Telford RJ. Technical note: Estimating unbiased transfer-function performances in spatially structured environments. Climate of the Past. 2016;12(5):1215–1223.
  59. 59. Roberts DR, Bahn V, Ciuti S, Boyce MS, Elith J, Guillera-Arroita G, et al. Cross-validation strategies for data with temporal, spatial, hierarchical, or phylogenetic structure. Ecography. 2017;40(8):913–929.
  60. 60. Bender A. Target-group background and spatial blocking for 30 triatomine species; 2020. Available from: https://figshare.com/articles/Target-group_background_and_spatial_blocking_for_30_triatomine_species/8604080/1.
  61. 61. Wood SN. Fast stable restricted maximum likelihood and marginal likelihood estimation of semiparametric generalized linear models. Journal of the Royal Statistical Society: Series B (Statistical Methodology). 2011;73(1):3–36.
  62. 62. Kammann EE, Wand MP. Geoadditive models. Journal of the Royal Statistical Society: Series C (Applied Statistics). 2003;52(1):1–18.
  63. 63. Wood SN. Generalized Additive Models: An Introduction with R. 2nd ed. Boca Raton: Chapman & Hall/Crc Texts in Statistical Science; 2017.
  64. 64. Marra G, Wood SN. Practical variable selection for generalized additive models. Computational Statistics & Data Analysis. 2011;55(7):2372–2387.
  65. 65. Wood SN, Goude Y, Shaw S. Generalized additive models for large data sets. Journal of the Royal Statistical Society: Series C (Applied Statistics). 2015;64(1):139–155.
  66. 66. Wood SN, Li Z, Shaddick G, Augustin NH. Generalized Additive Models for Gigadata: Modeling the U.K. Black Smoke Network Daily Data. Journal of the American Statistical Association. 2017;112(519):1199–1210.
  67. 67. Li Z, Wood SN. Faster model matrix crossproducts for large generalized linear models with discretized covariates. Statistics and Computing. 2019.
  68. 68. R Core Team. R: A Language and Environment for Statistical Computing; 2018. Available from: https://www.R-project.org/.
  69. 69. Wickham H, François R, Henry L, Müller K. dplyr: A Grammar of Data Manipulation; 2019. Available from: https://CRAN.R-project.org/package=dplyr.
  70. 70. Wickham H, Henry L. tidyr: Easily Tidy Data with’spread()’ and’gather()’ Functions; 2019. Available from: https://CRAN.R-project.org/package=tidyr.
  71. 71. Valavi R, Elith J, Lahoz-Monfort JJ, Guillera-Arroita G. blockCV: An R package for generating spatially or environmentally separated folds for k-fold cross-validation of species distribution models. Methods in Ecology and Evolution. 2018;0(0).
  72. 72. Bender A. Predicted rasters (.gri files) of 30 triatomine vectors; 2020. Available from: https://figshare.com/articles/Predicted_rasters_gri_files_of_30_triatomine_vectors/8598548/1.
  73. 73. Bender A. Visualization of the predicted distribution of 30 triatomine vectors (with confidence intervals).; 2020. Available from: https://figshare.com/articles/Visualization_of_the_predicted_distribution_of_30_triatomine_vectors_with_confidence_intervals_/8617352/2.
  74. 74. Grijalva MJ, Villacis AG, Moncayo AL, Ocana-Mayorga S, Yumiseva CA, Baus EG. Distribution of triatomine species in domestic and peridomestic environments in central coastal Ecuador. PLoS Neglected Tropical Diseases. 2017;11(10).
  75. 75. Bargues MD, Klisionwisc DR, Gonzalez-Candelas F, Ramsey JM, Monroy C, Ponce C, et al. Phylogeography and Genetic Variation of Triatoma dimidiata, the Main Chagas Disease Vector in Central America, and Its Position within the Genus Triatoma. PLoS Neglected Tropical Diseases. 2008;2(3):e233. pmid:18461141
  76. 76. Rodríguez-Planes LI, Gaspe MS, Enriquez GF, Gürtler RE. Habitat-Specific Occupancy and a Metapopulation Model of Triatoma sordida (Hemiptera: Reduviidae), a Secondary Vector of Chagas Disease, in Northeastern Argentina. Journal of Medical Entomology. 2018;55(2):370–381. pmid:29272421
  77. 77. Dorn PL, Monroy C, Curtis A. Triatoma dimidiata (Latreille, 1811): A review of its diversity across its geographic range and the relationship among populations. Infection, Genetics and Evolution. 2007;7(2):343–352. pmid:17097928