Click on to read the abstract and find the paper's online home via .
The R Developer Community Does Have a Strong Software Engineering Culture.
K. Ram. The R Journal, 2022.
There is a strong software engineering culture in the R developer community. We recommend creating, updating and vetting packages as well as keeping up with community standards. We invite contributions to the rOpenSci project, where participants can gain experience that will shape their work and that of their peers.
When, Where, and What? Characterizing Personal PM2.5 Exposure in Periurban India by Integrating GPS, Wearable Camera, and Ambient and Personal Monitoring Data.
C.Tonne. Environmental Science & Technology, 2019.
Evidence identifying factors that influence personal exposure to air pollutants in low- and middle-income countries is scarce. Our objective was to identify the relative contribution of the time of the day ( when?), location ( where?), and individuals’ activities ( what?) to PM2.5 personal exposure in periurban South India. We conducted a panel study in which 50 participants were monitored in up to six 24-h sessions ( n = 227). We integrated data from multiple sources: continuous personal and ambient PM2.5 concentrations; questionnaire, GPS, and wearable camera data; and modeled long-term exposure at residence. Mean 24-h personal exposure was 43.8 μg/m3 (SD 24.6) for men and 39.7 μg/m3 (SD 12.0) for women. Temporal patterns in exposure varied between women (peak exposure in the morning) and men (more exposed throughout the rest of the day). Most exposure occurred at home, 67% for men and 89% for women, which was proportional to the time spent in this location. Ambient daily PM2.5 was an important predictor of 24-h personal exposure for both genders. Among men, activities predictive of higher hourly average exposure included presence near food preparation, in the kitchen, in the vicinity of smoking, or in industry. For women, predictors of exposure were largely related to cooking.
Use of spatiotemporal characteristics of ambient PM2.5 in rural South India to infer local versus regional contributions.
J.D. Marshall. Environmental Pollution, 2018.
This study uses spatiotemporal patterns in ambient concentrations to infer the contribution of regional versus local sources. We collected 12 months of monitoring data for outdoor fine particulate matter (PM2.5) in rural southern India. Rural India includes more than one-tenth of the global population and annually accounts for around half a million air pollution deaths, yet little is known about the relative contribution of local sources to outdoor air pollution. We measured 1-min averaged outdoor PM2.5 concentrations during June 2015–May 2016 in three villages, which varied in population size, socioeconomic status, and type and usage of domestic fuel. The daily geometric-mean PM2.5 concentration was ∼30 μg m−3 (geometric standard deviation: ∼1.5). Concentrations exceeded the Indian National Ambient Air Quality standards (60 μg m−3) during 2–5% of observation days. Average concentrations were ∼25 μg m−3 higher during winter than during monsoon and ∼8 μg m−3 higher during morning hours than the diurnal average. A moving average subtraction method based on 1-min average PM2.5 concentrations indicated that local contributions (e.g., nearby biomass combustion, brick kilns) were greater in the most populated village, and that overall the majority of ambient PM2.5 in our study was regional, implying that local air pollution control strategies alone may have limited influence on local ambient concentrations. We compared the relatively new moving average subtraction method against a more established approach. Both methods broadly agree on the relative contribution of local sources across the three sites. The moving average subtraction method has broad applicability across locations.
Development of land-use regression models for fine particles and black carbon in peri-urban South India.
C. Tonne. Science of The Total Environment, 2018.
Land-use regression (LUR) has been used to model local spatial variability of particulate matter in cities of high-income countries. Performance of LUR models is unknown in less urbanized areas of low-/middle-income countries (LMICs) experiencing complex sources of ambient air pollution and which typically have limited land use data. To address these concerns, we developed LUR models using satellite imagery (e.g., vegetation, urbanicity) and manually-collected data from a comprehensive built-environment survey (e.g., roads, industries, non-residential places) for a peri-urban area outside Hyderabad, India. As part of the CHAI (Cardiovascular Health effects of Air pollution in Telangana, India) project, concentrations of fine particulate matter (PM2.5) and black carbon were measured over two seasons at 23 sites. Annual mean (sd) was 34.1 (3.2) μg/m3 for PM2.5 and 2.7 (0.5) μg/m3 for black carbon. The LUR model for annual black carbon explained 78% of total variance and included both local-scale (energy supply places) and regional-scale (roads) predictors. Explained variance was 58% for annual PM2.5 and the included predictors were only regional (urbanicity, vegetation). During leave-one-out cross-validation and cross-holdout validation, only the black carbon model showed consistent performance. The LUR model for black carbon explained a substantial proportion of the spatial variability that could not be captured by simpler interpolation technique (ordinary kriging). This is the first study to develop a LUR model for ambient concentrations of PM2.5 and black carbon in a non-urban area of LMICs, supporting the applicability of the LUR approach in such settings. Our results provide insights on the added value of manually-collected built-environment data to improve the performance of LUR models in settings with limited data availability. For both pollutants, LUR models predicted substantial within-village variability, an important feature for future epidemiological studies.
Wearable camera-derived microenvironments in relation to personal exposure to PM2.5.
C. Tonne. Environment International, 2018.
Data regarding which microenvironments drive exposure to air pollution in low and middle income countries are scarce. Our objective was to identify sources of time-resolved personal PM2.5 exposure in peri-urban India using wearable camera-derived microenvironmental information. We conducted a panel study with up to 6 repeated non-consecutive 24 h measurements on 45 participants (186 participant-days). Camera images were manually annotated to derive visual concepts indicative of microenvironments and activities. Men had slightly higher daily mean PM2.5 exposure (43 μg/m3) compared to women (39 μg/m3). Cameras helped identify that men also had higher exposures when near a biomass cooking unit (mean (sd) μg/m3: 119 (383) for men vs 83 (196) for women) and presence in the kitchen (133 (311) for men vs 48 (94) for women). Visual concepts associated in regression analysis with higher 5-minute PM2.5 for both sexes included: smoking (+93% (95% confidence interval: 63%, 129%) in men, +29% (95% CI: 2%, 63%) in women), biomass cooking unit (+57% (95% CI: 28%, 93%) in men, +69% (95% CI: 48%, 93%) in women), visible flame or smoke (+90% (95% CI: 48%, 144%) in men, +39% (95% CI: 6%, 83%) in women), and presence in the kitchen (+49% (95% CI: 27%, 75%) in men, +14% (95% CI: 7%, 20%) in women). Our results indicate wearable cameras can provide objective, high time-resolution microenvironmental data useful for identifying peak exposures and providing insights not evident using standard self-reported time-activity.
A Community of Practice Around Peer Review for Long-Term Research Software Sustainability.
S. Butland. Computing in Science & Engineering, 2018.
Scientific open source projects are responsible for enabling many of the major advances in modern science including recent breakthroughs such as the Laser Interferometer Gravitational-Wave Observatory project recognized in the 2017 Nobel Prize for physics. However, much of this software ecosystem is developed ad hoc with no regard for sustainable software development practices. This problem is further compounded by the fact that researchers who develop software have little in the way of resources or academic recognition for their efforts. The rOpenSci Project, founded in 2011 with the explicit mission of developing software to support reproducible science, has in recent years undertaken an effort to improve the long tail of scientific software. In this paper, we describe our software peer-review system, which brings together the best of traditional academic review with new ideas from industry code review.
Health impact assessment of cycling network expansions in European cities.
A. de Nazelle,
L. Int Panis,
M. Nieuwenhuijsen. Preventive Medicine, 2018.
We conducted a health impact assessment (HIA) of cycling network expansions in seven European cities. We modeled the association between cycling network length and cycling mode share and estimated health impacts of the expansion of cycling networks. First, we performed a non-linear least square regression to assess the relationship between cycling network length and cycling mode share for 167 European cities. Second, we conducted a quantitative HIA for the seven cities of different scenarios (S) assessing how an expansion of the cycling network [i.e. 10% (S1); 50% (S2); 100% (S3), and all-streets (S4)] would lead to an increase in cycling mode share and estimated mortality impacts thereof. We quantified mortality impacts for changes in physical activity, air pollution and traffic incidents. Third, we conducted a cost–benefit analysis. The cycling network length was associated with a cycling mode share of up to 24.7% in European cities. The all-streets scenario (S4) produced greatest benefits through increases in cycling for London with 1,210 premature deaths (95% CI: 447–1,972) avoidable annually, followed by Rome (433; 95% CI: 170–695), Barcelona (248; 95% CI: 86–410), Vienna (146; 95% CI: 40–252), Zurich (58; 95% CI: 16–100) and Antwerp (7; 95% CI: 3–11). The largest cost–benefit ratios were found for the 10% increase in cycling networks (S1). If all 167 European cities achieved a cycling mode share of 24.7% over 10,000 premature deaths could be avoided annually. In European cities, expansions of cycling networks were associated with increases in cycling and estimated to provide health and economic benefits.
Population Wide Decline in Somatic Growth in Harbor Seals—Early Signs of Density Dependence.
K. C. Hårding,
T. Härkönen. Frontiers in Ecology and Evolution, 2018.
The harbor seal populations of Danish and Swedish waters have had turbulent population dynamics during the last century. They were severely depleted by hunting in the beginning of the twentieth century, followed by rapid recovery due to protective measures. They were victims to two mass mortalities caused by Phocine Distemper Virus (PDV) epidemics. Long term monitoring and intensive sampling during the last decades now allow analysis of population level phenomena in response to shifting population size. We compare somatic growth curves from several seal populations including 2,041 specimens with known age, length and population size at birth. Asymptotic body lengths of female harbor seals were 148 cm in all four regions in 1988, when seal abundances had been kept low by hunting for decades. Males were 158 cm, being 10 cm longer. However, in 2002 the asymptotic lengths of seals differed among regions. While seals in the Kattegat showed similar asymptotic lengths as in 1988, seals in the Skagerrak were significantly shorter, where both male and female asymptotic lengths declined by 7 cm. We estimate the area of available feeding grounds in the two sea regions and find the density of seals per square kilometer feeding ground to be three times greater in the Skagerrak compared to the Kattegat. Thus, the shorter body length of seals in the Skagerrak can be an early signal of density dependence. Hampered body growth is known to trigger a suite of changes in life history traits, including delayed age at sexual maturity, higher juvenile mortality and lowered fecundity. These mechanisms all point at a possible ‘smooth route toward carrying capacity’ with gradually reduced population growth rate as the main response to high population density. Recent aerial surveys confirm declining rates of population increase in the Skagerrak.
B. Rudis. The Journal of Open Source Software, 2017.
osmdata imports OpenStreetMap (OSM) data into R as either Simple Features or R Spatial objects, respectively able to be processed with the R packages sf and sp. OSMdata are extracted from the Overpass API and processed with very fast C++ routines for return to R. The package enables simple Overpass queries to be constructed without the user necessarily understanding the syntax of the Overpass query language, while retaining the ability to handle arbitrarily complex queries. Functions are also provided to enable recursive searching between different kinds of OSM data (for example, to findall lines which intersect a given point). The package is faster than current alternatives for importing OSM data into R and is the only one compatible with sf.
rtimicropem: an R package supporting the analysis of RTI MicroPEM output files.
C. Tonne. The Journal of Open Source Software, 2017.
Supports the input and reproducible analysis of RTI MicroPEM output files.
Integrated assessment of exposure to PM2.5 in South India and its relation with cardiovascular risk: Design of the CHAI observational cohort study.
J. D. Marshall. Elsevier International Journal of Hygiene and Environmental Health, 2017.
While there is convincing evidence that fine particulate matter causes cardiovascular mortality and morbidity, little of the evidence is based on populations outside of high income countries, leaving large uncertainties at high exposures. India is an attractive setting for investigating the cardiovascular risk of particles across a wide concentration range, including concentrations for which there is the largest uncertainty in the exposure-response relationship. CHAI is a European Research Council funded project that investigates the relationship between particulate air pollution from outdoor and household sources with markers of atherosclerosis, an important cardiovascular pathology. The project aims to (1) characterize the exposure of a cohort of adults to particulate air pollution from household and outdoor sources (2) integrate information from GPS, wearable cameras, and continuous measurements of personal exposure to particles to understand where and through which activities people are most exposed and (3) quantify the association between particles and markers of atherosclerosis. CHAI has the potential to make important methodological contributions to modeling air pollution exposure integrating outdoor and household sources as well as in the application of wearable camera data in environmental exposure assessment.
Predictors of Daily Mobility of Adults in Peri-Urban South India.
R. T. Wilson,
J. D. Marshall,
C. Tonne. International Journal of Environmental Research and Public Health, 2017.
Daily mobility, an important aspect of environmental exposures and health behavior, has mainly been investigated in high-income countries. We aimed to identify the main dimensions of mobility and investigate their individual, contextual, and external predictors among men and women living in a peri-urban area of South India. We used 192 global positioning system (GPS)-recorded mobility tracks from 47 participants (24 women, 23 men) from the Cardiovascular Health effects of Air pollution in Telangana, India (CHAI) project (mean: 4.1 days/person). The mean age was 44 (standard deviation: 14) years. Half of the population was illiterate and 55% was in unskilled manual employment, mostly agriculture-related. Sex was the largest determinant of mobility. During daytime, time spent at home averaged 13.4 (3.7) h for women and 9.4 (4.2) h for men. Women’s activity spaces were smaller and more circular than men’s. A principal component analysis identified three main mobility dimensions related to the size of the activity space, the mobility in/around the residence, and mobility inside the village, explaining 86% (women) and 61% (men) of the total variability in mobility. Age, socioeconomic status, and urbanicity were associated with all three dimensions. Our results have multiple potential applications for improved assessment of environmental exposures and their effects on health.
Timeliness in the German surveillance system for infectious diseases: Amendment of the infection protection act in 2013 decreased local reporting time to 1 day.
A. Gilsdorf. PLOS ONE, 2017.
Time needed to report surveillance data within the public health service delays public health actions. The amendment to the infection protection act (IfSG) from 29 March 2013 requires local and state public health agencies to report surveillance data within one working day instead of one week. We analysed factors associated with reporting time and evaluated the IfSG amendment. Local reporting time is the time between date of notification and date of export to the state public health agency and state reporting time is time between date of arrival at the state public health agency and the date of export. We selected cases reported between 28 March 2012 and 28 March 2014. We calculated the median local and state reporting time, stratified by potentially influential factors, computed a negative binominal regression model and assessed quality and workload parameters. Before the IfSG amendment the median local reporting time was 4 days and 1 day afterwards. The state reporting time was 0 days before and after. Influential factors are the individual local public health agency, the notified disease, the notification software and the day of the week. Data quality and workload parameters did not change. The IfSG amendment has decreased local reporting time, no relevant loss of data quality or identifiable workload-increase could be detected. State reporting time is negligible. We recommend efforts to harmonise practices of local public health agencies including the exclusive use of software with fully compatible interfaces.
To combat air inequality, governments and researchers must open their data (commentary).
C. A. Hasenkopf,
D. C. Adukpo,
H. L. Dewitt,
A. I. Ibrahim,
L. Sereeter. Clean Air Journal, 2016.
Monitoring Count Time Series in R: Aberration Detection in Public Health Surveillance.
M. Höhle. Journal of Statistical Software, 2016.
Public health surveillance aims at lessening disease burden by, e.g., timely recognizing emerging outbreaks in case of infectious diseases. Seen from a statistical perspective, this implies the use of appropriate methods for monitoring time series of aggregated case reports. This paper presents the tools for such automatic aberration detection offered by the R package surveillance. We introduce the functionalities for the visualization, modeling and monitoring of surveillance time series. With respect to modeling we focus on univariate time series modeling based on generalized linear models (GLMs), multivariate GLMs, generalized additive models and generalized additive models for location, shape and scale. Applications of such modeling include illustrating implementational improvements and extensions of the well-known Farrington algorithm, e.g., by spline-modeling or by treating it in a Bayesian context. Furthermore, we look at categorical time series and address overdispersion using beta-binomial or Dirichlet-multinomial modeling. With respect to monitoring we consider detectors based on either a Shewhart-like single timepoint comparison between the observed count and the predictive distribution or by likelihoodratio based cumulative sum methods. Finally, we illustrate how surveillance can support aberration detection in practice by integrating it into the monitoring workflow of a public health institution. Altogether, the present article shows how well surveillance can support automatic aberration detection in a public health surveillance context.
A system for automated outbreak detection of communicable diseases in Germany.
M. Höhle. Eurosurveillance, 2016.
We describe the design and implementation of a novel automated outbreak detection system in Germany that monitors the routinely collected surveillance data for communicable diseases. Detecting unusually high case counts as early as possible is crucial as an accumulation may indicate an ongoing outbreak. The detection in our system is based on state-of-the-art statistical procedures conducting the necessary data mining task. In addition, we have developed effective methods to improve the presentation of the results of such algorithms to epidemiologists and other system users. The objective was to effectively integrate automatic outbreak detection into the epidemiological workflow of a public health institution. Since 2013, the system has been in routine use at the German Robert Koch Institute.
Bayesian Outbreak Detection in the Presence of Reporting Delays.
M. Höhle. Biometrical Journal, 2015.
One use of infectious disease surveillance systems is the statistical aberration detection performed on time series of counts resulting from the aggregation of individual case reports. However, inherent reporting delays in such surveillance systems make the considered time series incomplete, which can be an impediment to the timely detection and thus to the containment of emerging outbreaks. In this work, we synthesize the outbreak detection algorithms of Noufaily et al. (2013) and Manitz and Höhle (2013) while additionally addressing right truncation caused by reporting delays. We do so by considering the resulting time series as an incomplete two‐way contingency table which we model using negative binomial regression. Our approach is defined in a Bayesian setting allowing a direct inclusion of all sources of uncertainty in the derivation of whether an observed case count is to be considered an aberration. The proposed algorithm is evaluated both on simulated data and on the time series of Salmonella Newport cases in Germany in 2011. Altogether, our method aims at allowing timely aberration detection in the presence of reporting delays and hence underlines the need for statistical modeling to address complications of reporting systems. An implementation of the proposed method is made available in the R package surveillance as the function ‘bodaDelay’.
Management of Phytophthora ramorum at Plot and Landscape Scales for Disease Control, Tanoak Conservation, and Forest Restoration– Insights From Epidemiological and Ecosystem Models.
C.A. Gilligan. Proceedings of the Sudden Oak Death Fifth Science Symposium, 2013.
Nosolink: An Agent-based Approach to Link Patient Flows and Staff Organization with the Circulation of Nosocomial Pathogens in an Intensive Care Unit.
L. Temime. Procedia Computer Science, 2013.
Computational models and simulations are commonly employed to aid decision making in two areas of health care management: optimization of the use of hospital resources and control of the spread of hospital-acquired infections caused by antibiotic-resistant pathogens. We propose a model that combines the operational and the epidemiologic perspectives to size up the effect of understaffing and overcrowding on nosocomial contagion in a intensive-care unit. Specifically, we develop an agent-based model simulating contact-mediated pathogen transmission which allows establishing quantitative relations between patient flow, nurse staffing conditions and pathogen colonization in patients. The results of the model, once calibrated with data from the literature, should indicate under which conditions the variation in pathogen transmission resulting from management decisions can lead to significant increases in the incidence of health care-associated infections in the intensive care unit.