--- title: "The exponential distribution" output: rmarkdown::html_vignette bibliography: references.bib csl: proc_b.csl vignette: > %\VignetteIndexEntry{The exponential distribution} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} header-includes: \usepackage{amsmath} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` ```{r setup, message = FALSE} library(anovir) ``` ### Introduction {#top} The mathematical models describing the epidemiology of infectious diseases commonly assume mortality rates remain constant over time. This assumption is sometimes justified by observed data, particularly if they cover relatively short periods of time compared to the organism's expected survival. However assuming constant rate of mortality is more frequently a convenience, making models easier to manipulate and interpret. The hazard function of the exponential distribution describes constant rates of mortality. This is a special case of the Weibull distribution and occurs when _b_ = 1; \begin{align} h(t) &= \frac{1}{bt} \exp \left( \frac{\log t - a}{b} \right) \\ \\ &= \frac{1}{t} \exp \left( \log t - a \right) \\ \\ &= \frac{1}{t} \frac{\exp \left(\log t \right)}{\exp \left(a \right)} \\ \\ &= \exp \left( -a \right) \\ \\ \end{align} where the first line is the hazard function for the Weibull distribution with location and scale parameters _a_ and _b_, respectively. When _b = 1_, the rate of mortality described by the hazard function is constant at, _h(t) = exp(-a)_. The _nll\_functions_ of this package can estimate constant rates of mortality as described by the exponential distribution. This is achieved by choosing the Weibull distribution to describe the pattern of mortality in question and constraining the scale parameter to equal one when specifying the value of variables to be sent for maximum likelihood estimation. ### Example: background rate of mortality constant The following example uses a subset of data from Sy et al [@Sy_2014]; the full dataset is freely available at [@Sy_2014_data]. The data record the survival of blood-fed adult female _Aedes aegypti_ mosquitoes, either infected or not with the microsporidian parasite _Vavraia culicis_, over the first three weeks of the adult life. Uninfected adult _Ae. aegypti_ females can survive much longer than this in laboratory conditions. ```{r include = FALSE} sy_time <- c(0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23) sy_Suninf <- c(1,0.988,0.988,0.969,0.951,0.941,0.934,0.929,0.92,0.913,0.906,0.903,0.896,0.889,0.88,0.875,0.875,0.868,0.858,0.846,0.842,0.837,0.837,0.831) sy_Sobsinf <- c(1,0.979,0.966,0.942,0.915,0.902,0.887,0.875,0.869,0.859,0.841,0.81,0.788,0.747,0.732,0.696,0.68,0.652,0.616,0.6,0.584,0.574,0.561,0.545) sy_surv <- data.frame(sy_time, sy_Suninf, sy_Sobsinf) ``` ```{r echo = FALSE, fig.width = 4, fig.height = 4, fig.align = 'center'} plot(sy_surv[, 1], sy_surv[, 2], type = 's', col = 'black', xlab = 'time (days)', ylab = 'Survival', xlim = c(0, 23), ylim = c(0, 1), main = 'Cumulative survival', axes = FALSE ) lines(sy_surv[, 3], type = 's', col = 'red') axis(side = 1, seq(0, 28, by = 7)) axis(side = 2, seq(0, 1.1, by = 0.2)) legend("bottomleft", c("Uninfected", "Infected"), lty = c(1, 1), col = c(1, 2)) ``` ```{r include = FALSE} sy_times <- c(1,3,4,5,6,7,8,9,10,11,12,13,14,15,17,18,19,20,21,23,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,7,21,22,23,7,14,21,22,23) sy_censor <- c(0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1) sy_inf <- c(0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,0,0,0,0,1,1,1,1,1) sy_fq <- c(5,8,8,4,3,2,4,3,3,1,3,3,4,2,3,4,5,2,2,1,7,4,8,9,4,5,4,2,3,6,10,7,13,5,11,5,9,11,5,5,3,4,2,3,4,205,143,6,7,4,103,66) data_sy <- data.frame(sy_times, sy_inf, sy_censor, sy_fq) data_sy$fq <- data_sy$sy_fq ``` In a first model, the Weibull distribution was chosen to describe both background mortality and mortality due to infection, _d1_ and _d2_, respectively, ```{r include = TRUE} head(data_sy) ``` ```{r eval = TRUE, echo = TRUE, message = FALSE, warning = FALSE} m01_prep_function <- function(a1, b1, a2, b2){ nll_basic( a1, b1, a2, b2, data = data_sy, time = sy_times, censor = sy_censor, infected_treatment = sy_inf, d1 = 'Weibull', d2 = 'Weibull') } m01 <- mle2(m01_prep_function, start = list(a1 = 3, b1 = 0.5, a2 = 3, b2 = 0.5) ) summary(m01) ``` The estimate (± s.e.) for the scale parameter, _b1_, overlaps with 1.0, suggesting the exponential distribution was suitable for describing background mortality. In a second model the scale parameter for background mortality _b1_ was constrained, or fixed, to _b1 = 1.0_ throughout the estimation process. Hence background mortality was estimated according to the exponential distribution. ```{r eval = TRUE, echo = TRUE, message = FALSE, warning = FALSE} m02 <- mle2(m01_prep_function, start = list(a1 = 4.8, a2 = 3.6, b2 = 0.6), fixed = list(b1 = 1.0) ) summary(m02) # compare models by AICc AICc(m01, m02, nobs = 753) ``` The log-likelihoods of the two models were very similar, indicating the data were not less well described when the exponential distribution was chosen to describe the background mortality. This was confirmed by the lower AICc of the second model, as fewer parameters were estimated for essentially the same log-likelihood indicating it was the better of the two models. [back to top](#top) ### References