Estimating disease prevalence from partially-sampled clusters using the conditional linear family for multivariate Bernoulli data

Authors

  • SUSAN L EDWARDS Department of Biostatistics, University of North Carloina at Chapel Hill Gillings School of Global Public Health, Chapel Hill, NC 27599
  • JOHN S PREISSER Department of Biostatistics, University of North Carloina at Chapel Hill Gillings School of Global Public Health, Chapel Hill, NC 27599
  • BAHJAT QAQISH Department of Biostatistics, University of North Carloina at Chapel Hill Gillings School of Global Public Health, Chapel Hill, NC 27599
  • CRISTIANO SUSIN Department of Biostatistics, University of North Carloina at Chapel Hill Gillings School of Global Public Health, Chapel Hill, NC 27599

DOI:

https://doi.org/10.3329/jsr.v58i1.75408

Keywords:

correlated binary data, missing by design, computer simulation, periodontal disease(s)/periodontitis, partial-recording protocol, surveillance

Abstract

In periodontal disease surveillance in human populations, full-mouth clinical examinations to classify the disease status of individuals are the gold standard for estimating periodontitis prevalence. However, conducting full-mouth exams is resource intensive, time consuming, and costly, especially in studies involving thousands of participants. Partial-recording protocols have been utilized in oral health surveys worldwide to gather correlated binary outcomes of periodontal disease on selected teeth in lieu of full-mouth exams. Since the use of partial-recording protocols tends to underestimate disease prevalence, a statistical distributional approach considering the pattern of tooth-level disease in the mouth was proposed to substantially reduce bias for the estimation of periodontitis prevalence. This approach employed multivariate Bernoulli distributions for observation (tooth)-level disease indicators to define formulae for the prevalence of disease (periodontitis) at the cluster (person)-level for various full-mouth case definitions. In turn, prevalence estimators were based on plug- in estimates of parameters from a conditional linear family for binary data gathered under partial recording protocols. Work in this article extended existing prevalence estimators for simple case definitions based on single clinical measures of tooth-level periodontal disease to a definition of severe periodontitis using two measures as defined by the Centers for Disease Control and Prevention and the American Academy of Periodontology, and later adopted by the 2017 World Workshop in Periodontology. Simulations evaluated the finite-sample performance of the proposed estimators and their confidence intervals for three established partial-recording protocols. In general, the prevalence estimators performed well relative to bias and coverage when tooth-level probabilities of disease and within-mouth correlation structures were correctly specified and even when the pattern of tooth-pair correlations was misspecified.

Journal of Statistical Research 2024, Vol. 58, No. 1, pp. 3-31. 

Abstract
208
PDF
98 Supplimentary
14

Published

2024-08-14

How to Cite

EDWARDS, S. L., PREISSER, J. S., QAQISH, B., & SUSIN, C. (2024). Estimating disease prevalence from partially-sampled clusters using the conditional linear family for multivariate Bernoulli data. Journal of Statistical Research, 58(1), 3–31. https://doi.org/10.3329/jsr.v58i1.75408

Issue

Section

Articles