Introduction: The increasing accessibility of wearable activity-tracking and health-tracking devices has prompted much research into passive diagnostics and screening that could contribute to infrastructure for population health testing and ultimately mitigate potential pandemics. Elevated resting heart rates have been noted to occur alongside fever.1 This finding has enabled researchers to accurately estimate the prevalence of influenza using data from wearable devices alone.1
In the past 3 years, studies show the potential to make individualised predictions of infection. For example, wearable devices have shown promise for population-level tracking of disease prevalence1,2 and detection before the onset of symptoms.3,4 However, we caution the reader to pay close attention to the design of these studies and the outcomes they estimate. The methods of evaluation proposed in COVID-19 detection studies using machine learning do not replicate a realistic clinical use scenario. Until now, the performance of a one prediction per participant per day model to detect COVID-19 with wearables has not been described.
The following criteria must be satisfied before a study can claim reasonable COVID-19 detection performance using data from wearable devices: (1) the latest training data must predate the earliest testing data when data are non-stationary, otherwise the performance is greatly inflated as a result of data leakage; (2) the evaluation period must not be artificially cropped around the event window, otherwise disease incidence is increased to unrealistic levels; (3) the evaluation period must not exclude participants that always test negative, as such exclusion results in a skewed representation of the population at large; and (4) the model must differentiate between COVID-19 and other conditions with similar characteristics—eg, non-COVID-19 influenza-like illness—when claiming to detect COVID-19.
Read the full paper here.