Article
Author(s):
An advanced statistical modeling and analytic tool aims to improve personalized medicine through health care and medical data.
Scientists have developed novel statistical modeling and analytic tools to make health care and medical data more meaningful.
Electronic health records (EHR) can help improve patient care and fuel the development of individualized treatments; however, the tool can also be challenging.
“Powerful statistical analyses and results from these records and databases can be the foundation on which informed medical questions are asked and decisions are made,” said statistician Liangyuan Hu.
One example is physicians seeking an optimal treatment for high-risk cancer patients. They could choose between multiple radical prostatectomy (RP) or radiotherapy (RT) modalities. Unfortunately, it is challenging to conduct randomized controlled trials that produce quality results comparing RP with RT for long-term survival, ultimately limiting physicians to the available data to help make decisions.
“Therefore, finding evidence using statistical tools from large, representative national databases is crucial to inform such critical medical decisions,” Hu said.
At the 2017 Joint Statistical Meetings in Baltimore, Maryland, Hu used a case study in chronic diseases to show challenges associated with drawing interferences from EHR and administrative databases.
Uncontrolled data collection settings, variations of practice among physicians, and missing data can lead to false conclusions if not properly addressed by statistical methods. There methods employ machine learning and flexible models to draw valid inference using EHRs via a representative population and reflect outcomes from actual clinical practice.
“In clinical prediction studies, we show that combining strengths of nonparametric algorithms and parametric models leads to the development of a data-driven and reproducible tool that will not only generate immediate public health impact, but also advanced developments in statistical methodology pertaining to drawing valid and useful information from vast data sources,” Hu concluded.