Article

NIH Explores New Field of Liberation Data Science in All of Us Research Program

The All of Us Research Program looks to enroll more than 1 million US individuals in the next 10 years to enable public access to medical data for research purposes.

Through the All of Us Research Program at the National Institutes of Health (NIH), scientists are able to get data into the hands of community members outside of just those in health care fields, explained Karriem Watson, DHSc, MS, MPH, chief engagement officer of the All of Us Research Program at NIH, during a presentation at the 15th American Association for Cancer Research Conference on the Science of Cancer Health Disparities in Racial/Ethnic Minorities and the Medically Underserved. Instead, the All of Us Research Program encourages public access and knowledge of the public’s own data to better assess its implications on the public’s own communities.

Watson explained that in Chicago, health disparities and life expectancy gaps were broad topics of discussion among health care professionals and community members alike for many years. However, many elected officials in the city who made decisions that could impact these health issues remained unaware of the disparities in cancer incidence and outcomes and its impact on their communities. Further, these officials were the ones making decisions regarding the allocation of government support and assistance for health care needs.

To better educate these elected officials on the incidence of cancer and its impact on the city, Watson and his colleagues worked to bring these data to the attention of elected officials. Further, the researchers helped to explained and analyze the data and its implications for their communities in laymen’s terms.

“When you take this data and give it to communities and not only encourage them to utilize it, but also make your faculty and staff accessible to those communities to help them navigate and understand that data—that's really what we like to think of as liberation data science,” Watson said.

As the All of Us Research Program continues to work to provide a broader dataset for the purposes of public access, Watson explained that their goal is to enroll 1 million or more individuals across the United States over the next 10 years.

“We're right at about 523,000 people enrolled today, and of those, about 372,000 participants have donated data that allows us to do a truly deep dive,” Watson said. “Then 80% of those in the 372,000 are in underrepresented groups in biomedical research, and 45% of those are self-identified racial and ethnic minorities.”

To access these data, Watson explained that there are 3 levels available. The first level is the public tier available to anyone with digital access, which Watson acknowledged as a potential limitation for universal applicability.

“So we use that term anyone, but that's contextualized,” Watson said. “It's available to anyone with internet access. And it's available to those that are able to look and understand the electronic health record (EHR) and see aggregated data about what conditions we have.”

Watson additionally noted that at the public tier, people are able to look at the research project directory. He explained this is particularly important for those in the community who donate their data, as they are able to then see how that information is being used for research purposes. Additionally, the public can see specifically what types of questions researchers are asking in reference to the use of their data as well.

“This is something that we're really excited about,” Watson said.

Additional tiers available for access to the All of Us Research Program dataset include the register tier data and the control tier data, which can supply information specific to the needs of each tier level.

“So for example, at the register tier data, we have information from our surveys. We recently launched a social terms of health survey and a survey on COVID-19 where we asked questions about discrimination, questions about food insecurity, questions about social isolation as a result of the pandemic—and imagine that data coupled with EHR data from over 270,000 participants and physical measures, and even data like wearables,” Watson said.

Due to the access limitation of wearables as a measurement method, Watson explained that the All of Us Research Program donated wearables to populations who don't typically get to participate in studies of this kind in order to broaden the medical data collected.

In the next access tier at the control tier level, Watson noted that the program gathered not only what is available at the register tier, but expanded it further to include spending demographics, unshipped event dates, and genomic data derived from whole-genome sequencing. Additionally, Watson explained the program will soon also be able to provide 3-digit ZIP code data as well at the control tier access level, with more than 24,300 health conditions and over 15,300 unique lab values.

“Today, we have about 42,000 of our participants who have identified in their EHR that they have been diagnosed with cancer. That's 18% of our participants living with cancer or having a diagnosis of cancer,” Watson said. “You think about some of the top cancers, like skin cancer, breast cancer, prostate cancer, colorectal, and lung—from these projects, we have over 2000 projects in our research tier registry. These are just a snapshot of 162 that are looking at the intersectionality of cancer, asking questions like the impact of nutrition and physical fitness on prostate cancer, social determinants on colorectal cancer, breast cancer genomic variants, and then personal family history of cancer.”

REFERENCE

Watson KS. The All of Us Research Hub: A dataset for all of us. Philadelphia, PA: 15th AACR Conference on the Science of Cancer Health Disparities in Racial/Ethnic Minorities and the Medically Underserved; September 17, 2022.

Related Videos