|Articles|April 25, 2023

Data-Driven Science: How AI and Open Data Can Revolutionize Scientific Discovery

While the demand for professionals adept in the ability to work with big data is at an all-time high, there remains a significant lack of skilled talent.

Scientists have long been perceived and portrayed in film as older individuals in white lab coats perched at a bench full of bubbling fluorescent liquids—the present day reality is quite different. Scientists are increasingly data jockeys in hoodies sitting before monitors analyzing enormous amounts of data. Modern day labs are more likely composed of sterile rows of robots doing the manual handling of materials, and lab notebooks are now electronic in massive data centers holding vast quantities of information. Today, scientific input comes from data pulled from the cloud, with algorithms fueling scientific discovery the way Bunsen burners once did.

Advances in technology, and especially instrumentation, enable scientists to collect and process data at an unprecedented scale. As a result, scientists are now faced with massive datasets that require sophisticated analysis techniques and computational tools to extract meaningful insights. This also presents significant challenges—how do you store, manage, and share these large datasets, as well as ensure that the data is of high quality and reliable?

The Impact of Big Data on Science

This growth in data is transforming the way scientists conduct research, and it is enabling new discoveries across many fields, but especially in the areas of genome and protein research. This has fostered the emergence of a whole new type of scientist whose role is as bioinformaticians and data scientists who work hands-on with big data by developing and applying algorithms. In fact, “data scientist” has been at the top of the list of desirable jobs on career sites for the last few years. However, while the demand for professionals adept in the ability to work with big data is at an all-time high, there is a significant lack of skilled talent.

In medicine, as in other fields, it’s not just the volume and velocity of data generation that is increasing, but also the variety of data being collected to answer research questions. For instance, flow cytometry data is fundamentally different from DNA sequencing data, which is again totally different from 3D models of proteins. The tools and algorithms that work for one data type are not suited for another. Further, flexibility in data storage and modeling is crucial for repurposing data. This is especially true for predictive science where integration occurs between data and data types unrelated to the hypotheses of any of the original studies.

Turning to Machine Learning and Artificial Intelligence

Technology can act like a powerful flashlight, illuminating hidden patterns and insights that exist in vast amounts of data, and allowing us to see and understand things that were previously too dark to see. That’s why, despite the recent rise in genAI like ChatGPT generating a lot of headlines and stoking fear about potential risks, drug discovery is one setting where artificial intelligence (AI) and machine learning (ML) are poised to make a significant, positive impact.

For example, during the pandemic, I had the opportunity to collaborate with the team behind the EVE Online video game to create Project Discovery - Flow Cytometry, a free mini-game that enabled tens of thousands of gamers to become citizen scientists. Using data from cell samples of patients with COVID-19 and other immune system diseases, players were trained to identify different cell patterns generated using a technology known as flow cytometry. The game was incentivized with rewards and rankings to make it fun and challenging, but many players expressed the desire and satisfaction associated with participation in scientific research, especially as it related to their own interests and experience.

To-date, players have solved millions of puzzles, representing hundreds of years of effort. All data from the project will be freely available for open science. Companies like Dotmatics will be able to use the data to develop ML approaches to flow cytometry data analysis, leading to exponentially faster, less expensive, and more significant medical breakthroughs.

Today, both ML and AI are being used around the world in many research labs and universities to expedite discoveries. The National Cancer Institute’s Center for Cancer Research has developed deep learning algorithms to improve cancer detection. For example, one model can function as “a virtual expert,” reviewing MRIs in hard to detect cancer types, guiding less-experienced radiologists, and minimizing error rates. Similarly, AI is used in the University of Toronto to predict Alzhiemer risk, byRutgers University to predict cardiovascular disease, and by hundreds of startups using advanced technology to design cheaper, safer drugs with less adverse effects.

Complexities of Big Data

Despite these advances, the complexity of the data and the heterogeneity of the tools required to analyze these data can make it difficult for researchers to collaborate effectively to generate the big datasets that AI requires. Efforts such as the FAIR Guiding Principles for scientific data management and stewardship provide guidelines to improve the Findability, Accessibility, Interoperability, and Reuse of digital assets. They are increasingly being adopted and are even being mandated by granting agencies. The withholding of funding will act as a powerful motivating force in academia, but this doesn’t directly translate to pharmaceutical companies who are perhaps even more burdened by the same underlying challenges when trying to find and share massive and complex datasets internally within global organizations.

While the old way of science using beakers and chemistry is still important, tomorrow’s scientists will be able to explore and understand the world around us and scale ambitious research into areas that are presently economically prohibitive. However, to truly harness the power of AI, we must invest in further improvements to the infrastructure supporting the integration, analysis, and reuse of data that have already become the new frontier of scientific discovery.

About the Author

Ryan Brinkman, PhD, is the vice president and research director for Dotmatics.

Stay informed on drug updates, treatment guidelines, and pharmacy practice trends—subscribe to Pharmacy Times for weekly clinical insights.

Subscribe Now!

Latest CME

Webinar Registration

The Expanding Therapeutic Landscape in IgA Nephropathy: Translating New Clinical Evidence and Updated Guidelines Into Managed Care Strategies

October 21, 2025 | 1:00 PM & 8:00 PM ET

Data-Driven Science: How AI and Open Data Can Revolutionize Scientific Discovery

Newsletter

Related Content

Five-Year NATALEE Data Show Ribociclib Improves Outcomes in High-Risk HR+/HER2– Early Breast Cancer

Abemaciclib Provides Long-Term Survival Benefit in High-Risk Early Breast Cancer

Durvalumab Plus Chemotherapy Sets New Standard in Resectable Gastric and GEJ Cancer

The Juggling Act: Navigating Pharmacy School and Parenting

Emerging Use of an Allosteric TKI: Asciminib for the Treatment of Chronic Myeloid Leukemia

Latest CME

The Expanding Therapeutic Landscape in IgA Nephropathy: Translating New Clinical Evidence and Updated Guidelines Into Managed Care Strategies

The Evolving Strategies for Managing nAMD and DME: Empowering Managed Care Decisions for Improved Patient Outcomes

Targeting the Root of Autoimmunity in Generalized Myasthenia Gravis: Pharmacist Strategies for Integrating FcRn Therapies Into Specialty Practice

Utilizing VMAT-2 Inhibitors for the Management of Tardive Dyskinesia: The Role of Long-Term Care Pharmacists

ASCP 2025 Satellite Symposia

IL-23 Inhibitors in Psoriasis: Optimizing Access and Patient Outcomes Across Integrated Systems

Effective Strategies to Manage Hyperglycemia When Treating with PI3K and AKT Inhibitors

New Horizons in ATTR-CM: Therapeutic Advances and Strategic Insights

AMCP Nexus 2025

Innovations in Diabetes Technology: A New Era of Patient Choice and Opportunities for Pharmacists to Support Insulin Pump Management

Multidisciplinary Insights and Strategies for Patients Treated With PI3K and AKT Inhibitors to Prevent Hyperglycemia

Collaborating Across the Continuum™: Identifying and Preventing Hepatic Encephalopathy in Long-Term Care Settings

The Role of Pharmacists in Pneumococcal Prevention: Strategies to Improve Vaccine Uptake

Asembia Specialty Pharmacy Day of Education 2025

Advancing Treatment Strategies in Extensive-Stage Small Cell Lung Cancer: Enhancing Pharmacist Competence in Therapy Selection, Administration, and Adverse Event Management

Addressing Gaps in Care for Hyperkalemia With Novel Oral Potassium Binding Agents

Transforming Gout Care: Navigating Barriers and Therapeutic Advances in Disease Management

Insights in Influenza Immunization: The Long-Term Care Pharmacist’s Role in Improving Health Outcomes in At-Risk Populations and Vaccine Uptake

Managing Schizophrenia: Evidence-Based Care and Optimizing Value of New and Emerging Therapies

Directions in Oncology Pharmacy® 2025: Northeast

Pharmacists at the Forefront: Enhancing Targeted Therapy Implementation and Patient Outcomes in Advanced Gastric Cancer

Best Practice Approaches for Understanding Chronic Obstructive Pulmonary Disease and Precision Medicine Treatment

Cytokine Release Syndrome With Bispecific Antibodies: A Pharmacist's Guide to Safe Management

Enhancing Endometrial Cancer Management: Highlights of Molecular Insights, Treatment Approaches, and Toxicity Management

Closing Gaps in CLL Care: Managed Care Insights and Strategies

Enhancing Safety in Cancer Care: Expert Guidance on Toxicity Management of HER2, TROP2, and HER3-Targeted Antibody-Drug Conjugates

Patient Safety and Reporting Errors: A Guide for Pharmacists and Technicians (Pharmacy Technician Credit)

Patient Safety and Reporting Errors: A Guide for Pharmacists and Technicians

ASHP Midyear 2025

The Impact and Role of Pharmacists in the Management of Specialty Therapies

Exploring Advances in the Management of DOAC-Associated Bleeding With Reversal Agents: Role of the Health-System Pharmacists

Navigating the Complexity of Migraine: A Multimodal Approach to Acute Treatment

New and Emerging Therapies for the Management of Narcolepsy: A Guide to Patient-Centered Care

The Impact of Pharmacists and Pharmacy Technicians in Recognizing and Responding to Human Trafficking (Pharmacist Credit)

The Impact of Pharmacists and Pharmacy Technicians in Recognizing and Responding to Human Trafficking (Pharmacy Technician Credit)

Iron Deficiency Anemia Treatment Considerations in Women’s Health

The Changing Paradigm in Pain Management and Supporting Access to Novel Therapies

Personalizing Treatment in Relapsed/Refractory Multiple Myeloma: The Evolving Role of Antibody-Drug Conjugates and Bispecific Antibodies

Bridging Knowledge Gaps in HR+/HER2– Early-Stage Breast Cancer

Optimizing HER2-Directed Therapy in Metastatic Breast Cancer: From HER2-Positive to HER2-Ultralow

Understanding Type 2 Inflammation and Its Role Across Various Immune-Related Diseases

Shaping the Future of Generalized Myasthenia Gravis Management: A Focus on Novel Treatment Approaches

Expert Insights on the Horizon of HER2-Directed Therapy

Breaking Barriers in Asthma Care: Exploring the Role of Type 2 Inflammation and Biologic Therapies

Type 2 Inflammation in Focus: Advancing Pediatric Atopic Dermatitis Care With Biologic Therapies

Updated Guidance and Managed Care Strategies to Optimize Care in EGFR Mutated NSCLC

Advancing Pharmacist Expertise in R/R FL: Navigating Novel Therapies and Optimizing Patient Outcomes

Paroxysmal Nocturnal Hemoglobinuria: Managed Care Strategies to Mitigate Burden and Enhance Outcomes

Panel Discussion: Integrating Novel Combinations and Earlier Line Use in Diffuse Large B-Cell Lymphoma

From Molecules to Medicine: Pharmacologic Principles of Innovative Non-Hodgkin Lymphoma Therapies

The Pharmacist's Role in Palliative and End of Life Symptom Management

Understanding mRNA Vaccines: Dispelling Myths and Empowering Pharmacists to Counsel Patients

Balancing the Burden of Moderate to Severe Atopic Dermatitis in Pediatric Patients: Navigating the Use of Targeted Biologic Therapies and Caregiver Challenges

Transforming Small Cell Lung Cancer Management With DLL3-Targeted Therapies

Charting the Course: The Pharmacist's Role in Safeguarding Patients From Immune-Related Adverse Events

The Evolving Treatment Landscape of Migraine

Innovations in Medicine: 2024 Lineup of New and Approved Specialty Drugs (Pharmacy Technician Credit)

Implementing Updated RSV Vaccination Recommendations for Older Adults: Introduction and Disease Burden (Pharmacy Technician Credit)

Revolutionizing Acute Pain Relief: Emerging Nonopioid Therapies and the Essential Role of Managed Care

Driving Better Outcomes in Hypertrophic Cardiomyopathy: A Managed Care Imperative

PBM Challenges and the Evolving Role of Oncology Pharmacists in Community Practices

Improving the Care of Persons Living With Diabetes Using CGM Devices (Pharmacy Technician Credit)

Advances and Innovations in Hypertrophic Cardiomyopathy Management: A Focus on New and Emerging Therapies

Evaluating the Role of Pharmacists in Utilizing Potassium-Competitive Acid Blockers for Acid Peptic Disorders

Beyond the Prescription: Tackling Social Isolation for Better Health (Pharmacy Technician Credit)

Leveraging Novel Therapies to Transform Demodex Blepharitis Care (Pharmacy Technician Credit)

Advancing Multiple Sclerosis Treatment: Exploring Evolving Therapies and Patient-Centered Approaches

Exploring ATTR Phenotypes and Identifying Appropriate Opportunities for Pharmacists to Intervene

Chronic Obstructive Pulmonary Disease and Inflammation: Practical Approaches to Integrating Biologic Therapy

Addressing the Burden of Hemolysis in Paroxysmal Nocturnal Hemoglobinuria: The Pharmacist's Contribution to Patient Care

From Strict Avoidance to Tailored Therapies: Transforming Food Allergy Care