News|Articles|February 29, 2024

April 2024
Volume 6
Issue 3

Unlocking the Potential of Machine Learning and Large Language Models in Oncology

Author(s)Alana Hippensteele, Lead Editor

Will Shapiro, vice president of Data Science at Flatiron Health, provides context into the terminology and history of these models.

A strength of using machine learning (ML) in oncology is its potential to extract data from unstructured documents, explained Will Shapiro, vice president of Data Science at Flatiron Health, during a session at the Association of Cancer Care Centers (ACCC) Annual Meeting & Cancer Center Business Summit (AMCCBS) in Washington DC. According to Shapiro, the ML team at Flatiron Health is focused on this endeavor in relation to oncology data and literature.

“There's a ton of really rich information that's only in unstructured documents,” Shapiro said during the session. “We build models to extract things like metastatic status or diagnosis state, which are often not captured in any kind of regular structured way.”

Shapiro explained further that more recently, his ML team has started working with large language models (LLMs). He noted this space has significant potential within health care.

“[At Flatiron Health] we built out a tool at the point of care that matches practice-authored regimens to NCCN guidelines,” Shapiro said. “That's something that we're really excited about.”

Notably, Shapiro explained that his background is in fact not in health care, as he worked for many years at Spotify, where he built personalized recommendation engines using artificial intelligence (AI) and ML.

“I really got excited about machine learning and AI in the context of building personalized recommendation engines [at Spotify],” Shapiro explained. “While personalizing music for a place like Spotify is radically different from personalizing medicine, I think there's actually some core things that really connect them, and I believe strongly that ML and AI have a key role to play in making truly personalized medicine a reality.”

Shapiro noted that terminology can pose challenges for professionals in health care as they begin to dive into terms that contain a wealth of knowledge based on decades of research and thousands of dissertations. Terms such as LLM, natural language processing (NLP), generative AI, AI, and ML each represent an abundance of information that have helped us understand their potential today. Specifically, Shapiro noted that this collection of terms is distinct from workflow automation, which is another term in the same field that is often grouped together. Shapiro noted that workflow automation is distinct from these other terms in that currently there are well-known ways in which we evaluate quality for workflow automation.

“With something like generative AI—which is, I think, one of the most hyped things out in the world right now—it's so new that there really aren't ways that we can think about quality,” Shapiro said. “That's why I think it's really important to get educated and understand what's going on [around these terms].”

According to Shapiro, a lot of these terms get used interchangeably, which can lead to additional confusion.

“I think that there's a good reason for that, which is that there's a lot of overlap,” Shapiro said. “The same algorithm can be a deep learning algorithm and an NLP algorithm, and a lot of the applications are also the same.”

Shapiro noted that one way of structuring these terms is to think of AI as a very broad category that encompasses ML, deep learning, and generative AI as nested subcategories. NLP, however, contains some differences.

“There is an enormous amount of overlap between NLP and AI. A lot of the major advances in ML and AI stemmed from questions from NLP. But then there are also parts of NLP that are really distinct. [For example,] rules-based methods of parsing text are not something that I will think about with AI, and I will caveat this by saying that this is contentious,” Shapiro said. “If you google this, there will be 20 different ways that people try to structure this. My guidance is to not get too bogged down in the labels, but really try to focus on what the algorithm is or the product is that you're trying to understand.”

According to Shapiro, one reason that oncologists should care about these terms is that ChatGPT, the most famous LLM currently in use today, is used by 1 in 10 doctors in their practice, according to a survey conducted over the summer of 2023. Shapiro noted that by the time of the presentation at the ACCC AMCCBS meeting in February 2024, that number has likely increased.

LLMs, which are large language models, are also a type of language model. According to Shapiro, the technical definition of a language model is a probability distribution over a sequence of words.

“So, basically, given a chunk of text, what is the probability that any word will follow the chunk that you're looking at,” Shapiro said. “LLMs are essentially language models that are trained on the internet, so they're enormous.”

According to Shapiro, language models can also be used to generate text. For instance, in the example “My best friend and I are so close, we finish each other's ___” it is not difficult for humans to finish this with the appropriate word in the blank, which in this case would be “sentences.” Shapiro explained that is very much how language models work.

“Probabilistically, ‘sentence’ is the missing word [in that example], which is very much at the core of what's happening with a language model,” Shapiro said. “In fact, autocomplete, which you probably don't even think about as you see it every day, is generative AI that's an example [of a language model], and it's one of the motivating examples of generative AI.”

To be clear in terms of definition, Shapiro noted that generative AI are AI models that generate new content. Specifically, the “GPT” in ChatGPT (which is both an LLM and generative AI) stands for generative pre-trained transformer. According to Shapiro, pre-trained models can be understood as having a foundational knowledge, which is in contrast to other kinds of models that just do one task.

“I mentioned my team works on building models that will extract metastatic status from documents, and that's all they do,” Shapiro said. “In contrast, pre-trained models can do a lot of different kinds of things. They can classify the sentiment of reviews, they can flag abusive messages, and they probably are going to write the next 10 Harry Potter novels. They can extract adverse events from charts, and they can also do things that extract metastatic status. So, that's a big part of the appeal—one model can do a lot of different things.”

However, this capacity of one model being capable of doing many different things can also have a trade off in terms of quality. Shapiro explained that that is something his team at Flatiron Health has found to be true in their work.

“What we've found at Flatiron Health is that generally, purpose-built models can be much better at actually predicting or doing one task. But one thing that's become really exciting, and kind of gets into the weeds of LLMs, is this concept of taking a pre-trained model and fine-tuning it on labeled examples, which is a way to really increase the performance of a pre-trained model.”

Further, the ‘T” in ChatGPT stands for “transformer,” which is a type of deep learning architecture that was developed at Google in 2017, explained Shapiro. It was originally described in a paper called “Attention is All You Need.”

“Transformers are actually kind of simple,” Shapiro said. “If you read about the history of deep learning, model architectures tended to get more and more complex, and the transformer actually stripped away a fair amount of this complexity. But what's been really game changing is how big they are, as they're trained on the internet. So things like Wikipedia, Reddit—these huge corpuses of text—have billions of grammars, and they're really, really expensive to train.”

Yet, the size of them is what has led to these incredible breakthroughs in performance and benchmarks that have caused quite a bit of buzz recently, explained Shapiro. With this buzz and attention raises the importance of becoming more educated in what these models are and how they work, especially in areas such as health care.

3 Key Takeaways

Large language models (LLMs) like ChatGPT have the potential to be valuable tools in oncology. They can be used to extract data from unstructured documents, summarize visit notes, predict patient response to treatment, and discover new drug targets.

There are challenges associated with using LLMs, such as hallucinations and bias. It is important to be aware of these challenges and to take steps to mitigate them, such as using high-quality data and carefully validating the models.

Healthcare professionals need to become more educated about AI and ML. This will help them to understand the potential benefits and risks of these technologies, and to use them safely and effectively.

“With 10% of doctors using ChatGPT, it is something that everyone really needs to get educated about pretty quickly. I also just think there are so many exciting ways that ML and AI have a role to play in the future of oncology,” Shapiro said.

Shapiro explained further that using these models, there is the potential in oncology to conduct research that is pulled from enormous patient populations, which can made available at scale. Additionally, there is the potential to summarize visit notes from audio recordings, to predict patient response to a treatment, and to discover new drug targets.

“There are huge opportunities in ML and AI, but there are also a lot of challenges and a lot of open questions. When you see someone like Sam Altman, the CEO of OpenAI, going to Congress and asking it to be regulated, you know that there's something to pay attention to,” Shapiro said. “That's because there's some real problems.”

Such problems include hallucinations, which consists of models inventing answers. Shapiro explained what makes hallucinations by AI models even more pernicious is that they come from a place of technological authority.

“There's an inherent inclination to trust them,” Shapiro said. “There's a lot of traditional considerations for any type of ML or AI algorithm around whether they are biased, whether they are perpetuating inequity, and whether data shifts affect their quality. For this reason, I think it's more important than ever to really think closely about how you're validating the quality of models. High quality ground truth data, I think, is essential for using any of these types of ML or AI algorithms.”

Reference

Shapiro W. Deep Dive 6. Artificial and Business Intelligence Technology. Presented at: ACCC AMCCBS; February 28-March 1, 2024; Washington, DC.

Articles in this issue

almost 2 years ago

Article

OPC Summer 2024: Connecting Oncology Pharmacists With Cutting-Edge Insights and Opportunities to Collaborate

almost 2 years ago

Article

Advancement in Stem Cell Transplantation: Improving Outcomes for Patients With Blood Cancer

almost 2 years ago

Article

Optimizing Oncological Care: The Influence of AI on Insurance Approvals

almost 2 years ago

Article

HSSP Model Can Reduce Financial Toxicity of Oral Oncology Treatment

almost 2 years ago

Article

The Fludarabine Shortage and Its Ripple Effects: Navigating the Crisis

almost 2 years ago

Article

Practical Considerations of T-Cell Engagers in the Management of Relapsed/Refractory Multiple Myeloma

almost 2 years ago

Article

Making Residency Research Projects Patient Centered: How Retrospective Research Can Significantly Impact Cancer Care

almost 2 years ago

Article

Pirtobrutinib and BTK Inhibitors in the CLL Treatment Landscape

almost 2 years ago

Article

April Is Stress Awareness Month: A Little Humbling Self-Awareness May Go a Long Way

almost 2 years ago

Article

Empowering Oncology Pharmacists: Driving Innovation and Advancement Through Pharmacist-Led Research

Stay informed on drug updates, treatment guidelines, and pharmacy practice trends—subscribe to Pharmacy Times for weekly clinical insights.

Subscribe Now!

Latest CME

Online Article

Innovations in Medicine: 2024 Lineup of New and Approved Specialty Drugs (Pharmacy Technician Credit)

2.0 Credits / General Pharmacy

Online Article

Innovations in Medicine: 2024 Lineup of New and Approved Specialty Drugs

2.0 Credits / General Pharmacy

On-Demand Webinar

Implementing Updated RSV Vaccination Recommendations for Older Adults: Considerations for the Pharmacist (Pharmacy Technician Credit)

0.5 Credit / Immunology

On-Demand Webinar

Implementing Updated RSV Vaccination Recommendations for Older Adults: Incorporating ACIP Guidelines (Part 2)

0.5 Credit / Immunization

Online Article

Navigating Diabetes: A Guide to Effective Oral Treatments

1.0 Credits / Endocrinology, Diabetes & Metabolism

On-Demand Virtual Symposium

Exploring HIV Long-Acting Injectable Uptake: How Pharmacists Can Encourage Long-Acting Injectables to Stem the Spread of HIV

1.5 Credits / HIV/AIDS, Infectious Disease

In-Person + Virtual Event

APhA 2026

March 28-29, 2026

On-Demand Webinar

Optimizing LDL-C Lowering and Adherence to Hyperlipidemia Guidelines

1.0 Credit / Cardiology

Podcast

Best Practices for Management of Hyperlipidemia: A Focus on Guidelines and Patient Adherence

0.5 Credit / Cardiology

Webinar Registration

Understanding the Pearls and Pitfalls of Nimodipine in Aneurysmal Subarachnoid Hemorrhage: Opportunities to Improve Patient Outcomes

February 19, 2026 | March 5, 2026 | 1:00 PM & 8:00 PM ET

Case Conversation

Implementing Updated RSV Vaccination Recommendations for Older Adults: RSV Case Discussion

0.75 Credit / Immunization

On-Demand Webinar

Implementing Updated RSV Vaccination Recommendations for Older Adults: Incorporating ACIP Guidelines (Pharmacy Technician Credit)

0.5 Credit / Immunization

On-Demand Webinar

Implementing Updated RSV Vaccination Recommendations for Older Adults: Introduction and Disease Burden (Part 1)

0.5 Credit / Immunization

AJMC Supplement

Revolutionizing Acute Pain Relief: Emerging Nonopioid Therapies and the Essential Role of Managed Care

2.0 Credits / Pain Management

On-Demand Webinar

Empowering Patients With COPD: The Pharmacist's Role in Personalized Treatment Strategies

1.0 Credit / Pulmonology/Respiratory

Panel Discussion Registration

2025 SABCS Abstracts to Action: From Data to Decisions in HR+ Breast Cancer Care

February 23, 2026 | 7:00 PM ET

Case Conversation

Implementing Updated RSV Vaccination Recommendations for Older Adults: RSV Case Discussion (Pharmacy Technician Credit)

0.75 Credit / Immunization

Webinar Registration

Advancing AML Care: Equipping Pharmacists to Implement Menin Inhibitor Therapies with Precision

Thursday, March 5, 2026 | 12:00 PM & 7:00 PM ET

On-Demand Virtual Symposium

Advancing Acute Pain Care: Breakthrough Non-Opioid Therapies and the Pharmacist's Critical Role

1.5 Credits / Pain Management/Opioids

On-Demand Webinar

Hyperlipidemia Overview and the Clinical and Economic Burden

1.0 Credit / Cardiology

On-Demand Webinar

The Role of the Pharmacist in the Management of Hyperlipidemia

1.0 Credit / Cardiology

On-Demand Webinar

Implementing Updated RSV Vaccination Recommendations for Older Adults: Introduction and Disease Burden (Pharmacy Technician Credit)

0.5 Credit / Immunization

On-Demand Webinar

Implementing Updated RSV Vaccination Recommendations for Older Adults: Considerations for the Pharmacist (Part 3)

0.5 Credit / Immunization

Unlocking the Potential of Machine Learning and Large Language Models in Oncology

3 Key Takeaways

Articles in this issue

Newsletter

Related Content

Q&A: Making Plant-Based Eating Practical in Pharmacy-Led Cardiometabolic Care

Pharmacist Takeover: CAR T Therapy Signals a Shift in ALL Treatment

Strengthening Neurology Care Through Clinical Pharmacy Integration and Collaborative Practice

TrumpRx Launches, Offering Cash-Paying Patients Discounted Drugs

Q&A: How Pharmacists Can Address Structural Barriers in Obesity Care

Latest CME

Innovations in Medicine: 2024 Lineup of New and Approved Specialty Drugs (Pharmacy Technician Credit)

Innovations in Medicine: 2024 Lineup of New and Approved Specialty Drugs

Implementing Updated RSV Vaccination Recommendations for Older Adults: Considerations for the Pharmacist (Pharmacy Technician Credit)

Implementing Updated RSV Vaccination Recommendations for Older Adults: Incorporating ACIP Guidelines (Part 2)

Navigating Diabetes: A Guide to Effective Oral Treatments

Exploring HIV Long-Acting Injectable Uptake: How Pharmacists Can Encourage Long-Acting Injectables to Stem the Spread of HIV

APhA 2026

Optimizing LDL-C Lowering and Adherence to Hyperlipidemia Guidelines

Best Practices for Management of Hyperlipidemia: A Focus on Guidelines and Patient Adherence

Understanding the Pearls and Pitfalls of Nimodipine in Aneurysmal Subarachnoid Hemorrhage: Opportunities to Improve Patient Outcomes

Implementing Updated RSV Vaccination Recommendations for Older Adults: RSV Case Discussion

Implementing Updated RSV Vaccination Recommendations for Older Adults: Incorporating ACIP Guidelines (Pharmacy Technician Credit)

Implementing Updated RSV Vaccination Recommendations for Older Adults: Introduction and Disease Burden (Part 1)

Revolutionizing Acute Pain Relief: Emerging Nonopioid Therapies and the Essential Role of Managed Care

Empowering Patients With COPD: The Pharmacist's Role in Personalized Treatment Strategies

2025 SABCS Abstracts to Action: From Data to Decisions in HR+ Breast Cancer Care

Implementing Updated RSV Vaccination Recommendations for Older Adults: RSV Case Discussion (Pharmacy Technician Credit)

Advancing AML Care: Equipping Pharmacists to Implement Menin Inhibitor Therapies with Precision

Advancing Acute Pain Care: Breakthrough Non-Opioid Therapies and the Pharmacist's Critical Role

Hyperlipidemia Overview and the Clinical and Economic Burden

The Role of the Pharmacist in the Management of Hyperlipidemia

Implementing Updated RSV Vaccination Recommendations for Older Adults: Introduction and Disease Burden (Pharmacy Technician Credit)

Implementing Updated RSV Vaccination Recommendations for Older Adults: Considerations for the Pharmacist (Part 3)

Trending on Pharmacy Times

TrumpRx Launches, Offering Cash-Paying Patients Discounted Drugs

Nipah Outbreak in India Poses Low Global Risk Despite Lack of Approved Treatments

Pharmacist Takeover: CAR T Therapy Signals a Shift in ALL Treatment

SGLT2 Inhibitors in T2D Lower 5-Year Risk of CKD and Acute Kidney Injury

Q&A: Making Plant-Based Eating Practical in Pharmacy-Led Cardiometabolic Care