How risky is that biotech trial?

Want to get smarter about biotech clinical trials, R&D strategy, and financial modeling? Join the thousands of biopharma execs and professionals who subscribe to our free email newsletter. Read a sample issue, and then sign up here.

 

There’s nothing like a clinical trial failure to rev up biotech’s Monday-morning quarterbacks.

That was the story with BridgeBio, which lost about 2/3 of its market value in late 2021 after its drug candidate acoramidis failed in a Phase 3 study in transthyretin amyloidosis (ATTR). In the aftermath, one amyloidosis expert claimed on Twitter that the flame-out was “completely anticipated” based on the study design.

But in fact, such a bold claim is almost certainly false.

Every planned clinical trial has many options for the population, endpoints, and other design elements that could be logical and defensible, and experts can (and do) disagree on the “best” choices. As an investor or other external observer, it’s extremely rare to find a trial that’s absolutely certain to fail. (And if you do, you should short the stock!)

In reality, clinical studies in the same indication may have very different odds of success depending on how the sponsors chose to balance risk, cost, time, and potential reward in choosing key elements of the design. And beyond trial design, a study’s risk also depends on the underlying science, the strength of the existing data, and the sponsor’s execution capabilities. That’s why even identically-designed studies in the same indication of drugs with different mechanisms, run by different companies, might have radically different chances of yielding positive results.

This leads to a key question that plagues drug companies and their investors: How should one handicap the odds of success of a trial before the cards are flipped over?

In this post, I’ll describe a framework for thinking about clinical trial risk I’ve been testing and refining with biopharma and investor clients. This is still a work in progress, but so far I’ve found it to be a good starting point for systematically assessing the risk of clinical-stage drug development programs.

CONTENTS OF THIS POST:

  1. What’s missing from how we currently assess biotech clinical development risk?

  2. A framework for describing clinical trial risk

  3. Some commonly held ideas that this framework explicitly addresses

  4. Next steps

Before we dig in, two quick disclaimers: First, although I don’t have any conflicts to report related to companies mentioned in this post, as a professional consultant I earn most of my livelihood from drugmakers and investors, so you should generally interpret everything I say through that lens. Second, I wrote a book about biotech drug development, The Pharmagellan Guide to Analyzing Biotech Clinical Trials, which aims to help non-experts become more confident readers of press releases, investor decks, meeting presentations, and journal articles. Buy it on Amazon, or visit the Pharmagellan website to learn more and get a free excerpt.

What’s missing from how we currently assess biotech clinical development risk?

Many investors and R&D teams spend a ton of time trying to quantify the risk of clinical programs to make decisions about investing and deal-making. These analyses are often conducted by deep experts in drug development with a ton of experience assessing data and trial designs. However, I’ve seen several scenarios in which there’s a need for a more systematic approach. For example:

  • In comparing programs in disparate indications, it can be hard to account for different levels of knowledge about the pathophysiology, mechanisms of action (MOAs), and patients that affect the drug candidates’ different intrinsic levels of risk, independent of their data sets.

  • Some sponsors set a too-permissive bar for advancing a drug candidate into mid- and late-stage clinical studies, but the flimsiness of the supporting data isn’t always explicitly incorporated into the risk assessment of the upcoming trial.

  • Financial and strategic investors may struggle to explicitly quantify why and how much a trial’s odds of success might be affected by the resources, capabilities, and expertise – or lack thereof – of the company running the study.

  • It’s hard to systematically catalog all of the factors that may make a trial “poorly designed,” let alone decide how these factors quantitatively affect the study’s odds of success.

What would a better approach look like? Ideally, it would incorporate all of the important factors that contribute to the riskiness of a clinical study, without any overlaps between categories. It would also be useful in a broad range of use cases: assessing similar or highly diverse programs, within a single company or across several, and for the purposes of supporting internal strategic planning, transactions, or financial investments. And finally, it would provide a strong foundation for converting qualitative, descriptive findings into numerical inputs into valuation models.

A framework for describing clinical trial risk

The rubric I’ve been progressively refining for “clinical trial risk” is an initial step toward filling the gaps outlined above. Its categories are intended to be “mutually exclusive, collectively exhaustive” (MECE), to use the standard consulting lingo:

  • Data-driven risk refers to the strength (quality, quantity, and presence vs. absence) of the evidence for this particular drug candidate.

  • Inherent scientific risk boils down to the “strength of the science” in a particular indication, independent of the particular drug candidate: How well do we understand the disease, MOA/target, patients, and how these elements fit together?

  • Sponsor-dependent risk encompasses all of the factors that could cause two teams or companies to have a different probability of success (POS) for the exact same asset and indication. In other words, the risk factors in this category depend who’s running the clinical development program.

  • Trial design risk includes factors related to key attributes of the upcoming or in-progress study that might impact its odds of success, such as whether the patient population was enriched for likely responders with a biomarker or other clinical characteristics. It also includes the risk of “underpowering,” which means that due to the chosen study size, the minimum difference the trial can detect is larger than the magnitude of the effect that the drug is likely to have on patients.

Before continuing, I’d like to highlight two important points. First, it’s worth reiterating that in this rubric data-driven risk and intrinsic scientific risk are mutually exclusive: the first is about the specific asset, whereas the second is more generally about our knowledge of the clinical and scientific area. In other words, intrinsic scientific risks persist until a drug gets approved, independent of the quantity or quality of data, unless the underlying science advances. A new anti-PD-1 antibody that in non-small cell lung cancer that’s yet to be clinically tested has high data-driven risk due to the absence of data, but low inherent scientific risk, because the MOA is known to work in that indication. In comparison, the intrinsic scientific risk of a Phase 3 anti-amyloid drug for Alzheimer’s disease is persistently high, even with promising Phase 2 data on surrogate endpoints, because our understanding of the disease is abysmally low, and furthermore, when tested in the clinic this MOA hasn’t done much (if anything).

Second, because the focus here is strictly on clinical development risk for a particular trial, this framework doesn’t include regulatory or market access risks. A trial could have decent odds of delivering a positive result, but depending on the endpoints, effect size, and other trial design factors, even positive results may be unlikely to lead to regulatory approval or favorable pricing and coverage. Conversely, a trial could be optimally designed from a regulatory or pricing perspective, but it may still be high-risk due to other factors. In a typical risk-adjusted model, one would handle these different sources of risk (clinical development, regulatory, and market access) separately.

Some commonly held ideas that this framework explicitly addresses

I’ve road-tested this model with drug companies and financial investors, and it’s helped us crystallize many factors related to clinical development risk that are generally accepted and intuitively well-understood by many folks, even though they don’t always explicitly incorporate them into decision-making. Here are three examples:

1.    Data can’t completely overcome intrinsic scientific risk.

This framework highlights two crucial and related issues related to intrinsic scientific risk. The first is that intrinsic risk varies widely across indications and targets. This is an important consideration for financial investors, who often play broadly across medical areas, but it also applies within large pharma organizations that are organized by therapeutic area. If you’re leading business development or R&D in a TA like neurology or “rare diseases,” you’re evaluating and comparing programs in indications that vary widely in terms of our collective knowledge of the underlying pathophysiology, key targets, natural history, patient segments, etc. It’s simply implausible to think that Phase 3 programs in ischemic stroke, Alzheimer’s disease, and Dravet syndrome have similar odds of success just because they’re all in the same broad clinical class. The same is also true in more “homogeneous” TAs like oncology, where there’s a wide gap in terms of our understanding of these factors between, say, leiomyosarcoma (overall low-quality knowledge base) and EGFR-mutant non-small cell lung cancer (relatively better-understood). In order to compare the odds of success of clinical-stage programs, you need to incorporate some appreciation of the fact that some indications and targets are inherently riskier than others.

That leads to the second point, which is that although positive data are better than negative data (or none at all), the common biotech refrain that “data trumps everything” is not entirely true. Intrinsic scientific risks related to a particular indication and target or separate from asset-specific risks, and they don’t completely disappear in the face of supportive experimental evidence. For example, consider two agents for acute ischemic stroke gearing up for Phase 2b trials that have identical study designs. One is a novel neuroprotectant with a newly-hypothesized mechanism of action that has apparently positive results on a short-term, surrogate imaging endpoint from a single-arm study of 12 patients. The other is a “me-too,” next-generation thrombolytic that has completed Phase 1, but lacks any efficacy data. Although the first drug has more clinical data, I’d argue that the upcoming trial of the second is less risky, all else being equal.

2.    The quality of data supporting superficially similar clinical-stage assets can vary widely.

All experienced biopharma folks know that two clinical-stage programs in the same indication can be very different in terms of the strength of prior trial data. But the differences in the quality of evidence between clinical assets can be extremely hard to systematically evaluate. There has been some great qualitative work looking at predictors of clinical success and failure, and many pharma companies have developed structured approaches to help teams and executives systematically evaluate how a program’s data package impacts its likely odds of success. (If you want to dig in to the literature in this area, I recommend starting with the NIH’s report on Phase 3 failures after success in Phase 2; this paper and its sequel on AstraZeneca “5R’s” approach; this discussion of Lilly’s Chorus model; and Robert Plenge’s views on reducing risk in early development, alongside this commentary by Derek Lowe.) Nonetheless, it’s still a challenge to robustly quantify and score an asset’s clinical data in a way that takes into account both the results and the design of the trial.

One approach I’ve been taking recently, focused specifically on the strength of the design of a prior study, is to methodically catalogue the most common “red flags” in the setup of a Phase 2 trial that suggest its positive predictive value may be poor. These include a non-RCT design, a non-ITT analysis that biases in favor of the new drug, an overly permissive P value threshold, a high level of fragility, and tests of multiple subgroups without appropriate statistical safeguards. (If you’re not an expert in study design or biostats, these topics and others are covered in my book on analyzing biotech clinical trials.) This is still an early work in progress, but I’m optimistic that it could mature into a set of checklists modeled after the Cochrane reviews that would enable one to more fairly and systematically compare the quality of clinical data between disparate assets.

3.    Many sponsor-dependent factors can impact a trial’s odds of success.

The company-specific risks listed in the rubric could cause two companies to have different odds of success for the exact same program. This has implications for both strategic and financial investors. A biotech asset that looks favorable in terms of the other types of risk (intrinsic, asset-specific, etc.) might live in a company with poor capabilities, expertise, funding, or leadership, which means the clinical program is riskier than it would be if another firm owned it. For a potential acquirer, this drug candidate might be potentially attractive for a bigco that can offset those sponsor-dependent weaknesses. Similarly, a “non-core” asset in a large pharma company might have higher risk than other pipeline programs because it doesn’t benefit as much from the company’s high-functioning execution capabilities, but if it were spun out into a biotech that focuses on that particular clinical area, its odds of success might improve.

Next steps

I think the framework described above can help biotech investors and drug developers more rigorously codify the various sources of clinical development risk for pipeline programs. I’m excited to improve it based on feedback on its utility and limitations (send me an email!) and also subject it to further road tests on clinical-stage assets. As part of that latter effort, I’ll be posting some analyses of real-life examples, including the one with which I opened this post: How might one have assessed the risk of the Phase 3 trial of BridgeBio’s acoramidis before the study’s readout? (If you haven’t already, sign up to get email updates when new blog posts are released.)

But to be clear, there’s still one large outstanding challenge that needs to be addressed: how should one convert the qualitative findings for particular asset into an explicit POS percentage that can be incorporated into a risk-adjusted valuation model or an R&D portfolio planning tool (like in this article that I co-wrote)? There’s not an obvious answer, but one possible approach might be to use this framework to customize “generic” POS values from the literature (summarized in our biotech forecasting book) based on the particular attributes of the specific asset, indication, and sponsor. This is an active area of interest for me, and I hope to have more to say about it in subsequent posts.

 

Want to analyze biotech clinical study results like a pro? Download our free plain-English guide for non-stats experts:

“Interpreting efficacy findings in biotech clinical trials”

 
Previous
Previous

Red flags in biotech press releases

Next
Next

Introductory resources for analyzing biotech clinical trial results (updated 3/28/24)