Rediscovery Is Not Discovery: How We Built an Oncology Target ID Engine That Knows the Difference

Of the roughly 20,000 genes in the human genome, only about 1% have ever crossed the finish line as an approved oncology drug or a viable Phase II candidate. For any Head of Research & Development, that 1% is the only thing that matters. The other 99% is noise, expensive and time-consuming noise.

When we built our internal Target Identification platform, we didn't just want a model that could “find targets”. Anyone can generate a list of 500 genes. We wanted an engine that could rank the known winners at the top alongside exciting novel targets. To prove it works, we gave it a blind test: could it rediscover the industry’s biggest hits without being told where they were?

The Honesty Metric: Why We Use PR-AUC

Standard data-science metrics like ROC-AUC (Receiver Operating Characteristic – Area Under the Curve) don't always tell the full story in biology, and can be deceptively flattering. When 99% of genes aren't good targets, it’s easy to achieve a high score just by identifying the 'noise.' The real challenge and the real value lie in the 1% where the signal actually lives.

Instead, we use AUC-PR (Area Under the Precision-Recall Curve). In a discovery landscape where the success rate is only 1%, a random guess gives you a score of 0.01 AUC-PR. To ensure our models are robust, we do not limit our models to a single disease. Instead, we evaluate our engine across the entire oncology landscape, specifically 33 different oncology indications, including those from the Tumor Cancer Genome Atlas. When evaluating the 50 top candidate targets, our model achieved an average PR-AUC of 0.062 .

On paper, 6.2% might sound humble. In reality, it outperforms more than six times the precision of random chance. It means that when our AI flags a Top 50 target, it is more than six times more likely to be clinically viable than a target picked through traditional, non-supervised search.

Escaping the Cancer Hallmark Trap

Most AI models in oncology are too smart for their own good. They quickly learn the Hallmarks of Cancer and start recommending the same famous genes (like ERBB2 or EGFR) for every single indication.

Because these targets are so well-studied and tested across dozens of cancers, they become "data magnets". Therefore, they are frequently highly ranked, not necessarily because they are the right precision fit for your specific patient subgroup of interest.

This is a reassuring sanity check, but it isn’t Precision Medicine. To be truly effective, our platform operates with two complementary modes:

The Benchmark Mode is designed to rediscover the industry's greatest hits and confirm that the engine can decode the biological patterns behind known clinical success.

The Discovery Mode is where we move into uncharted territory. By masking the dominant signals from cancer hallmarks and penalising the 'data magnets', we force the engine to look deeper into the unique microenvironment of a specific tumor — and surface novel, indication-specific candidates.

Our Case Study: Target ID in Bladder Cancer

Muscle-invasive bladder cancer (urothelial carcinoma) is a perfect example of why we needed to evolve our models. It is a disease defined by clear genomic drivers; such as FGFR3 mutations and cell cycle dysregulation. Yet it remains notoriously difficult to treat due to its high degree of genetic heterogeneity. In this context, precision medicine isn't just a buzzword; it refers to the model's ability to distinguish between a general cancer signature and the specific vulnerabilities of a patient’s unique tumor microenvironment. We first tested our Benchmark Model, designed to identify the "industry hits." In a blind search among 20,000 genes, the engine successfully prioritized the following in our Top 50:

Immune Checkpoints: CTLA4, TIGIT, IDO1
Known cancer markers: ERBB2 (HER2), TROP2

Figure 1: Example target ranking results for bladder cancer, from the Benchmark Mode

By consistently ranking these established targets in our top tier, the engine demonstrates that it can identify the complex biological patterns associated with existing clinical success.

Switching to Discovery Mode produced a different picture. The famous names dropped out of our top 50 — and that wasn't a failure, it was the goal. With the pan-cancer noise penalised, the engine surfaced novel, bladder-specific candidates that are now undergoing wet-lab validation in our facilities.

From Triage to Wet Lab

Running Discovery Mode on bladder cancer wasn't a one-off experiment — it was the front door of a real R&D pipeline. Combining our AI's computational power with expert oversight in a Human-in-the-Loop approach, we moved from data to validated assets in record time.

Across two bladder cancer patient cohorts (N=217 and N=105), our funnel compressed the search space dramatically:

314 Targets Triaged across the genome: 260 deprioritized, 40 flagged as high-risk, and 14 surfaced as high-probability hits.
9 Cleared for deep dive: our internal expert review committee advanced 9 of the 14 hits to deep mechanistic review, dropping 5 on various biological grounds.
6 Internal Approvals: those mechanistic deep dives produced 6 targets cleared to enter the validation pipeline.
Active progress today: 2 targets in active validation and 1 already in Hit Identification as an Antibody-Drug Conjugate (ADC).

That ADC candidate is where biological discovery runs into its next constraint: druggability. A great target on paper isn't necessarily a great target for an antibody to deliver a payload to.

Beyond Biological Relevance: An ADC-specific engine

Identifying a "hit" is only half the battle. The next question is: How do we drug it? A gene might be a critical driver of bladder cancer, but if its protein remains hidden inside the cell, it is invisible to an ADC. To function as a "smart bomb", an ADC target must be highly expressed on the cell surface, tumor-specific to avoid off-target toxicity, and able to internalize the drug to release the warhead.

A model designed for biological relevance isn't enough for this. We needed a model calibrated to the drug modality and mechanism of action from day one. So we built a specialized ADC engine, designed to find the best 'delivery hubs' for toxic warheads, even when those targets aren't the primary biological drivers of the tumor.

Our ADC engine ranked the gold standards for bladder cancer — Nectin-4 and TROP2 — in its top tier, demonstrating it can identify the surface-protein patterns that drive ADC clinical success.

But the real differentiator is control. A scientist can manually adjust the weights of specific biological features to match a strategic goal:

In-Licensing: Dial up the weight of established clinical data and known pathways to de-risk an asset under consideration for acquisition.
Strategic Positioning: Find fast follower targets — candidates in a class where a first mover has already validated the biology — that share high-probability signatures with existing hits, but offer differentiation through better safety profiles or different patient subgroups.
Repurposing: Test whether a target validated in lung cancer has the right biological profile for a new trial in bladder cancer.‍
De Novo Discovery: Dial up the novelty weight to shift focus toward non-coding genes and unexplored proteins—the untapped 1% your competitors haven't noticed.

So what’s next?

The 1% problem won't be solved by bigger models or more data. It will be solved by being honest about what success means, building tools that respect the difference between rediscovery and discovery, and trusting expert biologists to steer them. That's what's behind the targets currently in our validation pipeline and the ADC programme moving forward today.

But a methodology that only works inside our walls isn't worth much. The real test is whether other R&D teams, working on different indications, different modalities, different therapeutic hypotheses, can apply the same approach and compress their own search space the same way. So we’re integrating the models, datasets, and parameters described in this post into K-Pro as AI skills, codified workflows that agents can execute autonomously and R&D teams can run on their own indications with the same rigor we've applied to ours.

Box 1. "What is a "skill" in K-Pro?

Skills are codified scientific methodologies that AI agents can execute autonomously — they're at the core of how K-Pro does real discovery work. Each one packages a validated workflow together with its underlying AI models, curated datasets, and search parameters, so a researcher can run the full method themselves with a single command. The ADC target engine described in this post is one such skill. Other examples include skills to characterize drug targets, investigate drug combinations, and optimize patient subgroups. More are in development — across modalities, indications, and stages of the discovery pipeline.

Figure 2: Example skills in K Pro (in development)

In practice, that changes what a target ID team can do in a quarter. A lead working on a heterogeneous indication can run a Discovery Mode search in days, not weeks. A team evaluating an in-licensing opportunity can stress-test it against established clinical signatures before committing capital. A scientist hunting for fast-follower opportunities can dial the novelty weight up or down to match the strategic position. What used to require weeks of cross-functional computational biology work is becoming a workflow a target ID team can drive themselves — with the same statistical honesty and hallmark-aware reasoning we've described here.

The next decade of oncology won't be defined by who builds the biggest model. It will be defined by who can tell rediscovery from discovery — and who has the conviction to chase the 1% the rest of the field has missed.