Press release
August 29, 2020

Published in Nature Communications – Owkin's deep learning model predicts RNA-Seq from tumor images

Owkin, the healthcare technology company applying federated learning to medical research, today announces the publication of its HE2RNA paper in ‘Nature Communications’ showcasing its novel tool for genomic analysis. The paper, entitled ‘Transcriptomic Learning for Digital Pathology’ describes how Owkin has developed a detailed and accurate deep learning model to predict RNA-seq expression of tumours from histology images of digitized biopsies.

Gilles Wainrib, Chief Scientific Officer and Co-Founder of Owkin:

Understanding the relationship between genotype and phenotype is one of the biggest 21st-century challenges in biology. Our research opens a new path to better connect information at the genomic, cellular and tissue levels, and this would not have been possible withoutrecent advances in artificial intelligence.

Massive changes in gene expression are known to occur in many cancers

Understanding and characterizing these disease-related gene signatures can help to clarify disease mechanisms and prioritize targets for novel personalized therapeutic approaches. Traditionally, the only available option for identifying gene expression during carcinogenesis has been to use whole transcriptome sequencing techniques (RNA-Seq) and dedicated bioinformatic tools. However, these analyses are costly and time-consuming. As a result, medical centres do not routinely use them. In oncology, tumour biopsies [or histology whole slide images (‘WSIs’)] are routinely collected in hospitals and research centres as a first step in the diagnostic and treatment pathway. The ready availability of these digitized slides in all research centres makes them a perfect data source for Machine Learning (‘ML’) models.

In recent years, deep ML has had a tremendous impact on various fields in science such as improvements in speech recognition and image recognition. Recently ML models have been applied to histology WSIs to improve the performance of pathologists in determining the diagnosis and grade of cancer patients. While it is becoming clear that the application of such models to tissue-based pathology can be very useful, few attempts have been made to connect specific molecular signatures directly to gene expression patterns within the histology slides.

A ML model that can use these ubiquitously available histology slides to determine gene expression without the need for expensive sequencing techniques has the potential to be an incredibly useful clinical tool. Owkin HE2RNA model is named after its capability to predict gene expression (RNA) of numerous tumour genes in 28 different cancers from Hematoxylin-Eosin (HE) stained biopsy slides. The model was also able to highlight (via gene expression) the exact location of each mutation on each WSI, hence creating a Virtual Spatial Transcriptomics map. This interpretability feature, combined with the Model’s ability to detect such a broad scope of mutations, offers huge potential to aid patient diagnosis and improve prediction of response to treatment and survival outcome. In this paper, Owkin describes how the model works. The paper also successfully explores the application of HE2RNA to predict genes involved in cancer development and to predict tumor status and response to therapies.

Elodie Pronier, Translational Research Scientist at Owkin and co-author of the paper on the success of the model:

Our efforts were rewarded by the model’s ability not only to correctly predict the location of a variety of gene expression signatures in each image but also to transfer the knowledge it learned on bulk data to smaller independent datasets to accurately answer specific clinically relevant molecular questions such as the identification of tumours with microsatellite instability. We are now excited to explore how HE2RNA learning of gene expression can help improve the prediction of other clinical targets on new datasets within our partner research centres to expand our scope and improve the performance of this model.

Owkin specializes in AI for medical research. Through the application of its technology, the company enables researchers to build ML models on fit-for-AI cohorts, highly curated, multimodal, research-grade longitudinal data, while keeping patient information preserved safely within the hospital’s local infrastructure. Ultimately, this method can result in an acceleration of the clinical research process that offers protected data for patients, exhaustive traceability of computations for institutions, and maximum collaboration for researchers.

Owkin’s proprietary platform, Owkin Studio, integrates these biomedical images, genomics, and clinical data to discover biomarkers and mechanisms associated with disease evolution and treatment outcomes that will propel the next generation of treatment plans and drugs. Owkin’s novel HE2RNA model is available to researchers to apply to their datasets via Owkin Studio and the results published in this paper can be visualized and explored in this demo.

About Owkin

Owkin is the first full-stack TechBio company on a mission to understand complex biology and derive new multimodal biomarkers through AI.

We identify precision therapeutics, de-risk and accelerate clinical trials and develop diagnostics using AI trained on world-class patient data through privacy-enhancing technologies. We merge wet lab experiments with advanced AI techniques to create a powerful feedback loop for accelerated discovery and innovation in oncology, cardiovascular, immunity and inflammation.

Owkin also founded MOSAIC, the world’s largest spatial multi-omics atlas for cancer research across seven cancer indications.

Owkin has raised over $300 million through investments from leading biopharma companies, including Sanofi and BMS, and venture funds like F-Prime, GV and Bpifrance, among others.