September 29-October 1, 2025
Decoding Biology Hackathon
Tackle a large, pre-specified Q&A-based dataset extracted from rich, multimodal patient data. Your mission: push the boundaries of AI reasoning to accelerate biological understanding and fuel the next wave of scientific breakthroughs.
Reserve your spot

The challenge
Improve artificial intelligence for biomedical research
Owkin has generated a proprietary dataset of more than to 300K+ questions and answers derived from the MOSAIC initiative, the world’s largest spatial omics dataset in oncology.
The MOSAIC initiative includes multimodal data from 2,500+ patients across 10 cancer indications and 6 modalities including bulk RNA-seq, WES, single-cell RNA-seq, H&E, spatial omics and clinical data.
The challenge is to leverage the set of training questions, as well as any resource from the public domain, to develop projects that will aim to build or fine tune agentic systems and improve AI reasoning to answer biological questions.
The dataset
Spatial Transcriptomics
- Which genes are up in tumor core vs surrounding tissue?
- High-resolution spatial context across multiple cancer types. Eg., Which gene is upregulated in tumor islets versus stroma in Lung adenocarcinoma?
Tumor vs Healthy Expression
- Does HERC3 exhibit higher transcript abundance in papillary renal cell carcinoma neoplastic tissue compared to matched non-neoplastic tissue?
- Data from TCGA and Owkin pipelines: built with Owkin’s proprietary Discovery Engine
Gene Indication Features
- Is the gene expression level of PLEKHG6 significantly elevated in bladder urothelial carcinoma tumor tissue compared to normal spleen tissue?
- Dataset generated using TCGA and MOSAIC data
Signature-Based Expression and Similarity Reasoning Across Cancers
- Cancer similarity: Which cancer types look most alike based on signature activity?
- Signature expression and comparison: How do gene programs differ between cancers?
- Signature similarity: Which gene programs show the same activation patterns?
Drug-Target Interactions
- Predict gene deregulation effects in cancer cells: Would a drug inhibiting the activity of the target TUBB induce a deregulation of gene PTPRH in muscle invasive bladder cancer cells?
- Over 25K QA pairs from real perturbation experiments
Drug-Induced Pathway Effects
- Q&A dataset that captures how compounds affect biological systems, based on gene set enrichment results from the Tahoe-100M dataset.
- Which Reactome pathway is most affected by a treatment?
Therapeutic Target Profiling
Participant profiles
Biological data processing and handling, computational biology (single-cell and spatial RNAseq) or previous work with imaging data is a plus.
Examples of relevant profiles:
- AI/ML researcher
- Data scientists
- Bioinformaticians
- Computational biologists
50 participants will work in diverse teams of 3 to 5 experts to achieve first place on a leaderboard on a held-out set of Q&A.
Event details and preliminary agenda
September 29, 2025
Hackathon day 1
September 30, 2025
Hackathon day 2
October 1, 2025
Hackathon day 3
Participants will be able to:
Do you want to support the hackathon?
Owkin is hosting a unique three-day event at our Paris headquarters that will bring together professionals from industry and academia to develop new AI tools that can answer important questions about biology and health.
If you’re interested in sponsoring the event, get in touch with us directly to learn more.