
Our most commonly asked questions
K Pro provides filtering capabilities based on a wide range of clinical and genetic variables from the TCGA and the MOSAIC Window datasets.
For example, you can specify: "Show the proportion of immune cell types in MOSAIC window bladder cancer patients over 50 years old."
The integration of K within the customer information system will follow those main steps:
- The integration will be done on a dedicated AWS account, being part of the customers AWS organisation.
- The Owkin team responsible for the deployment of K will be given the capability to assume a role it will provide to the customer in advance. This role follows a least privilege approach and will be reviewed by the customer security team before the deployment starts.
- The customer will need to allow egress network flows to necessary dependencies (i.e. docker image repository, source code, …) which will be provided by the Owkin team.
K Pro is purpose-built for biomedical research, offering unique features and scientific rigor that go far beyond the capabilities of general-purpose LLMs.
Here’s what sets Owkin K apart:
1. Smarter Scientific Reasoning
Unlike generic LLMs that often provide surface-level or generic answers, K Pro delivers scientifically robust reasoning. It supports every analysis with statistical evidence—such as p-values and population-level data—so you can trust the validity of its findings and never have to rely on guesswork.
2. Exclusive Access to Cutting-Edge Data
K Pro provides access to specialized datasets unavailable in mainstream LLMs. For example, you can analyze a subset of the MOSAIC dataset, the largest multiomic spatial oncology dataset, unlocking insights from a unique and powerful data resource.
3. Interactive, User-Driven Visualizations
With K Pro, data exploration is intuitive and interactive. You can iterate on data visualizations in real time using plain language. Just ask to filter charts by new variables, compare groups, or adjust parameters. This makes deep data analysis both faster and more engaging.
4. Publication-Ready Scientific Writing
K Pro generates responses that sound like real scientific writing. Its outputs follow accepted scientific conventions, producing clear, coherent, and publication-ready paragraphs—no more generic or off-topic content.
5. Designed by Scientists, for Scientists
While generic LLMs are designed for broad, everyday tasks (such as writing poems or planning trips) and are based on data from all over the web, K Pro is laser-focused on the specific needs of researchers and only pulls from credible scientific sources. It’s engineered by scientists to fit the workflows, language, and standards of the biomedical community, ensuring relevance and credibility in every interaction. K Pro empowers researchers with rigorous scientific reasoning, exclusive data access, interactive analytics, and precise scientific communication—delivering specialized support that generic LLMs simply can’t match.
6. Tailored agentic workflows for pharma
K Pro isn’t just an LLM with biomedical data. It’s designed with a deep understanding of pharmaceutical research workflows. K Pro provides a set of agentic capabilities: these are smart, modular tools and agents that map directly to the typical tasks and questions a pharma researcher might encounter, from data exploration and hypothesis generation, to biomarker discovery and clinical trial design.
Data uploaded to K Pro can only be seen by the original user and their organization.
We conduct bias audits during model development and testing. Our models are evaluated across diverse demographic datasets, and we partner with academic and clinical collaborators to validate performance in real world settings.
At Owkin, preventing bias in K Pro’s recommendations is a central focus throughout the model development lifecycle. We conduct comprehensive bias audits during both development and testing phases, using a variety of statistical and qualitative checks to identify and mitigate potential sources of bias. Our models are systematically evaluated across diverse demographic and clinical datasets to ensure that recommendations are robust and equitable for all patient groups. To further strengthen our safeguards, we collaborate closely with leading academic and clinical partners who independently assess K Pro’s performance in real-world settings. Their feedback helps us refine our models and address any disparities that might emerge. Additionally, we maintain ongoing monitoring after deployment, so that any new or unforeseen sources of bias can be quickly identified and corrected. Our commitment to transparency means that users can always trace the data sources and rationale behind K Pro’s recommendations. By combining rigorous technical evaluation with diverse, real-world validation and continuous monitoring, we strive to deliver recommendations that are fair, reliable, and trustworthy for everyone.
We have daily monitoring in place (e.g., using a metric like Tool Call Accuracy (TCA)) to track the percentage of times the correct tool is identified and called. This helps us quickly identify systemic issues where an agent might be "hallucinating" the need for a specific tool when another is more suitable, or failing to identify any tool when one is needed. Parameter Monitoring: Once a tool is called, it's important that the correct parameters are passed to it. Incorrect parameters can lead to wrong actions, all of which manifest as hallucinations in the agent's final response. We monitor the accuracy and completeness of parameters passed to tools via some automated tests on a set of evaluation questions.
These are strategic working sessions where Owkin goes in-house with pharma teams to explore how multimodal data and agentic AI can unlock new insights, break down translational research barriers and optimize clinical trials. The goal is to apply K Pro to real-world challenges and rapidly uncover high-value opportunities. These sessions are part of longer trial periods, which last 2 weeks and are otherwise held remotely.
You have two easy options if you wish to report an issue with K Pro: 1. Submit a request via our Help Center for technical issues or questions. 2. Share feedback on the tool via survey.
You can do that anytime through our in-app survey (look for the blue 'open survey' button on the right side of your chat interface). In addition, enterprise-tier K Pro users have a dedicated account manager they can contact. We value your input—it helps us improve the tool for everyone.
K Pro is an advanced drug portfolio decision-making platform that leverages agentic AI to seamlessly connect deep biological understanding with clinically relevant outcomes, enabling pharmaceutical teams to make faster, more confident, and better-informed R&D decisions. Built for accessibility, it empowers both technical and non-technical users to interrogate vast, complex multimodal datasets, including genomic, imaging, clinical, and literature sources, using natural language, eliminating the need for coding or external tools. By integrating high-quality public biomedical databases, proprietary patient data, and user-uploaded data, K Pro breaks down traditional silos and fosters real-time collaboration across research, clinical, and strategic teams. Its three specialized “Agentic Spaces” Analyze, Activate, and Amplify, support the full drug development lifecycle: from hypothesis generation, biomarker discovery, and mechanistic exploration, to patient population optimization, competitive landscape analysis, and enriched clinical trial evaluation. With capabilities for producing publication-ready visuals, static insight reports, and comprehensive trial data integration, K Pro will function as a collaborative, always-on intelligence layer. All data and operations are underpinned by rigorous privacy, security, and compliance standards, making it a trusted co-pilot for accelerating discovery and de-risking pipeline decisions in modern drug development.
The K Pro free version supports basic functionalities including literature and gap analysis, smart hypothesis testing, scientific writing assistance, and interactive data exploration and visualization, tapping into 17+ biomedical datasets.
You can get started with K Pro’s Free version immediately. It offers access to the core functionalities so you can explore and experience the platform without delay. Simply sign up to begin using K Pro Free today.
For organizations needing advanced features, integrations, and premium support, we offer an enterprise version. Please contact our team to book a personalized demo. Our demos typically run for two weeks and are conducted primarily remotely for your convenience. As part of the enterprise introduction, we also offer “Katalyst sessions”—dedicated, 2-day workshops where the Owkin team collaborates directly with your organization, in house. These sessions are designed to accelerate your onboarding, answer questions, and tailor K Pro to your specific needs.
Owkin understands the value and importance of interpretability with ML and AI tools. While the codebase of K Pro is proprietary and confidential to Owkin, at runtime, users can inspect the reasoning process undertaken by agents in the system. Where plots are generated, or a coding agent generates code for a complex task, this will be available to the user
At Owkin, we recognize that any scientific tool, whether it’s AI-driven, a spreadsheet, or statistical software, can be misused or lead to misleading conclusions if not applied responsibly. That’s why K Pro is built with robust safeguards to promote transparency, accountability, and scientific rigor. Every result generated by K Pro is grounded in evidence from authoritative sources such as PubMed and validated biological knowledge bases. For each output, users can trace back recommendations to their original sources, ensuring full provenance and transparency.
Our explainability features log every data source, model decision, and rationale, making the reasoning process fully visible and auditable. To further reduce the risk of false or hallucinated information, we’ve implemented technical checks to verify that every cited source exists and is relevant to the scientific question at hand. However, we acknowledge that no AI system is infallible—scientific oversight and expert review remain essential.
K Pro is designed to assist researchers, not replace human expertise or decision-making.With these measures, we empower users to make informed, reliable decisions while maintaining the highest standards of scientific integrity.
We believe that accelerating scientific discovery responsibly is a must. BASI is not about replacing human reasoning; it’s about scaling scientific insight through AI and human collaboration. We’re advancing toward this vision carefully, with transparency, oversight, and a commitment to a shared benefit for humanity. At Owkin, we believe that accelerating scientific discovery must always be balanced with ethical responsibility. The concept of Biological Artificial Superintelligence (BASI) is not intended to replace human reasoning or judgment. Instead, BASI is designed to amplify and scale scientific insight by fostering collaboration between advanced AI systems and human experts.
We are fully aware of the ethical complexities that come with developing such transformative technologies. That’s why our approach is grounded in transparency, rigorous oversight, and continuous consultation with diverse stakeholders—including scientists, clinicians, ethicists. Our development processes integrate ethical review and risk assessment at every stage, ensuring that BASI is aligned with the broader interests of society. We are committed to ensuring that the benefits of BASI are shared equitably, and that its insights are used to advance scientific understanding and improve human health for everyone.
Ultimately, we see BASI as a tool for responsible innovation—one that accelerates discovery while upholding the highest standards of ethics, accountability, and public trust.
Agentic AI in pharma is progressing rapidly from pilots to practical impact, reliably accelerating many parts of the puzzle - such as, data harmonization, hypothesis iteration and validation, biomarker identification, clinical trial optimization, etc, with strong guardrails and human oversight. At Owkin, maturity is higher because agents are grounded in high‑quality multimodal data and deep biology context, enabling stronger grounding, clearer reasoning, and more trustworthy recommendations.
On top of the diverse datasets, K Pro’s agentic system is built over tools and agents which take into account the diversity of datasets. The K Pro orchestration allows to capture intent from user questions and to call the right tools in order to find proofpoints across natural language scientific reports (e.g. in literature or patents), structured gene databases (e.g. OpenTargets) or multi-omics patient cohorts. Standardization / harmonization is performed by dedicated transformations optimized for the usage by the tools and agents in K Pro’s agentic system.
K Pro is an Enterprise-level Agentic AI decision-making tool that helps biopharma teams connect deep biological insights with real-world clinical impact. By combining multimodal data, proprietary patient cohorts, and competitive trial intelligence into a single, natural language interface, K Pro empowers researchers, clinicians, and strategists to generate and validate hypotheses faster, optimize patient populations, and de-risk trial and portfolio decisions. With customizable visualizations and secure data integration, K Pro acts as a co-pilot that transforms scattered data and siloed workflows into confident, clinically actionable decisions, accelerating discovery while strengthening competitive advantage.
The free version K Pro Free includes the basic functionalities of the AI-powered copilot and enables the rapid generation, testing, and refinement of scientific hypotheses, especially in oncology, making it accessible to everyone, regardless of coding skills.
Trustability is at the heart of K Pro’s design and is ensured in various ways. First, K Pro's reasoning (or thought process) is made accessible to the user. Second, a number of guardrails have been put in place to combat the hallucinations specific to LLMs (see dedicated question), for example by augmenting the literature review with a RAG system to ensure the existence of PubMed IDs returned by K Pro.
More generally, K also relies heavily on tools (modality-specific AI models, etc.) whose proper calling is constantly monitored. Finally, it is important to note that one of K Pro's strengths is that it is based on the analysis of underlying patient data, queried by tools. This data, and therefore the resulting plots, cannot be invented and thus ensure that the response is grounded in reality.
For each K Pro user, only the chat history is saved in the database and it is always associated with the given user + their organization. Only an authenticated user can access their own chat history. The database itself is a managed RDS instance on Owkin’s AWS account and only accessible to the backend service that needs it via a secure service account + network policies.
Agentic behavior is strictly managed by role based permissions and safety checks. K Pro is not designed to self-initiate actions beyond its assigned domain and cannot operate without authenticated users and guardrails. At Owkin, we recognize the importance of managing AI systems that can act “agentically”—that is, systems capable of taking independent actions within defined boundaries. To ensure responsible use, Owkin K’s agentic behavior is strictly governed by robust role-based permissions and multiple layers of safety checks. K Pro is deliberately not designed to self-initiate actions outside its clearly assigned domain. Every function it performs is subject to explicit user authentication and operates within carefully established limits. All actions are traceable to authenticated users, and comprehensive guardrails are in place to prevent unauthorized or unintended operations.
In addition, we regularly review and update these controls to keep pace with emerging best practices in AI safety and governance. By combining technical safeguards with clear operational policies, we ensure that agentic capabilities are used solely to support and empower qualified users—never to replace human oversight or operate without accountability. This approach allows us to harness the power of advanced AI while maintaining trust, transparency, and full control over its actions at all times.
Yes K Pro allows you to discover multimodal datasets that are part of the Owkin data catalog and that could be integrated to your project based on your research needs. These dataset span various modalities (bulk, single cell, spatial transcriptomics, WES) but also various therapeutic area such as Oncology, I&I, Cardiovascular and Neurodegenerative diseases.
We see the opposite effect. Researchers are more empowered, more productive, and more collaborative when supported by intelligent tools. K Pro allows teams to tackle big questions by expanding their capabilities.
Yes, we are engaged with regulatory bodies and bioethics groups. We have also invited the academic community to be amongst beta testers and early users, to have scientists help us build K Pro for scientists.
The final outcomes would be affected by the ability of the LLMs to understand the user question and call the right tools to answer it. Usually, more recent versions of LLMs have better performance on a tool call task as shown here. Also, switching from one LLM to another means doing the necessary changes in prompting and context engineering to reach the best results possible for a given LLM (to be assessed via our evaluation automations)
Below is the list of public and private datasets currently available in the K Pro platform, along with their respective licenses and versions.
- Chembl – Public, https://www.ebi.ac.uk/chembl/
- CollecTRI – Public, https://github.com/saezlab/CollecTRI?tab=readme-ov-file
- COMPARTMENTS – Public, https://compartments.jensenlab.org/Search
- Complex Portal – Public, https://www.ebi.ac.uk/complexportal/home
- DepMap – Public, https://depmap.org/portal/
- Ensembl (BioMart) – Public, https://www.ensembl.org/info/data/biomart/index.html
- gnomAD – Public, https://gnomad.broadinstitute.org/
- GTEx – Public, https://gtexportal.org/home/
- Hallmarks of Cancer – Public, https://pubmed.ncbi.nlm.nih.gov/21376230/
- Human Protein Atlas (HPA) – Public, https://www.proteinatlas.org/
- IntOgen – Public, https://www.intogen.org/search
- MsigDB – Public, https://www.gsea-msigdb.org/gsea/msigdb/human/collections.jsp
- Open Targets – Public, https://platform.opentargets.org/
- RCSB PDB – Public, https://www.rcsb.org/
- Reactome – Public, https://reactome.org/
- TCGA – Public, https://www.cancer.gov/ccg/research/genome-sequencing/tcga
- TMHMM – Public, https://services.healthtech.dtu.dk/services/TMHMM-2.0/
- Uniprot – Public, https://www.uniprot.org/
- Tabula Sapiens – Public, https://tabula-sapiens.sf.czbiohub.org/
- MOSAIC Window – Owkin's proprietary, https://www.mosaic-research.com/mosaic-window
- EMBL-EBI - https://www.ebi.ac.uk/chembl/
At Owkin, keeping your data secure is our highest priority. While much of our technology is developed and managed in-house, we also partner with select, highly reputable vendors who must meet our stringent privacy, security, and ethics standards. Each partner is carefully vetted through rigorous due diligence, including detailed security assessments and contractual requirements aligned with our own commitments. To ensure the highest standards of information protection, we employ robust organizational and technical measures, conduct regular internal and external audits, and perform comprehensive Security Risk Assessments with every major change to our systems. When integrating large language models or other third-party components, we choose hosting options that guarantee privacy and confidentiality for all data and outputs.
This privacy-first approach ensures full compliance with GDPR and HIPAA requirements. Owkin is certified to ISO 27001:2022 for information security and ISO 13485:2016 for medical device quality, reflecting our ongoing dedication to safeguarding your data. With these measures in place, you can be confident that your information is protected at every stage.
MOSAIC (Multi Omic Spatial Atlas In Cancer) is a large-scale spatial and multiomics dataset, an international collaboration between Owkin and leading cancer research institutions, including the University of Pittsburgh, Gustave Roussy, Lausanne University Hospital, Uniklinikum Erlangen/Friedrich-Alexander-Universität Erlangen-Nürnberg, and Charité - Universitätsmedizin Berlin. The initiative combines spatial and single-cell transcriptomics with AI to create comprehensive maps of tumor environments across 6 data modalities, aiming to unlock new treatments for some of the most challenging cancers. To date, it includes data from over 2,600 patients from 10 cancer indications: Bladder, Breast cancer, Diffuse Large B-Cell Lymphoma (DLBCL), Glioblastoma, Mesothelioma, Non-Small Cell Lung Cancer (NSCLC), Ovarian cancer, Head & Neck cancer, Pancreatic cancer, Colorectal cancer. Part of the MOSAIC dataset is available to K Pro users as an add-on.
A smaller subset of the MOSAIC datasets, called MOSAIC Window, is available in the K Pro free tier. MOSAIC Window includes spatial omics and multimodal data from 60 patients across five cancer types: BLCA (15), OV (15), GBM (10), DLBCL (10), and MESO (10). This unique resource enables researchers to explore tumor biology at near single-cell resolution, providing detailed insight into tumor and immune cell interactions.
Yes, your data can be integrated into the Pro. To ensure data compatibility with K Pro, there are two options: clients can prepare datasets following our provided documentation, or our internal teams can handle the data preparation process. We utilize diverse pipelines, models, and expertise that the Owkin team has developed over the years to support various data modalities across multiple therapeutic areas. To maintain quality standards and data integrity, autonomous data uploads to K Pro are not currently available. However, we are actively developing this capability with a strong emphasis on data quality assurance and interoperability.
At Owkin, we are deeply committed to ensuring K Pro is used ethically and responsibly.
While any advanced tool can, in theory, be misused, we have put in place robust safeguards to prevent this. Owkin K features strict role-based access controls, detailed usage monitoring, and transparent sourcing for every insight, making all actions traceable and accountable. We also require all partner institutions to agree to our ethical usage policies, and we reserve the right to suspend access immediately if misuse is detected. These measures, combined with continuous oversight, help ensure K Pro remains a force for good in research and innovation.
Our dedication to ethical standards means you can trust in our ongoing vigilance and commitment to responsible use.
K Pro is an Enterprise-level Agentic AI decision-making tool that helps biopharma teams connect deep biological insights with real-world clinical impact. By combining multimodal data, proprietary patient cohorts, and competitive trial intelligence into a single, natural language interface, K Pro empowers researchers, clinicians, and strategists to generate and validate hypotheses faster, optimize patient populations, and de-risk trial and portfolio decisions. With customizable visualizations and secure data integration, K Pro acts as a co-pilot that transforms scattered data and siloed workflows into confident, clinically actionable decisions, accelerating discovery while strengthening competitive advantage.
The free version K Pro Free includes the basic functionalities of the AI-powered copilot and enables the rapid generation, testing, and refinement of scientific hypotheses, especially in oncology, making it accessible to everyone, regardless of coding skills.
K Pro is an advanced drug portfolio decision-making platform that leverages agentic AI to seamlessly connect deep biological understanding with clinically relevant outcomes, enabling pharmaceutical teams to make faster, more confident, and better-informed R&D decisions. Built for accessibility, it empowers both technical and non-technical users to interrogate vast, complex multimodal datasets, including genomic, imaging, clinical, and literature sources, using natural language, eliminating the need for coding or external tools. By integrating high-quality public biomedical databases, proprietary patient data, and user-uploaded data, K Pro breaks down traditional silos and fosters real-time collaboration across research, clinical, and strategic teams. Its three specialized “Agentic Spaces” Analyze, Activate, and Amplify, support the full drug development lifecycle: from hypothesis generation, biomarker discovery, and mechanistic exploration, to patient population optimization, competitive landscape analysis, and enriched clinical trial evaluation. With capabilities for producing publication-ready visuals, static insight reports, and comprehensive trial data integration, K Pro will function as a collaborative, always-on intelligence layer. All data and operations are underpinned by rigorous privacy, security, and compliance standards, making it a trusted co-pilot for accelerating discovery and de-risking pipeline decisions in modern drug development.
The K Pro free version supports basic functionalities including literature and gap analysis, smart hypothesis testing, scientific writing assistance, and interactive data exploration and visualization, tapping into 17+ biomedical datasets.
K Pro is purpose-built for biomedical research, offering unique features and scientific rigor that go far beyond the capabilities of general-purpose LLMs.
Here’s what sets Owkin K apart:
1. Smarter Scientific Reasoning
Unlike generic LLMs that often provide surface-level or generic answers, K Pro delivers scientifically robust reasoning. It supports every analysis with statistical evidence—such as p-values and population-level data—so you can trust the validity of its findings and never have to rely on guesswork.
2. Exclusive Access to Cutting-Edge Data
K Pro provides access to specialized datasets unavailable in mainstream LLMs. For example, you can analyze a subset of the MOSAIC dataset, the largest multiomic spatial oncology dataset, unlocking insights from a unique and powerful data resource.
3. Interactive, User-Driven Visualizations
With K Pro, data exploration is intuitive and interactive. You can iterate on data visualizations in real time using plain language. Just ask to filter charts by new variables, compare groups, or adjust parameters. This makes deep data analysis both faster and more engaging.
4. Publication-Ready Scientific Writing
K Pro generates responses that sound like real scientific writing. Its outputs follow accepted scientific conventions, producing clear, coherent, and publication-ready paragraphs—no more generic or off-topic content.
5. Designed by Scientists, for Scientists
While generic LLMs are designed for broad, everyday tasks (such as writing poems or planning trips) and are based on data from all over the web, K Pro is laser-focused on the specific needs of researchers and only pulls from credible scientific sources. It’s engineered by scientists to fit the workflows, language, and standards of the biomedical community, ensuring relevance and credibility in every interaction. K Pro empowers researchers with rigorous scientific reasoning, exclusive data access, interactive analytics, and precise scientific communication—delivering specialized support that generic LLMs simply can’t match.
6. Tailored agentic workflows for pharma
K Pro isn’t just an LLM with biomedical data. It’s designed with a deep understanding of pharmaceutical research workflows. K Pro provides a set of agentic capabilities: these are smart, modular tools and agents that map directly to the typical tasks and questions a pharma researcher might encounter, from data exploration and hypothesis generation, to biomarker discovery and clinical trial design.
These are strategic working sessions where Owkin goes in-house with pharma teams to explore how multimodal data and agentic AI can unlock new insights, break down translational research barriers and optimize clinical trials. The goal is to apply K Pro to real-world challenges and rapidly uncover high-value opportunities. These sessions are part of longer trial periods, which last 2 weeks and are otherwise held remotely.