AI x Bio x NY x LifeX: highlights

What we learned. Oh, and you can just listen to the highlights on a run if you want, in the podcast form, 15 min: https://amol.sarva.co/podcast/ai-x-bio-x-ny-highlights-from-our-conference/

Key Learnings and Insights from the LifeX AI x Bio Conference

Foundation Models in Biology

  • There is some debate on the exact definition of a foundation model.
  • A key characteristic is formulating tasks in an open-ended way, throwing large amounts of unstructured data at a model to see what it can do.
  • Foundation models are good at many tasks, even those they weren’t specifically designed to do.
  • The term “foundation” suggests serving as a base for multiple applications; it may also signal importance for fundraising.
  • Thomas at Vanta AI works on all-atom foundation models for protein-small molecule complexes. These models are expensive to train, but not necessarily $500M expensive.
  • Compared to LLMs trained on exabytes of text, biological data models (e.g., AlphaFold) use far less—terabytes at most.
  • Biodata could be nearly infinite if gathered and organized; generating and curating data is essential.
  • Multimodal models are emerging, but their advantages remain debated.
  • Surprising Insight: Sean suggests AlphaFold may be the only “true” foundation model in biomedicine, showing emergent properties.
  • LLMs can correlate vision and text to describe unseen images—this is analogous to AlphaFold’s capabilities.
  • Multimodal models exist for tasks like disease diagnosis, but true discipline-spanning foundation models remain rare.
  • Agent-based systems may one day abstract away model selection entirely.
  • Medical specialties may be social/reimbursement boundaries, not scientific ones—unlike the medicine–physics divide.

Data in Biology and AI

  • Data is more important than model architecture in biomedicine today.
  • Data hoarding is common due to its value.
  • Compared to finance, bio shows more anxiety around data sharing.
  • Surprising Insight: HIPAA concerns drive biopharma to run models on-prem, even with third-party data—distinct from regulation, it’s about ownership fear.
  • Data quality issues persist, especially in legacy phenotypic datasets.
  • Labeled data is a frontier—time-consuming to create but crucial.
  • Outlier biological traits (e.g., insensitivity to pain) are valuable for discovery.
  • Synthetic data can help; sometimes it outperforms radiologists, but quality is bounded by generation process.
  • Data-scarce strategies include simulation (e.g., virtual cell modeling).
  • Mechanistic models are deterministic but don’t reduce compute burden.
  • Depth > breadth: Germline mutation data can offer stronger predictive power than more shallow data.

Infrastructure

  • Options: pay for APIs, self-host in cloud, or buy GPUs.
  • AWS hosts many models; Bedrock aims to simplify access for bio use cases.
  • Goal: Make complex platforms usable for less technical scientists.
  • Bio infrastructure mixes cloud (scalable) and on-prem (privacy).
  • Training models is costly; some companies cloud-hop for credits. Inference is increasingly expensive too.

AI for Accelerating Drug Discovery

  • The panel discussed reversing Irum’s Law (rising drug discovery costs).
  • Challenges include combinatorial therapy and multi-target inhibition.
  • AI is used for hypothesis generation and predicting experimental outcomes.
  • AI may replace some experimental work, boosting iteration speed.
  • Surprising Insight: Only ~10% of drugs entering the clinic succeed—improving this is AI’s big opportunity.
  • The future lies in causal biology—differentiated hypotheses via mechanistic insights.
  • AI also helps in clinical ops: patient selection, endpoints, data management.
  • New architectures include flow matching, agentic systems, RL, and chain-of-thought reasoning.
  • Fragmented data across stages limits end-to-end modeling; bridging these gaps could drive the next wave.
  • Pharma’s in vivo/in vitro data is valuable for improving translation.
  • Surprising Insight: Biotech “sells hope” to pharma, pharma to patients, investors buy hope—this drives business models.
  • Some startups sell predictions (anti-hope) vs. molecules (hope).
  • Hype is seen as necessary to attract capital and talent.
  • Should vertical AI tools build pipelines (“platform envy”) or stay focused and share in upside?

Longevity – Hype vs. Reality

  • Longevity focuses on aging drivers, not just disease prevention.
  • Hallmarks include mitochondrial dysfunction and chronic inflammation.
  • Factions focus on specific aging mechanisms (e.g., mitochondria, immune).
  • Measuring dysfunction is tough outside research settings.
  • Defining “health span” shapes longevity goals.
  • The space has evolved from cosmetic/optimization to biochemistry-based interventions.
  • Surprising Insight/Debunking: Brian Johnson’s N=1 experiments are viewed skeptically; epigenetic clocks have high variability.
  • Interventions like NAD, metformin, rapamycin, fasting are unproven for lifespan extension.
  • Uncontrolled self-experimentation may have unknown harms.
  • Antioxidants could worsen some cancers in mice.
  • Surprising Insight: Plasma exchange therapy shows promise in mice but isn’t practical/safe for humans.
  • Validated biomarkers remain the standard; most physicals miss deeper aging indicators.
  • Biomarkers help both stratify patients and track intervention impact.
  • Best validated interventions: eat less, sleep more, be happy, exercise.
  • Fasting is a consistent positive signal.
  • GLP-1s could be a “miracle drug” for longevity, pending long-term studies.
  • Mother–daughter asymmetry shows rejuvenation is biologically possible and worth studying.

Starting a Company as a PhD

  • Surprising Insight: Great academics often make weak founders; bold simplification is punished in academia but necessary in startups.
  • Startups require prioritization, speed, and clarity.
  • Choose the right seed investor, not the first.
  • Bootstrapping can prove pull before raising.
  • Serendipitous conversations lead to funding opportunities.
  • Startup law firms can defer fees and provide crucial early help.
  • Trust and team empowerment are critical.
  • University IP can be a landmine—build off-site, license when needed.
  • Cofounder questionnaires help surface conflicts early.
  • Biotech companies tend to delay hiring GCs until IPO stage or regulatory complexity arises.
  • Academic rigor (p-values, caveats) must give way to rapid, imperfect action.
  • Simplifying your pitch is essential.
  • Conviction comes from personal connection to the problem and end user.
  • Surprising Insight: Standard startup structures (stock, boards, investor terms) exist for a reason—over-customization creates drag.
  • Legal simplicity allows focus on harder problems.

Key Learnings and Insights from the LifeX AI x Bio Conference

Welcome and Overview

  • The event was one of several AI and Bio conferences co-hosted by Fenwick & West and LifeX Ventures.
  • Fenwick & West is a tech and life sciences focused law firm supporting venture-backed companies and investors.
  • LifeX Ventures is an early-stage venture fund (Seed and Series A) focused on “AI Meets Bio,” specifically pure tech companies where the customer is a scientist or doctor, aiming to make things better so people live longer.
  • LifeX Ventures has made 55–56 deals in about two and a half years.

Foundation Models in Biology

  • There is some debate on the exact definition of a foundation model.
  • A key characteristic is formulating tasks in an open-ended way, throwing large amounts of unstructured data at a model to see what it can do.
  • Foundation models are good at many tasks, even those they weren’t specifically designed to do.
  • The term “foundation” suggests serving as a base for multiple applications; it may also signal importance for fundraising.
  • Thomas at Vanta AI works on all-atom foundation models for protein-small molecule complexes. These models are expensive to train, but not necessarily $500M expensive.
  • Compared to LLMs trained on exabytes of text, biological data models (e.g., AlphaFold) use far less—terabytes at most.
  • Biodata could be nearly infinite if gathered and organized; generating and curating data is essential.
  • Multimodal models are emerging, but their advantages remain debated.
  • Surprising Insight: Sean suggests AlphaFold may be the only “true” foundation model in biomedicine, showing emergent properties.
  • LLMs can correlate vision and text to describe unseen images—this is analogous to AlphaFold’s capabilities.
  • Multimodal models exist for tasks like disease diagnosis, but true discipline-spanning foundation models remain rare.
  • Agent-based systems may one day abstract away model selection entirely.
  • Medical specialties may be social/reimbursement boundaries, not scientific ones—unlike the medicine–physics divide.

Data in Biology and AI

  • Data is more important than model architecture in biomedicine today.
  • Data hoarding is common due to its value.
  • Compared to finance, bio shows more anxiety around data sharing.
  • Surprising Insight: HIPAA concerns drive biopharma to run models on-prem, even with third-party data—distinct from regulation, it’s about ownership fear.
  • Data quality issues persist, especially in legacy phenotypic datasets.
  • Labeled data is a frontier—time-consuming to create but crucial.
  • Outlier biological traits (e.g., insensitivity to pain) are valuable for discovery.
  • Synthetic data can help; sometimes it outperforms radiologists, but quality is bounded by generation process.
  • Data-scarce strategies include simulation (e.g., virtual cell modeling).
  • Mechanistic models are deterministic but don’t reduce compute burden.
  • Depth > breadth: Germline mutation data can offer stronger predictive power than more shallow data.

Infrastructure

  • Options: pay for APIs, self-host in cloud, or buy GPUs.
  • AWS hosts many models; Bedrock aims to simplify access for bio use cases.
  • Goal: Make complex platforms usable for less technical scientists.
  • Bio infrastructure mixes cloud (scalable) and on-prem (privacy).
  • Training models is costly; some companies cloud-hop for credits. Inference is increasingly expensive too.

AI for Accelerating Drug Discovery

  • The panel discussed reversing Irum’s Law (rising drug discovery costs).
  • Challenges include combinatorial therapy and multi-target inhibition.
  • AI is used for hypothesis generation and predicting experimental outcomes.
  • AI may replace some experimental work, boosting iteration speed.
  • Surprising Insight: Only ~10% of drugs entering the clinic succeed—improving this is AI’s big opportunity.
  • The future lies in causal biology—differentiated hypotheses via mechanistic insights.
  • AI also helps in clinical ops: patient selection, endpoints, data management.
  • New architectures include flow matching, agentic systems, RL, and chain-of-thought reasoning.
  • Fragmented data across stages limits end-to-end modeling; bridging these gaps could drive the next wave.
  • Pharma’s in vivo/in vitro data is valuable for improving translation.
  • Surprising Insight: Biotech “sells hope” to pharma, pharma to patients, investors buy hope—this drives business models.
  • Some startups sell predictions (anti-hope) vs. molecules (hope).
  • Hype is seen as necessary to attract capital and talent.
  • Should vertical AI tools build pipelines (“platform envy”) or stay focused and share in upside?

Longevity – Hype vs. Reality

  • Longevity focuses on aging drivers, not just disease prevention.
  • Hallmarks include mitochondrial dysfunction and chronic inflammation.
  • Factions focus on specific aging mechanisms (e.g., mitochondria, immune).
  • Measuring dysfunction is tough outside research settings.
  • Defining “health span” shapes longevity goals.
  • The space has evolved from cosmetic/optimization to biochemistry-based interventions.
  • Surprising Insight/Debunking: Brian Johnson’s N=1 experiments are viewed skeptically; epigenetic clocks have high variability.
  • Interventions like NAD, metformin, rapamycin, fasting are unproven for lifespan extension.
  • Uncontrolled self-experimentation may have unknown harms.
  • Antioxidants could worsen some cancers in mice.
  • Surprising Insight: Plasma exchange therapy shows promise in mice but isn’t practical/safe for humans.
  • Validated biomarkers remain the standard; most physicals miss deeper aging indicators.
  • Biomarkers help both stratify patients and track intervention impact.
  • Best validated interventions: eat less, sleep more, be happy, exercise.
  • Fasting is a consistent positive signal.
  • GLP-1s could be a “miracle drug” for longevity, pending long-term studies.
  • Mother–daughter asymmetry shows rejuvenation is biologically possible and worth studying.

Starting a Company as a PhD

  • Surprising Insight: Great academics often make weak founders; bold simplification is punished in academia but necessary in startups.
  • Startups require prioritization, speed, and clarity.
  • Choose the right seed investor, not the first.
  • Bootstrapping can prove pull before raising.
  • Serendipitous conversations lead to funding opportunities.
  • Startup law firms can defer fees and provide crucial early help.
  • Trust and team empowerment are critical.
  • University IP can be a landmine—build off-site, license when needed.
  • Cofounder questionnaires help surface conflicts early.
  • Biotech companies tend to delay hiring GCs until IPO stage or regulatory complexity arises.
  • Academic rigor (p-values, caveats) must give way to rapid, imperfect action.
  • Simplifying your pitch is essential.
  • Conviction comes from personal connection to the problem and end user.
  • Surprising Insight: Standard startup structures (stock, boards, investor terms) exist for a reason—over-customization creates drag.
  • Legal simplicity allows focus on harder problems.

Leave a Reply