What we learned. Oh, and you can just listen to the highlights on a run if you want, in the podcast form, 15 min: https://amol.sarva.co/podcast/ai-x-bio-x-ny-highlights-from-our-conference/

Key Learnings and Insights from the LifeX AI x Bio Conference
Foundation Models in Biology
- There is some debate on the exact definition of a foundation model.
- A key characteristic is formulating tasks in an open-ended way, throwing large amounts of unstructured data at a model to see what it can do.
- Foundation models are good at many tasks, even those they weren’t specifically designed to do.
- The term “foundation” suggests serving as a base for multiple applications; it may also signal importance for fundraising.
- Thomas at Vanta AI works on all-atom foundation models for protein-small molecule complexes. These models are expensive to train, but not necessarily $500M expensive.
- Compared to LLMs trained on exabytes of text, biological data models (e.g., AlphaFold) use far less—terabytes at most.
- Biodata could be nearly infinite if gathered and organized; generating and curating data is essential.
- Multimodal models are emerging, but their advantages remain debated.
- Surprising Insight: Sean suggests AlphaFold may be the only “true” foundation model in biomedicine, showing emergent properties.
- LLMs can correlate vision and text to describe unseen images—this is analogous to AlphaFold’s capabilities.
- Multimodal models exist for tasks like disease diagnosis, but true discipline-spanning foundation models remain rare.
- Agent-based systems may one day abstract away model selection entirely.
- Medical specialties may be social/reimbursement boundaries, not scientific ones—unlike the medicine–physics divide.
Data in Biology and AI
- Data is more important than model architecture in biomedicine today.
- Data hoarding is common due to its value.
- Compared to finance, bio shows more anxiety around data sharing.
- Surprising Insight: HIPAA concerns drive biopharma to run models on-prem, even with third-party data—distinct from regulation, it’s about ownership fear.
- Data quality issues persist, especially in legacy phenotypic datasets.
- Labeled data is a frontier—time-consuming to create but crucial.
- Outlier biological traits (e.g., insensitivity to pain) are valuable for discovery.
- Synthetic data can help; sometimes it outperforms radiologists, but quality is bounded by generation process.
- Data-scarce strategies include simulation (e.g., virtual cell modeling).
- Mechanistic models are deterministic but don’t reduce compute burden.
- Depth > breadth: Germline mutation data can offer stronger predictive power than more shallow data.
Infrastructure
- Options: pay for APIs, self-host in cloud, or buy GPUs.
- AWS hosts many models; Bedrock aims to simplify access for bio use cases.
- Goal: Make complex platforms usable for less technical scientists.
- Bio infrastructure mixes cloud (scalable) and on-prem (privacy).
- Training models is costly; some companies cloud-hop for credits. Inference is increasingly expensive too.
AI for Accelerating Drug Discovery
- The panel discussed reversing Irum’s Law (rising drug discovery costs).
- Challenges include combinatorial therapy and multi-target inhibition.
- AI is used for hypothesis generation and predicting experimental outcomes.
- AI may replace some experimental work, boosting iteration speed.
- Surprising Insight: Only ~10% of drugs entering the clinic succeed—improving this is AI’s big opportunity.
- The future lies in causal biology—differentiated hypotheses via mechanistic insights.
- AI also helps in clinical ops: patient selection, endpoints, data management.
- New architectures include flow matching, agentic systems, RL, and chain-of-thought reasoning.
- Fragmented data across stages limits end-to-end modeling; bridging these gaps could drive the next wave.
- Pharma’s in vivo/in vitro data is valuable for improving translation.
- Surprising Insight: Biotech “sells hope” to pharma, pharma to patients, investors buy hope—this drives business models.
- Some startups sell predictions (anti-hope) vs. molecules (hope).
- Hype is seen as necessary to attract capital and talent.
- Should vertical AI tools build pipelines (“platform envy”) or stay focused and share in upside?
Longevity – Hype vs. Reality
- Longevity focuses on aging drivers, not just disease prevention.
- Hallmarks include mitochondrial dysfunction and chronic inflammation.
- Factions focus on specific aging mechanisms (e.g., mitochondria, immune).
- Measuring dysfunction is tough outside research settings.
- Defining “health span” shapes longevity goals.
- The space has evolved from cosmetic/optimization to biochemistry-based interventions.
- Surprising Insight/Debunking: Brian Johnson’s N=1 experiments are viewed skeptically; epigenetic clocks have high variability.
- Interventions like NAD, metformin, rapamycin, fasting are unproven for lifespan extension.
- Uncontrolled self-experimentation may have unknown harms.
- Antioxidants could worsen some cancers in mice.
- Surprising Insight: Plasma exchange therapy shows promise in mice but isn’t practical/safe for humans.
- Validated biomarkers remain the standard; most physicals miss deeper aging indicators.
- Biomarkers help both stratify patients and track intervention impact.
- Best validated interventions: eat less, sleep more, be happy, exercise.
- Fasting is a consistent positive signal.
- GLP-1s could be a “miracle drug” for longevity, pending long-term studies.
- Mother–daughter asymmetry shows rejuvenation is biologically possible and worth studying.
Starting a Company as a PhD
- Surprising Insight: Great academics often make weak founders; bold simplification is punished in academia but necessary in startups.
- Startups require prioritization, speed, and clarity.
- Choose the right seed investor, not the first.
- Bootstrapping can prove pull before raising.
- Serendipitous conversations lead to funding opportunities.
- Startup law firms can defer fees and provide crucial early help.
- Trust and team empowerment are critical.
- University IP can be a landmine—build off-site, license when needed.
- Cofounder questionnaires help surface conflicts early.
- Biotech companies tend to delay hiring GCs until IPO stage or regulatory complexity arises.
- Academic rigor (p-values, caveats) must give way to rapid, imperfect action.
- Simplifying your pitch is essential.
- Conviction comes from personal connection to the problem and end user.
- Surprising Insight: Standard startup structures (stock, boards, investor terms) exist for a reason—over-customization creates drag.
- Legal simplicity allows focus on harder problems.
Key Learnings and Insights from the LifeX AI x Bio Conference
Welcome and Overview
- The event was one of several AI and Bio conferences co-hosted by Fenwick & West and LifeX Ventures.
- Fenwick & West is a tech and life sciences focused law firm supporting venture-backed companies and investors.
- LifeX Ventures is an early-stage venture fund (Seed and Series A) focused on “AI Meets Bio,” specifically pure tech companies where the customer is a scientist or doctor, aiming to make things better so people live longer.
- LifeX Ventures has made 55–56 deals in about two and a half years.
Foundation Models in Biology
- There is some debate on the exact definition of a foundation model.
- A key characteristic is formulating tasks in an open-ended way, throwing large amounts of unstructured data at a model to see what it can do.
- Foundation models are good at many tasks, even those they weren’t specifically designed to do.
- The term “foundation” suggests serving as a base for multiple applications; it may also signal importance for fundraising.
- Thomas at Vanta AI works on all-atom foundation models for protein-small molecule complexes. These models are expensive to train, but not necessarily $500M expensive.
- Compared to LLMs trained on exabytes of text, biological data models (e.g., AlphaFold) use far less—terabytes at most.
- Biodata could be nearly infinite if gathered and organized; generating and curating data is essential.
- Multimodal models are emerging, but their advantages remain debated.
- Surprising Insight: Sean suggests AlphaFold may be the only “true” foundation model in biomedicine, showing emergent properties.
- LLMs can correlate vision and text to describe unseen images—this is analogous to AlphaFold’s capabilities.
- Multimodal models exist for tasks like disease diagnosis, but true discipline-spanning foundation models remain rare.
- Agent-based systems may one day abstract away model selection entirely.
- Medical specialties may be social/reimbursement boundaries, not scientific ones—unlike the medicine–physics divide.
Data in Biology and AI
- Data is more important than model architecture in biomedicine today.
- Data hoarding is common due to its value.
- Compared to finance, bio shows more anxiety around data sharing.
- Surprising Insight: HIPAA concerns drive biopharma to run models on-prem, even with third-party data—distinct from regulation, it’s about ownership fear.
- Data quality issues persist, especially in legacy phenotypic datasets.
- Labeled data is a frontier—time-consuming to create but crucial.
- Outlier biological traits (e.g., insensitivity to pain) are valuable for discovery.
- Synthetic data can help; sometimes it outperforms radiologists, but quality is bounded by generation process.
- Data-scarce strategies include simulation (e.g., virtual cell modeling).
- Mechanistic models are deterministic but don’t reduce compute burden.
- Depth > breadth: Germline mutation data can offer stronger predictive power than more shallow data.
Infrastructure
- Options: pay for APIs, self-host in cloud, or buy GPUs.
- AWS hosts many models; Bedrock aims to simplify access for bio use cases.
- Goal: Make complex platforms usable for less technical scientists.
- Bio infrastructure mixes cloud (scalable) and on-prem (privacy).
- Training models is costly; some companies cloud-hop for credits. Inference is increasingly expensive too.
AI for Accelerating Drug Discovery
- The panel discussed reversing Irum’s Law (rising drug discovery costs).
- Challenges include combinatorial therapy and multi-target inhibition.
- AI is used for hypothesis generation and predicting experimental outcomes.
- AI may replace some experimental work, boosting iteration speed.
- Surprising Insight: Only ~10% of drugs entering the clinic succeed—improving this is AI’s big opportunity.
- The future lies in causal biology—differentiated hypotheses via mechanistic insights.
- AI also helps in clinical ops: patient selection, endpoints, data management.
- New architectures include flow matching, agentic systems, RL, and chain-of-thought reasoning.
- Fragmented data across stages limits end-to-end modeling; bridging these gaps could drive the next wave.
- Pharma’s in vivo/in vitro data is valuable for improving translation.
- Surprising Insight: Biotech “sells hope” to pharma, pharma to patients, investors buy hope—this drives business models.
- Some startups sell predictions (anti-hope) vs. molecules (hope).
- Hype is seen as necessary to attract capital and talent.
- Should vertical AI tools build pipelines (“platform envy”) or stay focused and share in upside?
Longevity – Hype vs. Reality
- Longevity focuses on aging drivers, not just disease prevention.
- Hallmarks include mitochondrial dysfunction and chronic inflammation.
- Factions focus on specific aging mechanisms (e.g., mitochondria, immune).
- Measuring dysfunction is tough outside research settings.
- Defining “health span” shapes longevity goals.
- The space has evolved from cosmetic/optimization to biochemistry-based interventions.
- Surprising Insight/Debunking: Brian Johnson’s N=1 experiments are viewed skeptically; epigenetic clocks have high variability.
- Interventions like NAD, metformin, rapamycin, fasting are unproven for lifespan extension.
- Uncontrolled self-experimentation may have unknown harms.
- Antioxidants could worsen some cancers in mice.
- Surprising Insight: Plasma exchange therapy shows promise in mice but isn’t practical/safe for humans.
- Validated biomarkers remain the standard; most physicals miss deeper aging indicators.
- Biomarkers help both stratify patients and track intervention impact.
- Best validated interventions: eat less, sleep more, be happy, exercise.
- Fasting is a consistent positive signal.
- GLP-1s could be a “miracle drug” for longevity, pending long-term studies.
- Mother–daughter asymmetry shows rejuvenation is biologically possible and worth studying.
Starting a Company as a PhD
- Surprising Insight: Great academics often make weak founders; bold simplification is punished in academia but necessary in startups.
- Startups require prioritization, speed, and clarity.
- Choose the right seed investor, not the first.
- Bootstrapping can prove pull before raising.
- Serendipitous conversations lead to funding opportunities.
- Startup law firms can defer fees and provide crucial early help.
- Trust and team empowerment are critical.
- University IP can be a landmine—build off-site, license when needed.
- Cofounder questionnaires help surface conflicts early.
- Biotech companies tend to delay hiring GCs until IPO stage or regulatory complexity arises.
- Academic rigor (p-values, caveats) must give way to rapid, imperfect action.
- Simplifying your pitch is essential.
- Conviction comes from personal connection to the problem and end user.
- Surprising Insight: Standard startup structures (stock, boards, investor terms) exist for a reason—over-customization creates drag.
- Legal simplicity allows focus on harder problems.
You must be logged in to post a comment.