Cancer AI Conversations · May 26, 2026

AI-Assisted Vibe Coding
in Cancer Genomics

Ino de Bruijn, PhD

Data Visualization Team Lead | Cancer Data Science Initiative
Memorial Sloan Kettering Cancer Center

* this presentation was fully vibe-coded

Context

cBioPortal for Cancer Genomics

40K+

monthly users

300+

contributors

15+

years of development

100+

global instances

Open-source platform for exploring cancer genomics data — mutations, copy-number, expression, clinical outcomes.

Early 2025

Simple Web Pages — solved

Landing pages, dashboards, interactive maps — built in hours, not weeks.

🌐 about.cbioportal.org AI-generated landing page, maintained through prompts.

🗺️ Installation Map Originally a summer-long project — rebuilt by a new engineer in a few hours.

      Models weren't yet good enough for a large existing codebase — but any WordPress-like site was already fully vibe-codeable.
    

Mid-2025

Features in a mature codebase

Claude Code launched May 2025. Agents could now navigate and extend 300K+ lines of existing code.

Similarity Maps — embeddings viewer in cBioPortal

          Example: New Similarity Maps feature for cBioPortal — fully vibe-coded with Claude Code.
        

Agents read existing patterns, follow conventions, write tests
Engineers who stopped coding day-to-day can ship features again

Late 2025 / Early 2026

Software rewrites

vibe-vep — Ensembl VEP rewrite:

Perl → Go, single binary (~95MB vs 17GB)
99.8% concordance, 14K variants/sec
Built at a hackathon, refined with Claude Code

github.com/inodb/vibe-vep

⚡ fastVEP — independent preprint, same idea

📊 RustQC — MultiQC in Rust, 63× faster

📜 rewrites.bio — community guidelines for responsible rewrites

2026

Agents consuming our resources

💬 chat.cbioportal.org Natural language → SQL over cancer genomics data. Unlocks access for non-programmers.

📈 Uninvited agent traffic Agents don't paginate, don't rate-limit, make redundant calls. Had to spin up separate servers. Memory spike from agent traffic

      The API used to be for programmers. Now any researcher can point an agent at it.
    

Open Source

Open source under pressure

⚠️ Google Summer of Code 2026: Agents are so good at "good first issues" that evaluating candidates is near-impossible.

Far more PRs — hard to keep up with review volume
Agent-assisted contributions look like expert ones
How do you mentor the next generation?

The Big Question

What should we build?

Before: our team built software for cancer researchers.
Now: researchers can build much of it themselves.

🗄️ Data organization Nobody else will organize our data for us. Curation remains our job.

🔗 Integrated experiences Chat + visualization + data is the new interface. Accuracy in our domain is the hard part.

📏 Benchmarks Comprehensive test suites so any tool can be rewritten and validated. May matter more than the tool itself.

⚠️ The 95% illusion: a vibe-coded prototype looks nearly done — but maintenance, correctness, and collaborative development are the hard part.

Conversation starters

What to build? — data, benchmarks, integrated experiences
Rewrites — benchmarks may matter more than the tool itself
Agent traffic — our APIs are now consumed by machines, not just people
Open source — more PRs, harder to evaluate contributors

cbioportal.org github.com/cbioportal

AI-Assisted Vibe Codingin Cancer Genomics