Cancer AI Conversations · May 26, 2026

AI-Assisted Vibe Coding
in Cancer Genomics

Ino de Bruijn, PhD
Data Visualization Team Lead | Cancer Data Science Initiative
Memorial Sloan Kettering Cancer Center
Memorial Sloan Kettering Cancer Center cBioPortal * this presentation was fully vibe-coded

cBioPortal for Cancer Genomics

Goal of cBioPortal
40K+
monthly users
300+
contributors
15+
years of development
100+
global instances

Open-source platform for exploring cancer genomics data — mutations, copy-number, expression, clinical outcomes.

Simple Web Pages — solved

Landing pages, dashboards, interactive maps — built in hours, not weeks.

🌐 about.cbioportal.org AI-generated landing page, maintained through prompts. about.cbioportal.org
🗺️ Installation Map Originally a summer-long project — rebuilt by a new engineer in a few hours. cBioPortal Installation Map
Models weren't yet good enough for a large existing codebase — but any WordPress-like site was already fully vibe-codeable.

Features in a mature codebase

Claude Code launched May 2025. Agents could now navigate and extend 300K+ lines of existing code.

Similarity Maps — embeddings viewer in cBioPortal
Example: New Similarity Maps feature for cBioPortal — fully vibe-coded with Claude Code.
  • Agents read existing patterns, follow conventions, write tests
  • Engineers who stopped coding day-to-day can ship features again

Software rewrites

vibe-vep — Ensembl VEP rewrite:

  • Perl → Go, single binary (~95MB vs 17GB)
  • 99.8% concordance, 14K variants/sec
  • Built at a hackathon, refined with Claude Code

github.com/inodb/vibe-vep

⚡ fastVEP — independent preprint, same idea
📊 RustQC — MultiQC in Rust, 63× faster
📜 rewrites.bio — community guidelines for responsible rewrites

Agents consuming our resources

💬 chat.cbioportal.org Natural language → SQL over cancer genomics data. Unlocks access for non-programmers.
Chat query Chat response
📈 Uninvited agent traffic Agents don't paginate, don't rate-limit, make redundant calls. Had to spin up separate servers. Memory spike from agent traffic
The API used to be for programmers. Now any researcher can point an agent at it.

Open source under pressure

⚠️ Google Summer of Code 2026: Agents are so good at "good first issues" that evaluating candidates is near-impossible.
Google Summer of Code

What should we build?

Before: our team built software for cancer researchers.
Now: researchers can build much of it themselves.

🗄️ Data organization Nobody else will organize our data for us. Curation remains our job.
🔗 Integrated experiences Chat + visualization + data is the new interface. Accuracy in our domain is the hard part.
📏 Benchmarks Comprehensive test suites so any tool can be rewritten and validated. May matter more than the tool itself.
⚠️ The 95% illusion: a vibe-coded prototype looks nearly done — but maintenance, correctness, and collaborative development are the hard part.

Conversation starters

  1. What to build? — data, benchmarks, integrated experiences
  2. Rewrites — benchmarks may matter more than the tool itself
  3. Agent traffic — our APIs are now consumed by machines, not just people
  4. Open source — more PRs, harder to evaluate contributors