The gap between raw biological data and actionable clinical insight is vast, and mostly unautomated.
My work sits precisely in that gap. I build across the full stack of this problem: multi-agent AI pipelines that reason over clinical trial criteria, spatial and single-cell transcriptomics analyses that map tumor microenvironments at cellular resolution, regulatory-grade clinical data pipelines built to CDISC standards, and full-stack applications that surface these insights to the people who need them. The goal is always the same: make the biology computable and the computation trustworthy.
| Status | Project | Focus |
|---|---|---|
| 🛠 Ongoing | SpatioCore-Flow | Gated multi-agent orchestrator for single-cell and spatial transcriptomics with Code-in-the-Loop hallucination prevention |
| 🟢 Active | ClinPilot | Refactoring orchestration layer into a stateless FastAPI backend for multi-user support |
| 🟢 Active | LabTasker | Token persistence, email deadline reminders, analytics view, and time tracking per task |
| 🟢 Active | CargoURL AI | Supabase integration and live frontend connection to replace mock data with real API responses |
How do you make an LLM-generated clinical trial report trustworthy enough for a patient, a clinician, and a regulatory reviewer to act on?
Problem LLMs can summarize clinical trial eligibility criteria fluently, but they hallucinate, paraphrase, and omit without flagging it. In a regulatory context, an unverified output is not just wrong, it's a liability.
Solution A 5-agent sequential pipeline that retrieves live trial data from ClinicalTrials.gov, cross-references PubMed literature and FDA drug labels, then audits its own outputs before returning a structured 9-section report, written simultaneously for patients, clinicians, and regulatory reviewers.
| Agent | Model | Role |
|---|---|---|
| Analyst | gpt-4o-mini | Extracts top inclusion and exclusion criteria from raw eligibility text |
| Researcher | gpt-4o-mini | Summarizes relevant PubMed abstracts; notes support or tension with criteria |
| Advocate | gpt-4o-mini | Produces plain-language patient report (trial overview, checklists, next steps) |
| Auditor | gpt-4o | Self-verification loop: grounds every criterion in verbatim source quotes; prunes unverifiable claims |
| Guardrail | gpt-4o | Compliance & risk review against 21 CFR Part 312 and ICH E6; Fairness & Diversity Audit per NIH/FDA guidelines |
Tech Stack
Data sources: ClinicalTrials.gov V2 API · PubMed E-utilities · FDA Drug Label API
Verification: Auditor self-verification loop: verbatim citation binding, unverifiable claims pruned
Output: 9-section progressive-disclosure report + downloadable PDF certificate
Caching: SQLite (audit cache by NCT ID) · ChromaDB (past audit retrieval)
Models: gpt-4o-mini (extraction/summarization) · gpt-4o (verification/compliance)
What if an AI system could analyze a tumor's spatial architecture without hallucinating biology it never actually saw in the data?
Problem LLM-based bioinformatics tools fail silently: they reason over biological data but have no mechanism to verify their own inferences against the underlying genomics. In a clinical or research context, an undetected hallucination in a spatial deconvolution or cell-type annotation is not a minor error.
Solution A production-grade multi-agent framework for autonomous Single-Cell Genomics and Spatial Transcriptomics analysis built on a Verification-First architecture. Every AI-generated inference is programmatically validated against raw AnnData/Squidpy objects through a Code-in-the-Loop gate before propagation: if the Biological Consistency Score drops below 0.8, the inference is rejected and the agent reruns with constraint adjustments.
| Agent | Role |
|---|---|
| Curator | Maps raw input to the correct biological coordinate system (dissociated vs. in-situ) |
| Analyst | Executes foundation models (scGPT, Geneformer, Tangram, cell2location) for deconvolution and ST prediction |
| Validator | Code-driven verification of Analyst claims against source data |
| Synthesizer | Merges "What" (RNA) with "Where" (Spatial) into a tumor microenvironment graph |
| Auditor | Source-to-bit traceability linking every claim to a gene index or pixel coordinate |
| Guardrail | Evaluates outputs against FDA/SaMD risk frameworks and clinical literature |
Tech Stack
Architecture: Hybrid DAG orchestration · Sandbox-Gate pattern
Validation: Code-in-the-Loop · Biological Consistency Score (BCS ≥ 0.8)
Models: scGPT · Geneformer · Tangram · cell2location · StarDist
Storage: PostgreSQL (audit trail) · ChromaDB (RAG) · Redis (cache)
Problem Gastric cancer subtypes are clinically heterogeneous, and bulk profiling fails to resolve the malignant, immune, and fibroblast populations driving treatment resistance.
Solution Single-cell RNA-seq pipeline from raw count matrices through clustering, cell-type annotation, differential expression, and trajectory inference, reconstructing the cellular landscape of gastric tumor samples.
Tech Stack
Problem Standard scRNA-seq dissolves tissue architecture: you lose the spatial organization of tumor, immune, and stromal compartments that determines how cancer progresses and resists treatment.
Solution End-to-end spatial transcriptomics pipeline on a 10x Xenium in situ dataset from a human breast cancer FFPE section: cell type mapping, spatially-resolved gene expression patterns, and tissue architecture characterization at single-cell resolution.
Tech Stack
Problem Regulatory submissions to the FDA and EMA require clinical trial data in CDISC-standard formats. Transforming raw trial datasets into analysis-ready ADaM structures, and from there into auditable TLF outputs, demands both statistical rigor and deep familiarity with submission standards.
Solution
End-to-end clinical data pipeline from SDTM source datasets (DM, AE, LB) through ADaM derivation (ADSL subject-level, ADAE adverse events) to regulatory-grade tables, listings, and figures. Fully automated via a single run_all.R entry point.
Tech Stack
Standards: CDISC SDTM → ADaM (ADSL, ADAE)
Outputs: AE summary tables, demographics TLFs, treatment-emergent AE flags
Pipeline: Modular R scripts · run_all.R single-command execution
Research workflows don't fit generic project managers: they need Kanban boards that understand experiments, not sprints.
Problem Lab teams lose track of experiments, deadlines, and task ownership across spreadsheets and email threads. Generic project tools lack the domain-specific structure that research workflows require.
Solution Full-stack research task management application with a drag-and-drop Kanban board (To Do / In Progress / Done), per-project progress tracking, and JWT-authenticated user accounts. Frontend deployed on Render with a live demo.
Frontend
Backend
Auth: JWT · bcrypt · protected routes
Features: Drag-and-drop reorder, due dates, progress bars, per-user scoping
Live demo: https://labtasker-frontend.onrender.com
A link management platform needs more than redirect counts: it needs to tell you when to post, who's clicking, and whether the numbers are moving.
Problem Raw click data from a URL shortener is noise without context. Posting time, audience composition, and CTR deltas require time-series forecasting and segmentation, not just counting.
Solution A Flask REST API that turns raw click event streams into optimization signals: high-engagement posting window prediction via Prophet, audience segmentation via KMeans clustering, and CTR delta reporting. Built as the analytics and AI backend for CargoURL.
Tech Stack
Pipeline: Click events → Prophet forecasting → KMeans segmentation → CTR delta
Deployment: Railway / Render (Nixpacks)
Status: API functional and deployed; frontend integration in progress
| Domain | Tools & Methods |
|---|---|
| Single-Cell Genomics | Scanpy, Seurat, scRNA-seq clustering, trajectory inference |
| Spatial Transcriptomics | 10x Xenium (in situ), Squidpy, Tangram, cell2location, spatial gene expression analysis |
| Biological Foundation Models | scGPT, Geneformer, StarDist, spatial deconvolution |
| Clinical Data Standards | CDISC SDTM/ADaM, TLF generation, admiral (R), regulatory pipelines |
| Statistical Analysis | R/Bioconductor, differential expression, survival analysis |
| Data Engineering | Pandas, NumPy, high-performance Python, HPC/SLURM |
| Domain | Tools & Methods |
|---|---|
| Multi-Agent Systems | CrewAI, LiteLLM, Hybrid-DAG orchestration, deliberation frameworks |
| Agentic Validation | Code-in-the-Loop verification, Biological Consistency Scoring, Pydantic schemas |
| Retrieval-Augmented Generation | ChromaDB, vector embeddings, semantic search |
| LLM Integration | OpenAI API (gpt-4o, gpt-4o-mini), prompt engineering |
| Caching & Optimization | SQLite semantic cache, inference cost reduction |
| Regulatory AI | FDA-aware guardrail agents, citation integrity, audit trails |
| Domain | Tools & Methods |
|---|---|
| Frontend | React 18, TypeScript, Tailwind CSS, shadcn/ui, Vite |
| Backend | Node.js, Express, Flask, RESTful API design |
| Database | MongoDB Atlas, PostgreSQL, SQLite, Redis, data modeling |
| Auth & Security | JWT, bcrypt, role-based access control |
| DevOps | Git, GitHub Actions, Linux/Bash, Render, Railway |
"The most consequential code running today is the code that interprets biology."