Papers

All paper repos include source, data, and build tooling. Repos will be made public on submission.

arXiv cs.SE Draft

Representation Over Retrieval

How information presentation affects AI performance on structured engineering tasks. The anchor paper for our benchmark findings.

arXiv cs.SE Draft

Beyond the Mean

Per-task analysis reveals hidden structure in tool-augmented LLM evaluation. Why aggregate scores hide the interesting signal.

arXiv cs.SE Draft

sysml-bench

Evaluating tool-augmented LLMs on SysML v2 model comprehension. Methodology, tasks, scoring, replication.

arXiv cs.SE Draft

Grammar Conversion at Scale

From specification notation to parser generators for SysML v2. How kebnf bridges the gap between OMG specs and working parsers.

arXiv cs.SE Draft

Context Asymmetry Is a Representation Problem

Disposition graphs for AI-augmented collaborative development. The formal model behind synthesist's stakeholder tracking.

ICSE NIER 2026 Under review

Anti-Vacuity Enforcement by Construction

A test framework pattern for LLM agents. Currently under double-anonymous review.

GVSETS 2026

AI-Assisted Systems Engineering with SysML v2

Applying the benchmark methodology to defense ground vehicle systems modeling.