Available for Summer 2026 Opportunities

<E.Justice/>

ML Researcher | LLM Inference Optimization | vLLM

Hi, I'm Ethan Justice - a Computer Science student at the University of Michigan researching KV-cache optimization and multi-agent LLM systems. Building production-grade ML infrastructure on HPC clusters.

8+
Years Coding
5
Internships
11+
Projects
LLMs
Research Focus
Scroll

02. Experience Timeline

From ML research to enterprise software engineering, building production systems at scale.

03. Work & Projects

From ML research to hackathon wins, building solutions across the stack.

Search: |
Type
Category

Machine Learning Researcher

researchFeatured
Nov 2025 - Present
University of Michigan - RobustNet Lab
Ann Arbor, Michigan hybrid

Optimization of Large Language Model inference infrastructure focusing on KV-cache management.

  • Productionizing research from 'Compute Or Load KV Cache? Why Not Both?' paper into the vLLM ecosystem
  • Implementing bidirectional KV prefill logic to minimize Time-to-First-Token (TTFT) latency
  • Porting experimental logic from legacy codebases to current main branches of vLLM and LMCache
  • Optimizing GPU memory transfer overheads on Great Lakes HPC (A100/H100) clusters
PythonvLLMLMCacheCUDANvidia NsightA100H100

AI Engineering Intern

internshipFeatured
Aug 2025 - Nov 2025
PersistOS, Inc.
San Francisco Bay Area remote

Core infrastructure engineer for an agentic memory platform; built the complete multimodal interaction pipeline.

  • Engineered the entire end-to-end multimodal pipeline, enabling seamless file uploads and voice interfaces
  • Achieved sub-second voice-to-voice agent interactions through optimized audio processing streams
  • Implemented a semantic caching layer via Convex, reducing API response times by over 99%
  • Built an 'LLM-as-a-Judge' testing suite to quantitatively evaluate agent memory performance
PythonConvexRedisGCPNext.jsTypeScriptOpenAI API

Efficient Heterogeneous LLM Multi-Agent Debate

class projectFeatured
Aug 2025 - Dec 2025
EECS 498 - Machine Learning Research

Research framework and class project reducing inference costs for complex reasoning tasks by 40% via heterogeneous agents.

  • Achieved 40% reduction in total FLOPs using a confidence-based gating mechanism for agent responses
  • Deployed high-performance inference pipelines on Great Lakes HPC cluster using Slurm scheduling
  • Implemented a Factory Pattern to modularly switch backends between vLLM and HuggingFace
  • Identified 'Syntactic Determinism' failure modes in token-confidence calibration
PythonPyTorchvLLMSlurmLlama-3DeepSeekMathstralHuggingFace

Information Digest

projectFeatured
Aug 2025 - Aug 2025

Agentic information retrieval pipeline with robust Infrastructure-as-Code and CI/CD automation.

  • Automated complete infrastructure lifecycle using Terraform and GitHub Actions
  • Engineered an LLM agent that autonomously decomposes complex queries and synthesizes answers
  • Developed comprehensive Pytest suites ensuring reliability before deployment to GCP Cloud Run
  • Designed a strongly-typed REST interface guaranteeing deterministic JSON outputs
PythonLangChainFastAPIDockerTerraformGCP Cloud RunGitHub Actions

Cycle-Accurate Processor Simulator

class project
Oct 2025 - Dec 2025
EECS 370 - Computer Organization

High-performance microarchitectural simulator modeling a 5-stage pipelined CPU with a configurable cache hierarchy.

  • Engineered a 5-stage pipeline simulator (IF, ID, EX, MEM, WB) achieving cycle-accurate execution of LC-2K binaries
  • Implemented hazard detection logic to handle data dependencies via forwarding units and stall injection
  • Integrated a unified instruction/data cache with configurable associativity, block size, and write-back/allocate-on-write policies
  • Optimized memory access simulation using Least Recently Used (LRU) eviction to minimize miss rates
CGCCMakeLinux

LC-2K Compilation Toolchain

class project
Sep 2025 - Oct 2025
EECS 370 - Computer Organization

End-to-end translation system converting assembly code into executable machine binaries via intermediate object files.

  • Developed a two-pass assembler generating object files with symbol tables and relocation entries for global label resolution
  • Built a static linker capable of combining multiple object files, resolving cross-file dependencies, and managing stack allocation
  • Wrote optimized LC-2K assembly algorithms, including a recursive combination function managing stack frames and return addresses
  • Implemented bitwise simulation of the LC-2K ISA to validate machine code execution behavior
CLC-2K AssemblyGCCMake

Software Engineering Intern

internship
May 2025 - Aug 2025
Little Caesars Pizza
Detroit, Michigan hybrid

Backend development and DevOps automation for enterprise Computer Systems serving 5,000+ franchise locations.

  • Architected a C# .NET microservice to handle SMS automation at the scale of 5,000+ stores
  • Designed and implemented full CI/CD pipelines in Azure DevOps to automate deployment workflows
  • Created a batch-processing API that achieved a 90% reduction in developer time for data management tasks
  • Built compliance tracking tools for franchise store hours using MongoDB and Docker
C#.NETAzure DevOpsMongoDBDockerTwilio

Deep Metric Learning for Facial Recognition

class project
Mar 2025 - Apr 2025
EECS 442 - Computer Vision

Facial recognition system built using ResNet-18 and non-parametric instance discrimination.

  • Fine-tuned a ResNet-18 backbone using Contrastive Loss to optimize feature embeddings for facial identity
  • Implemented Non-Parametric Instance Discrimination (NPID) to maintain a memory bank of feature vectors
  • Visualized high-dimensional embedding clusters using t-SNE to verify class separation
  • Engineered a custom CNN architecture to benchmark against pre-trained models on a held-out dataset
PythonPyTorchResNet-18MatplotlibSciPy

Interview Bot Pro

class project
Feb 2025 - Mar 2025
EECS 487 - User Interface Design

AI-powered interview practice application featuring auditory analysis and content grading.

  • Trained a Random Forest ML model on the MIT interview dataset for voice prosody analysis
  • Integrated Google Gemini to grade answers based on STAR method structure and job requirements
  • Developed logic for dynamic follow-up questions tailored to previous user responses
PythonGoogle GeminiRandom ForestScikit-learnSpeech Prosody Analysis

Multi-Resolution Image Blending

class project
Feb 2025 - Feb 2025
EECS 442 - Computer Vision

Seamless image combination engine using Gaussian and Laplacian pyramids.

  • Constructed 4-level Laplacian Pyramids to decompose images into distinct frequency bands
  • Implemented seamless blending masks to combine images without visible seams or artifacts
  • Reconstructed high-fidelity final images by collapsing pyramid levels upsampled via Gaussian kernels
PythonNumPySciPyMatplotlib

Steerable Filter Edge Detection

class project
Jan 2025 - Feb 2025
EECS 442 - Computer Vision

Advanced edge detection system using gradient filters and convolution math.

  • Designed steerable filters to detect edges at arbitrary angles (pi/4, pi/2, 3pi/4)
  • Implemented Gaussian and Box filters to analyze noise reduction trade-offs in edge detection
  • Calculated gradient magnitude and orientation maps to visualize structural image features
PythonNumPySciPy

Multi-Threaded Network File Server

class project
Nov 2024 - Present
EECS 482 - Operating Systems

Concurrent network file server using fine-grained locking and crash-consistent disk logging.

  • Implemented fine-grained reader/writer locks (`boost::shared_mutex`) for high-concurrency file access
  • Ensured file system consistency and crash recovery via strictly ordered disk writes (inode vs. data)
  • Built hierarchical directory management and inode allocation logic
  • Designed thread-safe network communication using TCP sockets
C++Boost ThreadsTCP Sockets

Virtual Memory Pager

class project
Oct 2024 - Present
EECS 482 - Operating Systems

External pager managing virtual address spaces with eviction policies and copy-on-write sharing.

  • Implemented the 'Clock' page replacement algorithm to approximate LRU eviction efficiency
  • Engineered copy-on-write optimizations for `fork` operations to minimize physical memory usage
  • Managed swap-backed vs. file-backed pages and eager swap reservation
  • Handled page faults and memory protection bits to simulate hardware MMU behavior
C++LinuxMMU Simulation

The Situation Room - Cal Hacks

project
Oct 2024 - Oct 2024

Unity-based simulation game using AI for dialogue analysis in de-escalation scenarios.

  • Integrated Hume AI for real-time sentiment analysis of user voice input
  • Utilized Google Gemini to dynamically generate NPC responses based on conversation history
UnityC#PythonGoogle GeminiHume AI

Ribbet - MHacks Project

project
Oct 2024 - Oct 2024

Cross-platform social media application with betting mechanics built during MHacks.

  • Architected MERN Stack backend to support cross-platform mobile functionality
  • Implemented full CRUD operations for user profiles, feeds, and betting transactions
MongoDBExpress.jsReact NativeNode.js

Research Assistant

research
Sep 2024 - Dec 2024
University of Michigan - Interactive Sensing and Computing Lab
Ann Arbor, Michigan on-site

Embedded ML development for real-time user activity sensing and identification.

  • Engineered an embedded system on Orange Pi for real-time activity level identification
  • Optimized Llama 3 inference on edge hardware using strategic core routing and multithreading
  • Processed high-frequency sensor data streams for immediate classification
PythonLlama 3Orange Pi

User-Level Thread Library

class project
Sep 2024 - Present
EECS 482 - Operating Systems

High-performance threading library supporting context switching and preemptive scheduling.

  • Implemented thread initialization and context switching using `getcontext`, `makecontext`, and `swapcontext`
  • Built synchronization primitives including Mutexes and Condition Variables with wait queues
  • Handled timer interrupts for preemptive thread scheduling and CPU time slicing
  • Designed RAII wrappers for automatic resource management and deadlock prevention
C++Linux

Inner Voice AI

project
Jul 2024 - Jul 2024

Conversational AI therapist focusing on mental health, winner of 1st place at Stemist Hacks.

1st Place - Stemist Hacks
  • Won 1st Place at Stemist Hacks for best overall project
  • Engineered natural conversational flow using OpenAI API and Hume AI emotional analysis
  • Implemented session management and history tracking using Firebase
PythonFlaskOpenAI APIHume AIFirebase

Software Development Intern

internship
May 2024 - Aug 2024
United Wholesale Mortgage
Pontiac, Michigan on-site

Full-stack modernization of internal employee rewards platforms used by 750+ team leads.

  • drove a 20% increase in monthly active users by re-engineering the legacy rewards platform
  • Managed backend operations and Apex controllers for a dataset exceeding 50,000 entries
  • Implemented automated testing protocols ensuring high availability for critical internal tools
JavaScriptCSSSalesforceApexCopado

Flight Systems Developer

project team
Sep 2023 - Sep 2024
M-Fly Aero Design
Ann Arbor, Michigan on-site

Autonomous navigation software development for M-Fly Aero Design competition planes.

  • Implemented 3DVFH* algorithm for dynamic aerial obstacle avoidance
  • Developed autonomous control loops using ROS and MAVLink for actuator management
  • Integrated remote identification Computer Systems for real-time aircraft tracking
PythonROSMAVLink3DVFH*

Lead Programmer

project team
Jun 2019 - Jun 2023
Hartland Robotics - FRC Team 3536
Hartland, Michigan on-site

Technical lead for FRC Team 3536; architected software for 4 competitive robots (Reaper, Blade, Raptor, NavPod).

  • Designed 'NavPod' custom sensor fusion system combining Optical Flow and IMU data for precise localization
  • Implemented Hermite spline pathing for complex autonomous trajectory generation
  • Programmed advanced drivetrain logic including Differential Swerve and field-oriented control
  • Developed autonomous turret tracking using computer vision and physics-based launch calculations
  • Mentored student engineers in C++, object-oriented design, and control theory
C++WPILibOpenCVArduinoI2CHermite Splines

04. Education

Coursework, projects, and academic achievements at Michigan Engineering.

EDU-UMICH

University of Michigan

Bachelor of Science in Engineering in Computer Science

Aug 2023 - May 2026Ann Arbor, Michigan

Dean's ListUniversity HonorsMichigander EV and Mobility ScholarUndergraduate Research Opportunity Program
Activities:M-Fly Aero DesignMichigan HackersClaude Builder's Club
Coursework

05. Contact

Interested in collaborating or have an opportunity to discuss? Let's connect.

CONTACT-INFO

Get in Touch

ethanjus@umich.edu
Ann Arbor, Michigan
Connect
Available for Summer 2026 Opportunities
CONTACT-FORM

Send a Message