Hello! I’m Jacob Merizian. I work on technical AI governance at the UK AI Security Institute.
Currently, I’m focused on sandbagging mitigations and evaluations. In the past, I’ve done research in high-performance computing, language model pretraining, interpretability, and hardware enabled governance.
- Jul 2025 - UK AISI Whitebox Control Progress Update (AF, LW)
- Jul 2025 - Establishing Best Practices for Building Rigorous Agentic Benchmarks (HN)
- Jun 2025 - Seminal Plots of AI Progress
- Mar 2025 - SWE-Bench Visualizer
- Nov 2022 - Interpreting Neural Networks through the Polytope Lens (LW)
- Dec 2021 - GPT-3 knows about its evals
- Sep 2021 - City Circuits: a GPT-2 “neuron explorer”