Papers

Paper summaries, notes, and reading lists.

Multi-Pivot Attribution: Attributing Distributed Privacy Leaks in LLM Agent Trajectories

Privacy violations in LLM agent trajectories often arise through distributed information flow: multiple individually benign steps that collectively leak sensitive data, with no single step bearing full responsibility. We formalize this as a post-violation attribution problem and propose Multi-Pivot Attribution (MPA), a method that selects multiple trajectory steps for sanitization using context-aware risk scoring and greedy ranking. We benchmark five strategies spanning a safety-cost Pareto frontier---from single-pivot to full sanitization---on 180 agent trajectories with step-level violation labels. Results show that multi-pivot strategies substantially outperform single-pivot, and one variant achieves strong coverage at a fraction of the cost of full sanitization. The framework is model-agnostic and fully reproducible.

2026 LLM Agent

PixelTopo-Gen: Teaching Pure-Text LLMs to Understand Space by Generating 0/1 Pixel Art

Recent work (Vision Banana) has demonstrated that visual generation models can "understand" by generating RGB images—proving that their ability to create accurate visual content reflects genuine visual comprehension. This raises a dual question for pure-text large language models (LLMs): Can they demonstrate spatial understanding by generating 0/1 pixel art, the simplest possible representation of 2D space?

2026 NLPLoRALLM Interpretability