Work at a Frontier Lab
CoursesBlogPapersDiscussJobs

Paper Breakdowns

Frontier Reports. Broken down for your next ML interview.

Google BrainFeb 22, 2026

Attention Is All You Need

Ashish Vaswani, Noam Shazeer +6 more

The dominant sequence transduction models are based on complex recurrent or convolutional neural networks that include an encoder and a decoder. The best performing models also connect the encoder and decoder through an attention mechanism. We propose a new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely. Experiments on two machine translation tasks show these models to be superior in quality while being more parallelizable and requiring significantly less time to train.

foundationalattentiontransformersarchitecture
DeepSeekFeb 22, 2026

DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models

DeepSeek-AI, Aixin Liu +3 more

DeepSeek-V3.2 introduces DeepSeek Sparse Attention (DSA), an efficient attention mechanism that reduces computational complexity from O(L²) to O(Lk) while preserving performance in long-context scenarios. Through a robust reinforcement learning protocol using improved GRPO and scaling post-training compute to over 10% of pre-training cost, V3.2 performs comparably to GPT-5. The high-compute variant DeepSeek-V3.2-Speciale surpasses GPT-5 on math and achieves gold medals at IMO 2025, IOI 2025, and ICPC World Finals 2025. A novel large-scale agentic task synthesis pipeline generates 85,000+ training prompts across 1,800+ environments.

frontierMoEsparse-attentionreinforcement-learning+3
Zhipu AI / Tsinghua UniversityFeb 22, 2026

GLM-5: From Vibe Coding to Agentic Engineering

Aohan Zeng, Xin Lv +11 more

We present GLM-5, a next-generation foundation model designed to transition the paradigm of vibe coding to agentic engineering. Building upon the agentic, reasoning, and coding (ARC) capabilities of its predecessor, GLM-5 adopts DSA to significantly reduce training and inference costs while maintaining long-context fidelity. To advance model alignment and autonomy, we implement a new asynchronous reinforcement learning infrastructure that drastically improves post-training efficiency by decoupling generation from training. Furthermore, we propose novel asynchronous agent RL algorithms that further improve RL quality, enabling the model to learn from complex, long-horizon interactions more effectively. Through these innovations, GLM-5 achieves state-of-the-art performance on major open benchmarks.

advancedreinforcement-learningagenticsparse-attention+2
OpenAIFeb 22, 2026

GPT-OSS-120B & GPT-OSS-20B: OpenAI's First Open-Weight Reasoning Models

OpenAI, Sandhini Agarwal +5 more

OpenAI releases gpt-oss-120b (116.8B total, 5.1B active) and gpt-oss-20b (20.9B total, 3.6B active), their first open-weight reasoning models under Apache 2.0. Both use efficient MoE transformer architectures with 128 and 32 experts respectively, trained via large-scale distillation and reinforcement learning similar to o3. MXFP4 quantization enables the 120B model to run on a single 80GB GPU. The models feature variable-effort reasoning (low/medium/high), agentic tool use (browsing, Python, function calling), and a novel Harmony chat format with instruction hierarchy. On AIME 2025, gpt-oss-120b scores 97.9% and gpt-oss-20b scores 98.7%, competitive with o3 and o4-mini.

advancedMoEreasoningopen-weights+3
Moonshot AIFeb 22, 2026

Kimi K2: Open Agentic Intelligence

Kimi Team, Yifan Bai +6 more

We introduce Kimi K2, a Mixture-of-Experts large language model with 32 billion activated parameters and 1 trillion total parameters. K2 uses the MuonClip optimizer and was pre-trained on 15.5 trillion tokens with zero loss spikes. Post-training involves a large-scale agentic data synthesis pipeline and a joint reinforcement learning stage. K2 achieves state-of-the-art performance among open-source non-thinking models with scores including 65.8 on SWE-Bench Verified and 53.7 on LiveCodeBench v6.

advancedMoEoptimizeragentic+3

Built with Next.js

PrivacyTermsContactPapersJobsDiscuss|GitHub|Work at a Frontier Lab