Search the public knowledge base.
DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models
1 article>Diffusion, flow matching, VAEs, and beyond.
0 articles>Reinforcement Learning for Large Language Models: Group Relative Policy Optimization (GRPO)
1 article>Linear algebra, low-rank methods, and optimization.
0 articles>4 results

Speculative Decoding in LLM's

MoE architecture for efficient LLM scaling via specialized experts

GRPO and RL for LLM's

Language models that recursively refine or compose intermediate reasoning/representations.
Search the public knowledge base.
DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models
1 article>Diffusion, flow matching, VAEs, and beyond.
0 articles>Reinforcement Learning for Large Language Models: Group Relative Policy Optimization (GRPO)
1 article>Linear algebra, low-rank methods, and optimization.
0 articles>4 results

Speculative Decoding in LLM's

MoE architecture for efficient LLM scaling via specialized experts

GRPO and RL for LLM's

Language models that recursively refine or compose intermediate reasoning/representations.