Search the public knowledge base.
DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models
1 article>Diffusion, flow matching, VAEs, and beyond.
0 articles>Reinforcement Learning for Large Language Models: Group Relative Policy Optimization (GRPO)
1 article>Linear algebra, low-rank methods, and optimization.
0 articles>3 results

MoE architecture for efficient LLM scaling via specialized experts

GRPO and RL for LLM's

Language models that recursively refine or compose intermediate reasoning/representations.
Search the public knowledge base.
DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models
1 article>Diffusion, flow matching, VAEs, and beyond.
0 articles>Reinforcement Learning for Large Language Models: Group Relative Policy Optimization (GRPO)
1 article>Linear algebra, low-rank methods, and optimization.
0 articles>3 results

MoE architecture for efficient LLM scaling via specialized experts

GRPO and RL for LLM's

Language models that recursively refine or compose intermediate reasoning/representations.