Back to All Resources

Qwen2.5-Max Technical Report

Details on large-scale MoE training with 20T tokens and RLHF optimization (Jan 2025)