Introduced in December 2023, this model uses about 12 billion during inference, enhancing throughput for various tasks.