Microsoft releases DeepSpeed and ZeRO, optimizer technologies that make training trillion-parameter models feasible. The groundwork for GPT-3-scale training is laid. PrevMain BlogNext