Nvidia is breaking the trend of sesame street themed models with the new MegatronLM.

bad dl joke 

But also 8.3 billion parameters? Seriously? I can't but help feel like this is designed to sell Tesla GPUs. That's gonna take a lot of VRAM.

