BigCode [Oct 2022 - Present]
- Working on the training and inference team for reducing training time and inference latency for large language models
- Working on implementing Multi-Query Attention and Flash Attention in Megatron-LM for distributed training