acceleration.md 365 B

Acceleration

Hardware and software acceleration for LLM training and inference

Papers

2023

  • (2023-02) High-throughput Generative Inference of Large Language Models with a single GPU Ying Sheng et al. Paper | Github

Useful Resources